ANTIGRAVITY LABJP
Articles/Antigravity Basics
Antigravity Basics/2026-04-09Intermediate

Setting Up Local LLMs in Antigravity for Practical Use

Step-by-step guide to configuring local LLMs in Antigravity. Covers Ollama and LM Studio integration, recommended models, Gemma 4 local setup, and troubleshooting tips for a privacy-first development environment.

Antigravity249local LLM14Ollama15LM Studio5setup guide

Antigravity is a powerful AI development environment, but many developers worry about the cost of cloud-based LLM services and privacy concerns. Using a local LLM with Antigravity solves both problems at once.

This guide shows you how to set up Antigravity with local LLM runners like Ollama and LM Studio, enabling a privacy-first development workflow while maximizing productivity.

Why Use Local LLMs with Antigravity?

1. Privacy and Data Security

Process code, documents, and sensitive data locally without sending anything to cloud services. Perfect for companies and regulated industries where data privacy is non-negotiable.

2. Cost Savings and Unlimited Usage

No API billing means unlimited experimentation and bulk code generation. Especially valuable during prompt refinement and parallel multi-file edits.

3. Zero Latency and Offline Capability

Network latency nearly disappears, making responses lightning-fast. Work offline on planes, trains, or anywhere without internet.

4. Customization and Fine-Tuning

Adjust inference parameters, fine-tune models, and optimize for your specific workflow. Build AI assistants tailored to your development style.

5. Gemma 4 Local Edition on the Horizon

Google is expected to release a local-runnable version of Gemma 4, opening access to high-performance open-source models for free.

Comparing Local LLM Runners

Ollama (Recommended)

Strengths:

  • Easiest setup (under 10 minutes)
  • Supports Mistral, Llama 2, Neural Chat, and more
  • Memory-efficient, runs on modest hardware
  • Simple command-line interface

OS Support: macOS, Linux, Windows
Recommended Specs: 8GB RAM (16GB ideal)

LM Studio

Strengths:

  • Rich GUI, beginner-friendly
  • Built-in OpenAI API-compatible server
  • Intuitive model management
  • Complete workflow in GUI

OS Support: macOS, Linux, Windows
Recommended Specs: 16GB RAM+

LocalAI

Strengths:

  • Docker-native, container-friendly
  • Parallel multi-model execution
  • Microservice integration
  • Enterprise-grade features

OS Support: All (Docker required)
Recommended Specs: Depends on model

Quick Comparison

RunnerSetup EaseGUIPerformanceBest For
Ollama★★★ EasyNo★★★ FastEveryone
LM Studio★★☆ ModerateYes★★☆ ModerateGUI preference
LocalAI★☆☆ ComplexYes★★☆ ModerateDevOps/containers

Conclusion: For most developers, Ollama is the optimal choice — simple setup, excellent performance, strong community support.

Setting Up Local LLMs with Ollama

Step 1: Install Ollama

Download from the official site (https://ollama.ai) for your operating system.

macOS:

# Via Homebrew
brew install ollama
 
# Or download directly from https://ollama.ai/download/mac

Linux (Ubuntu/Debian):

# Install
curl -fsSL https://ollama.ai/install.sh | sh
 
# Start service
sudo systemctl start ollama
sudo systemctl enable ollama

Windows: Download Ollama-windows-installer.exe from the official site and run it. Ollama starts automatically as a background service.

Step 2: Download and Run a Model

After installation, pull a model using the command line. Initial download takes several minutes depending on model size.

# Mistral 7B (lightweight, fast)
ollama pull mistral
 
# Or Llama 2 13B (higher quality)
ollama pull llama2
 
# Start Ollama API server
ollama serve

You'll see:

time=2026-04-09T10:00:00.000Z level=INFO msg="Listening on 127.0.0.1:11434"

Ollama now serves an API on http://127.0.0.1:11434.

Step 3: Test the Model

In another terminal, verify the model is working:

curl http://127.0.0.1:11434/api/generate \
  -d '{
    "model": "mistral",
    "prompt": "What is the capital of France?",
    "stream": false
  }' | jq '.response'

You'll get a JSON response:

{
  "response": "The capital of France is Paris."
}

Connecting Local LLM to Antigravity

Step 1: Open Antigravity Settings

Launch Antigravity and go to SettingsLLM Provider.

Step 2: Add Custom Provider

Click "Add Custom Provider" and fill in:

FieldValue
Provider NameLocal Ollama
API Endpointhttp://127.0.0.1:11434/api/generate
Modelmistral
API Key(leave blank)
AuthenticationNone

Step 3: Test Connection

Click "Test Connection" to verify Antigravity can reach your local model. Once successful, click "Save".

Step 4: Set as Default

Select Local Ollama from the dropdown and click "Set as Default". All future code generation and completions will use your local model.

Recommended Models and Use Cases

Mistral 7B (Recommended, General Purpose)

Specs:

  • Size: ~4GB
  • Speed: Fast (seconds on GPU)
  • Quality: Moderate (limited Japanese support)
  • Use cases: Code completion, simple comments, quick debugging

Install:

ollama pull mistral

Best for lightweight tasks with minimal memory usage.

Llama 2 13B (Balanced)

Specs:

  • Size: ~7GB
  • Speed: Moderate (5-10 seconds on GPU)
  • Quality: High (better instruction understanding)
  • Use cases: Complex code generation, documentation, error analysis

Install:

ollama pull llama2

Great for complex development tasks and multi-file edits.

Neural Chat (Japanese Support)

Specs:

  • Size: ~4GB
  • Speed: Fast
  • Quality: Good Japanese support
  • Use cases: Japanese comments, explanations

Install:

ollama pull neural-chat

Perfect for Japanese-language prompts and outputs.

Code Llama (Code Specialist)

Specs:

  • Size: ~7GB
  • Speed: Moderate
  • Quality: Excellent (deep code understanding)
  • Use cases: Complex algorithms, security-critical code

Install:

ollama pull codellama

Use for production-grade code generation.

Selection Guide

What's your task?

→ "Simple completions, quick comments"
  Use: Mistral 7B (lightest, fastest)

→ "General code generation, debugging"
  Use: Llama 2 13B (recommended)

→ "Complex algorithms, secure code"
  Use: Code Llama (code-focused)

→ "Japanese interaction"
  Use: Neural Chat (Japanese-aware)

Gemma 4 Local Edition: What to Expect

Google plans to release Gemma 4, an open-source model competitive with Llama 2 and Mistral. Based on Google's AI research, Gemma 4 stands out for:

Expected Features

  • Japanese Support: Built on Google's multilingual NLP, excellent Japanese output
  • Code Understanding: Deep familiarity with Python, Java, JavaScript
  • Safety: Built-in safety filters for responsible AI use

Using Gemma 4 Locally with Antigravity

Once released:

ollama pull gemma:4b
# or
ollama pull gemma:13b

Access Google's advanced AI capabilities for free, offline, and unlimited.

Troubleshooting: Common Connection Issues

Error: "API connection timeout"

Cause: Ollama server not running or port mismatch

Check:

# Is Ollama running?
ps aux | grep ollama
 
# Test local connection
curl http://127.0.0.1:11434/api/generate -d '{"model":"mistral","prompt":"test"}'

Fix:

ollama serve

Error: "Model not found"

Cause: Model not downloaded

Check:

ollama list

Fix:

ollama pull mistral

Issue: Slow Performance (10+ seconds)

Cause: Running on CPU only, or memory constraints

Check:

top  # Check memory usage and GPU utilization

Fix:

  • Switch to lighter model (Mistral 7B)
  • Increase system RAM
  • Update GPU drivers

Error: "Antigravity cannot connect to localhost"

Cause: Security settings blocking local connections

Fix:

  1. Go to Antigravity PreferencesPrivacy
  2. Enable "Allow local connections"
  3. Restart Antigravity

Windows: Ollama Service Won't Start

Cause: Windows Defender or antivirus blocking

Fix:

  1. Windows Defender → "Virus & threat protection"
  2. Add ollama.exe to exclusions
  3. Restart: net start ollama

Practical Example: Multi-File Refactoring with Local LLM

Scenario: React Component Modernization

You have:

  • Button.jsx — Old class component
  • Input.jsx — Old class component
  • FormContainer.jsx — Parent using both

Goal: Convert to functional components + hooks.

Step 1: Open Files in Antigravity

Open all three files simultaneously. Local LLM means zero API latency.

Step 2: Analyze Structure

Antigravity's AI analyzes relationships across all files:

Analysis Result:
- Both use deprecated class pattern
- FormContainer uses prop drilling
- Recommendation: Convert to hooks, useCallback for handlers

Step 3: Run Auto-Refactoring

Send "Convert all to functional components + hooks" to your local LLM. No billing, unlimited iterations.

Step 4: Review and Polish

Check results, adjust as needed. Complete privacy — nothing leaves your machine.

Best Practices for Local LLM Use

1. Model Selection by Task

# Lightweight tasks
ollama run mistral
 
# Complex generation
ollama run codellama

Switch models per project in Antigravity settings.

2. Prompt Optimization

With no API costs, iterate freely:

"Rewrite this function in TypeScript with strict types."
[code]

Better version:
"Convert this JavaScript function to TypeScript:
- Enable strict type checking
- Explicit return types
- Type function parameters"
[code]

More detailed prompts yield better results locally too.

3. GPU Memory Management

# Stop running models (free memory)
pkill -f ollama
 
# Reload fresh
ollama run mistral

4. Offline Development

Local LLMs work completely offline — perfect for planes, trains, and remote work.

Looking back

Local LLMs with Antigravity enable:

  • Privacy: All code and data stays on your machine
  • Cost Savings: No API billing, unlimited experiments
  • Speed: Zero network latency
  • Offline Work: Develop anywhere, anytime
  • Customization: Fine-tune and optimize for your needs

Setup takes just 10 minutes with Ollama, and you'll immediately experience a more responsive, private development environment.

Next: Learn Antigravity's multi-agent system and leverage local LLMs across multiple agents for autonomous development.

Share

Thank You for Reading

Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

If you found this article helpful, a small tip ($1.50) would mean a lot to us. Your support helps keep this site ad-free and covers server and hosting costs.

Related Articles

Antigravity2026-05-02
Gemma 4 × Antigravity Complete Practical Guide — Local LLM, RAG, Ollama/LM Studio Integration
A practical, production-grade guide to running Antigravity with Gemma 4 — covering local LLM setup, RAG pipelines, Ollama/LM Studio integration, and fine-tuning. Includes troubleshooting and operational best practices.
Antigravity2026-04-23
Antigravity × Local LLM (Ollama / LM Studio / LM Link): A Production Connection Playbook
Getting Antigravity to connect to Ollama, LM Studio, or LM Link is the easy part. Running on that setup for eight hours a day, week after week, surfaces disconnects, model-swap hangs, stale sessions after lunch, and VRAM pressure from other processes. This playbook covers hardening the connection layer, picking the right backend for the task, and designing the fallback path for when local goes silent.
Antigravity2026-04-16
Running Gemma 4 in Antigravity — Ollama Setup and a Realistic Local/Cloud Split
How to run Gemma 4 locally in Antigravity IDE with Ollama: installation, a three-step connection check that actually isolates failures, keep_alive tuning to kill reload waits, and a realistic split between Gemma 4 and Gemini 2.5 Pro.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →