Antigravity is a powerful AI development environment, but many developers worry about the cost of cloud-based LLM services and privacy concerns. Using a local LLM with Antigravity solves both problems at once.
This guide shows you how to set up Antigravity with local LLM runners like Ollama and LM Studio, enabling a privacy-first development workflow while maximizing productivity.
Why Use Local LLMs with Antigravity?
1. Privacy and Data Security
Process code, documents, and sensitive data locally without sending anything to cloud services. Perfect for companies and regulated industries where data privacy is non-negotiable.
2. Cost Savings and Unlimited Usage
No API billing means unlimited experimentation and bulk code generation. Especially valuable during prompt refinement and parallel multi-file edits.
3. Zero Latency and Offline Capability
Network latency nearly disappears, making responses lightning-fast. Work offline on planes, trains, or anywhere without internet.
4. Customization and Fine-Tuning
Adjust inference parameters, fine-tune models, and optimize for your specific workflow. Build AI assistants tailored to your development style.
5. Gemma 4 Local Edition on the Horizon
Google is expected to release a local-runnable version of Gemma 4, opening access to high-performance open-source models for free.
Comparing Local LLM Runners
Ollama (Recommended)
Strengths:
- Easiest setup (under 10 minutes)
- Supports Mistral, Llama 2, Neural Chat, and more
- Memory-efficient, runs on modest hardware
- Simple command-line interface
OS Support: macOS, Linux, Windows
Recommended Specs: 8GB RAM (16GB ideal)
LM Studio
Strengths:
- Rich GUI, beginner-friendly
- Built-in OpenAI API-compatible server
- Intuitive model management
- Complete workflow in GUI
OS Support: macOS, Linux, Windows
Recommended Specs: 16GB RAM+
LocalAI
Strengths:
- Docker-native, container-friendly
- Parallel multi-model execution
- Microservice integration
- Enterprise-grade features
OS Support: All (Docker required)
Recommended Specs: Depends on model
Quick Comparison
| Runner | Setup Ease | GUI | Performance | Best For |
|---|---|---|---|---|
| Ollama | ★★★ Easy | No | ★★★ Fast | Everyone |
| LM Studio | ★★☆ Moderate | Yes | ★★☆ Moderate | GUI preference |
| LocalAI | ★☆☆ Complex | Yes | ★★☆ Moderate | DevOps/containers |
Conclusion: For most developers, Ollama is the optimal choice — simple setup, excellent performance, strong community support.
Setting Up Local LLMs with Ollama
Step 1: Install Ollama
Download from the official site (https://ollama.ai) for your operating system.
macOS:
# Via Homebrew
brew install ollama
# Or download directly from https://ollama.ai/download/macLinux (Ubuntu/Debian):
# Install
curl -fsSL https://ollama.ai/install.sh | sh
# Start service
sudo systemctl start ollama
sudo systemctl enable ollamaWindows:
Download Ollama-windows-installer.exe from the official site and run it. Ollama starts automatically as a background service.
Step 2: Download and Run a Model
After installation, pull a model using the command line. Initial download takes several minutes depending on model size.
# Mistral 7B (lightweight, fast)
ollama pull mistral
# Or Llama 2 13B (higher quality)
ollama pull llama2
# Start Ollama API server
ollama serveYou'll see:
time=2026-04-09T10:00:00.000Z level=INFO msg="Listening on 127.0.0.1:11434"
Ollama now serves an API on http://127.0.0.1:11434.
Step 3: Test the Model
In another terminal, verify the model is working:
curl http://127.0.0.1:11434/api/generate \
-d '{
"model": "mistral",
"prompt": "What is the capital of France?",
"stream": false
}' | jq '.response'You'll get a JSON response:
{
"response": "The capital of France is Paris."
}Connecting Local LLM to Antigravity
Step 1: Open Antigravity Settings
Launch Antigravity and go to Settings → LLM Provider.
Step 2: Add Custom Provider
Click "Add Custom Provider" and fill in:
| Field | Value |
|---|---|
| Provider Name | Local Ollama |
| API Endpoint | http://127.0.0.1:11434/api/generate |
| Model | mistral |
| API Key | (leave blank) |
| Authentication | None |
Step 3: Test Connection
Click "Test Connection" to verify Antigravity can reach your local model. Once successful, click "Save".
Step 4: Set as Default
Select Local Ollama from the dropdown and click "Set as Default". All future code generation and completions will use your local model.
Recommended Models and Use Cases
Mistral 7B (Recommended, General Purpose)
Specs:
- Size: ~4GB
- Speed: Fast (seconds on GPU)
- Quality: Moderate (limited Japanese support)
- Use cases: Code completion, simple comments, quick debugging
Install:
ollama pull mistralBest for lightweight tasks with minimal memory usage.
Llama 2 13B (Balanced)
Specs:
- Size: ~7GB
- Speed: Moderate (5-10 seconds on GPU)
- Quality: High (better instruction understanding)
- Use cases: Complex code generation, documentation, error analysis
Install:
ollama pull llama2Great for complex development tasks and multi-file edits.
Neural Chat (Japanese Support)
Specs:
- Size: ~4GB
- Speed: Fast
- Quality: Good Japanese support
- Use cases: Japanese comments, explanations
Install:
ollama pull neural-chatPerfect for Japanese-language prompts and outputs.
Code Llama (Code Specialist)
Specs:
- Size: ~7GB
- Speed: Moderate
- Quality: Excellent (deep code understanding)
- Use cases: Complex algorithms, security-critical code
Install:
ollama pull codellamaUse for production-grade code generation.
Selection Guide
What's your task?
→ "Simple completions, quick comments"
Use: Mistral 7B (lightest, fastest)
→ "General code generation, debugging"
Use: Llama 2 13B (recommended)
→ "Complex algorithms, secure code"
Use: Code Llama (code-focused)
→ "Japanese interaction"
Use: Neural Chat (Japanese-aware)
Gemma 4 Local Edition: What to Expect
Google plans to release Gemma 4, an open-source model competitive with Llama 2 and Mistral. Based on Google's AI research, Gemma 4 stands out for:
Expected Features
- Japanese Support: Built on Google's multilingual NLP, excellent Japanese output
- Code Understanding: Deep familiarity with Python, Java, JavaScript
- Safety: Built-in safety filters for responsible AI use
Using Gemma 4 Locally with Antigravity
Once released:
ollama pull gemma:4b
# or
ollama pull gemma:13bAccess Google's advanced AI capabilities for free, offline, and unlimited.
Troubleshooting: Common Connection Issues
Error: "API connection timeout"
Cause: Ollama server not running or port mismatch
Check:
# Is Ollama running?
ps aux | grep ollama
# Test local connection
curl http://127.0.0.1:11434/api/generate -d '{"model":"mistral","prompt":"test"}'Fix:
ollama serveError: "Model not found"
Cause: Model not downloaded
Check:
ollama listFix:
ollama pull mistralIssue: Slow Performance (10+ seconds)
Cause: Running on CPU only, or memory constraints
Check:
top # Check memory usage and GPU utilizationFix:
- Switch to lighter model (Mistral 7B)
- Increase system RAM
- Update GPU drivers
Error: "Antigravity cannot connect to localhost"
Cause: Security settings blocking local connections
Fix:
- Go to Antigravity Preferences → Privacy
- Enable "Allow local connections"
- Restart Antigravity
Windows: Ollama Service Won't Start
Cause: Windows Defender or antivirus blocking
Fix:
- Windows Defender → "Virus & threat protection"
- Add
ollama.exeto exclusions - Restart:
net start ollama
Practical Example: Multi-File Refactoring with Local LLM
Scenario: React Component Modernization
You have:
Button.jsx— Old class componentInput.jsx— Old class componentFormContainer.jsx— Parent using both
Goal: Convert to functional components + hooks.
Step 1: Open Files in Antigravity
Open all three files simultaneously. Local LLM means zero API latency.
Step 2: Analyze Structure
Antigravity's AI analyzes relationships across all files:
Analysis Result:
- Both use deprecated class pattern
- FormContainer uses prop drilling
- Recommendation: Convert to hooks, useCallback for handlers
Step 3: Run Auto-Refactoring
Send "Convert all to functional components + hooks" to your local LLM. No billing, unlimited iterations.
Step 4: Review and Polish
Check results, adjust as needed. Complete privacy — nothing leaves your machine.
Best Practices for Local LLM Use
1. Model Selection by Task
# Lightweight tasks
ollama run mistral
# Complex generation
ollama run codellamaSwitch models per project in Antigravity settings.
2. Prompt Optimization
With no API costs, iterate freely:
"Rewrite this function in TypeScript with strict types."
[code]
Better version:
"Convert this JavaScript function to TypeScript:
- Enable strict type checking
- Explicit return types
- Type function parameters"
[code]
More detailed prompts yield better results locally too.
3. GPU Memory Management
# Stop running models (free memory)
pkill -f ollama
# Reload fresh
ollama run mistral4. Offline Development
Local LLMs work completely offline — perfect for planes, trains, and remote work.
Looking back
Local LLMs with Antigravity enable:
- Privacy: All code and data stays on your machine
- Cost Savings: No API billing, unlimited experiments
- Speed: Zero network latency
- Offline Work: Develop anywhere, anytime
- Customization: Fine-tune and optimize for your needs
Setup takes just 10 minutes with Ollama, and you'll immediately experience a more responsive, private development environment.
Next: Learn Antigravity's multi-agent system and leverage local LLMs across multiple agents for autonomous development.