Gemini 3.1 Pro Comes to Antigravity — 2x Reasoning, 65K Token Output, and What It Means for Your Workflow

Overview

On February 19, 2026, Google DeepMind released Gemini 3.1 Pro — and it became available in Antigravity IDE the same day. Compared to its predecessor Gemini 3 Pro, it achieves an ARC-AGI-2 score of 77.1%, which is more than double the previous benchmark. Output capacity has grown to 65,536 tokens, and the context window now spans 2 million tokens. For an agent-first IDE like Antigravity, this integration is a significant leap forward. This article breaks down what changed and how to make the most of it.

What Is Gemini 3.1 Pro?

Gemini 3.1 Pro is Google DeepMind's latest multimodal reasoning model. It processes text, images, audio, video, and entire code repositories simultaneously, and is specifically optimized for complex, multi-step reasoning tasks.

Specification Comparison

Feature	Gemini 3 Pro	Gemini 3.1 Pro
ARC-AGI-2 Score	~38%	77.1%
Max Output Tokens	32,768	65,536
Context Window	1 million	2 million
Thinking Modes	Low / High	Low / Medium / High

The 2 million token context window is especially noteworthy. You can now load an entire large-scale codebase alongside documentation and recent search results in a single context — making whole-project refactors and architectural reviews genuinely practical to delegate to an agent.

The Three-Tier Thinking System

One of the headline features of Gemini 3.1 Pro is its three-tier thinking system: Low, Medium, and High.

Low (Latency-First)

Best for fast-response scenarios like code completion, simple refactors, or inline suggestions. Latency stays minimal, making it ideal for continuous development tasks.

Medium (Balanced)

The newly added middle tier in Gemini 3.1 Pro. It balances reasoning depth against response time, making it well-suited for moderately complex tasks — code review, test generation, migration planning, and similar work.

High (Depth-First)

For tasks where thoroughness matters more than speed: complex architectural design, security audits, large-scale bug investigations. The 77.1% ARC-AGI-2 score was measured under this mode.

Using Gemini 3.1 Pro in Antigravity

Switching the Model

In either the Editor or Manager view, open the model selector and choose Gemini 3.1 Pro. As of March 2026, Antigravity remains in public preview, so the model is free to use within Gemini's rate limits.

Setting the Thinking Mode

The thinking mode dropdown appears next to the model name in Antigravity's chat panel:

Model: Gemini 3.1 Pro  |  Think: [Low ▼]  [Medium]  [High]

You can switch modes manually based on the complexity of each task, or set it to "Auto" and let Antigravity decide based on the prompt.

Practical Use Cases

Use Case 1: Large-Scale Refactoring

With a 2 million token context window, you can scan an entire monorepo and delegate dependency cleanup or naming convention unification to an agent in one shot.

Example prompt:
"Analyze this entire repository, list all deprecated API calls,
create a migration plan to the new SDK as an Artifact,
then auto-apply changes starting from the lowest-risk ones."

Use Case 2: Parallel Agents in Manager View

Manager View lets you spawn multiple agents running different tasks simultaneously. For example:

Agent A: UI bug fixes on feat/bug-fix-ui
Agent B: Unit test additions on feat/add-tests
Agent C: Documentation updates on docs/update-readme

Assigning Gemini 3.1 Pro to each agent can reduce work that would normally take hours to a fraction of the time.

Use Case 3: Security Audit

Invoke High thinking mode for a thorough security review. The agent will analyze your code from an OWASP Top 10 perspective and output findings with remediation patches as Artifacts.

Example prompt:
"Review this API server code from an OWASP Top 10 perspective.
List all vulnerabilities, then create fix patches in order of priority.
Use High thinking mode."

Gemini 3 Pro vs. 3.1 Pro — Side by Side

Here's how the two models compare on common development tasks:

Task	Gemini 3 Pro	Gemini 3.1 Pro (High)
Multi-file bug root cause	Needed 2–3 follow-up prompts	Identified correctly on first pass
Test coverage planning	Frequent edge-case gaps	Comprehensive coverage proposals
Output truncation on long tasks	Common on complex output	Significantly reduced with 65K limit
Latency (Medium mode)	~10 seconds	~14 seconds (within acceptable range)

Gemini 3.1 Pro clearly leads on complex multi-step reasoning and long-form output. That said, for simple autocomplete or quick lookups, Gemini 3 Flash still delivers faster responses — so mixing models strategically remains the right approach.

Wrapping up

The Gemini 3.1 Pro integration in Antigravity marks a shift from agents that iterate through trial and error to agents that get it right the first time. The combination of three-tier thinking and a 2-million-token context window makes large-scale refactoring, parallel multi-agent development, and deep security audits practical rather than aspirational. If you haven't tried it yet, today is a great time to open Antigravity and switch to Gemini 3.1 Pro.