ANTIGRAVITY LABJP
Articles/Antigravity Basics
Antigravity Basics/2026-06-23Advanced

When the Default Model Changes Underneath You: Pinning and Diff-Gating Scheduled Runs

Antigravity 2.0 promoted Gemini 3.5 Flash to the default fast model. It is a welcome upgrade, but any scheduled run that leaned on the default starts producing subtly different output one morning. Here is how I pin the model explicitly, fingerprint the output, and gate drift, sized for a solo developer's pipeline.

Antigravity 2.05Gemini 3.5 Flash4model pinningdrift detectionscheduled runsreproducibility4indie development13regression testing

Premium Article

One morning I opened a draft from a publishing job that should have run exactly as it always did, and noticed the headings were styled just slightly differently. Nothing in the body was wrong. But the rhythm of the bullet lists and the way paragraphs broke belonged to someone else compared to the week before.

The cause was neither my prompt nor my settings. Antigravity 2.0 had quietly promoted Gemini 3.5 Flash to the default fast model. On the benchmarks it is described as beating the previous-generation Pro, with 76.2% on Terminal-Bench 2.1 and 83.6% on MCP Atlas (source: Google Developers Blog). As raw capability, the change is welcome. But for a scheduled run that had been "trusting the default," it meant the character of my output had been swapped out somewhere I wasn't watching.

This article is about the design that keeps that swap from becoming an incident. It draws on my own pipeline as an indie developer, where I auto-generate drafts for several sites every day.

Why "trusting the default" is risky only for scheduled runs

When you work interactively, a default model change rarely causes trouble. You see the output, and if it looks off you resubmit on the spot. A human performs the final inspection.

Scheduled runs are different. They run overnight, and the result flows straight into the next stage with nobody inspecting it. So even a change that is "better on average" drifts silently if your downstream was built around the previous model's habits. If a formatting script waits for a particular heading marker, or a minimum-length gate assumes the old model's volume, you can end up with the perverse outcome of a gate failing precisely because quality went up.

The problem, then, is not that the new model is bad. It is that the assumptions behind your output moved at a time you weren't aware of. What deserves attention is not which model is better, but the invisibility of the change.

Pin the model explicitly first

The first move is simple. Don't let the scheduled agent lean on the default; write out the model it uses.

{
  "agent": "nightly-draft",
  "model": "gemini-3.1-pro",
  "fallback": "gemini-3.5-flash",
  "temperature": 0.2,
  "note": "Model is explicit so a default change can't reach this agent. Fallback only when the pinned model is briefly unavailable."
}

With model declared, this agent keeps using the model you named even when the platform's default moves. The fallback is a statement that says "drop here only when the pinned model can't be reached." That is different in meaning from following the default automatically. The former is a choice you made between two options; the latter is one option the platform chose for you.

Pinning, though, only buys time. Leave it pinned forever and the old model's support eventually ends. The real purpose of pinning is to let you choose the timing of the change. For that, you need the diff gate that follows.

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN
A minimal config that pins the model and declares a fallback so a default swap can't quietly change your output
A complete diff gate that normalizes and fingerprints output, failing with exit 1 only on changes you didn't sanction
A migration procedure for welcoming upgrades on your own terms: snapshot updates and promoting one agent at a time
Secure payment via Stripe · Cancel anytime

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

or
Unlock all articles with Membership →
Share

Thank You for Reading

Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

Related Articles

Antigravity2026-06-14
After Gemini 3.5 Flash Became the Default, Route Flash and Pro Per Task
Now that Antigravity's default Flash is Gemini 3.5 Flash, leaving everything on Flash wastes accuracy and forcing everything onto Pro wastes time. Here is a two-axis decision table for splitting work between Flash and Pro, plus the routing setup to wire it into your agents.
Antigravity2026-06-18
Choosing Among Desktop, CLI, SDK, and Managed Agents for the Same Job
Antigravity 2.0 has several surfaces: desktop, CLI, SDK, and the Managed Agents API. Which one should run a given task? Here is a framework for choosing the surface from the nature of the work.
Antigravity2026-06-14
Budgeting Quota So Parallel Agents in Antigravity 2.0 Don't Run Dry
Run several agents at once in Antigravity 2.0 and your quota can be gone by mid-afternoon, right when you need it for the real work. Here is how I measure per-agent consumption, find the Pro-vs-Ultra break-even, and budget so I never hit the ceiling.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →