A few days into Antigravity 2.0.0, with the Gemini CLI shutdown set for June 18, I spent this week reorganizing my setup and circling one question: how much should an agent be trusted to do? Weeks like this reward principles over feature lists — when the tooling keeps shifting, what carries over is your own answer to the delegation question. The five posts we published at Antigravity Lab this week each contribute a piece of that answer, so here they are, with notes on how they fit together.
Start by sizing the unit of delegation
Size Antigravity Agent Tasks by What You Can Review — a Practical Cure for Rework
The observation at the heart of this post is that large requests fail in proportion to the number of judgment calls they contain. The proposed ceiling — one task is what you can review in fifteen minutes — sounds modest until you see the worked example, where a single CSV-export request gets re-cut into three. Writing acceptance criteria as three lines describing the finished state is a habit that transfers well beyond agents; it is simply how to ask anyone for work. If you read one agent-operations post this week, I would make it this one.
Widen the scope: updates and execution platforms
Handing Dependency Updates to Antigravity Agents — Risk Tiers, Verification, and Rollback
Dependency updates are the classic "easy to delegate, dangerous to delegate fully" chore. This post designs the whole operation: four risk tiers instead of blind trust in semver, lot-splitting with worktree isolation, and a rollback path planned before anything updates. The playbook written into AGENTS.md is shown in full, which makes it easy to adapt. The line I adopted immediately: a passing build is only half of verification.
Running Gemini's Managed Agents API: Where Cloud Execution Ends and My Local Agents Begin
This one asks where the work should run. Treating an agent run as a job rather than a request reframes the practical details — polling intervals, timeouts, idempotency — as scheduling design instead of afterthoughts. The ordering advice stuck with me: put a budget cap on cost first, and only then promote the job to a recurring schedule.
Check the principles against a real migration
After the theory, a field report. Ten-year-old purchase code across four wallpaper apps gets moved to StoreKit 2, and the post keeps the texture of real indie developer work: what was handed to the agent, what was verified by hand, and where the boundary sat. That division of labor lines up neatly with the task-sizing and risk-tier posts above. The three snags hit in App Store sandbox testing are answers waiting for anyone with the same migration ahead.
Make the tool comfortable in your hands
Localizing Antigravity 2.0 to Japanese — The Three-Layer Setup for Menus, Commands, and AI Responses
To close, something lighter. This setup guide splits localization into three layers — UI language pack, command palette, and the language of AI responses — and is honest about what to leave in English for searchability. The most useful warning is that the AI response language is the layer everyone forgets; if you finished your setup and the agent still answers in English, this is why.
Next week crosses the Gemini CLI shutdown date. If your migration is still pending, getting comfortable with the new CLI this week turns the deadline into an ordinary day. Then take one request you were about to hand an agent and re-cut it to a reviewable size — the task-sizing post shows you exactly how.