ANTIGRAVITY LABJP
Articles/Agents & Manager
Agents & Manager/2026-06-18Advanced

Spending Less on Failure Without Swallowing It: A Retry Budget for Agents Built Around Gemini 3.5 Flash

A design that separates an agent's retries from quietly swallowing errors: classify the failure first, then retry within a budget. Grounded in the speed and price of Gemini 3.5 Flash, with per-task caps, logging, and a weekly tightening routine.

antigravity374agent-design5retry8gemini-3-5-flashcost-control5

Premium Article

When you hand work to an agent, it does not always succeed on the first shot. A test fails, a tool call errors out, the output is malformed. Telling it to "try again" is the natural reflex. But if you allow retries without thinking, the agent repeats the same failure at high speed, and before you notice, only your quota and your bill have grown.

I run four blogs as an indie developer, and most of the automation I run overnight is handed to agents. What that taught me is that a retry is one step away from swallowing a failure. Precisely because a fast, cheap model like Gemini 3.5 Flash sits at the core, the cost of retrying has dropped — which makes it easier to drift into the sloppy habit of "just keep it running." That is exactly why retries need a budget around them.

Swallowing and retrying are not the same

The first thing to separate is swallowing versus retrying. Swallowing means "pretend the failure never happened and move on"; retrying means "acknowledge the failure, change a condition, and try once more." Mix the two and errors keep spinning without ever landing in a log, and you lose the ability to trace the cause afterward.

I enforce this distinction at the code level. Before any retry, I classify why it failed, and a failure I cannot classify does not get retried. If it cannot be classified, throwing it back under the same conditions is unlikely to change the outcome.

Sort failures into three kinds first

In practice, agent failures settle into roughly three kinds. As a rule, only the transient ones earn a retry.

KindExampleRetryCondition to change
TransientRate limit, timeout, brief network dropYesWait time (exponential backoff)
Input-drivenBroken JSON, missing context, vague instructionsConditionalPrompt and supplied context
PermanentMissing permission, nonexistent API, logically impossibleNoHold until a human steps in

Sending permanent failures into a retry is the most typical way to waste a budget. An agent will not say "I can't"; it fails plausibly, over and over. Just stopping it here visibly lowers the bill.

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN
How to budget retries by separating them from error-swallowing and classifying failures first
How to set a per-task retry cap grounded in the speed and price of Gemini 3.5 Flash
A weekly routine for spotting wasted retries in your logs and tightening the budget
Secure payment via Stripe · Cancel anytime

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

or
Unlock all articles with Membership →
Share

Thank You for Reading

Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

Related Articles

Agents & Manager2026-06-15
Treating the Managed Agent as a Cost-Capped Throwaway Worker: Isolating Untrusted Input from Production
How to use the Managed Antigravity Agent, now in Gemini API public preview, as a throwaway worker that is born and discarded per request. Cost caps, isolation, and idempotency with implementation steps.
Agents & Manager2026-06-15
Calling a Managed Antigravity Agent from the Gemini API: Design Notes on the Preview Model
antigravity-preview-05-2026, now in public preview on the Gemini API, is a Managed Agent that plans, runs code, edits files, and browses the web autonomously inside a sandbox. Here is how it differs from rolling your own orchestration, and where to draw the line.
Agents & Manager2026-06-01
Capping Parallel Agents With a Token Budget — Designing a Guard That Stops Runaway Cost
Running many agents in parallel quietly inflates your token bill. This is not about shrinking prompts — it is about a governance layer that meters spend in real time and cuts it off at a budget. Full design and TypeScript implementation, drawn from running six sites autonomously.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →