ANTIGRAVITY LABJP
Articles/Agents & Manager
Agents & Manager/2026-06-27Advanced

Before Your dynamic sub-agents Branch Out Too Far — Designing a Depth Budget and Fan-out Cap

Antigravity 2.0's dynamic sub-agents can spawn their own sub-agents at runtime. Handy, but without depth and fan-out control they can burn through your quota overnight. Here are three guards, with concrete code.

Antigravity 2.08dynamic sub-agentsmulti-agent48autonomous runscost control

Premium Article

One night I put the updates for four blog sites onto a single scheduled run. The next morning, the quota screen showed nearly three times the consumption I had expected. The jobs all succeeded; only the spend had ballooned. Tracing the logs, I found that a dynamic sub-agent spawned by the parent had spawned another sub-agent, whose child had spawned a grandchild. Nobody had entered an infinite loop. The decision "this is a bit involved, let me split it one more level" had simply repeated itself, quietly, at every node of the tree.

Antigravity 2.0's dynamic sub-agents let a parent agent spin up child agents on demand at runtime. Operationally, that is a completely different beast from static parallelism where you fix the degree of concurrency up front. The upside is flexibility; the catch is that you hand the "when and how much to branch" decision to the agent itself, so left alone the tree grows deep and wide. Starting from the trap I stepped on in my own off-peak automation, this article shares a concrete design for controlling three things: depth, width, and cancellation.

Treat static parallelism and dynamic sub-agents as different things

The "parallel agents" we are used to had a fixed width — like firing five tasks at once with Promise.all. A human decides the maximum concurrency in advance. In tree terms, it's a shallow thicket of depth 1 and width 5.

dynamic sub-agents differ at the root. If a child agent decides mid-task that "this refactor splits cleanly into three independent modules," it can stand up three grandchildren on the spot. The grandchildren make the same call. As a result, both the depth and the width of the tree are decided at runtime and cannot be read ahead of time.

If you grasp only "it's parallel, so it should be fast" without understanding this property, you will misjudge both spend and latency. With depth 3 and a fan-out of 3 per node, the number of leaf nodes is 3 to the power of 3 — 27 branches. Each sub-agent calls the model, so token consumption scales with the count. The reason I burned nearly three times the quota in one night was precisely that I had overlooked this exponential spread.

So the first premise to hold is simple: when you use dynamic sub-agents, assume the tree will grow on its own, and put explicit ceilings on depth and width.

Three guards: depth budget, fan-out cap, cancellation propagation

The axes worth controlling break down into three.

First, the depth budget: the maximum number of levels of sub-agents you can spawn, counting from the parent. Keep the depth shallow and the tree can never grow exponentially.

Second, the fan-out cap: the maximum number of sub-agents running concurrently across the whole tree. Even if you allow depth, capping the simultaneous count keeps the peak spend down.

Third, cancellation propagation: a mechanism to tell every descendant to stop when a branch stalls or is no longer needed. Without it, grandchildren keep calling the model long after the parent has given up.

These look independent, but operationally they are easier to handle if you fold them into a single orchestration layer. Below I show each implementation in Node.js (TypeScript), assuming a thin wrapper around the Antigravity SDK's sub-agent launch.

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN
An implementation that embeds a depth budget into the agent context to stop sub-agents from multiplying recursively
A fan-out cap with queuing that manages how many branches run at once from a single place
A structure that propagates cancellation to descendants via AbortSignal, so a stuck branch doesn't drag the whole job down
Secure payment via Stripe · Cancel anytime

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

or
Unlock all articles with Membership →
Share

Thank You for Reading

Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

Related Articles

Agents & Manager2026-06-19
Your Antigravity Sandbox Isolates Multi-Agents Less Than You Think — Notes on Containing the Blast Radius
An Antigravity sandbox gives you the feeling of isolation, but isolation leaks through three real gaps: shared volumes, over-broad allowed domains, and approval fatigue. Field notes on plugging the leaks, containing the blast radius by design, and proving isolation holds with tests.
Agents & Manager2026-06-17
After Generating Several Candidates, Which One Do You Adopt? Designing Best-of-N That Arbitrates by Verification
With Gemini 3.5 Flash's speed, generating several implementations of the same task has become practical. The hard part is no longer generation but arbitration. Here is the design and TypeScript implementation of a Best-of-N arbiter that picks the winner using verifiable signals only — not majority vote, not self-reported confidence.
Agents & Manager2026-06-17
Accounting for Which Agent Spent What: A Cost Attribution Design by Task
Your month-end bill is one number, but running multiple agents on Gemini 3.5 Flash hides which task ate the cost. Separate from a budget guard, I share a cost-attribution accounting design that maps usage to per-task and per-site cost, with a solo-operator implementation and numbers.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →