How to Orchestrate Multiple Agents: Drawing the Line Between Parallel and Serial Work
Antigravity 2.0 brings true parallel execution across multiple agents. But making everything parallel does not make it faster. Which work should fan out in parallel, and which should stay serial? This is an orchestration design that does not fall apart, viewed through dependencies and contention.
"Everything in parallel is faster" is usually wrong
In Antigravity 2.0, one agent writing a React component, another configuring an API route, and a third running visual regression tests in a headless browser can all happen at once. It is the clearest sign that its character has shifted from "code editor" to "platform that orchestrates and runs agents."
The first thing many people try is to throw every task they can think of into parallel. But that does not reliably make things faster. More often it multiplies accidents: two agents rewriting the same file at once and corrupting it, or work that can only start after another's result fires off early and comes back empty.
As an indie developer, I run several blog sites in parallel and have spent a long time probing how far the processing can be parallelized. What sank in is that designing for parallelism is not "designing to go fast" — it is "designing to avoid collisions."
Three axes for deciding what may run in parallel
Whether a task may be split into parallel work should be judged by criteria, not by gut. I check these three in order.
First, dependencies. If task B needs the output of task A, the two cannot run in parallel. This seems obvious yet is easy to miss. "Write the type definitions" and "write the implementation that uses those types" look like separate tasks but are serial.
Second, shared resources. If multiple agents write to the same file, the same database, or the same build artifact, parallelism causes contention. Reading is fine in parallel, but the moment writes intersect, things break.
Third, idempotency. If a step fails partway and is re-run, is running it twice safe? In parallel execution, failures and retries happen independently, so placing non-idempotent work in parallel piles up side effects with every retry.
Lens
Safe to parallelize
Keep serial
Dependencies
Neither needs the other's output
One takes the other's output as input
Shared resources
Each writes to a distinct target
They write to the same file/DB
Idempotency
Re-running yields the same result
Each run adds side effects
✦
Thank you for reading this far.
Continue Reading
What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.
WHAT YOU'LL LEARN
✦Criteria for telling apart work that breaks under parallelism from work that speeds up, across three axes: dependencies, shared resources, and idempotency
✦A fan-out / gate-aggregation pattern for designing how agents divide labor
✦Concrete examples, drawn from running multiple sites in parallel as an indie developer, of what was parallelized and what stayed serial
Secure payment via Stripe · Cancel anytime
✦
Unlock This Article
Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.
A mix of parallel and serial is clearest when designed as a fan: one line in at the entry, widening into several (fan-out), then bundled back into one at the end (fan-in).
The "plan" at the entry is owned by a single agent that decides what to build and how to divide it. This must not be parallel — overall coherence has to be established in one place. The middle implementation fans out only what does not collide on the three axes above. And the verification and deploy at the exit are always bundled back into one.
Making the exit serial is the crux. Provide one place to verify everything built in parallel. Even in my article automation, I distribute generation but always funnel the quality gates and deploy into a single place at the end.
A skeleton for aggregating the gate
The fan-in part runs verification only after every agent's output is in. Here is the skeleton in pseudo-code.
import asyncioasync def run_agent(name, task): # Launch each agent in parallel result = await dispatch_agent(name, task) return name, resultasync def orchestrate(plan): # Fan-out: dispatch only non-colliding work in parallel parallel = [ run_agent("ui", plan["ui"]), run_agent("api", plan["api"]), run_agent("schema", plan["schema"]), ] results = dict(await asyncio.gather(*parallel)) # Fan-in: verify in one place only after everything is in if not all_outputs_present(results): raise RuntimeError("Some agent outputs missing. Aborting deploy") gate_ok = run_quality_gates(results) # serial, exactly once if not gate_ok: raise RuntimeError("Quality gate failed. Sending back") return deploy(results)
asyncio.gather launches in parallel, while all_outputs_present confirms everything is in before proceeding to the gate. This single "wait until complete" move prevents the moth-eaten outputs that parallel execution tends to produce. It is tempting to scatter the gate inside each agent, but doing so lets the pass/fail criteria drift apart bit by bit, so I choose to aggregate it in one place.
A concrete example of what I kept serial
In my automation, I process four sites in sequence. Per-site article generation is mutually independent and could be parallel, yet I do not run it fully parallel. The reason is shared resources. Because generation uses the same work disk and the same external rate limit midway, running everything at once creates windows where disk pressure or rate limits take the whole thing down together.
So within a site I fan out the division of labor, and between sites I stagger gently toward serial. It is easy to fixate only on raising the degree of parallelism, but the decision to "deliberately keep some work serial" is the backbone of operations that run stably for a long time. Now that Antigravity 2.0 has become an agent orchestration platform, this line can finally be written explicitly as design.
If you start designing now, first sketch your workflow as a single fan. The moment you can see where to fan out and where to bundle back in, the parallelism discussion becomes far more grounded. Thank you for reading.
Share
Thank You for Reading
Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.