How to Orchestrate Multiple Agents: Drawing the Line Between Parallel and Serial Work

Antigravity 2.0 brings true parallel execution across multiple agents. But making everything parallel does not make it faster. Which work should fan out in parallel, and which should stay serial? This is an orchestration design that does not fall apart, viewed through dependencies and contention.

agents⁹⁶ orchestration¹⁹ parallel-execution⁶ antigravity³⁷⁷ design⁷ automation⁵³

✦ Premium Article

"Everything in parallel is faster" is usually wrong

In Antigravity 2.0, one agent writing a React component, another configuring an API route, and a third running visual regression tests in a headless browser can all happen at once. It is the clearest sign that its character has shifted from "code editor" to "platform that orchestrates and runs agents."

The first thing many people try is to throw every task they can think of into parallel. But that does not reliably make things faster. More often it multiplies accidents: two agents rewriting the same file at once and corrupting it, or work that can only start after another's result fires off early and comes back empty.

As an indie developer, I run several blog sites in parallel and have spent a long time probing how far the processing can be parallelized. What sank in is that designing for parallelism is not "designing to go fast" — it is "designing to avoid collisions."

Three axes for deciding what may run in parallel

Whether a task may be split into parallel work should be judged by criteria, not by gut. I check these three in order.

First, dependencies. If task B needs the output of task A, the two cannot run in parallel. This seems obvious yet is easy to miss. "Write the type definitions" and "write the implementation that uses those types" look like separate tasks but are serial.

Second, shared resources. If multiple agents write to the same file, the same database, or the same build artifact, parallelism causes contention. Reading is fine in parallel, but the moment writes intersect, things break.

Third, idempotency. If a step fails partway and is re-run, is running it twice safe? In parallel execution, failures and retries happen independently, so placing non-idempotent work in parallel piles up side effects with every retry.

Lens	Safe to parallelize	Keep serial
Dependencies	Neither needs the other's output	One takes the other's output as input
Shared resources	Each writes to a distinct target	They write to the same file/DB
Idempotency	Re-running yields the same result	Each run adds side effects

✦

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN

✦Criteria for telling apart work that breaks under parallelism from work that speeds up, across three axes: dependencies, shared resources, and idempotency

✦A fan-out / gate-aggregation pattern for designing how agents divide labor

✦Concrete examples, drawn from running multiple sites in parallel as an indie developer, of what was parallelized and what stayed serial

Secure payment via Stripe · Cancel anytime

✦

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

Unlock all articles with Membership →

Think in fan structures

A mix of parallel and serial is clearest when designed as a fan: one line in at the entry, widening into several (fan-out), then bundled back into one at the end (fan-in).

            ┌─ Agent A: UI components ─┐
entry(plan)─┼─ Agent B: API routes    ─┼─ gate aggregation → deploy
            └─ Agent C: schema defs    ─┘
                 (fan-out = parallel)       (fan-in = serial)

The "plan" at the entry is owned by a single agent that decides what to build and how to divide it. This must not be parallel — overall coherence has to be established in one place. The middle implementation fans out only what does not collide on the three axes above. And the verification and deploy at the exit are always bundled back into one.

Making the exit serial is the crux. Provide one place to verify everything built in parallel. Even in my article automation, I distribute generation but always funnel the quality gates and deploy into a single place at the end.

A skeleton for aggregating the gate

The fan-in part runs verification only after every agent's output is in. Here is the skeleton in pseudo-code.

import asyncio
 
async def run_agent(name, task):
    # Launch each agent in parallel
    result = await dispatch_agent(name, task)
    return name, result
 
async def orchestrate(plan):
    # Fan-out: dispatch only non-colliding work in parallel
    parallel = [
        run_agent("ui", plan["ui"]),
        run_agent("api", plan["api"]),
        run_agent("schema", plan["schema"]),
    ]
    results = dict(await asyncio.gather(*parallel))
 
    # Fan-in: verify in one place only after everything is in
    if not all_outputs_present(results):
        raise RuntimeError("Some agent outputs missing. Aborting deploy")
 
    gate_ok = run_quality_gates(results)   # serial, exactly once
    if not gate_ok:
        raise RuntimeError("Quality gate failed. Sending back")
 
    return deploy(results)

asyncio.gather launches in parallel, while all_outputs_present confirms everything is in before proceeding to the gate. This single "wait until complete" move prevents the moth-eaten outputs that parallel execution tends to produce. It is tempting to scatter the gate inside each agent, but doing so lets the pass/fail criteria drift apart bit by bit, so I choose to aggregate it in one place.

A concrete example of what I kept serial

In my automation, I process four sites in sequence. Per-site article generation is mutually independent and could be parallel, yet I do not run it fully parallel. The reason is shared resources. Because generation uses the same work disk and the same external rate limit midway, running everything at once creates windows where disk pressure or rate limits take the whole thing down together.

So within a site I fan out the division of labor, and between sites I stagger gently toward serial. It is easy to fixate only on raising the degree of parallelism, but the decision to "deliberately keep some work serial" is the backbone of operations that run stably for a long time. Now that Antigravity 2.0 has become an agent orchestration platform, this line can finally be written explicitly as design.

If you start designing now, first sketch your workflow as a single fan. The moment you can see where to fan out and where to bundle back in, the parallelism discussion becomes far more grounded. Thank you for reading.

Thank You for Reading

Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.