Watching Parallel Agents on One Screen Made Me Redesign My Workflow

Watching several agents run in parallel on a single screen, I did not feel faster. If anything, I spent more time with my hands still. That was the most surprising part of folding the Antigravity 2.0 desktop app into my own workflow for about a week. The model got faster and I could run three or four agents at once, yet overall progress did not climb the way I expected. Chasing down the reason led me to rebuild how my automation is wired in the first place. This is that record.

I used to avoid collisions by staggering the clock

As an indie developer, I run a handful of technical sites under the name Dolice Labs, and I spread routine work, like generating article drafts and checking for broken links, across the open pockets of the day. This site in the early morning, another one before noon. I staggered the timing on purpose.

The reason seemed simple: stacking heavy jobs in the same window clogged both the machine and me. When several generation tasks ran at once, the side that fell behind was me, reviewing what came out. So I was not really spacing things out "so the processing would not overlap." I was spacing them out so my own review would not overlap. At the time I had not put it that plainly to myself.

The Antigravity 2.0 desktop app challenges that assumption head-on. You can run multiple agents in parallel and have background tasks scheduled automatically. The Manager Surface gives you a single view of what each agent is planning and where it currently stands. In other words, "running at the same time" became the default behavior. So my first thought was that the staggered schedule was now unnecessary.

Running three at once showed me the real bottleneck

As a test, I fired off draft generation for three sites at the same time. It was fast. At least, the time until each agent produced text was clearly shorter. Maybe because it shares the same agent harness as the new Go-based CLI, the responsiveness felt crisp.

But the moment three drafts came back at once, my hands stopped. None of them can be published without my eyes on them. While I read one, the second agent was already waiting to ask whether it could proceed. The third was about to start something else entirely. Before I noticed, I had become the one being kept waiting by the agents, not the other way around.

That is when it clicked. What got faster was the agents' generation, not my review. The more I raised the number of concurrent agents, the more clearly the bottleneck shifted from compute to my own judgment. The instant I went three-wide, the number of points in a day where "nothing moves until a human checks" tripled. The benefit of speed gets carried right up to that traffic jam, and stops there.

I re-chose what to delegate by "how cheap the failure is"

So I redrew the line between what goes to the parallel agents and what stays in my hands. My criteria were two questions about each task: if it fails, is it reversible, and is checking it cheap?

Work that is reversible and takes a glance to verify goes to the background without hesitation. Internal link integrity checks, for example, return a mechanical pass or fail, so all I do at the end is look for the green check. First-draft generation is the same: if it is wrong, throwing it away ends the matter, so running it in parallel is fine. The lower the cost of failure and the lighter the review, the more honestly you receive the gains of concurrency.

Conversely, I pulled the irreversible work out of the parallel lane: confirming a publish, and deleting a page. Once executed, these reach all the way into search engine indexing, and you cannot quietly take them back later. So for just these two, no matter how many agents have piled up, I keep it to myself, looking at each one and pressing the button by hand. Deciding from the start that they are simply not candidates for parallelization means I can watch agents waiting on screen without feeling rushed.

I wrote this boundary into the task definitions themselves. Work sent to the background is allowed to run all the way through without a human in the loop, but in exchange it must pass a mechanical gate one step before anything is finalized.

# A gate every background task must pass just before "finalizing."
# It lists only mechanically decidable checks; one failure blocks finalization.
run_safety_gates() {
  local repo="$1"
 
  # Does it meet the quality bar for publishing? (machine-decided)
  quality_check "$repo" || return 1
 
  # Do all internal link targets actually exist?
  link_integrity "$repo" || return 1
 
  # Do the Japanese and English counts match? (a missing side holds publishing)
  local ja en
  ja=$(find "$repo/content" -path '*/ja/*.mdx' | wc -l)
  en=$(find "$repo/content" -path '*/en/*.mdx' | wc -l)
  [ "$ja" = "$en" ] || return 1
 
  return 0
}
 
# Only what clears the gate rises into the human review queue
run_safety_gates "$REPO" && enqueue_for_human_review "$REPO"

The key here is that passing the gate does not publish anything automatically. The gate only judges whether something is finished enough to be worth a human's eyes; the publish itself still stays in my hands. However many parallel agents I add, I keep that single point of finalization serial. After a week of trying, this is the shape that settled.

In the end, I kept the staggered schedule

The surprising conclusion is that I did not fully abandon the old practice of spacing things out across the clock. I reconsidered: just because I can run them at the same time does not mean they need to come back at the same time.

The agents can start in parallel, but I let the results arrive in my review queue with slight offsets. That way I avoid the situation where three drafts surge in at once and my hands freeze. Compute in parallel, review moderately in series. That split fit my current capacity best. The desktop app's parallel execution works precisely as the foundation for tuning that split finely. I used to stagger for the machine's sake; now I stagger for the sake of my own review capacity. It is the same "staggering," but the reason swapped places, which I found a genuinely interesting change.

What I will try next

If you happen to be reading this while running several projects of your own, I would start by picking just one routine task, the one whose failure is cheapest, and handing it to a background agent. Something like a link check, where the result comes back as a clear pass or fail, is a good fit. Experiencing once where your own checking gets stuck tells you where your personal bottleneck sits, before you raise the number of parallel agents.

For my part, the real sweet spot of concurrency is "increasing the work a human does not need to look at," not "speeding up the work a human does look at." That obvious truth only landed for me when I was finally staring at a screen full of agents. I would be glad to keep figuring this out together.