Antigravity Lab Highlights (June 1–12) — Two Weeks of Turning 'How Much Should I Delegate?' into Design Language

It has been a little while since the last weekly roundup, so this one looks back at the first half of June. The gap was deliberate in one sense: over these two weeks I focused on making each individual piece denser rather than publishing more of them.

Rereading everything published since June 1, I noticed that almost every article was circling the same underlying question from a different angle: how do you decide what an agent is allowed to do — not by gut feeling, but in explicit design terms? Rather than expanding an agent's autonomy on instinct and getting burned, the idea is to write down the conditions under which expansion is safe, before you expand. These two weeks were essentially a record of working out those conditions.

Four Yardsticks for Deciding What to Delegate

The backbone of this period was a cluster of articles about the criteria themselves.

Separate Reversible from Irreversible Operations — Deciding Agent Autonomy by Reversibility turned out to be the centerpiece. Replacing the vague question "is this task safe to hand over?" with "can this operation be rolled back if it fails?" makes the decision suddenly concrete. Running my own wallpaper apps, I still hesitate sometimes before handing work to an agent — and writing this piece helped me see that the hesitation almost always traces back to an unclear reversibility assessment.

Let the Agent Swing and Miss Before It Touches Production — Designing a Zero-Side-Effect Dry-Run Layer covers the safety device that sits one step earlier. Having the agent emit only its execution plan first, for a human to read, is an unglamorous extra layer — but it quietly widens the range of work you can delegate.

Rolling Back the Side Effects of a Partially Failed Agent — Implementation Notes on Compensation Transactions deals with cleanup when things fail midway anyway. Applying the Saga pattern from the database world to an agent's sequence of operations felt like a genuinely productive transplant.

Keeping Parallel Agent Token Costs Under a Budget — Designing a Guard That Stops Runaway Spend asks the same question from the financial side. The wider the delegation, the more a mechanical cost ceiling earns its keep.

How You Hand Over Tools, and How Diffs Get Sliced

Once the scope of delegation is settled, the next problem is how to hand things over.

Choosing the Granularity of the Tools You Give an Agent — Coarse Bundles or Fine Slices tackles a question anyone who has wired up an MCP integration has faced: should tool definitions be broad or narrow? There is no single right answer, but the framing in this piece — work backwards from "at which layer do I want to notice a failure?" — is one I now use directly in my own projects.

Teaching the Agent to Slice Diffs Small — How One Month of Operation Changed My Review Habits is about the receiving end. When large diffs arrive all at once, review degenerates into a ritual. Forcing smaller slices gave the human check its substance back.

Keeping Secrets Out of Agent Output — Multi-Layer Redaction for Logs, Diffs, and PR Bodies belongs in the same context. The more you delegate, the more paths agent output travels through. Placing a defense at each of the three exits — logs, diffs, and PR text — is unglamorous work, but I consider it non-negotiable.

Managed Agents API — Drawing the Cloud/Local Boundary

Published at the end of this period, Running the Managed Agents API and Thinking About the Boundary Between Cloud and Local Execution connects all of this design talk to the question of where execution should live. Cloud-run agents are wonderfully low-maintenance, yet there is a kind of reassurance only local execution provides. The article isn't an argument for either side — it is about drawing the boundary per type of work.

Running Six Sites Solo with Autonomous Agents — A Timetable Design That Avoids Collisions and Spam Signals uses my own operation as its raw material. When one person runs multiple properties, agents whose execution windows merely overlap can collide in surprising ways. It was a pleasant discovery that the timetable — a thoroughly classical tool — still works in the world of autonomous systems.

Hands-On Records — Worktrees, App Migration, Delivery Pipelines

Alongside the design essays, this period also had its share of hands-on records.

Two Months of Maintaining Four Wallpaper Apps in Parallel with Antigravity and git worktree is an honest account of a worktree workflow I adopted to escape branch-switching wait times — including where it pinched, around disk usage and IDE indexing.

Migrating a Wallpaper App to Mandatory Edge-to-Edge Under targetSdk 36 with Antigravity documents Android 16-era compliance work on real devices. Enforced platform changes are deadline work, so the division of labor — agent does the research and drafts the diffs, human concentrates on verification — paid off once again.

Rebuilding Wallpaper Image Delivery Around Resolution Buckets — Letting an Antigravity Agent Handle Conversion and Validation covers the delivery side: instead of fitting images to each device resolution one by one, bundle them into buckets and let the agent run conversion and validation.

The Quiet Troubleshooting Set

As always, this period included the practical articles that readers tend to arrive at via search, mid-frustration: the MCP server failing to start with spawn npx ENOENT, connections failing with self-signed certificate in chain, and agent edits failing with patch does not apply. Every one of these stopped me in my own work at some point, and each article starts from the diagnostic order I actually used.

For those still setting up their environment, there were also two orientation pieces: Antigravity vs OpenAI Codex — An AI Coding Agent Comparison for 2026 and Configuring Antigravity 2.0 for a Comfortable Japanese UI. The comparison piece isn't there to crown a winner — it's material for deciding which parts of your own workflow to hand to which tool.

Looking Ahead

The question that ran through early June — deciding delegation scope in design language rather than by feel — is far from settled. The Managed Agents API in particular has only just entered real operation here, and how the work I've handed to cloud execution matures is something I'll be watching over the coming weeks. I'll share the follow-up records as they take shape.

For now, pick the one article closest to your own operation and start there. Thank you for reading.