ANTIGRAVITY LABJP
Articles/Agents & Manager
Agents & Manager/2026-06-14Intermediate

Before You Trust 'It's Fixed' — Make the Agent Confirm Your Live URL Actually Renders

The agent reported it had fixed the bug and deployed successfully, yet the production page was blank. To prevent the empty-body-with-200 trap, here is how to add a completion gate that makes Antigravity 2.0's Browser Sub-Agent open the live URL and confirm the main content selector is actually filled.

Antigravity227agents90verification2Browser Sub-Agentdeploy4

On a quiet Saturday morning I asked an agent to fix a minor layout glitch. It came back with the usual completion report: "Fixed the relevant section and confirmed the build and deploy both succeeded." Reassured, I moved on to other work. A few hours later I opened the production page and the entire article body was empty. The HTTP status was 200. The build log showed no errors. And yet what readers were seeing was a page with nothing in it.

What makes this kind of failure so slippery is that every word of the agent's report was true. The code was fixed, the build passed, the deploy finished. There was no lie anywhere. The only thing nobody had checked was whether the fixed code was actually rendering on the reader's screen at the live URL. Today I want to share how to hand that final step to Antigravity 2.0's Browser Sub-Agent as a completion gate.

A successful build and a visible page are two different guarantees

The first thing to internalize is that a successful deploy does not mean successful rendering. The "empty body with a 200" I ran into has a few typical causes.

When an exception happens partway through server-side rendering, some frameworks still stream the response back as a 200 while swapping the body for an error boundary. If fetching a static asset fails for a moment, a page generated with an empty body can be served as-is. And if an edge cache happens to grab that broken HTML at exactly that moment, it stays frozen for a while. In every one of these cases the build succeeded, so nothing looks wrong from the build log.

What the agent looks at is usually that same build log and the deploy API's response. In other words, the agent declares completion based solely on a "success" signal. What we actually want to confirm is one step further out — when you open the public URL for real, is there text inside the element that holds the body? — and that lived outside the agent's field of view.

Redefine "done" as "the main selector is filled at the live URL"

The direction of the fix is simple. Push the agent's definition of "done" one step past build success, so that it includes confirming the render at the live URL. Antigravity 2.0's Browser Sub-Agent can drive a real browser separately from the main agent, so you can have it do this check itself.

The completion criteria I keep in AGENTS.md (the instruction file for agents at the project root) reads roughly like this:

## Definition of done (changes that involve a deploy)
 
A successful build and deploy alone do NOT count as "done".
Report completion only after all of the following hold:
 
1. Open the production URL (the affected page) with the Browser Sub-Agent
2. Confirm the body container `.article-content` exists and
   its text length is at least 200 characters
3. Confirm no element with a `data-error-boundary` attribute
   exists on the page
4. If the above cannot be met, do NOT write "done"; instead report
   the observed state (empty body, presence of an error boundary)

The key is that it names the selector to check explicitly. A vague instruction like "confirm the page displays correctly" leads the agent to take a screenshot and reply "it appears to be displaying." An empty page is also a "white screen is displayed" on a screenshot, so that never catches it. Name the element that holds the body, and make the condition that it contains text of sufficient length, and empty pages get rejected reliably.

The verification script you hand to the Browser Sub-Agent

Whether the selector exists and how long its contents are can be judged mechanically by running this in the Browser Sub-Agent's console. After opening the production URL, I have the agent evaluate this small script and use its return value as the basis for the completion decision.

// Evaluate with the production URL open
(() => {
  const main = document.querySelector('.article-content');
  const text = main ? main.innerText.trim() : '';
  const hasErrorBoundary =
    document.querySelector('[data-error-boundary]') !== null;
  const htmlClosed = document.documentElement.outerHTML
    .includes('</html>');
 
  return {
    selectorFound: !!main,
    textLength: text.length,
    hasErrorBoundary,
    htmlClosed,
    verdict:
      !!main && text.length >= 200 &&
      !hasErrorBoundary && htmlClosed
        ? 'RENDERED'
        : 'EMPTY_OR_BROKEN',
  };
})();

Completion is granted only when verdict is RENDERED; if EMPTY_OR_BROKEN comes back, the agent returns to the fix phase. I keep hasErrorBoundary as a separate flag to help isolate the cause. If an error boundary is showing, it's an exception during rendering; if the body is empty but there's no error boundary either, it's an asset-fetch failure. The return value alone narrows down where to look next.

Why require not just "the selector exists" but "text length above a threshold"? Because a very common failure mode is that the element itself is generated but its contents are empty. The .article-content box is there, yet there isn't a single character inside it. If you look only at whether the selector exists, this state slips right through. Set the length threshold to match your minimum content volume; for body pages I use 200 characters as the floor.

Avoiding the "you're still seeing the old version" trap right after deploy

There is one pitfall here. If you open the production URL immediately after the deploy completes, the edge may still have the previous version cached, so you end up confirming the pre-fix HTML. This produces the hardest-to-catch slip of all: "rendering succeeded, but what you verified was the old code."

The workaround I use is to compare a value that changes with every deploy — say, a deploy identifier embedded in the HTML — as part of the verification script. Only when the identifier you just deployed matches the one read from the production URL can you be certain you're looking at the new version. If your site has no such identifier, even taking a sentence that always changes with the fix (a heading's wording, or a newly added element) as a passphrase and adding its existence to the conditions is effective. The point is to confirm not just "it succeeded" but "what I fixed this time is visible right now."

Operational notes to make the gate stick

If you write this into AGENTS.md once and forget about it, the agent may skip it when it's in a hurry. As an indie developer running several sites in parallel, I pair deploy-bearing tasks with the practice of slicing work into "reviewable units," keeping changes small so the check itself stays light. The thinking on granularity is in slicing agent requests into reviewable units; when the pages to confirm fit into one or two, the Browser Sub-Agent's round trip stays short too.

If you stumble on the render check itself — especially the case where a single-page app's body generation lags and gets misread as an empty page — the waiting strategy covered in why the Browser Sub-Agent misreads SPAs as empty pages helps. Just inserting the small step of waiting for the body container to appear before evaluating the verification script cuts false negatives considerably.

If this gate does detect an empty page, it's often faster to roll back to the pre-break state and calmly read the diff than to dive straight into root-cause analysis. The rollback procedure is organized in Antigravity Checkpoints & Rollback.

Your next step

If you currently take the agent's "deployed" at face value and move straight to the next task, start by adding one line to AGENTS.md: "Do not treat as done until the body selector at the production URL is confirmed to contain text." You can use the verification script above as-is; just swap the selector name for your own site's body container. A green build check and text being visible on the reader's screen are two different guarantees — bake that single fact into the agent's completion criteria, and you will reliably cut down on showing readers a blank page.

Thank you for reading. I hope it offers a bit of quiet reassurance to anyone, like me running several pages at Dolice Labs, updating them automatically the same way.

Share

Thank You for Reading

Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

If you found this article helpful, a small tip ($1.50) would mean a lot to us. Your support helps keep this site ad-free and covers server and hosting costs.

Related Articles

Agents & Manager2026-06-12
Size Antigravity Agent Tasks by What You Can Review — a Practical Cure for Rework
A one-line request cost me forty minutes of agent time and a Monday rewrite. Here is the sizing rule I switched to — tasks I can review in fifteen minutes — with the three actual briefs, acceptance criteria phrasing, and file-boundary rules for worktree parallelism.
Agents & Manager2026-05-05
Building a Subscription AI Agent Service with AgentKit 2.0 — Stripe Billing to Monthly Revenue Design
Complete guide to building and monetizing a subscription-based AI agent service using AgentKit 2.0. Covers Stripe integration, multi-agent design, pricing strategy, and churn prevention — everything needed to reach stable monthly recurring revenue.
Agents & Manager2026-04-26
Designing Antigravity Agent Traces That Tell You Why It Failed — Observability in Practice
Run Antigravity agents long enough and unreadable failure logs pile up fast. This piece walks span structure, attribute design, failure tagging, dashboards, cost visibility, and retry policy — backed by six months of production metrics — so you can cut post-incident debugging time in half.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →