Make the Self-Debugging Agent Walk the Logged-In and Post-Paywall Screens

By default, Antigravity 2.0's real-browser self-debug only sees the logged-out free view and reports success. To catch billing regressions, inject an authenticated session and paid state into the agent's browser and force coverage with assertions.

antigravity⁴¹⁴ self-debug² paywall testing¹⁷ agents¹²⁰

✦ Premium Article

Last week an Antigravity 2.0 agent came back with "Verified every screen in a real browser, from the home page to article detail. No issues found," complete with evidence. I merged it, relaxed, and the next morning a reader emailed to say a premium article's paywall had lifted for free members and the full text was visible to everyone.

The bug was not in the code. It was in where the check started. The Chrome the agent launched was a first-time visitor with no cookies, so it only ever walked the free, logged-out state. The one screen that must never break, the billing path, was never opened during self-debug. As an indie developer whose revenue rests on premium articles and in-app purchases, I cannot leave that blind spot alone.

Why default self-debug only sees the free, first-visit state

Real-browser self-debug launches an actual Chrome mid-build, drives elements, and checks its own work with screenshots. The speed is genuinely useful, but that Chrome starts from a clean profile every time. It carries no login session and no cookie that marks the visitor as paid.

So the agent always looks at the public face of the site. On my sites, article body gating runs across three states.

State	What decides it	Walked by default?
Logged out, free article	No cookie	Yes (the only one)
Logged-in member	premium_token cookie	No
Single-article purchase	article_purchases cookie	No

If two of three states are never seen, "all screens OK" is not all screens. The agent's well-meaning pass report quietly breeds complacency. Where to keep the evidence and the approval boundary is covered in designing evidence and approval for real-browser self-debug; here I dig into the step before that, whether the agent is even being shown the right state.

Three entry points for showing the agent a state

To make the agent walk the paid screen, you provide an entry point that injects the state. I adopt them in this order.

1. Pre-seeding the cookie (first choice)

Only on preview, burn in a premium_token before navigation. This reproduces "how a member sees it" without touching the real Stripe flow at all, so it is the fastest and safest option.

2. Deep-linking to the post-purchase URL

Send the browser straight to the unlocked article URL and confirm the rendered result. Use it together with cookie seeding.

3. Flag-forced unlock (last resort)

Disable the paywall with an environment variable. This bypasses the decision logic itself, so I treat it strictly as a visual check and never as verification of the billing decision.

✦

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN

✦Understand why an agent that only sees the logged-out free view reports 'all screens OK' and starts missing paywall regressions

✦Copy a working harness that injects a premium_token cookie and deep-links into the paid state, straight into your own preview

✦Add a 'fail the run if the post-paywall DOM was never reached' coverage assertion so a broken Stripe paywall is caught before it ships

Secure payment via Stripe · Cancel anytime

✦

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

Unlock all articles with Membership →

Injecting an authenticated session into the agent's browser

Prepare a single preview verification harness and instruct the agent to check only through it. Below is a minimal Playwright setup that seeds a member cookie into the preview and walks both the free and premium states.

// verify-paywall.ts — a preview-only state injection harness
import { chromium, Browser, BrowserContext } from "playwright";
 
const PREVIEW = process.env.PREVIEW_URL ?? "https://preview.example.pages.dev";
const PREMIUM_ARTICLE = "/articles/agents/some-premium-slug";
 
// A verification-only token for preview (separate from the production signing key)
const PREVIEW_PREMIUM_TOKEN = process.env.PREVIEW_PREMIUM_TOKEN ?? "";
 
async function seedMemberContext(browser: Browser): Promise<BrowserContext> {
  const ctx = await browser.newContext();
  // Burn in the member cookie before any navigation
  await ctx.addCookies([
    {
      name: "premium_token",
      value: PREVIEW_PREMIUM_TOKEN,
      domain: new URL(PREVIEW).hostname, // a domain mismatch is silently dropped, so be exact
      path: "/",
      httpOnly: true,
      secure: true,
      sameSite: "Lax",
    },
  ]);
  return ctx;
}
 
async function main() {
  const browser = await chromium.launch();
 
  // (A) Confirm the paywall shows for a free visitor
  const guest = await browser.newContext();
  const guestPage = await guest.newPage();
  await guestPage.goto(PREVIEW + PREMIUM_ARTICLE, { waitUntil: "networkidle" });
  const paywallShown = await guestPage.locator("[data-paywall]").count();
 
  // (B) Confirm the full body shows for a member
  const member = await seedMemberContext(browser);
  const memberPage = await member.newPage();
  await memberPage.goto(PREVIEW + PREMIUM_ARTICLE, { waitUntil: "networkidle" });
  const fullBodyShown = await memberPage.locator("[data-premium-body]").count();
 
  console.log(JSON.stringify({ paywallShown, fullBodyShown }));
  await browser.close();
 
  // The paywall must show for free AND the body must show for members, together
  if (paywallShown < 1 || fullBodyShown < 1) {
    process.exit(1);
  }
}
 
main();

The point is to keep the free and member visits as separate BrowserContexts and always walk both in the same run. Checking only one side misses both today's "full text leaks to free" regression and the opposite "body never renders for members" regression.

Fail the run if the post-paywall DOM was never reached

A harness is not enough if the agent shrugs and says "I was short on time, so I only looked at the free side." So make passing through each required state a machine-enforced assertion. Place markers in the DOM and fail the whole run if there is no record of hitting them.

// coverage-assert.ts — check required-state arrival as coverage
type Visited = { key: string; seen: boolean };
 
const REQUIRED: Visited[] = [
  { key: "guest_paywall", seen: false },    // saw data-paywall while free
  { key: "member_full_body", seen: false }  // saw data-premium-body as a member
];
 
export function markVisited(key: string) {
  const target = REQUIRED.find((v) => v.key === key);
  if (target) target.seen = true;
}
 
export function assertCoverage(): void {
  const missed = REQUIRED.filter((v) => !v.seen).map((v) => v.key);
  if (missed.length > 0) {
    // failing here structurally forbids "passed after only seeing the free screen"
    throw new Error("uncovered paywall states: " + missed.join(", "));
  }
}

Wire these two files into the preview procedure and spell them out in the agent's guide. Putting a state-coverage contract in the project's guide file keeps the agent from settling for the free side on its own.

## Required self-debug coverage
- Verify any billing-related screen only through verify-paywall.ts
- Reach both guest_paywall and member_full_body, then pass assertCoverage()
- If either state is unreached, report that run as a failure

Pitfalls I hit as an indie developer

Reproducing and debugging this surfaced several places that break if you build them naively. Three that actually bit me in production.

First, the edge cache serving free HTML to the agent. My cache worker bypasses caching for premium_token holders, but if the first request before cookie injection gets pinned in the cache, a member navigation still returns the free version. The fix was to bust the cache with a DEPLOY_VERSION-style query during verification, or fetch the member context's first request with no-store. The collision between edge caching and paid state runs deep, and the composite-check thinking from designing a source of truth for ad-free and paid state applies directly.

Second, cookie domain mismatch. If the preview subdomain and the cookie's domain attribute differ by even one character, the browser silently discards the cookie. There is no error, just the free view, so it burned 30 minutes before I traced it. Pulling the host name mechanically from the URL is the reliable move.

Third, bfcache. Pressing back restored the pre-paywall state and erased the member navigation. You need a pageshow handler to detect restore and refetch. Containing these side effects pairs well with pointing self-debug at a throwaway preview environment.

How much to delegate, and where a human looks

I recommend delegating the state walk and marker checks to the agent, but always reviewing the two paywall screenshots, before and after unlock, with my own eyes. An assertion only guarantees that a state was reached, not that the state looks right. A half-broken render where the marker is present but part of the body is missing can only be caught by looking.

Push the automation rate up, but keep human approval on the two revenue-critical screens. Since drawing that line, no billing regression has shipped to production. It is not a flashy mechanism, but cutting off the structure that lets you relax after seeing only the free side is the highest-leverage investment for a small shop.

Start by opening a free context and a member context in the same run on your local preview, so both screenshots end up side by side. If only one of them ever shows up, that is exactly where your self-debug is blind. Thank you for reading.

Thank You for Reading

Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.