When Your Antigravity Agent Opens a PR That Just Says "Update files" — and a Gate That Forces a Reviewable Summary

Pull requests opened automatically by an Antigravity agent tend to carry empty descriptions like "Update files." Here is a validation gate, with working code, that estimates risk from the diff and rejects vacuous descriptions so a human can actually review them.

Antigravity²⁹⁰ Agents¹⁶ Code Review³ Pull Request Quality Assurance²

✦ Premium Article

Running agents unattended across several repositories means I spend my mornings reviewing a stack of pull requests. Between the apps I maintain as an indie developer (monetized with AdMob) and the blogs I keep running, I have a handful of repos turning over in parallel, so this "morning PR review" has become a daily ritual.

The frustrating part was that the description field was almost useless. "Update files." "Fix issues." "Apply changes." Whatever the agent auto-filled, it converged on one of those three.

When the description is empty, the only option left is to read the whole diff from scratch every time. A couple of PRs is fine, but as the count grows you can't keep up — and one morning you mutter "this is probably fine" and merge without really looking. That is almost always the moment something breaks.

The problem was not the diff itself. It was that nothing had been summarized into a form a human could judge safely and quickly. So I started making the agent write a structured summary, and added a gate that mechanically rejects empty ones. Let me walk through the design and the working code.

Why agent PR descriptions tend to be empty

An agent is optimized to mark its assigned task as "done." Once the code is written and the tests pass, the objective is met. The PR description is, to the agent, a byproduct.

And unlike code, a description has no pass/fail check. "Update files" doesn't throw a syntax error, so it slips through and bothers no one.

There is a second reason. An agent can output what it did, but it does not spontaneously write what a human should verify. Those are entirely different pieces of information — the former is a record of the change, the latter is a blueprint for review. Review needs the latter, yet left alone, even the former gets trimmed away.

In short, description quality drifts structurally to the bottom unless you demand it explicitly through prompt and gate.

The five elements a reviewable description needs

After re-reading many PRs, the descriptions that let a human judge safely and quickly shared a common shape — these five elements.

Element	Question	Vacuous → Working
What	Which files changed, and what was done	"Update files" → "Extracted ad-frequency control into `AdFrequencyController`"
Why	Which problem this change solves	(none) → "Fixes a bug where the interstitial showed twice under a specific condition"
Risk	Where the blast radius is largest if it breaks	(none) → "Did not touch the billing path, but changed the default ad frequency, which can affect revenue metrics"
Test	How correctness was confirmed	"Tested" → "Added a reproduction test for the double display and checked three boundary cases of the frequency cap"
Review focus	Where you want the human to look hardest	(none) → "The boundary condition in `shouldShowAd()` and whether changing the default is appropriate"

Of these, Risk and Review focus do the most to lighten review. What and Why can be reconstructed from the diff if you spend the time, but Risk and Review focus are knowledge only the author of the change holds — making the agent write them is what cuts review load the most.

Put the other way: a PR with those two blank is one that has dumped the judgment entirely on the reader. That is exactly where the gate should focus its defense.

✦

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN

✦The structural reason agent-written PR descriptions go empty, and the point where review quietly stops being review

✦A minimal validation gate (working Node.js) that estimates risk from changed paths and rejects vacuous descriptions with exit 1

✦How to keep the gate from being too strict — leaving an escape hatch for trivial changes while making it stick in daily use

Secure payment via Stripe · Cancel anytime

✦

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

Unlock all articles with Membership →

Prompting the agent to write a structured summary

First, make the agent fill fixed headings rather than free text. I append this template as a requirement at the end of the task I hand the Antigravity agent.

Write the PR description in Markdown and fill in ALL of the headings below.
Each heading needs at least one sentence; if a heading does not apply,
write "Not applicable" together with the reason.
 
## What
The files changed and the gist of the change, as bullet points.
 
## Why
The problem this change solves. Include the related issue number if any.
 
## Risk
Where the impact is largest if this breaks. If the change touches any of
billing, authentication, data migration, or configuration values, you MUST
state that and describe the blast radius.
 
## Test
How correctness was confirmed. The tests added/changed and the boundary
conditions you checked.
 
## Review focus
One to three places you specifically want the reviewer to inspect.

The crux of this template is the concrete trigger baked into Risk: "if you touched billing, auth, migration, or config values, you must write it." Ask abstractly to "describe the risk" and the agent tends to answer "none." Name what should trigger a write, and the output stabilizes.

But the prompt alone won't hold. Agents sometimes ignore instructions, and in unattended runs there is no one there to correct them in the moment. So I always place a gate behind it that verifies the output mechanically.

A gate that rejects vacuous descriptions

The script below takes the PR description (via stdin) and the list of changed files (from git), and exits with code 1 if the requirements aren't met. It is dependency-free Node.js, meant to sit in front of an Antigravity CLI hook or in CI.

What it solves: it automatically checks that all required headings are filled, that Why isn't just a restatement of What, and that Risk isn't effectively blank when a high-risk path was touched.

#!/usr/bin/env node
// validate-pr-description.mjs
// Usage:
//   git diff --name-only origin/main...HEAD > /tmp/changed.txt
//   cat pr-body.md | node validate-pr-description.mjs --changed /tmp/changed.txt
 
import { readFileSync } from "node:fs";
 
// --- 1. Read inputs ------------------------------------------------
function readStdin() {
  try {
    return readFileSync(0, "utf8");
  } catch {
    return "";
  }
}
 
function getChangedFiles() {
  const idx = process.argv.indexOf("--changed");
  if (idx === -1) return [];
  const path = process.argv[idx + 1];
  return readFileSync(path, "utf8")
    .split("\n")
    .map((l) => l.trim())
    .filter(Boolean);
}
 
// --- 2. Required headings ------------------------------------------
const REQUIRED_SECTIONS = ["What", "Why", "Risk", "Test", "Review focus"];
 
// Phrases treated as "effectively empty." If this is all that is written, fail.
const VACUOUS = [
  /^update files?\.?$/i,
  /^fix(es|ed)? issues?\.?$/i,
  /^apply changes\.?$/i,
  /^n\/?a\.?$/i,
  /^none\.?$/i,
  /^not applicable\.?$/i, // bare "not applicable" with no reason is not enough
];
 
// --- 3. High-risk path patterns ------------------------------------
const HIGH_RISK = [
  { label: "billing", re: /(billing|payment|checkout|stripe|pricing|subscription)/i },
  { label: "auth", re: /(auth|login|session|token|oauth|credential)/i },
  { label: "migration", re: /(migration|migrate|schema|\.sql$)/i },
  { label: "config", re: /(config|\.env|settings|\.ya?ml$|\.toml$)/i },
];
 
// --- 4. Split the body by heading ----------------------------------
function parseSections(body) {
  const map = {};
  const re = /^##\s+(.+?)\s*$/gm;
  const heads = [...body.matchAll(re)];
  for (let i = 0; i < heads.length; i++) {
    const name = heads[i][1].trim();
    const start = heads[i].index + heads[i][0].length;
    const end = i + 1 < heads.length ? heads[i + 1].index : body.length;
    map[name.toLowerCase()] = body.slice(start, end).trim();
  }
  return map;
}
 
function isVacuous(text) {
  if (!text) return true;
  const firstLine = text.split("\n")[0].trim();
  return VACUOUS.some((re) => re.test(firstLine));
}
 
// A naive word-overlap measure to flag "Why is just a paraphrase of What"
function jaccard(a, b) {
  const norm = (s) =>
    new Set(
      s
        .toLowerCase()
        .replace(/[^\p{L}\p{N}\s]/gu, " ")
        .split(/\s+/)
        .filter((w) => w.length > 1)
    );
  const sa = norm(a);
  const sb = norm(b);
  if (sa.size === 0 || sb.size === 0) return 0;
  let inter = 0;
  for (const w of sa) if (sb.has(w)) inter++;
  return inter / (sa.size + sb.size - inter);
}
 
// --- 5. Validation -------------------------------------------------
function validate(body, changed) {
  const errors = [];
  const sections = parseSections(body);
 
  // 5-1. Required headings exist and are non-empty
  for (const name of REQUIRED_SECTIONS) {
    const text = sections[name.toLowerCase()];
    if (text === undefined) {
      errors.push(`Missing heading "## ${name}"`);
    } else if (isVacuous(text)) {
      errors.push(`"## ${name}" is effectively empty: "${text.split("\n")[0]}"`);
    }
  }
 
  // 5-2. Why must not be a paraphrase of What
  const what = sections["what"] ?? "";
  const why = sections["why"] ?? "";
  if (what && why && jaccard(what, why) > 0.8) {
    errors.push("Why is almost identical to What. Describe the motivation (why).");
  }
 
  // 5-3. If a high-risk path was touched, Risk must be non-empty.
  // We do not require the label name itself (to avoid over-detection);
  // we only guarantee Risk is non-empty.
  const hitLabels = new Set();
  for (const f of changed) {
    for (const { label, re } of HIGH_RISK) {
      if (re.test(f)) hitLabels.add(label);
    }
  }
  if (hitLabels.size > 0 && isVacuous(sections["risk"] ?? "")) {
    const labels = [...hitLabels].join(", ");
    errors.push(`Changed files related to ${labels}, but Risk is empty`);
  }
 
  return errors;
}
 
// --- 6. Run --------------------------------------------------------
const body = readStdin();
const changed = getChangedFiles();
 
if (!body.trim()) {
  console.error("PR description is empty. Fill in the template.");
  process.exit(1);
}
 
const errors = validate(body, changed);
if (errors.length > 0) {
  console.error("PR description does not meet the review requirements:\n");
  for (const e of errors) console.error("  - " + e);
  console.error("\nFill in every heading, and make Risk and Review focus concrete.");
  process.exit(1);
}
 
console.log("PR description meets the review requirements.");

One note on why it is shaped this way. The jaccard check for "Why is a paraphrase of What" is deliberately kept to naive word overlap. Try to judge meaning strictly and false positives multiply, sending the agent into a rewrite loop. In my own use, the 0.8 threshold catches the obvious copy-paste while letting legitimate descriptions through.

Estimating risk from the diff and handing it to the human

Beyond rejecting at the gate, it helped to estimate "how much attention this deserves" from the diff and append it to the PR, to support the human reviewer's judgment. This small script produces a rough risk score from changed line counts and contact with high-risk paths.

// risk-score.mjs — a rough risk estimate from git diff numbers and paths
import { execSync } from "node:child_process";
 
const base = process.argv[2] ?? "origin/main";
const stat = execSync(`git diff --numstat ${base}...HEAD`, { encoding: "utf8" })
  .split("\n")
  .filter(Boolean)
  .map((l) => {
    const [add, del, file] = l.split("\t");
    return { add: Number(add) || 0, del: Number(del) || 0, file };
  });
 
const PATTERNS = [
  { label: "billing", re: /(billing|payment|checkout|stripe|pricing)/i, weight: 5 },
  { label: "auth", re: /(auth|session|token|credential)/i, weight: 4 },
  { label: "migration", re: /(migration|schema|\.sql$)/i, weight: 5 },
  { label: "config", re: /(config|\.env|\.ya?ml$|\.toml$)/i, weight: 3 },
];
 
let score = 0;
const reasons = [];
let totalDel = 0;
 
for (const { add, del, file } of stat) {
  totalDel += del;
  for (const { label, re, weight } of PATTERNS) {
    if (re.test(file)) {
      score += weight;
      reasons.push(`${label}: ${file} (+${add}/-${del})`);
    }
  }
}
 
// Large deletions add on their own (removing is easier to break than adding)
if (totalDel > 200) {
  score += 3;
  reasons.push(`large deletion: -${totalDel} lines total`);
}
 
const level = score >= 8 ? "HIGH" : score >= 4 ? "MEDIUM" : "LOW";
console.log(`Risk estimate: ${level} (score=${score})`);
for (const r of reasons) console.log("  - " + r);

This score is only an estimate; I never use it for pass/fail. What I care about is that the machine does not take over the judgment. The score stays as a "look here especially" prompt, and whether to merge is always decided by a human. The more I automate the agent, the more that sense of keeping the final line in human hands is what brings peace of mind.

Wiring the gate into the agent loop

Validation runs right before the agent opens a PR. If you run the Antigravity CLI unattended, the natural spot is in front of the PR-creation command.

#!/usr/bin/env bash
# open-pr.sh — open a PR only if it passes validation
set -euo pipefail
 
BASE="origin/main"
BODY_FILE="pr-body.md"   # the description the agent wrote
 
git diff --name-only "${BASE}...HEAD" > /tmp/changed.txt
 
# 1. Description gate (stop here if it fails)
if ! cat "${BODY_FILE}" | node validate-pr-description.mjs --changed /tmp/changed.txt; then
  echo "Description does not meet review requirements; not opening a PR. Have the agent rewrite it."
  exit 1
fi
 
# 2. Append the risk estimate to the description (a heads-up for humans)
{
  echo ""
  echo "---"
  echo "### Automated risk estimate"
  echo '```'
  node risk-score.mjs "${BASE}"
  echo '```'
} >> "${BODY_FILE}"
 
# 3. Only now open the PR
gh pr create --base main --title "$(git log -1 --pretty=%s)" --body-file "${BODY_FILE}"

When validation fails, I feed the error message straight back to the agent and have it rewrite the description. The important thing is that even though the gate and PR creation form one flow, a failed gate reliably stops it (set -e plus exit 1). If a PR opens with an empty description anyway, the gate served no purpose.

What I learned from running it

At first I strictly required all five elements, but forcing Risk and Test even on trivial changes — a typo fix, an added comment — made the description longer than the change itself, which defeats the point.

Now I leave an escape hatch for PRs that are small and touch no high-risk paths. Concretely, at the top of the gate I added a branch: "if the change is under 10 lines and touches no high-risk path, pass on What alone." What I want to protect against is a heavy change slipping through empty — not turning every PR into a heavyweight.

The other lesson: since I started requiring a Risk field, the agent's own changes have grown a little more cautious. The step of putting risk into words seems to nudge it toward not touching too much. A gate is not only a thing that rejects; it is also a quiet device that changes the writer's behavior.

If you adopt this, the realistic start is one repository with only the Risk field required. Get a feel for false positives there, then expand to Review focus and Test. Don't demand the perfect shape from day one — grow it while confirming that review has genuinely gotten lighter.

If you are wrestling with the same hollowing-out of review under unattended agents, I hope this offers a useful thread to pull on.

Thank You for Reading

Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.