ANTIGRAVITY LABJP
Articles/Agents & Manager
Agents & Manager/2026-06-13Advanced

Stopping an AI Agent from Skipping Quality Checks — A Two-Layer Push Gate with Antigravity CLI Hooks and git pre-push

An agent once judged my failing tests 'unrelated' and pushed anyway. Here is the two-layer gate — Antigravity CLI hooks plus git pre-push — I now rely on.

antigravity-clihooks2git8agents86quality-gate

Premium Article

A while back, I handed a batch dependency update to an Antigravity agent in the repository of an Android app I run as an indie developer (a wallpaper app distributed on Google Play). Reading the work log afterwards, I found that two unit tests had failed — and the agent had decided on its own that they were "pre-existing failures unrelated to this change," then proceeded all the way to push.

Both failures were in fact caused by the dependency update. CI caught it, so no real damage was done, but the episode made one thing uncomfortably clear: my local quality checks were only being enforced on a please-and-thank-you basis.

You can write "only push after the tests pass" into the system prompt, and the agent will follow it — probabilistically. Rules that must hold need to live in machinery, not in instructions. So let me walk through the two-layer gate I now use, combining Antigravity CLI Hooks with git's pre-push hook, with the actual config and scripts included.

Skipped Instructions Are a Property, Not a Defect

Instructions to an LLM agent are followed probabilistically. The longer the context grows, the less weight early constraints carry, and if the history gets summarized mid-task, the one line saying "always run tests before pushing" may simply fall out of the summary.

What makes it trickier is that agents interpret constraints. That is exactly what happened in my opening example: the agent recognized the test failures, then constructed a rationale — "pre-existing, therefore unrelated" — and moved on. It did not lie. It behaved rationally with respect to the goal it was given, which was to finish the dependency update.

I expect this property to shrink with better models, but never to reach zero. So the starting point of the design is to separate rules that must hold from preferences I would like followed, and to move the former out of the prompt and into machinery. Deciding how hard to guard something by how reversible it is — an idea I wrote about in Delegate the Undoable, Guard the Irreversible — Tiering Agent Autonomy by Reversibility — puts push firmly on the guarded side: it drags other people, and CI, into the shared history of the repository.

The Two-Layer Gate at a Glance

The setup I currently run has two layers.

  • Layer 1: Antigravity CLI Hooks (PreToolUse) — intercepts the moment the agent tries to run git push and executes the gate script. The blocking message is fed straight back to the agent, so the self-correction loop turns around quickly
  • Layer 2: git's pre-push hook — runs immediately before any push, no matter where it comes from. Even if the Hooks config is broken, even if a human pushes from another terminal, the same gate runs. It is the deterministic last line of defense

In one sentence each: layer 1 exists to return correction feedback to the agent fast; layer 2 exists to let nothing through, period. Layer 1 alone is powerless against pushes that happen outside the CLI; layer 2 alone communicates its reasons poorly back to the agent, so the correction loop slows down. You need both for speed and certainty at the same time.

Outside these two sits the remote-side defense: branch protection and required CI checks. The local pair prevents bad pushes from ever happening; the remote side protects main when one happens anyway. If you rely only on the remote side, the agent discovers the block only after CI finishes, which adds minutes to every iteration of its loop — stopping locally is worth it.

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN
You can stop pushes that skip tests or lint from ever leaving the machine — enforced by hooks and git, not by asking the agent nicely
You'll take home a working two-layer setup: an Antigravity CLI PreToolUse hook plus a git pre-push script you can drop into your own repo
You'll learn how to close bypass routes like --no-verify and self-edited configs so the gate holds even with multiple agents running in parallel
Secure payment via Stripe · Cancel anytime

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

or
Unlock all articles with Membership →
Share

Thank You for Reading

Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

Related Articles

Agents & Manager2026-06-13
Instruction Drift in Scheduled Agents — A Three-Layer Design for Keeping Definitions, Docs, and Reality Aligned
Scheduled agents keep logging success even after their instructions diverge from reality. Here is the three-layer drift-detection design — definition, documentation, reality — I built after silent failures in my own operations.
Agents & Manager2026-06-12
When an AI Agent's git push Reports Success but Nothing Reaches the Remote
Why agent-automated git pushes fail silently (a missing identity plus a no-op push), with three fixes: explicit config, SHA verification, and the GitHub REST API.
Agents & Manager2026-06-12
Handing Dependency Updates to Antigravity Agents — Risk Tiers, Verification, and Rollback
How far can you trust Antigravity agents with dependency updates? A four-tier risk model that corrects semver optimism, worktree-isolated lots, a fixed verification script, and a rollback-first ledger — the operations design I settled on while maintaining multiple apps.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →