ANTIGRAVITY LABJP
Articles/Agents & Manager
Agents & Manager/2026-06-21Advanced

Letting a Background Agent Work Overnight Without Regretting It by Morning — Guardrails for Unattended Runs

When you hand overnight refactoring to Antigravity's Background Agent, the morning brings as much anxiety as convenience. From three angles — blast radius, completion criteria, and detecting silent regressions — here are the guardrails that let me run unattended jobs with confidence.

antigravity385background-agent8ai-agent15automation58code-review6

Premium Article

The first time overnight automation burned me, the agent was being helpful. It found two near-duplicate utility functions and merged them into one. The trouble was that one version assumed callers swallowed errors, while the other threw exceptions. Every test stayed green, the diff read cleanly, and the commit message was polite. Trusting all of that, morning-me merged it — and that afternoon a production job quietly stopped.

The danger of unattended work isn't that it fails loudly. It's that it looks like it succeeded. If a human is in the review loop, that kind of mix-up surfaces in conversation. But a Background Agent running overnight makes confident mistakes at an hour when nobody is watching and nobody can stop it. Below are the design choices that, across roughly thirty unattended nights with Antigravity's Background Agent, actually reduced the damage. This isn't a story about impressive automation — it's a record of the unglamorous mechanisms that keep you from regretting things by morning.

Decide the blast radius before the scope of work

When designing a nightly task, it's tempting to start from "what should it do." I had the order backwards. The first thing to settle is how much can break if it fails — the blast radius.

Maintaining several repositories alone as an indie developer, you start wanting to hand the agent the upkeep you never reach at night. The principle I landed on is simple: fence the agent in three ways. First, never let it touch main; it always works on a disposable branch. Second, name the files it may rewrite in an allowlist, and forbid writes anywhere else. Third, cap the amount of diff a single run may produce. All three work regardless of how clever the agent is.

I write those fences directly into the task definition.

# .antigravity/tasks/nightly-maintenance.md
 
## Goal
Improve readability and type safety of allowlisted files WITHOUT changing behavior.
 
## Where you may write (writing anywhere else is forbidden)
- src/lib/**/*.ts
- src/utils/**/*.ts
 
## Never do this
- Change any public API signature (params, return type, exception kind)
- Commit directly to main / develop
- Add or update dependencies (package.json is read-only)
- Produce more than 120 lines of diff per file
 
## Definition of done
- Existing tests stay green
- For each changed file, one line stating what changed and why
- If you can't satisfy the above, finish with "no change" and record the reason

What matters is giving "Never do this" and "Definition of done" the same weight as the goal itself. An agent optimizes toward its objective, so if you leave the constraints fuzzy, it will sacrifice the constraints to reach the goal. After getting a public API changed out from under me, I now spend an explicit line forbidding signature changes.

Fence it at launch too — don't over-trust the prompt

Constraints in the task definition are still only a request. Since this runs unattended, I also keep mechanically enforceable fences in the launch script. It wakes a session through the Antigravity CLI and cuts a working branch before handing anything over.

#!/usr/bin/env bash
# scripts/nightly-agent.sh — invoked by cron, one task per launch
set -euo pipefail
 
TASK_FILE="$1"                       # e.g. .antigravity/tasks/nightly-maintenance.md
DATE="$(date +%Y%m%d)"
BRANCH="agent/nightly-${DATE}-$(basename "$TASK_FILE" .md)"
 
# Disposable working branch. main stays a clean starting point.
git fetch origin main --quiet
git switch -c "$BRANCH" origin/main
 
# Start the session; receive the result as JSON when done.
SESSION_ID="$(antigravity sessions create \
  --task "$TASK_FILE" \
  --sandbox isolated \
  --timeout 45m \
  --max-output-tokens 64000 \
  --format json | jq -r '.session_id')"
 
echo "started ${SESSION_ID} on ${BRANCH}"
 
antigravity sessions wait "$SESSION_ID" --timeout 50m || {
  echo "session did not finish cleanly: ${SESSION_ID}"
  # Throw away half-done work, branch and all.
  git switch main && git branch -D "$BRANCH"
  exit 0
}

Two things matter here. Always set a timeout. And discard, without inspection, the output of any session that didn't finish cleanly. The time it takes to pick through half-processed code in the morning costs more than just re-running the job. Early on I'd salvage partial work because throwing it away felt wasteful; it was almost never usable, and all it left me was decision fatigue.

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN
Concrete patterns for bounding unattended overnight refactors by blast radius: throwaway branches, a file allowlist, and a hard diff ceiling
Why 'tests are green' is the wrong completion gate, and how a semantic-diff and coverage-delta layer catches quiet regressions
A morning triage routine that clears several generated branches in minutes, plus idempotent task definitions that survive re-runs without double-applying
Secure payment via Stripe · Cancel anytime

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

or
Unlock all articles with Membership →
Share

Thank You for Reading

Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

Related Articles

Agents & Manager2026-05-22
Three Weeks of Letting Antigravity's Background Agent Handle Nightly Wallpaper Asset Updates
A quiet field log from three weeks of delegating the nightly asset pipeline of a wallpaper app to Antigravity's Background Agent. Where it earned its keep, where I still need a human hand, and the small routines that made the morning check feel light.
Agents & Manager2026-04-13
Building a Coding Agent System with Gemma 4 × Antigravity — A Complete Implementation Guide for Code Review, Test Generation, and Refactoring
A hands-on guide to building a 3-agent collaborative system using Gemma 4 and Antigravity AgentKit 2.0, covering code review, automated test generation, and refactoring suggestions. Includes production-quality code and pitfall solutions.
Agents & Manager2026-03-28
Designing a SKILL.md-Driven Code Review Agent
As AI accelerates implementation, review workload grows. Learn how to design a code review agent around SKILL.md — including a 9-step review process, quality guides, and custom instruction architecture.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →