Running Antigravity CLI Unattended: Notify Only on Real Failures
A small wrapper for scheduled Antigravity CLI runs that stays silent on success and alerts you only on failures a human needs to act on, covering exit codes, transient-error triage, and duplicate suppression.
A few nights ago, my own automation kept me from sleeping. A scheduled job stalled for just a moment, and my phone buzzed again and again. When I opened the logs the next morning, exactly one failure had needed my attention. Everything else was a transient hiccup that had cleared itself within minutes.
When Gemini CLI was retired on June 18 and I moved my unattended scripts over to the Go-rewritten Antigravity CLI, all of those scripts had to be ported at once. The new CLI is fast and starts lightly. But the question of what to notify on is still mine to design, regardless of which CLI sits underneath. Today I want to share the small mechanism I use to stay silent on success and quietly surface only the failures that actually need a person.
Over-alerting is as dangerous as under-alerting
When I first started building unattended jobs, I set them to "notify on everything." Success and failure alike landed in my pocket. It felt reassuring; in practice it was the opposite.
Once your eyes adjust to a constant stream of success pings, the single failure buried among them slips past. During a stretch when I was running automated updates for four blogs alongside AdMob revenue checks for several apps, dozens of notifications piled up each day, and the one anomaly that mattered scrolled off into the distance.
The value of a notification lives not in volume but in trust — the trust that when it arrives, you will stop what you're doing. So the design starts with subtraction, not addition. Decide first what you will never send, then ring only what's left.
Anchor silence-on-success in the exit code
The first foundation is the Antigravity CLI's exit code. Launched from a shell, the CLI returns 0 on success and non-zero on failure. That value is the primary gate that decides whether to notify at all.
Collect stdout and stderr into a single log file. Whether you're reading the failure later or assembling the body of an alert, that one log is what you'll lean on.
#!/usr/bin/env bashset -uo pipefailLOG_DIR="${HOME}/.agy-runs"STATE_DIR="${HOME}/.agy-state"mkdir -p "$LOG_DIR" "$STATE_DIR"# Take the first argument as the task name; pass the rest through to the CLITASK="${1:?task name required}"shiftLOG="${LOG_DIR}/${TASK}-$(date +%Y%m%d-%H%M%S).log"# Run the Antigravity CLI, funneling all output into the logagy run "$@" >"$LOG" 2>&1CODE=$?echo "task=${TASK} exit=${CODE} log=${LOG}"
The deliberate choice here is to not use set -e. If the whole script halted the moment the CLI failed, you'd never reach the part that classifies the failure and decides whether to notify. A failure isn't something to halt on — it's something to observe and handle. Keep only set -uo pipefail and capture the exit code yourself.
✦
Thank you for reading this far.
Continue Reading
What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.
WHAT YOU'LL LEARN
✦An unattended-run wrapper built on exit codes and logs that stays silent on success
✦Classification logic that separates transient rate limits and auth expiry from failures worth a human's time
✦Duplicate suppression that won't re-alert the same failure for 6 hours, preventing alert fatigue
Secure payment via Stripe · Cancel anytime
✦
Unlock This Article
Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.
Separate "transient hiccups" from "failures worth a human"
A non-zero exit code doesn't always mean a person is needed. Rate limits and brief network drops are the kind of wobble that passes on a retry a few minutes later. An expired auth token or a syntax error, by contrast, won't fix itself no matter how many times the job runs.
So I sort failures by character into three kinds: transient, auth, and the ones worth reading.
classify() { local code="$1" log="$2" # Exit code 0 is success. Not a notification candidate [ "$code" -eq 0 ] && { echo "ok"; return; } # Transient: rate limits, timeouts, brief disconnects if grep -qiE 'rate limit|429|timeout|temporarily|ETIMEDOUT|ECONNRESET|50[23]' "$log"; then echo "transient"; return fi # Auth: can't recover unattended; needs a human to re-login if grep -qiE '401|unauthorized|token .*(expired|invalid)|re-?auth' "$log"; then echo "auth"; return fi # Everything else is a failure you should read in the morning echo "actionable"}
This sorting doesn't need to be perfect. Rough is fine at the start. As you run it, every time you find a real failure you mistakenly filed as transient — or the reverse — you grow the grep patterns one at a time. In my case, over about half a year of running this, the patterns grew to roughly a dozen.
Laying the handling out per class makes the intent easier to see.
Failure kind
Typical cause
Handling when unattended
ok
Normal exit (code 0)
Silence. Reset the transient counter
transient
Rate limit, timeout, brief disconnect
Skip by default. Notify only after 3 in a row
auth
Token expired, re-login required
Notify immediately, suppressed for 6 hours
actionable
Syntax error, unexpected exception
Notify immediately; read it in the morning
Ring on transient failures only when they persist
If you notify on the very first transient failure, you slide right back to over-ringing. Ignore the wobble, but do learn when the wobble is continuing. Express that line with a consecutive-count counter.
# Decide whether transient failures are persisting. True only at 3 in a rowtransient_persisted() { local task="$1" local f="${STATE_DIR}/${task}.transient" local n n="$(cat "$f" 2>/dev/null || echo 0)" n=$((n + 1)) echo "$n" > "$f" [ "$n" -ge 3 ]}# Reset the transient counter on successreset_transient() { rm -f "${STATE_DIR}/${1}.transient"}
A single success in between wipes the counter. So a one-off stall is quietly let go, and only when the job stalls three times in a row does the signal — "maybe this isn't just a wobble" — reach your hand.
Don't ring the same failure twice or three times
Failures that reproduce until a human fixes them — auth expiry, a syntax error — will generate a notification on every run if left alone. Ten identical alerts in one night is noise. So take a "fingerprint" of the failure and suppress the same fingerprint for a set window.
# Fingerprint: kind + tail of the log, hashed shortshould_notify() { local task="$1" kind="$2" log="$3" local fp f now last fp="$(printf '%s|%s' "$kind" "$(tail -n 20 "$log")" | shasum | cut -c1-12)" f="${STATE_DIR}/${task}.${fp}" now="$(date +%s)" last="$(cat "$f" 2>/dev/null || echo 0)" # Don't re-notify the same fingerprint within 6 hours (21600 seconds) if [ "$((now - last))" -lt 21600 ]; then return 1 fi echo "$now" > "$f" return 0}
Including the log tail in the fingerprint means that even if the error type matches, a different target file is treated as distinct. The six-hour window is simply tuned to how often I run jobs. Shorten it for hourly jobs, lengthen it for daily ones — set it to your own schedule.
Write the alert body for your half-asleep self
The person an alert reaches is usually a slightly earlier version of you. So that the situation can be taken in at a glance even with the context forgotten, pack the body with only the hostname, task name, failure kind, and the log tail — kept short.
notify() { local task="$1" kind="$2" log="$3" local tail_text tail_text="$(tail -n 8 "$log" | sed 's/"/\\"/g')" # Pass WEBHOOK_URL via env (e.g. a Slack Incoming Webhook) curl -fsS -X POST "$WEBHOOK_URL" \ -H 'Content-Type: application/json' \ -d "$(printf '{"text":"[%s] %s failed (%s)\n%s"}' \ "$(hostname)" "$task" "$kind" "$tail_text")" \ >/dev/null || echo "notify itself failed" >&2}
Finally, thread the parts into one flow.
run_and_watch() { local task="$1"; shift local log code kind log="${LOG_DIR}/${task}-$(date +%Y%m%d-%H%M%S).log" agy run "$@" >"$log" 2>&1 code=$? kind="$(classify "$code" "$log")" if [ "$kind" = "ok" ]; then reset_transient "$task" exit 0 fi # Transient: only proceed once it has happened 3 times in a row if [ "$kind" = "transient" ] && ! transient_persisted "$task"; then exit 0 fi if should_notify "$task" "$kind" "$log"; then notify "$task" "$kind" "$log" fi exit "$code"}run_and_watch "$@"
After that, call it from cron or macOS launchd as run_and_watch.sh blog-update agy run --task publish. On a successful night, nothing arrives. That quiet is the proof the mechanism is working.
Verify the alert path without waiting for a real failure
The scariest thing about a notification setup is that the alert fails to fire exactly when you need it. But you can't confirm anything while waiting for a real failure to happen. So run a small command that fails on purpose in place of the CLI, and verify just the path first.
# Inject a command that always exits non-zero in place of agy run to test the pathWEBHOOK_URL="$WEBHOOK_URL" \ run_and_watch "smoke-test" bash -c 'echo "401 unauthorized: token expired" >&2; exit 1'
With 401 in the log, the classifier lands on auth, and exactly one notification fires after passing duplicate suppression. If it reaches your phone, you've confirmed that a real failure will travel the same path reliably. As an indie developer, I've made this smoke test a rule before putting any new task on the schedule. Running it through once changes how safe unattended operation feels.
Start small, and grow the number of silent nights
You don't have to aim for perfect classification from day one. What I'd suggest is starting with just the exit code and duplicate suppression, and running it. Add the transient logic later — grow a pattern only after you see an alert that fired by mistake. Through that repetition, the mechanism settles into your own way of working.
I've come to measure an unattended job not by the hours it ran, but by the number of nights it stayed silent. Tonight, pick one job from your schedule and wrap it in this quiet mechanism. Tomorrow morning should be a little calmer.
Share
Thank You for Reading
Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.