One Saturday morning, while skimming the overnight job logs as usual, I noticed the output was formatted just slightly differently from the day before. The prompt hadn't changed. The input data hadn't changed. Exactly one thing had: agy had auto-updated from v2.1.3 to v2.1.4 during the night.
The rewritten, Go-based Antigravity CLI is now rolling out to all users, and v2.1.x releases keep arriving at short intervals. When you use it interactively, having it quietly stay current is a welcome behavior. But a CLI wired into your automation silently becoming a different version is a small trap from the standpoint of reproducibility.
As an indie developer, I run the four Dolice Labs sites automatically during off-peak hours overnight. Each job runs while no one is watching, so when "what passed yesterday looks subtly different this morning," I need to be able to separate whether the cause is the prompt, the input, or the tool itself. Otherwise the investigation alone can eat the whole morning.
When an automated run is "only slightly off," suspect the tool first
If I write reproducibility as one equation, an automated pipeline's output is determined by "binary × prompt × input." Most people watch the prompt and the input carefully. They keep them in Git and read the diffs. Yet the third term — the binary itself — tends to slip out of the operating model on the assumption that it just updates whenever.
This is especially easy to miss when the CLI is embedded in automation. Interactively you'd catch a new version from the startup banner, but a headless run discards the banner and keeps only the result in the log. That is precisely why you need to record, actively and from the operations side, "which binary is running right now."
A single Go binary makes pinning remarkably straightforward
This is where it helps that the Antigravity CLI ships as a single Go binary. Trying to pin a Node-based tool drags in the state of node_modules, the global install, and even the Node version itself — the problem grows. A single binary is different. "Keep that one file" is nearly the whole job.
The idea is simple. Pick one version for automation, place that binary at a known path, and have your jobs always call that path. Keep the latest release separately as your interactive copy, and never mix it with automation. That alone severs the "upgrades on its own overnight" route.
The minimal steps to pin a version
First, decide on one binary location dedicated to automation. Create a pinning directory under your home directory and stash the chosen version there.
# Prepare the fixed path that automation will reference
PIN_DIR="$HOME/.agy-pinned"
mkdir -p "$PIN_DIR"
# Stash the agy you currently have, labeled with its version
VER="$(agy --version | grep -oE '[0-9]+\.[0-9]+\.[0-9]+')"
cp "$(command -v agy)" "$PIN_DIR/agy-$VER"
ln -sf "$PIN_DIR/agy-$VER" "$PIN_DIR/agy"
echo "pinned: agy-$VER -> $PIN_DIR/agy"The key is that $PIN_DIR/agy is a symlink whose target carries the version name. With this in place, when you later want to bump the version, you switch with just a relink, and the old binary stays as agy-2.1.3, so you can roll back immediately.
On the scheduled-job side, don't leave it to PATH — call this fixed path explicitly.
# ❌ Not ideal: calling agy on PATH (you won't notice if it bumped overnight)
agy run ./pipeline.task
# ✅ Preferred: call the pinned binary explicitly
AGY="$HOME/.agy-pinned/agy"
"$AGY" run ./pipeline.taskThese two look almost identical, but they mean very different things. The first depends on "whatever agy happens to be installed on that machine." The second is pinned to "the one binary you chose and stashed." For automation, the second is what you want to protect.
Record the binary's identity in every run's log
Pinning alone is only half of it. So that you can later confirm "was this output really produced by that version," stamp the binary's identity into the log at the start of each run. I recommend keeping not just the version string but the file's hash alongside it. Even with the same version name, if the contents were swapped, the hash changes and you notice.
AGY="$HOME/.agy-pinned/agy"
# At the top of the run, record the identity of the binary you use
{
echo "run_started: $(TZ=Asia/Tokyo date '+%Y-%m-%d %H:%M:%S %Z')"
echo "agy_version: $("$AGY" --version | tr -d '\n')"
echo "agy_sha256: $(sha256sum "$AGY" | awk '{print $1}')"
} >> run.log
"$AGY" run ./pipeline.task >> run.log 2>&1Once this becomes a habit, in the "slightly different from yesterday" situation from the opening, you just line up two days of agy_version and agy_sha256 and can tell at a glance whether the tool moved. The time it takes to isolate a cause is what shrank the most for me.
Here is how operations change with and without pinning.
| Aspect | No pinning (left to PATH) | Pinned (stashed + recorded) |
|---|---|---|
| Overnight auto-update | Behavior can change unnoticed | Updates only via a manual relink |
| Cause of changed output | Hard to separate tool from input | Tell instantly from logged version and hash |
| Effort to roll back | Hard — no past binary kept | Just relink to the old version |
When you do upgrade, push just one job through first
Pinning does not mean "never update." You want to take in the CLI's improvements quickly. What I keep in mind is the order. Rather than moving every job to the new version at once, I point only the lowest-impact job at the new binary first, run it for a few days to confirm the output doesn't break, and then widen to the rest.
# Stash the new version in the pinned directory (don't relink yet)
agy --version # e.g. a local agy already updated to 2.1.5
cp "$(command -v agy)" "$HOME/.agy-pinned/agy-2.1.5"
# Try just one job against the new binary, named explicitly
NEW="$HOME/.agy-pinned/agy-2.1.5"
"$NEW" run ./low-risk.task >> trial.log 2>&1
# After a few quiet days, relink the main line and roll it out
ln -sf "$HOME/.agy-pinned/agy-2.1.5" "$HOME/.agy-pinned/agy"This "push just one through first" approach is the same sequencing I leaned on when moving from Gemini CLI to the Antigravity CLI. If you want to compare outputs before switching, the matching approach I built in Does the New CLI Do the Same Job? An Output-Parity Gate Before Switching to Antigravity CLI applies just as well to comparing old and new versions. If responsiveness is on your mind, pair it with Measuring the Go-Based Antigravity CLI's Responsiveness to Rethink My Nightly Batch.
For your next step, pick one job that runs automatically today, create the pinning directory and stash today's version, and add one line each for agy_version and agy_sha256 to the run log. Reproducibility builds up less from elaborate machinery and more from the small habit of recording the one binary that's running now. If you also want to rethink how much you delegate to scheduled runs, After Migrating to Antigravity CLI: Deciding How Much to Delegate to Scheduled Runs is a useful starting point.
If you live alongside overnight automation the way I do, I hope this trims even a little off your investigation time.