ANTIGRAVITY LABJP
Articles/Tips & Best Practices
Tips & Best Practices/2026-06-21Intermediate

Tracing What a Long Agent Run Actually Did: Review That Starts From In-Conversation Search

How to use the in-conversation search added in Antigravity v2.1.4 as the starting point for reviewing long agent runs. Choosing search terms, the decision points to inspect, and reconciling with background-agent logs, with concrete steps.

Antigravity253Agents15ReviewWorkflow7Debugging3

Have you ever scrolled an agent conversation that ran for hundreds of steps, top to bottom, trying to find where it made a decision? The output looks right, but you cannot trace what it decided where, so you end up dragging the scrollbar back and forth. That was the time I found most wasteful.

Antigravity v2.1.4 added cmd/ctrl+F search inside the conversation view. It looks like a small feature, but for reviewing long agent runs it changes the starting point entirely. Instead of reading everything, you jump to the decision points by search and read only those closely. This article lays out a review workflow built around that search—from choosing search terms to reconciling with background-agent logs.

Stop reading the whole thing

Reading a long conversation from the top costs the same time regardless of how important each decision was. What you really want to see in an agent review is not the output itself but "where it set the direction." Once you frame search as the tool for jumping straight to those branch points, how to use it becomes clear.

I split a review into three stages.

StageWhat to inspectExample search terms
Direction branchesWhere the agent narrowed its options"instead", "for the following reason", "rather than"
External effectsWhere a write, run, or send happened"git push", "rm ", "created"
UncertaintyWhere the agent hesitated or guessed"probably", "I assume", "likely"

Of these three, the first to inspect is "external effects." What the agent rewrote and what it executed is the part you cannot take back in a review. Judging the quality of its reasoning can wait; first pin down the effects by search.

Search from the words that caused side effects

Concretely, search first for words that correspond to file writes and command execution. When an Antigravity agent uses execution tools, the conversation holds the commands it ran and the paths it created. Picking those up by search gives you a list of side effects in tens of seconds.

The first search terms I type are mostly fixed.

  • Execution: Running, Bash, Terminal, executed
  • Writes: Created, Edited, Wrote, updated
  • Destructive ops: rm , DROP, --force, deleted
  • Outbound: push, POST, deploy, published

I type the destructive-op terms partly to confirm there are zero hits. Being able to verify "it did nothing" by search is faster and more reliable than eyeballing. When there is a hit, I read only the surrounding lines and judge whether the operation was intended.

Pick up the traces of hesitation

Once side effects are pinned, the next set is the uncertainty terms. When an agent is not confident, it leaves characteristic phrasing. Searching for "probably", "it seems", "I assume", "likely" surfaces the spots where it proceeded on a guess.

This is where the correctness of the output is ultimately decided. In my experience, a large share of decisions that caused problems later were near these hesitation words. If the agent wrote "the config file is probably here" and moved on, I verify against the real thing whether that guess was right. If it was, no problem; if it missed, everything downstream may be off.

Jump to these words by search and read only one or two steps around each. With this approach, even a conversation of hundreds of steps needs close reading of perhaps 5 to 10 spots. Compared with reading all of it, the review time dropped to a fraction in practice.

Reconcile background agents against logs

In-conversation search is powerful in the chat view, but for an agent that ran unattended in the background, it is faster to look at the logs before opening the conversation. Antigravity's background and scheduled runs leave execution logs, so I apply the same search terms to the log side, form a hypothesis, and then open the conversation.

# Extract only side effects and traces of hesitation from background-agent logs
LOG_DIR="$HOME/.antigravity/agent-runs"
 
# On recent run logs, confirm destructive ops and outbound sends first
grep -rniE "rm -rf|--force|drop table|git push|deploy" "$LOG_DIR" \
  | tail -40
 
# Surface the spots where it proceeded on a guess (a starting point to suspect drift)
grep -rniE "probably|it seems|i assume|maybe|likely" "$LOG_DIR" \
  | wc -l

Grasp the count and location on the log side, then open the matching conversation and jump to the same words with cmd/ctrl+F. With this two-step approach, even when several background agents run in parallel, you narrow down which run and which spot to look at first. As an indie developer, when I run wallpaper-app asset generation in the background, I check in this order—logs, then conversation—every time. If the logs confirm zero destructive ops, I can review the conversation focused calmly on just the hesitation words.

Make the search terms your own defaults

Finally, a note on habit. If you invent search terms on the spot for every review, you will miss things. I recommend deciding on about ten of your own default terms across the three categories—side effects, hesitation, and direction branches. Even when the project or language changes, this skeleton carries over.

The benefit of fixed defaults is that the review becomes reproducible. If anyone can jump to the same branch points with the same words, review quality stops depending on the person. Even as an indie developer working alone, being able to review with the same criteria as my past self is quietly valuable.

Next time you let an agent run long, try searching one "destructive-op word" instead of reading from the top. Just knowing there are zero hits should settle how you enter the review. From there, add terms for side effects, hesitation, and branches, and you will stop losing your starting point even against a long conversation.

Share

Thank You for Reading

Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

If you found this article helpful, a small tip ($1.50) would mean a lot to us. Your support helps keep this site ad-free and covers server and hosting costs.

Related Articles

Tips2026-05-02
Pairing git bisect with Antigravity's AI to Find a Regression's Root Commit in Minutes
When something worked last week and is broken today, git bisect plus Antigravity's AI can isolate the offending commit in under thirty minutes. Here is the working split between human and AI that I have found most reliable.
Tips2026-06-14
Keep Side Questions Out of Your Main Thread with Antigravity's /btw
How Antigravity 2.1.4's /btw slash command routes side questions to a disposable subagent so your main agent's context stays clean through long tasks.
Tips2026-06-12
This Week at Antigravity Lab: Five Posts on Deciding How Much to Hand Over to Agents
Editor's notes on five posts: sizing agent tasks by what you can review, delegating dependency updates by risk tier, a StoreKit 2 migration case study, and a Japanese UI setup guide.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →