Turning a throwaway prompt into a reusable sub-agent
When a one-off prompt to an Antigravity 2.0 dynamic sub-agent works beautifully, it usually vanishes into your chat history. Here is how to capture it as a reusable definition, with the actual file layout and the distillation steps.
One night a prompt worked unusually well. It lined up my pre-release screenshots and flagged the truncated labels and the uneven margins per device. A week later I tried to run the same task again, and I could not find that prompt. It had sunk somewhere deep in the conversation scroll.
In Antigravity 2.0, dynamic sub-agents pick up parallel tasks for you. That is convenient, but the instructions you hand a child agent are born inside a conversation and they die inside it. The more useful a one-off result is, the more it pays to decide how you will keep it.
Why a prompt that worked once won't work again
The reason is plain: a prompt you typed in the moment carries no record of its assumptions.
The file that happened to be open, the exchange right before it, the unspoken bar for "good enough" you were holding in your head. The instruction only worked because all of that was present. Copy the string alone and none of it comes with you. Next week's version of you sits in a different context entirely.
So the thing worth keeping is not the wording of the prompt. It is the conditions under which that instruction held. A sub-agent definition is nothing more than those conditions, written down.
Four parts every reusable definition needs
The definitions I actually reused all shared the same four parts.
Role and boundaries
State in one sentence what this agent is responsible for. Just as important is what it must not do.
"Reviews the layout of screenshots. Does not touch code or assets." Spell out both the permitted scope and the forbidden one. Skip the boundary, and the child agent will helpfully propose changes you never asked for, which only adds to your review load.
Inputs and assumptions
Which files, which directory, which state does it need in order to work?
Leave this vague and every run begins with "which target did you mean?" — and the whole point of reuse evaporates. When the input cannot be fixed, declare it as a placeholder you pass in at call time.
Definition of done
What counts as finished? This is the heart of the definition.
"For each device size, check whether labels fit within the frame; list any that don't, together with the screen name. If everything is fine, reply only 'no issues'." Fix the shape of the output, and you can handle the result mechanically afterward.
Failure behavior
Cannot decide, target not found, assumption broken. Write down that, in those cases, it must not press ahead on its own.
This matters most in unattended runs. An agent that stops and reports is far cheaper to clean up after than one that charges forward on a shaky premise.
✦
Thank you for reading this far.
Continue Reading
What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.
WHAT YOU'LL LEARN
✦A concrete template that breaks a working prompt into four parts: role, inputs, definition of done, and failure behavior
✦Three steps to distill a throwaway prompt into a reusable definition, plus how to grow its accuracy over versions
✦A real indie-dev example: folding it into recurring release chores like screenshot review
Secure payment via Stripe · Cancel anytime
✦
Unlock This Article
Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.
I keep a docs/agents/ directory in the repo, with one markdown file per agent. It works whether you reference it from Antigravity's AGENTS.md or paste the body in at call time.
# screenshot-reviewer## RoleReviews the layout of pre-release screenshots.Does not modify code or image assets (no suggestions either).## Inputs- Target: {{ scshots_dir }} (default: ./fastlane/screenshots)- Devices: iPhone 16 Pro / iPhone SE / iPad Pro 13 (three sizes)## Steps1. Inspect every image for each device size2. Detect any labels that overflow the frame3. Detect screens whose margins are wildly uneven across edges## Definition of done- If there are issues: list "screen / device / symptom", one per line- If there are none: reply only "no issues"- Do not write fixes from guesswork (detection only)## On failure- Target directory empty or missing -> do not proceed; report it- Fewer than three device sizes present -> report the missing size and stop
The trick is to keep the steps to about three and fix the output to a single shape. The more an agent is asked to do, the harder its definition is to reuse. Keep each one small and own many of them — that has been my lived experience as an indie developer who keeps adding tools.
From throwaway prompt to definition — three steps
Here is how I turn an in-the-moment prompt into the shape above.
Pin down the fact that it worked. Keep one output where the prompt genuinely helped. Build a definition from a vague feeling, and you get an agent that never reproduces.
Surface the hidden assumptions. Ask yourself "what was I looking at?" and "where did I decide it was good enough?", then copy those answers into the inputs and the definition of done. The bar you couldn't put into words is exactly what kept drifting run to run.
Add the things it should not do. Write down, as prohibitions, the behaviors that actually annoyed you. My first definition always omitted "do not write fixes," and the reviewer quietly mutated into an editor.
Finish those three, and you are left not with a copied string but with an agent you can carry around, conditions included.
Calling it, and bringing results back to the main thread
Once the definition exists, you can throw the task sideways without stopping your main conversation. It pairs well with a disposable detour like /btw: hand over the definition body and swap only the input.
# Run just the screenshot review once, with the definition, from the CLIagy run \ --agent docs/agents/screenshot-reviewer.md \ --var scshots_dir=./fastlane/screenshots/en \ --output-format text# Only carry it back into the main work when the reply is not "no issues"
If the reply is "no issues," there is no need to interrupt what you were doing. Only when something comes back do you paste those lines into the parent conversation and let a human make the fix. Separating detection from judgment is what lets you add more child agents without losing the thread.
Growing it — raising accuracy over versions
A definition is not written once and left alone. Each time you use it, add the miss back as a single line.
I keep an agent that checks AdMob banner placement. At first it only looked for ads overlapping the body text. After a real-device overlap slipped through, I added "also check the distance to the bottom safe-area edge"; after a run of false positives, I added "treat dark-mode screenshots as a separate pass." Around the third revision, the rework all but disappeared.
Because the definition lives in the repo like code, its change history doubles as the agent's growth record. Keeping a trace of which version learned what turns out to support long-running automation more than I expected.
For a next step, pick the single task you most dread re-typing the prompt for, and break it into the four parts. Don't aim for a perfect definition — starting from one output that actually worked is enough. Thank you for reading.
Share
Thank You for Reading
Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.