It Did Things I Never Asked For — Binding an Agent's Task Scope With a Contract

Ask it to fix a button color and you get a refactor, renames, and a dependency bump too. This is a scope problem, not a permission one. Here is a contract that stops at the scope boundary and asks.

antigravity³⁹⁹ agents¹⁰⁹ scope-control task-contract

✦ Premium Article

I thought I had only asked to "fix the button color." The returned changes included the color fix plus a refactor of a nearby function, a variable rename, and a dependency version bump for good measure. All of them well-meant, and all of them work. But what I wanted to see was a single line of color change; the rest only swelled the review and planted unintended diffs.

When you run several apps in indie development, this "over-helpfulness" quietly adds up. Each instance is small, but stacked together you lose track of which change you actually intended. What I want to bind here is not permission. The write access is fine. What I want to bind is acting beyond what was asked.

This Is a Scope Problem, Not a Permission One

Talk of reining in a runaway agent tends to start with permissions: what can it write to, what can it execute. That matters too, but this problem sits on a different layer. Having permission to fix the color is fine. The problem is doing other things while fixing the color — being over-helpful beyond the task's scope.

Binding with permissions reduces what the agent can do. Binding with scope leaves what it can do unchanged, and limits only "what is allowed this time." The latter is what I wanted.

Sign a Task-Scope Contract

So I put a short contract that states the scope per request into the rules or prompt. It has only three parts.

State what to do (in-scope) in one sentence.
List what not to do (out-of-scope), only the things that tend to happen.
Declare that if it wants to cross the boundary, it proposes without executing and stops.

## Scope of this task
- Do: change the submit button color to brand-primary.
- Do not: refactor nearby code, rename variables/functions,
  bump dependency versions, apply the formatter wholesale, ripple into other files.
- If tempted to cross the boundary: do not execute. List it as a one-line "proposal"
  and hand the decision to the human.

The third part is the crux. Banning all "while I'm at it" stops genuinely necessary ripple too. Instead of that, change it to "if tempted, leave a proposal rather than doing it." This keeps the noticing while confining only the execution to scope.

✦

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN

✦A task-scope contract, distinct from permissions, that stops over-helpful changes

✦How to write an acceptance rule that proposes — without executing — when it wants to cross the boundary

✦How to set scope granularity and allowed ripple so it does not stop and ask every time

Secure payment via Stripe · Cancel anytime

✦

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

Unlock all articles with Membership →

Decide the Behavior at the Boundary

A contract alone is weak, so write the behavior at the boundary as an acceptance rule.

Request	In scope	Tempting out-of-scope	Behavior when crossed
Change button color	The one color line	Nearby refactor, renames	Record as a proposal, do not execute
Fix one bug	The cause and necessary tests	Clearing unrelated warnings wholesale	Carve out as a separate task
Edit copy	The target string	Surrounding formatting, import cleanup	Report the count only, do not touch
Add one dependency	That dependency and minimal wiring	Updating other dependencies together	Stop and confirm with the human

Build this table once and "what to do when crossed" becomes standardized per request type. You no longer write a contract from scratch each time — you pick a type and paste it.

Granularity So It Does Not Stop Too Much

Make the scope too strict and it stops to ask for everything. That is draining in its own way. The key is to fix the granularity of allowed ripple in advance.

I default to a line: "formatting that stays within the same function is in scope; ripple that crosses files is out of scope." If the indentation gets messy around the line you fixed, fixing it within that function is fine. But going off to fix the same kind of spot in another file wholesale is out of scope.

I vary the granularity by the weight of the request. For a throwaway small fix the scope is narrow; for design-bearing work I allow some ripple. In my own work I take the narrowest scope for requests with external effects — like AdMob-related config — and allow wider ripple for internal test work, running it as two tiers. Treat scope not as a fixed value but as a dial you select per request, and you find the middle between stopping too much and running too wild.

Run Without Discarding Proposals

Instead of letting it execute out-of-scope, having it leave proposals turns those into seeds for good next tasks. "Noticed while fixing the color: this function would be better split" is a line you do not want executed now, but it has value as a separate task.

I transcribe proposals to the end of my work log and review them on the weekend. As the un-executed observations accumulate, they become a list of the next places worth touching. Binding scope is not discarding the agent's powers of observation. Receive the observation, confine only the execution to scope. That separation protects both a light review and the clarity of change intent at once.

As a next step, pick one request type you issue most often and write out just three of its "do not" items. Adding three out-of-scope lines alone cuts over-helpful changes considerably.

Thank You for Reading

Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.