Designing Safe Background Tasks with the Managed Agents API
Antigravity 2.0's Managed Agents API launches an agent in an isolated Linux environment with a single API call, handling reasoning, tool use, and code execution. Convenient, but left unattended it invites runaways and cost overruns. Here is a design for running it safely as a background task.
"Fire and forget" is only allowed when you're interactive
Antigravity 2.0's Managed Agents API launches an agent in an isolated Linux environment with a single API call, completing reasoning, tool use, and code execution inside it. Trying it interactively, you can just watch for the result come back — it really is fire and forget.
But the moment you start running it unattended as a long-lived background task, the story changes entirely. In hours when no one is watching, it can keep running longer than expected, launch the same job twice, or quietly run usage sky-high. As an indie developer running automation, the times I got burned by unattended tasks were almost always when I had not built in a way to stop them.
The more convenient something is, the more its stop mechanism should be designed first. I consider this the first principle of background tasks.
Three guardrails I always add to unattended tasks
For agents that run unattended, I add the safety devices before writing the feature. These three are always present.
First, the timeout. An agent can decide it is "almost done" and keep trying indefinitely. Impose a wall-clock cap from the outside and cut it off without negotiation once exceeded.
Second, the budget cap. Set a ceiling on how much a single launch may use, and stop new calls when it is about to be exceeded. Most cost overruns come not from a single runaway but from an accumulation of small calls.
Third, the idempotency key. Double launches from retries or scheduler overlaps will happen. Give the task a unique key and make it do nothing if already processed.
Guardrail
Accident it prevents
Implementation point
Timeout
An agent that never finishes lingering on
Impose a wall-clock cap externally
Budget cap
Cost overruns from accumulated small calls
Cut off usage per launch
Idempotency key
Duplicate processing and side effects from double launches
Detect "already done" via a unique key
✦
Thank you for reading this far.
Continue Reading
What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.
WHAT YOU'LL LEARN
✦Design guidance for timeouts, budget caps, and idempotency keys when using the Managed Agents API as a long-running background task
✦Guardrail implementation patterns — caps, cancellation, observation — to prevent runaways and cost overruns
✦From experience running automation as an indie developer, the safety devices I always put into unattended tasks
Secure payment via Stripe · Cancel anytime
✦
Unlock This Article
Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.
Rather than calling the Managed Agents API directly, launch through a wrapper with the three above attached. Here is the skeleton in pseudo-code. Confirm the actual endpoint and parameter names in your SDK docs.
import time, hashlibdef idempotency_key(task: dict) -> str: # Build a stable unique key from the task content raw = f"{task['type']}:{task['target']}:{task['date']}" return hashlib.sha256(raw.encode()).hexdigest()[:16]def run_managed_agent(task: dict, *, max_seconds=600, budget_units=50): key = idempotency_key(task) # Idempotency check: do nothing if already processed if store.already_done(key): return {"status": "skipped", "key": key} started = time.monotonic() handle = agents.launch( # Launch the agent in an isolated env prompt=task["prompt"], budget_units=budget_units, # Pass the budget cap at launch ) while True: state = agents.poll(handle.id) if state.finished: store.mark_done(key) return {"status": "ok", "result": state.result, "key": key} # Timeout: cut off externally by wall clock if time.monotonic() - started > max_seconds: agents.cancel(handle.id) store.mark_failed(key, reason="timeout") return {"status": "timeout", "key": key} time.sleep(5)
What matters here is that the timeout decision is made by the "outer wall clock," not by "the agent's own report." Asking a runaway agent "are you out of time yet?" will not get a correct answer. Cancellation must always be imposed from outside. And rejecting at launch via the idempotency key quietly prevents double-launch accidents.
What you cannot observe, you cannot operate
Even with guardrails in place, you cannot keep operating if you cannot see what is happening. For unattended tasks, at minimum record "when it launched, how much it used, and how it ended."
def log_run(record: dict): # Leave a structured log on one line (easy to aggregate later) line = ( f"ts={record['ts']} key={record['key']} " f"status={record['status']} elapsed={record['elapsed']}s " f"units={record['units']}" ) append_log(line)
In my automation, I always record how much time and usage each day's task consumed. With this, when usage suddenly rises one day, I can trace afterward which task changed and when. Without it, you are left with "it feels like it went up somehow" and never reach the cause. Observation exists less to stop a runaway than to notice one.
Go unattended in stages
I do not recommend running fully unattended from the start. I always acclimate in order: run once by hand, then semi-automatic with observation, then fully unattended. The Managed Agents API is powerful and lets a single call carry a lot — which is exactly why, before handing it everything, I want to see for myself at least once what happens in the hours I am not watching.
If you take one step now, put just the timeout and the idempotency key into your wrapper. Those two alone make the night of an unattended task a much quieter one.
Share
Thank You for Reading
Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.