When Managed Agents Run in the Cloud, How Do You Hand Them Credentials?

The Antigravity 2.0 Managed Agents API runs agents in the cloud, away from your machine. Convenient, but the credential handling that was trivial on your own laptop suddenly gets hard. Here is a design for not handing over long-lived tokens, but issuing them per run and expiring them quickly.

Managed Agents³ Antigravity²³³ credentials least privilege short-lived tokens security design cloud execution indie development¹¹

✦ Premium Article

While you run agents locally, credentials rarely come to the surface. An API key sits in your machine's environment variables and the agent just reads it. You know where the key is, and closing the laptop stops the run.

The first day I offloaded a task to the cloud via the Managed Agents API, I realized that comfort lived only on my desk. I went to pour a production deploy token into an environment variable for a cloud-run agent, the same way I always had, and my hands stopped.

That token would leave my machine, be used in a place I couldn't see, at a time I wasn't watching. If something leaked mid-flight, I couldn't quickly stop how far the damage reached or how long it lasted.

The essence of offloading an agent to the cloud is that execution leaves your hands. If so, the credentials have to be remade into a form that is safe once it leaves your hands too.

The fear of a long-lived token leaving your desk

Most credentials we use day to day are built to live a long time. Issue one, and it's valid until you explicitly revoke it. On the assumption that only you use it locally, that's fine.

The problem is that this property is a poor fit for cloud execution.

A long-lived token has three weaknesses. First, a leak doesn't stop: it can be reused until revoked, so the later you notice, the wider the wound. Second, the privilege is too broad: it's tempting to hand over a strong key "so it just works," and it reaches resources the agent never needed to touch. Third, you lose the trail: reuse the same token across runs and you can no longer separate which run did what.

Locally, all three stay latent. The blast radius is confined to your machine, broad privilege is an extension of your own actions, and your memory fills in the trail. But in the cloud, all three become operational risk as-is.

So the design stance is simple. Keep the long-lived strong key on your desk, and hand the cloud a different credential that lives briefly and reaches narrowly. That's the whole of it.

Issue per run, throw away when done

The first pillar is matching the token's lifetime to the run's lifetime.

Before starting an agent, issue a token meant only for that run. When the task ends, success or failure, revoke it. The next run receives a fresh token. With this in place, even if a token leaks mid-run, it is valid only for the short window until that run finishes.

It takes the shape of inserting a broker. Only the broker layer holds the strong key on your desk; the agent receives only the short-lived token the broker issued.

# token_broker.py — a broker that issues and revokes short-lived tokens per run
import time
import secrets
from dataclasses import dataclass
 
 
@dataclass
class LeasedToken:
    value: str
    scope: tuple[str, ...]      # only the actions this run is allowed
    expires_at: float
    run_id: str
 
 
class TokenBroker:
    """Hide the strong key inside; lend out only short-lived, narrow tokens."""
 
    def __init__(self, master_key: str, default_ttl: int = 600):
        self._master_key = master_key      # never leaves the desk
        self._default_ttl = default_ttl
        self._active: dict[str, LeasedToken] = {}
 
    def issue(self, run_id: str, scope: tuple[str, ...], ttl: int | None = None) -> LeasedToken:
        ttl = ttl or self._default_ttl
        token = LeasedToken(
            value=f"agt_{secrets.token_urlsafe(24)}",
            scope=scope,
            expires_at=time.time() + ttl,
            run_id=run_id,
        )
        self._active[token.value] = token
        return token
 
    def authorize(self, token_value: str, action: str) -> bool:
        token = self._active.get(token_value)
        if token is None:
            return False
        if time.time() > token.expires_at:
            self.revoke(token_value)        # expired means immediately invalid
            return False
        return action in token.scope         # reject anything outside the granted scope
 
    def revoke(self, token_value: str) -> None:
        self._active.pop(token_value, None)
 
 
# Usage
broker = TokenBroker(master_key="hide the strong production key here")
 
def run_agent_task(run_id: str):
    # pass only the actions this run needs
    token = broker.issue(run_id, scope=("read:articles", "write:draft"), ttl=300)
    try:
        dispatch_to_cloud_agent(run_id, token.value)   # the agent gets only the short-lived token
    finally:
        broker.revoke(token.value)                     # always revoke, success or failure

The key point is that revoke always runs in finally. Whether it crashes on an exception or finishes cleanly, the token dies at the end of the run. The short ttl is a second layer of insurance: even if the revocation itself never runs, the token stops working on its own once time runs out.

The agent-side code doesn't even need to be aware that its token is short-lived. It uses it as an ordinary token, and it naturally goes invalid when the run ends. The complexity is sealed inside the broker.

✦

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN

✦Why you must not hand long-lived tokens to cloud-run agents, and a design that replaces them with per-run short-lived tokens

✦A broker-layer implementation that exchanges one strong key for least-privilege tokens scoped to only what the agent touches

✦An operational pattern that revokes issued tokens reliably at run end, capping any leak's damage by time

Secure payment via Stripe · Cancel anytime

✦

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

Unlock all articles with Membership →

Trade a strong key for a narrow one

The second pillar is the breadth of privilege. Even with a short lifetime, if it can touch every resource during that short window, a leak is still costly.

What helps here is the habit of putting into scope, at each issuance, only the actions this run genuinely needs. The earlier example used ("read:articles", "write:draft") because a draft-writing task needs only to read articles and write a draft. Publishing to production, or reaching billing data, is impossible with this token.

The hesitation when narrowing privilege is "what if it's not enough." Against that worry, I keep the order: don't hand over broad and trim later — hand over narrow and add if it falls short. Broad privilege sits as risk forever unless used. Narrow privilege fails loudly the moment it's short, so the necessary scope emerges naturally in operation.

Not making the action names too coarse matters too. Instead of one lump write, separating write:draft from publish:production structurally prevents a publish privilege from sneaking into a draft agent. That's why authorize decides on an exact match against the allow list. It rules out the vague "it's a write, so close enough" at the level of code.

Cap a leak's damage by time

Combine the two pillars and even the worst case stays bounded.

Suppose a token leaks somewhere in cloud execution. With a long-lived token, that's serious. But with a short-lived, narrow token, what the attacker holds is "a key valid for a few more minutes that can only read and write drafts." Production is untouched, and it dies on its own at timeout. The leak itself is a problem, but the ceiling on the damage was fixed by design from the start.

Being able to follow the trail pays off in real operations. Because tokens are issued tied to a run_id, you know uniquely which token belonged to which run. Later, when you ask "what did that automated run last week actually touch," you can identify the run using the token as a thread. A reused long-lived token makes that separation impossible.

# Keeping a revocation log makes later tracing easier
import logging
 
audit = logging.getLogger("token.audit")
 
class AuditedBroker(TokenBroker):
    def issue(self, run_id, scope, ttl=None):
        token = super().issue(run_id, scope, ttl)
        audit.info("issued run=%s scope=%s ttl=%ds", run_id, scope, int(token.expires_at - time.time()))
        return token
 
    def revoke(self, token_value):
        token = self._active.get(token_value)
        if token:
            audit.info("revoked run=%s", token.run_id)
        super().revoke(token_value)

This log is a record for reconstructing "when, to which run, and with what scope a key was lent" if something happens. You don't read it day to day, but whether it exists on the day an investigation is needed completely changes how far you can look.

What to do as the first step

When offloading agents to the cloud, trying to build a perfect credential platform all at once is daunting for an indie developer. What I'd suggest is to introduce only the broker that "issues per run and revokes when done" first. Fine-grained privilege splitting can come next.

Just matching lifetime to the run already caps a leak's damage by time. Layer least privilege on top and add the trail log, and even execution that has left your hands recovers about as much comfort as when it was in them.

The freedom to offload tasks to the cloud and the caution of never letting go of your credentials can coexist. In fact, it's precisely because the caution is baked into the design that I feel able to trust an agent with the work.

Thank You for Reading

Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.