ANTIGRAVITY LABJP
Articles/Integrations
Integrations/2026-06-16Advanced

When a Cloud Nightly Batch Drifts From Yesterday's Result — An Input Contract and Snapshot Design for Reproducibility

When you push a batch to a cloud ephemeral worker via the Managed Agents API, the environment assumptions you took for granted locally vanish. Here is a three-layer design — environment snapshot, input contract, seed pinning — that keeps the same input producing the same result.

antigravity361Managed Agents4reproducibility2batchcloud2operations11idempotency7

Premium Article

This is about moving a batch I had been running on my own machine to a cloud worker via the Managed Agents API. The first few days were comfortable: it did not occupy my local CPU, it ran overnight on its own, and the report was ready by morning. But after about a week I noticed something odd. The input was supposedly identical, yet the report's structure differed between yesterday and today.

An ephemeral worker spins up in a fresh environment every time and disappears when done. That is a design benefit, but the flip side is that "everything that was obvious locally" disappears too. The config files, environment variables, caches, and small bits of input state that lived on my machine are not carried over in the cloud.

Here I share how I separated the causes of drift in a cloud batch and protected reproducibility across three layers: input contract, environment snapshot, and seed pinning. This batch also rolls up AdMob mediation numbers alongside each site's weekly report. I want the rollup to make progress between App Store and Google Play releases without me stopping to babysit it — which is exactly why daily drift was unacceptable. This is the practical assembly I arrived at as an indie developer auto-generating reports across several sites.

First, split "drift" into four causes

"It is not reproducible" cannot be fixed as a single lump. I split the cause into four and closed them one at a time.

  1. Environment differences. The ephemeral worker's runtime and library versions differ subtly on each spin-up — a real gotcha.
  2. Missing implicit context. Files and data I could reference locally were never handed to the cloud.
  3. Model updates. An alias like gemini-3.5-flash can have its underlying model quietly swapped.
  4. The model's inherent non-determinism. Temperature and sampling jitter remain.

These four call for entirely different fixes. Environment differences are handled by snapshots, implicit context by the input contract, model updates by version pinning, and non-determinism by fixing seed and temperature. Mix them together and each gets only half-fixed.

Input contract: turn implicit context into explicit arguments

What helped most was pinning the input contract as JSON. A cloud worker can assume nothing about "what should be on the machine." So I write everything the job needs into one contract object and pass only that.

// Input contract — state everything this job requires
interface BatchInputContract {
  contractVersion: "1";          // version of the contract itself
  task: "weekly-report";
  // Pass data as content or an immutable pointer, never as a loose reference
  inputs: {
    siteId: string;
    periodStart: string;         // ISO 8601 — no vague "last week"
    periodEnd: string;
    datasetUri: string;          // fixed URI on immutable object storage
    datasetSha256: string;       // hash to verify after fetch
  };
  // Specify the model by a pinned version, not an alias
  model: {
    id: "gemini-3.5-flash";
    pinnedVersion: string;       // e.g. "gemini-3.5-flash-002"
    temperature: 0;
    seed: number;
  };
  // Pin the prompt template by version as well
  promptTemplateId: string;      // e.g. "weekly-report@7"
}

Three points matter. Do not pass the period as a relative expression like "last week." Pass data as an immutable URI you can verify by hash, not a reference. Specify the model by a pinned version, not an alias. These alone sharply raised the odds that the same contract yields the same result.

The dataset hash check especially mattered. Even if datasetUri is the same, the result changes if its contents were swapped. The worker verifies datasetSha256 right after fetching and halts the job on mismatch. This catches the hardest-to-notice drift — "the input changed without anyone noticing."

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN
Get a diagnosis flow that separates the four causes of drift on cloud ephemeral workers (environment, implicit context, model updates, non-determinism)
Learn a manifest design (TypeScript) that pins the input contract as JSON and stamps the environment digest and served model version onto the artifact
See the operating rules that dropped non-reproducible output in weekly reports from about 15% to under 1%
Secure payment via Stripe · Cancel anytime

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

or
Unlock all articles with Membership →
Share

Thank You for Reading

Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

Related Articles

Integrations2026-05-24
Catching "Running but Doing Nothing" Antigravity Subagents — A 3-Layer Observability Pattern Across Six Production Apps
Running Antigravity subagents across six production apps surfaced more than ten silent failures every month — exits clean, logs green, but nothing actually happened. Here is the Heartbeat / Output Trace / Decision Log pattern I now use to catch them inside 60 seconds, with code, GCP costs, and four months of running numbers.
Integrations2026-06-15
Regression-Testing Antigravity Agent Output in CI
Agent output drifts between identical runs and turns CI red for no real reason. Here is how I stabilized snapshot regression testing for Antigravity agents using a normalization layer and pytest golden files, drawn from running it in my own indie developer CI.
Integrations2026-06-14
Trusting Temporal Workflows in Production — Field Notes on Idempotency, Retry Triage, and Saga Compensation
Practical notes from running Temporal as a production backend: how to make activities idempotent for real, where to draw the line between retryable and fatal errors, how to keep Saga compensation from firing twice, and how to make it all observable—built with Antigravity in the loop.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →