Before You Let an Agent Own Your Tests — Deciding How Fixtures and Seed Data Live

Agents will happily edit test data until the tests pass. A practical defense built on fixture ownership, deterministic seeds, and anonymized subsets of production data, with working scripts.

antigravity⁴⁰⁹ app-dev⁴⁵ test-data fixtures quality-assurance³

✦ Premium Article

I handed a failing test suite to an Antigravity agent and asked it to fix things. A few minutes later everything was green. The diff told the real story: the application code was untouched, and fixtures/users.json had been rewritten to match the assertions. The agent had made the tests pass by editing the test data.

No malice involved. The agent took the shortest path to the goal I gave it — "make the tests pass." The actual failure was mine: I had delegated testing without deciding anything about how test data should be handled.

Since then I keep three rules pinned in every project: fixture ownership, seed determinism, and a fixed procedure for carving subsets from real data. Here they are, in order.

Rule 1 — take fixture ownership away from the agent

The first change was declaring fixtures/ read-only for agents. If the agent believes an expected value must change, it reports a proposal instead of editing. One paragraph in the Guide skill goes a surprisingly long way:

## Test data policy
- Files under fixtures/ are read-only
- If you conclude an expected value must change, do not edit it;
  report a "fixture change proposal" with your reasoning

Guide skills are advisory, though, not enforcement. So a mechanical backstop lives in CI: reject any commit that touches fixtures and application code at the same time.

#!/usr/bin/env bash
# fixture-guard.sh — detect simultaneous fixture/src changes
CHANGED="$(git diff --cached --name-only)"
FIXTURE_TOUCHED=$(echo "$CHANGED" | grep -c '^fixtures/' || true)
SRC_TOUCHED=$(echo "$CHANGED" | grep -c '^src/' || true)
 
if [ "$FIXTURE_TOUCHED" -gt 0 ] && [ "$SRC_TOUCHED" -gt 0 ]; then
  echo "❌ fixture and src changes must be separated"
  echo "   fixture edits need their own commit with the reason in the message"
  exit 1
fi
exit 0

Changing fixtures is not forbidden — legitimate updates follow spec changes all the time. What is forbidden is quietly moving the goalposts in the same hand that edits the code. Forced into a standalone commit, a fixture change always crosses a reviewer's eyes.

Rule 2 — generate seeds deterministically

Tests written by agents tend to come with improvised test data: random names and dates conjured inline. That is a factory for flaky tests that pass on Tuesday and fail on Thursday.

The fix is to centralize data creation in one generator script with a pinned seed:

// scripts/generate-fixtures.mjs
import { faker } from "@faker-js/faker";
import { writeFileSync } from "node:fs";
 
faker.seed(20260702); // pinned seed — identical output on every run
 
const users = Array.from({ length: 50 }, (_, i) => ({
  id: `u${String(i + 1).padStart(4, "0")}`,
  name: faker.person.fullName(),
  email: faker.internet.email().toLowerCase(),
  createdAt: faker.date
    .between({ from: "2025-01-01", to: "2026-06-30" })
    .toISOString(),
  plan: faker.helpers.arrayElement(["free", "pro", "premium"]),
}));
 
writeFileSync(
  "fixtures/users.json",
  JSON.stringify({ schemaVersion: 3, users }, null, 2)
);
console.log(`generated ${users.length} users (schemaVersion 3)`);

A pinned seed gives you a tamper check for free: regenerate and diff. If the file in the repository no longer matches the generator's output, someone — usually an agent — edited it by hand.

The schemaVersion embedded in the fixture lets tests validate their assumptions. When versions disagree, I skip with a warning rather than fail, which keeps migration periods quiet.

✦

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN

✦A real case of an agent editing fixtures to force green, and the CI guard that catches it

✦A deterministic seed script with faker that regenerates identical test data every run

✦A four-stage procedure and masking table for safely carving anonymized subsets from production data

Secure payment via Stripe · Cancel anytime

✦

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

Unlock all articles with Membership →

Rule 3 — carve real data in four fixed stages

Generated data cannot reproduce everything. Emoji in display names, absurdly long strings, NULL combinations nobody designed for — some bugs only live in real user data. As an indie developer with production apps in service, I inevitably reach the moment of wanting fixtures cut from the real database.

That moment follows four stages, always:

Extract: pull at most a few hundred rows from a read replica with tight conditions. Never create a full dump; it is dangerous to handle and easy to forget to delete.
Mask: replace identifying columns per the table below. Everything up to here happens in a temp directory, never in the repository.
Verify: run a leak scanner — regex sweeps for email shapes, phone shapes, and surviving real domains.
Freeze: only verified output lands in fixtures/real-subset/, where Rule 1's read-only policy takes over.

Column type	Masking method	Property preserved
Names	Replace with faker names (match original script/charset)	Length distribution, emoji and variant characters
Email addresses	Hash + example.com	Uniqueness, mixed casing
Free-text fields	Dummy text of identical length	Length, line breaks, leading/trailing whitespace
Amounts and dates	Keep as-is	Boundary and outlier reproducibility
External IDs	Renumber sequentially (preserve referential integrity)	Shape of relations

The "property preserved" column is the real content of that table. Masking is not about deleting information; it is about keeping exactly the properties that reproduce bugs while dropping the personal data. Framed that way, the replacement choices stop being guesswork.

Measured results

I tracked flakiness before and after adopting the three rules. Across roughly 320 test cases, including agent-written ones, I averaged seven "passes on rerun" failures per week before; after seed pinning and fixture freezing, one to two per week. The survivors are almost all time-of-day dependent — data-driven flakiness has disappeared from my sample.

The CI guard has fired twice in three months. Forced into standalone commits, one change turned out to be a legitimate schema update and the other an assertion rewrite. Without the guard, both would have sailed through.

The next step

Run git log --oneline -- fixtures/ on your current repository. If fixture edits are mixed into the same commits as code fixes, that is your first cleanup target. Installing the guard is a thirty-minute job — it is just the script above in a commit hook.

Testing is where collaboration with agents pays off most, in my experience at Dolice Labs. Decide how the data lives, and the range of work you can hand over with confidence grows steadily.

Thank You for Reading

Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.