When Parallel Agents Corrupt Your Lockfile: Serializing Dependency Installs in Antigravity

Antigravity's parallel agents racing on the same pnpm store and lockfile can corrupt both. Here is how I kept code generation parallel while serializing only the installs with a file lock.

Antigravity²⁴⁹ pnpm² parallel-agents lockfile node_modules

✦ Premium Article

One morning, two of the four tasks I had handed to overnight agents had stalled with a "dependency not found" error. As an indie developer running several apps and the Dolice Labs blogs in parallel, I batch my agents during off-peak hours. That night I had split the work across four agents at once: a React screen, a new API route, a lint cleanup, and some test coverage.

What greeted me was a corrupted pnpm-lock.yaml and a half-extracted node_modules. The code diffs themselves were fine. The problem was that all four agents had fired pnpm install at nearly the same instant and fought over the same store and the same lockfile.

This article is the fix that has kept things stable since. The idea is simple: leave code generation parallel, and serialize exactly one thing — the dependency install.

What Actually Broke — Two Shared Resources

When we think of parallel-agent collisions, we picture two agents editing the same source file. But that was not the failure here. What broke were two resources that agents rarely see.

The first is the pnpm content-addressable store (by default ~/.local/share/pnpm/store or ~/.pnpm-store). pnpm keeps one copy of each package in a central store and hard-links into each project's node_modules. When two processes write to the store at once, one links into a package another is still extracting, leaving a directory with missing contents.

The second is pnpm-lock.yaml. pnpm rewrites the lockfile when resolution changes during an install. Two installs writing it back simultaneously leave a YAML with one side's intermediate state mixed in, and every later resolution inherits the damage.

In my setup, the worktrees were separated with git worktree, but the store and lockfile were still shared. I had isolated the source but not the dependency resolution — I had drawn the parallelism boundary one level too high.

How to Tell It Apart in Logs

The same "install failed" message looks different depending on cause. The signs specific to a parallel race were these.

Observed sign	What it means
`ENOENT` on a path under the store	Another process linked to a still-extracting copy
`Cannot read properties of undefined (reading 'integrity')`	The lockfile lost an integrity hash mid-rewrite
A huge, disordered `git diff` on `pnpm-lock.yaml`	Two resolutions interleaved
A solo re-run always succeeds	Strong evidence the code is fine and the cause is contention
Empty directories left under `node_modules/.pnpm`	An interrupted extraction

That last row was the decisive test. If a solo re-run always passes, it is not a content bug — it is a timing problem. I now treat that observation as my first triage step.

✦

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN

✦The exact way parallel agents fight over the pnpm store and lockfile, and how to spot it in logs

✦A working implementation that keeps code generation parallel while serializing only dependency installs with a file lock

✦A per-worktree node_modules layout that preserves the shared store, with before/after failure rates (3-of-12 to 0)

Secure payment via Stripe · Cancel anytime

✦

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

Unlock all articles with Membership →

The "Fixes" That Failed First

My first move was to drop the agent count from four to two. The frequency fell, but it never reached zero. Lowering parallelism only thins the probability; it does not address the cause.

Next I tried giving each worktree its own store (pnpm config set store-dir per worktree). The lockfile collisions dropped, but separate stores lose hard-link sharing: disk usage ballooned and first installs slowed noticeably. With four worktrees on fully separated stores, dependency disk usage grew about 3.4x.

Those two detours settled the design: keep one store, but allow only one process to touch it at a time. In other words, mutual exclusion.

Serializing Only Installs With a File Lock

A single layer of flock (an advisory lock available on Linux and macOS) is enough to prevent concurrent installs. Agents call pnpm install through this wrapper instead of directly.

#!/usr/bin/env bash
# locked-install.sh — serialize only the dependency install
set -euo pipefail
 
# One lock file for the whole repo, sized to the resource we protect
LOCK_FILE="${PNPM_INSTALL_LOCK:-/tmp/pnpm-install.lock}"
TIMEOUT_SEC="${INSTALL_LOCK_TIMEOUT:-600}"
 
exec 9>"$LOCK_FILE"
 
# -w gives a timeout. If we cannot get the lock, exit non-zero so the caller retries
if ! flock -w "$TIMEOUT_SEC" 9; then
  echo "install lock timeout after ${TIMEOUT_SEC}s" >&2
  exit 75   # EX_TEMPFAIL: treat as a transient failure
fi
 
echo "[locked-install] acquired lock, running: pnpm install $*" >&2
pnpm install "$@"
# Leaving the scope of fd 9 releases the lock automatically

Three things matter here.

One lock per repository, sized to the resources you protect — the store and the lockfile. A per-project lock would not prevent the shared-store race.
A timeout, so that if a lock-holder dies, waiters do not hang forever. Returning exit 75 (transient failure) lines up with your agents' retry policy.
Code generation, edits, and tests stay outside the lock. Only the install is serialized, so the benefit of parallelism is preserved.

On the Antigravity side, each agent's workflow swaps pnpm install for ./locked-install.sh. Spelling this out in the task definition (your AGENTS.md or task setup steps) stops an agent from quietly reverting to a bare pnpm install.

Per-Worktree node_modules, Shared Store

Serialization stops the collisions, but one more adjustment adds stability: make node_modules independent per worktree while keeping a single shared store. This is close to pnpm's default, but worth stating explicitly when you use git worktree.

# .npmrc at the root of each worktree
# Share the store (hard links save disk); keep node_modules worktree-local
store-dir=~/.pnpm-store
# Build a strict node_modules instead of flattening symlinks
node-linker=isolated
# Parallel downloads stay on; write contention is handled by flock

With node-linker=isolated, each worktree's dependency tree is independent, so a half-finished state in one worktree cannot leak into another. The store stays shared, so repeat installs remain fast via hard links.

I chose this because the set of repos I touch changes day to day. Separate stores pay the first-install cost every time, but a shared store reuses whatever I have already downloaded. In daily work, that disk-versus-first-run trade-off quietly adds up.

What Changed, Measured

On the overnight batch (four parallel agents, a dozen-plus installs per night), I compared roughly two weeks before and after.

Metric	Before	After
Stalls from lockfile corruption	3 of 12	0
Manual morning recoveries (2 weeks)	5	0
Max wait on a parallel install	—	~40s
Dependency disk use (vs separate-store option)	—	~1/3.4 of it
Agent parallelism	4	4 (unchanged)

The ~40-second worst case is the last of four installs waiting on the prior three. Since code generation never waits, perceived total time barely moved. The bigger difference for me was that morning recoveries dropped to zero.

Where the Same Idea Applies

This shape — stay parallel, serialize only the one moment that touches a shared resource — is not limited to installs. I apply the same pattern to:

Database migrations: lock only the moment the schema changes; keep read/write query generation parallel.
Final repo-wide formatting: each agent's edits run in parallel, but a single repo-wide format pass runs serially.
Shared cache rebuilds: take the lock only while regenerating the build cache.

What they share is a mindset: not "reduce parallelism" but "shrink the serialized region to the smallest single point." Look one level finer at your unit of parallelism and you can erase the accidents without slowing the whole down.

As you add parallel agents, collisions will surface somewhere. But the cause is usually not the source files — it is the store, the lockfile, or the cache hiding behind them. Start there. I hope this saves someone else the same morning cleanup.

Thank You for Reading

Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.