Taking Stock of the Dependencies Your Agent Added — A Design for Keeping License and Provenance Traceable
A few months of letting agents work, and your package.json quietly grows dependencies you don't remember adding. Here is a design for taking stock — recovering what was added, when, and why, in a form you can still trace later.
One morning I opened package.json and stopped. Sitting next to date-fns was a small date library I had no memory of adding. A quick git blame traced it to an agent commit from three weeks earlier. It had been pulled in to make a test pass. It works. But why that package, under what license, and whether it was safe to remove — none of that was written down anywhere.
Handing code to an agent genuinely raises your throughput. Underneath that, the dependency tree grows quietly, and faster than your memory of it. Each decision is small, but across a few months and several projects they pile up into a state where nobody holds the whole picture. This is the kind of debt that hits hardest in long-term operation.
What follows is a design for taking stock of the dependencies your agent added — pulling license and provenance back into a form you can trace.
What happens when you can no longer trace it
"One mystery dependency" is a joke you can live with. The trouble shows up when it accumulates.
The first cost is a license mismatch. An agent picks the package that satisfies the feature; it won't always weigh the license terms each time. GPL-family code can slip into a closed commercial app, and without an auditing mechanism you'll never notice.
The second is a widening attack surface. Add one direct dependency and it drags ten or twenty transitive ones along. The longer unused dependencies sit there, the more of your time goes to chasing vulnerability notices.
The third is lost judgment. When you wonder, six months on, "can I drop this dependency?" — if the reason it was added is gone, so is your nerve to remove it. Dependencies nobody dares touch end up squatting forever.
As an indie developer running several sites and apps in parallel at Dolice, I feel all three. The third — lost judgment — carries a weight specific to agent-driven work. Dependencies a human added usually leave behind a memory, or a message in a chat log. Dependencies an agent added often leave nothing beyond the commit message.
Break the audit into three questions
Start an audit vaguely and the sheer number of dependencies stalls you. Splitting it into three questions draws a clear line between what the machine handles and what a person reviews.
Question
What you want to know
Automatable?
What
Which direct dependencies did the agent add?
Mostly yes
When
In which commit, as part of which work, did it land?
Yes
Why
Was there a real reason it had to be this one?
Needs a human
"What" and "when" live in git history. Only "why" needs a person to fill in after the fact — which is exactly why recording the "why" at the moment of adding pays off. We'll come back to that.
✦
Thank you for reading this far.
Continue Reading
What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.
WHAT YOU'LL LEARN
✦A script that extracts only the dependencies an agent added, straight from git history
✦How to collect licenses and check them automatically against your own policy
✦An operating design that records provenance in commits, so the you of six months from now can reconstruct the reasoning
Secure payment via Stripe · Cancel anytime
✦
Unlock This Article
Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.
Implementation 1: Extract the dependencies the agent added
First, mechanically pull out when and what was added. The approach: chase the commits that added lines to the dependencies block of package.json with git log -p. If you keep your setup so agent commits are identifiable by author or committer email, the filtering stays accurate.
The script below lists each line added with + to the dependency sections, alongside the introducing commit's date, author, and subject.
#!/usr/bin/env bash# scan-added-deps.sh — extract dependencies added to package.json from historyset -euo pipefailPKG="package.json"# Pick only lines added to dependencies / devDependencies, with commit contextgit log --diff-filter=AM -p --date=short \ --pretty=format:'@@COMMIT@@%h|%ad|%an|%s' -- "$PKG" \| awk ' /^@@COMMIT@@/ { sub(/^@@COMMIT@@/, ""); split($0, m, "|"); sha=m[1]; date=m[2]; author=m[3]; subject=m[4]; next; } # Target only added lines of the form " \"pkg\": \"^x.y.z\"," /^\+[[:space:]]+"[^"]+":[[:space:]]*"[~^]?[0-9]/ { line=$0; sub(/^\+[[:space:]]+"/, "", line); sub(/".*/, "", line); printf "%-28s | %s | %-18s | %s\n", line, date, author, subject; }' | sort -u
The author column tells you at a glance which ones came from the agent. The ones you added yourself, like p-retry, you'll remember. The problem children are the quiet agent additions like date-fns-tz. Those are the stars of the audit.
Worth noting: devDependencies come through the same mechanism. Build tools and type definitions are numerous and easy for an agent to grow, so viewing them separately from runtime dependencies lightens the load.
Implementation 2: Collect licenses and check them against a policy
Once you know "what" and "when," the next question is whether the license is acceptable. This part fully automates. Collect licenses from the installed dependencies and color-code them against your policy.
License data lives in each package.json under node_modules. You can tally it with stock Node, no dedicated tool required.
// audit-licenses.mjs — check installed dependency licenses against a policyimport { readFile, readdir } from "node:fs/promises";import { join } from "node:path";// Your own policy. Tune it to the nature of the project.const POLICY = { allow: ["MIT", "ISC", "Apache-2.0", "BSD-2-Clause", "BSD-3-Clause", "0BSD"], review: ["MPL-2.0", "LGPL-3.0", "CC0-1.0", "Unlicense"], // Everything else (GPL family, unknown licenses) is treated as deny.};async function collect(dir) { const out = []; for (const name of await readdir(dir)) { if (name.startsWith(".")) continue; const base = join(dir, name); if (name.startsWith("@")) { for (const scoped of await readdir(base)) { out.push(...(await readOne(join(base, scoped), `${name}/${scoped}`))); } } else { out.push(...(await readOne(base, name))); } } return out;}async function readOne(base, name) { try { const pkg = JSON.parse(await readFile(join(base, "package.json"), "utf8")); const license = typeof pkg.license === "string" ? pkg.license : "UNKNOWN"; return [{ name, version: pkg.version ?? "?", license }]; } catch { return []; }}function classify(license) { if (POLICY.allow.includes(license)) return "allow"; if (POLICY.review.includes(license)) return "review"; return "deny";}const deps = await collect("node_modules");const flagged = deps .map((d) => ({ ...d, tier: classify(d.license) })) .filter((d) => d.tier !== "allow") .sort((a, b) => a.tier.localeCompare(b.tier));for (const d of flagged) { console.log(`[${d.tier.toUpperCase()}] ${d.name}@${d.version} — ${d.license}`);}const denied = flagged.filter((d) => d.tier === "deny");if (denied.length > 0) { console.error(`\n⚠️ Action needed: ${denied.length} package(s) hit the deny policy`); process.exit(1);}
Anything in allow is not printed — the only things a person should look at are review and deny. Wire it into CI and the build stops the moment a deny slips in. A pull request where the agent added a new dependency fails this check — that is the moment an audit shifts from a periodic chore to a mechanism that stops contamination before it lands.
License strings vary by package. Some put an object in license instead of a string, or use the old licenses array form. You'll see a flood of UNKNOWN at first, so the realistic move is to handle the most frequent ones first. Prioritize never dropping a deny over perfect classification.
In my own work, the allow policy runs roughly in three tiers.
Tier
Policy
Examples
allow
Use without review
MIT, ISC, Apache-2.0, BSD family
review
Confirm the use case first
MPL-2.0, LGPL family
deny
Avoid in principle, find an alternative
GPL family, unknown licenses
These tiers shift with the project. A closed commercial app and an OSS project you plan to publish will, of course, draw the deny line differently. What matters is writing it down somewhere and applying the same standard to agents and humans alike.
Record the "why" the moment you add it
We mechanized "what, when, and which license." What remains is "why," and that alone cannot be recovered automatically later. So you need a mechanism that records it — however thinly — at the moment of adding.
What I settled on is appending a provenance trailer to the commit. Git trailers are key: value lines you can pull back out mechanically with git log.
feat: add input validation with zodConsolidate schema definitions in one place and validate at the API boundary.Dep-Added: zod@^3.23Dep-Reason: hand-written type guards were multiplying; replace them declarativelyDep-Reviewed-By: masaki
Add one line to your agent instructions (a rule file such as AGENTS.md) — "whenever you add a dependency, always write a Dep-Added and Dep-Reason trailer" — and provenance starts accumulating as a record. You don't need perfection. A single line of reasoning dramatically changes how an audit feels six months out.
When you want to tally the trailers, pull them like this.
You don't need to run all of this by hand every day. Deciding how it runs is the real substance of the design. Here is how I split it.
The part that stops contamination lives permanently in CI. The license check runs on every pull request and always fails on a deny. No human will is inserted here; the machine owns it.
The audit itself is plenty at once a month. Run scan-added-deps.sh, look over the dependencies the agent added that month, fill in a reason only for the ones missing a Dep-Reason, and drop what isn't used. Half an hour covers it.
And the review cases that genuinely need judgment, I look at myself, each time. Try to automate that and the policy ends up hollow. Keeping the human's scope narrow is, I think, the trick to sustaining this for the long run.
Agents work remarkably conscientiously within a standard you make explicit. The trouble is leaving them to "just handle it" without ever handing over the standard. Auditing dependencies is, in part, the practice of putting that standard into words and splitting the work between machine and human.
As a first step, run scan-added-deps.sh once against your own repository. How many dependencies from the past month do you not remember? That number is the tangible case for turning the audit into a mechanism. I hope it helps anyone juggling several projects the same way.
Share
Thank You for Reading
Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.