Accounting for Which Agent Spent What: A Cost Attribution Design by Task
Your month-end bill is one number, but running multiple agents on Gemini 3.5 Flash hides which task ate the cost. Separate from a budget guard, I share a cost-attribution accounting design that maps usage to per-task and per-site cost, with a solo-operator implementation and numbers.
The bill that arrived each month was always one total. Gemini 3.5 Flash is fast and cheap, and even running agents in parallel the per-call cost is small — or so I assumed. But running tasks across six sites together, I had no idea which site's which task ate how much of that total. A cheap operation can quietly dominate the bill through sheer volume, and you'd never notice.
Before lowering cost, I needed to make where it goes visible. This is a third kind of work, distinct from token-saving optimization and from a budget guard that cuts off spend. In accounting terms it is cost attribution: assigning incurred cost correctly to the unit that produced it. As an indie developer who watches the books alone, I can't make investment decisions while the unit cost stays invisible.
A budget guard and cost attribution are different jobs
They're easy to conflate, so let me separate them first. A budget guard "stops when a cap is exceeded" — it prevents runaway spend. Cost attribution "assigns spent cost to the task and site that produced it" — it produces material for decisions.
With a guard but no attribution, you can't see the state where "the total is within budget, but cost skews toward inefficient tasks." I myself ran with only a guard for a long time and felt safe. It didn't break the bank, but whether the allocation was optimal stayed unknown.
Attribution starts by tagging every model call with "who is this call for."
Tag every call with a cost tag
Attach three tags to each call — site, agent, and task type — so you can later slice cost along these axes.
Fill the tags automatically from the call-site context. Tag by hand and you will forget, so generate the CostTag where the agent is launched and pass it into the call wrapper.
✦
Thank you for reading this far.
Continue Reading
What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.
WHAT YOU'LL LEARN
✦Get the full TypeScript implementation of a cost ledger that tags every call and converts usage to currency
✦Learn the aggregation logic that rolls up cost by task and by site and exposes cost per published article
✦See the decision criteria and rules that cut monthly spend about 28% once unit cost became visible
Secure payment via Stripe · Cancel anytime
✦
Unlock This Article
Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.
Convert the usage (token counts) the model returns into currency at separate input and output rates. Rates vary by model, so collect them in one table.
// Price per million tokens (JPY). Real values vary by contract and FX, so centralize as constantsconst PRICE_PER_MTOK_JPY: Record<string, { in: number; out: number }> = { "gemini-3.5-flash": { in: 11, out: 44 }, "gemini-3.1-pro": { in: 190, out: 760 },};function toCostJpy(usage: Usage): number { const p = PRICE_PER_MTOK_JPY[usage.model]; if (!p) throw new Error(`unknown model price: ${usage.model}`); const inCost = (usage.inputTokens / 1_000_000) * p.in; const outCost = (usage.outputTokens / 1_000_000) * p.out; return Math.round((inCost + outCost) * 100) / 100; // 2 decimals}
What matters here is reflecting that the output rate is several times the input rate. When a "cheap" operation turns out expensive, the cause is usually output volume. Recording input and output separately lets you later name the tasks whose output has ballooned.
Record to a ledger and aggregate
Accumulate tagged cost entries in an append-only ledger. One JSON per line is plenty. Then just aggregate by axis.
class CostLedger { private entries: CostEntry[] = []; record(tag: CostTag, usage: Usage): void { this.entries.push({ ...tag, ts: new Date().toISOString(), model: usage.model, inputTokens: usage.inputTokens, outputTokens: usage.outputTokens, costJpy: toCostJpy(usage), }); } // group-sum along any axis sumBy(key: keyof CostTag): Record<string, number> { const acc: Record<string, number> = {}; for (const e of this.entries) { acc[e[key]] = (acc[e[key]] ?? 0) + e.costJpy; } return acc; } // cost per artifact, by task type unitCost(taskType: string, count: number): number { const total = this.entries .filter((e) => e.taskType === taskType) .reduce((s, e) => s + e.costJpy, 0); return count > 0 ? Math.round((total / count) * 100) / 100 : 0; }}
sumBy("taskType") gives cost by task and sumBy("site") by site in one shot. unitCost drops you to unit economics like "cost per premium article." It was only by looking at this unit cost that I noticed a certain routine task was eating several times its expected spend.
Make investment decisions on unit cost
Once cost is visible by axis, decisions shift from feel to numbers. Here is how I actually read it.
First, line up each task type's unit cost against the value that task produces. If a rarely-read meta aggregation task costs as much as one premium article, it's a candidate to stop. Unit cost exposes whether spend skews toward low-value work.
Second, reconcile per-site cost against that site's revenue (App Store / Google Play / AdMob, or subscriptions). If cost skews across sites while revenue doesn't match, that's a signal to rebalance allocation.
Third, name the tasks with outsized output tokens and tighten their prompts. Output costs several times input, so prompts that demand verbose output quietly inflate the bill. Check whether you're asking for long free-form writing where a summary or a bulleted list would do.
These revisions cut my monthly spend about 28%. The main driver was being able to stop low-value tasks I'd unconsciously multiplied "because it's cheap and fast" — using unit cost as the evidence. That's a kind of waste a guard alone can't catch.
As long as the bill looks like one number, your optimization moves rely on intuition. Start by tagging every call with site, agent, and task, and aggregate a month by axis. The moment you can see where the money flows, what to stop becomes concrete.
Share
Thank You for Reading
Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.