Articles/App Development

▣ App Development/2026-07-03Intermediate

Catching Download Size Regressions Before Submission Day — A Weekly Agent Gate for AAB/IPA Size Budgets

Mediation SDKs and bundled assets quietly inflate download size. A design for size ledgers, budget gates, and agent-driven delta attribution using bundletool and App Thinning reports.

antigravity⁴¹¹ app-dev⁴⁶ android²⁷ ios³⁴ automation⁷⁵

✦ Premium Article

In May 2026 I expanded the ad mediation stack in the wallpaper apps I run as an indie developer, adding Liftoff, InMobi, and Unity Ads adapters. The revenue configuration worked exactly as intended. What I only noticed after pushing an internal-test build was that the per-device download size had grown by more than 8MB. Functional tests passed. No crashes. The regression that actually mattered to users — a heavier download — sailed through every check I had, because none of my checks measured it.

Size regressions don't break builds and don't throw exceptions, so unless you measure them deliberately, they are invisible. I had gates for dependency updates and audit scripts for most other things, yet size was still on a "glance at the Play Console occasionally" basis. Since then I've given size its own budget and put an Antigravity agent on weekly watch duty. This article is the implementation record.

Where Size Quietly Grows — Three Typical Paths

Looking back over a year of size increases in my own apps, nearly all of them came from three paths.

Path	Typical example	Delta per event	How easy to miss
Dependency additions/updates	Ad adapters, analytics SDKs, UI libraries	0.5–4MB	High (lockfile diffs don't show megabytes)
Asset additions	Bundled wallpapers, onboarding videos, fonts	0.2–10MB	Medium (the person adding them knows, but nothing records it)
Shrinker/config drift	Looser R8/ProGuard keep rules, App Thinning misconfiguration	1–8MB	Highest (nothing was "added", yet size grew)

The third path is the nasty one. Widening a -keep rule while chasing a crash, or touching build settings, looks size-neutral in a code review. Any detection scheme based on classifying changes will eventually miss one of these. The only approach that doesn't leak is measuring the resulting number itself, every week.

Don't Measure the AAB File Size

This is the mistake I made in my first implementation: recording the size of the .aab artifact from CI. That number doesn't correspond to anything a user experiences. Play generates split APKs per device configuration from the AAB, so the actual download is much smaller than the bundle. What you want is bundletool's per-device download size.

# Get the per-device download size from an AAB
# (generate the .apks archive with build-apks first)
bundletool build-apks \
  --bundle=app/build/outputs/bundle/release/app-release.aab \
  --output=/tmp/app.apks \
  --ks=$KEYSTORE --ks-key-alias=$ALIAS \
  --ks-pass=pass:$KS_PASS --key-pass=pass:$KEY_PASS
 
# Prints MIN and MAX across device configs. Gate on MAX.
bundletool get-size total --apks=/tmp/app.apks
# Example output:
# MIN,MAX
# 21436512,24893440

I gate on MAX rather than MIN because the budget should hold even for the least favorable device configuration. On iOS, Xcode's App Thinning Size Report plays the same role: export the archive with thinning enabled and parse the largest compressed variant from App Thinning Size Report.txt.

# iOS: extract the largest thinned download size from the report
grep "compressed" "App Thinning Size Report.txt" \
  | grep -oE '[0-9.]+ MB' | sort -rn | head -1
# Example output: 31.2 MB

Both measurements must be reproducible from the command line rather than read off a store dashboard — that's the precondition for the unattended runs described below.

✦

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN

✦You can put a hard numeric budget and an automated gate on the download size that creeps up with every dependency and asset addition

✦You'll learn how to build a size ledger from bundletool and App Thinning reports, and how to narrow a size delta down to the offending commit

✦You can run the whole thing as a weekly unattended check, so a size regression never surprises you on submission day

Secure payment via Stripe · Cancel anytime

✦

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

Unlock all articles with Membership →

The Size Ledger Schema — Budgets, Measurements, and Exceptions in One JSON

Weekly measurements go into a Git-tracked ledger, size-ledger.json. The design decision that mattered: budgets, history, and approved exceptions live in one file. Split them across files and one of them will stop being updated — I've watched it happen.

{
  "budgets": {
    "android_download_max_mb": 26.0,
    "ios_thinned_max_mb": 34.0,
    "warn_ratio": 0.92
  },
  "exceptions": [
    {
      "id": "2026-05-mediation-adapters",
      "delta_mb": 3.1,
      "reason": "Liftoff/InMobi/Unity Ads adapters (revenue requirement)",
      "approved": "2026-05-21"
    }
  ],
  "history": [
    { "date": "2026-06-26", "android_mb": 23.7, "ios_mb": 31.2, "commit": "a1b2c3d" },
    { "date": "2026-07-03", "android_mb": 23.9, "ios_mb": 31.2, "commit": "e4f5a6b" }
  ]
}

The budget values themselves are app-specific, but as a starting point I used "current measurement plus six months of the past year's growth rate." A budget that's too tight turns exception requests into routine, and a routine exception is no gate at all. warn_ratio fires a warning at 92% of budget — a buffer zone that tells you you're drifting before anything blocks.

Implementing size-gate.mjs — From Measurement to Verdict

Measurement, ledger append, and verdict fit in one dependency-free Node script.

// size-gate.mjs — download size budget gate
// usage: node size-gate.mjs --android 23.9 --ios 31.2 --commit e4f5a6b
import { readFileSync, writeFileSync } from "node:fs";
 
const args = Object.fromEntries(
  process.argv.slice(2).join(" ").split("--").filter(Boolean)
    .map(s => s.trim().split(/\s+/)).map(([k, v]) => [k, v])
);
const ledger = JSON.parse(readFileSync("size-ledger.json", "utf8"));
const { budgets, history } = ledger;
const android = parseFloat(args.android);
const ios = parseFloat(args.ios);
const prev = history[history.length - 1];
 
const checks = [
  { name: "android", value: android, budget: budgets.android_download_max_mb, prev: prev?.android_mb },
  { name: "ios", value: ios, budget: budgets.ios_thinned_max_mb, prev: prev?.ios_mb },
];
 
let status = "PASS";
for (const c of checks) {
  const delta = c.prev ? (c.value - c.prev).toFixed(2) : "n/a";
  if (c.value > c.budget) {
    status = "BLOCK";
    console.log(`❌ ${c.name}: ${c.value}MB > budget ${c.budget}MB (Δ ${delta}MB)`);
  } else if (c.value > c.budget * budgets.warn_ratio) {
    if (status === "PASS") status = "WARN";
    console.log(`⚠️ ${c.name}: ${c.value}MB (${(c.value / c.budget * 100).toFixed(0)}% of budget, Δ ${delta}MB)`);
  } else {
    console.log(`✅ ${c.name}: ${c.value}MB (Δ ${delta}MB)`);
  }
}
 
// Append to the ledger regardless of verdict — gaps in history hurt the most
ledger.history.push({
  date: new Date().toISOString().slice(0, 10),
  android_mb: android, ios_mb: ios, commit: args.commit ?? "unknown",
});
writeFileSync("size-ledger.json", JSON.stringify(ledger, null, 2) + "\n");
console.log(status);
process.exit(status === "BLOCK" ? 1 : 0);

Notice the script only judges and records. Investigating why the gate failed is handed to the agent in the next section. Keeping judgment and investigation in separate layers turned out to matter: the judging side stays boring and trustworthy.

Letting the Agent Attribute the Delta — Two Constraints That Stop Hallucination

When the gate reports WARN or BLOCK, I hand the attribution work to an Antigravity agent. bundletool alone won't tell you which module or asset grew, so the agent compares the current build against the previous baseline with apkanalyzer, which ships with the Android SDK.

# Per-file size comparison between baseline and current builds
apkanalyzer apk compare --different-only baseline.apk current.apk

The instructions live in a fixed guidance file in the repository so every run gets the same constraints. Two constraints earned their keep:

Verbatim citation required. Any claim like "the growth comes from X" must quote the exact apkanalyzer output line or the exact git log entry. If it can't cite, it must label the claim "unverified". Before I added this, the agent occasionally named plausible-sounding libraries that simply weren't in the diff.
At most three candidates, ranked by confidence. Exhaustive listings cancel out the speed the gate bought you. Three ranked candidates, each with a single command a human should run next to confirm.

With this shape, going from a BLOCK to the offending commit takes roughly half what it took by hand — around 20 minutes instead of 40 in my experience. The agent's real contribution is the first move: knowing where to start looking.

Wiring It into a Weekly Unattended Run — WARN and BLOCK, Nothing Else

The measurement runs once a week, Friday mornings, as a background run. I deliberately did not run it on every build: a full bundletool pass takes several minutes, and size regressions are not a problem you need to catch within hours. Weekly is early enough, and it keeps the cost trivial.

The operating rules are two-tier and boring on purpose:

WARN (above 92% of budget): recorded in the ledger and put on the weekly review agenda. Nothing stops.
BLOCK (over budget): the next store submission is held. To let it through, you write the reason and approval date into exceptions first, then revise the budget.

Writing exceptions down instead of silently raising the budget is a gift to your future self. The +8.4MB from the three mediation adapters got squeezed to +3.1MB — R8 rule cleanup plus excluding unused adapter resources — before being recorded as an exception. Keeping that order, shrink first and except second, is what keeps the budget honest.

The scheduling itself reuses the setup from After Migrating to Antigravity CLI: Deciding How Much to Delegate to Scheduled Runs. And since dependency updates are the top source of size growth, this gate pairs naturally with Stop Treating Dependency Updates as a Monthly Chore — Weekly Agent Runs with Semver Risk Triage and Verification Gates: the dependency gate guards the entrance, the size gate guards the exit.

Six Weeks of Real Numbers from the Wallpaper Apps

Here are six weeks of measurements after rollout. Wallpaper apps are asset-heavy, so the absolute deltas run large, but the pattern should transfer to most apps.

Week	Android DL (MB)	iOS thinned (MB)	Verdict	Notes
W1	23.6	31.0	PASS	Baseline established
W2	23.7	31.2	PASS	—
W3	25.1	31.2	WARN	Four bundled wallpapers added; WebP recompression fixed it by W4
W4	24.0	31.2	PASS	Recompressed assets landed
W5	24.0	33.8	WARN	iOS only; two font families embedded twice
W6	24.0	31.4	PASS	Duplicate fonts removed

W5's duplicate fonts were a pure size regression with zero functional symptoms — exactly the kind of thing I would have shipped and never noticed without the gate. Put differently: six weeks surfaced two regressions I'm confident I would previously have missed. That's the effect size, measured.

Pitfalls Worth Knowing in Advance

bundletool version drift: the CSV shape of get-size total has shifted subtly between versions. For unattended runs, pin the bundletool jar in the repository rather than resolving it implicitly.
Measuring a universal APK: an APK built with --mode=universal contains everything pre-split and is not a proxy for download size. Run get-size against a default-mode .apks archive.
iOS report format changes: the App Thinning Size Report format can change across major Xcode releases. Don't hang the parse on one regex — if extraction comes back empty, report "measurement unavailable" instead of failing the gate, so an Xcode 26-line update can't silently kill the weekly run.
Confusing CDN compression with your numbers: loosening the budget because store delivery compresses transfers is tempting and unstable — compression ratios are content-dependent. Compare bundletool and thinning-report numbers against the budget directly.

Bundled assets disappearing on specific density buckets is a different failure mode, covered separately in A Few Low-Density Phones Lost Their Bundled Wallpaper — The drawable vs nodpi Boundary in Play's Density Splits.

Start by Writing One Line in the Ledger

You don't need the whole machine on day one. Run bundletool get-size total once, today, and write the number into size-ledger.json as the first history entry. Budgets, gates, and agents can all come later. The difference between having one recorded baseline and having none is the difference between checking a fact and reconstructing one, the next time you wonder whether the app got heavier.

Size regressions are quiet problems, easy to defer precisely because nothing visibly breaks. If I hadn't shipped an 8MB increase without noticing, I'd probably still be glancing at dashboards occasionally. I hope this design saves you that particular detour.

Thank You for Reading

Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.