Catching Download Size Regressions Before Submission Day — A Weekly Agent Gate for AAB/IPA Size Budgets
Mediation SDKs and bundled assets quietly inflate download size. A design for size ledgers, budget gates, and agent-driven delta attribution using bundletool and App Thinning reports.
In May 2026 I expanded the ad mediation stack in the wallpaper apps I run as an indie developer, adding Liftoff, InMobi, and Unity Ads adapters. The revenue configuration worked exactly as intended. What I only noticed after pushing an internal-test build was that the per-device download size had grown by more than 8MB. Functional tests passed. No crashes. The regression that actually mattered to users — a heavier download — sailed through every check I had, because none of my checks measured it.
Size regressions don't break builds and don't throw exceptions, so unless you measure them deliberately, they are invisible. I had gates for dependency updates and audit scripts for most other things, yet size was still on a "glance at the Play Console occasionally" basis. Since then I've given size its own budget and put an Antigravity agent on weekly watch duty. This article is the implementation record.
Where Size Quietly Grows — Three Typical Paths
Looking back over a year of size increases in my own apps, nearly all of them came from three paths.
Path
Typical example
Delta per event
How easy to miss
Dependency additions/updates
Ad adapters, analytics SDKs, UI libraries
0.5–4MB
High (lockfile diffs don't show megabytes)
Asset additions
Bundled wallpapers, onboarding videos, fonts
0.2–10MB
Medium (the person adding them knows, but nothing records it)
The third path is the nasty one. Widening a -keep rule while chasing a crash, or touching build settings, looks size-neutral in a code review. Any detection scheme based on classifying changes will eventually miss one of these. The only approach that doesn't leak is measuring the resulting number itself, every week.
Don't Measure the AAB File Size
This is the mistake I made in my first implementation: recording the size of the .aab artifact from CI. That number doesn't correspond to anything a user experiences. Play generates split APKs per device configuration from the AAB, so the actual download is much smaller than the bundle. What you want is bundletool's per-device download size.
# Get the per-device download size from an AAB# (generate the .apks archive with build-apks first)bundletool build-apks \ --bundle=app/build/outputs/bundle/release/app-release.aab \ --output=/tmp/app.apks \ --ks=$KEYSTORE --ks-key-alias=$ALIAS \ --ks-pass=pass:$KS_PASS --key-pass=pass:$KEY_PASS# Prints MIN and MAX across device configs. Gate on MAX.bundletool get-size total --apks=/tmp/app.apks# Example output:# MIN,MAX# 21436512,24893440
I gate on MAX rather than MIN because the budget should hold even for the least favorable device configuration. On iOS, Xcode's App Thinning Size Report plays the same role: export the archive with thinning enabled and parse the largest compressed variant from App Thinning Size Report.txt.
# iOS: extract the largest thinned download size from the reportgrep "compressed" "App Thinning Size Report.txt" \ | grep -oE '[0-9.]+ MB' | sort -rn | head -1# Example output: 31.2 MB
Both measurements must be reproducible from the command line rather than read off a store dashboard — that's the precondition for the unattended runs described below.
✦
Thank you for reading this far.
Continue Reading
What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.
WHAT YOU'LL LEARN
✦You can put a hard numeric budget and an automated gate on the download size that creeps up with every dependency and asset addition
✦You'll learn how to build a size ledger from bundletool and App Thinning reports, and how to narrow a size delta down to the offending commit
✦You can run the whole thing as a weekly unattended check, so a size regression never surprises you on submission day
Secure payment via Stripe · Cancel anytime
✦
Unlock This Article
Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.
The Size Ledger Schema — Budgets, Measurements, and Exceptions in One JSON
Weekly measurements go into a Git-tracked ledger, size-ledger.json. The design decision that mattered: budgets, history, and approved exceptions live in one file. Split them across files and one of them will stop being updated — I've watched it happen.
The budget values themselves are app-specific, but as a starting point I used "current measurement plus six months of the past year's growth rate." A budget that's too tight turns exception requests into routine, and a routine exception is no gate at all. warn_ratio fires a warning at 92% of budget — a buffer zone that tells you you're drifting before anything blocks.
Implementing size-gate.mjs — From Measurement to Verdict
Measurement, ledger append, and verdict fit in one dependency-free Node script.
Notice the script only judges and records. Investigating why the gate failed is handed to the agent in the next section. Keeping judgment and investigation in separate layers turned out to matter: the judging side stays boring and trustworthy.
Letting the Agent Attribute the Delta — Two Constraints That Stop Hallucination
When the gate reports WARN or BLOCK, I hand the attribution work to an Antigravity agent. bundletool alone won't tell you which module or asset grew, so the agent compares the current build against the previous baseline with apkanalyzer, which ships with the Android SDK.
# Per-file size comparison between baseline and current buildsapkanalyzer apk compare --different-only baseline.apk current.apk
The instructions live in a fixed guidance file in the repository so every run gets the same constraints. Two constraints earned their keep:
Verbatim citation required. Any claim like "the growth comes from X" must quote the exact apkanalyzer output line or the exact git log entry. If it can't cite, it must label the claim "unverified". Before I added this, the agent occasionally named plausible-sounding libraries that simply weren't in the diff.
At most three candidates, ranked by confidence. Exhaustive listings cancel out the speed the gate bought you. Three ranked candidates, each with a single command a human should run next to confirm.
With this shape, going from a BLOCK to the offending commit takes roughly half what it took by hand — around 20 minutes instead of 40 in my experience. The agent's real contribution is the first move: knowing where to start looking.
Wiring It into a Weekly Unattended Run — WARN and BLOCK, Nothing Else
The measurement runs once a week, Friday mornings, as a background run. I deliberately did not run it on every build: a full bundletool pass takes several minutes, and size regressions are not a problem you need to catch within hours. Weekly is early enough, and it keeps the cost trivial.
The operating rules are two-tier and boring on purpose:
WARN (above 92% of budget): recorded in the ledger and put on the weekly review agenda. Nothing stops.
BLOCK (over budget): the next store submission is held. To let it through, you write the reason and approval date into exceptions first, then revise the budget.
Writing exceptions down instead of silently raising the budget is a gift to your future self. The +8.4MB from the three mediation adapters got squeezed to +3.1MB — R8 rule cleanup plus excluding unused adapter resources — before being recorded as an exception. Keeping that order, shrink first and except second, is what keeps the budget honest.
Here are six weeks of measurements after rollout. Wallpaper apps are asset-heavy, so the absolute deltas run large, but the pattern should transfer to most apps.
Week
Android DL (MB)
iOS thinned (MB)
Verdict
Notes
W1
23.6
31.0
PASS
Baseline established
W2
23.7
31.2
PASS
—
W3
25.1
31.2
WARN
Four bundled wallpapers added; WebP recompression fixed it by W4
W4
24.0
31.2
PASS
Recompressed assets landed
W5
24.0
33.8
WARN
iOS only; two font families embedded twice
W6
24.0
31.4
PASS
Duplicate fonts removed
W5's duplicate fonts were a pure size regression with zero functional symptoms — exactly the kind of thing I would have shipped and never noticed without the gate. Put differently: six weeks surfaced two regressions I'm confident I would previously have missed. That's the effect size, measured.
Pitfalls Worth Knowing in Advance
bundletool version drift: the CSV shape of get-size total has shifted subtly between versions. For unattended runs, pin the bundletool jar in the repository rather than resolving it implicitly.
Measuring a universal APK: an APK built with --mode=universal contains everything pre-split and is not a proxy for download size. Run get-size against a default-mode .apks archive.
iOS report format changes: the App Thinning Size Report format can change across major Xcode releases. Don't hang the parse on one regex — if extraction comes back empty, report "measurement unavailable" instead of failing the gate, so an Xcode 26-line update can't silently kill the weekly run.
Confusing CDN compression with your numbers: loosening the budget because store delivery compresses transfers is tempting and unstable — compression ratios are content-dependent. Compare bundletool and thinning-report numbers against the budget directly.
You don't need the whole machine on day one. Run bundletool get-size total once, today, and write the number into size-ledger.json as the first history entry. Budgets, gates, and agents can all come later. The difference between having one recorded baseline and having none is the difference between checking a fact and reconstructing one, the next time you wonder whether the app got heavier.
Size regressions are quiet problems, easy to defer precisely because nothing visibly breaks. If I hadn't shipped an 8MB increase without noticing, I'd probably still be glancing at dashboards occasionally. I hope this design saves you that particular detour.
Share
Thank You for Reading
Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.