Only Slow Right After Install — Cutting Android Cold-Start Time with Baseline Profiles, Measured
Why an Android app stutters on launch only right after install or update, explained through JIT and Cloud Profiles, plus a measured walkthrough of cutting cold-start time with Baseline Profiles — from building a Macrobenchmark harness to staged rollout, from an indie developer's perspective.
On a fresh device, the very first tap into my wallpaper app left half a beat of silence before the home grid appeared. Every launch after that was instant. Slow only in release builds, and only when the install was brand new — a symptom that nagged at me quietly for about half a year.
When you run the same app for years as an indie developer, your own device has a fully warmed profile, so you never feel this slowness yourself. I only caught it when a review said "heavy the first time." The cause lived in the timing of JIT and Cloud Profiles, and a Baseline Profile shrank it in a way I could actually measure. Here is the path I took, starting with how to measure before you change anything.
Why it's slow only when the install is new
Right after installation, most of an Android app's code runs interpreted or through the JIT (just-in-time compiler). As hot paths get exercised, ART gradually compiles them ahead of time, which is why a device you've used for a while launches faster. That is the real reason "it doesn't stutter after the first time."
Google Play has a mechanism called Cloud Profiles that aggregates execution profiles from many users and ships a somewhat warm state to new installs. But it only kicks in once enough usage has accumulated after release. A new install right after launch — or an app update, where the profile is reset — sees little of that benefit.
A Baseline Profile lets you enumerate the hot paths for startup and your busiest screens ahead of time, bundle them into the APK / AAB, and have them AOT-compiled at install. Instead of waiting for Cloud Profiles to mature, you ship a warm state from the very first launch. My symptom was exactly this "cold only at the start" state, so it lined up well.
Build something measurable first — the Macrobenchmark module
Before touching any optimization, get to a state you can reproduce in numbers. Skip this and add only the profile, and you'll never know whether it helped or you imagined it. The first time I added a Baseline Profile, I could only describe it by feel, which made for an unconvincing note to myself.
For measurement I use Macrobenchmark. Separate from the app itself, you add a dedicated com.android.test module.
// build.gradle.kts of the :macrobenchmark moduleplugins { id("com.android.test") id("org.jetbrains.kotlin.android")}android { namespace = "com.example.wallpaper.macrobenchmark" compileSdk = 35 defaultConfig { minSdk = 24 targetSdk = 35 // Runs on a physical device against a release-like build testInstrumentationRunner = "androidx.test.runner.AndroidJUnitRunner" } targetProjectPath = ":app" experimentalProperties["android.experimental.self-instrumenting"] = true}dependencies { implementation("androidx.test.ext:junit:1.2.1") implementation("androidx.test.uiautomator:uiautomator:2.3.0") implementation("androidx.benchmark:benchmark-macro-junit4:1.3.3")}
On the app side, make it profileable so the benchmark can launch and measure it. Use a release-like benchmark build type rather than a debug build, and the numbers reflect reality much better.
The test that measures startup itself looks like this. StartupTimingMetric reports the time from tap to first frame (Time To Initial Display) with a median.
// StartupBenchmark.kt@RunWith(AndroidJUnit4::class)class StartupBenchmark { @get:Rule val rule = MacrobenchmarkRule() private fun measure(mode: CompilationMode) = rule.measureRepeated( packageName = "com.example.wallpaper", metrics = listOf(StartupTimingMetric()), iterations = 15, // higher count so the spread is visible startupMode = StartupMode.COLD, compilationMode = mode, ) { pressHome() startActivityAndWait() } // Bare state with no profile (the worst-case baseline) @Test fun coldStartupNone() = measure(CompilationMode.None()) // With the Baseline Profile required @Test fun coldStartupBaselineProfile() = measure(CompilationMode.Partial(BaselineProfileMode.Require))}
I set iterations to 15 because cold start has wide variance depending on device state. With one or two runs the median won't settle, and you'll misjudge the improvement.
✦
Thank you for reading this far.
Continue Reading
What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.
WHAT YOU'LL LEARN
✦Diagnose why launch is slow only right after install using JIT and Cloud Profiles, so you can decide whether Baseline Profiles will help your own app
✦Measure startup time as reproducible before/after numbers using a Macrobenchmark module and CompilationMode
✦Bundle a generated Baseline Profile into your AAB and roll it out safely while watching startup metrics through staged release
Secure payment via Stripe · Cancel anytime
✦
Unlock This Article
Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.
This is the crux. CompilationMode.None() is the worst-case baseline that uses no profile at all; CompilationMode.Partial(BaselineProfileMode.Require) represents the app with the Baseline Profile bundled and required. Run the same test under both modes and you isolate exactly the difference the profile made.
Specifying Require is deliberate. With BaselineProfileMode.Enable (the default), measurement passes silently even when no profile is present — so you can "measure" a state where nothing was actually bundled. During measurement, use Require so you only ever measure a state where the profile truly applied. I once misjudged a run as "no effect" because the profile wasn't applied, and lost half a day to it.
You can run it from the Android Studio gutter or the command line. For CI, the command is easier to wire up.
Once you can measure, generate the profile itself. Generation also happens in a dedicated com.android.test module, handled by the androidx.baselineprofile plugin.
// build.gradle.kts of the :baselineprofile moduleplugins { id("com.android.test") id("org.jetbrains.kotlin.android") id("androidx.baselineprofile")}dependencies { implementation("androidx.test.ext:junit:1.2.1") implementation("androidx.test.uiautomator:uiautomator:2.3.0") implementation("androidx.benchmark:benchmark-macro-junit4:1.3.3")}
The generator reproduces the routes real users take. Launch, scroll the grid a little, open a detail — trace your busiest path honestly. Whatever you enumerate here becomes the target for AOT compilation.
// BaselineProfileGenerator.kt@RunWith(AndroidJUnit4::class)class BaselineProfileGenerator { @get:Rule val rule = BaselineProfileRule() @Test fun generate() = rule.collect( packageName = "com.example.wallpaper", // also optimize dex layout for the startup classes includeInStartupProfile = true, ) { pressHome() startActivityAndWait() // reproduce the actual primary route val grid = device.findObject(By.res(packageName, "wallpaper_grid")) grid.fling(Direction.DOWN) grid.fling(Direction.UP) device.findObject(By.res(packageName, "thumbnail_0")).click() device.wait(Until.hasObject(By.res(packageName, "detail_image")), 5_000) }}
Adding includeInStartupProfile = true also produces a Startup Profile that pulls startup-route classes toward the front of the dex. That is a separate axis from AOT: it improves the on-disk locality of the classes read at launch. It's subtle, but it helps most on low-end devices.
Here is the generation command. When it finishes, a text-form profile is written into the release source set.
./gradlew :app:generateReleaseBaselineProfile# example output: app/src/release/generated/baselineProfiles/baseline-prof.txt
Bundle the generated profile into the app
You add only two things on the app side. One is androidx.profileinstaller, which is responsible for handing the profile to ART at install time. Many Jetpack libraries pull it in transitively, but declaring it explicitly is safer.
// app/build.gradle.ktsplugins { id("com.android.application") id("org.jetbrains.kotlin.android") id("androidx.baselineprofile")}dependencies { implementation("androidx.profileinstaller:profileinstaller:1.4.1") baselineProfile(project(":baselineprofile"))}baselineProfile { // auto-merge the generated artifact into the release source set automaticGenerationDuringBuild = false}
Setting automaticGenerationDuringBuild to false is an indie-developer call. Regenerating the profile on every build makes a physical device mandatory and slows both CI and local builds. I regenerate by hand only before a release and commit the artifact. With baseline-prof.txt in Git, even builds that skip regeneration always bundle the latest profile.
With that in place, build the release AAB and confirm assets/dexopt/baseline.prof is present. If it isn't, nothing you measure will show an effect.
./gradlew :app:bundleReleaseunzip -l app/build/outputs/bundle/release/app-release.aab | grep baseline.prof# seeing base/assets/dexopt/baseline.prof and baseline.profm means it bundled
How much it shrank in my wallpaper app
For one of the wallpaper apps I run, the kind that assembles a grid on launch, I measured cold start 15 times each on a physical Pixel 6a. The numbers are the median (P50) of Time To Initial Display, plus P90 to show the spread.
Condition
P50 (ms)
P90 (ms)
vs. bare
CompilationMode.None (no profile)
724
918
baseline
Baseline Profile applied (Require)
503
612
~30% faster
Well-used device (matured Cloud Profile)
486
574
reference
What stands out is that the "brand new" number with a Baseline Profile (503ms) nearly matched the well-used device whose profile had matured (486ms). In other words, the Baseline Profile front-loads, from the first install, the state Cloud Profiles take time to reach. The P90 shrank a lot too, and that calmer worst case is what directly answered the "heavy the first time" review.
One caveat about the numbers. The TTID from StartupTimingMetric is "until the first frame," so if you want to measure until the app is genuinely usable — images loaded — call reportFullyDrawn() from the screen and watch Time To Full Display as well. I treat TTFD as the primary metric only on screens where thumbnail loading is heavy.
Where I tripped, and how I keep it going solo
The first thing that stumped me was a profile that looked bundled but had no effect. The cause was running measurement with CompilationMode.Partial(BaselineProfileMode.Enable), which passed through even with no profile applied. Always use Require during measurement and you avoid it.
The second is profile freshness. If you don't regenerate after a big change to a screen, you optimize the old hot paths and leave the new route cold. I forgot this after rebuilding the wallpaper detail screen, and took the long way around when a startup I thought I'd fixed felt heavy again. Since adding one line — "regenerate the Baseline Profile" — to my pre-release checklist, it's been stable.
Roll out gradually while confirming you haven't broken startup metrics. I always start at 5% and widen to 25% → 50% → 100% while watching Play Console startup time and ANR rate. Profile-related defects are rare, but profileinstaller can fail to install on a small subset of devices, and staged release lets you stop there. For the same family of release-only Android pitfalls, walking through tracking down a release-only R8 and Gson crash alongside this covers the obfuscation-side traps. If you're adding localization, switching the app's language independently of device settings touches the startup route too, so include it in the generator's path.
If you try just one thing, add a measurement module and run CompilationMode.None and Partial(Require) 15 times each to see how far apart your app's P50 sits. That gap is the headroom a Baseline Profile can win back. Thanks for reading.
Share
Thank You for Reading
Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.