Vetting AI Studio's Native Android Code Before It Reaches Your Live App

AI Studio's native Android vibe coding produces working screens at startling speed. But before it goes into a live app, it needs its own vetting. Here is a pre-merge review design for generated Kotlin.

AI Studio Android¹² vibe coding⁵ code review²

✦ Premium Article

The first time I tried AI Studio's native Android vibe coding, a single prompt stood up an entire settings screen and I caught my breath. Layout and navigation both worked. Then I went to drop that generated code straight into an app that had been running for years, and my hand stopped. Working in a fresh project and behaving correctly as part of a live app are two different things.

The app I maintain as an indie developer carries conventions that years of operation have settled — things you cannot decide from a screen alone. Generated code knows none of that context. So here I will design the vetting that AI Studio's Kotlin passes through before it enters a live app, split into what a machine filters and what a human reviews.

Why "works in a fresh project" is not "safe in production"

Vibe-coded output is correct in isolation. The question is whether it meshes with an existing app's assumptions. The generator does not know your established dependency-injection style, how you share state across screens, your custom Activity base class, or the threading contract the whole app honors. None of that is visible in a screenshot, so generated code tends to be written in a way that works in the moment but causes incidents in the app's context.

The three areas that break quietly in production

Three areas came up again and again in pre-merge review.

Area	What generated code tends to do	What happens in production
Lifecycle	Holds state without accounting for Activity recreation	State vanishes on rotation or resume
Memory leaks	Passes Context or a View to a long-lived object	Memory climbs as you move between screens
Threading	Calls I/O on the main thread	ANRs and jank on slow devices

None of these surface in a short emulator session. That is exactly why you need a machine layer before relying on human eyes.

✦

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN

✦How to spot the three areas where generated Kotlin quietly breaks a live app: lifecycle, leaks, threading

✦A pre-merge gate that filters by machine before a human looks (Detekt profile plus a diff-only run)

✦A staged rollout that lands 5,000 generated lines one feature at a time, with the criteria I actually used

Secure payment via Stripe · Cancel anytime

✦

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

Unlock all articles with Membership →

Make the pre-merge gate act only on the diff

The first defense is static analysis. But running Detekt over the whole app buries the generated-code issues under existing warnings. So apply a stricter ruleset to only the files in this import.

// detekt-generated.yml — strict profile for imported generated code
complexity:
  LongMethod:
    threshold: 40
  TooManyFunctions:
    thresholdInClasses: 12
potential-bugs:
  Deprecation:
    active: true
performance:
  SpreadOperator:
    active: true
style:
  ForbiddenComment:
    comments: ['TODO', 'FIXME', 'STOPSHIP']

Narrow the run to changed files with a shell wrapper.

#!/usr/bin/env bash
set -euo pipefail
# only the .kt files changed on the import branch
CHANGED=$(git diff --name-only origin/main...HEAD -- '*.kt')
if [ -z "$CHANGED" ]; then echo "nothing to check"; exit 0; fi
 
# run the generated-only profile against changed files only
echo "$CHANGED" | xargs detekt \
  --config detekt-generated.yml \
  --fail-on-issues \
  --report txt:build/detekt-generated.txt
echo "✅ diff-only static analysis passed"

The three-dot origin/main...HEAD gives you only what changed since the branch point. Narrowing to the diff rather than scanning everything is the key; it alone surfaces almost all the noise specific to generated code.

Lifecycle and leaks need more than a machine

Static analysis catches shape-fixed problems like main-thread I/O, but it cannot fully catch the design question of "this way of holding state will not survive recreation." That part assumes a human reviewer, so fix the lens they look through.

The four points I always check:

Does state that must survive rotation live somewhere recreation-proof (a ViewModel, etc.)?
Is anything I pass a Context to shorter-lived than the screen?
Is each coroutine's scope tied to the screen's lifetime and reliably canceled on exit?
Does it bypass the existing base class or shared navigation with its own implementation?

Fixing the lens to four points turned generated-code review from "read it and see" into "knock down these four in order," and misses dropped.

Do not land 5,000 lines at once

The thing that helped most was not technique but how I imported. Vibe coding generates groups of screens, but landing them as one big change makes both review and rollback heavy all at once.

1. Split the output along feature boundaries (settings, list, detail...)
2. Branch one feature at a time and pass the pre-merge gate
3. Pass the human four-point review
4. Ship one feature to production; watch crash rate and memory for 2-3 days
5. If nothing's wrong, move to the next feature

In my case I landed the generated settings screen as the first feature, confirmed the crash rate was no different from usual, and only then moved on. Staging it means that if a problem appears, the cause is confined to one feature, so isolation is fast. Land it all at once and destabilize the whole thing, and just figuring out which generated piece is at fault can melt days.

Where to delegate and where a human takes over

Finally, a decision table. Vibe coding is fast, but it blurs where responsibility sits, so drawing the line up front saves hesitation.

Step	Owner	Why
Screen scaffolding	AI Studio	The speed benefit is largest here
Static analysis / diff gate	Machine	Shape-fixed incidents are caught reliably by a machine
Lifecycle / design review	Human	Only a human holds the app-specific context
Go/no-go for production	Human	Reads the observed data and owns the final call

Get pulled along by the speed of generation and hand even those last two rows to a machine, and a live app is hard to walk back. Use generation freely; lock down the vetting with a static gate, the four points, and staged rollout. That combination is the approach I actually use in my own indie development now. I hope it helps steady your footing if you want to bring this new generation experience into production.

Thank You for Reading

Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.