ANTIGRAVITY LABJP
Articles/App Development
App Development/2026-06-13Advanced

Making Apple Foundation Models and Gemini Interchangeable: A Three-Tier Abstraction for In-App AI

After WWDC26 opened Apple Foundation Models to qualifying developers and announced server-side Gemini integration, I redesigned my apps around a three-tier abstraction — on-device, Private Cloud Compute, and third-party APIs — behind a single Swift protocol.

Apple Foundation Models2Gemini API3Swift7Architecture4WWDC26iOS Development

Premium Article

The day after WWDC26 wrapped, I opened the source of one of my production apps and winced. The Gemini API client was being called directly from view-layer code in far more places than I remembered.

With Apple Foundation Models now opened free of charge to qualifying developers, and a server-side integration announced that lets you call Claude or Gemini through the same Swift API, hard-wiring a specific model deep into your codebase is a debt your future self will have to pay.

For an independent developer, switching model providers is not a contingency — it is an annual event. Over the past two years I have swapped the backends for summarization, translation, and image description more times than I care to count, and each swap meant editing app code. I wanted that to stop. What follows is the three-tier abstraction I settled on, with the Swift code to back it up.

Why provider-coupled code breaks down within a year

The problem with direct coupling is not that the code stops working. It keeps working — and becomes impossible to change.

Three stiffening points show up in practice.

  • Provider-specific request shapes leak into the UI layer. If your view models assemble Gemini request structs directly, a provider change becomes a UI-layer refactor
  • You cannot track pricing and eligibility changes. Apple's free tier reportedly draws the line at two million first-time downloads. Lines like that move — your app grows, policies get revised. Rewriting every call site each time eligibility shifts is not realistic
  • Fallbacks become unwritable. A cascade like "try on-device, then fall back to the cloud" is only implementable when every call goes through one unified entry point

In my own app, direct imports of the Gemini client were scattered across 14 files. Seeing that number is what finally pushed me to build the abstraction layer.

Think in three tiers from summer 2026 onward

After WWDC26, the AI execution environments available to an iOS app settle into three tiers.

  • Tier 1, on-device (the Foundation Models framework). Lowest latency, works offline, costs nothing extra. It cedes vocabulary and long-form coherence to the upper tiers, but it is plenty for classification, short generation, and keyword extraction
  • Tier 2, Private Cloud Compute. The tier covered by the newly announced free access. It accepts image input and, while off-device, runs on Apple's privacy infrastructure
  • Tier 3, third-party APIs such as the Gemini API. The highest performance ceiling, billed per use. The announced server-side integration is expected to expose this tier through the same Swift API as well

The key design move is not deciding which tier each feature uses. It is building a structure that can move between tiers first, so the decision itself stays swappable. The decision criteria come later in this article.

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN
A Swift pattern that unifies on-device, Private Cloud Compute, and third-party APIs behind one protocol
A working AIRouter implementation that pulls fallback order and timeouts out of call sites
Concrete routing criteria — privacy, latency, and cost — for deciding which tier serves each request
Secure payment via Stripe · Cancel anytime

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

or
Unlock all articles with Membership →
Share

Thank You for Reading

Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

Related Articles

App Dev2026-06-13
Apple Foundation Models Are Now Free for Most Indie Apps — Three Questions That Decide What I Build
Apple Foundation Models are now free for developers under 2M first-time downloads. Three questions I used to decide which AI features belong in my wallpaper apps.
App Dev2026-04-15
Antigravity × App Store Connect API: Complete Automation Guide for Revenue, ASO & Reviews
A hands-on guide to automating App Store Connect with Antigravity. From JWT authentication to sales report fetching, review sentiment analysis, ASO tracking, and a daily Slack dashboard — everything indie developers need to reclaim development time.
App Dev2026-04-04
Mastering Swift Concurrency with Antigravity — async/await, Actors, and Structured Concurrency in Production iOS Apps
A practical, production-focused walkthrough of Swift Concurrency's async/await, Actor model, and Structured Concurrency with Antigravity AI assistance. Build production-quality iOS apps free from data races and callback hell.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →