Articles/App Development

▣ App Development/2026-04-19Advanced

Building a Fully Automated iOS App Release Pipeline with Antigravity — From Screenshot Generation to App Store Review Management

Use Antigravity, App Store Connect API, and GitHub Actions to automate every step from build to App Store submission. A complete advanced guide covering AI screenshot captions, metadata optimization, and rejection analysis.

antigravity⁴³⁶ ios³⁶ app-store⁵ fastlane³ github-actions⁹ automation⁸⁷ gemini¹⁶ indie-dev¹⁹ cicd⁸ app-store-connect-api

✦ Premium Article

If you're an independent iOS developer, you know the feeling: more time spent on release prep than on actual development. Screenshots for every device size, metadata updates in multiple languages, TestFlight uploads, review waiting — and then it all repeats for the next version.

I ran multiple apps simultaneously, and at one point, release preparation alone took one to two weeks per cycle. That's time I wanted to spend writing code. After building a pipeline combining Antigravity with the App Store Connect API, that changed.

This guide shares the full implementation: the architectural decisions, the working scripts, and the hard-won lessons from running this in production. This isn't just "add Fastlane to your project." We're covering AI-generated screenshot captions, keyword optimization based on competitive analysis, and an agent that identifies the root cause of review rejections and proposes code-level fixes.

Why Automating "Build → Submit" Isn't Enough

Most developers who automate their release workflow stop at "build → upload to TestFlight → submit to App Store." That automation is valuable, but it covers a surprisingly small portion of the actual work.

Here's how my time broke down for a new app launch across five languages and six device sizes:

Screenshot capture and editing: ~8 hours
App Store metadata per language (title, description, keywords): ~6 hours
Build, signing, and upload: ~2 hours (already automated)
Review waiting + rejection handling (two cycles): ~10 hours

The existing automation covered about 15% of total release effort. The remaining 85% was content creation work. That's where AI can make the real difference.

Pipeline Architecture and Component Design

The pipeline consists of three independent phases.

Phase 1 — Content Generation: An Antigravity agent reads the app's source code and design files, then orchestrates screenshot capture, caption generation, and metadata optimization.

Phase 2 — Automated Submission: Fastlane combined with the App Store Connect API takes the generated content and the build, then submits to TestFlight and the App Store.

Phase 3 — Review Monitoring: GitHub Actions detects rejection notifications, triggers an Antigravity agent, and generates a structured analysis with specific fix recommendations.

[Antigravity Agent]
    ↓ Reads source code + Figma designs
[Screenshot Capture] → Snapshot (Fastlane)
[Gemini Vision API] → Caption and description generation
[Competitive Analysis Agent] → Keyword optimization
    ↓
[GitHub Actions CI/CD]
    ↓
[Fastlane] → Build, sign, TestFlight upload
[deliver] → App Store metadata + screenshot submission
    ↓
[App Store Review Monitor] ← Webhook / polling
    ↓ On rejection
[Antigravity Review Agent] → Guideline analysis + fix proposals

The key design choice here is independence between phases. You can re-run screenshot generation without triggering a new build, or update metadata without going through the full pipeline. This matters when you're iterating on App Store optimization after the initial launch.

✦

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN

✦Real before/after numbers measured across 50M-download indie apps: 14 days → 1.5 days per release, rejection rate 35% → 8%, monthly releases 3-4 → 10-12

✦Undocumented gotchas from running 5 iOS apps in parallel — JWT pre-rotation, screenshot color temperature alignment, and tone dictionaries that survived 6 months in production

✦A complete release-agent.md plus four explicit human-in-the-loop triggers, so the Antigravity agent stops at the exact moments you'd want it to

Secure payment via Stripe · Cancel anytime

✦

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

Unlock all articles with Membership →

Phase 1: AI Screenshot Generation Pipeline

Automated Screenshot Capture with Snapshot

Fastlane Snapshot captures screenshots through UI tests, which doubles as basic regression testing. Here's the Snapfile configuration:

# fastlane/Snapfile
devices([
  "iPhone 16 Pro Max",
  "iPhone 16",
  "iPhone SE (3rd generation)",
  "iPad Pro 13-inch (M4)"
])
 
languages([
  "ja",
  "en-US",
  "zh-Hans",
  "ko",
  "fr-FR"
])
 
scheme "YourApp"
output_directory "./screenshots"
clear_previous_screenshots true
override_status_bar true
scale 1  # Always capture at 100% scale

In your UI test target, explicitly navigate to each screen you want to capture:

// UITests/ScreenshotTests.swift
import XCTest
 
final class ScreenshotTests: XCTestCase {
    
    override func setUpWithError() throws {
        continueAfterFailure = false
        let app = XCUIApplication()
        setupSnapshot(app)
        app.launch()
    }
    
    func testCaptureAllScreens() throws {
        let app = XCUIApplication()
        
        // Home screen
        snapshot("01_home")
        
        // Feature introduction (if tutorial exists)
        if app.buttons["Next"].exists {
            app.buttons["Next"].tap()
            snapshot("02_feature")
        }
        
        // Settings screen
        app.tabBars.buttons["Settings"].tap()
        snapshot("03_settings")
    }
}

Generating Captions with Gemini Vision API

This is where the real time savings happen. Rather than writing captions manually for each screenshot in each language, Gemini Vision analyzes the screenshot and generates App Store-appropriate copy:

# scripts/generate_captions.py
import os
import base64
import json
from pathlib import Path
import google.generativeai as genai
 
# IMPORTANT: Always load API keys from environment variables
genai.configure(api_key=os.environ["GEMINI_API_KEY"])
 
def encode_image(image_path: str) -> str:
    with open(image_path, "rb") as f:
        return base64.b64encode(f.read()).decode("utf-8")
 
def generate_caption(image_path: str, language: str, app_description: str) -> dict:
    """
    Generate App Store caption from a screenshot.
    
    Args:
        image_path: Path to the screenshot PNG
        language: Target language code ("en-US", "ja", etc.)
        app_description: Brief app description for context
    
    Returns:
        {"caption": str, "alt_text": str}
    """
    model = genai.GenerativeModel("gemini-2.5-pro")
    image_data = encode_image(image_path)
    
    prompt = f"""
    Look at this iOS app screenshot and write an App Store caption in {language}.
    
    App context: {app_description}
    
    Requirements:
    - Maximum 30 characters (App Store limit)
    - Focus on user benefit, not feature names
    - Sound natural to {language} native speakers
    - Action-oriented: what can users DO with this screen?
    
    Return JSON only:
    {{"caption": "caption text here", "alt_text": "accessibility description"}}
    """
    
    response = model.generate_content([
        {"mime_type": "image/png", "data": image_data},
        prompt
    ])
    
    try:
        result = json.loads(
            response.text.strip()
            .replace("```json", "")
            .replace("```", "")
        )
        return result
    except json.JSONDecodeError:
        # Fallback: use raw text, truncated to limit
        return {"caption": response.text[:30], "alt_text": response.text}
 
def process_all_screenshots(screenshots_dir: str, app_description: str):
    """Process entire screenshots directory and generate captions.json"""
    screenshots_path = Path(screenshots_dir)
    captions = {}
    
    for lang_dir in screenshots_path.iterdir():
        if not lang_dir.is_dir():
            continue
        
        lang_code = lang_dir.name
        captions[lang_code] = {}
        
        for screenshot in sorted(lang_dir.glob("*.png")):
            print(f"Processing {lang_code}/{screenshot.name}...")
            
            try:
                result = generate_caption(
                    str(screenshot),
                    lang_code,
                    app_description
                )
                captions[lang_code][screenshot.stem] = result
                print(f"  ✓ {result['caption']}")
            except Exception as e:
                # Don't fail the entire batch on a single error
                print(f"  ✗ Error: {e}")
                captions[lang_code][screenshot.stem] = {
                    "caption": "",
                    "alt_text": ""
                }
    
    output_path = screenshots_path / "captions.json"
    with open(output_path, "w", encoding="utf-8") as f:
        json.dump(captions, f, ensure_ascii=False, indent=2)
    
    print(f"\n✅ Caption generation complete: {output_path}")
    return captions

Common Pitfall #1: API Rate Limiting

With five languages, six devices, and three screens per combination, you're making 90 API calls. Hitting rate limits is nearly guaranteed without backoff logic:

# scripts/utils/rate_limiter.py
import time
import random
from functools import wraps
 
def with_retry(max_retries: int = 3, base_delay: float = 1.0):
    """Exponential backoff retry decorator"""
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            for attempt in range(max_retries):
                try:
                    return func(*args, **kwargs)
                except Exception as e:
                    if "RATE_LIMIT" in str(e) or "429" in str(e):
                        if attempt < max_retries - 1:
                            delay = base_delay * (2 ** attempt) + random.uniform(0, 1)
                            print(f"  Rate limited. Retrying in {delay:.1f}s...")
                            time.sleep(delay)
                        else:
                            raise
                    else:
                        raise  # Non-rate-limit errors fail immediately
            return None
        return wrapper
    return decorator

Phase 2: AI-Driven App Store Metadata Optimization

Screenshots get users to look; metadata gets them to download. This phase uses Gemini to generate and optimize title, subtitle, description, and keywords for each supported language.

Defining the Agent's Behavior in AGENTS.md

Create an AGENTS.md at the project root to define how the Antigravity agent should handle metadata optimization:

# AGENTS.md - App Store Metadata Optimizer
 
## Role
You are an App Store metadata optimization specialist.
Read `app_info.json` and generate optimized metadata for each language.
 
## Output Format
Write files to `fastlane/metadata/{language}/`:
- `name.txt` (30 chars max)
- `subtitle.txt` (30 chars max)
- `description.txt` (4,000 chars max)
- `keywords.txt` (100 chars max, comma-separated, English only)
 
## Optimization Principles
1. Place highest-search-volume keywords first
2. First 255 characters of description appear before "More" fold — treat as your hook
3. Never mention competitor names (App Store guideline violation)
4. Write each language as a native speaker, not a translation

Metadata Generation Script

# scripts/optimize_metadata.py
import os
import json
from pathlib import Path
import google.generativeai as genai
 
genai.configure(api_key=os.environ["GEMINI_API_KEY"])
 
def generate_metadata_for_language(
    app_info: dict,
    language: str,
    existing_metadata: dict = None
) -> dict:
    """
    Generate App Store metadata for a specific language.
    When existing_metadata is provided, the function acts as an improvement pass.
    """
    model = genai.GenerativeModel("gemini-2.5-pro")
    
    improvement_context = ""
    if existing_metadata:
        improvement_context = f"""
        Current metadata (identify what to improve):
        {json.dumps(existing_metadata, ensure_ascii=False, indent=2)}
        """
    
    prompt = f"""
    Optimize App Store metadata for the following iOS app, targeting {language} users.
 
    App information:
    {json.dumps(app_info, ensure_ascii=False, indent=2)}
    
    Hard constraints:
    - name: max 30 characters in {language}
    - subtitle: max 30 characters in {language}  
    - keywords: max 100 characters total, comma-separated, English only
    - description: max 4,000 characters, first 255 are most important
    
    {improvement_context}
    
    Return only valid JSON:
    {{
      "name": "...",
      "subtitle": "...",
      "keywords": "keyword1,keyword2,...",
      "description": "..."
    }}
    """
    
    response = model.generate_content(prompt)
    
    try:
        text = response.text.strip()
        if "```json" in text:
            text = text.split("```json")[1].split("```")[0].strip()
        elif "```" in text:
            text = text.split("```")[1].split("```")[0].strip()
        return json.loads(text)
    except (json.JSONDecodeError, IndexError) as e:
        raise ValueError(f"Invalid JSON from Gemini: {e}\nResponse preview: {response.text[:200]}")
 
def write_fastlane_metadata(metadata: dict, language: str, output_base: str):
    """Write metadata files in Fastlane deliver format"""
    lang_dir = Path(output_base) / "metadata" / language
    lang_dir.mkdir(parents=True, exist_ok=True)
    
    for field, filename in [
        ("name", "name.txt"),
        ("subtitle", "subtitle.txt"),
        ("keywords", "keywords.txt"),
        ("description", "description.txt")
    ]:
        if field in metadata:
            (lang_dir / filename).write_text(metadata[field], encoding="utf-8")
            print(f"  ✓ {lang_dir / filename}")

Phase 3: Automated Rejection Analysis

App Store rejections are expensive. When a rejection arrives, you typically spend hours identifying the issue, cross-referencing Apple's guidelines, and figuring out what exactly needs to change. This phase automates that initial triage.

Fetching Rejection Details via App Store Connect API

# scripts/monitor_review.py
import os
import jwt
import time
import httpx
 
def create_jwt_token(key_id: str, issuer_id: str, private_key: str) -> str:
    """Generate JWT for App Store Connect API authentication"""
    payload = {
        "iss": issuer_id,
        "iat": int(time.time()),
        "exp": int(time.time()) + 1200,  # 20 minute expiry
        "aud": "appstoreconnect-v1"
    }
    
    return jwt.encode(
        payload,
        private_key,
        algorithm="ES256",
        headers={"alg": "ES256", "kid": key_id, "typ": "JWT"}
    )
 
def get_app_review_status(app_id: str, token: str) -> dict:
    """Fetch the latest review status for the specified app"""
    url = f"https://api.appstoreconnect.apple.com/v1/apps/{app_id}/appStoreVersions"
    
    response = httpx.get(
        url,
        headers={"Authorization": f"Bearer {token}"},
        params={
            "filter[appStoreState]": "REJECTED,METADATA_REJECTED",
            "include": "appStoreReviewDetail",
            "limit": 1
        },
        timeout=30.0
    )
    response.raise_for_status()
    data = response.json()
    
    if not data.get("data"):
        return {"status": "no_rejection"}
    
    version = data["data"][0]
    review_detail = next(
        (item["attributes"] for item in data.get("included", [])
         if item["type"] == "appStoreReviewDetails"),
        None
    )
    
    return {
        "status": version["attributes"]["appStoreState"],
        "version": version["attributes"]["versionString"],
        "rejection_reasons": review_detail.get("rejectReasonCode", []) if review_detail else [],
        "notes": review_detail.get("notes", "") if review_detail else ""
    }

Antigravity Agent for Guideline Analysis

# scripts/analyze_rejection.py
import os
import json
import google.generativeai as genai
 
genai.configure(api_key=os.environ["GEMINI_API_KEY"])
 
COMMON_REJECTION_CODES = {
    "2.1.0": "App completeness — app doesn't perform as expected",
    "4.0.0": "Design — doesn't follow Human Interface Guidelines",
    "4.2.0": "Minimum functionality — too limited in features",
    "5.1.1": "Privacy — data collection and usage issues",
    "2.5.4": "Software requirements — inappropriate background execution"
}
 
def analyze_rejection_and_propose_fix(
    rejection_info: dict,
    project_context: str
) -> dict:
    """
    Analyze rejection and generate specific fix recommendations.
    
    Returns structured analysis with prioritized action items,
    file references, and estimated time to fix.
    """
    model = genai.GenerativeModel("gemini-2.5-pro")
    
    rejection_codes = rejection_info.get("rejection_reasons", [])
    code_descriptions = [
        f"- {code}: {COMMON_REJECTION_CODES.get(code, f'Code {code}')}"
        for code in rejection_codes
    ]
    
    prompt = f"""
    Analyze this App Store rejection and provide actionable fix recommendations.
 
    Rejection details:
    - Status: {rejection_info['status']}
    - Version: {rejection_info.get('version', 'unknown')}
    - Rejection codes:
    {chr(10).join(code_descriptions) if code_descriptions else "  None (see reviewer notes)"}
    - Reviewer notes: {rejection_info.get('notes', 'None provided')}
    
    Project context:
    {project_context}
    
    Return valid JSON:
    {{
      "severity": "critical|major|minor",
      "root_cause": "One to two sentence explanation of why this was rejected",
      "fix_actions": [
        {{
          "action": "Specific thing to change",
          "priority": 1,
          "file_to_edit": "Likely filename",
          "code_hint": "Code snippet or implementation guidance"
        }}
      ],
      "apple_guideline_reference": "https://developer.apple.com/app-store/review/guidelines/#...",
      "estimated_hours": 2,
      "resubmission_tip": "What to verify before resubmitting"
    }}
    """
    
    response = model.generate_content(prompt)
    
    try:
        text = response.text.strip()
        if "```json" in text:
            text = text.split("```json")[1].split("```")[0].strip()
        return json.loads(text)
    except Exception:
        return {
            "severity": "unknown",
            "root_cause": "Automated analysis failed. Review the rejection notes manually.",
            "fix_actions": [],
            "estimated_hours": 0
        }

Monitoring with GitHub Actions

# .github/workflows/review-monitor.yml
name: App Store Review Monitor
 
on:
  schedule:
    - cron: '0 */2 * * *'  # Check every 2 hours during review
  workflow_dispatch:
 
jobs:
  check-review-status:
    runs-on: macos-latest
    
    steps:
      - uses: actions/checkout@v4
      
      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.12'
      
      - name: Install dependencies
        run: pip install httpx pyjwt google-generativeai
      
      - name: Check Review Status and Analyze if Rejected
        id: review_check
        env:
          APP_STORE_KEY_ID: ${{ secrets.APP_STORE_KEY_ID }}
          APP_STORE_ISSUER_ID: ${{ secrets.APP_STORE_ISSUER_ID }}
          APP_STORE_PRIVATE_KEY: ${{ secrets.APP_STORE_PRIVATE_KEY }}
          APP_ID: ${{ secrets.APP_STORE_APP_ID }}
          GEMINI_API_KEY: ${{ secrets.GEMINI_API_KEY }}
        run: |
          python scripts/monitor_review.py --analyze-if-rejected
      
      - name: Create GitHub Issue for Rejection
        if: steps.review_check.outputs.rejected == 'true'
        uses: actions/github-script@v7
        with:
          script: |
            const fs = require('fs');
            const analysis = JSON.parse(
              fs.readFileSync('rejection_analysis.json', 'utf8')
            );
            
            const actionsText = analysis.fix_actions
              .map((a, i) => [
                `${i + 1}. **${a.action}**`,
                `   - File: \`${a.file_to_edit}\``,
                `   - Guidance: ${a.code_hint}`
              ].join('\n'))
              .join('\n\n');
            
            await github.rest.issues.create({
              owner: context.repo.owner,
              repo: context.repo.repo,
              title: `🚨 App Store Rejection: v${analysis.version}`,
              body: `## Rejection Analysis\n\n**Severity**: ${analysis.severity}\n\n### Root Cause\n${analysis.root_cause}\n\n### Fix Actions\n${actionsText}\n\n### Estimated Fix Time\n~${analysis.estimated_hours} hours\n\n### Before Resubmitting\n${analysis.resubmission_tip}`,
              labels: ['app-store-rejection', 'urgent']
            });

Common Pitfalls

Pitfall #1: Screenshot pixel dimension mismatch

App Store requires exact pixel dimensions per device. Simulator scale settings can cause Snapshot to output smaller images. Always set scale 1 in your Snapfile.

Pitfall #2: HTML entities in metadata

Gemini occasionally generates descriptions with & or < characters. The App Store Connect API rejects these:

import html
 
# Always sanitize before submission
clean_description = html.unescape(raw_description).replace("<", "").replace(">", "")

Pitfall #3: External tester distribution timing

TestFlight external tester distribution requires Apple review. Only internal testers receive builds immediately. Split your upload lanes accordingly:

# fastlane/Fastfile
lane :upload_internal do
  pilot(
    ipa: "./build/YourApp.ipa",
    distribute_external: false,
    notify_external_testers: false
  )
end

Pitfall #4: JWT token expiration mid-run

App Store Connect JWTs expire after 20 minutes. For large metadata updates across many languages, implement token refresh:

def get_fresh_token() -> str:
    """Always generates a fresh token; don't cache between operations"""
    return create_jwt_token(
        key_id=os.environ["APP_STORE_KEY_ID"],
        issuer_id=os.environ["APP_STORE_ISSUER_ID"],
        private_key=os.environ["APP_STORE_PRIVATE_KEY"]
    )

Connecting Everything with an Antigravity Agent

With the individual scripts in place, create release-agent.md at the project root to let Antigravity orchestrate the full pipeline:

# Release Agent
 
## Role
Automate App Store release preparation for a new version.
Read `version_info.json`, then execute the following steps in order.
 
## Steps
1. Run `fastlane snapshot` to capture screenshots
2. Run `python scripts/generate_captions.py` for caption generation
3. Run `python scripts/optimize_metadata.py` for metadata optimization
4. Run `fastlane gym` to build
5. Run `fastlane pilot` to upload to TestFlight
6. Exit when app enters "Waiting for Review" state
 
## Error Handling
- Snapshot failure: restart Simulator and retry once
- API errors: wait 60 seconds and retry once
- Build errors: save log to `build_error.log` and stop

Trigger it from Antigravity:

@release-agent Start release preparation for version 2.1.0

Six Numbers That Actually Moved After Running 50M-Download Apps Through This Pipeline

I started shipping indie iOS apps in 2014. At peak, I was maintaining around five apps simultaneously — wallpaper, calming, and manifestation utilities — and release prep was always the bottleneck. After six months of running this Antigravity-based pipeline across all of them, I went back to my own work logs and pulled out the numbers that actually moved.

Metric	Before automation (late 2024)	After automation (early 2026)	Change
Wall-clock days per release	~14 (interrupted)	~1.5	~9x faster
Monthly releases across 5 apps	3–4 total	10–12 total	~3x
First-submission rejection rate	35–40%	6–8%	~5x lower
Metadata A/B test cadence	Once a month	Once a week	~4x
AdMob eCPM (rebased)	$1.42 (baseline)	$2.11	+48%
Gemini API monthly cost (5 apps)	$0	$18–$28	New line item

Two of these surprised me. The rejection rate drop was expected — the AI catches forbidden phrasing (medical claims, third-party trademarks, age-gate violations) before submission. The eCPM lift, however, was an unintended side effect. Faster release cadence keeps users on the latest OS and feature set, which raises engagement, which ad networks read as higher-value inventory. None of that was in my original spec for the pipeline, but it shows up cleanly in the data.

Why "14 days" was always 14 days of interrupted work

The pre-automation 14 days weren't a continuous sprint. They were a long string of context switches interleaved with other development. For a single release, the work decomposed like this:

Days 1–2: Decide which screens need re-shooting
Day 3: Run Simulator captures across 4 device sizes × 6 frames × 2 languages = 48 PNGs
Day 4: Write captions for each (90 min of copywriting × 2 languages)
Days 5–6: Update title / subtitle / description / keywords by hand in App Store Connect (2 languages)
Day 7: Upload to TestFlight, distribute to internal testers
Days 8–10: Address internal tester feedback
Day 11: Submit
Days 12–14: Review wait + rejection handling

Across five apps, those 14-day cycles overlapped enough that roughly 70% of any given month was release work. Some months, less than 30% of my time went into actual code. That's the self-defeating loop: the more apps you ship, the less time you have to ship more.

What the 1.5-day cycle actually contains

Of the 1.5 days after automation, only three blocks of human time matter:

30 minutes: Decide what's changing, hand the agent the brief (@release-agent Prepare version 2.4.0)
45 minutes: Review the AI-generated captions and metadata, adjust 3–5 phrasings
30 minutes: Eyeball the screenshots one last time before hitting submit

The remaining ~12 hours runs in the background on GitHub Actions and Antigravity. The transformative part isn't the total time saved; it's the decoupling of my working hours from release wall time. I queue work in the evening, check results in the morning.

Five Things the Official Docs Don't Tell You About Running This in Production

Reading the App Store Connect API and fastlane docs end to end won't prepare you for what actually breaks when you run five apps through this pipeline week after week. These are the five things I wish I'd built in from day one.

1. Pre-rotate JWTs instead of reacting to expiry

The 20-minute JWT limit hits hard once you're updating metadata for several apps in sequence. My fix is to cache the issue time locally (SQLite is fine, no Redis needed) and rotate the token before it expires, not after the API returns 401.

# scripts/asc_token_manager.py
import time, jwt, os, sqlite3, pathlib
 
ASC_KEY_PATH   = os.environ["ASC_KEY_PATH"]
ASC_KEY_ID     = os.environ["ASC_KEY_ID"]
ASC_ISSUER_ID  = os.environ["ASC_ISSUER_ID"]
TOKEN_CACHE    = pathlib.Path.home() / ".asc_token_cache.sqlite"
SAFETY_WINDOW  = 180  # rotate when < 3 minutes remain
 
def _init_cache():
    con = sqlite3.connect(TOKEN_CACHE)
    con.execute("CREATE TABLE IF NOT EXISTS token (id INTEGER PRIMARY KEY, jwt TEXT, exp INTEGER)")
    con.commit(); return con
 
def get_token() -> str:
    con = _init_cache()
    row = con.execute("SELECT jwt, exp FROM token WHERE id=1").fetchone()
    now = int(time.time())
    if row and (row[1] - now) > SAFETY_WINDOW:
        return row[0]
    with open(ASC_KEY_PATH, "r") as f:
        key = f.read()
    exp = now + 1200
    token = jwt.encode(
        {"iss": ASC_ISSUER_ID, "iat": now, "exp": exp, "aud": "appstoreconnect-v1"},
        key,
        algorithm="ES256",
        headers={"kid": ASC_KEY_ID, "typ": "JWT"},
    )
    con.execute("INSERT OR REPLACE INTO token (id, jwt, exp) VALUES (1, ?, ?)", (token, exp))
    con.commit()
    return token

Across five apps that's around 200 API calls per release. Since switching to the pre-rotation pattern, I've had zero JWT-expiry failures in six months. Before, I'd hit it two or three times a month.

2. Normalize screenshot color temperature across devices

iPhone 16 Pro Max (True Tone XDR) and iPad Air (standard IPS) shoot the same content at slightly different color temperatures (around 6500K vs 7100K). Side-by-side in the App Store device gallery, one model reads as visibly bluer. I now run a quick post-capture pass to align them.

# Snapfile
override_status_bar true
clear_previous_screenshots true
scale 1
languages ["ja-JP", "en-US"]
 
after_each_device do |device|
  Dir["screenshots/#{device}/*.png"].each do |f|
    system("magick \"#{f}\" -modulate 100,100,99 -auto-level \"#{f}\"")
    # Saturation/hue both dropped to 99 to pull warm devices back in line
  end
end

This eliminated a quiet class of rejection feedback: "brand consistency across device variants" notes from reviewers.

3. Give Gemini a per-app tone dictionary up front

Embedding a "use these words / never use these words" dictionary in the prompt removes most of the metadata drift across versions.

# tone_dictionary.py
TONE = {
    "wallpaper-zen": {
        "prefer": ["calm", "spacious", "minimal", "still", "centered"],
        "avoid":  ["ultimate", "insane", "mind-blowing", "complete guide"],
        "audience": "Professionals in their 30s seeking calm focus",
    },
    "manifest-coach": {
        "prefer": ["quiet confidence", "noticing", "journaling", "observing"],
        "avoid":  ["law of attraction", "miracle", "guaranteed results"],
        "audience": "20-40s readers oriented toward self-reflection",
    },
}
 
def build_prompt(app_id: str, base_prompt: str) -> str:
    t = TONE[app_id]
    return f"""
{base_prompt}
 
# Tone guide for this app
- Prefer: {", ".join(t["prefer"])}
- Avoid:  {", ".join(t["avoid"])}
- Audience: {t["audience"]}
 
Do not use any "avoid" terms in captions or descriptions.
"""

The avoid list, in particular, dropped my Section 1.4.1 (Safety / Medical claims) rejections from five over six months to zero.

4. Split GitHub Actions secrets per app

A shared ASC_KEY_PATH looks economical until one app loses access and all five pipelines stop. Splitting secrets per app is annoying to set up but is the cheaper choice over a year.

# .github/workflows/release-wallpaper-zen.yml
jobs:
  release:
    runs-on: macos-15
    env:
      ASC_KEY_PATH:   ${{ secrets.ASC_KEY_WALLPAPER_ZEN }}
      ASC_KEY_ID:     ${{ secrets.ASC_KEY_ID_WALLPAPER_ZEN }}
      ASC_ISSUER_ID:  ${{ secrets.ASC_ISSUER_ID }}   # can stay shared
      GEMINI_API_KEY: ${{ secrets.GEMINI_API_KEY_PROD }}

When a key needs rotating, you rotate one workflow's secret. The other four keep shipping.

5. Define exactly four "stop and ask me" triggers

Full automation sounds appealing until something silently ships the wrong copy. My pipeline halts and pings me for human review whenever one of these four conditions hits:

Gemini's per-caption confidence score drops below 0.85
The generated description matches any term in the per-app avoid list
Build size changes by more than ±15% versus the previous release
An internal tester report contains "crash" or "won't launch"

After adding these gates, I haven't had a single "we shipped the wrong copy" incident. Anything ambiguous comes to me; everything else ships unattended.

Results After Three Months

Running this pipeline in production on several apps, here's how the time allocation changed:

Screenshot + caption generation: 8 hours → ~25 minutes
Metadata optimization: 6 hours → ~15 minutes (including review and minor edits)
Build and upload: ~2 hours (unchanged, runs via GitHub Actions)
Rejection handling: 10 hours → ~2 hours (root cause identification is now instant)

API cost: Using Gemini 2.5 Pro, one full release cycle costs roughly $0.50–$1.50 depending on screenshot count and text volume. For two to three releases per month, you're well under $5.

The shift this enabled isn't just efficiency. When release prep takes an evening instead of a week, you stop batching features together just to amortize the release cost. Smaller, more frequent releases with better feedback loops — that's what this infrastructure actually buys you.

Start with generate_captions.py. Run it against a single screenshot in English, see what comes back, and adjust the prompt from there. The feedback loop is immediate, and the quality is good enough out of the box to be useful.

Extending the Pipeline: Localization at Scale

One area where this pipeline shows its greatest leverage is localization. Adding a new language to your app used to mean manually writing or reviewing metadata in that language. With the metadata generation script in place, adding Korean or French takes minutes.

Here's a practical app_info.json structure that feeds both the metadata optimizer and the caption generator:

{
  "app_name": "YourApp",
  "bundle_id": "com.yourcompany.yourapp",
  "category": "Productivity",
  "core_value_proposition": "Helps users track daily habits with minimal friction",
  "key_features": [
    "One-tap habit logging",
    "Weekly progress visualization",
    "Streak tracking with notifications"
  ],
  "target_audience": "Busy professionals aged 25-40",
  "tone": "friendly, encouraging, non-judgmental",
  "supported_languages": ["en-US", "ja", "zh-Hans", "ko", "fr-FR"],
  "competitors_to_avoid_mentioning": ["Habitica", "Streaks", "Fabulous"]
}

The tone field is more important than it looks. When you tell Gemini to write in a "friendly, encouraging" voice, the generated descriptions feel notably different from the default corporate-neutral output. For productivity apps especially, matching your in-app voice in your App Store copy makes a measurable difference in conversion.

Automated A/B Testing of Metadata

App Store Connect supports product page optimization for testing alternate screenshots and metadata. Here's how to wire that into the pipeline:

# scripts/create_product_page_test.py
import os
import httpx
import json
 
def create_app_store_experiment(
    app_id: str,
    token: str,
    variant_metadata: dict,
    traffic_proportion: float = 0.5
) -> str:
    """
    Create a product page optimization experiment.
    
    Args:
        app_id: App Store app ID
        token: Valid JWT token
        variant_metadata: The alternate metadata to test
        traffic_proportion: Fraction of traffic for the variant (0.0 to 1.0)
    
    Returns:
        Experiment ID for monitoring
    """
    url = "[api.appstoreconnect.apple.com](https://api.appstoreconnect.apple.com/v1/appStoreVersionExperiments)"
    
    payload = {
        "data": {
            "type": "appStoreVersionExperiments",
            "attributes": {
                "name": f"Metadata Test {variant_metadata.get('name', 'Variant')}",
                "trafficProportion": int(traffic_proportion * 100),
                "platform": "IOS"
            },
            "relationships": {
                "app": {
                    "data": {"type": "apps", "id": app_id}
                }
            }
        }
    }
    
    response = httpx.post(
        url,
        headers={
            "Authorization": f"Bearer {token}",
            "Content-Type": "application/json"
        },
        json=payload,
        timeout=30.0
    )
    response.raise_for_status()
    
    experiment_id = response.json()["data"]["id"]
    print(f"✅ Created experiment: {experiment_id}")
    return experiment_id

Running this monthly lets you systematically improve conversion rate without guesswork. The Antigravity agent can analyze experiment results after two weeks and recommend which variant to promote.

Monitoring Post-Launch: Connecting Reviews to Code

The pipeline doesn't end at submission. After launch, user reviews often surface bugs and UX issues faster than traditional crash reporting. Connecting App Store reviews to your development workflow closes the feedback loop automatically.

Automated Review Triage

Extend the review monitoring script to also fetch and categorize post-launch user reviews:

# scripts/triage_user_reviews.py
import os
import json
import httpx
import google.generativeai as genai
 
genai.configure(api_key=os.environ["GEMINI_API_KEY"])
 
def fetch_recent_reviews(app_id: str, token: str, limit: int = 50) -> list:
    """Fetch most recent user reviews via App Store Connect API"""
    url = f"[api.appstoreconnect.apple.com](https://api.appstoreconnect.apple.com/v1/apps/{app_id}/customerReviews)"
    
    response = httpx.get(
        url,
        headers={"Authorization": f"Bearer {token}"},
        params={
            "sort": "-createdDate",
            "limit": limit,
            "fields[customerReviews]": "rating,title,body,createdDate,territory"
        },
        timeout=30.0
    )
    response.raise_for_status()
    return response.json().get("data", [])
 
def categorize_reviews(reviews: list) -> dict:
    """
    Use Gemini to categorize reviews by type and extract actionable signals.
    
    Returns:
        {
            "bugs": [{"review_id": str, "description": str, "severity": str}],
            "feature_requests": [{"request": str, "count": int}],
            "ux_issues": [{"issue": str, "affected_screen": str}],
            "praise": [{"aspect": str}]
        }
    """
    model = genai.GenerativeModel("gemini-2.5-pro")
    
    review_texts = [
        f"[{r['attributes']['rating']}★] {r['attributes']['title']}: {r['attributes']['body']}"
        for r in reviews
        if r.get("attributes", {}).get("body")
    ]
    
    prompt = f"""
    Analyze these {len(review_texts)} App Store user reviews and extract actionable signals.
    
    Reviews:
    {chr(10).join(review_texts[:30])}  # Limit to 30 for token efficiency
    
    Categorize into:
    1. Bug reports (actual crashes or broken functionality)
    2. Feature requests (things users want but don't have)
    3. UX friction points (things that work but confuse users)
    4. Positive signals (what users love — useful for marketing)
    
    For bugs, estimate severity: critical (app unusable), major (feature broken), minor (cosmetic).
    
    Return JSON with structure:
    {{
        "bugs": [{{"description": str, "severity": "critical|major|minor", "frequency": int}}],
        "feature_requests": [{{"request": str, "frequency": int}}],
        "ux_issues": [{{"issue": str, "likely_screen": str, "frequency": int}}],
        "praise": [{{"aspect": str, "frequency": int}}],
        "overall_sentiment": "positive|neutral|negative",
        "top_priority_action": "Single most important thing to address next"
    }}
    """
    
    response = model.generate_content(prompt)
    
    try:
        text = response.text.strip()
        if "```json" in text:
            text = text.split("```json")[1].split("```")[0].strip()
        return json.loads(text)
    except Exception as e:
        return {"error": str(e), "raw": response.text[:500]}
 
def create_github_issues_from_bugs(
    bugs: list,
    github_token: str,
    owner: str,
    repo: str
):
    """Create GitHub issues for critical and major bugs found in reviews"""
    headers = {
        "Authorization": f"Bearer {github_token}",
        "Accept": "application/vnd.github+json"
    }
    
    for bug in bugs:
        if bug.get("severity") not in ["critical", "major"]:
            continue
        
        title = f"[App Store Review] {bug['description'][:80]}"
        body = f"""**Source**: App Store user reviews ({bug.get('frequency', 1)} mentions)
**Severity**: {bug['severity']}
 
**Description**:
{bug['description']}
 
**Notes**: This issue was automatically detected from App Store review analysis.
Investigate and reproduce before closing.
"""
        
        response = httpx.post(
            f"https://api.github.com/repos/{owner}/{repo}/issues",
            headers=headers,
            json={
                "title": title,
                "body": body,
                "labels": ["bug", "app-store-feedback", f"severity-{bug['severity']}"]
            }
        )
        
        if response.status_code == 201:
            print(f"  ✅ Created issue: {title[:60]}")
        else:
            print(f"  ✗ Failed to create issue: {response.status_code}")

This script runs weekly via GitHub Actions and creates issues for any newly surfaced bugs. Combine it with the rejection analysis system, and you have a complete feedback loop: user reviews → bug reports → code fixes → new release → updated reviews.

Security Considerations for Production Use

A few security points worth emphasizing explicitly, since this pipeline handles sensitive credentials:

Never hardcode API keys. All scripts in this guide use os.environ["KEY_NAME"]. Store secrets exclusively in GitHub Secrets for CI/CD and in your local .env file (which should be in .gitignore).

Rotate App Store Connect keys regularly. API keys should be rotated every 90 days. The pipeline makes this easy — update the secret in GitHub, and all workflows pick it up immediately.

Limit API key permissions. When creating your App Store Connect API key, select only the roles your pipeline needs. For read-only monitoring, use "Developer" role. For submission, use "App Manager." Never use "Admin" for automated scripts.

Audit Gemini API usage. Set monthly budget alerts in Google Cloud Console. A runaway loop in the screenshot processing script could generate unexpected API costs. With budget alerts at $10/month, you'll catch issues before they become expensive.

Scaling to Multiple Apps

Once you have this working for one app, extending to a portfolio of apps is straightforward. The key is parameterizing the scripts rather than hardcoding app-specific values:

# scripts/release_orchestrator.py
import argparse
import json
import subprocess
from pathlib import Path
 
def run_release_pipeline(app_config_path: str, dry_run: bool = False):
    """
    Run the full release pipeline for an app specified by config file.
    
    Args:
        app_config_path: Path to app-specific config JSON
        dry_run: If True, generate content but skip submission
    """
    config = json.loads(Path(app_config_path).read_text())
    
    app_id = config["app_store_id"]
    languages = config.get("languages", ["en-US"])
    screenshots_dir = config.get("screenshots_dir", "./screenshots")
    
    print(f"\n🚀 Starting release pipeline for: {config['app_name']}")
    print(f"   Languages: {', '.join(languages)}")
    print(f"   Dry run: {dry_run}")
    
    steps = [
        ("Screenshot capture", f"fastlane snapshot --scheme {config['scheme']}"),
        ("Caption generation", f"python scripts/generate_captions.py {screenshots_dir} '{config['description']}'"),
        ("Metadata optimization", f"python scripts/optimize_metadata.py {app_config_path}"),
    ]
    
    if not dry_run:
        steps.extend([
            ("Build", f"fastlane gym --scheme {config['scheme']}"),
            ("TestFlight upload", "fastlane pilot upload"),
        ])
    
    for step_name, command in steps:
        print(f"\n📍 {step_name}...")
        result = subprocess.run(command, shell=True, capture_output=True, text=True)
        
        if result.returncode \!= 0:
            print(f"  ✗ Failed: {result.stderr[:200]}")
            raise RuntimeError(f"Pipeline failed at: {step_name}")
        
        print(f"  ✓ Complete")
    
    print(f"\n✅ Pipeline complete for {config['app_name']}")
 
if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("config", help="Path to app config JSON")
    parser.add_argument("--dry-run", action="store_true")
    args = parser.parse_args()
    
    run_release_pipeline(args.config, dry_run=args.dry_run)

With this structure, running releases for three apps is just three invocations with different config files:

python scripts/release_orchestrator.py apps/app1/config.json
python scripts/release_orchestrator.py apps/app2/config.json
python scripts/release_orchestrator.py apps/app3/config.json

Or parallelize them in GitHub Actions using a matrix strategy.

The Bigger Picture

What this pipeline actually does is shift your relationship with the release process. When you know that screenshots, metadata, and rejection triage are handled, you stop treating "release" as an event to dread and start treating it as a lightweight deployment step.

For solo developers, that mental shift matters. The bottleneck moves from "how much release work can I do" to "how many improvements can I ship." That's where you want the bottleneck to be.

The place to start is the caption generation script. Set up your environment, point it at a screenshot directory from an existing app, and run it. The feedback loop is immediate, and the quality — while you'll want to review and sometimes adjust — is good enough to be useful from the first run.

From there, add metadata optimization for one language. Then another. By the time you've automated three languages, you'll have a feel for where Gemini needs more context and where the prompts can be tightened. That iterative process is where the real learning happens — and Antigravity makes it fast to iterate.

For related implementation details, the App Store Connect API Automation Guide covers the authentication setup in more depth, and Advanced CI/CD Pipeline with GitHub Actions walks through the GitHub Actions configuration for production-scale deployments.

Thank You for Reading

Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.