Don't Let Your Automation Lean on AI Ultra's 5x Ceiling

The $100/month AI Ultra plan raises Antigravity's usage limits to 5x AI Pro. But if you architect automation around that ceiling, it collapses the moment you drop back a tier. Here is a limit-independent degradation design, with the real pain points.

Antigravity²⁹⁷ AI Ultra² quota⁹ automation⁶⁸ design¹⁴

✦ Premium Article

When the $100/month AI Ultra plan appeared, the first thing I weighed was a single line: Antigravity's usage limit becomes 5x that of AI Pro. As an indie developer running my own apps alongside several Dolice blogs in parallel, a 5x ceiling makes me want to fire off more agents at once.

But I stopped myself. A quota tier is not permanent. Next month I might review costs and drop back to AI Pro, and there are days the Ultra allowance simply runs out. Build your automation on the assumption that "there's 5x headroom" and the moment the tier steps down, half your nightly batch fails and you spend the morning cleaning it up. I did exactly that once.

This article lays out how to build automation that does not tie its correctness to a quota tier, together with the pain points I only understood by raising and lowering the tier myself.

Read the ceiling as headroom, not horsepower

Seeing "5x" tricks you into thinking throughput went up fivefold. What actually grows is only the number of agents you can run at once and the headroom on total volume per window. The intelligence or accuracy of any single run does not change.

Miss this and you end up designing "Ultra means ten in parallel." In my own sense of it, the ceiling is like savings: the more you have, the calmer you feel, but if you fix your lifestyle to the balance, a drop hurts all at once.

The ceiling is safest read as a cushion that keeps peaks from jamming. Your steady-state count should be decided well before you reach it.

What breaks first when the tier drops

When I moved from Ultra back to Pro, or burned through the Ultra allowance, the same spots cried out first in my setup.

Where it breaks	Symptom	Root cause
Concurrency	The later half of simultaneously launched agents dies at startup with 429	Concurrency pinned right at the ceiling
Retries	A failed run retries immediately, eats more allowance, and cascades	Backoff on failure ignores remaining quota
Nightly batch	Jobs clustered at the same late hour all fail together	Peak shoved to one point, eating shared headroom

The common thread is that each quietly leaned on the assumption that the ceiling was plentiful. The moment the ceiling shrank, the assumption broke, and failure bred failure.

✦

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN

✦A rule for treating the 5x gap between AI Pro and Ultra as headroom, not performance

✦A map of the three things that break first when your tier drops: concurrency, retries, and nightly batches

✦A ~40-line shell control that steps down concurrency by remaining quota, plus rules for running independent of the ceiling

Secure payment via Stripe · Cancel anytime

✦

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

Unlock all articles with Membership →

Principle 1 — never tie correctness to the ceiling

The rule I hold most tightly is to never make the correctness of a result depend on the ceiling. The limit should only affect whether things finish fast or slow, never whether they finish or fail.

Aspect	Ceiling-dependent (fragile)	Ceiling-independent (robust)
Concurrency	Pinned at 10 to match the Ultra allowance	A baseline of 3 derived from redo cost, bumped up only when headroom exists
On failure	Push through with immediate retries	Check remaining quota, wait, or defer to the next window
Completion	Assumes everything finishes within the window	Can resume from where it stopped next time

With the right-hand design, a tier drop only makes things "slower." The left-hand one turns a drop into "never finishes." That difference decided whether I had morning cleanup or not.

Principle 2 — set concurrency by redo cost, not the ceiling

How many run at once is decided from the cost of redoing one that fails, not by working backward from the ceiling.

Estimate how many minutes and how much allowance a single failed run costs to rerun
Factor in that the more that run at once, the higher the chance one failure drags others down
Take the count where "even if all of them fail, I can live with it" as the steady-state baseline
On days with real headroom, add just a little on top of that baseline

In my case, this thinking led me to actually cut the parallel count. I spread jobs that had clustered at the same late hour onto different times, and rounded the daily total number of runs down at the source. Even with 5x the ceiling, my steady-state count barely changed. The feeling is close to: raise the ceiling, but keep the rhythm of daily life.

Guard it in code — step-down by remaining quota

Holding the principle in your head is not enough to protect automation that runs at 3 a.m. I bake a step-down mechanism into the script itself. Here is a simplified shell version of the idea; swap the real remaining-quota read for whatever fits your environment.

#!/usr/bin/env bash
# Step concurrency down according to remaining quota
set -euo pipefail
 
# remaining_pct: replace with a function returning remaining allowance (0-100)
remaining_pct() {
  # e.g. derive the remaining share from the quota screen or an API response
  echo "${QUOTA_REMAINING_PCT:-100}"
}
 
decide_concurrency() {
  local pct="$1"
  if   [ "$pct" -ge 60 ]; then echo 3   # plenty: run baseline
  elif [ "$pct" -ge 30 ]; then echo 2   # below half: drop one step
  elif [ "$pct" -ge 10 ]; then echo 1   # nearly gone: fall back to serial
  else                          echo 0  # exhausted: skip this window
  fi
}
 
PCT="$(remaining_pct)"
N="$(decide_concurrency "$PCT")"
 
if [ "$N" -eq 0 ]; then
  echo "quota exhausted (${PCT}%): skip this window, defer to next" >&2
  exit 0   # a skip, not a failure. do not create a cascade
fi
 
echo "remaining ${PCT}% -> concurrency ${N}"
# launch jobs capped at N here

The crux is the final exit 0. If you exit with an error when the allowance runs dry, monitoring and retries react and eat even more quota. Exhaustion is not an anomaly but an expected state, so treating "skip and defer to the next window" as the happy path is what kept cascades from forming.

What I learned by raising and lowering the tier

Going up to Ultra and later revising my baseline, a few things clicked into place.

Right after raising the ceiling you want to add parallelism, but every added lane adds collateral damage on failure. In the end I judged that spreading across times is steadier than raising concurrency.
If your read on remaining quota is loose, the step-down comes too late. Consumption often becomes visible after the fact, so being a little conservative and dropping a step early lost fewer runs.
Once "skip" became the happy path, morning cleanup nearly vanished. Jobs that did not clear overnight quietly resume from where they left off in the next window.

In numbers: my steady-state parallel count stayed put even at 5x the ceiling, and once I unclustered the late-hour peak and rounded the total run count down, cascading failures nearly stopped. The extra headroom was enough just to absorb peaks.

Your next step

If you currently pin concurrency to your quota tier, try one thing. Reset your steady-state count from "everything can fail and I'll cope" rather than by working backward from the ceiling. Then add the one line that treats exhaustion as an exit 0 skip, and even if you drop a tier next month, your nightly automation will only slow down quietly instead of ambushing you in the morning.

I hope it helps the design of anyone else running several processes in parallel.

Thank You for Reading

Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.