Articles/Integrations

⬡ Integrations/2026-04-29Advanced

Building a Stripe Dunning Recovery Pipeline with Antigravity AI Agents — Stop Failed Payments from Becoming Churn

Wire Antigravity AI agents into Stripe Smart Retries to recover failed payments before they turn into churn. Webhooks, idempotency, AI-tailored email copy, Slack escalation, and winback offers — all in one production-ready pattern.

antigravity⁴³⁶ stripe⁹ saas¹⁰ billing³ ai-agent¹⁸ dunning

✦ Premium Article

I run a small SaaS, and one month our MRR dropped sharply for no obvious reason. New signups were growing. Engagement was stable. When I finally chased it down inside the Stripe dashboard, the answer was unsexy: expired cards and one-off network declines. Roughly six out of ten cancellations that month traced back to those.

What hit me wasn't the loss of revenue. It was the realization that "how I handle failed payments" was a survival-line for the product, equal in weight to anything I'd ship as a feature. If I could automate that recovery flow well, it would compound just like new growth.

This article walks through the dunning recovery pipeline I rebuilt afterward, using Antigravity's AI agents at the center. I'll cover capturing failure events, controlling retries, branching email copy by reason, Slack escalation, and a final winback step — all in patterns I run in production today.

Why an AI Agent Belongs at the Center of Dunning

"Dunning" is the polite name for the dance you do with a customer after a charge fails. It looks small from the outside but the decision tree is surprisingly thick:

An expired card, an insufficient-funds decline, and a 3DS authentication failure all need different copy and different timing.
Pushing more retries lifts recovery rate, but past a point it makes customers feel hounded.
If the same customer fails twice in a week, sending another email usually backfires.
Corporate cards and personal cards often need different recipients on the notification side.

I tried to encode all this in a single webhook handler with if branches. It got past 150 lines before I lost track of what was firing when. I threw it out.

The shift that worked was moving the decision logic out of the handler entirely and into an Antigravity sub-agent. The webhook handler shrinks down to "validate, normalize, hand off." The agent owns the policy.

Pipeline Overview

The complete shape:

Receive a Stripe webhook (invoice.payment_failed, customer.subscription.updated, etc.)
Idempotency check + event normalization (Cloudflare D1 or Postgres)
Hand off to the Dunning Agent with normalized failure reason, customer profile, and retry history
Decide: keep retrying via Stripe, notify the customer, or trigger a winback offer
Execute: send email through Resend, alert internal Slack, update KV/DB state
Observe: log everything as structured JSON and chart it in Looker Studio

Steps 3 and 4 are where the agent earns its place. Stripe's built-in Smart Retries are good, but they don't speak your domain — "skip notifications during free trial," "give annual subscribers extra grace," "different copy for customers who came back after canceling once before." Encoding those rules in a prompt + tool contract beats an ever-growing nest of conditionals.

✦

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN

✦Match copy, timing, and channel to each decline reason (expiry, funds, 3DS, network)

✦An idempotent webhook and agent-delegation design that avoids double retries with Smart Retries

✦Three months of real recovery metrics by reason, down to the ~30-failures break-even

Secure payment via Stripe · Cancel anytime

✦

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

Unlock all articles with Membership →

Step 1: Receive Webhooks Idempotently

The webhook handler comes first. Stripe will sometimes deliver the same event multiple times, so deduplicate before you touch any business logic.

// app/api/webhook/stripe-billing/route.ts
import Stripe from "stripe";
import { getCloudflareContext } from "@opennextjs/cloudflare";
 
// Set as Cloudflare Worker secrets:
//   STRIPE_SECRET_KEY=sk_live_... (or sk_test_...)
//   STRIPE_WEBHOOK_BILLING_SECRET=whsec_...
const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!, {
  apiVersion: "2025-08-27.basil",
});
 
export async function POST(req: Request) {
  const sig = req.headers.get("stripe-signature");
  if (!sig) return new Response("missing signature", { status: 400 });
 
  const body = await req.text();
  let event: Stripe.Event;
  try {
    event = await stripe.webhooks.constructEventAsync(
      body,
      sig,
      process.env.STRIPE_WEBHOOK_BILLING_SECRET!,
      undefined,
      Stripe.createSubtleCryptoProvider() // required on Cloudflare Workers
    );
  } catch (err) {
    console.error("webhook signature verification failed", err);
    return new Response("invalid signature", { status: 400 });
  }
 
  // Idempotency: event.id is the dedup key
  const { env } = getCloudflareContext();
  const seen = await env.BILLING_KV.get(`evt:${event.id}`);
  if (seen) {
    return new Response("duplicate", { status: 200 });
  }
  await env.BILLING_KV.put(`evt:${event.id}`, "1", { expirationTtl: 60 * 60 * 24 * 7 });
 
  if (event.type === "invoice.payment_failed") {
    await enqueueDunning(event.data.object as Stripe.Invoice, env);
  }
 
  return new Response("ok", { status: 200 });
}

The handler does not make business decisions. Stripe will keep retrying for three days if you respond slowly, which means a heavy handler causes a chain reaction. Aim for under 100 ms: validate, dedupe, enqueue, return.

Expected behavior: even if the same evt_xxx arrives five times, enqueueDunning runs exactly once.

Step 2: Normalize the Failure into Domain Vocabulary

The Stripe Invoice object is information-dense and not friendly to feed to an AI agent raw. I run it through a normalization layer.

// lib/dunning/normalize.ts
export type DunningContext = {
  customerId: string;
  customerEmail: string;
  amountDue: number; // smallest unit (JPY = yen, USD = cents)
  currency: string;
  failureReason: "card_expired" | "insufficient_funds" | "authentication_required" | "do_not_honor" | "unknown";
  attemptCount: number;
  planTier: "basic" | "pro" | "team";
  isAnnual: boolean;
  customerSegment: "trial" | "new" | "loyal" | "winback";
  lastSuccessfulPaymentAt: number | null; // unix seconds
  totalLifetimeValueJpy: number;
};
 
export async function buildDunningContext(invoice: Stripe.Invoice, env: Env): Promise<DunningContext> {
  const customer = await stripe.customers.retrieve(invoice.customer as string);
  const subs = await stripe.subscriptions.list({ customer: invoice.customer as string, limit: 1 });
  const sub = subs.data[0];
 
  const code = invoice.last_finalization_error?.decline_code ?? invoice.last_payment_error?.decline_code;
  const failureReason = mapDeclineCode(code);
 
  if (!failureReason) {
    console.warn("unmapped decline_code", code, "invoice", invoice.id);
  }
 
  return {
    customerId: invoice.customer as string,
    customerEmail: (customer as Stripe.Customer).email ?? "",
    amountDue: invoice.amount_due,
    currency: invoice.currency,
    failureReason: failureReason ?? "unknown",
    attemptCount: invoice.attempt_count,
    planTier: detectPlanTier(sub),
    isAnnual: sub.items.data[0].price.recurring?.interval === "year",
    customerSegment: await detectSegment(invoice.customer as string, env),
    lastSuccessfulPaymentAt: await getLastSuccessfulPaymentAt(invoice.customer as string, env),
    totalLifetimeValueJpy: await getLtvJpy(invoice.customer as string, env),
  };
}
 
function mapDeclineCode(code?: string | null): DunningContext["failureReason"] | undefined {
  switch (code) {
    case "expired_card": return "card_expired";
    case "insufficient_funds": return "insufficient_funds";
    case "authentication_required": return "authentication_required";
    case "do_not_honor": return "do_not_honor";
    default: return undefined;
  }
}

Why translate Stripe codes into your own vocabulary? Stripe's set of decline_code values evolves. Pinning the agent prompt directly to those strings makes it brittle. Cap your domain to four or five reasons, send anything new to unknown, and alert on it. That single discipline keeps the agent stable for years.

If you're already deep in Stripe webhook plumbing, my full SaaS pattern — including metering and invoice issuance — is in Antigravity x Stripe Full-Stack SaaS Deployment Guide. Reading both side by side gives a clearer picture of how dunning fits into the broader billing surface.

Step 3: Define the Dunning Agent in Antigravity

This is where the AI agent comes in. Drop a file at agents/dunning-orchestrator.md and make the tools and policy explicit.

# Dunning Orchestrator Agent
 
## Role
Receive a Stripe payment failure event and choose exactly one recovery action that protects both customer experience and revenue.
 
## Available tools
- send_email(template_id, customer_email, variables): send via Resend
- post_slack(channel, blocks): post to Slack
- update_customer_state(customer_id, state, note): write to internal KV
- request_stripe_smart_retry(invoice_id, schedule): reschedule Stripe Smart Retries
- offer_winback_discount(customer_id, percent_off, valid_days): issue coupon + send email
- escalate_to_human(reason): hand off to support
 
## Decision policy
1. failureReason = "card_expired" -> send card update link (template "card_update_request")
2. failureReason = "insufficient_funds" AND attemptCount <= 2 -> let Stripe Smart Retries continue + send a soft reminder
3. attemptCount >= 3 AND planTier in ["pro","team"] AND totalLifetimeValueJpy > 30000 -> escalate_to_human
4. customerSegment = "loyal" AND failureReason != "card_expired" -> offer_winback_discount(percent_off=20, valid_days=14)
5. unknown cases -> escalate_to_human + Slack notification
 
## Hard constraints
- Never send more than one email to the same customer within 24 hours.
- Always check that customer_email is non-empty before calling send_email.
- offer_winback_discount can fire at most once per 90 days for the same customer.

The "hard constraints" section is the part I refuse to compromise on. AI agents are flexible, but flexibility without guardrails turns into surprise behavior in production. The single line "never send more than one email per 24 hours" was the difference between launching nervously and launching with confidence on day one.

Step 4: Invoke the Agent and Handle Results

Don't call the agent directly from the webhook handler. Push the work onto a queue (Cloudflare Queues or D1 + Cron) because agent calls can take seconds.

// lib/dunning/run-agent.ts
import { GoogleGenAI, FunctionDeclaration } from "@google/genai";
 
const tools: FunctionDeclaration[] = [
  {
    name: "send_email",
    description: "Send an email using a fixed template id and variables",
    parameters: {
      type: "object",
      properties: {
        template_id: { type: "string", enum: ["card_update_request", "soft_payment_reminder", "winback_offer", "final_warning"] },
        customer_email: { type: "string" },
        variables: { type: "object" },
      },
      required: ["template_id", "customer_email"],
    },
  },
  // ... declare the other tools the same way
];
 
export async function runDunningAgent(ctx: DunningContext, env: Env) {
  // Hard constraints (always run before the agent decision)
  if (!ctx.customerEmail) {
    await postSlack(env, `[dunning] customer ${ctx.customerId} has no email - skipping`);
    return { action: "skipped", reason: "no_email" };
  }
 
  const recent = await env.BILLING_KV.get(`mail-throttle:${ctx.customerId}`);
  if (recent) {
    return { action: "skipped", reason: "throttled" };
  }
 
  const ai = new GoogleGenAI({ apiKey: env.GEMINI_API_KEY });
  const response = await ai.models.generateContent({
    model: "gemini-3-pro",
    contents: [
      { role: "user", parts: [{ text: buildPrompt(ctx) }] },
    ],
    config: {
      tools: [{ functionDeclarations: tools }],
      systemInstruction: await loadAgentMd(env, "dunning-orchestrator.md"),
      temperature: 0.2, // tighten determinism
    },
  });
 
  // The agent is constrained to one function_call per invocation.
  const call = response.functionCalls?.[0];
  if (!call) {
    await postSlack(env, `[dunning] agent returned no action for ${ctx.customerId}`);
    return { action: "no_action" };
  }
 
  const result = await dispatchTool(call.name, call.args, ctx, env);
  await env.BILLING_KV.put(
    `mail-throttle:${ctx.customerId}`,
    "1",
    { expirationTtl: 60 * 60 * 24 } // 24h throttle
  );
 
  return { action: call.name, ...result };
}

The deliberate constraint here is one action per agent call. If the agent could fire "send email + Slack alert + issue coupon" in the same response, you immediately face partial-failure questions: what state is the customer in if the email succeeded but the coupon issue failed? Limit it to one step, then call again if you need a follow-up. Rollback design becomes trivial.

Expected behavior: within 30 seconds of a failed charge, the agent picks one action, Resend delivers the email, and a 24h throttle entry lands in KV.

Step 5: Let the AI Tailor Email Copy

I don't ship fully static templates. Inside Resend + React Email, the agent gets a small slot — usually a "hero paragraph" — that it adapts to the situation while the rest of the structure stays fixed.

// emails/CardUpdateRequest.tsx
import { Body, Container, Heading, Text, Button, Html } from "@react-email/components";
 
export default function CardUpdateRequest({
  firstName,
  amountFormatted,
  updateLink,
  empathyParagraph, // generated by Gemini for the specific context
}: { firstName: string; amountFormatted: string; updateLink: string; empathyParagraph: string }) {
  return (
    <Html>
      <Body>
        <Container>
          <Heading as="h2">Hi {firstName}, could you take a moment to update your card?</Heading>
          <Text>{empathyParagraph}</Text>
          <Text>The amount we tried to charge was {amountFormatted}. The button below opens a secure form where you can add a new card.</Text>
          <Button href={updateLink}>Update card</Button>
          <Text>If you've already taken care of this, you can ignore this email.</Text>
        </Container>
      </Body>
    </Html>
  );
}

The trick is keeping the AI's surface area small. Asking it to write the entire email creates inconsistency; asking it for nothing makes the message feel automated and unread. A single paragraph it can warm up — based on plan tier, segment, time since last successful payment — is the sweet spot.

For the broader emailing pattern (delivery retries, ordering, sandbox routing), see Building a React Email Pipeline with Antigravity and Resend. Sharing template infrastructure between dunning and routine product emails dramatically lowers operational load.

Step 6: Slack Escalation, Designed for Speed

The agent must hand off cleanly when it isn't sure. The Slack message is the contract for that handoff, and it has to give the on-call person enough to decide in three seconds.

// lib/dunning/slack.ts
export function buildEscalationBlocks(ctx: DunningContext, reason: string) {
  return [
    {
      type: "header",
      text: { type: "plain_text", text: ":rotating_light: Dunning escalation" },
    },
    {
      type: "section",
      fields: [
        { type: "mrkdwn", text: `*Customer*\n<https://dashboard.stripe.com/customers/${ctx.customerId}|${ctx.customerEmail}>` },
        { type: "mrkdwn", text: `*Plan*\n${ctx.planTier} (${ctx.isAnnual ? "annual" : "monthly"})` },
        { type: "mrkdwn", text: `*Amount*\n${ctx.amountDue / 100} ${ctx.currency.toUpperCase()}` },
        { type: "mrkdwn", text: `*Reason*\n${ctx.failureReason}` },
        { type: "mrkdwn", text: `*Attempt*\n${ctx.attemptCount}` },
        { type: "mrkdwn", text: `*LTV (JPY)*\n${ctx.totalLifetimeValueJpy.toLocaleString()}` },
      ],
    },
    {
      type: "section",
      text: { type: "mrkdwn", text: `*Why escalated*\n${reason}` },
    },
    {
      type: "actions",
      elements: [
        { type: "button", text: { type: "plain_text", text: "Send winback offer" }, action_id: "dunning_winback" },
        { type: "button", text: { type: "plain_text", text: "Mark as lost" }, action_id: "dunning_mark_lost", style: "danger" },
      ],
    },
  ];
}

I've redesigned this Slack block twice because my early version "just notified that something failed." That's useless — the on-call ends up opening the Stripe dashboard anyway. Including the LTV, plan, reason, and attempt count in the same block is what made it actually decision-ready.

Common Pitfalls I Have Hit Personally

A few I rewrote my way out of:

1. Doubling up Stripe Smart Retries with your own retry loop

Stripe runs Smart Retries (up to four attempts) by default. If you don't realize that and add an app-side cron retry on top, you can hit the customer's card eight times for one invoice. Always check Settings -> Billing -> Subscriptions -> Retries first, and reserve any app-side retry for the moment Stripe gives up.

2. Listening to `invoice.payment_failed` only

There is more than one failure shape. Don't merge them.

invoice.payment_failed: automatic collection failed
invoice.payment_action_required: customer needs to complete 3DS
customer.subscription.updated (status: past_due to unpaid): all retries exhausted
customer.subscription.deleted: fully canceled

3DS-pending customers should not get "please update your card." Different reason, different copy. Branch them at the webhook layer.

3. Sending live emails from the test environment

Inspect STRIPE_SECRET_KEY for the sk_test_ vs sk_live_ prefix and route Resend differently. A safe default: if (process.env.STRIPE_SECRET_KEY?.startsWith("sk_test_")) { /* sandbox */ } rerouting to an internal-only inbox. I learned this the hard way after a single test event leaked to a real customer.

4. The "reason: unknown" swamp

Run SELECT failure_reason, count(*) FROM dunning_events GROUP BY 1 weekly. The unknown bucket grows quietly as Stripe adds new decline codes. Once it crosses 10% of total failures, it's time to extend mapDeclineCode.

5. Coupon overuse

Firing a winback coupon on every failure trains your most willing-to-pay customers to wait for discounts. Gate offer_winback_discount on segment + LTV + last-payment-date triple. After I added customerSegment === "loyal" && totalLifetimeValueJpy > 30000, my monthly coupon issuance dropped to one-fifth without hurting recovery rate.

Observability I Always Set Up

Five charts I keep on permanent display in Looker Studio:

Failure events per day, broken down by reason
Recovery rate (payment_failed -> payment_succeeded within 14 days)
Click-through rate from email to card update form
Slack escalations and what fraction got resolved within 24h
Share of unknown failure reasons

Recovery rate is the one metric I check first. The agent rollout in my own product moved this from 38% to 61% in the first month. When I did the math, that recovery delta was as valuable as a sizable chunk of new MRR — without any acquisition spend.

Wiring the Tool Dispatcher

The tool dispatcher is the place where agent intent becomes side effects. Get it small and well-tested, because if any of these branches are wrong you'll discover it through customers, not unit tests.

// lib/dunning/dispatch.ts
type Args = Record<string, unknown>;
 
export async function dispatchTool(name: string, args: Args, ctx: DunningContext, env: Env) {
  switch (name) {
    case "send_email": {
      const { template_id, customer_email, variables } = args as {
        template_id: string;
        customer_email: string;
        variables?: Record<string, unknown>;
      };
      // Hard-validate one more time at the dispatch boundary
      if (!customer_email || customer_email !== ctx.customerEmail) {
        throw new Error(`email mismatch: agent=${customer_email} ctx=${ctx.customerEmail}`);
      }
      const empathy = await composeEmpathyParagraph(ctx, env);
      return await sendResendEmail(env, template_id, customer_email, {
        firstName: ctx.customerSegment === "trial" ? "there" : (variables?.firstName ?? ""),
        amountFormatted: formatAmount(ctx.amountDue, ctx.currency),
        updateLink: await mintCardUpdateLink(ctx.customerId, env),
        empathyParagraph: empathy,
        ...variables,
      });
    }
    case "post_slack":
      return await postSlackBlocks(env, (args as any).channel, (args as any).blocks);
    case "update_customer_state":
      return await env.BILLING_KV.put(
        `cust:${(args as any).customer_id}`,
        JSON.stringify({ state: (args as any).state, note: (args as any).note, updatedAt: Date.now() })
      );
    case "request_stripe_smart_retry":
      return await stripe.invoices.update((args as any).invoice_id, {
        metadata: { dunning_reschedule: (args as any).schedule },
      });
    case "offer_winback_discount":
      return await issueWinbackCoupon(ctx, args as any, env);
    case "escalate_to_human":
      return await escalate(ctx, (args as any).reason, env);
    default:
      throw new Error(`unknown tool: ${name}`);
  }
}

The double-validation on customer_email exists because I want to fail loudly if the agent ever fabricates a different address. It hasn't happened in production, but the check costs nothing and the alternative — a misdirected email — costs trust.

Expected behavior: any tool the agent picks resolves to a single, named, traceable side effect, and unknown tool names raise a 500 instead of silently passing.

Composing the Empathy Paragraph

Empathy paragraphs are the one slot of natural language I let the agent generate per email. Keeping the prompt narrow keeps the output reliable.

// lib/dunning/empathy.ts
export async function composeEmpathyParagraph(ctx: DunningContext, env: Env) {
  const ai = new GoogleGenAI({ apiKey: env.GEMINI_API_KEY });
  const segmentTone =
    ctx.customerSegment === "loyal" ? "thank them for the years they've been with us" :
    ctx.customerSegment === "trial" ? "lower the pressure; this is their first billing experience" :
    "stay friendly and concise";
 
  const reasonHint =
    ctx.failureReason === "card_expired" ? "Mention that cards expire and it happens to everyone." :
    ctx.failureReason === "insufficient_funds" ? "Avoid any phrasing that sounds judgmental about funds." :
    ctx.failureReason === "authentication_required" ? "Explain that the bank needs an extra confirmation step." :
    "Keep the cause vague; we don't have certainty about what happened.";
 
  const res = await ai.models.generateContent({
    model: "gemini-3-flash",
    contents: [{ role: "user", parts: [{ text:
      `Write one short paragraph (40-70 words, English, second person) for an email about a failed charge. ` +
      `Tone: ${segmentTone}. Hint: ${reasonHint}. Avoid technical Stripe terminology. ` +
      `End with a soft action prompt to update payment.`
    }]}],
    config: { temperature: 0.6 },
  });
  return res.text?.trim() ?? "We had trouble processing your latest payment. Could you take a quick moment to update your card details?";
}

I use gemini-3-flash here intentionally: cheaper, faster, and the output quality is more than enough for a 60-word paragraph. The slow model belongs in the orchestration step, not in copywriting.

Migrating from a Manual Process

If you're starting from a fully manual flow, here's the migration order I'd recommend, distilled from my own missteps:

Week 1: deploy only the webhook handler and event normalization. Log everything, do nothing else. Two purposes: confirm your idempotency works under real Stripe traffic, and start the data series for recovery-rate baselining.
Week 2: introduce the agent in dry-run mode. Have it produce its decision and the would-be tool call, but route every action to Slack instead of executing. Watch for at least 50 events. Reject the rollout if more than 5% of decisions look wrong to you.
Week 3: enable email and KV updates only. Hold winback coupons and human escalations on Slack approval. This is where you'll find your unknown rate, your throttle hits, and the edge cases your prompt missed.
Week 4: turn on the rest, including the winback coupon arm. Add a kill switch (a single KV key the handler reads) so you can pause the agent in five seconds if something looks wrong.

Don't compress this. Fast failure recovery feels like the kind of code you'd ship in a weekend, but the cost of a misfire is real human distrust. I lost a few weeks' MRR worth in mistakes during my own initial rollout because I skipped step 3.

What Actually Changed in My Numbers

Numbers help so let me share what shifted in the product I run, before and after this pipeline ran for two months.

Recovery rate (failed → succeeded within 14 days) moved from 38% to 61%, then settled at around 58% after the novelty of the email faded.
Average time-to-recovery dropped from 6.4 days to 1.9 days, mostly because card-update emails now go out within minutes instead of whenever I opened the dashboard.
Customer support tickets containing the word "billing" dropped by roughly half. Most of the long thread cleanups I used to handle by hand simply stopped happening because the agent caught the issue first.
Coupon issuance dropped to one-fifth after I gated winback offers on segment + LTV. Recovery rate on those eligible cohorts stayed flat, which told me the constraint was healthy.

These aren't huge SaaS numbers and they aren't meant to impress. They're mid-product, mid-budget numbers from one indie developer's account. The point is that the levers exist even at small scale, and the agent pattern compresses the work to the point where one person can run the recovery surface that small teams used to staff.

A side benefit I didn't expect: writing the agent prompt forced me to articulate my own customer policy in plain language. I had never written down "we don't pressure trial users" or "loyal customers get a winback offer up to once a quarter." The agent prompt became a living policy document that my future self can edit in five minutes.

Testing Strategy

A few patterns I now consider non-negotiable:

Replay tests: keep a folder of real Stripe webhook payloads (with PII redacted) and replay them through your handler in CI. Easier to maintain than synthetic fixtures, and far more honest.
Dispatcher snapshots: for each tool branch, snapshot the side-effect arguments. Diffing the snapshot is faster than reading through assertions.
Stub the agent: in unit tests, replace the agent call with a fixed function-call response. The agent itself has its own evaluation loop separate from the pipeline tests.
End-to-end smoke: every deploy, fire one synthetic invoice.payment_failed against a sentinel customer and confirm the test inbox receives the right template within 60 seconds. This single guardrail caught a regression for me when I changed Resend SDK versions.

For broader testing strategy on agent-driven pipelines, I keep coming back to the patterns from Antigravity AgentKit 2.0 Unit Testing with Vitest Guide — most of those techniques apply almost unchanged here.

Beyond Dunning: One Agent Pattern, Many Lifecycles

Once the dunning agent works, the same shape generalizes:

Trial-end reminders: trigger on customer.subscription.trial_will_end, classify usage as "engaged" or "lapsed," and let the agent write the right nudge.
Upgrade prompts: when usage approaches a tier ceiling, draft the upgrade pitch contextually.
Cancellation winback: catch customer.subscription.deleted and queue a tasteful winback the next day, informed by what the customer actually used.

I'm building the next layer right now: copying dunning-orchestrator.md into lifecycle-orchestrator.md and growing it into a single agent system that owns the entire SaaS revenue cycle. The leverage of one well-shaped agent is much higher than I expected when I started.

For metering on the other side of billing, I covered it in Antigravity AI Agents and Stripe Meter Events for Usage-Based Billing. And for revenue strategy across the funnel, Antigravity Subscription Revenue Optimization (Advanced) pairs nicely with this pipeline.

Match the Message to the Decline Reason

Handing judgment to an agent only works if you, the human, hold a clear opinion about what each reason deserves. Here is the table I keep underneath the agent's Decision policy. Retry timing stays with Stripe Smart Retries; what I manage is when, through which channel, and in what tone a human-readable message reaches the customer.

Decline reason	Likely cause	First touch	Channel	Tone
card_expired	Customer forgot to update	Immediate	Email + in-app banner	Matter-of-fact, update link first
insufficient_funds	Temporary cash-flow	Wait until next business day	Email only	Gentle, state the retry date
authentication_required (3DS)	Pending verification	Immediate	In-app + email	Guide the action, never say "update card"
do_not_honor / network decline	Issuing bank's call	After the 2nd failure	Email + human	Assume it is not the customer's fault
unknown (unmapped)	Unclassified	Escalate to Slack at once	Human	No template; a person reviews

The single most important row is insufficient_funds: do not contact immediately. Emailing someone the same hour their card bounced, just before payday, rushes a customer who still intends to pay and sours the relationship. I sent those notices instantly at first and watched churn tick up. Simply staying quiet until the next business day lifted recovery in that category noticeably.

How the Numbers Moved Over Three Months

Earlier I mentioned month-one recovery climbing from 38% to 61%. Broken down by reason across a full quarter, it is clearer where the leverage actually lived:

Decline reason	Share of failures	Recovery before	After 3 months	What moved it
Expired card	~41%	52%	74%	Immediate in-app banner + update link
Insufficient funds	~27%	34%	58%	Waiting a day; naming the retry date
3DS authentication	~18%	29%	63%	Saying "verify," not "update"
Network decline	~9%	22%	31%	Human follow-up after two failures
unknown	~5%	—	—	Instant Slack escalation

The biggest gain came from insufficient funds, and the lever was neither code nor AI — it was the discipline to wait. Network declines, by contrast, stay stubborn no matter what; deciding not to pour human time into them mattered more for my workload than any copy change. As an indie developer running this across the small Dolice products I maintain, the break-even was concrete: once monthly payment failures pass roughly 30, the build pays for itself. Below that, doing it by hand is honestly cheaper. Automation here is not a rescue — it is an investment that only pays once volume shows up.

One Thing to Do Today

If you read this far, here's the single move I'd recommend:

Open Stripe's Reports -> Revenue -> Failed payments and look at the past 30 days. Note the count and the top three reasons.

That number tells you whether dunning is worth automating yet for your product. A handful of failures per month is fine to handle by hand. Once it crosses a few dozen, the investment described in this article starts paying back fast. That's exactly the scale at which I started writing my own — the day after I first opened that report.

Thank You for Reading

Antigravity Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.