AI agents · Idempotency · Vendor API safety

AI agent idempotency: preventing duplicate vendor charges on retries

Idempotency is the property that running the same operation multiple times produces the same result as running it once. For a human-driven web form, idempotency is a nice-to-have — the user usually doesn't double-submit. For an AI agent, idempotency is mandatory: workflow engines retry failed steps automatically, process crashes resume mid-pipeline, serverless cold starts re-execute in-flight tasks, and stuck loops trigger the same action hundreds of times before a human notices. Each retry without an idempotency key creates a duplicate Stripe charge, duplicate Twilio SMS, or duplicate Resend email — real money out the door, real customer-facing harm. This page covers how to build stable idempotency keys from workflow run IDs, what each vendor's deduplication guarantee actually covers, and how per-run spend caps add a second layer of protection when idempotency keys expire or fail.

TL;DR

Build your idempotency key from immutable, context-stable identifiers: workflow_run_id + "-" + item_id is the standard pattern. The workflow run ID is stable across all retries of the same run. The item ID scopes it to the specific record being processed. Set your idempotency key before making the vendor call, not after — the common mistake is computing the key from the API response, which doesn't exist yet on the first call. Then add a spend cap as a backstop: idempotency keys have expiry windows (24 hours for Stripe, 4 hours for Twilio), and a stuck agent retrying the same action over multiple days will eventually exhaust the deduplication window and produce duplicate charges regardless.

Why agent retries are different from human retries

When a human double-clicks "Submit Payment", the browser typically blocks the second click with a disabled button, and the worst case is one duplicate charge. Agent retries are structurally different in three ways:

Automatic and silent. Workflow engines like Temporal, Inngest, Hatchet, and Trigger.dev retry failed steps automatically. There's no human in the loop to notice that a step is retrying — it just does. The agent code logs the retry but the engineer is asleep.
Potentially unbounded. A stuck agent loop can retry the same vendor call thousands of times. Temporal's retry policy defaults to infinite retries with exponential backoff — a broken Stripe call retries until the maximum schedule-to-close timeout, which is 10 years by default. An agent calling Stripe in a loop with a 1-second backoff can make 86,400 attempts in a day.
Cross-process and cross-restart. Agent failures aren't just network timeouts — they include process crashes, OOM kills, Kubernetes pod evictions, and deployment restarts. When a workflow step resumes after a process crash, it doesn't know whether the previous Stripe call succeeded before the crash. Without an idempotency key, it calls Stripe again.

How to build a stable idempotency key

The core requirement: the idempotency key must be the same value across all retries of the same logical operation, and different across different logical operations.

# Pattern: workflow_run_id + item_id
idempotency_key = f"{workflow_run_id}-{customer_id}"

# Wrong: using a timestamp — different on every retry
idempotency_key = f"{customer_id}-{int(time.time())}"

# Wrong: using a random UUID — different on every retry
import uuid
idempotency_key = f"{customer_id}-{uuid.uuid4()}"

# Wrong: using only the customer ID — collides across different billing runs
idempotency_key = customer_id  # Two billing runs for the same customer → same key → second run is deduplicated!

The workflow run ID component ensures uniqueness across different logical runs for the same customer. The customer ID component ensures uniqueness within a run that processes multiple customers. Together they produce a key that's stable across retries of the same step within the same run, and unique across different runs.

Idempotency key sources by workflow framework

Framework	Stable run ID to use	Key pattern
Temporal	`workflow.GetInfo(ctx).WorkflowExecution.ID` + `RunID`	`f"{workflow_id}-{run_id}-{customer_id}"`
Inngest	`runId` (available in function context, stable across step retries)	`${runId}-${customerId}`
Trigger.dev	`context.run.id` (stable across task retries)	`${context.run.id}-${customerId}`
Prefect	`FlowRunContext.get().flow_run.id`	`f"{flow_run_id}-{customer_id}"`
Hatchet	`ctx.workflow_run_id()`	`f"{run_id}-{customer_id}"`
Dagster	`context.run_id` (in op context)	`f"{run_id}-{customer_id}"`
Airflow	`context['run_id']` (task context variable)	`f"{dag_id}-{run_id}-{customer_id}"`
Apache Beam	Job ID (passed as pipeline argument before submission)	`f"{job_id}-{customer_id}"`

What each vendor's idempotency guarantee actually covers

Vendor idempotency keys deduplicate identical requests within a time window. The window and exact semantics vary:

Vendor	How to send	Dedup window	What's deduplicated
Stripe	`Idempotency-Key` HTTP header, or `idempotency_key` SDK parameter	24 hours	Same key + same endpoint + same API key → returns cached response, no new charge. Different API key (e.g., rotating keys mid-run) voids the deduplication even for the same idempotency key string.
Twilio	Custom header — Twilio doesn't have a built-in idempotency key; use application-level dedup (check sent log before calling, or maintain a sent-message ID table).	Application-managed	Twilio has no native per-request deduplication. Application-level dedup: before calling `messages.create()`, check if a message was already sent for this `run_id + recipient` combination. Store the Twilio SID after send. On retry, if SID exists, skip the send.
Resend	No built-in idempotency key. Application-level dedup required (store sent email IDs against your run ID).	Application-managed	Resend has no per-request deduplication. Application-level dedup: before calling `resend.emails.send()`, check if an email was already sent for this `run_id + recipient` combination. Store the Resend email ID after send.
Stripe Billing (invoices)	`Idempotency-Key` header on `POST /v1/invoices`	24 hours	Same as Stripe above — covers invoice creation, not finalization or payment. Creating and finalizing are separate API calls, each needing their own idempotency key.

The gap idempotency keys don't cover: the 24-hour cliff

Stripe's 24-hour idempotency window is the most common point of failure for long-running or stuck agents. If a billing pipeline fails at step 500 of 1,000 and isn't retried until 26 hours later (after incident investigation, on-call escalation, and remediation), the idempotency keys for steps 1–500 have expired. Retrying the pipeline from scratch — or even from step 500 — will re-charge customers 1–499, who were already successfully charged in the first run.

The idempotency key cliff means you need a second layer of protection: a spend cap that prevents the total dollar amount of vendor calls from exceeding a threshold per run, regardless of how many retries occur or when they happen. A vault key with a $5,000 cap on a 1,000-customer billing run at $10 each would catch a 26-hour-late retry that tries to re-charge all 1,000 customers: the first 500 new charges fire (the previous 500 already happened in run 1 but the cap doesn't know about them), and the cap stops the next charge.

This isn't a complete solution — the cap doesn't know about charges made in a previous run with a different vault key — but it limits the blast radius of an idempotency expiry event from "re-charged every customer" to "re-charged at most N dollars worth of customers."

Scoping vault keys as idempotency backstops

import stripe
import requests

def run_billing_step(run_id: str, customers: list, budget_usd: float = 1000.0):
    # Issue vault key once per run — the cap is the backstop if idempotency keys expire
    resp = requests.post(
        "https://proxy.keybrake.com/vault/keys",
        headers={"Authorization": f"Bearer {KEYBRAKE_API_KEY}"},
        json={
            "vendor": "stripe",
            "daily_usd_cap": budget_usd,
            "allowed_endpoints": ["POST /v1/payment_intents"],
            "expires_in": "48h",  # longer than Stripe's 24h idempotency window
            "agent_run_label": f"billing/{run_id}",
        },
    )
    vault_key = resp.json()["vault_key"]

    stripe.api_key = vault_key
    stripe.api_base = "https://proxy.keybrake.com/stripe"

    for customer in customers:
        # Idempotency key is stable across retries of this specific run
        idempotency_key = f"{run_id}-{customer['id']}"

        try:
            charge = stripe.PaymentIntent.create(
                amount=customer["amount_cents"],
                currency="usd",
                customer=customer["id"],
                idempotency_key=idempotency_key,
            )
        except stripe.error.RateLimitError as e:
            if "cap_exhausted" in str(e):
                raise RuntimeError(f"Spend cap hit at customer {customer['id']} — stopping run")
            raise  # Transient rate limit — let workflow engine retry

The vault key TTL is set to 48 hours — longer than Stripe's 24-hour idempotency window. This means a run that's retried up to 48 hours after its first attempt still has a live cap, limiting re-charges to the configured dollar ceiling even if idempotency keys for early customers have expired. The agent_run_label ties every vendor call in the audit log to the specific run ID, so you can reconstruct exactly which customers were charged and when — essential for incident investigation when idempotency does fail.

How Keybrake fits

Keybrake acts as the idempotency backstop that operates at the dollar level rather than the per-request level. Idempotency keys prevent individual duplicate calls; vault key caps prevent duplicate spend at the run level. Together they form two-layer protection: idempotency keys handle the common case (same call, same window), caps handle the edge cases (expired keys, different keys, stuck loops that outlast the deduplication window). The Keybrake audit log gives you a complete record of every vendor call with its amount, timestamp, and run label — so when an idempotency failure does occur, you have the data to identify which customers were doubled and by how much.

Get early access