Stripe deep dive · 12 min read

Why your Stripe Restricted Key probably isn't restricted enough (and what to do about it)

Keybrake team · May 1, 2026

Stripe Restricted Keys are the most underrated production-safety primitive Stripe ships. They will, with about ten minutes of clicking, lock an AI agent out of four-fifths of the catastrophic moves it can make on your account. They will also let it burn $4,000 in twenty minutes if it gets stuck in a refund loop, because there are four things a Restricted Key does not do, and three of those four are exactly what you want from a key you handed an autonomous agent. This is the post for the engineer who has read "Limit access with restricted API keys", ticked the right boxes, and now wants to know what is still uncovered — and what to reach for to close each gap.

What a Restricted Key actually restricts

Before the gaps, the wins. A Stripe Restricted Key (prefix rk_live_ or rk_test_) lets you build, per key, a permission matrix with three axes:

Resource — the noun: Charges, Customers, Refunds, Subscriptions, PaymentIntents, SetupIntents, and about sixty more. Stripe's resource list maps roughly one-to-one to the API's top-level objects.
Permission — the verb, per resource: None, Read, or Write. Write on most resources implies Read; for a few resources Write is the only meaningful state because the only operation is mutation (e.g. Refunds with no read view).
Endpoint allowlist — for some resources, you can pin specific endpoints rather than the whole resource. This is documented thinly and surfaces only in the dashboard for a subset of resources, but it is how, for instance, you can grant Charges:Write but only POST /v1/charges/{id}/capture.

If you would like to see the resulting matrix for fourteen common AI agent use cases — pick the use cases you intend to support, get the minimum scope back — we built the Stripe Restricted Key picker. It runs entirely in your browser, scores blast radius for each combination, and warns on the four resource pairs that compound into irreversible cash exfiltration. Use it as the starting point for any scope decision.

What you get from a well-scoped Restricted Key is real. A read-only reporting agent cannot create a charge. A subscription-management agent cannot trigger a payout to a new bank account. A checkout-link generator cannot read your customer email list. These are the failure modes that previously required a privately-held secret key plus discipline; now they require checkboxes, and Stripe will reject the request at the edge if the agent steps outside.

So far, so good. The trouble starts when you ask: what about the operations that are inside scope, but I want to cap them anyway?

The four gaps

I have spent the last six months reading stripe/agent-toolkit issues, the comments under stripe/ai#356, and the postmortems engineers have published from real agent-on-Stripe incidents. The same four gaps come up every time. They are not bugs in Restricted Keys — they are out of the primitive's scope by design — but if your agent is autonomous, all four of them matter.

Gap 1 of 4

No per-day, per-key spend cap

A Restricted Key with Charges:Write can fire as many charge requests as Stripe's rate limit will accept. For most accounts that is around 100 requests per second on test, 25/sec sustained on live. At an average charge size of $40 and a stuck loop, that is the cost of a small car every minute.

Stripe has rate limits, but rate limits cap requests, not dollars. They will keep you from exhausting your customers' card networks; they will not keep you from charging the same card twenty times for $40 in eight seconds. There is no max_usd_per_day field on a Restricted Key, and there is no dashboard switch that says "stop accepting Write calls from this key after $500 in 24 hours."

Gap 2 of 4

No parameter-level scope

A Restricted Key with Refunds:Write can refund any charge on your account. There is no way to restrict it to refunds for a specific customer, a specific connected account on a Standard Connect platform, charges below a specific amount, or charges with a specific metadata tag. The endpoint allowlist gates which URL the key can hit; it does not gate the body of the request.

For a customer-support agent that is supposed to issue partial refunds on tickets escalated to it, this is the difference between "the agent can refund Acme Corp's last invoice" and "the agent can refund any of your 40,000 customers' last invoices." Stripe's stance, reasonable from their position, is that parameter-level authorization is application logic, not API logic. Reasonable, and yet you still need it.

Gap 3 of 4

No sub-second mid-run revoke

You can delete a Restricted Key from the dashboard. The deletion is final, but the propagation tail to Stripe's edge is in the 30-second to 5-minute range, depending on which region your traffic terminates in and how cache invalidation lines up across their fleet. We covered this in the 2am playbook — the gist is that during the propagation window, requests using the deleted key continue to succeed, and Stripe's documentation does not promise an upper bound on that window.

For a stuck-loop incident that you catch in real time, those minutes are exactly the window during which the loss compounds. You want a kill switch that takes effect on the next packet from the agent, not on the next cache eviction at the vendor's edge.

Gap 4 of 4

No per-call audit with parsed cost

The Stripe Dashboard's API logs are excellent for debugging — you can filter by Restricted Key, see request and response payloads, replay. They are not built for the operational question your CFO will ask the morning after: "show me every dollar the agent spent yesterday, grouped by use case, with the policy decision logged next to each call."

Specifically: there is no per-call cost field in the log (you have to derive it from the request body or the resulting Charge object), no policy_decision_at column (because there is no policy layer to record one), no agent_run_id correlation, and no SQL access. We covered the schema you actually want in the audit trail post.

What to reach for, gap by gap

Three of the four gaps have native Stripe workarounds that are good enough for many teams. Walk through each and decide honestly whether it covers your actual exposure.

Gap 1 (spend cap): Radar rules and idempotency

Stripe Radar is a fraud-detection tool, but its rule engine can be re-used as a coarse spend governor. A rule like :total_charges_in_last_24h_for_customer: > 1000 will block charges from a single customer above $1,000/day. The rule fires on attempt, not after the fact, so it is a real cap, not a notification. The trade-offs:

Rules are per-Stripe-account, not per-Restricted-Key. Your support agent's $1,000-per-customer cap applies to your dashboard users' charges, too. There is no namespace for "rules that only apply to traffic from key rk_live_xyz."
Radar's per-customer aggregation does not give you per-key, per-day-total caps. A stuck loop that hits 1,000 different customers for $0.50 each will sail through.
Rule changes can take a minute to propagate. Acceptable for most use cases; not what you want when the agent is loose.

Combine Radar rules with idempotency keys (your agent should always be sending an Idempotency-Key header derived from the underlying business event, not the agent retry counter) and you get a dollar-bounded surface for the kind of drift that comes from upstream loops. You do not get a true per-key daily ceiling.

Gap 2 (parameter-level scope): wrapper at the call site

The honest answer for parameter-level scope is that you build it yourself, in the agent's code, before the request hits Stripe. A wrapper function that takes the agent's intended Charge.create arguments, validates that customer is in an allowlist your application maintains, and refuses if it is not, is a perfectly fine implementation. The trouble is two-fold: every team builds its own version of this and the implementations drift; and the wrapper lives in the same process as the agent, so if the agent is the one constructing the wrapper call (think: tool-use loop where the LLM decides the parameters), a smart-enough prompt-injection can get around it.

The pattern that scales better is to enforce parameter scope at a point the agent does not control: a proxy in front of Stripe that the agent's outbound traffic must traverse, with policy that reads the body. The governance stack post walks through where in your topology that proxy belongs.

Gap 3 (sub-second revoke): you can't, with a Restricted Key alone

There is no native Stripe workaround for sub-second mid-run revoke, because the propagation tail is a property of Stripe's edge cache, not a property you can configure. The two options people reach for are:

Block the agent's egress at the network layer. If the agent runs in a process you control, drop its outbound TCP to api.stripe.com. This works in seconds and gives you a true kill. It also requires that the agent and its network are not on a serverless-ish substrate where you do not own the firewall.
Hand the agent a key you can revoke locally instead of upstream. Issue a vault key from a proxy you control; revoke that key in your own database; the next request from the agent fails closed because the proxy refuses to forward. Latency is the time it takes the proxy to read its own table — single-digit milliseconds.

This is the gap that is hardest to close with native Stripe primitives, and the one teams most often discover during an actual incident.

Gap 4 (per-call audit): export to your own store

Subscribe to charge.*, refund.*, payment_intent.*, and subscription.* webhooks; persist them with the request metadata and a derived cost; index by agent_run_id stored in Charge metadata. This gives you a real audit log. It does not give you a row for the calls that 4xx'd at Stripe's edge (no event fires for an unauthorized request), and it does not give you a policy decision column unless you write your own policy decisions to a separate store and join. For most teams the gap-4 workaround is "build a small ETL." That is fine, but recognize you are now shipping a logging and storage pipeline alongside your agent, and that pipeline is now load-bearing for compliance.

When the workarounds are enough — and when they aren't

Restricted Keys plus the workarounds above are enough if at least three of the following are true for your team:

The agent's traffic is bounded to a single Stripe account, with one set of customers, and the parameter-level scope you need can be expressed in a Radar rule.
You own the network the agent runs on and can drop egress in seconds during an incident.
You have or can build a webhook-driven audit pipeline and your compliance team is happy with after-the-fact reconciliation.
The agent's worst-case stuck-loop spend is within an order of magnitude of an amount you can write off without escalation.
The agent runs on infrastructure you control end-to-end (no third-party computer-use platform that owns the network).

If most of those are true, you do not need a separate proxy product — Restricted Keys are sufficient and the engineering cost of layering more is not paid back.

The workarounds stop being enough when one or more of the following holds:

The agent transacts on behalf of multiple end-customers (e.g. a SaaS product where each customer brings their own Stripe account via Connect, and parameter-level scope means "key X can only refund customer X's charges").
The agent runs on a substrate where you do not own the firewall, and a sub-second kill matters because the worst-case loss exceeds your no-questions-asked spend authority.
You need a single audit row per agent action that joins policy decision, vendor request, vendor response, and parsed cost — for SOC 2, for an internal incident review process, or because your team has been bitten and wants the trail.
You are running more than one money-moving SaaS API per agent (Stripe + Twilio + Resend + Shopify), and reproducing the workaround stack for each vendor is starting to feel like a small product of its own.

That last one is the inflection point we built Keybrake for. The four gaps above are not Stripe-specific; they exist on Twilio, Resend, Shopify Admin, Postmark, and every other money-moving SaaS API your agent is touching. Building five copies of "Radar rules + egress firewall + webhook ETL" is roughly the work of building one proxy that closes all four gaps once, applies the same policy across all vendors, and lets you revoke the agent's access in milliseconds when it goes wrong.

Bottom line

Use the Stripe Restricted Key picker for any agent you are about to deploy. Tick the boxes for the use cases the agent is supposed to handle. Use the resulting key. That decision alone — picking the minimum permissions, not the convenient maximum — closes the majority of the agent risk surface and costs you ten minutes.

Then, before you ship, walk the four gaps above against your specific scenario. For each gap, decide: does the native workaround cover it for me, or do I need a layer in front? If the answer is "I need a layer" for two or more gaps, you are about to build a proxy. We have already done that work.

Get Keybrake when v1 ships

Pre-launch waitlist for the SaaS-API governance proxy. Per-vendor daily spend cap, parameter-level policy, sub-second mid-run revoke, per-call audit with parsed cost — across Stripe, Twilio, and Resend on day one. We'll email you a working code sample for each vendor the day v1 lands.