Stripe Agent Toolkit · MCP · Governance

Giving Stripe Agent Toolkit an off-switch

Keybrake · May 31, 2026 · 8 min read

Stripe Agent Toolkit lets Claude issue refunds and charges through MCP in under 30 seconds of config. The off-switch — a spend cap, a kill switch, a per-call audit trail — takes two minutes to add. Most people don't add it until after the first incident.

Here's the walkthrough: the minimal setup, what goes wrong, and the single config change that closes the gap.

The baseline setup

Stripe ships an official MCP server — @stripe/mcp — that exposes Stripe payment operations as tools any MCP client can call. Wiring it to Claude Desktop takes a one-time edit to claude_desktop_config.json:

{
  "mcpServers": {
    "stripe": {
      "command": "npx",
      "args": [
        "-y",
        "@stripe/mcp",
        "--tools=create_refund,list_customers,retrieve_balance"
      ],
      "env": {
        "STRIPE_API_KEY": "sk_live_51..."
      }
    }
  }
}

Restart Claude Desktop and you'll see a hammer icon in the toolbar — Claude can now issue refunds, list customers, and retrieve the Stripe balance on your behalf. The toolkit handles the Stripe API call; Claude decides when to call it based on the conversation.

This is genuinely useful. A support agent can triage "why did this charge fail?" by calling list_payment_intents and list_customers without anyone writing an integration. A billing workflow can auto-draft invoices. A back-office Claude can issue refunds without the support rep opening Stripe at all.

The --tools flag is the toolkit's built-in access control: it limits which of the 14 default tools Claude can see. We used it above to expose only refunds, customer reads, and balance retrieval — not charges, not invoice creation, not product writes. That's the right starting point. It is not enough on its own.

Two failure modes the tools flag doesn't stop

Failure mode 1 — the refund loop

Suppose you're running a support agent that handles "can you refund my order?" messages. The happy path works fine: Claude calls create_refund, it succeeds, the customer is happy. Now imagine the agent receives an ambiguous message — "refund all the failed charges" — or a malformed context that causes Claude to retry a refund call that already succeeded. The Stripe API will reject a refund on an already-refunded charge with a 400, but only if Claude stops on the first successful call. If the agent's loop calls create_refund again before processing the response, or if the 400 is parsed as "try again," each call that hits a distinct charge_id succeeds and money leaves your Stripe account.

The --tools flag has nothing to say about this. The tool is exposed. Each call is individually valid. The problem is the call volume and dollar total, not the tool choice — and there is no dollar total control anywhere in the toolkit.

Failure mode 2 — customer scope bleed

A support agent should probably only refund customers whose tickets it is currently handling. But the toolkit's create_refund accepts any charge ID your Stripe key can see. There is no way to say "the agent may only touch charges belonging to customer cus_A" without writing that check into your own application code. If Claude constructs a charge ID from partial context — or if a user pastes a charge ID from a different account into the chat — the refund goes through.

For a Connect platform supporting many merchants, this failure mode scales with the number of merchants you manage. The exposed tool plus a Stripe secret key with Refunds: Write equals write access to every charge on every connected account.

Why these aren't edge cases

Both failure modes share a root cause: the toolkit's security model is which tool can be called, not how much damage the tool can do. The tools flag is a coarse filter at the protocol boundary. It is genuinely useful for preventing category errors — you wouldn't expose create_charge to a read-only analytics agent. But it isn't designed to cap dollar exposure, scope by customer, or cut the agent off mid-run. Those are runtime enforcement problems, and the toolkit punts them to you.

Stripe's own Restricted Keys help with the scoping problem — you can issue a key with Refunds: Write and strip every other permission. That narrows the blast radius considerably. But a Restricted Key still has no per-day dollar cap, no per-customer scope for non-Connect accounts, and no sub-second revoke path separate from rotating the key itself (which takes up to five minutes to propagate across Stripe's infrastructure). For a deeper breakdown of exactly what Restricted Keys cover and miss, see Why your Stripe Restricted Key probably isn't restricted enough.

The fix: a proxy between the toolkit and Stripe

The cleanest solution is a governance proxy that sits between the MCP server and the Stripe API. The toolkit thinks it's talking to Stripe. The proxy enforces the policy you defined — spend cap, customer allowlist, endpoint allowlist, kill switch — before forwarding the call. If the policy blocks the call, the proxy returns an error and the call never reaches Stripe.

Here's what the before/after looks like in Claude Desktop config:

Before — direct to Stripe:

{
  "mcpServers": {
    "stripe": {
      "command": "npx",
      "args": ["-y", "@stripe/mcp", "--tools=create_refund,list_customers"],
      "env": {
        "STRIPE_API_KEY": "sk_live_51..."
      }
    }
  }
}

After — routed through a governance proxy:

{
  "mcpServers": {
    "stripe": {
      "command": "npx",
      "args": ["-y", "@stripe/mcp", "--tools=create_refund,list_customers"],
      "env": {
        "STRIPE_API_KEY": "vk_support_7f3a9b...",
        "STRIPE_BASE_URL": "https://proxy.keybrake.com/stripe"
      }
    }
  }
}

Two env var changes. The toolkit itself is unchanged — same @stripe/mcp binary, same tools flag, same Claude Desktop wiring. The difference is where the API calls go.

Setting up the vault key

Before updating the config, you issue a vault key — a scoped token that the proxy maps to your real Stripe secret. The vault key carries the policy: which endpoints are allowed, what the daily dollar cap is, when it expires. You set the policy once; the proxy enforces it on every call.

curl -X POST https://proxy.keybrake.com/keys \
  -H "Content-Type: application/json" \
  -H "X-Admin-Key: your-admin-key" \
  -d '{
    "name": "claude-support-agent",
    "stripe_secret": "sk_live_51...",
    "daily_usd_cap": 500,
    "allowed_endpoints": ["refunds.create", "customers.list", "balance.retrieve"]
  }'

The response includes the vault key string (vk_support_7f3a9b...) — that's what goes into STRIPE_API_KEY in the Claude Desktop config. Your real Stripe secret never leaves the proxy server.

The three fields in the policy map directly to the two failure modes above:

daily_usd_cap: 500 — after the agent has issued $500 in refunds today, every subsequent create_refund returns 429. The loop stops. You get an audit log row explaining why.
allowed_endpoints: ["refunds.create", "customers.list", "balance.retrieve"] — even if the toolkit were to expose additional tools, any call to an endpoint not on this list returns 403. Defense-in-depth with the --tools flag.
Expiry — you can add "expires_in": "8h" to issue a key that stops working at the end of the business day, without touching the Claude Desktop config.

The kill switch

This is the part the toolkit, Restricted Keys, and Stripe Projects all lack in the same form: a way to stop the agent mid-run in under a second.

Rotating a Stripe Restricted Key takes 30 seconds of dashboard work and 5 minutes of propagation. During that window, any in-flight call with the old key still succeeds. For a stuck loop at 100 calls per second, that's up to 30,000 charges or refunds before the key goes cold. The rotate-vs-revoke playbook covers the math in detail — the short version is that rotation is the right move when you have time; it is not a reliable kill switch when you don't.

With a vault key, the kill switch is a single API call:

curl -X DELETE https://proxy.keybrake.com/keys/vk_support_7f3a9b... \
  -H "X-Admin-Key: your-admin-key"

The proxy marks the key inactive in SQLite and returns 401 on every subsequent call — even calls already in-flight that the proxy is currently processing. The Stripe secret is unchanged. Other agents using different vault keys are unaffected. You can re-issue a new vault key with a different policy without any Stripe dashboard work.

For the range of AI agent kill-switch patterns — network egress, circuit-breaker, human-in-the-loop — a proxy-layer revoke is the cleanest because it doesn't require application code changes or firewall rules. It works regardless of where the agent is running.

What you get after the one config change

Control	Toolkit alone	Restricted Key alone	Proxy-routed toolkit
Limit which tools Claude can call	✓ `--tools` flag	Partial (key scopes)	✓ both layers
Per-day dollar cap	✗	✗	✓ enforced pre-call
Endpoint allowlist (beyond tool names)	✗	✗	✓
Sub-second mid-run revoke	✗	✗ (5-min propagation)	✓ single API call
Per-call audit log with parsed cost	✗	✗	✓
Key rotation requires code change	Yes (update `.env`)	Yes (update `.env`)	No (proxy holds secret)

The audit log deserves a sentence. Every call through the proxy writes a row: vault key name, vendor, HTTP method, Stripe endpoint, amount parsed from the response body, status code, block reason if any. You can query it from the dashboard or directly with GET /audit. For a support team that wants to answer "what did the agent actually do between 14:00 and 14:15?" it's the only place that answer lives.

Does this work with the other two toolkit modes?

The MCP server mode is the cleanest integration because the proxy URL override is just an env var. For the LangChain / Vercel AI SDK / CrewAI integration modes, you pass the proxy base URL when initializing the toolkit client — the exact parameter name differs by SDK, but every integration mode supports a custom base URL because the toolkit is built on top of the standard Stripe Node/Python SDK, which accepts STRIPE_API_BASE or equivalent. For the raw tool-definition mode, you call the proxy endpoint instead of api.stripe.com directly.

The Stripe Agent Toolkit + MCP reference page has the full breakdown of all three integration modes and which Stripe API scopes each default tool requires — useful for building the minimal allowed_endpoints list for your use case before issuing the vault key.

Frequently asked questions

What's the latency overhead?

The proxy is a Node.js server sitting on the same network path as your agent — typically on the same VPS or in the same region. The overhead is one additional HTTPS roundtrip to the proxy before the call goes to Stripe. In practice this is 5–15ms if the proxy is co-located with your agent, 20–50ms if it's in a different region. For a support agent handling conversational turns, this is indistinguishable from normal Stripe API latency. For a batch agent processing 50 records per second, you'd want the proxy in the same region.

What happens if the proxy goes down?

Your agent gets a connection error, the same as it would if Stripe itself were unreachable. The toolkit surfaces it as a tool error. Claude's behavior depends on your system prompt — typically it retries once and then reports failure to the user. For production deployments you'd run the proxy under systemd or a process supervisor so it restarts automatically on crash.

Does this add a new secret management problem?

The proxy holds your Stripe secret — you swap one secret management problem (Stripe key in .env) for another (admin key + Stripe key on the proxy server). The trade is usually worth it because the proxy server is a single, controlled point: you manage one secret there instead of distributing the Stripe secret to every agent that needs it. For a team running three different agents, that's three places the Stripe secret could leak versus one.

Is this relevant now that Stripe Projects shipped?

Stripe Projects (announced at Stripe Sessions 2026) is a token-issuing layer for agents, not a proxy. It handles monthly billing aggregation caps at the Stripe level and is limited to Stripe's 32 named partner vendors. It doesn't give you per-call enforcement, sub-second revoke, per-call audit logs, or coverage of any API outside that 32-vendor list. The two are complementary: Stripe Projects for monthly spend aggregation across multiple vendors, a governance proxy for per-call enforcement and instant kill-switch on any API your agent touches.

Bottom line

Stripe Agent Toolkit is genuinely the easiest way to give an LLM payment capabilities. The MCP integration in particular takes under a minute to wire up. The gap is runtime enforcement — once the tool is exposed, the toolkit has no mechanism to limit how much the agent spends, scope it to specific customers, or cut it off without rotating the underlying key.

The two-env-var change above closes all three gaps without touching your agent code or your Claude Desktop tool list. The cost is running a proxy server and managing one additional admin key. For any agent running against a production Stripe account, that trade is worth making before the first incident, not after.

Get early access to Keybrake

Spend caps, kill-switch, and per-call audit log for every API your agent touches — Stripe, Twilio, Resend. Join the waitlist for a vault key when v1 ships.