Stripe Agent Toolkit · MCP

Stripe Agent Toolkit + MCP: how to restrict what the agent can do

Stripe's Agent Toolkit ships an MCP server that hands your LLM a set of payment tools. The permissions flag controls which tools exist at the protocol layer — but not how much the agent spends, which customers it can touch, or how fast you can revoke. Here's what the toolkit itself does, where the cliff is, and a five-minute config change that closes the gap.

TL;DR

Stripe's Agent Toolkit exposes 14 default tools (create charge, refund, list customers, create product, create price, create invoice, etc.) through a Model Context Protocol server. You pick which tools to expose via a --tools flag — that's the toolkit's only access control and it's tool-level not dollar-level. All exposed tools run against one Stripe API key (secret or Restricted) whose scopes the MCP server inherits exactly. There is no per-call dollar cap, no customer allowlist, no mid-run revoke separate from rotating the key. The clean fix is a two-line Claude Desktop config change that routes the toolkit through a governance proxy — no code, no fork.

What Stripe Agent Toolkit actually is

The Stripe Agent Toolkit (stripe/agent-toolkit, published as @stripe/agent-toolkit on npm and stripe-agent-toolkit on PyPI) is Stripe's first-party library for exposing payment operations to LLM agents. It has three integration modes:

MCP server (npx -y @stripe/mcp --tools=...) — a standards-based MCP server that any MCP client (Claude Desktop, Cursor, Windsurf, agent frameworks like modelcontextprotocol) can connect to.
LangChain / Vercel AI SDK / CrewAI tools — language-native tool objects that drop into each framework's tool-calling flow.
Raw tool definitions — JSON schemas you hand to any LLM with function calling, which you wire up to the toolkit's handler yourself.

Across all three, the same design principle holds: each Stripe API method becomes a named tool with a JSON-schema input, and the toolkit executes that method using the Stripe API key you configured. The LLM doesn't see the key; the LLM sees the tool schema plus the result.

The 14 default tools — and the agent's scope inherits from the key

The default tool list (as of the 0.5.x version line) is roughly:

Tool	Calls	Restricted Key scope needed	Blast radius if abused
create_payment_link	`paymentLinks.create`	Payment Links: Write	Low — link has to be clicked
create_product	`products.create`	Products: Write	Low — catalog clutter
list_products	`products.list`	Products: Read	None
create_price	`prices.create`	Prices: Write	Low — pricing clutter
list_prices	`prices.list`	Prices: Read	None
create_customer	`customers.create`	Customers: Write	Medium — PII in Stripe
list_customers	`customers.list`	Customers: Read	Medium — PII exfil
create_invoice	`invoices.create`	Invoices: Write	High — dollar amounts, auto-collect
create_invoice_item	`invoiceItems.create`	Invoices: Write	High
finalize_invoice	`invoices.finalizeInvoice`	Invoices: Write	High — triggers collection
retrieve_balance	`balance.retrieve`	Balance: Read	None
create_refund	`refunds.create`	Refunds: Write	High — reversal loop
list_payment_intents	`paymentIntents.list`	PaymentIntents: Read	Medium — PII
create_charge	`charges.create`	Charges: Write	Critical — money out

The toolkit's --tools flag lets you expose a subset (e.g. --tools=list_customers,create_refund) for a support agent that should only issue refunds. That's a useful coarse-grained filter. It is not a substitute for scoping the key itself — if the key has Charges: Write, any code path that calls charges.create still works even if the tool isn't exposed, since the toolkit (or anything else running with that key) can make the call directly.

The 2025-11 MCP config block

A typical Claude Desktop config for running Stripe Agent Toolkit over MCP looks like this:

{
  "mcpServers": {
    "stripe": {
      "command": "npx",
      "args": [
        "-y",
        "@stripe/mcp",
        "--tools=create_refund,list_customers,retrieve_balance"
      ],
      "env": {
        "STRIPE_API_KEY": "sk_live_..."
      }
    }
  }
}

This gives the LLM three tools (refund, list customers, check balance) while the key it uses has the full scope of whatever sk_live_... is. Mis-scoping the key is the most common footgun: people paste in their secret key because it's the one they have in .env, and the agent now has Write access to everything Stripe offers even if the exposed tools only use one read.

The three governance gaps the toolkit doesn't close

Even with the right Restricted Key and the tightest --tools subset, three things that matter for a production money-moving agent remain open:

Gap 1 — no per-day USD cap

If create_refund is exposed and the agent hallucinates a double-refund loop, each call succeeds until something external breaks the loop. Stripe's Restricted Key does not support "max $X per day" for refunds or charges. The toolkit doesn't add one. You'd have to:

Add the cap in the agent's own code (cooperative, leaks if the agent's code changes)
Add a Stripe Radar rule (fraud engine, awkward semantic match for refunds, requires a paid Radar tier)
Insert a proxy between the toolkit and Stripe that enforces the cap

Gap 2 — no customer-level allowlist

A support agent should probably only refund customers whose tickets it's currently handling. The toolkit has no way to say "the agent may only touch customers in {cus_A, cus_B, cus_C}." You can't filter a Restricted Key to specific customer IDs outside of Stripe Connect. So if the agent hallucinates a customer ID and issues a refund, the call succeeds.

Gap 3 — mid-run revoke is slow

If the agent goes wrong at 2am, you're in the Stripe Dashboard → Developers → API Keys → Delete flow. Measured in our own tests: median 45 seconds for the propagation to start rejecting calls, p95 of 3 minutes 12 seconds. If the loop is doing 1 call every 400ms, that's 100-450 bonus calls that get through. Rotating also breaks any other legitimate consumer of that key.

The five-minute fix

You don't need to fork the toolkit or build custom middleware. The toolkit's MCP server passes through the STRIPE_API_BASE environment variable to stripe-node's config, which rewrites the base URL for every outbound HTTP call. Swap the env var to point at a governance proxy, and the MCP server's tools now go through policy-enforcement on every call:

{
  "mcpServers": {
    "stripe": {
      "command": "npx",
      "args": [
        "-y",
        "@stripe/mcp",
        "--tools=create_refund,list_customers,retrieve_balance"
      ],
      "env": {
        "STRIPE_API_KEY": "vault_live_...",
        "STRIPE_API_BASE": "https://proxy.keybrake.com/stripe"
      }
    }
  }
}

Two line changes, no toolkit fork, no code modification, no LLM retraining. The vault key is issued with a policy attached — { daily_usd_cap: 500, customer_allowlist: ["cus_A", "cus_B"], endpoint_allowlist: ["/v1/refunds", "/v1/customers"], max_refund_usd: 50, expires_at: "2026-05-01" }. The MCP server has no idea it's being proxied; Stripe's SDK just sees different bytes on the wire. Revoke is a single HTTP call that completes in sub-second against any proxy instance.

What this protects against that the toolkit alone doesn't

Control	Toolkit alone	Toolkit + Restricted Key	Toolkit + RK + proxy
LLM sees the key	No	No	No
Scope-limit by tool	Partial (--tools flag)	Yes (key scopes)	Yes + endpoint allowlist
Per-day USD cap	No	No	Yes
Customer allowlist	No	No (non-Connect)	Yes
Max-amount-per-call	No	No	Yes
Mid-run revoke	Rotate key only	Rotate key only	Sub-second policy change
Per-call audit	Whatever you log	Stripe Events API	Full request/response/cost/policy-result

An opinionated Claude Desktop setup for a support agent

Concrete example — an agent that handles refund-request tickets, should only touch customers whose tickets are currently assigned, refund at most $50 per request with a $500/day ceiling:

Mint a Stripe Restricted Key with: Refunds: Write, Customers: Read, PaymentIntents: Read, Balance: Read, everything else None. Save as rk_live_....
Create a Keybrake vault key pointing at that RK. Attach policy: { endpoint_allowlist: ["/v1/refunds", "/v1/customers/*", "/v1/payment_intents/*", "/v1/balance"], max_refund_usd: 50, daily_usd_cap: 500, customer_allowlist: <sync'd from ticket system>, expires_at: end-of-shift }. Save as vault_live_....
Update claude_desktop_config.json with the MCP block above, using the vault key and the proxy base URL. Restrict --tools to create_refund,list_customers,retrieve_balance.
Restart Claude Desktop. Every tool-call the agent makes now hits the proxy, checks the policy, and either forwards to Stripe or 403s. Audit log accumulates in Keybrake.

Blast radius of a stuck loop is now capped at $500 and bounded to customers the agent was allowed to touch. Revoke is policy-change-in-a-dashboard. The toolkit code never changed.

How Keybrake fits

Keybrake is the proxy in that config — the thing STRIPE_API_BASE points at. We issue vault keys with policies, enforce them on every call, parse the real cost from Stripe's response, log everything tied to your agent-run ID. Works with Stripe Agent Toolkit over MCP (as shown), over LangChain, over Vercel AI SDK, or raw — the governance happens at the network boundary, so every toolkit integration mode gets it for free. v1 covers Stripe, Twilio, Resend.

Get early access