Stripe Agent Toolkit · MCP
Stripe Agent Toolkit + MCP: how to restrict what the agent can do
Stripe's Agent Toolkit ships an MCP server that hands your LLM a set of payment tools. The permissions flag controls which tools exist at the protocol layer — but not how much the agent spends, which customers it can touch, or how fast you can revoke. Here's what the toolkit itself does, where the cliff is, and a five-minute config change that closes the gap.
TL;DR
Stripe's Agent Toolkit exposes 14 default tools (create charge, refund, list customers, create product, create price, create invoice, etc.) through a Model Context Protocol server. You pick which tools to expose via a --tools flag — that's the toolkit's only access control and it's tool-level not dollar-level. All exposed tools run against one Stripe API key (secret or Restricted) whose scopes the MCP server inherits exactly. There is no per-call dollar cap, no customer allowlist, no mid-run revoke separate from rotating the key. The clean fix is a two-line Claude Desktop config change that routes the toolkit through a governance proxy — no code, no fork.
What Stripe Agent Toolkit actually is
The Stripe Agent Toolkit (stripe/agent-toolkit, published as @stripe/agent-toolkit on npm and stripe-agent-toolkit on PyPI) is Stripe's first-party library for exposing payment operations to LLM agents. It has three integration modes:
- MCP server (
npx -y @stripe/mcp --tools=...) — a standards-based MCP server that any MCP client (Claude Desktop, Cursor, Windsurf, agent frameworks like modelcontextprotocol) can connect to. - LangChain / Vercel AI SDK / CrewAI tools — language-native tool objects that drop into each framework's tool-calling flow.
- Raw tool definitions — JSON schemas you hand to any LLM with function calling, which you wire up to the toolkit's handler yourself.
Across all three, the same design principle holds: each Stripe API method becomes a named tool with a JSON-schema input, and the toolkit executes that method using the Stripe API key you configured. The LLM doesn't see the key; the LLM sees the tool schema plus the result.
The 14 default tools — and the agent's scope inherits from the key
The default tool list (as of the 0.5.x version line) is roughly:
| Tool | Calls | Restricted Key scope needed | Blast radius if abused |
|---|---|---|---|
| create_payment_link | paymentLinks.create | Payment Links: Write | Low — link has to be clicked |
| create_product | products.create | Products: Write | Low — catalog clutter |
| list_products | products.list | Products: Read | None |
| create_price | prices.create | Prices: Write | Low — pricing clutter |
| list_prices | prices.list | Prices: Read | None |
| create_customer | customers.create | Customers: Write | Medium — PII in Stripe |
| list_customers | customers.list | Customers: Read | Medium — PII exfil |
| create_invoice | invoices.create | Invoices: Write | High — dollar amounts, auto-collect |
| create_invoice_item | invoiceItems.create | Invoices: Write | High |
| finalize_invoice | invoices.finalizeInvoice | Invoices: Write | High — triggers collection |
| retrieve_balance | balance.retrieve | Balance: Read | None |
| create_refund | refunds.create | Refunds: Write | High — reversal loop |
| list_payment_intents | paymentIntents.list | PaymentIntents: Read | Medium — PII |
| create_charge | charges.create | Charges: Write | Critical — money out |
The toolkit's --tools flag lets you expose a subset (e.g. --tools=list_customers,create_refund) for a support agent that should only issue refunds. That's a useful coarse-grained filter. It is not a substitute for scoping the key itself — if the key has Charges: Write, any code path that calls charges.create still works even if the tool isn't exposed, since the toolkit (or anything else running with that key) can make the call directly.
The 2025-11 MCP config block
A typical Claude Desktop config for running Stripe Agent Toolkit over MCP looks like this:
{
"mcpServers": {
"stripe": {
"command": "npx",
"args": [
"-y",
"@stripe/mcp",
"--tools=create_refund,list_customers,retrieve_balance"
],
"env": {
"STRIPE_API_KEY": "sk_live_..."
}
}
}
}
This gives the LLM three tools (refund, list customers, check balance) while the key it uses has the full scope of whatever sk_live_... is. Mis-scoping the key is the most common footgun: people paste in their secret key because it's the one they have in .env, and the agent now has Write access to everything Stripe offers even if the exposed tools only use one read.
The three governance gaps the toolkit doesn't close
Even with the right Restricted Key and the tightest --tools subset, three things that matter for a production money-moving agent remain open:
Gap 1 — no per-day USD cap
If create_refund is exposed and the agent hallucinates a double-refund loop, each call succeeds until something external breaks the loop. Stripe's Restricted Key does not support "max $X per day" for refunds or charges. The toolkit doesn't add one. You'd have to:
- Add the cap in the agent's own code (cooperative, leaks if the agent's code changes)
- Add a Stripe Radar rule (fraud engine, awkward semantic match for refunds, requires a paid Radar tier)
- Insert a proxy between the toolkit and Stripe that enforces the cap
Gap 2 — no customer-level allowlist
A support agent should probably only refund customers whose tickets it's currently handling. The toolkit has no way to say "the agent may only touch customers in {cus_A, cus_B, cus_C}." You can't filter a Restricted Key to specific customer IDs outside of Stripe Connect. So if the agent hallucinates a customer ID and issues a refund, the call succeeds.
Gap 3 — mid-run revoke is slow
If the agent goes wrong at 2am, you're in the Stripe Dashboard → Developers → API Keys → Delete flow. Measured in our own tests: median 45 seconds for the propagation to start rejecting calls, p95 of 3 minutes 12 seconds. If the loop is doing 1 call every 400ms, that's 100-450 bonus calls that get through. Rotating also breaks any other legitimate consumer of that key.
The five-minute fix
You don't need to fork the toolkit or build custom middleware. The toolkit's MCP server passes through the STRIPE_API_BASE environment variable to stripe-node's config, which rewrites the base URL for every outbound HTTP call. Swap the env var to point at a governance proxy, and the MCP server's tools now go through policy-enforcement on every call:
{
"mcpServers": {
"stripe": {
"command": "npx",
"args": [
"-y",
"@stripe/mcp",
"--tools=create_refund,list_customers,retrieve_balance"
],
"env": {
"STRIPE_API_KEY": "vault_live_...",
"STRIPE_API_BASE": "https://proxy.keybrake.com/stripe"
}
}
}
}
Two line changes, no toolkit fork, no code modification, no LLM retraining. The vault key is issued with a policy attached — { daily_usd_cap: 500, customer_allowlist: ["cus_A", "cus_B"], endpoint_allowlist: ["/v1/refunds", "/v1/customers"], max_refund_usd: 50, expires_at: "2026-05-01" }. The MCP server has no idea it's being proxied; Stripe's SDK just sees different bytes on the wire. Revoke is a single HTTP call that completes in sub-second against any proxy instance.
What this protects against that the toolkit alone doesn't
| Control | Toolkit alone | Toolkit + Restricted Key | Toolkit + RK + proxy |
|---|---|---|---|
| LLM sees the key | No | No | No |
| Scope-limit by tool | Partial (--tools flag) | Yes (key scopes) | Yes + endpoint allowlist |
| Per-day USD cap | No | No | Yes |
| Customer allowlist | No | No (non-Connect) | Yes |
| Max-amount-per-call | No | No | Yes |
| Mid-run revoke | Rotate key only | Rotate key only | Sub-second policy change |
| Per-call audit | Whatever you log | Stripe Events API | Full request/response/cost/policy-result |
An opinionated Claude Desktop setup for a support agent
Concrete example — an agent that handles refund-request tickets, should only touch customers whose tickets are currently assigned, refund at most $50 per request with a $500/day ceiling:
- Mint a Stripe Restricted Key with:
Refunds: Write,Customers: Read,PaymentIntents: Read,Balance: Read, everything elseNone. Save asrk_live_.... - Create a Keybrake vault key pointing at that RK. Attach policy:
{ endpoint_allowlist: ["/v1/refunds", "/v1/customers/*", "/v1/payment_intents/*", "/v1/balance"], max_refund_usd: 50, daily_usd_cap: 500, customer_allowlist: <sync'd from ticket system>, expires_at: end-of-shift }. Save asvault_live_.... - Update
claude_desktop_config.jsonwith the MCP block above, using the vault key and the proxy base URL. Restrict--toolstocreate_refund,list_customers,retrieve_balance. - Restart Claude Desktop. Every tool-call the agent makes now hits the proxy, checks the policy, and either forwards to Stripe or 403s. Audit log accumulates in Keybrake.
Blast radius of a stuck loop is now capped at $500 and bounded to customers the agent was allowed to touch. Revoke is policy-change-in-a-dashboard. The toolkit code never changed.
How Keybrake fits
Keybrake is the proxy in that config — the thing STRIPE_API_BASE points at. We issue vault keys with policies, enforce them on every call, parse the real cost from Stripe's response, log everything tied to your agent-run ID. Works with Stripe Agent Toolkit over MCP (as shown), over LangChain, over Vercel AI SDK, or raw — the governance happens at the network boundary, so every toolkit integration mode gets it for free. v1 covers Stripe, Twilio, Resend.
Related questions
Does the --tools flag prevent the agent from using tools I didn't list?
At the MCP layer, yes — tools not listed aren't advertised to the LLM, so it doesn't know to call them. But the underlying Stripe API key still has whatever scope you gave it. If the LLM writes raw curl commands to shell tools, or a different MCP server uses the same key, everything the key is scoped for still works. The --tools flag is a discoverability filter, not a security boundary; the key's own Restricted scope is the real security boundary.
Can I use a Stripe Restricted Key with the toolkit?
Yes — rk_live_... or rk_test_... works in the STRIPE_API_KEY env var. The toolkit doesn't care whether it's an sk_ secret key or an rk_ Restricted Key; stripe-node treats them identically on the wire. See our working example for scope choices.
Does the toolkit handle idempotency keys automatically?
Partially. The toolkit passes an Idempotency-Key on create-style calls when it can construct one deterministically from the tool arguments. In practice, if the LLM retries a tool call with slightly different arguments (temperature-induced variation), the idempotency key changes too and Stripe treats it as a new operation. For strong dedupe, set your own idempotency strategy — typically "<agent_run_id>:<tool_name>:<stable_hash_of_args>" — upstream of the toolkit.
What about Stripe Connect? Does the toolkit support it?
Yes — pass stripeAccount in the toolkit's configuration and the MCP server will add the Stripe-Account header on every call. This is how most Connect platforms use the toolkit: each agent runs scoped to one connected account, inheriting that account's Restricted Key permissions. You still want a proxy in front for per-day caps and revoke, but you get customer-allowlist for free because the connected-account scope already limits the universe of customers the call can touch.
Is this safe to run against a production Stripe account?
Not on the default config, no. The minimum floor for production is: Restricted Key (not Secret Key), narrow --tools list, idempotency strategy, monitoring on Stripe events, and a documented revoke procedure. With those in place it's about as safe as any manually-used Restricted Key. With a proxy layer added, the blast radius of mistakes is explicitly bounded by the policy you set — which is what most teams actually want before letting an LLM anywhere near create_charge.
Further reading
- MCP API key auth — the four credential-flow patterns MCP servers use; the toolkit uses pattern 1 and benefits from pattern 4.
- Stripe Restricted Key example — the five-tick scope set you should be using with the toolkit.
- Stripe API key with restricted access — full 10-control coverage matrix; rows that say "no" are the rows this proxy closes.
- AI agent kill switch — what "mid-run revoke" actually means in latency terms, and why it's the gap the toolkit alone can't fill.
- How to give an AI agent a Stripe API key — the five-control checklist with code for both SDK-wrapper and reverse-proxy patterns.