MCP · Credentials
MCP API key auth: how credentials actually flow through an MCP server
There are four patterns in the wild. Three leak, one doesn't. Here is how each one looks inside Claude Desktop, Cursor, and Windsurf, what actually exposes your key, and where a governance proxy fits when the MCP tool calls a money-moving API.
TL;DR
The Model Context Protocol (MCP) does not standardize how the MCP server authenticates to the downstream API it wraps — only how the MCP client talks to the server. In practice: about 80% of public MCP servers read a downstream API key from an environment variable set by the client config, 15% expect the client to pass the key per tool-call as a parameter, and a small and growing number use OAuth. Each pattern has a different leak model. If the downstream API is Stripe, Twilio, Resend, or any money-moving SaaS, none of these patterns give you a spend cap, a customer-level allowlist, or a sub-second revoke — the MCP spec is deliberately silent on those, and that's where a proxy goes.
Where auth lives in the MCP architecture
An MCP deployment has three moving parts:
- The MCP client — Claude Desktop, Cursor, Windsurf, Zed, etc. It launches (or connects to) the MCP server, forwards tool calls to it, shows the results back to the user or model.
- The MCP server — a process (local binary, npm package, remote HTTPS endpoint) that exposes a list of tools and handles each tool-call by making real API calls to some downstream service.
- The downstream API — Stripe, Twilio, Resend, Postgres, GitHub, Shopify. What the MCP server is actually wrapping.
The MCP spec covers auth between the client and the server (stdio transport inherits OS-level trust; HTTP+SSE transport is standardizing OAuth as of the 2025-11 spec). It explicitly leaves auth between the server and the downstream API up to the server author. That's where every real credential-handling decision happens — and where every public MCP server has quietly picked a different approach.
Pattern 1: environment-variable secret (the common one)
This is how @stripe/mcp, @modelcontextprotocol/server-postgres, @twilio/mcp, and most community servers work. The client config looks like this in Claude Desktop:
{
"mcpServers": {
"stripe": {
"command": "npx",
"args": ["-y", "@stripe/mcp"],
"env": {
"STRIPE_API_KEY": "sk_live_..."
}
}
}
}
When Claude Desktop spawns the server, it injects STRIPE_API_KEY into the child process environment. The server reads it on boot, holds it in memory, and uses it for every tool-call. The key lives on disk in claude_desktop_config.json (plaintext), in the process environment (readable by any process running as the same user), and in memory for the lifetime of the server.
What this protects against: the LLM itself seeing the key. Claude never gets STRIPE_API_KEY as part of tool inputs or outputs — it sees only tool schemas and tool results. That's a real security boundary.
What this does not protect against: anything else. The key is long-lived. It has whatever scope you gave it when you minted it in the Stripe Dashboard. A re-prompted Claude agent calling charges.create 80 times in a loop will succeed every time because the MCP server obediently forwards the call with the full-scope key. There is no spend cap, no per-call audit, no mid-run revoke.
Pattern 2: client-supplied per-call header
A smaller number of MCP servers expose a "set the auth header per tool-call" pattern, usually framed as "bring your own key." The server takes the credential as a tool-call parameter or reads it from a per-session header on an HTTP+SSE transport. @shopify/mcp does this for admin access tokens, for example.
This pushes the decision about which key to use onto the client. If the client is Claude Desktop configured by a human, the human picks. If the client is an agent framework, the agent picks — and now the LLM has to be trusted with the credential. That's usually a regression from pattern 1.
Pattern 3: OAuth-per-tool-call
The 2025-11 MCP spec standardized an OAuth flow for HTTP+SSE transports. Remote MCP servers can gate tool-calls behind an OAuth access token the client holds (refreshed via a standard OAuth 2.1 code flow on first connect). The token is bearer-auth between client and server; the server still has to figure out how to map that token to a downstream API credential.
In practice, an OAuth-aware remote MCP server does one of:
- Mints a short-lived downstream credential per user, stored keyed by the OAuth user ID (typical SaaS-integration pattern)
- Holds a single long-lived downstream key and trusts the OAuth gate as authorization (punts governance entirely)
- Issues a downstream "vault-style" key to the user out of band and expects them to pass it
The first option is clean but depends on the downstream API having a programmatic key-issuance endpoint scoped by user (most don't — Stripe's Connect is the closest for Stripe; Twilio has sub-accounts; Resend doesn't). The second option doesn't help with governance at all. The third option is what pattern 4 is doing.
Pattern 4: proxy-enforced vault keys (what we're building)
The idea: issue a vault key that looks like a downstream API key but is scoped, capped, and revocable by policy in a proxy you own. The MCP server reads the vault key from its environment exactly the same way pattern 1 uses a real key, so existing MCP servers (including @stripe/mcp, @twilio/mcp, and @resend/mcp) work without any code change. The proxy sits between the MCP server and api.stripe.com / api.twilio.com / api.resend.com.
{
"mcpServers": {
"stripe": {
"command": "npx",
"args": ["-y", "@stripe/mcp"],
"env": {
"STRIPE_API_KEY": "vault_live_...",
"STRIPE_API_BASE": "https://proxy.keybrake.com/stripe"
}
}
}
}
Now the governance layer is enforced at network boundary: spend cap per day, customer allowlist, parameter limits (e.g. max_amount_usd: 50 on charge creates), mid-run revoke on the vault key (sub-second, no MCP server restart), and a complete audit log tied to agent run ID. The MCP server doesn't know it's being proxied. The real Stripe key never leaves the vault.
This is pattern 1 plus an enforcement layer. You keep what pattern 1 got right (LLM never sees the credential) and add what pattern 1 missed (everything else).
Comparison table
| Pattern | LLM sees key? | Spend cap | Customer scope | Mid-run revoke | Per-call audit |
|---|---|---|---|---|---|
| 1. Env-var secret | No | No | No | Rotate only (slow) | No |
| 2. Per-call header | Usually yes | No | No | Drop the call parameter | No |
| 3. OAuth per call | No | Depends on server | Depends on server | Revoke OAuth token | Depends on server |
| 4. Proxy-enforced vault | No | Yes | Yes | Sub-second, policy-level | Yes |
What to do today if you're shipping an MCP integration with real money downstream
- If the downstream is read-only (search, fetch, Postgres SELECTs), pattern 1 is fine. The blast radius of a stuck loop is wasted CPU, not money.
- If the downstream can move money, ship MCP tools (
charges.create,refunds.create,transfers.create), and you're running through Claude Desktop or Cursor for interactive use, pattern 1 with a tightly-scoped Restricted Key is the current floor. See the Restricted Key example for the five-tick scope set. - If the downstream can move money and you're running the MCP server against an autonomous agent (no human in the loop), pattern 1 is insufficient — the missing controls matter. Either build your own proxy or use one.
How Keybrake fits
Keybrake is pattern 4, prebuilt. Register your Stripe / Twilio / Resend key once; issue per-MCP-server vault_… keys; attach a policy (daily USD cap, customer allowlist, endpoint allowlist, max-amount-per-call, expires-at); swap the MCP server's API_BASE to proxy.keybrake.com/<vendor>; done. Every tool-call the MCP server makes is governed, logged, and killable. The MCP server code is untouched.
Related questions
Does the MCP spec itself define how the server auths to the downstream API?
No. The spec covers client-to-server authentication (stdio inherits OS trust, HTTP+SSE uses OAuth 2.1 as of the 2025-11 revision). Server-to-downstream-API authentication is treated as an implementation detail of each server. The upside is that MCP can wrap anything; the downside is that governance is not a spec concern and every server author picks their own trade-offs.
If I use a vault key with Stripe's official MCP server, does the server know?
No. The Stripe MCP server uses stripe-node under the hood and accepts a STRIPE_API_KEY env var + optional STRIPE_API_BASE override. It does not validate the shape of the key beyond "starts with sk_ or rk_" — Stripe's vault key prefix (vault_live_…) passes that check because the proxy is structured to expose the same shape. From the server's perspective, it's calling the same endpoints it always did.
Can I use pattern 4 with an MCP server I wrote myself?
Yes — easier, actually. You get to choose the API_BASE in your own code. Swap the SDK's baseURL (Stripe apiBase, Twilio edge, Resend baseUrl) to the proxy endpoint and ship the vault key as its credential.
What about MCP servers that run as a remote HTTPS endpoint instead of a local subprocess?
Same three patterns apply, with one caveat: with a remote server you do not control the environment, so pattern 1 becomes "the server operator holds your key" — a delegation of trust many teams aren't willing to make. OAuth (pattern 3) is better suited here. Combining remote MCP with a proxy (pattern 4) means the remote operator holds your vault key, not the real one, which makes the delegation much cheaper to reason about.
Is this only a concern for money-moving APIs?
Mostly. MCP servers wrapping read-only Postgres, GitHub, or search APIs don't benefit much from pattern 4 — there's nothing to cap in dollars. The cases that need governance are the ones where a stuck loop causes a real-world effect: charging, refunding, sending SMS, sending email, creating Shopify orders, issuing payouts. That's our scope; it's also why MCP servers wrapping those APIs are where you want a proxy in front.
Further reading
- Stripe Restricted Key example — the tightly-scoped upstream key pattern 1 should really be using.
- Stripe API key with restricted access — coverage matrix: what Restricted Keys do and don't give you for an AI agent.
- LiteLLM alternative for Stripe? — why LLM proxies don't solve the SaaS-tool key-governance problem, and what does.
- How to give an AI agent a Stripe API key — the five-control checklist, with code for both SDK-wrapper and reverse-proxy patterns.