MCP · Credentials

MCP API key auth: how credentials actually flow through an MCP server

There are four patterns in the wild. Three leak, one doesn't. Here is how each one looks inside Claude Desktop, Cursor, and Windsurf, what actually exposes your key, and where a governance proxy fits when the MCP tool calls a money-moving API.

TL;DR

The Model Context Protocol (MCP) does not standardize how the MCP server authenticates to the downstream API it wraps — only how the MCP client talks to the server. In practice: about 80% of public MCP servers read a downstream API key from an environment variable set by the client config, 15% expect the client to pass the key per tool-call as a parameter, and a small and growing number use OAuth. Each pattern has a different leak model. If the downstream API is Stripe, Twilio, Resend, or any money-moving SaaS, none of these patterns give you a spend cap, a customer-level allowlist, or a sub-second revoke — the MCP spec is deliberately silent on those, and that's where a proxy goes.

Where auth lives in the MCP architecture

An MCP deployment has three moving parts:

The MCP client — Claude Desktop, Cursor, Windsurf, Zed, etc. It launches (or connects to) the MCP server, forwards tool calls to it, shows the results back to the user or model.
The MCP server — a process (local binary, npm package, remote HTTPS endpoint) that exposes a list of tools and handles each tool-call by making real API calls to some downstream service.
The downstream API — Stripe, Twilio, Resend, Postgres, GitHub, Shopify. What the MCP server is actually wrapping.

The MCP spec covers auth between the client and the server (stdio transport inherits OS-level trust; HTTP+SSE transport is standardizing OAuth as of the 2025-11 spec). It explicitly leaves auth between the server and the downstream API up to the server author. That's where every real credential-handling decision happens — and where every public MCP server has quietly picked a different approach.

Pattern 1: environment-variable secret (the common one)

This is how @stripe/mcp, @modelcontextprotocol/server-postgres, @twilio/mcp, and most community servers work. The client config looks like this in Claude Desktop:

{
  "mcpServers": {
    "stripe": {
      "command": "npx",
      "args": ["-y", "@stripe/mcp"],
      "env": {
        "STRIPE_API_KEY": "sk_live_..."
      }
    }
  }
}

When Claude Desktop spawns the server, it injects STRIPE_API_KEY into the child process environment. The server reads it on boot, holds it in memory, and uses it for every tool-call. The key lives on disk in claude_desktop_config.json (plaintext), in the process environment (readable by any process running as the same user), and in memory for the lifetime of the server.

What this protects against: the LLM itself seeing the key. Claude never gets STRIPE_API_KEY as part of tool inputs or outputs — it sees only tool schemas and tool results. That's a real security boundary.

What this does not protect against: anything else. The key is long-lived. It has whatever scope you gave it when you minted it in the Stripe Dashboard. A re-prompted Claude agent calling charges.create 80 times in a loop will succeed every time because the MCP server obediently forwards the call with the full-scope key. There is no spend cap, no per-call audit, no mid-run revoke.

Pattern 2: client-supplied per-call header

A smaller number of MCP servers expose a "set the auth header per tool-call" pattern, usually framed as "bring your own key." The server takes the credential as a tool-call parameter or reads it from a per-session header on an HTTP+SSE transport. @shopify/mcp does this for admin access tokens, for example.

This pushes the decision about which key to use onto the client. If the client is Claude Desktop configured by a human, the human picks. If the client is an agent framework, the agent picks — and now the LLM has to be trusted with the credential. That's usually a regression from pattern 1.

Pattern 3: OAuth-per-tool-call

The 2025-11 MCP spec standardized an OAuth flow for HTTP+SSE transports. Remote MCP servers can gate tool-calls behind an OAuth access token the client holds (refreshed via a standard OAuth 2.1 code flow on first connect). The token is bearer-auth between client and server; the server still has to figure out how to map that token to a downstream API credential.

In practice, an OAuth-aware remote MCP server does one of:

Mints a short-lived downstream credential per user, stored keyed by the OAuth user ID (typical SaaS-integration pattern)
Holds a single long-lived downstream key and trusts the OAuth gate as authorization (punts governance entirely)
Issues a downstream "vault-style" key to the user out of band and expects them to pass it

The first option is clean but depends on the downstream API having a programmatic key-issuance endpoint scoped by user (most don't — Stripe's Connect is the closest for Stripe; Twilio has sub-accounts; Resend doesn't). The second option doesn't help with governance at all. The third option is what pattern 4 is doing.

Pattern 4: proxy-enforced vault keys (what we're building)

The idea: issue a vault key that looks like a downstream API key but is scoped, capped, and revocable by policy in a proxy you own. The MCP server reads the vault key from its environment exactly the same way pattern 1 uses a real key, so existing MCP servers (including @stripe/mcp, @twilio/mcp, and @resend/mcp) work without any code change. The proxy sits between the MCP server and api.stripe.com / api.twilio.com / api.resend.com.

{
  "mcpServers": {
    "stripe": {
      "command": "npx",
      "args": ["-y", "@stripe/mcp"],
      "env": {
        "STRIPE_API_KEY": "vault_live_...",
        "STRIPE_API_BASE": "https://proxy.keybrake.com/stripe"
      }
    }
  }
}

Now the governance layer is enforced at network boundary: spend cap per day, customer allowlist, parameter limits (e.g. max_amount_usd: 50 on charge creates), mid-run revoke on the vault key (sub-second, no MCP server restart), and a complete audit log tied to agent run ID. The MCP server doesn't know it's being proxied. The real Stripe key never leaves the vault.

This is pattern 1 plus an enforcement layer. You keep what pattern 1 got right (LLM never sees the credential) and add what pattern 1 missed (everything else).

Comparison table

Pattern	LLM sees key?	Spend cap	Customer scope	Mid-run revoke	Per-call audit
1. Env-var secret	No	No	No	Rotate only (slow)	No
2. Per-call header	Usually yes	No	No	Drop the call parameter	No
3. OAuth per call	No	Depends on server	Depends on server	Revoke OAuth token	Depends on server
4. Proxy-enforced vault	No	Yes	Yes	Sub-second, policy-level	Yes

What to do today if you're shipping an MCP integration with real money downstream

If the downstream is read-only (search, fetch, Postgres SELECTs), pattern 1 is fine. The blast radius of a stuck loop is wasted CPU, not money.
If the downstream can move money, ship MCP tools (charges.create, refunds.create, transfers.create), and you're running through Claude Desktop or Cursor for interactive use, pattern 1 with a tightly-scoped Restricted Key is the current floor. See the Restricted Key example for the five-tick scope set.
If the downstream can move money and you're running the MCP server against an autonomous agent (no human in the loop), pattern 1 is insufficient — the missing controls matter. Either build your own proxy or use one.

How Keybrake fits

Keybrake is pattern 4, prebuilt. Register your Stripe / Twilio / Resend key once; issue per-MCP-server vault_… keys; attach a policy (daily USD cap, customer allowlist, endpoint allowlist, max-amount-per-call, expires-at); swap the MCP server's API_BASE to proxy.keybrake.com/<vendor>; done. Every tool-call the MCP server makes is governed, logged, and killable. The MCP server code is untouched.

Get early access