API key security · AI agents · architecture
AI agent API key scope: the right way to issue credentials to autonomous systems
When someone says "scope your API keys," they usually mean: use a read-only key, or a key restricted to the endpoints you actually need. That's good advice for human operators. For AI agents, it's necessary but not sufficient. Agents introduce three additional risk dimensions that traditional key scoping doesn't address: spending (how much the agent can charge), time (how long the credential is valid), and revocability (how fast you can stop it mid-run). This page covers the four-dimension scoping framework and the vault-key pattern that implements all four without requiring your vendors to change their APIs.
TL;DR
Traditional API key scope = permission scope (which endpoints). Agent API key scope = permission scope + spending cap + time boundary + revocability. The first dimension is what vendors give you natively. The other three require a proxy layer. The vault-key pattern: issue a short-lived, endpoint-allowlisted, spend-capped key to the agent; the proxy enforces the policy and can revoke the key mid-run without touching the real vendor credential.
Why traditional API key scoping was designed for humans
The canonical API key scoping pattern — restricting a key to read-only or to a specific set of endpoints — was designed for a specific threat model: a developer makes a mistake, or a key leaks, and the attacker can only do what the key allows. If the key is read-only, the attacker can read data but can't modify it. If the key is scoped to GET /v1/customers, the attacker can't create charges.
This model assumes a human is on the other end of the key: someone who makes deliberate API calls, pauses between requests, and stops when they've done what they came to do. The risk being mitigated is unauthorized access — a key in the wrong hands.
AI agents break this model. An agent with a scoped "mail send" key is authorized to send email — and it will send as much email as its reasoning produces until something stops it. The risk isn't unauthorized access; it's authorized excess. The agent has the key legitimately. The problem is that it might use it in a way you didn't intend, at a scale you didn't anticipate, for longer than you wanted.
The four dimensions of agent API key scope
Dimension 1: Permission scope (what endpoints)
This is what vendor API key systems give you natively. Stripe's restricted API keys let you limit a key to specific permissions groups (e.g., "Charges: Write" but not "Customers: Write"). Twilio lets you create API keys scoped to specific services. SendGrid's restricted keys can be limited to "Mail Send" only.
You should always use this. A key scoped to only the endpoints the agent actually needs limits the blast radius of both mistakes and abuse. If the billing agent only needs to create payment intents, give it a key that can only create payment intents — not one that can also refund, dispute, or modify customer records.
Limitation for agents: Permission scope controls which operations are possible. It doesn't limit how many times the agent performs those operations, or how much it spends doing so.
Dimension 2: Spending scope (how much per run)
This is the dimension vendors almost universally don't provide at the API key level. Stripe's restricted keys have permission scope but no dollar cap per key. Twilio's API keys have no per-key message count limit. Resend's API keys have no per-key email volume cap.
For AI agents, spending scope is often more important than permission scope. An agent that's authorized to create Stripe payment intents but has no spending cap can create $100,000 worth of payment intents in a stuck loop — legitimately, from Stripe's perspective. The key had the right permissions. Stripe did what it was asked.
Spending scope requires enforcement at the proxy layer — a system that tracks cumulative spend per key per day (or per run, or per hour), and blocks calls that would exceed the cap before they reach the vendor.
Dimension 3: Time scope (how long the credential is valid)
Traditional long-lived API keys (Stripe's 30+ character secrets, Twilio's auth tokens) don't expire. They're designed to be stored in environment variables and used indefinitely. This is fine for stable infrastructure; for agent runs, it's a mismatch.
An agent run has a natural time boundary: the duration of the conversation, workflow, or task. A key issued for that run should expire when the run ends. If the run lasts 2 hours, the key should expire after 2 hours — not after 90 days.
Short-lived keys reduce the blast radius of key leakage. If an agent's key is captured in a log, a prompt injection attack, or a debugging output, the window of exposure is the key's TTL, not indefinite. They also naturally enforce run isolation: a key issued for run A can't be reused for run B.
Time scope is something you can implement yourself (generate a random token, store it with an expiry, check on each use) — but doing it correctly, securely, and at scale for every agent run across multiple vendors is what the vault-key proxy is for.
Dimension 4: Revocability (how fast can you stop it)
Rotating a vendor API key takes time and breaks everyone using that key. The Stripe key rotation workflow: generate new key → deploy new key to all services → confirm old key is not used → delete old key. If done carefully, this takes 30–60 minutes at a minimum. Under incident pressure (a runaway agent at 3am), it takes longer.
For agents, you need per-run revocability: the ability to stop a specific agent run's access to a specific vendor without touching any other agent or service. The revoke operation needs to be fast — sub-second ideally, certainly under 5 seconds — because an agent in a loop can make many calls in the time it takes a human to navigate a dashboard and click a button.
The vault-key pattern achieves this: the vault key is a token that the proxy checks on every call. Revoking it is a database write (key status → revoked) that takes effect on the next call. No vendor key rotation required.
The four-dimension scoping matrix
| Dimension | Vendor native keys | Vault key proxy | What it prevents |
|---|---|---|---|
| Permission scope | Yes — most vendors support restricted keys with endpoint/permission filtering | Yes — endpoint allowlist in vault key policy | Agent calling endpoints it shouldn't |
| Spending scope | No — no vendor provides per-key dollar caps at the API layer | Yes — daily_usd_cap enforced pre-call | Agent exceeding per-run budget, stuck loop charges |
| Time scope | No — vendor keys are long-lived by default; Stripe's short-lived keys require separate OAuth flow | Yes — expires_in field, TTL enforced at proxy | Leaked key used after run ends, cross-run key reuse |
| Revocability | No — revoking a vendor key affects all users of that key; takes minutes to hours to rotate safely | Yes — single API call, sub-second effect, no blast radius to other agents | Inability to stop a specific agent mid-run without breaking other services |
Implementing the four dimensions with a vault key
A vault key policy encodes all four dimensions in a single JSON object:
POST https://proxy.keybrake.com/vault/keys
Authorization: Bearer keybrake_api_key_xxx
Content-Type: application/json
{
"vendor": "stripe",
// Dimension 1: Permission scope
"allowed_endpoints": [
"POST /v1/payment_intents",
"GET /v1/customers/*"
],
// Dimension 2: Spending scope
"daily_usd_cap": 500,
// Dimension 3: Time scope
"expires_in": "4h",
// Dimension 4: Revocability (label for the revoke call)
"agent_run_label": "billing-agent/run-8f3a2c"
}
The response contains the vault key — a random token that looks like vault_key_a1b2c3.... This is what the agent gets. The real Stripe key stays on the Keybrake side.
To revoke mid-run:
DELETE https://proxy.keybrake.com/vault/keys/vault_key_a1b2c3
Authorization: Bearer keybrake_api_key_xxx
The next call the agent makes through the proxy with that vault key gets a 401. The real Stripe key is unchanged. Every other agent and service continues working normally.
When to issue vault keys
The right issuance pattern depends on your agent architecture:
- Per-conversation (chatbots): Issue at conversation start, expire at conversation end or after a fixed TTL (1h, 2h). Each conversation gets its own cap and its own audit trail segment.
- Per-workflow-run (Temporal, Prefect, Airflow): Issue in the first activity of the workflow run, pass as a parameter to subsequent activities. The vault key's cap covers the entire workflow run.
- Per-agent-session (long-running agents): Issue at session start, expire at session end. If the session runs longer than expected, the time-scope kicks in and the key expires — a natural forcing function to issue a fresh key for the next session.
- Per-tool-call (highest isolation): Issue a vault key, make one vendor call, let the key expire. Maximum isolation, maximum overhead. Only appropriate for high-value, high-risk calls where the per-call overhead is acceptable.
How Keybrake fits
Keybrake is the vault-key proxy for Stripe, Twilio, and Resend — the three vendors whose costs are most directly exposed to agent behavior. You issue vault keys via the Keybrake API, point your agent at the proxy endpoint, and the proxy enforces all four scoping dimensions on every call. The dashboard shows per-key spend, per-call audit log, and one-click revoke. Vault key policy is per-key JSON — no infrastructure to run, no database to manage.
Related questions
Don't vendor "restricted keys" already solve the scoping problem?
They solve dimension 1 (permission scope) — and you should use them. But the other three dimensions (spending, time, revocability) are not covered by any major vendor's native key system today. Stripe's restricted keys are the most advanced: you can restrict them to specific permission groups. But there's no per-key dollar cap, no TTL on the key itself, and no per-key revoke that doesn't affect all users of that key. The vault-key proxy adds the three missing dimensions on top of the vendor's native permission scoping.
Can I implement the four dimensions myself without a proxy?
Yes — each dimension is implementable independently: (1) use vendor restricted keys for permission scope, (2) add a pre-call spend-tracking middleware to your agent that queries a database and aborts if over cap, (3) rotate keys on a schedule using a secret manager, (4) add a kill-switch flag in your database that your middleware checks before each call. The proxy pattern packages all four into one place so you don't maintain separate systems for each dimension, and moves the enforcement outside the agent process so a crashing or misbehaving agent can't bypass it.
What's the right daily_usd_cap for a billing agent?
The right cap is 1.5–2× the maximum expected legitimate spend for one agent run. If your billing agent processes at most 100 invoices averaging $50 each, the expected maximum is $5,000. Set the cap at $7,500–$10,000. This allows for normal variance without blocking legitimate runs, while capping the damage from a bug that causes double-processing. The cap is per-day, so a run that starts near midnight and crosses midnight gets two day's worth of cap — account for this if your agent runs span midnight.
Further reading
- AI agent kill switch patterns — the four revocation mechanisms and their real stop latencies; which one is fast enough for each scenario.
- AI agent API key best practices — the full checklist: naming conventions, rotation schedules, and storage patterns for agent credentials.
- AI agent Stripe spend cap — the dimension 2 deep dive: why pre-charge enforcement is different from post-charge alerts and how the proxy-layer cap works.
- AI agent governance tools — the broader governance stack beyond key scoping: policy frameworks, audit trails, and monitoring for autonomous agent deployments.