AI agents · Multi-tenancy · API key security
AI agent multi-tenant isolation: per-tenant API key scoping and spend caps
When you build a SaaS product that gives each customer their own AI agent — a billing agent that charges their customers, an outreach agent that sends their emails, a fulfillment agent that triggers their Shopify orders — you face an architectural choice: give each tenant their own Stripe and Twilio keys, or use your platform keys and route all vendor calls through shared credentials. Using your platform keys is simpler to build but creates hard isolation problems: one tenant's stuck billing agent can exhaust Stripe's rate limits and slow every other tenant's requests, a runaway outreach agent burns your Resend send limit for the month, and revoking a shared key to stop a bad actor breaks every tenant simultaneously. Giving each tenant separate Stripe keys is the correct architecture but requires storing, rotating, and auditing dozens or hundreds of vendor secrets. Per-tenant vault keys solve this: one real Stripe key at your platform level, N vault keys at the tenant level, each with its own cap and its own revoke lifecycle.
TL;DR
Issue one vault key per tenant per agent run. Each vault key has its own dollar cap (reflecting that tenant's plan limits), its own endpoint allowlist (reflecting what actions that tenant's agent is permitted to take), and its own revoke lifecycle (revoking one tenant's key doesn't affect any other tenant). The audit log for each vendor call includes the tenant ID as part of the agent_run_label, so you can partition call history, spend reports, and incident reviews by tenant without cross-contamination. One real Stripe key at the platform level, isolated exposure at the tenant level.
The multi-tenant API key problem
Consider a B2B SaaS that offers a "billing agent" feature. Each customer (tenant) has their own subscribers, and the billing agent charges those subscribers on a schedule. The platform has a single Stripe account with a single secret key:
# Naive multi-tenant billing agent
STRIPE_KEY = os.environ["STRIPE_SECRET_KEY"] # shared platform key
def run_billing_agent(tenant_id: str, plan_id: str):
stripe.api_key = STRIPE_KEY # Same key for every tenant
customers = db.query("SELECT * FROM customers WHERE tenant_id = ? AND plan_id = ?", tenant_id, plan_id)
for customer in customers:
stripe.PaymentIntent.create(
amount=customer["amount_cents"],
currency="usd",
customer=customer["stripe_customer_id"],
)
This works until it doesn't. The failure modes are specific and expensive:
- Tenant A's agent gets stuck in a loop. Tenant A's data pipeline has a bug that returns 10,000 customers instead of 100. The billing agent charges all 10,000. You want to stop it without affecting Tenants B, C, and D who are running their billing agents simultaneously. There's no way to revoke Tenant A's access specifically — the key is shared. Rotating the Stripe key stops Tenant A but also breaks B, C, and D.
- Rate limit bleed across tenants. Stripe's rate limits apply per API key, not per customer. Tenant A's 10,000-customer run hits Stripe's rate limit. Tenants B and C, making legitimate charges with the same key, start receiving 429 responses. Their billing runs fail through no fault of their own.
- Audit log cross-contamination. When Tenant B asks for a spend report, you have to filter Stripe's API logs by customer IDs that belong to Tenant B. There's no shared identifier that cleanly maps Stripe calls to tenant at the API key level. Incidents require manual cross-referencing.
- Breach blast radius. If a single tenant's agent payload is compromised (e.g., prompt injection that exfiltrates environment variables), the attacker gets the shared Stripe key with access to all tenants' data.
Three isolation requirements for multi-tenant agents
| Requirement | What it means in practice | Why it's hard with shared keys |
|---|---|---|
| Spend isolation | Tenant A's agent can spend at most $X per run (where X reflects A's plan tier). Tenant A hitting $X doesn't affect Tenant B's ability to spend up to their own $Y limit. | Stripe's rate limits and your own Stripe spending limits apply to the key, not the tenant. One tenant's spend behavior affects all others sharing the key. |
| Revoke isolation | Revoking Tenant A's agent access (because their run is stuck, their account is suspended, or there's a security incident) doesn't affect any other tenant's running agents. | Rotating the shared Stripe key is the only way to revoke access — but this affects every agent on every tenant simultaneously, causing operational chaos during an incident. |
| Audit isolation | Tenant A's agent's full call history is queryable without any Tenant B records appearing. Incidents, compliance reviews, and customer-facing spend reports can be run per tenant without cross-contamination. | Stripe's API logs are keyed by payment intent, customer, and timestamp — not by your tenant ID. Cross-referencing requires matching Stripe customer IDs to your tenant table, which is brittle and slow at scale. |
Per-tenant vault key architecture
The vault key pattern scales to multi-tenancy naturally. Issue one vault key per tenant per agent run, using the tenant's plan tier to set the cap:
import requests
def run_billing_agent(tenant_id: str, plan_id: str, tenant_plan: str):
# Cap reflects tenant's plan tier
plan_caps = {"starter": 500, "growth": 5000, "enterprise": 50000}
cap = plan_caps.get(tenant_plan, 500)
# Issue vault key scoped to this tenant's run
resp = requests.post(
"https://proxy.keybrake.com/vault/keys",
headers={"Authorization": f"Bearer {KEYBRAKE_API_KEY}"},
json={
"vendor": "stripe",
"daily_usd_cap": cap,
"allowed_endpoints": ["POST /v1/payment_intents", "POST /v1/refunds"],
"expires_in": "2h",
"agent_run_label": f"billing-agent/tenant:{tenant_id}/plan:{plan_id}",
"metadata": {"tenant_id": tenant_id, "plan_tier": tenant_plan},
},
)
vault_key = resp.json()["vault_key"]
stripe.api_key = vault_key
stripe.api_base = "https://proxy.keybrake.com/stripe"
customers = db.query(
"SELECT * FROM customers WHERE tenant_id = ? AND plan_id = ?",
tenant_id, plan_id
)
run_id = resp.json()["key_id"]
for customer in customers:
try:
stripe.PaymentIntent.create(
amount=customer["amount_cents"],
currency="usd",
customer=customer["stripe_customer_id"],
idempotency_key=f"{run_id}-{customer['id']}",
)
except Exception as e:
if "cap_exhausted" in str(e):
notify_operations(f"Tenant {tenant_id} billing agent hit spend cap: {cap} USD")
break
raise
Each run_billing_agent invocation issues a new vault key scoped to that specific tenant and run. The cap reflects the tenant's plan tier. The agent_run_label embeds the tenant ID so every Stripe call is attributable to the correct tenant in the Keybrake audit log without any post-hoc cross-referencing. The metadata field stores the tenant ID and plan tier for structured queries.
Revoking one tenant without affecting others
# When Tenant A's billing agent goes rogue:
# 1. Look up the active vault key for Tenant A
resp = requests.get(
"https://proxy.keybrake.com/vault/keys",
headers={"Authorization": f"Bearer {KEYBRAKE_API_KEY}"},
params={"metadata.tenant_id": tenant_a_id, "status": "active"},
)
key_id = resp.json()["keys"][0]["key_id"]
# 2. Revoke just Tenant A's key — Tenants B, C, D are unaffected
requests.delete(
f"https://proxy.keybrake.com/vault/keys/{key_id}",
headers={"Authorization": f"Bearer {KEYBRAKE_API_KEY}"},
)
# Tenant A's next Stripe call gets 401. Tenants B, C, D continue normally.
The revoke operation takes milliseconds and propagates to the proxy immediately. Tenant A's billing agent receives a 401 on its next Stripe call and fails cleanly. Tenants B and C, whose agents are running simultaneously with their own vault keys, are completely unaffected. No Stripe key rotation, no worker restart, no operational blast radius.
Audit log partitioning by tenant
With the agent_run_label including tenant:{tenant_id}, the Keybrake audit log gives you tenant-partitioned call history out of the box. For a compliance review of Tenant A's activity:
# Query Keybrake audit log for Tenant A only
resp = requests.get(
"https://proxy.keybrake.com/audit",
headers={"Authorization": f"Bearer {KEYBRAKE_API_KEY}"},
params={
"label_contains": f"tenant:{tenant_a_id}",
"vendor": "stripe",
"from": "2026-06-01T00:00:00Z",
"to": "2026-06-02T00:00:00Z",
},
)
# Returns: [{timestamp, vault_key_id, endpoint, amount_usd, stripe_request_id, agent_run_label}]
# Tenant B's calls are never in this result set — they have a different label.
Every Stripe call made by Tenant A's agents is in the result set. Every call made by Tenant B, C, or D is excluded — not by filtering your own database, but because the vault keys were issued with tenant-specific labels and the audit log is already partitioned by label. Spend reports, incident timelines, and compliance reviews are per-tenant queries, not full-table scans with application-level filtering.
How Keybrake fits
Keybrake is the multi-tenant isolation layer between your platform's single Stripe account and N tenants' independent agent runs. One Keybrake admin key at your platform level. N vault keys at the tenant level — one per run, one per tenant, one per allowed action set. The spend cap on each vault key enforces your plan tier limits at the proxy level, not just in application code that can be bypassed. The audit log is already partitioned by tenant, so compliance, incident response, and customer-facing spend reports are first-class operations. Revoking one tenant's agent access is a single API call that affects only that tenant.
Related questions
Should I issue one vault key per tenant per session, or one per tenant total?
One per run is the right granularity. A "per-tenant total" key that lasts for the account lifetime means the cap reflects cumulative spend across all runs, not per-run spend — a tenant who makes 10 billing runs per month needs a cap 10× the single-run cap to avoid false positives. A per-run key has a cap that matches the expected spend of one billing run, expires when the run is done (limiting exposure), and gives you per-run audit resolution. Issue the key at the start of the agent run and let it expire after the run completes (set TTL to 2× expected run duration as a safety margin).
How do I handle tenants who use their own Stripe accounts, not my platform's?
If tenants provide their own Stripe keys (common in "bring your own credentials" architectures), the multi-tenant isolation is at the Stripe account level — each tenant's key can only charge their own Stripe customers. In this case, the vault key pattern still adds value: store the tenant's Stripe key in Keybrake as a vendor credential, issue vault keys that reference the tenant's credential, and add per-run caps on top of their Stripe account. The audit log is still per-tenant (each vault key is for one tenant's credential), and you can still revoke access without the tenant rotating their own Stripe key.
Can vault keys restrict which Stripe customers an agent can charge?
The vault key's allowed_endpoints field restricts API endpoints (e.g., only POST /v1/payment_intents, not POST /v1/customers or DELETE /v1/customers/{id}). It doesn't restrict which Stripe customer IDs are valid — that's enforced at the application level by querying your tenant's customer table before making the call. For stronger isolation, combine endpoint allowlists with application-level pre-flight checks: before calling Stripe, verify that customer.tenant_id == current_tenant_id in your database. This prevents prompt injection attacks that might try to charge a different tenant's customer.
Further reading
- AI agent API key scope — the general framework for scoping any vendor API key for agent use: endpoint allowlists, merchant allowlists, and time-bound expiry.
- AI agent audit trail — how to structure the audit log that captures every vendor call with tenant context, amount, and timestamp for compliance and incident response.
- AI agent API key rotation — when rotating your platform Stripe key (forced by a security event), the per-tenant vault key architecture means only you need to update one credential in Keybrake — tenant agent sessions using vault keys continue uninterrupted.
- AI agent governance platform — the broader governance stack for multi-tenant agent deployments: per-tenant policy enforcement, cross-tenant spend visibility, and kill-switch operations.