AI agents · Multi-tenancy · API key security

AI agent multi-tenant isolation: per-tenant API key scoping and spend caps

When you build a SaaS product that gives each customer their own AI agent — a billing agent that charges their customers, an outreach agent that sends their emails, a fulfillment agent that triggers their Shopify orders — you face an architectural choice: give each tenant their own Stripe and Twilio keys, or use your platform keys and route all vendor calls through shared credentials. Using your platform keys is simpler to build but creates hard isolation problems: one tenant's stuck billing agent can exhaust Stripe's rate limits and slow every other tenant's requests, a runaway outreach agent burns your Resend send limit for the month, and revoking a shared key to stop a bad actor breaks every tenant simultaneously. Giving each tenant separate Stripe keys is the correct architecture but requires storing, rotating, and auditing dozens or hundreds of vendor secrets. Per-tenant vault keys solve this: one real Stripe key at your platform level, N vault keys at the tenant level, each with its own cap and its own revoke lifecycle.

TL;DR

Issue one vault key per tenant per agent run. Each vault key has its own dollar cap (reflecting that tenant's plan limits), its own endpoint allowlist (reflecting what actions that tenant's agent is permitted to take), and its own revoke lifecycle (revoking one tenant's key doesn't affect any other tenant). The audit log for each vendor call includes the tenant ID as part of the agent_run_label, so you can partition call history, spend reports, and incident reviews by tenant without cross-contamination. One real Stripe key at the platform level, isolated exposure at the tenant level.

The multi-tenant API key problem

Consider a B2B SaaS that offers a "billing agent" feature. Each customer (tenant) has their own subscribers, and the billing agent charges those subscribers on a schedule. The platform has a single Stripe account with a single secret key:

# Naive multi-tenant billing agent
STRIPE_KEY = os.environ["STRIPE_SECRET_KEY"]  # shared platform key

def run_billing_agent(tenant_id: str, plan_id: str):
    stripe.api_key = STRIPE_KEY  # Same key for every tenant
    customers = db.query("SELECT * FROM customers WHERE tenant_id = ? AND plan_id = ?", tenant_id, plan_id)

    for customer in customers:
        stripe.PaymentIntent.create(
            amount=customer["amount_cents"],
            currency="usd",
            customer=customer["stripe_customer_id"],
        )

This works until it doesn't. The failure modes are specific and expensive:

Tenant A's agent gets stuck in a loop. Tenant A's data pipeline has a bug that returns 10,000 customers instead of 100. The billing agent charges all 10,000. You want to stop it without affecting Tenants B, C, and D who are running their billing agents simultaneously. There's no way to revoke Tenant A's access specifically — the key is shared. Rotating the Stripe key stops Tenant A but also breaks B, C, and D.
Rate limit bleed across tenants. Stripe's rate limits apply per API key, not per customer. Tenant A's 10,000-customer run hits Stripe's rate limit. Tenants B and C, making legitimate charges with the same key, start receiving 429 responses. Their billing runs fail through no fault of their own.
Audit log cross-contamination. When Tenant B asks for a spend report, you have to filter Stripe's API logs by customer IDs that belong to Tenant B. There's no shared identifier that cleanly maps Stripe calls to tenant at the API key level. Incidents require manual cross-referencing.
Breach blast radius. If a single tenant's agent payload is compromised (e.g., prompt injection that exfiltrates environment variables), the attacker gets the shared Stripe key with access to all tenants' data.

Three isolation requirements for multi-tenant agents

Requirement	What it means in practice	Why it's hard with shared keys
Spend isolation	Tenant A's agent can spend at most $X per run (where X reflects A's plan tier). Tenant A hitting $X doesn't affect Tenant B's ability to spend up to their own $Y limit.	Stripe's rate limits and your own Stripe spending limits apply to the key, not the tenant. One tenant's spend behavior affects all others sharing the key.
Revoke isolation	Revoking Tenant A's agent access (because their run is stuck, their account is suspended, or there's a security incident) doesn't affect any other tenant's running agents.	Rotating the shared Stripe key is the only way to revoke access — but this affects every agent on every tenant simultaneously, causing operational chaos during an incident.
Audit isolation	Tenant A's agent's full call history is queryable without any Tenant B records appearing. Incidents, compliance reviews, and customer-facing spend reports can be run per tenant without cross-contamination.	Stripe's API logs are keyed by payment intent, customer, and timestamp — not by your tenant ID. Cross-referencing requires matching Stripe customer IDs to your tenant table, which is brittle and slow at scale.

Per-tenant vault key architecture

The vault key pattern scales to multi-tenancy naturally. Issue one vault key per tenant per agent run, using the tenant's plan tier to set the cap:

import requests

def run_billing_agent(tenant_id: str, plan_id: str, tenant_plan: str):
    # Cap reflects tenant's plan tier
    plan_caps = {"starter": 500, "growth": 5000, "enterprise": 50000}
    cap = plan_caps.get(tenant_plan, 500)

    # Issue vault key scoped to this tenant's run
    resp = requests.post(
        "https://proxy.keybrake.com/vault/keys",
        headers={"Authorization": f"Bearer {KEYBRAKE_API_KEY}"},
        json={
            "vendor": "stripe",
            "daily_usd_cap": cap,
            "allowed_endpoints": ["POST /v1/payment_intents", "POST /v1/refunds"],
            "expires_in": "2h",
            "agent_run_label": f"billing-agent/tenant:{tenant_id}/plan:{plan_id}",
            "metadata": {"tenant_id": tenant_id, "plan_tier": tenant_plan},
        },
    )
    vault_key = resp.json()["vault_key"]

    stripe.api_key = vault_key
    stripe.api_base = "https://proxy.keybrake.com/stripe"

    customers = db.query(
        "SELECT * FROM customers WHERE tenant_id = ? AND plan_id = ?",
        tenant_id, plan_id
    )
    run_id = resp.json()["key_id"]

    for customer in customers:
        try:
            stripe.PaymentIntent.create(
                amount=customer["amount_cents"],
                currency="usd",
                customer=customer["stripe_customer_id"],
                idempotency_key=f"{run_id}-{customer['id']}",
            )
        except Exception as e:
            if "cap_exhausted" in str(e):
                notify_operations(f"Tenant {tenant_id} billing agent hit spend cap: {cap} USD")
                break
            raise

Each run_billing_agent invocation issues a new vault key scoped to that specific tenant and run. The cap reflects the tenant's plan tier. The agent_run_label embeds the tenant ID so every Stripe call is attributable to the correct tenant in the Keybrake audit log without any post-hoc cross-referencing. The metadata field stores the tenant ID and plan tier for structured queries.

Revoking one tenant without affecting others

# When Tenant A's billing agent goes rogue:
# 1. Look up the active vault key for Tenant A
resp = requests.get(
    "https://proxy.keybrake.com/vault/keys",
    headers={"Authorization": f"Bearer {KEYBRAKE_API_KEY}"},
    params={"metadata.tenant_id": tenant_a_id, "status": "active"},
)
key_id = resp.json()["keys"][0]["key_id"]

# 2. Revoke just Tenant A's key — Tenants B, C, D are unaffected
requests.delete(
    f"https://proxy.keybrake.com/vault/keys/{key_id}",
    headers={"Authorization": f"Bearer {KEYBRAKE_API_KEY}"},
)
# Tenant A's next Stripe call gets 401. Tenants B, C, D continue normally.

The revoke operation takes milliseconds and propagates to the proxy immediately. Tenant A's billing agent receives a 401 on its next Stripe call and fails cleanly. Tenants B and C, whose agents are running simultaneously with their own vault keys, are completely unaffected. No Stripe key rotation, no worker restart, no operational blast radius.

Audit log partitioning by tenant

With the agent_run_label including tenant:{tenant_id}, the Keybrake audit log gives you tenant-partitioned call history out of the box. For a compliance review of Tenant A's activity:

# Query Keybrake audit log for Tenant A only
resp = requests.get(
    "https://proxy.keybrake.com/audit",
    headers={"Authorization": f"Bearer {KEYBRAKE_API_KEY}"},
    params={
        "label_contains": f"tenant:{tenant_a_id}",
        "vendor": "stripe",
        "from": "2026-06-01T00:00:00Z",
        "to": "2026-06-02T00:00:00Z",
    },
)
# Returns: [{timestamp, vault_key_id, endpoint, amount_usd, stripe_request_id, agent_run_label}]
# Tenant B's calls are never in this result set — they have a different label.

Every Stripe call made by Tenant A's agents is in the result set. Every call made by Tenant B, C, or D is excluded — not by filtering your own database, but because the vault keys were issued with tenant-specific labels and the audit log is already partitioned by label. Spend reports, incident timelines, and compliance reviews are per-tenant queries, not full-table scans with application-level filtering.

How Keybrake fits

Keybrake is the multi-tenant isolation layer between your platform's single Stripe account and N tenants' independent agent runs. One Keybrake admin key at your platform level. N vault keys at the tenant level — one per run, one per tenant, one per allowed action set. The spend cap on each vault key enforces your plan tier limits at the proxy level, not just in application code that can be bypassed. The audit log is already partitioned by tenant, so compliance, incident response, and customer-facing spend reports are first-class operations. Revoking one tenant's agent access is a single API call that affects only that tenant.

Get early access