AI agents · Multi-tenancy · API key security

AI agent multi-tenant isolation: per-tenant API key scoping and spend caps

When you build a SaaS product that gives each customer their own AI agent — a billing agent that charges their customers, an outreach agent that sends their emails, a fulfillment agent that triggers their Shopify orders — you face an architectural choice: give each tenant their own Stripe and Twilio keys, or use your platform keys and route all vendor calls through shared credentials. Using your platform keys is simpler to build but creates hard isolation problems: one tenant's stuck billing agent can exhaust Stripe's rate limits and slow every other tenant's requests, a runaway outreach agent burns your Resend send limit for the month, and revoking a shared key to stop a bad actor breaks every tenant simultaneously. Giving each tenant separate Stripe keys is the correct architecture but requires storing, rotating, and auditing dozens or hundreds of vendor secrets. Per-tenant vault keys solve this: one real Stripe key at your platform level, N vault keys at the tenant level, each with its own cap and its own revoke lifecycle.

TL;DR

Issue one vault key per tenant per agent run. Each vault key has its own dollar cap (reflecting that tenant's plan limits), its own endpoint allowlist (reflecting what actions that tenant's agent is permitted to take), and its own revoke lifecycle (revoking one tenant's key doesn't affect any other tenant). The audit log for each vendor call includes the tenant ID as part of the agent_run_label, so you can partition call history, spend reports, and incident reviews by tenant without cross-contamination. One real Stripe key at the platform level, isolated exposure at the tenant level.

The multi-tenant API key problem

Consider a B2B SaaS that offers a "billing agent" feature. Each customer (tenant) has their own subscribers, and the billing agent charges those subscribers on a schedule. The platform has a single Stripe account with a single secret key:

# Naive multi-tenant billing agent
STRIPE_KEY = os.environ["STRIPE_SECRET_KEY"]  # shared platform key

def run_billing_agent(tenant_id: str, plan_id: str):
    stripe.api_key = STRIPE_KEY  # Same key for every tenant
    customers = db.query("SELECT * FROM customers WHERE tenant_id = ? AND plan_id = ?", tenant_id, plan_id)

    for customer in customers:
        stripe.PaymentIntent.create(
            amount=customer["amount_cents"],
            currency="usd",
            customer=customer["stripe_customer_id"],
        )

This works until it doesn't. The failure modes are specific and expensive:

Three isolation requirements for multi-tenant agents

RequirementWhat it means in practiceWhy it's hard with shared keys
Spend isolation Tenant A's agent can spend at most $X per run (where X reflects A's plan tier). Tenant A hitting $X doesn't affect Tenant B's ability to spend up to their own $Y limit. Stripe's rate limits and your own Stripe spending limits apply to the key, not the tenant. One tenant's spend behavior affects all others sharing the key.
Revoke isolation Revoking Tenant A's agent access (because their run is stuck, their account is suspended, or there's a security incident) doesn't affect any other tenant's running agents. Rotating the shared Stripe key is the only way to revoke access — but this affects every agent on every tenant simultaneously, causing operational chaos during an incident.
Audit isolation Tenant A's agent's full call history is queryable without any Tenant B records appearing. Incidents, compliance reviews, and customer-facing spend reports can be run per tenant without cross-contamination. Stripe's API logs are keyed by payment intent, customer, and timestamp — not by your tenant ID. Cross-referencing requires matching Stripe customer IDs to your tenant table, which is brittle and slow at scale.

Per-tenant vault key architecture

The vault key pattern scales to multi-tenancy naturally. Issue one vault key per tenant per agent run, using the tenant's plan tier to set the cap:

import requests

def run_billing_agent(tenant_id: str, plan_id: str, tenant_plan: str):
    # Cap reflects tenant's plan tier
    plan_caps = {"starter": 500, "growth": 5000, "enterprise": 50000}
    cap = plan_caps.get(tenant_plan, 500)

    # Issue vault key scoped to this tenant's run
    resp = requests.post(
        "https://proxy.keybrake.com/vault/keys",
        headers={"Authorization": f"Bearer {KEYBRAKE_API_KEY}"},
        json={
            "vendor": "stripe",
            "daily_usd_cap": cap,
            "allowed_endpoints": ["POST /v1/payment_intents", "POST /v1/refunds"],
            "expires_in": "2h",
            "agent_run_label": f"billing-agent/tenant:{tenant_id}/plan:{plan_id}",
            "metadata": {"tenant_id": tenant_id, "plan_tier": tenant_plan},
        },
    )
    vault_key = resp.json()["vault_key"]

    stripe.api_key = vault_key
    stripe.api_base = "https://proxy.keybrake.com/stripe"

    customers = db.query(
        "SELECT * FROM customers WHERE tenant_id = ? AND plan_id = ?",
        tenant_id, plan_id
    )
    run_id = resp.json()["key_id"]

    for customer in customers:
        try:
            stripe.PaymentIntent.create(
                amount=customer["amount_cents"],
                currency="usd",
                customer=customer["stripe_customer_id"],
                idempotency_key=f"{run_id}-{customer['id']}",
            )
        except Exception as e:
            if "cap_exhausted" in str(e):
                notify_operations(f"Tenant {tenant_id} billing agent hit spend cap: {cap} USD")
                break
            raise

Each run_billing_agent invocation issues a new vault key scoped to that specific tenant and run. The cap reflects the tenant's plan tier. The agent_run_label embeds the tenant ID so every Stripe call is attributable to the correct tenant in the Keybrake audit log without any post-hoc cross-referencing. The metadata field stores the tenant ID and plan tier for structured queries.

Revoking one tenant without affecting others

# When Tenant A's billing agent goes rogue:
# 1. Look up the active vault key for Tenant A
resp = requests.get(
    "https://proxy.keybrake.com/vault/keys",
    headers={"Authorization": f"Bearer {KEYBRAKE_API_KEY}"},
    params={"metadata.tenant_id": tenant_a_id, "status": "active"},
)
key_id = resp.json()["keys"][0]["key_id"]

# 2. Revoke just Tenant A's key — Tenants B, C, D are unaffected
requests.delete(
    f"https://proxy.keybrake.com/vault/keys/{key_id}",
    headers={"Authorization": f"Bearer {KEYBRAKE_API_KEY}"},
)
# Tenant A's next Stripe call gets 401. Tenants B, C, D continue normally.

The revoke operation takes milliseconds and propagates to the proxy immediately. Tenant A's billing agent receives a 401 on its next Stripe call and fails cleanly. Tenants B and C, whose agents are running simultaneously with their own vault keys, are completely unaffected. No Stripe key rotation, no worker restart, no operational blast radius.

Audit log partitioning by tenant

With the agent_run_label including tenant:{tenant_id}, the Keybrake audit log gives you tenant-partitioned call history out of the box. For a compliance review of Tenant A's activity:

# Query Keybrake audit log for Tenant A only
resp = requests.get(
    "https://proxy.keybrake.com/audit",
    headers={"Authorization": f"Bearer {KEYBRAKE_API_KEY}"},
    params={
        "label_contains": f"tenant:{tenant_a_id}",
        "vendor": "stripe",
        "from": "2026-06-01T00:00:00Z",
        "to": "2026-06-02T00:00:00Z",
    },
)
# Returns: [{timestamp, vault_key_id, endpoint, amount_usd, stripe_request_id, agent_run_label}]
# Tenant B's calls are never in this result set — they have a different label.

Every Stripe call made by Tenant A's agents is in the result set. Every call made by Tenant B, C, or D is excluded — not by filtering your own database, but because the vault keys were issued with tenant-specific labels and the audit log is already partitioned by label. Spend reports, incident timelines, and compliance reviews are per-tenant queries, not full-table scans with application-level filtering.

How Keybrake fits

Keybrake is the multi-tenant isolation layer between your platform's single Stripe account and N tenants' independent agent runs. One Keybrake admin key at your platform level. N vault keys at the tenant level — one per run, one per tenant, one per allowed action set. The spend cap on each vault key enforces your plan tier limits at the proxy level, not just in application code that can be bypassed. The audit log is already partitioned by tenant, so compliance, incident response, and customer-facing spend reports are first-class operations. Revoking one tenant's agent access is a single API call that affects only that tenant.

Get early access

Related questions

Should I issue one vault key per tenant per session, or one per tenant total?

One per run is the right granularity. A "per-tenant total" key that lasts for the account lifetime means the cap reflects cumulative spend across all runs, not per-run spend — a tenant who makes 10 billing runs per month needs a cap 10× the single-run cap to avoid false positives. A per-run key has a cap that matches the expected spend of one billing run, expires when the run is done (limiting exposure), and gives you per-run audit resolution. Issue the key at the start of the agent run and let it expire after the run completes (set TTL to 2× expected run duration as a safety margin).

How do I handle tenants who use their own Stripe accounts, not my platform's?

If tenants provide their own Stripe keys (common in "bring your own credentials" architectures), the multi-tenant isolation is at the Stripe account level — each tenant's key can only charge their own Stripe customers. In this case, the vault key pattern still adds value: store the tenant's Stripe key in Keybrake as a vendor credential, issue vault keys that reference the tenant's credential, and add per-run caps on top of their Stripe account. The audit log is still per-tenant (each vault key is for one tenant's credential), and you can still revoke access without the tenant rotating their own Stripe key.

Can vault keys restrict which Stripe customers an agent can charge?

The vault key's allowed_endpoints field restricts API endpoints (e.g., only POST /v1/payment_intents, not POST /v1/customers or DELETE /v1/customers/{id}). It doesn't restrict which Stripe customer IDs are valid — that's enforced at the application level by querying your tenant's customer table before making the call. For stronger isolation, combine endpoint allowlists with application-level pre-flight checks: before calling Stripe, verify that customer.tenant_id == current_tenant_id in your database. This prevents prompt injection attacks that might try to charge a different tenant's customer.

Further reading