AI agents · Zero trust · Credential security

AI agent zero trust: applying zero-trust principles to autonomous system credentials

Zero trust was formalized for a different problem: human users accessing internal network resources from untrusted locations. "Never trust, always verify" in that context means requiring identity verification (MFA, certificates) even on the corporate network, because the network perimeter is no longer the security boundary. AI agents that make autonomous vendor API calls — Stripe charges, Twilio SMS, Resend emails — create a credential problem that traditional zero-trust architecture (ZTNA, BeyondCorp) doesn't address. The agent is authorized to call Stripe; the problem isn't unauthorized access. The problem is authorized excess: the agent is stuck in a loop and calls Stripe 2,000 times, or the agent's prompt was injected to execute a larger transaction than intended. This page covers the four zero-trust principles that apply specifically to AI agent credential design for vendor API access.

TL;DR

Traditional zero trust is about network access and identity verification — it doesn't cover AI agent vendor API spend. Agent zero trust has four specific requirements: (1) per-run credentials, not long-lived keys; (2) endpoint allowlists, not just network segments; (3) pre-call spend caps, not post-charge billing alerts; (4) per-call audit with agent context, not just access logs. Vault keys implement all four: scoped, short-lived, revocable credentials with call-level audit, issued per agent run against a proxy that enforces the policy pre-call.

Where traditional zero trust stops and agent zero trust begins

Traditional zero-trust architecture (Google BeyondCorp, ZTNA, Zscaler, Cloudflare Access) addresses one threat model: unauthorized access to internal resources. Its controls are:

Identity verification on every request — MFA, device posture, identity-aware proxy
Network segmentation — micro-segments that prevent lateral movement after a breach
Least-privilege access — users get the minimum network/resource access needed for their role
Continuous validation — access isn't assumed because a session was authenticated earlier

Apply these to an AI agent calling Stripe and the gaps are immediate. BeyondCorp controls which network resources the agent's service account can reach — but it doesn't control what the agent does once it reaches Stripe. The agent has authorized access to api.stripe.com; it uses a valid Stripe API key; it's making legitimate API calls. BeyondCorp sees "authorized request to authorized endpoint" and approves it — even if the agent is calling POST /v1/payment_intents 2,000 times instead of once.

The threat model for AI agent vendor access isn't unauthorized access — it's authorized excess. The agent is permitted to call Stripe; the risk is that it calls it too much, at too large an amount, against the wrong endpoints, or in a way that no human is watching.

The four zero-trust principles for AI agent credentials

Principle 1: Per-run credentials, not long-lived keys

Traditional zero trust principle: "never trust, always verify" — don't assume a session is valid; verify on every request.

Agent equivalent: never issue a long-lived key to an agent; issue a credential scoped to a single run, with an expiry matching the run's expected duration.

A long-lived Stripe API key given to a billing agent is trusted implicitly for its entire lifetime — months or years. If the agent has a bug that causes it to loop, the key is valid for every iteration of the loop. Per-run credentials limit the blast radius to a single execution: a billing run that should last 2 hours gets a vault key that expires in 3 hours. When the run ends (normally or abnormally), the key expires. A key leaked in logs or error messages expires within hours, not months.

# Traditional approach: long-lived key in environment
STRIPE_SECRET_KEY=sk_live_abc123  # valid for years, full account access

# Zero-trust approach: per-run vault key
vault_key = issue_vault_key({
    vendor: "stripe",
    expires_in: "3h",          # expires when the run should end
    daily_usd_cap: 500,        # run-level spend boundary
    allowed_endpoints: [       # only what this run legitimately needs
        "POST /v1/payment_intents",
    ],
    agent_run_label: f"billing/{run_id}",
})

Principle 2: Endpoint allowlists, not just network segments

Traditional zero trust principle: micro-segmentation — restrict which internal network resources each workload can reach.

Agent equivalent: restrict which API endpoints the agent can call within an authorized vendor, not just which vendors it can reach.

Network segmentation says "this agent's pod can reach api.stripe.com." That's 350+ endpoints: POST /v1/payment_intents, POST /v1/refunds, DELETE /v1/customers/{id}, POST /v1/transfers. A billing agent that legitimately needs POST /v1/payment_intents should not be able to call POST /v1/refunds (refund loop risk) or POST /v1/transfers (fund movement). Network segmentation can't express this distinction — all of those endpoints are on the same hostname.

Endpoint allowlists within the credential enforce least privilege at the API level: the vault key policy specifies exactly which paths are permitted. Any call to an unlisted path gets 403 before the request reaches Stripe — before money can move in unintended directions.

Control	What it prevents	What it doesn't prevent
Network segmentation (ZTNA)	Agent calling unauthorized vendors (e.g., a billing agent connecting to an unrelated internal service)	Agent calling `POST /v1/refunds` when it should only call `POST /v1/payment_intents`
Stripe Restricted Key scope	Agent calling broad categories of endpoints (e.g., no Payout Write permission = no wire transfers)	Agent calling `POST /v1/payment_intents` 5,000 times; no per-day dollar cap; no mid-run revoke
Vault key endpoint allowlist	Agent calling any endpoint not in the explicit allow-list; blocks at the proxy before the vendor receives the request	Nothing within the allowed envelope — per-call spend cap provides the backstop there

Principle 3: Pre-call spend caps, not post-charge alerts

Traditional zero trust principle: continuous validation — enforce policy on every request, not just at session start.

Agent equivalent: enforce a dollar cap before each vendor call, not after the billing cycle aggregates charges.

Post-charge billing alerts (Stripe Dashboard budget alerts, AWS Budget alerts, Twilio spend threshold notifications) fire after the vendor has already recorded the transaction. For a billing alert set at $1,000/day that fires 8 hours after midnight, a stuck agent that started at 11 PM can generate $900 in charges before the alert fires — and the operator still has to manually stop the agent, by which time more charges may be in flight.

A pre-call spend cap fires before the request reaches the vendor. The proxy checks the cumulative spend for the vault key against the cap; if the next call would exceed it, the proxy returns 429 and the vendor API never receives the request. No charge is applied, no money moves. The cap fires in microseconds, not hours.

Principle 4: Per-call audit with agent context, not just access logs

Traditional zero trust principle: assume breach — log everything so you can investigate after an incident.

Agent equivalent: log every vendor API call with agent run context (run ID, agent name, policy verdict, parsed cost) so you can reconstruct what an agent did and what it cost, not just that a request was made.

Standard access logs (network flow logs, API gateway request logs) record that a request was made: source IP, target endpoint, timestamp, HTTP status code. They don't record which agent run triggered the request, what the vendor charged, whether the call was within policy, or how this call relates to a broader agent operation. Post-incident investigation requires manually correlating API gateway logs with orchestration system run histories — two systems with no shared identifier.

A proxy audit log with agent context answers post-incident questions directly:

-- What did run X do and what did it cost?
SELECT vendor, endpoint, cost_usd, vendor_txn_id, called_at, policy_verdict
FROM audit_log
WHERE agent_run_id = 'billing/run_abc123'
ORDER BY called_at;

-- What called Stripe yesterday and how much?
SELECT agent_name, SUM(cost_usd), COUNT(*)
FROM audit_log
WHERE vendor = 'stripe' AND date(called_at) = date('now', '-1 day')
GROUP BY agent_name
ORDER BY 2 DESC;

The gap between enterprise zero trust and agent zero trust

Enterprise zero trust products (Zscaler, Cloudflare Access, Palo Alto Prisma Access, Okta) provide ZTNA — identity-verified, device-checked access to internal network resources. They are the right tool for controlling which services your agent's deployment environment can reach at the network layer.

But they cannot provide:

Per-run spend caps on vendor API calls (they don't parse Stripe response bodies)
Endpoint-level allowlists within a single vendor's API (they work at the hostname level)
Sub-second vault key revocation that stops a specific agent run (they revoke network access, not per-run credentials)
Per-call audit with parsed dollar cost and agent run context (they log network flows, not vendor transaction amounts)

The two layers are complementary: ZTNA controls which network resources your agent reaches; vault key proxy controls what the agent can do at those resources and how much it can spend. Neither replaces the other. A complete agent zero-trust posture needs both.

How Keybrake implements agent zero trust

Keybrake is the proxy that enforces the four agent zero-trust principles at the vendor API layer:

Per-run credentials — vault keys with expires_in set to the run duration, issued fresh per run
Endpoint allowlists — allowed_endpoints policy enforced at the proxy before each forwarded request
Pre-call spend caps — daily_usd_cap checked against cumulative spend atomically before forwarding; 429 returned before money moves
Per-call audit — every proxied call logged with agent_run_id, vendor, endpoint, cost_usd_parsed, and policy_verdict

Get early access