AI agents · Zero trust · Credential security

AI agent zero trust: applying zero-trust principles to autonomous system credentials

Zero trust was formalized for a different problem: human users accessing internal network resources from untrusted locations. "Never trust, always verify" in that context means requiring identity verification (MFA, certificates) even on the corporate network, because the network perimeter is no longer the security boundary. AI agents that make autonomous vendor API calls — Stripe charges, Twilio SMS, Resend emails — create a credential problem that traditional zero-trust architecture (ZTNA, BeyondCorp) doesn't address. The agent is authorized to call Stripe; the problem isn't unauthorized access. The problem is authorized excess: the agent is stuck in a loop and calls Stripe 2,000 times, or the agent's prompt was injected to execute a larger transaction than intended. This page covers the four zero-trust principles that apply specifically to AI agent credential design for vendor API access.

TL;DR

Traditional zero trust is about network access and identity verification — it doesn't cover AI agent vendor API spend. Agent zero trust has four specific requirements: (1) per-run credentials, not long-lived keys; (2) endpoint allowlists, not just network segments; (3) pre-call spend caps, not post-charge billing alerts; (4) per-call audit with agent context, not just access logs. Vault keys implement all four: scoped, short-lived, revocable credentials with call-level audit, issued per agent run against a proxy that enforces the policy pre-call.

Where traditional zero trust stops and agent zero trust begins

Traditional zero-trust architecture (Google BeyondCorp, ZTNA, Zscaler, Cloudflare Access) addresses one threat model: unauthorized access to internal resources. Its controls are:

Apply these to an AI agent calling Stripe and the gaps are immediate. BeyondCorp controls which network resources the agent's service account can reach — but it doesn't control what the agent does once it reaches Stripe. The agent has authorized access to api.stripe.com; it uses a valid Stripe API key; it's making legitimate API calls. BeyondCorp sees "authorized request to authorized endpoint" and approves it — even if the agent is calling POST /v1/payment_intents 2,000 times instead of once.

The threat model for AI agent vendor access isn't unauthorized access — it's authorized excess. The agent is permitted to call Stripe; the risk is that it calls it too much, at too large an amount, against the wrong endpoints, or in a way that no human is watching.

The four zero-trust principles for AI agent credentials

Principle 1: Per-run credentials, not long-lived keys

Traditional zero trust principle: "never trust, always verify" — don't assume a session is valid; verify on every request.

Agent equivalent: never issue a long-lived key to an agent; issue a credential scoped to a single run, with an expiry matching the run's expected duration.

A long-lived Stripe API key given to a billing agent is trusted implicitly for its entire lifetime — months or years. If the agent has a bug that causes it to loop, the key is valid for every iteration of the loop. Per-run credentials limit the blast radius to a single execution: a billing run that should last 2 hours gets a vault key that expires in 3 hours. When the run ends (normally or abnormally), the key expires. A key leaked in logs or error messages expires within hours, not months.

# Traditional approach: long-lived key in environment
STRIPE_SECRET_KEY=sk_live_abc123  # valid for years, full account access

# Zero-trust approach: per-run vault key
vault_key = issue_vault_key({
    vendor: "stripe",
    expires_in: "3h",          # expires when the run should end
    daily_usd_cap: 500,        # run-level spend boundary
    allowed_endpoints: [       # only what this run legitimately needs
        "POST /v1/payment_intents",
    ],
    agent_run_label: f"billing/{run_id}",
})

Principle 2: Endpoint allowlists, not just network segments

Traditional zero trust principle: micro-segmentation — restrict which internal network resources each workload can reach.

Agent equivalent: restrict which API endpoints the agent can call within an authorized vendor, not just which vendors it can reach.

Network segmentation says "this agent's pod can reach api.stripe.com." That's 350+ endpoints: POST /v1/payment_intents, POST /v1/refunds, DELETE /v1/customers/{id}, POST /v1/transfers. A billing agent that legitimately needs POST /v1/payment_intents should not be able to call POST /v1/refunds (refund loop risk) or POST /v1/transfers (fund movement). Network segmentation can't express this distinction — all of those endpoints are on the same hostname.

Endpoint allowlists within the credential enforce least privilege at the API level: the vault key policy specifies exactly which paths are permitted. Any call to an unlisted path gets 403 before the request reaches Stripe — before money can move in unintended directions.

ControlWhat it preventsWhat it doesn't prevent
Network segmentation (ZTNA) Agent calling unauthorized vendors (e.g., a billing agent connecting to an unrelated internal service) Agent calling POST /v1/refunds when it should only call POST /v1/payment_intents
Stripe Restricted Key scope Agent calling broad categories of endpoints (e.g., no Payout Write permission = no wire transfers) Agent calling POST /v1/payment_intents 5,000 times; no per-day dollar cap; no mid-run revoke
Vault key endpoint allowlist Agent calling any endpoint not in the explicit allow-list; blocks at the proxy before the vendor receives the request Nothing within the allowed envelope — per-call spend cap provides the backstop there

Principle 3: Pre-call spend caps, not post-charge alerts

Traditional zero trust principle: continuous validation — enforce policy on every request, not just at session start.

Agent equivalent: enforce a dollar cap before each vendor call, not after the billing cycle aggregates charges.

Post-charge billing alerts (Stripe Dashboard budget alerts, AWS Budget alerts, Twilio spend threshold notifications) fire after the vendor has already recorded the transaction. For a billing alert set at $1,000/day that fires 8 hours after midnight, a stuck agent that started at 11 PM can generate $900 in charges before the alert fires — and the operator still has to manually stop the agent, by which time more charges may be in flight.

A pre-call spend cap fires before the request reaches the vendor. The proxy checks the cumulative spend for the vault key against the cap; if the next call would exceed it, the proxy returns 429 and the vendor API never receives the request. No charge is applied, no money moves. The cap fires in microseconds, not hours.

Principle 4: Per-call audit with agent context, not just access logs

Traditional zero trust principle: assume breach — log everything so you can investigate after an incident.

Agent equivalent: log every vendor API call with agent run context (run ID, agent name, policy verdict, parsed cost) so you can reconstruct what an agent did and what it cost, not just that a request was made.

Standard access logs (network flow logs, API gateway request logs) record that a request was made: source IP, target endpoint, timestamp, HTTP status code. They don't record which agent run triggered the request, what the vendor charged, whether the call was within policy, or how this call relates to a broader agent operation. Post-incident investigation requires manually correlating API gateway logs with orchestration system run histories — two systems with no shared identifier.

A proxy audit log with agent context answers post-incident questions directly:

-- What did run X do and what did it cost?
SELECT vendor, endpoint, cost_usd, vendor_txn_id, called_at, policy_verdict
FROM audit_log
WHERE agent_run_id = 'billing/run_abc123'
ORDER BY called_at;

-- What called Stripe yesterday and how much?
SELECT agent_name, SUM(cost_usd), COUNT(*)
FROM audit_log
WHERE vendor = 'stripe' AND date(called_at) = date('now', '-1 day')
GROUP BY agent_name
ORDER BY 2 DESC;

The gap between enterprise zero trust and agent zero trust

Enterprise zero trust products (Zscaler, Cloudflare Access, Palo Alto Prisma Access, Okta) provide ZTNA — identity-verified, device-checked access to internal network resources. They are the right tool for controlling which services your agent's deployment environment can reach at the network layer.

But they cannot provide:

The two layers are complementary: ZTNA controls which network resources your agent reaches; vault key proxy controls what the agent can do at those resources and how much it can spend. Neither replaces the other. A complete agent zero-trust posture needs both.

How Keybrake implements agent zero trust

Keybrake is the proxy that enforces the four agent zero-trust principles at the vendor API layer:

Get early access

Related questions

Does zero trust for AI agents replace Stripe Restricted Keys?

No — they address different parts of the credential problem. Stripe Restricted Keys enforce permission scope: which Stripe API resource types the key can access (Charges, Refunds, Customers, Payouts, etc.). This is the first dimension of least privilege and is genuinely useful. Vault keys on a proxy enforce three additional dimensions that Stripe Restricted Keys don't: per-run spend cap (dollar amount per run, not per day across the account), per-run expiry (key expires when the run ends, not on a rotation schedule), and sub-second per-run revoke (one API call stops a specific run without rotating the underlying Stripe key). Use both: Stripe Restricted Key for permission-level scoping, vault key proxy for run-level spend and lifecycle control.

Is a service account with IAM policies the zero-trust answer for agents?

IAM policies (AWS IAM, GCP IAM, Azure RBAC) control which cloud services your agent's service account can call — S3, BigQuery, Pub/Sub, etc. They're the right tool for cloud API access and satisfy network-layer zero trust. But for third-party vendor APIs (Stripe, Twilio, Resend, Shopify), IAM policies have no reach — those APIs use their own authentication (Stripe API keys, Twilio Auth Tokens). You can store those vendor API keys in Secrets Manager (controlled by IAM), but IAM has no mechanism to enforce per-call spend caps on Stripe, endpoint allowlists within api.stripe.com, or per-run revoke without rotating the underlying Stripe key. IAM handles the cloud infrastructure layer; vault key proxy handles the vendor API layer.

How do prompt injection attacks interact with zero-trust agent credentials?

Prompt injection is the threat model where malicious content in an agent's inputs causes it to take unintended actions. Zero-trust credential design limits the blast radius: if an attacker injects a prompt that causes a billing agent to attempt a large unauthorized transfer, the vault key's endpoint allowlist blocks POST /v1/transfers (if only POST /v1/payment_intents is allowed), the spend cap stops it if the transfer amount would exceed the daily_usd_cap, and the audit log records the attempt with the full request context for incident review. Zero trust doesn't prevent the injection (that's a prompt safety / input sanitization problem), but it limits what the injected instruction can actually accomplish at the vendor API level.

Further reading