AI agents · API key lifecycle · credential management

AI agent API key lifecycle: issuance, enforcement, expiration, and revocation for autonomous agents

A shared API key has no meaningful lifecycle. It's created once, distributed to every process that needs it, and rarely expires — sometimes lasting years until a security incident forces a rotation. For an autonomous AI agent calling Stripe, Twilio, or Resend, this non-lifecycle creates a class of governance failures that grow more costly as agents become more capable: a key with no TTL can be used by a stuck loop for days; a key with no scope can be used for operations far outside the agent's intended task; a key that can't be selectively revoked requires taking down the entire system to stop a misbehaving agent. Per-run vault keys introduce a real credential lifecycle aligned to the agent run's lifecycle — four phases that prevent each failure mode.

TL;DR

A vault key's lifecycle has four phases: issuance (create a key with policy before the run starts), active enforcement (every proxied call checks the policy atomically), expiration (auto-expire via TTL when the run's expected window ends), and revocation (emergency kill switch that voids the key immediately). These phases map directly to the agent run lifecycle: setup → execution → teardown → emergency. A shared API key has none of these phases — it's always in the "active with no policy" state, regardless of what the agent is currently doing.

Why shared keys fail the lifecycle model

Lifecycle propertyShared API keyPer-run vault key
Issuance scope Created once for the entire platform; shared across all agents, environments, and runs Created per run with a policy that names the vendor, endpoint allowlist, spend cap, and TTL
Active enforcement None — vendor accepts any request the key is authorized for, with no per-run limits Every proxied call checks remaining cap and allowed endpoints before forwarding
Expiration Rarely; manual rotation only, usually triggered by an incident Automatic via TTL — key becomes invalid after the specified duration even if unused
Revocation Revokes access for all agents simultaneously — operational risk during emergency Per-run revocation — kill one run's access without affecting others
Attribution All runs share one key — vendor audit log can't distinguish which run made which call Key label = run ID — every audit log entry is attributed to the specific run

Phase 1: Issuance

A vault key is issued at the start of an agent run, before any vendor API calls are made. Issuance attaches a policy to the key that governs every subsequent request the key makes. The policy has four components:

  1. Vendor — which SaaS API this key proxies. A key issued for Stripe cannot be used against Twilio; each vendor has a distinct proxy namespace.
  2. Allowed endpoints — the specific API paths the agent is permitted to call. A billing agent gets ["/v1/payment_intents", "/v1/payment_intents/*"]; an account lookup agent gets ["/v1/customers", "/v1/customers/*"]. Requests to endpoints outside this list are blocked at the proxy with 403.
  3. Spend cap — the maximum dollar amount this key may cause to be charged in its lifetime. The cap is checked atomically before each request is forwarded; the request is blocked with 429 if the cap would be exceeded.
  4. TTL (expires_in) — how long the key is valid from the time of issuance. The TTL is the expected maximum duration of the agent run, with a buffer. A nightly batch job that typically takes 2 hours might get a 4-hour TTL.
// Phase 1: Issuance — at the start of an agent run
const key = await keybrake.keys.create({
  label: `billing-run-${runId}-${new Date().toISOString()}`,
  vendor: "stripe",
  allowed_endpoints: [
    "/v1/payment_intents",
    "/v1/payment_intents/*"
  ],
  daily_usd_cap: 10000,
  expires_in: "4h"  // TTL from issuance
});

// The vault key token is the only credential the agent code receives
const agentConfig = { vaultKey: key.token };

Phase 2: Active enforcement

During the agent run, every vendor API call passes through the Keybrake proxy. The proxy performs three checks on each request before forwarding it:

  1. Key validity — is the key still active, not expired, not revoked? If not, return 401.
  2. Endpoint check — does the requested path match the key's allowed_endpoints policy? If not, return 403 with code: endpoint_not_allowed.
  3. Spend check — would the request cost (parsed from the request body) push the cumulative spend past the daily_usd_cap? If yes, return 429 with code: cap_exhausted. The check is atomic — a compare-and-swap on the spend counter prevents parallel requests from racing past the cap.

If all three checks pass, the proxy forwards the request to the vendor with the real vendor API key, logs the request (including cost, endpoint, and vault key label), and returns the vendor's response to the agent. The agent code handles vendor responses exactly as it would without the proxy — the response format is unchanged.

The enforcement phase handles the most common agent failure modes:

Phase 3: Expiration

A vault key expires automatically when its TTL elapses from the time of issuance. After expiration, any request using the key receives 401 with code: key_expired. The agent run should handle this as a terminal condition — if the agent is still running when the key expires, it means the run exceeded its expected duration, which is itself a signal that something is wrong.

TTL design patterns for different agent run types:

Agent run typeTypical durationRecommended TTLRationale
Nightly billing batch 1–3 hours 6 hours 2× expected duration; rare retries push toward 4h, 6h prevents stuck reruns from running past business hours
Per-user LLM conversation turn 5–30 seconds 5 minutes Long enough to handle slow LLM inference + tool call latency; short enough that a leaked key is useless within minutes
Multi-step workflow (Temporal, Prefect) Minutes to hours Workflow SLA + 20% Match the workflow's maximum expected duration from SLA; the proxy key expiration becomes an external timeout on the workflow itself
Event-driven agent (webhook handler) Seconds to 2 minutes 10 minutes Conservative; handles retry storms but limits blast radius if a webhook is replayed incorrectly

Phase 4: Revocation

Revocation is the emergency phase — triggered when an agent run needs to be stopped before its TTL expires. This can happen when:

// Phase 4: Revocation — emergency stop for a specific run
await keybrake.keys.revoke(key.id);

// Or revoke all keys with a specific label prefix (emergency stop for an agent type)
await keybrake.keys.revokeByLabelPrefix("billing-run-");

// After revocation, any request using the key returns:
// HTTP 401 { "code": "key_revoked", "message": "This vault key has been revoked." }

Revocation is immediate — there is no propagation delay. The revocation is recorded in the audit log with a timestamp and (optionally) the reason string provided in the revocation call. All in-flight requests that arrive after the revocation timestamp are rejected; requests that were forwarded to the vendor before revocation were already executed and are not reversed (Stripe idempotency keys handle the retry-deduplication concern separately).

Lifecycle alignment with orchestration frameworks

The four lifecycle phases map cleanly to lifecycle hooks in major agent orchestration frameworks:

FrameworkIssuance hookRevocation hook
Temporal Workflow workflow.start() → issue vault key; pass to activities via signal or side-effect workflow.cancel() or compensation activity → revoke vault key
Prefect Flow @flow body before first task → issue vault key; store in Prefect Variable Flow on_failure hook → revoke vault key
LangGraph StateGraph Entry node → issue vault key; add to state END node cleanup or exception handler → revoke vault key
Celery task chain Task setup before chord → issue vault key; pass as task arg on_failure callback → revoke vault key

Get early access

Related questions

What happens to in-flight requests when a key expires or is revoked mid-run?

Requests that have already been forwarded to the vendor and received a response are not affected by expiration or revocation — the charge has been created and logged. Only requests that arrive at the proxy after the expiration or revocation timestamp are rejected. If an agent makes a request and the key expires between the moment the request is sent and the moment the proxy receives it, the proxy rejects it. This means there is a small window (network round-trip time, typically under 100ms) during which a request sent just before expiration may arrive after expiration and be rejected. Agents should handle 401 responses as terminal — not retry with the same key — which is consistent with how most HTTP clients handle authentication failures.

Can I extend a key's TTL after issuance?

Yes — vault keys support a TTL extension via PATCH /keys/{key_id} with a new expires_at timestamp. Extensions must be to a later time than the current expiration; a key cannot be extended after it has already expired (you would issue a new key instead). Extensions are logged in the audit trail. A common pattern is to extend the key's TTL when the agent emits a heartbeat — confirming it's still making progress and not stuck — and allow it to expire if the heartbeat stops.

Should I revoke keys at the end of a successful run, or let them expire?

Explicit revocation at the end of a successful run is best practice even if the key would expire shortly. The window between run completion and TTL expiration is a window where the key is valid but the run is done — a leaked or captured key could be used during this window. Revoking immediately on completion closes this window. The practical overhead is one API call per run. For very short-lived keys (under 5 minutes), letting them expire is reasonable. For keys with TTLs of hours, explicit revocation is recommended.

How do I manage vault keys for long-running Temporal workflows that span multiple days?

For workflows that span multiple days, issuing a single vault key with a multi-day TTL creates a key with a longer blast radius than necessary. The recommended pattern is to issue short-lived vault keys per activity rather than per workflow — each activity that calls a vendor API gets its own vault key with a TTL equal to the activity's expected duration. The workflow state machine issues and revokes vault keys as activities start and complete. This way, at any given moment, only the currently-executing activity holds a valid vault key, minimizing the blast radius of a compromised or misbehaving key.

Further reading