Cloudflare Workers · AI agents · API key management · edge AI

Cloudflare Workers AI agent API key management: vault keys for edge AI workflows

Cloudflare Workers is increasingly the runtime for edge AI agent logic: Workers AI for inference, Cloudflare Workflows for multi-step orchestration, AI Gateway for LLM routing, and Durable Objects for agent state. When these Workers need to call Stripe, Twilio, or Resend, the API key story is the same as every other serverless runtime: the key lives in Workers Secrets, shared across all invocations of the Worker, with no per-invocation spend cap or endpoint scope. Vault keys fit naturally into Cloudflare's architecture: store the Keybrake API token in Workers Secrets, issue per-invocation vault keys at the start of each workflow step, and the real Stripe secret never travels to the edge at all.

TL;DR

Store the Keybrake API token (not the Stripe secret) in a Cloudflare Workers Secret. At the start of each Worker invocation or Workflow step, call the Keybrake API to issue a vault key scoped to the workflow's spend cap and allowed endpoints. Use the vault key token as the Authorization: Bearer header when proxying vendor calls through https://proxy.keybrake.com/stripe/.... The real Stripe secret lives in Keybrake's infrastructure — not in Workers Secrets, not in the Worker's memory, and not in any fetch request that could be logged at the edge.

Cloudflare's AI agent primitives and where credentials fit

Cloudflare has assembled a set of primitives for building AI agents at the edge:

AI Gateway handles LLM API key governance. Keybrake handles vendor SaaS API key governance (Stripe, Twilio, Resend). They're complementary: an AI agent can route its LLM calls through AI Gateway and its Stripe calls through Keybrake simultaneously.

The vault key pattern for Cloudflare Workers

// Worker: issue vault key, proxy Stripe call, revoke key
export default {
  async fetch(request: Request, env: Env): Promise {
    const body = await request.json() as { customerId: string; amountCents: number };
    const workflowId = request.headers.get("X-Workflow-Id") ?? crypto.randomUUID();

    // Step 1: Issue vault key using the Keybrake token from Workers Secrets
    const keyRes = await fetch("https://api.keybrake.com/v1/keys", {
      method: "POST",
      headers: {
        "Authorization": `Bearer ${env.KEYBRAKE_TOKEN}`,  // from Workers Secret
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        label: `workers-charge-${workflowId}`,
        vendor: "stripe",
        allowed_endpoints: [
          "/v1/payment_intents",
          "/v1/payment_intents/*"
        ],
        daily_usd_cap: 500,
        expires_in: "5m"
      }),
    });

    if (!keyRes.ok) {
      return new Response(JSON.stringify({ error: "vault_key_issuance_failed" }), {
        status: 500, headers: { "Content-Type": "application/json" }
      });
    }

    const { token: vaultKey, id: keyId } = await keyRes.json() as { token: string; id: string };

    try {
      // Step 2: Proxy Stripe call through Keybrake using the vault key
      const stripeRes = await fetch(
        "https://proxy.keybrake.com/stripe/v1/payment_intents",
        {
          method: "POST",
          headers: {
            "Authorization": `Bearer ${vaultKey}`,  // vault key, not Stripe secret
            "Content-Type": "application/json",
          },
          body: JSON.stringify({
            amount: body.amountCents,
            currency: "usd",
            customer: body.customerId,
          }),
        }
      );

      if (stripeRes.status === 429) {
        const err = await stripeRes.json() as { code?: string };
        if (err.code === "cap_exhausted") {
          return new Response(
            JSON.stringify({ error: "spend_cap_exceeded" }),
            { status: 402, headers: { "Content-Type": "application/json" } }
          );
        }
      }

      const payment = await stripeRes.json();
      return new Response(JSON.stringify(payment), {
        status: stripeRes.status,
        headers: { "Content-Type": "application/json" }
      });

    } finally {
      // Step 3: Revoke vault key — fire and forget (TTL is safety net)
      await fetch(`https://api.keybrake.com/v1/keys/${keyId}`, {
        method: "DELETE",
        headers: { "Authorization": `Bearer ${env.KEYBRAKE_TOKEN}` },
      });
    }
  }
};

Vault keys in Cloudflare Workflows

Cloudflare Workflows provides durable multi-step orchestration with automatic retry. Each workflow step is a distinct execution unit — vault keys should be issued per step rather than per workflow, because a workflow can span minutes to hours and individual steps are independently retried:

import { WorkflowEntrypoint, WorkflowStep, WorkflowEvent } from 'cloudflare:workers';

interface BillingWorkflowParams {
  customerId: string;
  amountCents: number;
  runId: string;
}

export class BillingWorkflow extends WorkflowEntrypoint {
  async run(event: WorkflowEvent, step: WorkflowStep) {
    const { customerId, amountCents, runId } = event.payload;

    // Issue a vault key scoped to this workflow step (not the entire workflow)
    const vaultKey = await step.do("issue-vault-key", async () => {
      const res = await fetch("https://api.keybrake.com/v1/keys", {
        method: "POST",
        headers: { "Authorization": `Bearer ${this.env.KEYBRAKE_TOKEN}` },
        body: JSON.stringify({
          label: `workflow-${runId}-charge-step`,
          vendor: "stripe",
          allowed_endpoints: ["/v1/payment_intents", "/v1/payment_intents/*"],
          daily_usd_cap: 1000,
          expires_in: "10m"  // Longer TTL for workflow steps with retry
        }),
      });
      const key = await res.json() as { token: string; id: string };
      return key;
    });

    // Charge the customer using the per-step vault key
    const payment = await step.do("charge-customer", async () => {
      const res = await fetch("https://proxy.keybrake.com/stripe/v1/payment_intents", {
        method: "POST",
        headers: { "Authorization": `Bearer ${vaultKey.token}` },
        body: JSON.stringify({ amount: amountCents, currency: "usd", customer: customerId }),
      });
      return await res.json();
    });

    // Revoke the step's vault key
    await step.do("revoke-vault-key", async () => {
      await fetch(`https://api.keybrake.com/v1/keys/${vaultKey.id}`, {
        method: "DELETE",
        headers: { "Authorization": `Bearer ${this.env.KEYBRAKE_TOKEN}` },
      });
    });

    return payment;
  }
}

Cloudflare AI Gateway vs. Keybrake: complementary, not competing

PropertyCloudflare AI GatewayKeybrake
What it proxies LLM providers: OpenAI, Anthropic, Mistral, Cohere, Workers AI Vendor SaaS APIs: Stripe, Twilio, Resend
Spend enforcement LLM token cost caps Dollar spend caps on vendor charges (Stripe amount, Twilio SMS price)
Audit log LLM request/response logs with token counts Vendor API call logs with dollar cost per call
Per-execution scoping Rate limiting by AI Gateway key, not per-workflow-run Per-run vault keys with individual spend caps and endpoint allowlists
Revocation No per-execution revocation Per-vault-key revocation; kills one run without affecting others
How they interact Both can be active simultaneously: AI Gateway routes LLM calls, Keybrake proxies vendor SaaS calls. A Cloudflare Workflow step can call Workers AI via AI Gateway and Stripe via Keybrake in the same step.

Durable Objects for vault key state caching

For high-throughput Workers that process many short requests (e.g., a webhook handler receiving thousands of events per minute), issuing a vault key per-request adds overhead. Durable Objects can cache vault keys for a session or user, reducing issuance to once per session rather than once per request:

// Durable Object: caches vault key for a user session
export class AgentSession extends DurableObject {
  private vaultKey: { token: string; id: string; expiresAt: number } | null = null;

  async getVaultKey(env: Env): Promise {
    const now = Date.now();

    // Reuse cached key if it expires more than 2 minutes from now
    if (this.vaultKey && this.vaultKey.expiresAt > now + 120_000) {
      return this.vaultKey.token;
    }

    // Issue new vault key (15-minute session key, reused across requests)
    const res = await fetch("https://api.keybrake.com/v1/keys", {
      method: "POST",
      headers: { "Authorization": `Bearer ${env.KEYBRAKE_TOKEN}` },
      body: JSON.stringify({
        label: `session-${this.ctx.id}`,
        vendor: "stripe",
        allowed_endpoints: ["/v1/payment_intents", "/v1/payment_intents/*"],
        daily_usd_cap: 200,  // Per-session cap
        expires_in: "15m"
      }),
    });

    this.vaultKey = await res.json() as { token: string; id: string; expiresAt: number };
    return this.vaultKey.token;
  }
}

Get early access

Related questions

Does Cloudflare's edge network add latency to requests going to proxy.keybrake.com?

Cloudflare Workers run on Cloudflare's edge network in 200+ cities. The fetch from a Worker to proxy.keybrake.com exits Cloudflare's network and travels to Keybrake's infrastructure as a standard HTTPS request. If Keybrake's proxy is hosted on a major cloud provider (GCP, AWS, or Fly.io), the latency from a Cloudflare PoP to the proxy is typically 10–50ms — comparable to any cross-cloud HTTPS call. Keybrake's proxy then forwards to Stripe at similar latency. The total round-trip overhead versus direct-to-Stripe is approximately one extra HTTPS hop, typically 20–80ms depending on geographic proximity.

Can I use Cloudflare KV to cache vault keys across Workers invocations?

Yes. Cloudflare KV is appropriate for caching vault keys that are valid for multiple minutes and shared across many Worker invocations for the same user session or workflow run. Store the vault key token and its expiration timestamp in KV with the user session ID or workflow run ID as the key. Set the KV entry's TTL to match the vault key's TTL minus a safety buffer (e.g., if the vault key expires in 15 minutes, set KV TTL to 12 minutes). On each Worker invocation, read from KV first and only call the Keybrake API if no cached key exists. This reduces vault key issuance to once per session rather than once per request.

How does this work with Cloudflare Workers calling multiple vendor APIs (Stripe and Twilio) in the same invocation?

Issue a separate vault key for each vendor. A Workers invocation that needs to both charge a customer (Stripe) and send an SMS confirmation (Twilio) issues two vault keys: one with vendor: "stripe" and one with vendor: "twilio". Each key has its own spend cap and endpoint allowlist. The two key issuance calls can be made in parallel (Promise.all) to avoid sequential latency. Both keys are revoked in a finally block when the invocation completes. The audit log in Keybrake shows both the Stripe charge and the Twilio SMS as separate entries attributed to the same workflow run label.

What's the relationship between Cloudflare Workers Secrets and the Stripe secret key?

In the vault key pattern, the Stripe secret key is not stored in Workers Secrets at all. It lives in Keybrake's encrypted secrets vault and never travels to Cloudflare's edge. Workers Secrets holds the Keybrake API token — a credential that can issue short-lived vault keys. The security property is that a compromised Workers Secret (the Keybrake token) can only issue vault keys, not make direct Stripe API calls. An attacker with the Keybrake token could issue vault keys with the policies you've configured (vendor: stripe, allowed_endpoints, spend cap), but cannot call Stripe's API without routing through Keybrake's proxy — where every call is logged, capped, and can be blocked by revoking all outstanding keys via the Keybrake dashboard.

Further reading