Next.js · AI Agents · API Key Management

Next.js AI agent API key management: the concurrency problem and how to fix it

When ten users each trigger an AI agent in your Next.js app at the same time, process.env.STRIPE_SECRET_KEY stops being a configuration detail and starts being a liability. Here is the vault key pattern for Route Handlers, Server Actions, and the Vercel AI SDK — and why you need a proxy layer too.

By Keybrake · June 6, 2026 · 9 min read

Next.js is the most common framework for building AI agent tool backends today. If you are using the OpenAI Agents SDK, LangChain.js, or the Vercel AI SDK, your agent's tool calls are almost certainly landing in Route Handlers under app/api/ or in Server Actions. Those handlers read process.env.STRIPE_SECRET_KEY, call the Stripe SDK, and return the result. For a single user running a single agent, this works fine — you would never notice the problem.

The problem surfaces the moment two users trigger agents at the same time. Because process.env is process-global, every concurrent Route Handler invocation shares the same Stripe key. That means:

A stuck agent from user A will burn spend capacity until you manually rotate the key — which also kills user B's live agent session.
Your Stripe audit log shows charges from both users as coming from the same key, with no way to tell which agent was responsible for which call.
You cannot set a spend cap of "$50 per agent session" — the key has no concept of per-request budget.

This post walks through all three failure modes, shows you the vault key pattern that fixes them for Next.js specifically (including the Vercel AI SDK streaming edge case that trips most implementations), and explains the one thing vault keys alone cannot give you.

The three failure modes of `process.env` in a concurrent agent backend

1. The stuck loop drains the entire account

An AI agent gets into a retry loop — the LLM keeps calling the charge tool because the previous response was ambiguous, or the client code catches a 402 and retries immediately, or the agent interprets a timeout as "the charge might not have gone through, try again." Under a normal SaaS backend, a stuck loop might cause duplicate charges but will eventually hit rate limits or exhaust a pre-authorized budget. Under a shared env var key, it hits your Stripe account's global rate limit: 100 API calls per second by default. At a $20 average transaction size, that is $120,000 per minute before Stripe's rate limiter stops the bleeding.

You cannot stop this mid-run by revoking the key without also taking down every other agent session that is currently in flight — all of them are using the same process.env.STRIPE_SECRET_KEY.

2. Audit entries are indistinguishable by user or agent

Stripe's Restricted Key audit log shows you which key made each request. When every request comes from the same key, the audit log is useless for answering "which agent charged this customer?" or "how much did user A's session spend today?" You are left reconstructing agent actions from your application logs — which are only reliable if you never have a deploy, pod restart, or log rotation at the wrong moment.

For regulated industries or any scenario where you need to demonstrate to a customer that "our agent did not make unauthorized charges," an indistinguishable audit trail is not just a debugging inconvenience. It is a liability.

3. Key rotation takes down all active sessions

When you detect that an agent is misbehaving, the nuclear option is rotating STRIPE_SECRET_KEY. In Vercel or any container-based deployment, this means updating the environment variable, triggering a redeploy or function restart, and waiting for the change to propagate. During that window, every in-flight agent request — from all users, all sessions — is dropped. Depending on your deployment configuration, active streaming responses will be interrupted mid-stream.

What you actually want is to revoke exactly one agent's credential, in under a second, without touching any other running session. That requires per-session credentials, not a shared env var.

Vault keys: the per-request credential model

A vault key is a short-lived, scoped token you issue at the start of each Route Handler request and revoke in the finally block. Keybrake's vault key API is two calls:

// Issue at request start
POST https://api.keybrake.com/v1/keys
{
  "vendor": "stripe",
  "daily_usd_cap": 50,
  "allowed_endpoints": ["/v1/charges", "/v1/payment_intents"],
  "ttl_seconds": 120
}
// Returns: { "id": "vk_xxx", "token": "sk_vault_xxx" }

// Revoke in finally block
DELETE https://api.keybrake.com/v1/keys/vk_xxx

Instead of using process.env.STRIPE_SECRET_KEY directly, you use the returned token as the Stripe key and point the Stripe SDK at the Keybrake proxy URL instead of api.stripe.com. The proxy enforces the policy (spend cap, endpoint allowlist, TTL) and logs every call with the vault key ID, which ties back to the user session that issued it.

For the full technical breakdown of the pattern, see our Next.js vault key reference page — this post focuses on the failure modes and the places the pattern breaks down in practice.

The basic Route Handler pattern

Here is the vault key lifecycle in a minimal Next.js Route Handler for an AI agent Stripe tool:

// app/api/tools/charge/route.ts
import { NextRequest, NextResponse } from "next/server";
import Stripe from "stripe";

export async function POST(req: NextRequest) {
  const { amount, currency, customerId } = await req.json();

  // Issue a vault key scoped to this one agent request
  const keyRes = await fetch("https://api.keybrake.com/v1/keys", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      "Authorization": `Bearer ${process.env.KEYBRAKE_API_KEY}`,
    },
    body: JSON.stringify({
      vendor: "stripe",
      daily_usd_cap: 50,
      allowed_endpoints: ["/v1/payment_intents"],
      ttl_seconds: 30,
    }),
  });
  const { id: vaultKeyId, token: vaultToken } = await keyRes.json();

  try {
    const stripe = new Stripe(vaultToken, {
      apiVersion: "2024-06-20",
      // Route all calls through the Keybrake proxy
      host: "proxy.keybrake.com",
      port: 443,
      protocol: "https",
      basePath: "/stripe",
    });

    const paymentIntent = await stripe.paymentIntents.create({
      amount,
      currency,
      customer: customerId,
      confirm: true,
      automatic_payment_methods: { enabled: true, allow_redirects: "never" },
    });

    return NextResponse.json({ clientSecret: paymentIntent.client_secret });
  } finally {
    // Always revoke — even on error paths
    await fetch(`https://api.keybrake.com/v1/keys/${vaultKeyId}`, {
      method: "DELETE",
      headers: { "Authorization": `Bearer ${process.env.KEYBRAKE_API_KEY}` },
    });
  }
}

The try/finally pattern is the key detail. JavaScript's finally block runs whether the try succeeds, throws, or returns early — so the vault key is always revoked at the end of the request, even if the Stripe call throws a 402 or the agent tool returns an error to the model.

If the Keybrake proxy detects that this vault key has already reached its $50 daily cap, it returns HTTP 429 with a cap_exceeded error body. Your tool handler should catch this specifically and return a structured error that prevents the LLM from retrying:

} catch (err: any) {
  if (err?.raw?.code === "cap_exceeded") {
    // Return a terminal error — tell the model not to retry this tool
    return NextResponse.json(
      { error: "spend_cap_reached", message: "Daily budget for this session has been exhausted. Do not retry." },
      { status: 429 }
    );
  }
  throw err;
}

The "Do not retry" message in the error body matters. Without it, the LLM may interpret a 429 as a transient rate limit and loop the tool call, which is exactly the failure mode you are trying to prevent. For more on this pattern, see AI agent rate limiting and the retry storm problem.

The Vercel AI SDK streaming edge case

The pattern above works for synchronous Route Handlers. When you add streaming via the Vercel AI SDK's streamText, the vault key lifecycle needs special handling — and this is where most implementations break.

The problem: streamText returns a ReadableStream immediately. The actual tool calls happen as the model generates them, which can be seconds or minutes after streamText() returns. If you put your vault key issuance before streamText() and your revocation after it (awaiting the stream), the timeline looks correct on paper but fails in two ways:

The vault key has a TTL. If the model takes 45 seconds to reach the tool call, a 30-second TTL has already expired by the time the tool fires.
Awaiting the stream to completion inside a Route Handler ties up the edge function for the full duration, which works against the streaming model and hits Vercel's maximum execution time on some plans.

The correct pattern is to issue the vault key inside the tool handler, not in the outer streaming handler. Each tool invocation issues and revokes its own vault key:

// app/api/chat/route.ts
import { streamText } from "ai";
import { openai } from "@ai-sdk/openai";
import { tool } from "ai";
import { z } from "zod";

export async function POST(req: Request) {
  const { messages } = await req.json();

  const result = streamText({
    model: openai("gpt-4o"),
    messages,
    tools: {
      chargeStripe: tool({
        description: "Create a Stripe payment intent for the given amount",
        parameters: z.object({
          amountCents: z.number(),
          currency: z.string().default("usd"),
          customerId: z.string(),
        }),
        // Each tool invocation gets its own vault key — TTL is per-call, not per-stream
        execute: async ({ amountCents, currency, customerId }) => {
          const keyRes = await fetch("https://api.keybrake.com/v1/keys", {
            method: "POST",
            headers: {
              "Content-Type": "application/json",
              "Authorization": `Bearer ${process.env.KEYBRAKE_API_KEY}`,
            },
            body: JSON.stringify({
              vendor: "stripe",
              daily_usd_cap: 50,
              allowed_endpoints: ["/v1/payment_intents"],
              ttl_seconds: 30,
              // Tag the key with the user session for audit correlation
              labels: { sessionId: req.headers.get("x-session-id") ?? "unknown" },
            }),
          });
          const { id: vaultKeyId, token: vaultToken } = await keyRes.json();

          try {
            const stripe = new Stripe(vaultToken, {
              apiVersion: "2024-06-20",
              host: "proxy.keybrake.com",
              port: 443,
              protocol: "https",
              basePath: "/stripe",
            });

            const intent = await stripe.paymentIntents.create({
              amount: amountCents,
              currency,
              customer: customerId,
              confirm: true,
              automatic_payment_methods: { enabled: true, allow_redirects: "never" },
            });

            return { success: true, paymentIntentId: intent.id, status: intent.status };
          } catch (err: any) {
            if (err?.raw?.code === "cap_exceeded") {
              return { success: false, error: "spend_cap_reached", retry: false };
            }
            return { success: false, error: err.message, retry: true };
          } finally {
            await fetch(`https://api.keybrake.com/v1/keys/${vaultKeyId}`, {
              method: "DELETE",
              headers: { "Authorization": `Bearer ${process.env.KEYBRAKE_API_KEY}` },
            });
          }
        },
      }),
    },
    onFinish: ({ usage, finishReason }) => {
      // onFinish fires when the model stops generating — safe to do cleanup here
      console.log(`Stream finished: ${finishReason}, tokens: ${JSON.stringify(usage)}`);
    },
  });

  return result.toDataStreamResponse();
}

Three things to note in this pattern:

TTL is per-tool-call, not per-stream: a 30-second TTL on a vault key issued inside execute() is more than enough for a single Stripe API call — you never need a vault key that outlives the operation it was issued for.
The labels field: tagging the vault key with the session ID means every Stripe call in the Keybrake audit log is tied to the user session, not just to the application. This is what makes the audit trail actually useful.
retry: false in the cap_exceeded return: returning a structured response that the model can pattern-match on ("retry: false") is more reliable than hoping the model interprets a 429 error string correctly. Some models will retry tool calls on any error-looking response unless the tool explicitly signals "do not retry."

Server Actions

Server Actions in Next.js 14+ run on the server as form handlers or directly-invoked async functions from client components. The vault key pattern is identical — try/finally around the vendor call — but there is one Server Action–specific risk to watch for: Next.js caches Server Action responses by default in some configurations. If you are using revalidatePath or revalidateTag, verify that vault key issuance is not being cached alongside the action result.

// app/actions/charge.ts
"use server";

import Stripe from "stripe";

export async function chargeCustomer(customerId: string, amountCents: number) {
  const keyRes = await fetch("https://api.keybrake.com/v1/keys", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      "Authorization": `Bearer ${process.env.KEYBRAKE_API_KEY}`,
    },
    body: JSON.stringify({
      vendor: "stripe",
      daily_usd_cap: 100,
      allowed_endpoints: ["/v1/payment_intents"],
      ttl_seconds: 30,
    }),
    cache: "no-store", // Never cache the key issuance request
  });
  const { id: vaultKeyId, token: vaultToken } = await keyRes.json();

  try {
    const stripe = new Stripe(vaultToken, {
      apiVersion: "2024-06-20",
      host: "proxy.keybrake.com",
      port: 443,
      protocol: "https",
      basePath: "/stripe",
    });

    const intent = await stripe.paymentIntents.create({
      amount: amountCents,
      currency: "usd",
      customer: customerId,
      confirm: true,
      automatic_payment_methods: { enabled: true, allow_redirects: "never" },
    });

    return { success: true, paymentIntentId: intent.id };
  } finally {
    await fetch(`https://api.keybrake.com/v1/keys/${vaultKeyId}`, {
      method: "DELETE",
      headers: { "Authorization": `Bearer ${process.env.KEYBRAKE_API_KEY}` },
      cache: "no-store",
    });
  }
}

The cache: "no-store" on both the issuance and revocation calls ensures Next.js's built-in fetch caching never serves a stale vault key token or skips a revocation call.

The one thing vault keys alone cannot give you

Vault keys fix the per-session scoping problem: each agent request gets its own credential, with its own TTL, that can be revoked without affecting other sessions. But a vault key issued directly against the Stripe API — without a proxy in the middle — cannot enforce a spend cap.

Stripe's Restricted Keys let you scope which API endpoints a key can call, but they do not let you set a per-key daily spend cap. A vault key that forwards to Stripe directly can be issued per-session, but if the underlying real Stripe key has access to the /v1/payment_intents endpoint with Write permission, nothing stops the vault key from making $10,000 of payment intent calls before the TTL expires.

Spend cap enforcement requires a proxy that:

Parses the response body of each Stripe API call to extract the charge amount
Accumulates the running total for the vault key
Returns HTTP 429 with cap_exceeded when the cap is hit
Does this atomically (a counter increment per call, not a periodic polling job) to prevent race conditions in concurrent agents

This is what the Keybrake proxy layer does. The vault key API and the proxy enforcement are two halves of the same system — issuing a vault key without the proxy gives you scoped credential management but not budget enforcement. For the full AI agent budget enforcement architecture, including the race condition failure mode of application-side counters, see that guide.

Comparison: three approaches side by side

Here is how the three approaches stack up against the failure modes described at the top of this post:

Approach	Per-session spend cap	Per-agent revoke	Attribution in audit log	No key rotation restart
`process.env` directly	No	No	No	No
Stripe Restricted Key (no proxy)	No	Per-role only	Per-key only	Rotate role key
Vault key via Keybrake proxy	Yes — per-session cap	Yes — instant, no collateral	Yes — per-request + session label	Yes — revoke vault key only

Stripe Restricted Keys are a necessary first step — they scope which API endpoints a key can call, which prevents an agent from accidentally creating customers or deleting subscriptions when it only needs to issue refunds. For the five most common agent use-case configurations, see Stripe Restricted API key examples. But Restricted Keys do not solve concurrency, spend caps, or per-session revocation in a multi-user Next.js app. For those, you need a vault key layer on top.

A note on Edge Runtime compatibility

If you are running Route Handlers in the Vercel Edge Runtime (with export const runtime = "edge"), the vault key issuance pattern described here works without changes — fetch is native to the Edge Runtime. However, the Stripe Node.js SDK uses http.request under the hood, which is not available in Edge. For edge-deployed agent tool handlers that call Stripe, you have two options:

Use the Stripe SDK's fetch-based client (available in Stripe SDK v12.4+): new Stripe(token, { httpClient: Stripe.createFetchHttpClient() })
Or proxy the Stripe call through a Node.js Route Handler (drop export const runtime = "edge") — edge runtime is primarily a latency optimization and is not required for agent tool handlers that are already waiting on model responses.

The vault key issuance call itself is a plain fetch to a HTTPS endpoint and is fully edge-compatible in both cases.

Summary

The standard Next.js pattern of reading process.env.STRIPE_SECRET_KEY directly in a Route Handler is correct for a single-user app. It fails silently in a multi-user agent backend because the credential is shared across all concurrent sessions — no per-session spend cap, no per-agent revocation, no attribution in the audit log.

The fix is a vault key issued at the top of each Route Handler invocation (or inside each Vercel AI SDK tool's execute() function for streaming responses) and revoked in the finally block. Pairing the vault key with a proxy that enforces the spend cap at call time — not just at issuance — closes the remaining gap that Stripe Restricted Keys alone cannot fill.

For a full reference on the Next.js vault key pattern including middleware-level issuance and shared withVaultKey helper functions, see that page. For the general AI agent credential management architecture across frameworks, that guide covers the same four-layer model applied to non-Next.js backends.

Get notified when Keybrake launches

We are building the vault key proxy for teams running AI agents against Stripe, Twilio, and Resend. Join the waitlist to get early access and a permanent free tier.

The three failure modes of process.env in a concurrent agent backend