Restate · AI agents · API key security

Restate AI agent API key: scoping vendor calls in durable service invocations

Restate is a durable execution framework for TypeScript, Go, Python, and Java. It persists a journal of each handler's execution so that on any failure — network partition, process crash, pod eviction — the handler resumes from where it left off rather than restarting from scratch. ctx.run() wraps non-deterministic side-effects (including vendor API calls) so they execute exactly once regardless of how many times the handler is retried. This is genuinely useful for AI agent reliability. But durability features interact with vendor spend in two ways that create risk: (1) fan-out via ctx.serviceSendClient().send() dispatches N concurrent child invocations, each making independent vendor calls with no shared cap, and (2) calls to vendor APIs placed outside ctx.run() are replayed on every retry, creating duplicate charges. There is no per-invocation dollar cap built into Restate. This page covers the vault-key pattern that bounds vendor spend per Restate handler invocation.

TL;DR

Issue the vault key inside a ctx.run("issue-vault-key", ...) call — Restate's journal memoizes this, so the same vault key is returned on every retry of the handler without issuing a new key each time. Pass the vault key as a parameter to child service invocations for fan-out. Use ctx.key() (for Virtual Objects) or the invocation ID combined with the item ID as the Stripe idempotency key — stable across handler retries. The real Stripe or Twilio secret stays in Keybrake, never in the Restate journal or service environment that handlers read. Revoking a runaway invocation is a single API call — no Restate invocation cancellation, no credential rotation.

How Restate AI agent services call vendor APIs

A typical Restate billing service uses ctx.run() to wrap Stripe calls as exactly-once side effects:

import * as restate from "@restatedev/restate-sdk";
import Stripe from "stripe";

const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!);

const billingService = restate.service({
  name: "billing",
  handlers: {
    async chargeCustomers(ctx: restate.Context, args: { planId: string; budgetUsd: number }) {
      // Fetch customers — deterministic, OK to replay
      const customers = await ctx.run("fetch-customers", async () => {
        const resp = await fetch(`https://your-api.run.app/customers?plan_id=${args.planId}`);
        return resp.json() as Promise;
      });

      // Fan-out: dispatch one child invocation per customer
      const client = ctx.serviceSendClient(chargeOneService);
      for (const customer of customers) {
        await client.chargeCustomer({ customerId: customer.id, amountCents: customer.amountCents });
      }
    }
  }
});

const chargeOneService = restate.service({
  name: "charge-one",
  handlers: {
    async chargeCustomer(ctx: restate.Context, args: { customerId: string; amountCents: number }) {
      await ctx.run("stripe-charge", async () => {
        // Full-access key: caps bind at account level, not per-invocation
        const intent = await stripe.paymentIntents.create({
          amount: args.amountCents,
          currency: "usd",
          customer: args.customerId,
        });
        return intent.id;
      });
    }
  }
});

The fan-out creates N concurrent chargeOneService invocations. Each invocation has independent access to the full-access Stripe key via the service environment variable. If customers returns 2,000 records due to a data issue, 2,000 simultaneous Stripe calls are dispatched with no per-invocation cap and no shared cap across the fan-out. Each child invocation is independent — if 500 of them exhaust what would be a budget, the remaining 1,500 proceed without knowing.

Three gaps Restate's native tooling doesn't fill for vendor spend control

GapWhat happens in practiceRestate's answer
No per-invocation spend cap Restate's durability guarantees are about execution reliability, not spend control. A handler that fans out to 5,000 child invocations each calling Stripe completes all 5,000 — Restate has no mechanism to stop the fan-out when cumulative vendor spend crosses a dollar threshold. Even with a parent handler that wants to stop the fan-out on cap breach, the child invocations are already durably enqueued. Restate provides invocation cancellation and termination APIs. No per-invocation dollar cap for vendor API spend within handlers.
No mid-invocation vendor revoke without service restart The Stripe API key lives in the service's environment variable (STRIPE_SECRET_KEY). Rotating this secret requires redeploying the Restate service — which kills all currently executing handlers on the old service instances, breaking every in-flight invocation, not just the runaway one. Restate's invocation cancellation API sends a cancellation signal, but a handler executing a ctx.run() closure can complete that closure before acknowledging the cancellation. Invocation cancellation and termination are available via the Restate Admin API. No per-invocation API key scoping or mid-execution vendor termination.
No per-call audit with invocation context Restate's journal records the execution steps of each handler — ctx.run() results are stored as journal entries. But the journal doesn't parse dollar amounts from Stripe responses, correlate PaymentIntent.id values with the Restate invocation ID in a queryable cost table, or provide a per-invocation spend summary. Debugging an overcharge requires cross-referencing Restate's execution UI with the Stripe dashboard. Restate's observability shows handler execution steps and journal entries. No structured vendor cost tracking or invocation-ID-to-charge correlation.

The ctx.run() memoization interaction with vault keys

Restate's ctx.run() is the exactly-once boundary: the first time the handler runs, the closure executes and the result is persisted in the journal. On any subsequent retry of the handler, ctx.run() returns the journaled result without re-executing the closure. This memoization property is essential for issuing vault keys correctly.

If you issue the vault key inside ctx.run("issue-vault-key", ...), the same vault key is returned on every retry of the handler — Restate re-reads the journal entry, not Keybrake. This means the cap accumulates correctly across retries: the second attempt to charge a customer doesn't start with a fresh cap, it continues counting against the cap set in the first handler execution. The vault key's TTL must cover the full expected handler duration including potential retry delays.

Scoping vault keys per Restate handler invocation

import * as restate from "@restatedev/restate-sdk";
import Stripe from "stripe";

const billingService = restate.service({
  name: "billing",
  handlers: {
    async chargeCustomers(ctx: restate.Context, args: { planId: string; budgetUsd: number }) {
      // Issue vault key once — memoized by Restate's journal on retries
      const { vaultKey } = await ctx.run("issue-vault-key", async () => {
        const resp = await fetch("https://proxy.keybrake.com/vault/keys", {
          method: "POST",
          headers: {
            "Authorization": `Bearer ${process.env.KEYBRAKE_API_KEY}`,
            "Content-Type": "application/json",
          },
          body: JSON.stringify({
            vendor: "stripe",
            daily_usd_cap: args.budgetUsd,
            allowed_endpoints: ["POST /v1/payment_intents"],
            expires_in: "2h",
            agent_run_label: `restate/billing/${ctx.request().id}`,
          }),
        });
        const data = await resp.json();
        return { vaultKey: data.vault_key };
      });

      // Fetch customers
      const customers = await ctx.run("fetch-customers", async () => {
        const resp = await fetch(`https://your-api.run.app/customers?plan_id=${args.planId}`);
        return resp.json() as Promise;
      });

      // Fan-out: pass vault key to each child invocation
      const client = ctx.serviceSendClient(chargeOneService);
      for (const customer of customers) {
        await client.chargeCustomer({
          customerId: customer.id,
          amountCents: customer.amountCents,
          vaultKey,  // shared across all children — shared cap
        });
      }
    }
  }
});

const chargeOneService = restate.service({
  name: "charge-one",
  handlers: {
    async chargeCustomer(
      ctx: restate.Context,
      args: { customerId: string; amountCents: number; vaultKey: string }
    ) {
      await ctx.run("stripe-charge", async () => {
        const stripe = new Stripe(args.vaultKey, {
          baseURL: "https://proxy.keybrake.com/stripe/v1",
        });
        const intent = await stripe.paymentIntents.create({
          amount: args.amountCents,
          currency: "usd",
          customer: args.customerId,
          idempotency_key: `${ctx.request().id}-${args.customerId}`,
        });
        return intent.id;
      });
    }
  }
});

The vault key is issued inside ctx.run("issue-vault-key", ...) so it is memoized: on handler retries, Restate returns the same vault key from the journal without issuing a new one. The vault key is passed explicitly to child invocations via the chargeCustomer input — all children share the same vault key and the same cap accumulates atomically across all concurrent child calls.

The idempotency key uses the Restate invocation ID (from ctx.request().id) plus the customer ID — stable across handler retries of the child invocation, unique across different handler runs. The Stripe client points at the Keybrake proxy with the vault key as the API key.

How Keybrake fits

Keybrake is the proxy layer between your Restate service handlers and Stripe, Twilio, or Resend. The vault key issued inside ctx.run() replaces the full-access STRIPE_SECRET_KEY environment variable. The real Stripe secret stays in Keybrake — never in Restate's journal (which stores ctx.run() results including the vault key, but the vault key is scoped, not the real secret). For Virtual Objects, use ctx.key() as part of the agent_run_label to correlate audit log entries with the specific object key and invocation. Revoking a runaway invocation is a single DELETE /vault/keys/{key_id} call — no Restate cancellation required, no environment variable rotation.

Get early access

Related questions

Does the vault key in the Restate journal expose the real Stripe secret?

No — the vault key is a scoped credential (vault_key_xxx), not your real Stripe secret key (sk_live_xxx). If the journal entry is accessed by an attacker, they can make vendor calls only up to the configured cap (daily_usd_cap) until the key expires (expires_in). The real Stripe secret stays in Keybrake's encrypted store. That said, treat the vault key as a secret in the journal: restrict access to Restate's journal storage (the Restate server's persistent storage) with the same controls you'd apply to any sensitive runtime output.

What vault key TTL should I use when Restate suspends handlers for hours?

Restate can suspend handlers that are waiting on ctx.sleep() or waiting for durable promises — the handler resumes from the journal after an arbitrarily long suspension. If the vault key expires during a suspension, the handler will resume and the first vendor call will get a 401. Options: (1) set expires_in longer than the maximum expected suspension duration (e.g., 24h for handlers that sleep overnight); (2) issue the vault key after the last suspension point (in a step that runs just before the vendor calls begin); (3) add a key-refresh step that issues a new vault key if the current one is within 10 minutes of expiry. For short-lived billing handlers (no sleeps, completes in minutes), use a 30-minute TTL as a safe default.

How do I handle cap exhaustion when Restate retries the handler?

When the proxy returns 429 due to cap exhaustion, the Stripe SDK call inside ctx.run() throws an exception. Since the closure failed, Restate does not journal the result and will retry the closure on the next handler retry. But the cap is still exhausted — the retry gets another 429. To prevent an infinite retry loop: catch the 429 inside the ctx.run() closure and check for the X-Keybrake-Cap-Hit: true response header; if it's a cap hit (not a transient Stripe error), throw a terminal error type that Restate recognizes as non-retryable. In TypeScript, this means throwing a restate.TerminalError for cap exhaustion vs. a regular Error for transient failures. Cap exhaustion is an intentional stop — don't retry it.

Further reading