Hatchet · AI agents · API key security

Hatchet AI agent API key: scoping vendor calls in durable workflow steps

Hatchet is an open-source durable workflow engine for Python and TypeScript — step-based workflows with explicit dependencies, spawn_workflow() for triggering child workflow runs, max_retries per step, and concurrency controls at the workflow and step level. AI agent teams adopt Hatchet because it solves the hard parts of reliable async AI workloads: durable execution that survives worker restarts, child workflow fan-out for parallel processing, and step-level retries for transient failures. When those workflow steps call Stripe, Twilio, or Resend, Hatchet's durability features become vendor spend amplifiers: spawn_workflow() fans out to many child runs each making vendor calls in parallel, step max_retries multiplies vendor calls on failures, and there is no per-workflow-run dollar cap built into the engine. This page covers the vault-key pattern that bounds vendor spend per Hatchet workflow run.

TL;DR

Issue a vault key in the first step of your Hatchet workflow, then pass it through step_output to all downstream steps that make vendor calls. For child workflows spawned via spawn_workflow(), include the vault key in the child workflow's input — each parent run issues one vault key shared by all its children. The cap accumulates atomically across all parallel child runs. The real Stripe or Twilio secret stays in Keybrake, never in your Hatchet worker environment or workflow input. Revoking a runaway workflow run is a single API call — no step cancellation, no worker restart, no credential rotation.

How Hatchet AI agent workflows call vendor APIs

A typical Hatchet AI agent workflow uses a multi-step design where one step prepares work and downstream steps make vendor API calls:

from hatchet_sdk import Hatchet, Context
import stripe

hatchet = Hatchet()

@hatchet.workflow(name="billing-agent", on_events=["billing:process"])
class BillingAgentWorkflow:

    @hatchet.step()
    def fetch_customers(self, ctx: Context) -> dict:
        plan_id = ctx.workflow_input()["plan_id"]
        customers = db.query("SELECT id, amount_cents FROM customers WHERE plan_id = ?", plan_id)
        return {"customers": customers}

    @hatchet.step(parents=["fetch_customers"], max_retries=3)
    def charge_customers(self, ctx: Context) -> dict:
        stripe.api_key = os.environ["STRIPE_SECRET_KEY"]  # full-access key
        customers = ctx.step_output("fetch_customers")["customers"]

        results = []
        for customer in customers:
            charge = stripe.PaymentIntent.create(
                amount=customer["amount_cents"],
                currency="usd",
                customer=customer["id"],
            )
            results.append({"customer_id": customer["id"], "charge_id": charge["id"]})

        return {"results": results}

worker = hatchet.worker("billing-worker")
worker.register_workflow(BillingAgentWorkflow())

This is standard Hatchet. The charge_customers step runs with max_retries=3 — if the step raises an exception after some Stripe calls succeed, Hatchet retries the entire step. All successful charges from the first attempt re-run without idempotency keys, creating duplicate charges. If fetch_customers returns 2,000 customers due to a query bug, all 2,000 Stripe calls run with no cap before the step returns. Each retry adds another 2,000 potential calls.

Three gaps Hatchet's native tooling doesn't fill for vendor spend control

Gap	What happens in practice	Hatchet's answer
No per-workflow spend cap	Hatchet's concurrency controls limit how many workflow runs execute simultaneously (useful for rate limiting vendor API calls at the workflow level), and `max_runs` limits total concurrent runs. But neither controls the dollar amount of vendor calls made within a single workflow run. A single workflow run processing 5,000 customers makes 5,000 Stripe calls regardless of concurrency settings. There is no mechanism to fail the workflow when cumulative vendor spend within that run crosses a dollar threshold.	Hatchet's concurrency slots control parallelism at the run level. No per-run dollar cap for vendor API spend inside steps.
No step-level vendor revoke without worker restart	Hatchet allows cancelling a workflow run from the dashboard or API. Cancellation marks the run as cancelled but doesn't interrupt a step that is currently executing on a worker. The Stripe API key set in the worker environment (`STRIPE_SECRET_KEY`) can be rotated, but this requires restarting all Hatchet workers — breaking every other in-flight workflow on those workers, not just the runaway one. For child workflows spawned with `spawn_workflow()`, each child run needs to be cancelled individually.	Workflow and step cancellation are available via the Hatchet API and dashboard. No per-step API key scoping or mid-execution vendor termination.
No per-call audit with workflow step context	Hatchet's run history captures step inputs, outputs, and status. It doesn't parse dollar amounts from Stripe responses, correlate Stripe `PaymentIntent.id` values with Hatchet run ID and step name in a queryable cost table, or provide a per-run spend summary. Debugging an overcharge requires cross-referencing Hatchet run history (which shows step outputs) with the Stripe dashboard, matching on timestamps since there's no shared identifier between the two systems.	Hatchet's observability shows step inputs and outputs. No structured vendor cost tracking or external transaction ID correlation.

The spawn_workflow() fan-out amplification risk

Hatchet's spawn_workflow() (Python) or ctx.spawnWorkflow() (TypeScript) creates independent child workflow runs. This is the right pattern for parallel per-customer processing — each customer gets its own child workflow run with isolated step history, retries, and concurrency controls. But each child run also has its own access to the shared worker environment, including the full-access Stripe API key.

A parent workflow that spawns 500 child workflows, each with a step that calls Stripe, creates 500 simultaneous Stripe calls with no cap across the child runs. If each child step has max_retries=3, a transient failure in all children could result in up to 2,000 Stripe calls (500 initial + 1,500 retries) before the retry budget is exhausted. The parent workflow has no visibility into the cumulative dollar spend across its children.

Scoping vault keys per Hatchet workflow run

from hatchet_sdk import Hatchet, Context
import requests

hatchet = Hatchet()

@hatchet.workflow(name="billing-agent", on_events=["billing:process"])
class BillingAgentWorkflow:

    @hatchet.step()
    def issue_vault_key(self, ctx: Context) -> dict:
        workflow_input = ctx.workflow_input()
        resp = requests.post(
            "https://proxy.keybrake.com/vault/keys",
            headers={"Authorization": f"Bearer {os.environ['KEYBRAKE_API_KEY']}"},
            json={
                "vendor": "stripe",
                "daily_usd_cap": workflow_input.get("budget_usd", 500),
                "allowed_endpoints": ["POST /v1/payment_intents"],
                "expires_in": "2h",
                "agent_run_label": f"hatchet/billing-agent/{ctx.workflow_run_id()}",
            },
        )
        return {"vault_key": resp.json()["vault_key"]}

    @hatchet.step(parents=["issue_vault_key"])
    def fetch_customers(self, ctx: Context) -> dict:
        plan_id = ctx.workflow_input()["plan_id"]
        customers = db.query("SELECT id, amount_cents FROM customers WHERE plan_id = ?", plan_id)
        return {"customers": customers}

    @hatchet.step(parents=["fetch_customers", "issue_vault_key"], max_retries=3)
    def charge_customers(self, ctx: Context) -> dict:
        vault_key = ctx.step_output("issue_vault_key")["vault_key"]
        run_id = ctx.workflow_run_id()
        customers = ctx.step_output("fetch_customers")["customers"]

        stripe.api_key = vault_key
        stripe.api_base = "https://proxy.keybrake.com/stripe"

        results = []
        for customer in customers:
            try:
                charge = stripe.PaymentIntent.create(
                    amount=customer["amount_cents"],
                    currency="usd",
                    customer=customer["id"],
                    idempotency_key=f"{run_id}-{customer['id']}",
                )
                results.append({"customer_id": customer["id"], "charge_id": charge["id"]})
            except stripe.error.RateLimitError as e:
                if "cap_exhausted" in str(e):
                    raise Exception("VENDOR_CAP_EXHAUSTED")  # non-retriable: let Hatchet fail the step
                raise  # retriable: let Hatchet retry

        return {"results": results}

worker = hatchet.worker("billing-worker")
worker.register_workflow(BillingAgentWorkflow())

The issue_vault_key step runs first and returns the vault key via step_output. The charge_customers step declares issue_vault_key as a parent, guaranteeing the vault key is available before any charges run. step_output("issue_vault_key")["vault_key"] retrieves the key, which is then used as the Stripe API key pointed at the Keybrake proxy.

The idempotency key uses workflow_run_id() + customer_id — stable across step retries (Hatchet retries the step with the same run ID), so a failed-and-retried step doesn't charge the same customer twice. The agent_run_label includes the Hatchet workflow run ID so every vendor call in the audit log is traceable to the specific workflow execution.

How Keybrake fits

Keybrake is the proxy layer between your Hatchet workflow steps and Stripe, Twilio, or Resend. The vault key issued in the first step replaces the full-access key that was previously read from os.environ["STRIPE_SECRET_KEY"] in the charge step. The real Stripe secret stays in Keybrake — never in your worker environment variables or Hatchet step outputs (which are stored in Hatchet's database). For spawn_workflow() fan-out, pass the vault key in the child workflow input — all child runs share the same vault key, and the cap accumulates atomically across all parallel children. Revoking a runaway workflow is a single DELETE /vault/keys/{key_id} call — no worker restart, no per-child cancellation, no environment rotation.

Get early access