Prefect · AI agents · API key security
Prefect AI agent API key: scoping vendor calls in flows and tasks
Prefect gives you elegant Python-native orchestration — declarative retries, event-driven automations, and concurrent infrastructure. Those same features create spending risks when your tasks call Stripe, Twilio, or Resend: a task that retries three times is three vendor API calls, and an automation that triggers a flow on every inbound event can run the same payment task dozens of times before anyone notices. This page covers what Prefect's native tooling doesn't handle for vendor API spend control, and the vault-key pattern that does.
TL;DR
Prefect's @task(retries=3) is great for idempotent data processing — but not for vendor API calls that cost money. A vault key proxy sits between your Prefect tasks and the vendor: issue one vault key at flow-run start, enforce a per-run dollar cap, and get a structured per-call audit log with flow run ID attached. If a flow run goes wrong, revoke the vault key without rotating the real Stripe key that every other flow depends on.
How Prefect AI agents call vendor APIs
In a Prefect flow, vendor API calls live inside @task functions. The flow coordinates which tasks run and in what order; the tasks do the work. An AI billing agent built on Prefect might look like:
from prefect import flow, task
import stripe, os
@task(retries=3, retry_delay_seconds=10)
def charge_customer(customer_id: str, amount_cents: int) -> dict:
stripe.api_key = os.environ["STRIPE_SECRET_KEY"] # full-access key
return stripe.PaymentIntent.create(
amount=amount_cents,
currency="usd",
customer=customer_id,
)
@flow(name="billing-flow")
def billing_flow(customer_ids: list[str], amount_cents: int):
for customer_id in customer_ids:
charge_customer(customer_id, amount_cents)
This is standard Prefect code. The retries=3 decorator is correct practice for network failures. The problem is the API key: STRIPE_SECRET_KEY is a long-lived, full-access key with no per-run cap and no way to stop just this flow run from making Stripe calls without rotating the key for everything else.
Three gaps Prefect's native tooling doesn't fill for vendor spend control
| Gap | What happens in practice | Prefect's answer |
|---|---|---|
| No per-run spend cap | A billing flow has a bug in the customer list and charges 500 customers instead of 50. Prefect faithfully executes every task, retrying failures. The cap on real-money damage is your Stripe account limit — not the $500 you intended for this flow run. | None. Flow run logs show what happened; they don't stop it while it's happening. |
| No per-run revoke | You cancel a Prefect flow run at 2am. Tasks that are currently executing may have already made Stripe calls. Rotating the Stripe key to stop them breaks every other flow on your infrastructure. | Flow cancellation halts task scheduling but doesn't cancel in-flight vendor API calls or revoke credential access. |
| No per-call audit with flow context | Prefect's task logs show task inputs and outputs, but don't parse dollar amounts from Stripe responses or attach flow run IDs to individual Stripe charges in a queryable way. | Task run logs and artifacts. No cost parsing, no cross-referencing Stripe charges by flow run ID. |
The retry risk: why Prefect's retries need a spend guard
Prefect's task retry logic is designed for idempotent operations: fetch data, transform a file, call a safe API. When the task makes a vendor API call that costs money, the retry semantics become a liability:
- A
charge_customertask fails with a network timeout (not a Stripe error — your connection dropped). - Prefect waits
retry_delay_secondsand retries the task. - The retry succeeds — but the original call may have reached Stripe before the timeout. You now have a duplicate charge unless you're passing a stable idempotency key.
- Most teams don't wire up idempotency keys correctly when first building Prefect flows, because the retry decorator hides the risk.
A per-run vault key adds a second safety layer. If the retry would double-charge past the run's cap, the proxy returns a 429 instead of completing the charge. Your task raises an exception, Prefect logs the event, and a human reviews — rather than discovering the duplicate on the next bank reconciliation.
The automation risk: event-triggered flows that compound spend
Prefect automations let you trigger flow runs based on events — a new webhook, a file landing in S3, a schedule, or a state change. An AI agent that sends Twilio notifications on every incoming event can trigger the same flow dozens of times during a traffic spike. Each run issues tasks that call Twilio. Without a per-run cap, the spend scales linearly with event volume.
The vault key pattern isolates each flow run: each run gets its own vault key with its own cap. A traffic spike that triggers 50 concurrent flow runs doesn't bypass the cap — each run hits its individual limit, and the proxy returns 429s rather than forwarding every Twilio call to the vendor.
Scoping vault keys per flow run in Prefect
Issue the vault key at flow start (not in individual tasks) so the same key and its cap span the entire run:
import httpx
from prefect import flow, task
from prefect.runtime import flow_run
import stripe, os
@task
def issue_vault_key(run_id: str, budget_usd: float) -> str:
r = httpx.post(
"https://proxy.keybrake.com/vault/keys",
headers={"Authorization": f"Bearer {os.environ['KEYBRAKE_API_KEY']}"},
json={
"vendor": "stripe",
"daily_usd_cap": budget_usd,
"allowed_endpoints": ["POST /v1/payment_intents", "GET /v1/customers/*"],
"expires_in": "4h",
"agent_run_label": f"prefect-billing/{run_id}",
},
)
return r.json()["vault_key"]
@task(retries=3, retry_delay_seconds=10)
def charge_customer(customer_id: str, amount_cents: int, vault_key: str) -> dict:
stripe.api_key = vault_key # scoped key, not the real secret
stripe.api_base = "https://proxy.keybrake.com/stripe"
return stripe.PaymentIntent.create(
amount=amount_cents,
currency="usd",
customer=customer_id,
idempotency_key=f"{flow_run.id}-{customer_id}-{amount_cents}",
)
@flow(name="billing-flow")
def billing_flow(customer_ids: list[str], amount_cents: int, budget_usd: float = 500.0):
run_id = flow_run.id
vault_key = issue_vault_key(run_id, budget_usd)
for customer_id in customer_ids:
charge_customer(customer_id, amount_cents, vault_key)
Three things changed: (1) STRIPE_SECRET_KEY is gone from the task — only the Keybrake side holds it; (2) the vault key travels through the flow via parameter, so every task shares the same per-run cap; (3) flow_run.id is used as the stable idempotency key prefix, so retries are genuinely idempotent. The Keybrake audit log records every call with agent_run_label: "prefect-billing/{run_id}", giving you a queryable per-flow-run spend view.
How Keybrake fits
Keybrake is the proxy layer between your Prefect tasks and Stripe, Twilio, or Resend. You swap stripe.api_key for the vault key and set stripe.api_base to https://proxy.keybrake.com/stripe. The real Stripe secret stays in Keybrake, not in your Prefect infrastructure. Each flow run gets its own vault key with its own dollar cap, endpoint allowlist, and expiry. Retries that would exceed the cap return 429s — catchable exceptions in your Prefect tasks, not silent duplicate charges.
Related questions
Does the vault key approach work with Prefect's task caching?
Yes, with a small caveat. Prefect's task result caching stores the task output and skips re-execution on cache hits. If a charge_customer task hits the cache, it returns the cached result without making a Stripe call — so the vault key is never used for that invocation. This is actually correct behavior: if Prefect cached the successful charge result, the customer was already charged. The vault key is relevant only for actual task executions (cache misses or tasks with no caching configured), which is exactly where you want spend enforcement.
What happens to the vault key if the Prefect flow run is cancelled?
The vault key expires at its configured expires_in time. For immediate revocation on flow cancellation, you can call the Keybrake revoke endpoint in a Prefect state hook — on_cancellation accepts a list of callables that run when the flow transitions to Cancelled. This ensures tasks that were scheduled but haven't executed yet cannot use the vault key, even if they somehow start after cancellation is signalled.
Can I use the same vault key across subflows?
Yes. Pass the vault key as a parameter to subflows just as you would to tasks. A shared vault key means the parent flow and all its subflows share the same per-run dollar cap — which is usually what you want for a billing flow where the cap represents the total authorized spend for that run, regardless of which subflow did the charging. If you want independent caps per subflow, issue a separate vault key in each subflow start with its own budget.
Further reading
- AI agent kill switch patterns — the four ways to stop a runaway agent and their real stop latencies, including mid-flow scenarios.
- AI agent audit trail schema — what belongs in a structured per-call log and the SQL queries that matter when reviewing a billing incident.
- AI agent cost management — the full cost map: LLM spend, SaaS tool spend, and infra spend — which tool handles which.
- AI agent API key rotation — why short-lived vault keys are better than rotation schedules for agent workloads.