OpenAI Agents SDK · Stripe · Function tools
OpenAI Agents SDK + Stripe: wiring function tools safely
The OpenAI Agents SDK makes adding a Stripe function tool trivial. The same property — easy to add — makes the spend-cap gap easy to miss. This page covers the gap, the two-line SDK change that routes calls through a vault-key proxy, and what the audit log reveals that Stripe's own dashboard can't show you.
TL;DR
OpenAI Agents SDK function tools wrap any Python callable, so a Stripe tool is 10-20 lines. The gap: the SDK gives the agent unrestricted access to whatever Stripe key you pass — no per-run budget, no mid-run revoke, no per-call audit with agent context. Fix: swap stripe.api_key for a vault_key_xxx and add stripe.base_url = "https://proxy.keybrake.com/stripe/v1". The proxy holds the real secret, enforces the policy, and logs every call with agent_run_id — the identifier you need for incident forensics.
Setting up a Stripe function tool in the Agents SDK
The OpenAI Agents SDK defines tools as decorated Python functions. A Stripe charge tool looks like this:
import stripe
from agents import function_tool
stripe.api_key = os.environ["STRIPE_SECRET_KEY"]
@function_tool
def charge_customer(customer_id: str, amount_cents: int, currency: str = "usd") -> str:
"""Charge a Stripe customer. Returns the PaymentIntent ID."""
intent = stripe.PaymentIntent.create(
amount=amount_cents,
currency=currency,
customer=customer_id,
confirm=True,
automatic_payment_methods={"enabled": True, "allow_redirects": "never"},
)
return f"PaymentIntent {intent.id} status: {intent.status}"
Pass it to an agent:
from agents import Agent, Runner
billing_agent = Agent(
name="Billing Agent",
instructions="You process customer payments. When asked to charge a customer, call charge_customer.",
tools=[charge_customer],
)
result = Runner.run_sync(billing_agent, "Charge customer cus_ABC123 for $49.")
That's the working demo. The problem is STRIPE_SECRET_KEY: it has no per-run budget, and if the agent enters a reasoning loop ("I should retry this charge"), it can call charge_customer dozens of times before the loop resolves.
The three gaps a Stripe restricted key doesn't fill
Most engineers reach for a Stripe restricted key first. Restricting the key to payment_intents:write is the right first step — it prevents the agent from issuing refunds or changing account settings. But three gaps remain:
- No per-run dollar cap. A restricted key can be scoped by endpoint, not by spend. If your agent is supposed to charge $49 and instead charges $49 forty times, the restricted key allows all forty charges.
- No mid-run revoke without production impact. Revoking the restricted key stops the runaway agent — but it also stops every other consumer of that key. If you share the key between staging and production agents, or between this agent and a cron job, you've taken down everything to stop one thing.
- No per-call audit with agent context. Stripe logs API calls by IP. You can't join on
agent_run_id,agent_name, or the model invocation that triggered the call. Post-incident forensics is manual timeline reconstruction.
The two-line fix: vault key + proxy base_url
The fix is two lines in the function tool setup:
import stripe
from agents import function_tool
# Two-line change from the original setup:
stripe.api_key = os.environ["VAULT_KEY"] # vault_key_xxx instead of STRIPE_SECRET_KEY
stripe.base_url = "https://proxy.keybrake.com/stripe/v1" # proxy instead of api.stripe.com
@function_tool
def charge_customer(customer_id: str, amount_cents: int, currency: str = "usd") -> str:
"""Charge a Stripe customer. Returns the PaymentIntent ID."""
intent = stripe.PaymentIntent.create(
amount=amount_cents,
currency=currency,
customer=customer_id,
confirm=True,
automatic_payment_methods={"enabled": True, "allow_redirects": "never"},
)
return f"PaymentIntent {intent.id} status: {intent.status}"
The vault key policy you configure on the Keybrake dashboard:
{
"vendor": "stripe",
"daily_usd_cap": 200,
"allowed_endpoints": [
"POST /v1/payment_intents"
],
"max_amount_per_call_usd": 500,
"expires_in": "4h",
"metadata": {
"agent_name": "billing_agent",
"agent_run_id": "run_20260531_abc"
}
}
With this policy, the agent can charge up to $200/day total, each individual charge can't exceed $500, it can only call POST /v1/payment_intents (no refunds, no customer mutations), and the vault key expires after 4 hours. All of this fires before the call reaches Stripe — the 429 on cap breach is a pre-spend rejection.
What the audit log reveals
The proxy records every call with the columns you can't get from Stripe alone:
| Column | From Stripe | From Keybrake audit log |
|---|---|---|
| request_id | Yes (req_xxx) | Yes (plus mapped to vault_key) |
| timestamp | Yes | Yes |
| amount charged | Yes (per call) | Yes + running daily total |
| agent_run_id | No | Yes (if passed in vault key metadata) |
| agent_name | No | Yes (from vault key metadata) |
| policy_check_result | No | Yes (allowed / cap_exceeded / endpoint_denied) |
| calls that hit the cap | No | Yes (policy_check_result = cap_exceeded) |
The query that matters most after an incident:
-- All calls from a specific agent run, in order
SELECT ts, endpoint, amount_usd, policy_check_result, cap_usage_after
FROM calls
WHERE metadata->>'agent_run_id' = 'run_20260531_abc'
ORDER BY ts ASC;
This gives you a chronological view of exactly what the agent did, what the cap was at each step, and which call (if any) triggered the enforcement. No manual timestamp correlation with separate logs required.
Handling the 429 in the Agents SDK
When the vault key's daily cap is exceeded, the proxy returns a 429 with a Reason: daily_spend_cap_exceeded header. The Stripe SDK will raise a stripe.error.RateLimitError. Your function tool should surface this clearly:
@function_tool
def charge_customer(customer_id: str, amount_cents: int, currency: str = "usd") -> str:
"""Charge a Stripe customer. Returns the PaymentIntent ID or an error."""
try:
intent = stripe.PaymentIntent.create(
amount=amount_cents,
currency=currency,
customer=customer_id,
confirm=True,
automatic_payment_methods={"enabled": True, "allow_redirects": "never"},
)
return f"PaymentIntent {intent.id} status: {intent.status}"
except stripe.error.RateLimitError as e:
return f"Charge blocked: daily spend cap reached. Error: {e.user_message}"
The OpenAI Agents SDK surfaces the tool's return value to the model. When the model sees "Charge blocked: daily spend cap reached," it should stop retrying and inform the user. If the model keeps retrying, the proxy keeps returning 429 — the cap is enforced regardless of how many times the tool is called.
How Keybrake fits
Keybrake is the proxy. Two environment variable swaps per agent, one dashboard to configure vault key policies. The Free tier handles 1,000 proxied calls/month; the Hobby tier ($29/month) adds all vendors (Stripe, Twilio, Resend) and 30-day audit log retention.
Related questions
Does this work with the OpenAI Agents SDK's hosted tool execution (not local Python)?
If you're running tools locally (in your Python process), yes — the base_url override works exactly as described. If you're using OpenAI's hosted tool execution (where OpenAI runs the tool in their infrastructure), the base_url env var needs to be set in that environment. Check whether your hosted tool execution environment allows custom env vars — if so, same two-line change applies.
Can I use per-run vault keys with the Agents SDK's streaming mode?
Yes. The vault key is a per-run credential; you issue it before starting the agent run and revoke it when the run completes (or set expires_in to the expected run duration). The streaming mode affects how the agent's output is consumed, not how tool calls are issued — the tool calls still go through the proxy synchronously.
What's the latency overhead of the proxy hop?
6–14ms when the proxy is in the same region as your agent, 40–80ms cross-region. Stripe's own median response time for payment_intents.create is 250–600ms, so the proxy adds 2–10% overhead. For the use cases this page covers (preventing runaway spend), that overhead is worth it — a single stuck-loop incident at $0.0079/call can cost more than a year of Hobby tier in 20 minutes.
Further reading
- Stripe API key with restricted access — what Stripe's native restricted keys cover and the 10-control gap analysis.
- Giving Stripe Agent Toolkit an off-switch — the MCP-based Stripe Agent Toolkit has the same gap; this post covers the same proxy pattern for that setup.
- AI agent kill switch patterns — the four ways to stop a running agent, with real latency numbers.
- AI agent audit trail — the audit schema and SQL queries for reconstructing what an agent did with a key.