Pydantic AI · Stripe · Agent safety
Pydantic AI + Stripe: giving your agent a payment tool without handing it an uncapped key
Pydantic AI's structured tool system makes it easy to write type-safe Stripe tools — validated inputs, explicit return types, dependency injection. What it doesn't give you is a per-run dollar cap on what the agent can spend. That gap lives at the HTTP layer, below Pydantic AI's abstraction. This page covers the tool setup, the structural gap, and the one-line proxy fix.
TL;DR
Pydantic AI's @agent.tool decorator validates tool inputs against a Pydantic model — it catches type errors before the Stripe SDK call, not spend overruns after it. The spend-cap gap exists in every agent framework that calls Stripe directly, including Pydantic AI. The fix is the same: point the Stripe SDK's base_url at a proxy, swap the key for a vault key with a daily cap, and the proxy enforces the cap before each forwarded request. Pydantic AI tool code is unchanged.
Setting up a Stripe tool in Pydantic AI
Pydantic AI (from the Pydantic team) uses a decorator-based tool API with dependency injection via RunContext. A minimal Stripe charge tool looks like this:
import stripe
from pydantic import BaseModel
from pydantic_ai import Agent, RunContext
# Dependencies injected into the agent run
class AgentDeps(BaseModel):
stripe_key: str
customer_id: str
agent = Agent('openai:gpt-4o', deps_type=AgentDeps)
class ChargeInput(BaseModel):
amount_cents: int
currency: str = 'usd'
description: str
@agent.tool
async def charge_customer(
ctx: RunContext[AgentDeps],
charge: ChargeInput,
) -> str:
stripe.api_key = ctx.deps.stripe_key
intent = stripe.PaymentIntent.create(
amount=charge.amount_cents,
currency=charge.currency,
customer=ctx.deps.customer_id,
description=charge.description,
)
return f"PaymentIntent {intent.id} created"
Pydantic AI validates ChargeInput before calling the tool — if the model tries to pass a string for amount_cents, the call fails with a validation error before hitting Stripe. This is genuine value: it prevents type-coercion bugs at the tool boundary.
What validation doesn't catch: a model that calls charge_customer 40 times in a single run — once for each item in a shopping cart, because the system prompt didn't say "batch the charges". Each call is individually valid. The total spend is 40× what was intended.
The structured-output gap
Pydantic AI's structured outputs enforce schema correctness, not spending correctness. Those are different problems:
| What Pydantic AI validates | What it doesn't validate |
|---|---|
| Input field types and constraints | How many times the tool is called per run |
| Required fields are present | Total dollar value of charges this agent run |
| Enum values are valid | Whether a PaymentIntent was already created for this intent |
| Nested object structure | Cumulative spend across concurrent agent runs |
The spend-cap problem lives at the API transport layer, not the tool schema layer. This is why the fix is a proxy rather than a validator.
The vault-key fix for Pydantic AI
The one-line change is in stripe.base_url. The agent's dependency injection makes it clean to swap in per-run:
import stripe
from pydantic import BaseModel
from pydantic_ai import Agent, RunContext
class AgentDeps(BaseModel):
vault_key: str # vault_key_xxx from Keybrake
customer_id: str
agent_run_id: str
agent = Agent('openai:gpt-4o', deps_type=AgentDeps)
@agent.tool
async def charge_customer(
ctx: RunContext[AgentDeps],
charge: ChargeInput,
) -> str:
stripe.api_key = ctx.deps.vault_key
stripe.base_url = "https://proxy.keybrake.com/stripe/v1"
intent = stripe.PaymentIntent.create(
amount=charge.amount_cents,
currency=charge.currency,
customer=ctx.deps.customer_id,
description=charge.description,
metadata={"agent_run_id": ctx.deps.agent_run_id},
)
return f"PaymentIntent {intent.id} created"
Issue the vault key before the agent run starts, and pass it in via AgentDeps:
import httpx
def create_vault_key(agent_run_id: str) -> str:
resp = httpx.post(
"https://api.keybrake.com/vault-keys",
headers={"Authorization": f"Bearer {KEYBRAKE_API_KEY}"},
json={
"vendor": "stripe",
"daily_usd_cap": 200,
"allowed_endpoints": ["POST /v1/payment_intents"],
"expires_in": "1h",
"agent_run_id": agent_run_id,
},
)
return resp.json()["vault_key"]
run_id = f"checkout_{order_id}"
deps = AgentDeps(
vault_key=create_vault_key(run_id),
customer_id=customer.stripe_id,
agent_run_id=run_id,
)
result = await agent.run("Process the checkout", deps=deps)
With this setup: the agent can call charge_customer as many times as the model decides, but the total charges across the run are capped at $200. If the cap is reached, the proxy returns 429; the agent receives "daily spend cap reached" as a tool error and can stop or escalate. Every call is logged with agent_run_id.
Why Pydantic AI's RunContext makes this particularly clean
Because vault keys should be per-run (issued at run start, expire at run end), Pydantic AI's RunContext dependency injection is a natural fit: the vault key is injected as a dependency alongside other run-specific data, and every tool in the agent can read it from context without global state. In frameworks that use class-level or module-level key initialization, per-run key rotation is awkward. In Pydantic AI, it's idiomatic.
How Keybrake fits
Keybrake issues vault keys, enforces the policy at the Stripe HTTP layer, and logs every call. The Pydantic AI tool code changes by two lines — stripe.api_key and stripe.base_url. The Free tier covers 1,000 proxied requests/month; the Hobby tier ($29/month) adds all vendors and 30-day audit log retention.
Related questions
Can I use Pydantic AI's structured result types to capture the vault key policy response?
Yes. If you define a VaultKeyResponse Pydantic model matching Keybrake's API response, the vault key creation call can return a typed result. The daily_usd_cap, expires_at, and allowed_endpoints fields all serialize cleanly. This is particularly useful if you want to surface the remaining cap to the agent in the tool result — the agent can then decide whether to batch differently.
Does this work with Pydantic AI's streaming mode?
Yes. The proxy is at the HTTP transport layer — it intercepts requests before they reach Stripe regardless of whether Pydantic AI is streaming the model's output. Streaming affects how the model's text response is returned; it doesn't affect how tool calls are executed. Tool calls in Pydantic AI's streaming mode execute synchronously (from the proxy's perspective) even when the surrounding text response is streamed.
What's the difference between using Pydantic AI tool validation vs. using the proxy?
Pydantic AI validation fires before the Stripe call and catches schema errors (wrong types, missing fields, invalid enum values). The proxy fires at the HTTP layer and catches behavioral errors (too many calls, too much spend, disallowed endpoints). They're complementary layers addressing different failure modes. Run both — validation prevents bad requests from reaching Stripe at all; the proxy caps the blast radius of valid-but-excessive requests.
Further reading
- OpenAI Agents SDK + Stripe — the same vault-key pattern applied to OpenAI's function tool system.
- LangChain Stripe API key — StripeTool setup and the three gaps Stripe's restricted keys leave open.
- AI agent API key best practices — framework-agnostic checklist for every SaaS key your agent touches.
- CrewAI API key management — per-agent vault keys in multi-agent CrewAI systems.