AutoGen · Multi-agent · API key security
AutoGen agent API key management: spend caps and per-agent credentials
AutoGen's AssistantAgent and UserProxyAgent execute tool calls in a shared environment — which means every agent in a group chat shares the same Stripe or Twilio API key you put in your environment variables. This is fine for experiments. In production, it creates attribution gaps, revoke coupling, and blast-radius problems this page covers in full.
TL;DR
In AutoGen's default setup, all agents in a GroupChat share the process environment, including API keys. For Stripe and Twilio, that means shared spend attribution, shared rate limits, and coupled revoke. The vault key pattern issues each agent role its own vault_key_xxx with a per-agent spend cap and endpoint scope, backed by a proxy that holds the real secret. AutoGen's tool calling system supports base_url overrides at the SDK level — the change is two environment variables per agent context, zero code changes to the tool implementations.
How AutoGen tools see API keys
AutoGen tools are Python functions registered on an agent's register_for_execution or register_for_llm decorator. When the LLM decides to call a tool, AutoGen executes the function directly in the Python process. The function sees whatever environment variables are set in that process:
import stripe
from autogen import AssistantAgent, UserProxyAgent
stripe.api_key = os.environ["STRIPE_SECRET_KEY"] # shared by all agents
def charge_customer(customer_id: str, amount_cents: int) -> str:
intent = stripe.PaymentIntent.create(
amount=amount_cents,
currency="usd",
customer=customer_id,
)
return f"Charged {amount_cents} cents. PaymentIntent: {intent.id}"
billing_agent = AssistantAgent("billing_agent", system_message="You handle billing.")
user_proxy = UserProxyAgent("user_proxy", human_input_mode="NEVER")
user_proxy.register_for_execution()(charge_customer)
billing_agent.register_for_llm(name="charge_customer", description="Charge a Stripe customer")(charge_customer)
If you have a refund agent in the same GroupChat, it executes tools in the same process with the same STRIPE_SECRET_KEY. Both agents can call any Stripe endpoint the key permits.
The three problems this creates at scale
Problem 1: Spend attribution is account-level, not agent-level
Stripe records every API call under the account, not under the agent that made it. If your billing agent and refund agent both fire calls in the same time window, your Stripe Dashboard shows total charges and total refunds — with no way to split spend by which agent generated it. Post-incident forensics requires cross-referencing Stripe's Request-Id headers with your application logs, hoping your logs captured agent identity at the right granularity.
Problem 2: One misbehaving agent can revoke for all
If the billing agent's behavior looks suspicious (loop detected, unexpected call pattern), you rotate the Stripe key. That rotation immediately breaks the refund agent, the webhook handler, the cron-job that checks subscription statuses, and any other process using the same key. A security incident in one agent requires a production-wide key rotation.
Problem 3: No per-agent spend cap
A single STRIPE_SECRET_KEY has no spend limit. AutoGen's conversational orchestration can lead an agent into a reasoning loop — "I should try charging again, the error might be transient" — which results in repeated calls without any policy layer stopping it. Stripe's billing alerts fire hours after the excess spend; by then, the damage is done.
The vault key pattern for AutoGen
The vault key pattern issues each agent role a scoped, time-limited credential. In AutoGen's architecture, the cleanest implementation is per-agent tool configuration — each agent's tools are initialized with a different vault key:
import stripe
import os
from autogen import AssistantAgent, UserProxyAgent, GroupChat, GroupChatManager
# Billing agent tools — uses billing vault key
def make_billing_tools():
billing_stripe = stripe.StripeClient(
api_key=os.environ["BILLING_VAULT_KEY"],
base_url="https://proxy.keybrake.com/stripe/v1"
)
def charge_customer(customer_id: str, amount_cents: int) -> str:
intent = billing_stripe.payment_intents.create(params={
"amount": amount_cents, "currency": "usd", "customer": customer_id
})
return f"PaymentIntent {intent.id}"
return charge_customer
# Refund agent tools — uses refund vault key
def make_refund_tools():
refund_stripe = stripe.StripeClient(
api_key=os.environ["REFUND_VAULT_KEY"],
base_url="https://proxy.keybrake.com/stripe/v1"
)
def issue_refund(payment_intent_id: str, amount_cents: int) -> str:
refund = refund_stripe.refunds.create(params={
"payment_intent": payment_intent_id, "amount": amount_cents
})
return f"Refund {refund.id} issued"
return issue_refund
charge_customer = make_billing_tools()
issue_refund = make_refund_tools()
Each vault key has its own policy on the Keybrake dashboard:
# Billing vault key policy
{
"vendor": "stripe",
"daily_usd_cap": 5000,
"allowed_endpoints": ["POST /v1/payment_intents"],
"expires_in": "24h"
}
# Refund vault key policy
{
"vendor": "stripe",
"daily_usd_cap": 1000,
"allowed_endpoints": ["POST /v1/refunds"],
"expires_in": "8h"
}
How this fixes all three problems
| Problem | Shared secret key | Per-agent vault key |
|---|---|---|
| Spend attribution | Account-level total only | Per-vault-key daily spend in audit log; queryable by agent role |
| Revoke coupling | Rotate key = break all agents | Revoke billing vault key = billing agent stops; refund agent unaffected |
| Spend cap | No cap; billing alert fires 1–24h late | Per-agent daily cap; fires before the call at zero lag |
Using AutoGen's nested chats with vault keys
AutoGen supports nested chats — a subagent spawned by a parent agent to handle a subtask. Each nested chat is a new agent context. If the nested billing subagent needs Stripe access, issue a child vault key derived from the parent run's vault key, with a tighter scope (lower cap, narrower endpoint allowlist, shorter expiry). When the nested chat completes, the child vault key expires automatically.
def spawn_billing_subagent(parent_run_id: str) -> str:
# Issue a child vault key via Keybrake API
child_key = requests.post("https://api.keybrake.com/vault_keys", json={
"parent_vault_key": os.environ["BILLING_VAULT_KEY"],
"daily_usd_cap": 200,
"allowed_endpoints": ["POST /v1/payment_intents"],
"expires_in": "30m",
"metadata": {"parent_run_id": parent_run_id, "role": "billing_subagent"}
}).json()["vault_key"]
return child_key
How Keybrake fits
Keybrake issues vault keys and enforces their policies. For AutoGen, the setup is: one vault key per agent role, configured with the right scope and cap in the Keybrake dashboard, then passed as the API key in that agent's tool initializer with the proxy base_url. Free tier covers 1,000 proxied requests/month; Hobby ($29/month) adds all vendors and 30-day log retention.
Related questions
Does this work with AutoGen's Docker-based code execution sandbox?
Yes. AutoGen can execute tool code inside a Docker container for isolation. The vault key and base_url environment variables need to be passed into the container's environment — the same way you'd pass any env var to a Docker run command. The proxy call goes out from the container to the proxy endpoint; no changes to the Docker setup beyond the env var injection.
What about AutoGen's built-in rate limiting (max_consecutive_auto_reply)?
AutoGen's max_consecutive_auto_reply limits conversation turns, not API calls within a single tool execution. A tool can make 50 Stripe calls in one AutoGen "turn" if the function loops. The proxy's pre-call spend cap is the complementary control — it fires at the API-call level, inside the tool function, before any of those calls reach Stripe.
How do I handle vault key issuance in a multi-tenant system where each customer gets their own AutoGen agent instance?
Issue a vault key per customer per session at agent initialization time. Each vault key carries the customer's context in its metadata fields and has an expires_in matching the session timeout. The audit log is then filterable by customer — one query per customer ID shows their complete API call history with spend totals. Cap is set per customer based on their plan tier, not a single shared cap across all customers.
Further reading
- CrewAI API key management — the same five problems covered for CrewAI's crew-based architecture, including hierarchical mode.
- AI agent API key best practices — the 7-control checklist that applies across all agent frameworks.
- AI agent governance tools — the broader governance landscape beyond key management.
- AI agent payment infrastructure in 2026 — the full category map for multi-agent systems that handle money.