Django · AI agents · API key management · Python
Django AI agent API key management: scoped vault keys for agent views and tasks
Django is a popular choice for building AI agent tool backends: a DRF view or class-based view receives tool call arguments from a LangChain, OpenAI Agents SDK, or CrewAI agent, calls Stripe or Twilio using settings.STRIPE_API_KEY, and returns a JSON response. The problem is structural: Django's settings.py is a module-level singleton loaded once at startup and shared across every gunicorn worker. All twenty users running agents concurrently share the same Stripe credential — no per-user spend cap, no per-request endpoint allowlist, and no way to stop one runaway agent without rotating the key for everyone. Vault keys add per-request credential scoping through Django middleware without changing your view signatures or ORM patterns.
TL;DR
Store the Keybrake API token in settings.py instead of the raw vendor secret. Write a Django middleware class that issues a vault key in process_request, attaches it to request.vault_key, and revokes it in process_response. Views and DRF serializers read request.vault_key and call https://proxy.keybrake.com/stripe/... rather than the Stripe SDK directly. No changes to view signatures — the vault key travels on the request object Django already passes everywhere.
The Django AI agent tool backend pattern
A typical Django AI agent tool backend looks like this:
from django.conf import settings
import stripe
from rest_framework.views import APIView
from rest_framework.response import Response
stripe.api_key = settings.STRIPE_API_KEY # shared across ALL workers
class ChargeTool(APIView):
def post(self, request):
intent = stripe.PaymentIntent.create(
amount=request.data["amount_cents"],
currency="usd",
customer=request.data["customer_id"]
)
return Response(intent)
This is clean and idiomatic Django. The problem: stripe.api_key is a module-level global. When gunicorn spins up four workers, all four share the same Stripe credential. An agent stuck in a retry loop in one user's session can rack up charges using any other user's customer data — there's no per-request spending boundary. And rotating the key to stop the runaway agent disrupts every other active session simultaneously.
Adding vault keys via Django middleware
Django middleware is the right integration point — it wraps every request and response, runs before your views, and has access to the request object that views already receive:
import requests as http
from django.conf import settings
from django.http import JsonResponse
class VaultKeyMiddleware:
"""Issues a per-request vault key and attaches it to request.vault_key."""
def __init__(self, get_response):
self.get_response = get_response
def __call__(self, request):
# Skip non-agent endpoints (admin, static, health checks)
if not request.path.startswith("/tools/"):
return self.get_response(request)
user_id = request.headers.get("X-User-Id", "anonymous")
run_id = request.headers.get("X-Agent-Run-Id", "unknown")
try:
resp = http.post(
"https://api.keybrake.com/v1/keys",
headers={"Authorization": f"Bearer {settings.KEYBRAKE_TOKEN}"},
json={
"label": f"django-{user_id}-{run_id}",
"vendor": "stripe",
"allowed_endpoints": [
"/v1/payment_intents",
"/v1/payment_intents/*"
],
"daily_usd_cap": 500,
"expires_in": "5m"
},
timeout=5
)
resp.raise_for_status()
key_data = resp.json()
request.vault_key = key_data["token"]
request.vault_key_id = key_data["id"]
except Exception as e:
# Log and fail open or closed depending on your risk tolerance
import logging
logging.error(f"Vault key issuance failed: {e}")
request.vault_key = None
request.vault_key_id = None
response = self.get_response(request)
# Revoke vault key after the view has finished
if getattr(request, "vault_key_id", None):
try:
http.delete(
f"https://api.keybrake.com/v1/keys/{request.vault_key_id}",
headers={"Authorization": f"Bearer {settings.KEYBRAKE_TOKEN}"},
timeout=5
)
except Exception:
pass # TTL is the safety net if revocation fails
return response
Register the middleware in settings.py:
MIDDLEWARE = [
"django.middleware.security.SecurityMiddleware",
# ... other middleware ...
"yourapp.middleware.VaultKeyMiddleware", # add after auth middleware
]
Now views receive the vault key on the request object:
import requests as http
from rest_framework.views import APIView
from rest_framework.response import Response
from rest_framework import status
class ChargeTool(APIView):
def post(self, request):
if not request.vault_key:
return Response({"error": "vault_key_unavailable"}, status=503)
resp = http.post(
"https://proxy.keybrake.com/stripe/v1/payment_intents",
headers={"Authorization": f"Bearer {request.vault_key}"},
json={
"amount": request.data["amount_cents"],
"currency": "usd",
"customer": request.data["customer_id"]
}
)
if resp.status_code == 429 and resp.json().get("code") == "cap_exhausted":
return Response({"error": "spend_cap_exceeded"}, status=402)
resp.raise_for_status()
return Response(resp.json())
Vault keys with Django REST Framework serializers and viewsets
DRF viewsets and serializers work cleanly with the middleware approach because request is passed through the DRF call stack just as it is in class-based views. If you use DRF serializer context to pass the request into nested serializers, the vault key travels with it:
class BillingSerializer(serializers.Serializer):
customer_id = serializers.CharField()
amount_cents = serializers.IntegerField()
def create(self, validated_data):
request = self.context["request"]
vault_key = request.vault_key
resp = requests.post(
"https://proxy.keybrake.com/stripe/v1/payment_intents",
headers={"Authorization": f"Bearer {vault_key}"},
json={
"amount": validated_data["amount_cents"],
"currency": "usd",
"customer": validated_data["customer_id"]
}
)
resp.raise_for_status()
return resp.json()
class BillingViewSet(viewsets.GenericViewSet):
serializer_class = BillingSerializer
def get_serializer_context(self):
ctx = super().get_serializer_context()
ctx["request"] = self.request # vault_key is on self.request
return ctx
Django + Celery: vault keys for async agent tasks
Many Django AI agent backends enqueue Celery tasks from views. The vault key issued in Django middleware does not cross the Celery task boundary — Celery workers don't have access to the request object. Issue a fresh vault key at the start of the Celery task instead:
from celery import shared_task
import requests as http
from django.conf import settings
@shared_task(bind=True, max_retries=3)
def run_billing_agent(self, run_id: str, customer_ids: list):
# Issue vault key at task start, not in the Django request
resp = http.post(
"https://api.keybrake.com/v1/keys",
headers={"Authorization": f"Bearer {settings.KEYBRAKE_TOKEN}"},
json={
"label": f"celery-billing-{run_id}",
"vendor": "stripe",
"allowed_endpoints": ["/v1/payment_intents", "/v1/payment_intents/*"],
"daily_usd_cap": 5000,
"expires_in": "30m" # longer TTL for async tasks
}
)
resp.raise_for_status()
vault_key = resp.json()["token"]
key_id = resp.json()["id"]
try:
for customer_id in customer_ids:
process_single_customer(customer_id, vault_key)
except Exception as exc:
# Revoke before retry to avoid duplicate active keys
http.delete(
f"https://api.keybrake.com/v1/keys/{key_id}",
headers={"Authorization": f"Bearer {settings.KEYBRAKE_TOKEN}"}
)
raise self.retry(exc=exc, countdown=60)
finally:
http.delete(
f"https://api.keybrake.com/v1/keys/{key_id}",
headers={"Authorization": f"Bearer {settings.KEYBRAKE_TOKEN}"}
)
Performance and concurrency considerations
| Concern | Impact | Mitigation |
|---|---|---|
| Middleware latency | ~30–80ms per request (synchronous HTTPS to api.keybrake.com) | Switch to Django async views + aiohttp for high-throughput; or pre-issue keys per user session with 15-minute TTL and cache in Django's session store |
| Middleware failure | If issuance fails, request.vault_key = None; views must check and return 503 |
Add middleware-level circuit breaker; Keybrake API uptime SLA is 99.9% — treat failures as transient and return a retryable error code to the agent |
| Gunicorn worker isolation | Each worker issues independent vault keys — no shared state between workers | Expected behavior. KEYBRAKE_TOKEN in settings.py is the master credential; vault keys are the per-request layer. Each worker issues its own vault key independently. |
| Django's ORM and Stripe SDK | Stripe SDK's module-level stripe.api_key is still set from settings; must use raw HTTP calls to proxy instead |
Use requests.post("https://proxy.keybrake.com/stripe/...") instead of stripe.PaymentIntent.create(). The Stripe SDK doesn't support custom base URLs for the proxy pattern in all versions — check stripe-python 8.x StripeClient for custom base_url support. |
Related questions
Can I use vault keys with Django's async views (ASGI + Daphne/Uvicorn)?
Yes. Django's async view support means you can use httpx.AsyncClient for vault key issuance to avoid blocking the event loop. Write an async middleware using Django's ASGI middleware pattern: async def __call__(self, scope, receive, send). However, Django's synchronous ORM calls still block when used inside async views — if your tool views call the ORM and then the proxy, use sync_to_async for ORM calls or keep the vault key issuance as a synchronous middleware layer and let gunicorn (sync) or uvicorn (async) manage concurrency at the worker level. For pure async Django setups, httpx async is recommended over the synchronous requests library.
How do vault keys work with Django's multi-tenancy patterns (tenant-per-schema, django-tenant-schemas)?
Vault keys complement multi-tenant Django setups cleanly. In a tenant-per-schema setup, the request object typically carries a tenant attribute set by the tenant middleware. Your vault key middleware runs after the tenant middleware and can include the tenant identifier in the vault key label: "label": f"tenant-{request.tenant.schema_name}-{run_id}". Set per-tenant daily caps and allowed endpoints according to each tenant's plan. The audit log in Keybrake will be filterable by tenant label, giving you per-tenant spend visibility on top of your existing per-tenant schema isolation.
What's the right approach for Django management commands that call vendor APIs?
Management commands don't go through the request/response lifecycle, so Django middleware doesn't run. Issue vault keys at the start of the management command's handle() method and revoke them at the end — similar to the Celery task pattern. For management commands that fan out work across multiple threads or processes (using multiprocessing or concurrent.futures), issue a separate vault key per worker process so each has an independent spend cap. The per-process approach matches the vault key model: one cap per unit of autonomous work, regardless of whether that unit is an HTTP request, a Celery task, or a management command worker.
Further reading
- Flask AI agent API key — the same vault key pattern for Flask's WSGI request model, using
before_requesthooks and Flask'sgcontext object. - FastAPI AI agent API key — async ASGI vault key integration for FastAPI using dependency injection, which avoids the middleware-level blocking HTTP call.
- Celery AI agent API key — vault key lifecycle for Celery workers used in async Django agent architectures, where keys must be issued at task start rather than at request start.
- AI agent API key lifecycle — the four lifecycle phases (issuance, enforcement, expiration, revocation) and how they map to Django's middleware request lifecycle.
- AI agent credential management — broader credential management architecture for Python-based agent backends, including Django settings management and secret rotation patterns.