Architecture · API gateway · AI agents · Vendor API security

AI agent API gateway: routing, policy enforcement, and spend control for multi-vendor agent calls

An LLM gateway like LiteLLM handles the token budget and model routing for the AI side of your agents. But your agents don't just call LLMs — they call Stripe to charge customers, Twilio to send SMS, Resend to deliver email, and Shopify to fulfill orders. These vendor calls carry real money and real side effects. An AI agent API gateway sits between your agent and these vendor APIs, enforcing per-agent spend caps, scoping credentials to individual runs, logging every call to an audit trail, and providing a kill-switch that works in under one second. This guide covers what that gateway looks like, how to build a minimal version, and when a managed proxy makes more sense than DIY.

TL;DR

An AI agent API gateway is a reverse proxy that: (1) authenticates requests using short-lived vault keys, not long-lived vendor API keys; (2) enforces per-agent policies (spend caps, endpoint allowlists, TTLs); (3) translates vault key requests to real vendor API calls; and (4) logs every request and its cost. The agent never sees the real Stripe or Twilio API key — only the vault key that expires when the agent run ends.

Why agents need a different kind of API gateway

Traditional API gateways (Kong, AWS API Gateway, nginx) are designed for human-facing services: they rate-limit by IP, authenticate users via JWT or OAuth, and route traffic between microservices. They are not designed for the specific risks that autonomous agents introduce:

Unbounded spend in a tight loop. An agent that encounters an error and retries a Stripe charge can spend thousands of dollars before the next human review. Traditional gateways enforce request-per-second limits, not total dollar spend caps.
Credential scope creep. Agents use the full vendor API key, which grants access to every endpoint. An agent given permission to create invoices doesn't need — and shouldn't have — permission to delete payment methods or modify webhook endpoints.
Side-effect attribution. When 50 users' agents are all calling Stripe through the same API key, there is no way to attribute which charge came from which agent run without instrumentation. Traditional access logs don't parse vendor response bodies for cost data.
Emergency revocation speed. Rotating a vendor API key to stop a runaway agent takes minutes and disrupts all other users' agents. A vault key kill-switch revokes one agent's credential in milliseconds without affecting others.

The AI agent API gateway architecture

The gateway sits at the edge of your vendor API calls:

┌─────────────────────────────────────────────────────────────┐
│                     Agent (LLM + tools)                      │
│                                                              │
│  chargeCustomer({ amount: 100, customer: "cus_xxx" })        │
└──────────────────────────┬───────────────────────────────────┘
                           │ vault_key_xxx (short-lived, scoped)
                           ▼
┌─────────────────────────────────────────────────────────────┐
│               AI Agent API Gateway (proxy)                   │
│                                                              │
│  1. Authenticate: vault_key_xxx → look up policy             │
│  2. Enforce: daily_usd_cap=$500, allowed=/v1/payment_intents  │
│  3. Forward: swap vault key → real STRIPE_SECRET_KEY         │
│  4. Log: request + response + cost (parsed from response)    │
│  5. Revoke: mark vault_key_xxx spent after response returns  │
└──────────────────────────┬───────────────────────────────────┘
                           │ real Stripe API key (never leaves gateway)
                           ▼
                    api.stripe.com

The agent has the vault key. The gateway has the real vendor API key. The two never meet in the agent's process memory.

Minimal self-hosted gateway implementation

A minimal AI agent API gateway in Node.js requires three components:

1. Vault key store and policy enforcement

// gateway/vault.ts
import Database from "better-sqlite3";

const db = new Database("./data.db");

db.exec(`
  CREATE TABLE IF NOT EXISTS vault_keys (
    id TEXT PRIMARY KEY,
    token TEXT UNIQUE NOT NULL,
    vendor TEXT NOT NULL,
    allowed_endpoints TEXT NOT NULL,  -- JSON array
    daily_usd_cap REAL NOT NULL,
    daily_usd_spent REAL DEFAULT 0,
    expires_at INTEGER NOT NULL,
    revoked INTEGER DEFAULT 0
  );
  CREATE TABLE IF NOT EXISTS audit_log (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    vault_key_id TEXT NOT NULL,
    vendor TEXT NOT NULL,
    method TEXT NOT NULL,
    path TEXT NOT NULL,
    status_code INTEGER,
    cost_usd REAL,
    request_at INTEGER NOT NULL
  );
`);

export function enforcePolicy(
  token: string,
  method: string,
  path: string,
): { valid: true; keyId: string } | { valid: false; reason: string } {
  const key = db
    .prepare("SELECT * FROM vault_keys WHERE token = ?")
    .get(token) as any;

  if (!key) return { valid: false, reason: "unknown_token" };
  if (key.revoked) return { valid: false, reason: "revoked" };
  if (Date.now() / 1000 > key.expires_at) return { valid: false, reason: "expired" };
  if (key.daily_usd_spent >= key.daily_usd_cap)
    return { valid: false, reason: "cap_exhausted" };

  const allowed: string[] = JSON.parse(key.allowed_endpoints);
  const pathAllowed = allowed.some((pattern) => {
    if (pattern.endsWith("/*")) {
      return path.startsWith(pattern.slice(0, -2));
    }
    return path === pattern;
  });

  if (!pathAllowed) return { valid: false, reason: "endpoint_not_allowed" };

  return { valid: true, keyId: key.id };
}

2. Vendor proxy handler

// gateway/proxy.ts
import http from "http";
import https from "https";
import { enforcePolicy } from "./vault.ts";

const VENDOR_TARGETS: Record<string, { host: string; realKey: string }> = {
  stripe: {
    host: "api.stripe.com",
    realKey: process.env.STRIPE_SECRET_KEY!,
  },
  twilio: {
    host: "api.twilio.com",
    realKey: `${process.env.TWILIO_ACCOUNT_SID}:${process.env.TWILIO_AUTH_TOKEN}`,
  },
  resend: {
    host: "api.resend.com",
    realKey: process.env.RESEND_API_KEY!,
  },
};

export function createProxyServer() {
  return http.createServer(async (req, res) => {
    // URL pattern: /stripe/v1/payment_intents → vendor=stripe, path=/v1/payment_intents
    const match = req.url?.match(/^\/(\w+)(\/.*)/);
    if (!match) {
      res.writeHead(404).end(JSON.stringify({ error: "unknown_vendor" }));
      return;
    }

    const [, vendor, vendorPath] = match;
    const target = VENDOR_TARGETS[vendor];
    if (!target) {
      res.writeHead(404).end(JSON.stringify({ error: "unsupported_vendor" }));
      return;
    }

    // Extract vault key from Authorization header
    const auth = req.headers["authorization"] ?? "";
    const token = auth.startsWith("Bearer ") ? auth.slice(7) : "";
    const check = enforcePolicy(token, req.method!, vendorPath);

    if (!check.valid) {
      res.writeHead(check.reason === "cap_exhausted" ? 429 : 401).end(
        JSON.stringify({ error: check.reason }),
      );
      return;
    }

    // Forward to vendor with real API key
    const proxyReq = https.request(
      {
        hostname: target.host,
        path: vendorPath,
        method: req.method,
        headers: {
          ...req.headers,
          host: target.host,
          authorization: `Bearer ${target.realKey}`,
        },
      },
      (proxyRes) => {
        res.writeHead(proxyRes.statusCode!, proxyRes.headers);
        proxyRes.pipe(res);
        // Cost parsing and audit logging happen on response body (vendor-specific)
      },
    );

    req.pipe(proxyReq);
  });
}

3. Cost parsing per vendor

Each vendor exposes cost differently — you need vendor-specific parsers:

// Cost parsing per vendor
function parseCost(vendor: string, statusCode: number, responseBody: any): number {
  if (vendor === "stripe") {
    // Stripe: parse amount from PaymentIntent or Charge responses
    if (responseBody.object === "payment_intent" && responseBody.amount) {
      return responseBody.amount / 100; // Stripe amounts are in cents
    }
    return 0;
  }

  if (vendor === "twilio") {
    // Twilio: price is in the response body for messages and calls
    if (responseBody.price) {
      return Math.abs(parseFloat(responseBody.price));
    }
    return 0;
  }

  if (vendor === "resend") {
    // Resend: fixed rate ~$0.001 per email, no per-request cost in response
    if (statusCode === 200 && responseBody.id) return 0.001;
    return 0;
  }

  return 0;
}

Build vs buy decision matrix

Factor	Build self-hosted	Use managed (Keybrake)
Time to first scoped call	2–5 days (proxy + vault key store + policy enforcement)	~30 minutes (POST /v1/keys, change API base URL)
Ongoing maintenance	You own TLS cert renewal, SQLite backups, Node.js upgrades, and vendor API schema changes	Managed — vendor schema changes handled by Keybrake
Audit log compliance	You build storage, retention, and querying; GDPR deletion is your problem	90-day retention on Team plan; one-click data export
Vendor expansion	Each new vendor requires a new proxy handler, cost parser, and policy type	New vendors added by Keybrake; same vault key API across all vendors
Control and customization	Full control — custom policy types, custom cost parsing, internal LDAP integration	Standard policy types (spend cap, endpoint allowlist, TTL, user attribution)
Appropriate for	Teams with compliance requirements that prevent third-party proxies; 10+ vendors with non-standard APIs	Most AI agent teams calling Stripe, Twilio, Resend, Shopify, Postmark, Segment

Vault key API used by agents

Regardless of whether you build or use a managed gateway, the agent-facing API should follow the same pattern:

# Issue a vault key (one per agent run or per tool call)
POST https://api.keybrake.com/v1/keys
Authorization: Bearer ${KEYBRAKE_TOKEN}
{
  "label": "agent-run-${runId}",
  "vendor": "stripe",
  "allowed_endpoints": ["/v1/payment_intents", "/v1/payment_intents/*"],
  "daily_usd_cap": 500,
  "expires_in": "10m"
}
→ { "id": "vk_xxx", "token": "vault_key_xxx" }

# Use the vault key against the proxy
POST https://proxy.keybrake.com/stripe/v1/payment_intents
Authorization: Bearer vault_key_xxx
(standard Stripe API request body)
→ standard Stripe API response (or 429 with {"code":"cap_exhausted"} if over cap)

# Revoke when done
DELETE https://api.keybrake.com/v1/keys/vk_xxx
Authorization: Bearer ${KEYBRAKE_TOKEN}
→ 204 No Content

The agent only needs the vault_key_xxx token and the proxy URL. The real vendor API key is never distributed to the agent process.

Get early access