Agent governance · Buyer's guide

AI agent governance platform: why governance is not a single platform (and what to buy instead)

There is no single product on the market that does end-to-end AI agent governance. The vendors that pitch themselves as "agent governance platforms" cover at most one or two of the four jobs the term actually contains — and the layer they almost always skip is the one that ships incidents. This is the honest map of what governance is, what each vendor covers, and the four-line shopping list a CTO needs to put in front of an agent that already touches Stripe, Twilio, or a customer database.

TL;DR

Agent governance is four decoupled jobs, not one platform. Layer 1 — identity: who is the agent, what credentials does it carry. Layer 2 — runtime policy enforcement: what can the agent do at the moment of action (per-day USD caps, endpoint allowlists, customer scope, mid-run revoke). Layer 3 — audit and cost: what did the agent actually do, joined on agent_run_id. Layer 4 — post-hoc evaluation: was the output safe, correct, on-brand. Every vendor that names itself a "governance platform" covers Layer 4 and parts of Layer 1. Almost none cover Layer 2 — the layer where money moves. The right play in 2026 is to buy four narrow tools (one per layer) and join them on agent_run_id, not to wait for a single platform that doesn't exist.

Why the "platform" framing fails for agent governance

The phrase AI agent governance platform follows the naming pattern of cloud platform or data platform — singular, all-encompassing, one vendor to buy from. That naming worked because the underlying jobs were tractable from a single vantage point: a cloud platform owns the substrate, a data platform owns the storage. Agent governance has no equivalent single vantage point. The four jobs that "governance" actually contains live on four different technical surfaces — the agent's runtime, the model's input/output, the SaaS API's HTTP request, and a post-hoc evaluation harness — and no current vendor straddles even three of them.

Two empirical claims behind that. First, the vendors that market themselves as governance platforms — Lasso Security, CalypsoAI, Robust Intelligence (now part of Cisco), Credo AI, Holistic AI, Lakera Guard — cluster heavily on Layer 4 (post-hoc evaluation) and parts of Layer 1 (identity-via-key-management). Their dashboards show prompt-injection detection rates, jailbreak attempts blocked, model-eval scorecards, and risk-register exports for compliance. None of them sit between the agent and api.stripe.com with a per-day USD cap. Second, the controls a CTO actually reaches for at 2am — a working off-switch with measurable propagation latency, a per-vendor spend cap that fires before the charge, a queryable per-run audit row — are conspicuously absent from the feature lists on those product pages. They aren't lying; they shipped a different product than the search query implies.

The mistake to avoid: buying any one of these and assuming "agent governance" is now covered. It isn't. The runaway-loop scenarios that show up in incident postmortems live almost entirely on Layer 2, and Layer 2 is the layer the platforms skip.

The four layers, named and scoped

Layer 1 — Identity

Who is the agent, and what credentials does it carry. In practice this is two sub-problems: credential management (the agent has API keys; those keys need to be stored, rotated, scoped) and provenance (a downstream system asking "is this caller a human user or one of our agents?" needs an answer). Today this is mostly handled by an existing secret-store (1Password, Vault, AWS Secrets Manager, Doppler) plus per-agent service accounts in your SSO provider. There is no agent-specific Layer-1 vendor with critical mass yet. The 2026 incumbents in standard credential management are good enough.

Where Layer 1 leaks into Layer 2: a long-lived API key handed directly to an agent is a Layer-1 control, but the moment it's in the agent's hands, you have a Layer-2 problem because Layer 1 has no concept of how that key is used. Stripe Restricted Keys blur the line — they're a Layer-1 artefact (they live in Stripe's dashboard) that does some Layer-2 enforcement (scope-by-endpoint), but only some. The full-fat Layer-2 controls — per-day USD cap, parameter-level allowlist, sub-second mid-run revoke — don't exist in Layer 1.

Layer 2 — Runtime policy enforcement

What can the agent do at the moment of action. This is a request-time, in-the-hot-path layer that sits between the agent and the SaaS API and inspects each call before it goes out. Concretely:

Per-vendor daily USD cap — no more than $X spent on Stripe today, regardless of which run.
Endpoint allowlist — the agent can call POST /v1/refunds but not POST /v1/transfers.
Customer-scope allowlist — refunds only for customers in {cus_*}, not arbitrary customers.
Parameter-level allowlist — the amount on a refund cannot exceed the original charge.
Mid-run revoke — flip a flag, the next call from this agent 401s, with a measurable propagation latency. The four real kill-switch patterns document the propagation numbers (Stripe key revoke median 45s / p95 3m12s; Twilio 30s-2m; Resend near-instant; OpenAI 1-5m).

This is where the runaway-loop incidents get caught. It is also the layer where, by our count, zero out of the six named "governance platforms" enforce a Stripe spend cap. The category that does cover Layer 2 has its own name — SaaS-tool governance proxy, or just agent governance proxy — and consists of Keybrake, parts of agentgateway.dev, and DIY proxies people build in-house. LiteLLM and Portkey sit one layer to the left (LLM-traffic governance) and don't cover Layer 2 either.

Layer 3 — Audit and cost

What did the agent actually do, and what did each call cost. The output is a queryable table whose rows are individual API calls, joined to a per-run grouping key (agent_run_id) so you can ask "what did run run_2026_04_30_8a3f spend?" and get a single number across all vendors. The four-column MVP schema is agent_run_id + policy_verdict + cost_usd_parsed + customer_scope_id. The full reference is our sixteen-column schema post with the indexes and the five operational queries.

Some Layer-3 capability falls out of Layer 2 for free: the proxy that enforces the cap is also the natural place to record the call. The remainder is the join with Layer 1's identity and Layer 4's evaluation, plus retention policy. Vendors who do Layer 4 well typically do Layer 3 partially — they record their layer's calls, but the per-run cost row only covers what their layer saw, which is rarely the Stripe charge.

Layer 4 — Post-hoc evaluation

Was the output safe, correct, on-brand, free of prompt injection. This runs after the action, on samples or every call, and feeds back into prompt updates, model retraining, and risk registers for compliance. This is the layer with the most vendor density today: Lasso Security, Lakera Guard, CalypsoAI, Robust Intelligence, Credo AI, Holistic AI, plus open-source projects like Garak, Promptfoo, and DeepEval. Many of these are excellent at what they do.

The catch: Layer 4 is preventative against output harm (a model says something it shouldn't, an agent gets prompt-injected into leaking secrets) and detective against process harm (the eval scorecard gets worse over time). It is not preventative against action harm — the agent doing the wrong thing on Stripe — because by the time evaluation runs, the action has happened. Layer 2 is where action-harm gets caught, and Layer 2 is what the "governance platform" pitch implies but doesn't deliver.

The capability matrix — six vendors against four layers

Six products that show up under the "AI agent governance platform" search, scored against the four layers. Yes = full coverage of that layer's controls. Partial = covers some of the layer's jobs but leaves the most expensive ones uncovered. No = does not address that layer.

Vendor	L1 Identity	L2 Runtime policy	L3 Audit + cost	L4 Post-hoc eval
Lasso Security	Partial	No	Partial (LLM calls only)	Yes
CalypsoAI	Partial	No	Partial (LLM calls only)	Yes
Robust Intelligence (Cisco)	No	No	Partial (LLM calls only)	Yes
Credo AI	No	No	No (registry-shape audit)	Yes
Holistic AI	No	No	No	Yes
Lakera Guard	No	Partial (prompt-level only)	Partial (LLM calls only)	Yes

Two columns matter to read. The L2 column is mostly "No". The runaway-loop on Stripe is invisible to all six because none of them sit on the SaaS-API request path. The L4 column is mostly "Yes". This is consistent — the vendors are excellent evaluation products that have been re-marketed as governance products. They are not lying about what they do; the words "governance platform" carry more freight than the features warrant.

Lakera Guard is the closest to a Layer-2 player on this list, and even there the policy enforcement is at prompt-level (does this prompt try to jailbreak the model) not at action-level (does this POST /v1/refunds stay under the daily cap). Different unit, different control surface.

Where Keybrake sits, plainly

We are a Layer-2 product. We sit between your agent and the SaaS API; we enforce the per-day USD cap, the endpoint allowlist, the customer scope, the parameter allowlist, the mid-run revoke. We produce the Layer-3 audit row as a side effect because the proxy is the natural place to record the call. We do not do Layer 4. We do not score prompt injection, run model evaluations, or produce a risk register. If your blast-radius worry is "the model will say something embarrassing", we are the wrong tool — buy Lakera or Lasso. If your worry is "the agent will spend $4,000 on Stripe before standup", we are the right tool, and the platform vendors above are not.

This is also why we don't market Keybrake as a "governance platform". The honest description is scoped API-key proxy for the SaaS APIs your agent calls. That maps cleanly to one layer, leaves the other three alone, and tells you whether you have a problem we solve in under a sentence.

The honest 2026 stack to buy

If you have an agent in production and want full governance coverage, the four-line shopping list is:

Layer 1 (identity) — your existing secret store + SSO provider. No new vendor required for most teams.
Layer 2 (runtime policy) — a SaaS-tool governance proxy in front of money-moving APIs. Keybrake for the proxy; LLM gateways like LiteLLM, Portkey, or Helicone for token-level controls on model traffic. Two proxies, not one — see the 2026 agent governance stack post for the dual-proxy architecture.
Layer 3 (audit and cost) — falls out of Layer 2 for free at the SaaS-tool axis; emit agent_run_id from the agent and join the LLM gateway's rows on the same key. Result: per-run cost across all vendors. Schema reference: the sixteen-column post.
Layer 4 (post-hoc evaluation) — Lakera Guard, Lasso Security, Promptfoo, or whatever fits your industry's compliance regime. This is where the named "governance platforms" actually shine.

Total vendor count: typically three (your existing secret store doesn't count, Keybrake covers L2+L3 on the SaaS axis, an LLM gateway covers L2+L3 on the model axis, an evaluation product covers L4). Total to spend on agent governance for a 50-person team: under $500/month for the proxies and gateways, plus whatever your compliance commitments demand on Layer 4. There is no fifth thing to buy.

Three antipatterns the "platform" framing creates

Buying a Layer-4 product and calling it your governance answer. The team buys Lasso Security, gets a beautiful prompt-injection dashboard, marks the governance epic complete, and then gets paged at 3am because the agent issued $48,000 in refunds in eleven minutes. The Layer-4 dashboard didn't see any of it because Stripe charges aren't model outputs. Different layer. Different control surface.

Putting the governance epic on the data-science team. "Governance" sounds like a model-eval thing because Layer 4 is the most visible layer, so the work gets routed to the team that owns evaluation. They build excellent Layer-4 controls — and have no jurisdiction over the SaaS API request path, which is owned by the platform team, which doesn't have a governance epic. The SaaS-axis bleed continues. The fix is to scope the epic by layer up front and assign each layer to the team that owns its surface — Layer 2 belongs to the team that already operates your API gateway, not to data science.

Waiting for a unified platform that doesn't exist. "Let's hold off until [vendor] launches their full agent governance platform" is a decision to ship Layer 2 in zero months instead of three. The companies pitching the unified platform are at varying stages of building it; the public roadmaps suggest 2027-2028 for production-ready Layer-2 enforcement on non-LLM APIs from any of them. Meanwhile every week your agent is in production without Layer 2 is a $3.24M-per-day worst-case exposure (one stuck Stripe refund loop at $15 × 216,000 calls/day — see the three-axis cost decomposition).

Get early access to Keybrake (Layer 2)

Related questions

Why doesn't a vendor like CalypsoAI just add the Layer-2 controls?

Mostly architectural inertia. The Layer-4 vendors built their pipelines around model input/output streams — that's the data they ingest and act on. Layer 2 needs a different shape: an HTTP proxy on the egress path, vendor-specific response parsers (Stripe's amount, Twilio's price, Resend's tier table), and per-vendor policy schemas. It's not a feature flag away; it's a different product. The vendor would need either to acquire a SaaS-tool governance proxy or build one from scratch, and both options compete with the existing roadmap. Expect this to start happening in 2027 once the Layer-2 incidents are public enough to force the priority.

Is there a single open-source platform that does all four layers?

Not as of April 2026. The closest assemblies are: Garak + Promptfoo for Layer 4; LiteLLM for LLM-axis Layer 2; agentgateway.dev experimenting at the SaaS-axis Layer 2 boundary; an open-source audit-table schema is what's missing — most teams roll their own table from references like ours. A unified open-source platform would need a coordinator who owns all four problem statements; nobody is currently doing that. Until someone does, the practical answer is "join four narrow tools on agent_run_id".

Can I just build all four in-house?

Layer 1 is mostly already done if you have a secret store and SSO. Layer 4 is buildable in-house if your compliance regime is permissive — wrap calls to the model, run a regex pass and a small classifier on the output, log results. Layer 3 is straightforward — one table, one join key. Layer 2 is the one to look at carefully before building in-house: vendor-specific cost parsers (Stripe / Twilio / Resend / Shopify each have a different shape), sub-second policy lookup, mid-run revoke with measured propagation. Our Stripe-key blog post sketches the SDK-wrapper version, which is the cheapest in-house path; the reverse-proxy version is what we sell because it scales across vendors without re-implementing per-language. Build vs buy decision is per-team, but most teams underestimate the per-vendor parser work.

What's the order of operations if I'm starting from zero?

Layer 2 first, on the SaaS-tool axis. The blast radius is largest there and the controls are absent in most stacks. Then Layer 2 on the LLM axis (an LLM gateway with token caps). Then Layer 3 (audit join key — costs you almost nothing once Layer 2 is in place). Then Layer 4 if compliance demands or if you've had a Layer-4-shaped incident. Layer 1 is whatever your secret store is doing today; revisit only if you discover an actual identity gap. Reverse-order investment — Layer 4 first because the vendor pitches are loudest — is the most expensive ordering choice teams make.

Where does an MCP server fit in this map?

An MCP server is mostly a Layer-1 artefact (it provides a discoverable, scoped credential surface for tools the agent can call). The MCP server itself doesn't enforce per-day USD caps or mid-run revoke — those are Layer 2. Our MCP-auth page covers the auth-handshake side; the Stripe agent toolkit page covers what Stripe's own MCP server gives you and where it stops. Net: MCP solves discovery and structured tool definitions; it doesn't replace Layer 2.

Is "agent governance" the same thing as "AI governance"?

Overlapping but not identical. "AI governance" historically refers to model risk, regulatory compliance, fairness audits, and policy frameworks at the organisation level — Credo AI, Holistic AI, the EU AI Act compliance market. "Agent governance" is narrower and more operational: the runtime controls that keep an autonomous agent's actions inside a known-safe envelope. The first is about model outputs and policy posture; the second is about the agent's hand on the wheel. Both are real, both are needed, but they're solved by different tools. The cross-pollination of vocabulary is the source of half the confusion in the buyer journey.