LangGraph · AI agents · API key security

LangGraph AI agent API key: scoping tool calls in stateful agent graphs

LangGraph gives you explicit control over agent execution flow — stateful graphs, persistent checkpoints, and multi-actor supervisor patterns. That control is what makes LangGraph powerful for complex agent workloads. It's also what makes uncapped API keys dangerous: a cyclic graph can revisit the same Stripe tool node ten times before the LLM decides it's done, and a supervisor that delegates to a billing subagent can invoke the same payment tool across multiple worker runs. This page covers what LangGraph's native tooling doesn't provide for vendor spend enforcement, and the vault-key pattern that does.

TL;DR

LangGraph's cycles and multi-actor patterns mean a single graph run can call a payment tool many more times than you intended. A vault key proxy adds the per-run constraint: issue one vault key at graph-run start (using the thread ID as the run identifier), enforce a per-run dollar cap across all node executions, and get a structured per-call audit log with graph context attached. The 429 from an exceeded cap is a normal LangGraph tool exception — your error node or conditional edge handles it.

How LangGraph agents call vendor APIs

In LangGraph, vendor API calls happen inside tool functions that are called by ToolNode or directly inside agent nodes. A billing agent graph might look like:

from langgraph.graph import StateGraph, END
from langgraph.prebuilt import ToolNode
from langchain_core.tools import tool
import stripe, os

@tool
def charge_customer(customer_id: str, amount_cents: int) -> str:
    """Charge a customer via Stripe."""
    stripe.api_key = os.environ["STRIPE_SECRET_KEY"]  # full-access key
    intent = stripe.PaymentIntent.create(
        amount=amount_cents,
        currency="usd",
        customer=customer_id,
    )
    return f"Charged {amount_cents} cents. Intent: {intent.id}"

tools = [charge_customer]
tool_node = ToolNode(tools)

# Graph with a cycle: agent → tools → agent → ... → END
builder = StateGraph(AgentState)
builder.add_node("agent", call_model)
builder.add_node("tools", tool_node)
builder.add_edge("tools", "agent")
builder.add_conditional_edges("agent", should_continue, {"tools": "tools", END: END})
graph = builder.compile(checkpointer=memory_saver)

The graph is correct LangGraph — the cycle is intentional, letting the agent refine its actions based on tool results. The problem is the API key: every time the graph traverses the tools node, it calls STRIPE_SECRET_KEY with no per-run cap. An LLM that gets confused or over-optimistic about billing will keep calling charge_customer until the graph hits a terminal state or you kill it externally.

Three gaps LangGraph's native tooling doesn't fill for vendor spend control

Gap	What happens in practice	LangGraph's answer
No per-graph-run spend cap	A billing agent with a cycle bug charges a customer 12 times before reaching a terminal state. LangGraph faithfully executes every tool node traversal. The cap on damage is your Stripe account limit, not the single charge you expected.	You can add custom graph logic to count tool calls, but there's no built-in spend enforcement at the tool level.
No per-run revoke	You interrupt a graph run (via `interrupt_before` or external signal). The node that was executing may have already made the Stripe call. Rotating the real Stripe key breaks every other graph sharing that key.	Graph interruption halts future node execution but cannot cancel in-flight vendor calls or revoke credential access.
No per-tool-call audit with graph context	LangGraph's state history and checkpoint data show node inputs/outputs, but don't parse dollar amounts from Stripe responses or cross-reference Stripe charges with the graph thread ID and step count that made them.	State snapshots and LangSmith traces. No vendor cost parsing, no per-run charge aggregation.

The cycle risk: why LangGraph's reactive loops create overspend scenarios

The cycle in a LangGraph agent is intentional — it lets the agent use tool results to decide what to do next. For retrieval, search, and read operations, cycling is safe. For tool calls that charge money, each cycle iteration is a potential new vendor charge.

Consider a billing agent that's been told to "invoice all customers from the March cohort." If the LLM has a subtle bug in its understanding of which customers qualify, or if it receives ambiguous context about whether a charge succeeded, it may retry the same charge across multiple cycle iterations before reaching a confident terminal state. Each retry is a real Stripe call — your state history shows the attempts, but nothing stopped them.

LangGraph's recursion_limit parameter puts a hard cap on graph steps, which indirectly limits tool calls. But the recursion limit is designed to prevent infinite loops, not to enforce a dollar cap — you'd need to set it unreasonably low to meaningfully constrain spend, which would break legitimate multi-step agents.

Scoping vault keys per graph run in LangGraph

Issue the vault key at graph-run start, using the thread ID as the run identifier. The vault key travels through graph state alongside other run context:

import httpx
from langgraph.graph import StateGraph, END
from langgraph.prebuilt import ToolNode
from langchain_core.tools import tool
from typing import TypedDict, Optional
import stripe, os

class AgentState(TypedDict):
    messages: list
    vault_key: Optional[str]       # added to state
    thread_id: str

def issue_vault_key_node(state: AgentState) -> AgentState:
    """First node: issue a scoped vault key for this graph run."""
    r = httpx.post(
        "https://proxy.keybrake.com/vault/keys",
        headers={"Authorization": f"Bearer {os.environ['KEYBRAKE_API_KEY']}"},
        json={
            "vendor": "stripe",
            "daily_usd_cap": 200.0,
            "allowed_endpoints": ["POST /v1/payment_intents", "GET /v1/customers/*"],
            "expires_in": "2h",
            "agent_run_label": f"langgraph-billing/{state['thread_id']}",
        },
    )
    return {"vault_key": r.json()["vault_key"]}

@tool
def charge_customer(customer_id: str, amount_cents: int, vault_key: str) -> str:
    """Charge a customer via Stripe."""
    stripe.api_key = vault_key                           # scoped key
    stripe.api_base = "https://proxy.keybrake.com/stripe"
    intent = stripe.PaymentIntent.create(
        amount=amount_cents,
        currency="usd",
        customer=customer_id,
        idempotency_key=f"lg-{customer_id}-{amount_cents}",
    )
    return f"Charged {amount_cents} cents. Intent: {intent.id}"

# In the agent node, inject vault_key into tool calls from state
def call_model(state: AgentState):
    # pass vault_key from state into tool arguments
    # ... LLM call with vault_key injected into the tool schema context
    pass

builder = StateGraph(AgentState)
builder.add_node("setup", issue_vault_key_node)     # vault key issuance first
builder.add_node("agent", call_model)
builder.add_node("tools", ToolNode([charge_customer]))
builder.add_edge("setup", "agent")
builder.add_edge("tools", "agent")
builder.add_conditional_edges("agent", should_continue, {"tools": "tools", END: END})
graph = builder.compile(checkpointer=memory_saver)

The vault key is issued once per graph run in the setup node, stored in graph state, and injected into tool calls that need it. Every cycle through the tools node uses the same vault key and the same per-run cap. When the cap is hit, the tool raises an exception with a 429 — which your conditional edge can route to an error-handling node rather than back to the agent loop.

Multi-actor LangGraph: supervisor patterns and credential isolation

LangGraph's supervisor pattern uses a parent agent to orchestrate multiple worker subagents. Each worker can have its own tools — including payment tools. In a naive implementation, all workers share the same underlying API keys, which means a misbehaving worker can exhaust the vendor credential on behalf of the entire multi-agent system.

The vault key pattern adds the missing isolation: issue a vault key for the entire supervisor run, and optionally allocate sub-caps per worker. Worker A gets a vault key with a $100 cap; Worker B gets a vault key with a $50 cap; the supervisor run's total cap is $200. If Worker A exhausts its cap, Worker B can still operate. If the supervisor run exceeds $200, all workers get 429s and the supervisor handles the budget exception.

How Keybrake fits

Keybrake is the proxy layer between your LangGraph tool nodes and Stripe, Twilio, or Resend. You add a setup node to issue a vault key at graph-run start, pass it through graph state, and inject it into tool calls that need it. The real Stripe secret stays in Keybrake, not in your LangGraph environment. The per-run cap fires as a catchable tool exception — your graph's conditional edges can route it to a human-review node, a retry-with-lower-amount path, or a clean terminal state.

Get early access