LangGraph Stripe Integration: Restricted API Keys, Spend Caps, and Agent Governance

LangGraph's graph-based agent design is the cleanest model for stateful AI workflows. The same features that make it powerful — conditional retry edges, persistent thread checkpoints, and parallel Send dispatch — are the ones that silently create duplicate Stripe charges, billing replay on resumed threads, and fan-out re-billing on retried orchestration.

LangGraph is LangChain's framework for building stateful, multi-actor agent workflows as explicit graphs. Instead of a linear agent loop, you define nodes (functions), edges (transitions), and state (a typed dict that persists across nodes). The result is a workflow where you can inspect state at every step, pause and resume mid-run, and fan out to parallel subgraphs. For agents that process Stripe billing — subscription charges, usage-based invoicing, payment reconciliation — LangGraph's structured approach is well-suited. The catch is that none of LangGraph's built-in behaviors (conditional routing, checkpointing, Send dispatch) know anything about Stripe idempotency or spend limits. That blind spot is invisible until a conditional edge routes the agent back to the billing node it just ran.

This post covers three failure modes specific to LangGraph's architecture, with Python code for each, and the two-layer governance pattern — content-hash idempotency keys plus per-run vault keys via a spend-cap proxy — that closes all three.

Failure mode 1: Conditional retry edge routes back to the billing node

The most common LangGraph pattern for tool-calling agents is the ReAct loop: an agent node calls the LLM, a tools node executes whatever tools the LLM requested, and a conditional edge routes back to the agent when tool calls are present or to END when they're not. This pattern is the foundation of LangGraph's introduction tutorial and is in production across thousands of deployments.

The billing failure emerges when the tools node executes a billing tool that succeeds at the Stripe layer but raises a Python exception afterward. A common case: stripe.charges.create() completes and returns a charge ID, but the follow-up operation — writing to a database, calling a webhook, sending a confirmation email — raises an exception. LangGraph's ToolNode catches this exception and returns a ToolMessage with is_error=True containing the exception text. The agent node receives this message, the LLM reads the error, and — depending on the system prompt — generates a new AIMessage with tool_calls requesting the billing tool again. The conditional edge routes to tools. The billing tool runs again. Stripe creates a second charge.

# graph.py — UNSAFE: conditional edge re-fires billing on tool error
from langgraph.graph import StateGraph, END
from langgraph.prebuilt import ToolNode
from langchain_core.messages import BaseMessage
from typing import Annotated, Sequence
from typing_extensions import TypedDict
import operator
import stripe

stripe.api_key = "sk_live_..."  # Bare production key

def charge_customer(customer_id: str, amount_cents: int, billing_period: str) -> dict:
    """Charge a customer for their subscription."""
    # Stripe charge succeeds — charge ID is returned
    charge = stripe.charges.create(
        amount=amount_cents,
        currency="usd",
        customer=customer_id,
        description=f"Subscription {billing_period}",
        # No idempotency_key — every call creates a new charge object
    )

    # If this raises, ToolNode returns is_error=True to the LLM
    record_charge_in_db(customer_id, charge["id"], billing_period)

    return {"charge_id": charge["id"], "status": "succeeded"}

class State(TypedDict):
    messages: Annotated[Sequence[BaseMessage], operator.add]

def should_continue(state: State):
    last = state["messages"][-1]
    # Any tool call in the last message → route to tools node
    if last.tool_calls:
        return "tools"
    return END

tools = [charge_customer]
tool_node = ToolNode(tools)

builder = StateGraph(State)
builder.add_node("agent", call_model)       # calls LLM
builder.add_node("tools", tool_node)        # executes tool calls
builder.set_entry_point("agent")
builder.add_conditional_edges("agent", should_continue)
builder.add_edge("tools", "agent")          # always route back to agent
graph = builder.compile()

When record_charge_in_db raises a database connection error, ToolNode returns ToolMessage(content="Error: connection refused", is_error=True). The LLM sees a failed tool call and calls charge_customer again with the same parameters. Stripe creates charge ID ch_B — the customer has been charged twice for the same billing period.

The fix adds a content-hash idempotency key derived from the billing parameters, and separates the database write failure from the charge result so the LLM can retry only the write without re-firing the charge:

# graph.py — SAFE: content-hash idempotency key + separated error handling
import hashlib
import stripe

def make_idempotency_key(
    customer_id: str, amount_cents: int, billing_period: str
) -> str:
    payload = f"{customer_id}:{amount_cents}:{billing_period}:langgraph-billing"
    return hashlib.sha256(payload.encode()).hexdigest()[:36]

def charge_customer(customer_id: str, amount_cents: int, billing_period: str) -> dict:
    """Charge a customer for their subscription."""
    idempotency_key = make_idempotency_key(customer_id, amount_cents, billing_period)

    # Stripe returns the original charge object on any retry with the same key
    charge = stripe.charges.create(
        amount=amount_cents,
        currency="usd",
        customer=customer_id,
        description=f"Subscription {billing_period}",
        idempotency_key=idempotency_key,
    )

    db_error = None
    try:
        record_charge_in_db(customer_id, charge["id"], billing_period)
    except Exception as e:
        db_error = str(e)

    if db_error:
        # Return the charge ID regardless of DB failure — the charge is real.
        # The LLM can call a separate record_charge tool without re-billing.
        return {
            "charge_id": charge["id"],
            "status": "succeeded",
            "db_error": db_error,
            "message": "Charge completed. DB write failed — call record_charge to retry.",
        }

    return {"charge_id": charge["id"], "status": "succeeded"}

The idempotency key is stable across all retries: the same customer, amount, and billing period always hash to the same key. Stripe returns the original charge object — no second charge is created. Returning the charge ID alongside the DB error prevents the LLM from conflating "the DB write failed" with "the charge failed," breaking the retry loop that would otherwise re-fire Stripe.

Failure mode 2: MemorySaver checkpoint replays billing on resumed thread

LangGraph's persistence layer — MemorySaver, SqliteSaver, PostgresSaver — stores the complete message list for every thread. This enables human-in-the-loop workflows, multi-turn conversations, and mid-run resume after failure. The billing risk is that every prior tool call and its result are visible in the thread history when the agent continues on the same thread_id.

The failure pattern: a billing agent runs on June 1, charges customer cus_abc for May 2026, and writes the successful charge ID to the thread history. The thread is persisted in the checkpointer. On July 1, an operator runs the same agent on the same thread_id (a common pattern when agents are associated with a customer ID or account): the message history includes AIMessage(tool_calls=[charge_customer(cus_abc, 9900, "May-2026")]) and the corresponding ToolMessage(charge_id=ch_xxx, status=succeeded). The new message — "process this month's billing" — is ambiguous: it doesn't specify a billing period. The LLM sees the prior May charge in context, infers that "this month" resolves from the prior period's pattern, and calls charge_customer with "May-2026" — creating a duplicate May charge — or with "June-2026" alongside a repeat May charge if the system prompt is underspecified.

# graph.py — UNSAFE: same thread_id on second month re-exposes prior billing context
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import StateGraph, END

checkpointer = MemorySaver()
graph = builder.compile(checkpointer=checkpointer)

# June 1: charges customer for May 2026
result = graph.invoke(
    {"messages": [("user", "Charge cus_abc $99 for May 2026")]},
    config={"configurable": {"thread_id": "cus_abc"}},  # customer ID as thread
)
# Persisted: charge_customer(cus_abc, 9900, "May-2026") → {charge_id: ch_xxx}

# July 1: ambiguous continuation on the same thread
result = graph.invoke(
    {"messages": [("user", "Process billing for cus_abc, same amount")]},
    config={"configurable": {"thread_id": "cus_abc"}},  # SAME thread_id
    # LLM sees prior May charge in context — may infer "same amount" = May period
    # and re-execute charge for already-billed period
)

There are two compounding risks: the LLM may use "May-2026" from prior context (duplicate May charge), or it may use "July-2026" from the current date but also re-confirm the May charge based on prior context (extra charge for a period that wasn't requested). Both create billing errors the customer will dispute.

The fix operates at three levels: use period-scoped thread IDs so each billing period gets a clean conversation context, require an explicit billing period in every invocation, and use content-hash idempotency keys so that even a period inference from context produces the same key for an already-billed period and Stripe returns the original charge object:

# graph.py — SAFE: period-scoped thread_id + explicit period + idempotency key
from langgraph.checkpoint.sqlite import SqliteSaver
import sqlite3

conn = sqlite3.connect("billing_checkpoints.db", check_same_thread=False)
checkpointer = SqliteSaver(conn)
graph = builder.compile(checkpointer=checkpointer)

billing_agent = Agent(
    instructions="""You are a billing agent. When processing billing:
    1. You MUST use the billing_period from the current message in YYYY-MM format.
    2. NEVER infer the billing period from prior conversation history.
    3. If billing_period is not specified in the current message, ask for it.
    4. Each billing period is independent — prior charges do not imply current charges.""",
)

# July 1: use a period-scoped thread_id — fresh context per billing cycle
result = graph.invoke(
    {"messages": [("user", "Charge cus_abc $99 for billing period 2026-07")]},
    config={"configurable": {"thread_id": "cus_abc:2026-07"}},
    #                                       ^^^^^^^^^^^^^^^^^^^
    #           Period-scoped thread: prior billing history is not in this context
)

# The idempotency key in charge_customer ensures that even if the same
# thread is invoked again, Stripe returns the original charge — no duplicate.

Period-scoped thread IDs are the most important defense: "cus_abc:2026-07" starts with an empty message list regardless of what happened in "cus_abc:2026-06". The LLM cannot see prior billing history because the prior billing happened in a different thread. The idempotency key provides a second layer for cases where the same period-scoped thread is invoked twice (concurrent runs, operator error, double-click).

Failure mode 3: Send API fan-out re-bills on dispatch retry

LangGraph's Send API enables dynamic parallel dispatch: an orchestrator node can return a list of Send objects, each spawning a parallel subgraph instance with its own state. This is the standard pattern for processing a batch of customers in parallel — one Send per customer, each handled by a billing_agent subgraph that runs independently.

The failure mode emerges when the orchestrating node that generates the Send list is retried after partial completion. In LangGraph, if the parent graph's orchestrator node raises an exception after dispatching some subgraph instances — or if the parent graph is interrupted and re-invoked — the orchestrator node re-executes and generates a new batch of Send objects for all customers, including those whose billing subgraphs already completed successfully in prior dispatch. Each new Send spawns a fresh subgraph instance with no knowledge of the prior subgraph's Stripe charge.

# batch_graph.py — UNSAFE: Send fan-out re-bills on orchestrator retry
from langgraph.graph import StateGraph, END
from langgraph.types import Send
from typing import Annotated
import operator

class BatchState(TypedDict):
    customers: list[dict]
    billing_results: Annotated[list, operator.add]  # accumulates from parallel nodes

class CustomerState(TypedDict):
    customer_id: str
    amount_cents: int
    billing_period: str
    charge_id: str | None

def dispatch_billing(state: BatchState):
    # Generates one Send per customer — all fire in parallel
    return [
        Send("charge_customer_node", {
            "customer_id": c["id"],
            "amount_cents": c["amount_cents"],
            "billing_period": c["billing_period"],
        })
        for c in state["customers"]
    ]

def charge_customer_node(state: CustomerState) -> dict:
    # No idempotency key — new charge on every invocation
    charge = stripe.charges.create(
        amount=state["amount_cents"],
        currency="usd",
        customer=state["customer_id"],
        description=f"Subscription {state['billing_period']}",
    )
    return {"charge_id": charge["id"]}

builder = StateGraph(BatchState)
builder.add_node("dispatch", dispatch_billing)
builder.add_node("charge_customer_node", charge_customer_node)
builder.set_entry_point("dispatch")
builder.add_conditional_edges("dispatch", lambda _: END, ["charge_customer_node", END])
graph = builder.compile()

# If graph.invoke() fails after some customers are charged and is retried,
# dispatch_billing regenerates all Sends → all customers are re-charged

In a batch of 50 customers where 30 completed before the orchestrator failed, a retry re-bills those 30. Without idempotency keys, 30 customers receive duplicate charges that will generate disputes, chargebacks, and support escalations.

The fix adds content-hash idempotency keys inside the subgraph node so that any re-dispatch of an already-completed customer returns the original Stripe charge object rather than creating a new one:

# batch_graph.py — SAFE: idempotency key inside each subgraph node
import hashlib
import stripe

def charge_customer_node(state: CustomerState) -> dict:
    idempotency_key = hashlib.sha256(
        f"{state['customer_id']}:{state['amount_cents']}:{state['billing_period']}:langgraph-batch".encode()
    ).hexdigest()[:36]

    # Stripe returns the original charge on any re-dispatch with the same key
    charge = stripe.charges.create(
        amount=state["amount_cents"],
        currency="usd",
        customer=state["customer_id"],
        description=f"Subscription {state['billing_period']}",
        idempotency_key=idempotency_key,
    )

    return {"charge_id": charge["id"], "status": charge["status"]}

# Now a retry of dispatch_billing safely re-sends to all customers:
# - Already-charged customers: Stripe returns existing charge object (no duplicate)
# - Not-yet-charged customers: Stripe creates new charge (correct behavior)
# The batch is fully idempotent — run it once or a hundred times, same result.

Each customer's idempotency key is deterministic from their billing parameters. A re-dispatched Send for an already-charged customer hits Stripe with the same key and receives the same charge object. The billing result is identical to the first run — no duplicate charge, no customer dispute.

Adding vault key isolation and a spend cap

The idempotency key prevents duplicate charges within a billing cycle, but it doesn't prevent a stuck LangGraph loop from creating charges across different billing periods, or a prompt injection from calling stripe.refunds.create() on a customer's prior charges. A raw Stripe key gives the agent access to every Stripe endpoint with no daily spend limit. A single misconfigured conditional edge or an injected billing period can exhaust a customer's monthly Stripe budget before the on-call engineer is paged.

The second layer replaces the raw Stripe key with a scoped vault key issued by a spend-cap proxy. The proxy holds the real Stripe key, enforces a daily USD cap per vault key, restricts the agent to specific Stripe endpoints, and logs every call with parsed cost:

# tools.py — SAFE: vault key factory + proxy + idempotency key
import hashlib
import stripe
from langchain_core.tools import tool

def make_billing_tool(vault_key: str, proxy_url: str = "https://proxy.keybrake.com"):
    """Create a billing tool scoped to a specific vault key."""

    # Point the Stripe client at the proxy instead of api.stripe.com
    stripe_client = stripe.StripeClient(
        api_key=vault_key,
        base_url=proxy_url,
        # The proxy reads the vault_key from Authorization: Bearer ,
        # looks up the real Stripe key, enforces the daily cap and endpoint allowlist,
        # forwards the request, and logs the parsed cost to the audit table.
    )

    @tool
    def charge_customer(customer_id: str, amount_cents: int, billing_period: str) -> dict:
        """Charge a customer for their subscription via the governed proxy.

        Args:
            customer_id: Stripe customer ID (cus_xxx)
            amount_cents: Amount to charge in cents (e.g., 9900 for $99.00)
            billing_period: Billing period in YYYY-MM format (e.g., 2026-07)
        """
        idempotency_key = hashlib.sha256(
            f"{customer_id}:{amount_cents}:{billing_period}:langgraph-billing".encode()
        ).hexdigest()[:36]

        try:
            charge = stripe_client.charges.create(
                params={
                    "amount": amount_cents,
                    "currency": "usd",
                    "customer": customer_id,
                    "description": f"Subscription {billing_period}",
                },
                options={"idempotency_key": idempotency_key},
            )
        except stripe.StripeError as e:
            # Return structured error — do not re-raise, which would trigger retry loop
            return {"error": e.user_message, "error_code": e.code}

        return {"charge_id": charge.id, "status": charge.status}

    return charge_customer

# Per-run vault key: issued at Keybrake dashboard with:
# { vendor: 'stripe', allowed_endpoints: ['POST /v1/charges'], daily_usd_cap: 500 }
import os
billing_tool = make_billing_tool(os.environ["KEYBRAKE_VAULT_KEY"])

The vault key is scoped to POST /v1/charges — the agent cannot call POST /v1/refunds, DELETE /v1/customers/{id}, or any other Stripe endpoint. The daily cap means the proxy returns a structured error once the billing budget is exhausted, and because the charge_customer tool returns this as a structured dict rather than re-raising, the LangGraph agent receives a clean error message instead of routing back to the tool node in an endless retry loop.

Governance comparison

Concern Raw Stripe key Restricted key only Idempotency key only Vault key + proxy only Idempotency + vault key
Conditional retry edge re-fires billing Yes — new charge on every retry Yes — restriction doesn't prevent duplication No — Stripe deduplicates on key Yes — proxy forwards all charges No — key prevents duplication at Stripe layer
Thread checkpoint replays prior billing Yes — same period re-billed Yes — restriction doesn't prevent replay Partial — deduplicates only if same period inferred Partial — cap limits total damage Yes — period-scoped thread + key + cap all reduce risk
Send fan-out re-bills on orchestrator retry Yes — all customers re-charged Yes — restriction doesn't help fan-out No — Stripe deduplicates each re-dispatched customer Yes — proxy forwards all re-dispatched charges No — idempotency key deduplicates re-dispatch
Runaway loop exhausts Stripe budget Yes — unbounded spend Partial — key scope limits endpoints No — new billing period = new key = new charge Yes — proxy enforces daily USD cap Yes — proxy cap stops loop after budget
Prompt injection reaches Stripe endpoints Yes — any endpoint accessible Partial — key restricts endpoint scope No — doesn't restrict endpoints Yes — proxy allowlist enforces endpoint scope Yes — allowlist + cap enforce boundaries
Audit log of agent billing activity Manual Stripe dashboard query Manual Stripe dashboard query Manual Stripe dashboard query Automatic — proxy logs every call with cost Automatic — proxy logs every call with cost

Pytest enforcement suite

These tests verify the governance layer without hitting the live Stripe API or the proxy:

# tests/test_billing.py
import pytest
import hashlib
from unittest.mock import MagicMock, patch
from tools import make_billing_tool

@pytest.fixture
def mock_stripe_client():
    with patch("tools.stripe.StripeClient") as MockClient:
        client_instance = MagicMock()
        client_instance.charges.create.return_value = MagicMock(
            id="ch_test_123", status="succeeded"
        )
        MockClient.return_value = client_instance
        yield client_instance

def test_idempotency_key_stable_across_retries(mock_stripe_client):
    """Same billing params → same idempotency key on every call."""
    tool = make_billing_tool("vk_test_key")
    args = {"customer_id": "cus_abc", "amount_cents": 9900, "billing_period": "2026-06"}

    tool.invoke(args)
    tool.invoke(args)  # Simulated retry

    calls = mock_stripe_client.charges.create.call_args_list
    assert len(calls) == 2
    key1 = calls[0].kwargs["options"]["idempotency_key"]
    key2 = calls[1].kwargs["options"]["idempotency_key"]
    assert key1 == key2, "Retry must use same idempotency key"

def test_different_periods_produce_different_keys(mock_stripe_client):
    """Different billing periods → different idempotency keys."""
    tool = make_billing_tool("vk_test_key")
    base = {"customer_id": "cus_abc", "amount_cents": 9900}

    tool.invoke({**base, "billing_period": "2026-05"})
    tool.invoke({**base, "billing_period": "2026-06"})

    calls = mock_stripe_client.charges.create.call_args_list
    key1 = calls[0].kwargs["options"]["idempotency_key"]
    key2 = calls[1].kwargs["options"]["idempotency_key"]
    assert key1 != key2, "Different billing periods must produce different keys"

def test_stripe_error_returned_not_raised(mock_stripe_client):
    """StripeError is returned as structured dict, not re-raised."""
    import stripe as stripe_lib
    mock_stripe_client.charges.create.side_effect = stripe_lib.StripeError(
        "cap_exhausted", code="cap_exhausted"
    )
    tool = make_billing_tool("vk_test_key")
    result = tool.invoke({"customer_id": "cus_abc", "amount_cents": 9900, "billing_period": "2026-06"})
    assert "error" in result
    assert result.get("error_code") == "cap_exhausted"

def test_proxy_url_used_not_stripe_direct():
    """Billing tool must point at proxy, not api.stripe.com."""
    with patch("tools.stripe.StripeClient") as MockClient:
        MockClient.return_value = MagicMock()
        make_billing_tool("vk_test", "https://proxy.keybrake.com")
        MockClient.assert_called_once_with(
            api_key="vk_test",
            base_url="https://proxy.keybrake.com",
        )

def test_send_fanout_idempotency_across_customers():
    """Each customer's idempotency key is unique per period."""
    def compute_key(customer_id, amount, period):
        return hashlib.sha256(
            f"{customer_id}:{amount}:{period}:langgraph-batch".encode()
        ).hexdigest()[:36]

    customers = [
        {"id": "cus_001", "amount_cents": 9900, "billing_period": "2026-06"},
        {"id": "cus_002", "amount_cents": 9900, "billing_period": "2026-06"},
    ]
    keys = [compute_key(c["id"], c["amount_cents"], c["billing_period"]) for c in customers]
    assert len(set(keys)) == len(keys), "Each customer must get a unique idempotency key"

Gap analysis

Interrupt-before resume re-executes the billing node

LangGraph supports human-in-the-loop via interrupt_before=["billing_node"] — the graph pauses before executing the specified node and waits for a human to approve or modify state before resuming. The billing risk in this pattern: if the human approves and the graph resumes, the billing node executes. If the resume request is sent twice (double-click, network retry, operator error), LangGraph executes the billing node twice. Without an idempotency key, two Stripe charges are created for the same billing state. The content-hash idempotency key handles this: both resume executions produce the same key from the same state, and Stripe deduplicates at the Stripe layer. Verify that your interrupt-resume flow passes the same state to the billing node on both resume attempts — if state mutation between resumes changes the billing parameters, the idempotency key changes and the second resume creates a second charge.

Subgraph delegation creates two independent billing paths

LangGraph subgraphs compile as independent graphs invoked from a parent node. If both the parent graph and a subgraph have access to a billing tool — a pattern that emerges when a generic tools node is shared across graph levels — both can independently trigger a Stripe charge for the same customer in the same session. This is not a retry issue; it's a configuration issue where two independent execution paths each call the billing tool once. The idempotency key deduplicates this if both paths use the same key material (same customer, amount, period). If the parent charges for a full month and the subgraph charges for a pro-rated partial month, they produce different keys and both charges go through — which may be correct, but is worth verifying explicitly.

Streaming graph execution may mask billing completion before client disconnect

LangGraph's graph.stream() yields events as they're produced. A client that streams a billing graph run may disconnect mid-stream after the billing node has completed but before the stream is fully consumed. If the caller retries the full graph run after a disconnect, the billing node re-executes. The content-hash idempotency key handles this — the retry produces the same key and Stripe returns the original charge. The subtler risk is that the caller may log the retry run as a fresh billing event in their own system without checking whether a Stripe charge already exists for that period. Always derive your billing system's record of truth from Stripe's charge object (filtered by idempotency key) rather than from the number of times your graph executed.

LangGraph Cloud's durable execution retries failed nodes automatically

LangGraph Cloud (the hosted execution platform) provides durable execution with automatic retry of failed nodes. This is transparent to your graph code — a node that raised an exception is retried by the platform, not by your conditional edge logic. If your billing node raises an exception (network timeout, database write failure), LangGraph Cloud retries the node without the LLM being involved. The content-hash idempotency key handles this retry at the Stripe layer — the platform retry produces the same key and Stripe deduplicates. However, if you're using LangGraph Cloud and also have a conditional retry edge in your graph, you have two independent retry paths for the same billing node: the platform retries the node on exception, and the LLM routes back to the node on tool error. Make sure your billing node returns a structured result rather than raising on partial failure — raising causes platform retry; returning an error dict routes the LLM's conditional edge decision. Use the StripeError catch-and-return pattern shown above to stay on the LLM path and avoid the platform retry path for billing specifically.

FAQ

Can I use LangGraph's run ID as the idempotency key instead of a content hash?

No. LangGraph assigns a new run ID to every graph.invoke() call, including retries. A conditional edge retry or a re-invoked graph both get new run IDs — using the run ID as the idempotency key means every retry creates a new Stripe charge. The idempotency key must be derived from the billing intent (customer, amount, period, charge type), which is stable across all retries for the same billing period. The run ID is useful for tracing which graph execution created a charge, but it should be stored alongside the idempotency key in your audit log, not used as the key itself.

What's the right thread_id scheme for recurring billing agents?

Period-scoped thread IDs — {customer_id}:{billing_period}, for example "cus_abc:2026-07" — are the safest default. Each billing period starts with a clean message history containing no prior charges, eliminating checkpoint replay as a source of billing errors. The downside is that multi-turn human-in-the-loop flows that span billing periods (an agent that collects payment information in one session and charges in another) require explicit state passing between threads rather than relying on thread history. If your use case requires continuity across periods, scope the thread to the customer but inject the billing period as an explicit parameter in every message rather than allowing the LLM to infer it from history.

We use LangGraph's subgraph pattern for billing. Does the idempotency key work across subgraph boundaries?

Yes. Idempotency keys are a Stripe-level concept, not a LangGraph concept. Whether the stripe.charges.create() call originates in a parent graph node, a subgraph node, or a tool inside a subgraph node, Stripe deduplicates on the idempotency key regardless of the caller's graph depth. The key is derived from billing parameters — customer, amount, period — which are available at every graph level if passed correctly through subgraph state. The pattern to avoid: deriving the idempotency key from any runtime value that changes when the invocation path changes (session ID, node execution count). Derive it only from the business-level billing parameters.

How do we handle the case where a customer's amount changes mid-period?

If a customer upgrades or downgrades mid-period, the billing parameters change (different amount), producing a different idempotency key. This is correct behavior — a charge for cus_abc:9900:2026-07 and a charge for cus_abc:4950:2026-07 (pro-rated downgrade) are genuinely different billing events and should each create a new Stripe charge. Include a charge type suffix in the idempotency key to distinguish the cases: "cus_abc:9900:2026-07:full" and "cus_abc:4950:2026-07:prorated-downgrade". This prevents the downgrade charge from being deduplicated against the full charge while still preventing duplicate charges within the same billing event type.

The proxy rejected a charge with a 402 (daily cap exhausted). How do we resume billing?

Raise the daily cap at the Keybrake dashboard for the affected vault key, then re-trigger the billing graph. Because all prior charges used idempotency keys, the re-triggered graph skips already-completed customers and only fires new Stripe calls for customers not yet billed. The billing run is idempotent end-to-end: the graph can be run multiple times without double-billing any customer. The proxy audit log shows exactly which customers were billed before the cap was hit and which were skipped, so you can verify completeness without querying Stripe's dashboard.

Our LangGraph agent is multi-tenant — multiple customers run billing in parallel. Does the idempotency key scheme handle concurrent runs?

Yes, because each customer's key includes their customer ID. Two concurrent billing runs for cus_abc and cus_xyz in the same billing period produce different idempotency keys and Stripe processes both independently. The risk case is two concurrent runs for the same customer in the same period — for example, a webhook delivery that fires twice causes two LangGraph instances to start simultaneously for the same customer. Both instances compute the same idempotency key and both send POST /v1/charges to Stripe. Stripe deduplicates them and returns the same charge object to both. Both instances return the same charge ID — the customer is billed once. At the proxy layer, two requests with the same idempotency key arriving within the 24-hour idempotency window are counted once against the daily cap.

Keybrake: scoped vault keys + spend caps for LangGraph → Stripe workflows

Issue a vault key per graph run. Set a daily USD cap. Point your LangGraph billing tool at proxy.keybrake.com. Every charge is logged, capped, and revocable — without changing your graph structure or your Stripe account setup.