Mastra Stripe Integration: Restricted API Keys, Spend Caps, and Agent Governance

Mastra's TypeScript-native agent loop is one of the cleanest ways to build production agents. The same design patterns that make it clean — automatic tool retry, parallel workflow steps, persistent memory — are the ones that silently create duplicate Stripe charges, parallel billing races, and billing replay on resumed agents.

Mastra is an open-source TypeScript framework for building AI agents and multi-step workflows. It handles the agent loop, tool registration, memory management, and workflow orchestration so you can focus on what your tools actually do. For agents that touch Stripe — subscription billing, usage-based charges, invoice generation — Mastra is a natural fit. The problem is that none of the built-in behaviors (retry, parallelism, memory) have any awareness of external API idempotency or spend limits. That gap is invisible until a stuck loop charges a customer twice.

This post covers three failure modes specific to Mastra's architecture, with TypeScript code for each, and the two-layer governance pattern — content-hash idempotency keys plus per-run vault keys via a spend-cap proxy — that closes all three.

Failure mode 1: Agent retry loop re-fires a completed Stripe charge

Mastra agents follow the standard tool-use loop: generate a response, call a tool, observe the result, continue. When a tool call returns an error — a network timeout, a downstream API failure, a validation exception — the LLM sees the error as an observation and is implicitly encouraged to retry. Depending on your system prompt and the error message, the LLM will often call the billing tool a second time.

The failure pattern emerges when the Stripe charge succeeded but the tool function threw an error afterward. A common case: the Stripe charge is created, but the follow-up step (writing to a database, calling a webhook, sending a confirmation email) fails and the tool function propagates the exception. From the LLM's perspective, the tool returned an error — so the charge must not have gone through. It calls the billing tool again. The customer is billed twice.

A minimal Mastra billing tool without idempotency protection looks like this:

// tools/billing.ts — UNSAFE: no idempotency key
import { createTool } from '@mastra/core/tools';
import { z } from 'zod';
import Stripe from 'stripe';

const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!);

export const chargeCustomerTool = createTool({
  id: 'charge_customer',
  description: 'Charge a customer for their subscription',
  inputSchema: z.object({
    customerId: z.string(),
    amountCents: z.number(),
    billingPeriod: z.string(),
  }),
  execute: async ({ context }) => {
    const { customerId, amountCents, billingPeriod } = context;

    // Stripe charge created — but if anything below throws,
    // the LLM sees an error and may retry the entire tool call
    const charge = await stripe.charges.create({
      amount: amountCents,
      currency: 'usd',
      customer: customerId,
      description: `Subscription ${billingPeriod}`,
      // No idempotencyKey — every call creates a new charge object
    });

    // If this CRM call fails, the charge already happened
    await updateCRMPaymentStatus(customerId, charge.id);

    return { chargeId: charge.id, status: 'succeeded' };
  },
});

If updateCRMPaymentStatus throws a network error, the tool returns an error to the agent. The LLM's next action is often to retry charge_customer with the same arguments — and Stripe creates a new charge object with a new charge ID. Two charges, one customer, one billing period.

The fix is a content-hash idempotency key derived from the billing parameters, not from the tool invocation:

// tools/billing.ts — SAFE: content-hash idempotency key
import { createTool } from '@mastra/core/tools';
import { z } from 'zod';
import Stripe from 'stripe';
import { createHash } from 'crypto';

const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!);

function billingIdempotencyKey(
  customerId: string,
  amountCents: number,
  billingPeriod: string
): string {
  return createHash('sha256')
    .update(`${customerId}:${amountCents}:${billingPeriod}:mastra-billing`)
    .digest('hex')
    .slice(0, 36); // Stripe idempotency key max length is 255, but 36 is readable
}

export const chargeCustomerTool = createTool({
  id: 'charge_customer',
  description: 'Charge a customer for their subscription',
  inputSchema: z.object({
    customerId: z.string(),
    amountCents: z.number(),
    billingPeriod: z.string(),
  }),
  execute: async ({ context }) => {
    const { customerId, amountCents, billingPeriod } = context;
    const idempotencyKey = billingIdempotencyKey(customerId, amountCents, billingPeriod);

    // Stripe returns the original charge object on any retry with the same key
    const charge = await stripe.charges.create(
      {
        amount: amountCents,
        currency: 'usd',
        customer: customerId,
        description: `Subscription ${billingPeriod}`,
      },
      { idempotencyKey }
    );

    try {
      await updateCRMPaymentStatus(customerId, charge.id);
    } catch (err) {
      // Return the charge ID even if CRM update failed — the charge is real
      // The LLM can retry the CRM update separately without re-charging
      return {
        chargeId: charge.id,
        status: 'succeeded',
        crmUpdateFailed: true,
        message: 'Charge completed. CRM update failed — retry updateCRM separately.',
      };
    }

    return { chargeId: charge.id, status: 'succeeded' };
  },
});

The idempotency key is stable: the same customer, amount, and billing period always produce the same key, regardless of how many times the agent retries. Stripe returns the original Charge object on every retry after the first — no duplicate charge is created. Separating the CRM update error from the charge return prevents the LLM from retrying the charge when only the CRM call failed.

Failure mode 2: Parallel Workflow steps fire concurrent Stripe calls

Mastra Workflows support parallel step execution via .parallel(). This is useful for running independent operations simultaneously — fetching enrichment data, updating multiple downstream systems, or processing a batch of customers. The problem arises when two steps that both call Stripe are placed in a parallel branch without cross-call deduplication.

A typical scenario: a billing workflow charges a subscription fee and a setup fee for a new customer. An engineer structures these as parallel steps because they seem independent — two separate Stripe charges for different line items. The parallel execution fires both Stripe calls simultaneously. Without idempotency keys, two concurrent POST /v1/charges requests with identical parameters create two charge objects before either result is registered in the workflow context.

// workflows/onboard.ts — UNSAFE: parallel Stripe calls without dedup
import { Workflow, Step } from '@mastra/core';

const chargeSubscription = new Step({
  id: 'charge_subscription',
  execute: async ({ context }) => {
    // Fires POST /v1/charges — creates charge A
    return stripe.charges.create({
      amount: context.subscriptionAmountCents,
      currency: 'usd',
      customer: context.customerId,
      description: `Subscription ${context.billingPeriod}`,
    });
  },
});

const chargeSetupFee = new Step({
  id: 'charge_setup_fee',
  execute: async ({ context }) => {
    // Fires POST /v1/charges — creates charge B simultaneously
    return stripe.charges.create({
      amount: context.setupFeeAmountCents,
      currency: 'usd',
      customer: context.customerId,
      description: `Setup fee ${context.billingPeriod}`,
    });
  },
});

// Both steps run at the same time — if either fails and the
// workflow is retried, both fire again with no idempotency key
const onboardingWorkflow = new Workflow({ name: 'customer-onboarding' })
  .step(validateCustomer)
  .parallel([chargeSubscription, chargeSetupFee])  // Concurrent Stripe calls
  .step(sendWelcomeEmail)
  .commit();

If the workflow fails at sendWelcomeEmail and is retried, both parallel steps re-execute. Without idempotency keys, the customer receives four charges — two subscription fees and two setup fees — from two workflow runs.

The fix adds unique idempotency keys scoped to each charge type, ensuring parallel execution and workflow retries both produce stable keys:

// workflows/onboard.ts — SAFE: per-charge-type idempotency keys
import { createHash } from 'crypto';

function chargeKey(
  customerId: string,
  amountCents: number,
  billingPeriod: string,
  chargeType: string
): string {
  return createHash('sha256')
    .update(`${customerId}:${amountCents}:${billingPeriod}:${chargeType}`)
    .digest('hex')
    .slice(0, 36);
}

const chargeSubscription = new Step({
  id: 'charge_subscription',
  execute: async ({ context }) => {
    return stripe.charges.create(
      {
        amount: context.subscriptionAmountCents,
        currency: 'usd',
        customer: context.customerId,
        description: `Subscription ${context.billingPeriod}`,
      },
      {
        idempotencyKey: chargeKey(
          context.customerId,
          context.subscriptionAmountCents,
          context.billingPeriod,
          'subscription'  // Unique per charge type
        ),
      }
    );
  },
});

const chargeSetupFee = new Step({
  id: 'charge_setup_fee',
  execute: async ({ context }) => {
    return stripe.charges.create(
      {
        amount: context.setupFeeAmountCents,
        currency: 'usd',
        customer: context.customerId,
        description: `Setup fee ${context.billingPeriod}`,
      },
      {
        idempotencyKey: chargeKey(
          context.customerId,
          context.setupFeeAmountCents,
          context.billingPeriod,
          'setup'  // Different type → different key → different Stripe charge
        ),
      }
    );
  },
});

Now parallel execution is safe: both steps fire simultaneously, and Stripe deduplicates on idempotency key. A workflow retry produces the same keys — Stripe returns the original charge objects. Two charges are created for two different charge types, exactly as intended, regardless of how many times the workflow runs.

Failure mode 3: Agent memory replay re-executes billing on resumed context

Mastra agents support persistent memory — conversation history and tool call results stored across sessions. This is essential for long-running agents that need context from prior interactions. The billing risk emerges when a resumed agent sees a prior billing tool call in its memory context and an ambiguous follow-up message causes the LLM to re-execute the charge.

Consider a billing agent that ran successfully last week: it called charge_customer, received a chargeId, and completed the billing cycle. The conversation history stored in Mastra's memory includes the tool call and the successful response. When the same agent is resumed for the next billing cycle, the user sends a message like "process this month's billing" — but the LLM's context window includes the prior cycle's successful charge. On an ambiguous or insufficiently specific follow-up prompt, the LLM may interpret the existing memory as an incomplete prior run and re-execute the charge tool for the period already billed.

// agent.ts — UNSAFE: memory context enables billing replay
import { Agent } from '@mastra/core';
import { Memory } from '@mastra/memory';

const billingAgent = new Agent({
  name: 'billing-agent',
  instructions: `You are a billing agent. When asked to process billing,
    call charge_customer with the customer ID, amount, and billing period.`,
  model: anthropic('claude-sonnet-4-6'),
  tools: { chargeCustomerTool },
  memory: new Memory(),  // Persists all tool calls — including prior charges
});

// First run — June 1: charges customer for May billing
await billingAgent.generate([
  { role: 'user', content: 'Process billing for customer cus_abc123, $99, May 2026' }
]);
// Memory now contains: charge_customer(cus_abc123, 9900, 'May-2026') → {chargeId: 'ch_xxx'}

// Second run — July 1: ambiguous message, prior context in memory
await billingAgent.generate([
  { role: 'user', content: 'Process billing for customer cus_abc123, same amount' }
  // No billing period specified — LLM sees prior May charge in memory
  // May interpret 'same amount' as 'redo the previous charge' and re-execute
]);

The LLM has no innate understanding that a chargeId in prior context means the charge is settled. It may reason that the prior charge was "the last billing" and the new request is for the current period with the same amount — and call charge_customer again with "May 2026" from context, creating a duplicate May charge.

The fix operates at two levels: make the billing period explicit in every agent invocation, and enforce idempotency at the proxy layer so that even if the LLM re-executes the charge with an old billing period, the proxy deduplicates on the content-hash key and returns the original charge object rather than creating a new one:

// agent.ts — SAFE: explicit period + proxy-layer dedup
import { Agent } from '@mastra/core';
import { Memory } from '@mastra/memory';

const billingAgent = new Agent({
  name: 'billing-agent',
  instructions: `You are a billing agent. When asked to process billing,
    you MUST call charge_customer with the customer ID, the exact amount in cents,
    and the billing period in YYYY-MM format. Never infer the billing period from
    prior conversation history — always use the period specified in the current message.
    If the billing period is not specified in the current message, ask for it before
    calling any billing tools.`,
  model: anthropic('claude-sonnet-4-6'),
  tools: { chargeCustomerTool },  // Tool routes through proxy with idempotency key
  memory: new Memory(),
});

// Explicit period in every invocation prevents LLM from inferring from context
await billingAgent.generate([
  {
    role: 'user',
    content: 'Process billing for customer cus_abc123, $99, billing period 2026-07'
    //                                                           ^^^^^^^^^^^^^^^^^^^
    //                                          Always specify — never rely on memory
  }
]);

Combined with the content-hash idempotency key in the tool itself, this creates a two-layer defense: the system prompt makes the LLM unlikely to infer the billing period from memory, and the idempotency key at the Stripe level ensures that even if the LLM does re-execute the charge for a prior period, Stripe returns the existing charge object rather than creating a duplicate.

Adding vault key isolation and a spend cap

The idempotency key fixes the duplicate-charge problem, but it doesn't limit the blast radius of a runaway agent or prompt injection. A Mastra agent with a production Stripe key can call any Stripe endpoint with any amount. A stuck retry loop, a malicious prompt in customer data, or a misconfigured workflow step can exhaust a day's Stripe budget before the on-call engineer is paged.

The second layer replaces the raw Stripe key with a scoped vault key issued by a spend-cap proxy. The proxy holds the real Stripe key, enforces a daily USD cap, restricts the agent to specific Stripe endpoints, and logs every call with parsed cost:

// tools/billing.ts — SAFE: vault key + proxy + idempotency key
import { createTool } from '@mastra/core/tools';
import { z } from 'zod';
import Stripe from 'stripe';
import { createHash } from 'crypto';

function makeBillingTool(vaultKey: string, proxyUrl = 'https://proxy.keybrake.com') {
  // Point the Stripe client at the proxy instead of api.stripe.com
  const stripe = new Stripe(vaultKey, {
    host: new URL(proxyUrl).hostname,
    protocol: 'https',
    port: 443,
    // The proxy extracts the vault key from Authorization: Bearer ,
    // looks up the real Stripe key, enforces the policy, and forwards the request
  });

  return createTool({
    id: 'charge_customer',
    description: 'Charge a customer for their subscription via the governed proxy',
    inputSchema: z.object({
      customerId: z.string(),
      amountCents: z.number().int().positive(),
      billingPeriod: z.string().regex(/^\d{4}-\d{2}$/),
    }),
    execute: async ({ context }) => {
      const { customerId, amountCents, billingPeriod } = context;
      const idempotencyKey = createHash('sha256')
        .update(`${customerId}:${amountCents}:${billingPeriod}:mastra-billing`)
        .digest('hex')
        .slice(0, 36);

      const charge = await stripe.charges.create(
        {
          amount: amountCents,
          currency: 'usd',
          customer: customerId,
          description: `Subscription ${billingPeriod}`,
        },
        { idempotencyKey }
      );

      return { chargeId: charge.id, status: charge.status };
    },
  });
}

// Per-run vault key: issued at Keybrake dashboard with:
// { vendor: 'stripe', allowed_endpoints: ['POST /v1/charges'], daily_usd_cap: 500 }
export const chargeCustomerTool = makeBillingTool(process.env.KEYBRAKE_VAULT_KEY!);

The vault key is scoped to POST /v1/charges only — the agent cannot call POST /v1/refunds, GET /v1/customers, or any other Stripe endpoint. The daily cap means the proxy returns a 402 once the billing budget is exhausted, preventing a stuck loop from spending beyond the expected range. Every call is logged in the proxy audit table with parsed cost from the Stripe response, giving you a per-agent-per-day spend view without needing to query Stripe's own dashboard.

Governance comparison

Concern Raw Stripe key Restricted key only Idempotency key only Vault key + proxy only Idempotency + vault key
Retry creates duplicate charge Yes — new charge on every retry Yes — restriction doesn't prevent duplication No — Stripe deduplicates on key Yes — proxy forwards all charges No — key prevents duplication at Stripe layer
Parallel steps create duplicate charge Yes — concurrent POSTs create two charges Yes — restriction doesn't help concurrency No — same key → same charge object Yes — proxy forwards both concurrent calls No — idempotency key deduplicates concurrent calls
Memory replay re-executes billing Yes — prior period rebilled Yes — restriction doesn't prevent replay Partial — key deduplicates if period is same Partial — cap limits total damage Yes — system prompt + key + cap all reduce risk
Runaway loop exhausts Stripe budget Yes — unbounded spend Partial — key scope limits endpoints No — new billing period → new key → new charge Yes — proxy enforces daily USD cap Yes — proxy cap stops loop after budget
Prompt injection reaches billing endpoint Yes — any Stripe endpoint accessible Partial — key restricts endpoint scope No — doesn't restrict endpoints Yes — proxy allowlist enforces endpoint scope Yes — allowlist + cap both enforce boundaries
Audit log of agent billing activity Manual Stripe dashboard query Manual Stripe dashboard query Manual Stripe dashboard query Automatic — proxy logs every call with cost Automatic — proxy logs every call with cost

Vitest enforcement suite

These tests verify the governance layer without hitting the live Stripe API:

// tests/billing.test.ts
import { describe, it, expect, vi, beforeEach } from 'vitest';
import { makeBillingTool } from '../tools/billing';
import Stripe from 'stripe';

vi.mock('stripe');

describe('Mastra billing tool governance', () => {
  let mockCreate: ReturnType<typeof vi.fn>;

  beforeEach(() => {
    mockCreate = vi.fn().mockResolvedValue({ id: 'ch_test_123', status: 'succeeded' });
    (Stripe as any).mockImplementation(() => ({
      charges: { create: mockCreate },
    }));
  });

  it('sends the same idempotency key on retry', async () => {
    const tool = makeBillingTool('vk_test_key');
    const args = { customerId: 'cus_abc', amountCents: 9900, billingPeriod: '2026-06' };

    await tool.execute({ context: args });
    await tool.execute({ context: args }); // Simulated retry

    expect(mockCreate).toHaveBeenCalledTimes(2);
    const [, opts1] = mockCreate.mock.calls[0];
    const [, opts2] = mockCreate.mock.calls[1];
    expect(opts1.idempotencyKey).toBe(opts2.idempotencyKey);
  });

  it('produces different keys for different billing periods', async () => {
    const tool = makeBillingTool('vk_test_key');
    const base = { customerId: 'cus_abc', amountCents: 9900 };

    await tool.execute({ context: { ...base, billingPeriod: '2026-05' } });
    await tool.execute({ context: { ...base, billingPeriod: '2026-06' } });

    const [, opts1] = mockCreate.mock.calls[0];
    const [, opts2] = mockCreate.mock.calls[1];
    expect(opts1.idempotencyKey).not.toBe(opts2.idempotencyKey);
  });

  it('produces different keys for different charge types', async () => {
    // Verifies parallel step keys are distinct
    const { chargeKey } = await import('../tools/billing');
    const base = { customerId: 'cus_abc', amountCents: 9900, billingPeriod: '2026-06' };
    const subscriptionKey = chargeKey(base.customerId, base.amountCents, base.billingPeriod, 'subscription');
    const setupKey = chargeKey(base.customerId, base.amountCents, base.billingPeriod, 'setup');
    expect(subscriptionKey).not.toBe(setupKey);
  });

  it('routes through the proxy URL, not api.stripe.com', async () => {
    const tool = makeBillingTool('vk_test_key', 'https://proxy.keybrake.com');
    await tool.execute({ context: { customerId: 'cus_abc', amountCents: 9900, billingPeriod: '2026-06' } });
    expect(Stripe).toHaveBeenCalledWith('vk_test_key', expect.objectContaining({
      host: 'proxy.keybrake.com',
    }));
  });

  it('returns charge ID even when CRM update fails', async () => {
    vi.doMock('../lib/crm', () => ({
      updateCRMPaymentStatus: vi.fn().mockRejectedValue(new Error('CRM timeout')),
    }));
    const tool = makeBillingTool('vk_test_key');
    const result = await tool.execute({
      context: { customerId: 'cus_abc', amountCents: 9900, billingPeriod: '2026-06' },
    });
    expect(result.chargeId).toBe('ch_test_123');
    expect(result.crmUpdateFailed).toBe(true);
  });
});

Gap analysis

Mastra's model parallelism within a single agent.generate() call

When using models that support parallel function calling (Claude Sonnet, GPT-4o, Gemini 1.5 Pro), a single agent.generate() call can produce multiple tool call requests in one model response. If the LLM decides to call charge_customer twice in one turn — for example, when asked to "charge Alice and Bob for their June subscriptions" — both tool calls fire simultaneously. The idempotency key handles this if each call is for a different customer (different key material). But if the model calls charge_customer twice for the same customer with the same parameters — which can happen due to hallucination or ambiguous agent instructions — the idempotency key prevents the duplicate charge. Verify your agent instructions are explicit about one-charge-per-call semantics.

Workflow step retry semantics differ from agent retry semantics

In Mastra Workflows, step-level retry is configured separately from agent loop retry. A Workflow step can be configured with retries: N — when the step's execute function throws, Mastra re-executes the entire step up to N times. This is different from the agent loop retry described in failure mode 1, where the LLM decides to call the tool again. Both retry paths are active if you use an agent tool inside a workflow step: the workflow retries the step, and within each step execution, the agent may also retry the tool. The idempotency key handles both — any combination of workflow-level and agent-level retries produces the same key for the same billing parameters.

Mastra's MCP integration may expose billing tools to multiple callers

Mastra supports Model Context Protocol (MCP) server integration, allowing agents to use tools served over MCP. If a billing tool is exposed via an MCP server, multiple Mastra agents (or external MCP clients) can call it concurrently. The idempotency key provides Stripe-level deduplication, but the proxy audit log is the only way to see that two different agents called the same billing tool in the same billing period. Set up proxy audit alerts for duplicate idempotency key hits — they indicate a configuration error (two agents billing the same customer) that the idempotency key silently masks at the Stripe layer.

TypeScript compilation does not enforce billing period format

The input schema in the examples uses z.string().regex(/^\d{4}-\d{2}$/) to validate billing period format. Without this regex, a billing period of "June 2026", "2026-6", and "2026-06" all produce different idempotency keys for the same billing intent — meaning the LLM can create three Stripe charges for the same customer in the same month by varying the period format. Always validate the billing period format in the Zod schema, not just at the application layer. Zod validation in Mastra tool schemas runs before the execute function — a malformed period fails fast with a schema error rather than creating a charge with a non-deduplicatable key.

FAQ

Can I use the Mastra tool invocation ID as the idempotency key?

No. Mastra doesn't expose a stable per-invocation ID across retries in the same way a workflow run ID is stable. More importantly, the agent loop assigns a new invocation context on each tool call attempt — so a retry of the same tool produces a different invocation context. Using any runtime-generated ID as the idempotency key means a retry generates a new key and Stripe creates a second charge. The idempotency key must be derived from the billing parameters (customer, amount, period, charge type), which are stable across all retries for the same billing intent.

What happens if the proxy's daily USD cap is hit mid-batch?

The proxy returns a 402 response for any request that would exceed the daily cap. The Stripe client in your tool receives a 402, which the Stripe SDK raises as a StripeError. Your execute function should catch this specifically and return a structured error — do not re-throw, as that triggers the agent retry loop. Return { error: 'CAP_EXHAUSTED', remainingBudget: 0 } and let the agent report the cap exhaustion to the caller. The caller raises the cap at the proxy dashboard and re-triggers the batch. Because each prior charge used an idempotency key, the re-triggered batch skips already-completed charges and only bills the remaining customers.

Does Mastra's streaming mode affect tool call idempotency?

Mastra's agent.stream() mode streams tokens as they're generated but tool calls are still executed atomically — the tool's execute function runs to completion before the result is streamed back. A disconnected stream (client closes the connection) may cause the caller to retry the outer agent.stream() call. If the tool call had already completed before the disconnect, the retry re-executes the agent turn from the beginning, re-calling charge_customer. The idempotency key handles this: the retry produces the same key, and Stripe returns the original charge object. The caller sees a successful charge on the retry turn, not a duplicate.

We issue a new vault key per deployment. Do we need a new key per agent run?

Per-deployment keys are better than a single shared key, but per-run keys are better still. A deployment-scoped vault key is shared across every run of that deployment — a runaway loop in one run consumes the daily cap for all other runs in the same deployment. A per-run vault key (issued at the start of each agent run, scoped to the current customer and billing period, with a cap equal to the maximum expected charge for that run) gives you granular containment. If that run's loop spins, it exhausts only its own cap, leaving all other concurrent runs unaffected.

How do I test the proxy integration without hitting live Stripe?

Issue a vault key with the proxy policy set to Stripe test mode ("stripe_mode": "test") — the proxy forwards requests to api.stripe.com/v1 using your Stripe test secret key rather than the live key. All Stripe test-mode objects are created (charges, customers, events), and the proxy enforces the daily cap against test-mode amounts. Your Vitest suite can also mock the proxy host directly using Stripe's httpClient option to point at a local server that returns the expected Stripe response shapes. This lets you test cap exhaustion (return a 402 after N requests) and idempotency deduplication (return the same charge object for repeated idempotency keys) without any live Stripe or proxy dependency.

Our Mastra agent uses Anthropic's Claude and sometimes generates multiple tool calls in one response. Does the idempotency key handle that?

Yes, for different customers — each call uses a different customer ID, producing a different key, and both charges are created as intended. For the same customer in the same period with the same amount, both tool calls produce the same idempotency key, and Stripe returns the same charge object for both — so only one charge is created. The proxy audit log shows two billing requests with the same idempotency key hitting the proxy in the same second, which is a signal that your agent instructions are generating duplicate tool calls and should be tightened. A parallel-safe pattern is to validate in the tool's execute function that a charge with the computed idempotency key doesn't already exist (via stripe.charges.retrieve with the key as a filter) before calling stripe.charges.create — though the idempotency key alone is sufficient to prevent double billing at the Stripe layer.

Keybrake: scoped vault keys + spend caps for Mastra → Stripe workflows

Issue a vault key per agent run. Set a daily USD cap. Point your Mastra billing tool at proxy.keybrake.com. Every charge is logged, capped, and revocable — without changing your Mastra agent design or your Stripe account setup.