AWS Lambda · AI agents · API key security

AWS Lambda AI agent API key: scoping vendor calls in event-driven agent functions

AWS Lambda is the foundational compute unit for event-driven AI agents on AWS — triggered by SQS, EventBridge, API Gateway, or Step Functions, it executes agent logic on-demand and scales to hundreds of concurrent instances automatically. When agent Lambda functions call Stripe, Twilio, or Resend, Lambda's automatic concurrency scaling becomes a vendor spend amplifier: a SQS queue burst triggers hundreds of simultaneous function invocations each calling Stripe with the same STRIPE_SECRET_KEY from environment variables, EventBridge retries re-execute functions that may have already made vendor calls, and there is no per-invocation dollar cap in the Lambda runtime. Rotating the Stripe key requires a function redeploy or a forced recycle of all warm instances — there is no mid-run revoke. This page covers the vault-key pattern that bounds vendor spend for event-driven AI agent functions on AWS Lambda.

TL;DR

Store KEYBRAKE_API_KEY in SSM Parameter Store (not Lambda env vars — never store the real Stripe key in env vars). Issue a vault key at the start of each Lambda invocation (or reuse a cached one if still valid) with the event's messageId as the agent_run_label. Use the SQS messageId as both the vault key label and the Stripe idempotency key — it is stable across SQS redeliveries and EventBridge retries, making your vendor calls exactly-once safe. Revoking a runaway function invocation is a single DELETE /vault/keys/{key_id} call — no function redeploy, no Secrets Manager rotation, no reserved concurrency change.

How Lambda AI agent functions call vendor APIs

A typical event-driven agent Lambda reads STRIPE_SECRET_KEY from environment variables and calls Stripe for each SQS message:

// Typical pattern — problematic for agent workloads
exports.handler = async (event) => {
  const stripeKey = process.env.STRIPE_SECRET_KEY;  // same key for ALL invocations

  for (const record of event.Records) {
    const body = JSON.parse(record.body);

    const res = await fetch('https://api.stripe.com/v1/payment_intents', {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${stripeKey}`,
        'Content-Type': 'application/x-www-form-urlencoded'
      },
      body: new URLSearchParams({
        amount: body.amount_cents,
        currency: 'usd',
        customer: body.customer_id
      })
    });

    if (!res.ok) throw new Error(`Stripe error: ${res.status}`);
  }
};

This pattern has three compounding risks specific to Lambda. First, if the SQS queue contains 1,000 messages, Lambda's automatic concurrency scaling can dispatch up to 1,000 concurrent function invocations simultaneously — all using the same STRIPE_SECRET_KEY, with no dollar cap across the invocations. Reserved concurrency limits the count, not the spend. Second, when a Lambda invocation fails (throws an unhandled exception), SQS redelivers the message up to the queue's maxReceiveCount — typically 3–5 times. If the function threw after Stripe had applied a charge, the redelivery retries the Stripe call without an idempotency key, creating duplicate charges. Third, EventBridge event buses retry failed Lambda targets for up to 24 hours by default — the same vendor API call is re-attempted across multiple invocations without charge deduplication.

Three gaps Lambda's native tooling doesn't fill for vendor spend control

Gap	What happens in practice	Lambda's answer
No per-invocation spend cap	Lambda reserved concurrency caps the number of simultaneous invocations, not the dollar spend per invocation or across all concurrent invocations. AWS Cost Anomaly Detection and CloudWatch billing alarms fire after spend has occurred — typically hours after — too late to stop a concurrency burst that clears the SQS queue in minutes. Lambda function timeout caps how long a single invocation can run (max 15 minutes), not how much money it spends during that time. A function that makes 100 Stripe API calls within 15 minutes is within timeout but may have charged $10,000.	Reserved concurrency limits concurrent invocations by count. AWS Cost Anomaly Detection alerts after spend. No per-invocation dollar cap in the Lambda runtime.
No mid-invocation vendor revoke without function redeployment	Lambda environment variables are baked into the function configuration — changing them requires a function update (deploy), which creates a new function version. Existing warm execution environments continue running with the old environment variables until they are recycled by Lambda (typically within 15 minutes of inactivity). Storing the Stripe key in Secrets Manager and fetching it at invocation start improves rotation speed — but already-running invocations that already fetched the key continue using it for the invocation's lifetime. There is no mechanism to revoke a Stripe key mid-invocation without rotating the key entirely.	Lambda supports Secrets Manager and SSM Parameter Store for secret rotation. No mid-invocation key revocation that takes effect within a running function execution.
No per-invocation audit with event context	CloudWatch Logs and Lambda Insights capture invocation duration, memory usage, and custom log lines — but they don't parse dollar amounts from Stripe response bodies, correlate Stripe `PaymentIntent.id` values with the Lambda `requestId` and SQS `messageId` in a structured cost table, or provide a queryable per-agent-run spend summary. X-Ray traces Lambda duration and downstream HTTP calls but records latency, not vendor dollar cost. Reconstructing what a runaway burst charged requires cross-referencing CloudWatch Logs and the Stripe dashboard with manual timestamp correlation.	CloudWatch Logs records custom log output. Lambda Insights tracks compute metrics. No structured vendor cost tracking or requestId-to-charge correlation natively.

The concurrency burst amplification risk

Lambda's automatic scaling is designed to process queue backlogs as fast as possible. When a SQS queue accumulates a backlog, Lambda scales aggressively — adding up to 1,000 concurrent executions per minute until the queue is cleared (within the reserved concurrency limit). For agent billing functions, this means a queue that backed up overnight due to an upstream delay can trigger 500 concurrent Lambda invocations simultaneously when it is unblocked — 500 concurrent Stripe calls with no dollar cap. The speed that makes Lambda efficient for processing becomes a liability when each execution has an unbounded cost.

SQS's dead-letter queue behavior adds a deduplication gap. By default, SQS Standard queues have at-least-once delivery — the same message can be delivered multiple times, and the Lambda function may be invoked twice for the same message if the first invocation didn't delete it in time. Without a stable idempotency key (the SQS messageId is stable across redeliveries), two Lambda invocations processing the same message make two independent Stripe calls.

Scoping vault keys per Lambda event

const { SSMClient, GetParameterCommand } = require('@aws-sdk/client-ssm');

const ssm = new SSMClient({ region: process.env.AWS_REGION });
const KEYBRAKE_BASE = 'https://proxy.keybrake.com';

// Process-level cache — survives warm starts, not cold starts
let keybrakeApiKey;

async function getKeybrakeApiKey() {
  if (!keybrakeApiKey) {
    const { Parameter } = await ssm.send(new GetParameterCommand({
      Name: '/keybrake/api-key',
      WithDecryption: true
    }));
    keybrakeApiKey = Parameter.Value;
  }
  return keybrakeApiKey;
}

async function issueVaultKey(apiKey, messageId, budgetUsd) {
  const res = await fetch(`${KEYBRAKE_BASE}/vault/keys`, {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${apiKey}`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      vendor: 'stripe',
      daily_usd_cap: budgetUsd,
      allowed_endpoints: ['POST /v1/payment_intents'],
      expires_in: '15m',  // matches Lambda max timeout
      agent_run_label: `lambda/${process.env.AWS_LAMBDA_FUNCTION_NAME}/${messageId}`
    })
  });
  if (!res.ok) throw new Error(`Keybrake error: ${res.status}`);
  return res.json();
}

exports.handler = async (event) => {
  const apiKey = await getKeybrakeApiKey();
  const failedItems = [];

  for (const record of event.Records) {
    const body = JSON.parse(record.body);
    const messageId = record.messageId;  // stable across SQS redeliveries

    try {
      // Issue a vault key scoped to this message / agent run
      const vault = await issueVaultKey(apiKey, messageId, body.budget_usd ?? 100);

      const res = await fetch(`${KEYBRAKE_BASE}/stripe/v1/payment_intents`, {
        method: 'POST',
        headers: {
          'Authorization': `Bearer ${vault.vault_key}`,
          'Idempotency-Key': messageId,  // SQS dedup + Stripe dedup — same value
          'Content-Type': 'application/json'
        },
        body: JSON.stringify({
          amount: body.amount_cents,
          currency: 'usd',
          customer: body.customer_id
        })
      });

      if (res.status === 429) {
        const err = await res.json();
        if (err.code === 'cap_exhausted') {
          // Cap hit: move to DLQ without retrying — it's intentional, not transient
          failedItems.push({ itemIdentifier: record.messageId });
          continue;
        }
      }

      if (!res.ok) throw new Error(`Stripe error: ${res.status}`);
    } catch (err) {
      if (err.message.startsWith('CapExhausted')) {
        failedItems.push({ itemIdentifier: record.messageId });
      } else {
        throw err;  // re-throw transient errors for SQS retry
      }
    }
  }

  // Partial batch response: only failed items are retried by SQS
  return { batchItemFailures: failedItems };
};

The KEYBRAKE_API_KEY is fetched from SSM Parameter Store once per cold start and cached at process level — warm invocations reuse the cached value without an SSM call. A separate vault key is issued per SQS message (per agent run), with a 15-minute TTL matching the Lambda maximum timeout. The vault key's agent_run_label includes the function name and messageId, making every vendor call in the audit log traceable to the specific SQS message that triggered it. Using messageId as both the vault key label and the Stripe Idempotency-Key makes SQS redeliveries and Stripe retries idempotent with the same key. Partial batch response (batchItemFailures) ensures cap-exhausted messages are sent to the DLQ rather than blocking the entire batch retry.

How Keybrake fits

Keybrake is the proxy layer between your agent Lambda functions and Stripe, Twilio, or Resend. The vault key issued per SQS message replaces the STRIPE_SECRET_KEY previously stored in Lambda environment variables or fetched from Secrets Manager. The real Stripe secret stays in Keybrake — it is never present in Lambda environment variables, CloudWatch Logs, or SSM exports. Revoking a runaway Lambda invocation mid-execution is a single DELETE /vault/keys/{key_id} call — effective on the next proxied request, with no function redeploy, no Secrets Manager rotation, and no reserved concurrency change that would affect other functions sharing the same concurrency pool.

Get early access