AWS Lambda · AI agents · API key security
AWS Lambda AI agent API key: scoping vendor calls in event-driven agent functions
AWS Lambda is the foundational compute unit for event-driven AI agents on AWS — triggered by SQS, EventBridge, API Gateway, or Step Functions, it executes agent logic on-demand and scales to hundreds of concurrent instances automatically. When agent Lambda functions call Stripe, Twilio, or Resend, Lambda's automatic concurrency scaling becomes a vendor spend amplifier: a SQS queue burst triggers hundreds of simultaneous function invocations each calling Stripe with the same STRIPE_SECRET_KEY from environment variables, EventBridge retries re-execute functions that may have already made vendor calls, and there is no per-invocation dollar cap in the Lambda runtime. Rotating the Stripe key requires a function redeploy or a forced recycle of all warm instances — there is no mid-run revoke. This page covers the vault-key pattern that bounds vendor spend for event-driven AI agent functions on AWS Lambda.
TL;DR
Store KEYBRAKE_API_KEY in SSM Parameter Store (not Lambda env vars — never store the real Stripe key in env vars). Issue a vault key at the start of each Lambda invocation (or reuse a cached one if still valid) with the event's messageId as the agent_run_label. Use the SQS messageId as both the vault key label and the Stripe idempotency key — it is stable across SQS redeliveries and EventBridge retries, making your vendor calls exactly-once safe. Revoking a runaway function invocation is a single DELETE /vault/keys/{key_id} call — no function redeploy, no Secrets Manager rotation, no reserved concurrency change.
How Lambda AI agent functions call vendor APIs
A typical event-driven agent Lambda reads STRIPE_SECRET_KEY from environment variables and calls Stripe for each SQS message:
// Typical pattern — problematic for agent workloads
exports.handler = async (event) => {
const stripeKey = process.env.STRIPE_SECRET_KEY; // same key for ALL invocations
for (const record of event.Records) {
const body = JSON.parse(record.body);
const res = await fetch('https://api.stripe.com/v1/payment_intents', {
method: 'POST',
headers: {
'Authorization': `Bearer ${stripeKey}`,
'Content-Type': 'application/x-www-form-urlencoded'
},
body: new URLSearchParams({
amount: body.amount_cents,
currency: 'usd',
customer: body.customer_id
})
});
if (!res.ok) throw new Error(`Stripe error: ${res.status}`);
}
};
This pattern has three compounding risks specific to Lambda. First, if the SQS queue contains 1,000 messages, Lambda's automatic concurrency scaling can dispatch up to 1,000 concurrent function invocations simultaneously — all using the same STRIPE_SECRET_KEY, with no dollar cap across the invocations. Reserved concurrency limits the count, not the spend. Second, when a Lambda invocation fails (throws an unhandled exception), SQS redelivers the message up to the queue's maxReceiveCount — typically 3–5 times. If the function threw after Stripe had applied a charge, the redelivery retries the Stripe call without an idempotency key, creating duplicate charges. Third, EventBridge event buses retry failed Lambda targets for up to 24 hours by default — the same vendor API call is re-attempted across multiple invocations without charge deduplication.
Three gaps Lambda's native tooling doesn't fill for vendor spend control
| Gap | What happens in practice | Lambda's answer |
|---|---|---|
| No per-invocation spend cap | Lambda reserved concurrency caps the number of simultaneous invocations, not the dollar spend per invocation or across all concurrent invocations. AWS Cost Anomaly Detection and CloudWatch billing alarms fire after spend has occurred — typically hours after — too late to stop a concurrency burst that clears the SQS queue in minutes. Lambda function timeout caps how long a single invocation can run (max 15 minutes), not how much money it spends during that time. A function that makes 100 Stripe API calls within 15 minutes is within timeout but may have charged $10,000. | Reserved concurrency limits concurrent invocations by count. AWS Cost Anomaly Detection alerts after spend. No per-invocation dollar cap in the Lambda runtime. |
| No mid-invocation vendor revoke without function redeployment | Lambda environment variables are baked into the function configuration — changing them requires a function update (deploy), which creates a new function version. Existing warm execution environments continue running with the old environment variables until they are recycled by Lambda (typically within 15 minutes of inactivity). Storing the Stripe key in Secrets Manager and fetching it at invocation start improves rotation speed — but already-running invocations that already fetched the key continue using it for the invocation's lifetime. There is no mechanism to revoke a Stripe key mid-invocation without rotating the key entirely. | Lambda supports Secrets Manager and SSM Parameter Store for secret rotation. No mid-invocation key revocation that takes effect within a running function execution. |
| No per-invocation audit with event context | CloudWatch Logs and Lambda Insights capture invocation duration, memory usage, and custom log lines — but they don't parse dollar amounts from Stripe response bodies, correlate Stripe PaymentIntent.id values with the Lambda requestId and SQS messageId in a structured cost table, or provide a queryable per-agent-run spend summary. X-Ray traces Lambda duration and downstream HTTP calls but records latency, not vendor dollar cost. Reconstructing what a runaway burst charged requires cross-referencing CloudWatch Logs and the Stripe dashboard with manual timestamp correlation. |
CloudWatch Logs records custom log output. Lambda Insights tracks compute metrics. No structured vendor cost tracking or requestId-to-charge correlation natively. |
The concurrency burst amplification risk
Lambda's automatic scaling is designed to process queue backlogs as fast as possible. When a SQS queue accumulates a backlog, Lambda scales aggressively — adding up to 1,000 concurrent executions per minute until the queue is cleared (within the reserved concurrency limit). For agent billing functions, this means a queue that backed up overnight due to an upstream delay can trigger 500 concurrent Lambda invocations simultaneously when it is unblocked — 500 concurrent Stripe calls with no dollar cap. The speed that makes Lambda efficient for processing becomes a liability when each execution has an unbounded cost.
SQS's dead-letter queue behavior adds a deduplication gap. By default, SQS Standard queues have at-least-once delivery — the same message can be delivered multiple times, and the Lambda function may be invoked twice for the same message if the first invocation didn't delete it in time. Without a stable idempotency key (the SQS messageId is stable across redeliveries), two Lambda invocations processing the same message make two independent Stripe calls.
Scoping vault keys per Lambda event
const { SSMClient, GetParameterCommand } = require('@aws-sdk/client-ssm');
const ssm = new SSMClient({ region: process.env.AWS_REGION });
const KEYBRAKE_BASE = 'https://proxy.keybrake.com';
// Process-level cache — survives warm starts, not cold starts
let keybrakeApiKey;
async function getKeybrakeApiKey() {
if (!keybrakeApiKey) {
const { Parameter } = await ssm.send(new GetParameterCommand({
Name: '/keybrake/api-key',
WithDecryption: true
}));
keybrakeApiKey = Parameter.Value;
}
return keybrakeApiKey;
}
async function issueVaultKey(apiKey, messageId, budgetUsd) {
const res = await fetch(`${KEYBRAKE_BASE}/vault/keys`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${apiKey}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({
vendor: 'stripe',
daily_usd_cap: budgetUsd,
allowed_endpoints: ['POST /v1/payment_intents'],
expires_in: '15m', // matches Lambda max timeout
agent_run_label: `lambda/${process.env.AWS_LAMBDA_FUNCTION_NAME}/${messageId}`
})
});
if (!res.ok) throw new Error(`Keybrake error: ${res.status}`);
return res.json();
}
exports.handler = async (event) => {
const apiKey = await getKeybrakeApiKey();
const failedItems = [];
for (const record of event.Records) {
const body = JSON.parse(record.body);
const messageId = record.messageId; // stable across SQS redeliveries
try {
// Issue a vault key scoped to this message / agent run
const vault = await issueVaultKey(apiKey, messageId, body.budget_usd ?? 100);
const res = await fetch(`${KEYBRAKE_BASE}/stripe/v1/payment_intents`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${vault.vault_key}`,
'Idempotency-Key': messageId, // SQS dedup + Stripe dedup — same value
'Content-Type': 'application/json'
},
body: JSON.stringify({
amount: body.amount_cents,
currency: 'usd',
customer: body.customer_id
})
});
if (res.status === 429) {
const err = await res.json();
if (err.code === 'cap_exhausted') {
// Cap hit: move to DLQ without retrying — it's intentional, not transient
failedItems.push({ itemIdentifier: record.messageId });
continue;
}
}
if (!res.ok) throw new Error(`Stripe error: ${res.status}`);
} catch (err) {
if (err.message.startsWith('CapExhausted')) {
failedItems.push({ itemIdentifier: record.messageId });
} else {
throw err; // re-throw transient errors for SQS retry
}
}
}
// Partial batch response: only failed items are retried by SQS
return { batchItemFailures: failedItems };
};
The KEYBRAKE_API_KEY is fetched from SSM Parameter Store once per cold start and cached at process level — warm invocations reuse the cached value without an SSM call. A separate vault key is issued per SQS message (per agent run), with a 15-minute TTL matching the Lambda maximum timeout. The vault key's agent_run_label includes the function name and messageId, making every vendor call in the audit log traceable to the specific SQS message that triggered it. Using messageId as both the vault key label and the Stripe Idempotency-Key makes SQS redeliveries and Stripe retries idempotent with the same key. Partial batch response (batchItemFailures) ensures cap-exhausted messages are sent to the DLQ rather than blocking the entire batch retry.
How Keybrake fits
Keybrake is the proxy layer between your agent Lambda functions and Stripe, Twilio, or Resend. The vault key issued per SQS message replaces the STRIPE_SECRET_KEY previously stored in Lambda environment variables or fetched from Secrets Manager. The real Stripe secret stays in Keybrake — it is never present in Lambda environment variables, CloudWatch Logs, or SSM exports. Revoking a runaway Lambda invocation mid-execution is a single DELETE /vault/keys/{key_id} call — effective on the next proxied request, with no function redeploy, no Secrets Manager rotation, and no reserved concurrency change that would affect other functions sharing the same concurrency pool.
Related questions
Should I issue one vault key per Lambda invocation or per SQS message within a batch?
Per SQS message, not per invocation. Lambda processes SQS messages in batches — a single invocation may receive up to 10,000 messages (with batch size and batch window configured). Each message represents a separate agent run with its own budget. Issuing one vault key per invocation would create one cap shared across all messages in the batch, which means the first few messages could exhaust the cap and all subsequent messages in the batch fail — even if each individual run should have its own budget. Issue a vault key per messageId and set the cap to the per-run budget (body.budget_usd). Each message's cap accumulates independently.
How do I handle cap exhaustion so SQS doesn't endlessly retry the same message?
Use SQS's partial batch response feature (batchItemFailures in the return value). When a message hits a cap, add its messageId to batchItemFailures instead of throwing an exception. SQS treats items not in batchItemFailures as successfully processed and deletes them; items in batchItemFailures are redelivered for retry. For cap-exhausted messages, you want them to go to the dead-letter queue (not retry), so increment the visibility timeout or let them exceed maxReceiveCount naturally. Alternatively, explicitly send the message to your DLQ via the SQS API and return it as successfully processed. Never throw an unhandled exception for cap exhaustion — that retries the entire batch.
What's the right pattern for EventBridge Scheduler triggering Lambda for periodic agent runs?
For scheduled Lambda invocations (e.g. a nightly billing run), issue the vault key at the start of the Lambda handler using a stable run identifier derived from the schedule time — agent_run_label: `scheduler/${functionName}/${event.time}`. The event.time from EventBridge Scheduler is the scheduled invocation time, not the actual invocation time, so it is stable across retries triggered by Lambda failures. Use event.id (the EventBridge event ID) as the Stripe idempotency key — it is unique per scheduled event and stable across EventBridge retries. Set the vault key TTL to the expected maximum run duration. For runs that generate significant Stripe volume, set daily_usd_cap to the maximum expected spend for that scheduled run.
Further reading
- AWS Step Functions AI agent API key — for Lambda functions orchestrated by Step Functions Map states, where the vault key is issued in a preceding Task state rather than within the Lambda handler.
- AI agent idempotency — why SQS messageId as the idempotency key is safe and why random UUIDs at invocation time create duplicate charges on redelivery.
- AI agent spend reporting — the four reporting queries that give per-run cost visibility using the vault_key audit log that Lambda's built-in tooling doesn't provide.
- AI agent API key best practices — the seven operational controls for event-driven agent functions, including how to handle Lambda's process-level env var caching.