Conductor · AI agents · API key security
Conductor AI agent API key: scoping vendor calls in Netflix Conductor workflows
Netflix Conductor is a microservices workflow orchestration engine — workers poll a central server for tasks, execute them, and report results back. AI agent teams adopt Conductor because it handles complex multi-step workflows with FORK_JOIN parallel execution, conditional branches, sub-workflows, and configurable retry policies, without coupling workflow logic to worker process code. When those worker tasks call Stripe, Twilio, or Resend, Conductor's architecture creates three specific spending risks: FORK_JOIN dispatches N parallel tasks each making independent vendor calls with no shared dollar cap; worker task retry policies (retryCount, retryDelaySeconds) re-execute failed tasks that may have already applied Stripe charges (duplicate charges without stable idempotency keys); and already-polled tasks continue executing on workers even after the workflow is terminated at the Conductor server level. There is no per-workflow-execution dollar cap built into Conductor's task model. This page covers the vault-key pattern that bounds vendor spend per Conductor workflow execution.
TL;DR
Add an ISSUE_VAULT_KEY worker task as the first task in your Conductor workflow definition. The task issues a vault key via the Keybrake API and outputs it into the workflow's shared variable space. Downstream tasks read the vault key from workflow input (${issue_vault_key_task.output.vault_key}) and use it to call Stripe, Twilio, or Resend via the proxy. For FORK_JOIN fan-out, pass the vault key in each forked task's input — all parallel tasks share the same key and the same cap accumulates atomically across concurrent task executions. The real vendor secret stays in Keybrake, never in worker environment variables or Conductor task outputs. Revoking a runaway workflow is a single DELETE /vault/keys/{key_id} call — effective on the next proxied request without requiring workflow termination.
How Conductor AI agent workers call vendor APIs
In Conductor, workers poll for tasks of their registered type, execute the task logic, and report success or failure. A typical billing workflow uses a FORK_JOIN to charge multiple customers in parallel:
# Workflow definition (JSON)
{
"name": "billing_agent_workflow",
"tasks": [
{
"name": "fetch_customers",
"taskReferenceName": "fetch_customers_task",
"type": "SIMPLE",
"inputParameters": {
"plan_id": "${workflow.input.plan_id}"
}
},
{
"name": "FORK_JOIN",
"taskReferenceName": "charge_fork",
"type": "FORK_JOIN",
"forkTasks": [
// Dynamically generated — one per customer from fetch_customers output
]
},
{
"name": "JOIN",
"taskReferenceName": "charge_join",
"type": "JOIN",
"joinOn": ["charge_fork"]
}
]
}
# Python worker
import conductor.client.http.models as models
from stripe import Stripe
stripe_client = Stripe(api_key=os.environ["STRIPE_SECRET_KEY"])
def charge_customer_worker(task: Task) -> TaskResult:
customer_id = task.input_data["customer_id"]
amount_cents = task.input_data["amount_cents"]
# Full-access key: no per-workflow cap
intent = stripe_client.payment_intents.create(
amount=amount_cents,
currency="usd",
customer=customer_id,
)
return TaskResult(
task_id=task.task_id,
workflow_instance_id=task.workflow_instance_id,
status=TaskResultStatus.COMPLETED,
output_data={"charge_id": intent.id},
)
The FORK_JOIN dispatches one task per customer simultaneously. Each charge_customer worker polls independently and calls Stripe with the full-access key from its environment. If fetch_customers returns 3,000 customers due to a bug, 3,000 task executions are queued and processed by however many workers are running — with no workflow-level dollar cap to stop them. Each task's retryCount: 3 means each failed task can generate up to 4 Stripe calls before the retry budget is exhausted, with duplicate charge risk on any task that reached Stripe before failing.
Three gaps Conductor's native tooling doesn't fill for vendor spend control
| Gap | What happens in practice | Conductor's answer |
|---|---|---|
| No per-workflow spend cap | Conductor has rate limiting at the task type level (tasks per second, concurrent executions) and workflow-level timeout. Neither controls the dollar amount of vendor calls made within task executions. A rate limit of 100 tasks/second on charge_customer still allows $45,000 in Stripe charges per minute (100 × $7.50 average charge × 60 seconds) — far beyond what most budgets intend. Workflow timeout terminates the workflow after a wall-clock duration but doesn't retroactively cancel already-dispatched tasks. |
Task execution rate limits and workflow timeouts are available. No per-workflow-execution dollar cap for vendor API spend within tasks. |
| No mid-execution vendor revoke without worker restart | Conductor's workflow termination API marks the workflow as terminated and stops scheduling new tasks. But workers that have already polled a task (acknowledged it with PUT /tasks/{taskId}/ack) continue executing until they report completion. The Stripe API key in the worker's environment can only be rotated by restarting all workers of that type — which breaks every other in-flight task, not just the runaway workflow's tasks. For FORK_JOIN branches, dozens of tasks may already be polled and executing simultaneously when termination is triggered. |
Workflow termination is available via the Conductor API. No per-task API key scoping or mid-execution vendor termination for already-polled tasks. |
| No per-call audit with workflow execution context | Conductor's execution history records task inputs, outputs, start/end times, and status for each task in a workflow run. It doesn't parse dollar amounts from Stripe responses, correlate Stripe PaymentIntent.id values with the Conductor workflowInstanceId and task reference name in a structured cost table, or produce a per-workflow spend summary. Debugging an overcharge requires cross-referencing Conductor's execution UI with the Stripe dashboard, matching on timestamps. |
Conductor's execution history captures task inputs/outputs/status. No structured vendor cost tracking or workflow-ID-to-charge correlation. |
The FORK_JOIN amplification risk
Conductor's FORK_JOIN system task is designed for parallel execution — it's the right pattern for processing many customers simultaneously. But parallel execution means many vendor API calls hit Stripe simultaneously with no shared cap. The join task (JOIN) waits for all forked tasks to complete before proceeding, but it doesn't aggregate spend or enforce a budget across the fork. If 500 forked tasks each call Stripe for $50, the total spend is $25,000 from a single workflow execution — with no mechanism in Conductor to stop the fork at $10,000.
The retry amplification compounds this. Conductor's task retry policy (retryCount, retryDelaySeconds) re-executes failed tasks after a delay. If a charge_customer task fails after Stripe applied a charge but before the worker returned success (network timeout, worker crash), the retry re-runs the entire task — calling Stripe again with no idempotency key unless you explicitly generate a stable one from the workflow and customer ID.
Scoping vault keys per Conductor workflow execution
# Workflow definition with vault key issuance
{
"name": "billing_agent_workflow",
"tasks": [
{
"name": "issue_vault_key",
"taskReferenceName": "issue_vault_key_task",
"type": "HTTP",
"inputParameters": {
"http_request": {
"uri": "https://proxy.keybrake.com/vault/keys",
"method": "POST",
"headers": {
"Authorization": "Bearer ${workflow.input.keybrake_api_key}",
"Content-Type": "application/json"
},
"body": {
"vendor": "stripe",
"daily_usd_cap": "${workflow.input.budget_usd}",
"allowed_endpoints": ["POST /v1/payment_intents"],
"expires_in": "2h",
"agent_run_label": "conductor/${workflow.workflowId}"
}
}
}
},
{
"name": "fetch_customers",
"taskReferenceName": "fetch_customers_task",
"type": "SIMPLE",
"inputParameters": {
"plan_id": "${workflow.input.plan_id}"
}
},
{
"name": "FORK_JOIN",
"taskReferenceName": "charge_fork",
"type": "FORK_JOIN",
"forkTasks": [
// Each forked task includes vault_key in inputParameters
{
"name": "charge_customer",
"taskReferenceName": "charge_customer_task_N",
"type": "SIMPLE",
"inputParameters": {
"customer_id": "${fetch_customers_task.output.customers[N].id}",
"amount_cents": "${fetch_customers_task.output.customers[N].amountCents}",
"vault_key": "${issue_vault_key_task.output.response.body.vault_key}",
"workflow_id": "${workflow.workflowId}"
}
}
]
}
]
}
# Python worker with vault key
def charge_customer_worker(task: Task) -> TaskResult:
customer_id = task.input_data["customer_id"]
amount_cents = task.input_data["amount_cents"]
vault_key = task.input_data["vault_key"]
workflow_id = task.input_data["workflow_id"]
stripe_client = Stripe(
api_key=vault_key,
base_url="https://proxy.keybrake.com/stripe/v1",
)
try:
intent = stripe_client.payment_intents.create(
amount=amount_cents,
currency="usd",
customer=customer_id,
idempotency_key=f"{workflow_id}-{customer_id}",
)
return TaskResult(
task_id=task.task_id,
status=TaskResultStatus.COMPLETED,
output_data={"charge_id": intent.id},
)
except Exception as e:
if "cap_exhausted" in str(e):
# Non-retryable: cap hit is intentional, don't retry
return TaskResult(
task_id=task.task_id,
status=TaskResultStatus.FAILED_WITH_TERMINAL_ERROR,
reason_for_incompletion="Vendor spend cap exhausted",
)
raise # Retriable: network error, Stripe 500, etc.
The issue_vault_key uses Conductor's built-in HTTP system task to call the Keybrake API — no custom worker needed for key issuance. The vault key is stored in the task's output and referenced by downstream tasks via ${issue_vault_key_task.output.response.body.vault_key}. Each forked task receives the vault key in its inputParameters — all parallel tasks share the same vault key and the cap accumulates atomically.
The idempotency key uses the Conductor workflowId plus the customer ID — stable across task retries (Conductor retries the same task with the same workflowInstanceId), unique across different workflow executions. On cap exhaustion, the worker returns FAILED_WITH_TERMINAL_ERROR to prevent Conductor from retrying the task.
How Keybrake fits
Keybrake is the proxy layer between your Conductor workers and Stripe, Twilio, or Resend. The vault key issued in the first task replaces the full-access STRIPE_SECRET_KEY that was previously stored in worker environment variables. The real Stripe secret stays in Keybrake — never in Conductor task inputs/outputs or execution history. For FORK_JOIN fan-out, the vault key is passed in each forked task's input, giving all parallel tasks access to the same cap. Even if 20 worker processes poll tasks simultaneously, the cap is enforced atomically at the proxy level. Revoking a runaway workflow is a single DELETE /vault/keys/{key_id} call — effective on the next proxied request from any worker holding that vault key, without requiring workflow termination or worker restart.
Related questions
Is the vault key stored in Conductor task output visible in the execution history?
Yes — Conductor's execution history includes task input and output data, and the issue_vault_key task output includes the vault key. This is a scoped credential (vault_key_xxx), not your real Stripe secret — if extracted from the execution history, the attacker can make vendor calls only up to the configured cap until the key expires. For higher security, omit the vault key from task output by using an HTTP system task that calls a key-distribution endpoint: workers poll Keybrake directly using the Conductor workflowInstanceId as a lookup key, never receiving the vault key via task input. Apply Conductor's built-in audit log access controls to restrict who can read execution histories.
How does vault key scoping interact with Conductor's dynamic fork on large customer lists?
Conductor's dynamic FORK_JOIN (using DYNAMIC_FORK_JOIN with a fork task that generates the task list at runtime) creates as many parallel tasks as there are items in the input list — potentially thousands. All those tasks share the same vault key and cap. If the fork generates 5,000 tasks with a $500 cap, the first 1,000 tasks to complete (assuming $0.50 average) will exhaust the cap; the remaining 4,000 tasks get 429 responses and should fail with terminal error. This is the intended behavior — the cap stopped the runaway. Set daily_usd_cap to the legitimate maximum spend for a single workflow execution, not an account-level daily budget.
Does Orkes Conductor (the managed version) change this pattern?
No — Orkes Conductor is a fully managed, enterprise version of Netflix Conductor with the same task model, FORK_JOIN semantics, and worker polling architecture. The vault-key pattern works identically: the HTTP system task issues the vault key in the first workflow step, downstream tasks read it from ${issue_vault_key_task.output.response.body.vault_key}, and workers use it as their Stripe API key pointed at the Keybrake proxy. Orkes adds RBAC and audit logging at the Conductor level, but it doesn't add per-workflow vendor spend caps or per-task API key scoping.
Further reading
- Temporal AI agent API key — Temporal Activities map to Conductor worker tasks; the vault-key-per-workflow-execution approach is equivalent across both engines.
- Airflow AI agent API key — Airflow's dynamic task mapping creates the same
FORK_JOINfan-out risk; per-DAG-run vault key pattern is equivalent to per-workflow-execution. - AI agent idempotency — why stable idempotency keys derived from the Conductor
workflowInstanceIdare essential when task retry policies re-execute failed tasks. - AI agent API key best practices — the seven operational controls that reduce vendor spend risk across all orchestration engines including Conductor.