Conductor · AI agents · API key security

Conductor AI agent API key: scoping vendor calls in Netflix Conductor workflows

Netflix Conductor is a microservices workflow orchestration engine — workers poll a central server for tasks, execute them, and report results back. AI agent teams adopt Conductor because it handles complex multi-step workflows with FORK_JOIN parallel execution, conditional branches, sub-workflows, and configurable retry policies, without coupling workflow logic to worker process code. When those worker tasks call Stripe, Twilio, or Resend, Conductor's architecture creates three specific spending risks: FORK_JOIN dispatches N parallel tasks each making independent vendor calls with no shared dollar cap; worker task retry policies (retryCount, retryDelaySeconds) re-execute failed tasks that may have already applied Stripe charges (duplicate charges without stable idempotency keys); and already-polled tasks continue executing on workers even after the workflow is terminated at the Conductor server level. There is no per-workflow-execution dollar cap built into Conductor's task model. This page covers the vault-key pattern that bounds vendor spend per Conductor workflow execution.

TL;DR

Add an ISSUE_VAULT_KEY worker task as the first task in your Conductor workflow definition. The task issues a vault key via the Keybrake API and outputs it into the workflow's shared variable space. Downstream tasks read the vault key from workflow input (${issue_vault_key_task.output.vault_key}) and use it to call Stripe, Twilio, or Resend via the proxy. For FORK_JOIN fan-out, pass the vault key in each forked task's input — all parallel tasks share the same key and the same cap accumulates atomically across concurrent task executions. The real vendor secret stays in Keybrake, never in worker environment variables or Conductor task outputs. Revoking a runaway workflow is a single DELETE /vault/keys/{key_id} call — effective on the next proxied request without requiring workflow termination.

How Conductor AI agent workers call vendor APIs

In Conductor, workers poll for tasks of their registered type, execute the task logic, and report success or failure. A typical billing workflow uses a FORK_JOIN to charge multiple customers in parallel:

# Workflow definition (JSON)
{
  "name": "billing_agent_workflow",
  "tasks": [
    {
      "name": "fetch_customers",
      "taskReferenceName": "fetch_customers_task",
      "type": "SIMPLE",
      "inputParameters": {
        "plan_id": "${workflow.input.plan_id}"
      }
    },
    {
      "name": "FORK_JOIN",
      "taskReferenceName": "charge_fork",
      "type": "FORK_JOIN",
      "forkTasks": [
        // Dynamically generated — one per customer from fetch_customers output
      ]
    },
    {
      "name": "JOIN",
      "taskReferenceName": "charge_join",
      "type": "JOIN",
      "joinOn": ["charge_fork"]
    }
  ]
}

# Python worker
import conductor.client.http.models as models
from stripe import Stripe

stripe_client = Stripe(api_key=os.environ["STRIPE_SECRET_KEY"])

def charge_customer_worker(task: Task) -> TaskResult:
    customer_id = task.input_data["customer_id"]
    amount_cents = task.input_data["amount_cents"]

    # Full-access key: no per-workflow cap
    intent = stripe_client.payment_intents.create(
        amount=amount_cents,
        currency="usd",
        customer=customer_id,
    )
    return TaskResult(
        task_id=task.task_id,
        workflow_instance_id=task.workflow_instance_id,
        status=TaskResultStatus.COMPLETED,
        output_data={"charge_id": intent.id},
    )

The FORK_JOIN dispatches one task per customer simultaneously. Each charge_customer worker polls independently and calls Stripe with the full-access key from its environment. If fetch_customers returns 3,000 customers due to a bug, 3,000 task executions are queued and processed by however many workers are running — with no workflow-level dollar cap to stop them. Each task's retryCount: 3 means each failed task can generate up to 4 Stripe calls before the retry budget is exhausted, with duplicate charge risk on any task that reached Stripe before failing.

Three gaps Conductor's native tooling doesn't fill for vendor spend control

Gap	What happens in practice	Conductor's answer
No per-workflow spend cap	Conductor has rate limiting at the task type level (tasks per second, concurrent executions) and workflow-level timeout. Neither controls the dollar amount of vendor calls made within task executions. A rate limit of 100 tasks/second on `charge_customer` still allows $45,000 in Stripe charges per minute (100 × $7.50 average charge × 60 seconds) — far beyond what most budgets intend. Workflow timeout terminates the workflow after a wall-clock duration but doesn't retroactively cancel already-dispatched tasks.	Task execution rate limits and workflow timeouts are available. No per-workflow-execution dollar cap for vendor API spend within tasks.
No mid-execution vendor revoke without worker restart	Conductor's workflow termination API marks the workflow as terminated and stops scheduling new tasks. But workers that have already polled a task (acknowledged it with `PUT /tasks/{taskId}/ack`) continue executing until they report completion. The Stripe API key in the worker's environment can only be rotated by restarting all workers of that type — which breaks every other in-flight task, not just the runaway workflow's tasks. For `FORK_JOIN` branches, dozens of tasks may already be polled and executing simultaneously when termination is triggered.	Workflow termination is available via the Conductor API. No per-task API key scoping or mid-execution vendor termination for already-polled tasks.
No per-call audit with workflow execution context	Conductor's execution history records task inputs, outputs, start/end times, and status for each task in a workflow run. It doesn't parse dollar amounts from Stripe responses, correlate Stripe `PaymentIntent.id` values with the Conductor `workflowInstanceId` and task reference name in a structured cost table, or produce a per-workflow spend summary. Debugging an overcharge requires cross-referencing Conductor's execution UI with the Stripe dashboard, matching on timestamps.	Conductor's execution history captures task inputs/outputs/status. No structured vendor cost tracking or workflow-ID-to-charge correlation.

The FORK_JOIN amplification risk

Conductor's FORK_JOIN system task is designed for parallel execution — it's the right pattern for processing many customers simultaneously. But parallel execution means many vendor API calls hit Stripe simultaneously with no shared cap. The join task (JOIN) waits for all forked tasks to complete before proceeding, but it doesn't aggregate spend or enforce a budget across the fork. If 500 forked tasks each call Stripe for $50, the total spend is $25,000 from a single workflow execution — with no mechanism in Conductor to stop the fork at $10,000.

The retry amplification compounds this. Conductor's task retry policy (retryCount, retryDelaySeconds) re-executes failed tasks after a delay. If a charge_customer task fails after Stripe applied a charge but before the worker returned success (network timeout, worker crash), the retry re-runs the entire task — calling Stripe again with no idempotency key unless you explicitly generate a stable one from the workflow and customer ID.

Scoping vault keys per Conductor workflow execution

# Workflow definition with vault key issuance
{
  "name": "billing_agent_workflow",
  "tasks": [
    {
      "name": "issue_vault_key",
      "taskReferenceName": "issue_vault_key_task",
      "type": "HTTP",
      "inputParameters": {
        "http_request": {
          "uri": "https://proxy.keybrake.com/vault/keys",
          "method": "POST",
          "headers": {
            "Authorization": "Bearer ${workflow.input.keybrake_api_key}",
            "Content-Type": "application/json"
          },
          "body": {
            "vendor": "stripe",
            "daily_usd_cap": "${workflow.input.budget_usd}",
            "allowed_endpoints": ["POST /v1/payment_intents"],
            "expires_in": "2h",
            "agent_run_label": "conductor/${workflow.workflowId}"
          }
        }
      }
    },
    {
      "name": "fetch_customers",
      "taskReferenceName": "fetch_customers_task",
      "type": "SIMPLE",
      "inputParameters": {
        "plan_id": "${workflow.input.plan_id}"
      }
    },
    {
      "name": "FORK_JOIN",
      "taskReferenceName": "charge_fork",
      "type": "FORK_JOIN",
      "forkTasks": [
        // Each forked task includes vault_key in inputParameters
        {
          "name": "charge_customer",
          "taskReferenceName": "charge_customer_task_N",
          "type": "SIMPLE",
          "inputParameters": {
            "customer_id": "${fetch_customers_task.output.customers[N].id}",
            "amount_cents": "${fetch_customers_task.output.customers[N].amountCents}",
            "vault_key": "${issue_vault_key_task.output.response.body.vault_key}",
            "workflow_id": "${workflow.workflowId}"
          }
        }
      ]
    }
  ]
}

# Python worker with vault key
def charge_customer_worker(task: Task) -> TaskResult:
    customer_id = task.input_data["customer_id"]
    amount_cents = task.input_data["amount_cents"]
    vault_key = task.input_data["vault_key"]
    workflow_id = task.input_data["workflow_id"]

    stripe_client = Stripe(
        api_key=vault_key,
        base_url="https://proxy.keybrake.com/stripe/v1",
    )

    try:
        intent = stripe_client.payment_intents.create(
            amount=amount_cents,
            currency="usd",
            customer=customer_id,
            idempotency_key=f"{workflow_id}-{customer_id}",
        )
        return TaskResult(
            task_id=task.task_id,
            status=TaskResultStatus.COMPLETED,
            output_data={"charge_id": intent.id},
        )
    except Exception as e:
        if "cap_exhausted" in str(e):
            # Non-retryable: cap hit is intentional, don't retry
            return TaskResult(
                task_id=task.task_id,
                status=TaskResultStatus.FAILED_WITH_TERMINAL_ERROR,
                reason_for_incompletion="Vendor spend cap exhausted",
            )
        raise  # Retriable: network error, Stripe 500, etc.

The issue_vault_key uses Conductor's built-in HTTP system task to call the Keybrake API — no custom worker needed for key issuance. The vault key is stored in the task's output and referenced by downstream tasks via ${issue_vault_key_task.output.response.body.vault_key}. Each forked task receives the vault key in its inputParameters — all parallel tasks share the same vault key and the cap accumulates atomically.

The idempotency key uses the Conductor workflowId plus the customer ID — stable across task retries (Conductor retries the same task with the same workflowInstanceId), unique across different workflow executions. On cap exhaustion, the worker returns FAILED_WITH_TERMINAL_ERROR to prevent Conductor from retrying the task.

How Keybrake fits

Keybrake is the proxy layer between your Conductor workers and Stripe, Twilio, or Resend. The vault key issued in the first task replaces the full-access STRIPE_SECRET_KEY that was previously stored in worker environment variables. The real Stripe secret stays in Keybrake — never in Conductor task inputs/outputs or execution history. For FORK_JOIN fan-out, the vault key is passed in each forked task's input, giving all parallel tasks access to the same cap. Even if 20 worker processes poll tasks simultaneously, the cap is enforced atomically at the proxy level. Revoking a runaway workflow is a single DELETE /vault/keys/{key_id} call — effective on the next proxied request from any worker holding that vault key, without requiring workflow termination or worker restart.

Get early access