Google Cloud Workflows · AI agents · API key security

Google Cloud Workflows AI agent API key: scoping vendor calls in managed GCP workflows

Google Cloud Workflows is a fully managed serverless workflow service that executes YAML-defined orchestration logic — HTTP steps, parallel branches, condition blocks, retry policies, and variable passing between steps. AI agent teams running on GCP adopt Cloud Workflows because it handles durable execution, automatic retries, and parallel fan-out without managing infrastructure. When those workflow steps call Stripe, Twilio, or Resend, Workflows' reliability features become vendor spend amplifiers: the parallel block fans out to N simultaneous HTTP calls each making a vendor API request, try/retry blocks retry failed vendor calls that may have already reached Stripe (duplicate charge risk without idempotency keys), and there is no per-execution dollar cap built into Cloud Workflows. This page covers the vault-key pattern that bounds vendor spend per Cloud Workflows execution.

TL;DR

Issue a vault key in the first step of your Cloud Workflows workflow via an HTTP call to Keybrake, store it as a workflow variable, and pass it into every downstream HTTP step that calls Stripe, Twilio, or Resend. For parallel branches, pass the vault key as an argument to the parallel iteration — all branches share the same vault key and the same cap, which accumulates atomically across all concurrent branches. The real vendor secret stays in Keybrake (or Secret Manager fetched only at proxy startup), never in workflow variables or execution logs. Revoking a runaway workflow execution is a single DELETE /vault/keys/{key_id} call — no workflow termination command, no Secret Manager rotation, no infrastructure change.

How Cloud Workflows AI agent workflows call vendor APIs

A typical agent workflow uses Cloud Workflows HTTP steps to call vendor APIs directly. Here is a billing workflow that charges a list of customers using Stripe:

main:
  params: [args]
  steps:
    - fetch_customers:
        call: http.get
        args:
          url: https://your-api.run.app/customers
          auth:
            type: OIDC
          query:
            plan_id: ${args.plan_id}
        result: customers_response
    - charge_customers:
        parallel:
          for:
            value: customer
            in: ${customers_response.body.customers}
            steps:
              - charge_one:
                  call: http.post
                  args:
                    url: https://api.stripe.com/v1/payment_intents
                    auth:
                      type: OAuth2
                    headers:
                      Authorization: ${"Bearer " + sys.get_env("STRIPE_SECRET_KEY")}
                    body:
                      amount: ${customer.amount_cents}
                      currency: usd
                      customer: ${customer.id}
                  result: charge_result

This pattern has two compounding risks. First, if customers_response.body.customers contains 3,000 records due to a query bug, the parallel block dispatches 3,000 simultaneous Stripe calls with no cap. Second, Cloud Workflows' parallel block doesn't have a native dollar-spend stop condition — it runs all iterations regardless of cumulative cost. If you add a retry policy to charge_one, each failed branch retries independently, and a Stripe 500 that already applied a charge before returning an error will be retried without idempotency keys, creating duplicate charges.

Three gaps Cloud Workflows' native tooling doesn't fill for vendor spend control

Gap	What happens in practice	Cloud Workflows' answer
No per-execution spend cap	Cloud Workflows has no built-in mechanism to stop an execution when cumulative vendor API spend reaches a dollar threshold. The `parallel` block runs all iterations to completion (or failure). Cloud Billing budget alerts fire 24 hours after spend occurs — too slow to stop a runaway execution that completes in minutes. Execution timeout (`--call-log-level`) caps wall-clock time, not dollars spent.	Cloud Billing budgets send email alerts after the fact. No pre-call, per-execution dollar cap in the Workflows runtime.
No mid-execution vendor revoke without Secret Manager rotation	The Stripe API key is typically loaded from Secret Manager at workflow execution start or injected as a runtime argument. Rotating the Secret Manager secret version prevents new executions from fetching the old key — but already-running executions that cached the secret in a workflow variable continue using it until the execution completes. Terminating the execution via the Cloud Workflows API (`gcloud workflows executions cancel`) sends a cancellation signal, but parallel branches that are already dispatching HTTP calls may complete before the cancellation propagates.	Execution cancellation is available via CLI or API. No per-execution API key scoping or mid-execution vendor termination that takes effect on in-flight HTTP steps.
No per-call audit with workflow execution context	Cloud Logging captures Workflows execution events (step names, HTTP request/response bodies if `--call-log-level=LOG_ALL_CALLS` is set). But it doesn't parse dollar amounts from Stripe responses, correlate Stripe `PaymentIntent.id` values with the Workflows execution ID and step name in a structured cost table, or provide a queryable per-execution spend summary. Reconstructing what a runaway execution charged requires cross-referencing Cloud Logging (Workflows events) with the Stripe dashboard, matching on timestamps since no shared identifier is propagated.	Cloud Logging with `LOG_ALL_CALLS` records HTTP step inputs and outputs. No structured vendor cost tracking or execution-ID-to-charge correlation.

The parallel branch amplification risk

Cloud Workflows' parallel block is the primary fan-out mechanism and the primary spend amplifier. Unlike sequential for loops (where you could add a running-total check), parallel branches execute concurrently with no coordination between iterations. Each branch has independent access to the shared Stripe key stored in the workflow variable. A parallel block iterating over 500 customer IDs dispatches 500 Stripe calls simultaneously. The execution completes when the last branch finishes — but the vendor charges have already been applied.

The retry amplification compounds this. A retry block on an HTTP step retries the call when it receives a non-2xx response. A Stripe 500 that partially applied a charge before returning the error will be retried, potentially creating a duplicate charge. Without a stable idempotency key derived from the execution ID and customer ID (not a random UUID, not a timestamp), retries are not dedup-safe.

Scoping vault keys per Cloud Workflows execution

main:
  params: [args]
  steps:
    - issue_vault_key:
        call: http.post
        args:
          url: https://proxy.keybrake.com/vault/keys
          headers:
            Authorization: ${"Bearer " + sys.get_env("KEYBRAKE_API_KEY")}
          body:
            vendor: stripe
            daily_usd_cap: ${default(map.get(args, "budget_usd"), 500)}
            allowed_endpoints:
              - POST /v1/payment_intents
            expires_in: 2h
            agent_run_label: ${"cloud-workflows/" + sys.get_env("GOOGLE_CLOUD_WORKFLOW_EXECUTION_ID")}
        result: vault_key_response
    - set_vault_key:
        assign:
          - vault_key: ${vault_key_response.body.vault_key}
    - fetch_customers:
        call: http.get
        args:
          url: https://your-api.run.app/customers
          auth:
            type: OIDC
          query:
            plan_id: ${args.plan_id}
        result: customers_response
    - charge_customers:
        parallel:
          for:
            value: customer
            in: ${customers_response.body.customers}
            steps:
              - charge_one:
                  call: http.post
                  args:
                    url: https://proxy.keybrake.com/stripe/v1/payment_intents
                    headers:
                      Authorization: ${"Bearer " + vault_key}
                    body:
                      amount: ${customer.amount_cents}
                      currency: usd
                      customer: ${customer.id}
                      idempotency_key: ${sys.get_env("GOOGLE_CLOUD_WORKFLOW_EXECUTION_ID") + "-" + customer.id}
                  result: charge_result

The issue_vault_key step runs before the parallel block and stores the vault key as a workflow variable. All parallel branches read the same vault key from the workflow variable — no per-branch key issuance, no per-branch cap. The cap accumulates atomically across all concurrent branches: if branches 1–499 have already charged $490 against a $500 cap, branch 500's call is rejected by the proxy before money moves.

The idempotency key combines the execution ID (stable across retries of the parallel block) with the customer ID. This makes retries safe: if a branch fails after Stripe applied a charge, the retry's duplicate call is deduped by Stripe using the same idempotency key. The agent_run_label includes the execution ID so every vendor call in the audit log is traceable to the specific workflow execution.

How Keybrake fits

Keybrake is the proxy layer between your Cloud Workflows HTTP steps and Stripe, Twilio, or Resend. The vault key issued in the first step replaces the full-access Stripe key previously stored in workflow variables or passed from Secret Manager. The real Stripe secret stays in Keybrake — it never appears in workflow variable logs or Cloud Logging HTTP call records. For parallel block fan-out, the vault key is passed as a shared workflow variable — all branches use the same key and the same cap accumulates across all concurrent calls. Revoking a runaway execution is a single DELETE /vault/keys/{key_id} call — effective on the next proxied request, with no execution cancellation command, no Secret Manager rotation, and no impact on other running executions.

Get early access