Google Cloud Workflows · AI agents · API key security

Google Cloud Workflows AI agent API key: scoping vendor calls in managed GCP workflows

Google Cloud Workflows is a fully managed serverless workflow service that executes YAML-defined orchestration logic — HTTP steps, parallel branches, condition blocks, retry policies, and variable passing between steps. AI agent teams running on GCP adopt Cloud Workflows because it handles durable execution, automatic retries, and parallel fan-out without managing infrastructure. When those workflow steps call Stripe, Twilio, or Resend, Workflows' reliability features become vendor spend amplifiers: the parallel block fans out to N simultaneous HTTP calls each making a vendor API request, try/retry blocks retry failed vendor calls that may have already reached Stripe (duplicate charge risk without idempotency keys), and there is no per-execution dollar cap built into Cloud Workflows. This page covers the vault-key pattern that bounds vendor spend per Cloud Workflows execution.

TL;DR

Issue a vault key in the first step of your Cloud Workflows workflow via an HTTP call to Keybrake, store it as a workflow variable, and pass it into every downstream HTTP step that calls Stripe, Twilio, or Resend. For parallel branches, pass the vault key as an argument to the parallel iteration — all branches share the same vault key and the same cap, which accumulates atomically across all concurrent branches. The real vendor secret stays in Keybrake (or Secret Manager fetched only at proxy startup), never in workflow variables or execution logs. Revoking a runaway workflow execution is a single DELETE /vault/keys/{key_id} call — no workflow termination command, no Secret Manager rotation, no infrastructure change.

How Cloud Workflows AI agent workflows call vendor APIs

A typical agent workflow uses Cloud Workflows HTTP steps to call vendor APIs directly. Here is a billing workflow that charges a list of customers using Stripe:

main:
  params: [args]
  steps:
    - fetch_customers:
        call: http.get
        args:
          url: https://your-api.run.app/customers
          auth:
            type: OIDC
          query:
            plan_id: ${args.plan_id}
        result: customers_response
    - charge_customers:
        parallel:
          for:
            value: customer
            in: ${customers_response.body.customers}
            steps:
              - charge_one:
                  call: http.post
                  args:
                    url: https://api.stripe.com/v1/payment_intents
                    auth:
                      type: OAuth2
                    headers:
                      Authorization: ${"Bearer " + sys.get_env("STRIPE_SECRET_KEY")}
                    body:
                      amount: ${customer.amount_cents}
                      currency: usd
                      customer: ${customer.id}
                  result: charge_result

This pattern has two compounding risks. First, if customers_response.body.customers contains 3,000 records due to a query bug, the parallel block dispatches 3,000 simultaneous Stripe calls with no cap. Second, Cloud Workflows' parallel block doesn't have a native dollar-spend stop condition — it runs all iterations regardless of cumulative cost. If you add a retry policy to charge_one, each failed branch retries independently, and a Stripe 500 that already applied a charge before returning an error will be retried without idempotency keys, creating duplicate charges.

Three gaps Cloud Workflows' native tooling doesn't fill for vendor spend control

GapWhat happens in practiceCloud Workflows' answer
No per-execution spend cap Cloud Workflows has no built-in mechanism to stop an execution when cumulative vendor API spend reaches a dollar threshold. The parallel block runs all iterations to completion (or failure). Cloud Billing budget alerts fire 24 hours after spend occurs — too slow to stop a runaway execution that completes in minutes. Execution timeout (--call-log-level) caps wall-clock time, not dollars spent. Cloud Billing budgets send email alerts after the fact. No pre-call, per-execution dollar cap in the Workflows runtime.
No mid-execution vendor revoke without Secret Manager rotation The Stripe API key is typically loaded from Secret Manager at workflow execution start or injected as a runtime argument. Rotating the Secret Manager secret version prevents new executions from fetching the old key — but already-running executions that cached the secret in a workflow variable continue using it until the execution completes. Terminating the execution via the Cloud Workflows API (gcloud workflows executions cancel) sends a cancellation signal, but parallel branches that are already dispatching HTTP calls may complete before the cancellation propagates. Execution cancellation is available via CLI or API. No per-execution API key scoping or mid-execution vendor termination that takes effect on in-flight HTTP steps.
No per-call audit with workflow execution context Cloud Logging captures Workflows execution events (step names, HTTP request/response bodies if --call-log-level=LOG_ALL_CALLS is set). But it doesn't parse dollar amounts from Stripe responses, correlate Stripe PaymentIntent.id values with the Workflows execution ID and step name in a structured cost table, or provide a queryable per-execution spend summary. Reconstructing what a runaway execution charged requires cross-referencing Cloud Logging (Workflows events) with the Stripe dashboard, matching on timestamps since no shared identifier is propagated. Cloud Logging with LOG_ALL_CALLS records HTTP step inputs and outputs. No structured vendor cost tracking or execution-ID-to-charge correlation.

The parallel branch amplification risk

Cloud Workflows' parallel block is the primary fan-out mechanism and the primary spend amplifier. Unlike sequential for loops (where you could add a running-total check), parallel branches execute concurrently with no coordination between iterations. Each branch has independent access to the shared Stripe key stored in the workflow variable. A parallel block iterating over 500 customer IDs dispatches 500 Stripe calls simultaneously. The execution completes when the last branch finishes — but the vendor charges have already been applied.

The retry amplification compounds this. A retry block on an HTTP step retries the call when it receives a non-2xx response. A Stripe 500 that partially applied a charge before returning the error will be retried, potentially creating a duplicate charge. Without a stable idempotency key derived from the execution ID and customer ID (not a random UUID, not a timestamp), retries are not dedup-safe.

Scoping vault keys per Cloud Workflows execution

main:
  params: [args]
  steps:
    - issue_vault_key:
        call: http.post
        args:
          url: https://proxy.keybrake.com/vault/keys
          headers:
            Authorization: ${"Bearer " + sys.get_env("KEYBRAKE_API_KEY")}
          body:
            vendor: stripe
            daily_usd_cap: ${default(map.get(args, "budget_usd"), 500)}
            allowed_endpoints:
              - POST /v1/payment_intents
            expires_in: 2h
            agent_run_label: ${"cloud-workflows/" + sys.get_env("GOOGLE_CLOUD_WORKFLOW_EXECUTION_ID")}
        result: vault_key_response
    - set_vault_key:
        assign:
          - vault_key: ${vault_key_response.body.vault_key}
    - fetch_customers:
        call: http.get
        args:
          url: https://your-api.run.app/customers
          auth:
            type: OIDC
          query:
            plan_id: ${args.plan_id}
        result: customers_response
    - charge_customers:
        parallel:
          for:
            value: customer
            in: ${customers_response.body.customers}
            steps:
              - charge_one:
                  call: http.post
                  args:
                    url: https://proxy.keybrake.com/stripe/v1/payment_intents
                    headers:
                      Authorization: ${"Bearer " + vault_key}
                    body:
                      amount: ${customer.amount_cents}
                      currency: usd
                      customer: ${customer.id}
                      idempotency_key: ${sys.get_env("GOOGLE_CLOUD_WORKFLOW_EXECUTION_ID") + "-" + customer.id}
                  result: charge_result

The issue_vault_key step runs before the parallel block and stores the vault key as a workflow variable. All parallel branches read the same vault key from the workflow variable — no per-branch key issuance, no per-branch cap. The cap accumulates atomically across all concurrent branches: if branches 1–499 have already charged $490 against a $500 cap, branch 500's call is rejected by the proxy before money moves.

The idempotency key combines the execution ID (stable across retries of the parallel block) with the customer ID. This makes retries safe: if a branch fails after Stripe applied a charge, the retry's duplicate call is deduped by Stripe using the same idempotency key. The agent_run_label includes the execution ID so every vendor call in the audit log is traceable to the specific workflow execution.

How Keybrake fits

Keybrake is the proxy layer between your Cloud Workflows HTTP steps and Stripe, Twilio, or Resend. The vault key issued in the first step replaces the full-access Stripe key previously stored in workflow variables or passed from Secret Manager. The real Stripe secret stays in Keybrake — it never appears in workflow variable logs or Cloud Logging HTTP call records. For parallel block fan-out, the vault key is passed as a shared workflow variable — all branches use the same key and the same cap accumulates across all concurrent calls. Revoking a runaway execution is a single DELETE /vault/keys/{key_id} call — effective on the next proxied request, with no execution cancellation command, no Secret Manager rotation, and no impact on other running executions.

Get early access

Related questions

How do I pass the vault key to parallel branches in Cloud Workflows?

Workflow variables defined before the parallel block are accessible inside parallel iterations — the vault_key variable set by the assign step is readable from inside the parallel for-each loop's steps. You don't need to pass it as an explicit argument; simply reference ${vault_key} inside the branch step definitions. All branches share the same variable, which means they share the same vault key and the same cap. Don't issue a separate vault key per branch — that creates N independent caps and defeats the purpose of bounding total spend for the execution.

How do I handle cap exhaustion inside a Cloud Workflows try/retry block?

When the proxy returns a 429 due to cap exhaustion, the HTTP step receives a non-2xx response. If you have a retry policy on the step, Cloud Workflows will retry the call — but the cap is still exhausted on retry, so the retry also gets a 429. To prevent a retry storm, distinguish cap exhaustion from transient errors: check for the X-Keybrake-Cap-Hit: true response header in the step's error handler or inspect the response body. In the except block of a try statement, check the error code and raise a non-retryable exception if it's a cap hit. Cap exhaustion is an intentional stop — retrying it just burns time before getting the same result.

What vault key TTL should I use for long-running Cloud Workflows executions?

Set expires_in to the expected maximum execution duration plus a 20% safety margin. A billing workflow that processes 10,000 customers with parallel branches might take 5–15 minutes; use expires_in: "30m". If your workflow has human approval steps or waits for external events (using await with callbacks), the execution can pause for hours or days — in this case, issue the vault key after the approval step resumes, not at the start. A vault key that expires mid-execution causes all subsequent vendor calls to get 401 errors. The alternative is to set a generous TTL (e.g. expires_in: "24h") and explicitly revoke the key when the execution completes normally.

Further reading