Rust · Tokio · Axum · AI agents · API key management

Rust AI agent API key management: vault keys via Axum middleware and task-local storage

Rust AI agent backends store vendor API keys in an Arc<AppState> struct shared via Axum's State extractor — every async Tokio task reads the same Stripe key from shared state. This is idiomatic Rust: Arc gives cheap clones across task boundaries, and the borrow checker ensures the key isn't mutated. But from Stripe's perspective, all concurrent agent tasks look identical. There's no per-task spend cap, no endpoint allowlist, and no way to revoke one runaway task's access without restarting the server. Vault keys issued via Axum middleware fix this — and Rust's Drop trait enables deterministic revocation when the request completes.

TL;DR

Create a VaultKey struct that implements Drop to automatically revoke itself. Write an Axum middleware that issues the vault key, stores it in Extension<Arc<VaultKey>>, and inserts it into the request before the handler runs. Handlers extract it with Extension(vault_key): Extension<Arc<VaultKey>> and use vault_key.token as the Authorization header for proxy.keybrake.com calls. The Drop impl calls revoke when the Arc reference count hits zero at end of request.

The Rust AI agent tool backend pattern

The canonical Axum pattern for vendor API keys wraps them in shared application state:

use std::sync::Arc;
use axum::{extract::State, routing::post, Router};
use reqwest::Client;

#[derive(Clone)]
struct AppState {
    stripe_key: String, // shared Arc clone across ALL async tasks
    http: Client,
}

#[tokio::main]
async fn main() {
    let state = Arc::new(AppState {
        stripe_key: std::env::var("STRIPE_SECRET_KEY").unwrap(),
        http: Client::new(),
    });

    let app = Router::new()
        .route("/tools/charge", post(charge_handler))
        .with_state(Arc::clone(&state));

    axum::Server::bind(&"0.0.0.0:8080".parse().unwrap())
        .serve(app.into_make_service())
        .await
        .unwrap();
}

async fn charge_handler(State(state): State<Arc<AppState>>) -> impl axum::response::IntoResponse {
    // Uses state.stripe_key — shared by every concurrent agent task
    let response = state.http
        .post("https://api.stripe.com/v1/payment_intents")
        .bearer_auth(&state.stripe_key)
        .form(&[("amount", "5000"), ("currency", "usd")])
        .send()
        .await;
    // ...
}

Every concurrent Tokio task that serves an agent tool call shares state.stripe_key. Stripe sees the same API key for all of them — no per-task attribution, no per-task spend limits, no way to cut off one specific task's access mid-run.

VaultKey with Drop-based revocation

Rust's ownership model is ideal for automatic vault key cleanup. Define a VaultKey struct whose Drop impl revokes the key when it goes out of scope:

use std::sync::Arc;
use reqwest::Client;

#[derive(Debug, Clone)]
pub struct VaultKey {
    pub id: String,
    pub token: String,
    keybrake_token: String,
    http: Client,
}

impl VaultKey {
    pub async fn issue(
        http: &Client,
        keybrake_token: &str,
        label: &str,
        vendor: &str,
        daily_cap_usd: u32,
        allowed_endpoints: &[&str],
        expires_in: &str,
    ) -> anyhow::Result<Self> {
        let body = serde_json::json!({
            "label":             label,
            "vendor":            vendor,
            "daily_usd_cap":     daily_cap_usd,
            "allowed_endpoints": allowed_endpoints,
            "expires_in":        expires_in,
        });

        let resp = http
            .post("https://api.keybrake.com/v1/keys")
            .bearer_auth(keybrake_token)
            .json(&body)
            .timeout(std::time::Duration::from_millis(2000))
            .send()
            .await?;

        if !resp.status().is_success() {
            anyhow::bail!("keybrake issue: status {}", resp.status());
        }

        let vk: serde_json::Value = resp.json().await?;

        Ok(VaultKey {
            id:              vk["id"].as_str().unwrap_or("").to_string(),
            token:           vk["token"].as_str().unwrap_or("").to_string(),
            keybrake_token:  keybrake_token.to_string(),
            http:            http.clone(),
        })
    }

    pub async fn revoke(&self) {
        let _ = self.http
            .delete(format!("https://api.keybrake.com/v1/keys/{}", self.id))
            .bearer_auth(&self.keybrake_token)
            .timeout(std::time::Duration::from_millis(2000))
            .send()
            .await;
        // Non-critical — TTL is the safety net; ignore errors
    }
}

// Drop revokes synchronously via a spawned task (Drop can't be async)
impl Drop for VaultKey {
    fn drop(&mut self) {
        let http    = self.http.clone();
        let id      = self.id.clone();
        let token   = self.keybrake_token.clone();

        // tokio::spawn is safe here: Tokio runtime is still active during request cleanup
        tokio::spawn(async move {
            let _ = http
                .delete(format!("https://api.keybrake.com/v1/keys/{id}"))
                .bearer_auth(&token)
                .timeout(std::time::Duration::from_millis(2000))
                .send()
                .await;
        });
    }
}

Axum middleware for per-request vault keys

Axum middleware runs as a tower::Layer wrapping each request. Issue the vault key here and attach it as an Extension:

use axum::{
    extract::Request,
    middleware::Next,
    response::Response,
    Extension,
};
use std::sync::Arc;

pub async fn vault_key_middleware(
    State(state): State<Arc<AppState>>,
    mut req: Request,
    next: Next,
) -> Response {
    let user_id = req.headers()
        .get("x-user-id")
        .and_then(|v| v.to_str().ok())
        .unwrap_or("unknown");
    let run_id = req.headers()
        .get("x-agent-run-id")
        .and_then(|v| v.to_str().ok())
        .unwrap_or("unknown");

    let label = format!("rust-{user_id}-{run_id}");

    match VaultKey::issue(
        &state.http,
        &state.keybrake_token,
        &label,
        "stripe",
        500,
        &["/v1/payment_intents", "/v1/payment_intents/*"],
        "5m",
    ).await {
        Ok(vk) => {
            req.extensions_mut().insert(Arc::new(vk));
        }
        Err(e) => {
            tracing::warn!("vault key issuance failed: {e}");
            // Fail open: proceed without vault key
        }
    }

    next.run(req).await
}

// Register middleware for agent tool routes only
let app = Router::new()
    .route("/tools/charge", post(charge_handler))
    .route_layer(axum::middleware::from_fn_with_state(
        Arc::clone(&state),
        vault_key_middleware,
    ))
    .with_state(state);

Handler extracting the vault key

use axum::{Extension, Json, http::StatusCode};
use std::sync::Arc;

async fn charge_handler(
    Extension(vault_key): Extension<Arc<VaultKey>>,
    State(state): State<Arc<AppState>>,
    Json(body): Json<ChargeRequest>,
) -> Result<Json<serde_json::Value>, (StatusCode, String)> {

    let resp = state.http
        .post("https://proxy.keybrake.com/stripe/v1/payment_intents")
        .bearer_auth(&vault_key.token)
        .json(&serde_json::json!({
            "amount":   body.amount,
            "currency": "usd",
        }))
        .send()
        .await
        .map_err(|e| (StatusCode::INTERNAL_SERVER_ERROR, e.to_string()))?;

    if resp.status() == StatusCode::TOO_MANY_REQUESTS {
        let err: serde_json::Value = resp.json().await.unwrap_or_default();
        if err["code"] == "cap_exhausted" {
            return Err((StatusCode::PAYMENT_REQUIRED, "spend_cap_exceeded".into()));
        }
    }

    let result: serde_json::Value = resp.json().await
        .map_err(|e| (StatusCode::INTERNAL_SERVER_ERROR, e.to_string()))?;

    Ok(Json(result))
    // Arc dropped here → Drop impl spawns revocation task
}

Rust async pattern	Vault key propagation	Revocation mechanism
Axum HTTP handler	`Extension(vault_key): Extension<Arc<VaultKey>>`	Arc drop at handler return → spawned async revoke task
Tokio task (spawned worker)	Pass `Arc<VaultKey>` as task parameter	Arc drop when task completes (or on `JoinHandle` abort)
tonic gRPC service	Intercept in `tower::Layer`, insert into `Extensions` on `Request<B>`	Arc drop at end of service handler
Temporal Rust SDK activity	Issue in activity function, pass by reference to sub-calls	Explicit revoke + Drop at end of activity function

Get early access

Related questions

Does this work with the stripe-rust crate, or only raw reqwest calls?

The vault key pattern works with raw reqwest calls to proxy.keybrake.com — it doesn't integrate with the async-stripe crate directly. The async-stripe crate constructs its own reqwest::Client and points it at api.stripe.com. You can override the base URL by building a custom StripeClient with a custom backend URL (AsyncStripe::from_url), but this is more complex than just using reqwest directly. For agent tool backends, raw HTTP to proxy.keybrake.com is cleaner — the JSON shapes are the same as Stripe's API, and you get vault key enforcement without SDK configuration gymnastics.

Should I use Axum State vs task_local! for storing the vault key?

Axum State (Extension) is the right choice for per-request vault keys in an HTTP handler context — it's designed for exactly this: values that exist for the duration of a single request. tokio::task_local! is useful when you need to propagate a value across tokio::spawn boundaries implicitly without passing it as a parameter, but it requires careful scoping with TASK_LOCAL.scope(value, future) and doesn't integrate as cleanly with Axum's extractor system. Stick with Extension<Arc<VaultKey>> in middleware: it's visible in function signatures (you always know the type is present), it composes naturally with other Axum extractors, and the Arc reference count drives cleanup automatically.

How do I handle the case where the vault key extension is missing from the request?

If vault key issuance fails and the middleware fails open (doesn't insert the extension), a handler that uses Extension(vault_key): Extension<Arc<VaultKey>> will return a 500 automatically — Axum returns a 500 when a required extension is missing. To fall back gracefully instead, use Option<Extension<Arc<VaultKey>>>: vault_key: Option<Extension<Arc<VaultKey>>>. Then match on Some vs None to decide whether to use the proxy or fall back to the shared key. This gives you a gradual rollout path: start with fail-open (optional extension + fallback), monitor the fallback rate, then switch to fail-closed (required extension) once issuance is stable.