Twilio · AI Agents · SMS Security

AI agent Twilio security: four controls that prevent the $1,200 SMS bill

Keybrake · May 31, 2026 · 9 min read

Handing an AI agent your Twilio key is equivalent to handing it a corporate credit card with no limit, valid in every country, that charges $0.0082 per US message and up to $0.30 per premium-route international message. Twilio has no per-key spend cap. Your agent has no retry budget. The $1,200 SMS bill is not a hypothetical — it's what a stuck retry loop at international rates looks like by the time anyone checks the dashboard.

This post covers three failure modes that produce runaway Twilio spend, why Twilio's own safety features don't prevent them, and the four controls that do — with the specific Keybrake proxy configuration for each.

How agents call Twilio today

The standard pattern for wiring Twilio SMS into an AI agent uses the Twilio Python helper library inside a LangChain tool, a CrewAI tool, or an OpenAI function-calling handler. A minimal support agent that sends confirmation texts looks something like this:

from twilio.rest import Client
from langchain.tools import BaseTool

twilio = Client(
    os.environ["TWILIO_ACCOUNT_SID"],
    os.environ["TWILIO_AUTH_TOKEN"]
)

class SendSMSTool(BaseTool):
    name = "send_sms"
    description = (
        "Send a text message to a customer's mobile number. "
        "Use the customer's E.164 phone number as 'to'."
    )

    def _run(self, to: str, body: str) -> str:
        msg = twilio.messages.create(
            to=to,
            from_=os.environ["TWILIO_FROM_NUMBER"],
            body=body
        )
        return f"Sent: {msg.sid}"

This is clean and idiomatic. The tool description is tight, the return value gives the agent a receipt, and the pattern composes naturally with any orchestration framework. The problem isn't the code. The problem is what TWILIO_AUTH_TOKEN actually grants once the agent starts running unsupervised.

A standard Twilio auth token gives the holder full account access: send SMS to any number, any country, any volume; make voice calls; provision new phone numbers; access recordings; query all message history. Even a purpose-built SMS agent with a tool definition that only exposes send_sms is one bad prompt — or one network error — away from a four-digit bill.

Three ways the bill gets out of hand

Failure mode 1 of 3

The retry storm

The agent sends a confirmation SMS. The network call to api.twilio.com times out before a response arrives. The agent's retry logic — whether built into the tool, the framework, or the LLM's own "I didn't get a success response, so I'll try again" reasoning — fires. Twilio received the original call and queued the message; the retry is a new, duplicate message. The agent sees another timeout. It retries again. By the time the connection stabilizes, the same message has been sent four to six times to the same recipient. At $0.0082 per US message this is noise; at $0.0877 per message to a mobile number in the UK, six messages per customer × 5,000 customers in a batch = $2,631 in duplicate sends.

Failure mode 2 of 3

International routing bleed

Your agent is built to text US customers. Its input data — a CRM export, a ticket queue, an order list — is supposed to contain only US numbers in E.164 format starting with +1. Someone adds a UK customer. Or a test record with a Nigerian number slips through data validation. Or the agent is handed a phone number from user input that was never validated. The agent calls send_sms with a +234 number. Twilio routes it. The SMS costs $0.0551 — 6.7× the US rate. International SMS to Nigeria runs at $0.0551, to India $0.0098, to premium-rate routes in the Caribbean up to $0.30 per message. A batch of 10,000 messages at the expected $0.0082 costs $82. The same batch with 200 international numbers accidentally included can cost $200–400 depending on the destinations. There is no way to express "this Twilio key may only send to +1 numbers" in Twilio's Auth Token configuration.

Failure mode 3 of 3

The unsubscribed-list broadcast

A marketing agent is told to "notify opted-in customers about the new feature." The data pipeline query returns all customers — not just opted-in ones — because a WHERE clause was dropped or because the opted-in flag isn't on the table the agent queries. The agent sends 50,000 messages instead of 2,000. At $0.0082 per message, that's $410 in unexpected spend before anyone checks the Twilio console. More critically, texting unsubscribed users in the US violates TCPA regulations, with per-message penalties up to $1,500 in a class-action context. The financial risk from a regulatory violation can dwarf the direct Twilio cost. There is no mechanism in Twilio to express "this key may only send to numbers that appear in this opted-in table."

All three failure modes have the same structural shape: the Twilio auth token grants capabilities (send to any number, any volume), and nothing between the agent and Twilio's API enforces constraints (spend limit, destination scope, deduplication). The agent tool definition tells the model when to send an SMS. It cannot tell Twilio how much damage a bad call can do.

Why Twilio's own safety features don't close the gap

Twilio ships several features that look relevant here. None of them solve the problem at the right layer:

Twilio spend alerts — Dashboard spend alerts fire an email when your account crosses a dollar threshold you configure. By the time the email arrives — typically 15–30 minutes after the threshold is crossed — a retry storm or batch broadcast has already completed. The alert is post-hoc; it doesn't stop the next message from sending.
Sub-accounts — You can create a Twilio sub-account per agent and cap its funding. This is the closest Twilio gets to per-agent spend control: fund the sub-account with $20, and the agent can send at most ~2,400 US messages before running dry. The limitation is operational complexity: sub-accounts require manual provisioning, each needs its own credentials, and they have no per-call audit log accessible without custom querying. For a team running 10+ agents, managing 10+ sub-accounts with individually funded balances is significant overhead.
Geographic Permissions — Twilio lets you disable entire world regions at the account level (block all messages to Africa, for example). This is coarse-grained and permanent — it's not a per-agent, per-run control, and it doesn't protect against the retry storm or the unsubscribed-list broadcast. It's also a dashboard-only setting, not configurable in code.
Messaging Services — Twilio Messaging Services provide a pool of numbers and add opt-out handling (STOP keyword → do-not-contact list), but the opt-out list lives in Twilio's infrastructure. An agent that queries your CRM for opted-in users and then sends to everyone on the CRM list isn't protected by Twilio's STOP handling — it will send to unsubscribed users regardless, because those users never texted back STOP from the number on file.
SDK-layer rate limiting — You can add a sleep or a semaphore to the _run method to limit calls per second. This controls throughput but not dollar exposure: a burst of 100 messages per second for three seconds is 300 messages, but 300 international-rate messages at $0.30 each is $90. Rate limiting counts calls, not dollars.

Four controls that actually prevent the damage

The common thread in all three failure modes is that enforcement needs to happen at the API call layer — after the agent constructs the request, before Twilio receives it. That's the proxy layer. Here are the four controls, mapped to the failure modes they address:

Control 1 of 4

Per-day USD spend cap

Not a message count — a dollar amount. The proxy parses Twilio's response body, which includes "price": "-0.0085" (negative, per Twilio convention) and "price_unit": "USD" on every sent message. The proxy accumulates the parsed cost toward a daily cap. When the cap is hit, the next POST /Messages call returns 429 before it reaches Twilio. The cap resets at UTC midnight.

This closes failure modes 2 and 3: international routing bleed runs into the cap long before it becomes a four-figure bill, and an unsubscribed-list broadcast stops mid-batch at the cap rather than completing the full 50,000-message run.

Control 2 of 4

Destination prefix allowlist

The proxy inspects the To field in every POST /Messages request. If the destination number's E.164 prefix doesn't match the allowlist, the call is rejected with 403 before forwarding. A US-only agent gets "allowed_prefixes": ["+1"] in its policy; a UK-US agent gets ["+1", "+44"]. An accidentally included +234 number in the batch is blocked at the proxy — it never reaches Twilio, never incurs a charge, and the agent receives a clear policy-violation error it can log and surface.

This directly closes failure mode 2. It also provides defense-in-depth against a compromised agent that might try to exfiltrate data via SMS to an attacker-controlled number outside the allowed prefix range.

Control 3 of 4

Deduplication window

The proxy maintains a short-TTL deduplication cache keyed on (vault_key, to_number, body_hash). If the same (destination, body) pair arrives within a configurable window (default: 60 seconds), the proxy returns a synthetic success response — the same structure as a real Twilio success, with a note in the response metadata — without forwarding to Twilio. The agent's retry logic receives the success it was waiting for; no duplicate message is sent; no duplicate charge is incurred.

This closes failure mode 1. For agents using LangChain's built-in retry handlers or frameworks that retry on timeout, the deduplication window eliminates the retry-storm cost without requiring any change to the agent's retry configuration.

Control 4 of 4

Sub-second revoke

Every vault key has an active flag in the proxy's database. A single DELETE call to the proxy management API flips the flag; subsequent calls using that vault key return 401 immediately, before any forwarding occurs. The real Twilio auth token never changes — there's no rotation, no propagation delay, no code change. If a batch agent starts sending to the wrong list, you kill the vault key; messages mid-flight complete; the next message in the queue is blocked.

This is the manual override for all three failure modes. When automated controls fail — the cap was set too high, the allowlist was configured too broadly — the kill switch is the last line of defense. Sub-second response time means you can stop a batch mid-run, not after it finishes.

Setting up a Twilio vault key

You issue a vault key once per agent or per agent run. The key carries the real Twilio credentials (account SID and auth token) inside the proxy, along with the policy that governs this agent's calls:

curl -X POST https://proxy.keybrake.com/keys \
  -H "X-Admin-Key: $KEYBRAKE_ADMIN_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "support-bot-prod",
    "vendor": "twilio",
    "twilio_account_sid": "'"$TWILIO_ACCOUNT_SID"'",
    "twilio_auth_token": "'"$TWILIO_AUTH_TOKEN"'",
    "policy": {
      "daily_usd_cap": 50,
      "allowed_prefixes": ["+1"],
      "dedup_window_seconds": 60,
      "expires_in": "24h"
    }
  }'

The response is a vault_key_xxx token. Your agent's environment gets two variables instead of the real Twilio credentials:

TWILIO_ACCOUNT_SID=vault_key_xxx   # the vault key replaces the real SID
TWILIO_PROXY_URL=https://proxy.keybrake.com/twilio

# In your agent code, construct the client with the proxy base URL:
from twilio.rest import Client

twilio = Client(
    os.environ["TWILIO_ACCOUNT_SID"],  # vault key
    os.environ["TWILIO_ACCOUNT_SID"],  # vault key used as both SID and token
    region=None
)
# Override the base URL via a custom HTTP client pointed at the proxy URL

The proxy receives the call with the vault key in the Authorization header, looks up the real Twilio account SID and auth token, enforces the policy (cap check, prefix check, dedup check), and if all checks pass, forwards the request to api.twilio.com using the real credentials. The Twilio response comes back verbatim; the proxy logs the cost from the response body and stores the call in the audit log before returning.

The audit log captures: vault key name, destination number prefix (not the full number — the last four digits are masked), message body length, price from Twilio's response, price_unit, message SID, whether the call was blocked and why, and a timestamp. For a batch run, you can query the log with GROUP BY vault_key_name, DATE(created_at) to get the cost breakdown per agent per day — something Twilio's own console only shows at the account level.

How the four controls map to each failure mode

Failure mode	Primary control	Backup control
Retry storm (duplicates)	Deduplication window	Per-day USD cap
International routing bleed	Destination prefix allowlist	Per-day USD cap
Unsubscribed-list broadcast	Per-day USD cap	Sub-second revoke
Any runtime anomaly	Sub-second revoke	—

The dollar cap is the backstop for every failure mode because it operates regardless of what caused the excess spend. The prefix allowlist and dedup window are more specific — they prevent the failure from happening rather than limiting the blast radius after it starts. The kill switch is the override when you need to stop an agent mid-run for any reason, including reasons that have nothing to do with spend.

What this doesn't address

Two things are outside the scope of the proxy approach:

Content-based filtering. The proxy enforces policies on destination and cost. It does not inspect message content for compliance — profanity, personally identifiable information, or content that violates Twilio's Acceptable Use Policy. Content filtering is a separate layer, typically handled at the prompt or output-validation stage before the SMS tool is called. The proxy is not a content moderation system.

Opt-in list management. The unsubscribed-list failure mode in this post is caused by a bad data pipeline query — the agent receives a list of numbers it shouldn't be texting. The proxy can limit the blast radius (the cap stops the batch mid-run), but the correct fix is enforcing the opted-in filter at the data layer before the agent's input is constructed. The proxy is a last line of defense for spend, not a substitute for correct input validation.

For a broader look at the audit trail structure that catches data-pipeline errors like this — including the agent_run_id pattern that lets you trace a bad batch back to the specific run and input data — see the audit trail schema post.

Bottom line

Twilio's auth token model is binary: you have it, you can send to anywhere at any volume, and nothing stops you until Twilio's post-hoc alert fires. For an AI agent running a batch job unsupervised, that's an unacceptable risk profile. The four controls — per-day USD cap, destination prefix allowlist, deduplication window, sub-second revoke — cover the three failure modes that actually produce large unexpected bills. All four operate at the proxy layer, before calls reach Twilio, without requiring any change to the agent's tool logic.

If your agent sends ten SMS messages a day in response to user requests, the risk profile is low and the controls are nice-to-have. If your agent runs nightly batches, processes ticket queues, or responds to external events with outbound messages, adding the proxy before going to production costs you nothing and prevents the class of incident where you're explaining a $1,200 Twilio charge to a CFO at 9am on a Tuesday.

For the Stripe equivalent of this problem — spend caps, endpoint allowlists, and kill switches for agents calling the Stripe API — see LangChain + Stripe: the spend-cap your agent doesn't have. For the general question of what to log and how to structure an audit trail that covers both Twilio and Stripe calls in a single query, see AI agent audit trail schema. And for the question of why even a Restricted Key (on Stripe) or a purpose-limited sub-account (on Twilio) isn't sufficient without a proxy enforcement layer, see Why your Stripe Restricted Key probably isn't restricted enough.

Get early access to Keybrake

Per-day spend caps, destination prefix allowlists, deduplication, and sub-second kill-switch for every API your agent touches — Stripe, Twilio, Resend. Join the waitlist for a vault key when v1 ships.