Your OpenClaw Agent Is Burning Money on Heartbeats. Here Is How Smart Model Routing Fixes That.
Running OpenClaw in production?
Managed hosting with built-in AI agent security. 5-day free trial.
Your agent checks its inbox 48 times a day. Every check runs on Claude Opus 4.6. That is $4.70/day for "nothing new." Smart model routing fixes this at the config level, not the instruction level.
If you are running a self-hosted OpenClaw agent with a single model configured, every task your agent performs uses the same model: conversations, heartbeats, sub-agents, background checks. A Claude Opus heartbeat costs roughly $0.082 per cycle. A Claude Haiku heartbeat costs $0.004. Same result, 20x cheaper.
This is the single biggest cost optimization most OpenClaw operators miss. Not because the feature does not exist, but because it requires config-level routing, not prompt-level instructions.
The Problem: One Model for Everything
A typical self-hosted OpenClaw setup looks like this:
agents:
defaults:
model:
primary: "anthropic/claude-opus-4-6"
That single line means every agent activity runs on Opus: your conversations, the 30-minute heartbeat checks, sub-agent coordination, memory synthesis. Opus is the right model for complex reasoning and conversations. It is the wrong model for "check inbox, nothing new, done."
The Real Cost of Unrouted Models
| Task | Frequency | Cost on Opus | Cost on Haiku | Monthly Savings |
|---|---|---|---|---|
| Heartbeat checks | 48x/day | $0.082/cycle | $0.004/cycle | $112/mo |
| Sub-agent tasks | 10-50x/day | $0.05-0.15/task | $0.002-0.008/task | $40-130/mo |
| Memory synthesis | 1x/day | $0.20-0.50 | $0.01-0.03 | $6-14/mo |
Total potential savings: $150-250/month per agent. For a single agent. Multiply by your fleet size.
The Solution: 3-Tier Model Routing
The fix is not telling your agent to use a cheaper model in its instructions. OpenClaw reads model assignments from its config file, not from markdown files. Adding "use Haiku for heartbeats" to HEARTBEAT.md does nothing.
Proper model routing requires three tiers in the OpenClaw config:
agents:
defaults:
model:
primary: "anthropic/claude-opus-4-6"
fallbacks:
- "anthropic/claude-sonnet-4-6"
- "anthropic/claude-haiku-4-5-20251001"
heartbeat:
every: "30m"
model: "anthropic/claude-haiku-4-5-20251001"
subagents:
model: "anthropic/claude-haiku-4-5-20251001"
maxConcurrent: 4
What Each Tier Does
| Tier | Config Key | Used For | Recommended Model |
|---|---|---|---|
| Primary | model.primary | Direct conversations, complex tasks | Claude Opus 4.6 or Sonnet 4.6 |
| Heartbeat | heartbeat.model | Periodic inbox/channel checks | Claude Haiku 4.5 (cheapest reliable) |
| Sub-agents | subagents.model | Background tasks, file ops, coordination | Claude Haiku 4.5 (cheapest reliable) |
| Fallbacks | model.fallbacks | Automatic failover on primary outage | Sonnet 4.6 then Haiku 4.5 |
Why Instruction-Level Routing Does Not Work
This is the most common mistake. Operators add something like this to their agent's HEARTBEAT.md:
## Model Routing
- Switch to Haiku for this heartbeat check
- Only escalate to Opus if something needs action
This has zero effect. Here is why:
- OpenClaw reads model assignments from the config file at startup. The heartbeat model is set in the JSON config, not in the agent's markdown instructions.
- Mid-session model switching is blocked. OpenClaw restricts
session_status(model=X)to the configured primary model. This is a security measure in OpenClaw itself that prevents prompt injection from downgrading to a weaker model. - The agent cannot override config-level settings. Even if the agent tried to switch models via an API call, OpenClaw would reject it.
The correct approach is always config-level: set heartbeat.model and subagents.model to your cheapest reliable model in the OpenClaw config file.
How ClawTrust Handles This Automatically
If you are using ClawTrust managed hosting, you do not need to touch any config files. Smart model routing is configured automatically for every agent from the moment it is provisioned.
Here is exactly what happens:
- You pick your primary model in the dashboard. This is the model for direct conversations. Pick the best model you can afford.
- ClawTrust generates a 3-tier config. The platform automatically sets
heartbeat.modelandsubagents.modelto the cheapest reliable model based on your connected providers. - The config deploys to your agent's VPS. Every time you change your primary model, a config push updates the routing in seconds.
- Fallback chains are built automatically. Based on which providers you have connected (Anthropic, OpenAI, Google, MiniMax), ClawTrust builds the optimal fallback order.
Provider-Aware Cheap Model Selection
ClawTrust does not just pick Haiku every time. It selects the cheapest reliable model based on what you have connected:
| Connected Provider | Heartbeat Model | Cost per Cycle |
|---|---|---|
| Anthropic (subscription or BYOK) | Claude Haiku 4.5 | ~$0.004 |
| OpenAI BYOK | GPT-4.1 Mini | ~$0.003 |
| Google AI BYOK | Gemini 2.5 Flash | ~$0.001 |
| OpenRouter only (no BYOK) | Claude Haiku 4.5 via OpenRouter | ~$0.004 |
Models to Avoid for Background Tasks
Not every cheap model works for agent background tasks. Some models break OpenClaw's tool calling in ways that are invisible until production.
Llama 3.3 70B: Phantom Tool Calls
Llama 3.3 70B generates phantom tts tool calls instead of responding to messages. The agent tries to invoke a text-to-speech tool that does not exist, fails silently, and the user gets no response. This happens intermittently, making it hard to diagnose. Never use Llama 3.3 as a default primary model.
Gemini 2.5 Flash Lite: Malformed Function Calls
Gemini 2.5 Flash Lite returns MALFORMED_FUNCTION_CALL errors on roughly 15-20% of tool invocations. The model formats the function arguments incorrectly, causing OpenClaw to reject the call. This breaks any workflow that depends on reliable tool use: file operations, API calls, scheduling. Do not use Flash Lite in any tier of your routing config.
Safe Choices by Provider
| Provider | Safe for Heartbeats | Avoid |
|---|---|---|
| Anthropic | Claude Haiku 4.5 | None (all models stable) |
| OpenAI | GPT-4.1 Mini | GPT-4.1 Nano (limited tool use) |
| Gemini 2.5 Flash | Gemini 2.5 Flash Lite (broken tool calls) | |
| Meta (via OpenRouter) | Llama 4 Scout | Llama 3.3 70B (phantom tts calls) |
ClawTrust's config generator excludes these broken models automatically. If you are self-hosting, you need to test every model in your routing config against OpenClaw's tool calling before deploying it.
Monthly Cost Projections: Routed vs. Unrouted
Here is what a typical agent costs per month with and without smart routing, broken out by usage tier.
| Usage Level | Conversations/Day | Unrouted (Opus for all) | Routed (3-tier) | Savings |
|---|---|---|---|---|
| Light (personal) | 2-5 | $180/mo | $35/mo | 81% |
| Moderate (small biz) | 10-20 | $320/mo | $65/mo | 80% |
| Heavy (agency) | 50+ | $700/mo | $150/mo | 79% |
The savings scale is consistent across usage levels because the majority of API costs come from background tasks (heartbeats, sub-agents), not conversations. Even heavy users see 79%+ savings because conversations are a small fraction of total API calls.
OpenRouter vs. Direct API Pricing
OpenRouter adds a small markup over direct API pricing. For most ClawTrust users, this markup is negligible because it buys multi-provider access from a single key.
| Model | Direct API (per 1M tokens) | OpenRouter (per 1M tokens) | Markup |
|---|---|---|---|
| Claude Opus 4.6 | $15 / $75 | $15 / $75 | 0% |
| Claude Haiku 4.5 | $0.80 / $4 | $0.80 / $4 | 0% |
| GPT-4.1 Mini | $0.40 / $1.60 | $0.40 / $1.60 | 0% |
| Gemini 2.5 Flash | $0.15 / $0.60 | $0.15 / $0.60 | 0% |
OpenRouter currently passes through Anthropic, OpenAI, and Google pricing at 0% markup on most models. The benefit of routing through OpenRouter is automatic failover: if Anthropic goes down, your agent can fall back to OpenAI or Google without changing config. BYOK users who connect their own API keys bypass OpenRouter entirely for that provider, getting direct API pricing with zero intermediary.
Bring Your Own Key: Insurance Against Budget Exhaustion
Every ClawTrust plan includes an AI budget via OpenRouter. When that budget runs out, your agent's fallback chain determines what happens next.
Without BYOK keys: the agent tries cheaper OpenRouter models until those also hit the budget limit, then stops responding until the next billing cycle or a manual top-up.
With BYOK keys: the fallback chain tries your own provider keys first. Since BYOK uses your own billing, there is no ClawTrust budget limit to hit. Your agent keeps running on your dime, but it keeps running.
This is why connecting a BYOK key is the single best thing you can do for agent reliability, even if you never select a BYOK model as your primary.
BYOK Security
Your API keys never touch the agent's VPS. They are encrypted in the ClawTrust database and decrypted only by a Cloudflare Worker proxy at request time. The agent authenticates with a platform token and sends requests through the proxy. Even a full server compromise cannot leak your API keys.
Supported providers: Anthropic (subscription or API key), OpenAI, Google AI, and MiniMax. Connect them in your agent's Integrations page.
The Fallback Chain: Zero-Downtime Model Failover
Model outages happen. Anthropic goes down, OpenRouter hits capacity limits, Google has a bad deploy. When your primary model fails, the fallback chain takes over automatically.
How the Chain Is Built
ClawTrust builds a custom fallback chain for each agent based on connected providers:
Anthropic subscription users:
Primary (your pick) → Claude Sonnet 4.6 → Claude Sonnet 4.5 → Claude Haiku 4.5
OpenRouter + BYOK users:
Primary (your pick) → BYOK fallbacks (Anthropic, OpenAI, Google if connected) → Claude Haiku 4.5 via OpenRouter → Llama 3.3 70B → Gemini 2.0 Flash
BYOK fallbacks are tried before OpenRouter models. This means if your OpenRouter budget is exhausted, the agent seamlessly falls back to your own provider keys without any interruption.
What ClawTrust Controls vs. What OpenClaw Controls
This is the part most people get confused about. ClawTrust is the managed hosting platform. OpenClaw is the open-source agent runtime. They each control different things.
ClawTrust (the platform)
- Generates the 3-tier model config based on your primary model and connected providers
- Validates model selection against the provider allowlist (dashboard only)
- Deploys the config to your agent's VPS via secure config push
- Encrypts and proxies BYOK API keys through a Cloudflare Worker
- Sets heartbeat and sub-agent models to the cheapest available option
- Does not restrict which models OpenClaw can use at runtime
OpenClaw (the agent runtime)
- Reads the config file at startup and assigns models to each task tier
- Routes heartbeats to the configured heartbeat model
- Routes sub-agents to the configured sub-agent model
- Uses the fallback chain when the primary model fails
- Blocks mid-session model switching (security measure in OpenClaw, not a ClawTrust restriction)
If your agent tries to call session_status(model="claude-haiku") to switch models mid-conversation, OpenClaw will reject it. This is intentional. Mid-session switching is a security risk because prompt injection could downgrade the agent to a weaker, more manipulable model. The routing must happen at the config level, where it cannot be influenced by adversarial prompts.
Multi-Agent Fleet Savings
The numbers above are for a single agent. If you run multiple agents (one per department, one per client, or one per location), the savings multiply linearly.
| Fleet Size | Unrouted Monthly Cost | Routed Monthly Cost | Annual Savings |
|---|---|---|---|
| 3 agents | $540/mo | $105/mo | $5,220 |
| 5 agents | $900/mo | $175/mo | $8,700 |
| 10 agents | $1,800/mo | $350/mo | $17,400 |
For agencies running an agent per client, 10 agents is not unusual. Without routing, that is $1,800/mo in background API costs alone. With routing, it drops to $350/mo. The $17,400 annual savings exceeds the cost of the ClawTrust subscriptions themselves.
Common Mistakes and Troubleshooting
Even with the right config structure, there are several ways model routing can fail silently. These are the issues we see most often in self-hosted setups.
Mistake 1: Using OpenRouter Model IDs with Anthropic API
OpenRouter and the Anthropic API use different model ID formats for the same model. This is the single most common cause of 400 errors in model routing configs.
| Provider | Correct Format | Example |
|---|---|---|
| Anthropic API | claude-{model}-{version}-{date} | claude-haiku-4-5-20251001 |
| OpenRouter | anthropic/claude-{model}-{version} | anthropic/claude-haiku-4.5 |
Using anthropic/claude-haiku-4.5 in a direct Anthropic API config returns a 400 error. Using claude-haiku-4-5-20251001 on OpenRouter also fails. Match the format to the provider, or let ClawTrust handle it automatically.
Mistake 2: Setting openrouter/auto as Default
The openrouter/auto meta-model lets OpenRouter pick the "best" model for each request. In theory, this sounds ideal. In practice, it routes heartbeat checks to expensive frontier models. We measured $4.70/day in idle burn from a single agent using openrouter/auto as its primary model. That is $141/mo for an agent that is mostly checking empty inboxes.
Always set an explicit primary model. Never use openrouter/auto for any tier of your routing config.
Mistake 3: Not Restarting After Config Changes
OpenClaw reads its model config at startup. If you edit the config file while the agent is running, the changes do not take effect until the next restart. On a self-hosted setup, you need to manually restart the Docker container. On ClawTrust, the config push triggers an automatic restart within seconds.
Mistake 4: Mixing Providers in Fallback Chains
A fallback chain like anthropic/claude-opus-4-6 -> openai/gpt-4.1 -> google/gemini-2.5-flash sounds robust. But each provider has different tool calling formats, context window sizes, and system prompt handling. Switching providers mid-conversation can cause the agent to lose context or misformat tool calls. Best practice: use same-provider fallbacks (Opus to Sonnet to Haiku) as your primary chain, with cross-provider fallbacks only as a last resort.
Getting Started
If you are already on ClawTrust, smart model routing is already active. You can verify by checking your agent's config in the dashboard under Settings.
If you are self-hosting OpenClaw, add the heartbeat.model and subagents.model keys to your config file and restart your agent. The savings are immediate.
If you want all of this handled automatically, with BYOK support, encrypted key proxying, automatic fallback chains, and zero config management, start a free 5-day trial. Your agent will be provisioned with smart routing from minute one.
Read the full model routing documentation for technical details on provider selection, fallback chains, and BYOK security architecture.
Frequently Asked Questions
Can my OpenClaw agent switch AI models mid-conversation?
No. OpenClaw blocks mid-session model switching as a security measure. This prevents prompt injection attacks from downgrading your agent to a weaker, more exploitable model. The routing happens at the config level: your primary model handles conversations, while heartbeats and sub-agents automatically use the cheapest reliable model.
Does ClawTrust restrict which models my agent can use?
ClawTrust does not restrict runtime model usage. The model allowlist in the dashboard only validates what you can select as your primary model. Once deployed, OpenClaw handles all runtime behavior. Mid-session switching is blocked by OpenClaw itself, not ClawTrust.
How much does OpenClaw model routing save?
For an agent using Claude Opus 4.6 as primary, smart routing saves roughly 80% on background costs. Heartbeats drop from $0.082 to $0.004 per cycle (20x cheaper). Sub-agent tasks see similar savings. Direct conversations still use your full-power primary model.
Do I need to configure model routing in my agent's instructions?
No. Adding model routing instructions to HEARTBEAT.md, AGENTS.md, or any agent instruction file has no effect. OpenClaw reads the heartbeat and sub-agent model from its config file, not from the agent's markdown files. ClawTrust sets this config automatically when you pick your primary model.
What happens when my primary AI model has an outage?
ClawTrust configures an automatic fallback chain. For Anthropic subscription users: Sonnet 4.6, then Sonnet 4.5, then Haiku 4.5. For OpenRouter users: BYOK providers first (if connected), then Claude Haiku 4.5, Llama 3.3, Gemini 2.0 Flash. Failover is automatic with zero downtime.
Can I bring my own API keys for multiple providers?
Yes. ClawTrust supports Bring Your Own Key (BYOK) for Anthropic, OpenAI, Google AI, and MiniMax. Your keys are encrypted in the database and proxied through a Cloudflare Worker. The agent authenticates with a platform token and never sees your real API key.
What model does OpenClaw use for heartbeats?
ClawTrust automatically configures the cheapest reliable model for heartbeats. With Anthropic connected: Claude Haiku 4.5. With OpenAI BYOK: GPT-4.1 Mini. With Google BYOK: Gemini 2.5 Flash. With no BYOK keys: Claude Haiku 4.5 via OpenRouter. This is set at the platform config level and cannot be overridden by the agent.
Is there a free trial to test model routing?
Yes. ClawTrust offers a 5-day free trial on Starter and Pro plans. Your agent runs on a fully provisioned dedicated VPS with smart model routing active from minute one. The trial includes $5 in AI budget so you can see the routing in action.