LLM Configuration and Model Strategy
Direct API, unified routing via OpenRouter, or fully local inference with Ollama — choose based on cost, privacy, and capability.
Choosing a Model Strategy
Maximum capability. Use Claude or GPT-4o via direct API or OpenRouter. Best for complex instructions and prompt injection resistance.
Cost efficiency at volume. Use OpenRouter with model failover. Route simple tasks to cheap models, complex tasks to heavy ones.
Full data privacy. Use Ollama locally. No API calls leave your VPS. Tradeoff: less capable and needs more RAM (8 GB+ for 7B models).
Flexibility. Use OpenRouter. One API key, one billing dashboard, 300+ models. Swap models without reconfiguring OpenClaw.
Option A: Direct Anthropic Connection
The simplest configuration and the recommended starting point. Sign up at console.anthropic.com and set a monthly spending limit immediately.
openclaw config set agent.provider anthropic
openclaw config set agent.api_key "sk-ant-your-key-here"
openclaw config set agent.model claude-sonnet-4-5Available Models
| Model | Best For | Cost |
|---|---|---|
| claude-opus-4-5 | Complex reasoning, autonomous task chains | High |
| claude-sonnet-4-5 | Everyday tasks, balanced speed and capability | Medium |
| claude-haiku-4-5 | High-volume simple tasks, fast responses | Low |
openclaw config verify
openclaw gateway restartOption B: OpenRouter as a Unified Layer
OpenRouter accepts a single API key and routes requests to the model you specify. One billing account, automatic failover, and access to 300+ models.
openclaw config set agent.provider openrouter
openclaw config set agent.api_key "sk-or-your-key-here"
openclaw config set agent.model anthropic/claude-sonnet-4-5Configure Model Failover
openclaw config set agent.model anthropic/claude-sonnet-4-5
openclaw config set agent.fallback_model openai/gpt-4o
openclaw config set agent.fallback_on_error trueRoute Different Agents to Different Models
# High-capability agent for complex work
openclaw agents update work-agent --model anthropic/claude-opus-4-5
# Lightweight agent for simple tasks
openclaw agents update reminder-agent --model google/gemini-flash-1.5Option C: Ollama for Local Inference
Ollama serves open-weight models locally. No API costs, no data leaving the server.
System Requirements
| Model Size | Min RAM | Recommended |
|---|---|---|
| 7B params | 8 GB | 12 GB |
| 13B params | 16 GB | 20 GB |
| 33B params | 32 GB | 48 GB |
| 70B params | 64 GB | 80 GB |
curl -fsSL https://ollama.com/install.sh | sh
# Pull a model
ollama pull llama3.2:3b # 2 GB, fits in 4 GB RAM
ollama pull qwen2.5-coder:7b # 4.7 GB, fits in 8 GB RAM
# Test the model
ollama run llama3.2:3b "Hello, are you working?"Secure Ollama
Restrict Ollama to localhost:
sudo systemctl edit ollama
# Add to the [Service] section:
[Service]
Environment="OLLAMA_HOST=127.0.0.1:11434"
sudo systemctl daemon-reload
sudo systemctl restart ollamaopenclaw config set agent.provider ollama
openclaw config set agent.base_url "http://127.0.0.1:11434"
openclaw config set agent.model qwen2.5-coder:7b
openclaw config verifyOllama on a Separate GPU Server
For faster inference, run Ollama on a GPU server (Vast.ai, Lambda Labs, Hetzner GPU) and point your VPS at it:
# On the GPU server
export OLLAMA_HOST=0.0.0.0:11434
ollama serve
# On your OpenClaw VPS
openclaw config set agent.provider ollama
openclaw config set agent.base_url "http://gpu-server-ip:11434"
openclaw config set agent.model llama3.3:70bAdd your VPS IP to the GPU server's firewall. Never expose Ollama to the public internet without authentication.
Session Pruning and Context Management
# Maximum context tokens before pruning
openclaw config set agent.max_context_tokens 100000
# Strategy: 'sliding' keeps recent, 'summarize' compresses old context
openclaw config set agent.context_strategy slidingThe summarize strategy is more expensive but produces more coherent long-running memory.
Model Verification Checklist
# Run doctor for overall health
openclaw doctor
# Verify model connection
openclaw config verify
# Check channel-agent assignments
openclaw channels list
openclaw agents listEvery channel should show connected, and every channel should have an agent assigned. Send a test message on each channel and confirm responses arrive.
