- Published on
Inside Clawdbot: How the Agent System Routes Models, Tools, and Skills
- Authors

- Name
- Avasdream
- @avasdream_
This is the second post in my Clawdbot deep dive series (first: How Clawdbot Remembers). Here I'm looking at the agent system itself — how Clawdbot selects models, gates tool access through multiple policy layers, manages the context window under pressure, loads skills dynamically, and isolates execution in Docker containers.
The source analysis covers ~47K lines of code across the open-source codebase. Let's get into it.
Architecture Overview
Clawdbot runs a single long-lived Gateway daemon that owns all messaging surfaces (Telegram, Discord, WhatsApp, Slack, Signal, iMessage). The Gateway connects to LLM providers, manages sessions, and routes tool calls through a policy engine before executing them.
┌──────────────────────────────────────────────────────┐
│ Gateway Daemon │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Telegram │ │ Discord │ │ WhatsApp │ ... │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────────────────────────────────────┐ │
│ │ Session Router │ │
│ │ (per-user DM sessions, per-group) │ │
│ └──────────────────┬──────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────┐ │
│ │ Agent Engine │ │
│ │ │ │
│ │ ┌────────────┐ ┌──────────────────┐ │ │
│ │ │ Provider │ │ Tool Policy │ │ │
│ │ │ Router │ │ Engine (9-layer)│ │ │
│ │ └──────┬─────┘ └────────┬─────────┘ │ │
│ │ │ │ │ │
│ │ ▼ ▼ │ │
│ │ ┌────────────┐ ┌──────────────────┐ │ │
│ │ │ 15+ LLM │ │ Skills Loader │ │ │
│ │ │ Providers │ │ (3 directories) │ │ │
│ │ └────────────┘ └──────────────────┘ │ │
│ └──────────────────┬──────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────┐ │
│ │ Execution Layer │ │
│ │ ┌─────────┐ ┌───────────────────────┐ │ │
│ │ │ Host │ │ Docker Sandbox │ │ │
│ │ │ (bare) │ │ (read-only, no net, │ │ │
│ │ │ │ │ cap drop ALL) │ │ │
│ │ └─────────┘ └───────────────────────┘ │ │
│ └─────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────┘
One Gateway per host. It's the single source of truth for sessions, credentials, and channel state. Clients (CLI, macOS app, web UI) and nodes (iOS, Android, headless) connect via WebSocket.
AI Providers: 15+ Backends with Auto-Discovery
Clawdbot ships with over 15 built-in model providers. You don't configure most of them — they're in the pi-ai catalog and activate when you set an API key.
Built-in Providers
| Provider | Auth | Example Model |
|---|---|---|
| Anthropic | ANTHROPIC_API_KEY | anthropic/claude-opus-4-5 |
| OpenAI | OPENAI_API_KEY | openai/gpt-5.2 |
| OpenAI Codex | OAuth (ChatGPT) | openai-codex/gpt-5.2 |
| Google Gemini | GEMINI_API_KEY | google/gemini-3-pro-preview |
| OpenCode Zen | OPENCODE_API_KEY | opencode/claude-opus-4-5 |
| OpenRouter | OPENROUTER_API_KEY | openrouter/anthropic/claude-sonnet-4-5 |
| xAI | XAI_API_KEY | xai/grok-3 |
| Groq | GROQ_API_KEY | groq/llama-3.3-70b |
| Mistral | MISTRAL_API_KEY | mistral/mistral-large |
| Cerebras | CEREBRAS_API_KEY | cerebras/llama-3.3-70b |
| Z.AI | ZAI_API_KEY | zai/glm-4.7 |
| Vercel AI Gateway | AI_GATEWAY_API_KEY | vercel-ai-gateway/anthropic/claude-opus-4.5 |
| GitHub Copilot | GITHUB_TOKEN | github-copilot/gpt-4o |
| Venice AI | API key | venice/llama-3.3-70b |
| Ollama | None (local) | ollama/llama3.3 |
Auto-Discovery
Some providers are discovered automatically:
- Ollama: Detected when running locally at
http://127.0.0.1:11434/v1. Pull a model withollama pull llama3.3and Clawdbot finds it. - Amazon Bedrock: Uses AWS SDK credential chain — if your environment has valid AWS creds, Bedrock models are available.
- GitHub Copilot: Token exchange from
GITHUB_TOKEN— reuses your existing Copilot subscription without separate API keys.
Custom Providers
Any OpenAI-compatible or Anthropic-compatible endpoint can be added via models.providers:
{
models: {
mode: "merge",
providers: {
"my-proxy": {
baseUrl: "http://localhost:1234/v1",
apiKey: "local-key",
api: "openai-completions",
models: [{ id: "my-model", name: "Local Model" }]
}
}
}
}
This covers LM Studio, vLLM, LiteLLM, text-generation-webui, and any other local inference server. Model refs always use the provider/model format — anthropic/claude-opus-4-5, ollama/llama3.3, my-proxy/my-model.
The 9-Layer Tool Policy Engine
This is where Clawdbot gets interesting from a security perspective. Every tool call passes through 9 layers of filtering before it can execute. The layers are evaluated in order — each can narrow the tool set but never widen it beyond what the previous layer allowed.
The Layers
Layer 1: Tool Profile (base allowlist — minimal/coding/messaging/full)
Layer 2: Provider Policy (per-provider or per-model restrictions)
Layer 3: Global Allow/Deny (tools.allow / tools.deny in config)
Layer 4: Agent Override (agents.list[].tools — per-agent policy)
Layer 5: Group Policy (group chat restrictions)
Layer 6: Sandbox Policy (sandboxed sessions get reduced tools)
Layer 7: Sub-agent Policy (spawned sub-agents inherit restrictions)
Layer 8: Skill Gating (skills check bins, env vars, config at load time)
Layer 9: Runtime Guards (elevated exec, host vs sandbox routing)
How They Work Together
Layer 1 — Tool Profiles set the base. Four profiles are available:
minimal— Onlysession_status. For agents that shouldn't touch anything.coding— File I/O (read,write,edit), runtime (exec,process), sessions, memory, image analysis.messaging— Message sending, session management. No file or shell access.full— Everything. No restrictions.
Layer 2 — Provider Policy narrows per-model. If you're using a less capable model for some tasks, you can restrict its tools:
{
tools: {
profile: "coding",
byProvider: {
"google-antigravity": { profile: "minimal" }
}
}
}
This means: coding tools for most models, but only session_status when using Google Antigravity.
Layer 3 — Global Allow/Deny applies config-level overrides. deny always wins:
{
tools: {
deny: ["browser"] // No browser control, ever
}
}
Layer 4 — Agent Override is per-agent. A support agent might only need messaging:
{
agents: {
list: [{
id: "support",
tools: { profile: "messaging", allow: ["slack"] }
}]
}
}
Layers 5-7 handle group chats, sandboxed sessions, and sub-agents — each applying additional restrictions.
Layer 8 — Skill Gating is load-time filtering based on metadata. Skills declare what they need:
metadata: {"clawdbot":{"requires":{"bins":["uv"],"env":["GEMINI_API_KEY"],"config":["browser.enabled"]}}}
If uv isn't on PATH, the skill never loads.
Layer 9 — Runtime Guards handle the final routing decision: does this tool call run on the host, in a sandbox, or on a remote node?
Tool Groups
Tools are organized into semantic groups for policy declarations:
| Group | Tools |
|---|---|
group:runtime | exec, bash, process |
group:fs | read, write, edit, apply_patch |
group:sessions | sessions_list, sessions_history, sessions_send, sessions_spawn, session_status |
group:memory | memory_search, memory_get |
group:web | web_search, web_fetch |
group:ui | browser, canvas |
group:automation | cron, gateway |
group:messaging | message |
group:nodes | nodes |
This means you can write deny: ["group:runtime"] to block all shell access — exec, bash, and process — in one line.
Context Window Management
LLMs have finite context windows. Clawdbot runs long-lived sessions that can span hours or days, so it needs aggressive context management to avoid exceeding limits.
Cache-Aware Pruning
Anthropic's API supports prompt caching — previously sent content is cached and doesn't count against token costs on subsequent turns. Clawdbot's pruning system is aware of this:
- Soft trim: When context approaches the limit, old tool results and verbose outputs are trimmed first. Cached content is preserved preferentially because it's cheaper to keep.
- Hard trim: When soft trimming isn't enough, older turns are dropped entirely. The system prioritizes keeping recent conversation and cached system prompt content.
- Cache TTL: Cached content has a time-to-live. The pruner tracks which content is still within the cache window and avoids dropping it unnecessarily.
Pre-Compaction Memory Flush
Before conversation history is compacted (summarized to save space), Clawdbot automatically scans the about-to-be-dropped turns for durable information — preferences, decisions, context — and writes them to memory files on disk. This is the safety net described in the memory deep dive.
Compaction
When the context window fills beyond a threshold, Clawdbot compacts the oldest portion of conversation into a summary. The summary replaces the original turns, freeing context space while retaining the essential information. This lets sessions run indefinitely without losing critical context.
Skills System
Skills are how Clawdbot learns to use external tools. Each skill is a directory with a SKILL.md containing YAML frontmatter and instructions. The agent reads the skill file on demand and follows the instructions.
Loading Pipeline
Skills load from three directories in priority order:
<workspace>/skills/ (highest priority — per-agent)
~/.clawdbot/skills/ (managed/local — shared across agents)
<install>/skills/ (bundled — shipped with Clawdbot)
If the same skill name exists in multiple locations, workspace wins. This lets you override bundled skills with customized versions.
Additional directories can be added via skills.load.extraDirs in config.
Eligibility Checks
Not every skill loads every time. At load time, Clawdbot checks:
- Binary requirements (
requires.bins): Doesuv,gemini,ffmpeg, etc. exist on PATH? - Any-binary requirements (
requires.anyBins): At least one of the listed binaries must exist. - Environment variables (
requires.env): IsGEMINI_API_KEYset? - Config requirements (
requires.config): Isbrowser.enabledtruthy in the config? - OS requirements (
os): Is the skill limited todarwin,linux, orwin32?
Skills that fail eligibility checks are silently excluded. The agent never sees them.
ClawdHub
ClawdHub is the public skills registry. Install skills directly:
clawdhub install <skill-name>
clawdhub update --all
Skills are treated as trusted code — they inject instructions into the agent's prompt and can reference scripts that run on the host. Only install skills you trust.
How Skills Reach the Agent
Skills don't run continuously. They're injected as descriptions in the system prompt. When the agent encounters a task that matches a skill description, it reads the full SKILL.md and follows the instructions. This is lazy loading — skills only consume context when actually used.
The system prompt includes an <available_skills> block listing each skill's name and description. The agent scans this list, picks the most relevant skill, reads it, and follows it. One skill at a time — no bulk loading.
Docker Sandbox
Clawdbot can isolate tool execution in Docker containers. This is the "assume the model will do something dumb" defense layer.
Default Container Configuration
The sandbox runs with aggressive restrictions:
| Setting | Value | Effect |
|---|---|---|
| Root filesystem | Read-only | No persistent writes to system |
| Network | None | No outbound connections |
| Capabilities | Drop ALL | No privileged operations |
| User | Non-root | No uid 0 access |
This is a zero-trust container. The agent can read and write to its workspace directory (mounted as a volume), but cannot install packages, phone home, or escalate privileges.
Sandbox Modes
Configuration controls when sandboxing activates:
"off"— No sandboxing. Everything runs on the host."non-main"— Sandbox only non-main sessions (group chats, sub-agents). Your private DM runs on the host."all"— Every session is sandboxed.
{
agents: {
defaults: {
sandbox: {
mode: "non-main",
scope: "session", // One container per session
workspaceAccess: "rw"
}
}
}
}
Scope
"session"— One container per session. Full isolation between sessions."agent"— One container per agent. Sessions share state."shared"— One container for everything sandboxed.
Workspace Access
"none"— Sandbox gets its own workspace under~/.clawdbot/sandboxes. Completely isolated."ro"— Agent workspace mounted read-only. Can see files but not modify them."rw"— Agent workspace mounted read-write. Full access to the workspace.
Escape Hatches
Some tools can bypass the sandbox when explicitly allowed:
- Elevated exec (
tools.elevated): Runs on the host even when the session is sandboxed. Gated by both global and per-agent policy. - Host browser:
sandbox.browser.allowHostControllets sandboxed sessions target the host's browser. - Custom bind mounts: Mount specific host directories into the container for controlled access.
The sandbox is not a perfect security boundary — Docker escapes exist. But it materially limits blast radius when the model does something unexpected, which is the realistic threat model for most users.
How It All Connects
A single message from Telegram flows through the full stack:
- Telegram delivers the message to the Gateway via long-polling.
- Session Router maps it to the correct session (DM → main session, group → isolated session).
- Agent Engine assembles the system prompt: bootstrap files, project context, available skills.
- Provider Router selects the model (primary, with fallback if configured).
- Tool Policy Engine computes the available tool set for this session (all 9 layers).
- The model generates a response, potentially requesting tool calls.
- Tool calls pass through the policy engine again — denied tools are blocked before execution.
- Execution routes to host or Docker sandbox depending on sandbox config.
- Results flow back to the model. The cycle repeats until the model produces a final response.
- The response routes back to the same Telegram chat. Deterministic — the model never chooses channels.
The entire flow is stateless from the model's perspective. The Gateway handles all the statefulness — sessions, memory, channel routing, tool policy.
Practical Implications
For Security
The 9-layer tool policy isn't just defensive — it's how you run multi-agent setups safely. A support bot that only needs Slack access gets profile: "messaging" with allow: ["slack"]. It physically cannot run shell commands or read files, even if prompt-injected. The policy is enforced at the Gateway level, not by the model's willingness to comply.
Combined with Docker sandboxing for non-main sessions, you get defense-in-depth that doesn't depend on the model being well-behaved.
For Flexibility
15+ providers means you're not locked into any vendor. Run Opus for complex tasks, Llama locally for quick lookups, and a Groq-hosted model for fast iteration. Switch with a single config change or a /model command.
For Reliability
Context management with cache-aware pruning and pre-compaction memory flushes means sessions can run for days without degradation. The agent doesn't "forget" because it ran out of context window — it gracefully compacts and saves what matters.
What's Next
The next post in this series will cover Clawdbot's channel system and messaging architecture — how a single Gateway manages WhatsApp, Telegram, Discord, Slack, Signal, and iMessage with deterministic routing and per-channel policy.
Series
- How Clawdbot Remembers: Memory Architecture
- Inside Clawdbot: Agent System & AI Providers (this post)
- Channel System & Messaging (coming soon)