Published on

Inside Clawdbot: How the Agent System Routes Models, Tools, and Skills

Authors

This is the second post in my Clawdbot deep dive series (first: How Clawdbot Remembers). Here I'm looking at the agent system itself — how Clawdbot selects models, gates tool access through multiple policy layers, manages the context window under pressure, loads skills dynamically, and isolates execution in Docker containers.

The source analysis covers ~47K lines of code across the open-source codebase. Let's get into it.


Architecture Overview

Clawdbot runs a single long-lived Gateway daemon that owns all messaging surfaces (Telegram, Discord, WhatsApp, Slack, Signal, iMessage). The Gateway connects to LLM providers, manages sessions, and routes tool calls through a policy engine before executing them.

┌──────────────────────────────────────────────────────┐
Gateway Daemon│                                                        │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐             │
│  │ Telegram │  │ Discord  │  │ WhatsApp...│  └────┬─────┘  └────┬─────┘  └────┬─────┘             │
│       │              │              │                   │
│       ▼              ▼              ▼                   │
│  ┌─────────────────────────────────────────┐           │
│  │            Session Router               │           │
  (per-user DM sessions, per-group)      │           │
│  └──────────────────┬──────────────────────┘           │
│                     │                                   │
│                     ▼                                   │
│  ┌─────────────────────────────────────────┐           │
│  │            Agent Engine                  │           │
│  │                                          │           │
│  │  ┌────────────┐  ┌──────────────────┐   │           │
│  │  │  Provider   │  │  Tool Policy     │   │           │
│  │  │  Router     │  │  Engine (9-layer)│   │           │
│  │  └──────┬─────┘  └────────┬─────────┘   │           │
│  │         │                  │              │           │
│  │         ▼                  ▼              │           │
│  │  ┌────────────┐  ┌──────────────────┐   │           │
│  │  │  15+ LLM   │  │  Skills Loader   │   │           │
│  │  │  Providers  (3 directories) │   │           │
│  │  └────────────┘  └──────────────────┘   │           │
│  └──────────────────┬──────────────────────┘           │
│                     │                                   │
│                     ▼                                   │
│  ┌─────────────────────────────────────────┐           │
│  │        Execution Layer                   │           │
│  │  ┌─────────┐  ┌───────────────────────┐ │           │
│  │  │  Host   │  │  Docker Sandbox       │ │           │
│  │    (bare)  (read-only, no net,  │ │           │
│  │  │         │  │   cap drop ALL)       │ │           │
│  │  └─────────┘  └───────────────────────┘ │           │
│  └─────────────────────────────────────────┘           │
└──────────────────────────────────────────────────────┘

One Gateway per host. It's the single source of truth for sessions, credentials, and channel state. Clients (CLI, macOS app, web UI) and nodes (iOS, Android, headless) connect via WebSocket.


AI Providers: 15+ Backends with Auto-Discovery

Clawdbot ships with over 15 built-in model providers. You don't configure most of them — they're in the pi-ai catalog and activate when you set an API key.

Built-in Providers

ProviderAuthExample Model
AnthropicANTHROPIC_API_KEYanthropic/claude-opus-4-5
OpenAIOPENAI_API_KEYopenai/gpt-5.2
OpenAI CodexOAuth (ChatGPT)openai-codex/gpt-5.2
Google GeminiGEMINI_API_KEYgoogle/gemini-3-pro-preview
OpenCode ZenOPENCODE_API_KEYopencode/claude-opus-4-5
OpenRouterOPENROUTER_API_KEYopenrouter/anthropic/claude-sonnet-4-5
xAIXAI_API_KEYxai/grok-3
GroqGROQ_API_KEYgroq/llama-3.3-70b
MistralMISTRAL_API_KEYmistral/mistral-large
CerebrasCEREBRAS_API_KEYcerebras/llama-3.3-70b
Z.AIZAI_API_KEYzai/glm-4.7
Vercel AI GatewayAI_GATEWAY_API_KEYvercel-ai-gateway/anthropic/claude-opus-4.5
GitHub CopilotGITHUB_TOKENgithub-copilot/gpt-4o
Venice AIAPI keyvenice/llama-3.3-70b
OllamaNone (local)ollama/llama3.3

Auto-Discovery

Some providers are discovered automatically:

  • Ollama: Detected when running locally at http://127.0.0.1:11434/v1. Pull a model with ollama pull llama3.3 and Clawdbot finds it.
  • Amazon Bedrock: Uses AWS SDK credential chain — if your environment has valid AWS creds, Bedrock models are available.
  • GitHub Copilot: Token exchange from GITHUB_TOKEN — reuses your existing Copilot subscription without separate API keys.

Custom Providers

Any OpenAI-compatible or Anthropic-compatible endpoint can be added via models.providers:

{
  models: {
    mode: "merge",
    providers: {
      "my-proxy": {
        baseUrl: "http://localhost:1234/v1",
        apiKey: "local-key",
        api: "openai-completions",
        models: [{ id: "my-model", name: "Local Model" }]
      }
    }
  }
}

This covers LM Studio, vLLM, LiteLLM, text-generation-webui, and any other local inference server. Model refs always use the provider/model format — anthropic/claude-opus-4-5, ollama/llama3.3, my-proxy/my-model.


The 9-Layer Tool Policy Engine

This is where Clawdbot gets interesting from a security perspective. Every tool call passes through 9 layers of filtering before it can execute. The layers are evaluated in order — each can narrow the tool set but never widen it beyond what the previous layer allowed.

The Layers

Layer 1: Tool Profile      (base allowlist — minimal/coding/messaging/full)
Layer 2: Provider Policy   (per-provider or per-model restrictions)
Layer 3: Global Allow/Deny (tools.allow / tools.deny in config)
Layer 4: Agent Override    (agents.list[].tools — per-agent policy)
Layer 5: Group Policy      (group chat restrictions)
Layer 6: Sandbox Policy    (sandboxed sessions get reduced tools)
Layer 7: Sub-agent Policy  (spawned sub-agents inherit restrictions)
Layer 8: Skill Gating      (skills check bins, env vars, config at load time)
Layer 9: Runtime Guards    (elevated exec, host vs sandbox routing)

How They Work Together

Layer 1 — Tool Profiles set the base. Four profiles are available:

  • minimal — Only session_status. For agents that shouldn't touch anything.
  • coding — File I/O (read, write, edit), runtime (exec, process), sessions, memory, image analysis.
  • messaging — Message sending, session management. No file or shell access.
  • full — Everything. No restrictions.

Layer 2 — Provider Policy narrows per-model. If you're using a less capable model for some tasks, you can restrict its tools:

{
  tools: {
    profile: "coding",
    byProvider: {
      "google-antigravity": { profile: "minimal" }
    }
  }
}

This means: coding tools for most models, but only session_status when using Google Antigravity.

Layer 3 — Global Allow/Deny applies config-level overrides. deny always wins:

{
  tools: {
    deny: ["browser"]  // No browser control, ever
  }
}

Layer 4 — Agent Override is per-agent. A support agent might only need messaging:

{
  agents: {
    list: [{
      id: "support",
      tools: { profile: "messaging", allow: ["slack"] }
    }]
  }
}

Layers 5-7 handle group chats, sandboxed sessions, and sub-agents — each applying additional restrictions.

Layer 8 — Skill Gating is load-time filtering based on metadata. Skills declare what they need:

metadata: {"clawdbot":{"requires":{"bins":["uv"],"env":["GEMINI_API_KEY"],"config":["browser.enabled"]}}}

If uv isn't on PATH, the skill never loads.

Layer 9 — Runtime Guards handle the final routing decision: does this tool call run on the host, in a sandbox, or on a remote node?

Tool Groups

Tools are organized into semantic groups for policy declarations:

GroupTools
group:runtimeexec, bash, process
group:fsread, write, edit, apply_patch
group:sessionssessions_list, sessions_history, sessions_send, sessions_spawn, session_status
group:memorymemory_search, memory_get
group:webweb_search, web_fetch
group:uibrowser, canvas
group:automationcron, gateway
group:messagingmessage
group:nodesnodes

This means you can write deny: ["group:runtime"] to block all shell access — exec, bash, and process — in one line.


Context Window Management

LLMs have finite context windows. Clawdbot runs long-lived sessions that can span hours or days, so it needs aggressive context management to avoid exceeding limits.

Cache-Aware Pruning

Anthropic's API supports prompt caching — previously sent content is cached and doesn't count against token costs on subsequent turns. Clawdbot's pruning system is aware of this:

  • Soft trim: When context approaches the limit, old tool results and verbose outputs are trimmed first. Cached content is preserved preferentially because it's cheaper to keep.
  • Hard trim: When soft trimming isn't enough, older turns are dropped entirely. The system prioritizes keeping recent conversation and cached system prompt content.
  • Cache TTL: Cached content has a time-to-live. The pruner tracks which content is still within the cache window and avoids dropping it unnecessarily.

Pre-Compaction Memory Flush

Before conversation history is compacted (summarized to save space), Clawdbot automatically scans the about-to-be-dropped turns for durable information — preferences, decisions, context — and writes them to memory files on disk. This is the safety net described in the memory deep dive.

Compaction

When the context window fills beyond a threshold, Clawdbot compacts the oldest portion of conversation into a summary. The summary replaces the original turns, freeing context space while retaining the essential information. This lets sessions run indefinitely without losing critical context.


Skills System

Skills are how Clawdbot learns to use external tools. Each skill is a directory with a SKILL.md containing YAML frontmatter and instructions. The agent reads the skill file on demand and follows the instructions.

Loading Pipeline

Skills load from three directories in priority order:

<workspace>/skills/         (highest priority — per-agent)
~/.clawdbot/skills/          (managed/local — shared across agents)
<install>/skills/            (bundled — shipped with Clawdbot)

If the same skill name exists in multiple locations, workspace wins. This lets you override bundled skills with customized versions.

Additional directories can be added via skills.load.extraDirs in config.

Eligibility Checks

Not every skill loads every time. At load time, Clawdbot checks:

  1. Binary requirements (requires.bins): Does uv, gemini, ffmpeg, etc. exist on PATH?
  2. Any-binary requirements (requires.anyBins): At least one of the listed binaries must exist.
  3. Environment variables (requires.env): Is GEMINI_API_KEY set?
  4. Config requirements (requires.config): Is browser.enabled truthy in the config?
  5. OS requirements (os): Is the skill limited to darwin, linux, or win32?

Skills that fail eligibility checks are silently excluded. The agent never sees them.

ClawdHub

ClawdHub is the public skills registry. Install skills directly:

clawdhub install <skill-name>
clawdhub update --all

Skills are treated as trusted code — they inject instructions into the agent's prompt and can reference scripts that run on the host. Only install skills you trust.

How Skills Reach the Agent

Skills don't run continuously. They're injected as descriptions in the system prompt. When the agent encounters a task that matches a skill description, it reads the full SKILL.md and follows the instructions. This is lazy loading — skills only consume context when actually used.

The system prompt includes an <available_skills> block listing each skill's name and description. The agent scans this list, picks the most relevant skill, reads it, and follows it. One skill at a time — no bulk loading.


Docker Sandbox

Clawdbot can isolate tool execution in Docker containers. This is the "assume the model will do something dumb" defense layer.

Default Container Configuration

The sandbox runs with aggressive restrictions:

SettingValueEffect
Root filesystemRead-onlyNo persistent writes to system
NetworkNoneNo outbound connections
CapabilitiesDrop ALLNo privileged operations
UserNon-rootNo uid 0 access

This is a zero-trust container. The agent can read and write to its workspace directory (mounted as a volume), but cannot install packages, phone home, or escalate privileges.

Sandbox Modes

Configuration controls when sandboxing activates:

  • "off" — No sandboxing. Everything runs on the host.
  • "non-main" — Sandbox only non-main sessions (group chats, sub-agents). Your private DM runs on the host.
  • "all" — Every session is sandboxed.
{
  agents: {
    defaults: {
      sandbox: {
        mode: "non-main",
        scope: "session",  // One container per session
        workspaceAccess: "rw"
      }
    }
  }
}

Scope

  • "session" — One container per session. Full isolation between sessions.
  • "agent" — One container per agent. Sessions share state.
  • "shared" — One container for everything sandboxed.

Workspace Access

  • "none" — Sandbox gets its own workspace under ~/.clawdbot/sandboxes. Completely isolated.
  • "ro" — Agent workspace mounted read-only. Can see files but not modify them.
  • "rw" — Agent workspace mounted read-write. Full access to the workspace.

Escape Hatches

Some tools can bypass the sandbox when explicitly allowed:

  • Elevated exec (tools.elevated): Runs on the host even when the session is sandboxed. Gated by both global and per-agent policy.
  • Host browser: sandbox.browser.allowHostControl lets sandboxed sessions target the host's browser.
  • Custom bind mounts: Mount specific host directories into the container for controlled access.

The sandbox is not a perfect security boundary — Docker escapes exist. But it materially limits blast radius when the model does something unexpected, which is the realistic threat model for most users.


How It All Connects

A single message from Telegram flows through the full stack:

  1. Telegram delivers the message to the Gateway via long-polling.
  2. Session Router maps it to the correct session (DM → main session, group → isolated session).
  3. Agent Engine assembles the system prompt: bootstrap files, project context, available skills.
  4. Provider Router selects the model (primary, with fallback if configured).
  5. Tool Policy Engine computes the available tool set for this session (all 9 layers).
  6. The model generates a response, potentially requesting tool calls.
  7. Tool calls pass through the policy engine again — denied tools are blocked before execution.
  8. Execution routes to host or Docker sandbox depending on sandbox config.
  9. Results flow back to the model. The cycle repeats until the model produces a final response.
  10. The response routes back to the same Telegram chat. Deterministic — the model never chooses channels.

The entire flow is stateless from the model's perspective. The Gateway handles all the statefulness — sessions, memory, channel routing, tool policy.


Practical Implications

For Security

The 9-layer tool policy isn't just defensive — it's how you run multi-agent setups safely. A support bot that only needs Slack access gets profile: "messaging" with allow: ["slack"]. It physically cannot run shell commands or read files, even if prompt-injected. The policy is enforced at the Gateway level, not by the model's willingness to comply.

Combined with Docker sandboxing for non-main sessions, you get defense-in-depth that doesn't depend on the model being well-behaved.

For Flexibility

15+ providers means you're not locked into any vendor. Run Opus for complex tasks, Llama locally for quick lookups, and a Groq-hosted model for fast iteration. Switch with a single config change or a /model command.

For Reliability

Context management with cache-aware pruning and pre-compaction memory flushes means sessions can run for days without degradation. The agent doesn't "forget" because it ran out of context window — it gracefully compacts and saves what matters.


What's Next

The next post in this series will cover Clawdbot's channel system and messaging architecture — how a single Gateway manages WhatsApp, Telegram, Discord, Slack, Signal, and iMessage with deterministic routing and per-channel policy.

Series

  1. How Clawdbot Remembers: Memory Architecture
  2. Inside Clawdbot: Agent System & AI Providers (this post)
  3. Channel System & Messaging (coming soon)

Resources