Published on

Inside Clawdbot: The Channel & Messaging System That Speaks Seven Protocols

Authors

This is the fourth post in my Clawdbot deep dive series. Previous: Agent System & AI Providers. Here I'm looking at the channel and messaging system — how Clawdbot abstracts seven messaging platforms behind a single contract, routes messages through a unified pipeline, and handles the wildly different capabilities of each platform.

The source analysis covers the channel abstraction layer, plugin registry, all platform implementations, and the full message routing pipeline across the open-source codebase. Let's get into it.


Architecture Overview

Clawdbot's messaging system is built around a channel plugin architecture with three layers: the agent/LLM layer at the top, a channel abstraction layer in the middle, and platform-specific protocol implementations at the bottom.

┌──────────────────────────────────────────────────────────────┐
Agent / LLM Layer          (auto-reply, commands, tools, dispatch)├──────────────────────────────────────────────────────────────┤
Channel Abstraction Layer│  ┌──────────┐  ┌──────────┐  ┌───────────────┐  ┌─────────┐ │
│  │   Dock   │  │ Registry │  │ Outbound Adpt │  │ Actions │ │
(light)  (full)  (send/poll)(msg ops)│ │
│  └──────────┘  └──────────┘  └───────────────┘  └─────────┘ │
├──────────────────────────────────────────────────────────────┤
Platform Protocol Layer│  ┌────────┐ ┌─────────┐ ┌──────┐ ┌────────┐ ┌─────┐ ┌────┐ │
│  │Telegram│ │ Discord │ │Signal│ │WhatsApp│ │Slack│ │LINE│  │
│  │ grammY │ │ Carbon  │ │ RPC  │ │Baileys │ │Bolt │ │SDK │  │
│  └────────┘ └─────────┘ └──────┘ └────────┘ └─────┘ └────┘  │
└──────────────────────────────────────────────────────────────┘

Four design principles drive this:

  • Two-tier weight system: "Dock" (lightweight metadata) vs "Plugin" (full implementation) — shared code never imports heavy platform SDKs
  • Plugin registry: All channels register through a unified PluginRegistry, whether bundled or installed via npm
  • Outbound adapters: Separate, lazy-loadable send adapters avoid importing heavy monitors just to send a message
  • Unified inbound pipeline: Every platform normalizes messages into the same MsgContext format before hitting the agent

The Channel Abstraction Layer

The ChannelPlugin Interface

Every channel implements a single contract: ChannelPlugin. It's defined at src/channels/plugins/types.plugin.ts and contains over 20 adapter slots:

export type ChannelPlugin<ResolvedAccount = any> = {
  id: ChannelId;                           // "telegram" | "discord" | etc.
  meta: ChannelMeta;                       // UI labels, docs paths, ordering
  capabilities: ChannelCapabilities;       // Feature flags
  defaults?: { queue?: { debounceMs?: number } };
  reload?: { configPrefixes: string[] };   // Hot-reload triggers

  // Adapter slots (all optional except config):
  config: ChannelConfigAdapter;            // Account resolution (required)
  onboarding?: ChannelOnboardingAdapter;   // CLI setup wizard
  pairing?: ChannelPairingAdapter;         // DM pairing flow
  security?: ChannelSecurityAdapter;       // DM policy, warnings
  groups?: ChannelGroupAdapter;            // Group mention/policy
  mentions?: ChannelMentionAdapter;        // @mention stripping
  outbound?: ChannelOutboundAdapter;       // Send text/media/polls
  gateway?: ChannelGatewayAdapter;         // Start/stop account lifecycle
  streaming?: ChannelStreamingAdapter;     // Block streaming coalescing
  threading?: ChannelThreadingAdapter;     // Reply-to-mode, tool context
  messaging?: ChannelMessagingAdapter;     // Target normalization
  directory?: ChannelDirectoryAdapter;     // User/group directory
  actions?: ChannelMessageActionAdapter;   // Message action dispatch
  // ... plus auth, elevated, commands, status, agentPrompt, heartbeat
};

Only config is required. Everything else is optional — channels implement what they support and skip the rest. This keeps lightweight channels (like a hypothetical Matrix plugin) from needing to stub out 20 empty methods.

ChannelCapabilities

Each channel declares its feature set with a capabilities object:

export type ChannelCapabilities = {
  chatTypes: Array<"direct" | "group" | "channel" | "thread">;
  polls?: boolean;
  reactions?: boolean;
  edit?: boolean;
  unsend?: boolean;
  reply?: boolean;
  effects?: boolean;        // Message effects (iMessage tapbacks)
  threads?: boolean;        // Discord/Slack forum threads
  media?: boolean;
  nativeCommands?: boolean; // Slash commands
  blockStreaming?: boolean; // Streaming response delivery
};

Here's how this plays out across platforms:

CapabilityTelegramDiscordSignalWhatsAppSlackiMessageLINE
polls
reactionsLimited¹
nativeCommands
threads
blockStreaming
edit

¹ Telegram receives reaction events, but sending reactions uses the setMessageReaction API rather than standard message compose.

The agent layer reads these capabilities to decide what's possible — it won't try to send a poll on Signal or edit a message on WhatsApp.


Dock vs Plugin: The Two-Tier Weight System

This is one of the more interesting architectural decisions. Clawdbot splits every channel into two tiers:

Dock (src/channels/dock.ts) — lightweight metadata and behavior that can be imported anywhere without pulling in platform SDKs:

export type ChannelDock = {
  id: ChannelId;
  capabilities: ChannelCapabilities;
  commands?: ChannelCommandAdapter;
  outbound?: { textChunkLimit?: number };
  streaming?: ChannelDockStreaming;
  groups?: ChannelGroupAdapter;
  mentions?: ChannelMentionAdapter;
  threading?: ChannelThreadingAdapter;
  agentPrompt?: ChannelAgentPromptAdapter;
};

Plugin — the full ChannelPlugin with gateway lifecycle, outbound adapters, and monitor implementations that import grammY, Carbon, Baileys, etc.

Why? Because shared code paths — reply flow, command auth, sandbox explain — need to know things about channels (mention patterns, text limits, threading defaults) without importing 50MB of platform SDKs. The dock contains inline implementations for every core channel. The plugin only loads at execution boundaries.

// Shared code imports from dock (cheap):
import { getChannelDock } from "./channels/dock.js";

// Only execution boundaries import full plugins (expensive):
import { getChannelPlugin } from "./channels/plugins/index.js";

The dock for each channel is either hardcoded in a DOCKS map (for core channels) or auto-generated from plugin metadata via buildDockFromPlugin().


Plugin Registry and Channel Registration

The PluginRegistry

All channel plugins register through the unified PluginRegistry at src/plugins/registry.ts. The registry isn't just for channels — it holds tools, hooks, providers, HTTP handlers, CLI registrars, and services. But channels get their own typed slot:

export type PluginRegistry = {
  plugins: PluginRecord[];
  channels: PluginChannelRegistration[];      // ← Channel plugins
  tools: PluginToolRegistration[];
  hooks: PluginHookRegistration[];
  providers: PluginProviderRegistration[];
  services: PluginServiceRegistration[];
  // ... httpHandlers, cliRegistrars, diagnostics, etc.
};

A channel registration pairs the full plugin with an optional dock override:

export type PluginChannelRegistration = {
  pluginId: string;
  plugin: ChannelPlugin;       // Full channel implementation
  dock?: ChannelDock;          // Optional lightweight override
  source: string;              // File path origin
};

Registration API

Plugins register via the ClawdbotPluginApi interface during initialization:

export function init(api: ClawdbotPluginApi) {
  api.registerChannel({
    plugin: myChannelPlugin,
    dock: myChannelDock,
  });
}

The registry enforces uniqueness — duplicate channel IDs produce diagnostic errors.

Plugin Discovery Pipeline

Plugins are discovered from four sources in priority order:

┌─────────────────────────────────────────────────────┐
1. config     — paths in user config               │
2. workspace  — local project node_modules         │
3. global     — system-wide plugin directory       │
4. bundled    — built into clawdbot                │
└──────────────────────┬──────────────────────────────┘
             discoverPlugins()loadPluginModule()
              init(api) → api.registerChannel()
              PluginRegistry.channels[]

External channel plugins declare their metadata in package.json:

{
  "name": "clawdbot-channel-matrix",
  "clawdbot": {
    "channel": {
      "id": "matrix",
      "label": "Matrix",
      "selectionLabel": "Matrix (matrix-js-sdk)",
      "blurb": "Decentralized messaging via Matrix protocol.",
      "docsPath": "/channels/matrix",
      "aliases": ["element"],
      "order": 10
    },
    "install": {
      "npmSpec": "clawdbot-channel-matrix",
      "localPath": "../clawdbot-channel-matrix"
    }
  }
}

Lazy Loading

Channel plugins load lazily and cache per-registry:

export async function loadChannelPlugin(id: ChannelId) {
  const registry = getActivePluginRegistry();
  ensureCacheForRegistry(registry);
  const cached = cache.get(id);
  if (cached) return cached;
  const entry = registry?.channels.find((e) => e.plugin.id === id);
  if (entry) {
    cache.set(id, entry.plugin);
    return entry.plugin;
  }
  return undefined;
}

The cache invalidates automatically when the plugin registry changes (e.g., on hot reload).


Message Routing & Delivery Pipeline

Inbound: Platform Event to Agent

Every channel follows the same inbound pipeline, regardless of platform:

Platform EventMonitor/HandlerNormalizeGateDispatch
                                  MsgContext
                              ┌───────────────┐
                              │  resolveRoute  │ → agent + session key
                              │  allowList     │ → DM policy check
                              │  mentionCheck  │ → group require mention
                              │  commandCheck  │ → native command detection
                              └───────┬───────┘
                              dispatchInboundMessage()
                              createReplyDispatcher()
                              LLM call → response → send reply

The normalization step is key. Whether the source is a grammY update, a Carbon gateway event, or a Baileys message, it all gets transformed into the same envelope format:

formatInboundEnvelope({
  channel: "signal",
  accountId: "default",
  senderId: "+14155551234",
  senderName: "Alice",
  chatType: "group",
  groupLabel: "Friends",
  text: "Hey @bot what's up?",
  mediaPath: "/tmp/image.jpg",
  replyToId: "1234567890",
});

Session routing uses the channel, account, peer kind, and peer ID to produce a deterministic session key:

const route = resolveAgentRoute({
  cfg,
  channel: "telegram",
  accountId: account.accountId,
  peer: { kind: isGroup ? "group" : "dm", id: peerId },
});
// → { agentId: "main", sessionKey: "agent:main:telegram:dm:123456" }

Outbound: Agent Response to Platform

Outbound delivery goes through the ChannelOutboundAdapter:

Agent ResponseresolveTarget() → chunk text → format → send
              Platform-specific API call

Three delivery modes exist:

ModeDescriptionUsed by
directDirect API call from the processTelegram, Discord, Signal, Slack, iMessage
gatewayRoute through the gateway daemonWhatsApp
hybridTry direct, fall back to gatewayFuture use

WhatsApp uses gateway delivery because the Baileys socket (headless WhatsApp Web) runs inside the gateway daemon — outbound messages route through it rather than making independent API calls.

Target Resolution

Each channel normalizes targets differently:

ChannelTarget formatExample
Telegramtelegram:<chatId> or @usernametelegram:123456789
DiscordSnowflake channel ID1234567890123456789
SignalE.164 or group:<groupId>+14155551234
WhatsAppE.164 JID or group JID14155551234@s.whatsapp.net
SlackChannel IDC0123456789
iMessagePhone, email, chat_id, or chat_guid+14155551234
LINEUser/group/room IDU1234...

Platform Implementations

Telegram (grammY)

The most feature-rich integration. Uses grammy with @grammyjs/runner for concurrent update processing and @grammyjs/transformer-throttler for rate limiting.

Key architecture decisions:

  • Dual mode: Long-polling via @grammyjs/runner (default) or webhook mode. Polling includes automatic restart with exponential backoff on conflicts or network errors.
  • Sequentialized processing: The sequentialize middleware routes updates to per-chat processing queues. Forum topics get their own sub-keys.
  • Update deduplication: A recentUpdates tracker plus lastUpdateId persistence prevents reprocessing after restarts.
  • Markdown → HTML pipeline: Converts markdown through an intermediate IR before rendering to Telegram's HTML subset. Auto-falls back to plain text when HTML parsing fails.
// Markdown → Telegram HTML via intermediate representation
const ir = markdownToIR(markdown, { linkify: true, headingStyle: "none" });
const html = renderMarkdownWithMarkers(ir, {
  styleMarkers: {
    bold: { open: "<b>", close: "</b>" },
    italic: { open: "<i>", close: "</i>" },
    code_block: { open: "<pre><code>", close: "</code></pre>" },
  },
  escapeText: escapeHtml,
});

Telegram-specific features: forum topic routing, inline keyboard buttons, sticker sending/search, message editing with keyboard updates, silent sends, HTTP proxy support, and block streaming for real-time response delivery.

Discord (Carbon)

Uses @buape/carbon with gateway WebSocket and discord-api-types/v10 for REST.

Gateway intents are configurable: Guilds | GuildMessages | MessageContent | DirectMessages | GuildMessageReactions | DirectMessageReactions, with optional GuildPresences and GuildMembers.

Discord has the broadest action surface — 50 registered message actions covering guild management, thread CRUD, moderation (timeout/kick/ban), role management, event creation, search, and more. The hard limit is 2000 characters per message, the lowest of all platforms.

Slash commands deploy via client.handleDeployRequest() with retry logic. Interactive buttons handle sandboxed command approvals.

WhatsApp (Baileys)

Uses @whiskeysockets/baileys — a headless WhatsApp Web client. This is the only channel that uses gateway-based delivery by default, since the Baileys socket must run persistently inside the gateway daemon.

Login uses QR code authentication rendered as images. The reconnection logic handles exponential backoff, connection state tracking, and crypto error detection.

WhatsApp-specific features: native polls (up to 12 options), configurable ack reactions on message receipt, auto JPEG compression for images, broadcast group support, and proper JID normalization between @s.whatsapp.net (DMs) and @g.us (groups).

Signal (signal-cli JSON-RPC)

A custom integration over signal-cli's REST API using JSON-RPC 2.0 over HTTP with SSE (Server-Sent Events) for inbound messages.

signal-cli daemon ← HTTP POST (JSON-RPC 2.0)Clawdbot
SSE stream (/api/v1/events)Clawdbot

Signal requires media to be saved to the local filesystem before sending — it passes file paths to signal-cli rather than URLs or buffers. SSE reconnection handles automatic backoff. Targets can be phone numbers, group IDs, or Signal usernames.

Slack (Bolt)

Uses @slack/bolt with dual mode: Socket Mode (default, requires appToken) or HTTP Receiver (requires signingSecret).

Threading uses Slack's native thread_ts mechanism. The buildSlackThreadingToolContext() helper ensures tool responses stay in the correct thread. Full slash command support with argument parsing and interactive menus. Text limit: 4000 characters.

iMessage (imsg RPC)

macOS only. Spawns an imsg rpc child process and communicates via newline-delimited JSON-RPC over stdin/stdout.

Supports service routing between iMessage and SMS (auto-selection). Targets can be phone numbers, email addresses, chat IDs, or chat GUIDs. Media is saved to disk and passed as file paths.

LINE (Bot SDK)

Webhook-based using @line/bot-sdk. LINE has a unique dual delivery model:

  • Reply: Uses a replyToken from the webhook event (free, expires quickly)
  • Push: Proactive sends to a user/group ID (consumes message quota)

Rich message support includes Flex Messages (complex JSON layouts), Template Messages (carousels, confirms), Quick Reply buttons (up to 13 items, 20-char label limit), and Rich Menu management. Loading animation provides a typing indicator for up to 20 seconds.


Typing Indicators, Reactions & Polls

Typing Indicators

src/channels/typing.ts provides a unified typing callback factory. Each channel implements typing differently:

ChannelMechanismNotes
TelegramsendChatAction("typing")Expires ~5s, needs refresh
Discordchannel.sendTyping() RESTExpires ~10s
SignalsendTyping RPCHas explicit stop support
WhatsAppsendPresenceUpdate("composing")Explicit stop with "paused"
SlackNo typing API
iMessageNo typing API
LINEshowLoadingAnimation()Up to 20s or until next message

Reactions

Reactions serve two purposes: ack reactions (processing indicator when the bot starts working) and expressive reactions (agent-initiated emoji responses).

Ack reactions are scope-gated:

export type AckReactionScope =
  "all" | "direct" | "group-all" | "group-mentions" | "off" | "none";
// Default: "group-mentions" — only ack in groups when @mentioned

Each platform sends reactions through different APIs:

ChannelMethod
Telegramapi.setMessageReaction(chatId, messageId, [{type:"emoji", emoji}])
DiscordREST PUT /channels/{id}/messages/{id}/reactions/{emoji}/@me
SignalsendReaction RPC with sender + timestamp targeting
WhatsAppBaileys sendMessage(jid, {react: {text, key}})
Slackreactions.add({channel, timestamp, name})
iMessageimsg rpc react method

Polls

Only Discord and WhatsApp support native polls:

ChannelMax OptionsImplementation
Discord10sendPollDiscord() via REST API
WhatsApp12sendPollWhatsApp() via Baileys

Media Handling Across Channels

Media flows through a common pipeline:

Media URLloadWebMedia()Buffer + ContentType
         mediaKindFromMime()"image" | "video" | "audio" | "document"
         Channel-specific send method

Per-channel media limits are resolved through resolveChannelMediaMaxBytes():

ChannelDefault MaxNotes
Telegram5 MBBot API: ~50MB download, 10MB upload
Discord8 MBBoost-dependent (25/50/100 MB)
Signal8 MBFile attachment limit
WhatsApp16 MBAuto-compresses images to JPEG
Slack8 MBVaries by workspace plan
iMessage16 MBLocal filesystem path-based
LINE10 MBContent URLs must be HTTPS

Telegram has the richest media type system — it distinguishes between photos, videos, audio, voice notes, animations (GIFs), stickers, and generic documents, using different API methods for each. Caption splitting handles the 1024-char caption limit by putting the caption on the media and sending the remainder as a follow-up text message.

Signal and iMessage require media to be saved to the local filesystem first — they pass file paths rather than URLs or buffers.

LINE requires publicly accessible HTTPS URLs for all media content.

Text Chunk Limits

When a response exceeds the platform limit, Clawdbot chunks it:

Telegram  ████████████████████████████████████████  4096 chars
WhatsApp  ████████████████████████████████████████  4000 chars
Signal    ████████████████████████████████████████  4000 chars
Slack     ████████████████████████████████████████  4000 chars
iMessage  ████████████████████████████████████████  4000 chars
Discord   ████████████████████                      2000 chars

DM Access Model & Group Policy

DM Access Modes

Clawdbot controls who can DM the bot through a configurable access model per channel:

  • Pairing — Users must exchange a pairing code before they can interact. The ChannelPairingAdapter handles code generation, validation, and session binding.
  • Allowlist — Only explicitly allowed user IDs can interact. Configured per-account with allowFrom arrays.
  • Open — Anyone can DM the bot.
  • Disabled — DMs are turned off entirely.

The security adapter (ChannelSecurityAdapter) enforces these policies before messages reach the agent. Each channel's dock includes resolveAllowFrom() and formatAllowFrom() helpers for reading and displaying allowlist configuration.

Group Policy

Groups have their own policy layer through the ChannelGroupAdapter:

  • Mention required — The bot only responds when @mentioned (default for most channels)
  • Always respond — The bot responds to every message in the group
  • Command only — Only responds to explicit commands

The mention adapter (ChannelMentionAdapter) handles stripping the bot's @mention from message text before it reaches the agent — so the LLM sees clean text without bot username noise.

Group policies are enforced during the inbound gate phase, after monitor handlers extract the message but before dispatching to the agent.


Message Actions: 50 Operations

The dispatchChannelMessageAction() function routes named actions to channel plugins. The full catalog:

Core:        send, broadcast
Polls:       poll
Reactions:   react, reactions
Messages:    read, edit, unsend, delete, reply, sendWithEffect, sendAttachment
Stickers:    sticker, sticker-search, sticker-upload
Threading:   thread-create, thread-list, thread-reply
Search:      search
Pins:        pin, unpin, list-pins
Permissions: permissions
Groups:      renameGroup, setGroupIcon, addParticipant, removeParticipant, leaveGroup
Members:     member-info, timeout, kick, ban
Roles:       role-info, role-add, role-remove
Channels:    channel-info, channel-list, channel-create, channel-edit,
             channel-delete, channel-move
Categories:  category-create, category-edit, category-delete
Voice:       voice-status
Events:      event-list, event-create
Emoji:       emoji-list, emoji-upload

Each action is gated by supportsAction checks — the agent won't try to create a Discord event on Telegram, or manage roles on Signal. Discord has the broadest action surface (guild management, moderation, threads, events), while Signal and iMessage support only the basics (send, react, read).


Channel Registry & ID Resolution

The core channel roster is defined with a fixed order:

export const CHAT_CHANNEL_ORDER = [
  "telegram", "whatsapp", "discord", "googlechat",
  "slack", "signal", "imessage",
] as const;

Aliases make configuration forgiving:

const CHAT_CHANNEL_ALIASES: Record<string, ChatChannelId> = {
  imsg: "imessage",
  "google-chat": "googlechat",
  gchat: "googlechat",
};

normalizeAnyChannelId() resolves both core and external plugin channel IDs, handling aliases and case normalization. The buildChannelUiCatalog() function generates a presentation-ready catalog for settings UIs with labels, detail labels, system images, and ordering.


Putting It All Together

Here's the complete flow for a single message — from platform event to agent response to delivery:

              INBOUND
   ┌────────────┼────────────┐
   ▼            ▼            ▼
┌────────┐ ┌────────┐ ┌────────┐
│Telegram│ │Discord │ │Signal  │ ...
│ grammY │ │ Carbon │ │  SSE└───┬────┘ └───┬────┘ └───┬────┘
    │          │          │
    ▼          ▼          ▼
┌──────────────────────────────────────────┐
Channel Monitor / Handler│  ├─ Extract text, media, sender, group   │
│  ├─ Allowlist / pairing check            │
│  ├─ Mention detection & stripping        │
│  ├─ Group policy enforcement             │
│  └─ Command detection                   │
└──────────────────┬───────────────────────┘
┌──────────────────────────────────────────┐
Inbound Dispatch Pipeline│  ├─ resolveAgentRoute() → session key    │
│  ├─ Debounce (per-sender, configurable)│  ├─ Build MsgContext (normalized)│  ├─ recordInboundSession()│  └─ dispatchInboundMessage()└──────────────────┬───────────────────────┘
┌──────────────────────────────────────────┐
LLM / Agent Processing│  ├─ System prompt (with channel hints)│  ├─ Tool execution                       │
│  └─ Response generation                  │
└──────────────────┬───────────────────────┘
              OUTBOUND
┌──────────────────────────────────────────┐
Outbound Delivery│  ├─ resolveTarget() (normalize)│  ├─ Chunk text (per-channel limit)│  ├─ Format (HTML / mrkdwn / plain)│  ├─ Attach media (URL or file)│  └─ Send via channel outbound adapter    │
└──────────────────┬───────────────────────┘
   ┌────────────┼────────────┐
   ▼            ▼            ▼
┌────────┐ ┌────────┐ ┌────────┐
│Telegram│ │Discord │ │Signal  │ ...
│Bot API │ │ REST   │ │  RPC└────────┘ └────────┘ └────────┘

The system handles 7+ platforms through one abstraction, supports 50 message actions, and keeps import weight low through the dock/plugin split. Adding a new channel means implementing a ChannelPlugin object with the relevant adapters, declaring a package.json manifest, and registering through the plugin system. The rest — routing, session management, agent dispatch — works automatically.


Series

  1. Core Architecture & Gateway
  2. Memory System
  3. Agent System & AI Providers
  4. Channel & Messaging (this post)
  5. Sessions & Multi-Agent
  6. CLI, Commands & TUI
  7. Browser, Media & Canvas
  8. Infrastructure & Security

Resources