Published on

Inside Clawdbot: Core Architecture & the Gateway Control Plane

Authors

This is the first post in a series exploring how Clawdbot works under the hood. Clawdbot is a personal AI assistant that runs locally and bridges multiple messaging channels — WhatsApp, Telegram, Discord, Slack, Signal, iMessage — through a single process. The architectural center of gravity is the Gateway: a WebSocket server that manages agent sessions, routes messages, handles authentication, serves a Control UI, and coordinates with companion apps on macOS, iOS, and Android.

This post covers the Gateway's architecture, the bootstrap sequence that starts it, the WebSocket protocol that clients speak, the configuration system that drives it, and the design patterns that hold everything together. The source analysis covers the open-source codebase.


The Gateway: Central Hub

Everything in Clawdbot flows through the Gateway daemon. It's a single Node.js process (Node 22+, TypeScript ESM) that binds an HTTP + WebSocket server on port 18789 and orchestrates every subsystem. Here's the high-level topology:

┌──────────────────────────────────────────────────────────────┐
Gateway Daemon (:18789)│                                                                │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐      │
│  │ Telegram │  │ WhatsApp │  │ Discord  │  │  Slack...│  └────┬─────┘  └────┬─────┘  └────┬─────┘  └────┬─────┘      │
│       │              │              │              │            │
│       ▼              ▼              ▼              ▼            │
│  ┌─────────────────────────────────────────────────────┐      │
│  │               Channel Manager                        │      │
          (plugin lifecycle, hot-reload)               │      │
│  └──────────────────────┬──────────────────────────────┘      │
│                          │                                      │
│  ┌───────────────────────┼───────────────────────────────┐    │
│  │                Agent Runtime                           │    │
│  │    Sessions · Models · Tools · Memory · Skills         │    │
│  └───────────────────────┬───────────────────────────────┘    │
│                          │                                      │
│  ┌───────────────────────┼───────────────────────────────┐    │
│  │            WebSocket Control Plane                     │    │
│  │   80+ RPC methods · Event broadcast · Auth · Presence  │    │
│  └───────────────────────┬───────────────────────────────┘    │
│                          │                                      │
│         ┌────────────────┼────────────────┐                    │
│         ▼                ▼                ▼                    │
│    ┌─────────┐    ┌───────────┐    ┌───────────┐              │
│    │   CLI   │    │ Control UI│    │   Nodes   │              │
 (local)  (WebApp)(iOS/macOS)│              │
│    └─────────┘    └───────────┘    └───────────┘              │
└──────────────────────────────────────────────────────────────┘

The Gateway isn't just a message relay. It owns:

  • Agent sessions — one per conversation context (DM sessions, group sessions, subagent sessions)
  • Channel connections — each messaging platform connects through a channel plugin
  • Node registry — companion apps (iOS, macOS, Android) register as nodes
  • Cron service — scheduled tasks with isolated agent execution
  • Browser control — headless Chrome automation via Playwright
  • Canvas host — A2UI rendering for rich UI output
  • Config watcher — hot-reload when clawdbot.json changes on disk

All clients — the CLI, the Control UI web app, companion apps — speak the same WebSocket protocol. There's no REST API. Everything is WebSocket RPC + event streaming.


Bootstrap Sequence

The path from clawdbot on your terminal to a running Gateway involves three stages.

Stage 1: Entry Point (src/entry.ts)

The CLI binary starts at src/entry.ts. Before doing anything useful, it handles Node.js environment concerns:

// src/entry.ts
process.title = "clawdbot";
installProcessWarningFilter();

// Color mode
if (process.argv.includes("--no-color")) {
  process.env.NO_COLOR = "1";
  process.env.FORCE_COLOR = "0";
}

The most interesting thing here is self-respawn for warning suppression. Node.js emits ExperimentalWarning for certain APIs. Rather than letting those leak into output, the entry point checks whether --disable-warning=ExperimentalWarning is in NODE_OPTIONS. If not, it respawns itself as a child process with the flag injected:

function ensureExperimentalWarningSuppressed(): boolean {
  if (isTruthyEnvValue(process.env.CLAWDBOT_NO_RESPAWN)) return false;
  if (isTruthyEnvValue(process.env.CLAWDBOT_NODE_OPTIONS_READY)) return false;

  process.env.NODE_OPTIONS = `${nodeOptions} ${EXPERIMENTAL_WARNING_FLAG}`.trim();
  const child = spawn(process.execPath, [...process.execArgv, ...process.argv.slice(1)], {
    stdio: "inherit",
    env: process.env,
  });
  attachChildProcessBridge(child);
  return true; // Parent must not continue — child takes over
}

The parent becomes a transparent bridge, forwarding exit codes and signals. This means the first run has a ~50ms overhead for the respawn, but subsequent runs (with NODE_OPTIONS already set) skip it entirely.

The entry point also handles --profile <name> for profile-specific environment overrides and Windows argv normalization (stripping duplicate node.exe entries and control characters).

Stage 2: CLI Program (src/index.ts)

Once the entry point is ready, it dynamically imports the CLI runner:

import("./cli/run-main.js")
  .then(({ runCli }) => runCli(process.argv))

The library entry at src/index.ts runs core initializations at import time:

loadDotEnv({ quiet: true });
normalizeEnv();
ensureClawdbotCliOnPath();
enableConsoleCapture();     // Structured logging wrapping console.*
assertSupportedRuntime();   // Node 22+ check

import { buildProgram } from "./cli/program.js";
const program = buildProgram();  // Commander.js program

Global error handlers are installed, and Commander.js dispatches to the appropriate subcommand — gateway start, gateway status, chat, etc.

Stage 3: Gateway Start (startGatewayServer())

The startGatewayServer() function in src/gateway/server.impl.ts is a ~350-line orchestration function that wires together every subsystem:

startGatewayServer(port=18789)
  ├─ Read & validate config snapshot
  │   ├─ Auto-migrate legacy config entries
  │   ├─ Validate against Zod schema
  │   └─ Auto-enable plugins from env vars
  ├─ Initialize subsystems
  │   ├─ Subagent registry
  │   ├─ Plugin registry + gateway method extensions
  │   ├─ Channel manager
  │   ├─ Node registry (iOS/Android/macOS nodes)
  │   ├─ Cron service
  │   ├─ Heartbeat runner
  │   ├─ Exec approval manager
  │   └─ Skills remote registry
  ├─ Create runtime state
  │   ├─ HTTP server(s) + optional TLS
  │   ├─ WebSocket server
  │   ├─ Client set + broadcast function
  │   └─ Canvas host handler
  ├─ Attach WS handlers (auth, routing, methods)
  ├─ Start sidecars
  │   ├─ Browser control server
  │   ├─ Plugin services
  │   └─ Channel connections
  ├─ Start service discovery (Bonjour/mDNS)
  ├─ Start Tailscale exposure (Serve/Funnel)
  ├─ Start config file watcher (hot reload)
  ├─ Start maintenance timers (tick, health, dedupe)
  └─ Return { close() } handle

Every subsystem gets a named child logger from a subsystem logger factory:

const log = createSubsystemLogger("gateway");
const logCanvas = log.child("canvas");
const logDiscovery = log.child("discovery");
const logTailscale = log.child("tailscale");
const logChannels = log.child("channels");
const logBrowser = log.child("browser");
const logHealth = log.child("health");
const logCron = log.child("cron");

This enables fine-grained log filtering — you can tail just gateway:channels or gateway:ws in production without noise from other subsystems.


WebSocket Protocol

The Gateway speaks a custom WebSocket protocol with three frame types, strict schema validation, and a challenge-based handshake.

Frame Types

All communication uses JSON frames discriminated by type:

// Request frame (client → gateway)
{ type: "req", id: "<uuid>", method: "connect", params: { ... } }

// Response frame (gateway → client)
{ type: "res", id: "<uuid>", ok: true, payload: { ... } }
{ type: "res", id: "<uuid>", ok: false, error: { code: "...", message: "..." } }

// Event frame (gateway → client, broadcast)
{ type: "event", event: "tick", payload: { ts: 1234 }, seq: 42 }

Every frame is validated at parse time using AJV-compiled TypeBox schemas:

const ajv = new Ajv({ allErrors: true, strict: false, removeAdditional: false });
export const validateConnectParams = ajv.compile<ConnectParams>(ConnectParamsSchema);
export const validateRequestFrame = ajv.compile<RequestFrame>(RequestFrameSchema);

This gives strict wire-level validation without runtime type-checking overhead — AJV compiles schemas to optimized validation functions at startup.

Connection Handshake

The connection lifecycle is a strict state machine:

Client connects via WebSocket
Gateway sends:  { type: "event", event: "connect.challenge",
                  payload: { nonce, ts } }
Client sends:   { type: "req", method: "connect",
                  params: ConnectParams }
Gateway validates:
  1. Protocol version negotiation (minProtocol/maxProtocol range)
  2. Role validation (operator | node)
  3. Scope resolution
  4. Device identity (Ed25519 signature + nonce verification)
  5. Auth check (token | password | tailscale | device-token)
  6. Device pairing check (is this device known?)
  7. Node command filtering (if role=node)
Gateway responds: { type: "res", id: "...", ok: true, payload: HelloOk }
Normal RPC cycle begins
Gateway sends periodic ticks every 30s for liveness

If the handshake doesn't complete within 10 seconds, the connection is dropped. Protocol version negotiation is strict — if the client's supported range doesn't include the server's PROTOCOL_VERSION, the connection is rejected with the expected version in the error payload.

HelloOk: The Initial Snapshot

On successful connection, the client receives a rich HelloOk payload with everything it needs to bootstrap its state:

const helloOk = {
  type: "hello-ok",
  protocol: PROTOCOL_VERSION,
  server: {
    version: "2026.1.25",
    commit: "abc123",
    host: "my-server",
    connId: "<uuid>",
  },
  features: {
    methods: ["health", "agent", "chat.send", ...],   // ~80+ methods
    events: ["tick", "agent", "chat", "presence", ...], // ~14 event types
  },
  snapshot: {
    presence: [...],           // Currently connected clients
    stateVersion: { presence: 42, health: 7 },
    health: { ... },           // Cached health snapshot
  },
  canvasHostUrl: "http://...",
  auth: {
    deviceToken: "...",        // Per-device rotating token
    role: "operator",
    scopes: ["operator.admin"],
  },
  policy: {
    maxPayload: 524288,        // 512KB frame limit
    maxBufferedBytes: 1572864, // 1.5MB buffer limit
    tickIntervalMs: 30000,
  },
};

This "snapshot on connect" pattern means clients never need a separate bootstrap request — the handshake response contains presence, health, feature flags, and policy constraints.

Authentication & Device Identity

The Gateway supports three auth modes:

ModeMechanismUse Case
tokenShared secret via timingSafeEqualDefault; set via config or CLAWDBOT_GATEWAY_TOKEN
passwordShared password via timingSafeEqualRequired for Tailscale Funnel
tailscaleTailscale Serve identity headers + tailscale whoisAuto-enabled for Serve mode

Every client must also present a device identity — an Ed25519 keypair. During handshake:

  1. Gateway sends a random nonce in the challenge
  2. Client signs a deterministic payload (deviceId + clientId + mode + role + scopes + timestamp + token + nonce) with its private key
  3. Gateway verifies the device ID is derived from the public key, the timestamp is within a 10-minute skew window, the nonce matches, and the Ed25519 signature is valid

New devices must be paired before they can connect. Local connections (loopback) are auto-approved. Remote connections trigger a pairing request broadcast to existing clients, where an operator can approve or reject. Approved devices receive a rotating device token for subsequent connections.

Role-Based Authorization

After authentication, every method call is authorized by role and scopes:

// Scope hierarchy:
// operator.admin  → full access (default for operators)
// operator.read   → read-only methods (health, status, lists)
// operator.write  → send messages, invoke agents, etc.
// operator.approvals → exec approval workflow
// operator.pairing   → device/node pairing management

Nodes (companion devices) can only call node-specific methods. The operator.admin scope is a superset that grants everything. Methods are checked before dispatch — unauthorized calls never reach the handler.


Gateway Methods & Events

The Gateway exposes 80+ RPC methods organized by domain, registered as a flat handler map:

DomainKey MethodsPurpose
connectconnectWebSocket handshake
healthhealth, status, system-presenceGateway health & presence
agentagent, agent.wait, wakeAgent run management
chatchat.send, chat.history, chat.abortWebChat real-time chat
sessionssessions.list, sessions.patch, sessions.reset, sessions.delete, sessions.compactSession CRUD
configconfig.get, config.set, config.apply, config.schemaConfiguration management
channelschannels.status, channels.logoutChannel lifecycle
nodesnode.list, node.describe, node.invokeCompanion device control
croncron.list, cron.add, cron.update, cron.remove, cron.runScheduled tasks
browserbrowser.requestBrowser automation
ttstts.status, tts.convertText-to-speech
skillsskills.status, skills.install, skills.updateSkills management
exec approvalsexec.approval.request, exec.approval.resolveExec approval flow
pairingnode.pair.*, device.pair.*, device.token.*Device/node pairing

Methods are registered as flat handler records per domain, then spread into a single map:

export const coreGatewayHandlers: GatewayRequestHandlers = {
  ...connectHandlers,
  ...sessionsHandlers,
  ...channelsHandlers,
  ...chatHandlers,
  ...cronHandlers,
  ...configHandlers,
  ...healthHandlers,
  // ... ~20 more handler modules
};

Plugins can extend this map with additional methods — channel plugins add channel-specific endpoints, and feature plugins can register entirely new RPC domains.

Event Broadcasting

The Gateway broadcasts 18+ event types to connected clients:

export const GATEWAY_EVENTS = [
  "connect.challenge",      // Handshake nonce
  "agent",                  // Agent run progress/completion
  "chat",                   // WebChat streaming deltas
  "presence",               // Client connect/disconnect
  "tick",                   // Liveness heartbeat (30s)
  "shutdown",               // Gateway shutting down
  "health",                 // Health snapshot update
  "cron",                   // Cron job events
  "node.pair.requested",    // Node pairing request
  "exec.approval.requested",// Exec approval request
  // ...
];

The broadcast function is the Gateway's event bus. It sends events to all connected clients with scope-based filtering and backpressure:

const broadcast = (event: string, payload: unknown, opts?) => {
  const eventSeq = ++seq;  // Monotonic sequence number
  const frame = JSON.stringify({ type: "event", event, payload, seq: eventSeq });

  for (const c of params.clients) {
    if (!hasEventScope(c, event)) continue;          // Scope filtering
    const slow = c.socket.bufferedAmount > MAX_BUFFERED_BYTES;
    if (slow && opts?.dropIfSlow) continue;           // Drop non-critical
    if (slow) { c.socket.close(1008, "slow consumer"); continue; }
    try { c.socket.send(frame); } catch { /* ignore */ }
  }
};

Key patterns:

  • Monotonic sequence numbers — every event gets an incrementing seq for gap detection on the client side
  • Scope guardsexec.approval.* events require operator.approvals scope; device.pair.* require operator.pairing
  • Backpressure — clients with bufferedAmount > 1.5MB get non-critical events dropped; persistently slow consumers are disconnected
  • State versioning — presence and health events carry version numbers for optimistic reconciliation

Configuration System

Clawdbot's configuration lives at ~/.clawdbot/clawdbot.json in JSON5 format (JSON with comments, trailing commas, and unquoted keys). The path is resolved via:

CLAWDBOT_CONFIG_PATH       → explicit override
CLAWDBOT_STATE_DIR/clawdbot.json → state dir override
~/.clawdbot/clawdbot.jsondefault

Dual Schema System

The config uses Zod v4 for runtime validation and TypeBox for JSON Schema generation (protocol schemas for AJV):

// Zod schema — runtime parsing and validation
export const ClawdbotSchema = z.object({
  meta: z.object({
    lastTouchedVersion: z.string().optional(),
    lastTouchedAt: z.string().optional(),
  }).strict().optional(),
  gateway: z.object({ /* auth, port, tls, reload, ... */ }).optional(),
  agents: AgentsSchema.optional(),
  channels: ChannelsSchema.optional(),
  models: ModelsConfigSchema.optional(),
  tools: ToolsSchema.optional(),
  // ... 30+ top-level sections
}).strict();

The type definitions are split across ~30 focused type files (types.agents.ts, types.gateway.ts, types.channels.ts, etc.) to keep files small and improve edit locality.

Loading Pipeline

Config loading runs a multi-stage pipeline:

Read file from disk (JSON5)
Resolve $include directives (recursive file merging)
Apply config.env to process.env (config-defined env vars)
Substitute ${VAR} references in string values
Warn on miskeys (e.g., "gateway.token""gateway.auth.token")
Validate against Zod schema (with plugin schemas merged)
Apply defaults chain:
  applyMessageDefaults → applySessionDefaults →
  applyLoggingDefaults → applyAgentDefaults →
  applyContextPruningDefaults → applyCompactionDefaults →
  applyModelDefaults
Normalize file paths (resolve ~, relative paths)
Apply runtime overrides (env var overrides)
Cache result for 200ms

The $include directive lets you split config across files — useful for separating channel credentials from general settings, or sharing base configs across machines.

Environment variable substitution means you can write "token": "${TELEGRAM_BOT_TOKEN}" in your config and have it resolved at load time without committing secrets.

Atomic Writes & Backup Rotation

Config writes are atomic with automatic backup rotation:

async function writeConfigFile(cfg: ClawdbotConfig) {
  // 1. Validate before writing
  const validated = validateConfigObjectWithPlugins(cfg);
  if (!validated.ok) throw new Error(...)

  // 2. Stamp version metadata
  const stamped = stampConfigVersion(cfg);

  // 3. Write to temp file first
  const tmp = path.join(dir, `clawdbot.json.${pid}.${uuid}.tmp`);
  await fs.writeFile(tmp, json, { mode: 0o600 });

  // 4. Rotate backups (up to 5: .bak, .bak.1, .bak.2, ...)
  await rotateConfigBackups(configPath);
  await fs.copyFile(configPath, `${configPath}.bak`);

  // 5. Atomic rename (with Windows fallback)
  await fs.rename(tmp, configPath);
}

File permissions are set to 0o600 (owner read/write only) — config files often contain API keys and tokens.

Hot Reload

The startGatewayConfigReloader() watches the config file using chokidar and applies changes without restart when possible:

const watcher = chokidar.watch(opts.watchPath, {
  ignoreInitial: true,
  awaitWriteFinish: { stabilityThreshold: 200, pollInterval: 50 },
});

When a change is detected, it's debounced (300ms default), the new config is read and validated, and then a reload plan is built by diffing config paths against reload rules:

const BASE_RELOAD_RULES: ReloadRule[] = [
  { prefix: "gateway.remote",     kind: "none" },    // No action needed
  { prefix: "hooks",              kind: "hot", actions: ["reload-hooks"] },
  { prefix: "cron",               kind: "hot", actions: ["restart-cron"] },
  { prefix: "browser",            kind: "hot", actions: ["restart-browser-control"] },
  { prefix: "channels.telegram",  kind: "hot", actions: ["restart-channel:telegram"] },
  { prefix: "gateway",            kind: "restart" }, // Full restart needed
  { prefix: "models",             kind: "none" },    // Dynamic, no restart
  { prefix: "agents",             kind: "none" },    // Dynamic
];

This means changing your Telegram bot token hot-reloads just the Telegram channel, while changing a model setting takes effect immediately with no restart at all. Only core Gateway settings like port or TLS trigger a full restart.

The reload mode is configurable via gateway.reload.mode:

ModeBehavior
hybrid (default)Hot-reload what's possible, restart for the rest
hotOnly hot-reload; ignore changes requiring restart
restartAlways restart on any change
offIgnore all config changes

Config Schema for UI

The Gateway generates a JSON Schema with UI hints for the Control UI via buildConfigSchema():

export function buildConfigSchema(params?) {
  const base = buildBaseConfigSchema();     // Zod → JSON Schema
  const merged = applyPluginSchemas(base, plugins);   // Plugin schemas
  const withChannels = applyChannelSchemas(merged, channels);
  const hints = applyPluginHints(base.uiHints, plugins);
  return { schema: merged, uiHints: hints, version, generatedAt };
}

UI hints include human-readable labels, help text, sensitivity markers (auto-detected via /token|password|secret|api.?key/i), logical groupings, and placeholder values. This powers the Control UI's settings editor — it renders form fields dynamically from the schema rather than hardcoding UI for each config option.


Connection Management

Client Representation

Each connected client is tracked as a GatewayWsClient:

export type GatewayWsClient = {
  socket: WebSocket;
  connect: ConnectParams;    // Handshake parameters (role, scopes, client info)
  connId: string;            // Connection UUID
  presenceKey?: string;      // Presence tracking key
};

All clients live in a Set<GatewayWsClient>. On disconnect, presence is updated and broadcast, and node registrations are cleaned up.

Request Routing

After handshake, requests are routed through handleGatewayRequest():

export async function handleGatewayRequest(opts) {
  // 1. Authorization check (role + scopes)
  const authError = authorizeGatewayMethod(req.method, client);
  if (authError) { respond(false, undefined, authError); return; }

  // 2. Handler lookup (core handlers + plugin handlers)
  const handler = opts.extraHandlers?.[req.method] ?? coreGatewayHandlers[req.method];
  if (!handler) {
    respond(false, undefined, errorShape(ErrorCodes.INVALID_REQUEST, "unknown method"));
    return;
  }

  // 3. Dispatch
  await handler({ req, params: req.params, client, respond, context });
}

The routing is flat — no middleware chains, no routing trees. Method name → handler function, with an authorization check before dispatch.

Node Registry

Companion devices (iOS, Android, macOS) connect as nodes with role: "node". The NodeRegistry tracks active nodes and provides bidirectional communication:

  • Node → Gateway: node.invoke.result, node.event
  • Gateway → Node: Events sent via nodeRegistry.sendEvent(nodeId, event, payload)

Node commands are filtered on connect against a configurable allowlist:

if (role === "node") {
  const allowlist = resolveNodeCommandAllowlist(cfg, {
    platform: connectParams.client.platform,
    deviceFamily: connectParams.client.deviceFamily,
  });
  connectParams.commands = declared.filter(cmd => allowlist.has(cmd));
}

This means a node can declare it supports 50 commands, but the Gateway will only allow the subset that's explicitly permitted for that platform.

Maintenance Timers

Three recurring timers keep the Gateway healthy:

TimerIntervalPurpose
Tick30sBroadcasts tick events for liveness detection
Health60sRefreshes health snapshot (channel status, model availability)
Dedupe cleanup5min TTLPrunes expired deduplication entries (max 1000)

Client SDK: GatewayClient

The GatewayClient class is the client-side counterpart. It handles:

Automatic reconnection with exponential backoff:

private scheduleReconnect() {
  const delay = this.backoffMs;
  this.backoffMs = Math.min(this.backoffMs * 2, 30_000);
  setTimeout(() => this.start(), delay).unref();
}

Request/response correlation via Promise map:

async request<T>(method: string, params?: unknown): Promise<T> {
  const id = randomUUID();
  const p = new Promise<T>((resolve, reject) => {
    this.pending.set(id, { resolve, reject });
  });
  this.ws.send(JSON.stringify({ type: "req", id, method, params }));
  return p;
}

Tick-based liveness detection — if no tick arrives within 2x the interval, the client closes and reconnects.

Sequence gap detection — if a received event seq is greater than lastSeq + 1, the client fires an onGap callback so it can request a full state refresh.

TLS fingerprint pinning — for remote connections, the client can pin a specific TLS certificate fingerprint rather than relying on the CA system.


The Control UI

The Control UI is a web application served by the Gateway's HTTP server. It connects via WebSocket just like any other client — there's no special backend-for-frontend. The UI speaks the same protocol, authenticates the same way, and receives the same events.

This design means the Control UI gets real-time updates for free. When an agent runs, the UI receives agent events and streams the response. When a channel reconnects, the health event updates the dashboard. When another client connects, presence events update the connected devices list.

The config editor is powered by the JSON Schema + UI hints system described above — the Gateway sends config.schema with field labels, help text, sensitivity markers, and groupings, and the UI renders form fields dynamically. No hardcoded settings pages.


Architecture Patterns

A few design decisions are worth calling out explicitly.

Single-User, Local-First

Clawdbot is designed for one person. There's no multi-tenant database, no user management, no permission boundaries between users. The Gateway runs on your machine, your data stays local, and authentication is about verifying devices rather than users.

This simplifies everything. Sessions don't need user IDs. Config changes take effect immediately for the one person who cares. The threat model is "keep other people out" rather than "keep users isolated from each other."

Flat Handler Registry

Methods are a flat Record<string, Handler> — not class hierarchies, not decorator-based routing, not middleware chains. Each domain exports a handler record, they're all spread into one map, and dispatch is a lookup + call. This is simple, extensible (plugins just add entries), and easy to understand.

Dependency Injection via Context Bag

Rather than DI containers or singletons, the Gateway passes a context bag through the call chain:

export type GatewayRequestContext = {
  deps: CliDeps;
  cron: CronService;
  broadcast: BroadcastFn;
  nodeRegistry: NodeRegistry;
  getHealthCache: () => HealthSnapshot | null;
  // ... 30+ context entries
};

Every handler receives this context. Dependencies are explicit and visible in the type signature.

Graceful Degradation

  • Config loading returns {} on parse errors — the Gateway starts with defaults rather than crashing
  • Broadcast ignores per-client send errors — one slow client doesn't affect others
  • Skills refresh uses 30s debounce to prevent feedback loops
  • Client reconnection uses exponential backoff up to 30s
  • Legacy config keys are auto-migrated on startup

Graceful Shutdown

The createGatewayCloseHandler() tears down subsystems in order: discovery → Tailscale → canvas → channels → plugins → cron → heartbeat → node timers → broadcast shutdown event → close all WebSocket connections → stop config reloader → stop browser control → close WebSocket server → close HTTP server.


Series

This is the first in an 8-part series on Clawdbot's architecture:

  1. Core Architecture & Gateway (this post)
  2. Memory System
  3. Agent System & AI Providers
  4. Channel & Messaging
  5. Sessions & Multi-Agent
  6. CLI, Commands & TUI
  7. Browser, Media & Canvas
  8. Infrastructure & Security

Resources