Published on

How Clawdbot Remembers: A Deep Dive into AI Agent Memory Architecture

Authors

Every AI assistant has the same fundamental problem: amnesia. ChatGPT forgets your conversation the moment you close the tab. Claude starts fresh each session. Even the most capable models are stateless — they process your input, generate a response, and forget everything.

Clawdbot solves this differently. Instead of bolting memory onto a chat interface as an afterthought, it builds memory into the agent's operating model from the ground up. The result is an AI assistant that genuinely remembers — your preferences, decisions, ongoing projects, and the context that makes conversations feel continuous rather than repetitive.

This post explains how the memory system works from a user perspective, then takes a technical deep dive into the architecture — complete with source code references and implementation details from the open-source codebase.


The Problem: Stateless Models in a Stateful World

Large Language Models are powerful but fundamentally stateless. Each API call is independent — the model receives a prompt, generates tokens, and the weights remain unchanged. There's no built-in mechanism for persistence.

Most AI products handle this with conversation history: they replay previous messages into the context window on each turn. This works for short conversations but breaks down in three ways:

  1. Context window limits: Even with 200K token windows, long-running conversations eventually exceed capacity. Old messages get dropped, and the model "forgets."
  2. Cross-session amnesia: Start a new chat, and everything from the previous session is gone.
  3. No semantic retrieval: The model can only reference what's currently in its context window. If you mentioned your preferred deployment strategy six weeks ago, it's gone.

Clawdbot addresses all three with a layered memory architecture built on plain files, vector search, and automatic persistence.


How Memory Works (User Guide)

If you're running Clawdbot, here's what you need to know about how memory works in practice.

The Two-Layer System

Clawdbot uses two types of memory files, both stored as plain Markdown in your workspace (default: ~/clawd):

Daily Notes (memory/YYYY-MM-DD.md)

These are append-only daily logs. Clawdbot writes running context here — what happened during the day, decisions made, tasks completed. Think of them as a journal.

At the start of each session, Clawdbot automatically reads today's and yesterday's daily notes for recent context. You don't need to do anything — it just knows what happened recently.

Long-Term Memory (MEMORY.md)

This is curated, distilled knowledge. While daily notes capture everything, MEMORY.md holds what matters long-term: your preferences, important decisions, recurring context, and lessons learned.

The distinction matters: daily notes are raw logs; MEMORY.md is the agent's curated understanding of your world.

What Gets Remembered

Clawdbot writes to memory in several scenarios:

  • Explicit requests: "Remember that I prefer TypeScript over JavaScript for new projects"
  • Automatic capture: Decisions, preferences, and context that emerge naturally during conversations
  • Pre-compaction flush: Before old conversation history is compacted (summarized), Clawdbot automatically saves any durable information to disk — a silent safety net that prevents context loss
  • Periodic curation: During heartbeat cycles, the agent can review recent daily notes and distill insights into MEMORY.md — like a human reviewing their journal

Searching Memory

You don't need to manage memory files manually. When you ask Clawdbot something that relates to past context, it uses hybrid semantic + keyword search to find relevant notes — even if your wording is completely different from what was originally written.

Ask "What was that deployment tool we discussed?" and Clawdbot will find notes about Coolify, Vercel, or whatever was discussed — even if the word "deployment" never appears in the original notes. But it also handles exact matches: search for an error code like YN0028 and BM25 keyword search catches it instantly.

Security: What Gets Loaded Where

MEMORY.md contains personal context — your preferences, private project details, sensitive decisions. Clawdbot only loads it in your private, main session. In group chats (Discord servers, Telegram groups), MEMORY.md is never injected. This prevents your personal context from leaking to other participants.

Daily notes follow a similar pattern: they're read at session start for continuity but aren't broadcast into shared contexts.


The Architecture (Technical Deep Dive)

Under the hood, Clawdbot's memory system is a composition of several subsystems: workspace files, bootstrap injection, a vector+keyword search engine backed by SQLite, an embedding pipeline with three provider backends, and pre-compaction memory flush. Here's how they fit together.

System Architecture

The full memory architecture spans several source files. Here's how they connect:

┌─────────────────────────────────────────────────────────────────┐
Agent Session│                                                                 │
memory_search("What did we discuss about X?")│       │                                                         │
│       ▼                                                         │
│  ┌──────────────┐                                               │
│  │ memory-tool    (src/agents/tools/memory-tool.ts)│  │ memory_search│                                               │
│  │ memory_get   │                                               │
│  └──────┬───────┘                                               │
│         │                                                       │
│         ▼                                                       │
│  ┌──────────────────┐      ┌──────────────────────┐             │
│  │ MemoryIndexManager│◄────│ search-manager.ts     │             │
   (manager.ts) (lazy factory/cache)  │             │
│  └──────┬───────────┘      └──────────────────────┘             │
│         │                                                       │
│    ┌────┴─────────────────────────┐                             │
│    │                              │                             │
│    ▼                              ▼                             │
│  ┌────────────┐          ┌────────────────┐                     │
│  │  Sync Layer │          │  Search Layer  │                     │
│  │             │          │                │                     │
│  │ • Memory    │          │ • Vector Search││  │   files     │             (cosine sim) │                     │
│  │ • Session   │          │ • FTS5 Keyword │                     │
│  │   files     │          │ • Hybrid Merge │                     │
│  │ • File      │          │                │                     │
│  │   watcher   │          └───────┬────────┘                     │
│  └──────┬──────┘                  │                             │
│         │                         │                             │
│         ▼                         ▼                             │
│  ┌──────────────────────────────────────┐                       │
│  │         SQLite Database              │                       │
│  │                                      │                       │
│  │  ┌──────┐ ┌───────┐ ┌────────────┐  │                       │
│  │  │ meta │ │ files │ │   chunks   │  │                       │
│  │  └──────┘ └───────┘ └────────────┘  │                       │
│  │  ┌────────────┐ ┌────────────────┐  │                       │
│  │  │ chunks_vec │ │  chunks_fts    │  │                       │
│  │   (vec0 ext)  (FTS5 ext)    │  │                       │
│  │  └────────────┘ └────────────────┘  │                       │
│  │  ┌───────────────────┐              │                       │
│  │  │ embedding_cache   │              │                       │
│  │  └───────────────────┘              │                       │
│  └──────────────────────────────────────┘                       │
│                                                                 │
│  ┌──────────────────────────────────────┐                       │
│  │       Embedding Providers            │                       │
│  │  ┌─────────┐ ┌────────┐ ┌────────┐  │                       │
│  │  │ OpenAI  │ │ Gemini │ │ Local  │  │                       │
│  │  │ +Batch  │ │ +Batch │ │ GGUF   │  │                       │
│  │  └─────────┘ └────────┘ └────────┘  │                       │
│  └──────────────────────────────────────┘                       │
└─────────────────────────────────────────────────────────────────┘

Source: The core implementation lives across ~20 files under src/memory/ and src/agents/. Key files:

FileRole
manager.tsCentral MemoryIndexManager — sync, index, search, lifecycle
manager-search.tsSearch execution — vector + keyword queries
hybrid.tsHybrid merge logic (BM25 + vector scoring)
embeddings.tsProvider abstraction — OpenAI, Gemini, or local
internal.tsChunking, hashing, cosine similarity
memory-schema.tsSQLite schema creation and migrations
sync-memory-files.tsFile sync and change detection
sqlite-vec.tssqlite-vec extension loader for native vector search

Workspace as Memory Substrate

The core design decision is radical in its simplicity: memory is plain Markdown files on disk. No database, no proprietary format, no vendor lock-in. The agent's memory is literally a directory of .md files that you can read, edit, grep, and version-control with Git.

~/clawd/
├── AGENTS.md          # Operating instructions
├── SOUL.md            # Persona and boundaries
├── USER.md            # User profile
├── IDENTITY.md        # Agent identity
├── TOOLS.md           # Tool-specific notes
├── MEMORY.md          # Curated long-term memory
├── HEARTBEAT.md       # Periodic task checklist
└── memory/
    ├── 2026-01-25.md  # Daily notes
    ├── 2026-01-26.md
    └── 2026-01-27.md

At the start of every session, Clawdbot injects the bootstrap files into the system prompt under a Project Context section. The system prompt assembly is handled in src/agents/system-prompt.ts, with file loading in src/agents/bootstrap-files.ts. Large files are truncated at a configurable limit (default: 20,000 characters per file).

Why Markdown? The choice over a database or structured store is deliberate:

  1. LLM-native format: Models are trained on Markdown. They read it naturally, write it well, and don't need serialization/deserialization logic.
  2. Human-readable: You can open MEMORY.md in any editor and see exactly what your agent "knows." No opaque embeddings or binary formats.
  3. Git-friendly: The entire memory store can live in a private Git repo. Version history, diffs, and backup for free.
  4. Tool-agnostic: Any file-reading tool the agent has can access memory. No special memory API needed.
  5. Portable: Moving to a new machine is git clone. No database migrations.

The tradeoff is that pure file-based memory doesn't support fast semantic lookup at scale. That's where vector search comes in.


Reading files sequentially doesn't scale. If your MEMORY.md grows to thousands of lines across months of use, grep-style search breaks down — especially when the query uses different wording than the stored content.

Clawdbot builds a vector index over memory files, backed by a per-agent SQLite database with optional hardware-accelerated search via sqlite-vec.

Chunking

Memory files are split into chunks using a line-by-line accumulator in internal.ts:

// Simplified from internal.ts
function chunkMarkdown(content: string, chunking: { tokens: number; overlap: number }) {
  const maxChars = Math.max(32, chunking.tokens * 4);     // default: 1600 chars
  const overlapChars = Math.max(0, chunking.overlap * 4);  // default: 320 chars

  for (const line of lines) {
    // Split very long lines into maxChars segments
    // When accumulated chars > maxChars, flush chunk
    // Carry overlap lines into next chunk
  }
}

Default: 400 tokens (~1600 chars) per chunk, 80 token (~320 chars) overlap. Each chunk records startLine, endLine, text, and a SHA-256 hash for deduplication.

The SQLite Schema

The database schema (memory-schema.ts) stores everything needed for search:

-- Tracked files with content hashes for change detection
CREATE TABLE IF NOT EXISTS files (
  path TEXT PRIMARY KEY,
  source TEXT NOT NULL DEFAULT 'memory',
  hash TEXT NOT NULL,
  mtime INTEGER NOT NULL,
  size INTEGER NOT NULL
);

-- Text chunks with embeddings
CREATE TABLE IF NOT EXISTS chunks (
  id TEXT PRIMARY KEY,
  path TEXT NOT NULL,
  source TEXT NOT NULL DEFAULT 'memory',
  start_line INTEGER NOT NULL,
  end_line INTEGER NOT NULL,
  hash TEXT NOT NULL,
  model TEXT NOT NULL,
  text TEXT NOT NULL,
  embedding TEXT NOT NULL,      -- JSON array of floats
  updated_at INTEGER NOT NULL
);

-- Native vector index (sqlite-vec extension)
CREATE VIRTUAL TABLE IF NOT EXISTS chunks_vec USING vec0(
  id TEXT PRIMARY KEY,
  embedding FLOAT[{dimensions}]  -- dynamically sized
);

-- Full-text search index
CREATE VIRTUAL TABLE IF NOT EXISTS chunks_fts USING fts5(
  text, id UNINDEXED, path UNINDEXED, source UNINDEXED
);

-- Embedding cache (survives reindexing)
CREATE TABLE IF NOT EXISTS embedding_cache (
  provider TEXT NOT NULL,
  model TEXT NOT NULL,
  provider_key TEXT NOT NULL,
  hash TEXT NOT NULL,
  embedding TEXT NOT NULL,
  PRIMARY KEY (provider, model, provider_key, hash)
);

The vector table dimensions are dynamically created based on the first embedding returned by the provider. If the model changes and produces different dimensions, the table is dropped and rebuilt automatically.

Embedding Providers

Clawdbot supports three embedding backends through a unified EmbeddingProvider interface (embeddings.ts):

type EmbeddingProvider = {
  id: string;          // "openai" | "gemini" | "local"
  model: string;       // e.g. "text-embedding-3-small"
  embedQuery: (text: string) => Promise<number[]>;
  embedBatch: (texts: string[]) => Promise<number[][]>;
};
ProviderDefault ModelNotes
OpenAItext-embedding-3-smallSupports Batch API (50% cheaper). Custom endpoints supported
Geminigemini-embedding-001Native Gemini API with async batch. Uses x-goog-api-key header
Localembeddinggemma-300M-Q8_0.ggufRuns via node-llama-cpp, ~0.6 GB, zero API cost

Auto-selection priority: local (if configured) → OpenAI (if key available) → Gemini (if key available). If the primary provider fails, a configurable fallback kicks in automatically.

Batch Embedding APIs

Both OpenAI and Gemini support asynchronous batch embedding for significant cost savings. The OpenAI flow (batch-openai.ts):

1. Upload JSONL file with embedding requests → POST /files
2. Create batch job referencing file → POST /batches
3. Poll for completion → GET /batches/{id}
4. Download output → GET /files/{output_file_id}/content
5. Parse JSONL → extract embeddings by custom_id

Key: max 50,000 requests per batch (auto-splits larger sets), configurable concurrency, and a circuit breaker that disables batch after 2 consecutive failures:

┌──────────────┐    success    ┌──────────────┐
Batch Enabled│◄──────────────│ Batch Running│
│  failures: 0 │───────────────►              │
└──────┬───────┘               └──────┬───────┘
       │ failure                      │ failure
       ▼                              ▼
┌──────────────┐    ≥2 fails   ┌──────────────┐
│  failures: 1 │──────────────►│ Batch└──────────────┘               │ DISABLED                                (use direct)                               └──────────────┘

Hybrid Search (BM25 + Vector)

Pure vector search is great at semantic matching — "Mac Studio gateway host" matches "the machine running the gateway." But it's weak at exact tokens: IDs like a828e60, code symbols like memorySearch.query.hybrid, error strings. BM25 keyword search is the opposite: strong at exact matches, weak at paraphrases.

Clawdbot combines both in hybrid.ts:

// Simplified from hybrid.ts
function mergeHybridResults(params) {
  const { vector, keyword, vectorWeight, textWeight } = params;
  const byId = new Map();

  // Add all vector results
  for (const r of vector) {
    byId.set(r.id, { ...r, textScore: 0 });
  }

  // Merge keyword results
  for (const r of keyword) {
    const existing = byId.get(r.id);
    if (existing) {
      existing.textScore = r.textScore;
      existing.snippet = r.snippet;  // prefer keyword snippet
    } else {
      byId.set(r.id, { ...r, vectorScore: 0 });
    }
  }

  // Compute weighted score and sort
  return Array.from(byId.values())
    .map(e => ({
      ...e,
      score: vectorWeight * e.vectorScore + textWeight * e.textScore
    }))
    .sort((a, b) => b.score - a.score);
}

Default weights: 70% vector, 30% keyword. BM25 scores from SQLite FTS5 (which returns negative ranks where lower = better) are normalized to 0–1:

// BM25 rank → 0-1 score (from hybrid.ts)
function bm25RankToScore(rank: number): number {
  const normalized = Math.max(0, rank);
  return 1 / (1 + normalized);
  // rank=0 → 1.0 (perfect), rank=1 → 0.5, rank=10 → ~0.09
}

FTS5 queries are built by tokenizing input and AND-joining terms:

function buildFtsQuery(raw: string): string | null {
  const tokens = raw.match(/[A-Za-z0-9_]+/g)?.map(t => t.trim()).filter(Boolean) ?? [];
  if (tokens.length === 0) return null;
  return tokens.map(t => `"${t.replaceAll('"', "")}"`).join(" AND ");
}
// "hello world" → '"hello" AND "world"'

Both search paths fetch maxResults × candidateMultiplier candidates (default: 6 × 4 = 24) before merging, ensuring enough high-quality candidates survive the weighting.

When sqlite-vec is available, vector search uses hardware-accelerated SQL:

SELECT c.id, c.path, c.start_line, c.end_line, c.text, c.source,
       vec_distance_cosine(v.embedding, ?) AS dist
  FROM chunks_vec v
  JOIN chunks c ON c.id = v.id
 WHERE c.model = ?
 ORDER BY dist ASC LIMIT ?

Without it, Clawdbot falls back to brute-force cosine similarity in JavaScript — O(n) per query, but functional.


File Synchronization & Change Detection

The sync system keeps the vector index fresh without unnecessary re-embedding. Source: sync-memory-files.ts.

File discovery recursively walks MEMORY.md (or memory.md) plus everything under memory/, indexing only .md files.

Change detection uses SHA-256 content hashing — not timestamps:

const record = db.prepare(`SELECT hash FROM files WHERE path = ? AND source = ?`)
  .get(entry.path, "memory");
if (!needsFullReindex && record?.hash === entry.hash) return;  // skip — unchanged

Sync triggers run through multiple paths:

TriggerDefaultDescription
Session startEnsures fresh index when agent wakes
Before searchFire-and-forget sync (non-blocking)
File watcherchokidar with 1.5s debounce
IntervalOffOptional periodic sync

Critical design: search never blocks on sync. The sync is fire-and-forget, so searches return immediately against the current index:

async search(query, opts?) {
  if (this.settings.sync.onSearch && this.dirty) {
    void this.sync({ reason: "search" });   // non-blocking!
  }
  // Proceed with search immediately against current index
}

Stale cleanup: after sync, files no longer on disk are automatically removed from the database (files, chunks, vector table, and FTS index).


Atomic Reindexing & Fault Tolerance

When the embedding model or provider changes, the entire index needs rebuilding. This is inherently dangerous — a crash mid-reindex could leave a corrupted database. Clawdbot uses an atomic swap strategy:

1. Create temp database (original.sqlite.tmp-{uuid})
2. Initialize schema in temp DB
3. Seed embedding cache from original DB (reuse existing embeddings!)
4. Index all files into temp DB
5. Close both DBs
6. Atomic swap: original → backup, temp → original
7. Delete backup
8. Open new DB from original path

The swap handles SQLite's WAL files (the -wal and -shm companions):

// Simplified from manager.ts
async swapIndexFiles(targetPath, tempPath) {
  const backupPath = `${targetPath}.backup-${randomUUID()}`;
  await moveIndexFiles(targetPath, backupPath);    // original → backup
  try {
    await moveIndexFiles(tempPath, targetPath);    // temp → original
  } catch (err) {
    await moveIndexFiles(backupPath, targetPath);  // restore on failure!
    throw err;
  }
  await removeIndexFiles(backupPath);              // cleanup backup
}

async moveIndexFiles(source, target) {
  for (const suffix of ["", "-wal", "-shm"]) {    // SQLite file triplet
    await fs.rename(`${source}${suffix}`, `${target}${suffix}`);
  }
}

If the reindex fails at any point — embedding API down, disk full, whatever — the original database is preserved intact. The test suite explicitly verifies this:

// From test suite: "preserves existing index when forced reindex fails"
await expect(manager.sync({ force: true })).rejects.toThrow(/mock embeddings failed/i);
const after = manager.status();
expect(after.files).toBe(before.files);    // original data intact
expect(after.chunks).toBe(before.chunks);  // no data loss

Embedding Cache

Embeddings are expensive (API calls) or slow (local inference). The cache (embedding_cache table) stores embeddings keyed by content hash, so identical text is never re-embedded:

Cache key: (provider, model, provider_key, content_hash)

The provider_key is a fingerprint of the API endpoint configuration — so switching from OpenAI's API to an OpenAI-compatible proxy correctly invalidates the cache.

Cache benefits:

  • Forced reindexes reuse existing embeddings (the atomic reindex seeds the cache from the original DB)
  • Minor file edits only re-embed changed chunks
  • Provider fallback triggers cache miss (correct behavior — different model = different embeddings)

Cache pruning removes oldest entries when maxEntries is exceeded.


Pre-Compaction Memory Flush

This is the most elegant piece of the memory system. When a session approaches the model's context window limit, Clawdbot needs to compact (summarize) older conversation history. But compaction is destructive — the original messages are replaced with a summary, and nuance is lost.

Before compaction triggers, Clawdbot runs a silent, agentic turn (src/auto-reply/reply/memory-flush.ts) that gives the model one last chance to save anything important to disk:

┌─────────────────────────────────────────────┐
Normal Conversation Flow│                                             │
User: "Deploy to staging"Agent: "Done. Deployed v2.3.1 to staging"User: "Run the integration tests"Agent: "All 47 tests passing"...[context window filling up]│                                             │
│  ┌───────────────────────────────────────┐  │
│  │  🧠 Memory Flush (silent turn)       │  │
│  │                                       │  │
│  │  System: "Session nearing compaction. │  │  Store durable memories now."         │  │
│  │                                       │  │
│  │  Agent: *writes to memory/2026-01-27* │  │
│  │  Agent: "NO_REPLY"                    │  │
│  └───────────────────────────────────────┘  │
│                                             │
│  ┌───────────────────────────────────────┐  │
│  │  🧹 Compaction                        │  │
  (src/agents/compaction.ts)           │  │
│  │  Older messages → summary             │  │
│  └───────────────────────────────────────┘  │
│                                             │
[Session continues with summary + recent]└─────────────────────────────────────────────┘

The flush is completely invisible to the user — the NO_REPLY response is swallowed by the Gateway. The agent quietly persists what matters before the context window is reclaimed.

Timing: The flush triggers when the session token estimate crosses contextWindow - reserveTokensFloor - softThresholdTokens. Only one flush runs per compaction cycle (tracked in the session store). If the workspace is read-only (sandboxed sessions), the flush is skipped entirely.

Configuration:

{
  agents: {
    defaults: {
      compaction: {
        reserveTokensFloor: 20000,
        memoryFlush: {
          enabled: true,
          softThresholdTokens: 4000,
          systemPrompt: "Session nearing compaction. Store durable memories now.",
          prompt: "Write lasting notes to memory/YYYY-MM-DD.md; reply NO_REPLY."
        }
      }
    }
  }
}

Session Memory Search (Experimental)

Beyond explicit memory files, Clawdbot can optionally index session transcripts — the raw JSONL conversation history. This lets memory_search surface things that were discussed but never explicitly written to a memory file.

{
  agents: {
    defaults: {
      memorySearch: {
        experimental: { sessionMemory: true },
        sources: ["memory", "sessions"]
      }
    }
  }
}

Session transcripts are parsed from JSONL, extracting only user and assistant messages:

// From session-files.ts
if (message.role !== "user" && message.role !== "assistant") continue;
const text = extractSessionText(message.content);
collected.push(`${message.role === "user" ? "User" : "Assistant"}: ${text}`);

Session indexing uses delta thresholds (100KB or 50 new messages) to avoid constant re-embedding. An efficient newline counter reads only the changed portion of each file:

// Count newlines in just the new bytes (64KB chunks)
async function countNewlines(absPath: string, start: number, end: number) {
  const handle = await fs.open(absPath, "r");
  const buffer = Buffer.alloc(64 * 1024);
  while (offset < end) {
    const { bytesRead } = await handle.read(buffer, 0, toRead, offset);
    for (let i = 0; i < bytesRead; i++) {
      if (buffer[i] === 10) count++;  // 0x0A = newline
    }
  }
}

Cost Optimization: Seven Layers Deep

The memory system implements aggressive cost optimization at every level:

  1. Embedding cache: Content-hash keyed. Identical text chunks never re-embedded, even across reindexes.
  2. SHA-256 change detection: Files only re-indexed when content actually changes (not on timestamp updates).
  3. Batch APIs: OpenAI and Gemini batch embedding is ~50% cheaper than synchronous calls. Auto-splits at 50K requests.
  4. Delta-based session indexing: Only re-embeds sessions that grew significantly (100KB+ or 50+ messages).
  5. Debounced sync: File watcher uses 1.5s debounce; session events use 5s. No rapid-fire re-indexing.
  6. Concurrent indexing: 4 files indexed in parallel by default (worker pool pattern).
  7. Local embeddings: node-llama-cpp with a 0.6GB GGUF model means zero API cost for complete offline operation.

And when things go wrong, the provider fallback automatically switches to an available backend and triggers a reindex with the new embeddings.


The Agent Tools

The agent interacts with memory through two tools defined in memory-tool.ts:

memory_search — Semantically searches all memory Markdown files. Returns snippets (capped at 700 chars) with file path, line range, score, and source:

{
  "results": [
    {
      "path": "memory/2026-01-15.md",
      "startLine": 5,
      "endLine": 12,
      "score": 0.87,
      "snippet": "Discussed deployment strategy...",
      "source": "memory"
    }
  ],
  "provider": "openai",
  "model": "text-embedding-3-small"
}

memory_get — Reads specific lines from a memory file. Strict path safety rejects anything outside MEMORY.md / memory/:

if (!absPath.startsWith(this.workspaceDir)) {
  throw new Error("path escapes workspace");
}

The two-step pattern (search → targeted read) is intentional: search finds relevant chunks with line numbers, then memory_get retrieves precise sections. This keeps context windows small.


The Full Memory Lifecycle

Putting it all together:

1. CONVERSATION
   User says something → Agent processes and responds
2. IMPLICIT CAPTURE   Agent decides to write to ───────┤
   memory/YYYY-MM-DD.md3. PRE-COMPACTION FLUSH   Context window full → silent ────┤
   turn saves durable notes         │
4. INDEXING   File watcher detects change ─────┤
   → chunk → embed → SQLite         
   (1.5s debounce, concurrent)5. RETRIEVAL   Future question → memory_search ─┤
   → hybrid BM25+vector lookup      │
   → memory_get for full context    │
6. CURATION   Periodically, agent reviews ─────┘
   daily notes → updates MEMORY.md
   with distilled long-term insights

Step 6 is particularly interesting: Clawdbot can use its heartbeat mechanism (periodic background check-ins) to review recent daily notes and update MEMORY.md with distilled learnings — like a human reviewing their journal and updating their mental model.


Comparison: How Other Systems Handle Memory

SystemMemory ApproachLimitations
ChatGPTConversation history + "Memory" featureLimited slots, no semantic search, no user control over format
Claude ProjectsAttached docs + conversationNo cross-session memory, no automatic persistence
Cursor/Windsurf.cursorrules / project contextStatic files, no dynamic memory, no search
Custom RAGVector DB + retrieval pipelineComplex setup, separate infrastructure, no file-level transparency
ClawdbotMarkdown + SQLite vector index + auto-flushHuman-readable, Git-friendly, hybrid search, atomic reindex, zero external deps

The key differentiator is the combination of transparency (plain files), resilience (atomic reindex, circuit breakers, provider fallback), and zero external dependencies (everything runs in-process with SQLite).


Practical Tips

For everyday users:

  • Say "remember this" when you want something persisted
  • Check MEMORY.md occasionally to see what your agent "knows"
  • Use /compact if conversations feel stale or repetitive

For power users:

  • Put your workspace in a private Git repo for backup and history
  • Configure local embeddings (provider: "local") for fully offline memory search
  • Tune vectorWeight / textWeight for your use case (more keyword-heavy for code, more vector-heavy for natural language)
  • Enable experimental.sessionMemory to search past conversations

For developers:


What's Next

This is the first post in a series exploring Clawdbot's architecture. The memory system is the foundation — without persistent, searchable memory, an AI assistant is just a fancy autocomplete. With it, you get something that genuinely improves over time.

Upcoming posts will cover:

  • Session management and compaction: How Clawdbot keeps long-running conversations within context limits
  • The agent runtime: How the embedded runtime orchestrates tools, models, and channels
  • Multi-channel architecture: How one assistant serves WhatsApp, Telegram, Discord, and more simultaneously