The MemPalace Memory System

Crawbl's persistent, per-workspace memory system. Agents use it to build context before every response. Covers the palace data model, the 3-phase ingestion pipeline, the 4-layer retrieval stack, the knowledge graph, and every MCP tool.

1. Overview

LLMs have no memory between conversations. MemPalace gives each workspace a persistent, searchable memory system backed by PostgreSQL (with pgvector for embeddings) and Redis (for palace-graph room aggregation caching). There is no external vector database and no separate microservice.

All agents in a workspace share one memory pool. Topic separation comes from the wing/room taxonomy, not agent boundaries. Every memory operation is scoped by workspace_id -- agents cannot read memories from other workspaces.

Codebase location

crawbl-backend/internal/orchestrator/memory/
├── types.go              # Core domain types (Drawer, Entity, Triple, Identity,
│                         #   HybridSearchResult, TraversalResult, Tunnel,
│                         #   PipelineTier constants, HeuristicKillSwitchValue)
├── repo/                 # All persistence (consumer-side interfaces per consumer)
│   ├── drawerrepo/       # pgvector drawers + hybrid CTE search
│   ├── centroidrepo/     # memory_type_centroids (Phase 2 k-NN)
│   ├── kgrepo/           # Knowledge graph: entities + temporal triples
│   ├── palacegraphrepo/  # BFS traversal + Redis-cached room aggregation
│   └── identityrepo/     # memory_identities upsert/read
├── layers/               # 4-layer retrieval stack (L0–L3)
│   ├── stack.go          # Composes layers into WakeUp / Recall / Search
│   ├── l0_identity.go    # L0: workspace identity (via identityrepo)
│   ├── l1_essential.go   # L1: top memories by importance
│   ├── l2_ondemand.go    # L2: filtered by wing/room
│   ├── l3_search.go      # L3: fallback pgvector-only search
│   └── retrieval.go      # HybridRetrieve — one CTE, no goroutines
├── autoingest/           # In-process pond pool for the hot path (NOT River)
│   ├── types.go          # Service, Work, Deps, Config, Metrics interfaces
│   ├── service.go        # NewService — wires pond.TypedPool; Submit + Shutdown
│   ├── worker.go         # per-chunk pipeline (classify → embed → centroid? → persist)
│   └── helpers.go        # isNoise, chunkText, buildDrawer, autoIngestDrawerID
├── jobs/                 # Business logic for the cold pipeline (driver-agnostic)
│   ├── process.go        # RunProcess — LLM reclassification
│   ├── maintain.go       # RunMaintain — decay + prune
│   ├── enrich.go         # RunEnrich — KG backfill
│   └── centroids.go      # RunCentroidRecompute — weekly centroid rebuild
├── extract/              # Heuristic + LLM memory classifiers
│   └── classify.go       # Regex-based heuristic classifier
└── config/               # Embedded JSON config (noise_patterns, classify_patterns)

River adapters for the cold pipeline live in internal/orchestrator/queue/memory_workers.go, keeping the jobs/ business logic free of River imports.

The SQL migration is at migrations/orchestrator/000005_memory_palace.up.sql.

2. Data Model

The palace metaphor

Every memory chunk is called a drawer -- a piece of verbatim text filed into a location in the palace:

Click diagram to zoom

Wing is the broadest category (like a department). Room is a topic within that wing. Hall is optional extra granularity -- most drawers skip it. The combination of wing + room is the primary navigation path.

Schema

`memory_drawers`

Field	Type	Purpose
`id`	`TEXT PK`	MD5-based deterministic ID
`workspace_id`	`TEXT`	Tenant isolation (all queries are scoped)
`wing`	`TEXT`	Top-level category
`room`	`TEXT`	Subtopic within the wing
`hall`	`TEXT`	Optional granular grouping
`content`	`TEXT`	Verbatim memory text (max 10,000 chars)
`embedding`	`vector(1536)`	pgvector embedding (`text-embedding-3-small`)
`importance`	`FLOAT`	Priority score 0--5 (default 3.0)
`memory_type`	`TEXT`	`decision\|preference\|milestone\|problem\|emotional\|fact\|task`
`pipeline_tier`	`TEXT`	`heuristic\|centroid\|llm` -- which classifier made the final call
`state`	`TEXT`	`raw\|processed\|merged\|failed`
`summary`	`TEXT`	LLM-generated one-line summary (cold path only)
`source_file`	`TEXT`	Where the memory originated
`added_by`	`TEXT`	`"auto-ingest"`, `"mobile"`, agent name, etc.
`added_by_agent`	`TEXT`	Agent UUID for affinity ranking
`filed_at`	`TIMESTAMP`	When it was filed
`last_accessed_at`	`TIMESTAMPTZ`	Updated on retrieval (`TouchAccess`)
`access_count`	`INT`	Incremented on retrieval
`superseded_by`	`TEXT`	Points to newer contradicting drawer
`cluster_id`	`TEXT`	Canonical drawer ID for merged clusters
`retry_count`	`INT`	Cold worker failure counter (max 3)
`entity_count`	`INT`	Filled by `memory_enrich` worker
`triple_count`	`INT`	Filled by `memory_enrich` worker

`memory_entities`

Field	Purpose
`(workspace_id, id)`	Composite PK. ID is SHA256 of normalized name
`name`	Display name (e.g., "PostgreSQL", "Alice")
`type`	Classification (e.g., "technology", "person", "service")
`properties`	JSON metadata bag
`embedding`	`vector(1536)` -- column exists, embedding fallback retrieval not yet implemented

`memory_triples`

Field	Purpose
`(workspace_id, id)`	Composite PK
`subject`, `predicate`, `object`	Entity IDs forming a directed relationship
`valid_from`, `valid_to`	Temporal range (`NULL` valid_to = current fact)
`confidence`	Relationship confidence score
`source_closet`	Origin drawer reference

`memory_identities`

One row per workspace. Holds the L0 identity text (max 2,000 characters).

`memory_type_centroids`

Field	Purpose
`memory_type`	PK -- one row per memory type
`centroid`	`vector(1536)` -- element-wise average of LLM-labelled embeddings
`sample_count`	Rows used; below 50 the centroid is ignored
`computed_at`	Last recompute timestamp
`source_hash`	Recompute is a no-op when hash is unchanged

State machine

                  +-- retry < 3 ---+
                  |                |
  [INSERT] --> raw ----> processed |
                |    ^       |     |
                |    +-------+     |
                |                  |
                +-- retry >= 3 --> failed

  processed ----> merged  (cluster canonical absorbs this drawer)

3. Ingestion Pipeline

flowchart TD
    A["User Message → Agent Reply\nstream.go finalize()"] --> B

    subgraph HOT["HOT PATH (request goroutine)"]
        B["chatservice.autoIngestConversation\nbuild exchange, trim noise"] --> C["autoingest.Pool.Submit\nnon-blocking"]
        C --> D{queue full?}
        D -- yes --> E["drop + warn log\n+ Dropped counter"]
        D -- no --> F["queued in pond pool"]
    end

    subgraph POOL["AUTO-INGEST POOL (in-process alitto/pond)"]
        F --> G["chunk + heuristic classify"]
        G --> H{confidence >= 0.8?}
        H -- yes --> I["embed + dedup + persist\nstate=processed\npipeline_tier=heuristic"]
        H -- no --> J{confidence >= 0.5?}
        J -- yes --> K["embed + centroid NearestType"]
        K --> L{cosine > 0.85?}
        L -- yes --> M["persist\nstate=processed\npipeline_tier=centroid"]
        L -- no --> N["persist\nstate=raw\npipeline_tier=llm"]
        J -- no --> N
        I --> O["publish NATS MemoryEvent"]
        M --> O
        N --> O
    end

    subgraph RIVER["RIVER WORKERS (periodic, not on hot path)"]
        P["memory_process\n1-min sweep\nclaim raw drawers\nLLM batch classify\nentity link + cluster\nstate: raw → processed"]
        Q["memory_enrich\n10-min sweep\nKG backfill for\nheuristic/centroid drawers\nimportance >= 3"]
        R["memory_maintain\ndaily midnight\ndecay + prune"]
        S["memory_centroid_recompute\nSunday 03:00 UTC\nrebuild prototype vectors\nfrom llm-labelled drawers"]
    end

    N -.->|"picked up within 60s"| P
    I -.->|"high-importance only"| Q
    M -.->|"high-importance only"| Q

Hot path: auto-ingest pool

After stream.go finalize() returns an agent reply, chatservice.autoIngestConversation builds the exchange pair, trims noise, and calls autoingest.Pool.Submit(). The request goroutine returns immediately -- orchestrator response latency is unaffected.

The pool is backed by github.com/alitto/pond (v2) with bounded capacity and non-blocking submit. If the queue is full, the work is dropped with a metric increment and warn log. The original messages remain in the messages table for potential future replay.

Why not River? Every chat turn would write one river_job row on the critical path. At scale, that is O(messages/second) Postgres writes just to hand a payload to a worker in the same process. River is used for periodic/cross-pod/must-survive-restart work; pond handles hot-path fan-out inside one pod.

Per-chunk pipeline (autoingest/worker.go):

Noise filter -- drop greetings, very short messages (configurable via embedded noise_patterns.json)
Chunk -- split content > 800 chars at sentence boundaries with 100-char overlap
Heuristic classify -- regex-based scoring in extract/classify.go
Embed -- text-embedding-3-small via the configured embedding provider
Dedup -- skip if cosine similarity > 0.85 against existing drawers
Tier decision (pickTier):
- confidence >= HeuristicConfidenceHigh (default 0.8) → state=processed, pipeline_tier=heuristic -- done
- confidence in [HeuristicConfidenceLow, HeuristicConfidenceHigh) AND centroid lookup finds similarity > 0.85 → state=processed, pipeline_tier=centroid -- done
- otherwise → state=raw, pipeline_tier=llm -- cold path picks it up
Persist -- idempotent insert via DrawerRepo.AddIdempotent
Publish -- emit a NATS MemoryEvent

Pool sizing (env-configurable):

Env var	Default	Purpose
`CRAWBL_AUTOINGEST_WORKERS`	16	Concurrent goroutines (sized for I/O-bound embedding calls)
`CRAWBL_AUTOINGEST_CAPACITY`	1024	Queue depth (~1s head-room at 1K msg/sec per pod)

Cold path: River workers

All cold workers run as River periodic jobs inside the orchestrator binary. No separate scheduler component.

`memory_process` (1-minute sweep)

jobs/process.go → RunProcess. Claims state=raw drawers with FOR UPDATE SKIP LOCKED (multi-pod safe). Batch classifies all drawers per workspace in one gpt-4o-mini structured output call. Falls back to individual calls on parse failure.

Steps per drawer:

LLM returns memory_type, importance (0--1, scaled to 0--5), entities, summary, and relationship triples
Entities upserted into KG, triples create relationship edges
Sets pipeline_tier = 'llm'
Clustering: drawers with cosine > 0.85 are merged (canonical absorbs cluster members, others get state=merged)
Conflict detection: drawers in 0.75--0.90 cosine range checked for contradiction via LLM. Older drawer gets superseded_by = new_id
State transitions: raw → processed (or failed after 3 retries)

`memory_enrich` (10-minute sweep)

jobs/enrich.go → RunEnrich. High-confidence drawers that bypassed the cold pipeline miss entity linking. For drawers with importance >= 3, this worker runs LLM extract to backfill KG entities and triples. Updates entity_count and triple_count on the drawer.

Query: state=processed AND pipeline_tier <> 'llm' AND entity_count=0 AND importance >= 3.0 ORDER BY created_at ASC LIMIT 100

Low-importance drawers (importance < 3) stay entity-less permanently.

`memory_maintain` (daily at midnight UTC)

jobs/maintain.go → RunMaintain. Only processes workspaces with activity in the last 24 hours.

Decay: importance = max(importance * 0.98, 0.3) for drawers older than 30 days and not accessed within 7 days. ~2 month half-life.
Pruning: deletes drawers with importance < 0.5 AND access_count < 3, keeping minimum 100 per workspace.
Access-based reinforcement: retrieval calls TouchAccess(), resetting the decay clock.

`memory_centroid_recompute` (Sunday 03:00 UTC)

jobs/centroids.go → RunCentroidRecompute. Aggregates up to 500 LLM-labelled drawers per type from the last 90 days, averages embeddings in Go, upserts into memory_type_centroids. The source_hash conditional update makes the job a no-op when no new LLM-labelled drawers exist.

Feedback-loop prevention: centroids are trained only on pipeline_tier='llm' drawers. Centroid-labelled drawers are excluded from all recomputes.

Sample floor: sample_count < 50 causes NearestType to return found=false. Phase 2 falls through to the cold LLM path. New workspaces stay safe until enough LLM-labelled history accumulates.

Kill switches

Both phase gates are read once at boot from env vars. Defaults are 999.0 (disabled -- every chunk falls to LLM path).

Env var	Default	Effect when set
`CRAWBL_MEM_HEURISTIC_HIGH`	999.0	Set to `0.8` to enable Phase 1 (heuristic trust)
`CRAWBL_MEM_HEURISTIC_LOW`	999.0	Set to `0.5` to enable Phase 2 (centroid k-NN band)

To disable Phase 2 only: set CRAWBL_MEM_HEURISTIC_LOW = CRAWBL_MEM_HEURISTIC_HIGH. The centroid band collapses to zero width. Requires pod restart.

Crash recovery

Failure point	What happens	Recovery
Pod crashes between Submit and worker pickup	Chunk lost	`messages` row exists for future replay
Crash after persist (`state=raw`)	Drawer sits raw	`memory_process` sweep picks it up within 60s
Crash after persist (`state=processed`)	Drawer done; entity linking pending	`memory_enrich` sweep picks it up within 10 min

4. Memory Classification

The seven memory types

Type	What it captures
`decision`	Architecture choices, technology picks, trade-offs
`preference`	Personal or team style rules
`milestone`	Achievements, breakthroughs, completed work
`problem`	Bugs, errors, root causes, and their fixes
`emotional`	Personal feelings, team morale moments
`fact`	Factual statements about the user, project, or domain
`task`	Pending or in-progress work items

Heuristic classifier

extract/classify.go scores segments against regex marker patterns loaded from config/classify_patterns.json:

rawScore    = sum of regex marker hits across all memory types
lengthBonus = +2 if segment > 500 chars, +1 if segment > 200 chars, else 0
confidence  = min(1.0, (bestTypeScore + lengthBonus) / 5.0)

The classifier also runs sentiment analysis (positive/negative word lists from config) and disambiguation logic -- e.g., a "problem" with resolution cues and positive sentiment may be reclassified as "milestone".

Pipeline tier column

memory_drawers.pipeline_tier records which classifier made the final type decision:

Value	Set by	Meaning
`heuristic`	Auto-ingest pool	Regex confidence >= 0.8; cold LLM skipped
`centroid`	Auto-ingest pool	Embedding nearest-centroid above cosine 0.85; cold LLM skipped
`llm`	Cold pipeline (`memory_process`)	Fell through both classifiers; LLM made the call

5. Retrieval: 4-Layer Stack

When an agent needs context, the memory system provides it through a layered stack (layers/stack.go). Each layer adds progressively more detail within a total character budget.

Click diagram to zoom

L0 -- Identity

The workspace's personality and context. Set once via memory_set_identity, injected at the start of every conversation via WakeUp().

Budget: 400 characters (never truncated)
Source: memory_identities table (one row per workspace)
Renderer: layers/l0_identity.go → renderL0

L1 -- Essential Story

The most important memories across the workspace, from the top 15 drawers ranked by importance.

Budget: 2,000 characters (truncated with "... (more in L3 search)" if exceeded)
Source: DrawerRepo.GetTopByImportance() -- sorted by importance descending, optionally filtered by wing
Grouped by: Room, sorted alphabetically for deterministic output
Snippet limit: 200 characters per drawer
Renderer: layers/l1_essential.go → renderL1

L2 -- On-Demand Recall

Retrieved when an agent explicitly asks for memories from a specific wing and room. Not injected automatically.

Budget: 1,200 characters
Source: DrawerRepo.GetByWingRoom() -- filtered retrieval
Default limit: 10 drawers, 300 chars per snippet
Renderer: layers/l2_ondemand.go → renderL2

L3 -- Hybrid Search

Semantic search that combines pgvector ANN with knowledge graph entity lookup in a single Postgres CTE query (drawerrepo.SearchHybrid). Falls back to pure vector search (renderL3) if hybrid retrieval fails.

Budget: 14,000 characters (hard cap on total output)
Default limit: 5 results (max 50)
Ranking formula (layers/retrieval.go → rankHybridResults):
```
finalScore = importance × recencyFactor × max(similarity, graphScore) + agentAffinityBoost(0.1)
```
Where recencyFactor = 1.0 / (1.0 + daysSinceAccess / 30.0)
KG branch: query words >= 4 chars are forwarded as KG entity lookup terms
Access tracking: all returned drawers get TouchAccessBatch() -- updates last_accessed_at, increments access_count, keeping hot memories alive against decay
Renderer: layers/l3_search.go → renderL3 (pure vector) or stack.Search (hybrid)

Token budgets (in characters, ~4 chars per token)

Layer	Budget	Behavior
L0 -- Identity	400	Never truncated
L1 -- Essential Story	2,000	Truncated first, shows "more in L3"
L2 -- On-Demand	1,200	Returned as-is
L3 -- Search	14,000 (hard cap)	Result count limited

6. Knowledge Graph

Entities identified by SHA256 hash of normalized name. Temporal triples with valid_from/valid_to for time-bounded facts.

Entities and triples

An entity is a named thing (person, service, concept, project). A triple is a temporal relationship:

[Subject] --predicate--> [Object]
   with valid_from / valid_to timestamps

For example:

[Crawbl Backend] --uses--> [PostgreSQL]       valid_from: 2025-01-15, valid_to: NULL (current)
[Crawbl Backend] --uses--> [MongoDB]          valid_from: 2024-06-01, valid_to: 2025-01-14 (expired)
[Alice]          --owns--> [Auth Module]      valid_from: 2025-03-01, valid_to: NULL (current)

The valid_from/valid_to fields let agents answer questions like "what database did we use before PostgreSQL?" or "who owned the auth module in Q4?". Facts expire naturally when valid_to is set -- they are not deleted.

The PalaceGraph layer (palacegraphrepo) adds spatial reasoning on top of drawers, with Redis-cached room aggregation via internal/pkg/redisclient:

Traverse -- BFS from a starting room, hopping through shared wings to find connected rooms
FindTunnels -- discover rooms that appear in multiple wings (cross-cutting concerns)
GraphStats -- room count, tunnel count, edges, rooms per wing

Workspace limits

Resource	Limit
Drawers per workspace	10,000
Entities per workspace	5,000
Triples per workspace	50,000
Drawer content length	10,000 characters
Identity (L0) length	2,000 characters

7. MCP Tools

19 MCP tools registered in internal/orchestrator/server/mcp/tools_memory.go.

Read tools

Tool	Purpose
`memory_status`	Total drawer count, number of wings and rooms
`memory_list_wings`	List all wings with drawer counts
`memory_list_rooms`	List rooms, optionally filtered by wing
`memory_get_taxonomy`	Full wing → room hierarchy with counts
`memory_search`	Semantic vector search by natural language query
`memory_check_duplicate`	Find drawers similar to a given text (threshold 0.9)
`memory_traverse`	BFS room traversal from a starting room
`memory_find_tunnels`	Find rooms bridging two wings
`memory_graph_stats`	Palace graph overview (rooms, tunnels, edges)

Write tools

Tool	Purpose
`memory_add_drawer`	Store a new memory with auto-classification and embedding
`memory_delete_drawer`	Remove a drawer by ID
`memory_set_identity`	Set or update the L0 identity text

Knowledge graph tools

Tool	Purpose
`memory_kg_query`	Query entity relationships (incoming, outgoing, or both)
`memory_kg_add`	Add a temporal triple (auto-creates entities if missing)
`memory_kg_invalidate`	Mark a relationship as ended (set `valid_to`)
`memory_kg_timeline`	Chronological view of all facts about an entity
`memory_kg_stats`	Entity and triple counts, relationship type list

Diary tools

Tool	Purpose
`memory_diary_write`	Write an agent-scoped diary entry (hall = agent name)
`memory_diary_read`	Read an agent's recent diary entries

Diary tools are a convenience wrapper around drawers. They auto-set wing = "diary" and hall = agent name, giving each agent a private journal within the shared workspace memory.

8. Backend Wiring

The memory system is wired up in cmd/crawbl/platform/orchestrator/orchestrator.go:

var drawerRepo      = drawerrepo.NewPostgres()
var kgRepo          = kgrepo.NewPostgres()
var palaceGraphRepo = palacegraphrepo.NewPostgres(redisClient, logger)
var identityRepo    = identityrepo.NewPostgres()
classifier := extract.NewClassifier()

if baseURL != "" {
    embedder    = embed.NewProvider(...)
    memoryStack = layers.NewStack(drawerRepo, identityRepo, embedder)
}

These are passed to three services:

Service	What it uses	Why
ChatService	`memoryStack` + `ingestPool`	Calls `WakeUp()` to inject L0+L1 context; submits work to `autoingest.Pool` after each turn
AgentService	`drawerRepo`	Lists memories for the agent detail UI
MCPService	All repos + classifier + embedder	Exposes the MCP tools to agents

Click diagram to zoom

Graceful shutdown order

Socket.IO teardown -- stop accepting new client connections
ingestPool.Shutdown(shutdownCtx) -- drain in-flight pond tasks
pkgriver.Shutdown(riverClient) -- three-phase River shutdown (20s/10s/force)
DB connection close

9. Configuration

Variable	Required	Default	Purpose
`CRAWBL_EMBED_BASE_URL`	Yes	--	Embedding API endpoint
`CRAWBL_EMBED_API_KEY`	Yes	--	Embedding API key
`CRAWBL_EMBED_MODEL`	No	`text-embedding-3-small`	Embedding model
`CRAWBL_LLM_BASE_URL`	No	`CRAWBL_EMBED_BASE_URL`	Chat completions API
`CRAWBL_LLM_API_KEY`	No	`CRAWBL_EMBED_API_KEY`	Chat completions key
`CRAWBL_CLASSIFY_MODEL`	No	`gpt-4o-mini`	Classification model
`CRAWBL_AUTOINGEST_WORKERS`	No	16	Pool worker count
`CRAWBL_AUTOINGEST_CAPACITY`	No	1024	Pool queue depth
`CRAWBL_MEM_HEURISTIC_HIGH`	No	999.0 (disabled)	Phase 1 gate
`CRAWBL_MEM_HEURISTIC_LOW`	No	999.0 (disabled)	Phase 2 gate

Embedded JSON configs (config/): noise_patterns.json (noise words/patterns), classify_patterns.json (heuristic regex markers). Both loaded via go:embed -- changes require recompilation.

10. Key Constants

Constant	Value	Location
`l1MaxDrawers`	15	`layers/l1_essential.go`
`maxSnippetLen`	200	`layers/l1_essential.go`
`l2MaxSnippetLen`	300	`layers/l2_ondemand.go`
`l3MaxSnippetLen`	300	`layers/l3_search.go`
`DefaultImportance`	3.0	`types.go`
`AutoIngestChunkSize`	800	`types.go`
`AutoIngestChunkOverlap`	100	`types.go`
`AutoIngestDupThreshold`	0.85	`types.go`
`AutoIngestMinConfidence`	0.3	`types.go`
`ColdWorkerClusterThreshold`	0.85	`types.go`
`ColdWorkerConflictLow`	0.75	`types.go`
`ColdWorkerConflictHigh`	0.90	`types.go`
`ColdWorkerMaxRetries`	3	`types.go`
`DecayFactor`	0.98	`types.go`
`DecayFloor`	0.3	`types.go`
`DecayAgeDays`	30	`types.go`
`PruneThreshold`	0.5	`types.go`
`PruneMinAccessCount`	3	`types.go`
`PruneKeepMin`	100	`types.go`
`MemoryCentroidThreshold`	0.85	`types.go`
`MemoryCentroidMinSamples`	50	`types.go`
`ReinforcementThreshold`	0.7	`types.go`
`ReinforcementBoost`	0.5	`types.go`
`MaxImportance`	5.0	`types.go`
Embedding dimensions	1536	pgvector column size

11. Known Limitations

High:

IVFFlat index (migration 000008) untested on DigitalOcean CPUs. May SIGILL like HNSW. Sequential scan is the fallback, acceptable at fewer than 10K drawers.
No cost tracking for cold path LLM calls. No per-workspace attribution.

Medium:

Batch classification cannot use JSON mode (array incompatibility). Falls back to N+1 calls on parse failure.
memory_drawers.id is TEXT PK, not composite (workspace_id, id). Cross-workspace isolation relies on code, not schema.
Phase 2 is dormant until the centroid table has >= 50 samples per type (roughly the first week of LLM-labelled traffic on a new deployment). This is expected behavior.

Low:

NATS worker migration incomplete (publisher only, no consumer).
KG entity embedding fallback not implemented (column exists, retrieval deferred).
added_by_agent field name implies slug but stores UUID.

1. Overview​

Codebase location​

2. Data Model​

The palace metaphor​

Schema​

memory_drawers​

memory_entities​

memory_triples​

memory_identities​

memory_type_centroids​

State machine​

3. Ingestion Pipeline​

Hot path: auto-ingest pool​

Cold path: River workers​

memory_process (1-minute sweep)​

memory_enrich (10-minute sweep)​

memory_maintain (daily at midnight UTC)​

memory_centroid_recompute (Sunday 03:00 UTC)​

Kill switches​

Crash recovery​

4. Memory Classification​

The seven memory types​

Heuristic classifier​

Pipeline tier column​

5. Retrieval: 4-Layer Stack​

L0 -- Identity​

L1 -- Essential Story​

L2 -- On-Demand Recall​

L3 -- Hybrid Search​

Token budgets (in characters, ~4 chars per token)​

6. Knowledge Graph​

Entities and triples​

Palace graph navigation​

Workspace limits​

7. MCP Tools​

Read tools​

Write tools​

Knowledge graph tools​

Diary tools​

8. Backend Wiring​

Graceful shutdown order​

9. Configuration​

10. Key Constants​

11. Known Limitations​

1. Overview

Codebase location

2. Data Model

The palace metaphor

Schema

`memory_drawers`

`memory_entities`

`memory_triples`

`memory_identities`

`memory_type_centroids`

State machine

3. Ingestion Pipeline

Hot path: auto-ingest pool

Cold path: River workers

`memory_process` (1-minute sweep)

`memory_enrich` (10-minute sweep)

`memory_maintain` (daily at midnight UTC)

`memory_centroid_recompute` (Sunday 03:00 UTC)

Kill switches

Crash recovery

4. Memory Classification

The seven memory types

Heuristic classifier

Pipeline tier column

5. Retrieval: 4-Layer Stack

L0 -- Identity

L1 -- Essential Story

L2 -- On-Demand Recall

L3 -- Hybrid Search

Token budgets (in characters, ~4 chars per token)

6. Knowledge Graph

Entities and triples

Palace graph navigation

Workspace limits

7. MCP Tools

Read tools

Write tools

Knowledge graph tools

Diary tools

8. Backend Wiring

Graceful shutdown order

9. Configuration

10. Key Constants

11. Known Limitations