Table Definitions

Full column specifications for all tables in the orchestrator schema.

users

Column	Type	Constraints	Description
`id`	UUID	PK	User identifier
`subject`	TEXT	NOT NULL, UNIQUE	Auth provider subject (Firebase UID or dev token)
`email`	TEXT	NOT NULL	User email
`nickname`	TEXT	NOT NULL, default `''`	Display nickname
`name`	TEXT	NOT NULL, default `''`	First name
`surname`	TEXT	NOT NULL, default `''`	Last name
`avatar_url`	TEXT	NULL	Profile avatar URL
`country_code`	TEXT	NULL	ISO country code
`date_of_birth`	TIMESTAMPTZ	NULL	Date of birth
`is_banned`	BOOLEAN	NOT NULL, default `false`	Ban flag
`has_agreed_with_terms`	BOOLEAN	NOT NULL, default `false`	Terms acceptance
`has_agreed_with_privacy_policy`	BOOLEAN	NOT NULL, default `false`	Privacy acceptance
`created_at`	TIMESTAMPTZ	NOT NULL	Creation timestamp
`updated_at`	TIMESTAMPTZ	NOT NULL	Last update timestamp
`deleted_at`	TIMESTAMPTZ	NULL	Soft delete timestamp

Indexes: idx_users_subject on subject

user_preferences

Column	Type	Constraints	Description
`user_id`	UUID	PK, FK -> users(id) CASCADE	User reference
`platform_theme`	TEXT	NULL	UI theme preference
`platform_language`	TEXT	NULL	Language code
`currency_code`	TEXT	NULL	Preferred currency
`updated_at`	TIMESTAMPTZ	NOT NULL	Last update timestamp

user_push_tokens

Column	Type	Constraints	Description
`user_id`	UUID	PK, FK -> users(id) CASCADE	User reference
`push_token`	TEXT	NOT NULL	FCM token
`updated_at`	TIMESTAMPTZ	NOT NULL	Last update timestamp

workspaces

Column	Type	Constraints	Description
`id`	UUID	PK	Workspace identifier
`user_id`	UUID	NOT NULL, FK -> users(id) CASCADE	Owner
`name`	TEXT	NOT NULL	Workspace display name
`created_at`	TIMESTAMPTZ	NOT NULL	Creation timestamp
`updated_at`	TIMESTAMPTZ	NOT NULL	Last update timestamp

Indexes: idx_workspaces_user_id on user_id

Runtime status comes from UserSwarm.status in Kubernetes, not this table.

agents

Column	Type	Constraints	Description
`id`	UUID	PK	Agent identifier
`workspace_id`	UUID	NOT NULL, FK -> workspaces(id) CASCADE	Parent workspace
`name`	TEXT	NOT NULL	Agent display name
`slug`	TEXT	NOT NULL	URL-safe identifier (e.g., `wally`, `manager`)
`role`	TEXT	NOT NULL	Agent role (e.g., `default`)
`description`	TEXT	NOT NULL, default `''`	Agent description
`avatar_url`	TEXT	NOT NULL, default `''`	Agent avatar
`sort_order`	INTEGER	NOT NULL, default `0`	Display order
`created_at`	TIMESTAMPTZ	NOT NULL	Creation timestamp
`updated_at`	TIMESTAMPTZ	NOT NULL	Last update timestamp

Indexes: idx_agents_workspace_slug UNIQUE on (workspace_id, slug), idx_agents_workspace_id on workspace_id

conversations

Column	Type	Constraints	Description
`id`	UUID	PK	Conversation identifier
`workspace_id`	UUID	NOT NULL, FK -> workspaces(id) CASCADE	Parent workspace
`agent_id`	UUID	NULL, FK -> agents(id) SET NULL	Associated agent
`type`	TEXT	NOT NULL	Conversation type
`title`	TEXT	NOT NULL	Conversation title
`unread_count`	INTEGER	NOT NULL, default `0`	Unread message count
`created_at`	TIMESTAMPTZ	NOT NULL	Creation timestamp
`updated_at`	TIMESTAMPTZ	NOT NULL	Last update timestamp

Indexes: idx_conversations_workspace_id on workspace_id, idx_conversations_workspace_type on (workspace_id, type)

messages

Column	Type	Constraints	Description
`id`	UUID	PK	Message identifier
`conversation_id`	UUID	NOT NULL, FK -> conversations(id) CASCADE	Parent conversation
`role`	TEXT	NOT NULL	Message role (user, assistant, system)
`content`	JSONB	NOT NULL	Message content (structured)
`status`	TEXT	NOT NULL	Message status
`local_id`	TEXT	NULL	Client-generated ID for deduplication
`agent_id`	UUID	NULL, FK -> agents(id) SET NULL	Agent that sent/received
`attachments`	JSONB	NOT NULL, default `'[]'`	File attachments
`created_at`	TIMESTAMPTZ	NOT NULL	Creation timestamp
`updated_at`	TIMESTAMPTZ	NOT NULL	Last update timestamp

Indexes: idx_messages_conversation_created_at on (conversation_id, created_at DESC, id DESC), idx_messages_local_id on local_id

mcp_audit_logs

Audit log for every MCP tool call made by Agent Runtime pods. Writes are async (goroutine with 5s timeout) to avoid response latency. Required for ISO 27001 compliance.

Column	Type	Constraints	Description
`id`	UUID	PK	Audit log identifier
`user_id`	TEXT	NOT NULL	User who triggered the tool call
`workspace_id`	TEXT	NOT NULL	Workspace context
`session_id`	TEXT	NULL	MCP session identifier
`tool_name`	TEXT	NOT NULL	MCP tool name (e.g., `send_push_notification`)
`input`	JSONB	NOT NULL	Tool call input parameters
`output`	JSONB	NULL	Tool call output
`api_calls`	JSONB	NULL	External API calls made during execution
`success`	BOOLEAN	NOT NULL	Whether the tool call succeeded
`error_message`	TEXT	NULL	Error message on failure
`duration_ms`	INTEGER	NOT NULL	Execution time in milliseconds
`remote_addr`	TEXT	NULL	Calling pod's IP address
`created_at`	TIMESTAMPTZ	NOT NULL	Timestamp of the tool call

tools

Global tool catalog. Seeded from migrations/orchestrator/seed/tools.json on startup.

Column	Type	Constraints	Description
`name`	TEXT	PK	Tool identifier
`display_name`	TEXT	NOT NULL	Human-readable name
`description`	TEXT	NOT NULL	Tool description
`category`	TEXT	NOT NULL	Category grouping
`icon_url`	TEXT	NOT NULL	Tool icon URL
`sort_order`	INTEGER	NOT NULL	Display order
`created_at`	TIMESTAMPTZ	NOT NULL	Creation timestamp

agent_settings

Per-agent model and response configuration. One row per agent.

Column	Type	Constraints	Description
`agent_id`	UUID	PK, FK -> agents(id) CASCADE	Agent reference
`model`	TEXT	NOT NULL, default `'auto'`	LLM model selection
`response_length`	TEXT	NOT NULL, default `'auto'`	Response length preference
`allowed_tools`	TEXT[]	NOT NULL, default `'{}'`	Tools this agent can use
`created_at`	TIMESTAMPTZ	NOT NULL	Creation timestamp
`updated_at`	TIMESTAMPTZ	NOT NULL	Last update timestamp

agent_prompts

System prompt templates per agent.

Column	Type	Constraints	Description
`id`	UUID	PK	Prompt identifier
`agent_id`	UUID	NOT NULL, FK -> agents(id) CASCADE	Agent reference
`name`	TEXT	NOT NULL	Prompt display name
`description`	TEXT	NOT NULL	Prompt description
`content`	TEXT	NOT NULL	System prompt text
`sort_order`	INTEGER	NOT NULL	Display order
`created_at`	TIMESTAMPTZ	NOT NULL	Creation timestamp
`updated_at`	TIMESTAMPTZ	NOT NULL	Last update timestamp

Indexes: idx_agent_prompts_agent_id on agent_id

agent_history

Manager-created conversation history summaries for an agent.

Column	Type	Constraints	Description
`id`	UUID	PK	History entry identifier
`agent_id`	UUID	NOT NULL, FK -> agents(id) CASCADE	Agent reference
`conversation_id`	UUID	NULL, FK -> conversations(id) SET NULL	Associated conversation
`title`	TEXT	NOT NULL	History entry title
`subtitle`	TEXT	NOT NULL	History entry subtitle
`created_at`	TIMESTAMPTZ	NOT NULL	Creation timestamp

Indexes: idx_agent_history_agent_id on (agent_id, created_at DESC)

memory_drawers

Core vector store for the MemPalace memory system. Each drawer is a chunk of verbatim content with a pgvector embedding for semantic search. See the MemPalace Memory Pipeline for the full state machine and processing flow.

Column	Type	Constraints	Description
`id`	TEXT	PK	MD5-based deterministic identifier
`workspace_id`	UUID	NOT NULL	Workspace scope
`wing`	TEXT	NOT NULL	Top-level topic category
`room`	TEXT	NOT NULL	Subtopic within the wing
`hall`	TEXT	NOT NULL, default `''`	Optional granular grouping
`content`	TEXT	NOT NULL	Verbatim memory text
`embedding`	vector(1536)	NULL	pgvector embedding for semantic search
`importance`	REAL	NOT NULL, default `3.0`	Priority score (0.0–5.0)
`memory_type`	TEXT	NOT NULL, default `''`	Classified type: `decision`, `preference`, `milestone`, `problem`, `emotional`, `fact`, `task`
`source_file`	TEXT	NOT NULL, default `''`	Origin file reference
`added_by`	TEXT	NOT NULL, default `'system'`	Who stored it (e.g., `auto-ingest`, `mobile`, agent name)
`added_by_agent`	TEXT	NOT NULL, default `''`	Agent UUID for affinity ranking
`state`	TEXT	NOT NULL, default `'raw'`	Pipeline state: `raw`, `processed`, `merged`, `failed`
`summary`	TEXT	NOT NULL, default `''`	LLM-generated one-line summary
`last_accessed_at`	TIMESTAMPTZ	NULL	Updated on retrieval (TouchAccess)
`access_count`	INT	NOT NULL, default `0`	Incremented on retrieval, drives decay reset
`superseded_by`	TEXT	NULL	Points to newer contradicting drawer
`cluster_id`	TEXT	NULL	Canonical drawer ID for merged clusters
`retry_count`	INT	NOT NULL, default `0`	Cold worker failure counter (max 3)
`pipeline_tier`	TEXT	NOT NULL, default `'llm'`	Which arm of the cold pipeline labelled this drawer: `heuristic`, `centroid`, or `llm`. Added by migration 000010.
`entity_count`	INT	NOT NULL, default `0`	Number of KG entities the enrichment worker wired up. Added by migration 000010.
`triple_count`	INT	NOT NULL, default `0`	Number of KG triples the enrichment worker wired up. Added by migration 000010.
`filed_at`	TIMESTAMPTZ	NOT NULL	When the memory was filed
`created_at`	TIMESTAMPTZ	NOT NULL	Row creation timestamp

Indexes:

idx_drawers_workspace on (workspace_id)
idx_drawers_workspace_wing on (workspace_id, wing)
idx_drawers_workspace_room on (workspace_id, wing, room)
idx_drawers_state on (state) WHERE state = 'raw' (cold worker polling)
idx_drawers_superseded on (superseded_by) WHERE superseded_by IS NOT NULL
idx_drawers_enrich on (workspace_id, created_at) WHERE state = 'processed' AND pipeline_tier <> 'llm' AND entity_count = 0 AND importance >= 3.0 — drives the memory_enrich River sweep. Predicate must match DrawerRepo.ListEnrichCandidates exactly.

примечание

No vector index exists on memory_drawers.embedding today. Migration 000008 is an intentional no-op: both HNSW and IVFFlat were tried first and both caused SIGILL crashes on Digital Ocean CPUs without AVX2. Sequential scan is the current backend, which is acceptable at the 10K-drawer-per-workspace ceiling enforced at write time by DrawerRepo.Add. See the commentary in migrations/orchestrator/000008_ivfflat_index.up.sql for the three workarounds to try once we move to AVX2-capable nodes.

memory_entities

Knowledge graph entity nodes. Each entity is a named thing (person, service, concept) scoped to a workspace.

Column	Type	Constraints	Description
`id`	TEXT	PK (composite)	SHA256-based stable entity ID from normalized name
`workspace_id`	UUID	PK (composite)	Workspace scope
`name`	TEXT	NOT NULL	Display name
`type`	TEXT	NOT NULL, default `'unknown'`	Entity classification (`person`, `tool`, `concept`, `project`, `organization`)
`properties`	JSONB	NOT NULL, default empty object	Additional metadata
`embedding`	vector(1536)	NULL	Embedding for semantic KG lookup fallback
`created_at`	TIMESTAMPTZ	NOT NULL	Row creation timestamp

Primary key: (workspace_id, id)

memory_triples

Knowledge graph temporal relationship edges. Each triple links two entities with a time-bounded predicate.

Column	Type	Constraints	Description
`id`	TEXT	PK (composite)	Triple identifier
`workspace_id`	UUID	PK (composite)	Workspace scope
`subject`	TEXT	NOT NULL	Subject entity ID
`predicate`	TEXT	NOT NULL	Relationship type
`object`	TEXT	NOT NULL	Object entity ID
`valid_from`	TEXT	NULL	When fact became true (YYYY-MM-DD)
`valid_to`	TEXT	NULL	When fact ended (NULL = current)
`confidence`	REAL	NOT NULL, default `1.0`	Confidence score
`source_closet`	TEXT	NOT NULL, default `''`	Source drawer ID
`source_file`	TEXT	NOT NULL, default `''`	Origin file
`extracted_at`	TIMESTAMPTZ	NOT NULL	Extraction timestamp

Primary key: (workspace_id, id)

Indexes:

idx_triples_subject on (workspace_id, subject)
idx_triples_object on (workspace_id, object)
idx_triples_predicate on (workspace_id, predicate)
idx_triples_validity on (workspace_id, valid_from, valid_to)

memory_identities

Per-workspace L0 identity text. One row per workspace, used to inject workspace context at the start of every conversation.

Column	Type	Constraints	Description
`workspace_id`	UUID	PK	Workspace scope
`content`	TEXT	NOT NULL, default `''`	L0 identity text (max 2,000 chars)
`updated_at`	TIMESTAMPTZ	NOT NULL	Last update timestamp

memory_type_centroids

Seven prototype vectors (one per memory type) used by the Phase 2 nearest-centroid classifier in the autoingest worker. A chunk in the [HeuristicConfidenceLow, HeuristicConfidenceHigh) confidence band is classified by picking the closest centroid; if cosine similarity exceeds MemoryCentroidThreshold (0.85) and the centroid has at least MemoryCentroidMinSamples (50) training samples, the drawer is persisted with pipeline_tier = 'centroid' and state = 'processed', bypassing the LLM cold path entirely. Added by migration 000011.

Centroids are rebuilt by the weekly memory_centroid_recompute River worker (Sunday 03:00 UTC) from the last 90 days of pipeline_tier = 'llm' drawers — training never pulls its own predictions, which prevents a feedback loop. See the MemPalace Memory Pipeline for the full rationale.

Column	Type	Constraints	Description
`memory_type`	TEXT	PK	One of the seven memory types: `decision`, `preference`, `milestone`, `problem`, `emotional`, `fact`, `task`
`centroid`	vector(1536)	NOT NULL	Element-wise average of the training-sample embeddings
`sample_count`	INT	NOT NULL	Size of the training cohort. Rows with `sample_count < 50` are ignored by `NearestType` lookups
`computed_at`	TIMESTAMPTZ	NOT NULL, default `NOW()`	When this centroid was last recomputed
`source_hash`	TEXT	NOT NULL	MD5 over the sorted training sample IDs. The upsert is a no-op when this hash has not changed since the previous run, so unchanged centroids do not churn dead tuples

примечание

The table is empty on first deploy. Phase 2 is dormant until at least one memory type has accumulated 50 LLM-labelled samples — roughly the first week of real traffic on a new workspace. A best-effort 30-second startup seed (centroidSeedTimeout in cmd/crawbl/platform/orchestrator/orchestrator.go) warms the table once after riverClient.Start so operators do not have to wait until Sunday 03:00 UTC for the first recompute.

integration_connections

OAuth connections between users and external providers.

Column	Type	Constraints	Description
`id`	UUID	PK	Connection identifier
`user_id`	UUID	NOT NULL, FK -> users(id) CASCADE	User reference
`workspace_id`	UUID	NOT NULL, FK -> workspaces(id) CASCADE	Workspace scope
`provider`	TEXT	NOT NULL	Provider name (e.g., `gmail`, `slack`)
`status`	TEXT	NOT NULL, default `'active'`	Connection status
`access_token_encrypted`	TEXT	NULL	Encrypted OAuth access token
`refresh_token_encrypted`	TEXT	NULL	Encrypted OAuth refresh token
`token_expires_at`	TIMESTAMPTZ	NULL	Token expiry
`scopes`	TEXT[]	NOT NULL	Granted OAuth scopes
`provider_user_id`	TEXT	NULL	User ID at provider
`provider_user_email`	TEXT	NULL	Email at provider
`metadata`	JSONB	NOT NULL	Additional provider metadata
`created_at`	TIMESTAMPTZ	NOT NULL	Creation timestamp
`updated_at`	TIMESTAMPTZ	NOT NULL	Last update timestamp

Indexes: idx_integration_connections_unique UNIQUE on (user_id, workspace_id, provider)

model_pricing

Per-token costs for LLM models. Append-only — new rows inserted on price changes, old rows preserved for historical cost computation.

Column	Type	Constraints	Description
`id`	UUID	PK	Row identifier
`provider`	TEXT	NOT NULL	LLM provider (e.g., `openai`)
`model`	TEXT	NOT NULL	Model identifier (e.g., `gpt-4o`)
`region`	TEXT	NOT NULL, default `'global'`	Pricing region
`input_cost_per_token`	NUMERIC(18,12)	NOT NULL	Cost per input token (USD)
`output_cost_per_token`	NUMERIC(18,12)	NOT NULL	Cost per output token (USD)
`cached_cost_per_token`	NUMERIC(18,12)	NOT NULL	Cost per cached token (USD)
`source`	TEXT	NOT NULL, default `'manual'`	Price source
`effective_at`	TIMESTAMPTZ	NOT NULL	When this price takes effect
`created_at`	TIMESTAMPTZ	NOT NULL	Row creation timestamp

Indexes: idx_model_pricing_lookup on (provider, model, region, effective_at DESC)

usage_plans

Plan definitions. Seeded from migrations/orchestrator/seed/usage_plans.json.

Column	Type	Constraints	Description
`plan_id`	TEXT	PK	Plan identifier (`free`, `pro`)
`name`	TEXT	NOT NULL	Display name
`monthly_token_limit`	BIGINT	NOT NULL, default `500000`	Tokens per month
`daily_request_limit`	INT	NULL	Daily request cap (unused)
`max_tokens_per_request`	INT	NULL	Per-request cap (unused)
`created_at`	TIMESTAMPTZ	NOT NULL	Creation timestamp
`updated_at`	TIMESTAMPTZ	NOT NULL	Last update timestamp

usage_quotas

Per-user plan assignment. Active quota has expires_at IS NULL.

Column	Type	Constraints	Description
`id`	UUID	PK	Row identifier
`user_id`	UUID	NOT NULL, FK -> users(id) CASCADE	User reference
`plan_id`	TEXT	NOT NULL, FK -> usage_plans DEFAULT `'free'`	Plan reference
`effective_at`	TIMESTAMPTZ	NOT NULL	When assignment started
`expires_at`	TIMESTAMPTZ	NULL	NULL = active
`created_at`	TIMESTAMPTZ	NOT NULL	Creation timestamp
`updated_at`	TIMESTAMPTZ	NOT NULL	Last update timestamp

Indexes: idx_usage_quotas_user_active UNIQUE on (user_id, plan_id) WHERE expires_at IS NULL

usage_counters

Monthly running totals per user. One row per (user_id, period).

Column	Type	Constraints	Description
`user_id`	UUID	PK (composite), FK -> users(id) CASCADE	User reference
`period`	TEXT	PK (composite)	Billing period (`YYYY-MM`)
`tokens_used`	BIGINT	NOT NULL, default `0`	Total tokens this month
`prompt_tokens_used`	BIGINT	NOT NULL, default `0`	Input tokens
`completion_tokens_used`	BIGINT	NOT NULL, default `0`	Output tokens
`cost_usd`	NUMERIC(12,6)	NOT NULL, default `0`	Accumulated cost (internal only)
`request_count`	INT	NOT NULL, default `0`	Number of LLM calls
`last_updated_at`	TIMESTAMPTZ	NOT NULL	Last increment timestamp

agent_usage_counters

Lifetime usage totals per agent.

Column	Type	Constraints	Description
`agent_id`	UUID	PK, FK -> agents(id) CASCADE	Agent reference
`workspace_id`	UUID	NOT NULL, FK -> workspaces(id) CASCADE	Workspace reference
`tokens_used`	BIGINT	NOT NULL, default `0`	Total tokens ever
`prompt_tokens_used`	BIGINT	NOT NULL, default `0`	Lifetime input tokens
`completion_tokens_used`	BIGINT	NOT NULL, default `0`	Lifetime output tokens
`cost_usd`	NUMERIC(12,6)	NOT NULL, default `0`	Accumulated cost (internal only)
`request_count`	INT	NOT NULL, default `0`	Number of LLM calls
`last_updated_at`	TIMESTAMPTZ	NOT NULL	Last increment timestamp

Indexes: idx_agent_usage_counters_workspace on workspace_id

users​

user_preferences​

user_push_tokens​

workspaces​

agents​

conversations​

messages​

mcp_audit_logs​

tools​

agent_settings​

agent_prompts​

agent_history​

memory_drawers​

memory_entities​

memory_triples​

memory_identities​

memory_type_centroids​

integration_connections​

model_pricing​

usage_plans​

usage_quotas​

usage_counters​

agent_usage_counters​