Memory System - Z-M-Huang/openhive GitHub Wiki

Memory System

Each team has persistent memory stored in SQLite. Memory entries are typed, versioned via a supersede mechanism, and searchable through a hybrid FTS5 + vector pipeline. A subset of memories (identity, lesson, decision, context) is auto-injected into each session's system prompt; the rest (reference, historical) is accessible on demand via the memory_search tool.

For team configuration and provider profiles, see Team-Configuration. For persistence and crash recovery, see Durability-Recovery. For the architectural decision behind this system, see Architecture-Decisions#ADR-29.


Table of Contents


Architecture Overview

The memory system is a single-layer SQLite store. The memories table is the source of truth; search tables (memory_chunks, memory_chunks_fts, embedding_cache) are derived from it. There is no filesystem memory directory — all memory lives in the shared .run/openhive.db database.

graph TB
    sessionHandler["Session Handler"]
    promptBuilder["Prompt Builder<br/>(prompt-builder.ts)"]
    memoryLoader["Memory Loader<br/>(memory-loader.ts)"]
    memoriesTable["memories table<br/>(SQLite)"]
    memorySearch["memory_search Tool"]
    memoryChunks["memory_chunks +<br/>memory_chunks_fts<br/>(SQLite FTS5)"]
    embeddingCache["embedding_cache<br/>(SQLite)"]
    embeddingProvider["Embedding Provider<br/>(providers.yaml profile)"]

    sessionHandler --> promptBuilder
    promptBuilder --> memoryLoader
    memoryLoader -->|"SELECT WHERE type IN<br/>(identity,lesson,decision,context)<br/>AND is_active = 1"| memoriesTable
    memorySearch --> memoryChunks
    memorySearch --> embeddingCache
    memoryChunks -->|"chunks built from"| memoriesTable
    embeddingProvider --> embeddingCache
Loading

Source of truth (memories table): Stores all memory entries with type, supersede chain, and audit metadata. All reads and writes go through a data access layer that enforces WHERE team_name = ? on every query.

Search layer (memory_chunks + memory_chunks_fts + embedding_cache): Derived from memories.content. The memory_search tool queries the FTS5 index for keyword search and the embedding cache for vector similarity. See Search Pipeline.


Memory Types

6 types with mandatory aliases for natural writes:

Type Injected? Purpose Example Aliases
identity Yes (never dropped) Discovered identity — capabilities/scope learned during operation (prescribed identity lives in team-rules) "core", "self"
lesson Yes Mistakes to avoid, patterns that work "warning", "insight", "learning"
decision Yes (may overflow budget) Key decisions with rationale "commitment", "choice"
context Yes (may overflow budget) Current state, ongoing work "active", "background"
reference No (search only) Where to find things, how-tos "pointer", "link"
historical No (search only) Past events, completed work "archive", "past"

Type aliases map user-friendly names to canonical types at write time. The alias resolution happens in memory_save — the agent can write type: "warning" and it maps to lesson. This reduces classification burden without losing semantic precision.

Type Boundaries

Type Test Question Example
identity "Is this something I discovered about my role?" "I'm also responsible for monitoring the payment gateway"
lesson "Is this a generalizable insight from experience?" "Never deploy on Fridays — caused incident #42"
decision "Does this commit to a specific action pattern?" "We chose JWT with 1h expiry + refresh tokens"
context "Is this situational but currently active?" "Auth service migration is in progress, payment gateway is next"
reference "Is this a pointer to external information?" "Grafana dashboard at grafana.internal/d/api-latency"
historical "Is this a past event or completed work?" "Q1 migration completed successfully on 2026-03-15"

Identity vs team-rules: Team-rules define prescribed identity (set at spawn time via team-rules/team-context.md). Memory identity entries capture capabilities or scope the agent discovers during operation. If they diverge, team-rules are authoritative for prescribed behavior; memory is authoritative for learned context.

Injection vs Search-Only

Category Types Behavior
Injected (budget-capped ~50 entries) identity (always first, never dropped), lesson (always second), decision (newest first, may overflow), context (newest first, may overflow) Auto-included in session system prompt. Type is never changed by budget overflow.
Search only (no limit) reference, historical, plus overflow decision/context entries Accessible on demand via memory_search

Entries beyond the injection budget cap are not injected but remain searchable with their original type. The agent explicitly reclassifies entries when appropriate (e.g., decisionhistorical when superseded). The system never auto-changes types.


Memory vs Vault

Teams have access to both the memory system (this page) and the team vault (key-value store in team_vault table). They serve different purposes and are not interchangeable.

When to Use Memory

Use memory for knowledge that should appear in the agent's reasoning context or be semantically searchable:

  • Auto-injected types (identity, lesson, decision, context) are included in the system prompt each session. Store learned capabilities, operational insights, active decisions, and current state here.
  • Search-only types (reference, historical) are retrievable on demand via memory_search. Store pointers, how-tos, and completed-work records here.

Memory content is visible to the agent, participates in hybrid search (FTS5 + vector), and influences reasoning through prompt injection.

When to Use Vault

Use vault for machine-consumed operational state and secrets that should never appear in prompts:

  • Configuration state -- runtime settings, feature flags, progress counters, cached intermediate data.
  • Secrets -- API keys, tokens, credentials. Vault entries with is_secret = 1 are registered with the credential scrubber and redacted from logs and tool outputs.

Vault data is never injected into the system prompt and is not semantically searchable. It is read programmatically by key, not discovered through natural language queries.

Learning-Originated Entries

When the autonomous learning system stores research findings as memories, it uses a deterministic key format: learn:{topic_slug}:{claim_hash} — where topic_slug is a kebab-case topic identifier and claim_hash is a short hash of the normalized claim for deduplication.

Each learning-originated memory entry includes a structured confidence tag with four fields: confidence (high/medium/low), corroborated (number of independent root domains), domains (comma-separated list of corroborating domains), and verified (ISO date of last verification).

Tag Field Values Meaning
confidence high, medium, low Overall confidence in the claim
corroborated integer Number of independent root domains that corroborate the claim
domains comma-separated list Root domains that provided corroborating evidence
verified ISO date Date the claim was last verified

These entries are typically stored as lesson or reference type and follow normal supersede rules -- a re-verified claim with updated confidence supersedes the prior entry.

For the full decision framework on when to use memory, vault, or the filesystem, see Team-Configuration#Data Store Decision Tree.

Vault vs Memory Anti-Patterns

Anti-Pattern Problem Correct Approach
Storing operational counters or caches in memory Pollutes semantic search; injected into prompts unnecessarily; creates version bloat (e.g., 251 versions of a reported_posts key) Use vault (is_secret=0) for counters, progress state, and operational caches
Using vault for knowledge that should be searchable Knowledge invisible to hybrid search; agent cannot discover it via memory_search Use memory (lesson/reference) for searchable knowledge
Storing large JSON blobs in memory content Degrades FTS5 ranking; wastes prompt token budget Store structured data in vault; store a summary in memory with a vault key reference

Self-reflection cycles (see Self-Evolution#Self-Reflection) should detect operational-cache-in-memory anti-patterns and propose migration to vault.


Supersede Mechanism

Memories cannot be silently overwritten. When an active entry with the same key exists, the agent must provide a supersede_reason explaining why the old entry was wrong or outdated. Two modes:

Hard Supersede

For substantive corrections — the old understanding was wrong or circumstances changed.

  • Agent must provide supersede_reason
  • Old entry marked is_active = 0
  • New entry links via supersedes_id
  • Agent should verify contradictions with user before superseding

Soft Supersede

For trivial corrections — typos, formatting, minor clarifications.

  • Flagged with a standardized reason like "minor correction"
  • Same schema mechanics, lighter ceremony
  • No user verification required

Flow

When memory_save is called: if no active entry exists with that key, a new entry is inserted. If an active entry exists and supersede_reason is provided, the old entry is marked inactive and a new entry is inserted linked via supersedes_id. If an active entry exists but no reason is provided, the call is rejected with an error listing the current entry.

Superseded entries remain searchable for audit trail. The supersede_reason often embeds a lesson itself — a supersede that says "we thought X but actually Y because Z" is valuable searchable context.


Schema

memories Table

The source of truth for all memory entries.

Column Purpose
id Auto-incrementing primary key
team_name Owning team (isolation boundary)
key Short identifier for the entry
content Memory content text
type One of: identity, lesson, decision, context, reference, historical (default: context)
is_active 1 = active, 0 = superseded or deleted
supersedes_id Foreign key to the entry this one replaced
supersede_reason Why the old entry was wrong or outdated
updated_by Agent/session ID for audit
created_at, updated_at ISO timestamps

Key indexes: a partial unique index enforces one active entry per key per team. A history index supports supersede chain queries. An injection index supports the budget-capped prompt injection query.

Search Index Tables

These tables are derived from memories.content. When a memory is inserted or updated, the application splits content into ~400-token chunks with 80-token overlap at paragraph boundaries.

Table Purpose
memory_chunks Stores content chunks with memory_id reference, team_name, chunk content, chunk index, and content hash (SHA-256)
memory_chunks_fts FTS5 virtual table over memory_chunks for full-text keyword search
embedding_cache Stores pre-computed embeddings keyed by content hash, with model name and timestamp

Indexed by: memory_id (for cascading deletes), team_name (for isolation), and content_hash (for embedding lookups).

Cleanup on Team Shutdown

When shutdown_team is called, all rows in memories, memory_chunks, and related FTS entries for that team are deleted alongside other team-scoped data (see Durability-Recovery#What Is Removed on shutdown_team).


Injection

Prompt Assembly

buildMemorySection() (memory-loader.ts) queries the memories table for active injected types, ordered by priority: identity first, then lesson, then decision (newest first), then context (newest first), limited to 50 entries.

Injection Format

Injected memories appear in the system prompt's dynamic suffix as structured blocks grouped by type — each entry shown with its type label and key, e.g., [IDENTITY] [team-scope]: I am the operations team.... The ordering matches the injection priority: identity, lesson, decision, context.

Budget

  • Entry count cap: ~50 active entries across injected types
  • If budget exceeded: the injection query skips entries beyond the cap. Type is never changed — a decision remains a decision even when not injected. It remains searchable via memory_search.
  • Injection priority: identity first, then lesson, then decision (newest first), then context (newest first)
  • Warning logged when active injected entries reach 80% of cap

Search Pipeline

The memory_search tool retrieves relevant memory fragments using two retrieval paths that run in parallel:

  1. FTS5 keyword search — full-text search with BM25 scoring via memory_chunks_fts. No embeddings required. Always available.
  2. Vector similarity search — embeds the query via a configured provider and finds memory chunks by cosine similarity against pre-indexed embeddings in embedding_cache.

Results from both paths are merged, then refined through temporal decay and diversity re-ranking.

Search Flow

graph TB
    memorySearchTool["memory_search(query)"] --> keywordPath["FTS5 keyword search<br/>(always runs)"]
    memorySearchTool --> checkProvider{"Embedding provider<br/>available?"}
    checkProvider -->|Yes| embeddingPath["Embed query → cosine similarity"]
    checkProvider -->|No| keywordOnly["Keyword results only"]
    keywordPath --> hybridMerge["Hybrid merge<br/>(0.7 vector + 0.3 keyword)"]
    embeddingPath --> hybridMerge
    keywordOnly --> temporalDecay
    hybridMerge --> temporalDecay["Temporal decay"]
    temporalDecay --> mmrRerank["MMR re-ranking"]
    mmrRerank --> results["Top-k results"]
Loading

Hybrid Merge

When both paths produce results, scores are combined:

  • Vector score: Cosine similarity (0-1) between the query embedding and each chunk embedding.
  • Keyword score: BM25 rank normalized to 0-1.
  • Combined: 0.7 * vector_score + 0.3 * keyword_score (weights configurable).

Temporal Decay

An exponential decay function reduces scores for older content:

  • Half-life: 30 days (default). A memory chunk last modified 30 days ago has its score multiplied by 0.5.
  • Exempt types: identity and lesson entries are exempt from decay — they are considered evergreen.

MMR Re-ranking

Maximal Marginal Relevance prevents redundant results:

  • Lambda: 0.7 (balances relevance vs. diversity).
  • Similarity metric: Jaccard similarity between chunk token sets.
  • After scoring, results are iteratively selected to maximize diversity among the top-k.

Search Scope

Search queries memory_chunks_fts and embedding_cache, then joins back to memories for metadata (type, team_name, is_active). Both active and superseded entries are searchable — superseded results are marked as such in the response. Deleted entries (via memory_delete) are excluded from search results.

Deferred Improvements (Post-MVP)

These enhancements were validated as valuable during the design review but deferred to reduce v1 complexity:

  • access_count / last_accessed columns for usage-based ranking signal
  • Cross-encoder reranking on top-20 results (~10ms, +15-30% MRR@5)
  • LLM-based query expansion (2-3 alternative phrasings per query)
  • Query-time type classification (predict relevant types from natural language)

Anthropic Provider Limitation

The @ai-sdk/anthropic package declares textEmbeddingModel(): never — Anthropic models do not support embeddings. When memory.embedding_provider_profile references an Anthropic profile, the system falls back to keyword-only search. Only OpenAI-compatible providers (@ai-sdk/openai) expose the embeddingModel() function (provider-registry.ts:33-35).


Tool Specification

All memory tools are inline AI SDK tool() definitions following the same pattern as organization tools (see Organization-Tools). They receive team_name implicitly from OrgToolContext.teamName via closure — the agent never passes it as a parameter.

memory_save

Upsert a memory entry. If an active entry with the same key exists, supersede_reason is required.

Parameter Type Required Default Description
key string yes Short identifier (e.g., "deploy-lesson", "auth-approach")
content string yes Memory content
type string no "context" One of: identity, lesson, decision, context, reference, historical. Accepts aliases.
supersede_reason string no Required when an active entry with same key exists

Implicit parameters (not passed by agent):

  • team_name: from OrgToolContext.teamName
  • updated_by: from session/correlation ID

Behavior:

  • No existing active entry: INSERT
  • Existing active entry + supersede_reason: mark old inactive, INSERT new with supersedes_id + reason
  • Existing active entry + no reason: REJECT with error listing current entry
  • Soft validation: if content suggests a different type than specified, log a non-blocking suggestion

memory_delete

Soft-delete a memory entry.

Parameter Type Required Description
key string yes Key of the memory entry to delete

Marks the entry is_active = 0. Deleted entries are excluded from search results (unlike superseded entries, which remain searchable for audit trail).

memory_search

Search team memory using hybrid FTS5 + vector similarity.

Parameter Type Required Default Description
query string yes Natural language search query
max_results number no 5 Maximum number of results to return

Return shape: An array of result objects, each containing: key, snippet (~400-token matching text fragment), score (combined relevance 0-1), type, is_active (false if superseded), and source (hybrid or keyword). The response also includes a search_mode field indicating whether hybrid or keyword-only search was used.

Scoped to calling team's data only. Returns active + superseded entries (superseded entries marked as such). Deleted entries are excluded.

memory_list

List active memory entries for the calling team.

Parameter Type Required Description
type string no Filter by type (e.g., "lesson", "context")

Returns all active entries, optionally filtered by type. Useful for reviewing current memory state before making changes.


Configuration

The embedding provider is configured via memory.embedding_provider_profile in the team's config.yaml, referencing a profile name from /data/config/providers.yaml. The referenced profile must use an OpenAI-compatible provider (provider: openai) since Anthropic models do not support embeddings. The profile's model field determines which embedding model is used (e.g., text-embedding-3-small). No additional configuration is needed — the profile encapsulates provider, API key, and model. See Team-Configuration#Provider Profiles for the full providers.yaml format.

Failure Modes

Failure Behavior Recovery
Profile name not found in providers.yaml Startup validation warning Falls back to keyword-only search
Profile references Anthropic provider textEmbeddingModel() returns never at runtime Falls back to keyword-only search with log warning
API key invalid or expired Embedding API call fails Falls back to keyword-only search with error log

Embedding Provider Integration

Only OpenAI-compatible providers support embeddings. The @ai-sdk/openai package exposes textEmbeddingModel() (provider-registry.ts:33-35), while the @ai-sdk/anthropic package declares this method as returning never. The AI SDK's embedMany() function is used for batch embedding during indexing.

The provider profile referenced by memory.embedding_provider_profile must use provider: openai in providers.yaml. Any profile using provider: anthropic will trigger a graceful fallback to keyword-only search.

Supported embedding models (via OpenAI-compatible providers):

  • text-embedding-3-small (1536 dimensions, recommended default)
  • text-embedding-3-large (3072 dimensions, higher quality)
  • text-embedding-ada-002 (1536 dimensions, legacy)

Graceful Degradation

The memory system is designed to degrade gracefully — search always returns results, even when components fail.

Failure Behavior Recovery
No embedding profile configured Keyword-only search, startup log info Configure memory.embedding_provider_profile
Embedding API unreachable Fallback to keyword search + warning log Check API connectivity
No memories exist Empty injection section, empty search results Team starts fresh; memories accumulate during operation
Memory budget exceeded Oldest context/decision entries not injected (still searchable) Agent consolidates or reclassifies entries

The principle is fail-open for reads, fail-loud for writes. A missing or empty memory store does not crash the session — the team continues without memory context. But a failed write operation is surfaced to the agent so it can retry or escalate.


Security and Isolation

Memory is isolated per-team. Each team can only access its own memory entries through a data access layer.

  • Data access layer: All team-scoped queries go through a single enforced path that adds WHERE team_name = ?. No raw SQL access to memory tables.
  • Search boundaries: The memory_search tool only returns results from the calling team's memory. Cross-team memory access is not possible.
  • Team name validation: SLUG_RE validation applies at the API boundary, preventing injection via team names.
  • API key management: Embedding provider API keys are centrally managed in providers.yaml by the system administrator, not configured per-team.
  • Credential redaction: Memory content passes through the credential scrubber, which redacts sensitive values from logs and tool outputs.
  • CI verification: Test coverage must include cross-team access attempts that verify isolation cannot be bypassed.

Edge Cases

  • Budget overflow: Entries beyond the ~50 cap are not injected but remain their original type and are searchable. The system never auto-changes types.
  • Concurrent writes: SQLite WAL mode handles concurrent readers. Under ADR-41, writes to memories are class 3 (per-key lock): concurrent writes to different subject_key values proceed in parallel, while writes to the same subject_key are serialised by a per-key advisory lock so the append-then-supersede sequence is atomic. The partial unique index on (team_name, subject_key, is_active=1) enforces the invariant of at most one active entry per key. memory_chunks + memory_chunks_fts are class 4 — they follow the parent memories row's lock, so FTS re-indexing cannot race the supersede. See Architecture-Decisions#ADR-41.
  • Team deletion: All memory rows (memories, memory_chunks) for the team are deleted with shutdown_team. This is irreversible.
  • Empty memory: A newly spawned team starts with no memory entries. Memories accumulate during operation as the agent learns.
  • Supersede chains: Frequently superseded keys create chains visible via idx_memory_history. Active queries remain O(1) via the partial index. History queries are opt-in.
  • Deleted vs superseded: Both set is_active = 0. The search query distinguishes them: superseded entries have another row's supersedes_id pointing to them (i.e., EXISTS (SELECT 1 FROM memories m2 WHERE m2.supersedes_id = m.id)). Deleted entries have no successor. The search pipeline includes superseded entries but excludes deleted ones.
⚠️ **GitHub.com Fallback** ⚠️