Memory System - Z-M-Huang/openhive GitHub Wiki
Each team has persistent memory stored in SQLite. Memory entries are typed, versioned via a supersede mechanism, and searchable through a hybrid FTS5 + vector pipeline. A subset of memories (identity, lesson, decision, context) is auto-injected into each session's system prompt; the rest (reference, historical) is accessible on demand via the memory_search tool.
For team configuration and provider profiles, see Team-Configuration. For persistence and crash recovery, see Durability-Recovery. For the architectural decision behind this system, see Architecture-Decisions#ADR-29.
- Architecture Overview
- Memory Types
- Memory vs Vault
- Supersede Mechanism
- Schema
- Injection
- Search Pipeline
- Tool Specification
- Configuration
- Embedding Provider Integration
- Graceful Degradation
- Security and Isolation
- Edge Cases
The memory system is a single-layer SQLite store. The memories table is the source of truth; search tables (memory_chunks, memory_chunks_fts, embedding_cache) are derived from it. There is no filesystem memory directory — all memory lives in the shared .run/openhive.db database.
graph TB
sessionHandler["Session Handler"]
promptBuilder["Prompt Builder<br/>(prompt-builder.ts)"]
memoryLoader["Memory Loader<br/>(memory-loader.ts)"]
memoriesTable["memories table<br/>(SQLite)"]
memorySearch["memory_search Tool"]
memoryChunks["memory_chunks +<br/>memory_chunks_fts<br/>(SQLite FTS5)"]
embeddingCache["embedding_cache<br/>(SQLite)"]
embeddingProvider["Embedding Provider<br/>(providers.yaml profile)"]
sessionHandler --> promptBuilder
promptBuilder --> memoryLoader
memoryLoader -->|"SELECT WHERE type IN<br/>(identity,lesson,decision,context)<br/>AND is_active = 1"| memoriesTable
memorySearch --> memoryChunks
memorySearch --> embeddingCache
memoryChunks -->|"chunks built from"| memoriesTable
embeddingProvider --> embeddingCache
Source of truth (memories table): Stores all memory entries with type, supersede chain, and audit metadata. All reads and writes go through a data access layer that enforces WHERE team_name = ? on every query.
Search layer (memory_chunks + memory_chunks_fts + embedding_cache): Derived from memories.content. The memory_search tool queries the FTS5 index for keyword search and the embedding cache for vector similarity. See Search Pipeline.
6 types with mandatory aliases for natural writes:
| Type | Injected? | Purpose | Example Aliases |
|---|---|---|---|
identity |
Yes (never dropped) | Discovered identity — capabilities/scope learned during operation (prescribed identity lives in team-rules) | "core", "self" |
lesson |
Yes | Mistakes to avoid, patterns that work | "warning", "insight", "learning" |
decision |
Yes (may overflow budget) | Key decisions with rationale | "commitment", "choice" |
context |
Yes (may overflow budget) | Current state, ongoing work | "active", "background" |
reference |
No (search only) | Where to find things, how-tos | "pointer", "link" |
historical |
No (search only) | Past events, completed work | "archive", "past" |
Type aliases map user-friendly names to canonical types at write time. The alias resolution happens in memory_save — the agent can write type: "warning" and it maps to lesson. This reduces classification burden without losing semantic precision.
| Type | Test Question | Example |
|---|---|---|
identity |
"Is this something I discovered about my role?" | "I'm also responsible for monitoring the payment gateway" |
lesson |
"Is this a generalizable insight from experience?" | "Never deploy on Fridays — caused incident #42" |
decision |
"Does this commit to a specific action pattern?" | "We chose JWT with 1h expiry + refresh tokens" |
context |
"Is this situational but currently active?" | "Auth service migration is in progress, payment gateway is next" |
reference |
"Is this a pointer to external information?" | "Grafana dashboard at grafana.internal/d/api-latency" |
historical |
"Is this a past event or completed work?" | "Q1 migration completed successfully on 2026-03-15" |
Identity vs team-rules: Team-rules define prescribed identity (set at spawn time via team-rules/team-context.md). Memory identity entries capture capabilities or scope the agent discovers during operation. If they diverge, team-rules are authoritative for prescribed behavior; memory is authoritative for learned context.
| Category | Types | Behavior |
|---|---|---|
| Injected (budget-capped ~50 entries) |
identity (always first, never dropped), lesson (always second), decision (newest first, may overflow), context (newest first, may overflow) |
Auto-included in session system prompt. Type is never changed by budget overflow. |
| Search only (no limit) |
reference, historical, plus overflow decision/context entries |
Accessible on demand via memory_search
|
Entries beyond the injection budget cap are not injected but remain searchable with their original type. The agent explicitly reclassifies entries when appropriate (e.g., decision → historical when superseded). The system never auto-changes types.
Teams have access to both the memory system (this page) and the team vault (key-value store in team_vault table). They serve different purposes and are not interchangeable.
Use memory for knowledge that should appear in the agent's reasoning context or be semantically searchable:
-
Auto-injected types (
identity,lesson,decision,context) are included in the system prompt each session. Store learned capabilities, operational insights, active decisions, and current state here. -
Search-only types (
reference,historical) are retrievable on demand viamemory_search. Store pointers, how-tos, and completed-work records here.
Memory content is visible to the agent, participates in hybrid search (FTS5 + vector), and influences reasoning through prompt injection.
Use vault for machine-consumed operational state and secrets that should never appear in prompts:
- Configuration state -- runtime settings, feature flags, progress counters, cached intermediate data.
-
Secrets -- API keys, tokens, credentials. Vault entries with
is_secret = 1are registered with the credential scrubber and redacted from logs and tool outputs.
Vault data is never injected into the system prompt and is not semantically searchable. It is read programmatically by key, not discovered through natural language queries.
When the autonomous learning system stores research findings as memories, it uses a deterministic key format: learn:{topic_slug}:{claim_hash} — where topic_slug is a kebab-case topic identifier and claim_hash is a short hash of the normalized claim for deduplication.
Each learning-originated memory entry includes a structured confidence tag with four fields: confidence (high/medium/low), corroborated (number of independent root domains), domains (comma-separated list of corroborating domains), and verified (ISO date of last verification).
| Tag Field | Values | Meaning |
|---|---|---|
confidence |
high, medium, low
|
Overall confidence in the claim |
corroborated |
integer | Number of independent root domains that corroborate the claim |
domains |
comma-separated list | Root domains that provided corroborating evidence |
verified |
ISO date | Date the claim was last verified |
These entries are typically stored as lesson or reference type and follow normal supersede rules -- a re-verified claim with updated confidence supersedes the prior entry.
For the full decision framework on when to use memory, vault, or the filesystem, see Team-Configuration#Data Store Decision Tree.
| Anti-Pattern | Problem | Correct Approach |
|---|---|---|
| Storing operational counters or caches in memory | Pollutes semantic search; injected into prompts unnecessarily; creates version bloat (e.g., 251 versions of a reported_posts key) |
Use vault (is_secret=0) for counters, progress state, and operational caches |
| Using vault for knowledge that should be searchable | Knowledge invisible to hybrid search; agent cannot discover it via memory_search
|
Use memory (lesson/reference) for searchable knowledge |
| Storing large JSON blobs in memory content | Degrades FTS5 ranking; wastes prompt token budget | Store structured data in vault; store a summary in memory with a vault key reference |
Self-reflection cycles (see Self-Evolution#Self-Reflection) should detect operational-cache-in-memory anti-patterns and propose migration to vault.
Memories cannot be silently overwritten. When an active entry with the same key exists, the agent must provide a supersede_reason explaining why the old entry was wrong or outdated. Two modes:
For substantive corrections — the old understanding was wrong or circumstances changed.
- Agent must provide
supersede_reason - Old entry marked
is_active = 0 - New entry links via
supersedes_id - Agent should verify contradictions with user before superseding
For trivial corrections — typos, formatting, minor clarifications.
- Flagged with a standardized reason like
"minor correction" - Same schema mechanics, lighter ceremony
- No user verification required
When memory_save is called: if no active entry exists with that key, a new entry is inserted. If an active entry exists and supersede_reason is provided, the old entry is marked inactive and a new entry is inserted linked via supersedes_id. If an active entry exists but no reason is provided, the call is rejected with an error listing the current entry.
Superseded entries remain searchable for audit trail. The supersede_reason often embeds a lesson itself — a supersede that says "we thought X but actually Y because Z" is valuable searchable context.
The source of truth for all memory entries.
| Column | Purpose |
|---|---|
id |
Auto-incrementing primary key |
team_name |
Owning team (isolation boundary) |
key |
Short identifier for the entry |
content |
Memory content text |
type |
One of: identity, lesson, decision, context, reference, historical (default: context) |
is_active |
1 = active, 0 = superseded or deleted |
supersedes_id |
Foreign key to the entry this one replaced |
supersede_reason |
Why the old entry was wrong or outdated |
updated_by |
Agent/session ID for audit |
created_at, updated_at
|
ISO timestamps |
Key indexes: a partial unique index enforces one active entry per key per team. A history index supports supersede chain queries. An injection index supports the budget-capped prompt injection query.
These tables are derived from memories.content. When a memory is inserted or updated, the application splits content into ~400-token chunks with 80-token overlap at paragraph boundaries.
| Table | Purpose |
|---|---|
memory_chunks |
Stores content chunks with memory_id reference, team_name, chunk content, chunk index, and content hash (SHA-256) |
memory_chunks_fts |
FTS5 virtual table over memory_chunks for full-text keyword search |
embedding_cache |
Stores pre-computed embeddings keyed by content hash, with model name and timestamp |
Indexed by: memory_id (for cascading deletes), team_name (for isolation), and content_hash (for embedding lookups).
When shutdown_team is called, all rows in memories, memory_chunks, and related FTS entries for that team are deleted alongside other team-scoped data (see Durability-Recovery#What Is Removed on shutdown_team).
buildMemorySection() (memory-loader.ts) queries the memories table for active injected types, ordered by priority: identity first, then lesson, then decision (newest first), then context (newest first), limited to 50 entries.
Injected memories appear in the system prompt's dynamic suffix as structured blocks grouped by type — each entry shown with its type label and key, e.g., [IDENTITY] [team-scope]: I am the operations team.... The ordering matches the injection priority: identity, lesson, decision, context.
- Entry count cap: ~50 active entries across injected types
- If budget exceeded: the injection query skips entries beyond the cap. Type is never changed — a
decisionremains adecisioneven when not injected. It remains searchable viamemory_search. - Injection priority:
identityfirst, thenlesson, thendecision(newest first), thencontext(newest first) - Warning logged when active injected entries reach 80% of cap
The memory_search tool retrieves relevant memory fragments using two retrieval paths that run in parallel:
-
FTS5 keyword search — full-text search with BM25 scoring via
memory_chunks_fts. No embeddings required. Always available. -
Vector similarity search — embeds the query via a configured provider and finds memory chunks by cosine similarity against pre-indexed embeddings in
embedding_cache.
Results from both paths are merged, then refined through temporal decay and diversity re-ranking.
graph TB
memorySearchTool["memory_search(query)"] --> keywordPath["FTS5 keyword search<br/>(always runs)"]
memorySearchTool --> checkProvider{"Embedding provider<br/>available?"}
checkProvider -->|Yes| embeddingPath["Embed query → cosine similarity"]
checkProvider -->|No| keywordOnly["Keyword results only"]
keywordPath --> hybridMerge["Hybrid merge<br/>(0.7 vector + 0.3 keyword)"]
embeddingPath --> hybridMerge
keywordOnly --> temporalDecay
hybridMerge --> temporalDecay["Temporal decay"]
temporalDecay --> mmrRerank["MMR re-ranking"]
mmrRerank --> results["Top-k results"]
When both paths produce results, scores are combined:
- Vector score: Cosine similarity (0-1) between the query embedding and each chunk embedding.
- Keyword score: BM25 rank normalized to 0-1.
-
Combined:
0.7 * vector_score + 0.3 * keyword_score(weights configurable).
An exponential decay function reduces scores for older content:
- Half-life: 30 days (default). A memory chunk last modified 30 days ago has its score multiplied by 0.5.
-
Exempt types:
identityandlessonentries are exempt from decay — they are considered evergreen.
Maximal Marginal Relevance prevents redundant results:
- Lambda: 0.7 (balances relevance vs. diversity).
- Similarity metric: Jaccard similarity between chunk token sets.
- After scoring, results are iteratively selected to maximize diversity among the top-k.
Search queries memory_chunks_fts and embedding_cache, then joins back to memories for metadata (type, team_name, is_active). Both active and superseded entries are searchable — superseded results are marked as such in the response. Deleted entries (via memory_delete) are excluded from search results.
These enhancements were validated as valuable during the design review but deferred to reduce v1 complexity:
-
access_count/last_accessedcolumns for usage-based ranking signal - Cross-encoder reranking on top-20 results (~10ms, +15-30% MRR@5)
- LLM-based query expansion (2-3 alternative phrasings per query)
- Query-time type classification (predict relevant types from natural language)
The @ai-sdk/anthropic package declares textEmbeddingModel(): never — Anthropic models do not support embeddings. When memory.embedding_provider_profile references an Anthropic profile, the system falls back to keyword-only search. Only OpenAI-compatible providers (@ai-sdk/openai) expose the embeddingModel() function (provider-registry.ts:33-35).
All memory tools are inline AI SDK tool() definitions following the same pattern as organization tools (see Organization-Tools). They receive team_name implicitly from OrgToolContext.teamName via closure — the agent never passes it as a parameter.
Upsert a memory entry. If an active entry with the same key exists, supersede_reason is required.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
key |
string |
yes | — | Short identifier (e.g., "deploy-lesson", "auth-approach") |
content |
string |
yes | — | Memory content |
type |
string |
no | "context" |
One of: identity, lesson, decision, context, reference, historical. Accepts aliases. |
supersede_reason |
string |
no | — | Required when an active entry with same key exists |
Implicit parameters (not passed by agent):
-
team_name: fromOrgToolContext.teamName -
updated_by: from session/correlation ID
Behavior:
- No existing active entry: INSERT
- Existing active entry +
supersede_reason: mark old inactive, INSERT new withsupersedes_id+ reason - Existing active entry + no reason: REJECT with error listing current entry
- Soft validation: if content suggests a different type than specified, log a non-blocking suggestion
Soft-delete a memory entry.
| Parameter | Type | Required | Description |
|---|---|---|---|
key |
string |
yes | Key of the memory entry to delete |
Marks the entry is_active = 0. Deleted entries are excluded from search results (unlike superseded entries, which remain searchable for audit trail).
Search team memory using hybrid FTS5 + vector similarity.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
query |
string |
yes | — | Natural language search query |
max_results |
number |
no | 5 |
Maximum number of results to return |
Return shape: An array of result objects, each containing: key, snippet (~400-token matching text fragment), score (combined relevance 0-1), type, is_active (false if superseded), and source (hybrid or keyword). The response also includes a search_mode field indicating whether hybrid or keyword-only search was used.
Scoped to calling team's data only. Returns active + superseded entries (superseded entries marked as such). Deleted entries are excluded.
List active memory entries for the calling team.
| Parameter | Type | Required | Description |
|---|---|---|---|
type |
string |
no | Filter by type (e.g., "lesson", "context") |
Returns all active entries, optionally filtered by type. Useful for reviewing current memory state before making changes.
The embedding provider is configured via memory.embedding_provider_profile in the team's config.yaml, referencing a profile name from /data/config/providers.yaml. The referenced profile must use an OpenAI-compatible provider (provider: openai) since Anthropic models do not support embeddings. The profile's model field determines which embedding model is used (e.g., text-embedding-3-small). No additional configuration is needed — the profile encapsulates provider, API key, and model. See Team-Configuration#Provider Profiles for the full providers.yaml format.
| Failure | Behavior | Recovery |
|---|---|---|
| Profile name not found in providers.yaml | Startup validation warning | Falls back to keyword-only search |
| Profile references Anthropic provider |
textEmbeddingModel() returns never at runtime |
Falls back to keyword-only search with log warning |
| API key invalid or expired | Embedding API call fails | Falls back to keyword-only search with error log |
Only OpenAI-compatible providers support embeddings. The @ai-sdk/openai package exposes textEmbeddingModel() (provider-registry.ts:33-35), while the @ai-sdk/anthropic package declares this method as returning never. The AI SDK's embedMany() function is used for batch embedding during indexing.
The provider profile referenced by memory.embedding_provider_profile must use provider: openai in providers.yaml. Any profile using provider: anthropic will trigger a graceful fallback to keyword-only search.
Supported embedding models (via OpenAI-compatible providers):
-
text-embedding-3-small(1536 dimensions, recommended default) -
text-embedding-3-large(3072 dimensions, higher quality) -
text-embedding-ada-002(1536 dimensions, legacy)
The memory system is designed to degrade gracefully — search always returns results, even when components fail.
| Failure | Behavior | Recovery |
|---|---|---|
| No embedding profile configured | Keyword-only search, startup log info | Configure memory.embedding_provider_profile
|
| Embedding API unreachable | Fallback to keyword search + warning log | Check API connectivity |
| No memories exist | Empty injection section, empty search results | Team starts fresh; memories accumulate during operation |
| Memory budget exceeded | Oldest context/decision entries not injected (still searchable) | Agent consolidates or reclassifies entries |
The principle is fail-open for reads, fail-loud for writes. A missing or empty memory store does not crash the session — the team continues without memory context. But a failed write operation is surfaced to the agent so it can retry or escalate.
Memory is isolated per-team. Each team can only access its own memory entries through a data access layer.
-
Data access layer: All team-scoped queries go through a single enforced path that adds
WHERE team_name = ?. No raw SQL access to memory tables. -
Search boundaries: The
memory_searchtool only returns results from the calling team's memory. Cross-team memory access is not possible. - Team name validation: SLUG_RE validation applies at the API boundary, preventing injection via team names.
-
API key management: Embedding provider API keys are centrally managed in
providers.yamlby the system administrator, not configured per-team. - Credential redaction: Memory content passes through the credential scrubber, which redacts sensitive values from logs and tool outputs.
- CI verification: Test coverage must include cross-team access attempts that verify isolation cannot be bypassed.
- Budget overflow: Entries beyond the ~50 cap are not injected but remain their original type and are searchable. The system never auto-changes types.
-
Concurrent writes: SQLite WAL mode handles concurrent readers. Under ADR-41, writes to
memoriesare class 3 (per-key lock): concurrent writes to differentsubject_keyvalues proceed in parallel, while writes to the samesubject_keyare serialised by a per-key advisory lock so the append-then-supersede sequence is atomic. The partial unique index on(team_name, subject_key, is_active=1)enforces the invariant of at most one active entry per key.memory_chunks+memory_chunks_ftsare class 4 — they follow the parentmemoriesrow's lock, so FTS re-indexing cannot race the supersede. See Architecture-Decisions#ADR-41. -
Team deletion: All memory rows (
memories,memory_chunks) for the team are deleted withshutdown_team. This is irreversible. - Empty memory: A newly spawned team starts with no memory entries. Memories accumulate during operation as the agent learns.
-
Supersede chains: Frequently superseded keys create chains visible via
idx_memory_history. Active queries remain O(1) via the partial index. History queries are opt-in. -
Deleted vs superseded: Both set
is_active = 0. The search query distinguishes them: superseded entries have another row'ssupersedes_idpointing to them (i.e.,EXISTS (SELECT 1 FROM memories m2 WHERE m2.supersedes_id = m.id)). Deleted entries have no successor. The search pipeline includes superseded entries but excludes deleted ones.