Memory and Storage Systems - uw-ssec/llmaven GitHub Wiki
š§ Memory and Storage Systems
LLMaven is designed to provide a robust, context-aware AI research assistant by integrating advanced memory and storage systems. This architecture combines temporal knowledge graphs, vector search, and both short-term and long-term memory mechanisms to enhance the capabilities of language models.
š§ Core Design Principles
- Temporal Awareness: Utilizing time-stamped knowledge graphs to track the evolution of information.
- Hybrid Retrieval: Combining vector search with graph-based retrieval for comprehensive context.
- Agentic Modularity: Employing modular agents for specialized tasks, enhancing scalability and maintainability.
- Human-in-the-Loop: Incorporating user feedback mechanisms to refine and guide AI behavior.
- Open Source and Deployability: Ensuring the system is open-source and can be deployed across various environments.
š§ Knowledge Graphs and Vector Search
Neo4j: The Dual-Purpose Database
Neo4j serves as both a knowledge graph and a vector database, enabling the storage and retrieval of structured relationships and semantic embeddings. This dual capability allows for efficient querying of interconnected data and similarity-based searches.
- Vector Indexing: Neo4j implements Hierarchical Navigable Small World (HNSW) indexing for approximate nearest neighbor searches, facilitating fast and scalable vector queries. Learn more
- Cypher Query Language: Cypher is Neo4jās declarative graph query language, allowing expressive and efficient data querying in a property graph. Learn more
š°ļø Temporal Knowledge Graphs with Graphiti
Graphiti is a framework designed to build and query temporally-aware knowledge graphs, tailored for AI agents operating in dynamic environments. GitHub
- Time-Stamped Relationships: Graphiti allows for the creation of knowledge graphs where relationships are annotated with temporal information, enabling the tracking of changes over time.
- Integration with Neo4j: Graphiti stores its temporal knowledge graphs in Neo4j, leveraging its graph database capabilities for efficient storage and retrieval. Read more
Mem0 and OpenMemory
š§ Short-Term Memory:Mem0 provides a self-improving memory layer for LLM applications, enabling personalized AI experiences that save costs and enhance user satisfaction.
- Selective Extraction: Mem0 focuses on extracting the most salient facts from interactions, reducing the amount of data stored and processed. See article
- Asynchronous Updates: The system updates its memory asynchronously, preventing memory management from interfering with real-time interactions.
- OpenMemory MCP: A private, local-first memory layer powered by Mem0, enabling persistent, context-aware AI across MCP-compatible clients. Blog post
š§ Long-Term Memory: Persistent Knowledge Graphs
Long-term memory in LLMaven is managed through persistent knowledge graphs stored in Neo4j. These graphs capture and maintain structured information over extended periods, allowing the AI to recall and reason over past interactions and data.
- Graphiti Integration: By leveraging Graphiti, the system ensures that long-term memory includes temporal context, enhancing the AI's ability to understand the evolution of information.
š§© Agentic Framework and Memory Integration
LLMaven employs a modular agentic framework, where specialized agents handle different aspects of the system. Memory components are integrated into this framework to provide context and continuity across interactions.
Agent | Role Description |
---|---|
Supervisor | Orchestrates task planning and coordination |
Docs Agent | Manages document retrieval and KG interaction |
Coding Agent | Interacts with code repositories and generates code |
Data Agent | Handles data retrieval, transformation, and storage |
Pipeline Agent | Executes and monitors pipeline workflows |
š Observability and Evaluation
To ensure transparency and facilitate debugging, LLMaven incorporates observability tools.
- LogFire: Provides detailed logs of agent interactions, aiding in debugging and performance tracking.
- Grafana: Offers dashboards for monitoring system metrics, user feedback, and performance.
š Deployment and Scalability
LLMaven is designed for flexible deployment across various environments.
- Kubernetes & HELM: Container orchestration and package management for scalable deployment.
- Cloud-Native Tools: Integration with Open Web UI, VLLM, and MinIO enhances functionality and extensibility.
š§ Future Work
One planned extension of the memory and storage architecture is the addition of a context sufficiency evaluation module. This module will:
- Automatically evaluate whether the retrieved memory provides enough relevant context to satisfy user needs or complete agent tasks.
- Score the relevance and completeness of retrieved memory chunks based on task-specific criteria.
- Enable dynamic querying or fallback strategies when insufficient memory context is detected.
This enhancement will support more robust agent behavior by improving how LLMaven detects and adapts to incomplete or under-contextualized inputs.
š Additional Resources
- Graphiti GitHub Repository
- Mem0 GitHub Repository
- Neo4j Vector Search Documentation
- Cypher Query Language Introduction
By integrating temporal knowledge graphs, vector search, and modular memory components, LLMaven provides a comprehensive and scalable solution for AI-driven research assistance.