Memory and Storage Systems - uw-ssec/llmaven GitHub Wiki

🧠 Memory and Storage Systems

LLMaven is designed to provide a robust, context-aware AI research assistant by integrating advanced memory and storage systems. This architecture combines temporal knowledge graphs, vector search, and both short-term and long-term memory mechanisms to enhance the capabilities of language models.

🧭 Core Design Principles

  • Temporal Awareness: Utilizing time-stamped knowledge graphs to track the evolution of information.
  • Hybrid Retrieval: Combining vector search with graph-based retrieval for comprehensive context.
  • Agentic Modularity: Employing modular agents for specialized tasks, enhancing scalability and maintainability.
  • Human-in-the-Loop: Incorporating user feedback mechanisms to refine and guide AI behavior.
  • Open Source and Deployability: Ensuring the system is open-source and can be deployed across various environments.

🧠 Knowledge Graphs and Vector Search

Neo4j: The Dual-Purpose Database

Neo4j serves as both a knowledge graph and a vector database, enabling the storage and retrieval of structured relationships and semantic embeddings. This dual capability allows for efficient querying of interconnected data and similarity-based searches.

  • Vector Indexing: Neo4j implements Hierarchical Navigable Small World (HNSW) indexing for approximate nearest neighbor searches, facilitating fast and scalable vector queries. Learn more
  • Cypher Query Language: Cypher is Neo4j’s declarative graph query language, allowing expressive and efficient data querying in a property graph. Learn more

šŸ•°ļø Temporal Knowledge Graphs with Graphiti

Graphiti is a framework designed to build and query temporally-aware knowledge graphs, tailored for AI agents operating in dynamic environments. GitHub

  • Time-Stamped Relationships: Graphiti allows for the creation of knowledge graphs where relationships are annotated with temporal information, enabling the tracking of changes over time.
  • Integration with Neo4j: Graphiti stores its temporal knowledge graphs in Neo4j, leveraging its graph database capabilities for efficient storage and retrieval. Read more

🧠 Short-Term Memory: Mem0 and OpenMemory

Mem0 provides a self-improving memory layer for LLM applications, enabling personalized AI experiences that save costs and enhance user satisfaction.

  • Selective Extraction: Mem0 focuses on extracting the most salient facts from interactions, reducing the amount of data stored and processed. See article
  • Asynchronous Updates: The system updates its memory asynchronously, preventing memory management from interfering with real-time interactions.
  • OpenMemory MCP: A private, local-first memory layer powered by Mem0, enabling persistent, context-aware AI across MCP-compatible clients. Blog post

🧠 Long-Term Memory: Persistent Knowledge Graphs

Long-term memory in LLMaven is managed through persistent knowledge graphs stored in Neo4j. These graphs capture and maintain structured information over extended periods, allowing the AI to recall and reason over past interactions and data.

  • Graphiti Integration: By leveraging Graphiti, the system ensures that long-term memory includes temporal context, enhancing the AI's ability to understand the evolution of information.

🧩 Agentic Framework and Memory Integration

LLMaven employs a modular agentic framework, where specialized agents handle different aspects of the system. Memory components are integrated into this framework to provide context and continuity across interactions.

Agent Role Description
Supervisor Orchestrates task planning and coordination
Docs Agent Manages document retrieval and KG interaction
Coding Agent Interacts with code repositories and generates code
Data Agent Handles data retrieval, transformation, and storage
Pipeline Agent Executes and monitors pipeline workflows

šŸ” Observability and Evaluation

To ensure transparency and facilitate debugging, LLMaven incorporates observability tools.

  • LogFire: Provides detailed logs of agent interactions, aiding in debugging and performance tracking.
  • Grafana: Offers dashboards for monitoring system metrics, user feedback, and performance.

šŸš€ Deployment and Scalability

LLMaven is designed for flexible deployment across various environments.

  • Kubernetes & HELM: Container orchestration and package management for scalable deployment.
  • Cloud-Native Tools: Integration with Open Web UI, VLLM, and MinIO enhances functionality and extensibility.

🧭 Future Work

One planned extension of the memory and storage architecture is the addition of a context sufficiency evaluation module. This module will:

  • Automatically evaluate whether the retrieved memory provides enough relevant context to satisfy user needs or complete agent tasks.
  • Score the relevance and completeness of retrieved memory chunks based on task-specific criteria.
  • Enable dynamic querying or fallback strategies when insufficient memory context is detected.

This enhancement will support more robust agent behavior by improving how LLMaven detects and adapts to incomplete or under-contextualized inputs.

šŸ“š Additional Resources

By integrating temporal knowledge graphs, vector search, and modular memory components, LLMaven provides a comprehensive and scalable solution for AI-driven research assistance.