Design Philosophy - uw-ssec/llmaven GitHub Wiki

🔍 Design Philosophy

Welcome to the design philosophy for LLMaven—a platform (not just a framework) for building flexible, reproducible, and extensible multi-agent systems for research workflows. Built by and for research software engineers, LLMaven aims to bring intelligent tooling to the hands of scientists, developers, and institutions who demand rigor, transparency, and purpose.

✨ Core Philosophy

LLMaven is founded on eight design principles that guide every decision, from high-level architecture to individual agent behavior. These are not abstract ideals—they are practical standards derived from real research needs and software engineering practices.

1. 🧭 Human-Centered, Purpose-Driven Design

LLMaven is a platform created for researchers, not for LLM novelty. Everything in the system starts from a question:

"What would a research software engineer need to accomplish their task?"

From flexible UI integrations like OpenWebUI to human-in-the-loop features and output guardrails, every layer is meant to support scientific understanding, control, and trust.

Collections UI simplifies document input and integrates RAG.
Supervisor Agent enables human-style planning and oversight.
Grafana and LogFire dashboards ensure transparency for sysadmins and researchers alike.

2. 🧩 Composable, Modular Architecture

LLMaven isn’t a monolith. It’s a platform of interchangeable parts:

Agents specialize in tasks (Docs, Code, Data, Pipeline) and communicate using a standardized A2A (Agent-to-Agent) protocol.
MCPs structure interactions with external resources, like Neo4j or Azure Blob.
Every component is containerized, deployed with Helm on Kubernetes.

This modularity means:

Agents can be added or removed without breaking the system.
MCPs can be extended to support new data types.
Researchers can plug in their own pipelines without touching the core logic.

3. 🎛️ Opinionated but Extensible

LLMaven makes strong design choices:

vLLM is used for inference over Ollama for multi-user scalability.
Neo4j replaces traditional vector DBs for dual-use as vector store + knowledge graph.
OpenWebUI is forked and customized rather than building a UI from scratch.

But we also ensure escape hatches:

Researchers can write and deploy custom MCPs.
System prompts are configurable.
Plugins can wrap any tool, API, or workflow.

You get a clear path to productivity, without hitting a wall when your use case is unique.

4. 🔁 Reproducibility by Default

Scientific software demands reproducibility:

All interactions are logged via LogFire, giving visibility into system decisions.
All user prompts and feedback are persisted in PostgreSQL.
Agents communicate using typed protocols (e.g., Graffiti MCP, Cypher MCP) that enforce structure and traceability.
Containerization ensures that every deployment is self-contained and reproducible anywhere—locally or in the cloud.

5. 🛡️ Guardrails Before Glamour

Before showcasing fancy AI capabilities, LLMaven ensures its outputs are safe, valid, and useful:

Input Guardrails validate prompt structure.
Output Guardrails filter junk, enforce formatting, and perform safety checks.
Fallback Nodes provide graceful degradation in the face of failure.
Human-in-the-loop Nodes enable users to confirm or reject decisions.

These features reflect the reality that AI is not always right—especially in high-stakes, high-rigor contexts like science.

6. 🏗️ Infrastructure as Product

Your developer experience is our product:

LLMaven uses GitHub Actions for CI/CD.
Helm charts allow for one-command deployments.
All core services (PostgreSQL, Neo4j, MinIO, Grafana, etc.) are cloud-native and portable.

No proprietary lock-in. No tangled monoliths. Just tools that work.

7. 💡 Strategic Technical Debt

LLMaven embraces rapid prototyping but keeps a clear boundary between MVP shortcuts and long-term architecture:

We prioritize shipping core agents (Docs, Code, Data) early.
We allow deterministic fallback for key actions.
We track weak areas in model performance to guide fine-tuning.

Everything is designed to scale—but we start lean to validate utility.

8. 🔬 Science-Aware Abstractions

LLMaven models the structure of research:

Knowledge graphs (Neo4j) reflect evolving knowledge bases.
MCPs abstract over data modalities: documents, graphs, code, storage.
Temporal tracking (e.g., Graffiti MCP) lets us monitor how memory changes over time.
The architecture explicitly separates reasoning (LLMs), control (rules), tools (MCPs), and memory (short- and long-term).

These abstractions aren’t arbitrary. They are born from understanding how research works and what software needs to support it.

🌍 Conclusion

LLMaven is built for clarity, control, and collaboration in the messy, beautiful world of research. We believe that AI tools should serve people—especially those doing the hard work of scientific discovery. These principles guide how we build, improve, and maintain this platform.

We hope you’ll join us in making it better.