Infrastructure and Deployment - uw-ssec/llmaven GitHub Wiki
🛠️ Infrastructure and Deployment
LLMaven is architected as a modular, cloud-native, and agent-driven platform designed to support a wide range of research workflows. Its infrastructure emphasizes portability, observability, and scalability, enabling deployment across cloud, on-premises, and local environments with minimal friction.
🚀 Kubernetes: The Foundation of Orchestration
Kubernetes serves as the backbone of LLMaven's infrastructure, providing robust container orchestration capabilities.
-
Why Kubernetes?
- Declarative configuration for reproducibility
- Native support for service discovery and load balancing
- Scalable, fault-tolerant deployments
📦 Helm: Streamlining Deployment
Helm is used as the package manager for Kubernetes.
- Helm Charts: Encapsulate deployment logic
- Environment-Specific Configs: Use templating to manage staging, dev, and production
- Repeatability: Ensure version-controlled, consistent deployments
☁️ Cloud-Native Components
LLMaven integrates several modular, open-source, and cloud-ready tools:
Component | Purpose |
---|---|
OpenWebUI | User interface; upgraded from SQLite to PostgreSQL for scalability |
Postgres | SQL Database for storing OpenWebUI interaction data |
Neo4j | Vector-enabled graph database for RAG & temporal reasoning |
VLLM | High-throughput inference engine for LLMs |
MinIO | S3-compatible object storage (local or remote) |
Grafana | Observability dashboards, PostgreSQL-integrated metrics |
LogFire | Agent interaction logging and fine-grained traceability |
🌐 Deployment Flexibility
LLMaven is deployment-agnostic and works in multiple environments:
-
Cloud Deployments
- Compatible with AWS, Azure, GCP
- Uses managed Kubernetes clusters (e.g., AKS, EKS)
-
On-Premises Deployments
- For organizations with strict compliance or data localization requirements
-
Local Development
- Fast iteration and debugging workflows with reproducible setups
💠 Initial Deployment: Azure Example
The initial deployment of LLMaven is configured for Microsoft Azure:
- Orchestration: Azure Kubernetes Service (AKS)
- Authentication: Azure Active Directory (optional)
- Storage: Azure Blob Storage, optionally replaced by MinIO
📊 Observability and Monitoring
Observability is a first-class concern within LLMaven:
-
Grafana
- System-wide dashboards for latency, usage, model activity
-
Pedantic AI LogFire
- Logs detailed agent flows and decision chains for debugging and evaluation
🔐 Security and Authentication
LLMaven uses federated authentication methods:
- Google, GitHub, Microsoft OAuth for user sign-in
- GitHub Tokens for coding agent interactions with private/public repositories
📚 Additional Resources
- Kubernetes Official Docs
- Helm Documentation
- Neo4j Vector Search
- Grafana Dashboards
- Pedantic AI LogFire
LLMaven’s infrastructure is intentionally flexible, future-proof, and modular—ready to support reproducible, extensible, and secure scientific workflows at scale.