30_LLD_Configuration System - asavschaeffer/globule GitHub Wiki
# Configuration System - Low Level Design
*Version: 1.0*
*Date: 2025-07-11*
*Status: Draft for Review*
## 1. Introduction
This document provides the detailed low-level design for Globule's Configuration System, translating the high-level requirements into concrete technical specifications. It builds upon the extensive research conducted and documented in "Low-Level Design of Globule's Configuration System.txt".
### 1.1 Scope
This LLD covers:
- Storage format and file locations
- Loading and validation strategies
- Configuration cascade implementation
- User-facing API design
- Performance optimizations
- Error handling and recovery
### 1.2 Dependencies from HLD
From the High Level Design document:
- Three-tier cascade model (System → User → Context)
- Support for hot-reloading in development
- Schema validation and type safety
- User empowerment through progressive configuration
## 2. Storage Format Decision
### 2.1 Selected Format: YAML with ruamel.yaml
**Decision**: YAML is selected as the primary configuration format.
**Rationale**:
- Superior human readability for complex nested structures
- Native comment support with preservation via ruamel.yaml
- Matches user expectations from similar tools (Kubernetes, CI/CD)
- Handles deeply nested structures required for schemas and LLM prompts
**Security Mitigation**:
- Mandatory use of yaml.safe\_load()
for all parsing operations
- No direct execution of YAML content
- Strict Pydantic validation post-parsing
### 2.2 File Locations (XDG Compliance)
Configuration files follow the XDG Base Directory Specification:
System defaults: /etc/globule/config.yaml (read-only)
User preferences: $XDG\_CONFIG\_HOME/globule/config.yaml
(defaults to ~/.config/globule/config.yaml)
Context configs: $XDG\_CONFIG\_HOME/globule/contexts/\*.yaml
Schema definitions: $XDG\_CONFIG\_HOME/globule/schemas/\*.yaml
### 2.3 File Structure Example
\# System defaults (/etc/globule/config.yaml)
version: "1.0"
system:
processing\_transparency: "concise"
file\_organization: "semantic"
ai\_models:
embedding: "mxbai-embed-large"
parsing: "llama3.2:3b"
hot\_reload:
enabled: false # Production default
\# User preferences (~/.config/globule/config.yaml)
version: "1.0"
user:
processing\_transparency: "verbose"
theme: "dark"
synthesis:
default\_cluster\_view: true
ai\_suggestions\_aggression: "moderate"
\# Context override (~/.config/globule/contexts/creative\_writing.yaml)
version: "1.0"
context:
name: "creative\_writing"
parent: null # No inheritance for MVP
overrides:
processing\_transparency: "verbose"
synthesis:
ai\_suggestions\_aggression: "proactive"
show\_semantic\_connections: true
## 3. Configuration Loading Strategy
### 3.1 Load Sequence
1. **System defaults** loaded from package-bundled YAML
2. **User preferences** loaded from XDG config directory
3. **Active context** (if any) loaded from contexts directory
4. **Environment variables** processed via Pydantic BaseSettings
5. **Merge cascade** applied in precedence order
6. **Validation** against GlobalConfig Pydantic model
### 3.2 Hot-Reload Mechanism
**Development Mode** (hot\_reload.enabled: true
):
- Uses watchdog to monitor configuration files
- Detection latency: 500-700ms
- Triggers full application restart for consistency
- All reload events logged with timestamp
**Production Mode** (hot\_reload.enabled: false
):
- No file watching
- Configuration changes require explicit restart
- Ensures stability and predictable behavior
### 3.3 Error Handling
class ConfigurationError(Exception):
"""Base exception for configuration-related errors"""
pass
class ConfigurationLoadError(ConfigurationError):
"""Raised when configuration files cannot be loaded"""
pass
class ConfigurationValidationError(ConfigurationError):
"""Raised when configuration fails Pydantic validation"""
pass
**Recovery Strategy**:
1. Log detailed error with file location and line number
2. Fall back to Last Known Good (LKG) configuration
3. If no LKG exists, use system defaults only
4. Notify user via CLI of degraded configuration
## 4. Configuration Cascade Implementation
### 4.1 Cascade Resolution Order
1\. Environment variables (highest precedence)
2\. Active context configuration
3\. User preferences
4\. System defaults (lowest precedence)
### 4.2 Nested Key Access
Keys use dot notation for hierarchical access:
- synthesis.ai\_suggestions.aggression
- ai\_models.embedding
Implementation uses recursive dictionary traversal with graceful handling of missing intermediate keys.
### 4.3 Context Inheritance (Post-MVP)
For MVP, contexts are independent. Future enhancement will support one-level inheritance:
context:
name: "creative\_writing.novel"
parent: "creative\_writing"
## 5. Schema Definition
### 5.1 Core Pydantic Models
from pydantic import BaseModel, Field, validator
from pydantic\_settings import BaseSettings
from typing import Literal, Optional, Dict, Any
from pathlib import Path
class AIModelsConfig(BaseModel):
"""AI model configuration"""
embedding: str = Field(
default="mxbai-embed-large",
description="Model for semantic embeddings"
)
parsing: str = Field(
default="llama3.2:3b",
description="Model for structural parsing"
)
class SynthesisConfig(BaseModel):
"""Interactive synthesis engine configuration"""
default\_cluster\_view: bool = True
ai\_suggestions\_aggression: Literal\["passive", "moderate", "proactive"] = "moderate"
show\_semantic\_connections: bool = True
progressive\_discovery\_depth: int = Field(default=2, ge=1, le=5)
class HotReloadConfig(BaseModel):
"""Hot reload configuration"""
enabled: bool = False
watch\_delay: float = Field(default=0.5, ge=0.1, le=5.0)
class GlobalConfig(BaseSettings):
"""Root configuration model with environment variable support"""
version: str = Field(default="1.0", const=True)
processing\_transparency: Literal\["silent", "concise", "verbose"] = "concise"
file\_organization: Literal\["semantic", "chronological", "hybrid"] = "semantic"
theme: Literal\["light", "dark", "auto"] = "auto"
ai\_models: AIModelsConfig = Field(default\_factory=AIModelsConfig)
synthesis: SynthesisConfig = Field(default\_factory=SynthesisConfig)
hot\_reload: HotReloadConfig = Field(default\_factory=HotReloadConfig)
class Config:
env\_prefix = "GLOBULE\_"
env\_nested\_delimiter = "\_\_"
env\_file = ".env"
env\_file\_encoding = "utf-8"
### 5.2 Validation Rules
- Version field ensures configuration compatibility
- Literal types enforce valid enum values
- Nested models provide structured validation
- Field constraints (e.g., ge=1, le=5
) ensure reasonable values
### 5.3 Environment Variable Mapping
\# Maps to processing\_transparency
export GLOBULE\_PROCESSING\_TRANSPARENCY=verbose
\# Maps to ai\_models.embedding
export GLOBULE\_AI\_MODELS\_\_EMBEDDING=all-MiniLM-L6-v2
\# Maps to synthesis.ai\_suggestions\_aggression
export GLOBULE\_SYNTHESIS\_\_AI\_SUGGESTIONS\_AGGRESSION=proactive
## 6. User-Facing API
### 6.1 CLI Commands
\# Get configuration value
globule config get <key>
globule config get synthesis.ai\_suggestions\_aggression
\# Set configuration value (saves to user config)
globule config set <key> <value>
globule config set theme dark
\# Set context-specific value
globule config set --context creative\_writing synthesis.ai\_suggestions\_aggression proactive
\# List all configuration
globule config list \[--show-source]
\# Open configuration in editor
globule config edit \[--context <name>]
\# Validate configuration
globule config validate \[--file <path>]
\# Show active configuration cascade
globule config cascade
### 6.2 Programmatic API
class ConfigManager:
"""Main configuration management interface"""
def \_\_init\_\_(self,
system\_config\_path: Optional\[Path] = None,
user\_config\_path: Optional\[Path] = None,
context\_configs\_dir: Optional\[Path] = None):
"""Initialize with optional custom paths"""
def get(self, key: str, context: Optional\[str] = None) -> Any:
"""Get configuration value with cascade resolution"""
def set(self, key: str, value: Any, target: str = "user") -> None:
"""Set configuration value in specified target"""
def validate(self) -> GlobalConfig:
"""Validate and return typed configuration object"""
def reload(self) -> None:
"""Manually trigger configuration reload"""
def export(self, format: str = "yaml") -> str:
"""Export effective configuration"""
### 6.3 Schema Management API
\# Create new schema from template
globule schema create <name> --template <type>
\# Validate schema file
globule schema validate <file>
\# List available schemas
globule schema list
\# Edit schema
globule schema edit <name>
## 7. Performance Specifications
### 7.1 Caching Strategy
- In-memory cache using Python dictionaries
- Cache invalidation on file change detection
- Lazy loading for context configurations
### 7.2 Performance Targets
| Operation | Target Latency | Notes |
|-----------|---------------|-------|
| Config key access (cached) | <1μs | Direct memory access |
| Config key access (miss) | <50ms | File parse + validation |
| Initial load | <200ms | All configs + validation |
| Hot reload detection | 500-700ms | Watchdog file system events |
| CLI get command | <100ms | Including process startup |
| CLI set command | <150ms | Including file write + validation |
### 7.3 Schema Compilation (Future)
Post-MVP optimization using Pydantic's create\_model()
to dynamically generate models from user schemas.
## 8. Error Messages and Logging
### 8.1 User-Facing Error Messages
\# Validation error
Error: Invalid configuration value
File: ~/.config/globule/config.yaml
Line: 15
Field: synthesis.ai\_suggestions\_aggression
Value: "aggressive"
Valid options: passive, moderate, proactive
\# File not found
Error: Configuration file not found
Expected: ~/.config/globule/contexts/work.yaml
Suggestion: Run 'globule config create-context work'
\# Malformed YAML
Error: Invalid YAML syntax
File: ~/.config/globule/config.yaml
Line: 22
Issue: Inconsistent indentation (expected 2 spaces, found 3)
### 8.2 Logging Format
\# Configuration reload
2025-07-11 10:23:45 INFO \[config] Configuration reloaded from ~/.config/globule/config.yaml
2025-07-11 10:23:45 INFO \[config] Active context: creative\_writing
2025-07-11 10:23:45 DEBUG \[config] Cascade: env(2) > creative\_writing(5) > user(3) > system(10)
\# Validation failure with fallback
2025-07-11 10:24:12 ERROR \[config] Validation failed for user config, using LKG
2025-07-11 10:24:12 ERROR \[config] Details: synthesis.progressive\_discovery\_depth = 10 (max: 5)
## 9. Thread Safety
### 9.1 Concurrent Access
- ConfigManager instances are thread-safe for read operations
- Write operations acquire a threading.Lock
- Cache updates are atomic
### 9.2 Implementation
import threading
class ConfigManager:
def \_\_init\_\_(self):
self.\_lock = threading.RLock()
self.\_cache = {}
def get(self, key: str) -> Any:
# Read-only, no lock needed if cache is populated
return self.\_get\_from\_cache(key)
def set(self, key: str, value: Any) -> None:
with self.\_lock:
# Validate, update file, invalidate cache
self.\_update\_config(key, value)
## 10. Migration and Compatibility
### 10.1 Version Management
- Configuration files include version field
- System validates version compatibility on load
- Future versions will include migration tools
### 10.2 Backward Compatibility
- New optional fields have defaults
- Deprecated fields logged but still functional
- Major version changes require explicit migration
## 11. Testing Requirements
### 11.1 Unit Tests
- Cascade resolution with all precedence combinations
- YAML parsing with malformed inputs
- Pydantic validation edge cases
- Environment variable parsing and type conversion
- Thread safety under concurrent access
### 11.2 Integration Tests
- Hot reload with actual file changes
- CLI commands with various inputs
- Schema validation pipeline
- Error recovery scenarios
### 11.3 Performance Tests
- Cache hit rate measurement
- Load time with various config sizes
- Memory usage profiling
- Concurrent access stress testing
## 12. Security Considerations
### 12.1 Input Validation
- All YAML parsing uses safe_load
- File paths sanitized to prevent directory traversal
- Environment variables filtered by prefix
### 12.2 Access Control
- System config files require elevated permissions to modify
- User cannot modify system defaults via CLI
- Context files isolated to user directory
## 13. Future Enhancements (Post-MVP)
1. **Context Inheritance**: One-level parent-child relationships
2. **Schema Compiler**: Dynamic Pydantic model generation
3. **Configuration Server**: For distributed deployments
4. **Encryption**: For sensitive configuration values
5. **Audit Trail**: Detailed change history with rollback
## 14. Decision Log
| Decision | Rationale | Date |
|----------|-----------|------|
| YAML over TOML | Better for deeply nested schemas, comment preservation | 2025-07-11 |
| Full restart for hot-reload | Reliability over speed in development | 2025-07-11 |
| No deep context inheritance | Simplicity and predictability for MVP | 2025-07-11 |
| Pydantic for validation | Type safety and IDE support | 2025-07-11 |
| XDG compliance | Standard locations, user expectations | 2025-07-11 |
## 15. Open Questions for Review
1. Should we support YAML anchors/aliases for configuration reuse?
2. Do we need a globule config diff
command to compare configurations?
3. Should context activation be explicit or derived from current directory?
4. Is the 500-700ms hot-reload latency acceptable for development?
5. Should we implement partial config exports (e.g., only synthesis settings)?
*This LLD is ready for review. Once approved, it will serve as the definitive specification for implementing the Configuration System.*