Memory Technologies - antimetal/system-agent GitHub Wiki
Memory Leak Detection Technologies Documentation
This directory contains comprehensive documentation for 23 different memory leak detection technologies, each with detailed implementation plans for system-agent integration, production examples, and academic references.
Quick Navigation
🟢 Production-Ready (Low Overhead)
- Memory-Technologies-Production-Ready-Page-Fault-Tracing - <1% overhead, eBPF-based page fault monitoring
- Memory-Technologies-Production-Ready-PSI-Metrics - 0% overhead, Pressure Stall Information monitoring
- Memory-Technologies-Production-Ready-Brk-Mmap-Tracing - <1% overhead, system call tracing
- Memory-Technologies-Production-Ready-Jemalloc-Profiling - ~4% overhead, statistical sampling profiler
- Memory-Technologies-Production-Ready-TCMalloc-Profiling - ~5% overhead, Google's allocator profiler
- Memory-Technologies-Production-Ready-Mimalloc-Profiling - ~2% overhead, Microsoft's lightweight allocator
- Memory-Technologies-Production-Ready-Hardware-PMC - 0% overhead, CPU performance counters
🟡 Production-Limited (Moderate Overhead)
- Memory-Technologies-Production-Limited-SWAT-Statistical - <5% overhead, Microsoft's stale object detection
- Memory-Technologies-Production-Limited-PRECOG-ML - ~1% overhead, machine learning detection
- Memory-Technologies-Production-Limited-BCC-Memleak-Sampled - 10-30% overhead, sampled eBPF tracing
- Memory-Technologies-Production-Limited-Bytehound - ~20% overhead, Rust-based complete tracking
- Memory-Technologies-Production-Limited-Time-Series-Analysis - 0-5% overhead, statistical analysis
🔴 Development-Only (High Overhead)
- Memory-Technologies-Development-Only-Valgrind-Massif - 2000% overhead, comprehensive heap profiling
- Memory-Technologies-Development-Only-Heaptrack - 50-100% overhead, heap memory profiler
- Memory-Technologies-Development-Only-ASan - 200-300% overhead, AddressSanitizer
- Memory-Technologies-Development-Only-LSan - 150-200% overhead, LeakSanitizer
- Memory-Technologies-Development-Only-BCC-Memleak-Full - 30-400% overhead, complete malloc/free tracing
🔬 Research Prototypes
- Memory-Technologies-Research-Prototypes-LeakGuard - 5-10% overhead, zero false positives
- Memory-Technologies-Research-Prototypes-Gencount - 5-15% overhead, age distribution analysis
- Memory-Technologies-Research-Prototypes-Sleigh - 10-20% overhead, stale object detection
☁️ Platform-Specific
- Memory-Technologies-Platform-Specific-Parca - Kubernetes continuous profiling platform
- Memory-Technologies-Platform-Specific-Pixie - Kubernetes-native observability
- Memory-Technologies-Platform-Specific-Pyroscope - Multi-language continuous profiling
Technology Categories
By Detection Method
Direct Allocation Tracking:
- BCC memleak (full and sampled)
- ByteHound
- Valgrind/Massif
- Heaptrack
- ASAN/LSAN
Statistical Profiling:
- jemalloc
- tcmalloc
- mimalloc
- SWAT
Indirect Signals:
- Page fault tracing
- PSI metrics
- Hardware PMCs
- brk/mmap tracing
Pattern Analysis:
- Time series analysis
- Precog ML
- LeakGuard
- GenCount
- Sleigh
By Deployment Model
Always-On Monitoring:
- PSI + Metrics
- Page fault tracing
- Hardware PMCs
- brk/mmap tracing
Triggered Profiling:
- jemalloc/tcmalloc
- BCC memleak (sampled)
- ByteHound
Continuous Profiling Platforms:
- Parca
- Pixie
- Pyroscope
Development/CI Only:
- Valgrind
- ASAN/LSAN
- Heaptrack
Three-Layer Detection Strategy
Based on the research, the optimal approach combines multiple technologies:
Layer 1: Continuous Monitoring (Always On)
- Primary: Memory-Technologies-Production-Ready-PSI-Metrics + Memory-Technologies-Production-Ready-Page-Fault-Tracing
- Overhead: <1%
- Purpose: Early warning and anomaly detection
Layer 2: Triggered Investigation (On Anomaly)
- Primary: Memory-Technologies-Production-Ready-Jemalloc-Profiling or Memory-Technologies-Production-Ready-TCMalloc-Profiling
- Overhead: 4-5%
- Duration: 5-10 minutes
- Purpose: Identify leak sources with stack traces
Layer 3: Deep Analysis (Critical Issues)
- Primary: Memory-Technologies-Production-Limited-Bytehound or Memory-Technologies-Production-Limited-BCC-Memleak-Sampled
- Overhead: 20-30%
- Duration: Brief periods
- Purpose: Complete tracking for root cause analysis
Selection Guide
For Different Scenarios
If you need... | Use this technology | Why |
---|---|---|
Zero overhead monitoring | Memory-Technologies-Production-Ready-Hardware-PMC or Memory-Technologies-Production-Ready-PSI-Metrics | Hardware-native or kernel tracking |
Production profiling | Memory-Technologies-Production-Ready-Jemalloc-Profiling | Best overhead/accuracy trade-off |
Kubernetes monitoring | Memory-Technologies-Platform-Specific-Pixie or Memory-Technologies-Platform-Specific-Parca | Native K8s integration |
Development debugging | Memory-Technologies-Development-Only-Valgrind-Massif | Most comprehensive |
Quick leak check | Memory-Technologies-Production-Ready-Page-Fault-Tracing | Low overhead, good signals |
Statistical analysis | Memory-Technologies-Production-Limited-SWAT-Statistical | Proven at Microsoft |
Complete tracking | Memory-Technologies-Production-Limited-Bytehound | 100% allocation coverage |
Performance vs Accuracy Trade-offs
High Accuracy, High Overhead:
Valgrind > ASAN/LSAN > Heaptrack > ByteHound > BCC full
Balanced (Production-suitable):
jemalloc ≈ tcmalloc > SWAT > Parca/Pixie > BCC sampled
Low Overhead, Lower Accuracy:
Hardware PMCs ≈ PSI < Page faults < brk/mmap < Time series
Implementation Status
Status | Count | Technologies |
---|---|---|
✅ Production Ready | 10 | PSI, Page faults, jemalloc, tcmalloc, mimalloc, PMCs, SWAT, brk/mmap, Time series, Continuous platforms |
⚠️ Limited Production | 6 | BCC sampled, ByteHound, Precog, Research prototypes |
❌ Development Only | 7 | Valgrind, Heaptrack, ASAN, LSAN, BCC full |
Key Findings
- Page fault tracing is underutilized - Provides excellent signals at <1% overhead
- Modern allocators are production-ready - jemalloc/tcmalloc's 4-5% overhead is acceptable
- Hardware PMCs offer zero overhead - But require expertise to interpret
- Statistical approaches work - SWAT proves <5% overhead is achievable
- Layered approach is optimal - No single tool solves all cases
References
- Memory-Technologies-Comparison-Matrix - Complete comparison matrix
- Memory-Technologies-Findings-Summary - Research findings summary
- Memory-Technologies-PMC-Research-Papers-Bibliography - Academic papers and references
Contributing
When adding new technologies:
- Follow the established document template
- Include production examples where available
- Provide implementation code for system-agent
- Add academic references and benchmarks
- Update this index and the comparison matrix