Exploratory Matrix - antimetal/system-agent GitHub Wiki

Memory Leak Detection Technologies Comparison Matrix

Overview

This matrix compares all researched memory leak detection approaches across multiple dimensions including performance overhead, accuracy, deployment complexity, and production readiness.

Scoring Legend

Overhead: Percentage performance impact (lower is better)
Accuracy: 🟢 High (>90%) | 🟡 Medium (60-90%) | 🔴 Low (<60%)
False Positives: 🟢 Low (<10%) | 🟡 Medium (10-30%) | 🔴 High (>30%)
Setup Complexity: 🟢 Easy | 🟡 Moderate | 🔴 Complex
Production Ready: ✅ Yes | ⚠️ Limited | ❌ No

Comprehensive Comparison Matrix

Technology	Overhead	Accuracy	False Positives	Granularity	Setup	Prod Ready	Restart Required	Stack Traces	Platform Requirements	Key Limitations
Page Fault Tracing	<1%	🟡 Medium	🟢 Low	Coarse	🟢 Easy	✅ Yes	No	Yes*	Linux 4.14+, Frame pointers	Indirect detection only
jemalloc Profiling	~4%	🟢 High	🟢 Low	Fine	🟢 Easy	✅ Yes	No**	Yes	LD_PRELOAD support	Sampling may miss small leaks
tcmalloc Profiling	~5%	🟢 High	🟢 Low	Fine	🟢 Easy	✅ Yes	No**	Yes	LD_PRELOAD support	Google ecosystem focused
mimalloc Profiling	~2%	🟡 Medium	🟢 Low	Medium	🟢 Easy	✅ Yes	Yes	Limited	Windows/Linux/macOS	Limited profiling features
PSI + Metrics	0%	🔴 Low	🔴 High	Very Coarse	🟢 Easy	✅ Yes	No	No	Linux 4.20+	Detection only, no root cause
Hardware PMCs	0%	🟡 Medium	🟡 Medium	Coarse	🔴 Complex	⚠️ Limited	No	No	Intel/AMD CPUs, root access	Requires expertise to interpret
SWAT (Statistical)	<5%	🟢 High	🟢 Low	Fine	🟡 Moderate	✅ Yes	No	Yes	Windows/Linux	Requires baseline period
Precog (ML)	~1%	🟡 Medium	🟡 Medium	Coarse	🟡 Moderate	⚠️ Limited	No	No	Training data required	Needs historical data
BCC memleak (sampled)	10-30%	🟢 High	🟢 Low	Fine	🟡 Moderate	⚠️ Limited	No	Yes	Linux 4.6+, BCC tools	Still significant overhead
BCC memleak (full)	30-400%	🟢 High	🟢 Low	Very Fine	🟡 Moderate	❌ No	No	Yes	Linux 4.6+, BCC tools	Unsuitable for production
ByteHound	~20%	🟢 High	🟢 Low	Very Fine	🟡 Moderate	⚠️ Limited	Yes	Yes	Linux, Rust runtime	Requires process restart
Parca	1-2%	🟡 Medium	🟢 Low	Fine	🔴 Complex	✅ Yes	No	Yes	Kubernetes, eBPF	Additional infrastructure
Pixie	1-2%	🟡 Medium	🟢 Low	Fine	🔴 Complex	✅ Yes	No	Yes	Kubernetes only	K8s specific
Pyroscope	1-2%	🟡 Medium	🟢 Low	Fine	🟡 Moderate	✅ Yes	No	Yes	Multi-platform	Server infrastructure needed
Valgrind/Massif	2000%	🟢 High	🟢 Low	Very Fine	🟢 Easy	❌ No	Yes	Yes	Linux/macOS	Dev only, serializes threads
Heaptrack	50-100%	🟢 High	🟢 Low	Very Fine	🟢 Easy	❌ No	Yes	Yes	Linux	Dev/debug only
ASAN	200-300%	🟢 High	🟢 Low	Very Fine	🟡 Moderate	❌ No	Rebuild	Yes	Compiler support	Requires recompilation
LSAN	150-200%	🟢 High	🟢 Low	Very Fine	🟡 Moderate	❌ No	Rebuild	Yes	LLVM/GCC	Requires recompilation
brk/mmap Tracing	<1%	🔴 Low	🔴 High	Very Coarse	🟢 Easy	✅ Yes	No	Yes	Linux, eBPF	Only heap expansion
LeakGuard	5-10%	🟢 High	🟢 Low	Fine	🟡 Moderate	⚠️ Limited	No	Yes	Research prototype	Not widely available
GenCount	5-15%	🟡 Medium	🟡 Medium	Fine	🟡 Moderate	⚠️ Limited	No	Yes	Research prototype	Academic tool
Sleigh	10-20%	🟡 Medium	🟡 Medium	Fine	🟡 Moderate	⚠️ Limited	No	Yes	Research prototype	Limited deployment

*Requires frame pointers to be enabled (-fno-omit-frame-pointer) **Can be enabled at runtime with mallctl() or environment variables

Detailed Capability Matrix

Technology	Detects Slow Leaks	Detects Fast Leaks	Kernel Memory	User Memory	Language Agnostic	Real-time Detection	Historical Analysis	Root Cause Analysis
Page Fault Tracing	✅	✅	❌	✅	✅	✅	❌	⚠️
jemalloc Profiling	⚠️	✅	❌	✅	✅	✅	✅	✅
tcmalloc Profiling	⚠️	✅	❌	✅	✅	✅	✅	✅
mimalloc Profiling	⚠️	✅	❌	✅	✅	✅	⚠️	⚠️
PSI + Metrics	✅	✅	✅	✅	✅	✅	❌	❌
Hardware PMCs	✅	✅	✅	✅	✅	✅	❌	❌
SWAT (Statistical)	✅	⚠️	❌	✅	❌	✅	✅	✅
Precog (ML)	✅	⚠️	✅	✅	✅	⚠️	✅	❌
BCC memleak	✅	✅	✅*	✅	✅	✅	❌	✅
ByteHound	✅	✅	❌	✅	✅	✅	✅	✅
Parca	✅	✅	❌	✅	⚠️	✅	✅	✅
Pixie	✅	✅	❌	✅	⚠️	✅	✅	✅
Pyroscope	✅	✅	❌	✅	✅	✅	✅	✅
Valgrind/Massif	✅	✅	❌	✅	✅	✅	✅	✅
Heaptrack	✅	✅	❌	✅	✅	✅	✅	✅
ASAN/LSAN	✅	✅	❌	✅	❌	✅	❌	✅

*BCC memleak can trace kernel allocations (kmalloc/kfree)

Use Case Recommendations

By Deployment Scenario

Scenario	Primary Choice	Secondary Choice	Avoid
Always-on Production	PSI + Page Faults	jemalloc (4% acceptable)	Valgrind, ASAN, Full memleak
Kubernetes	Pixie	Parca	Non-container aware tools
High-Performance Systems	Hardware PMCs	Page Fault Tracing	Any allocator instrumentation
Development/Testing	Valgrind/ASAN	Heaptrack	-
Quick Investigation	jemalloc profiling	Sampled BCC memleak	Full tracing
Deep Root Cause Analysis	ByteHound	Full BCC memleak	Surface-level metrics
Java Applications	JVM Native Tools	-	C/C++ specific tools
Embedded Systems	Custom lightweight	mimalloc	Heavy profilers

By Leak Characteristics

Leak Type	Best Tools	Why
Slow, gradual leaks	PSI + Metrics, Page Faults	Low overhead for long-term monitoring
Fast, obvious leaks	jemalloc, tcmalloc	Quick detection with stack traces
Small, intermittent	ByteHound, Full memleak	Need complete tracking
Unknown source	SWAT, Statistical approaches	Pattern recognition helps
Container escapes	Pixie, Parca	Container-aware
Kernel memory	BCC memleak (kernel mode)	Specialized for kernel

Quantitative Performance Comparison

Metric	Best Performers	Acceptable	Poor
CPU Overhead	PMCs (0%), PSI (0%)	Page Faults (<1%), Parca (1-2%)	Valgrind (2000%)
Memory Overhead	PSI, PMCs, Page Faults	jemalloc (~10%)	ASAN (2-3x)
Latency Impact	PMCs, PSI	jemalloc (+10% P99)	Valgrind (serialization)
Detection Speed	Direct tracing	Statistical (minutes)	ML approaches (hours)
Accuracy	Valgrind, ASAN, ByteHound	jemalloc, BCC	PSI, Metrics only

Implementation Effort Comparison

Approach	Lines of Code	Dependencies	Maintenance	Expertise Required
PSI Monitoring	~50	/proc/pressure	Low	Low
Page Fault eBPF	~200	BCC/bpftrace	Medium	Medium
jemalloc Integration	~100	jemalloc lib	Low	Low
PMC Analysis	~500	perf, PAPI	High	High
ML Detection	~1000+	sklearn, data pipeline	High	High
ByteHound	~50 (usage)	ByteHound binary	Medium	Medium
Parca/Pixie	~200	K8s, operators	High	Medium

Cost-Benefit Analysis

High Value (Low Cost, High Benefit)

Page Fault Tracing: <1% overhead, good detection
PSI Monitoring: 0% overhead, early warning
jemalloc (existing users): If already using, enable profiling

Medium Value

Allocator Switch: 4% overhead but need migration
Sampled BCC: 10-30% overhead, periodic use only
Statistical Approaches: Need tuning and baseline

Low Value (High Cost, Limited Benefit)

Full malloc/free tracing: 30-400% overhead
Valgrind in production: 2000% overhead
Custom ML solutions: High development cost

Decision Tree

Is this production?
├─ Yes
│  ├─ Can tolerate 4% overhead?
│  │  ├─ Yes → jemalloc/tcmalloc profiling
│  │  └─ No → Page fault tracing + PSI
│  └─ Kubernetes environment?
│     ├─ Yes → Pixie or Parca
│     └─ No → Standard Linux tools
└─ No (Development)
   ├─ Need complete accuracy?
   │  ├─ Yes → Valgrind or ASAN
   │  └─ No → Heaptrack or ByteHound
   └─ Quick check only?
      └─ Yes → jemalloc one-time profile

Key Insights from Matrix

No Silver Bullet: No single tool excels at all dimensions
Overhead vs Accuracy Trade-off: Universal across all approaches
Production Viability Threshold: ~5% overhead is the practical limit
Layered Approach Optimal: Combine low-overhead detection with targeted deep analysis
Platform Matters: Kubernetes environments have specialized, superior tools
Frame Pointers Critical: Most eBPF tools require them but modern compilers omit by default
Statistical Sampling Works: 4% overhead for 90%+ accuracy is achievable
Hardware Counters Underutilized: Zero overhead but require expertise