Linear Regression Detector - antimetal/system-agent GitHub Wiki
⚠️ DRAFT/WIP: Documentation for in-development feature onmem_monitor
branch
The Linear Regression Detector (ebpf/src/memgrowth.bpf.c
) uses statistical trend analysis to identify steady memory growth patterns. It implements integer-only linear regression directly in eBPF kernel space to detect slow, consistent memory leaks with minimal overhead.
The detector's most important optimization is 5-second event coalescing, which prevents burst events from overwhelming the 16-element circular buffer:
const __u32 COALESCE_THRESHOLD_DS = 50; // 5 seconds
if (time_since_last < COALESCE_THRESHOLD_DS) {
// UPDATE last entry instead of adding new
state->rss_history_mb[last_idx] = new_rss_mb;
return; // Don't advance buffer or recalculate
}
- 16 slots × 5 seconds = 80+ seconds minimum coverage
- Handles kernel RSS batching gracefully
- 95% reduction in regression calculations
#define RSS_HISTORY_SIZE 16 // Power of 2 for efficient masking
#define RSS_HISTORY_MASK 15 // Binary AND for wrap-around
struct process_memory_state {
__u32 rss_history_mb[RSS_HISTORY_SIZE]; // RSS in MB
__u32 time_history_ds[RSS_HISTORY_SIZE]; // Deciseconds
__u8 history_head; // Next write position (0-15)
__u8 history_count; // Valid entries (max 16)
};
Why megabyte resolution?
- Natural noise filtering - ignores <1MB changes
- Smaller integers - avoids arithmetic overflow
- Meaningful threshold - 1MB matters in production
The detector adjusts sampling based on process behavior:
Process State | Growth Rate | Sampling Interval | Rationale |
---|---|---|---|
Idle | 0 bytes/s | 10 seconds | Minimize overhead |
Slow Growth | <100KB/s | 5 seconds | Balance coverage |
Active Leak | >100KB/s | 1 second | Tight monitoring |
// No floating point in eBPF - use integer arithmetic
for (i = 0; i < n; i++) {
x = (time_history_ds[i] - t0) / 10; // Convert to seconds
y = rss_history_mb[i]; // Already in MB
sum_x += x;
sum_y += y;
sum_xy += x * y;
sum_x2 += x * x;
}
// Calculate slope in MB/s, convert to bytes/s
slope_mb_per_s = (n * sum_xy - sum_x * sum_y) /
(n * sum_x2 - sum_x * sum_x);
state->trend_slope = slope_mb_per_s * 1024 * 1024;
The detector calculates R² to measure trend quality:
- R² > 0.95: Strong linear correlation (high confidence)
- R² > 0.90: Good correlation (medium confidence)
- R² < 0.80: Poor correlation (low confidence)
The detector uses a multi-factor scoring system:
Factor | Weight | Criteria |
---|---|---|
Growth Rate | 0-25 points | >10MB/s: 25, >1MB/s: 20, >100KB/s: 15 |
Pattern Quality | 0-35 points | R²>0.95: 35, R²>0.90: 25, R²>0.80: 15 |
Duration | 0-25 points | 15+ samples: 25, 10+: 20, 6+: 10 |
Relative Growth | 0-15 points | 2x initial: 15, 1.5x: 10, 1x: 5 |
Total Score: 0-100 (threshold typically set at 60)
Time (s): 0 10 20 30 40 50 60 70 80
RSS (MB): 100 105 110 115 120 125 130 135 140
└─────────────────────────────────────────┘
Slope: 0.5 MB/s
R² = 0.99
Confidence: 85/100 ✅
Time (s): 0 10 20 30 40 50 60 70 80
RSS (MB): 100 120 118 122 119 121 120 119 121
└─────────────────────────────────────────┘
No clear trend
R² = 0.12
Confidence: 5/100 ✅
Metric | Value | Notes |
---|---|---|
Event Rate | 0.1-0.2/sec per process | After coalescing |
CPU per Event | <100 instructions | Highly optimized |
Memory per Process | 164 bytes | Compact state |
Total Memory (10K procs) | ~1.6MB | Scales linearly |
Regression Calculations | Every 5+ seconds | Only on buffer advance |
✅ Statistical robustness - Filters noise through regression
✅ Extended history - 80+ seconds with coalescing
✅ Adaptive behavior - Responds to process patterns
✅ Low overhead - <0.05% CPU for 1000 processes
✅ Accurate measurement - Direct RSS from kernel
❌ Minimum samples - Needs 6+ data points
❌ Linear assumption - May miss exponential growth initially
❌ MB resolution - Won't detect <1MB/min leaks
❌ Single pattern - Only detects linear trends
struct linear_regression_config {
__u64 min_rss_threshold; // Default: 10MB
__u64 mb_change_threshold; // Default: 1MB
__u32 steady_interval_ds; // Default: 100 (10s)
__u32 active_interval_ds; // Default: 10 (1s)
__u32 min_samples; // Default: 6
__u32 r2_threshold; // Default: 900 (0.90)
__u8 confidence_threshold; // Default: 60
};
Test with test/memory-leak-simulators/linear_growth.c
:
// Simulates 1MB/s steady leak
for (i = 0; i < 60; i++) {
void* leak = malloc(1024 * 1024);
memset(leak, 0, 1024 * 1024); // Touch pages
sleep(1);
}
Expected detection: 30-60 seconds, R² > 0.90, Confidence 70-85
The Linear Regression Detector provides:
- Trend validation for threshold-based detections
- Long-term perspective complementing short-term spikes
- Statistical confidence for decision making
- RSS Ratio Detector - Memory composition analysis
- Threshold Detector - Scientific heuristics
- Testing Methodology - Validation procedures
Last updated: 2025-01-19 | Branch: mem_monitor
| Status: DRAFT