Memory Technologies Production Ready Brk Mmap Tracing - antimetal/system-agent GitHub Wiki

brk/mmap System Call Tracing

Overview

System call tracing for memory management operations provides a coarse-grained but production-safe approach to detecting memory leaks and growth patterns. This method focuses on tracing heap expansion (brk/sbrk) and memory mapping (mmap/munmap) system calls to monitor virtual memory allocation patterns at the kernel level.

  • Traces heap expansion (brk) and memory mapping (mmap) syscalls
  • Very low overhead (<1%)
  • Coarse-grained but production-safe
  • Detects heap growth patterns
  • Shows virtual memory expansion patterns

Unlike fine-grained malloc/free tracing, system call tracing operates at the kernel boundary where memory segments are actually allocated or expanded, providing insights into underlying memory management behavior without the overhead of user-space function hooking.

Performance Characteristics

Metric Value
Overhead <1%
Accuracy Low (very coarse)
False Positives High
Production Ready Yes
Platform Linux with eBPF
Frequency Low (syscalls are infrequent)
Granularity System-level, not allocation-specific

The extremely low overhead makes this approach suitable for continuous monitoring in production environments. However, the coarse granularity means it serves best as an early warning system rather than a precise diagnostic tool.

System Calls Traced

Core Memory Management System Calls

brk() - Heap Segment Expansion

  • Purpose: Changes the end of the data segment (program break)
  • Usage: Traditional heap growth mechanism
  • Modern Context: Less common due to allocator changes
  • Kernel Function: sys_brk() or SyS_brk()
  • Tracepoint: syscalls:sys_enter_brk (Linux 4.14+)

sbrk() - Heap Size Changes

  • Purpose: Incremental heap size adjustment
  • Usage: Wrapper around brk() for relative changes
  • Return Value: Previous program break address
  • Implementation: Usually implemented via brk()

mmap() - Memory Mapping

  • Purpose: Maps files or anonymous memory into address space
  • Usage: Large allocations (>MMAP_THRESHOLD, typically 128KB)
  • Kernel Function: sys_mmap() or SyS_mmap()
  • Tracepoint: syscalls:sys_enter_mmap (Linux 4.14+)
  • Flags: MAP_ANONYMOUS for heap-like allocations

munmap() - Memory Unmapping

  • Purpose: Removes memory mappings
  • Usage: Frees large allocated blocks
  • Kernel Function: sys_munmap()
  • Tracepoint: syscalls:sys_enter_munmap (Linux 4.14+)

mremap() - Memory Remapping

  • Purpose: Expands or moves existing memory mappings
  • Usage: Realloc operations on large blocks
  • Kernel Function: sys_mremap()
  • Tracepoint: syscalls:sys_enter_mremap (Linux 4.14+)

Allocation Strategy Context

Modern malloc implementations use different strategies:

  • Small allocations: Usually from pre-allocated pools
  • Medium allocations: Traditional brk/sbrk heap expansion
  • Large allocations: Direct mmap calls (glibc default >128KB)

System-Agent Implementation Plan

eBPF Programs for Syscall Tracing

// Pseudo-code structure for eBPF program
struct syscall_event {
    u32 pid;
    u32 tid;
    u64 timestamp;
    u64 size;
    u64 addr;
    u32 syscall_id;
    char comm[16];
};

// Attach to syscall tracepoints
SEC("tracepoint/syscalls/sys_enter_brk")
int trace_brk_enter(struct trace_event_raw_sys_enter *ctx);

SEC("tracepoint/syscalls/sys_enter_mmap") 
int trace_mmap_enter(struct trace_event_raw_sys_enter *ctx);

SEC("tracepoint/syscalls/sys_exit_mmap")
int trace_mmap_exit(struct trace_event_raw_sys_exit *ctx);

Growth Pattern Detection Algorithm

  1. Baseline Establishment: Track normal allocation patterns per process
  2. Trend Analysis: Detect sustained growth over time windows
  3. Threshold Monitoring: Alert on size or frequency anomalies
  4. Correlation: Match with process memory metrics and PSI data

Integration Points

  • Layer 1 Monitoring: Feed into existing metrics pipeline
  • Process Correlation: Link with process memory stats from /proc
  • Alert Generation: Trigger detailed profiling when patterns detected
  • Historical Analysis: Store trends for capacity planning

How It Works

Syscall Entry/Exit Tracing

The eBPF programs attach to kernel tracepoints that fire when system calls are invoked:

  1. Entry Hook: Capture parameters (size, flags, addresses)
  2. Exit Hook: Capture return values and success/failure status
  3. Event Generation: Package data for user-space analysis
  4. Filtering: Apply PID/process filters to reduce noise

Size Tracking Methodology

# Track cumulative allocation sizes per process
brk_size[pid] += new_brk - old_brk
mmap_size[pid] += allocation_size  
munmap_size[pid] -= deallocation_size
net_growth[pid] = mmap_size[pid] - munmap_size[pid] + brk_size[pid]

Frequency Analysis

  • Call Rate Monitoring: Syscalls per second per process
  • Burst Detection: Unusual allocation patterns
  • Periodicity Analysis: Regular allocation cycles
  • Growth Rate Calculation: Size increase over time

Code Examples

bpftrace Script for Basic Monitoring

#!/usr/bin/env bpftrace

// brk-mmap-monitor.bt - Monitor memory allocation syscalls
BEGIN {
    printf("Monitoring brk/mmap syscalls. Ctrl-C to end.\n");
    printf("%-8s %-16s %-8s %-12s %-8s\n", "TIME", "COMM", "PID", "SYSCALL", "SIZE");
}

// Trace brk() system calls
tracepoint:syscalls:sys_enter_brk {
    $brk = args->brk;
    $old_brk = @brk_size[pid];
    $delta = $brk > $old_brk ? $brk - $old_brk : 0;
    
    if ($delta > 0) {
        printf("%-8u %-16s %-8d %-12s %8d\n", 
               elapsed / 1000000, comm, pid, "brk", $delta);
        @brk_size[pid] = $brk;
        @total_brk[pid] += $delta;
    }
}

// Trace mmap() system calls for anonymous mappings
tracepoint:syscalls:sys_enter_mmap {
    $flags = args->flags;
    $size = args->len;
    
    // Focus on anonymous mappings (heap-like allocations)
    if ($flags & 0x20) { // MAP_ANONYMOUS
        printf("%-8u %-16s %-8d %-12s %8d\n",
               elapsed / 1000000, comm, pid, "mmap", $size);
        @mmap_size[pid] += $size;
        @mmap_count[pid]++;
    }
}

// Trace munmap() system calls  
tracepoint:syscalls:sys_enter_munmap {
    $size = args->len;
    printf("%-8u %-16s %-8d %-12s %8d\n",
           elapsed / 1000000, comm, pid, "munmap", $size);
    @munmap_size[pid] += $size;
    @munmap_count[pid]++;
}

// Summary on exit
END {
    printf("\nSummary by Process:\n");
    printf("%-16s %-8s %-12s %-12s %-8s %-8s\n", 
           "COMM", "PID", "BRK_TOTAL", "MMAP_TOTAL", "MMAP_CNT", "UNMAP_CNT");
    
    // Print per-process summaries
    // (Note: bpftrace syntax for iteration varies by version)
}

BCC Python Program for Advanced Analysis

#!/usr/bin/env python3
# brk-mmap-tracer.py - Advanced memory syscall tracer

from bcc import BPF
from time import sleep
import argparse

# eBPF program
bpf_source = """
#include <uapi/linux/ptrace.h>
#include <linux/sched.h>

struct event_t {
    u32 pid;
    u32 tid;
    u64 timestamp;
    u64 size;
    u64 addr;
    u32 syscall;
    char comm[TASK_COMM_LEN];
};

BPF_PERF_OUTPUT(events);
BPF_HASH(brk_size, u32, u64);
BPF_HASH(process_stats, u32, u64);

// Trace brk syscall
TRACEPOINT_PROBE(syscalls, sys_enter_brk) {
    struct event_t event = {};
    u32 pid = bpf_get_current_pid_tgid() >> 32;
    u64 *prev_brk = brk_size.lookup(&pid);
    u64 curr_brk = args->brk;
    
    if (prev_brk && curr_brk > *prev_brk) {
        event.pid = pid;
        event.tid = bpf_get_current_pid_tgid() & 0xffffffff;
        event.timestamp = bpf_ktime_get_ns();
        event.size = curr_brk - *prev_brk;
        event.addr = curr_brk;
        event.syscall = 1; // brk
        bpf_get_current_comm(&event.comm, sizeof(event.comm));
        
        events.perf_submit(ctx, &event, sizeof(event));
        brk_size.update(&pid, &curr_brk);
    }
    return 0;
}

// Trace mmap syscall
TRACEPOINT_PROBE(syscalls, sys_enter_mmap) {
    struct event_t event = {};
    u32 flags = args->flags;
    
    // Only trace anonymous mappings
    if (flags & 0x20) { // MAP_ANONYMOUS
        event.pid = bpf_get_current_pid_tgid() >> 32;
        event.tid = bpf_get_current_pid_tgid() & 0xffffffff;
        event.timestamp = bpf_ktime_get_ns();
        event.size = args->len;
        event.addr = 0; // Will be filled by return probe
        event.syscall = 2; // mmap
        bpf_get_current_comm(&event.comm, sizeof(event.comm));
        
        events.perf_submit(ctx, &event, sizeof(event));
    }
    return 0;
}
"""

class MemoryTracer:
    def __init__(self, pid=None):
        self.pid = pid
        self.bpf = BPF(text=bpf_source)
        self.syscall_names = {1: 'brk', 2: 'mmap', 3: 'munmap'}
        
    def handle_event(self, cpu, data, size):
        event = self.bpf["events"].event(data)
        syscall = self.syscall_names.get(event.syscall, 'unknown')
        
        print(f"{event.timestamp/1e9:.6f} {event.comm.decode():16} "
              f"{event.pid:8d} {syscall:8} {event.size:12d}")
    
    def run(self):
        print("Tracing memory allocation syscalls... Ctrl-C to exit")
        print(f"{'TIME':>16} {'COMM':16} {'PID':8} {'SYSCALL':8} {'SIZE':12}")
        
        self.bpf["events"].open_perf_buffer(self.handle_event)
        
        try:
            while True:
                self.bpf.perf_buffer_poll()
        except KeyboardInterrupt:
            pass

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="Trace brk/mmap syscalls")
    parser.add_argument("-p", "--pid", type=int, help="Process ID to trace")
    args = parser.parse_args()
    
    tracer = MemoryTracer(pid=args.pid)
    tracer.run()

Growth Pattern Analysis Script

#!/bin/bash
# analyze-growth-patterns.sh - Analyze collected syscall data

# Run bpftrace and collect data
sudo bpftrace brk-mmap-monitor.bt > /tmp/syscall-trace.log &
TRACE_PID=$!

# Let it run for monitoring period
sleep 300  # 5 minutes

# Stop tracing
kill $TRACE_PID

# Analysis
echo "Top processes by total brk growth:"
awk '/brk/ {brk[$3]+=$NF} END {for(pid in brk) print brk[pid], pid}' \
    /tmp/syscall-trace.log | sort -nr | head -10

echo "Top processes by mmap count:"
awk '/mmap/ {mmap[$3]++} END {for(pid in mmap) print mmap[pid], pid}' \
    /tmp/syscall-trace.log | sort -nr | head -10

echo "Processes with unbalanced mmap/munmap:"
awk '/mmap/ {mmap[$3]++} /munmap/ {munmap[$3]++} 
     END {for(pid in mmap) 
         if(mmap[pid] - munmap[pid] > 5) 
             print pid, mmap[pid] - munmap[pid]}' \
    /tmp/syscall-trace.log

Detection Patterns

Continuous brk() Increases

  • Pattern: Steady, incremental brk() calls over time
  • Indication: Traditional heap growth, possibly from malloc fragmentation
  • Threshold: >10MB total growth without corresponding shrinkage
  • Time Window: 5-minute intervals for trend detection

Large mmap() Allocations

  • Pattern: Individual mmap() calls >1MB
  • Indication: Large object allocations or buffer creation
  • Threshold: Single allocations >128MB (configurable)
  • Frequency: >100 large allocations per minute

Unmatched mmap/munmap Ratios

  • Pattern: mmap calls significantly exceed munmap calls
  • Calculation: mmap_count - munmap_count > threshold
  • Threshold: >50% imbalance over 10-minute window
  • Weighting: Consider allocation sizes, not just call counts

Frequency Anomalies

  • Pattern: Unusual syscall frequency compared to baseline
  • Detection: Statistical deviation from historical averages
  • Baseline: 7-day moving average per process
  • Threshold: >3 standard deviations from baseline

Growth Rate Analysis

# Calculate growth velocity
growth_rate = (current_size - previous_size) / time_delta
acceleration = (current_rate - previous_rate) / time_delta

# Alert conditions
if growth_rate > threshold_rate:
    alert("High growth rate detected")
if acceleration > threshold_acceleration:
    alert("Accelerating memory growth")

Monitoring & Alerting

Growth Rate Thresholds

Tier 1 - Information

  • brk growth: >1MB/minute sustained for 5 minutes
  • mmap growth: >10MB/minute sustained for 5 minutes
  • Action: Log event, no immediate alert

Tier 2 - Warning

  • brk growth: >5MB/minute sustained for 10 minutes
  • mmap growth: >50MB/minute sustained for 10 minutes
  • Unbalanced ratio: >70% mmap without corresponding munmap
  • Action: Generate warning alert, tag for review

Tier 3 - Critical

  • brk growth: >20MB/minute sustained for 15 minutes
  • mmap growth: >200MB/minute sustained for 15 minutes
  • Memory exhaustion risk: Growth rate projects to 90% memory usage
  • Action: Critical alert, trigger detailed profiling

Allocation Size Limits

# Alert configuration
size_thresholds:
  single_mmap_warning: 128MB
  single_mmap_critical: 512MB
  cumulative_brk_warning: 100MB
  cumulative_brk_critical: 500MB
  
frequency_thresholds:
  mmap_per_second_warning: 10
  mmap_per_second_critical: 50
  brk_per_second_warning: 5
  brk_per_second_critical: 20

Pattern Matching Rules

# Alert rule examples
def evaluate_memory_patterns(process_data):
    alerts = []
    
    # Sustained growth pattern
    if detect_sustained_growth(process_data, window=300):
        alerts.append("sustained_memory_growth")
    
    # Allocation without deallocation
    if calculate_allocation_balance(process_data) < 0.3:
        alerts.append("poor_deallocation_ratio")
    
    # Frequency spike
    if detect_frequency_anomaly(process_data):
        alerts.append("syscall_frequency_anomaly")
    
    return alerts

Integration with Existing Systems

  • Prometheus Metrics: Export syscall statistics as time series
  • Grafana Dashboards: Visualize growth patterns and trends
  • PagerDuty Integration: Route critical alerts to on-call teams
  • Log Aggregation: Ship detailed events to centralized logging

Limitations

Very Coarse Granularity

  • No Allocation Sites: Cannot identify specific code locations causing leaks
  • No Call Stacks: Missing context about what triggered allocations
  • Virtual vs Physical: Tracks virtual memory allocation, not actual usage
  • Aggregated View: Cannot distinguish between many small vs few large allocations

High False Positive Rate

  • Normal Growth: Applications legitimately growing memory usage
  • Caching Behavior: Memory maps used for caching appear as leaks
  • Batch Processing: Periodic large allocations appear anomalous
  • Initialization Phase: Startup allocations trigger false alerts

Missing Critical Information

  • No Leak Source Identification: Cannot pinpoint leaking functions
  • No Object-Level Tracking: Cannot track specific data structures
  • No Allocation Lifetime: Cannot determine how long allocations persist
  • Limited Context: Missing application-level semantics

Platform Limitations

  • Linux-Specific: eBPF implementation tied to Linux kernel
  • Kernel Version: Requires modern kernel for tracepoint support
  • Permission Requirements: Needs root/CAP_BPF capabilities
  • Architecture Dependencies: Some features may vary by CPU architecture

Use Cases

Early Warning System

  • Primary Role: First-line defense against memory exhaustion
  • Integration: Trigger more expensive detailed profiling tools
  • Baseline Establishment: Learn normal allocation patterns per service
  • Capacity Planning: Track long-term memory growth trends

Heap Growth Monitoring

  • Traditional Allocators: Monitor brk-based heap expansion
  • Modern Allocators: Track mmap-based large allocations
  • Fragmentation Detection: Identify inefficient heap usage patterns
  • Allocator Performance: Compare allocation strategies across processes

Large Allocation Detection

  • Buffer Management: Detect oversized buffer allocations
  • Memory-Intensive Operations: Identify processes consuming large memory blocks
  • Resource Planning: Understand peak memory requirements
  • Anomaly Detection: Flag unusual large allocation patterns

Supplementary Signal for Comprehensive Monitoring

  • Multi-Layer Approach: Combine with malloc tracing, PSI metrics, page fault analysis
  • Correlation Analysis: Cross-reference with application metrics
  • Root Cause Analysis: Provide high-level context for detailed investigations
  • Historical Trends: Long-term memory usage pattern analysis

Comparison with Alternatives

vs malloc() Tracing (BCC memleak, etc.)

Aspect System Call Tracing malloc() Tracing
Overhead <1% 5-20%
Granularity Very coarse Fine-grained
Production Use Always safe Risky in high-throughput
Call Stack No Yes
Allocation Sites No Yes
False Positives High Low
Best Use Case Early warning Precise diagnosis

vs Page Fault Analysis

Aspect System Call Tracing Page Fault Tracing
Signal Type Virtual allocation Physical access
Timing Allocation time Access time
Memory Pressure Indirect Direct
Write vs Read No distinction Can distinguish
Performance Impact Very low Low-medium
Use Case Growth patterns Usage patterns

vs PSI (Pressure Stall Information)

Aspect System Call Tracing PSI Metrics
Granularity Per-process System-wide
Real-time Event-based Polling
Memory Pressure Predictive Current
Overhead Minimal Near zero
Actionability High Medium
Complement Yes Yes

Best Combined Strategy

System call tracing works best as part of a layered approach:

  1. Layer 1: System call tracing (continuous, low overhead)
  2. Layer 2: PSI metrics + page fault analysis (context)
  3. Layer 3: Detailed malloc tracing (triggered by Layer 1/2 alerts)
  4. Layer 4: Application profiling (when precise diagnosis needed)

eBPF Implementation

Tracepoint Attachment Strategy

// Modern approach using tracepoints (Linux 4.14+)
SEC("tracepoint/syscalls/sys_enter_brk")
int trace_brk_enter(struct trace_event_raw_sys_enter *ctx) {
    u64 brk_addr = ctx->args[0];
    u32 pid = bpf_get_current_pid_tgid() >> 32;
    
    // Process brk syscall
    return handle_brk_syscall(ctx, pid, brk_addr);
}

// Fallback for older kernels using kprobes
SEC("kprobe/sys_brk")
int kprobe_sys_brk(struct pt_regs *ctx) {
    u64 brk_addr = PT_REGS_PARM1(ctx);
    u32 pid = bpf_get_current_pid_tgid() >> 32;
    
    return handle_brk_syscall(ctx, pid, brk_addr);
}

Syscall Arguments Extraction

// mmap syscall argument structure
struct mmap_args {
    unsigned long addr;
    unsigned long len;
    unsigned long prot;
    unsigned long flags;
    unsigned long fd;
    unsigned long offset;
};

SEC("tracepoint/syscalls/sys_enter_mmap")
int trace_mmap_enter(struct trace_event_raw_sys_enter *ctx) {
    struct mmap_args *args = (struct mmap_args *)ctx->args;
    
    // Filter for anonymous mappings
    if (args->flags & MAP_ANONYMOUS) {
        struct event_t event = {};
        event.size = args->len;
        event.flags = args->flags;
        // ... populate event and submit
    }
    
    return 0;
}

Return Value Processing

SEC("tracepoint/syscalls/sys_exit_mmap")
int trace_mmap_exit(struct trace_event_raw_sys_exit *ctx) {
    long ret = ctx->ret;
    u32 pid = bpf_get_current_pid_tgid() >> 32;
    
    if (ret > 0) {
        // Successful allocation
        struct allocation_t alloc = {};
        alloc.addr = ret;
        alloc.timestamp = bpf_ktime_get_ns();
        
        // Store for tracking
        allocations.update(&pid, &alloc);
    }
    
    return 0;
}

Error Handling and Edge Cases

// Handle edge cases and errors
static int handle_syscall_error(long ret_code) {
    switch (ret_code) {
        case -ENOMEM:
            // Out of memory - important signal
            increment_oom_counter();
            return 1;
        case -EINVAL:
            // Invalid arguments - likely application bug
            increment_invalid_args_counter();
            return 0;
        default:
            return 0;
    }
}

Memory Management for eBPF Maps

// Efficient map structures
struct {
    __uint(type, BPF_MAP_TYPE_HASH);
    __uint(max_entries, 10240);
    __type(key, u32);      // PID
    __type(value, struct process_memory_stats);
} process_stats SEC(".maps");

// Cleanup old entries to prevent map overflow
static void cleanup_old_entries(void) {
    u64 current_time = bpf_ktime_get_ns();
    u64 cutoff_time = current_time - (5 * 60 * 1000000000ULL); // 5 minutes
    
    // Iterate and cleanup (pseudo-code, actual implementation varies)
    // bpf_for_each_map_elem(&process_stats, cleanup_callback, &cutoff_time, 0);
}

Integration Strategies

Combine with PSI (Pressure Stall Information)

class IntegratedMemoryMonitor:
    def __init__(self):
        self.syscall_tracer = SyscallTracer()
        self.psi_monitor = PSIMonitor()
        
    def analyze_memory_pressure(self):
        # Get current PSI metrics
        psi_data = self.psi_monitor.get_memory_pressure()
        
        # Get recent syscall activity
        syscall_data = self.syscall_tracer.get_recent_activity()
        
        # Correlate signals
        if psi_data.memory_pressure > 0.1 and syscall_data.growth_rate > threshold:
            return self.trigger_detailed_analysis()
            
    def trigger_detailed_analysis(self):
        """Launch more expensive profiling when signals align"""
        return DetailedProfiler().start_profiling()

Correlate with Process Metrics

# Correlation script
#!/bin/bash

# Collect syscall data
sudo bpftrace syscall-tracer.bt > /tmp/syscalls.log &
TRACER_PID=$!

# Collect process memory stats  
while true; do
    for pid in $(pgrep -f "target_process"); do
        echo "$(date +%s) $pid $(cat /proc/$pid/status | grep VmRSS | awk '{print $2}')"
    done
    sleep 10
done > /tmp/process_memory.log &
PROC_PID=$!

# Run for collection period
sleep 300

# Cleanup
kill $TRACER_PID $PROC_PID

# Correlate data
python3 << EOF
import pandas as pd
import numpy as np

# Load and correlate data
syscall_df = pd.read_csv('/tmp/syscalls.log', sep=' ', 
                        names=['timestamp', 'comm', 'pid', 'syscall', 'size'])
memory_df = pd.read_csv('/tmp/process_memory.log', sep=' ',
                       names=['timestamp', 'pid', 'rss_kb'])

# Find correlation between syscall activity and RSS growth
correlation = np.corrcoef(syscall_df.groupby('pid')['size'].sum(),
                         memory_df.groupby('pid')['rss_kb'].max())
print(f"Correlation coefficient: {correlation[0,1]:.3f}")
EOF

Trigger Detailed Profiling

class AdaptiveProfiler:
    def __init__(self):
        self.syscall_monitor = SyscallMonitor()
        self.profiler_active = False
        
    def monitor_loop(self):
        while True:
            metrics = self.syscall_monitor.get_metrics()
            
            if self.should_trigger_profiling(metrics):
                self.start_detailed_profiling()
            elif self.should_stop_profiling(metrics):
                self.stop_detailed_profiling()
                
            time.sleep(30)
    
    def should_trigger_profiling(self, metrics):
        """Decide when to start expensive profiling"""
        return (metrics.growth_rate > GROWTH_THRESHOLD and 
                metrics.allocation_frequency > FREQ_THRESHOLD and
                not self.profiler_active)
    
    def start_detailed_profiling(self):
        """Start malloc-level tracing"""
        subprocess.Popen(['bcc-memleak', '-p', str(self.target_pid)])
        self.profiler_active = True
        
    def stop_detailed_profiling(self):
        """Stop expensive profiling"""
        subprocess.call(['pkill', 'bcc-memleak'])
        self.profiler_active = False

Dashboard Integration

# Grafana dashboard configuration
dashboard:
  title: "Memory System Call Monitoring"
  panels:
    - title: "brk() Growth Rate by Process"
      type: "graph"
      targets:
        - expr: 'rate(brk_total_bytes[5m])'
          legendFormat: '{{process}}'
    
    - title: "mmap/munmap Balance"
      type: "graph" 
      targets:
        - expr: 'mmap_total_count - munmap_total_count'
          legendFormat: 'Unbalanced {{process}}'
    
    - title: "Large Allocation Events"
      type: "table"
      targets:
        - expr: 'mmap_large_allocations > 134217728' # >128MB

This comprehensive documentation provides a complete guide to implementing brk/mmap system call tracing for memory leak detection, covering both theoretical foundations and practical implementation details. The approach serves as an excellent first-line defense in a multi-layered memory monitoring strategy.