NUMA Collector - antimetal/system-agent GitHub Wiki
NUMA Memory Collector
Overview
The NUMA (Non-Uniform Memory Access) Memory Collector continuously monitors memory access patterns and allocation statistics in NUMA-enabled systems. NUMA is a computer memory design where memory access time depends on the memory location relative to the processor. In NUMA systems, processors can access their own local memory faster than non-local memory (typically 2-3x slower for remote access).
This collector is essential for:
- Performance optimization: Identifying memory locality issues that can significantly impact application performance
- Resource allocation: Understanding CPU and memory topology for better workload placement
- Troubleshooting: Detecting cross-node memory access patterns that indicate suboptimal configurations
- Capacity planning: Monitoring per-node memory usage and allocation patterns in real-time
Unlike other hardware info collectors, NUMA continuously monitors runtime statistics because memory allocation patterns change dynamically during system operation.
Technical Details
Property | Value |
---|---|
MetricType | MetricTypeNUMA ("numa") |
Collection Mode | Continuous (periodic) |
Data Sources | /sys/devices/system/node/ , /proc/sys/kernel/ |
Data Sources
/sys/devices/system/node/node*/numastat
- Per-node allocation statistics (runtime)/sys/devices/system/node/node*/meminfo
- Per-node memory usage (runtime)/sys/devices/system/node/node*/cpulist
- CPUs assigned to each node (static)/sys/devices/system/node/node*/distance
- Distance matrix between nodes (static)/proc/sys/kernel/numa_balancing
- Auto-balancing configuration
Capabilities
SupportsOneShot: true
SupportsContinuous: true (runs periodically)
RequiresRoot: false
RequiresEBPF: false
MinKernelVersion: 2.6.7
Source Code
Primary implementation: pkg/performance/collectors/numa.go
Collected Metrics
System-Level Metrics
Metric | Type | Description |
---|---|---|
Enabled |
bool | Whether NUMA is enabled on this system |
NodeCount |
int | Number of NUMA nodes in the system |
AutoBalance |
bool | Whether automatic NUMA balancing is enabled |
Per-Node Metrics
Metric | Type | Description |
---|---|---|
ID |
int | Node ID (0-based) |
CPUs |
[]int | List of CPU cores assigned to this node |
Memory Metrics | ||
MemTotal |
uint64 | Total memory on this node (bytes) |
MemFree |
uint64 | Free memory on this node (bytes) |
MemUsed |
uint64 | Used memory on this node (bytes) |
FilePages |
uint64 | File-backed pages/page cache (bytes) |
AnonPages |
uint64 | Anonymous pages/process memory (bytes) |
Allocation Counters | ||
NumaHit |
uint64 | Memory successfully allocated on intended node |
NumaMiss |
uint64 | Memory allocated here despite preferring different node |
NumaForeign |
uint64 | Memory intended for here but allocated elsewhere |
InterleaveHit |
uint64 | Interleaved memory successfully allocated here |
LocalNode |
uint64 | Memory allocated here while process was running here |
OtherNode |
uint64 | Memory allocated here while process was on other node |
Topology | ||
Distances |
[]int | Distance to other nodes (10=local, 20+=remote) |
Key Performance Indicators
- High
NumaHit
+ LowNumaMiss
: Indicates good NUMA locality - High
NumaForeign
: Suggests memory pressure causing cross-node allocations - High
OtherNode
: Processes frequently accessing memory from remote nodes
Data Structure
type NUMAStats struct {
Enabled bool
NodeCount int
Nodes []NUMANodeStats
AutoBalance bool
}
type NUMANodeStats struct {
ID int
CPUs []int
MemTotal uint64
MemFree uint64
MemUsed uint64
FilePages uint64
AnonPages uint64
NumaHit uint64
NumaMiss uint64
NumaForeign uint64
InterleaveHit uint64
LocalNode uint64
OtherNode uint64
Distances []int
}
Configuration
Enabling the Collector
The NUMA collector runs continuously at the configured interval:
performance:
enabled: true
interval: "60s" # Collection interval
collectors:
- numa
For programmatic configuration:
config := performance.CollectionConfig{
EnabledCollectors: map[performance.MetricType]bool{
performance.MetricTypeNUMA: true,
},
HostProcPath: "/proc",
HostSysPath: "/sys",
}
Required Paths
HostProcPath
: Must be an absolute path (default:/proc
)HostSysPath
: Must be an absolute path (default:/sys
)
Container Environments
When running in containers, mount the host filesystems:
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
Then configure:
config.HostProcPath = "/host/proc"
config.HostSysPath = "/host/sys"
Platform Considerations
Linux Kernel Requirements
- Minimum kernel version: 2.6.7 (NUMA support in /sys)
- NUMA hardware must be present
- NUMA support must be enabled in kernel
Non-NUMA Systems
On systems without NUMA or with only one node:
Enabled
will befalse
NodeCount
will be 0 or 1Nodes
array will be empty
Container Considerations
- Requires read access to
/sys/devices/system/node/
- NUMA topology is system-wide, not container-specific
- Container CPU/memory limits don't affect NUMA topology visibility
Common Issues
Issue: Collector reports NUMA disabled
Symptoms: Enabled: false
even on NUMA hardware
Possible causes:
- Single socket system (not NUMA)
- NUMA disabled in BIOS
- Kernel compiled without NUMA support
- Missing
/sys/devices/system/node/
directory
Resolution: Check BIOS settings and kernel configuration
Issue: Missing or incomplete node data
Symptoms: Some nodes have zero values for memory metrics
Possible causes:
- Partial
/sys
mount in container - Insufficient permissions
- Memory hot-plug operations
Resolution: Ensure full /sys
filesystem is mounted with read permissions
Issue: High NUMA miss rates
Symptoms: High NumaMiss
or NumaForeign
values
Possible causes:
- Poor process/thread affinity
- Memory pressure on specific nodes
- Suboptimal memory allocation policies
Resolution: Review application NUMA policies and consider enabling auto-balancing
Examples
Sample Output
{
"Enabled": true,
"NodeCount": 2,
"AutoBalance": true,
"Nodes": [
{
"ID": 0,
"CPUs": [0, 1, 2, 3, 4, 5, 6, 7],
"MemTotal": 68719476736,
"MemFree": 12636856320,
"MemUsed": 56082620416,
"FilePages": 22829453312,
"AnonPages": 33244332032,
"NumaHit": 1234567890,
"NumaMiss": 12345,
"NumaForeign": 54321,
"InterleaveHit": 9876,
"LocalNode": 1234567000,
"OtherNode": 890,
"Distances": [10, 21]
},
{
"ID": 1,
"CPUs": [8, 9, 10, 11, 12, 13, 14, 15],
"MemTotal": 68719476736,
"MemFree": 32954277888,
"MemUsed": 35765198848,
"FilePages": 12636856320,
"AnonPages": 23128342528,
"NumaHit": 987654321,
"NumaMiss": 54321,
"NumaForeign": 12345,
"InterleaveHit": 5432,
"LocalNode": 987600000,
"OtherNode": 54321,
"Distances": [21, 10]
}
]
}
Performance Impact
Since this collector runs continuously (unlike other hardware info collectors), consider the ongoing resource usage:
- CPU Usage: Negligible - only reads text files from sysfs
- Memory Usage: Small - proportional to number of NUMA nodes (typically < 1KB)
- I/O Operations: Few file reads per collection interval (5-10 files per node)
- Collection Time: Fast - typically < 1ms for 2-4 node systems
- Frequency Impact: Default 60s interval has minimal impact; can be adjusted based on monitoring needs
Related Collectors
- CPU Info Collector - CPU topology including NUMA node assignments
- Memory Info Collector - Static memory hardware configuration
- Memory Collector - Runtime memory usage statistics
- Process Collector - Per-process NUMA node affinity
- Disk Info Collector - Storage device information
- Network Info Collector - Network interface hardware