Disk Collector - antimetal/system-agent GitHub Wiki
Disk Collector
Overview
The Disk Collector is a performance monitoring component of the Antimetal System Agent that collects disk I/O statistics from Linux systems. It reads raw counter values from /proc/diskstats
to provide detailed metrics about disk operations, throughput, and queue performance. This collector is essential for:
- Performance Monitoring: Track disk I/O patterns and identify bottlenecks
- Capacity Planning: Understand disk utilization and throughput requirements
- Troubleshooting: Identify disks with high latency or queue depths
- SLA Compliance: Monitor disk performance against service level objectives
The collector reports statistics for whole disk devices only, filtering out partitions to avoid duplicate metrics.
Technical Details
MetricType
MetricTypeDisk
Data Source
- Primary:
/proc/diskstats
- Linux kernel disk statistics interface - Format: Space-separated values with 14 fields per device
- Documentation: Linux kernel iostats documentation
Capabilities
CollectorCapabilities{
SupportsOneShot: true,
SupportsContinuous: false, // Wrapped by ContinuousPointCollector
RequiresRoot: false,
RequiresEBPF: false,
MinKernelVersion: "2.6.0",
}
Registration
The collector is registered as a continuous collector that wraps the point collector:
performance.Register(
performance.MetricTypeDisk,
performance.PartialNewContinuousPointCollector(...)
)
Collected Metrics
The collector returns an array of []*performance.DiskStats
with the following metrics for each disk:
Field | Type | Description | Unit |
---|---|---|---|
Device | string | Device name (e.g., sda, nvme0n1) | - |
Major | uint32 | Major device number | - |
Minor | uint32 | Minor device number | - |
ReadsCompleted | uint64 | Number of reads completed successfully | count |
ReadsMerged | uint64 | Number of reads merged before queuing | count |
SectorsRead | uint64 | Total sectors read (×512 for bytes) | sectors |
ReadTime | uint64 | Total time spent reading | milliseconds |
WritesCompleted | uint64 | Number of writes completed successfully | count |
WritesMerged | uint64 | Number of writes merged before queuing | count |
SectorsWritten | uint64 | Total sectors written (×512 for bytes) | sectors |
WriteTime | uint64 | Total time spent writing | milliseconds |
IOsInProgress | uint64 | Current number of I/Os in progress | count |
IOTime | uint64 | Total time spent doing I/Os | milliseconds |
WeightedIOTime | uint64 | Weighted time spent doing I/Os | milliseconds |
Calculated Fields (Not Populated by Point Collector)
The following fields exist in the DiskStats structure but are set to zero by this collector:
IOPS
- I/O operations per secondReadBytesPerSec
- Read throughputWriteBytesPerSec
- Write throughputUtilization
- Disk utilization percentageAvgQueueSize
- Average queue sizeAvgReadLatency
- Average read latencyAvgWriteLatency
- Average write latency
These fields are intended for rate calculation by continuous collectors or downstream processors.
Data Structure
The implementation is located at: pkg/performance/collectors/disk.go
The data structure is defined in: pkg/performance/types.go
Configuration
The collector requires minimal configuration:
config := performance.CollectionConfig{
HostProcPath: "/proc", // Must be absolute path
}
collector, err := collectors.NewDiskCollector(logger, config)
Container Environments
When running in containers, ensure the host's /proc
filesystem is mounted:
volumes:
- name: host-proc
hostPath:
path: /proc
type: Directory
volumeMounts:
- name: host-proc
mountPath: /host/proc
readOnly: true
Then configure with:
config.HostProcPath = "/host/proc"
Platform Considerations
Linux Kernel Requirements
- Minimum Version: 2.6.0 (when
/proc/diskstats
was introduced) - Required Files:
/proc/diskstats
must be available and readable - Permissions: No root privileges required
Device Filtering
The collector automatically filters out partitions to report only whole disk devices:
- Standard disks: Filters devices ending with digits (sda1, sdb2)
- NVMe devices: Filters devices with 'pN' suffix (nvme0n1p1)
- MMC devices: Filters devices with 'pN' suffix (mmcblk0p1)
- Special devices: Includes loop and device-mapper devices (loop0, dm-0)
Container Considerations
- Must mount host
/proc
filesystem - Use
HostProcPath
configuration to specify mount point - Read-only mount is sufficient
Common Issues
1. Missing /proc/diskstats
Error: failed to open /proc/diskstats: no such file or directory
- Cause: Running on non-Linux system or /proc not mounted
- Solution: Ensure running on Linux with /proc filesystem available
2. Empty Results
Symptom: Collector returns empty array
- Cause: All devices filtered as partitions or no block devices present
- Solution: Check system has block devices with
lsblk
3. Parse Errors
Symptom: Some devices have zero values for all metrics
- Cause: Malformed lines in /proc/diskstats or kernel format changes
- Solution: Check kernel version compatibility and file format
4. Container Path Issues
Error: HostProcPath must be an absolute path
- Cause: Relative path provided in configuration
- Solution: Use absolute paths like
/host/proc
not./proc
Examples
Sample Output
[
{
"Device": "sda",
"Major": 8,
"Minor": 0,
"ReadsCompleted": 123456,
"ReadsMerged": 567,
"SectorsRead": 890123,
"ReadTime": 4567,
"WritesCompleted": 890,
"WritesMerged": 123,
"SectorsWritten": 456789,
"WriteTime": 1234,
"IOsInProgress": 0,
"IOTime": 5678,
"WeightedIOTime": 9012
},
{
"Device": "nvme0n1",
"Major": 259,
"Minor": 0,
"ReadsCompleted": 345678,
"ReadsMerged": 789,
"SectorsRead": 1234567,
"ReadTime": 6789,
"WritesCompleted": 1234,
"WritesMerged": 567,
"SectorsWritten": 890123,
"WriteTime": 3456,
"IOsInProgress": 2,
"IOTime": 7890,
"WeightedIOTime": 11234
}
]
Performance Impact
The Disk Collector has minimal performance impact:
- CPU Usage: Negligible - simple file parsing
- Memory Usage: O(n) where n is number of block devices
- I/O Operations: Single read of
/proc/diskstats
per collection - Collection Time: Typically < 1ms for systems with < 100 disks
Optimization Notes
- Partitions are filtered early to reduce memory allocation
- No system calls beyond file reading
- Efficient line-by-line parsing without loading entire file
Related Collectors
- Disk Info Collector: Collects static disk hardware information (model, size, queue configuration)
- File System Collector: Monitors mounted filesystem usage and capacity
- IO Stat Collector: Advanced I/O statistics if available
- Process Collector: Per-process disk I/O statistics
Collector Relationships
- Disk Collector: Runtime I/O statistics (dynamic)
- Disk Info Collector: Hardware configuration (static)
- Together they provide complete disk monitoring coverage
Troubleshooting Tips
-
Verify Data Source:
cat /proc/diskstats
-
Check Device Filtering:
# List all block devices lsblk # See which would be collected cat /proc/diskstats | awk '$3 !~ /[0-9]$/ && $3 !~ /p[0-9]+$/ {print $3}'
-
Monitor Collection:
# Watch diskstats changes watch -n 1 'cat /proc/diskstats | grep -E "sda |nvme0n1 "'
-
Calculate Rates Manually:
# Simple IOPS calculation awk '/sda / {print $4+$8}' /proc/diskstats; sleep 1; awk '/sda / {print $4+$8}' /proc/diskstats