Kernel Collector - antimetal/system-agent GitHub Wiki

Kernel Collector

Overview

The Kernel Collector monitors and collects kernel messages from the Linux kernel's ring buffer via /dev/kmsg. This collector provides real-time insights into kernel-level events, errors, warnings, and system behavior. It's particularly valuable for:

  • System Diagnostics: Identifying hardware failures, driver issues, and kernel panics
  • Security Monitoring: Detecting kernel-level security events and anomalies
  • Performance Analysis: Understanding system bottlenecks and resource constraints
  • Troubleshooting: Correlating application issues with kernel events

The collector can operate in both one-shot mode (collecting recent messages) and continuous mode (streaming new messages as they occur).

Technical Details

MetricType

  • Type: performance.MetricTypeKernel ("kernel")
  • Registry: Auto-registered via init() function

Data Source

  • Primary: /dev/kmsg - Kernel message ring buffer interface
  • Secondary: /proc/stat - Used to determine system boot time for timestamp calculations

Capabilities

CollectorCapabilities{
    SupportsOneShot:    true,   // Can collect recent messages
    SupportsContinuous: true,   // Can stream new messages
    RequiresRoot:       true,   // /dev/kmsg requires CAP_SYSLOG or root
    RequiresEBPF:       false,
    MinKernelVersion:   "3.5.0" // /dev/kmsg introduced in Linux 3.5
}

Collector Configuration

  • Message Limit: Default 50 messages (configurable via WithMessageLimit())
  • Buffer Size: 8KB for reading kernel messages
  • Channel Buffer: 100 messages for continuous collection

Collected Metrics

Field Type Description Example
Timestamp time.Time Absolute time when message was generated 2024-01-15 10:30:45.123456
Facility uint8 Syslog facility (priority >> 3) 0 (kernel)
Severity uint8 Message severity level (0-7) 6 (INFO)
SequenceNum uint64 Kernel sequence number 12345
Message string Raw kernel message text "usb 1-1: new high-speed USB device..."
Subsystem string Parsed kernel subsystem "usb", "ext4", "network"
Device string Parsed device identifier "1-1", "sda1", "eth0"

Severity Levels

  • 0 - KERN_EMERG: System is unusable
  • 1 - KERN_ALERT: Action must be taken immediately
  • 2 - KERN_CRIT: Critical conditions
  • 3 - KERN_ERR: Error conditions
  • 4 - KERN_WARNING: Warning conditions
  • 5 - KERN_NOTICE: Normal but significant condition
  • 6 - KERN_INFO: Informational
  • 7 - KERN_DEBUG: Debug-level messages

Data Structure

The collector returns []*performance.KernelMessage in one-shot mode or individual *performance.KernelMessage objects in continuous mode.

Source Code: pkg/performance/collectors/kernel.go

Configuration

Basic Usage

config := performance.CollectionConfig{
    HostProcPath: "/proc",
    HostDevPath:  "/dev",
}
collector, err := collectors.NewKernelCollector(logger, config)

Custom Message Limit

collector, err := collectors.NewKernelCollector(
    logger, 
    config,
    collectors.WithMessageLimit(100), // Collect up to 100 messages
)

Container Configuration

When running in containers, mount the host's /dev directory:

volumes:
  - name: dev
    hostPath:
      path: /dev
      type: Directory
volumeMounts:
  - name: dev
    mountPath: /host/dev
    readOnly: true

Platform Considerations

Linux Kernel Requirements

  • Minimum Version: Linux 3.5.0 (when /dev/kmsg was introduced)
  • Capabilities: Requires CAP_SYSLOG or root privileges
  • File Access: Read access to /dev/kmsg and /proc/stat

Container Considerations

  1. Device Access: Must mount host's /dev directory
  2. Privileges: Container needs appropriate capabilities or run as root
  3. Security: Consider using CAP_SYSLOG instead of full root access
  4. Namespace: Messages from host kernel, not container-specific

Message Format

The collector parses the standard /dev/kmsg format:

<priority>,<sequence>,<timestamp>,<flags>;<message>

Example:

6,1234,5678901234,-;usb 1-1: new high-speed USB device number 2 using xhci_hcd

Common Issues

Permission Denied

Problem: Cannot read /dev/kmsg

failed to open /dev/kmsg: permission denied

Solution:

  • Ensure container has CAP_SYSLOG capability
  • Or run with appropriate privileges
  • Check SELinux/AppArmor policies

Missing Messages

Problem: Kernel ring buffer overrun

Kernel ring buffer overrun, some messages lost

Solution:

  • Increase kernel log buffer size (log_buf_len kernel parameter)
  • Collect messages more frequently
  • Filter less important messages at kernel level

No Messages Available

Problem: Empty results in containers Solution:

  • Verify /dev/kmsg is properly mounted from host
  • Check if kernel logging is enabled
  • Ensure messages aren't being consumed by other readers

Examples

Sample Output

{
  "Timestamp": "2024-01-15T10:30:45.123456Z",
  "Facility": 0,
  "Severity": 6,
  "SequenceNum": 12345,
  "Message": "usb 1-1: new high-speed USB device number 2 using xhci_hcd",
  "Subsystem": "usb",
  "Device": "1-1"
}

Filtering Examples

Error Messages Only

messages, _ := collector.Collect(ctx)
errors := filterBySeverity(messages, 3) // ERR and above

Specific Subsystem

usbMessages := filterBySubsystem(messages, "usb")
networkMessages := filterBySubsystem(messages, "network")

Performance Impact

Resource Usage

  • CPU: Minimal - only active when reading messages
  • Memory:
    • Ring buffer: Up to messageLimit * average_message_size
    • Continuous mode: Additional channel buffer
  • I/O: Direct kernel interface, no disk I/O

Optimization Tips

  1. Message Limit: Adjust based on monitoring needs
  2. Collection Frequency: Balance between real-time needs and overhead
  3. Filtering: Consider kernel-level filtering with printk levels
  4. Continuous Mode: Use for real-time monitoring, one-shot for periodic checks

Benchmarks

  • One-shot collection: ~1-5ms for 50 messages
  • Message parsing: ~10μs per message
  • Continuous mode overhead: <0.1% CPU

Related Collectors

References

⚠️ **GitHub.com Fallback** ⚠️