Hardware Discovery - antimetal/system-agent GitHub Wiki
✅ COMPLETE: Hardware discovery is fully implemented and operational. The system collects comprehensive hardware information and builds a complete graph representation with all relationship types.
Implementation Status:
- ✅ Hardware graph builder
- ✅ Protobuf data models
- ✅ Resource store integration
- ✅ Performance collector integration
- ✅ Actual hardware data collection
- ✅ All relationship types (Contains, SharesSocket, NUMAAffinity, NUMADistance, ConnectedTo)
The Hardware Graph feature adds hardware configuration discovery and graph representation to the Antimetal Agent, enabling physical and virtual hardware resources to be represented as nodes and relationships in "The Graph" alongside Kubernetes and cloud resources.
┌─────────────────────────────────────────────────────────────┐
│ Performance Collectors │
│ (CPUInfo, MemoryInfo, DiskInfo, NetworkInfo) │
└──────────────────────┬──────────────────────────────────────┘
│ Collect hardware data
▼
┌─────────────────────────────────────────────────────────────┐
│ Hardware Manager │
│ - Periodic collection orchestration │
│ - Snapshot aggregation │
└──────────────────────┬──────────────────────────────────────┘
│ Hardware snapshot
▼
┌─────────────────────────────────────────────────────────────┐
│ Hardware Graph Builder │
│ - Converts collector data to graph nodes │
│ - Creates RDF triplet relationships │
└──────────────────────┬──────────────────────────────────────┘
│ Resources & Relationships
▼
┌─────────────────────────────────────────────────────────────┐
│ Resource Store │
│ (BadgerDB - stores nodes and relationships) │
└─────────────────────────────────────────────────────────────┘
-
Performance collectors read from
/proc
and/sys
filesystems - Hardware Manager orchestrates periodic collection (default: 5 minutes)
- Graph Builder transforms raw data into graph nodes and relationships
- Resource Store persists the hardware graph using RDF triplets
graph TB
%% Node Style Definitions
classDef systemNode fill:#e1f5fe,stroke:#01579b,stroke-width:3px
classDef cpuNode fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
classDef memoryNode fill:#e8f5e8,stroke:#1b5e20,stroke-width:2px
classDef storageNode fill:#fff3e0,stroke:#e65100,stroke-width:2px
classDef networkNode fill:#fce4ec,stroke:#880e4f,stroke-width:2px
classDef numaNode fill:#f1f8e9,stroke:#33691e,stroke-width:2px
%% System Root
SYS["SystemNode<br/>📟 hostname<br/>🏗️ x86_64<br/>⏰ boot_time<br/>🐧 Linux 6.8"]
%% CPU Topology
PKG0["CPUPackageNode<br/>🔧 socket-0<br/>⚡ Intel Xeon<br/>💾 36MB Cache<br/>🧮 8C/16T"]
PKG1["CPUPackageNode<br/>🔧 socket-1<br/>⚡ Intel Xeon<br/>💾 36MB Cache<br/>🧮 8C/16T"]
CORE0["CPUCoreNode<br/>🎯 core-0<br/>📊 3.2GHz"]
CORE1["CPUCoreNode<br/>🎯 core-1<br/>📊 3.2GHz"]
CORE8["CPUCoreNode<br/>🎯 core-8<br/>📊 3.2GHz"]
CORE9["CPUCoreNode<br/>🎯 core-9<br/>📊 3.2GHz"]
%% Memory Topology
MEM["MemoryModuleNode<br/>💾 64GB Total<br/>🧠 NUMA Enabled<br/>⚖️ Balancing: Yes"]
NUMA0["NUMANode<br/>🏷️ node-0<br/>💾 32GB<br/>🧮 CPUs: 0-7<br/>📏 [10,20]"]
NUMA1["NUMANode<br/>🏷️ node-1<br/>💾 32GB<br/>🧮 CPUs: 8-15<br/>📏 [20,10]"]
%% Storage Topology
NVME["DiskDeviceNode<br/>💿 nvme0n1<br/>📦 1TB Samsung<br/>⚡ SSD (NVMe)<br/>🎯 4KB blocks"]
SATA["DiskDeviceNode<br/>💿 sda<br/>📦 4TB Seagate<br/>🔄 HDD (SATA)<br/>🎯 512B blocks"]
NVME_P1["DiskPartitionNode<br/>📁 nvme0n1p1<br/>📊 100GB<br/>🎯 sector 2048"]
NVME_P2["DiskPartitionNode<br/>📁 nvme0n1p2<br/>📊 900GB<br/>🎯 sector 204800"]
SATA_P1["DiskPartitionNode<br/>📁 sda1<br/>📊 4TB<br/>🎯 sector 2048"]
%% Network Topology
ETH0["NetworkInterfaceNode<br/>🌐 eth0<br/>🔗 10Gbps<br/>📶 Full Duplex<br/>🚀 ena driver"]
ETH1["NetworkInterfaceNode<br/>🌐 eth1<br/>🔗 10Gbps<br/>📶 Full Duplex<br/>🚀 ena driver"]
%% Containment Relationships (Contains)
SYS -->|"Contains<br/>(physical)"| PKG0
SYS -->|"Contains<br/>(physical)"| PKG1
SYS -->|"Contains<br/>(physical)"| MEM
SYS -->|"Contains<br/>(logical)"| NUMA0
SYS -->|"Contains<br/>(logical)"| NUMA1
SYS -->|"Contains<br/>(physical)"| NVME
SYS -->|"Contains<br/>(physical)"| SATA
SYS -->|"Contains<br/>(physical)"| ETH0
SYS -->|"Contains<br/>(physical)"| ETH1
PKG0 -->|"Contains<br/>(physical)"| CORE0
PKG0 -->|"Contains<br/>(physical)"| CORE1
PKG1 -->|"Contains<br/>(physical)"| CORE8
PKG1 -->|"Contains<br/>(physical)"| CORE9
NVME -->|"Contains<br/>(partition)"| NVME_P1
NVME -->|"Contains<br/>(partition)"| NVME_P2
SATA -->|"Contains<br/>(partition)"| SATA_P1
%% NUMA Affinity Relationships
MEM -.->|"NUMAAffinity<br/>node-0"| NUMA0
MEM -.->|"NUMAAffinity<br/>node-1"| NUMA1
%% Socket Sharing Relationships (CPU cores on same socket)
CORE0 <-.->|"SharesSocket<br/>socket-0"| CORE1
CORE8 <-.->|"SharesSocket<br/>socket-1"| CORE9
%% NUMA Distance Relationships (between NUMA nodes)
NUMA0 <-.->|"NUMADistance<br/>local: 10<br/>remote: 20"| NUMA1
%% Bus Connection Relationships
NVME -.->|"ConnectedTo<br/>NVMe bus"| SYS
SATA -.->|"ConnectedTo<br/>SATA bus"| SYS
ETH0 -.->|"ConnectedTo<br/>PCI bus"| SYS
ETH1 -.->|"ConnectedTo<br/>PCI bus"| SYS
%% Apply Styles
class SYS systemNode
class PKG0,PKG1,CORE0,CORE1,CORE8,CORE9 cpuNode
class MEM,NUMA0,NUMA1 memoryNode
class NVME,SATA,NVME_P1,NVME_P2,SATA_P1 storageNode
class ETH0,ETH1 networkNode
Relationship Type | Visual Style | Description | Examples |
---|---|---|---|
Contains | Solid arrow | Hierarchical containment relationships | System → CPU Package → CPU Core Disk Device → Partition |
NUMAAffinity | Dotted arrow | Memory/CPU affinity to NUMA nodes | Memory Module → NUMA Node |
SharesSocket | Bidirectional dotted | CPU cores sharing physical sockets | Core-0 ↔ Core-1 (same socket) |
NUMADistance | Bidirectional dotted | Distance metrics between NUMA nodes | NUMA-0 ↔ NUMA-1 (distance: 20) |
ConnectedTo | Dotted arrow | Hardware bus connections | Disk → System (via NVMe bus) Network → System (via PCI bus) |
Root node representing the physical or virtual machine.
Properties:
-
hostname
: System hostname -
architecture
: CPU architecture (x86_64, arm64) -
boot_time
: System boot timestamp -
kernel_version
: Kernel version string -
os_info
: Operating system information
Represents a physical CPU socket/package.
Properties:
-
socket_id
: Physical package ID -
vendor_id
: CPU vendor (GenuineIntel, AuthenticAMD) -
model_name
: Full CPU model name -
cpu_family
: CPU family number -
model
: Model number -
stepping
: Stepping revision -
microcode
: Microcode version -
cache_size
: Cache size string -
physical_cores
: Number of physical cores -
logical_cores
: Number of logical cores (with hyperthreading)
Individual CPU core within a package.
Properties:
-
processor_id
: Logical CPU number -
core_id
: Physical core ID -
physical_id
: Parent package ID -
frequency_mhz
: Current frequency -
siblings
: Number of sibling threads
System memory configuration.
Properties:
-
total_bytes
: Total system memory -
numa_enabled
: NUMA support status -
numa_balancing_available
: NUMA balancing availability -
numa_node_count
: Number of NUMA nodes
NUMA memory node for systems with non-uniform memory access.
Properties:
-
node_id
: NUMA node identifier -
total_bytes
: Memory in this NUMA node -
cpus
: CPU cores assigned to this node -
distances
: Distance metrics to other nodes
Physical storage device.
Properties:
-
device
: Device name (sda, nvme0n1) -
model
: Model identifier -
vendor
: Manufacturer -
size_bytes
: Total capacity -
rotational
: HDD (true) or SSD (false) -
block_size
: Logical block size -
physical_block_size
: Physical block size -
scheduler
: I/O scheduler -
queue_depth
: Queue depth
Disk partition on a storage device.
Properties:
-
name
: Partition name (sda1, nvme0n1p1) -
parent_device
: Parent disk device -
size_bytes
: Partition size -
start_sector
: Starting sector
Network adapter/interface.
Properties:
-
interface
: Interface name (eth0, wlan0) -
mac_address
: Hardware MAC address -
speed
: Link speed in Mbps -
duplex
: Duplex mode (full/half) -
mtu
: Maximum transmission unit -
driver
: Driver name -
type
: Interface type (ethernet, wireless, loopback) -
oper_state
: Operational state -
carrier
: Carrier detection status
Hierarchical containment relationship.
Properties:
-
type
: Containment type (physical, logical, partition)
Usage:
- System → CPU Package (physical)
- CPU Package → CPU Core (physical)
- System → Memory Module (physical)
- System → Disk Device (physical)
- Disk Device → Partition (partition)
- System → Network Interface (physical)
NUMA node affinity relationships.
Properties:
-
node_id
: NUMA node identifier -
distance
: Distance metric (optional)
Usage:
- Memory Module → NUMA Node
- CPU Core → NUMA Node
CPU cores sharing a physical socket.
Properties:
-
physical_id
: Physical package ID -
socket_id
: Socket identifier
Usage:
- CPU Core ↔ CPU Core (same socket)
Hardware bus connections (future use).
Properties:
-
bus_type
: Bus type (pci, usb, sata, nvme) -
bus_address
: Bus address (optional)
SystemNode (node-01.example.com)
├── [Contains:physical] → CPUPackageNode (socket-0)
│ ├── [Contains:physical] → CPUCoreNode (core-0)
│ ├── [Contains:physical] → CPUCoreNode (core-1)
│ ├── [Contains:physical] → CPUCoreNode (core-2)
│ └── [Contains:physical] → CPUCoreNode (core-3)
├── [Contains:physical] → CPUPackageNode (socket-1)
│ ├── [Contains:physical] → CPUCoreNode (core-4)
│ ├── [Contains:physical] → CPUCoreNode (core-5)
│ ├── [Contains:physical] → CPUCoreNode (core-6)
│ └── [Contains:physical] → CPUCoreNode (core-7)
├── [Contains:physical] → MemoryModuleNode (64GB)
│ ├── [NUMAAffinity:node-0] → NUMANode (node-0, 32GB)
│ └── [NUMAAffinity:node-1] → NUMANode (node-1, 32GB)
├── [Contains:logical] → NUMANode (node-0)
├── [Contains:logical] → NUMANode (node-1)
├── [Contains:physical] → DiskDeviceNode (nvme0n1, 1TB)
│ ├── [Contains:partition] → DiskPartitionNode (nvme0n1p1, 100GB)
│ └── [Contains:partition] → DiskPartitionNode (nvme0n1p2, 900GB)
├── [Contains:physical] → DiskDeviceNode (sda, 4TB)
│ └── [Contains:partition] → DiskPartitionNode (sda1, 4TB)
├── [Contains:physical] → NetworkInterfaceNode (eth0, 10Gbps)
└── [Contains:physical] → NetworkInterfaceNode (eth1, 10Gbps)
- Hardware discovery reads from
/proc
and/sys
filesystems - Typical collection time: <100ms on modern systems
- Update interval configurable (default: 5 minutes)
- Each hardware node: ~200-500 bytes
- Typical system: 50-200 nodes total
- Total storage: <100KB per system
At scale, hardware configurations are highly repetitive. Analysis of large fleets shows:
- ~100 unique CPU models across millions of servers
- ~20 common memory configurations (16GB, 32GB, 64GB, 128GB, etc.)
- ~50 unique disk models
- Result: ~1,000 unique hardware profiles serve 99% of hosts
Instead of storing complete hardware graphs for each host, we use a profile catalog pattern:
- Hardware profiles are deduplicated - Each unique hardware configuration is stored once
- Hosts reference profiles - Each host points to its hardware profile ID
- Profile hashing - Agents compute a hash of their hardware locally for fast deduplication
- Differential storage - Only host-specific data (hostname, serial numbers) stored per-host
For 1 million hosts:
- Without deduplication: 100KB × 1M = 100GB
-
With profile catalog:
- Unique profiles: 1,000 × 100KB = 100MB
- Host mappings: 1M × 100 bytes = 100MB
- Total: 200MB (500x reduction)
The approach scales with the number of unique hardware configurations, not the number of hosts, making it ideal for large standardized fleets.
- Snapshot data held temporarily during collection
- Graph builder processes incrementally
- No persistent memory cache required
Link hardware nodes to runtime and Kubernetes nodes:
K8s Node → [RunsOn] → SystemNode
K8s Pod → [ScheduledOn] → CPUCoreNode
ContainerNode → [RunsOn] → CPUCoreNode (via cpuset.cpus)
ContainerNode → [AllocatedTo] → NUMANode (via cpuset.mems)
See Runtime Discovery for complete container and process topology integration.
- GPU devices and topology
- InfiniBand/RDMA adapters
- Hardware accelerators (TPU, FPGA)
- Power management states
- Thermal sensors
- Attach real-time metrics to hardware nodes
- CPU utilization per core
- Memory bandwidth per NUMA node
- Disk I/O per device
- Network throughput per interface
- PCIe bus topology
- Memory channel configuration
- CPU cache hierarchy
- Interrupt affinity
This document was migrated from the repository docs. Last updated: 2025-01-19