fabric_intro - OpenNebula/one-apps GitHub Wiki
🚀 Overview
The NVIDIA Fabric Manager Service VM appliance is a specialized OpenNebula tool designed to implement the NVIDIA NVSwitch Virtualization Model. This model is essential for virtualizing systems with multiple GPUs interconnected by NVSwitches (such as HGX or DGX platforms), allowing for the creation of hardware partitions for diverse workloads.
This appliance acts as the necessary Service VM on each compute node, taking control of the NVSwitch devices via PCI Passthrough and running the NVIDIA management software to partition the high-speed fabric interconnect.
📦 Appliance Components
The appliance is pre-configured with all components required to deploy the NVSwitch virtualization model:
| Component | Description | 
|---|---|
| NVIDIA Drivers | Proprietary drivers for hardware detection and management. | 
| Fabric Manager Service | The core NVIDIA service for managing the NVSwitch fabric. | 
| Fabric Manager SDK & Dev | Libraries for custom tool development. | 
nv-partitioner | 
A custom C++ tool built on the Fabric Manager SDK for logical NVSwitch partitioning. | 
⬇️ Download and Requirements
Download
The appliance is available in the OpenNebula Marketplace:
Minimum Requirements
| Requirement | Description | 
|---|---|
| Physical Host | Server with NVIDIA GPUs and NVSwitches (e.g., NVIDIA HGX). | 
| VM Resources | 2 vCPUs, 4 GB RAM. | 
| PCI Assignment | CRITICAL: All server NVSwitch devices must be assigned to the VM using PCI Passthrough. | 
| Host Driver | The NVSwitches on the host must be bound to the vfio-pci driver before instantiation. | 
📝 Release Notes
The appliance is based on a stable Linux distribution.
| Component | Version | 
|---|---|
| Base OS | Ubuntu 22.04 LTS (x86-64) | 
| NVIDIA Driver | 570 | 
| Fabric Manager | 570 | 
nv-partitioner | 
1.0.0 (Custom Partitioning Tool) | 
Next: Quick Start