fabric_feature - OpenNebula/one-apps GitHub Wiki
The main purpose of the NVIDIA Fabric Manager Service VM appliance is to expose the NVSwitch partition management functionality through the nv-partitioner tool. Once partitions are configured, Guest VMs can be deployed on the same host to utilize the defined GPU topologies.
The nv-partitioner tool is the key component for defining the virtual fabric that the NVSwitches present to the Guest VMs. Note: This tool operates on pre-existing partitions defined by a configuration file external to this utility. Its primary function is to list, activate, and deactivate these configured partitions.
All management is performed by SSHing into the Fabric Manager Service VM:
$ onevm ssh service_FabricManager_host1Key Management Commands (nv-partitioner) The nv-partitioner utility can be run in interactive mode (running without options) or via command-line flags, following this structure:
Usage: nv-partitioner [-i <IP>] -o <OP> [-p <ID>] [-f <FORMAT>]| Flag | Full Name | Description | Example Value(s) | 
|---|---|---|---|
| -i | --ip <IP>
 | 
IP address of Fabric Manager. | Default: 127.0.0.1
 | 
| -o | --operation <N>
 | 
The operation to perform. Required. | 
0 (List), 1 (Activate), 2 (Deactivate) | 
| -p | --partition <ID>
 | 
Partition ID. Required for Activate (1) or Deactivate (2). | Integer ID of the partition | 
| -f | --format <FORMAT>
 | 
Output format for the List operation (0). | 
csv or table (Default: table) | 
The Service VM is part of a two-step process for virtualizing NVSwitch systems:
- Fabric Setup (Service VM):
 
- Deploy this Service VM appliance on the target host with PCI Passthrough of the NVSwitches.
 - Access the VM and use nv-partitioner to Activate the required GPU partitions (e.g., Partition ID 1 for a 4-GPU group).
 
- Host Reporting (Feedback to OpenNebula):
 
- Once a partition is Active, the host will begin reporting the new hardware topology to OpenNebula.
 - Only the GPUs assigned to the Active partition will be visible and reported by the host, effectively virtualizing the NVSwitch fabric into usable, isolated blocks.
 
- Workload Deployment (Guest VM):
 
- Instantiate the Guest VM (where the actual workload runs) on the same host.
 - Configure the Guest VM template with PCI Passthrough for the specific GPUs (e.g., GPU 0, 1, 4, 5) that belong to the desired active partition.
 
The NVSwitch fabric, managed by the Service VM, ensures that the Guest VM's assigned GPUs communicate with each other using the high-speed topology defined by the active partition.