GPU‐Enabled LXD Host with VLAN‐Aware External Container Interfaces - dsi-icl/wikis GitHub Wiki

Documentation: GPU-Enabled LXD Host with VLAN-Aware Containers

Modeling new CPG experimental (For the landmark project temporary setting, go here Jump to "Landmark")

1. Context and Architecture

This document details the setup of a high-performance LXD host capable of running isolated containers with direct access to physical NVIDIA GPUs and specific VLAN networks. This is for a single host, no attention is given to the specifics of LXD cluster setup.

Architecture Goals:

  • Host OS: Ubuntu 24.04 (Noble Numbat).
  • Network: The container ("model-ptcns") is attached directly to VLAN 22 (via host bridge br22) using a public Floating IP. This bypasses host NAT for specific traffic.
  • Hardware: Direct passthrough of NVIDIA GPUs (Tesla/Server grade) to the container.
  • Workload: Supports nested Docker inside the container for AI/ML workflows.
  • Security: Enforces Anti-Spoofing (IP/MAC) at the LXD level and strict Firewalling (SSH access only) at the Host level.

2. Host Preparation & OS Upgrade

Goal: Ensure the host is running the latest stable OS and kernel before configuring proprietary drivers.

The system is updated and upgraded to the latest Ubuntu LTS release.

# 1. Update package lists and upgrade existing packages
sudo apt update
sudo apt full-upgrade

# 2. Upgrade the Distribution (e.g. to 24.04 LTS)
# Follow interactive prompts to keep or replace config files as needed.
sudo do-release-upgrade

# 3. Reboot into the new kernel/OS
sudo reboot


3. NVIDIA Driver Installation (Host Side)

Goal: Install the proprietary NVIDIA server drivers on the host. LXD needs these host drivers loaded to pass the device to the container.

We use the "headless" server drivers (version 580), optimized for compute tasks rather than graphics display.

# 1. Install utilities to detect recommended drivers
sudo apt install ubuntu-drivers-common

# 2. Install the specific 580 server driver (GPGPU optimized). Change version as required.
sudo ubuntu-drivers install --gpgpu nvidia:580-server

# 3. Install utils for verification (includes nvidia-smi). Make sure to match the version.
sudo apt install nvidia-utils-580-server

# 4. (Check) Verify the Host sees the GPUs
nvidia-smi

Note: If nvidia-smi fails here, do not proceed. A reboot is often required after driver installation.


4. LXD Initialization & Container Launch

Goal: Initialize LXD v6 and launch the target container.

# 1. Refresh LXD to the 6.0 Stable channel
sudo snap refresh lxd --channel=6.0/stable

# 2. Initialize LXD
# (Interactive: Choose ZFS/Btrfs for storage and configure a default management bridge)
sudo lxd init

# 3. Launch the Ubuntu 24.04 container
lxc launch ubuntu:24.04 model-ptcns


5. GPU Passthrough

Goal: Pass specific physical GPUs (ID 0 and ID 1, adapt as needed) into the container.

# Mount /dev/nvidia0 and /dev/nvidia1 inside the container
lxc config device add model-ptcns gpu0 gpu id=0
lxc config device add model-ptcns gpu1 gpu id=1


6. Network Configuration (Floating IP & Security)

Goal: Attach the container to VLAN 22 using a public static IP and enforce strict anti-spoofing.

Prerequisite: The host must have a bridge interface (e.g., br22) configured in Netplan corresponding to VLAN 22.

A. Device Attachment & Security

We attach eth0 to br22 and enable filtering. This ensures the host drops any traffic from the container that doesn't match the assigned IP/MAC. br22 must already exist as a bridge on an already tagged interface.

lxc config device add model-ptcns eth0 nic \
    nictype=bridged \
    parent=br22 \
    ipv4.address=IP_ADDRESS \
    ipv6.address=none \
    security.ipv4_filtering=true \
    security.ipv6_filtering=true \
    security.mac_filtering=true

B. Internal Network Configuration (Cloud-Init)

Since br22 is an unmanaged bridge (no DHCP), we inject the static IP configuration into the container using cloud-init.

lxc config set model-ptcns user.network-config "
network:
  version: 2
  ethernets:
    eth0:
      dhcp6: false
      addresses:
      - IP_ADDRESS/CRID
      nameservers:
        addresses: [DNS_01, DNS_02]
      routes:
      - to: default
        via: GATEWAY_IP
"
# A restart is required to apply cloud-init network changes
lxc restart model-ptcns


7. Enable Nested Docker & Runtime Support

Goal: Allow the container to run Docker inside, and allow that internal Docker to access the GPU.

# 1. Enable Nesting (Allows container to manage cgroups/namespaces)
lxc config set model-ptcns security.nesting=true
lxc config set model-ptcns nvidia.runtime=true
lxc config set model-ptcns nvidia.driver.capabilities="all"

# 2. Add NVIDIA Container Toolkit Repo (Host Side)
# (Helpful for host-side tools, though critical installation happens *inside* the container later)
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg && \
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

# 3. Install Runtime
sudo apt update
sudo apt install -y nvidia-container-runtime


8. Host-Level Firewalling (IPtables)

Goal: Secure the public IP. Since the container is bridged, traffic passes through the FORWARD chain. We must explicitly filter this.

A. Enable Bridged Traffic Inspection

By default, the kernel might pass bridged traffic without sending it to IPtables.

# 1. (Check) Verify module is loaded
lsmod | grep br_netfilter

# 2. (Check) Verify sysctl settings
cat /etc/sysctl.conf | grep bridge-nf-call-iptables

# 3. (Missing Command) If the check above is empty, enable it:
# echo "net.bridge.bridge-nf-call-iptables = 1" | sudo tee -a /etc/sysctl.conf
# sudo sysctl -p

B. Apply Firewall Rules

We explicitly allow only established connections, SSH (22), and Ping (ICMP). All other traffic to this IP is dropped. Adapt as needed.

# 1. Allow return traffic (outgoing connections initiated by container)
iptables -A FORWARD -d IP_ADDRESS -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT

# 2. Allow Inbound SSH
iptables -A FORWARD -d IP_ADDRESS -p tcp --dport 22 -j ACCEPT

# 3. Allow Ping
iptables -A FORWARD -d IP_ADDRESS -p icmp -j ACCEPT

# 4. Default Deny for this specific IP
iptables -A FORWARD -d IP_ADDRESS -j DROP

C. Persistence

Ensure these rules survive a reboot.

# 1. Save current rules to active runtime
iptables-save

# 2. Install persistent loader
sudo apt-get install -y iptables-persistent
# (Select "YES" when asked to save current IPv4 rules)

# 3. (Check) Inspect the saved rules file
cat /etc/iptables/rules.v4

# 4. Reload to verify syntax
sudo netfilter-persistent reload

9. Check and adjust routing

Goal: Understand and correct any routing issue preventing the host to forward correctly the packets to the container

A. Check the routing table

ip route get IP_ADDRESS 

You should check the interface used by the host to forward the traffic over is correct. If not, correct the routing by adding explicit routes of adjusting the host IP.

You will also want to check the chain of interfaces is accepting packets and ensure no MAC filtering is operated or that promiscuous flags are set correctly.

10. Container Internal Setup (Nested Docker & GPU)

Goal: Configure the guest OS to run Docker and utilize the passed-through GPU. While the drivers are on the host, the Container Toolkit and Docker daemon must be installed inside the guest. This is optional and renders optional the previous security nesting configuration.

A. Access and Update

First, we enter the container's shell to run administrative commands directly inside the guest environment.

# 1. Enter the container shell
lxc exec model-ptcns bash

# 2. (Inside Container) Update the guest OS
apt update && apt full-upgrade -y

B. Install Docker (Inside Container)

We install the standard Docker engine. The LXD security.nesting=true flag set earlier allows this daemon to run without privilege errors. This is given as example, please refer to the most recent official Docker installation guide.

# 1. Install prerequisites
apt-get install -y ca-certificates curl gnupg

# 2. Add Docker's official GPG key
install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor -o /etc/apt/keyrings/docker.gpg
chmod a+r /etc/apt/keyrings/docker.gpg

# 3. Add the repository to Apt sources
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  tee /etc/apt/sources.list.d/docker.list > /dev/null

# 4. Install Docker Engine
apt-get update
apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

C. Install NVIDIA Toolkit (Inside Container)

Critical Step: Even though the Host has the drivers, the internal Docker daemon needs the NVIDIA Container Toolkit to interface with the /dev/nvidia* devices LXD has mounted.

# 1. Add the NVIDIA Toolkit GPG Key and Repo
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg && \
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

# 2. Install the Toolkit
apt-get update
apt-get install -y nvidia-container-toolkit

D. Configure Nesting & Runtime

This is the most common point of failure. By default, the toolkit tries to manage cgroups, which conflicts with LXD's management. We must disable cgroup management in the toolkit and register it with Docker.

# 1. Disable cgroups in the toolkit (Fixes "permission denied" on cgroup mount)
nvidia-ctk config --set nvidia-container-cli.no-cgroups --in-place

# 2. Configure the Docker runtime to recognize "nvidia"
nvidia-ctk runtime configure --runtime=docker

# 3. Restart Docker to apply changes
systemctl restart docker

E. Final Verification

Run a test container inside your LXD container. This proves the full chain: Host Hardware -> LXD Passthrough -> Container Toolkit -> Nested Docker.

# Run a temporary container requesting all GPUs
docker run --rm --gpus all ubuntu nvidia-smi

Success: You should see the standard NVIDIA SMI table outputting the GPU details (e.g., Tesla V100/A100) inside this nested Docker container.

Landmark

Github wiki update on the specifics. Add Eugene's ssh key to the container