Ubuntu 18.04 ‐ One Computer Setup - AD-EYE/AD-EYE_Core GitHub Wiki
### Note- Ubuntu 18:04 should not be used without discussion with Konstantin. The default version of ubuntu approved for AD-EYE is 16.04
This guide provides the steps to establish a One Computer Setup, which consists of doing a GPU passthrough (or PCI passthrough) and setup a Virtual Machine that will run on the isolated GPU.
Essentially, with the PCI passthrough, one of the GPUs is isolated from the NVIDIA driver and a dummy driver is loaded there instead.
For its part, the VM allows to have 2 OS running on the same computer at the same moment, with great graphic performance (it is not always the case with standard virtual machines without GPU passthrough).
WARNING : Before attempting anything further, we highly recommend to read this guide in its entirety and the links in the references section.
This guide has been tested with the following machine:
- AMD Ryzen Threadripper 2950X 16-Core
- 64 GB of RAM
- 2x NVidia RTX 2080Ti
- Ubuntu 16.04
- Windows 10 for the virtual Machine
Before doing anything, update BIOS to latest available version.
Then, in BIOS :
- Disable all raid configuration (in Advanced -> AMD PBS)
- Enable Enumarate all IOMMU in IVRS (in Advanced -> AMD PBS)
- Turn on VT-d / SVM Mode (in advanced -> CPU Configuration)
First of all, attach one monitor to a GPU, and another monitor to the other GPU.
Then go to the NVidia settings by typing sudo gksu nvidia-settings
(the "sudo" is important as we'll change the configuration of the X-server). If the above command does not work, install gksu
using sudo apt-get install gksu
Go to the X server Display configuration. Enable the screen connected to the second GPU. Activate the Xinerama setting as shown on the figure below :
Then you click on Save to X Configuration file
which pops up the window :
Save and reboot. Before continuing, Ubuntu should show the display on both monitors.
First, enable IOMMU by modifying the GRUB config: sudo nano /etc/default/grub
and edit it to match:
-
GRUB_CMDLINE_LINUX_DEFAULT="amd_iommu=on iommu=pt kvm_amd.npt=1"
if you run on an AMD CPU -
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash intel_iommu=on vfio_iommu_type1.allow_unsafe_interrupts=1 pcie_acs_override=downstream"
if you run on an Intel CPU
Save (Ctrl+x -> Y -> Enter)
Afterwards use: sudo update-grub
and reboot your system.
Afterwards one can verify if iommu is enabled:
dmesg |grep AMD-Vi
(for AMD CPU)
dmesg | grep -i iommu
(for Intel CPU)
You should get this output for an AMD CPU :
adeye@adeye:~$ dmesg | grep AMD-Vi
[0.885677] AMD-Vi: IOMMU performance counters supported
[0.885727] AMD-Vi: IOMMU performance counters supported
[0.903346] AMD-Vi: Found IOMMU at 0000:00:00.2 cap 0x40
[0.903347] AMD-Vi: Extended features (0xf77ef22294ada):
[0.903352] AMD-Vi: Found IOMMU at 0000:40:00.2 cap 0x40
[0.903353] AMD-Vi: Extended features (0xf77ef22294ada):
[0.903356] AMD-Vi: Interrupt remapping enabled
[0.903357] AMD-Vi: virtual APIC enabled
[0.903695] AMD-Vi: Lazy IO/TLB flushing enabled
To list all the IOMMU Groups and devices, run the following command:
find /sys/kernel/iommu_groups -type l
You should get an output of this type:
/sys/kernel/iommu_groups/7/devices/0000:00:15.1
/sys/kernel/iommu_groups/7/devices/0000:00:15.0
/sys/kernel/iommu_groups/15/devices/0000:03:00.0
/sys/kernel/iommu_groups/5/devices/0000:00:14.2
/sys/kernel/iommu_groups/5/devices/0000:00:14.0
/sys/kernel/iommu_groups/13/devices/0000:01:00.2
/sys/kernel/iommu_groups/13/devices/0000:01:00.0
/sys/kernel/iommu_groups/13/devices/0000:01:00.3
/sys/kernel/iommu_groups/13/devices/0000:01:00.1
/sys/kernel/iommu_groups/3/devices/0000:00:08.0
/sys/kernel/iommu_groups/11/devices/0000:00:1c.0
/sys/kernel/iommu_groups/1/devices/0000:00:01.0
/sys/kernel/iommu_groups/8/devices/0000:00:16.0
/sys/kernel/iommu_groups/16/devices/0000:04:00.0
/sys/kernel/iommu_groups/6/devices/0000:00:14.3
/sys/kernel/iommu_groups/14/devices/0000:02:00.2
/sys/kernel/iommu_groups/14/devices/0000:02:00.0
/sys/kernel/iommu_groups/14/devices/0000:02:00.3
/sys/kernel/iommu_groups/14/devices/0000:02:00.1
/sys/kernel/iommu_groups/4/devices/0000:00:12.0
/sys/kernel/iommu_groups/12/devices/0000:00:1f.0
/sys/kernel/iommu_groups/12/devices/0000:00:1f.5
/sys/kernel/iommu_groups/12/devices/0000:00:1f.3
/sys/kernel/iommu_groups/12/devices/0000:00:1f.4
/sys/kernel/iommu_groups/2/devices/0000:00:01.1
/sys/kernel/iommu_groups/10/devices/0000:00:1b.0
/sys/kernel/iommu_groups/0/devices/0000:00:00.0
/sys/kernel/iommu_groups/9/devices/0000:00:17.0
OR the following command to get information on the NVIDIA devices :
(for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU Group %s ' "$n"; lspci -nns "${d##*/}"; done;) | grep NVIDIA
You should get an output of this type:
IOMMU Group 16 0a:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:1e04] (rev a1)
IOMMU Group 16 0a:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10f7] (rev a1)
IOMMU Group 16 0a:00.2 USB controller [0c03]: NVIDIA Corporation Device [10de:1ad6] (rev a1)
IOMMU Group 16 0a:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device [10de:1ad7] (rev a1)
IOMMU Group 34 42:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:1e04] (rev a1)
IOMMU Group 34 42:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10f7] (rev a1)
IOMMU Group 34 42:00.2 USB controller [0c03]: NVIDIA Corporation Device [10de:1ad6] (rev a1)
IOMMU Group 34 42:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device [10de:1ad7] (rev a1)
I would recommend to store this output into a text file, so you won't have to run this command multiple times.
Every device group must be passed through together. A passthrough of only one of the devices will not work. Each GPU typically also has an audio device associated with it that must also "passed through". In the latest NVidia GPUs, like the RTX2000 series, you will also see a USB bus manager. That is due to the USB-c port on the GPU.
If the iommu grouping is not successful we need to apply acs patch. Click here for instructions
- Create a file
/etc/initramfs-tools/scripts/init-top/vfio.sh
with the following content:
for dev in 0000:XX:XX.X 0000:XX:XX.X
do
echo "vfio-pci" > /sys/bus/pci/devices/$dev/driver_override
echo "$dev" > /sys/bus/pci/drivers/vfio-pci/bind
done
Remark : Replace XX onto actual numbers
- make the
vfio.sh
file executable by running this command :
sudo chmod +x /etc/initramfs-tools/scripts/init-top/vfio.sh
- Create file
/etc/modprobe.d/vfio.conf
with the following content:
options kvm_amd avic=1
- Create file
/etc/modprobe.d/nvidia.conf
with the following content:
blacklist nouveau
blacklist lbm-nouveau
options nouveau modeset=0
alias nouveau off
alias lbm_nouveau off
softdep nvidia-* pre: vfio-pci
softdep nvidia_* pre: vfio-pci
softdep nvidia pre: vfio-pci
- Run
sudo update-initramfs -k all -u
to update your boot image. Reboot.
In case of success, only one of your GPUs will be capable of showing you the login screen, the one that you have not overridden. Else, both screens display information as before.
- Run
lspci -nk
to confirm the isolation. You might get an output as this one:
0a:00.0 0300: 10de:1e04 (rev a1)
Subsystem: 1462:3711
Kernel driver in use: vfio-pci
Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
0a:00.1 0403: 10de:10f7 (rev a1)
Subsystem: 1462:3711
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel
0a:00.2 0c03: 10de:1ad6 (rev a1)
Subsystem: 1462:3711
Kernel driver in use: vfio-pci
0a:00.3 0c80: 10de:1ad7 (rev a1)
Subsystem: 1462:3711
Kernel driver in use: vfio-pci
If the isolation is successful, the Kernel driver in use for the isolated GPU is vfio-pci. A failure will show the NVIDIA/nouveau module in use which will mean that you have to debug what went wrong.
Reboot until the command lspci -nnk
give you this result (pay attention to Kernel driver) :
0a:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device
[10de:1e04] (rev a1)
Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:3711]
Kernel driver in use: vfio-pci
Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
0a:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10f7] (rev a1)
Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:3711]
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel
0a:00.2 USB controller [0c03]: NVIDIA Corporation Device [10de:1ad6] (rev
a1)
Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:3711]
Kernel driver in use: xhci_hcd
0a:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device [10de:1ad7]
(rev a1)
Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:3711]
Kernel driver in use: vfio-pci
Before starting, install the virtualization manager and related software via: sudo apt-get install qemu-kvm libvirt-bin libvirt-daemon-system bridge-utils virt-manager ovmf hugepages
Firstly, if you do not have an administrator who has a licence for Matlab and Prescan, you can just follow the Cloning the VM tutorial and skip the next steps. Cloning-the-VM.
You must download Windows ISO file that will be used later on during the setup: ISO files
Open Virt Manager.
We'll first do a VM that will be temporary.
- Create a new VM
- Select Local install media, and then
Forward
- Browse the Windows ISO file and select
Choose Volume
and moveForward
- Enter Memory RAM:
32768
MiB, and CPUs:8
, and goForward
- Select
Create a disk image
with 120 GiB, thenForward
(you can name the VM as you want) - Enable
Customize configuration before you install
, thenFinish
- In Overview, let
BIOS
instead ofUEFI
- In CPUs: In
configuration
, selectcore2duo
thenApply
- In Memory, edit
Current allocation
to32136
MiB (which in our case is half the RAM of the computer), thenApply
- In IDE Disk 1, change Disk bus to
Virtio
- In IDE CDROM 1, browse the Windows ISO file (downloaded from Onedrive), then click on Choose Volume
- Select Add hardware, go to storage: ● Click on Select or create custom storage, then click on Manage and browse the stable VirtIO file, then click on Choose Volume ● Device type: select CDROM device and then click on Finish.
Remark: Leave all network settings as is (no need to setup a bridge network in our case). Chose e1000
After finishing the previous steps, go to the Boot options, click on VirtIO Disk 1
(IDE disk1
before step 10), IDE CDROM 1
, and IDE CDROM 2
to enable them. Afterwards, put it in the
following order: IDE CDROM 1 > IDE Disk 1 > IDE CDROM 2. Then click on Apply
.
Select Add hardware: go to the PCI host device
and add the NVIDIA PCIs one by one (you can
find them with the IDs used during the Isolation of the selected GPU).
Make sure to remove the internet connection (by deleting the NIC hardware) to make the VM unable to connect to the internet. This will be useful during the Windows installation for it not to ask for a loop.
Click on Begin installation Then follow the steps until windows boots on desktop screen :
Click on Load Driver
, then Browse
. Open CD Drive (E:) virtio -> viostor -> w10.
Click amd64
, then OK
and then next
. From this moment onwards, windows recognizes the
partition/drive allocated in the settings (the screenshot is old, as we first tried with 100 Gb, but
then had to reset again the VM):
Click on next
. Follow the steps for Windows 10 Home installation. Select No
for tools asked for
installation and make sure to select the Basic installation
.
Once you've booted in Windows, make sure to re-add the NIC hardware, this should get the internet back.
Then, go to device manager and install the missing drivers from the VirtIO CDROM by following these steps:
You get error code 43 for your GPU, but it is normal. Shut down the Windows VM. This error can occur when the VM detects that the NVIDIA drivers are running in a virtual environment.
Execute the following command to copy the config file of the VM you just set up:
virsh dumpxml temp_VM > New_VM
Replace temp_VM
by the name of the temporary VM and New_VM
by the name you want for the new VM.
Modify this new xml file with gedit to replace the 3 first lines by these ones :
<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
<name>NAME_OF_THE_NEW_VM</name>
<title>NAME_OF_THE_NEW_VM</title>
Replace NAME_OF_THE_NEW_VM by the name given for the new VM. By this way the line defining the uuid of the VM is deleted.
Then copy the following lines between </vcpu>
and <os>
:
<qemu:commandline>
<qemu:arg value='-cpu'/>
<qemu:arg value='host,kvm=off,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv_vendor_id=whatever'/>
</qemu:commandline>
To have better performance, we'll use hugepages. This feature is enabled by adding the following lines just after the previous qemu:commandline ones:
<memoryBacking>
<hugepages/>
</memoryBacking>
So the beginning of the xml file looks like:
Important : if you see a path between <nvram>
and </nvram>
then something went wrong during the installation (Make sure you choose BIOS instead of UEFI as a firmware setting in overview). If you do not find tag, then you can save your changes and carry on.
Execute : virsh define NewXMLfile
(We define a new VM from the file we just modified). The output is:
adeye@adeye06u:~$ virsh define New_VM
Domain New_VM defined from New_VM
To ensure that the changes are taken into account, close VirtManager and run sudo systemctl restart libvirtd
.
Reboot VirtManager and launch your VM. It should boot on the secondary screen. Make sure Windows detect and use the assign GPU (the one you isolated). If it´s the case then you are almost done with the One Computer Setup!!
Once it has booted on the secondary screen, right click on the Desktop, select Display Settings. Click identify to see the number of each screen and in Multiple Displays select Show only on [the number of the screen on the Ubuntu screen].
Shutdown the VM and assign half the RAM to the VM. For the CPU, check Copy host CPU configuration and manually set CPU topology. For each the RAM and the CPU, in the current allocation, set the maximal allocation.
In windows, make sure it uses the hardware you gave to the VM.
You are done !! Well played !! 🥇
If you have a blue screen at startup of the VM, exectute the following command lines :
echo 1 > /sys/module/kvm/parameters/ignore_msrs
(root access required)
Then create a .conf file in /etc/modprobe.d/
(for example kvm.conf
) that includes the line : options kvm ignore_msrs=1
Set the CPU configuration to "host-passthrough" (enter it by hand as it doesn't exist in the list) in virt Manager.
Also in virt-manager, make sure that the display is configured as "Display VNC" (type: VNC server and Adress: Localhost only).
Once it has booted on the secondary screen, shutdown the VM and assign half the RAM to the VM. For the CPU, check Copy host CPU configuration and manually set CPU topology. In my case, with a 32 threads CPU, I gave 1 socket 8 cores and 2 threads. For each the RAM and and the CPU, in the current allocation, set the maximal allocation (16 cores and 32GB RAM in my case). In windows, make sure it uses the hardware you gave to the VM.
🎊 Congratulations, the setup is finished 🎊
http://mathiashueber.com/amd-ryzen-based-passthrough-setup-between-xubuntu-16-04-and-windows-10/ (maybe the most useful if the setup is done with a Ryzen CPU)
http://mathiashueber.com/ryzen-based-virtual-machine-passthrough-setup-ubuntu-18-04/
https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVMF
https://bufferoverflow.io/gpu-passthrough/
https://blog.zerosector.io/2018/07/28/kvm-qemu-windows-10-gpu-passthrough/
https://heiko-sieger.info/running-windows-10-on-linux-using-kvm-with-vga-passthrough/#The_Need