nvmf setup experiments - animeshtrivedi/notes GitHub Wiki

Deploying nvmf between a pair of VMs

Using node1 to setup two VM setting. /dev/sdb is the new flash device, 4TB which is mounted at /mnt/sdb/ inside which I have atr folder where images are. I used the stosys VM image. Made a base image then use a diff to boot from it.

I followed the instructions from here:

Setup host bridge

Follow: https://futurewei-cloud.github.io/ARM-Datacenter/qemu/network-aarch64-qemu-guests/

On the host Linux machine:

sudo ip link add br0 type bridge
sudo ip addr add 192.168.0.1/24 dev br0
sudo ip link set br0 up
# I put the echo in the local file 
echo 'allow br0' | sudo tee -a /home/atr/src/qemu-6.1.0/etc/qemu/bridge.conf 

Then one can add a virt-io device to the VM.

-netdev bridge,id=hn1,br=br??? -device virtio-net,netdev=hn1,mac=e6:c8:ff:09:76:99

When starting the VM if you get (because you have not installed the QEMU in the global path name and config file)

qemu-system-x86_64: bridge helper failed

Then you have to pass the helper script address, something like this

-netdev bridge,id=hn0,br=br1,helper=/home/atr/src/qemu-6.1.0/build/qemu-bridge-helper

See https://lists.gnu.org/archive/html/qemu-discuss/2021-05/msg00069.html

Other references

Create a base image and do a diff boot

From a base image, make two images:

sudo qemu-img create -f qcow2 -F qcow2 -b ./RO-ubuntu-20.04-stosys-v5.12.qcow ./vm-initiator.qcow
sudo qemu-img create -f qcow2 -F qcow2 -b ./RO-ubuntu-20.04-stosys-v5.12.qcow ./vm-target.qcow

Then use these diff images:

Target boot:

sudo /home/atr/src/qemu-6.1.0/build//qemu-system-x86_64 -name qemuzns -m 32G --enable-kvm -cpu host -smp 4 -hda /mnt/sdb/atr/vm-target.qcow -net user,hostfwd=tcp::7777-:22 -net nic -drive file=/mnt/sdb/atr/nvmessd-4G.img,id=nvme-device,format=raw,if=none -device nvme,drive=nvme-device,serial=nvme-dev,physical_block_size=512,logical_block_size=512 -drive file=/mnt/sdb/atr/nvmessd2-4G.img,id=nvme-device2,format=raw,if=none -device nvme,drive=nvme-device2,serial=nvme-dev2,physical_block_size=512,logical_block_size=512 -netdev bridge,id=hn0,br=br1,helper=/home/atr/src/qemu-6.1.0/build/qemu-bridge-helper -device virtio-net-pci,netdev=hn0,id=nic1,mac=e6:c8:ff:09:76:99 --daemonize

Initiator boot:

/home/atr/src/qemu-6.1.0/build//qemu-system-x86_64 -name qemuzns2 -m 32G --enable-kvm -cpu host -smp 4 -hda /mnt/sdb/atr/vm-initiator.qcow -net user,hostfwd=tcp::8888-:22 -net nic -netdev bridge,id=hn0,br=br1,helper=/home/atr/src/qemu-6.1.0/build/qemu-bridge-helper -device virtio-net-pci,netdev=hn0,id=nic1,mac=e6:c8:ff:09:76:9c -daemonize

Some issues:

  • Always specify mac address, the default MAC address generated is the same. In that case the host bridge does not know how to forward packets

Setup hostname

https://www.cyberciti.biz/faq/ubuntu-20-04-lts-change-hostname-permanently/

sudo hostnamectl set-hostname newNameHere
# change all references here 
sudo vim /etc/hosts
sudo reboot

Assigning static IPs

NIC=ens6
sudo ip addr add 190.160.10.8/24 dev $NIC
sudo ip link set $NIC up

Alternate way, https://bytefreaks.net/gnulinux/how-to-set-a-static-ip-address-from-the-command-line-in-gnulinux-using-ifconfig-and-route

sudo ifconfig eth0 192.168.1.2 netmask 255.255.255.0;
sudo route add default gw 192.168.1.1 eth0;

I have not looked into the netplan: https://linuxconfig.org/how-to-configure-static-ip-address-on-ubuntu-18-10-cosmic-cuttlefish-linux

Other issues:

Setting up NVM-Fabrics:

Following: https://futurewei-cloud.github.io/ARM-Datacenter/qemu/nvme-of-tcp-vms/

Target:

sudo modprobe nvmet
sudo modprobe nvmet-tcp
cd /sys/kernel/config/nvmet/subsystems
sudo mkdir nvme-test-target
cd nvme-test-target/
echo 1 | sudo tee -a attr_allow_any_host > /dev/null
sudo mkdir namespaces/1
cd namespaces/1
echo -n /dev/nvme0n1 |sudo tee -a device_path > /dev/null
echo 1|sudo tee -a enable > /dev/null
sudo mkdir /sys/kernel/config/nvmet/ports/1
cd /sys/kernel/config/nvmet/ports/1
echo 192.168.0.16 |sudo tee -a addr_traddr > /dev/null
echo tcp|sudo tee -a addr_trtype > /dev/null
echo 4420|sudo tee -a addr_trsvcid > /dev/null
echo ipv4|sudo tee -a addr_adrfam > /dev/null
sudo ln -s /sys/kernel/config/nvmet/subsystems/nvme-test-target/ /sys/kernel/config/nvmet/ports/1/subsystems/nvme-test-target

Initiator:

sudo modprobe nvme
sudo modprobe nvme-tcp
sudo nvme discover -t tcp -a 192.168.0.16 -s 4420 --hostnqn=nqn.2014-08.org.nvmexpress:uuid:1b4e28ba-2fa1-11d2-883f-0016d3ccabcd
sudo nvme connect -t tcp -n nvme-test-target -a 192.168.0.16 -s 4420 --hostnqn=nqn.2014-08.org.nvmexpress:uuid:1b4e28ba-2fa1-11d2-883f-0016d3ccabcd
sudo nvme list
sudo nvme disconnect /dev/nvme0n1 -n nvme-test-target

See the file https://github.com/animeshtrivedi/utilities/tree/master/qemu/ files for these scripts.

NVMe-InfiniBand setup: https://www.linuxjournal.com/content/data-flash-part-ii-using-nvme-drives-and-creating-nvme-over-fabrics-network and TCP: https://www.linuxjournal.com/content/data-flash-part-iii-nvme-over-fabrics-using-tcp (another TCP example: https://blogs.oracle.com/linux/post/nvme-over-tcp)

Unresolved issue on 5.19 kernel with initiator:

atr@node6:~$ sudo nvme connect -t tcp -a node3 -s 4420 --hostnqn=nqn.2014-08.org.nvmexpress:uuid:1b4e28ba-2fa1-11d2-883f-0016d3ccabcd -n nvme-atr-target-1gbps 
Failed to write to /dev/nvme-fabrics: Invalid argument
no controller found: failed to write to nvme-fabrics device
(reverse-i-search)`ta': sudo nvme connect -t tcp -a node3 -s 4420 --hostnqn=nqn.2014-08.org.nvmexpress:uuid:1b4e28ba-2fa1-11d2-883f-0016d3ccabcd -n nvme-atr-^Crget-1gbps 
atr@node6:~$ sudo tail -f /var/log/kern.log
Nov  8 13:19:27 node6 kernel: [2757932.499818] nvme2: Admin Cmd(0x6), I/O Error (sct 0x0 / sc 0x2) DNR 
Nov  8 13:19:27 node6 kernel: [2757932.500916] nvme3: Admin Cmd(0x6), I/O Error (sct 0x0 / sc 0x2) DNR 
Nov  8 13:19:27 node6 kernel: [2757932.518615] nvme nvme6: Invalid MNAN value 1024

Filebench issues

Filebench quick reading: https://www.usenix.org/system/files/login/articles/login_spring16_02_tarasov.pdf

Compiling filebench:

gcc: error: proto3-lexer.c: No such file or directory
gcc: fatal error: no input files

#Then 
sudo apt install bison    
sudo apt install flex

# reconfigure project
autoreconf -i
./configure
make

Filebench issue

11.060: Unexpected Process termination Code 3, Errno 0 around line 10
12.060: Run took 1 seconds...
12.060: NO VALID RESULTS! Filebench run terminated prematurely around line 10

Then disable randomized address space, https://github.com/filebench/filebench/issues/112

root# echo 0 > /proc/sys/kernel/randomize_va_space

I could not run filebench reliably, so I wrote a small program for myself to understand.

Misc