SPDK bug report - pykello/pykello.github.com GitHub Wiki
VM hangs when flushing data to a vhost crypto vbdev
Summary
- Create a crypto vbdev on top of an aio bdev
- Write Ubuntu jammy bootable image to crypto vbdev using spdk_dd
- Start vhost target & bind the vbdev to vhost
- Boot a VM using the vhost device
- Inside the VM, running the following commands hangs when flushing data:
dd if=/dev/random of=1.txt bs=512 count=1000000
sync 1.txt
Notes:
- I used the
v23.05spdk tag, with configure options:--with-crypto --with-vhost. - Note that most of the reads & writes to the crypto vbdev succeeds, otherwise the VM wouldn't have booted. Problem is that we see occasional unexpected failures which makes the device unusable.
- If I attach strace to vhost target, I see
io_submitcall failures like. SPDK seems to repeat the request without success:
io_submit(0x7f93de869000, 1,
[{aio_data=0x200013d6d220,
aio_lio_opcode=IOCB_CMD_PWRITEV,
aio_fildes=221,
aio_buf=[{iov_base="\325\221\343\275\243\200}\275x\7\r{&l\271\264\275\2212\f7?i\353*/z(\v\212e\34"...,
iov_len=65536}],
aio_offset=2029584384}]) = -1 EAGAIN (Resource temporarily unavailable)
- If I use an aio directly (without crypto vbdev), it works fine.
- If instead of vhost, I connect to the crypto vbdev using nbd & run the
ddandsynccommand, it also works fine. - I tried this on 2 different machines (local laptop, and a hetzner server). It always reproduced.
Steps to reproduce using qemu
Requirements
sudo apt install mtools qemu-system-x86
0. SPDK config
Write the bdev config that we'll using in spdk_dd and vhost target:
sudo HUGEMEM=5120 $SPDK/scripts/setup.sh
echo '{"subsystems":[{"subsystem":"accel","config":[{"method":"accel_crypto_key_create","params":{"name":"super_key","cipher":"AES_XTS","key":"70504b972ef8281e2bda917f70b8884b9e04c76528690d40d18d71c55292b15c","key2":"6acbfc7fe2683c20d6b2abddfcc53272b10b1fa6e2722327f7d68863c4ad38df"}}]},{"subsystem":"bdev","config":[{"method":"bdev_aio_create","params":{"name":"aio0","block_size":512,"filename":"spdk-encrypted-image","readonly":false}},{"method":"bdev_crypto_create","params":{"base_bdev_name":"aio0","name":"crypt0","key_name":"super_key"}}]}]}' \
> spdk_conf.json
1. Download & encrypt bootable image
Create cloudinit image for setting user/pass in VM
wget https://raw.githubusercontent.com/cloud-hypervisor/cloud-hypervisor/main/test_data/cloud-init/ubuntu/local/user-data
wget https://raw.githubusercontent.com/cloud-hypervisor/cloud-hypervisor/main/test_data/cloud-init/ubuntu/local/network-config
wget https://raw.githubusercontent.com/cloud-hypervisor/cloud-hypervisor/main/test_data/cloud-init/ubuntu/local/meta-data
rm -f /tmp/ubuntu-cloudinit.img
mkdosfs -n CIDATA -C /tmp/ubuntu-cloudinit.img 8192
mcopy -oi /tmp/ubuntu-cloudinit.img -s ./user-data ::
mcopy -oi /tmp/ubuntu-cloudinit.img -s ./network-config ::
mcopy -oi /tmp/ubuntu-cloudinit.img -s ./meta-data ::
Download & convert bootable image
wget https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img
# convert image to raw format
qemu-img convert -p -f qcow2 -O raw jammy-server-cloudimg-amd64.img jammy.raw
truncate -s 5G jammy.raw
# Encrypt image using spdk_dd
rm -rf spdk-encrypted-image && touch spdk-encrypted-image
truncate -s 5G spdk-encrypted-image
sudo $SPDK/build/bin/spdk_dd --config spdk_conf.json --if jammy.raw --ob crypt0
2. Start vhost
sudo $SPDK/build/bin/vhost --config spdk_conf.json -S /var/tmp
In a different window, bind vhost socket
sudo $SPDK/scripts/rpc.py vhost_create_blk_controller vhost.1 crypt0
3. Run qemu
sudo qemu-system-x86_64 \
--enable-kvm -cpu host -smp 2 -m 4G \
-object memory-backend-file,id=mem0,size=4G,mem-path=/dev/hugepages,share=on -numa node,memdev=mem0 \
-chardev socket,id=spdk_vhost_blk0,path=/var/tmp/vhost.1,reconnect=1 \
-device vhost-user-blk-pci,chardev=spdk_vhost_blk0,num-queues=4,bootindex=0 \
-drive file=/tmp/ubuntu-cloudinit.img,format=raw \
-nographic
Login with:
- User: cloud
- Password: cloud123
Inside the VM run:
dd if=/dev/random of=1.txt bs=512 count=1000000
sync 1.txt
The sync call hangs.