Basic: CXL Test with CXL emulation in QEMU - moking/moking.github.io GitHub Wiki

CXL (Compute Express Link) is an open standard for high-speed, high capacity central processing unit (CPU)-to-device and CPU-to-memory connections, designed for high performance data center computers. CXL is built on the serial PCI Express (PCIe) physical and electrical interface and includes PCIe-based block input/output protocol (CXL.io) and new cache-coherent protocols for accessing system memory (CXL.cache) and device memory (CXL.mem).--From (wikipedia)

CXL hardware design follows the open specification from CXL consortium (link). The CXL specification is under active development and has evolved from 1.1, 2.0, to 3.0 and as of now the 3.1 specification. The last specification was released August 2023 as the CXL 3.1 specification. Since CXL hardware is not currently widely accessible on the market at the moment, software developers rely on emulation of CXL hardware for debugging and testing CXL code including the CXL Linux kernel drivers. As far as I know, QEMU is the only emulator that supports CXL hardware emulation at the moment.

QEMU is a generic and open-source machine emulator and virtualizer. It can simulate systems with different hardware configurations, including CPUs with different ISA, memory configurations, and peripheral devices. It is noted that QEMU is defined for simulating system functionalities not for timing models, so it is not suitable for performance related simulation and evaluation.

CXL emulation in QEMU

The mainstream QEMU source code can be found (here). CXL related code is located in the following locations in the QEMU source tree:

  • hw/cxl/
  • include/hw/cxl/
  • hw/mem/cxl_type3.c
  • qapi/cxl.json

QEMU currently can emulate the following CXL 2.0 compliant CXL system components (Qemu CXL doc):

  • CXL Host Bridge (CXL HB): equivalent to PCIe host bridge.
  • CXL Root Ports (CXL RP): serves the same purpose as a PCIe Root Port. There are a number of CXL specific Designated Vendor Specific Extended Capabilities (DVSEC) in PCIe Configuration Space and associated component register access via PCI bars.
  • CXL Switch: has a similar architecture to those in PCIe, with a single upstream port, internal PCI bus and multiple downstream ports.
  • CXL Type 3 memory devices as memory expansion: the device can act as a system RAM or Dax device. Currently, volatile and non-volatile memory emulation has been merged to mainstream. CXL 3.0 introduces a new CXL memory device that implements dynamic capacity-DCD (dynamic capacity device). The support of DCD emulation in QEMU has been posted to the mailing list and will be merged soon.
  • CXL Fixed Memory Windows (CFMW): A CFMW consists of a particular range of Host Physical Address space which is routed to particular CXL Host Bridges. At time of generic software initialization it will have a particularly interleaving configuration and associated Quality of Service Throttling Group (QTG). This information is available to system software, when making decisions about how to configure interleave across available CXL memory devices. It is provide as CFMW Structures (CFMWS) in the CXL Early Discovery Table, an ACPI table.

Setup CXL Testing/Development environment with QEMU Emulation

To test a CXL device with QEMU emulation, we need to have the following prerequisites:

  • A QEMU emulator either compiled from source code or preinstalled with CXL emulation support;
  • A Kernel image with CXL support (compiled in or as modules);
  • A file system that serves as the root fs for booting the guest VM.

Install prerequisite packages

Building QEMU and the Linux kernel rely on some preinstalled packages. Here we use the Debian distribution ("bookworm") as an example.

sudo apt-get install libglib2.0-dev libgcrypt20-dev zlib1g-dev \
    autoconf automake libtool bison flex libpixman-1-dev bc QEMU-kvm \
    make ninja-build libncurses-dev libelf-dev libssl-dev debootstrap \
    libcap-ng-dev libattr1-dev libslirp-dev libslirp0

Build QEMU from source code

It is recommended to build the QEMU from source code for two reason:

  1. The pre-compiled binary can be old and lack the latest features that are supported by QEMU;
  2. Building QEMU from source code allows us to customize QEMU based on our needs, including development debugging, applying specific patches to test some un-merged features, or modifying QEMU to test some ideas or fixes, etc.

Steps to build QEMU emulator:

We can download QEMU source code from different sources, for example:

Below we will use DCD emulation setup as an example.

Step 1: download QEMU source code

git clone https://github.com/moking/QEMU/tree/dcd-v6

Step 2: configure QEMU

For example, configure QEMU with x86_64 CPU architecture and debug support:

cd $QEMU; 
./configure --target-list=x86_64-softmmu --enable-debug

Step 3: Compile QEMU

make -j 16

If compile succeed, a new QEMU binary will be generated under build/ directory:

fan@DT ~/c/QEMU (dcd-v6)> ls build/QEMU-system-x86_64 -lh
-rwxr-xr-x 1 fan fan 59M Mar 25 12:12 build/QEMU-system-x86_64

Build Kernel with CXL support enabled

Note: here we build cxl drivers as modules and load/unload them on demand.

Step 1: download Linux Kernel source code

Linux source code can be downloaded from different sources:

Below we will use DCD kernel code as an example.

git clone https://github.com/weiny2/linux-kernel/tree/dcd-2024-03-24

Step 2: configure kernel

After we downloaded the source code, we need to configure the kernel features we want to pick up in the following compile step.

make menuconfig

or,

make kconfig

After the kernel is configured, a .config file will be generated under the root directory of linux kernel source code.

To enable CXL related code support, we need to enable following configurations in .config file.

fan@DT ~/c/r/k/linux-dcd (dcd-2024-03-24)> cat .config | egrep  "CXL|DAX|_ND_"
CONFIG_ARCH_WANT_OPTIMIZE_DAX_VMEMMAP=y
CONFIG_CXL_BUS=m
CONFIG_CXL_PCI=m
CONFIG_CXL_MEM_RAW_COMMANDS=y
CONFIG_CXL_ACPI=m
CONFIG_CXL_PMEM=m
CONFIG_CXL_MEM=m
CONFIG_CXL_PORT=m
CONFIG_CXL_SUSPEND=y
CONFIG_CXL_REGION=y
CONFIG_CXL_REGION_INVALIDATION_TEST=y
CONFIG_CXL_PMU=m
CONFIG_ND_CLAIM=y
CONFIG_ND_BTT=m
CONFIG_ND_PFN=m
CONFIG_NVDIMM_DAX=y
CONFIG_DAX=m
CONFIG_DEV_DAX=m
CONFIG_DEV_DAX_PMEM=m
CONFIG_DEV_DAX_HMEM=m
CONFIG_DEV_DAX_CXL=m
CONFIG_DEV_DAX_HMEM_DEVICES=y
CONFIG_DEV_DAX_KMEM=m

Step 3: Compile Kernel

make -j 16

After a successful compile, a new file vmlinux will be generated under kernel root directory. A compressed kernel image will also be available.

fan@DT ~/c/r/k/linux-dcd (dcd-2024-03-24)> ls arch/x86/boot/bzImage -lh
-rw-r--r-- 1 fan fan 13M Mar 25 09:30 arch/x86/boot/bzImage

Step 4: Install kernel modules

sudo make modules_install

Creating Root File System for Guest VM

To create a disk image as root file system of the guest VM, we need to leverage the tools generated by compiling QEMU source code.

fan@DT ~/c/QEMU (dcd-v6)> find . -name "QEMU-img"
./build/QEMU-bundle/usr/local/bin/QEMU-img
./build/QEMU-img
  1. Create a QEMU image with QEMU-image (e.g., 16G).
QEMU-image create $IMG 16G
  1. Create a file system for the image.
sudo mkfs.ext4 $IMG
  1. Mount the file system to a directory.
 mkdir $DIR
 sudo mount -o loop $IMG $DIR
  1. Install the debian distribution to the file system.
sudo debootstrap --arch amd64 stable $DIR
  1. Setup host and guest VM directory sharing
echo "#! /bin/bash                                                          
mount -t 9p -o trans=virtio homeshare /home/fan                                 
mount -t 9p -o trans=virtio modshare /lib/modules                               
" > /tmp/rc.local                                                               
    chmod a+x /tmp/rc.local                                                     
    sudo cp /tmp/rc.local $DIR/etc/  
sudo mkdir -p $DIR/home/fan                                                 
sudo mkdir -p $DIR/lib/modules/ 
  1. Setup network for guest VM

Create a config.yaml file with following content under $DIR/etc/netplan.

network:
    version: 2
    renderer: networkd
    ethernets:
        enp0s2:
            dhcp4: true
  1. sudo umount $DIR

Bringing up the guest VM

Example 1: boot up VM with a CXL persistent memory sized 512MiB, directly attached to the root port of a host bridge.

QEMU-system-x86_64 -s -kernel /home/fan/cxl/repos/kdevops/linux-dcd/arch/x86/boot/bzImage -append root=/dev/sda rw console=ttyS0,115200 ignore_loglevel nokaslr \
cxl_acpi.dyndbg=+fplm cxl_pci.dyndbg=+fplm cxl_core.dyndbg=+fplm cxl_mem.dyndbg=+fplm cxl_pmem.dyndbg=+fplm cxl_port.dyndbg=+fplm cxl_region.dyndbg=+fplm \
cxl_test.dyndbg=+fplm cxl_mock.dyndbg=+fplm cxl_mock_mem.dyndbg=+fplm dax.dyndbg=+fplm dax_cxl.dyndbg=+fplm device_dax.dyndbg=+fplm \
-smp 1 -accel kvm -serial mon:stdio -nographic -qmp tcp:localhost:4444,server,wait=off -netdev user,id=network0,hostfwd=tcp::2024-:22 -device e1000,netdev=network0 \
-monitor telnet:127.0.0.1:12345,server,nowait -drive file=/home/fan/cxl/images/QEMU-image.img,index=0,media=disk,format=raw \
-machine q35,cxl=on -m 8G,maxmem=32G,slots=8 -virtfs local,path=/lib/modules,mount_tag=modshare,security_model=mapped \
-virtfs local,path=/home/fan,mount_tag=homeshare,security_model=mapped \
-object memory-backend-file,id=cxl-mem1,share=on,mem-path=/tmp/cxltest.raw,size=512M \
-object memory-backend-file,id=cxl-lsa1,share=on,mem-path=/tmp/lsa.raw,size=512M \
-device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 -device cxl-rp,port=0,bus=cxl.1,id=root_port13,chassis=0,slot=2 \
-device cxl-type3,bus=root_port13,memdev=cxl-mem1,lsa=cxl-lsa1,id=cxl-pmem0 \
-M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=8k

Example 2: boot up VM with CXL DCD setup: the device is directly attached to the only root port of a host bridge. The device has two dynamic capacity regions, with each region being 2GiB in size.

QEMU-system-x86_64 -s -kernel /home/fan/cxl/repos/kdevops/linux-dcd/arch/x86/boot/bzImage -append root=/dev/sda rw console=ttyS0,115200 ignore_loglevel nokaslr \
cxl_acpi.dyndbg=+fplm cxl_pci.dyndbg=+fplm cxl_core.dyndbg=+fplm cxl_mem.dyndbg=+fplm cxl_pmem.dyndbg=+fplm cxl_port.dyndbg=+fplm cxl_region.dyndbg=+fplm \
cxl_test.dyndbg=+fplm cxl_mock.dyndbg=+fplm cxl_mock_mem.dyndbg=+fplm dax.dyndbg=+fplm dax_cxl.dyndbg=+fplm device_dax.dyndbg=+fplm \
-smp 1 -accel kvm -serial mon:stdio -nographic -qmp tcp:localhost:4444,server,wait=off -netdev user,id=network0,hostfwd=tcp::2024-:22 -device e1000,netdev=network0 \
-monitor telnet:127.0.0.1:12345,server,nowait -drive file=/home/fan/cxl/images/QEMU-image.img,index=0,media=disk,format=raw \
-machine q35,cxl=on -m 8G,maxmem=32G,slots=8 -virtfs local,path=/lib/modules,mount_tag=modshare,security_model=mapped \
-virtfs local,path=/home/fan,mount_tag=homeshare,security_model=mapped \
-device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 -device cxl-rp,port=13,bus=cxl.1,id=root_port13,chassis=0,slot=2 \
-object memory-backend-file,id=dhmem0,share=on,mem-path=/tmp/dhmem0.raw,size=4G \
-object memory-backend-file,id=lsa0,share=on,mem-path=/tmp/lsa0.raw,size=512M \
-device cxl-type3,bus=root_port13,volatile-dc-memdev=dhmem0,num-dc-regions=2,id=cxl-memdev0 \
-M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=8K

Access CXL memory device emulated with QEMU

After the guest VM is started, we can install ndctl tool for managing the CXL device.

note: Following steps happen in QEMU VM.

Install ndctl from source code

git clone https://github.com/pmem/ndctl.git
cd ndctl;                                                              
meson setup build;                                                     
meson compile -C build;                                                
meson install -C build    

After a successful compile, three tools will be generated under the build directory.

root@DT:~/ndctl# ls build/daxctl/daxctl -lh
-rwxr-xr-x 1 root root 181K Nov 30 22:04 build/daxctl/daxctl
root@DT:~/ndctl# ls build/cxl/cxl -lh
-rwxr-xr-x 1 root root 318K Nov 30 22:04 build/cxl/cxl
root@DT:~/ndctl# ls build/daxctl/daxctl -lh
-rwxr-xr-x 1 root root 181K Nov 30 22:04 build/daxctl/daxctl

Load cxl drivers and show CXL memory device

modprobe -a cxl_acpi cxl_core cxl_pci cxl_port cxl_mem cxl_pmem
root@DT:~# cxl list -u
{
  "memdev":"mem0",
  "pmem_size":"512.00 MiB (536.87 MB)",
  "serial":"0",
  "host":"0000:0d:00.0"
}

To convert CXL memory into system ram, we need extra steps.

Create a cxl region:

cxl create-region -m -d decoder0.0 -w 1 mem0 -s 512M

{
  "region":"region0",
  "resource":"0xa90000000",
  "size":"512.00 MiB (536.87 MB)",
  "interleave_ways":1,
  "interleave_granularity":256,
  "decode_state":"commit",
  "mappings":[
    {
      "position":0,
     "memdev":"mem0",
      "decoder":"decoder2.0"
    }
  ]
}
cxl region: cmd_create_region: created 1 region

Create a namespace for the region:

ndctl create-namespace -m dax -r region0

{
  "dev":"namespace0.0",
  "mode":"devdax",
  "map":"dev",
  "size":257949696,
  "uuid":"8fb092ba-a4ef-4a9a-8d83-d022c518ddf7",
  "daxregion":{
    "id":0,
    "size":257949696,
    "align":2097152,
    "devices":[
      {
        "chardev":"dax0.0",
        "size":257949696,
        "target_node":1,
        "align":2097152,
        "mode":"devdax"
      }
    ]
  },
  "align":2097152
}

Converting a regular devdax mode device to system-ram mode with daxctl:

daxctl reconfigure-device --mode=system-ram --no-online dax0.0

reconfigured 1 device
[
  {
    "chardev":"dax0.0",
    "size":257949696,
    "target_node":1,
    "align":2097152,
    "mode":"system-ram",
    "online_memblocks":0,
    "total_memblocks":1
  }
]

Show system memory:

lsmem

RANGE                                  SIZE   STATE REMOVABLE BLOCK
0x0000000000000000-0x000000007fffffff    2G  online       yes  0-15
0x0000000100000000-0x000000027fffffff    6G  online       yes 32-79
0x0000000a98000000-0x0000000a9fffffff  128M offline             339

Memory block size:       128M
Total online memory:       8G
Total offline memory:    128M

After that, we can see a new 128M memory block has shown up.

References:

  1. wiki: https://en.wikipedia.org/wiki/Compute_Express_Link
  2. Setting up QEMU emulation of CXL
  3. QEMU CXL Page: Compute Express Link (CXL)
  4. CXL mailing list: https://lore.kernel.org/linux-cxl/
⚠️ **GitHub.com Fallback** ⚠️