NVMe operations - animeshtrivedi/notes GitHub Wiki

list all devices

$ sudo nvme list 
Node                  SN       Model          Namespace Usage                      Format           FW Rev  
--------------------- --- ------ --------- -------------------------- ---------------- --------
/dev/nvme0n1          YY         XX              1          49.60  GB / 960.20  GB    512   B +  0 B   0106    
/dev/nvme1n1          YY         XX              1         960.04  GB / 960.20  GB    512   B +  0 B   0106    
/dev/nvme2n1          YY         XX              1           3.84  TB /   3.84  TB      4 KiB +  0 B   0106
/dev/nvme3n1          YY         XX              1           3.84  TB /   3.84  TB      4 KiB +  0 B   0106
/dev/nvme4n1          YY         XX              1          54.07  GB /   3.84  TB    512   B +  0 B   0106

delete namespace

sudo nvme delete-ns /dev/nvme4n1 -n 1
delete-ns: Success, deleted nsid:1

check the device capacity

In the Identify – Identify Controller Data Structure:

  • Total NVM Capacity (TNVMCAP): This field indicates the total NVM capacity in the NVM subsystem. The value is in bytes. This field shall be supported if Namespace Management and Namespace Attachment commands are supported.

  • Unallocated NVM Capacity (UNVMCAP): This field indicates the unallocated NVM capacity in the NVM subsystem. The value is in bytes. This field shall be supported if Namespace Management and Namespace Attachment commands are supported.

$ sudo nvme id-ctrl /dev/nvme4 |grep tnvmcap
tnvmcap   : 3840755982336

$ sudo nvme id-ctrl /dev/nvme4 |grep unvmcap
unvmcap   : 3840755982336

create a namespace

Example of creating a 100 GiB namespace. What is NSZE and NCAP? NVMe Namespaces

  • The Namespace Size (NSZE) field defines the total size of the namespace in logical blocks (LBA 0 through n-1).
  • The Namespace Capacity (NCAP) field defines the maximum number of logical blocks that may be allocated at any point in time.
$ sudo nvme create-ns /dev/nvme4 -s $((1024*1024*1024*100)) -c $((1024*1024*1024*100)) -b 512 
NVMe status: NS_INSUFFICIENT_CAPACITY: Creating the namespace requires more free space than is currently available. The Command Specific Information field of the Error Information Log specifies the total amount of NVM capacity required to create the namespace in bytes(0x2115)
$ sudo nvme create-ns /dev/nvme4 -s $((1024*1024*2*100)) -c $((1024*1024*2*100)) -b 512 
create-ns: Success, created nsid:1

The capacity here in the count of LBA size, hence 100GB/512 count.

Consuming remaining space

There seems to be some granularity on which the space allocation happens

$ sudo nvme id-ctrl /dev/nvme4 |grep unvmcap | awk '{print $3}'
26297413632

$ sudo nvme create-ns /dev/nvme4 -s $((25*1024*1024*2))  -c $((25*1024*1024*2)) -b 512
NVMe status: NS_INSUFFICIENT_CAPACITY: Creating the namespace requires more free space than is currently available. The Command Specific Information field of the Error Information Log specifies the total amount of NVM capacity required to create the namespace in bytes(0x2115)

$ sudo nvme create-ns /dev/nvme4 -s $((24*1024*1024*2))  -c $((24*1024*1024*2)) -b 512
NVMe status: INVALID_FORMAT: The LBA Format specified is not supported. This may be due to various conditions(0x410a)

$ sudo nvme create-ns /dev/nvme4 -s $((20*1024*1024*2))  -c $((20*1024*1024*2)) -b 512
NVMe status: INVALID_FORMAT: The LBA Format specified is not supported. This may be due to various conditions(0x410a)

$ sudo nvme create-ns /dev/nvme4 -s $((10*1024*1024*2))  -c $((10*1024*1024*2)) -b 512
NVMe status: INVALID_FORMAT: The LBA Format specified is not supported. This may be due to various conditions(0x410a)

$ sudo nvme create-ns /dev/nvme4 -s $((1*1024*1024*2))  -c $((1*1024*1024*2)) -b 512
create-ns: Success, created nsid:8

$ sudo nvme create-ns /dev/nvme4 -s $((8*1024*1024*2))  -c $((8*1024*1024*2)) -b 512
create-ns: Success, created nsid:9

list namespaces

$ sudo nvme list-ns /dev/nvme4 -a
[   0]:0x1

attach a namespace

  1. Identify the controller id, which may or may not be the same as the eX number. ref
$ sudo nvme id-ctrl /dev/nvme4 | grep -e nn -e cntlid 
cntlid    : 0x6
nn        : 32

See figure 312 in the specification 2.1

Controller ID (CNTLID): Contains the NVM subsystem unique controller identifier associated with the controller.

Number of Namespaces (NN): This field indicates the maximum value of a valid NSID for the NVM subsystem. Otherwise you may get:

NVMe status: CONTROLLER_LIST_INVALID: The controller list provided is invalid(0x211c)
  1. Attach to it,
$ sudo nvme attach-ns /dev/nvme4 -n 1 -c 6
attach-ns: Success, nsid:1

What LBA sizes are supported by the NVMe device?

$ sudo nvme id-ns -H /dev/nvme4n1 | grep -e "LBA Format"
  [3:0] : 0	Current LBA Format Selected
LBA Format  0 : Metadata Size: 0   bytes - Data Size: 512 bytes - Relative Performance: 0 Best (in use)
LBA Format  1 : Metadata Size: 0   bytes - Data Size: 4096 bytes - Relative Performance: 0 Best 

Reformat a namespace with a different block size

$ sudo nvme format --lbaf=1 /dev/nvme4n1
You are about to format nvme4n1, namespace 0x1.
Namespace nvme4n1 has parent controller(s):nvme4

WARNING: Format may irrevocably delete this device's data.
You have 10 seconds to press Ctrl-C to cancel this operation.

Use the force [--force|-f] option to suppress this warning.
Sending format operation ... 
Success formatting namespace:1

$ sudo nvme id-ns -H /dev/nvme4n1 | grep -e "LBA Format"
  [3:0] : 0x1	Current LBA Format Selected
LBA Format  0 : Metadata Size: 0   bytes - Data Size: 512 bytes - Relative Performance: 0 Best 
LBA Format  1 : Metadata Size: 0   bytes - Data Size: 4096 bytes - Relative Performance: 0 Best (in use)

Identify MDTS

sudo nvme id-ctrl /dev/nvme4 -H | grep -ie MDTS
mdts      : 9

This field indicates the maximum data transfer size for a command that transfers data between host-accessible memory (refer to section 1.5.44) and the controller. The host should not submit a command that exceeds this maximum data transfer size. If a command is submitted that exceeds this transfer size, then the command is aborted with a status code of Invalid Field in Command. The value is in units of the minimum memory page size (CAP.MPSMIN) and is reported as a power of two (2^n). A value of 0h indicates that there is no maximum data transfer size.

Identify NVMe version

sudo nvme id-ctrl /dev/nvme4 -H | grep -wi ver
ver       : 0x10400

Show NVMe controller capability field

broken?

$ sudo nvme primary-ctrl-caps /dev/nvme4 
NVMe status: INVALID_FIELD: A reserved coded value or an unsupported value in a defined field(0x2)

Do a small read, and write example

Setting up NVM-fabrics example

From https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/9/html-single/managing_storage_devices/index#configuring-nvme-over-fabrics-using-nvme-rdma_managing-storage-devices

Target side

sudo modprobe nvmet nvmet-rdma 
sudo mkdir /sys/kernel/config/nvmet/subsystems/nvme-subsystem-atr  
cd /sys/kernel/config/nvmet/subsystems/nvme-subsystem-atr
ls
allowed_hosts        attr_cntlid_max  attr_firmware  attr_model      attr_qid_max  attr_version  passthru
attr_allow_any_host  attr_cntlid_min  attr_ieee_oui  attr_pi_enable  attr_serial   namespaces
sudo bash -c "echo 1 > attr_allow_any_host"
sudo mkdir -p namespaces/10 
namespaces/10/
sudo echo -n /dev/nvme4n1 > device_path 
sudo bash -c "echo -n /dev/nvme4n1 > device_path"
sudo bash -c "echo 1 > enable "
sudo mkdir /sys/kernel/config/nvmet/ports/10
cd /sys/kernel/config/nvmet/ports/10
$ ls
addr_adrfam  addr_treq     addr_trtype  ana_groups              param_max_queue_size  referrals
addr_traddr  addr_trsvcid  addr_tsas    param_inline_data_size  param_pi_enable       subsystems
sudo bash -c "echo 10.100.0.20 > addr_traddr "
sudo bash -c "echo rdma > addr_trtype"
sudo bash -c "echo 4420 > addr_trsvcid"
sudo bash -c "echo ipv4 > addr_adrfam"
sudo ln -s /sys/kernel/config/nvmet/subsystems/nvme-subsystem-atr/  /sys/kernel/config/nvmet/ports/10/subsystems/nvme-subsystem-atr
sudo dmesg | grep "enabling port"
[3004788.447838] nvmet_rdma: enabling port 10 (10.100.0.20:4420)

Initiator side

$ sudo modprobe nvme-rdma

$ sudo nvme discover -t rdma -a 10.100.0.20 -s 4420

Discovery Log Number of Records 2, Generation counter 2
=====Discovery Log Entry 0======
trtype:  rdma
adrfam:  ipv4
subtype: unrecognized
treq:    not specified, sq flow control disable supported
portid:  10
trsvcid: 4420
subnqn:  nqn.2014-08.org.nvmexpress.discovery
traddr:  10.100.0.20
rdma_prtype: not specified
rdma_qptype: connected
rdma_cms:    rdma-cm
rdma_pkey: 0x0000
=====Discovery Log Entry 1======
trtype:  rdma
adrfam:  ipv4
subtype: nvme subsystem
treq:    not specified, sq flow control disable supported
portid:  10
trsvcid: 4420
subnqn:  nvme-subsystem-atr
traddr:  10.100.0.20
rdma_prtype: not specified
rdma_qptype: connected
rdma_cms:    rdma-cm
rdma_pkey: 0x0000

$ sudo nvme connect -t rdma -n nvme-subsystem-atr -a 10.100.0.20 -s 4420
$ sudo nvme list 
Node                  SN                   Model                                    Namespace Usage                      Format           FW Rev  
--------------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme6n1          a326240c4e5eb9f44c88 Linux                                    10        137.44  GB / 137.44  GB    512   B +  0 B   6.9.0-at