command examples - animeshtrivedi/notes GitHub Wiki

nvme-commands

GPU utilization

https://askubuntu.com/questions/387594/how-to-measure-gpu-usage

pip install gpustat
gpustat -h 
gpustat --id 0,1 -i 2 -cp

https://askubuntu.com/questions/387594/how-to-measure-gpu-usage

Conda

conda create --name ssd
conda activate ssd

fio-commands

rwmixread=int Percentage of a mixed workload that should be reads. Default: 50.

rwmixwrite=int Percentage of a mixed workload that should be writes. If both rwmixread and rwmixwrite is given and the values do not add up to 100%, the latter of the two will be used to override the first. This may interfere with a given rate setting, if fio is asked to limit reads or writes to a certain rate. If that is the case, then the distribution may be skewed. Default: 50.

All with io_uring

BW examples

Time based on a block device

fio --name="bandwidth" --bs=1M --iodepth=16 --numjobs=4 --cpus_allowed=0-31 --time_based=1 --ramp_time=5s --runtime=30s --ioengine=io_uring --registerfiles=1 --fixedbufs=1 --ioscheduler=none --size=100% --norandommap=1 --group_reporting=1 --direct=1 --rw=write --allow_file_create=0 --filename=/dev/nvme4n1

size based (200%), 4 threads, full device 2x times.

fio --name="bandwidth" --bs=1M --iodepth=16 --numjobs=4 --cpus_allowed=0-31 --ioengine=io_uring --registerfiles=1 --fixedbufs=1 --ioscheduler=none --size=200% --norandommap=1 --group_reporting=1 --direct=1 --rw=write --allow_file_create=0 --filename=/dev/nvme4n1

parallel files sequential writes (128KB size) with all CPUs ($nproc, time based). Each thread will have its own private file in a given -directory. See also, fallocate=str

sudo fio --name="bandwidth" --bs=128K --iodepth=16 --numjobs=$(nproc) --cpus_allowed=0-$(($(nproc)-1)) \
--ioengine=io_uring --registerfiles=1 --fixedbufs=1 --ioscheduler=none --size=1g --norandommap=1 \
--group_reporting=1 --direct=1 --rw=write --allow_file_create=1 --directory=/home/atr/rocksdb-exp/ram/ \
--nrfiles=$(nproc) --time_based=1 --ramp_time=5s --runtime=30s --fallocate=posix

per second logging?

fio --name="bandwidth" --bs=1M --iodepth=16 --numjobs=4 --cpus_allowed=0-31 --ioengine=io_uring --registerfiles=1 --fixedbufs=1 --ioscheduler=none --size=10g --norandommap=1 --group_reporting=1 --direct=1 --rw=read --allow_file_create=0 --filename=/dev/nvme4n1 --bwavgtime=2000 --log_avg_msec=2000 --bandwidth-log

IOPS examples

single thread (randread):

sudo fio --name="iops" --bs=4K --iodepth=512 --numjobs=1 --cpus_allowed=0 --ioengine=io_uring --registerfiles=1 --fixedbufs=1 --ioscheduler=none --size=100% --norandommap=1 --group_reporting=1 --direct=1 --rw=randread --allow_file_create=0 --time_based=1 --ramp_time=5s --runtime=30s --filename=/dev/nvme4n1

multiple devices: filename=/dev/nvme4n1:/dev/nvme4n2

Latency example

fio --name="latency" --bs=4K --iodepth=1 --numjobs=1 --cpus_allowed=0 --time_based=1 --ramp_time=5s --runtime=30s --ioengine=io_uring --registerfiles=1 --fixedbufs=1 --ioscheduler=none --size=100% --norandommap=1 --group_reporting=1 --direct=1 --rw=randread --allow_file_create=0 --filename=/dev/nvme10n1

Rocksdb

db_bench --num=1000000 --compression_type=none --value_size=400 --key_size=20 --use_direct_io_for_flush_and_compaction --use_existing_db=true --use_direct_reads --max_bytes_for_level_multiplier=10 --max_background_jobs=48 --threads=48 --enable_pipelined_write=true --allow_concurrent_memtable_write=true --wal_size_limit_MB=0 --write_buffer_size=67108864 --max_write_buffer_number=48 --histogram --report_bg_io_stats=true --report_file=./readwhilewriting-per-second-file.csv --report_interval_seconds=1 --benchmarks=readwhilewriting --seed=42 --db=/home/atr/rocksdb-exp//ssd -wal_dir=/home/atr/rocksdb-exp//ram --read_cache_size=0 --blob_cache_size=0 --cache_size=0 --compressed_cache_size=0 --prepopulate_block_cache=0 --num_file_reads_for_auto_readahead=0 --statistics --file_opening_threads=48

Linux commands

Dynamic CPU affinity with `taskset`

atr: verified and worked

taskset -p pid* get the CPU mask for a process (PID)
taskset -cp <desired cpu(s) comma separated list> pid* - set a CPU affinity for a running process with PID

ref

Linux block device information

There are two folders:

/sys/block/nvme0n1/queue/ -- here you have global queue related parameters.
/sys/block/nvme0n1/mq/ -- here you have multi queue related information, which CPUs these queues are mapped to, tag information.

`journalctl`

ref

journalctl -b - since last boot
journalctl --list-boots - show all boots, and then use their offsets with journalctl -b -1
time based
- journalctl --since "1 hour ago"
- journalctl --since "2 days ago"
- journalctl --since "2015-06-26 23:15:00" --until "2015-06-26 23:20:00"
journalctl -u nginx.service -u mysql.service - specific service
journalctl -f - follow
journalctl -n 50 --since "1 hour ago" - most recent #50 entries since last hour

Make CPU online/offline

for x in /sys/devices/system/cpu/cpu{1..11}/online; do echo 1 >"$x"; done

deleting files, directory with special characters

rm -r \~

ref

Drop Linux buffer cache

sudo sh -c "/usr/bin/echo 3 > /proc/sys/vm/drop_caches"

Current value of kernel module parameters

Check in /sys/module/[name]/parameters/

$cat /sys/module/nvme/parameters/sgl_threshold 
32768

Find file or directory name

find . -type d -name "*nvme*"

       -type c
              File is of type c:
              b      block (buffered) special
              c      character (unbuffered) special
              d      directory
              p      named pipe (FIFO)
              f      regular file
              l      symbolic link; this is never true if the -L option or the -follow option is in effect, unless the symbolic link is broken.  If you want to search for symbolic links when -L is in effect, use -xtype.
              s      socket
              D      door (Solaris)

Mount tmpfs

sshfs atr@localhost:/home/atr/ ./vm-qemu7777/ -p 7777

Mount `tmpfs`

sudo mount -t tmpfs -o size=32G,noswap,uid=$USER,mpol=prefer:0,huge=never $USER ~/mnt/tmpfs/

https://man7.org/linux/man-pages/man5/tmpfs.5.html

Get the NUMA information

lstopo

or for NVMe

cat /sys/block/nvme1n1/device/numa_node 
0

or generic

cat /sys/class/?/"dev_name"/device/numa_node

PCIe topology

lspci -t -v

How Linux hardware details

https://www.baeldung.com/linux/list-network-cards

sudo lshw 
sudo lshw -C network
lshw -class disk -class storage

ebpf/bcc one-liners (references)

size histogram

~/src/bcc/tools/bitehist.py

On f20 it fails due to my disabling PYTHONPATH for conda:

    from bcc import BPF
ModuleNotFoundError: No module named 'bcc'

Solution: https://github.com/iovisor/bcc/blob/master/FAQ.txt

it picks up old path from the installed packages which I do not like but here we go

export PYTHONPATH=$(dirname `find /usr/lib -name bcc`):$PYTHONPATH

Then it works.

Get the CPU stack profiler

https://github.com/iovisor/bcc/blob/master/tools/profile_example.txt

Show the count for all active stack on the CPU (and filters)

$ sudo profile

Stack counter on particular pattern or filter

https://github.com/iovisor/bcc/blob/master/tools/stackcount_example.txt

Monitoring

dstat

-c CPU, -d disk, -i interrupts, -m memory, -n network, -p process, -r io stats, -s swap, -y systems stats. Further --aio, --fs, --ipc, --lock,

dstat -pcmrd

vmstat

$ vmstat 1 1000 
procs -----------memory---------- ---swap-- -----io---- -system-- -------cpu-------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st gu
 1  0 7370204 1611780   4576 13788080   62  128   215   443 2135   24 11  5 84  0  0  0
 0  0 7370204 1647792   4576 13788080   32    0    32   452 7935 14916  5  4 91  0  0  0

command examples - animeshtrivedi/notes GitHub Wiki

GPU utilization

Conda

fio-commands

BW examples

IOPS examples

Latency example

Rocksdb

Linux commands

Dynamic CPU affinity with taskset

Linux block device information

journalctl

Make CPU online/offline

deleting files, directory with special characters

Drop Linux buffer cache

Current value of kernel module parameters

Find file or directory name

Mount tmpfs

Mount tmpfs

Get the NUMA information

PCIe topology

How Linux hardware details

ebpf/bcc one-liners (references)

size histogram

Get the CPU stack profiler

Stack counter on particular pattern or filter

Monitoring

dstat

vmstat

TODO

Dynamic CPU affinity with `taskset`

`journalctl`

Mount `tmpfs`