command examples - animeshtrivedi/notes GitHub Wiki
GPU utilization
https://askubuntu.com/questions/387594/how-to-measure-gpu-usage
pip install gpustat
gpustat -h
gpustat --id 0,1 -i 2 -cp
https://askubuntu.com/questions/387594/how-to-measure-gpu-usage
Conda
conda create --name ssd
conda activate ssd
fio-commands
rwmixread=int
Percentage of a mixed workload that should be reads. Default: 50.
rwmixwrite=int
Percentage of a mixed workload that should be writes. If both rwmixread and rwmixwrite is given and the values do not add up to 100%, the latter of the two will be used to override the first. This may interfere with a given rate setting, if fio is asked to limit reads or writes to a certain rate. If that is the case, then the distribution may be skewed. Default: 50.
All with io_uring
BW examples
- Time based on a block device
fio --name="bandwidth" --bs=1M --iodepth=16 --numjobs=4 --cpus_allowed=0-31 --time_based=1 --ramp_time=5s --runtime=30s --ioengine=io_uring --registerfiles=1 --fixedbufs=1 --ioscheduler=none --size=100% --norandommap=1 --group_reporting=1 --direct=1 --rw=write --allow_file_create=0 --filename=/dev/nvme4n1
- size based (200%), 4 threads, full device 2x times.
fio --name="bandwidth" --bs=1M --iodepth=16 --numjobs=4 --cpus_allowed=0-31 --ioengine=io_uring --registerfiles=1 --fixedbufs=1 --ioscheduler=none --size=200% --norandommap=1 --group_reporting=1 --direct=1 --rw=write --allow_file_create=0 --filename=/dev/nvme4n1
- parallel files sequential writes (128KB size) with all CPUs ($nproc, time based). Each thread will have its own private file in a given
-directory
. See also,fallocate=str
sudo fio --name="bandwidth" --bs=128K --iodepth=16 --numjobs=$(nproc) --cpus_allowed=0-$(($(nproc)-1)) \
--ioengine=io_uring --registerfiles=1 --fixedbufs=1 --ioscheduler=none --size=1g --norandommap=1 \
--group_reporting=1 --direct=1 --rw=write --allow_file_create=1 --directory=/home/atr/rocksdb-exp/ram/ \
--nrfiles=$(nproc) --time_based=1 --ramp_time=5s --runtime=30s --fallocate=posix
per second logging?
fio --name="bandwidth" --bs=1M --iodepth=16 --numjobs=4 --cpus_allowed=0-31 --ioengine=io_uring --registerfiles=1 --fixedbufs=1 --ioscheduler=none --size=10g --norandommap=1 --group_reporting=1 --direct=1 --rw=read --allow_file_create=0 --filename=/dev/nvme4n1 --bwavgtime=2000 --log_avg_msec=2000 --bandwidth-log
IOPS examples
single thread (randread):
sudo fio --name="iops" --bs=4K --iodepth=512 --numjobs=1 --cpus_allowed=0 --ioengine=io_uring --registerfiles=1 --fixedbufs=1 --ioscheduler=none --size=100% --norandommap=1 --group_reporting=1 --direct=1 --rw=randread --allow_file_create=0 --time_based=1 --ramp_time=5s --runtime=30s --filename=/dev/nvme4n1
multiple devices: filename=/dev/nvme4n1:/dev/nvme4n2
Latency example
fio --name="latency" --bs=4K --iodepth=1 --numjobs=1 --cpus_allowed=0 --time_based=1 --ramp_time=5s --runtime=30s --ioengine=io_uring --registerfiles=1 --fixedbufs=1 --ioscheduler=none --size=100% --norandommap=1 --group_reporting=1 --direct=1 --rw=randread --allow_file_create=0 --filename=/dev/nvme10n1
Rocksdb
db_bench --num=1000000 --compression_type=none --value_size=400 --key_size=20 --use_direct_io_for_flush_and_compaction --use_existing_db=true --use_direct_reads --max_bytes_for_level_multiplier=10 --max_background_jobs=48 --threads=48 --enable_pipelined_write=true --allow_concurrent_memtable_write=true --wal_size_limit_MB=0 --write_buffer_size=67108864 --max_write_buffer_number=48 --histogram --report_bg_io_stats=true --report_file=./readwhilewriting-per-second-file.csv --report_interval_seconds=1 --benchmarks=readwhilewriting --seed=42 --db=/home/atr/rocksdb-exp//ssd -wal_dir=/home/atr/rocksdb-exp//ram --read_cache_size=0 --blob_cache_size=0 --cache_size=0 --compressed_cache_size=0 --prepopulate_block_cache=0 --num_file_reads_for_auto_readahead=0 --statistics --file_opening_threads=48
Linux commands
taskset
Dynamic CPU affinity with atr: verified and worked
taskset -p pid*
get the CPU mask for a process (PID)taskset -cp <desired cpu(s) comma separated list> pid*
- set a CPU affinity for a running process with PID
Linux block device information
There are two folders:
/sys/block/nvme0n1/queue/
-- here you have global queue related parameters./sys/block/nvme0n1/mq/
-- here you have multi queue related information, which CPUs these queues are mapped to, tag information.
journalctl
journalctl -b
- since last bootjournalctl --list-boots
- show all boots, and then use their offsets withjournalctl -b -1
- time based
journalctl --since "1 hour ago"
journalctl --since "2 days ago"
journalctl --since "2015-06-26 23:15:00" --until "2015-06-26 23:20:00"
journalctl -u nginx.service -u mysql.service
- specific servicejournalctl -f
- followjournalctl -n 50 --since "1 hour ago"
- most recent #50 entries since last hour
Make CPU online/offline
for x in /sys/devices/system/cpu/cpu{1..11}/online; do echo 1 >"$x"; done
deleting files, directory with special characters
rm -r \~
Drop Linux buffer cache
sudo sh -c "/usr/bin/echo 3 > /proc/sys/vm/drop_caches"
Current value of kernel module parameters
Check in /sys/module/[name]/parameters/
$cat /sys/module/nvme/parameters/sgl_threshold
32768
Find file or directory name
find . -type d -name "*nvme*"
-type c
File is of type c:
b block (buffered) special
c character (unbuffered) special
d directory
p named pipe (FIFO)
f regular file
l symbolic link; this is never true if the -L option or the -follow option is in effect, unless the symbolic link is broken. If you want to search for symbolic links when -L is in effect, use -xtype.
s socket
D door (Solaris)
Mount tmpfs
sshfs atr@localhost:/home/atr/ ./vm-qemu7777/ -p 7777
tmpfs
Mount sudo mount -t tmpfs -o size=32G,noswap,uid=$USER,mpol=prefer:0,huge=never $USER ~/mnt/tmpfs/
https://man7.org/linux/man-pages/man5/tmpfs.5.html
Get the NUMA information
lstopo
or for NVMe
cat /sys/block/nvme1n1/device/numa_node
0
or generic
cat /sys/class/?/"dev_name"/device/numa_node
PCIe topology
lspci -t -v
How Linux hardware details
https://www.baeldung.com/linux/list-network-cards
sudo lshw
sudo lshw -C network
lshw -class disk -class storage
ebpf/bcc one-liners (references)
size histogram
~/src/bcc/tools/bitehist.py
On f20
it fails due to my disabling PYTHONPATH for conda:
from bcc import BPF
ModuleNotFoundError: No module named 'bcc'
Solution: https://github.com/iovisor/bcc/blob/master/FAQ.txt
- it picks up old path from the installed packages which I do not like but here we go
export PYTHONPATH=$(dirname `find /usr/lib -name bcc`):$PYTHONPATH
Then it works.
Get the CPU stack profiler
https://github.com/iovisor/bcc/blob/master/tools/profile_example.txt
Show the count for all active stack on the CPU (and filters)
$ sudo profile
Stack counter on particular pattern or filter
https://github.com/iovisor/bcc/blob/master/tools/stackcount_example.txt
Monitoring
dstat
-c
CPU, -d
disk, -i
interrupts, -m
memory, -n
network, -p
process, -r
io stats, -s
swap, -y
systems stats. Further --aio
, --fs
, --ipc
, --lock
,
dstat -pcmrd
vmstat
$ vmstat 1 1000
procs -----------memory---------- ---swap-- -----io---- -system-- -------cpu-------
r b swpd free buff cache si so bi bo in cs us sy id wa st gu
1 0 7370204 1611780 4576 13788080 62 128 215 443 2135 24 11 5 84 0 0 0
0 0 7370204 1647792 4576 13788080 32 0 32 452 7935 14916 5 4 91 0 0 0