SSD - hpaluch/hpaluch.github.io GitHub Wiki

SSD notes

I have currently several SSD drives and I'm curious how long they will last.

Most important parameter from datasheets is called TBW, which is Total Bytes Written threshold. Reading "Limited Warranty" documents included with drives it seems that warranty is void when this TBW threshold exceeding.

Kingston SA400

I have 3 Kingston drives - 1st is 240GB model, and 2nd and 3rd are SA400 models. Datasheet can be found here: https://www.kingston.com/datasheets/SA400S37_us.pdf

WARNING!

Just recently (Dec 2023) I have noticed that Kingston SA400 read performance dramatically suffers for old files (data modified more than 3 months ago). On my old PC transfer rate decreases from 190 MB/s to around 20MB/s - which is worse than old IDE UDMA33 HDDs (they have typical transfer rate around 30MB/s)!

Therefore please avoid buying Kingston SA400 if you can. See end of this wiki pages for more details.

For my 240GB drive there is TBW = 80 TB For my 480GB drive there is TBW = 160 TB

Other interesting details:

up to 300MB/s write for 240GB drive (running openSUSE LEAP 15.5 on ZOTAC PC with Celeron CPU I get read rate around 290MB/s)
up to 450MB/s write for 480GB drive (see below)

Of course those write speeds are under specific ideal conditions.

Kingston is not disclosing details on used technology or cache.

Samsung QVO 870

I have one Samsung QVO 870, 1TB drive.

According to datasheet: https://download.semiconductor.samsung.com/resources/data-sheet/Samsung_SSD_870_QVO_Data_Sheet_Rev1.1_10129514072903.pdf

For my 1TB drive there is TBW = 360TB

Other interesting details from datasheet:

1GB LPDDR4 DRAM Cache
up to 560MB/s writes (when optimized)
technology 4bit MLC V-NAND

Problem with Kingston SA400: read rate drops to 20MB/s for old files

Encountered problems when using LVM on SSD under Proxmox VE on Kingston SA400 SSD using LVM with ext4 (QCOW2 VMs only).

Some files are read very quickly (around 130MB/s which is fine) but some big files (typically Proxmox backups) are very slow - reads are barely 20MB/s(!)

Useful commands:

# from sysstat package
iostat -Nsxyz 1

# get number of extens (if enabled on ext4)

# filefrag /var/lib/vz/template/iso/ubuntu-22.04.3-live-server-amd64.iso

/var/lib/vz/template/iso/ubuntu-22.04.3-live-server-amd64.iso: 39 extents found

e4defrag -v file_to_defrag

Getting details on LVM:

Getting Physical Extent (PE) Size:

# pvdisplay | fgrep 'PE Size'

  PE Size               4.00 MiB

getting Physical Volumes including segments:

# pvs --segments
PV         VG     Fmt  Attr PSize    PFree  Start SSize
/dev/sda3  pvessd lvm2 a--  <149.00g <2.01g     0  1984
/dev/sda3  pvessd lvm2 a--  <149.00g <2.01g  1984 35645
/dev/sda3  pvessd lvm2 a--  <149.00g <2.01g 37629   514

Notice that Start and SSize (Segment size) are in PE Size = 4MB blocks

and logical segments:

# lvs -o+seg_start_pe,seg_size_pe,segtype
LV   VG     Attr       LSize    Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Start SSize Type
root pvessd -wi-ao---- <139.24g                                                         0 35645 linear
swap pvessd -wi-ao----    7.75g                                                         0  1984 linear

Currently I have 2 possible explanation:

some regions of Kingston are very slow to read (dying cells?)
backups is slow because heavy fragmentation.

Here is output from defrag (but it took around 15 minutes to defrag 4.8GB file):

# e4defrag -v  vzdump-qemu-328-2023_05_30-10_39_53.vma.zst

e4defrag 1.47.0 (5-Feb-2023)
ext4 defragmentation for vzdump-qemu-328-2023_05_30-10_39_53.vma.zst
[1/1]vzdump-qemu-328-2023_05_30-10_39_53.vma.zst:	100%  extents: 349 -> 44	[ OK ]
 Success:			[1/1]

After defrag it looks fine:

# cho 3 > /proc/sys/vm/drop_caches
# sync
# sleep 3
# dd if=vzdump-qemu-328-2023_05_30-10_39_53.vma.zst bs=4096 of=/dev/null status=progress

4775202816 bytes (4.8 GB, 4.4 GiB) copied, 25 s, 191 MB/s
1187907+1 records in
1187907+1 records out
4865668213 bytes (4.9 GB, 4.5 GiB) copied, 25.47 s, 191 MB/s

Still not convinced that it is not "area degradation problem" - here is file with lover fragment numbers:

# dd if=arch-rootfs-1686140961.tar.gz bs=4096 of=/dev/null status=progress

2939146240 bytes (2.9 GB, 2.7 GiB) copied, 123 s, 23.9 MB/s
... horrible ...

# time e4defrag -v arch-rootfs-1686140961.tar.gz

e4defrag 1.47.0 (5-Feb-2023)
ext4 defragmentation for arch-rootfs-1686140961.tar.gz
[1/1]arch-rootfs-1686140961.tar.gz:	100%  extents: 92 -> 25	[ OK ]
 Success:			[1/1]

real	0m41.387s
user	0m0.084s
sys	0m28.429s

# echo 3 > /proc/sys/vm/drop_caches
# dd if=arch-rootfs-1686140961.tar.gz bs=4096 of=/dev/null status=progress

2892439552 bytes (2.9 GB, 2.7 GiB) copied, 15 s, 193 MB/s

Confirmed that the problem is NOT caused by fragmentation:

# filefrag vzdump-qemu-102-2023_06_01-17_06_05.vma.zst

vzdump-qemu-102-2023_06_01-17_06_05.vma.zst: 7 extents found

# dd if=vzdump-qemu-102-2023_06_01-17_06_05.vma.zst of=/dev/null bs=1024k status=progress

1780482048 bytes (1.8 GB, 1.7 GiB) copied, 79 s, 22.5 MB/s

Extent details:

File size of vzdump-qemu-102-2023_06_01-17_06_05.vma.zst is 1786722907 (436212 blocks of 4096 bytes)
 ext:     logical_offset:        physical_offset: length:   expected: flags:
   0:        0..    2047:    1951744..   1953791:   2048: 
   1:     2048..    4095:    1955840..   1957887:   2048:    1953792:
   2:     4096..    6143:    1906688..   1908735:   2048:    1957888:
   3:     6144..  268287:   21233664..  21495807: 262144:    1908736:
   4:   268288..  284671:   21217280..  21233663:  16384:   21495808:
   5:   284672..  307199:   21504032..  21526559:  22528:   21233664:
   6:   307200..  436211:   21528576..  21657587: 129012:   21526560: last,eof
vzdump-qemu-102-2023_06_01-17_06_05.vma.zst: 7 extents found

Hmm, it seems to be known issue:

This one is exactly what I'm observing:

https://forums.unraid.net/topic/133682-ssds-that-dont-slow-down-reading-old-files/

Apparently it's a not uncommon issue with SSDs that slow down reading old files due to flash cell voltage for files that are over 3 months old. ...

While the 840's and the Corsair model dropped like a rock after ~3 months, WD model looks to appear to slow down gradually from 4 months to a year and maintain decent speeds.

Possible (but drastic) remedy:

backup partition with partclone
restore partition with partclone However that partition must be inactive (not mounted) - using either Clonezilla shell or other installed Linux.

Workaround: rewrite used blocks using partclone

It is theoretically easy to refresh all SSD cells (using badblocks), but we should skip unused blocks to to not wear SSD TBW too quickly. Here partclone come to rescue.

My example - where /dev/sda11 is ext4 partition with Arch Linux on Kingston SA400 SSD, and /backup_disk is exFAT partition on my backup disk (Seagate 1TB HDD).

WARNING!

Do NOT use partclone.btrfs (BTRFS support)! It will corrupt data!

try check btrfsck --readonly PATH_TO_DEVICE before clone and on unpacked clone, but be sure to unpack clone on EMPTY disk (zeored) - otherwise the old data will remain and check will pass even when clone is broken(!)

verified this problem for partclone version 0.3.23+repack-1

Use rather tar (with --numeric-owner !!!) or cpio

Here is my example:

test OS has ext4 filesystem with UUID 5cb3bf8e-f456-4a4e-86df-f5447d8d9d99
do not forget to create subdirectory: mkdir -p partclone (relative to script directory) where backup will be store

Backup script backup-partclone.sh

#!/bin/bash

set -eu
# EXT4 filesystem UUID to backup (you can see it with "lsblk -f", or tune2fs -l /dev/PARTITION")
fsuuid=5cb3bf8e-f456-4a4e-86df-f5447d8d9d99
dev_link=/dev/disk/by-uuid/$fsuuid
[ -L "$dev_link" ] || {
	echo "ERROR: '$dev_link' is not symlink" >&2
	exit 1
}
dev=$(readlink -f "$dev_link")
[ -b "$dev" ] || {
	echo "ERROR: Resolved link '$dev_link' -> '$dev' is not block device" >&2
	exit 1
}

# verify that filesystem is not mounted
if mount | awk '{print $1 }' | fgrep -qx $dev;then
	echo "ERROR: Device '$dev' is mounted"'!' >&2
	exit 1
fi
set -o pipefail
cd `dirname $0`
t=partclone/pvessd-x2-oss2-50gb-`date '+%Y%m%d-%H%M'`.img.zstd
set -x
partclone.ext4 -c -d -N -s  $dev | zstd -1 > $t
zstdcat $t | partclone.info -s - 2> $t.info.txt
exit 0

you can watch transfer rate while clone (-c parameter) is in progress - I often see something as ugly as 1.5GB/min which is 1500 / 60 = 25 MB/s

restore - double check partition name!!:

cd /backup_disk
# commands below are DESTRUCTIVE! Replace /dev/sdbX with your target partition!
blkdiscard -vf /dev/sdbX
zstdcat partclone/pvessd-x2-oss2-50gb-20231208-0715.img.zstd | partclone.ext4 -r -N -s - -o /dev/sdbX

in my case even restore (write) rate was around 5 GB/min which is 5000/60= 83 MB/s, much better than read rate before...

and repeat it every 3 months (eh, kind of sarcastic...)