Voodoo checksums - EpicGazel/zpaqfranz GitHub Wiki

zpaqfranz can choose (almost everywhere) different algorithms for hashing/checksumming. Each of them has differences in speed and reliability, with the possibility of using HW acceleration (if the CPU supports it).

Why so many choices?
Because it essentially depends on the speed of reading from the media and eventually back-compatibility. With an NVMe capable of reading 2000MB/s using SHA256 will cap to a maximum of 300MB/s. But if you use a magnetic disk with a maximum speed of 200MB/s you will NOT reduce the time by using a faster (but less reliable) algorithm.

Important note: when using the new zpaqfranz functions, capable of generating one thread per folder to be processed (-all), the overall speed can be much higher than the maximum sustainable from the media

Short version: (currently mid 2021) I suggest -xxhash

for detecting unwanted corruption: fast and small. If you are "paranoid" go cryptographic-level: sha3, sha256, blake3, whirlpool

Rough estimate: speed reported is merely indicative, for an AMD 5950X, only a rough indication of relative performance, on Windows 64

   WHIRLPOOL:   189.36 MB/s 
     SHA-256:   286.79 MB/s 
       SHA-3:   447.54 MB/s 
         MD5:   842.67 MB/s 
       SHA-1:   903.32 MB/s 
      BLAKE3:     3.60 GB/s (HW accelerated)
    XXHASH64:     5.40 GB/s 
        XXH3:     6.86 GB/s 
     CRC-32C:     7.24 GB/s 
      WYHASH:     8.64 GB/s (experimental, just for reference)
      CRC-32:     9.10 GB/s 

-sha1

Wikipedia
Fair speed (~900MB/s), very reliable.
Collisions have been found, albeit in very special and limited cases.

-xxhash

Home
The XXHASH-64 bit, zpaqfranz 52's default (because it is smaller than 128-bit)
Very fast (~5000MB/s), it is thought to be reliable.

-xxh3

Home
The XXH3-128 bit.
Very fast (~7000MB/s), it is thought to be reliable.

-crc32

Wikipedia
The ancient but ubiquitous CRC-32.
Very fast (~9000MB/s), reliable for detecting corruption, non so much for collisions.

-crc32c

Wikipedia
The "Castagnoli" version, with HW acceleration.
Fastest (~7000MB/s), reliable for corruption, not for collisions

-blake3

Wikipedia
CPU intensive (on Win 64 runs with HW acceleration), but very reliable.
On Intel CPUs can be faster then SHA-256.
Please note: current implementation does NOT use multithread. Maybe in the future...

-sha256

Wikipedia
CPU intensive (~290MB/s), but the maybe the most reliable.
In Europe it constitutes legal proof.

-sha3 (256 bit)

Wikipedia
The latest NIST standard, very different internally from SHA2-256.
Typically faster than SHA-256 (450MB/s). Very, very strong.

-whirlpool

Wikipedia
Very CPU intensive (~180MB/s), but very, very, very reliable.
512-bit (64 byte) output. NOT made by NSA (if you do not like :)

-md5

Wikipedia
Today MD5 is broken as a cryptographic hash function, works great as checksum to verify unintentional corruption. Very common, widespread usage (and that's why it is here, ~800MB/s)