Features - jonelo/jacksum GitHub Wiki

Hundreds of Hash Functions

Standard Hash Functions

Jacksum supports 489 standard algorithms. For a full detail list, go to https://github.com/jonelo/jacksum/wiki/Algorithms

Customized Hash Functions

  • Concatenate algorithms (e.g. ascon-hash+sha256+crc32c) to obtain a combined hash value

  • The keyed-hash message authentication code (HMAC) is supported, including truncated output of HMAC

  • The "Rocksoft (tm) Model CRC Algorithm" schema is supported

    • Specify all parameters of that model (width, poly, init, refIn, refOut, xorOut) to get your own CRC
    • Use the extended model of it (suggested by the Jacksum project) to specify CRCs that incorporate the length of the input data (incLen, xorLen)

Broad Platform Support

Cross-platform support

  • Operating Systems

    • Microsoft Windows (e.g. Microsoft Windows 10, and 11)
    • GNU/Linux (e.g. Ubuntu)
    • Unix (e.g. BSD-flavors, macOS, Solaris)
    • any other operating system or architecture with an OpenJDK compatible Java Runtime Environment (JRE) or Java Development Kit (JDK)
  • Supported hardware architectures are dependent on the JDK/JRE

    • x86 64 bit (x64)
    • x86 32 bit (x86)
    • ARM 64 bit (AArch64, resp. M1)
    • ARM 32 bit (AArch32)
    • PPC 64 bit (ppc64)
  • written entirely in Java

    • no recompilation required

Multi-core system/multi-CPU support

  • Supports multi-threading on both multi-processor and multi-core computer systems

  • Multiple algorithms

    • Can calculate multiple hashes simultaneously, i.e. files are read only once, and the calculation load is distributed on the available cores
    • The user can control the number of threads if multiple algorithms should run in parallel

Solid-State-Disk (SSD) support

  • Supports multi-reading of files on fast SATA-SSDs, and NVME M.2-SSDs.

  • Multiple files

    • Can read multiple files simultaneously, i.e. files can be read in parallel on SSDs
    • The user can control the number of threads if multiple files should be read in parallel

Use Cases

Integrity verification features

  • Use predefined compatibility/style files to read and write popular 3rd party format styles (GNU/Linux, BSD, SFV, FCIV, openssl, etc.)
  • Perform integrity checks, and detect ok, failed, missing, and new files
  • Include not only the hash, but also the file size and/or file modification timestamp of files for performing reliable integrity checks
  • Create and use your own compatibility/style files

Find files by hashes

  • Find all files that match a given hash value (find the duplicates of a file)
  • Find all files that match the hash values in a precalculated hash set
  • Find all files that don't match the hash values in a precalculated hash set

Find the algorithm to a hash

  • Find the algorithm that was used to calculate a checksum/CRC/hash

Input and Output

Input related features

  • Recursively directory traversal

    • Processes directories recursively, and allows to limit the depth
    • Detects file system cycles and it avoids endless loops
    • Allows you to control how symbolic links on files and/or directories should be handled on all operating systems
  • Input from almost any source

    • Calculates hashes from files, stdin, file lists, command line argument values on all operating systems: Windows, Linux, Unix (e.g. macOS, BSD)
    • Calculates hashes from disks, and partitions on all operating systems: Windows, Linux, Unix (e.g. macOS, BSD)
    • Calculates hashes from block devices, character devices, named pipes, sockets, and sparse files on all Unix-like operating systems
    • Calculates hashes from NTFS Alternate Data Streams (ADS) on Microsoft Windows
    • Calculates hashes from doors on Solaris
  • Character sets, Unicode and BOM support

    • Full Unicode file name support for input files
    • Allows you to specify the character set for input files: GB18030, UTF-8, UTF-16, UTF-16BE, UTF-16LE, UTF-32, UTF-32BE, UTF-32LE, etc.
    • Ignores an optional Byte-Order-Mark (BOM) from the input if a BOM is allowed, but not required by the selected charset
  • Correctness of file handling

    • Handles special characters in filenames correctly (e.g. if a filename on Linux ends with a space or if a filename contains backslashes or newline characters)
    • Handles allowed max. length of filenames properly (e.g. 255 max. characters for a filename on Microsoft Windows NTFS file systems)
    • Handles allowed max. length of paths properly (e.g. 32,767 max. characters for the entire path on Microsoft Windows NTFS file systems)
    • It is large file aware, it can process file sizes up to 8 Exbibytes (= 8,000,000,000 Gibibytes), presupposed your operating system respectively your file system is large file aware, too.

Output related features

  • Predefined standard formats

    • Output can occur in predefined standard formats (BSD-, GNU/Linux-, openssl-, and Solaris style, SFV, and FCIV)
    • supports GNU file name escaping
  • User defined formats

    • Use comprehensive format options to get the output you need
    • Create your own format and define a compatibility file to be able to read your own format again
    • Create ed2k-links, magnet-links, Solaris' pkgmap format
    • To represent hash values, one of 16 encodings can be selected:
      • binary
      • decimal
      • octal
      • hex (lower- and uppercase), bytes can also be grouped and separated for easier readability
      • Base16
      • Base32 (with and without padding)
      • Base32hex (with and without padding)
      • Base64 (with and without padding)
      • Base64url (with and without padding)
      • BubbleBabble
      • z-base-32
      • z85
    • Many different charsets are supported in order to write localized filenames properly (see also options --charset-stdout, --charset-stderr, and --charset-output-file, --charset-error-file).
    • Paths can be customized. See also options -P, --no-path, --path-relative-to, and --path-absolute.
    • Timestamps can be customized. Predefined formats for timestamps such as ISO8601 or Unixtime are available.
  • Character sets, Unicode, and BOM support

    • Full Unicode file name support for output files
    • Allows you to specify the character set for output files: GB18030, UTF-8, UTF-16, UTF-16BE, UTF-16LE, UTF-32, UTF-32BE, UTF-32LE, etc.
    • Adds an optional Byte-Order-Mark (BOM) to the output if a BOM is allowed, but not required by the selected charset

3rd party support

Interaction with other tools

  • Works with the SendTo-feature on many file browsers (e.g. macOS Finder, Microsoft Windows Explorer, Gnome Nautilus, KDE Konqueror, ROX Filer, etc.)
  • As it has a command line interface, Jacksum can be used in cronjobs and autostart environments
  • Jacksum returns an exit status which is dependent on the result of the calculation/verification process, so you can use Jacksum in scripts and batches and control the code flow in your own scripts
  • Use predefined compatibility files to read and write popular 3rd party format styles in order to interact with other tools (GNU/Linux, BSD, SFV, FCIV, openssl, etc.)

Developer support

  • Entire source code has been opened, is hosted on GitHub, and accessible using git
  • The project has been mavenized with a pom.xml which makes it easy to work in your preferred IDE
  • Jacksum provides an API, so you can incorporate Jacksum in your own projects
  • Javadoc is available
  • Jacksum keeps compatibility with JDK 11, but it takes all the advantages of JDK 11+ if available

Algorithm selection

  • Select one, a few, many, or all algorithms for hash calculation, integrity verification or information gathering
  • Specify algorithms manually, or filter them by name or a message digest width

Other

  • The program is mature and very stable
  • Specify your preferred level of verbosity
  • Obtain details for each algorithm, including comprehensive compatibility lists