Parfu Data Blocking - ncsa/parfu_archive_tool GitHub Wiki

(NOTE as of May 2017: This document describes the parfu data file up through parfu 0.4. For versions 0.5 and later, file blocking has to follow tar conventions, so this page is largely irrelevant. It will probably be removed soon.

There is ONE blocking parameter in parfu 0.5.1 and later, that is how much data each rank handles at once. This is currently set to 4 MB. This may be a parameter that's useful to tune to match a local system/network/file system. Right now this must be adjusted by setting it in the main header file and rebuilding; this may be settable by command line at some future point.)

A parfu container file (by convention it has file extension .pfu) consists of the header (which is the catalog of everything contained in the file including directories, symlinks, and files) followed by the contents of each stored file, in order listed in the catalog. The header format is human-readable ASCII and is described in another wiki page.

Data storage in the container file

The contents of each target file are stored contiguously in the container file in turn. Each target file's contents sit in one or more "block"s. ("Block" here is a software notion within parfu and has nothing to do with network block sizes or disk blocks.) Blocks are sized dynamically depending on how large the file being stored is. Parfu has a minimum block size (min) and a maximum block size (max) for any invocation. For historical reasons, the block size is always a power of 2 of bytes. The block size used to store given file is the next largest power of up larger than the file size, up to the maximum block size. If the file has more bytes than will fit in a single block of size max, then that file's contents will be contained in multiple blocks, all of which are the maximum block size. The block size is often stored as the exponent E which makes the corresponding block size of 2^E.

Target files in the archive file typically had void space in between them, unless they happen to be exact multiples of block sizes. This is done in an effort to improve I/O and network efficiency. A future version of parfu may make an attempt to eliminate this waste.

Block sizing is most easily seen with a concrete example. Let's say the minimum block size is 4096 bytes, or E=12. Max block size is 1 MB, or E=20.

  • Storing a 1kB file would use block size E=12, or 4096 bytes, the lowest allowed block size. All data is in one block. This wastes 3 kB to store this file.
  • Storing a 6kB file uses a block size of E=13, or 8192 bytes, so wastes about 2kB. All data is in one block.
  • Storing a 20kB file uses block size of E=15, or 32768 bytes, which wastes about 13kB. All data is in one block.
  • Storing an 800 kB file uses block size of E=20, 1 MB. It wastes about 200kB. All data is in one block.
  • Storing 2.5 MB file uses a block size of E=20, 1 MB (the maximum allowed). The data takes up 3 blocks. The first two blocks are full, the third one is wastes about .5 MB.

The max block size is usually static and is generally not meant to be adjusted on the command-line. It will probably be set to be roughly equivalent to a network or parallel file system buffer size. On Blue Waters, the Lusture package data payload size was believed to be 1MB, so the default max block size for parfu is set to be 1 MB.

The minimum block size might make a difference on some systems. The smaller the minimum block size, the less wasted space exists in the archive file per file stored. On the other hand, the minimum block size also sets how many files one rank of parfu handles if the targeted collection contains many very small files. If min block size is smaller, then less space is wasted, but potentially this increases the bottleneck for writing small files to the archive file. If min block size is larger, then more space is potentially wasted in the archive file, but there will be less data writing bottleneck for very small files.

The minimum and maximum block sizes determine the arrangement and the spacing of the file data in the archive file.

Per-rank data movement

The data from a single target file is moved by one rank until the size of the file passes a certain threshold. The amount of data handled by one rank of parfu is called a "file fragment". The file fragment size is a multiple of the block size. It's parameterized by an integer B. A typical file fragment size would be N=40, so the file fragment size is 40MB. The file fragment size parameter is probably fairly important to parfu's efficiency. If it's too small, then data transfers between parfu and the disk will be dominated by latency; if it's too big, then network transfers will suffer by being too big and having to be divided into smaller transfers within the network library. Also, if file fragment size is too large and there are only a few files, then only a few ranks of parfu are moving data and the rest are idle.

Generally, the file fragment size for a given system should be as small as possible without causing loss of efficiency due to data transfer amortization loss.

Adjustable parameters

Early prototype versions of parfu had E (parameter for minimum block size) and B (number of blocks in the maximum file fragment size) adjustable on the command line. The proof-of-concept v0.4.0-alpha00 version of parfu does NOT have these parameters adjustable. One of the very next thing we'll work on is to add those flags to the command line for testing.

(Block size is generally hard-coded in the source. We don't believe that makes a difference, but we'll make sure it's easy to find in the main header file in the source.)