XEasy File Format - CONNJUR/CONNJUR_widgets GitHub Wiki
This description is based on the 1994 Manual for XEASY (also attached below). This is typically referred to as "version 1". It is unclear whether there are other versions or permutations of this spectral file format. If other versions are discovered, this page should be edited to reflect those differences.
XEASY Spectral Data is stored in two files, an ASCII file with metadata, and a binary file with the numeric data (either 8-bit or 16-bit floating point). The XEASY convention is for these two files to have the following suffixes, .3D.param for the metadata, .3D.16 for the 16-bit binary data.
Version ....................... 1
Number of dimensions .......... 2
16 or 8 bit file type ......... 16
Spectrometer frequency in w1 .. 60.752201
Spectrometer frequency in w2 .. 599.553223
Spectral sweep width in w1 .... 36.212997
Spectral sweep width in w2 .... 7.097000
Maximum chemical shift in w1 .. 135.606003
Maximum chemical shift in w2 .. 11.869000
Size of spectrum in w1 ........ 256
Size of spectrum in w2 ........ 1024
Submatrix size in w1 .......... 64
Submatrix size in w2 .......... 64
Permutation for w1 ............ 2
Permutation for w2 ............ 1
Folding in w1 ................. NO
Folding in w2 ................. NO
Type of spectrum .............. ?
Identifier for dimension w1 ... 15N
Identifier for dimension w2 ... 1H
The top three lines report the version number (presumably always "1"), the number of dimensions, and the binary precision. Next, for each dimension/axis, are reported the spectrometer frequency (Sf0: i.e. referenced to 0 ppm), the sweep width, max chemical shift (reference of leftmost/first point), number of points, number of points in submatrix, permutation (i.e. slow, fast, faster), and folding. Folding apparently takes three possible enumerations: "NO", "RSH" (States) or "TPPI". This says nothing about the ordering of the data, it is used by XEASY for plotting the positions of folded peaks, and therefore important to get correct for the user but not the binary data translation itself. Finally, there are the "Type of spectrum" which is not used by the program and can therefore be set to anything, and arbitrary identifiers for the peak axes.
Apart from the enumeration of folding, it is not clear from the documentation what the data type constraints for these data are. This will be determined (loosely) by trial and error. (Essentially observing what values cause XEASY to choke.)
From the documentation (also attached below), XEASY supports 8-bit and 16-bit binary data. These data types appear to be a non-standard floating point representations. In any case, there is little need to ever write to the 8-bit format, as the 16-bit has greater precision. 8-bit files are rare, and so support for reading 8-bit will be done only if very simple or a matter of last resort. MRG will concentrate on reading / writing 16-bit mode only.
The 16-bit numbers are ordered sequentially along the fast axis, and then staggered by slow and slower. The ordering of these indexes are given by the "permutation" parameters in the ASCII file. One final complexity, the data is arranged in sub-matrices (similar to Bruker) with the sub-matrix size being given in the ASCII file.
Image(Xeasy_ladder.jpg, right, title="Illustration of XEASY 16-bit float ladder")
It appears that the XEASY proprietary binary encoding is straight-forward. For 8-bit "floats", the binary stream is actually a stream of 8-bit signed integers (int8_t in C, byte in JAVA) spanning 0 to 95.
These correspond essentially to a ladder of exponents to the base √2: i.e. √2^L^, where L is an integer exponent. The sign of the float needs also to be encoded, which is done by splitting the ladder into two: one ladder for positive and another for negatives. While one might naively think that XEASY would therefore have a bandwidth of 128 steps on the ladder, it actually only uses 47 steps.
The range of floating point numbers (s,,k,,) supported by 8-bit XEASY is: -√2^47^ ≤ s,,k,, ≤ √2^46^. The range of integer exponents (L) on the √2^L^ is from -1 to 46 for positive floats, 0 to 47 for negative floats. The exponents of positive floats are encoded as 0 ≤ e,,k,, ≤ 47; the exponents of negative floats are encoded as 48 ≤ e,,k,, ≤ 95. Therefore, the total range of the 8-bit signed integers in the stream is 0 ≤ e,,k,, ≤ 95.
As diagrammed with the red arrows, the numerical encoding procedure is to find which L is closest to a given float, s,,k,,. (L = log,,√2,,|s,,k,,|. Next, the sign of the float (positive or negative) is encoded with the exponent to generate e,,k,,.
16-bit is a fairly simple extension to the above 8-bit encoding. A second, signed 8-bit integer precedes the "exponent." This second 8-bit integer can be translated into a mantissa a,,k,,, such that a,,k,, = (721|s,,k,,|/√2^L^) - 615. Conversely, s,,k,, can be determined from both a,,k,, and L (which is determined from e,,k,,), such that s,,k,, = (a,,k,,+615)√2^L^/721. The sign of s,,k,, is determined from e,,k,, as above; if e,,k,,>47, s,,k,, is negative. The mantissa is illustrated as the secondary blue ladder, within the original black ladder of L's.
The strategy for the XEASY translator will be to read the two 8-bit integers and mathematically convert them to the 32-bit floating point number required by CONNJUR-ST. This number will be stored in the common memory model for translation. For writing, the memory location will be read, and then scaled appropriately to within the range of ±(√2)^47^ (approximately ±10,000,000). After scaling, the number will be reconverted to the two, 8-bit integers and written to file.
__Unknown issue:__ It is apparent from an inspection of the data that some e,,k,,'s are zero in XEASY datasets. This is puzzling because (a) it implies L can be -1 in which case e,,k,, could be 96 which is not observed, and (b) if e,,k,, cannot be 96, then it is not clear whether e,,k,,=0 should be positive or negative (i.e., if it is positive, the positive rung has one additional number). At the current time, my interpretation of the zero value for e,,k,, is the L=1 is supported for positive values and is reflected in the accompanying illustration.