Introduction

The XDR standard requires the support for single-precision, double-precision, and quadruple-precision floating point representations, i.e. the interchange formats binary32, binary64, and binary128 as defined in IEEE 754. This poses some challenges for the xdrlib2 implementation. For one thing, most PCs do not support quadruple-precisison floating points natively. But there are other (potential) issues: the handling of floating point numbers in Python is dependent on the underlying machine architecture and the C or Java implemention in which Python is written. This means that Python offers no guarantee as to which floating point precisions are supported. A hypothetical Python implementation that only supports floating point numbers in the binary16 format (see IEEE 754) would still comply with the Python language as defined in the Python Language Reference.

With the above considerations, it is not feasible to simply treat XDR floating point numbers as a if they were Python's floats.

On the other hand, the XDR protocol is intended for data transfer, not for high-precision calculations. That means that, as far as the xdrllib2 library is concerned, only the creation, encoding, and decoding operations need to support the precision required by XDR floating point types. It is up to the user of the XDR protocol to provide any high-precision numerical libraries (e.g. gmpy) if they require to do calculations with instances of the XDR float types (in particular the quadruple-precision type).

Most use cases that involve XDR data involve generating data, transferring received data, and interpreting received data. Typical scenarios for interpreting received floating point data are: determining the sign, comparing to a pre-defined threshold, comparing with an earlier received data value, and so on.

For the above reasons I have decided to make the XDR floating point number types float subclasseses, with additional attributes and methods that ensure that creating, comparing, encoding, and decoding these values works for all precisions.

Theoretical basis

The following is extracted from IEEE 754. For more information on IEEE 754 see e.g. the IEEE 754 page on wikipedia. Since XDR is primarily about the representation of data, this section focussus on the interchange format, not on the in-memory encoding of floating point values or the operations that can be performed on them.

A floating point format (i.e. a set of numbers and symbols) contains

finite numbers
plus and minus infinity
NaN values

The XDR standard uses a subset of the full IEEE 754 interchange formats, namely the binary32, binary64, and binary128 formats. Each interchange format uses a packed sequence of three positive integers signbit, exponent, and fraction (s, e, f) to represent numbers. A interchange format has a fixed number of bits, therefore the number of bits available for the representation of these integers is limited. The signbit always uses 1 bit: 0 indicates a positive value, 1 indicates a negative value. The number of bits used for e is indicated with E, and the number of bits used for f is indicated with F. Different representation formats have different values for E and F.

format	size	E	F	bias	q_min	q_max	v_min	v_max
binary32	32	8	23	127	-126	127	2⁻¹⁴⁹	(2−2⁻²³) × 2¹²⁷
binary64	64	11	52	1023	-1022	1023	2⁻¹⁰⁷⁴	(2−2⁻⁵²) × 2¹⁰²³
binary128	128	15	112	16383	-16382	16383	2⁻¹⁶⁴⁹⁴	(2−2⁻¹¹²) × 2¹⁶³⁸³

In the interchange representations, the minimum and maximum possible values of e (0 and 2^E−1) are used to indicate special numbers. All other values 1 ≤ e < 2^E-1 are used to represent normal numbers.

Any normal number v that is representable in a floating point format can be described by three integer values: a signbit (s) with values 0 (for positive values) or 1 (for negative values), a coefficient (c), and an exponent (q), such that v = (-1)^s × c × 2^q, and 1 ≤ c < 2.

To allow the representation of negative exponents, the value of e (which is always a positive number) is assumed to have a bias b so that: q = e − b, with b = 2^E−1 − 1, i.e. a leading 0-bit followed by E−1 1-bits.

To maximize the precision of the coefficient, the most significant bit of c (which is always 1) is not represented in the interchange format, only the bits that form the binary fraction are encoded in f. The integer value f has therefore an implied factor of 2^−F. This means that c = 1 + f × 2^−F.

The smallest representable q value is q_min = 1 − b. Numbers v with an absolute value less than 2^1 − b cannot be represented in the way described above. Instead, such subnormal numbers are represented with e = 0, and their value is v = (-1)^s × f ×2^1−b−F. The absolute minimal value greater than 0 that can be represented in this way is v_min = 2^1−b−F. Any value less than v_min is represented with e and f equal to 0, which is equivalent to ±0.0, depending on the value of the signbit.

Similarly, the largest representable value v_max has e = 2^E−2 and f = 2^F−1, corresponding with v_max = (2−2^−F) × 2^b. Numbers with an absolute value | v | > v_max cannot be represented as normal numbers. Such values are equivalent to ±infinity, and are represented with e = 2^E − 1 and f = 0.

Finally, each format has representations for Not-a-Number (NaN) values. Such representations have e = 2^E − 1, and f ≠ 0. Signaling NaNs have 0 < f < 2^F−1, meaning that they have a leading 0-bit in f. Quiet NaNs have 2^F−1 ≤ f < 2^F, meaning that they have a leading 1-bit in f. The other bits in f act as an opaque payload and have no significance for the floating point format.

Constructing an XDR floating point value

In the xdrlib2 module, XDR floating points are a subclass of Python floats. All Python float instantiation methods must therefore also apply to XDR floating point types. That means they can be instantiated with either a number (integer or float) or a unicode or byte string as argument. The challenge is then to calculate the values of s, e, and f without using floating point calculations, because that would almost inevitably lead to rounding errors.

The key insight here is that the argument used to instantiate an XDR float type can always be represented as the ratio of two integers numerator (n) and denominator (d):

If the argument a is an integer, then n = a and d = 1;
If the argument a is a float, then n and d are the result of a.as_integer_ratio();
If the argument a is a string, then a can be normalized by shifting the decimal point such that the exponent becomes 0 and the string becomes <i>.<d>. Then n = 10^D × i + d, and d = 10^D, with D equal to the number of digits in the decimal part d.

Assuming a positive value, so positive integers n and d, we must now find values for f and q such that (n/d) = (1 + f × 2^−F) × 2^q, with an integer value for q and 1 ≤ (1 + f × 2^−F) < 2.

The above inequality can be written as 1 ≤ (n/d) × 2^−q < 2. Taking the log₂ of both sides we get 0 ≤ log₂n − log₂d − q <1. Pretending for now that we have an unlimited value range for q, we can see that the integer value for q is given by q = int(log₂n − log₂d).

Since we don't want to use floating point calculations, we approximate the log₂ values by taking the bit_length() of the values. The bit_length() is the smallest integer strictly greater than the log₂ value, and the log₂ value is greater than or equal to bit_length() - 1 (because the value must begin with a 1-bit). In other words (writing bits v for v.bit_length()), bits v − 1 ≤ log₂ v < bits v.

Taking q outside the inequalities, we get:

q ≤ log₂ n − log₂ d <bits n −bits d + 1
q > log₂ n − log₂ d − 1 ≥ bits n −bits d − 1

Writing b for bits n − bits d, we get that either q = b or q = b−1. It is now simply a matter of programmatically checking for which value the original condition is true.

Having the value of q available allows us to calculate f. But due to the fact that we only have F bits available for the integer f, there will in general not be an exact solution for this. Instead we will calculate the real value g that is exact solution for this equation, and then calculate the proper value for f that best approximates this solution.

Starting from (n/d) = (1 + g × 2^−F) × 2^q, we can see that g = (n × 2^F − q − d × 2^F) / d. This is fine for the case where q ≤ F. But since we want to avoid floating point operations, for the case where q > F we write this as: g = (n − d × 2^q) / (d × 2^q − F). In either case we can express g as the ratio of two integer values m and p, and the desired integer f is the rounded value of g, i.e. m // p if the remainder is less than half p, and (m // p) + 1 otherwise. Since rounding the value up can cause f to overflow, we must in that case divide f again by two and increase q by 1.

Now we must face the fact that q is not really unlimited in range.

If −2^E − 1 ≤ q < 2^E − 1, we have a normal situation. In that case we have f with the value calculated above, and e = q + b with b = 2^E − 1 − 1.
If q ≥ 2^E − 1, we have an overflow situation. In that case we must use the representation for an infinite number, i.e. e = 2^E − 1 and f = 0.
If q < −2^E − 1, we have an underflow situation. In that case we must rewrite the number as (n/d) = g × 2^{1 −b −F}, with b = 2^E − 1 − 1, or g = (n × 2^{F + b − 1}) / d. The exponent e is 0 in this case, and we calculate f from these two integer values as before. If f overflows due to rounding up, we must in this case set e to 1 and f to 0, because of the implied leading 1-bit for nominal values.

XDR floating point numbers - rhjdjong/xdrlib2 GitHub Wiki

Introduction

Theoretical basis

Constructing an XDR floating point value

⚠️ GitHub.com Fallback ⚠️

XDR floating point numbers - rhjdjong/xdrlib2 GitHub Wiki

Introduction

Theoretical basis

Constructing an XDR floating point value

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️