IEEE 754 floating point on Bela - BelaPlatform/Bela GitHub Wiki

This material has been superseded. Visit learn.bela.io for the maintained version.

On Bela, there are two floating point units. The VFP handles scalar operations on both single- and double- precision floats, while the NEON handles SIMD instructions (up to four elements per vector) in single precision only. The VFP is fully IEEE754-compliant, but on the Cortex-A8 CPU is way slower than NEON, even for non-vectorial operations. The NEON unit, on the other hand, is not fully IEEE754-compliant in that denormals are flushed to zero (FTZ). The VFP has a similar non-IEEE754-compliant mode that can be enabled via software which somehow improves its performances, though they will never be on par with the NEON. It is ultimately up to the compiler to decide whether it should generate VFP or NEON instructions.

FTZ is common practice in many DSP programs (e.g.: PureData and Gen~ implement a software flush to zero, while Supercollider tries to enable hardware support for it) and most of the times it does not affect the numerical result to a noticeable extent, while surely improving performance usage. We decided to enable the FTZ on denormals on the VFP in order to make non-NEON code perform a bit better than it otherwise would. This is done in the background by calling enable_runfast() from the math_neon library and is applied to all the threads created by the main Bela program, including the audio thread and all the auxiliary tasks.

In case your code suffers from having FTZ enabled, you have a few options:

  • disable it globally calling disable_runfast() from the setup() function
  • disable it per-thread by calling disable_runfast() from an AuxiliaryTask.
  • use double precision types in your code: when using double, with its greater range of exponents, presumably denormals in that situation are going to be incredibly small such that they wouldn't make any difference to audio, even within an IIR filter.

Caveat: in order to make sure FTZ does not take place, you must disable_runfast() AND use only the VFP for the critical part of the code. As mentioned above, NEON always flushes-to-zero. To make sure VFP instructions are being used for your code. To do so you can do a combination of the followings:

  • compile your FTZ-critical code with CPPFLAGS=-mfpu=vfp, thus disabling NEON.
  • use double precision types. These will always generate VFP instructions
  • check the generated assembly (the ultimate answer to everything) and make sure there are no NEON instructions in your FTZ-critical code