Fast math functions - BelaPlatform/Bela GitHub Wiki
This material has been superseded. Visit learn.bela.io for the maintained version.
When programming in C++ on Bela, you have access to a library of fast, approximated math function: math_neon
.
This is a library that aims at replacing the accurate math functions implemented by the libmath
library with faster, optimized and less accurate functions.
In order to use it, you will have to add
# include <libraries/math_neon/math_neon.h>
at the top of your file and then you can invoke the functions by adding _neon
at the end of the name of a regular math.h
function, e.g.:
sinf(float)
becomes sinf_neon(float)
Being these functions approximate, they may not suit all purposes and it is up to you to verify that they are accurate for your specific case. A list of the supported functions and a summary benchmark is in the table below.
"rate" is the relative improvement of the "fast" version compared to the one in libmath
.
Function | Range | Number | ABS Max Err | REL Max Err(%) | RMS Err | Time | Rate |
---|---|---|---|---|---|---|---|
sinf | -3.14, 3.14 | 500000 | 0.00000000 | 0.00000000 | 0.00000000 | 262041 | 1.00 |
sinf_neon | -3.14, 3.14 | 500000 | 0.00000083 | 0.85972977 | 0.00000041 | 66440 | 3.94 |
cosf | -3.14, 3.14 | 500000 | 0.00000000 | 0.00000000 | 0.00000000 | 215722 | 1.00 |
cosf_neon | -3.14, 3.14 | 500000 | 0.00000083 | 0.67412275 | 0.00000042 | 77271 | 2.79 |
tanf | -0.79, 0.79 | 500000 | 0.00000000 | 0.00000000 | 0.00000000 | 250169 | 1.00 |
tanf_neon | -0.79, 0.79 | 500000 | 0.00000191 | 0.00036218 | 0.00000067 | 97411 | 2.57 |
asinf | -1.00, 1.00 | 500000 | 0.00000000 | 0.00000000 | 0.00000000 | 229272 | 1.00 |
asinf_neon | -1.00, 1.00 | 500000 | 0.00004661 | 0.00889725 | nan | 121673 | 1.88 |
acosf | -1.00, 1.00 | 500000 | 0.00000000 | 0.00000000 | 0.00000000 | 277243 | 1.00 |
acosf_neon | -1.00, 1.00 | 500000 | 0.00004673 | 0.00634698 | nan | 123174 | 2.25 |
atanf | -1.00, 1.00 | 500000 | 0.00000000 | 0.00000000 | 0.00000000 | 206506 | 1.00 |
atanf_neon | -1.00, 1.00 | 500000 | 0.00016665 | 0.02123393 | 0.00007395 | 89627 | 2.30 |
sinhf | -3.14, 3.14 | 500000 | 0.00000000 | 0.00000000 | 0.00000000 | 313378 | 1.00 |
sinhf_neon | -3.14, 3.14 | 500000 | 0.00000191 | 0.15247960 | 0.00000019 | 75931 | 4.13 |
coshf | -3.14, 3.14 | 500000 | 0.00000000 | 0.00000000 | 0.00000000 | 327628 | 1.00 |
coshf_neon | -3.14, 3.14 | 500000 | 0.00000191 | 0.00002146 | 0.00000017 | 75932 | 4.31 |
tanhf | -3.14, 3.14 | 500000 | 0.00000000 | 0.00000000 | 0.00000000 | 322052 | 1.00 |
tanhf_neon | -3.14, 3.14 | 500000 | 0.00000024 | 0.24726489 | 0.00000005 | 74937 | 4.30 |
expf | 0.00, 10.00 | 500000 | 0.00000000 | 0.00000000 | 0.00000000 | 288120 | 1.00 |
expf_neon | 0.00, 10.00 | 500000 | 0.00976562 | 0.00006549 | 0.00164461 | 51935 | 5.55 |
logf | 1.00, 1000.00 | 500000 | 0.00000000 | 0.00000000 | 0.00000000 | 220097 | 1.00 |
logf_neon | 1.00, 1000.00 | 500000 | 0.00000763 | 0.01027698 | 0.00000107 | 52387 | 4.20 |
log10f | 1.00, 1000.00 | 500000 | 0.00000000 | 0.00000000 | 0.00000000 | 273479 | 1.00 |
log10f_neon | 1.00, 1000.00 | 500000 | 0.00000334 | 0.00668141 | 0.00000048 | 52388 | 5.22 |
floorf | 1.00, 1000.00 | 5000000 | 0.00000000 | 0.00000000 | 0.00000000 | 566063 | 1.00 |
floorf_neon | 1.00, 1000.00 | 5000000 | 0.00000000 | 0.00000000 | 0.00000000 | 339748 | 1.67 |
ceilf | 1.00, 1000.00 | 5000000 | 0.00000000 | 0.00000000 | 0.00000000 | 617084 | 1.00 |
ceilf_neon | 1.00, 1000.00 | 5000000 | 0.00000000 | 0.00000000 | 0.00000000 | 339745 | 1.82 |
fabsf | 1.00, 1000.00 | 5000000 | 0.00000000 | 0.00000000 | 0.00000000 | 332227 | 1.00 |
fabsf_neon | 1.00, 1000.00 | 5000000 | 0.00000000 | 0.00000000 | 0.00000000 | 226497 | 1.47 |
sqrtf | 1.00, 1000.00 | 500000 | 0.00000000 | 0.00000000 | 0.00000000 | 152492 | 1.00 |
sqrtf_neon | 1.00, 1000.00 | 500000 | 0.00000763 | 0.00002913 | 0.00000160 | 53386 | 2.86 |
invsqrtf | 1.00, 1000.00 | 500000 | 0.00000000 | 0.00000000 | 0.00000000 | 54986 | 1.00 |
invsqrtf_neon | 1.00, 1000.00 | 500000 | 0.00000012 | 0.00002124 | 0.00000000 | 38418 | 1.43 |
atan2f | 0.10, 10.00 | 10000 | 0.00000000 | 0.00000000 | 0.00000000 | 511718 | 1.00 |
atan2f_neon | 0.10, 10.00 | 10000 | 0.00016665 | 0.02124334 | 0.00000000 | 203125 | 2.52 |
powf | 1.00, 10.00 | 10000 | 0.00000000 | 0.00000000 | 0.00000000 | 1783204 | 1.00 |
powf_neon | 1.00, 10.00 | 10000 | 136192.00000000 | 0.00587880 | 0.00000000 | 201172 | 8.86 |
fmodf | 1.00, 10.00 | 10000 | 0.00000000 | 0.00000000 | 0.00000000 | 140625 | 1.00 |
fmodf_neon | 1.00, 10.00 | 10000 | 9.97472000 | 0.08058017 | 0.00000000 | 148437 | 0.95 |