Benchmarks 2024 11 21 TFLM LLVM Os spike_rv32_min - tum-ei-eda/muriscv-nn GitHub Wiki

Setup

Simulator

  • Spike (riscv-isa-sim ) (ISS, CPI=1)
    • Spike : 0bc176b3fca43560b9e8586cdbc41cfde073e17a
    • Spike PK : 7e9b671c0415dfd7b562ac934feb9380075d4aa2

Toolchains

Models

Frameworks

  • MLonMCU : develop

  • TFLM : 8eb6b23de4470d6a8da3131650d6a67514dfa130

Miscellaneous

  • Used -Os flag for compilation.
  • Benchmarks generated using MLonMCU deployment tool with minimal efforts.
  • Memory metrics are reported in Bytes

Results (Framework: tflm, Backend: tflmi, Toolchain: llvm, Flags: -Os, Target: spike_rv32_min )

Audio Wake Words (aww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
39232458 ( 0.4x ) 198384 ( 0.854 ) 35992 ( 1.0 ) 0 TFLM Reference RV32IM 0 -
33270646 ( 0.5x ) 203752 ( 0.877 ) 35992 ( 1.0 ) 128 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
31477974 ( 0.5x ) 203752 ( 0.877 ) 35992 ( 1.0 ) 256 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
30581638 ( 0.5x ) 203752 ( 0.877 ) 35992 ( 1.0 ) 512 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
30133470 ( 0.5x ) 203752 ( 0.877 ) 35992 ( 1.0 ) 1024 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
30136859 ( 0.5x ) 203752 ( 0.877 ) 35992 ( 1.0 ) 2048 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
30136859 ( 0.5x ) 203752 ( 0.877 ) 35992 ( 1.0 ) 4096 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
15024007 ( Base ) 232312 ( Base ) 35992 ( Base ) 0 muRISCV-NN Scalar RV32IM 0 -
14912371 ( 1.0x ) 231124 ( 0.995 ) 35992 ( 1.0 ) 0 muRISCV-NN Vector (Portable) RV32IM 0 -
6082522 ( 2.5x ) 242064 ( 1.042 ) 35992 ( 1.0 ) 128 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
5075860 ( 3.0x ) 242064 ( 1.042 ) 35992 ( 1.0 ) 256 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
4572628 ( 3.3x ) 242064 ( 1.042 ) 35992 ( 1.0 ) 512 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
4283313 ( 3.5x ) 242064 ( 1.042 ) 35992 ( 1.0 ) 1024 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
4283313 ( 3.5x ) 242064 ( 1.042 ) 35992 ( 1.0 ) 2048 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
4290091 ( 3.5x ) 242064 ( 1.042 ) 35992 ( 1.0 ) 4096 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
4098330 ( 3.7x ) 233004 ( 1.003 ) 35992 ( 1.0 ) 128 muRISCV-NN Vector RV32IM_ZVE64X 0 -
2843352 ( 5.3x ) 233004 ( 1.003 ) 35992 ( 1.0 ) 256 muRISCV-NN Vector RV32IM_ZVE64X 0 -
2166034 ( 6.9x ) 233004 ( 1.003 ) 35992 ( 1.0 ) 512 muRISCV-NN Vector RV32IM_ZVE64X 0 -
2125046 ( 7.1x ) 233004 ( 1.003 ) 35992 ( 1.0 ) 1024 muRISCV-NN Vector RV32IM_ZVE64X 0 -
2128435 ( 7.1x ) 233004 ( 1.003 ) 35992 ( 1.0 ) 2048 muRISCV-NN Vector RV32IM_ZVE64X 0 -
2131824 ( 7.0x ) 233004 ( 1.003 ) 35992 ( 1.0 ) 4096 muRISCV-NN Vector RV32IM_ZVE64X 0 -
6839842 ( 2.2x ) 240704 ( 1.036 ) 35992 ( 1.0 ) 128 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
5872428 ( 2.6x ) 240704 ( 1.036 ) 35992 ( 1.0 ) 256 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
5388820 ( 2.8x ) 240704 ( 1.036 ) 35992 ( 1.0 ) 512 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
5119237 ( 2.9x ) 240704 ( 1.036 ) 35992 ( 1.0 ) 1024 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
5119237 ( 2.9x ) 240704 ( 1.036 ) 35992 ( 1.0 ) 2048 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
5126015 ( 2.9x ) 240704 ( 1.036 ) 35992 ( 1.0 ) 4096 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP

Image Classification (resnet)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
120983358 ( 0.5x ) 241288 ( 0.914 ) 68760 ( 1.0 ) 0 TFLM Reference RV32IM 0 -
56488584 ( 1.0x ) 246380 ( 0.934 ) 68760 ( 1.0 ) 128 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
46731928 ( 1.2x ) 246380 ( 0.934 ) 68760 ( 1.0 ) 256 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
44385024 ( 1.3x ) 246380 ( 0.934 ) 68760 ( 1.0 ) 512 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
43951220 ( 1.3x ) 246380 ( 0.934 ) 68760 ( 1.0 ) 1024 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
43954609 ( 1.3x ) 246380 ( 0.934 ) 68760 ( 1.0 ) 2048 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
43954609 ( 1.3x ) 246380 ( 0.934 ) 68760 ( 1.0 ) 4096 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
56260946 ( Base ) 263904 ( Base ) 68760 ( Base ) 0 muRISCV-NN Scalar RV32IM 0 -
72452688 ( 0.8x ) 263040 ( 0.997 ) 68760 ( 1.0 ) 0 muRISCV-NN Vector (Portable) RV32IM 0 -
26482258 ( 2.1x ) 273964 ( 1.038 ) 68760 ( 1.0 ) 128 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
18420860 ( 3.1x ) 273964 ( 1.038 ) 68760 ( 1.0 ) 256 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
14642164 ( 3.8x ) 273964 ( 1.038 ) 68760 ( 1.0 ) 512 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
12924141 ( 4.4x ) 273964 ( 1.038 ) 68760 ( 1.0 ) 1024 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
12084461 ( 4.7x ) 273964 ( 1.038 ) 68760 ( 1.0 ) 2048 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
11461479 ( 4.9x ) 273964 ( 1.038 ) 68760 ( 1.0 ) 4096 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
15265177 ( 3.7x ) 265700 ( 1.007 ) 68760 ( 1.0 ) 128 muRISCV-NN Vector RV32IM_ZVE64X 0 -
9682429 ( 5.8x ) 265700 ( 1.007 ) 68760 ( 1.0 ) 256 muRISCV-NN Vector RV32IM_ZVE64X 0 -
7136815 ( 7.9x ) 265700 ( 1.007 ) 68760 ( 1.0 ) 512 muRISCV-NN Vector RV32IM_ZVE64X 0 -
5894799 ( 9.5x ) 265700 ( 1.007 ) 68760 ( 1.0 ) 1024 muRISCV-NN Vector RV32IM_ZVE64X 0 -
4969980 ( 11.3x ) 265700 ( 1.007 ) 68760 ( 1.0 ) 2048 muRISCV-NN Vector RV32IM_ZVE64X 0 -
4724305 ( 11.9x ) 265700 ( 1.007 ) 68760 ( 1.0 ) 4096 muRISCV-NN Vector RV32IM_ZVE64X 0 -
20234944 ( 2.8x ) 272360 ( 1.032 ) 68760 ( 1.0 ) 128 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
14139266 ( 4.0x ) 272360 ( 1.032 ) 68760 ( 1.0 ) 256 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
11281990 ( 5.0x ) 272360 ( 1.032 ) 68760 ( 1.0 ) 512 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
9983717 ( 5.6x ) 272360 ( 1.032 ) 68760 ( 1.0 ) 1024 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
9348837 ( 6.0x ) 272360 ( 1.032 ) 68760 ( 1.0 ) 2048 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
8879455 ( 6.3x ) 272360 ( 1.032 ) 68760 ( 1.0 ) 4096 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP

Anomaly Detection (toycar)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
2791918 ( 0.6x ) 380888 ( 0.979 ) 19336 ( 1.0 ) 0 TFLM Reference RV32IM 0 -
903686 ( 1.9x ) 382836 ( 0.984 ) 19336 ( 1.0 ) 128 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
673414 ( 2.5x ) 382836 ( 0.984 ) 19336 ( 1.0 ) 256 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
558278 ( 3.0x ) 382836 ( 0.984 ) 19336 ( 1.0 ) 512 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
500710 ( 3.4x ) 382836 ( 0.984 ) 19336 ( 1.0 ) 1024 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
471926 ( 3.6x ) 382836 ( 0.984 ) 19336 ( 1.0 ) 2048 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
471731 ( 3.6x ) 382836 ( 0.984 ) 19336 ( 1.0 ) 4096 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
1689241 ( Base ) 389104 ( Base ) 19336 ( Base ) 0 muRISCV-NN Scalar RV32IM 0 -
3041919 ( 0.6x ) 389108 ( 1.0 ) 19336 ( 1.0 ) 0 muRISCV-NN Vector (Portable) RV32IM 0 -
597099 ( 2.8x ) 392988 ( 1.01 ) 19336 ( 1.0 ) 128 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
483179 ( 3.5x ) 392988 ( 1.01 ) 19336 ( 1.0 ) 256 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
426219 ( 4.0x ) 392988 ( 1.01 ) 19336 ( 1.0 ) 512 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
397739 ( 4.2x ) 392988 ( 1.01 ) 19336 ( 1.0 ) 1024 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
383499 ( 4.4x ) 392988 ( 1.01 ) 19336 ( 1.0 ) 2048 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
381715 ( 4.4x ) 392988 ( 1.01 ) 19336 ( 1.0 ) 4096 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
1944832 ( 0.9x ) 390844 ( 1.004 ) 19336 ( 1.0 ) 128 muRISCV-NN Vector RV32IM_ZVE64X 0 -
1828152 ( 0.9x ) 390844 ( 1.004 ) 19336 ( 1.0 ) 256 muRISCV-NN Vector RV32IM_ZVE64X 0 -
1769812 ( 1.0x ) 390844 ( 1.004 ) 19336 ( 1.0 ) 512 muRISCV-NN Vector RV32IM_ZVE64X 0 -
1740966 ( 1.0x ) 390844 ( 1.004 ) 19336 ( 1.0 ) 1024 muRISCV-NN Vector RV32IM_ZVE64X 0 -
1737335 ( 1.0x ) 390844 ( 1.004 ) 19336 ( 1.0 ) 2048 muRISCV-NN Vector RV32IM_ZVE64X 0 -
1735479 ( 1.0x ) 390844 ( 1.004 ) 19336 ( 1.0 ) 4096 muRISCV-NN Vector RV32IM_ZVE64X 0 -
929404 ( 1.8x ) 392992 ( 1.01 ) 19336 ( 1.0 ) 128 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
667452 ( 2.5x ) 392992 ( 1.01 ) 19336 ( 1.0 ) 256 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
536476 ( 3.1x ) 392992 ( 1.01 ) 19336 ( 1.0 ) 512 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
470988 ( 3.6x ) 392992 ( 1.01 ) 19336 ( 1.0 ) 1024 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
438244 ( 3.9x ) 392992 ( 1.01 ) 19336 ( 1.0 ) 2048 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
434156 ( 3.9x ) 392992 ( 1.01 ) 19336 ( 1.0 ) 4096 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP

Visual Wake Words (vww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
103371251 ( 0.4x ) 472024 ( 0.933 ) 134296 ( 1.0 ) 0 TFLM Reference RV32IM 0 -
71824044 ( 0.6x ) 477392 ( 0.944 ) 134296 ( 1.0 ) 128 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
66662636 ( 0.7x ) 477392 ( 0.944 ) 134296 ( 1.0 ) 256 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
64210956 ( 0.7x ) 477392 ( 0.944 ) 134296 ( 1.0 ) 512 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
63178652 ( 0.7x ) 477392 ( 0.944 ) 134296 ( 1.0 ) 1024 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
62762657 ( 0.7x ) 477392 ( 0.944 ) 134296 ( 1.0 ) 2048 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
62730373 ( 0.7x ) 477392 ( 0.944 ) 134296 ( 1.0 ) 4096 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
44945967 ( Base ) 505952 ( Base ) 134296 ( Base ) 0 muRISCV-NN Scalar RV32IM 0 -
44760610 ( 1.0x ) 504764 ( 0.998 ) 134296 ( 1.0 ) 0 muRISCV-NN Vector (Portable) RV32IM 0 -
19334831 ( 2.3x ) 515704 ( 1.019 ) 134296 ( 1.0 ) 128 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
16459215 ( 2.7x ) 515704 ( 1.019 ) 134296 ( 1.0 ) 256 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
15081423 ( 3.0x ) 515704 ( 1.019 ) 134296 ( 1.0 ) 512 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
14580572 ( 3.1x ) 515704 ( 1.019 ) 134296 ( 1.0 ) 1024 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
14374052 ( 3.1x ) 515704 ( 1.019 ) 134296 ( 1.0 ) 2048 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
14363394 ( 3.1x ) 515704 ( 1.019 ) 134296 ( 1.0 ) 4096 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
13385338 ( 3.4x ) 506644 ( 1.001 ) 134296 ( 1.0 ) 128 muRISCV-NN Vector RV32IM_ZVE64X 0 -
10064338 ( 4.5x ) 506644 ( 1.001 ) 134296 ( 1.0 ) 256 muRISCV-NN Vector RV32IM_ZVE64X 0 -
8772910 ( 5.1x ) 506644 ( 1.001 ) 134296 ( 1.0 ) 512 muRISCV-NN Vector RV32IM_ZVE64X 0 -
8268920 ( 5.4x ) 506644 ( 1.001 ) 134296 ( 1.0 ) 1024 muRISCV-NN Vector RV32IM_ZVE64X 0 -
8224540 ( 5.5x ) 506644 ( 1.001 ) 134296 ( 1.0 ) 2048 muRISCV-NN Vector RV32IM_ZVE64X 0 -
8227929 ( 5.5x ) 506644 ( 1.001 ) 134296 ( 1.0 ) 4096 muRISCV-NN Vector RV32IM_ZVE64X 0 -
22209979 ( 2.0x ) 514344 ( 1.017 ) 134296 ( 1.0 ) 128 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
19426235 ( 2.3x ) 514344 ( 1.017 ) 134296 ( 1.0 ) 256 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
18094379 ( 2.5x ) 514344 ( 1.017 ) 134296 ( 1.0 ) 512 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
17593456 ( 2.6x ) 514344 ( 1.017 ) 134296 ( 1.0 ) 1024 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
17386900 ( 2.6x ) 514344 ( 1.017 ) 134296 ( 1.0 ) 2048 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
17376224 ( 2.6x ) 514344 ( 1.017 ) 134296 ( 1.0 ) 4096 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP

Original data

Click here to download the raw files for this benchmark.