Benchmarks 2024 11 26 TFLM GCC Os spike_rv32_min - tum-ei-eda/muriscv-nn GitHub Wiki

Setup

Simulator

  • Spike (riscv-isa-sim ) (ISS, CPI=1)
    • Spike : eb0a3e2b0a7c57522928be39de95cd9f8c6dc636
    • Spike PK : fix-gcc14-rvv

Toolchains

Models

Frameworks

  • MLonMCU : develop

  • TFLM : 8eb6b23de4470d6a8da3131650d6a67514dfa130

Miscellaneous

  • Used -Os flag for compilation.
  • Benchmarks generated using MLonMCU deployment tool with minimal efforts.
  • Memory metrics are reported in Bytes

Results (Framework: tflm, Backend: tflmi, Toolchain: gcc, Flags: -Os, Target: spike_rv32_min )

Audio Wake Words (aww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
175181395.0 ( 0.1x ) 185964 ( 0.867 ) 36196 ( 1.0 ) 0 TFLM Reference RV32IM 0 -
175181396.0 ( 0.1x ) 186688 ( 0.871 ) 36196 ( 1.0 ) 128 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
175181396.0 ( 0.1x ) 186688 ( 0.871 ) 36196 ( 1.0 ) 256 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
175181396.0 ( 0.1x ) 186688 ( 0.871 ) 36196 ( 1.0 ) 512 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
175181396.0 ( 0.1x ) 186688 ( 0.871 ) 36196 ( 1.0 ) 1024 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
175181396.0 ( 0.1x ) 186688 ( 0.871 ) 36196 ( 1.0 ) 2048 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
175181396.0 ( 0.1x ) 186688 ( 0.871 ) 36196 ( 1.0 ) 4096 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
17322613.0 ( Base ) 214456 ( Base ) 36204 ( Base ) 0 muRISCV-NN Scalar RV32IM 0 -
17336271.0 ( 1.0x ) 214220 ( 0.999 ) 36204 ( 1.0 ) 0 muRISCV-NN Vector (Portable) RV32IM 0 -
17322148.0 ( 1.0x ) 215252 ( 1.004 ) 36204 ( 1.0 ) 128 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
17322148.0 ( 1.0x ) 215252 ( 1.004 ) 36204 ( 1.0 ) 256 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
17322148.0 ( 1.0x ) 215252 ( 1.004 ) 36204 ( 1.0 ) 512 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
17322148.0 ( 1.0x ) 215252 ( 1.004 ) 36204 ( 1.0 ) 1024 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
17322148.0 ( 1.0x ) 215252 ( 1.004 ) 36204 ( 1.0 ) 2048 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
17322148.0 ( 1.0x ) 215252 ( 1.004 ) 36204 ( 1.0 ) 4096 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
7127297.0 ( 2.4x ) 218812 ( 1.02 ) 36204 ( 1.0 ) 128 muRISCV-NN Vector RV32IM_ZVE64X 0 -
4876411.0 ( 3.6x ) 218812 ( 1.02 ) 36204 ( 1.0 ) 256 muRISCV-NN Vector RV32IM_ZVE64X 0 -
3641637.0 ( 4.8x ) 218812 ( 1.02 ) 36204 ( 1.0 ) 512 muRISCV-NN Vector RV32IM_ZVE64X 0 -
3599757.0 ( 4.8x ) 218812 ( 1.02 ) 36204 ( 1.0 ) 1024 muRISCV-NN Vector RV32IM_ZVE64X 0 -
3600178.0 ( 4.8x ) 218812 ( 1.02 ) 36204 ( 1.0 ) 2048 muRISCV-NN Vector RV32IM_ZVE64X 0 -
3600502.0 ( 4.8x ) 218812 ( 1.02 ) 36204 ( 1.0 ) 4096 muRISCV-NN Vector RV32IM_ZVE64X 0 -
17338498.0 ( 1.0x ) 215020 ( 1.003 ) 36204 ( 1.0 ) 128 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
17338498.0 ( 1.0x ) 215020 ( 1.003 ) 36204 ( 1.0 ) 256 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
17338498.0 ( 1.0x ) 215020 ( 1.003 ) 36204 ( 1.0 ) 512 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
17338498.0 ( 1.0x ) 215020 ( 1.003 ) 36204 ( 1.0 ) 1024 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
17338498.0 ( 1.0x ) 215020 ( 1.003 ) 36204 ( 1.0 ) 2048 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
17338498.0 ( 1.0x ) 215020 ( 1.003 ) 36204 ( 1.0 ) 4096 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP

Image Classification (resnet)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
746703185.0 ( 0.1x ) 227652 ( 0.923 ) 68960 ( 1.0 ) 0 TFLM Reference RV32IM 0 -
746703198.0 ( 0.1x ) 228328 ( 0.926 ) 68960 ( 1.0 ) 128 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
746703204.0 ( 0.1x ) 228336 ( 0.926 ) 68964 ( 1.0 ) 256 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
746703204.0 ( 0.1x ) 228336 ( 0.926 ) 68964 ( 1.0 ) 512 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
746703204.0 ( 0.1x ) 228336 ( 0.926 ) 68964 ( 1.0 ) 1024 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
746703204.0 ( 0.1x ) 228336 ( 0.926 ) 68964 ( 1.0 ) 2048 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
746703204.0 ( 0.1x ) 228336 ( 0.926 ) 68964 ( 1.0 ) 4096 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
79932931.0 ( Base ) 246644 ( Base ) 68952 ( Base ) 0 muRISCV-NN Scalar RV32IM 0 -
79039507.0 ( 1.0x ) 246260 ( 0.998 ) 68952 ( 1.0 ) 0 muRISCV-NN Vector (Portable) RV32IM 0 -
79938165.0 ( 1.0x ) 247540 ( 1.004 ) 68952 ( 1.0 ) 128 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
79938163.0 ( 1.0x ) 247388 ( 1.003 ) 68956 ( 1.0 ) 256 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
79938163.0 ( 1.0x ) 247388 ( 1.003 ) 68956 ( 1.0 ) 512 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
79938163.0 ( 1.0x ) 247388 ( 1.003 ) 68956 ( 1.0 ) 1024 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
79938163.0 ( 1.0x ) 247388 ( 1.003 ) 68956 ( 1.0 ) 2048 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
79938163.0 ( 1.0x ) 247388 ( 1.003 ) 68956 ( 1.0 ) 4096 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
28309872.0 ( 2.8x ) 252060 ( 1.022 ) 68952 ( 1.0 ) 128 muRISCV-NN Vector RV32IM_ZVE64X 0 -
17541661.0 ( 4.6x ) 252060 ( 1.022 ) 68952 ( 1.0 ) 256 muRISCV-NN Vector RV32IM_ZVE64X 0 -
12632695.0 ( 6.3x ) 252060 ( 1.022 ) 68952 ( 1.0 ) 512 muRISCV-NN Vector RV32IM_ZVE64X 0 -
10237740.0 ( 7.8x ) 252060 ( 1.022 ) 68952 ( 1.0 ) 1024 muRISCV-NN Vector RV32IM_ZVE64X 0 -
8446804.0 ( 9.5x ) 252060 ( 1.022 ) 68952 ( 1.0 ) 2048 muRISCV-NN Vector RV32IM_ZVE64X 0 -
7967141.0 ( 10.0x ) 252060 ( 1.022 ) 68952 ( 1.0 ) 4096 muRISCV-NN Vector RV32IM_ZVE64X 0 -
78954930.0 ( 1.0x ) 247156 ( 1.002 ) 68952 ( 1.0 ) 128 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
78960489.0 ( 1.0x ) 247004 ( 1.001 ) 68956 ( 1.0 ) 256 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
78960237.0 ( 1.0x ) 247004 ( 1.001 ) 68956 ( 1.0 ) 512 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
78960111.0 ( 1.0x ) 247004 ( 1.001 ) 68956 ( 1.0 ) 1024 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
78960051.0 ( 1.0x ) 247004 ( 1.001 ) 68956 ( 1.0 ) 2048 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
78960021.0 ( 1.0x ) 247004 ( 1.001 ) 68956 ( 1.0 ) 4096 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP

Anomaly Detection (toycar)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
3124121.0 ( 0.6x ) 376304 ( 0.982 ) 19416 ( 1.0 ) 0 TFLM Reference RV32IM 0 -
3124191.0 ( 0.6x ) 376928 ( 0.983 ) 19416 ( 1.0 ) 128 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
3124191.0 ( 0.6x ) 376928 ( 0.983 ) 19416 ( 1.0 ) 256 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
3124191.0 ( 0.6x ) 376928 ( 0.983 ) 19416 ( 1.0 ) 512 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
3124191.0 ( 0.6x ) 376928 ( 0.983 ) 19416 ( 1.0 ) 1024 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
3124191.0 ( 0.6x ) 376928 ( 0.983 ) 19416 ( 1.0 ) 2048 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
3124191.0 ( 0.6x ) 376928 ( 0.983 ) 19416 ( 1.0 ) 4096 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
1854573.0 ( Base ) 383348 ( Base ) 19416 ( Base ) 0 muRISCV-NN Scalar RV32IM 0 -
3203894.0 ( 0.6x ) 383352 ( 1.0 ) 19416 ( 1.0 ) 0 muRISCV-NN Vector (Portable) RV32IM 0 -
1854731.0 ( 1.0x ) 384048 ( 1.002 ) 19416 ( 1.0 ) 128 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
1854731.0 ( 1.0x ) 384048 ( 1.002 ) 19416 ( 1.0 ) 256 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
1854731.0 ( 1.0x ) 384048 ( 1.002 ) 19416 ( 1.0 ) 512 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
1854731.0 ( 1.0x ) 384048 ( 1.002 ) 19416 ( 1.0 ) 1024 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
1854731.0 ( 1.0x ) 384048 ( 1.002 ) 19416 ( 1.0 ) 2048 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
1854731.0 ( 1.0x ) 384048 ( 1.002 ) 19416 ( 1.0 ) 4096 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
2234315.0 ( 0.8x ) 387572 ( 1.011 ) 19416 ( 1.0 ) 128 muRISCV-NN Vector RV32IM_ZVE64X 0 -
1993523.0 ( 0.9x ) 387572 ( 1.011 ) 19416 ( 1.0 ) 256 muRISCV-NN Vector RV32IM_ZVE64X 0 -
1873127.0 ( 1.0x ) 387572 ( 1.011 ) 19416 ( 1.0 ) 512 muRISCV-NN Vector RV32IM_ZVE64X 0 -
1813469.0 ( 1.0x ) 387572 ( 1.011 ) 19416 ( 1.0 ) 1024 muRISCV-NN Vector RV32IM_ZVE64X 0 -
1805984.0 ( 1.0x ) 387572 ( 1.011 ) 19416 ( 1.0 ) 2048 muRISCV-NN Vector RV32IM_ZVE64X 0 -
1802547.0 ( 1.0x ) 387572 ( 1.011 ) 19416 ( 1.0 ) 4096 muRISCV-NN Vector RV32IM_ZVE64X 0 -
3204062.0 ( 0.6x ) 384052 ( 1.002 ) 19416 ( 1.0 ) 128 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
3204062.0 ( 0.6x ) 384052 ( 1.002 ) 19416 ( 1.0 ) 256 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
3204062.0 ( 0.6x ) 384052 ( 1.002 ) 19416 ( 1.0 ) 512 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
3204062.0 ( 0.6x ) 384052 ( 1.002 ) 19416 ( 1.0 ) 1024 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
3204062.0 ( 0.6x ) 384052 ( 1.002 ) 19416 ( 1.0 ) 2048 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
3204062.0 ( 0.6x ) 384052 ( 1.002 ) 19416 ( 1.0 ) 4096 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP

Visual Wake Words (vww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
496738079.0 ( 0.1x ) 459604 ( 0.942 ) 134500 ( 1.0 ) 0 TFLM Reference RV32IM 0 -
496738276.0 ( 0.1x ) 460328 ( 0.943 ) 134500 ( 1.0 ) 128 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
496738276.0 ( 0.1x ) 460328 ( 0.943 ) 134500 ( 1.0 ) 256 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
496738276.0 ( 0.1x ) 460328 ( 0.943 ) 134500 ( 1.0 ) 512 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
496738276.0 ( 0.1x ) 460328 ( 0.943 ) 134500 ( 1.0 ) 1024 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
496738276.0 ( 0.1x ) 460328 ( 0.943 ) 134500 ( 1.0 ) 2048 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
496738276.0 ( 0.1x ) 460328 ( 0.943 ) 134500 ( 1.0 ) 4096 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
51821878.0 ( Base ) 488096 ( Base ) 134508 ( Base ) 0 muRISCV-NN Scalar RV32IM 0 -
52792427.0 ( 1.0x ) 487860 ( 1.0 ) 134508 ( 1.0 ) 0 muRISCV-NN Vector (Portable) RV32IM 0 -
51817996.0 ( 1.0x ) 488900 ( 1.002 ) 134508 ( 1.0 ) 128 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
51817996.0 ( 1.0x ) 488900 ( 1.002 ) 134508 ( 1.0 ) 256 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
51817996.0 ( 1.0x ) 488900 ( 1.002 ) 134508 ( 1.0 ) 512 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
51817996.0 ( 1.0x ) 488900 ( 1.002 ) 134508 ( 1.0 ) 1024 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
51817996.0 ( 1.0x ) 488900 ( 1.002 ) 134508 ( 1.0 ) 2048 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
51817996.0 ( 1.0x ) 488900 ( 1.002 ) 134508 ( 1.0 ) 4096 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
22840303.0 ( 2.3x ) 492452 ( 1.009 ) 134508 ( 1.0 ) 128 muRISCV-NN Vector RV32IM_ZVE64X 0 -
16892851.0 ( 3.1x ) 492452 ( 1.009 ) 134508 ( 1.0 ) 256 muRISCV-NN Vector RV32IM_ZVE64X 0 -
14611171.0 ( 3.5x ) 492452 ( 1.009 ) 134508 ( 1.0 ) 512 muRISCV-NN Vector RV32IM_ZVE64X 0 -
13705413.0 ( 3.8x ) 492452 ( 1.009 ) 134508 ( 1.0 ) 1024 muRISCV-NN Vector RV32IM_ZVE64X 0 -
13628964.0 ( 3.8x ) 492452 ( 1.009 ) 134508 ( 1.0 ) 2048 muRISCV-NN Vector RV32IM_ZVE64X 0 -
13629288.0 ( 3.8x ) 492452 ( 1.009 ) 134508 ( 1.0 ) 4096 muRISCV-NN Vector RV32IM_ZVE64X 0 -
52614637.0 ( 1.0x ) 488668 ( 1.001 ) 134508 ( 1.0 ) 128 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
52614133.0 ( 1.0x ) 488668 ( 1.001 ) 134508 ( 1.0 ) 256 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
52613881.0 ( 1.0x ) 488668 ( 1.001 ) 134508 ( 1.0 ) 512 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
52613755.0 ( 1.0x ) 488668 ( 1.001 ) 134508 ( 1.0 ) 1024 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
52613695.0 ( 1.0x ) 488668 ( 1.001 ) 134508 ( 1.0 ) 2048 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
52613665.0 ( 1.0x ) 488668 ( 1.001 ) 134508 ( 1.0 ) 4096 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP

Original data

Click here to download the raw files for this benchmark.