Benchmarks 2024 11 21 TFLM GCC O3 spike_rv32_min - tum-ei-eda/muriscv-nn GitHub Wiki

Setup

Simulator

  • Spike (riscv-isa-sim ) (ISS, CPI=1)
    • Spike : 0bc176b3fca43560b9e8586cdbc41cfde073e17a
    • Spike PK : 7e9b671c0415dfd7b562ac934feb9380075d4aa2

Toolchains

Models

Frameworks

  • MLonMCU : develop

  • TFLM : 8eb6b23de4470d6a8da3131650d6a67514dfa130

Miscellaneous

  • Used -Os flag for compilation.
  • Benchmarks generated using MLonMCU deployment tool with minimal efforts.
  • Memory metrics are reported in Bytes

Results (Framework: tflm, Backend: tflmi, Toolchain: gcc, Flags: -O3, Target: spike_rv32_min )

Audio Wake Words (aww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
53179108 ( 0.3x ) 200684 ( 0.814 ) 36200 ( 1.0 ) 0 TFLM Reference RV32IM 0 -
53184241 ( 0.3x ) 200752 ( 0.815 ) 36200 ( 1.0 ) 128 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
53184241 ( 0.3x ) 200752 ( 0.815 ) 36200 ( 1.0 ) 256 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
53184241 ( 0.3x ) 200752 ( 0.815 ) 36200 ( 1.0 ) 512 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
53184241 ( 0.3x ) 200752 ( 0.815 ) 36200 ( 1.0 ) 1024 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
53184241 ( 0.3x ) 200752 ( 0.815 ) 36200 ( 1.0 ) 2048 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
53184241 ( 0.3x ) 200752 ( 0.815 ) 36200 ( 1.0 ) 4096 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
15413284 ( Base ) 246396 ( Base ) 36216 ( Base ) 0 muRISCV-NN Scalar RV32IM 0 -
15231597 ( 1.0x ) 242140 ( 0.983 ) 36216 ( 1.0 ) 0 muRISCV-NN Vector (Portable) RV32IM 0 -
15413288 ( 1.0x ) 246436 ( 1.0 ) 36216 ( 1.0 ) 128 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
15413288 ( 1.0x ) 246436 ( 1.0 ) 36216 ( 1.0 ) 256 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
15413288 ( 1.0x ) 246436 ( 1.0 ) 36216 ( 1.0 ) 512 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
15413288 ( 1.0x ) 246436 ( 1.0 ) 36216 ( 1.0 ) 1024 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
15413288 ( 1.0x ) 246436 ( 1.0 ) 36216 ( 1.0 ) 2048 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
15413288 ( 1.0x ) 246436 ( 1.0 ) 36216 ( 1.0 ) 4096 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
7302947 ( 2.1x ) 248196 ( 1.007 ) 36216 ( 1.0 ) 128 muRISCV-NN Vector RV32IM_ZVE64X 0 -
5191489 ( 3.0x ) 248196 ( 1.007 ) 36216 ( 1.0 ) 256 muRISCV-NN Vector RV32IM_ZVE64X 0 -
3756987 ( 4.1x ) 248196 ( 1.007 ) 36216 ( 1.0 ) 512 muRISCV-NN Vector RV32IM_ZVE64X 0 -
3719395 ( 4.1x ) 248196 ( 1.007 ) 36216 ( 1.0 ) 1024 muRISCV-NN Vector RV32IM_ZVE64X 0 -
3722784 ( 4.1x ) 248196 ( 1.007 ) 36216 ( 1.0 ) 2048 muRISCV-NN Vector RV32IM_ZVE64X 0 -
3726173 ( 4.1x ) 248196 ( 1.007 ) 36216 ( 1.0 ) 4096 muRISCV-NN Vector RV32IM_ZVE64X 0 -
15231601 ( 1.0x ) 242180 ( 0.983 ) 36216 ( 1.0 ) 128 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
15231601 ( 1.0x ) 242180 ( 0.983 ) 36216 ( 1.0 ) 256 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
15231601 ( 1.0x ) 242180 ( 0.983 ) 36216 ( 1.0 ) 512 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
15231601 ( 1.0x ) 242180 ( 0.983 ) 36216 ( 1.0 ) 1024 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
15231601 ( 1.0x ) 242180 ( 0.983 ) 36216 ( 1.0 ) 2048 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
15231601 ( 1.0x ) 242180 ( 0.983 ) 36216 ( 1.0 ) 4096 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP

Image Classification (resnet)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
169739643 ( 0.3x ) 251568 ( 0.896 ) 68972 ( 1.0 ) 0 TFLM Reference RV32IM 0 -
169739643 ( 0.3x ) 251528 ( 0.896 ) 68972 ( 1.0 ) 128 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
169739643 ( 0.3x ) 251528 ( 0.896 ) 68972 ( 1.0 ) 256 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
169739643 ( 0.3x ) 251528 ( 0.896 ) 68972 ( 1.0 ) 512 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
169739643 ( 0.3x ) 251528 ( 0.896 ) 68972 ( 1.0 ) 1024 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
169739643 ( 0.3x ) 251528 ( 0.896 ) 68972 ( 1.0 ) 2048 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
169739643 ( 0.3x ) 251528 ( 0.896 ) 68972 ( 1.0 ) 4096 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
54527190 ( Base ) 280652 ( Base ) 68972 ( Base ) 0 muRISCV-NN Scalar RV32IM 0 -
72570894 ( 0.8x ) 279704 ( 0.997 ) 68972 ( 1.0 ) 0 muRISCV-NN Vector (Portable) RV32IM 0 -
54649796 ( 1.0x ) 280608 ( 1.0 ) 68972 ( 1.0 ) 128 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
54649796 ( 1.0x ) 280608 ( 1.0 ) 68972 ( 1.0 ) 256 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
54649796 ( 1.0x ) 280608 ( 1.0 ) 68972 ( 1.0 ) 512 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
54649796 ( 1.0x ) 280608 ( 1.0 ) 68972 ( 1.0 ) 1024 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
54649796 ( 1.0x ) 280608 ( 1.0 ) 68972 ( 1.0 ) 2048 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
54649796 ( 1.0x ) 280608 ( 1.0 ) 68972 ( 1.0 ) 4096 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
27854431 ( 2.0x ) 289168 ( 1.03 ) 68972 ( 1.0 ) 128 muRISCV-NN Vector RV32IM_ZVE64X 0 -
17587123 ( 3.1x ) 289168 ( 1.03 ) 68972 ( 1.0 ) 256 muRISCV-NN Vector RV32IM_ZVE64X 0 -
12971613 ( 4.2x ) 289168 ( 1.03 ) 68972 ( 1.0 ) 512 muRISCV-NN Vector RV32IM_ZVE64X 0 -
10719805 ( 5.1x ) 289168 ( 1.03 ) 68972 ( 1.0 ) 1024 muRISCV-NN Vector RV32IM_ZVE64X 0 -
8752490 ( 6.2x ) 289168 ( 1.03 ) 68972 ( 1.0 ) 2048 muRISCV-NN Vector RV32IM_ZVE64X 0 -
8218527 ( 6.6x ) 289168 ( 1.03 ) 68972 ( 1.0 ) 4096 muRISCV-NN Vector RV32IM_ZVE64X 0 -
72693689 ( 0.8x ) 279660 ( 0.996 ) 68972 ( 1.0 ) 128 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
72693689 ( 0.8x ) 279660 ( 0.996 ) 68972 ( 1.0 ) 256 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
72693689 ( 0.8x ) 279660 ( 0.996 ) 68972 ( 1.0 ) 512 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
72693689 ( 0.8x ) 279660 ( 0.996 ) 68972 ( 1.0 ) 1024 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
72693689 ( 0.8x ) 279660 ( 0.996 ) 68972 ( 1.0 ) 2048 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
72693689 ( 0.8x ) 279660 ( 0.996 ) 68972 ( 1.0 ) 4096 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP

Anomaly Detection (toycar)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
2788715 ( 0.6x ) 377836 ( 0.973 ) 19416 ( 1.0 ) 0 TFLM Reference RV32IM 0 -
2796033 ( 0.6x ) 377900 ( 0.973 ) 19416 ( 1.0 ) 128 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
2796033 ( 0.6x ) 377900 ( 0.973 ) 19416 ( 1.0 ) 256 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
2796033 ( 0.6x ) 377900 ( 0.973 ) 19416 ( 1.0 ) 512 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
2796033 ( 0.6x ) 377900 ( 0.973 ) 19416 ( 1.0 ) 1024 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
2796033 ( 0.6x ) 377900 ( 0.973 ) 19416 ( 1.0 ) 2048 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
2796033 ( 0.6x ) 377900 ( 0.973 ) 19416 ( 1.0 ) 4096 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
1652471 ( Base ) 388372 ( Base ) 19416 ( Base ) 0 muRISCV-NN Scalar RV32IM 0 -
2735891 ( 0.6x ) 388376 ( 1.0 ) 19416 ( 1.0 ) 0 muRISCV-NN Vector (Portable) RV32IM 0 -
1652471 ( 1.0x ) 388436 ( 1.0 ) 19416 ( 1.0 ) 128 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
1652471 ( 1.0x ) 388436 ( 1.0 ) 19416 ( 1.0 ) 256 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
1652471 ( 1.0x ) 388436 ( 1.0 ) 19416 ( 1.0 ) 512 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
1652471 ( 1.0x ) 388436 ( 1.0 ) 19416 ( 1.0 ) 1024 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
1652471 ( 1.0x ) 388436 ( 1.0 ) 19416 ( 1.0 ) 2048 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
1652471 ( 1.0x ) 388436 ( 1.0 ) 19416 ( 1.0 ) 4096 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
1946801 ( 0.8x ) 392968 ( 1.012 ) 19416 ( 1.0 ) 128 muRISCV-NN Vector RV32IM_ZVE64X 0 -
1719081 ( 1.0x ) 392968 ( 1.012 ) 19416 ( 1.0 ) 256 muRISCV-NN Vector RV32IM_ZVE64X 0 -
1605221 ( 1.0x ) 392968 ( 1.012 ) 19416 ( 1.0 ) 512 muRISCV-NN Vector RV32IM_ZVE64X 0 -
1548807 ( 1.1x ) 392968 ( 1.012 ) 19416 ( 1.0 ) 1024 muRISCV-NN Vector RV32IM_ZVE64X 0 -
1545117 ( 1.1x ) 392968 ( 1.012 ) 19416 ( 1.0 ) 2048 muRISCV-NN Vector RV32IM_ZVE64X 0 -
1544902 ( 1.1x ) 392968 ( 1.012 ) 19416 ( 1.0 ) 4096 muRISCV-NN Vector RV32IM_ZVE64X 0 -
2735891 ( 0.6x ) 388440 ( 1.0 ) 19416 ( 1.0 ) 128 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
2735891 ( 0.6x ) 388440 ( 1.0 ) 19416 ( 1.0 ) 256 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
2735891 ( 0.6x ) 388440 ( 1.0 ) 19416 ( 1.0 ) 512 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
2735891 ( 0.6x ) 388440 ( 1.0 ) 19416 ( 1.0 ) 1024 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
2735891 ( 0.6x ) 388440 ( 1.0 ) 19416 ( 1.0 ) 2048 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
2735891 ( 0.6x ) 388440 ( 1.0 ) 19416 ( 1.0 ) 4096 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP

Visual Wake Words (vww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
138781789 ( 0.3x ) 474324 ( 0.912 ) 134504 ( 1.0 ) 0 TFLM Reference RV32IM 0 -
138781151 ( 0.3x ) 474392 ( 0.912 ) 134504 ( 1.0 ) 128 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
138781151 ( 0.3x ) 474392 ( 0.912 ) 134504 ( 1.0 ) 256 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
138781151 ( 0.3x ) 474392 ( 0.912 ) 134504 ( 1.0 ) 512 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
138781151 ( 0.3x ) 474392 ( 0.912 ) 134504 ( 1.0 ) 1024 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
138781151 ( 0.3x ) 474392 ( 0.912 ) 134504 ( 1.0 ) 2048 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
138781151 ( 0.3x ) 474392 ( 0.912 ) 134504 ( 1.0 ) 4096 TFLM Reference RV32IM_ZVE64X 0 Loop+SLP
46346274 ( Base ) 520036 ( Base ) 134520 ( Base ) 0 muRISCV-NN Scalar RV32IM 0 -
45916524 ( 1.0x ) 515780 ( 0.992 ) 134520 ( 1.0 ) 0 muRISCV-NN Vector (Portable) RV32IM 0 -
46346343 ( 1.0x ) 520076 ( 1.0 ) 134520 ( 1.0 ) 128 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
46346343 ( 1.0x ) 520076 ( 1.0 ) 134520 ( 1.0 ) 256 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
46346343 ( 1.0x ) 520076 ( 1.0 ) 134520 ( 1.0 ) 512 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
46346343 ( 1.0x ) 520076 ( 1.0 ) 134520 ( 1.0 ) 1024 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
46346343 ( 1.0x ) 520076 ( 1.0 ) 134520 ( 1.0 ) 2048 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
46346343 ( 1.0x ) 520076 ( 1.0 ) 134520 ( 1.0 ) 4096 muRISCV-NN Scalar RV32IM_ZVE64X 0 Loop+SLP
23269989 ( 2.0x ) 521836 ( 1.003 ) 134520 ( 1.0 ) 128 muRISCV-NN Vector RV32IM_ZVE64X 0 -
17353609 ( 2.7x ) 521836 ( 1.003 ) 134520 ( 1.0 ) 256 muRISCV-NN Vector RV32IM_ZVE64X 0 -
15109299 ( 3.1x ) 521836 ( 1.003 ) 134520 ( 1.0 ) 512 muRISCV-NN Vector RV32IM_ZVE64X 0 -
14081318 ( 3.3x ) 521836 ( 1.003 ) 134520 ( 1.0 ) 1024 muRISCV-NN Vector RV32IM_ZVE64X 0 -
13998927 ( 3.3x ) 521836 ( 1.003 ) 134520 ( 1.0 ) 2048 muRISCV-NN Vector RV32IM_ZVE64X 0 -
14002316 ( 3.3x ) 521836 ( 1.003 ) 134520 ( 1.0 ) 4096 muRISCV-NN Vector RV32IM_ZVE64X 0 -
45916536 ( 1.0x ) 515820 ( 0.992 ) 134520 ( 1.0 ) 128 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
45916536 ( 1.0x ) 515820 ( 0.992 ) 134520 ( 1.0 ) 256 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
45916536 ( 1.0x ) 515820 ( 0.992 ) 134520 ( 1.0 ) 512 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
45916536 ( 1.0x ) 515820 ( 0.992 ) 134520 ( 1.0 ) 1024 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
45916536 ( 1.0x ) 515820 ( 0.992 ) 134520 ( 1.0 ) 2048 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP
45916536 ( 1.0x ) 515820 ( 0.992 ) 134520 ( 1.0 ) 4096 muRISCV-NN Vector (Portable) RV32IM_ZVE64X 0 Loop+SLP

Original data

Click here to download the raw files for this benchmark.