Benchmarks 2024 11 26 TFLM GCC O3 spike_rv32 - tum-ei-eda/muriscv-nn GitHub Wiki

Setup

Simulator

  • Spike (riscv-isa-sim ) (ISS, CPI=1)
    • Spike : eb0a3e2b0a7c57522928be39de95cd9f8c6dc636
    • Spike PK : fix-gcc14-rvv

Toolchains

Models

Frameworks

  • MLonMCU : develop

  • TFLM : 8eb6b23de4470d6a8da3131650d6a67514dfa130

Miscellaneous

  • Used -Os flag for compilation.
  • Benchmarks generated using MLonMCU deployment tool with minimal efforts.
  • Memory metrics are reported in Bytes

Results (Framework: tflm, Backend: tflmi, Toolchain: gcc, Flags: -O3, Target: spike_rv32 )

Audio Wake Words (aww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
54363848.0 ( 0.3x ) 148424 ( 0.821 ) 36144 ( 1.0 ) 0 TFLM Reference RV32GC 0 -
34476456.0 ( 0.4x ) 156946 ( 0.868 ) 36200 ( 1.001 ) 128 TFLM Reference RV32GCV 0 Loop+SLP
31869936.0 ( 0.5x ) 157292 ( 0.87 ) 36204 ( 1.001 ) 256 TFLM Reference RV32GCV 0 Loop+SLP
31219427.0 ( 0.5x ) 158376 ( 0.876 ) 36204 ( 1.001 ) 512 TFLM Reference RV32GCV 0 Loop+SLP
30534285.0 ( 0.5x ) 159520 ( 0.882 ) 36204 ( 1.001 ) 1024 TFLM Reference RV32GCV 0 Loop+SLP
30144995.0 ( 0.5x ) 160716 ( 0.889 ) 36200 ( 1.001 ) 2048 TFLM Reference RV32GCV 0 Loop+SLP
29803625.0 ( 0.5x ) 162722 ( 0.9 ) 36176 ( 1.001 ) 4096 TFLM Reference RV32GCV 0 Loop+SLP
15060815.0 ( Base ) 180806 ( Base ) 36152 ( Base ) 0 muRISCV-NN Scalar RV32GC 0 -
15088048.0 ( 1.0x ) 178130 ( 0.985 ) 36152 ( 1.0 ) 0 muRISCV-NN Vector (Portable) RV32GC 0 -
8093709.0 ( 1.9x ) 203012 ( 1.123 ) 36216 ( 1.002 ) 128 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
6685275.0 ( 2.3x ) 202958 ( 1.123 ) 36220 ( 1.002 ) 256 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
6004861.0 ( 2.5x ) 209620 ( 1.159 ) 36220 ( 1.002 ) 512 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
5682416.0 ( 2.7x ) 221040 ( 1.223 ) 36220 ( 1.002 ) 1024 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
5497499.0 ( 2.7x ) 247540 ( 1.369 ) 36216 ( 1.002 ) 2048 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
5494123.0 ( 2.7x ) 302598 ( 1.674 ) 36192 ( 1.001 ) 4096 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
7989645.0 ( 1.9x ) 183112 ( 1.013 ) 36152 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV 0 -
5446551.0 ( 2.8x ) 183112 ( 1.013 ) 36152 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV 0 -
3931921.0 ( 3.8x ) 183112 ( 1.013 ) 36152 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV 0 -
3897348.0 ( 3.9x ) 183112 ( 1.013 ) 36152 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV 0 -
3897348.0 ( 3.9x ) 183112 ( 1.013 ) 36152 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV 0 -
3904130.0 ( 3.9x ) 183112 ( 1.013 ) 36152 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV 0 -
7134408.0 ( 2.1x ) 200698 ( 1.11 ) 36216 ( 1.002 ) 128 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
5293706.0 ( 2.8x ) 200606 ( 1.11 ) 36220 ( 1.002 ) 256 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
4261290.0 ( 3.5x ) 207268 ( 1.146 ) 36220 ( 1.002 ) 512 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
3910675.0 ( 3.9x ) 218688 ( 1.21 ) 36220 ( 1.002 ) 1024 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
3687391.0 ( 4.1x ) 245188 ( 1.356 ) 36216 ( 1.002 ) 2048 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
3704454.0 ( 4.1x ) 300258 ( 1.661 ) 36192 ( 1.001 ) 4096 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP

Image Classification (resnet)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
172295593.0 ( 0.3x ) 196904 ( 0.905 ) 68916 ( 1.0 ) 0 TFLM Reference RV32GC 0 -
67905087.0 ( 0.8x ) 210674 ( 0.969 ) 68980 ( 1.001 ) 128 TFLM Reference RV32GCV 0 Loop+SLP
53031626.0 ( 1.0x ) 211398 ( 0.972 ) 68984 ( 1.001 ) 256 TFLM Reference RV32GCV 0 Loop+SLP
47125467.0 ( 1.2x ) 214156 ( 0.985 ) 68984 ( 1.001 ) 512 TFLM Reference RV32GCV 0 Loop+SLP
45091198.0 ( 1.2x ) 216022 ( 0.993 ) 68984 ( 1.001 ) 1024 TFLM Reference RV32GCV 0 Loop+SLP
44753023.0 ( 1.2x ) 218252 ( 1.004 ) 68980 ( 1.001 ) 2048 TFLM Reference RV32GCV 0 Loop+SLP
44036761.0 ( 1.2x ) 221856 ( 1.02 ) 68956 ( 1.001 ) 4096 TFLM Reference RV32GCV 0 Loop+SLP
54559765.0 ( Base ) 217464 ( Base ) 68908 ( Base ) 0 muRISCV-NN Scalar RV32GC 0 -
72393197.0 ( 0.8x ) 216724 ( 0.997 ) 68908 ( 1.0 ) 0 muRISCV-NN Vector (Portable) RV32GC 0 -
18041661.0 ( 3.0x ) 249354 ( 1.147 ) 68980 ( 1.001 ) 128 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
11533301.0 ( 4.7x ) 242486 ( 1.115 ) 68984 ( 1.001 ) 256 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
8332992.0 ( 6.5x ) 245616 ( 1.129 ) 68984 ( 1.001 ) 512 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
6860235.0 ( 8.0x ) 247026 ( 1.136 ) 68984 ( 1.001 ) 1024 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
6163500.0 ( 8.9x ) 248778 ( 1.144 ) 68980 ( 1.001 ) 2048 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
5850418.0 ( 9.3x ) 251434 ( 1.156 ) 68956 ( 1.001 ) 4096 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
32167007.0 ( 1.7x ) 224760 ( 1.034 ) 68908 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV 0 -
19763011.0 ( 2.8x ) 224760 ( 1.034 ) 68908 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV 0 -
14136501.0 ( 3.9x ) 224760 ( 1.034 ) 68908 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV 0 -
11394884.0 ( 4.8x ) 224760 ( 1.034 ) 68908 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV 0 -
9218540.0 ( 5.9x ) 224760 ( 1.034 ) 68908 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV 0 -
8638398.0 ( 6.3x ) 224760 ( 1.034 ) 68908 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV 0 -
29171367.0 ( 1.9x ) 248602 ( 1.143 ) 68980 ( 1.001 ) 128 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
17067706.0 ( 3.2x ) 241734 ( 1.112 ) 68984 ( 1.001 ) 256 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
10956464.0 ( 5.0x ) 244864 ( 1.126 ) 68984 ( 1.001 ) 512 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
8093209.0 ( 6.7x ) 246274 ( 1.132 ) 68984 ( 1.001 ) 1024 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
6789234.0 ( 8.0x ) 248026 ( 1.141 ) 68980 ( 1.001 ) 2048 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
6151842.0 ( 8.9x ) 250682 ( 1.153 ) 68956 ( 1.001 ) 4096 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP

Anomaly Detection (toycar)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
2784194.0 ( 0.6x ) 340620 ( 0.978 ) 19424 ( 1.0 ) 0 TFLM Reference RV32GC 0 -
1232013.0 ( 1.3x ) 343596 ( 0.987 ) 19428 ( 1.0 ) 128 TFLM Reference RV32GCV 0 Loop+SLP
835955.0 ( 2.0x ) 343614 ( 0.987 ) 19428 ( 1.0 ) 256 TFLM Reference RV32GCV 0 Loop+SLP
638570.0 ( 2.6x ) 344052 ( 0.988 ) 19428 ( 1.0 ) 512 TFLM Reference RV32GCV 0 Loop+SLP
534694.0 ( 3.1x ) 344468 ( 0.989 ) 19428 ( 1.0 ) 1024 TFLM Reference RV32GCV 0 Loop+SLP
485380.0 ( 3.4x ) 344926 ( 0.991 ) 19428 ( 1.0 ) 2048 TFLM Reference RV32GCV 0 Loop+SLP
460708.0 ( 3.6x ) 345668 ( 0.993 ) 19424 ( 1.0 ) 4096 TFLM Reference RV32GCV 0 Loop+SLP
1648267.0 ( Base ) 348140 ( Base ) 19424 ( Base ) 0 muRISCV-NN Scalar RV32GC 0 -
2728325.0 ( 0.6x ) 348142 ( 1.0 ) 19424 ( 1.0 ) 0 muRISCV-NN Vector (Portable) RV32GC 0 -
726502.0 ( 2.3x ) 349172 ( 1.003 ) 19428 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
544912.0 ( 3.0x ) 348766 ( 1.002 ) 19428 ( 1.0 ) 256 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
454495.0 ( 3.6x ) 348998 ( 1.002 ) 19428 ( 1.0 ) 512 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
409282.0 ( 4.0x ) 349202 ( 1.003 ) 19428 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
398968.0 ( 4.1x ) 349428 ( 1.004 ) 19428 ( 1.0 ) 2048 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
380515.0 ( 4.3x ) 349810 ( 1.005 ) 19424 ( 1.0 ) 4096 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
1993404.0 ( 0.8x ) 351934 ( 1.011 ) 19424 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV 0 -
1745390.0 ( 0.9x ) 351934 ( 1.011 ) 19424 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV 0 -
1621383.0 ( 1.0x ) 351934 ( 1.011 ) 19424 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV 0 -
1560047.0 ( 1.1x ) 351934 ( 1.011 ) 19424 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV 0 -
1555722.0 ( 1.1x ) 351934 ( 1.011 ) 19424 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV 0 -
1555172.0 ( 1.1x ) 351934 ( 1.011 ) 19424 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV 0 -
1221483.0 ( 1.3x ) 349174 ( 1.003 ) 19428 ( 1.0 ) 128 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
808554.0 ( 2.0x ) 348768 ( 1.002 ) 19428 ( 1.0 ) 256 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
602899.0 ( 2.7x ) 349000 ( 1.002 ) 19428 ( 1.0 ) 512 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
500104.0 ( 3.3x ) 349204 ( 1.003 ) 19428 ( 1.0 ) 1024 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
460999.0 ( 3.6x ) 349430 ( 1.004 ) 19428 ( 1.0 ) 2048 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
428147.0 ( 3.8x ) 349812 ( 1.005 ) 19424 ( 1.0 ) 4096 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP

Visual Wake Words (vww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
140036503.0 ( 0.3x ) 422066 ( 0.929 ) 134448 ( 1.0 ) 0 TFLM Reference RV32GC 0 -
77531620.0 ( 0.6x ) 430588 ( 0.947 ) 134504 ( 1.0 ) 128 TFLM Reference RV32GCV 0 Loop+SLP
69387094.0 ( 0.7x ) 430934 ( 0.948 ) 134508 ( 1.0 ) 256 TFLM Reference RV32GCV 0 Loop+SLP
66085799.0 ( 0.7x ) 432018 ( 0.951 ) 134508 ( 1.0 ) 512 TFLM Reference RV32GCV 0 Loop+SLP
64123882.0 ( 0.7x ) 433162 ( 0.953 ) 134508 ( 1.0 ) 1024 TFLM Reference RV32GCV 0 Loop+SLP
63218733.0 ( 0.7x ) 434358 ( 0.956 ) 134504 ( 1.0 ) 2048 TFLM Reference RV32GCV 0 Loop+SLP
62122887.0 ( 0.7x ) 436364 ( 0.96 ) 134480 ( 1.0 ) 4096 TFLM Reference RV32GCV 0 Loop+SLP
45259044.0 ( Base ) 454448 ( Base ) 134456 ( Base ) 0 muRISCV-NN Scalar RV32GC 0 -
45599828.0 ( 1.0x ) 451772 ( 0.994 ) 134456 ( 1.0 ) 0 muRISCV-NN Vector (Portable) RV32GC 0 -
25091654.0 ( 1.8x ) 476662 ( 1.049 ) 134520 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
21026264.0 ( 2.2x ) 476608 ( 1.049 ) 134524 ( 1.001 ) 256 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
19035801.0 ( 2.4x ) 483270 ( 1.063 ) 134524 ( 1.001 ) 512 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
18108838.0 ( 2.5x ) 494674 ( 1.089 ) 134524 ( 1.001 ) 1024 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
17730388.0 ( 2.6x ) 521174 ( 1.147 ) 134520 ( 1.0 ) 2048 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
17581231.0 ( 2.6x ) 576248 ( 1.268 ) 134496 ( 1.0 ) 4096 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
25430449.0 ( 1.8x ) 456754 ( 1.005 ) 134456 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV 0 -
18543026.0 ( 2.4x ) 456754 ( 1.005 ) 134456 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV 0 -
15914790.0 ( 2.8x ) 456754 ( 1.005 ) 134456 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV 0 -
14813451.0 ( 3.1x ) 456754 ( 1.005 ) 134456 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV 0 -
14723422.0 ( 3.1x ) 456754 ( 1.005 ) 134456 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV 0 -
14730204.0 ( 3.1x ) 456754 ( 1.005 ) 134456 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV 0 -
22137934.0 ( 2.0x ) 474348 ( 1.044 ) 134520 ( 1.0 ) 128 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
17072508.0 ( 2.7x ) 474256 ( 1.044 ) 134524 ( 1.001 ) 256 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
14168099.0 ( 3.2x ) 480918 ( 1.058 ) 134524 ( 1.001 ) 512 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
13159922.0 ( 3.4x ) 492322 ( 1.083 ) 134524 ( 1.001 ) 1024 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
12786564.0 ( 3.5x ) 518822 ( 1.142 ) 134520 ( 1.0 ) 2048 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
12651382.0 ( 3.6x ) 573908 ( 1.263 ) 134496 ( 1.0 ) 4096 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP

Original data

Click here to download the raw files for this benchmark.