Benchmarks 2024 03 02 TFLM GCC O3 - tum-ei-eda/muriscv-nn GitHub Wiki

Setup

Simulator

Toolchains

Models

Package Versions

  • MLonMCU : main

  • TFLM : main

  • Spike : 0bc176b3fca43560b9e8586cdbc41cfde073e17a

  • Spike PK : 7e9b671c0415dfd7b562ac934feb9380075d4aa2

Miscellaneous

  • Used -Os flag for compilation.
  • Benchmarks generated using MLonMCU deployment tool with minimal efforts.
  • Memory metrics are reported in Bytes

Results (Framework: tflm, Backend: tflmi, Toolchain: gcc, Flags: -O3)

Audio Wake Words (aww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
53184213 ( 0.3x ) 145991 ( 0.831 ) 36208 ( 1.0 ) 0 TFLM Reference RV32GC False -
53184213 ( 0.3x ) 146023 ( 0.831 ) 36208 ( 1.0 ) 128 TFLM Reference RV32GCV False Loop+SLP
53184213 ( 0.3x ) 146023 ( 0.831 ) 36208 ( 1.0 ) 256 TFLM Reference RV32GCV False Loop+SLP
53184213 ( 0.3x ) 146023 ( 0.831 ) 36208 ( 1.0 ) 512 TFLM Reference RV32GCV False Loop+SLP
53184213 ( 0.3x ) 146023 ( 0.831 ) 36208 ( 1.0 ) 1024 TFLM Reference RV32GCV False Loop+SLP
53184213 ( 0.3x ) 146023 ( 0.831 ) 36208 ( 1.0 ) 2048 TFLM Reference RV32GCV False Loop+SLP
53184213 ( 0.3x ) 146023 ( 0.831 ) 36208 ( 1.0 ) 4096 TFLM Reference RV32GCV False Loop+SLP
42403197 ( 0.4x ) 155544 ( 0.885 ) 36168 ( 0.998 ) 0 TFLM Reference RV32GCP False -
15402965 ( Base ) 175674 ( Base ) 36224 ( Base ) 0 muRISCV-NN Scalar RV32GC False -
15402969 ( 1.0x ) 175720 ( 1.0 ) 36224 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV False Loop+SLP
15402969 ( 1.0x ) 175720 ( 1.0 ) 36224 ( 1.0 ) 256 muRISCV-NN Scalar RV32GCV False Loop+SLP
15402969 ( 1.0x ) 175720 ( 1.0 ) 36224 ( 1.0 ) 512 muRISCV-NN Scalar RV32GCV False Loop+SLP
15402969 ( 1.0x ) 175720 ( 1.0 ) 36224 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV False Loop+SLP
15402969 ( 1.0x ) 175720 ( 1.0 ) 36224 ( 1.0 ) 2048 muRISCV-NN Scalar RV32GCV False Loop+SLP
15402969 ( 1.0x ) 175720 ( 1.0 ) 36224 ( 1.0 ) 4096 muRISCV-NN Scalar RV32GCV False Loop+SLP
7404748 ( 2.1x ) 176146 ( 1.003 ) 36224 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV False -
5219690 ( 3.0x ) 176146 ( 1.003 ) 36224 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV False -
3745444 ( 4.1x ) 176146 ( 1.003 ) 36224 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV False -
3707852 ( 4.2x ) 176146 ( 1.003 ) 36224 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV False -
3711241 ( 4.2x ) 176146 ( 1.003 ) 36224 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV False -
3714630 ( 4.1x ) 176146 ( 1.003 ) 36224 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV False -
12042729 ( 1.3x ) 179818 ( 1.024 ) 36176 ( 0.999 ) 0 muRISCV-NN Scalar RV32GCP False -
5820790 ( 2.6x ) 185090 ( 1.054 ) 36176 ( 0.999 ) 0 muRISCV-NN Packed RV32GCP False -

Image Classification (resnet)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
169763780 ( 0.3x ) 193557 ( 0.916 ) 68980 ( 1.0 ) 0 TFLM Reference RV32GC False -
169764212 ( 0.3x ) 193409 ( 0.915 ) 68980 ( 1.0 ) 128 TFLM Reference RV32GCV False Loop+SLP
169764212 ( 0.3x ) 193409 ( 0.915 ) 68980 ( 1.0 ) 256 TFLM Reference RV32GCV False Loop+SLP
169764212 ( 0.3x ) 193409 ( 0.915 ) 68980 ( 1.0 ) 512 TFLM Reference RV32GCV False Loop+SLP
169764212 ( 0.3x ) 193409 ( 0.915 ) 68980 ( 1.0 ) 1024 TFLM Reference RV32GCV False Loop+SLP
169764212 ( 0.3x ) 193409 ( 0.915 ) 68980 ( 1.0 ) 2048 TFLM Reference RV32GCV False Loop+SLP
169764212 ( 0.3x ) 193409 ( 0.915 ) 68980 ( 1.0 ) 4096 TFLM Reference RV32GCV False Loop+SLP
132830942 ( 0.4x ) 202610 ( 0.959 ) 68940 ( 0.999 ) 0 TFLM Reference RV32GCP False -
54516824 ( Base ) 211290 ( Base ) 68980 ( Base ) 0 muRISCV-NN Scalar RV32GC False -
54639345 ( 1.0x ) 211198 ( 1.0 ) 68980 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV False Loop+SLP
54639345 ( 1.0x ) 211198 ( 1.0 ) 68980 ( 1.0 ) 256 muRISCV-NN Scalar RV32GCV False Loop+SLP
54639345 ( 1.0x ) 211198 ( 1.0 ) 68980 ( 1.0 ) 512 muRISCV-NN Scalar RV32GCV False Loop+SLP
54639345 ( 1.0x ) 211198 ( 1.0 ) 68980 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV False Loop+SLP
54639345 ( 1.0x ) 211198 ( 1.0 ) 68980 ( 1.0 ) 2048 muRISCV-NN Scalar RV32GCV False Loop+SLP
54639345 ( 1.0x ) 211198 ( 1.0 ) 68980 ( 1.0 ) 4096 muRISCV-NN Scalar RV32GCV False Loop+SLP
28542019 ( 1.9x ) 216958 ( 1.027 ) 68980 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV False -
17905943 ( 3.0x ) 216958 ( 1.027 ) 68980 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV False -
13122433 ( 4.2x ) 216958 ( 1.027 ) 68980 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV False -
10788705 ( 5.1x ) 216958 ( 1.027 ) 68980 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV False -
8759950 ( 6.2x ) 216958 ( 1.027 ) 68980 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV False -
8209603 ( 6.6x ) 216958 ( 1.027 ) 68980 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV False -
41647356 ( 1.3x ) 218732 ( 1.035 ) 68932 ( 0.999 ) 0 muRISCV-NN Scalar RV32GCP False -
26976131 ( 2.0x ) 226386 ( 1.071 ) 68932 ( 0.999 ) 0 muRISCV-NN Packed RV32GCP False -

Anomaly Detection (toycar)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
2783542 ( 0.6x ) 338082 ( 0.988 ) 19432 ( 1.0 ) 0 TFLM Reference RV32GC False -
2783542 ( 0.6x ) 338124 ( 0.988 ) 19432 ( 1.0 ) 128 TFLM Reference RV32GCV False Loop+SLP
2783542 ( 0.6x ) 338124 ( 0.988 ) 19432 ( 1.0 ) 256 TFLM Reference RV32GCV False Loop+SLP
2783542 ( 0.6x ) 338124 ( 0.988 ) 19432 ( 1.0 ) 512 TFLM Reference RV32GCV False Loop+SLP
2783542 ( 0.6x ) 338124 ( 0.988 ) 19432 ( 1.0 ) 1024 TFLM Reference RV32GCV False Loop+SLP
2783542 ( 0.6x ) 338124 ( 0.988 ) 19432 ( 1.0 ) 2048 TFLM Reference RV32GCV False Loop+SLP
2783542 ( 0.6x ) 338124 ( 0.988 ) 19432 ( 1.0 ) 4096 TFLM Reference RV32GCV False Loop+SLP
2527586 ( 0.6x ) 349666 ( 1.022 ) 19380 ( 0.997 ) 0 TFLM Reference RV32GCP False -
1641896 ( Base ) 342150 ( Base ) 19432 ( Base ) 0 muRISCV-NN Scalar RV32GC False -
1641896 ( 1.0x ) 342194 ( 1.0 ) 19432 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV False Loop+SLP
1641896 ( 1.0x ) 342194 ( 1.0 ) 19432 ( 1.0 ) 256 muRISCV-NN Scalar RV32GCV False Loop+SLP
1641896 ( 1.0x ) 342194 ( 1.0 ) 19432 ( 1.0 ) 512 muRISCV-NN Scalar RV32GCV False Loop+SLP
1641896 ( 1.0x ) 342194 ( 1.0 ) 19432 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV False Loop+SLP
1641896 ( 1.0x ) 342194 ( 1.0 ) 19432 ( 1.0 ) 2048 muRISCV-NN Scalar RV32GCV False Loop+SLP
1641896 ( 1.0x ) 342194 ( 1.0 ) 19432 ( 1.0 ) 4096 muRISCV-NN Scalar RV32GCV False Loop+SLP
967673 ( 1.7x ) 344810 ( 1.008 ) 19432 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV False -
688241 ( 2.4x ) 344810 ( 1.008 ) 19432 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV False -
548525 ( 3.0x ) 344810 ( 1.008 ) 19432 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV False -
479183 ( 3.4x ) 344810 ( 1.008 ) 19432 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV False -
473893 ( 3.5x ) 344810 ( 1.008 ) 19432 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV False -
472878 ( 3.5x ) 344810 ( 1.008 ) 19432 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV False -
1378946 ( 1.2x ) 353314 ( 1.033 ) 19380 ( 0.997 ) 0 muRISCV-NN Scalar RV32GCP False -
893494 ( 1.8x ) 358470 ( 1.048 ) 19380 ( 0.997 ) 0 muRISCV-NN Packed RV32GCP False -

Visual Wake Words (vww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
138800786 ( 0.3x ) 419493 ( 0.934 ) 134524 ( 1.0 ) 0 TFLM Reference RV32GC False -
138800831 ( 0.3x ) 419525 ( 0.934 ) 134524 ( 1.0 ) 128 TFLM Reference RV32GCV False Loop+SLP
138800831 ( 0.3x ) 419525 ( 0.934 ) 134524 ( 1.0 ) 256 TFLM Reference RV32GCV False Loop+SLP
138800831 ( 0.3x ) 419525 ( 0.934 ) 134524 ( 1.0 ) 512 TFLM Reference RV32GCV False Loop+SLP
138800831 ( 0.3x ) 419525 ( 0.934 ) 134524 ( 1.0 ) 1024 TFLM Reference RV32GCV False Loop+SLP
138800831 ( 0.3x ) 419525 ( 0.934 ) 134524 ( 1.0 ) 2048 TFLM Reference RV32GCV False Loop+SLP
138800831 ( 0.3x ) 419525 ( 0.934 ) 134524 ( 1.0 ) 4096 TFLM Reference RV32GCV False Loop+SLP
107265137 ( 0.4x ) 429036 ( 0.955 ) 134484 ( 1.0 ) 0 TFLM Reference RV32GCP False -
46335895 ( Base ) 449176 ( Base ) 134540 ( Base ) 0 muRISCV-NN Scalar RV32GC False -
46335891 ( 1.0x ) 449222 ( 1.0 ) 134540 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV False Loop+SLP
46335891 ( 1.0x ) 449222 ( 1.0 ) 134540 ( 1.0 ) 256 muRISCV-NN Scalar RV32GCV False Loop+SLP
46335891 ( 1.0x ) 449222 ( 1.0 ) 134540 ( 1.0 ) 512 muRISCV-NN Scalar RV32GCV False Loop+SLP
46335891 ( 1.0x ) 449222 ( 1.0 ) 134540 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV False Loop+SLP
46335891 ( 1.0x ) 449222 ( 1.0 ) 134540 ( 1.0 ) 2048 muRISCV-NN Scalar RV32GCV False Loop+SLP
46335891 ( 1.0x ) 449222 ( 1.0 ) 134540 ( 1.0 ) 4096 muRISCV-NN Scalar RV32GCV False Loop+SLP
23575911 ( 2.0x ) 449648 ( 1.001 ) 134540 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV False -
17456779 ( 2.7x ) 449648 ( 1.001 ) 134540 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV False -
15134133 ( 3.1x ) 449648 ( 1.001 ) 134540 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV False -
14074920 ( 3.3x ) 449648 ( 1.001 ) 134540 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV False -
13990481 ( 3.3x ) 449648 ( 1.001 ) 134540 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV False -
13993870 ( 3.3x ) 449648 ( 1.001 ) 134540 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV False -
36583035 ( 1.3x ) 453312 ( 1.009 ) 134492 ( 1.0 ) 0 muRISCV-NN Scalar RV32GCP False -
19065996 ( 2.4x ) 458584 ( 1.021 ) 134492 ( 1.0 ) 0 muRISCV-NN Packed RV32GCP False -

Original data

Click here to download the raw files for this benchmark.