Benchmarks 2024 06 29 TFLM LLVM Os - tum-ei-eda/muriscv-nn GitHub Wiki

Setup

Simulator

Toolchains

  • LLVM/Clang:
    • TODO: Version
    • Linker: lld (TODO)
    • RISC-V GCC for Headers, libc,...

Models

Package Versions

  • MLonMCU : main

  • TFLM : main

  • Spike : 0bc176b3fca43560b9e8586cdbc41cfde073e17a

  • Spike PK : 7e9b671c0415dfd7b562ac934feb9380075d4aa2

Miscellaneous

  • Used -Os flag for compilation.
  • Benchmarks generated using MLonMCU deployment tool with minimal efforts.
  • Memory metrics are reported in Bytes

Results (Framework: tflm, Backend: tflmi, Toolchain: llvm, Flags: -Os)

Audio Wake Words (aww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
39299342 ( 0.4x ) 147084 ( 0.869 ) 36124 ( 1.0 ) 128 TFLM Reference RV32GC 0 -
33496385 ( 0.5x ) 153006 ( 0.904 ) 36132 ( 1.0 ) 128 TFLM Reference RV32GCV 0 Loop+SLP
31703713 ( 0.5x ) 153006 ( 0.904 ) 36132 ( 1.0 ) 256 TFLM Reference RV32GCV 0 Loop+SLP
30807377 ( 0.5x ) 153006 ( 0.904 ) 36132 ( 1.0 ) 512 TFLM Reference RV32GCV 0 Loop+SLP
30359209 ( 0.5x ) 153006 ( 0.904 ) 36132 ( 1.0 ) 1024 TFLM Reference RV32GCV 0 Loop+SLP
30362598 ( 0.5x ) 153006 ( 0.904 ) 36132 ( 1.0 ) 2048 TFLM Reference RV32GCV 0 Loop+SLP
30362598 ( 0.5x ) 153006 ( 0.904 ) 36132 ( 1.0 ) 4096 TFLM Reference RV32GCV 0 Loop+SLP
15090016 ( Base ) 169224 ( Base ) 36124 ( Base ) 128 muRISCV-NN Scalar RV32GC 0 -
6137776 ( 2.5x ) 179646 ( 1.062 ) 36132 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
5135085 ( 2.9x ) 179646 ( 1.062 ) 36132 ( 1.0 ) 256 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
4633837 ( 3.3x ) 179646 ( 1.062 ) 36132 ( 1.0 ) 512 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
4346506 ( 3.5x ) 179646 ( 1.062 ) 36132 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
4346506 ( 3.5x ) 179646 ( 1.062 ) 36132 ( 1.0 ) 2048 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
4349895 ( 3.5x ) 179646 ( 1.062 ) 36132 ( 1.0 ) 4096 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
4078875 ( 3.7x ) 169580 ( 1.002 ) 36124 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV 0 -
2823897 ( 5.3x ) 169580 ( 1.002 ) 36124 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV 0 -
2146579 ( 7.0x ) 169580 ( 1.002 ) 36124 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV 0 -
2105591 ( 7.2x ) 169580 ( 1.002 ) 36124 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV 0 -
2108980 ( 7.2x ) 169580 ( 1.002 ) 36124 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV 0 -
2112369 ( 7.1x ) 169580 ( 1.002 ) 36124 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV 0 -

Image Classification (resnet)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
121052920 ( 0.5x ) 188882 ( 0.929 ) 68892 ( 1.0 ) 128 TFLM Reference RV32GC 0 -
56888712 ( 1.0x ) 194748 ( 0.958 ) 68900 ( 1.0 ) 128 TFLM Reference RV32GCV 0 Loop+SLP
47132056 ( 1.2x ) 194748 ( 0.958 ) 68900 ( 1.0 ) 256 TFLM Reference RV32GCV 0 Loop+SLP
44785152 ( 1.3x ) 194748 ( 0.958 ) 68900 ( 1.0 ) 512 TFLM Reference RV32GCV 0 Loop+SLP
44351348 ( 1.3x ) 194748 ( 0.958 ) 68900 ( 1.0 ) 1024 TFLM Reference RV32GCV 0 Loop+SLP
44354737 ( 1.3x ) 194748 ( 0.958 ) 68900 ( 1.0 ) 2048 TFLM Reference RV32GCV 0 Loop+SLP
44354737 ( 1.3x ) 194748 ( 0.958 ) 68900 ( 1.0 ) 4096 TFLM Reference RV32GCV 0 Loop+SLP
56368604 ( Base ) 203240 ( Base ) 68888 ( Base ) 128 muRISCV-NN Scalar RV32GC 0 -
26152066 ( 2.2x ) 213890 ( 1.052 ) 68896 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
18287279 ( 3.1x ) 213890 ( 1.052 ) 68896 ( 1.0 ) 256 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
14600743 ( 3.9x ) 213890 ( 1.052 ) 68896 ( 1.0 ) 512 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
12924704 ( 4.4x ) 213890 ( 1.052 ) 68896 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
12105504 ( 4.7x ) 213890 ( 1.052 ) 68896 ( 1.0 ) 2048 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
11494493 ( 4.9x ) 213890 ( 1.052 ) 68896 ( 1.0 ) 4096 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
15254317 ( 3.7x ) 204312 ( 1.005 ) 68888 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV 0 -
9671121 ( 5.8x ) 204312 ( 1.005 ) 68888 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV 0 -
7125283 ( 7.9x ) 204312 ( 1.005 ) 68888 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV 0 -
5883155 ( 9.6x ) 204312 ( 1.005 ) 68888 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV 0 -
4958280 ( 11.4x ) 204312 ( 1.005 ) 68888 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV 0 -
4712577 ( 12.0x ) 204312 ( 1.005 ) 68888 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV 0 -

Anomaly Detection (toycar)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
2804252 ( 0.6x ) 342440 ( 0.988 ) 19392 ( 1.0 ) 128 TFLM Reference RV32GC 0 -
898457 ( 1.9x ) 344372 ( 0.993 ) 19392 ( 1.0 ) 128 TFLM Reference RV32GCV 0 Loop+SLP
668185 ( 2.5x ) 344372 ( 0.993 ) 19392 ( 1.0 ) 256 TFLM Reference RV32GCV 0 Loop+SLP
553049 ( 3.1x ) 344372 ( 0.993 ) 19392 ( 1.0 ) 512 TFLM Reference RV32GCV 0 Loop+SLP
495481 ( 3.4x ) 344372 ( 0.993 ) 19392 ( 1.0 ) 1024 TFLM Reference RV32GCV 0 Loop+SLP
466697 ( 3.6x ) 344372 ( 0.993 ) 19392 ( 1.0 ) 2048 TFLM Reference RV32GCV 0 Loop+SLP
466502 ( 3.6x ) 344372 ( 0.993 ) 19392 ( 1.0 ) 4096 TFLM Reference RV32GCV 0 Loop+SLP
1688600 ( Base ) 346650 ( Base ) 19392 ( Base ) 128 muRISCV-NN Scalar RV32GC 0 -
598054 ( 2.8x ) 350118 ( 1.01 ) 19392 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
484134 ( 3.5x ) 350118 ( 1.01 ) 19392 ( 1.0 ) 256 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
427174 ( 4.0x ) 350118 ( 1.01 ) 19392 ( 1.0 ) 512 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
398694 ( 4.2x ) 350118 ( 1.01 ) 19392 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
384454 ( 4.4x ) 350118 ( 1.01 ) 19392 ( 1.0 ) 2048 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
382670 ( 4.4x ) 350118 ( 1.01 ) 19392 ( 1.0 ) 4096 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
595014 ( 2.8x ) 347312 ( 1.002 ) 19392 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV 0 -
478334 ( 3.5x ) 347312 ( 1.002 ) 19392 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV 0 -
419994 ( 4.0x ) 347312 ( 1.002 ) 19392 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV 0 -
391148 ( 4.3x ) 347312 ( 1.002 ) 19392 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV 0 -
387517 ( 4.4x ) 347312 ( 1.002 ) 19392 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV 0 -
385661 ( 4.4x ) 347312 ( 1.002 ) 19392 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV 0 -

Visual Wake Words (vww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
103598379 ( 0.4x ) 420846 ( 0.95 ) 134452 ( 1.0 ) 128 TFLM Reference RV32GC 0 -
72380788 ( 0.6x ) 426716 ( 0.963 ) 134460 ( 1.0 ) 128 TFLM Reference RV32GCV 0 Loop+SLP
67219380 ( 0.7x ) 426716 ( 0.963 ) 134460 ( 1.0 ) 256 TFLM Reference RV32GCV 0 Loop+SLP
64767700 ( 0.7x ) 426716 ( 0.963 ) 134460 ( 1.0 ) 512 TFLM Reference RV32GCV 0 Loop+SLP
63735396 ( 0.7x ) 426716 ( 0.963 ) 134460 ( 1.0 ) 1024 TFLM Reference RV32GCV 0 Loop+SLP
63319401 ( 0.7x ) 426716 ( 0.963 ) 134460 ( 1.0 ) 2048 TFLM Reference RV32GCV 0 Loop+SLP
63287117 ( 0.7x ) 426716 ( 0.963 ) 134460 ( 1.0 ) 4096 TFLM Reference RV32GCV 0 Loop+SLP
45201566 ( Base ) 442988 ( Base ) 134452 ( Base ) 128 muRISCV-NN Scalar RV32GC 0 -
19574269 ( 2.3x ) 453356 ( 1.023 ) 134460 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
16707869 ( 2.7x ) 453356 ( 1.023 ) 134460 ( 1.0 ) 256 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
15334685 ( 2.9x ) 453356 ( 1.023 ) 134460 ( 1.0 ) 512 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
14833834 ( 3.0x ) 453356 ( 1.023 ) 134460 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
14627314 ( 3.1x ) 453356 ( 1.023 ) 134460 ( 1.0 ) 2048 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
14613267 ( 3.1x ) 453356 ( 1.023 ) 134460 ( 1.0 ) 4096 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
13374516 ( 3.4x ) 443346 ( 1.001 ) 134452 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV 0 -
10053516 ( 4.5x ) 443346 ( 1.001 ) 134452 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV 0 -
8762088 ( 5.2x ) 443346 ( 1.001 ) 134452 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV 0 -
8258098 ( 5.5x ) 443346 ( 1.001 ) 134452 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV 0 -
8213718 ( 5.5x ) 443346 ( 1.001 ) 134452 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV 0 -
8217107 ( 5.5x ) 443346 ( 1.001 ) 134452 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV 0 -

Original data

Click here to download the raw files for this benchmark.