Benchmarks 2024 11 18 TFLM LLVM Os spike_rv32 - tum-ei-eda/muriscv-nn GitHub Wiki

Setup

Simulator

  • Spike (riscv-isa-sim ) (ISS, CPI=1)
    • Spike : 0bc176b3fca43560b9e8586cdbc41cfde073e17a
    • Spike PK : 7e9b671c0415dfd7b562ac934feb9380075d4aa2

Toolchains

Models

Frameworks

  • MLonMCU : develop

  • TFLM : 8eb6b23de4470d6a8da3131650d6a67514dfa130

Miscellaneous

  • Used -Os flag for compilation.
  • Benchmarks generated using MLonMCU deployment tool with minimal efforts.
  • Memory metrics are reported in Bytes

Results (Framework: tflm, Backend: tflmi, Toolchain: llvm, Flags: -Os, Target: spike_rv32 )

Audio Wake Words (aww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
39294217 ( 0.4x ) 146784 ( 0.86 ) 36060 ( 1.0 ) 0 TFLM Reference RV32GC 0 -
33486320 ( 0.5x ) 152822 ( 0.895 ) 36068 ( 1.0 ) 128 TFLM Reference RV32GCV 0 Loop+SLP
31693648 ( 0.5x ) 152822 ( 0.895 ) 36068 ( 1.0 ) 256 TFLM Reference RV32GCV 0 Loop+SLP
30797312 ( 0.5x ) 152822 ( 0.895 ) 36068 ( 1.0 ) 512 TFLM Reference RV32GCV 0 Loop+SLP
30349144 ( 0.5x ) 152822 ( 0.895 ) 36068 ( 1.0 ) 1024 TFLM Reference RV32GCV 0 Loop+SLP
30352533 ( 0.5x ) 152822 ( 0.895 ) 36068 ( 1.0 ) 2048 TFLM Reference RV32GCV 0 Loop+SLP
30352533 ( 0.5x ) 152822 ( 0.895 ) 36068 ( 1.0 ) 4096 TFLM Reference RV32GCV 0 Loop+SLP
15086527 ( Base ) 170734 ( Base ) 36060 ( Base ) 0 muRISCV-NN Scalar RV32GC 0 -
6142945 ( 2.5x ) 181802 ( 1.065 ) 36068 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
5140254 ( 2.9x ) 181802 ( 1.065 ) 36068 ( 1.0 ) 256 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
4639006 ( 3.3x ) 181802 ( 1.065 ) 36068 ( 1.0 ) 512 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
4351675 ( 3.5x ) 181802 ( 1.065 ) 36068 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
4351675 ( 3.5x ) 181802 ( 1.065 ) 36068 ( 1.0 ) 2048 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
4355064 ( 3.5x ) 181802 ( 1.065 ) 36068 ( 1.0 ) 4096 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
4077937 ( 3.7x ) 171946 ( 1.007 ) 36060 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV 0 -
2822959 ( 5.3x ) 171946 ( 1.007 ) 36060 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV 0 -
2145641 ( 7.0x ) 171946 ( 1.007 ) 36060 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV 0 -
2104653 ( 7.2x ) 171946 ( 1.007 ) 36060 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV 0 -
2108042 ( 7.2x ) 171946 ( 1.007 ) 36060 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV 0 -
2111431 ( 7.1x ) 171946 ( 1.007 ) 36060 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV 0 -

Image Classification (resnet)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
121052840 ( 0.5x ) 188506 ( 0.921 ) 68828 ( 1.0 ) 0 TFLM Reference RV32GC 0 -
56885482 ( 1.0x ) 194494 ( 0.951 ) 68836 ( 1.0 ) 128 TFLM Reference RV32GCV 0 Loop+SLP
47128826 ( 1.2x ) 194494 ( 0.951 ) 68836 ( 1.0 ) 256 TFLM Reference RV32GCV 0 Loop+SLP
44781922 ( 1.3x ) 194494 ( 0.951 ) 68836 ( 1.0 ) 512 TFLM Reference RV32GCV 0 Loop+SLP
44348118 ( 1.3x ) 194494 ( 0.951 ) 68836 ( 1.0 ) 1024 TFLM Reference RV32GCV 0 Loop+SLP
44351507 ( 1.3x ) 194494 ( 0.951 ) 68836 ( 1.0 ) 2048 TFLM Reference RV32GCV 0 Loop+SLP
44351507 ( 1.3x ) 194494 ( 0.951 ) 68836 ( 1.0 ) 4096 TFLM Reference RV32GCV 0 Loop+SLP
56361734 ( Base ) 204622 ( Base ) 68824 ( Base ) 0 muRISCV-NN Scalar RV32GC 0 -
26153796 ( 2.2x ) 215920 ( 1.055 ) 68832 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
18289009 ( 3.1x ) 215920 ( 1.055 ) 68832 ( 1.0 ) 256 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
14602473 ( 3.9x ) 215920 ( 1.055 ) 68832 ( 1.0 ) 512 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
12926434 ( 4.4x ) 215920 ( 1.055 ) 68832 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
12107234 ( 4.7x ) 215920 ( 1.055 ) 68832 ( 1.0 ) 2048 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
11496223 ( 4.9x ) 215920 ( 1.055 ) 68832 ( 1.0 ) 4096 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
15252556 ( 3.7x ) 206548 ( 1.009 ) 68824 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV 0 -
9669360 ( 5.8x ) 206548 ( 1.009 ) 68824 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV 0 -
7123522 ( 7.9x ) 206548 ( 1.009 ) 68824 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV 0 -
5881394 ( 9.6x ) 206548 ( 1.009 ) 68824 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV 0 -
4956519 ( 11.4x ) 206548 ( 1.009 ) 68824 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV 0 -
4710816 ( 12.0x ) 206548 ( 1.009 ) 68824 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV 0 -

Anomaly Detection (toycar)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
2797025 ( 0.6x ) 342072 ( 0.983 ) 19364 ( 1.0 ) 0 TFLM Reference RV32GC 0 -
898484 ( 1.9x ) 344098 ( 0.989 ) 19364 ( 1.0 ) 128 TFLM Reference RV32GCV 0 Loop+SLP
668212 ( 2.5x ) 344098 ( 0.989 ) 19364 ( 1.0 ) 256 TFLM Reference RV32GCV 0 Loop+SLP
553076 ( 3.0x ) 344098 ( 0.989 ) 19364 ( 1.0 ) 512 TFLM Reference RV32GCV 0 Loop+SLP
495508 ( 3.4x ) 344098 ( 0.989 ) 19364 ( 1.0 ) 1024 TFLM Reference RV32GCV 0 Loop+SLP
466724 ( 3.6x ) 344098 ( 0.989 ) 19364 ( 1.0 ) 2048 TFLM Reference RV32GCV 0 Loop+SLP
466529 ( 3.6x ) 344098 ( 0.989 ) 19364 ( 1.0 ) 4096 TFLM Reference RV32GCV 0 Loop+SLP
1683764 ( Base ) 347942 ( Base ) 19364 ( Base ) 0 muRISCV-NN Scalar RV32GC 0 -
594867 ( 2.8x ) 352042 ( 1.012 ) 19364 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
480947 ( 3.5x ) 352042 ( 1.012 ) 19364 ( 1.0 ) 256 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
423987 ( 4.0x ) 352042 ( 1.012 ) 19364 ( 1.0 ) 512 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
395507 ( 4.3x ) 352042 ( 1.012 ) 19364 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
381267 ( 4.4x ) 352042 ( 1.012 ) 19364 ( 1.0 ) 2048 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
379483 ( 4.4x ) 352042 ( 1.012 ) 19364 ( 1.0 ) 4096 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
1940438 ( 0.9x ) 349462 ( 1.004 ) 19364 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV 0 -
1823758 ( 0.9x ) 349462 ( 1.004 ) 19364 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV 0 -
1765418 ( 1.0x ) 349462 ( 1.004 ) 19364 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV 0 -
1736572 ( 1.0x ) 349462 ( 1.004 ) 19364 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV 0 -
1732941 ( 1.0x ) 349462 ( 1.004 ) 19364 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV 0 -
1731085 ( 1.0x ) 349462 ( 1.004 ) 19364 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV 0 -

Visual Wake Words (vww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
103593216 ( 0.4x ) 420426 ( 0.946 ) 134364 ( 1.0 ) 0 TFLM Reference RV32GC 0 -
72363231 ( 0.6x ) 426464 ( 0.96 ) 134372 ( 1.0 ) 128 TFLM Reference RV32GCV 0 Loop+SLP
67201823 ( 0.7x ) 426464 ( 0.96 ) 134372 ( 1.0 ) 256 TFLM Reference RV32GCV 0 Loop+SLP
64750143 ( 0.7x ) 426464 ( 0.96 ) 134372 ( 1.0 ) 512 TFLM Reference RV32GCV 0 Loop+SLP
63717839 ( 0.7x ) 426464 ( 0.96 ) 134372 ( 1.0 ) 1024 TFLM Reference RV32GCV 0 Loop+SLP
63301844 ( 0.7x ) 426464 ( 0.96 ) 134372 ( 1.0 ) 2048 TFLM Reference RV32GCV 0 Loop+SLP
63269560 ( 0.7x ) 426464 ( 0.96 ) 134372 ( 1.0 ) 4096 TFLM Reference RV32GCV 0 Loop+SLP
45186114 ( Base ) 444376 ( Base ) 134364 ( Base ) 0 muRISCV-NN Scalar RV32GC 0 -
19572597 ( 2.3x ) 455444 ( 1.025 ) 134372 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
16706197 ( 2.7x ) 455444 ( 1.025 ) 134372 ( 1.0 ) 256 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
15333013 ( 2.9x ) 455444 ( 1.025 ) 134372 ( 1.0 ) 512 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
14832162 ( 3.0x ) 455444 ( 1.025 ) 134372 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
14625642 ( 3.1x ) 455444 ( 1.025 ) 134372 ( 1.0 ) 2048 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
14611595 ( 3.1x ) 455444 ( 1.025 ) 134372 ( 1.0 ) 4096 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
13372108 ( 3.4x ) 445588 ( 1.003 ) 134364 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV 0 -
10051108 ( 4.5x ) 445588 ( 1.003 ) 134364 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV 0 -
8759680 ( 5.2x ) 445588 ( 1.003 ) 134364 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV 0 -
8255690 ( 5.5x ) 445588 ( 1.003 ) 134364 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV 0 -
8211310 ( 5.5x ) 445588 ( 1.003 ) 134364 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV 0 -
8214699 ( 5.5x ) 445588 ( 1.003 ) 134364 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV 0 -

Original data

Click here to download the raw files for this benchmark.