Benchmarks 2024 03 02 TFLM LLVM Os - tum-ei-eda/muriscv-nn GitHub Wiki

Setup

Simulator

Toolchains

  • LLVM/Clang:
    • TODO: Version
    • Linker: lld (TODO)
    • RISC-V GCC for Headers, libc,...

Models

Package Versions

  • MLonMCU : main

  • TFLM : main

  • Spike : 0bc176b3fca43560b9e8586cdbc41cfde073e17a

  • Spike PK : 7e9b671c0415dfd7b562ac934feb9380075d4aa2

Miscellaneous

  • Used -Os flag for compilation.
  • Benchmarks generated using MLonMCU deployment tool with minimal efforts.
  • Memory metrics are reported in Bytes

Results (Framework: tflm, Backend: tflmi, Toolchain: llvm, Flags: -Os)

Audio Wake Words (aww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
43866579 ( 0.4x ) 146818 ( 0.865 ) 36124 ( 1.0 ) 0 TFLM Reference RV32GC 0 -
34218522 ( 0.5x ) 152592 ( 0.899 ) 36132 ( 1.0 ) 128 TFLM Reference RV32GCV 0 Loop+SLP
32425850 ( 0.5x ) 152592 ( 0.899 ) 36132 ( 1.0 ) 256 TFLM Reference RV32GCV 0 Loop+SLP
31529514 ( 0.5x ) 152592 ( 0.899 ) 36132 ( 1.0 ) 512 TFLM Reference RV32GCV 0 Loop+SLP
31081346 ( 0.5x ) 152592 ( 0.899 ) 36132 ( 1.0 ) 1024 TFLM Reference RV32GCV 0 Loop+SLP
31084735 ( 0.5x ) 152592 ( 0.899 ) 36132 ( 1.0 ) 2048 TFLM Reference RV32GCV 0 Loop+SLP
31088124 ( 0.5x ) 152592 ( 0.899 ) 36132 ( 1.0 ) 4096 TFLM Reference RV32GCV 0 Loop+SLP
15578007 ( Base ) 169664 ( Base ) 36124 ( Base ) 0 muRISCV-NN Scalar RV32GC 0 -
6156569 ( 2.5x ) 180672 ( 1.065 ) 36132 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
5137981 ( 3.0x ) 180672 ( 1.065 ) 36132 ( 1.0 ) 256 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
4628797 ( 3.4x ) 180672 ( 1.065 ) 36132 ( 1.0 ) 512 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
4333530 ( 3.6x ) 180672 ( 1.065 ) 36132 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
4336919 ( 3.6x ) 180672 ( 1.065 ) 36132 ( 1.0 ) 2048 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
4340308 ( 3.6x ) 180672 ( 1.065 ) 36132 ( 1.0 ) 4096 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
4148823 ( 3.8x ) 169846 ( 1.001 ) 36124 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV 0 -
2888101 ( 5.4x ) 169846 ( 1.001 ) 36124 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV 0 -
2207783 ( 7.1x ) 169846 ( 1.001 ) 36124 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV 0 -
2165295 ( 7.2x ) 169846 ( 1.001 ) 36124 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV 0 -
2168684 ( 7.2x ) 169846 ( 1.001 ) 36124 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV 0 -
2172073 ( 7.2x ) 169846 ( 1.001 ) 36124 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV 0 -

Image Classification (resnet)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
134767818 ( 0.4x ) 188450 ( 0.926 ) 68892 ( 1.0 ) 0 TFLM Reference RV32GC 0 -
58923571 ( 1.0x ) 194366 ( 0.955 ) 68900 ( 1.0 ) 128 TFLM Reference RV32GCV 0 Loop+SLP
49166915 ( 1.2x ) 194366 ( 0.955 ) 68900 ( 1.0 ) 256 TFLM Reference RV32GCV 0 Loop+SLP
46820011 ( 1.2x ) 194366 ( 0.955 ) 68900 ( 1.0 ) 512 TFLM Reference RV32GCV 0 Loop+SLP
46386207 ( 1.3x ) 194366 ( 0.955 ) 68900 ( 1.0 ) 1024 TFLM Reference RV32GCV 0 Loop+SLP
46389596 ( 1.3x ) 194366 ( 0.955 ) 68900 ( 1.0 ) 2048 TFLM Reference RV32GCV 0 Loop+SLP
46392985 ( 1.3x ) 194366 ( 0.955 ) 68900 ( 1.0 ) 4096 TFLM Reference RV32GCV 0 Loop+SLP
58410132 ( Base ) 203488 ( Base ) 68888 ( Base ) 0 muRISCV-NN Scalar RV32GC 0 -
28265873 ( 2.1x ) 214948 ( 1.056 ) 68896 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
19614629 ( 3.0x ) 214948 ( 1.056 ) 68896 ( 1.0 ) 256 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
15559453 ( 3.8x ) 214948 ( 1.056 ) 68896 ( 1.0 ) 512 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
13715478 ( 4.3x ) 214948 ( 1.056 ) 68896 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
12817747 ( 4.6x ) 214948 ( 1.056 ) 68896 ( 1.0 ) 2048 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
12145296 ( 4.8x ) 214948 ( 1.056 ) 68896 ( 1.0 ) 4096 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
15338238 ( 3.8x ) 204360 ( 1.004 ) 68888 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV 0 -
9753250 ( 6.0x ) 204360 ( 1.004 ) 68888 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV 0 -
7206516 ( 8.1x ) 204360 ( 1.004 ) 68888 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV 0 -
5963940 ( 9.8x ) 204360 ( 1.004 ) 68888 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV 0 -
5038841 ( 11.6x ) 204360 ( 1.004 ) 68888 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV 0 -
4793026 ( 12.2x ) 204360 ( 1.004 ) 68888 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV 0 -

Anomaly Detection (toycar)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
3076916 ( 0.6x ) 342268 ( 0.988 ) 19376 ( 1.0 ) 0 TFLM Reference RV32GC 0 -
906851 ( 1.9x ) 344048 ( 0.993 ) 19376 ( 1.0 ) 128 TFLM Reference RV32GCV 0 Loop+SLP
676579 ( 2.5x ) 344048 ( 0.993 ) 19376 ( 1.0 ) 256 TFLM Reference RV32GCV 0 Loop+SLP
561443 ( 3.0x ) 344048 ( 0.993 ) 19376 ( 1.0 ) 512 TFLM Reference RV32GCV 0 Loop+SLP
503875 ( 3.4x ) 344048 ( 0.993 ) 19376 ( 1.0 ) 1024 TFLM Reference RV32GCV 0 Loop+SLP
475091 ( 3.6x ) 344048 ( 0.993 ) 19376 ( 1.0 ) 2048 TFLM Reference RV32GCV 0 Loop+SLP
471507 ( 3.6x ) 344048 ( 0.993 ) 19376 ( 1.0 ) 4096 TFLM Reference RV32GCV 0 Loop+SLP
1693264 ( Base ) 346482 ( Base ) 19376 ( Base ) 0 muRISCV-NN Scalar RV32GC 0 -
594682 ( 2.8x ) 349764 ( 1.009 ) 19376 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
480762 ( 3.5x ) 349764 ( 1.009 ) 19376 ( 1.0 ) 256 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
423802 ( 4.0x ) 349764 ( 1.009 ) 19376 ( 1.0 ) 512 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
395322 ( 4.3x ) 349764 ( 1.009 ) 19376 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
381082 ( 4.4x ) 349764 ( 1.009 ) 19376 ( 1.0 ) 2048 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
379298 ( 4.5x ) 349764 ( 1.009 ) 19376 ( 1.0 ) 4096 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
594835 ( 2.8x ) 347090 ( 1.002 ) 19376 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV 0 -
478155 ( 3.5x ) 347090 ( 1.002 ) 19376 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV 0 -
419815 ( 4.0x ) 347090 ( 1.002 ) 19376 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV 0 -
390969 ( 4.3x ) 347090 ( 1.002 ) 19376 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV 0 -
387338 ( 4.4x ) 347090 ( 1.002 ) 19376 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV 0 -
385482 ( 4.4x ) 347090 ( 1.002 ) 19376 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV 0 -

Visual Wake Words (vww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
114181496 ( 0.4x ) 420578 ( 0.948 ) 134452 ( 1.0 ) 0 TFLM Reference RV32GC 0 -
72046415 ( 0.6x ) 426304 ( 0.961 ) 134460 ( 1.0 ) 128 TFLM Reference RV32GCV 0 Loop+SLP
66885007 ( 0.7x ) 426304 ( 0.961 ) 134460 ( 1.0 ) 256 TFLM Reference RV32GCV 0 Loop+SLP
64433327 ( 0.7x ) 426304 ( 0.961 ) 134460 ( 1.0 ) 512 TFLM Reference RV32GCV 0 Loop+SLP
63401023 ( 0.7x ) 426304 ( 0.961 ) 134460 ( 1.0 ) 1024 TFLM Reference RV32GCV 0 Loop+SLP
62985028 ( 0.7x ) 426304 ( 0.961 ) 134460 ( 1.0 ) 2048 TFLM Reference RV32GCV 0 Loop+SLP
62956133 ( 0.7x ) 426304 ( 0.961 ) 134460 ( 1.0 ) 4096 TFLM Reference RV32GCV 0 Loop+SLP
46664886 ( Base ) 443424 ( Base ) 134452 ( Base ) 0 muRISCV-NN Scalar RV32GC 0 -
19578144 ( 2.4x ) 454368 ( 1.025 ) 134460 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
16674880 ( 2.8x ) 454368 ( 1.025 ) 134460 ( 1.0 ) 256 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
15283264 ( 3.1x ) 454368 ( 1.025 ) 134460 ( 1.0 ) 512 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
14782413 ( 3.2x ) 454368 ( 1.025 ) 134460 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
14579282 ( 3.2x ) 454368 ( 1.025 ) 134460 ( 1.0 ) 2048 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
14565235 ( 3.2x ) 454368 ( 1.025 ) 134460 ( 1.0 ) 4096 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
13654131 ( 3.4x ) 443598 ( 1.0 ) 134452 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV 0 -
10321491 ( 4.5x ) 443598 ( 1.0 ) 134452 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV 0 -
9025107 ( 5.2x ) 443598 ( 1.0 ) 134452 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV 0 -
8519591 ( 5.5x ) 443598 ( 1.0 ) 134452 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV 0 -
8474590 ( 5.5x ) 443598 ( 1.0 ) 134452 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV 0 -
8477979 ( 5.5x ) 443598 ( 1.0 ) 134452 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV 0 -

Original data

Click here to download the raw files for this benchmark.