Benchmarks 2024 02 11 TFLM LLVM - tum-ei-eda/muriscv-nn GitHub Wiki

Setup

Simulator

Toolchains

  • LLVM/Clang:
    • TODO: Version
    • Linker: lld (TODO)
    • RISC-V GCC for Headers, libc,...

Models

Package Versions

  • MLonMCU : main

  • TFLM : a549448bb234cf3fed15ad5dabf83d06f82326ce

  • Spike : 0bc176b3fca43560b9e8586cdbc41cfde073e17a

  • Spike PK : 7e9b671c0415dfd7b562ac934feb9380075d4aa2

Miscellaneous

  • Used -Os flag for compilation.
  • Benchmarks generated using MLonMCU deployment tool with minimal efforts.
  • Memory metrics are reported in Bytes

Results (Framework: tflm, Backend: tflmi, Toolchain: llvm)

Audio Wake Words (aww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Auto-Vectorization
43849979 ( 0.4x ) 146710 ( 0.863 ) 36124 ( 1.0 ) 0 TFLM Reference RV32GC -
34194781 ( 0.5x ) 152484 ( 0.897 ) 36132 ( 1.0 ) 128 TFLM Reference RV32GCV Loop+SLP
31057605 ( 0.5x ) 152484 ( 0.897 ) 36132 ( 1.0 ) 1024 TFLM Reference RV32GCV Loop+SLP
15555116 ( Base ) 169970 ( Base ) 36124 ( Base ) 0 muRISCV-NN Scalar RV32GC -
6141153 ( 2.5x ) 181156 ( 1.066 ) 36132 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV Loop+SLP
4318114 ( 3.6x ) 181156 ( 1.066 ) 36132 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV Loop+SLP
6656543 ( 2.3x ) 169584 ( 0.998 ) 36124 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV -
2530451 ( 6.1x ) 169584 ( 0.998 ) 36124 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV -

Notes

  • LLVM's AutoVectorizer (Scalar mode) outperforms muRISCV-NN (Vector mode) for VLEN=128 with
  • Check for different VLENs! (256,512,2048)

Image Classification (resnet)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Auto-Vectorization
134756807 ( 0.4x ) 188332 ( 0.924 ) 68892 ( 1.0 ) 0 TFLM Reference RV32GC -
58895120 ( 1.0x ) 194258 ( 0.953 ) 68900 ( 1.0 ) 128 TFLM Reference RV32GCV Loop+SLP
46357756 ( 1.3x ) 194258 ( 0.953 ) 68900 ( 1.0 ) 1024 TFLM Reference RV32GCV Loop+SLP
58391132 ( Base ) 203794 ( Base ) 68888 ( Base ) 0 muRISCV-NN Scalar RV32GC -
28233936 ( 2.1x ) 215430 ( 1.057 ) 68896 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV Loop+SLP
13683415 ( 4.3x ) 215430 ( 1.057 ) 68896 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV Loop+SLP
27964207 ( 2.1x ) 204102 ( 1.002 ) 68888 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV -
8022385 ( 7.3x ) 204102 ( 1.002 ) 68888 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV -

Anomaly Detection (toycar)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Auto-Vectorization
3052975 ( 0.6x ) 342160 ( 0.987 ) 19376 ( 1.0 ) 0 TFLM Reference RV32GC -
895486 ( 2.1x ) 343940 ( 0.992 ) 19376 ( 1.0 ) 128 TFLM Reference RV32GCV Loop+SLP
492510 ( 3.7x ) 343940 ( 0.992 ) 19376 ( 1.0 ) 1024 TFLM Reference RV32GCV Loop+SLP
1846118 ( Base ) 346742 ( Base ) 19376 ( Base ) 0 muRISCV-NN Scalar RV32GC -
620218 ( 3.0x ) 350290 ( 1.01 ) 19376 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV Loop+SLP
387846 ( 4.8x ) 350290 ( 1.01 ) 19376 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV Loop+SLP
591772 ( 3.1x ) 346750 ( 1.0 ) 19376 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV -
418324 ( 4.4x ) 346750 ( 1.0 ) 19376 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV -

Notes

  • LLVM's AutoVectorizer (Scalar mode) outperforms muRISCV-NN (Vector mode) for VLEN=1024 with
  • Check for different VLENs! (256,512,2048)
  • Even for TFLM reference kernels auto-vectorization is very effective!

Visual Wake Words (vww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Auto-Vectorization
114164825 ( 0.4x ) 420470 ( 0.948 ) 134452 ( 1.0 ) 0 TFLM Reference RV32GC -
72029724 ( 0.6x ) 426196 ( 0.96 ) 134460 ( 1.0 ) 128 TFLM Reference RV32GCV Loop+SLP
63384332 ( 0.7x ) 426196 ( 0.96 ) 134460 ( 1.0 ) 1024 TFLM Reference RV32GCV Loop+SLP
46651565 ( Base ) 443730 ( Base ) 134452 ( Base ) 0 muRISCV-NN Scalar RV32GC -
19564767 ( 2.4x ) 454852 ( 1.025 ) 134460 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV Loop+SLP
14769036 ( 3.2x ) 454852 ( 1.025 ) 134460 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV Loop+SLP
21026824 ( 2.2x ) 443336 ( 0.999 ) 134452 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV -
10339644 ( 4.5x ) 443336 ( 0.999 ) 134452 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV -

Notes

  • See aww!

Original data

Click here to download the raw files for this benchmark.