Benchmarks CUSTOM TFLM GCC Os - tum-ei-eda/muriscv-nn GitHub Wiki

Setup

Simulator

Toolchains

Models

Package Versions

  • MLonMCU : main

  • TFLM : main

  • Spike : 0bc176b3fca43560b9e8586cdbc41cfde073e17a

  • Spike PK : 7e9b671c0415dfd7b562ac934feb9380075d4aa2

Miscellaneous

  • Used -Os flag for compilation.
  • Benchmarks generated using MLonMCU deployment tool with minimal efforts.
  • Memory metrics are reported in Bytes

Results (Framework: tflm, Backend: tflmi, Toolchain: gcc, Flags: -Os)

Audio Wake Words (aww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
174727877 ( 0.1x ) 132543 ( 0.884 ) 36204 ( 1.0 ) 128 TFLM Reference RV32GC False -
174727829 ( 0.1x ) 132549 ( 0.885 ) 36204 ( 1.0 ) 128 TFLM Reference RV32GCV False Loop+SLP
174727829 ( 0.1x ) 132549 ( 0.885 ) 36204 ( 1.0 ) 256 TFLM Reference RV32GCV False Loop+SLP
174727829 ( 0.1x ) 132549 ( 0.885 ) 36204 ( 1.0 ) 512 TFLM Reference RV32GCV False Loop+SLP
174727829 ( 0.1x ) 132549 ( 0.885 ) 36204 ( 1.0 ) 1024 TFLM Reference RV32GCV False Loop+SLP
174727829 ( 0.1x ) 132549 ( 0.885 ) 36204 ( 1.0 ) 2048 TFLM Reference RV32GCV False Loop+SLP
174727829 ( 0.1x ) 132549 ( 0.885 ) 36204 ( 1.0 ) 4096 TFLM Reference RV32GCV False Loop+SLP
157574076 ( 0.1x ) 144914 ( 0.967 ) 36148 ( 0.998 ) 128 TFLM Reference RV32GCP False -
16660013 ( Base ) 149852 ( Base ) 36212 ( Base ) 128 muRISCV-NN Scalar RV32GC False -
16660013 ( 1.0x ) 149854 ( 1.0 ) 36212 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV False Loop+SLP
16660013 ( 1.0x ) 149854 ( 1.0 ) 36212 ( 1.0 ) 256 muRISCV-NN Scalar RV32GCV False Loop+SLP
16660013 ( 1.0x ) 149854 ( 1.0 ) 36212 ( 1.0 ) 512 muRISCV-NN Scalar RV32GCV False Loop+SLP
16660013 ( 1.0x ) 149854 ( 1.0 ) 36212 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV False Loop+SLP
16660013 ( 1.0x ) 149854 ( 1.0 ) 36212 ( 1.0 ) 2048 muRISCV-NN Scalar RV32GCV False Loop+SLP
16660013 ( 1.0x ) 149854 ( 1.0 ) 36212 ( 1.0 ) 4096 muRISCV-NN Scalar RV32GCV False Loop+SLP
4113193 ( 4.1x ) 151032 ( 1.008 ) 36212 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV False -
2845137 ( 5.9x ) 151032 ( 1.008 ) 36212 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV False -
2156449 ( 7.7x ) 151032 ( 1.008 ) 36212 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV False -
2114737 ( 7.9x ) 151032 ( 1.008 ) 36212 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV False -
2114737 ( 7.9x ) 151032 ( 1.008 ) 36212 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV False -
2118126 ( 7.9x ) 151032 ( 1.008 ) 36212 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV False -
13526056 ( 1.2x ) 161456 ( 1.077 ) 36156 ( 0.998 ) 128 muRISCV-NN Scalar RV32GCP False -
15955430 ( 1.0x ) 163734 ( 1.093 ) 36156 ( 0.998 ) 128 muRISCV-NN Packed RV32GCP False -

Image Classification (resnet)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
745826315 ( 0.1x ) 173133 ( 0.939 ) 68968 ( 1.0 ) 128 TFLM Reference RV32GC False -
745826267 ( 0.1x ) 173149 ( 0.939 ) 68968 ( 1.0 ) 128 TFLM Reference RV32GCV False Loop+SLP
745826267 ( 0.1x ) 173149 ( 0.939 ) 68968 ( 1.0 ) 256 TFLM Reference RV32GCV False Loop+SLP
745826267 ( 0.1x ) 173149 ( 0.939 ) 68968 ( 1.0 ) 512 TFLM Reference RV32GCV False Loop+SLP
745826267 ( 0.1x ) 173149 ( 0.939 ) 68968 ( 1.0 ) 1024 TFLM Reference RV32GCV False Loop+SLP
745826267 ( 0.1x ) 173149 ( 0.939 ) 68968 ( 1.0 ) 2048 TFLM Reference RV32GCV False Loop+SLP
745826267 ( 0.1x ) 173149 ( 0.939 ) 68968 ( 1.0 ) 4096 TFLM Reference RV32GCV False Loop+SLP
697937407 ( 0.1x ) 185404 ( 1.006 ) 68912 ( 0.999 ) 128 TFLM Reference RV32GCP False -
81003295 ( Base ) 184350 ( Base ) 68960 ( Base ) 128 muRISCV-NN Scalar RV32GC False -
81003565 ( 1.0x ) 184378 ( 1.0 ) 68960 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV False Loop+SLP
81003565 ( 1.0x ) 184378 ( 1.0 ) 68960 ( 1.0 ) 256 muRISCV-NN Scalar RV32GCV False Loop+SLP
81003565 ( 1.0x ) 184378 ( 1.0 ) 68960 ( 1.0 ) 512 muRISCV-NN Scalar RV32GCV False Loop+SLP
81003565 ( 1.0x ) 184378 ( 1.0 ) 68960 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV False Loop+SLP
81003565 ( 1.0x ) 184378 ( 1.0 ) 68960 ( 1.0 ) 2048 muRISCV-NN Scalar RV32GCV False Loop+SLP
81003565 ( 1.0x ) 184378 ( 1.0 ) 68960 ( 1.0 ) 4096 muRISCV-NN Scalar RV32GCV False Loop+SLP
15486943 ( 5.2x ) 186362 ( 1.011 ) 68960 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV False -
9799935 ( 8.3x ) 186362 ( 1.011 ) 68960 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV False -
7206287 ( 11.2x ) 186362 ( 1.011 ) 68960 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV False -
5940767 ( 13.6x ) 186362 ( 1.011 ) 68960 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV False -
4999076 ( 16.2x ) 186362 ( 1.011 ) 68960 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV False -
4748549 ( 17.1x ) 186362 ( 1.011 ) 68960 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV False -
62985030 ( 1.3x ) 196058 ( 1.064 ) 68904 ( 0.999 ) 128 muRISCV-NN Scalar RV32GCP False -
68447304 ( 1.2x ) 198930 ( 1.079 ) 68904 ( 0.999 ) 128 muRISCV-NN Packed RV32GCP False -

Anomaly Detection (toycar)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
3106372 ( 0.6x ) 334042 ( 0.991 ) 19432 ( 1.0 ) 128 TFLM Reference RV32GC False -
3106372 ( 0.6x ) 334048 ( 0.991 ) 19432 ( 1.0 ) 128 TFLM Reference RV32GCV False Loop+SLP
3106372 ( 0.6x ) 334048 ( 0.991 ) 19432 ( 1.0 ) 256 TFLM Reference RV32GCV False Loop+SLP
3106372 ( 0.6x ) 334048 ( 0.991 ) 19432 ( 1.0 ) 512 TFLM Reference RV32GCV False Loop+SLP
3106372 ( 0.6x ) 334048 ( 0.991 ) 19432 ( 1.0 ) 1024 TFLM Reference RV32GCV False Loop+SLP
3106372 ( 0.6x ) 334048 ( 0.991 ) 19432 ( 1.0 ) 2048 TFLM Reference RV32GCV False Loop+SLP
3106372 ( 0.6x ) 334048 ( 0.991 ) 19432 ( 1.0 ) 4096 TFLM Reference RV32GCV False Loop+SLP
3121245 ( 0.6x ) 346402 ( 1.027 ) 19380 ( 0.997 ) 128 TFLM Reference RV32GCP False -
1789745 ( Base ) 337180 ( Base ) 19432 ( Base ) 128 muRISCV-NN Scalar RV32GC False -
1789745 ( 1.0x ) 337182 ( 1.0 ) 19432 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV False Loop+SLP
1789745 ( 1.0x ) 337182 ( 1.0 ) 19432 ( 1.0 ) 256 muRISCV-NN Scalar RV32GCV False Loop+SLP
1789745 ( 1.0x ) 337182 ( 1.0 ) 19432 ( 1.0 ) 512 muRISCV-NN Scalar RV32GCV False Loop+SLP
1789745 ( 1.0x ) 337182 ( 1.0 ) 19432 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV False Loop+SLP
1789745 ( 1.0x ) 337182 ( 1.0 ) 19432 ( 1.0 ) 2048 muRISCV-NN Scalar RV32GCV False Loop+SLP
1789745 ( 1.0x ) 337182 ( 1.0 ) 19432 ( 1.0 ) 4096 muRISCV-NN Scalar RV32GCV False Loop+SLP
584271 ( 3.1x ) 338094 ( 1.003 ) 19432 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV False -
465903 ( 3.8x ) 338094 ( 1.003 ) 19432 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV False -
406719 ( 4.4x ) 338094 ( 1.003 ) 19432 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV False -
377463 ( 4.7x ) 338094 ( 1.003 ) 19432 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV False -
373779 ( 4.8x ) 338094 ( 1.003 ) 19432 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV False -
371895 ( 4.8x ) 338094 ( 1.003 ) 19432 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV False -
1631834 ( 1.1x ) 349340 ( 1.036 ) 19380 ( 0.997 ) 128 muRISCV-NN Scalar RV32GCP False -
959447 ( 1.9x ) 350936 ( 1.041 ) 19380 ( 0.997 ) 128 muRISCV-NN Packed RV32GCP False -

Visual Wake Words (vww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
495297621 ( 0.1x ) 406249 ( 0.959 ) 134520 ( 1.0 ) 128 TFLM Reference RV32GC False -
495297621 ( 0.1x ) 406255 ( 0.959 ) 134520 ( 1.0 ) 128 TFLM Reference RV32GCV False Loop+SLP
495297621 ( 0.1x ) 406255 ( 0.959 ) 134520 ( 1.0 ) 256 TFLM Reference RV32GCV False Loop+SLP
495297621 ( 0.1x ) 406255 ( 0.959 ) 134520 ( 1.0 ) 512 TFLM Reference RV32GCV False Loop+SLP
495297621 ( 0.1x ) 406255 ( 0.959 ) 134520 ( 1.0 ) 1024 TFLM Reference RV32GCV False Loop+SLP
495297621 ( 0.1x ) 406255 ( 0.959 ) 134520 ( 1.0 ) 2048 TFLM Reference RV32GCV False Loop+SLP
495297621 ( 0.1x ) 406255 ( 0.959 ) 134520 ( 1.0 ) 4096 TFLM Reference RV32GCV False Loop+SLP
445917090 ( 0.1x ) 418618 ( 0.988 ) 134464 ( 1.0 ) 128 TFLM Reference RV32GCP False -
49691399 ( Base ) 423556 ( Base ) 134528 ( Base ) 128 muRISCV-NN Scalar RV32GC False -
49691399 ( 1.0x ) 423558 ( 1.0 ) 134528 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV False Loop+SLP
49691399 ( 1.0x ) 423558 ( 1.0 ) 134528 ( 1.0 ) 256 muRISCV-NN Scalar RV32GCV False Loop+SLP
49691399 ( 1.0x ) 423558 ( 1.0 ) 134528 ( 1.0 ) 512 muRISCV-NN Scalar RV32GCV False Loop+SLP
49691399 ( 1.0x ) 423558 ( 1.0 ) 134528 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV False Loop+SLP
49691399 ( 1.0x ) 423558 ( 1.0 ) 134528 ( 1.0 ) 2048 muRISCV-NN Scalar RV32GCV False Loop+SLP
49691399 ( 1.0x ) 423558 ( 1.0 ) 134528 ( 1.0 ) 4096 muRISCV-NN Scalar RV32GCV False Loop+SLP
13486989 ( 3.7x ) 424736 ( 1.003 ) 134528 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV False -
10158361 ( 4.9x ) 424736 ( 1.003 ) 134528 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV False -
8869239 ( 5.6x ) 424736 ( 1.003 ) 134528 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV False -
8364740 ( 5.9x ) 424736 ( 1.003 ) 134528 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV False -
8316126 ( 6.0x ) 424736 ( 1.003 ) 134528 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV False -
8319515 ( 6.0x ) 424736 ( 1.003 ) 134528 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV False -
40776459 ( 1.2x ) 435160 ( 1.027 ) 134472 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCP False -
49187703 ( 1.0x ) 437438 ( 1.033 ) 134472 ( 1.0 ) 128 muRISCV-NN Packed RV32GCP False -

Original data

Click here to download the raw files for this benchmark.