Benchmarks 2024 06 29 TFLM GCC Os - tum-ei-eda/muriscv-nn GitHub Wiki

Setup

Simulator

Toolchains

Models

Package Versions

  • MLonMCU : main

  • TFLM : main

  • Spike : 0bc176b3fca43560b9e8586cdbc41cfde073e17a

  • Spike PK : 7e9b671c0415dfd7b562ac934feb9380075d4aa2

Miscellaneous

  • Used -Os flag for compilation.
  • Benchmarks generated using MLonMCU deployment tool with minimal efforts.
  • Memory metrics are reported in Bytes

Results (Framework: tflm, Backend: tflmi, Toolchain: gcc, Flags: -Os)

Audio Wake Words (aww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
174727666 ( 0.1x ) 132619 ( 0.885 ) 36204 ( 1.0 ) 128 TFLM Reference RV32GC False -
174727618 ( 0.1x ) 132625 ( 0.885 ) 36204 ( 1.0 ) 128 TFLM Reference RV32GCV False Loop+SLP
174727618 ( 0.1x ) 132625 ( 0.885 ) 36204 ( 1.0 ) 256 TFLM Reference RV32GCV False Loop+SLP
174727618 ( 0.1x ) 132625 ( 0.885 ) 36204 ( 1.0 ) 512 TFLM Reference RV32GCV False Loop+SLP
174727618 ( 0.1x ) 132625 ( 0.885 ) 36204 ( 1.0 ) 1024 TFLM Reference RV32GCV False Loop+SLP
174727618 ( 0.1x ) 132625 ( 0.885 ) 36204 ( 1.0 ) 2048 TFLM Reference RV32GCV False Loop+SLP
174727618 ( 0.1x ) 132625 ( 0.885 ) 36204 ( 1.0 ) 4096 TFLM Reference RV32GCV False Loop+SLP
157574046 ( 0.1x ) 145020 ( 0.967 ) 36152 ( 0.998 ) 128 TFLM Reference RV32GCP False -
16654909 ( Base ) 149928 ( Base ) 36212 ( Base ) 128 muRISCV-NN Scalar RV32GC False -
16654909 ( 1.0x ) 149930 ( 1.0 ) 36212 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV False Loop+SLP
16654909 ( 1.0x ) 149930 ( 1.0 ) 36212 ( 1.0 ) 256 muRISCV-NN Scalar RV32GCV False Loop+SLP
16654909 ( 1.0x ) 149930 ( 1.0 ) 36212 ( 1.0 ) 512 muRISCV-NN Scalar RV32GCV False Loop+SLP
16654909 ( 1.0x ) 149930 ( 1.0 ) 36212 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV False Loop+SLP
16654909 ( 1.0x ) 149930 ( 1.0 ) 36212 ( 1.0 ) 2048 muRISCV-NN Scalar RV32GCV False Loop+SLP
16654909 ( 1.0x ) 149930 ( 1.0 ) 36212 ( 1.0 ) 4096 muRISCV-NN Scalar RV32GCV False Loop+SLP
4113222 ( 4.0x ) 151108 ( 1.008 ) 36212 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV False -
2845166 ( 5.9x ) 151108 ( 1.008 ) 36212 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV False -
2156478 ( 7.7x ) 151108 ( 1.008 ) 36212 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV False -
2114766 ( 7.9x ) 151108 ( 1.008 ) 36212 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV False -
2114766 ( 7.9x ) 151108 ( 1.008 ) 36212 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV False -
2118155 ( 7.9x ) 151108 ( 1.008 ) 36212 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV False -
13526014 ( 1.2x ) 161562 ( 1.078 ) 36160 ( 0.999 ) 128 muRISCV-NN Scalar RV32GCP False -
15950339 ( 1.0x ) 163840 ( 1.093 ) 36160 ( 0.999 ) 128 muRISCV-NN Packed RV32GCP False -

Image Classification (resnet)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
745826110 ( 0.1x ) 173209 ( 0.939 ) 68968 ( 1.0 ) 128 TFLM Reference RV32GC False -
745826062 ( 0.1x ) 173225 ( 0.939 ) 68968 ( 1.0 ) 128 TFLM Reference RV32GCV False Loop+SLP
745826062 ( 0.1x ) 173225 ( 0.939 ) 68968 ( 1.0 ) 256 TFLM Reference RV32GCV False Loop+SLP
745826062 ( 0.1x ) 173225 ( 0.939 ) 68968 ( 1.0 ) 512 TFLM Reference RV32GCV False Loop+SLP
745826062 ( 0.1x ) 173225 ( 0.939 ) 68968 ( 1.0 ) 1024 TFLM Reference RV32GCV False Loop+SLP
745826062 ( 0.1x ) 173225 ( 0.939 ) 68968 ( 1.0 ) 2048 TFLM Reference RV32GCV False Loop+SLP
745826062 ( 0.1x ) 173225 ( 0.939 ) 68968 ( 1.0 ) 4096 TFLM Reference RV32GCV False Loop+SLP
697937374 ( 0.1x ) 185510 ( 1.006 ) 68916 ( 0.999 ) 128 TFLM Reference RV32GCP False -
81003736 ( Base ) 184426 ( Base ) 68960 ( Base ) 128 muRISCV-NN Scalar RV32GC False -
81003608 ( 1.0x ) 184454 ( 1.0 ) 68960 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV False Loop+SLP
81003608 ( 1.0x ) 184454 ( 1.0 ) 68960 ( 1.0 ) 256 muRISCV-NN Scalar RV32GCV False Loop+SLP
81003608 ( 1.0x ) 184454 ( 1.0 ) 68960 ( 1.0 ) 512 muRISCV-NN Scalar RV32GCV False Loop+SLP
81003608 ( 1.0x ) 184454 ( 1.0 ) 68960 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV False Loop+SLP
81003608 ( 1.0x ) 184454 ( 1.0 ) 68960 ( 1.0 ) 2048 muRISCV-NN Scalar RV32GCV False Loop+SLP
81003608 ( 1.0x ) 184454 ( 1.0 ) 68960 ( 1.0 ) 4096 muRISCV-NN Scalar RV32GCV False Loop+SLP
15486978 ( 5.2x ) 186438 ( 1.011 ) 68960 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV False -
9799970 ( 8.3x ) 186438 ( 1.011 ) 68960 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV False -
7206322 ( 11.2x ) 186438 ( 1.011 ) 68960 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV False -
5940802 ( 13.6x ) 186438 ( 1.011 ) 68960 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV False -
4999111 ( 16.2x ) 186438 ( 1.011 ) 68960 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV False -
4748584 ( 17.1x ) 186438 ( 1.011 ) 68960 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV False -
62985081 ( 1.3x ) 196164 ( 1.064 ) 68908 ( 0.999 ) 128 muRISCV-NN Scalar RV32GCP False -
68452416 ( 1.2x ) 199036 ( 1.079 ) 68908 ( 0.999 ) 128 muRISCV-NN Packed RV32GCP False -

Anomaly Detection (toycar)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
3106395 ( 0.6x ) 334118 ( 0.991 ) 19448 ( 1.0 ) 128 TFLM Reference RV32GC False -
3106395 ( 0.6x ) 334124 ( 0.991 ) 19448 ( 1.0 ) 128 TFLM Reference RV32GCV False Loop+SLP
3106395 ( 0.6x ) 334124 ( 0.991 ) 19448 ( 1.0 ) 256 TFLM Reference RV32GCV False Loop+SLP
3106395 ( 0.6x ) 334124 ( 0.991 ) 19448 ( 1.0 ) 512 TFLM Reference RV32GCV False Loop+SLP
3106395 ( 0.6x ) 334124 ( 0.991 ) 19448 ( 1.0 ) 1024 TFLM Reference RV32GCV False Loop+SLP
3106395 ( 0.6x ) 334124 ( 0.991 ) 19448 ( 1.0 ) 2048 TFLM Reference RV32GCV False Loop+SLP
3106395 ( 0.6x ) 334124 ( 0.991 ) 19448 ( 1.0 ) 4096 TFLM Reference RV32GCV False Loop+SLP
3121206 ( 0.6x ) 346510 ( 1.027 ) 19384 ( 0.997 ) 128 TFLM Reference RV32GCP False -
1789685 ( Base ) 337256 ( Base ) 19448 ( Base ) 128 muRISCV-NN Scalar RV32GC False -
1789685 ( 1.0x ) 337258 ( 1.0 ) 19448 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV False Loop+SLP
1789685 ( 1.0x ) 337258 ( 1.0 ) 19448 ( 1.0 ) 256 muRISCV-NN Scalar RV32GCV False Loop+SLP
1789685 ( 1.0x ) 337258 ( 1.0 ) 19448 ( 1.0 ) 512 muRISCV-NN Scalar RV32GCV False Loop+SLP
1789685 ( 1.0x ) 337258 ( 1.0 ) 19448 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV False Loop+SLP
1789685 ( 1.0x ) 337258 ( 1.0 ) 19448 ( 1.0 ) 2048 muRISCV-NN Scalar RV32GCV False Loop+SLP
1789685 ( 1.0x ) 337258 ( 1.0 ) 19448 ( 1.0 ) 4096 muRISCV-NN Scalar RV32GCV False Loop+SLP
584299 ( 3.1x ) 338170 ( 1.003 ) 19448 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV False -
465931 ( 3.8x ) 338170 ( 1.003 ) 19448 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV False -
406747 ( 4.4x ) 338170 ( 1.003 ) 19448 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV False -
377491 ( 4.7x ) 338170 ( 1.003 ) 19448 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV False -
373807 ( 4.8x ) 338170 ( 1.003 ) 19448 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV False -
371923 ( 4.8x ) 338170 ( 1.003 ) 19448 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV False -
1631783 ( 1.1x ) 349446 ( 1.036 ) 19384 ( 0.997 ) 128 muRISCV-NN Scalar RV32GCP False -
959480 ( 1.9x ) 351042 ( 1.041 ) 19384 ( 0.997 ) 128 muRISCV-NN Packed RV32GCP False -

Visual Wake Words (vww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
495292368 ( 0.1x ) 406325 ( 0.959 ) 134520 ( 1.0 ) 128 TFLM Reference RV32GC False -
495292306 ( 0.1x ) 406331 ( 0.959 ) 134520 ( 1.0 ) 128 TFLM Reference RV32GCV False Loop+SLP
495292306 ( 0.1x ) 406331 ( 0.959 ) 134520 ( 1.0 ) 256 TFLM Reference RV32GCV False Loop+SLP
495292306 ( 0.1x ) 406331 ( 0.959 ) 134520 ( 1.0 ) 512 TFLM Reference RV32GCV False Loop+SLP
495292306 ( 0.1x ) 406331 ( 0.959 ) 134520 ( 1.0 ) 1024 TFLM Reference RV32GCV False Loop+SLP
495292306 ( 0.1x ) 406331 ( 0.959 ) 134520 ( 1.0 ) 2048 TFLM Reference RV32GCV False Loop+SLP
495292306 ( 0.1x ) 406331 ( 0.959 ) 134520 ( 1.0 ) 4096 TFLM Reference RV32GCV False Loop+SLP
445917146 ( 0.1x ) 418724 ( 0.988 ) 134468 ( 1.0 ) 128 TFLM Reference RV32GCP False -
49691563 ( Base ) 423632 ( Base ) 134528 ( Base ) 128 muRISCV-NN Scalar RV32GC False -
49691563 ( 1.0x ) 423634 ( 1.0 ) 134528 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCV False Loop+SLP
49691563 ( 1.0x ) 423634 ( 1.0 ) 134528 ( 1.0 ) 256 muRISCV-NN Scalar RV32GCV False Loop+SLP
49691563 ( 1.0x ) 423634 ( 1.0 ) 134528 ( 1.0 ) 512 muRISCV-NN Scalar RV32GCV False Loop+SLP
49691563 ( 1.0x ) 423634 ( 1.0 ) 134528 ( 1.0 ) 1024 muRISCV-NN Scalar RV32GCV False Loop+SLP
49691563 ( 1.0x ) 423634 ( 1.0 ) 134528 ( 1.0 ) 2048 muRISCV-NN Scalar RV32GCV False Loop+SLP
49691563 ( 1.0x ) 423634 ( 1.0 ) 134528 ( 1.0 ) 4096 muRISCV-NN Scalar RV32GCV False Loop+SLP
13487052 ( 3.7x ) 424812 ( 1.003 ) 134528 ( 1.0 ) 128 muRISCV-NN Vector RV32GCV False -
10158424 ( 4.9x ) 424812 ( 1.003 ) 134528 ( 1.0 ) 256 muRISCV-NN Vector RV32GCV False -
8869302 ( 5.6x ) 424812 ( 1.003 ) 134528 ( 1.0 ) 512 muRISCV-NN Vector RV32GCV False -
8364803 ( 5.9x ) 424812 ( 1.003 ) 134528 ( 1.0 ) 1024 muRISCV-NN Vector RV32GCV False -
8316189 ( 6.0x ) 424812 ( 1.003 ) 134528 ( 1.0 ) 2048 muRISCV-NN Vector RV32GCV False -
8319578 ( 6.0x ) 424812 ( 1.003 ) 134528 ( 1.0 ) 4096 muRISCV-NN Vector RV32GCV False -
40776347 ( 1.2x ) 435266 ( 1.027 ) 134476 ( 1.0 ) 128 muRISCV-NN Scalar RV32GCP False -
49192857 ( 1.0x ) 437544 ( 1.033 ) 134476 ( 1.0 ) 128 muRISCV-NN Packed RV32GCP False -

Original data

Click here to download the raw files for this benchmark.