Benchmarks 2024 06 29 TFLM GCC Os - tum-ei-eda/muriscv-nn GitHub Wiki
Setup
Simulator
- Spike (
riscv-isa-sim
) (ISS, CPI=1)
Toolchains
- RISC-V GCC:
- Scalar: TODO: version & url
- Vector: TODO: version & url
- Packed: Self compiled using patches found in https://github.com/riscv-collab/riscv-gcc/pull/258 and https://github.com/riscvarchive/riscv-binutils-gdb/pull/257
Models
-
MLPerfTiny Benchmark
-
TODO: others!
Package Versions
-
MLonMCU : main
-
TFLM : main
-
Spike : 0bc176b3fca43560b9e8586cdbc41cfde073e17a
-
Spike PK : 7e9b671c0415dfd7b562ac934feb9380075d4aa2
Miscellaneous
- Used
-Os
flag for compilation. - Benchmarks generated using MLonMCU deployment tool with minimal efforts.
- Memory metrics are reported in Bytes
Results (Framework: tflm, Backend: tflmi, Toolchain: gcc, Flags: -Os)
aww
)
Audio Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
174727666 ( 0.1x ) |
132619 ( 0.885 ) |
36204 ( 1.0 ) |
128 | TFLM | Reference | RV32GC | False | - |
174727618 ( 0.1x ) |
132625 ( 0.885 ) |
36204 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | False | Loop+SLP |
174727618 ( 0.1x ) |
132625 ( 0.885 ) |
36204 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | False | Loop+SLP |
174727618 ( 0.1x ) |
132625 ( 0.885 ) |
36204 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | False | Loop+SLP |
174727618 ( 0.1x ) |
132625 ( 0.885 ) |
36204 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | False | Loop+SLP |
174727618 ( 0.1x ) |
132625 ( 0.885 ) |
36204 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | False | Loop+SLP |
174727618 ( 0.1x ) |
132625 ( 0.885 ) |
36204 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | False | Loop+SLP |
157574046 ( 0.1x ) |
145020 ( 0.967 ) |
36152 ( 0.998 ) |
128 | TFLM | Reference | RV32GCP | False | - |
16654909 ( Base ) |
149928 ( Base ) |
36212 ( Base ) |
128 | muRISCV-NN | Scalar | RV32GC | False | - |
16654909 ( 1.0x ) |
149930 ( 1.0 ) |
36212 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
16654909 ( 1.0x ) |
149930 ( 1.0 ) |
36212 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
16654909 ( 1.0x ) |
149930 ( 1.0 ) |
36212 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
16654909 ( 1.0x ) |
149930 ( 1.0 ) |
36212 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
16654909 ( 1.0x ) |
149930 ( 1.0 ) |
36212 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
16654909 ( 1.0x ) |
149930 ( 1.0 ) |
36212 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
4113222 ( 4.0x ) |
151108 ( 1.008 ) |
36212 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | False | - |
2845166 ( 5.9x ) |
151108 ( 1.008 ) |
36212 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | False | - |
2156478 ( 7.7x ) |
151108 ( 1.008 ) |
36212 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | False | - |
2114766 ( 7.9x ) |
151108 ( 1.008 ) |
36212 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | False | - |
2114766 ( 7.9x ) |
151108 ( 1.008 ) |
36212 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | False | - |
2118155 ( 7.9x ) |
151108 ( 1.008 ) |
36212 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | False | - |
13526014 ( 1.2x ) |
161562 ( 1.078 ) |
36160 ( 0.999 ) |
128 | muRISCV-NN | Scalar | RV32GCP | False | - |
15950339 ( 1.0x ) |
163840 ( 1.093 ) |
36160 ( 0.999 ) |
128 | muRISCV-NN | Packed | RV32GCP | False | - |
resnet
)
Image Classification (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
745826110 ( 0.1x ) |
173209 ( 0.939 ) |
68968 ( 1.0 ) |
128 | TFLM | Reference | RV32GC | False | - |
745826062 ( 0.1x ) |
173225 ( 0.939 ) |
68968 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | False | Loop+SLP |
745826062 ( 0.1x ) |
173225 ( 0.939 ) |
68968 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | False | Loop+SLP |
745826062 ( 0.1x ) |
173225 ( 0.939 ) |
68968 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | False | Loop+SLP |
745826062 ( 0.1x ) |
173225 ( 0.939 ) |
68968 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | False | Loop+SLP |
745826062 ( 0.1x ) |
173225 ( 0.939 ) |
68968 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | False | Loop+SLP |
745826062 ( 0.1x ) |
173225 ( 0.939 ) |
68968 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | False | Loop+SLP |
697937374 ( 0.1x ) |
185510 ( 1.006 ) |
68916 ( 0.999 ) |
128 | TFLM | Reference | RV32GCP | False | - |
81003736 ( Base ) |
184426 ( Base ) |
68960 ( Base ) |
128 | muRISCV-NN | Scalar | RV32GC | False | - |
81003608 ( 1.0x ) |
184454 ( 1.0 ) |
68960 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
81003608 ( 1.0x ) |
184454 ( 1.0 ) |
68960 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
81003608 ( 1.0x ) |
184454 ( 1.0 ) |
68960 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
81003608 ( 1.0x ) |
184454 ( 1.0 ) |
68960 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
81003608 ( 1.0x ) |
184454 ( 1.0 ) |
68960 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
81003608 ( 1.0x ) |
184454 ( 1.0 ) |
68960 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
15486978 ( 5.2x ) |
186438 ( 1.011 ) |
68960 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | False | - |
9799970 ( 8.3x ) |
186438 ( 1.011 ) |
68960 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | False | - |
7206322 ( 11.2x ) |
186438 ( 1.011 ) |
68960 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | False | - |
5940802 ( 13.6x ) |
186438 ( 1.011 ) |
68960 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | False | - |
4999111 ( 16.2x ) |
186438 ( 1.011 ) |
68960 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | False | - |
4748584 ( 17.1x ) |
186438 ( 1.011 ) |
68960 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | False | - |
62985081 ( 1.3x ) |
196164 ( 1.064 ) |
68908 ( 0.999 ) |
128 | muRISCV-NN | Scalar | RV32GCP | False | - |
68452416 ( 1.2x ) |
199036 ( 1.079 ) |
68908 ( 0.999 ) |
128 | muRISCV-NN | Packed | RV32GCP | False | - |
toycar
)
Anomaly Detection (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
3106395 ( 0.6x ) |
334118 ( 0.991 ) |
19448 ( 1.0 ) |
128 | TFLM | Reference | RV32GC | False | - |
3106395 ( 0.6x ) |
334124 ( 0.991 ) |
19448 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | False | Loop+SLP |
3106395 ( 0.6x ) |
334124 ( 0.991 ) |
19448 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | False | Loop+SLP |
3106395 ( 0.6x ) |
334124 ( 0.991 ) |
19448 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | False | Loop+SLP |
3106395 ( 0.6x ) |
334124 ( 0.991 ) |
19448 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | False | Loop+SLP |
3106395 ( 0.6x ) |
334124 ( 0.991 ) |
19448 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | False | Loop+SLP |
3106395 ( 0.6x ) |
334124 ( 0.991 ) |
19448 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | False | Loop+SLP |
3121206 ( 0.6x ) |
346510 ( 1.027 ) |
19384 ( 0.997 ) |
128 | TFLM | Reference | RV32GCP | False | - |
1789685 ( Base ) |
337256 ( Base ) |
19448 ( Base ) |
128 | muRISCV-NN | Scalar | RV32GC | False | - |
1789685 ( 1.0x ) |
337258 ( 1.0 ) |
19448 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
1789685 ( 1.0x ) |
337258 ( 1.0 ) |
19448 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
1789685 ( 1.0x ) |
337258 ( 1.0 ) |
19448 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
1789685 ( 1.0x ) |
337258 ( 1.0 ) |
19448 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
1789685 ( 1.0x ) |
337258 ( 1.0 ) |
19448 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
1789685 ( 1.0x ) |
337258 ( 1.0 ) |
19448 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
584299 ( 3.1x ) |
338170 ( 1.003 ) |
19448 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | False | - |
465931 ( 3.8x ) |
338170 ( 1.003 ) |
19448 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | False | - |
406747 ( 4.4x ) |
338170 ( 1.003 ) |
19448 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | False | - |
377491 ( 4.7x ) |
338170 ( 1.003 ) |
19448 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | False | - |
373807 ( 4.8x ) |
338170 ( 1.003 ) |
19448 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | False | - |
371923 ( 4.8x ) |
338170 ( 1.003 ) |
19448 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | False | - |
1631783 ( 1.1x ) |
349446 ( 1.036 ) |
19384 ( 0.997 ) |
128 | muRISCV-NN | Scalar | RV32GCP | False | - |
959480 ( 1.9x ) |
351042 ( 1.041 ) |
19384 ( 0.997 ) |
128 | muRISCV-NN | Packed | RV32GCP | False | - |
vww
)
Visual Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
495292368 ( 0.1x ) |
406325 ( 0.959 ) |
134520 ( 1.0 ) |
128 | TFLM | Reference | RV32GC | False | - |
495292306 ( 0.1x ) |
406331 ( 0.959 ) |
134520 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | False | Loop+SLP |
495292306 ( 0.1x ) |
406331 ( 0.959 ) |
134520 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | False | Loop+SLP |
495292306 ( 0.1x ) |
406331 ( 0.959 ) |
134520 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | False | Loop+SLP |
495292306 ( 0.1x ) |
406331 ( 0.959 ) |
134520 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | False | Loop+SLP |
495292306 ( 0.1x ) |
406331 ( 0.959 ) |
134520 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | False | Loop+SLP |
495292306 ( 0.1x ) |
406331 ( 0.959 ) |
134520 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | False | Loop+SLP |
445917146 ( 0.1x ) |
418724 ( 0.988 ) |
134468 ( 1.0 ) |
128 | TFLM | Reference | RV32GCP | False | - |
49691563 ( Base ) |
423632 ( Base ) |
134528 ( Base ) |
128 | muRISCV-NN | Scalar | RV32GC | False | - |
49691563 ( 1.0x ) |
423634 ( 1.0 ) |
134528 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
49691563 ( 1.0x ) |
423634 ( 1.0 ) |
134528 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
49691563 ( 1.0x ) |
423634 ( 1.0 ) |
134528 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
49691563 ( 1.0x ) |
423634 ( 1.0 ) |
134528 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
49691563 ( 1.0x ) |
423634 ( 1.0 ) |
134528 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
49691563 ( 1.0x ) |
423634 ( 1.0 ) |
134528 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
13487052 ( 3.7x ) |
424812 ( 1.003 ) |
134528 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | False | - |
10158424 ( 4.9x ) |
424812 ( 1.003 ) |
134528 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | False | - |
8869302 ( 5.6x ) |
424812 ( 1.003 ) |
134528 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | False | - |
8364803 ( 5.9x ) |
424812 ( 1.003 ) |
134528 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | False | - |
8316189 ( 6.0x ) |
424812 ( 1.003 ) |
134528 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | False | - |
8319578 ( 6.0x ) |
424812 ( 1.003 ) |
134528 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | False | - |
40776347 ( 1.2x ) |
435266 ( 1.027 ) |
134476 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCP | False | - |
49192857 ( 1.0x ) |
437544 ( 1.033 ) |
134476 ( 1.0 ) |
128 | muRISCV-NN | Packed | RV32GCP | False | - |
Original data
Click here to download the raw files for this benchmark.