Benchmarks 2024 02 22 TFLM GCC - tum-ei-eda/muriscv-nn GitHub Wiki
Setup
Simulator
- Spike (
riscv-isa-sim
) (ISS, CPI=1)
Toolchains
- RISC-V GCC:
- Scalar: TODO: version & url
- Vector: TODO: version & url
- Packed: Self compiled using patches found in https://github.com/riscv-collab/riscv-gcc/pull/258 and https://github.com/riscvarchive/riscv-binutils-gdb/pull/257
Models
-
MLPerfTiny Benchmark
-
TODO: others!
Package Versions
-
MLonMCU : main
-
TFLM : main
-
Spike : 0bc176b3fca43560b9e8586cdbc41cfde073e17a
-
Spike PK : 7e9b671c0415dfd7b562ac934feb9380075d4aa2
Miscellaneous
- Used
-Os
flag for compilation. - Benchmarks generated using MLonMCU deployment tool with minimal efforts.
- Memory metrics are reported in Bytes
Results (Framework: tflm, Backend: tflmi, Toolchain: gcc)
aww
)
Audio Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Auto-Vectorization |
---|---|---|---|---|---|---|---|
174698646 ( 0.1x ) |
132407 ( 0.878 ) |
36204 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | - |
174698646 ( 0.1x ) |
132413 ( 0.878 ) |
36204 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | Loop+SLP |
174698646 ( 0.1x ) |
132413 ( 0.878 ) |
36204 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | Loop+SLP |
174698646 ( 0.1x ) |
132413 ( 0.878 ) |
36204 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | Loop+SLP |
157549999 ( 0.1x ) |
144774 ( 0.96 ) |
36148 ( 0.998 ) |
0 | TFLM | Reference | RV32GCP | - |
16643408 ( Base ) |
150730 ( Base ) |
36212 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | - |
16643408 ( 1.0x ) |
150736 ( 1.0 ) |
36212 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
16643408 ( 1.0x ) |
150736 ( 1.0 ) |
36212 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
16643408 ( 1.0x ) |
150736 ( 1.0 ) |
36212 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
4519571 ( 3.7x ) |
151794 ( 1.007 ) |
36212 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | - |
2410011 ( 6.9x ) |
151794 ( 1.007 ) |
36212 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | - |
2413400 ( 6.9x ) |
151794 ( 1.007 ) |
36212 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | - |
13496949 ( 1.2x ) |
162294 ( 1.077 ) |
36156 ( 0.998 ) |
0 | muRISCV-NN | Scalar | RV32GCP | - |
15933233 ( 1.0x ) |
164564 ( 1.092 ) |
36156 ( 0.998 ) |
0 | muRISCV-NN | Packed | RV32GCP | - |
resnet
)
Image Classification (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Auto-Vectorization |
---|---|---|---|---|---|---|---|
745801113 ( 0.1x ) |
172997 ( 0.934 ) |
68968 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | - |
745801113 ( 0.1x ) |
173013 ( 0.934 ) |
68968 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | Loop+SLP |
745801113 ( 0.1x ) |
173013 ( 0.934 ) |
68968 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | Loop+SLP |
745801113 ( 0.1x ) |
173013 ( 0.934 ) |
68968 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | Loop+SLP |
697912970 ( 0.1x ) |
185266 ( 1.0 ) |
68912 ( 0.999 ) |
0 | TFLM | Reference | RV32GCP | - |
80992082 ( Base ) |
185230 ( Base ) |
68960 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | - |
80992082 ( 1.0x ) |
185262 ( 1.0 ) |
68960 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
80992082 ( 1.0x ) |
185262 ( 1.0 ) |
68960 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
80992082 ( 1.0x ) |
185262 ( 1.0 ) |
68960 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
16827198 ( 4.8x ) |
187126 ( 1.01 ) |
68960 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | - |
6662526 ( 12.2x ) |
187126 ( 1.01 ) |
68960 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | - |
5392484 ( 15.0x ) |
187126 ( 1.01 ) |
68960 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | - |
62967673 ( 1.3x ) |
196896 ( 1.063 ) |
68904 ( 0.999 ) |
0 | muRISCV-NN | Scalar | RV32GCP | - |
68428919 ( 1.2x ) |
199762 ( 1.078 ) |
68904 ( 0.999 ) |
0 | muRISCV-NN | Packed | RV32GCP | - |
toycar
)
Anomaly Detection (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Auto-Vectorization |
---|---|---|---|---|---|---|---|
3094956 ( 0.6x ) |
333908 ( 0.989 ) |
19432 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | - |
3094956 ( 0.6x ) |
333914 ( 0.989 ) |
19432 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | Loop+SLP |
3094956 ( 0.6x ) |
333914 ( 0.989 ) |
19432 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | Loop+SLP |
3094956 ( 0.6x ) |
333914 ( 0.989 ) |
19432 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | Loop+SLP |
3097895 ( 0.6x ) |
346264 ( 1.025 ) |
19380 ( 0.997 ) |
0 | TFLM | Reference | RV32GCP | - |
1766145 ( Base ) |
337704 ( Base ) |
19432 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | - |
1766145 ( 1.0x ) |
337710 ( 1.0 ) |
19432 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
1766145 ( 1.0x ) |
337710 ( 1.0 ) |
19432 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
1766145 ( 1.0x ) |
337710 ( 1.0 ) |
19432 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
572955 ( 3.1x ) |
338498 ( 1.002 ) |
19432 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | - |
366147 ( 4.8x ) |
338498 ( 1.002 ) |
19432 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | - |
360579 ( 4.9x ) |
338498 ( 1.002 ) |
19432 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | - |
1602575 ( 1.1x ) |
349840 ( 1.036 ) |
19380 ( 0.997 ) |
0 | muRISCV-NN | Scalar | RV32GCP | - |
942627 ( 1.9x ) |
351430 ( 1.041 ) |
19380 ( 0.997 ) |
0 | muRISCV-NN | Packed | RV32GCP | - |
vww
)
Visual Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Auto-Vectorization |
---|---|---|---|---|---|---|---|
495266208 ( 0.1x ) |
406111 ( 0.957 ) |
134520 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | - |
495266207 ( 0.1x ) |
406117 ( 0.957 ) |
134520 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | Loop+SLP |
495266207 ( 0.1x ) |
406117 ( 0.957 ) |
134520 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | Loop+SLP |
495266207 ( 0.1x ) |
406117 ( 0.957 ) |
134520 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | Loop+SLP |
445892345 ( 0.1x ) |
418478 ( 0.986 ) |
134464 ( 1.0 ) |
0 | TFLM | Reference | RV32GCP | - |
49673166 ( Base ) |
424434 ( Base ) |
134528 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | - |
49673166 ( 1.0x ) |
424440 ( 1.0 ) |
134528 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
49673166 ( 1.0x ) |
424440 ( 1.0 ) |
134528 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
49673166 ( 1.0x ) |
424440 ( 1.0 ) |
134528 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
14864594 ( 3.3x ) |
425498 ( 1.003 ) |
134528 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | - |
9450505 ( 5.3x ) |
425498 ( 1.003 ) |
134528 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | - |
9403232 ( 5.3x ) |
425498 ( 1.003 ) |
134528 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | - |
40746574 ( 1.2x ) |
435998 ( 1.027 ) |
134472 ( 1.0 ) |
0 | muRISCV-NN | Scalar | RV32GCP | - |
49175307 ( 1.0x ) |
438268 ( 1.033 ) |
134472 ( 1.0 ) |
0 | muRISCV-NN | Packed | RV32GCP | - |
Original data
Click here to download the raw files for this benchmark.