Benchmarks 2024 02 26 TFLM LLVM - tum-ei-eda/muriscv-nn GitHub Wiki
Setup
Simulator
- Spike (
riscv-isa-sim
) (ISS, CPI=1)
Toolchains
- LLVM/Clang:
- TODO: Version
- Linker: lld (TODO)
- RISC-V GCC for Headers, libc,...
Models
-
MLPerfTiny Benchmark
-
TODO: others!
Package Versions
-
MLonMCU : main
-
TFLM : main
-
Spike : 0bc176b3fca43560b9e8586cdbc41cfde073e17a
-
Spike PK : 7e9b671c0415dfd7b562ac934feb9380075d4aa2
Miscellaneous
- Used
-Os
flag for compilation. - Benchmarks generated using MLonMCU deployment tool with minimal efforts.
- Memory metrics are reported in Bytes
Results (Framework: tflm, Backend: tflmi, Toolchain: llvm)
aww
)
Audio Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Auto-Vectorization |
---|---|---|---|---|---|---|---|
43849979 ( 0.4x ) |
146710 ( 0.865 ) |
36124 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | - |
34194781 ( 0.5x ) |
152484 ( 0.899 ) |
36132 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | Loop+SLP |
32402109 ( 0.5x ) |
152484 ( 0.899 ) |
36132 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | Loop+SLP |
31505773 ( 0.5x ) |
152484 ( 0.899 ) |
36132 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | Loop+SLP |
31057605 ( 0.5x ) |
152484 ( 0.899 ) |
36132 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | Loop+SLP |
31060994 ( 0.5x ) |
152484 ( 0.899 ) |
36132 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | Loop+SLP |
31064383 ( 0.5x ) |
152484 ( 0.899 ) |
36132 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | Loop+SLP |
15553948 ( Base ) |
169550 ( Base ) |
36124 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | - |
6139969 ( 2.5x ) |
180562 ( 1.065 ) |
36132 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
5121381 ( 3.0x ) |
180562 ( 1.065 ) |
36132 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
4612197 ( 3.4x ) |
180562 ( 1.065 ) |
36132 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
4316930 ( 3.6x ) |
180562 ( 1.065 ) |
36132 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
4320319 ( 3.6x ) |
180562 ( 1.065 ) |
36132 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
4323708 ( 3.6x ) |
180562 ( 1.065 ) |
36132 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
4124842 ( 3.8x ) |
169736 ( 1.001 ) |
36124 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | - |
2864120 ( 5.4x ) |
169736 ( 1.001 ) |
36124 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | - |
2183802 ( 7.1x ) |
169736 ( 1.001 ) |
36124 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | - |
2141314 ( 7.3x ) |
169736 ( 1.001 ) |
36124 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | - |
2144703 ( 7.3x ) |
169736 ( 1.001 ) |
36124 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | - |
2148092 ( 7.2x ) |
169736 ( 1.001 ) |
36124 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | - |
resnet
)
Image Classification (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Auto-Vectorization |
---|---|---|---|---|---|---|---|
134756807 ( 0.4x ) |
188332 ( 0.926 ) |
68892 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | - |
58895120 ( 1.0x ) |
194258 ( 0.955 ) |
68900 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | Loop+SLP |
49138464 ( 1.2x ) |
194258 ( 0.955 ) |
68900 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | Loop+SLP |
46791560 ( 1.2x ) |
194258 ( 0.955 ) |
68900 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | Loop+SLP |
46357756 ( 1.3x ) |
194258 ( 0.955 ) |
68900 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | Loop+SLP |
46361145 ( 1.3x ) |
194258 ( 0.955 ) |
68900 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | Loop+SLP |
46364534 ( 1.3x ) |
194258 ( 0.955 ) |
68900 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | Loop+SLP |
58393682 ( Base ) |
203380 ( Base ) |
68888 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | - |
28246841 ( 2.1x ) |
214840 ( 1.056 ) |
68896 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
19595597 ( 3.0x ) |
214840 ( 1.056 ) |
68896 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
15540421 ( 3.8x ) |
214840 ( 1.056 ) |
68896 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
13696446 ( 4.3x ) |
214840 ( 1.056 ) |
68896 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
12798715 ( 4.6x ) |
214840 ( 1.056 ) |
68896 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
12126264 ( 4.8x ) |
214840 ( 1.056 ) |
68896 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
15321692 ( 3.8x ) |
204252 ( 1.004 ) |
68888 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | - |
9736704 ( 6.0x ) |
204252 ( 1.004 ) |
68888 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | - |
7189970 ( 8.1x ) |
204252 ( 1.004 ) |
68888 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | - |
5947394 ( 9.8x ) |
204252 ( 1.004 ) |
68888 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | - |
5022295 ( 11.6x ) |
204252 ( 1.004 ) |
68888 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | - |
4776480 ( 12.2x ) |
204252 ( 1.004 ) |
68888 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | - |
toycar
)
Anomaly Detection (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Auto-Vectorization |
---|---|---|---|---|---|---|---|
3052975 ( 0.5x ) |
342160 ( 0.988 ) |
19376 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | - |
895486 ( 1.9x ) |
343940 ( 0.993 ) |
19376 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | Loop+SLP |
665214 ( 2.5x ) |
343940 ( 0.993 ) |
19376 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | Loop+SLP |
550078 ( 3.0x ) |
343940 ( 0.993 ) |
19376 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | Loop+SLP |
492510 ( 3.4x ) |
343940 ( 0.993 ) |
19376 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | Loop+SLP |
463726 ( 3.6x ) |
343940 ( 0.993 ) |
19376 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | Loop+SLP |
460142 ( 3.6x ) |
343940 ( 0.993 ) |
19376 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | Loop+SLP |
1676718 ( Base ) |
346374 ( Base ) |
19376 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | - |
578148 ( 2.9x ) |
349656 ( 1.009 ) |
19376 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
464228 ( 3.6x ) |
349656 ( 1.009 ) |
19376 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
407268 ( 4.1x ) |
349656 ( 1.009 ) |
19376 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
378788 ( 4.4x ) |
349656 ( 1.009 ) |
19376 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
364548 ( 4.6x ) |
349656 ( 1.009 ) |
19376 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
362764 ( 4.6x ) |
349656 ( 1.009 ) |
19376 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
571060 ( 2.9x ) |
346982 ( 1.002 ) |
19376 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | - |
454380 ( 3.7x ) |
346982 ( 1.002 ) |
19376 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | - |
396040 ( 4.2x ) |
346982 ( 1.002 ) |
19376 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | - |
367194 ( 4.6x ) |
346982 ( 1.002 ) |
19376 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | - |
363563 ( 4.6x ) |
346982 ( 1.002 ) |
19376 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | - |
361707 ( 4.6x ) |
346982 ( 1.002 ) |
19376 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | - |
vww
)
Visual Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Auto-Vectorization |
---|---|---|---|---|---|---|---|
114164825 ( 0.4x ) |
420470 ( 0.948 ) |
134452 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | - |
72029724 ( 0.6x ) |
426196 ( 0.961 ) |
134460 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | Loop+SLP |
66868316 ( 0.7x ) |
426196 ( 0.961 ) |
134460 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | Loop+SLP |
64416636 ( 0.7x ) |
426196 ( 0.961 ) |
134460 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | Loop+SLP |
63384332 ( 0.7x ) |
426196 ( 0.961 ) |
134460 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | Loop+SLP |
62968337 ( 0.7x ) |
426196 ( 0.961 ) |
134460 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | Loop+SLP |
62939442 ( 0.7x ) |
426196 ( 0.961 ) |
134460 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | Loop+SLP |
46648145 ( Base ) |
443316 ( Base ) |
134452 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | - |
19561421 ( 2.4x ) |
454260 ( 1.025 ) |
134460 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
16658157 ( 2.8x ) |
454260 ( 1.025 ) |
134460 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
15266541 ( 3.1x ) |
454260 ( 1.025 ) |
134460 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
14765690 ( 3.2x ) |
454260 ( 1.025 ) |
134460 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
14562559 ( 3.2x ) |
454260 ( 1.025 ) |
134460 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
14548512 ( 3.2x ) |
454260 ( 1.025 ) |
134460 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
13637449 ( 3.4x ) |
443490 ( 1.0 ) |
134452 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | - |
10304809 ( 4.5x ) |
443490 ( 1.0 ) |
134452 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | - |
9008425 ( 5.2x ) |
443490 ( 1.0 ) |
134452 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | - |
8502909 ( 5.5x ) |
443490 ( 1.0 ) |
134452 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | - |
8457908 ( 5.5x ) |
443490 ( 1.0 ) |
134452 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | - |
8461297 ( 5.5x ) |
443490 ( 1.0 ) |
134452 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | - |
Original data
Click here to download the raw files for this benchmark.