Benchmarks 2024 03 02 TFLM LLVM O3 - tum-ei-eda/muriscv-nn GitHub Wiki
Setup
Simulator
- Spike (
riscv-isa-sim
) (ISS, CPI=1)
Toolchains
- LLVM/Clang:
- TODO: Version
- Linker: lld (TODO)
- RISC-V GCC for Headers, libc,...
Models
-
MLPerfTiny Benchmark
-
TODO: others!
Package Versions
-
MLonMCU : main
-
TFLM : main
-
Spike : 0bc176b3fca43560b9e8586cdbc41cfde073e17a
-
Spike PK : 7e9b671c0415dfd7b562ac934feb9380075d4aa2
Miscellaneous
- Used
-Os
flag for compilation. - Benchmarks generated using MLonMCU deployment tool with minimal efforts.
- Memory metrics are reported in Bytes
Results (Framework: tflm, Backend: tflmi, Toolchain: llvm, Flags: -O3)
aww
)
Audio Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
42613536 ( 0.4x ) |
156982 ( 0.892 ) |
36148 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | 0 | - |
30264099 ( 0.5x ) |
173052 ( 0.984 ) |
36156 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
28765301 ( 0.5x ) |
173052 ( 0.984 ) |
36156 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
28061037 ( 0.6x ) |
173052 ( 0.984 ) |
36156 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
27712294 ( 0.6x ) |
173052 ( 0.984 ) |
36156 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
45607770 ( 0.3x ) |
173052 ( 0.984 ) |
36156 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
45617937 ( 0.3x ) |
173052 ( 0.984 ) |
36156 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
15645936 ( Base ) |
175930 ( Base ) |
36148 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | 0 | - |
5422098 ( 2.9x ) |
194876 ( 1.108 ) |
36156 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
4967759 ( 3.1x ) |
194876 ( 1.108 ) |
36156 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
4630487 ( 3.4x ) |
194876 ( 1.108 ) |
36156 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
5411336 ( 2.9x ) |
194876 ( 1.108 ) |
36156 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
14394157 ( 1.1x ) |
194876 ( 1.108 ) |
36156 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
14400935 ( 1.1x ) |
194876 ( 1.108 ) |
36156 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
4139348 ( 3.8x ) |
177552 ( 1.009 ) |
36148 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | 0 | - |
2878642 ( 5.4x ) |
177552 ( 1.009 ) |
36148 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | 0 | - |
2198332 ( 7.1x ) |
177552 ( 1.009 ) |
36148 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | 0 | - |
2155848 ( 7.3x ) |
177552 ( 1.009 ) |
36148 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | 0 | - |
2159237 ( 7.2x ) |
177552 ( 1.009 ) |
36148 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | 0 | - |
2162626 ( 7.2x ) |
177552 ( 1.009 ) |
36148 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | 0 | - |
resnet
)
Image Classification (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
132471988 ( 0.4x ) |
197152 ( 0.942 ) |
68916 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | 0 | - |
52716169 ( 1.1x ) |
215044 ( 1.027 ) |
68924 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
45054716 ( 1.3x ) |
215044 ( 1.027 ) |
68924 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
89137984 ( 0.7x ) |
215044 ( 1.027 ) |
68924 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
117435471 ( 0.5x ) |
215044 ( 1.027 ) |
68924 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
134757285 ( 0.4x ) |
215044 ( 1.027 ) |
68924 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
134767452 ( 0.4x ) |
215044 ( 1.027 ) |
68924 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
58314324 ( Base ) |
209358 ( Base ) |
68912 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | 0 | - |
11909943 ( 4.9x ) |
231122 ( 1.104 ) |
68920 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
11569873 ( 5.0x ) |
231122 ( 1.104 ) |
68920 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
14592770 ( 4.0x ) |
231122 ( 1.104 ) |
68920 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
16161693 ( 3.6x ) |
231122 ( 1.104 ) |
68920 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
16865394 ( 3.5x ) |
231122 ( 1.104 ) |
68920 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
36759082 ( 1.6x ) |
231122 ( 1.104 ) |
68920 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
15298076 ( 3.8x ) |
211228 ( 1.009 ) |
68912 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | 0 | - |
9713088 ( 6.0x ) |
211228 ( 1.009 ) |
68912 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | 0 | - |
7166354 ( 8.1x ) |
211228 ( 1.009 ) |
68912 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | 0 | - |
5923778 ( 9.8x ) |
211228 ( 1.009 ) |
68912 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | 0 | - |
4998679 ( 11.7x ) |
211228 ( 1.009 ) |
68912 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | 0 | - |
4752864 ( 12.3x ) |
211228 ( 1.009 ) |
68912 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | 0 | - |
toycar
)
Anomaly Detection (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
3071225 ( 0.6x ) |
342862 ( 0.987 ) |
19384 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | 0 | - |
816342 ( 2.1x ) |
346914 ( 0.999 ) |
19384 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
643478 ( 2.6x ) |
346914 ( 0.999 ) |
19384 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
553014 ( 3.1x ) |
346914 ( 0.999 ) |
19384 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
507782 ( 3.3x ) |
346914 ( 0.999 ) |
19384 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
485166 ( 3.5x ) |
346914 ( 0.999 ) |
19384 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
2433574 ( 0.7x ) |
346914 ( 0.999 ) |
19384 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
1693239 ( Base ) |
347344 ( Base ) |
19384 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | 0 | - |
542096 ( 3.1x ) |
351986 ( 1.013 ) |
19384 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
455420 ( 3.7x ) |
351986 ( 1.013 ) |
19384 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
410204 ( 4.1x ) |
351986 ( 1.013 ) |
19384 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
387596 ( 4.4x ) |
351986 ( 1.013 ) |
19384 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
376292 ( 4.5x ) |
351986 ( 1.013 ) |
19384 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
1358600 ( 1.2x ) |
351986 ( 1.013 ) |
19384 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
586965 ( 2.9x ) |
347688 ( 1.001 ) |
19384 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | 0 | - |
470285 ( 3.6x ) |
347688 ( 1.001 ) |
19384 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | 0 | - |
411945 ( 4.1x ) |
347688 ( 1.001 ) |
19384 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | 0 | - |
383099 ( 4.4x ) |
347688 ( 1.001 ) |
19384 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | 0 | - |
379468 ( 4.5x ) |
347688 ( 1.001 ) |
19384 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | 0 | - |
377612 ( 4.5x ) |
347688 ( 1.001 ) |
19384 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | 0 | - |
vww
)
Visual Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
112276178 ( 0.4x ) |
430466 ( 0.958 ) |
134476 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | 0 | - |
72688978 ( 0.6x ) |
446742 ( 0.994 ) |
134484 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
70412718 ( 0.7x ) |
446742 ( 0.994 ) |
134484 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
70827262 ( 0.7x ) |
446742 ( 0.994 ) |
134484 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
77512163 ( 0.6x ) |
446742 ( 0.994 ) |
134484 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
84910263 ( 0.6x ) |
446742 ( 0.994 ) |
134484 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
113660504 ( 0.4x ) |
446742 ( 0.994 ) |
134484 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
46914513 ( Base ) |
449414 ( Base ) |
134476 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | 0 | - |
17690680 ( 2.7x ) |
468566 ( 1.043 ) |
134484 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
17234514 ( 2.7x ) |
468566 ( 1.043 ) |
134484 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
18486594 ( 2.5x ) |
468566 ( 1.043 ) |
134484 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
21830631 ( 2.1x ) |
468566 ( 1.043 ) |
134484 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
25570168 ( 1.8x ) |
468566 ( 1.043 ) |
134484 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
40402396 ( 1.2x ) |
468566 ( 1.043 ) |
134484 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
13541089 ( 3.5x ) |
451036 ( 1.004 ) |
134476 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | 0 | - |
10208473 ( 4.6x ) |
451036 ( 1.004 ) |
134476 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | 0 | - |
8912101 ( 5.3x ) |
451036 ( 1.004 ) |
134476 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | 0 | - |
8406591 ( 5.6x ) |
451036 ( 1.004 ) |
134476 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | 0 | - |
8361593 ( 5.6x ) |
451036 ( 1.004 ) |
134476 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | 0 | - |
8364982 ( 5.6x ) |
451036 ( 1.004 ) |
134476 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | 0 | - |
Original data
Click here to download the raw files for this benchmark.