Benchmarks 2024 02 23 TFLM LLVM - tum-ei-eda/muriscv-nn GitHub Wiki
Setup
Simulator
- Spike (
riscv-isa-sim
) (ISS, CPI=1)
Toolchains
- LLVM/Clang:
- TODO: Version
- Linker: lld (TODO)
- RISC-V GCC for Headers, libc,...
Models
-
MLPerfTiny Benchmark
-
TODO: others!
Package Versions
-
MLonMCU : main
-
TFLM : main
-
Spike : 0bc176b3fca43560b9e8586cdbc41cfde073e17a
-
Spike PK : 7e9b671c0415dfd7b562ac934feb9380075d4aa2
Miscellaneous
- Used
-Os
flag for compilation. - Benchmarks generated using MLonMCU deployment tool with minimal efforts.
- Memory metrics are reported in Bytes
Results (Framework: tflm, Backend: tflmi, Toolchain: llvm)
aww
)
Audio Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Auto-Vectorization |
---|---|---|---|---|---|---|---|
43849979 ( 0.4x ) |
146710 ( 0.863 ) |
36124 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | - |
34194781 ( 0.5x ) |
152484 ( 0.897 ) |
36132 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | Loop+SLP |
31057605 ( 0.5x ) |
152484 ( 0.897 ) |
36132 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | Loop+SLP |
31064383 ( 0.5x ) |
152484 ( 0.897 ) |
36132 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | Loop+SLP |
15553945 ( Base ) |
169998 ( Base ) |
36124 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | - |
6139966 ( 2.5x ) |
181260 ( 1.066 ) |
36132 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
4316927 ( 3.6x ) |
181260 ( 1.066 ) |
36132 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
4323705 ( 3.6x ) |
181260 ( 1.066 ) |
36132 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
4382758 ( 3.5x ) |
170064 ( 1.0 ) |
36124 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | - |
2399230 ( 6.5x ) |
170064 ( 1.0 ) |
36124 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | - |
2406008 ( 6.5x ) |
170064 ( 1.0 ) |
36124 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | - |
resnet
)
Image Classification (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Auto-Vectorization |
---|---|---|---|---|---|---|---|
134756807 ( 0.4x ) |
188332 ( 0.924 ) |
68892 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | - |
58895120 ( 1.0x ) |
194258 ( 0.953 ) |
68900 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | Loop+SLP |
46357756 ( 1.3x ) |
194258 ( 0.953 ) |
68900 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | Loop+SLP |
46364534 ( 1.3x ) |
194258 ( 0.953 ) |
68900 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | Loop+SLP |
58388361 ( Base ) |
203824 ( Base ) |
68888 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | - |
28231465 ( 2.1x ) |
215534 ( 1.057 ) |
68896 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
13681070 ( 4.3x ) |
215534 ( 1.057 ) |
68896 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
12110888 ( 4.8x ) |
215534 ( 1.057 ) |
68896 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
15880816 ( 3.7x ) |
204580 ( 1.004 ) |
68888 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | - |
6506518 ( 9.0x ) |
204580 ( 1.004 ) |
68888 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | - |
5335604 ( 10.9x ) |
204580 ( 1.004 ) |
68888 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | - |
toycar
)
Anomaly Detection (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Auto-Vectorization |
---|---|---|---|---|---|---|---|
3052975 ( 0.5x ) |
342160 ( 0.986 ) |
19376 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | - |
895486 ( 1.9x ) |
343940 ( 0.991 ) |
19376 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | Loop+SLP |
492510 ( 3.4x ) |
343940 ( 0.991 ) |
19376 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | Loop+SLP |
460142 ( 3.6x ) |
343940 ( 0.991 ) |
19376 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | Loop+SLP |
1676686 ( Base ) |
346982 ( Base ) |
19376 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | - |
578110 ( 2.9x ) |
350516 ( 1.01 ) |
19376 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
378750 ( 4.4x ) |
350516 ( 1.01 ) |
19376 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
362726 ( 4.6x ) |
350516 ( 1.01 ) |
19376 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
571020 ( 2.9x ) |
347474 ( 1.001 ) |
19376 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | - |
367154 ( 4.6x ) |
347474 ( 1.001 ) |
19376 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | - |
361667 ( 4.6x ) |
347474 ( 1.001 ) |
19376 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | - |
vww
)
Visual Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Auto-Vectorization |
---|---|---|---|---|---|---|---|
114164825 ( 0.4x ) |
420470 ( 0.948 ) |
134452 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | - |
72029724 ( 0.6x ) |
426196 ( 0.96 ) |
134460 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | Loop+SLP |
63384332 ( 0.7x ) |
426196 ( 0.96 ) |
134460 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | Loop+SLP |
62939442 ( 0.7x ) |
426196 ( 0.96 ) |
134460 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | Loop+SLP |
46648295 ( Base ) |
443758 ( Base ) |
134452 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | - |
19561436 ( 2.4x ) |
454956 ( 1.025 ) |
134460 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
14765705 ( 3.2x ) |
454956 ( 1.025 ) |
134460 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
14548527 ( 3.2x ) |
454956 ( 1.025 ) |
134460 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
14562635 ( 3.2x ) |
443816 ( 1.0 ) |
134452 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | - |
9428095 ( 4.9x ) |
443816 ( 1.0 ) |
134452 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | - |
9386483 ( 5.0x ) |
443816 ( 1.0 ) |
134452 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | - |
Original data
Click here to download the raw files for this benchmark.