Benchmarks 2024 11 26 TFLM LLVM Os spike_rv32 - tum-ei-eda/muriscv-nn GitHub Wiki
Setup
Simulator
- Spike (
riscv-isa-sim
) (ISS, CPI=1)- Spike :
eb0a3e2b0a7c57522928be39de95cd9f8c6dc636
- Spike PK :
fix-gcc14-rvv
- Spike :
Toolchains
-
RISC-V GCC:
- Scalar:
riscv32-unknown-elf-gcc (g8b4bb54e6c4) 14.2.1 20241118
- Vector:
riscv32-unknown-elf-gcc (g8b4bb54e6c4) 14.2.1 20241118
- Packed: Self compiled using patches found in https://github.com/riscv-collab/riscv-gcc/pull/258 and https://github.com/riscvarchive/riscv-binutils-gdb/pull/257
- Scalar:
-
LLVM/Clang:
clang version 18.1.8 (https://github.com/llvm/llvm-project.git 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)
- Linker: lld (TODO)
Models
-
MLPerfTiny Benchmark
-
TODO: others!
Frameworks
-
MLonMCU :
develop
-
TFLM :
8eb6b23de4470d6a8da3131650d6a67514dfa130
Miscellaneous
- Used
-Os
flag for compilation. - Benchmarks generated using MLonMCU deployment tool with minimal efforts.
- Memory metrics are reported in Bytes
Results (Framework: tflm, Backend: tflmi, Toolchain: llvm, Flags: -Os, Target: spike_rv32 )
aww
)
Audio Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
39294338.0 ( 0.4x ) |
149588 ( 0.862 ) |
36060 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | 0 | - |
33489638.0 ( 0.5x ) |
155628 ( 0.897 ) |
36068 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
31696974.0 ( 0.5x ) |
155648 ( 0.897 ) |
36068 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
30800724.0 ( 0.5x ) |
155932 ( 0.899 ) |
36068 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
30352565.0 ( 0.5x ) |
156266 ( 0.9 ) |
36068 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
30350820.0 ( 0.5x ) |
156470 ( 0.902 ) |
36068 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
30359365.0 ( 0.5x ) |
156870 ( 0.904 ) |
36068 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
15085053.0 ( Base ) |
173538 ( Base ) |
36060 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | 0 | - |
14948926.0 ( 1.0x ) |
172684 ( 0.995 ) |
36060 ( 1.0 ) |
0 | muRISCV-NN | Vector (Portable) | RV32GC | 0 | - |
6143104.0 ( 2.5x ) |
184606 ( 1.064 ) |
36068 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
5140439.0 ( 2.9x ) |
184652 ( 1.064 ) |
36068 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
4644341.0 ( 3.2x ) |
184770 ( 1.065 ) |
36068 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
4351870.0 ( 3.5x ) |
184978 ( 1.066 ) |
36068 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
4351873.0 ( 3.5x ) |
185074 ( 1.066 ) |
36068 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
4358658.0 ( 3.5x ) |
185366 ( 1.068 ) |
36068 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
4083226.0 ( 3.7x ) |
174750 ( 1.007 ) |
36060 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | 0 | - |
2828248.0 ( 5.3x ) |
174672 ( 1.007 ) |
36060 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | 0 | - |
2150930.0 ( 7.0x ) |
174612 ( 1.006 ) |
36060 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | 0 | - |
2109942.0 ( 7.1x ) |
174646 ( 1.006 ) |
36060 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | 0 | - |
2113333.0 ( 7.1x ) |
174646 ( 1.006 ) |
36060 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | 0 | - |
2116724.0 ( 7.1x ) |
174646 ( 1.006 ) |
36060 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | 0 | - |
6887271.0 ( 2.2x ) |
183430 ( 1.057 ) |
36068 ( 1.0 ) |
128 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
5618070.0 ( 2.7x ) |
183436 ( 1.057 ) |
36068 ( 1.0 ) |
256 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
5134467.0 ( 2.9x ) |
183554 ( 1.058 ) |
36068 ( 1.0 ) |
512 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
4864889.0 ( 3.1x ) |
183762 ( 1.059 ) |
36068 ( 1.0 ) |
1024 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
4864892.0 ( 3.1x ) |
183858 ( 1.059 ) |
36068 ( 1.0 ) |
2048 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
4871677.0 ( 3.1x ) |
184150 ( 1.061 ) |
36068 ( 1.0 ) |
4096 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
resnet
)
Image Classification (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
121053207.0 ( 0.5x ) |
191310 ( 0.922 ) |
68828 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | 0 | - |
56884504.0 ( 1.0x ) |
197300 ( 0.951 ) |
68836 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
47127682.0 ( 1.2x ) |
197402 ( 0.952 ) |
68836 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
44786238.0 ( 1.3x ) |
197914 ( 0.954 ) |
68836 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
44352470.0 ( 1.3x ) |
198388 ( 0.956 ) |
68836 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
44345607.0 ( 1.3x ) |
198714 ( 0.958 ) |
68836 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
44349034.0 ( 1.3x ) |
199234 ( 0.961 ) |
68836 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
56367181.0 ( Base ) |
207426 ( Base ) |
68824 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | 0 | - |
72412283.0 ( 0.8x ) |
206780 ( 0.997 ) |
68824 ( 1.0 ) |
0 | muRISCV-NN | Vector (Portable) | RV32GC | 0 | - |
26152356.0 ( 2.2x ) |
218724 ( 1.054 ) |
68832 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
18287589.0 ( 3.1x ) |
218764 ( 1.055 ) |
68832 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
14601103.0 ( 3.9x ) |
219150 ( 1.057 ) |
68832 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
12930241.0 ( 4.4x ) |
219526 ( 1.058 ) |
68832 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
12111071.0 ( 4.7x ) |
219766 ( 1.059 ) |
68832 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
11488048.0 ( 4.9x ) |
220206 ( 1.062 ) |
68832 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
15252975.0 ( 3.7x ) |
209352 ( 1.009 ) |
68824 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | 0 | - |
9669655.0 ( 5.8x ) |
209286 ( 1.009 ) |
68824 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | 0 | - |
7123918.0 ( 7.9x ) |
209226 ( 1.009 ) |
68824 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | 0 | - |
5881773.0 ( 9.6x ) |
209260 ( 1.009 ) |
68824 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | 0 | - |
4956900.0 ( 11.4x ) |
209260 ( 1.009 ) |
68824 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | 0 | - |
4711199.0 ( 12.0x ) |
209260 ( 1.009 ) |
68824 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | 0 | - |
20452521.0 ( 2.8x ) |
217448 ( 1.048 ) |
68832 ( 1.0 ) |
128 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
14356883.0 ( 3.9x ) |
217488 ( 1.049 ) |
68832 ( 1.0 ) |
256 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
11499710.0 ( 4.9x ) |
217874 ( 1.05 ) |
68832 ( 1.0 ) |
512 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
10206614.0 ( 5.5x ) |
218250 ( 1.052 ) |
68832 ( 1.0 ) |
1024 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
9566619.0 ( 5.9x ) |
218490 ( 1.053 ) |
68832 ( 1.0 ) |
2048 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
9092126.0 ( 6.2x ) |
218930 ( 1.055 ) |
68832 ( 1.0 ) |
4096 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
toycar
)
Anomaly Detection (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
2797804.0 ( 0.6x ) |
344894 ( 0.983 ) |
19364 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | 0 | - |
899257.0 ( 1.9x ) |
346920 ( 0.989 ) |
19364 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
669015.0 ( 2.5x ) |
346874 ( 0.989 ) |
19364 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
559064.0 ( 3.0x ) |
346992 ( 0.989 ) |
19364 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
501526.0 ( 3.4x ) |
347186 ( 0.99 ) |
19364 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
472772.0 ( 3.6x ) |
347270 ( 0.99 ) |
19364 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
472609.0 ( 3.6x ) |
347354 ( 0.99 ) |
19364 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
1684522.0 ( Base ) |
350746 ( Base ) |
19364 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | 0 | - |
3037192.0 ( 0.6x ) |
350748 ( 1.0 ) |
19364 ( 1.0 ) |
0 | muRISCV-NN | Vector (Portable) | RV32GC | 0 | - |
594011.0 ( 2.8x ) |
354846 ( 1.012 ) |
19364 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
480091.0 ( 3.5x ) |
354770 ( 1.011 ) |
19364 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
423131.0 ( 4.0x ) |
354824 ( 1.012 ) |
19364 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
394651.0 ( 4.3x ) |
354976 ( 1.012 ) |
19364 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
375266.0 ( 4.5x ) |
355024 ( 1.012 ) |
19364 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
373482.0 ( 4.5x ) |
355072 ( 1.012 ) |
19364 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
1941171.0 ( 0.9x ) |
352266 ( 1.004 ) |
19364 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | 0 | - |
1824490.0 ( 0.9x ) |
352216 ( 1.004 ) |
19364 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | 0 | - |
1766151.0 ( 1.0x ) |
352178 ( 1.004 ) |
19364 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | 0 | - |
1737304.0 ( 1.0x ) |
352212 ( 1.004 ) |
19364 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | 0 | - |
1733673.0 ( 1.0x ) |
352212 ( 1.004 ) |
19364 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | 0 | - |
1731817.0 ( 1.0x ) |
352212 ( 1.004 ) |
19364 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | 0 | - |
927993.0 ( 1.8x ) |
354848 ( 1.012 ) |
19364 ( 1.0 ) |
128 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
666041.0 ( 2.5x ) |
354772 ( 1.011 ) |
19364 ( 1.0 ) |
256 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
535065.0 ( 3.1x ) |
354826 ( 1.012 ) |
19364 ( 1.0 ) |
512 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
464432.0 ( 3.6x ) |
354978 ( 1.012 ) |
19364 ( 1.0 ) |
1024 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
431688.0 ( 3.9x ) |
355026 ( 1.012 ) |
19364 ( 1.0 ) |
2048 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
427600.0 ( 3.9x ) |
355074 ( 1.012 ) |
19364 ( 1.0 ) |
4096 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
vww
)
Visual Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
103599763.0 ( 0.4x ) |
423230 ( 0.946 ) |
134364 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | 0 | - |
72374386.0 ( 0.6x ) |
429270 ( 0.96 ) |
134372 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
67212897.0 ( 0.7x ) |
429290 ( 0.96 ) |
134372 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
64761379.0 ( 0.7x ) |
429574 ( 0.961 ) |
134372 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
63728987.0 ( 0.7x ) |
429908 ( 0.961 ) |
134372 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
63307692.0 ( 0.7x ) |
430112 ( 0.962 ) |
134372 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
63278997.0 ( 0.7x ) |
430512 ( 0.963 ) |
134372 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
45191831.0 ( Base ) |
447180 ( Base ) |
134364 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | 0 | - |
45007628.0 ( 1.0x ) |
446326 ( 0.998 ) |
134364 ( 1.0 ) |
0 | muRISCV-NN | Vector (Portable) | RV32GC | 0 | - |
19573301.0 ( 2.3x ) |
458248 ( 1.025 ) |
134372 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
16706988.0 ( 2.7x ) |
458294 ( 1.025 ) |
134372 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
15338952.0 ( 2.9x ) |
458412 ( 1.025 ) |
134372 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
14832871.0 ( 3.0x ) |
458620 ( 1.026 ) |
134372 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
14626372.0 ( 3.1x ) |
458716 ( 1.026 ) |
134372 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
14615697.0 ( 3.1x ) |
459008 ( 1.026 ) |
134372 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
13372761.0 ( 3.4x ) |
448392 ( 1.003 ) |
134364 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | 0 | - |
10051761.0 ( 4.5x ) |
448314 ( 1.003 ) |
134364 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | 0 | - |
8760333.0 ( 5.2x ) |
448254 ( 1.002 ) |
134364 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | 0 | - |
8256343.0 ( 5.5x ) |
448288 ( 1.002 ) |
134364 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | 0 | - |
8211965.0 ( 5.5x ) |
448288 ( 1.002 ) |
134364 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | 0 | - |
8215356.0 ( 5.5x ) |
448288 ( 1.002 ) |
134364 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | 0 | - |
22478255.0 ( 2.0x ) |
457072 ( 1.022 ) |
134372 ( 1.0 ) |
128 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
18853462.0 ( 2.4x ) |
457078 ( 1.022 ) |
134372 ( 1.0 ) |
256 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
17526745.0 ( 2.6x ) |
457196 ( 1.022 ) |
134372 ( 1.0 ) |
512 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
17025843.0 ( 2.7x ) |
457404 ( 1.023 ) |
134372 ( 1.0 ) |
1024 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
16819274.0 ( 2.7x ) |
457500 ( 1.023 ) |
134372 ( 1.0 ) |
2048 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
16808593.0 ( 2.7x ) |
457792 ( 1.024 ) |
134372 ( 1.0 ) |
4096 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
Original data
Click here to download the raw files for this benchmark.