Benchmarks 2024 11 21 TFLM LLVM Os spike_rv32_min - tum-ei-eda/muriscv-nn GitHub Wiki
Setup
Simulator
- Spike (
riscv-isa-sim
) (ISS, CPI=1)- Spike :
0bc176b3fca43560b9e8586cdbc41cfde073e17a
- Spike PK :
7e9b671c0415dfd7b562ac934feb9380075d4aa2
- Spike :
Toolchains
-
RISC-V GCC:
- Scalar:
riscv32-unknown-elf-gcc (gc891d8dc23e) 13.2.0
- Vector:
riscv32-unknown-elf-gcc (gc891d8dc23e) 13.2.0
- Packed: Self compiled using patches found in https://github.com/riscv-collab/riscv-gcc/pull/258 and https://github.com/riscvarchive/riscv-binutils-gdb/pull/257
- Scalar:
-
LLVM/Clang:
clang version 18.1.8 (https://github.com/llvm/llvm-project.git 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)
- Linker: lld (TODO)
Models
-
MLPerfTiny Benchmark
-
TODO: others!
Frameworks
-
MLonMCU :
develop
-
TFLM :
8eb6b23de4470d6a8da3131650d6a67514dfa130
Miscellaneous
- Used
-Os
flag for compilation. - Benchmarks generated using MLonMCU deployment tool with minimal efforts.
- Memory metrics are reported in Bytes
Results (Framework: tflm, Backend: tflmi, Toolchain: llvm, Flags: -Os, Target: spike_rv32_min )
aww
)
Audio Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
39232458 ( 0.4x ) |
198384 ( 0.854 ) |
35992 ( 1.0 ) |
0 | TFLM | Reference | RV32IM | 0 | - |
33270646 ( 0.5x ) |
203752 ( 0.877 ) |
35992 ( 1.0 ) |
128 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
31477974 ( 0.5x ) |
203752 ( 0.877 ) |
35992 ( 1.0 ) |
256 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
30581638 ( 0.5x ) |
203752 ( 0.877 ) |
35992 ( 1.0 ) |
512 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
30133470 ( 0.5x ) |
203752 ( 0.877 ) |
35992 ( 1.0 ) |
1024 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
30136859 ( 0.5x ) |
203752 ( 0.877 ) |
35992 ( 1.0 ) |
2048 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
30136859 ( 0.5x ) |
203752 ( 0.877 ) |
35992 ( 1.0 ) |
4096 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
15024007 ( Base ) |
232312 ( Base ) |
35992 ( Base ) |
0 | muRISCV-NN | Scalar | RV32IM | 0 | - |
14912371 ( 1.0x ) |
231124 ( 0.995 ) |
35992 ( 1.0 ) |
0 | muRISCV-NN | Vector (Portable) | RV32IM | 0 | - |
6082522 ( 2.5x ) |
242064 ( 1.042 ) |
35992 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
5075860 ( 3.0x ) |
242064 ( 1.042 ) |
35992 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
4572628 ( 3.3x ) |
242064 ( 1.042 ) |
35992 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
4283313 ( 3.5x ) |
242064 ( 1.042 ) |
35992 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
4283313 ( 3.5x ) |
242064 ( 1.042 ) |
35992 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
4290091 ( 3.5x ) |
242064 ( 1.042 ) |
35992 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
4098330 ( 3.7x ) |
233004 ( 1.003 ) |
35992 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
2843352 ( 5.3x ) |
233004 ( 1.003 ) |
35992 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
2166034 ( 6.9x ) |
233004 ( 1.003 ) |
35992 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
2125046 ( 7.1x ) |
233004 ( 1.003 ) |
35992 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
2128435 ( 7.1x ) |
233004 ( 1.003 ) |
35992 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
2131824 ( 7.0x ) |
233004 ( 1.003 ) |
35992 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
6839842 ( 2.2x ) |
240704 ( 1.036 ) |
35992 ( 1.0 ) |
128 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
5872428 ( 2.6x ) |
240704 ( 1.036 ) |
35992 ( 1.0 ) |
256 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
5388820 ( 2.8x ) |
240704 ( 1.036 ) |
35992 ( 1.0 ) |
512 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
5119237 ( 2.9x ) |
240704 ( 1.036 ) |
35992 ( 1.0 ) |
1024 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
5119237 ( 2.9x ) |
240704 ( 1.036 ) |
35992 ( 1.0 ) |
2048 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
5126015 ( 2.9x ) |
240704 ( 1.036 ) |
35992 ( 1.0 ) |
4096 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
resnet
)
Image Classification (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
120983358 ( 0.5x ) |
241288 ( 0.914 ) |
68760 ( 1.0 ) |
0 | TFLM | Reference | RV32IM | 0 | - |
56488584 ( 1.0x ) |
246380 ( 0.934 ) |
68760 ( 1.0 ) |
128 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
46731928 ( 1.2x ) |
246380 ( 0.934 ) |
68760 ( 1.0 ) |
256 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
44385024 ( 1.3x ) |
246380 ( 0.934 ) |
68760 ( 1.0 ) |
512 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
43951220 ( 1.3x ) |
246380 ( 0.934 ) |
68760 ( 1.0 ) |
1024 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
43954609 ( 1.3x ) |
246380 ( 0.934 ) |
68760 ( 1.0 ) |
2048 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
43954609 ( 1.3x ) |
246380 ( 0.934 ) |
68760 ( 1.0 ) |
4096 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
56260946 ( Base ) |
263904 ( Base ) |
68760 ( Base ) |
0 | muRISCV-NN | Scalar | RV32IM | 0 | - |
72452688 ( 0.8x ) |
263040 ( 0.997 ) |
68760 ( 1.0 ) |
0 | muRISCV-NN | Vector (Portable) | RV32IM | 0 | - |
26482258 ( 2.1x ) |
273964 ( 1.038 ) |
68760 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
18420860 ( 3.1x ) |
273964 ( 1.038 ) |
68760 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
14642164 ( 3.8x ) |
273964 ( 1.038 ) |
68760 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
12924141 ( 4.4x ) |
273964 ( 1.038 ) |
68760 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
12084461 ( 4.7x ) |
273964 ( 1.038 ) |
68760 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
11461479 ( 4.9x ) |
273964 ( 1.038 ) |
68760 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
15265177 ( 3.7x ) |
265700 ( 1.007 ) |
68760 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
9682429 ( 5.8x ) |
265700 ( 1.007 ) |
68760 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
7136815 ( 7.9x ) |
265700 ( 1.007 ) |
68760 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
5894799 ( 9.5x ) |
265700 ( 1.007 ) |
68760 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
4969980 ( 11.3x ) |
265700 ( 1.007 ) |
68760 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
4724305 ( 11.9x ) |
265700 ( 1.007 ) |
68760 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
20234944 ( 2.8x ) |
272360 ( 1.032 ) |
68760 ( 1.0 ) |
128 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
14139266 ( 4.0x ) |
272360 ( 1.032 ) |
68760 ( 1.0 ) |
256 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
11281990 ( 5.0x ) |
272360 ( 1.032 ) |
68760 ( 1.0 ) |
512 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
9983717 ( 5.6x ) |
272360 ( 1.032 ) |
68760 ( 1.0 ) |
1024 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
9348837 ( 6.0x ) |
272360 ( 1.032 ) |
68760 ( 1.0 ) |
2048 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
8879455 ( 6.3x ) |
272360 ( 1.032 ) |
68760 ( 1.0 ) |
4096 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
toycar
)
Anomaly Detection (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
2791918 ( 0.6x ) |
380888 ( 0.979 ) |
19336 ( 1.0 ) |
0 | TFLM | Reference | RV32IM | 0 | - |
903686 ( 1.9x ) |
382836 ( 0.984 ) |
19336 ( 1.0 ) |
128 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
673414 ( 2.5x ) |
382836 ( 0.984 ) |
19336 ( 1.0 ) |
256 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
558278 ( 3.0x ) |
382836 ( 0.984 ) |
19336 ( 1.0 ) |
512 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
500710 ( 3.4x ) |
382836 ( 0.984 ) |
19336 ( 1.0 ) |
1024 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
471926 ( 3.6x ) |
382836 ( 0.984 ) |
19336 ( 1.0 ) |
2048 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
471731 ( 3.6x ) |
382836 ( 0.984 ) |
19336 ( 1.0 ) |
4096 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
1689241 ( Base ) |
389104 ( Base ) |
19336 ( Base ) |
0 | muRISCV-NN | Scalar | RV32IM | 0 | - |
3041919 ( 0.6x ) |
389108 ( 1.0 ) |
19336 ( 1.0 ) |
0 | muRISCV-NN | Vector (Portable) | RV32IM | 0 | - |
597099 ( 2.8x ) |
392988 ( 1.01 ) |
19336 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
483179 ( 3.5x ) |
392988 ( 1.01 ) |
19336 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
426219 ( 4.0x ) |
392988 ( 1.01 ) |
19336 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
397739 ( 4.2x ) |
392988 ( 1.01 ) |
19336 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
383499 ( 4.4x ) |
392988 ( 1.01 ) |
19336 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
381715 ( 4.4x ) |
392988 ( 1.01 ) |
19336 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
1944832 ( 0.9x ) |
390844 ( 1.004 ) |
19336 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
1828152 ( 0.9x ) |
390844 ( 1.004 ) |
19336 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
1769812 ( 1.0x ) |
390844 ( 1.004 ) |
19336 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
1740966 ( 1.0x ) |
390844 ( 1.004 ) |
19336 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
1737335 ( 1.0x ) |
390844 ( 1.004 ) |
19336 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
1735479 ( 1.0x ) |
390844 ( 1.004 ) |
19336 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
929404 ( 1.8x ) |
392992 ( 1.01 ) |
19336 ( 1.0 ) |
128 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
667452 ( 2.5x ) |
392992 ( 1.01 ) |
19336 ( 1.0 ) |
256 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
536476 ( 3.1x ) |
392992 ( 1.01 ) |
19336 ( 1.0 ) |
512 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
470988 ( 3.6x ) |
392992 ( 1.01 ) |
19336 ( 1.0 ) |
1024 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
438244 ( 3.9x ) |
392992 ( 1.01 ) |
19336 ( 1.0 ) |
2048 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
434156 ( 3.9x ) |
392992 ( 1.01 ) |
19336 ( 1.0 ) |
4096 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
vww
)
Visual Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
103371251 ( 0.4x ) |
472024 ( 0.933 ) |
134296 ( 1.0 ) |
0 | TFLM | Reference | RV32IM | 0 | - |
71824044 ( 0.6x ) |
477392 ( 0.944 ) |
134296 ( 1.0 ) |
128 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
66662636 ( 0.7x ) |
477392 ( 0.944 ) |
134296 ( 1.0 ) |
256 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
64210956 ( 0.7x ) |
477392 ( 0.944 ) |
134296 ( 1.0 ) |
512 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
63178652 ( 0.7x ) |
477392 ( 0.944 ) |
134296 ( 1.0 ) |
1024 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
62762657 ( 0.7x ) |
477392 ( 0.944 ) |
134296 ( 1.0 ) |
2048 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
62730373 ( 0.7x ) |
477392 ( 0.944 ) |
134296 ( 1.0 ) |
4096 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
44945967 ( Base ) |
505952 ( Base ) |
134296 ( Base ) |
0 | muRISCV-NN | Scalar | RV32IM | 0 | - |
44760610 ( 1.0x ) |
504764 ( 0.998 ) |
134296 ( 1.0 ) |
0 | muRISCV-NN | Vector (Portable) | RV32IM | 0 | - |
19334831 ( 2.3x ) |
515704 ( 1.019 ) |
134296 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
16459215 ( 2.7x ) |
515704 ( 1.019 ) |
134296 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
15081423 ( 3.0x ) |
515704 ( 1.019 ) |
134296 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
14580572 ( 3.1x ) |
515704 ( 1.019 ) |
134296 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
14374052 ( 3.1x ) |
515704 ( 1.019 ) |
134296 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
14363394 ( 3.1x ) |
515704 ( 1.019 ) |
134296 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
13385338 ( 3.4x ) |
506644 ( 1.001 ) |
134296 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
10064338 ( 4.5x ) |
506644 ( 1.001 ) |
134296 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
8772910 ( 5.1x ) |
506644 ( 1.001 ) |
134296 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
8268920 ( 5.4x ) |
506644 ( 1.001 ) |
134296 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
8224540 ( 5.5x ) |
506644 ( 1.001 ) |
134296 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
8227929 ( 5.5x ) |
506644 ( 1.001 ) |
134296 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
22209979 ( 2.0x ) |
514344 ( 1.017 ) |
134296 ( 1.0 ) |
128 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
19426235 ( 2.3x ) |
514344 ( 1.017 ) |
134296 ( 1.0 ) |
256 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
18094379 ( 2.5x ) |
514344 ( 1.017 ) |
134296 ( 1.0 ) |
512 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
17593456 ( 2.6x ) |
514344 ( 1.017 ) |
134296 ( 1.0 ) |
1024 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
17386900 ( 2.6x ) |
514344 ( 1.017 ) |
134296 ( 1.0 ) |
2048 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
17376224 ( 2.6x ) |
514344 ( 1.017 ) |
134296 ( 1.0 ) |
4096 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
Original data
Click here to download the raw files for this benchmark.