Benchmarks 2024 11 26 TFLM LLVM O3 spike_rv32_min - tum-ei-eda/muriscv-nn GitHub Wiki
Setup
Simulator
- Spike (
riscv-isa-sim
) (ISS, CPI=1)- Spike :
eb0a3e2b0a7c57522928be39de95cd9f8c6dc636
- Spike PK :
fix-gcc14-rvv
- Spike :
Toolchains
-
RISC-V GCC:
- Scalar:
riscv32-unknown-elf-gcc (g8b4bb54e6c4) 14.2.1 20241118
- Vector:
riscv32-unknown-elf-gcc (g8b4bb54e6c4) 14.2.1 20241118
- Packed: Self compiled using patches found in https://github.com/riscv-collab/riscv-gcc/pull/258 and https://github.com/riscvarchive/riscv-binutils-gdb/pull/257
- Scalar:
-
LLVM/Clang:
clang version 18.1.8 (https://github.com/llvm/llvm-project.git 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)
- Linker: lld (TODO)
Models
-
MLPerfTiny Benchmark
-
TODO: others!
Frameworks
-
MLonMCU :
develop
-
TFLM :
8eb6b23de4470d6a8da3131650d6a67514dfa130
Miscellaneous
- Used
-Os
flag for compilation. - Benchmarks generated using MLonMCU deployment tool with minimal efforts.
- Memory metrics are reported in Bytes
Results (Framework: tflm, Backend: tflmi, Toolchain: llvm, Flags: -O3, Target: spike_rv32_min )
aww
)
Audio Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
38350906.0 ( 0.4x ) |
216580 ( 0.888 ) |
35992 ( 1.0 ) |
0 | TFLM | Reference | RV32IM | 0 | - |
28469325.0 ( 0.5x ) |
231876 ( 0.951 ) |
35992 ( 1.0 ) |
128 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
27066902.0 ( 0.6x ) |
232124 ( 0.952 ) |
35992 ( 1.0 ) |
256 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
26357172.0 ( 0.6x ) |
232412 ( 0.953 ) |
35992 ( 1.0 ) |
512 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
26005377.0 ( 0.6x ) |
232692 ( 0.954 ) |
35992 ( 1.0 ) |
1024 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
41851929.0 ( 0.4x ) |
232912 ( 0.955 ) |
35992 ( 1.0 ) |
2048 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
41857724.0 ( 0.4x ) |
233204 ( 0.956 ) |
35992 ( 1.0 ) |
4096 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
15093474.0 ( Base ) |
243892 ( Base ) |
35992 ( Base ) |
0 | muRISCV-NN | Scalar | RV32IM | 0 | - |
14891253.0 ( 1.0x ) |
243580 ( 0.999 ) |
35992 ( 1.0 ) |
0 | muRISCV-NN | Vector (Portable) | RV32IM | 0 | - |
5361901.0 ( 2.8x ) |
262116 ( 1.075 ) |
35992 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
4937729.0 ( 3.1x ) |
262328 ( 1.076 ) |
35992 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
262576 ( 1.077 ) |
35992 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
5354836.0 ( 2.8x ) |
262724 ( 1.077 ) |
35992 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
13818054.0 ( 1.1x ) |
262832 ( 1.078 ) |
35992 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
13818187.0 ( 1.1x ) |
262944 ( 1.078 ) |
35992 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
246412 ( 1.01 ) |
35992 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
2827806.0 ( 5.3x ) |
246452 ( 1.01 ) |
35992 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
2150496.0 ( 7.0x ) |
246476 ( 1.011 ) |
35992 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
2109594.0 ( 7.2x ) |
246520 ( 1.011 ) |
35992 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
2110015.0 ( 7.2x ) |
246520 ( 1.011 ) |
35992 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
2110340.0 ( 7.2x ) |
246520 ( 1.011 ) |
35992 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
6411255.0 ( 2.4x ) |
263036 ( 1.078 ) |
35992 ( 1.0 ) |
128 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
5445489.0 ( 2.8x ) |
263124 ( 1.079 ) |
35992 ( 1.0 ) |
256 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
263372 ( 1.08 ) |
35992 ( 1.0 ) |
512 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
6183777.0 ( 2.4x ) |
263520 ( 1.08 ) |
35992 ( 1.0 ) |
1024 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
14649707.0 ( 1.0x ) |
263580 ( 1.081 ) |
35992 ( 1.0 ) |
2048 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
14649840.0 ( 1.0x ) |
263692 ( 1.081 ) |
35992 ( 1.0 ) |
4096 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
resnet
)
Image Classification (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
119733118.0 ( 0.5x ) |
256904 ( 0.936 ) |
68760 ( 1.0 ) |
0 | TFLM | Reference | RV32IM | 0 | - |
- ( ?x ) |
273960 ( 0.998 ) |
68760 ( 1.0 ) |
128 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
274276 ( 0.999 ) |
68760 ( 1.0 ) |
256 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
274744 ( 1.0 ) |
68760 ( 1.0 ) |
512 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
275144 ( 1.002 ) |
68760 ( 1.0 ) |
1024 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
122971099.0 ( 0.5x ) |
275484 ( 1.003 ) |
68760 ( 1.0 ) |
2048 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
122987184.0 ( 0.5x ) |
276128 ( 1.006 ) |
68760 ( 1.0 ) |
4096 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
56130322.0 ( Base ) |
274612 ( Base ) |
68760 ( Base ) |
0 | muRISCV-NN | Scalar | RV32IM | 0 | - |
72325671.0 ( 0.8x ) |
274100 ( 0.998 ) |
68760 ( 1.0 ) |
0 | muRISCV-NN | Vector (Portable) | RV32IM | 0 | - |
- ( ?x ) |
296812 ( 1.081 ) |
68760 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
297064 ( 1.082 ) |
68760 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
297552 ( 1.084 ) |
68760 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
297844 ( 1.085 ) |
68760 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
298096 ( 1.086 ) |
68760 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
298592 ( 1.087 ) |
68760 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
15167033.0 ( 3.7x ) |
277708 ( 1.011 ) |
68760 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
9584265.0 ( 5.9x ) |
277740 ( 1.011 ) |
68760 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
7038621.0 ( 8.0x ) |
277764 ( 1.011 ) |
68760 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
- ( ?x ) |
277808 ( 1.012 ) |
68760 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
- ( ?x ) |
277808 ( 1.012 ) |
68760 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
- ( ?x ) |
277808 ( 1.012 ) |
68760 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
- ( ?x ) |
296012 ( 1.078 ) |
68760 ( 1.0 ) |
128 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
296264 ( 1.079 ) |
68760 ( 1.0 ) |
256 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
296752 ( 1.081 ) |
68760 ( 1.0 ) |
512 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
297044 ( 1.082 ) |
68760 ( 1.0 ) |
1024 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
297296 ( 1.083 ) |
68760 ( 1.0 ) |
2048 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
297792 ( 1.084 ) |
68760 ( 1.0 ) |
4096 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
toycar
)
Anomaly Detection (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
2809616.0 ( 0.6x ) |
386588 ( 0.98 ) |
19336 ( 1.0 ) |
0 | TFLM | Reference | RV32IM | 0 | - |
- ( ?x ) |
390488 ( 0.989 ) |
19336 ( 1.0 ) |
128 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
390580 ( 0.99 ) |
19336 ( 1.0 ) |
256 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
390712 ( 0.99 ) |
19336 ( 1.0 ) |
512 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
390872 ( 0.99 ) |
19336 ( 1.0 ) |
1024 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
390972 ( 0.991 ) |
19336 ( 1.0 ) |
2048 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
391104 ( 0.991 ) |
19336 ( 1.0 ) |
4096 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
1694664.0 ( Base ) |
394668 ( Base ) |
19336 ( Base ) |
0 | muRISCV-NN | Scalar | RV32IM | 0 | - |
3043955.0 ( 0.6x ) |
394672 ( 1.0 ) |
19336 ( 1.0 ) |
0 | muRISCV-NN | Vector (Portable) | RV32IM | 0 | - |
- ( ?x ) |
400332 ( 1.014 ) |
19336 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
400396 ( 1.015 ) |
19336 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
400492 ( 1.015 ) |
19336 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
400608 ( 1.015 ) |
19336 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
400668 ( 1.015 ) |
19336 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
400744 ( 1.015 ) |
19336 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
1947290.0 ( 0.9x ) |
396252 ( 1.004 ) |
19336 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
1830610.0 ( 0.9x ) |
396232 ( 1.004 ) |
19336 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
1772270.0 ( 1.0x ) |
396240 ( 1.004 ) |
19336 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
1743424.0 ( 1.0x ) |
396300 ( 1.004 ) |
19336 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
1739793.0 ( 1.0x ) |
396300 ( 1.004 ) |
19336 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
1737937.0 ( 1.0x ) |
396300 ( 1.004 ) |
19336 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
- ( ?x ) |
400336 ( 1.014 ) |
19336 ( 1.0 ) |
128 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
400400 ( 1.015 ) |
19336 ( 1.0 ) |
256 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
400496 ( 1.015 ) |
19336 ( 1.0 ) |
512 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
400612 ( 1.015 ) |
19336 ( 1.0 ) |
1024 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
400672 ( 1.015 ) |
19336 ( 1.0 ) |
2048 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
400748 ( 1.015 ) |
19336 ( 1.0 ) |
4096 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
vww
)
Visual Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
101120758.0 ( 0.4x ) |
490220 ( 0.947 ) |
134296 ( 1.0 ) |
0 | TFLM | Reference | RV32IM | 0 | - |
- ( ?x ) |
505516 ( 0.977 ) |
134296 ( 1.0 ) |
128 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
505764 ( 0.977 ) |
134296 ( 1.0 ) |
256 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
506052 ( 0.978 ) |
134296 ( 1.0 ) |
512 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
506332 ( 0.978 ) |
134296 ( 1.0 ) |
1024 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
506552 ( 0.979 ) |
134296 ( 1.0 ) |
2048 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
506844 ( 0.979 ) |
134296 ( 1.0 ) |
4096 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
45136117.0 ( Base ) |
517532 ( Base ) |
134296 ( Base ) |
0 | muRISCV-NN | Scalar | RV32IM | 0 | - |
44714827.0 ( 1.0x ) |
517220 ( 0.999 ) |
134296 ( 1.0 ) |
0 | muRISCV-NN | Vector (Portable) | RV32IM | 0 | - |
- ( ?x ) |
535756 ( 1.035 ) |
134296 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
535968 ( 1.036 ) |
134296 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
536216 ( 1.036 ) |
134296 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
536364 ( 1.036 ) |
134296 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
536472 ( 1.037 ) |
134296 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
536584 ( 1.037 ) |
134296 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
13254981.0 ( 3.4x ) |
520052 ( 1.005 ) |
134296 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
9933987.0 ( 4.5x ) |
520092 ( 1.005 ) |
134296 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
8642608.0 ( 5.2x ) |
520116 ( 1.005 ) |
134296 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
8138670.0 ( 5.5x ) |
520160 ( 1.005 ) |
134296 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
8091343.0 ( 5.6x ) |
520160 ( 1.005 ) |
134296 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
8091668.0 ( 5.6x ) |
520160 ( 1.005 ) |
134296 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
- ( ?x ) |
536676 ( 1.037 ) |
134296 ( 1.0 ) |
128 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
536764 ( 1.037 ) |
134296 ( 1.0 ) |
256 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
537012 ( 1.038 ) |
134296 ( 1.0 ) |
512 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
537160 ( 1.038 ) |
134296 ( 1.0 ) |
1024 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
537220 ( 1.038 ) |
134296 ( 1.0 ) |
2048 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
- ( ?x ) |
537332 ( 1.038 ) |
134296 ( 1.0 ) |
4096 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
Original data
Click here to download the raw files for this benchmark.