Benchmarks 2024 11 21 TFLM LLVM O3 spike_rv32_min - tum-ei-eda/muriscv-nn GitHub Wiki
Setup
Simulator
- Spike (
riscv-isa-sim
) (ISS, CPI=1)- Spike :
0bc176b3fca43560b9e8586cdbc41cfde073e17a
- Spike PK :
7e9b671c0415dfd7b562ac934feb9380075d4aa2
- Spike :
Toolchains
-
RISC-V GCC:
- Scalar:
riscv32-unknown-elf-gcc (gc891d8dc23e) 13.2.0
- Vector:
riscv32-unknown-elf-gcc (gc891d8dc23e) 13.2.0
- Packed: Self compiled using patches found in https://github.com/riscv-collab/riscv-gcc/pull/258 and https://github.com/riscvarchive/riscv-binutils-gdb/pull/257
- Scalar:
-
LLVM/Clang:
clang version 18.1.8 (https://github.com/llvm/llvm-project.git 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)
- Linker: lld (TODO)
Models
-
MLPerfTiny Benchmark
-
TODO: others!
Frameworks
-
MLonMCU :
develop
-
TFLM :
8eb6b23de4470d6a8da3131650d6a67514dfa130
Miscellaneous
- Used
-Os
flag for compilation. - Benchmarks generated using MLonMCU deployment tool with minimal efforts.
- Memory metrics are reported in Bytes
Results (Framework: tflm, Backend: tflmi, Toolchain: llvm, Flags: -O3, Target: spike_rv32_min )
aww
)
Audio Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
38345597 ( 0.4x ) |
212548 ( 0.886 ) |
35992 ( 1.0 ) |
0 | TFLM | Reference | RV32IM | 0 | - |
28095021 ( 0.5x ) |
227256 ( 0.947 ) |
35992 ( 1.0 ) |
128 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
26689749 ( 0.6x ) |
227256 ( 0.947 ) |
35992 ( 1.0 ) |
256 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
25985485 ( 0.6x ) |
227256 ( 0.947 ) |
35992 ( 1.0 ) |
512 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
25636742 ( 0.6x ) |
227256 ( 0.947 ) |
35992 ( 1.0 ) |
1024 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
41483486 ( 0.4x ) |
227256 ( 0.947 ) |
35992 ( 1.0 ) |
2048 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
41490264 ( 0.4x ) |
227256 ( 0.947 ) |
35992 ( 1.0 ) |
4096 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
15093301 ( Base ) |
239860 ( Base ) |
35992 ( Base ) |
0 | muRISCV-NN | Scalar | RV32IM | 0 | - |
14894000 ( 1.0x ) |
239548 ( 0.999 ) |
35992 ( 1.0 ) |
0 | muRISCV-NN | Vector (Portable) | RV32IM | 0 | - |
5356284 ( 2.8x ) |
257488 ( 1.073 ) |
35992 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
4930543 ( 3.1x ) |
257488 ( 1.073 ) |
35992 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
4612532 ( 3.3x ) |
257488 ( 1.073 ) |
35992 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
5364165 ( 2.8x ) |
257488 ( 1.073 ) |
35992 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
13833479 ( 1.1x ) |
257488 ( 1.073 ) |
35992 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
13836868 ( 1.1x ) |
257488 ( 1.073 ) |
35992 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
4072290 ( 3.7x ) |
242028 ( 1.009 ) |
35992 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
2817328 ( 5.4x ) |
242028 ( 1.009 ) |
35992 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
2140018 ( 7.1x ) |
242028 ( 1.009 ) |
35992 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
2099034 ( 7.2x ) |
242028 ( 1.009 ) |
35992 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
2102423 ( 7.2x ) |
242028 ( 1.009 ) |
35992 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
2105812 ( 7.2x ) |
242028 ( 1.009 ) |
35992 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
6409949 ( 2.4x ) |
258360 ( 1.077 ) |
35992 ( 1.0 ) |
128 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
5740150 ( 2.6x ) |
258360 ( 1.077 ) |
35992 ( 1.0 ) |
256 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
5405131 ( 2.8x ) |
258360 ( 1.077 ) |
35992 ( 1.0 ) |
512 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
6489780 ( 2.3x ) |
258360 ( 1.077 ) |
35992 ( 1.0 ) |
1024 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
14962838 ( 1.0x ) |
258360 ( 1.077 ) |
35992 ( 1.0 ) |
2048 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
14969616 ( 1.0x ) |
258360 ( 1.077 ) |
35992 ( 1.0 ) |
4096 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
resnet
)
Image Classification (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
119733582 ( 0.5x ) |
252872 ( 0.935 ) |
68760 ( 1.0 ) |
0 | TFLM | Reference | RV32IM | 0 | - |
52090761 ( 1.1x ) |
269292 ( 0.995 ) |
68760 ( 1.0 ) |
128 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
44429316 ( 1.3x ) |
269292 ( 0.995 ) |
68760 ( 1.0 ) |
256 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
82726472 ( 0.7x ) |
269292 ( 0.995 ) |
68760 ( 1.0 ) |
512 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
107642711 ( 0.5x ) |
269292 ( 0.995 ) |
68760 ( 1.0 ) |
1024 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
122981451 ( 0.5x ) |
269292 ( 0.995 ) |
68760 ( 1.0 ) |
2048 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
122988229 ( 0.5x ) |
269292 ( 0.995 ) |
68760 ( 1.0 ) |
4096 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
56124814 ( Base ) |
270580 ( Base ) |
68760 ( Base ) |
0 | muRISCV-NN | Scalar | RV32IM | 0 | - |
72286157 ( 0.8x ) |
270068 ( 0.998 ) |
68760 ( 1.0 ) |
0 | muRISCV-NN | Vector (Portable) | RV32IM | 0 | - |
11827718 ( 4.7x ) |
292172 ( 1.08 ) |
68760 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
11328993 ( 5.0x ) |
292172 ( 1.08 ) |
68760 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
14188577 ( 4.0x ) |
292172 ( 1.08 ) |
68760 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
15675764 ( 3.6x ) |
292172 ( 1.08 ) |
68760 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
16353108 ( 3.4x ) |
292172 ( 1.08 ) |
68760 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
35569606 ( 1.6x ) |
292172 ( 1.08 ) |
68760 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
15161479 ( 3.7x ) |
273312 ( 1.01 ) |
68760 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
9578731 ( 5.9x ) |
273312 ( 1.01 ) |
68760 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
7033117 ( 8.0x ) |
273312 ( 1.01 ) |
68760 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
5791101 ( 9.7x ) |
273312 ( 1.01 ) |
68760 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
4866282 ( 11.5x ) |
273312 ( 1.01 ) |
68760 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
4620607 ( 12.1x ) |
273312 ( 1.01 ) |
68760 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
17488850 ( 3.2x ) |
291356 ( 1.077 ) |
68760 ( 1.0 ) |
128 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
16456569 ( 3.4x ) |
291356 ( 1.077 ) |
68760 ( 1.0 ) |
256 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
17217426 ( 3.3x ) |
291356 ( 1.077 ) |
68760 ( 1.0 ) |
512 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
16930877 ( 3.3x ) |
291356 ( 1.077 ) |
68760 ( 1.0 ) |
1024 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
16466509 ( 3.4x ) |
291356 ( 1.077 ) |
68760 ( 1.0 ) |
2048 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
43252927 ( 1.3x ) |
291356 ( 1.077 ) |
68760 ( 1.0 ) |
4096 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
toycar
)
Anomaly Detection (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
2808877 ( 0.6x ) |
382532 ( 0.979 ) |
19336 ( 1.0 ) |
0 | TFLM | Reference | RV32IM | 0 | - |
804495 ( 2.1x ) |
385964 ( 0.988 ) |
19336 ( 1.0 ) |
128 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
630991 ( 2.7x ) |
385964 ( 0.988 ) |
19336 ( 1.0 ) |
256 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
540527 ( 3.1x ) |
385964 ( 0.988 ) |
19336 ( 1.0 ) |
512 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
495295 ( 3.4x ) |
385964 ( 0.988 ) |
19336 ( 1.0 ) |
1024 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
472679 ( 3.6x ) |
385964 ( 0.988 ) |
19336 ( 1.0 ) |
2048 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
2228087 ( 0.8x ) |
385964 ( 0.988 ) |
19336 ( 1.0 ) |
4096 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
1693924 ( Base ) |
390612 ( Base ) |
19336 ( Base ) |
0 | muRISCV-NN | Scalar | RV32IM | 0 | - |
3043273 ( 0.6x ) |
390616 ( 1.0 ) |
19336 ( 1.0 ) |
0 | muRISCV-NN | Vector (Portable) | RV32IM | 0 | - |
557426 ( 3.0x ) |
395828 ( 1.013 ) |
19336 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
470192 ( 3.6x ) |
395828 ( 1.013 ) |
19336 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
424976 ( 4.0x ) |
395828 ( 1.013 ) |
19336 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
402368 ( 4.2x ) |
395828 ( 1.013 ) |
19336 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
391064 ( 4.3x ) |
395828 ( 1.013 ) |
19336 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
1327726 ( 1.3x ) |
395828 ( 1.013 ) |
19336 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
1941391 ( 0.9x ) |
391768 ( 1.003 ) |
19336 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
1824711 ( 0.9x ) |
391768 ( 1.003 ) |
19336 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
1766371 ( 1.0x ) |
391768 ( 1.003 ) |
19336 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
1737525 ( 1.0x ) |
391768 ( 1.003 ) |
19336 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
1733894 ( 1.0x ) |
391768 ( 1.003 ) |
19336 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
1732038 ( 1.0x ) |
391768 ( 1.003 ) |
19336 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
792381 ( 2.1x ) |
395832 ( 1.013 ) |
19336 ( 1.0 ) |
128 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
610555 ( 2.8x ) |
395832 ( 1.013 ) |
19336 ( 1.0 ) |
256 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
515995 ( 3.3x ) |
395832 ( 1.013 ) |
19336 ( 1.0 ) |
512 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
468715 ( 3.6x ) |
395832 ( 1.013 ) |
19336 ( 1.0 ) |
1024 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
445075 ( 3.8x ) |
395832 ( 1.013 ) |
19336 ( 1.0 ) |
2048 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
2356649 ( 0.7x ) |
395832 ( 1.013 ) |
19336 ( 1.0 ) |
4096 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
vww
)
Visual Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
101114236 ( 0.4x ) |
486188 ( 0.947 ) |
134296 ( 1.0 ) |
0 | TFLM | Reference | RV32IM | 0 | - |
67886948 ( 0.7x ) |
500896 ( 0.975 ) |
134296 ( 1.0 ) |
128 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
65567553 ( 0.7x ) |
500896 ( 0.975 ) |
134296 ( 1.0 ) |
256 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
65687185 ( 0.7x ) |
500896 ( 0.975 ) |
134296 ( 1.0 ) |
512 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
71487350 ( 0.6x ) |
500896 ( 0.975 ) |
134296 ( 1.0 ) |
1024 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
78000714 ( 0.6x ) |
500896 ( 0.975 ) |
134296 ( 1.0 ) |
2048 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
103503534 ( 0.4x ) |
500896 ( 0.975 ) |
134296 ( 1.0 ) |
4096 | TFLM | Reference | RV32IM_ZVE64X | 0 | Loop+SLP |
45135510 ( Base ) |
513500 ( Base ) |
134296 ( Base ) |
0 | muRISCV-NN | Scalar | RV32IM | 0 | - |
44649265 ( 1.0x ) |
513188 ( 0.999 ) |
134296 ( 1.0 ) |
0 | muRISCV-NN | Vector (Portable) | RV32IM | 0 | - |
17467082 ( 2.6x ) |
531128 ( 1.034 ) |
134296 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
17018542 ( 2.7x ) |
531128 ( 1.034 ) |
134296 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
18203883 ( 2.5x ) |
531128 ( 1.034 ) |
134296 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
21350208 ( 2.1x ) |
531128 ( 1.034 ) |
134296 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
24876654 ( 1.8x ) |
531128 ( 1.034 ) |
134296 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
38835797 ( 1.2x ) |
531128 ( 1.034 ) |
134296 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32IM_ZVE64X | 0 | Loop+SLP |
13244004 ( 3.4x ) |
515668 ( 1.004 ) |
134296 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
9923028 ( 4.5x ) |
515668 ( 1.004 ) |
134296 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
8631612 ( 5.2x ) |
515668 ( 1.004 ) |
134296 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
8127628 ( 5.6x ) |
515668 ( 1.004 ) |
134296 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
8083251 ( 5.6x ) |
515668 ( 1.004 ) |
134296 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
8086640 ( 5.6x ) |
515668 ( 1.004 ) |
134296 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32IM_ZVE64X | 0 | - |
20235231 ( 2.2x ) |
532000 ( 1.036 ) |
134296 ( 1.0 ) |
128 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
19949796 ( 2.3x ) |
532000 ( 1.036 ) |
134296 ( 1.0 ) |
256 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
21464513 ( 2.1x ) |
532000 ( 1.036 ) |
134296 ( 1.0 ) |
512 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
24610790 ( 1.8x ) |
532000 ( 1.036 ) |
134296 ( 1.0 ) |
1024 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
28137212 ( 1.6x ) |
532000 ( 1.036 ) |
134296 ( 1.0 ) |
2048 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
42099732 ( 1.1x ) |
532000 ( 1.036 ) |
134296 ( 1.0 ) |
4096 | muRISCV-NN | Vector (Portable) | RV32IM_ZVE64X | 0 | Loop+SLP |
Original data
Click here to download the raw files for this benchmark.