Benchmarks 2024 11 26 TFLM LLVM O3 spike_rv64 - tum-ei-eda/muriscv-nn GitHub Wiki
Setup
Simulator
- Spike (
riscv-isa-sim
) (ISS, CPI=1)- Spike :
eb0a3e2b0a7c57522928be39de95cd9f8c6dc636
- Spike PK :
fix-gcc14-rvv
- Spike :
Toolchains
-
RISC-V GCC:
- Scalar:
riscv32-unknown-elf-gcc (g8b4bb54e6c4) 14.2.1 20241118
- Vector:
riscv32-unknown-elf-gcc (g8b4bb54e6c4) 14.2.1 20241118
- Packed: Self compiled using patches found in https://github.com/riscv-collab/riscv-gcc/pull/258 and https://github.com/riscvarchive/riscv-binutils-gdb/pull/257
- Scalar:
-
LLVM/Clang:
clang version 18.1.8 (https://github.com/llvm/llvm-project.git 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)
- Linker: lld (TODO)
Models
-
MLPerfTiny Benchmark
-
TODO: others!
Frameworks
-
MLonMCU :
develop
-
TFLM :
8eb6b23de4470d6a8da3131650d6a67514dfa130
Miscellaneous
- Used
-Os
flag for compilation. - Benchmarks generated using MLonMCU deployment tool with minimal efforts.
- Memory metrics are reported in Bytes
Results (Framework: tflm, Backend: tflmi, Toolchain: llvm, Flags: -O3, Target: spike_rv64 )
aww
)
Audio Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
38221089.0 ( 0.4x ) |
153586 ( 0.885 ) |
38320 ( 1.0 ) |
0 | TFLM | Reference | RV64GC | 0 | - |
29524133.0 ( 0.5x ) |
163874 ( 0.945 ) |
38328 ( 1.0 ) |
128 | TFLM | Reference | RV64GCV | 0 | Loop+SLP |
28115606.0 ( 0.5x ) |
163884 ( 0.945 ) |
38328 ( 1.0 ) |
256 | TFLM | Reference | RV64GCV | 0 | Loop+SLP |
27411342.0 ( 0.5x ) |
164298 ( 0.947 ) |
38328 ( 1.0 ) |
512 | TFLM | Reference | RV64GCV | 0 | Loop+SLP |
27064383.0 ( 0.5x ) |
164512 ( 0.948 ) |
38328 ( 1.0 ) |
1024 | TFLM | Reference | RV64GCV | 0 | Loop+SLP |
42848944.0 ( 0.3x ) |
164732 ( 0.95 ) |
38328 ( 1.0 ) |
2048 | TFLM | Reference | RV64GCV | 0 | Loop+SLP |
42843818.0 ( 0.3x ) |
173750 ( 1.002 ) |
38328 ( 1.0 ) |
4096 | TFLM | Reference | RV64GCV | 0 | Loop+SLP |
14667379.0 ( Base ) |
173468 ( Base ) |
38320 ( Base ) |
0 | muRISCV-NN | Scalar | RV64GC | 0 | - |
15017771.0 ( 1.0x ) |
173420 ( 1.0 ) |
38320 ( 1.0 ) |
0 | muRISCV-NN | Vector (Portable) | RV64GC | 0 | - |
5164728.0 ( 2.8x ) |
195170 ( 1.125 ) |
38328 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
4731403.0 ( 3.1x ) |
195196 ( 1.125 ) |
38328 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
4411868.0 ( 3.3x ) |
195512 ( 1.127 ) |
38328 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
5158073.0 ( 2.8x ) |
195680 ( 1.128 ) |
38328 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
13632551.0 ( 1.1x ) |
195790 ( 1.129 ) |
38328 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
13632551.0 ( 1.1x ) |
201066 ( 1.159 ) |
38328 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
4067788.0 ( 3.6x ) |
176206 ( 1.016 ) |
38320 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV64GCV | 0 | - |
2812820.0 ( 5.2x ) |
176008 ( 1.015 ) |
38320 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV64GCV | 0 | - |
2135508.0 ( 6.9x ) |
176028 ( 1.015 ) |
38320 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV64GCV | 0 | - |
2096389.0 ( 7.0x ) |
176070 ( 1.015 ) |
38320 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV64GCV | 0 | - |
2096389.0 ( 7.0x ) |
176070 ( 1.015 ) |
38320 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV64GCV | 0 | - |
2098254.0 ( 7.0x ) |
176070 ( 1.015 ) |
38320 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV64GCV | 0 | - |
6574453.0 ( 2.2x ) |
196036 ( 1.13 ) |
38328 ( 1.0 ) |
128 | muRISCV-NN | Vector (Portable) | RV64GCV | 0 | Loop+SLP |
5903090.0 ( 2.5x ) |
196062 ( 1.13 ) |
38328 ( 1.0 ) |
256 | muRISCV-NN | Vector (Portable) | RV64GCV | 0 | Loop+SLP |
5566547.0 ( 2.6x ) |
196388 ( 1.132 ) |
38328 ( 1.0 ) |
512 | muRISCV-NN | Vector (Portable) | RV64GCV | 0 | Loop+SLP |
6649672.0 ( 2.2x ) |
196556 ( 1.133 ) |
38328 ( 1.0 ) |
1024 | muRISCV-NN | Vector (Portable) | RV64GCV | 0 | Loop+SLP |
15126938.0 ( 1.0x ) |
196618 ( 1.133 ) |
38328 ( 1.0 ) |
2048 | muRISCV-NN | Vector (Portable) | RV64GCV | 0 | Loop+SLP |
15126938.0 ( 1.0x ) |
201894 ( 1.164 ) |
38328 ( 1.0 ) |
4096 | muRISCV-NN | Vector (Portable) | RV64GCV | 0 | Loop+SLP |
resnet
)
Image Classification (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
118598202.0 ( 0.5x ) |
193416 ( 0.936 ) |
71088 ( 1.0 ) |
0 | TFLM | Reference | RV64GC | 0 | - |
52636826.0 ( 1.1x ) |
205394 ( 0.994 ) |
71096 ( 1.0 ) |
128 | TFLM | Reference | RV64GCV | 0 | Loop+SLP |
44975281.0 ( 1.2x ) |
205496 ( 0.994 ) |
71096 ( 1.0 ) |
256 | TFLM | Reference | RV64GCV | 0 | Loop+SLP |
82549432.0 ( 0.7x ) |
206138 ( 0.997 ) |
71096 ( 1.0 ) |
512 | TFLM | Reference | RV64GCV | 0 | Loop+SLP |
107251187.0 ( 0.5x ) |
206482 ( 0.999 ) |
71096 ( 1.0 ) |
1024 | TFLM | Reference | RV64GCV | 0 | Loop+SLP |
122522368.0 ( 0.5x ) |
206822 ( 1.001 ) |
71096 ( 1.0 ) |
2048 | TFLM | Reference | RV64GCV | 0 | Loop+SLP |
122529982.0 ( 0.5x ) |
216212 ( 1.046 ) |
71096 ( 1.0 ) |
4096 | TFLM | Reference | RV64GCV | 0 | Loop+SLP |
55604252.0 ( Base ) |
206666 ( Base ) |
71088 ( Base ) |
0 | muRISCV-NN | Scalar | RV64GC | 0 | - |
71387730.0 ( 0.8x ) |
206404 ( 0.999 ) |
71088 ( 1.0 ) |
0 | muRISCV-NN | Vector (Portable) | RV64GC | 0 | - |
13516438.0 ( 4.1x ) |
230610 ( 1.116 ) |
71096 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
11090390.0 ( 5.0x ) |
230686 ( 1.116 ) |
71096 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
13934402.0 ( 4.0x ) |
231258 ( 1.119 ) |
71096 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
15416357.0 ( 3.6x ) |
231564 ( 1.12 ) |
71096 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
16078717.0 ( 3.5x ) |
231818 ( 1.122 ) |
71096 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
35260512.0 ( 1.6x ) |
237460 ( 1.149 ) |
71096 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
15147568.0 ( 3.7x ) |
209650 ( 1.014 ) |
71088 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV64GCV | 0 | - |
9564397.0 ( 5.8x ) |
209444 ( 1.013 ) |
71088 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV64GCV | 0 | - |
7018571.0 ( 7.9x ) |
209464 ( 1.014 ) |
71088 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV64GCV | 0 | - |
5778289.0 ( 9.6x ) |
209504 ( 1.014 ) |
71088 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV64GCV | 0 | - |
4850025.0 ( 11.5x ) |
209504 ( 1.014 ) |
71088 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV64GCV | 0 | - |
4602798.0 ( 12.1x ) |
209504 ( 1.014 ) |
71088 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV64GCV | 0 | - |
16917311.0 ( 3.3x ) |
230140 ( 1.114 ) |
71096 ( 1.0 ) |
128 | muRISCV-NN | Vector (Portable) | RV64GCV | 0 | Loop+SLP |
15882343.0 ( 3.5x ) |
230216 ( 1.114 ) |
71096 ( 1.0 ) |
256 | muRISCV-NN | Vector (Portable) | RV64GCV | 0 | Loop+SLP |
16639796.0 ( 3.3x ) |
230792 ( 1.117 ) |
71096 ( 1.0 ) |
512 | muRISCV-NN | Vector (Portable) | RV64GCV | 0 | Loop+SLP |
16344947.0 ( 3.4x ) |
231098 ( 1.118 ) |
71096 ( 1.0 ) |
1024 | muRISCV-NN | Vector (Portable) | RV64GCV | 0 | Loop+SLP |
15886134.0 ( 3.5x ) |
231352 ( 1.119 ) |
71096 ( 1.0 ) |
2048 | muRISCV-NN | Vector (Portable) | RV64GCV | 0 | Loop+SLP |
42658277.0 ( 1.3x ) |
236994 ( 1.147 ) |
71096 ( 1.0 ) |
4096 | muRISCV-NN | Vector (Portable) | RV64GCV | 0 | Loop+SLP |
toycar
)
Anomaly Detection (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
2781526.0 ( 0.6x ) |
341958 ( 0.984 ) |
21448 ( 1.0 ) |
0 | TFLM | Reference | RV64GC | 0 | - |
787324.0 ( 2.1x ) |
345748 ( 0.995 ) |
21448 ( 1.0 ) |
128 | TFLM | Reference | RV64GCV | 0 | Loop+SLP |
613312.0 ( 2.7x ) |
345692 ( 0.995 ) |
21448 ( 1.0 ) |
256 | TFLM | Reference | RV64GCV | 0 | Loop+SLP |
522839.0 ( 3.1x ) |
345796 ( 0.995 ) |
21448 ( 1.0 ) |
512 | TFLM | Reference | RV64GCV | 0 | Loop+SLP |
477625.0 ( 3.4x ) |
345912 ( 0.995 ) |
21448 ( 1.0 ) |
1024 | TFLM | Reference | RV64GCV | 0 | Loop+SLP |
454991.0 ( 3.6x ) |
346012 ( 0.996 ) |
21448 ( 1.0 ) |
2048 | TFLM | Reference | RV64GCV | 0 | Loop+SLP |
2209875.0 ( 0.7x ) |
349562 ( 1.006 ) |
21448 ( 1.0 ) |
4096 | TFLM | Reference | RV64GCV | 0 | Loop+SLP |
1630806.0 ( Base ) |
347530 ( Base ) |
21448 ( Base ) |
0 | muRISCV-NN | Scalar | RV64GC | 0 | - |
2979806.0 ( 0.5x ) |
347534 ( 1.0 ) |
21448 ( 1.0 ) |
0 | muRISCV-NN | Vector (Portable) | RV64GC | 0 | - |
555295.0 ( 2.9x ) |
352908 ( 1.015 ) |
21448 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
468225.0 ( 3.5x ) |
352832 ( 1.015 ) |
21448 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
423015.0 ( 3.9x ) |
352888 ( 1.015 ) |
21448 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
405527.0 ( 4.0x ) |
352978 ( 1.016 ) |
21448 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
389104.0 ( 4.2x ) |
353040 ( 1.016 ) |
21448 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
1327622.0 ( 1.2x ) |
355462 ( 1.023 ) |
21448 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
1944151.0 ( 0.8x ) |
348792 ( 1.004 ) |
21448 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV64GCV | 0 | - |
1825784.0 ( 0.9x ) |
348646 ( 1.003 ) |
21448 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV64GCV | 0 | - |
1766600.0 ( 0.9x ) |
348602 ( 1.003 ) |
21448 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV64GCV | 0 | - |
1737343.0 ( 0.9x ) |
348642 ( 1.003 ) |
21448 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV64GCV | 0 | - |
1733659.0 ( 0.9x ) |
348642 ( 1.003 ) |
21448 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV64GCV | 0 | - |
1731775.0 ( 0.9x ) |
348642 ( 1.003 ) |
21448 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV64GCV | 0 | - |
788210.0 ( 2.1x ) |
352912 ( 1.015 ) |
21448 ( 1.0 ) |
128 | muRISCV-NN | Vector (Portable) | RV64GCV | 0 | Loop+SLP |
606926.0 ( 2.7x ) |
352836 ( 1.015 ) |
21448 ( 1.0 ) |
256 | muRISCV-NN | Vector (Portable) | RV64GCV | 0 | Loop+SLP |
512364.0 ( 3.2x ) |
352892 ( 1.015 ) |
21448 ( 1.0 ) |
512 | muRISCV-NN | Vector (Portable) | RV64GCV | 0 | Loop+SLP |
470212.0 ( 3.5x ) |
352982 ( 1.016 ) |
21448 ( 1.0 ) |
1024 | muRISCV-NN | Vector (Portable) | RV64GCV | 0 | Loop+SLP |
441445.0 ( 3.7x ) |
353044 ( 1.016 ) |
21448 ( 1.0 ) |
2048 | muRISCV-NN | Vector (Portable) | RV64GCV | 0 | Loop+SLP |
2359516.0 ( 0.7x ) |
355466 ( 1.023 ) |
21448 ( 1.0 ) |
4096 | muRISCV-NN | Vector (Portable) | RV64GCV | 0 | Loop+SLP |
vww
)
Visual Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
100751076.0 ( 0.4x ) |
427228 ( 0.956 ) |
136624 ( 1.0 ) |
0 | TFLM | Reference | RV64GC | 0 | - |
71098833.0 ( 0.6x ) |
437516 ( 0.979 ) |
136632 ( 1.0 ) |
128 | TFLM | Reference | RV64GCV | 0 | Loop+SLP |
68702222.0 ( 0.6x ) |
437526 ( 0.979 ) |
136632 ( 1.0 ) |
256 | TFLM | Reference | RV64GCV | 0 | Loop+SLP |
68785085.0 ( 0.6x ) |
437940 ( 0.979 ) |
136632 ( 1.0 ) |
512 | TFLM | Reference | RV64GCV | 0 | Loop+SLP |
74526501.0 ( 0.6x ) |
438154 ( 0.98 ) |
136632 ( 1.0 ) |
1024 | TFLM | Reference | RV64GCV | 0 | Loop+SLP |
81014165.0 ( 0.5x ) |
438374 ( 0.98 ) |
136632 ( 1.0 ) |
2048 | TFLM | Reference | RV64GCV | 0 | Loop+SLP |
106459614.0 ( 0.4x ) |
447160 ( 1.0 ) |
136632 ( 1.0 ) |
4096 | TFLM | Reference | RV64GCV | 0 | Loop+SLP |
43862607.0 ( Base ) |
447110 ( Base ) |
136624 ( Base ) |
0 | muRISCV-NN | Scalar | RV64GC | 0 | - |
44771384.0 ( 1.0x ) |
447062 ( 1.0 ) |
136624 ( 1.0 ) |
0 | muRISCV-NN | Vector (Portable) | RV64GC | 0 | - |
16851931.0 ( 2.6x ) |
468812 ( 1.049 ) |
136632 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
16426471.0 ( 2.7x ) |
468838 ( 1.049 ) |
136632 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
17610814.0 ( 2.5x ) |
469154 ( 1.049 ) |
136632 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
20762615.0 ( 2.1x ) |
469322 ( 1.05 ) |
136632 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
24289646.0 ( 1.8x ) |
469432 ( 1.05 ) |
136632 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
38247469.0 ( 1.1x ) |
474476 ( 1.061 ) |
136632 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
13271992.0 ( 3.3x ) |
449848 ( 1.006 ) |
136624 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV64GCV | 0 | - |
9951008.0 ( 4.4x ) |
449650 ( 1.006 ) |
136624 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV64GCV | 0 | - |
8659588.0 ( 5.1x ) |
449670 ( 1.006 ) |
136624 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV64GCV | 0 | - |
8157467.0 ( 5.4x ) |
449712 ( 1.006 ) |
136624 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV64GCV | 0 | - |
8109701.0 ( 5.4x ) |
449712 ( 1.006 ) |
136624 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV64GCV | 0 | - |
8111566.0 ( 5.4x ) |
449712 ( 1.006 ) |
136624 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV64GCV | 0 | - |
20497377.0 ( 2.1x ) |
469678 ( 1.05 ) |
136632 ( 1.0 ) |
128 | muRISCV-NN | Vector (Portable) | RV64GCV | 0 | Loop+SLP |
20219728.0 ( 2.2x ) |
469704 ( 1.051 ) |
136632 ( 1.0 ) |
256 | muRISCV-NN | Vector (Portable) | RV64GCV | 0 | Loop+SLP |
21742685.0 ( 2.0x ) |
470030 ( 1.051 ) |
136632 ( 1.0 ) |
512 | muRISCV-NN | Vector (Portable) | RV64GCV | 0 | Loop+SLP |
24894446.0 ( 1.8x ) |
470198 ( 1.052 ) |
136632 ( 1.0 ) |
1024 | muRISCV-NN | Vector (Portable) | RV64GCV | 0 | Loop+SLP |
28413164.0 ( 1.5x ) |
470260 ( 1.052 ) |
136632 ( 1.0 ) |
2048 | muRISCV-NN | Vector (Portable) | RV64GCV | 0 | Loop+SLP |
42375971.0 ( 1.0x ) |
475304 ( 1.063 ) |
136632 ( 1.0 ) |
4096 | muRISCV-NN | Vector (Portable) | RV64GCV | 0 | Loop+SLP |
Original data
Click here to download the raw files for this benchmark.