Benchmarks 2024 11 21 TFLM GCC O3 spike_rv32 - tum-ei-eda/muriscv-nn GitHub Wiki
Setup
Simulator
- Spike (
riscv-isa-sim
) (ISS, CPI=1)- Spike :
0bc176b3fca43560b9e8586cdbc41cfde073e17a
- Spike PK :
7e9b671c0415dfd7b562ac934feb9380075d4aa2
- Spike :
Toolchains
- RISC-V GCC:
- Scalar:
riscv32-unknown-elf-gcc (gc891d8dc23e) 13.2.0
- Vector:
riscv32-unknown-elf-gcc (gc891d8dc23e) 13.2.0
- Packed: Self compiled using patches found in https://github.com/riscv-collab/riscv-gcc/pull/258 and https://github.com/riscvarchive/riscv-binutils-gdb/pull/257
- Scalar:
Models
-
MLPerfTiny Benchmark
-
TODO: others!
Frameworks
-
MLonMCU :
develop
-
TFLM :
8eb6b23de4470d6a8da3131650d6a67514dfa130
Miscellaneous
- Used
-Os
flag for compilation. - Benchmarks generated using MLonMCU deployment tool with minimal efforts.
- Memory metrics are reported in Bytes
Results (Framework: tflm, Backend: tflmi, Toolchain: gcc, Flags: -O3, Target: spike_rv32 )
aww
)
Audio Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
53179108 ( 0.3x ) |
145608 ( 0.817 ) |
36144 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | 0 | - |
53179108 ( 0.3x ) |
145640 ( 0.817 ) |
36144 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
53179108 ( 0.3x ) |
145640 ( 0.817 ) |
36144 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
53179108 ( 0.3x ) |
145640 ( 0.817 ) |
36144 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
53179108 ( 0.3x ) |
145640 ( 0.817 ) |
36144 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
53179108 ( 0.3x ) |
145640 ( 0.817 ) |
36144 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
53179108 ( 0.3x ) |
145640 ( 0.817 ) |
36144 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
15403031 ( Base ) |
178306 ( Base ) |
36160 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | 0 | - |
15216243 ( 1.0x ) |
175044 ( 0.982 ) |
36160 ( 1.0 ) |
0 | muRISCV-NN | Vector (Portable) | RV32GC | 0 | - |
15403035 ( 1.0x ) |
178354 ( 1.0 ) |
36160 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
15403035 ( 1.0x ) |
178354 ( 1.0 ) |
36160 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
15403035 ( 1.0x ) |
178354 ( 1.0 ) |
36160 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
15403035 ( 1.0x ) |
178354 ( 1.0 ) |
36160 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
15403035 ( 1.0x ) |
178354 ( 1.0 ) |
36160 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
15403035 ( 1.0x ) |
178354 ( 1.0 ) |
36160 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
7403111 ( 2.1x ) |
179942 ( 1.009 ) |
36160 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | 0 | - |
5218005 ( 3.0x ) |
179942 ( 1.009 ) |
36160 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | 0 | - |
3743759 ( 4.1x ) |
179942 ( 1.009 ) |
36160 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | 0 | - |
3706167 ( 4.2x ) |
179942 ( 1.009 ) |
36160 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | 0 | - |
3709556 ( 4.2x ) |
179942 ( 1.009 ) |
36160 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | 0 | - |
3712945 ( 4.1x ) |
179942 ( 1.009 ) |
36160 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | 0 | - |
15216247 ( 1.0x ) |
175092 ( 0.982 ) |
36160 ( 1.0 ) |
128 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
15216247 ( 1.0x ) |
175092 ( 0.982 ) |
36160 ( 1.0 ) |
256 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
15216247 ( 1.0x ) |
175092 ( 0.982 ) |
36160 ( 1.0 ) |
512 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
15216247 ( 1.0x ) |
175092 ( 0.982 ) |
36160 ( 1.0 ) |
1024 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
15216247 ( 1.0x ) |
175092 ( 0.982 ) |
36160 ( 1.0 ) |
2048 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
15216247 ( 1.0x ) |
175092 ( 0.982 ) |
36160 ( 1.0 ) |
4096 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
resnet
)
Image Classification (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
169741639 ( 0.3x ) |
193140 ( 0.904 ) |
68916 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | 0 | - |
169737130 ( 0.3x ) |
192992 ( 0.903 ) |
68916 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
169737130 ( 0.3x ) |
192992 ( 0.903 ) |
68916 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
169737130 ( 0.3x ) |
192992 ( 0.903 ) |
68916 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
169737130 ( 0.3x ) |
192992 ( 0.903 ) |
68916 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
169737130 ( 0.3x ) |
192992 ( 0.903 ) |
68916 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
169737130 ( 0.3x ) |
192992 ( 0.903 ) |
68916 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
54521935 ( Base ) |
213764 ( Base ) |
68916 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | 0 | - |
72570040 ( 0.8x ) |
212960 ( 0.996 ) |
68916 ( 1.0 ) |
0 | muRISCV-NN | Vector (Portable) | RV32GC | 0 | - |
54639263 ( 1.0x ) |
213674 ( 1.0 ) |
68916 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
54639263 ( 1.0x ) |
213674 ( 1.0 ) |
68916 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
54639263 ( 1.0x ) |
213674 ( 1.0 ) |
68916 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
54639263 ( 1.0x ) |
213674 ( 1.0 ) |
68916 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
54639263 ( 1.0x ) |
213674 ( 1.0 ) |
68916 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
54639263 ( 1.0x ) |
213674 ( 1.0 ) |
68916 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
28539785 ( 1.9x ) |
220596 ( 1.032 ) |
68916 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | 0 | - |
17903709 ( 3.0x ) |
220596 ( 1.032 ) |
68916 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | 0 | - |
13120199 ( 4.2x ) |
220596 ( 1.032 ) |
68916 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | 0 | - |
10786471 ( 5.1x ) |
220596 ( 1.032 ) |
68916 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | 0 | - |
8757716 ( 6.2x ) |
220596 ( 1.032 ) |
68916 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | 0 | - |
8207369 ( 6.6x ) |
220596 ( 1.032 ) |
68916 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | 0 | - |
72692699 ( 0.8x ) |
212870 ( 0.996 ) |
68916 ( 1.0 ) |
128 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
72692699 ( 0.8x ) |
212870 ( 0.996 ) |
68916 ( 1.0 ) |
256 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
72692699 ( 0.8x ) |
212870 ( 0.996 ) |
68916 ( 1.0 ) |
512 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
72692699 ( 0.8x ) |
212870 ( 0.996 ) |
68916 ( 1.0 ) |
1024 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
72692699 ( 0.8x ) |
212870 ( 0.996 ) |
68916 ( 1.0 ) |
2048 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
72692699 ( 0.8x ) |
212870 ( 0.996 ) |
68916 ( 1.0 ) |
4096 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
toycar
)
Anomaly Detection (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
2783581 ( 0.6x ) |
337804 ( 0.978 ) |
19424 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | 0 | - |
2783581 ( 0.6x ) |
337846 ( 0.978 ) |
19424 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
2783581 ( 0.6x ) |
337846 ( 0.978 ) |
19424 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
2783581 ( 0.6x ) |
337846 ( 0.978 ) |
19424 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
2783581 ( 0.6x ) |
337846 ( 0.978 ) |
19424 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
2783581 ( 0.6x ) |
337846 ( 0.978 ) |
19424 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
2783581 ( 0.6x ) |
337846 ( 0.978 ) |
19424 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
1649315 ( Base ) |
345434 ( Base ) |
19424 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | 0 | - |
2732748 ( 0.6x ) |
345436 ( 1.0 ) |
19424 ( 1.0 ) |
0 | muRISCV-NN | Vector (Portable) | RV32GC | 0 | - |
1649279 ( 1.0x ) |
345478 ( 1.0 ) |
19424 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
1649279 ( 1.0x ) |
345478 ( 1.0 ) |
19424 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
1649279 ( 1.0x ) |
345478 ( 1.0 ) |
19424 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
1649279 ( 1.0x ) |
345478 ( 1.0 ) |
19424 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
1649279 ( 1.0x ) |
345478 ( 1.0 ) |
19424 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
1649279 ( 1.0x ) |
345478 ( 1.0 ) |
19424 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
2049441 ( 0.8x ) |
349262 ( 1.011 ) |
19424 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | 0 | - |
1768281 ( 0.9x ) |
349262 ( 1.011 ) |
19424 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | 0 | - |
1627701 ( 1.0x ) |
349262 ( 1.011 ) |
19424 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | 0 | - |
1558215 ( 1.1x ) |
349262 ( 1.011 ) |
19424 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | 0 | - |
1552853 ( 1.1x ) |
349262 ( 1.011 ) |
19424 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | 0 | - |
1551766 ( 1.1x ) |
349262 ( 1.011 ) |
19424 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | 0 | - |
2732712 ( 0.6x ) |
345480 ( 1.0 ) |
19424 ( 1.0 ) |
128 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
2732712 ( 0.6x ) |
345480 ( 1.0 ) |
19424 ( 1.0 ) |
256 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
2732712 ( 0.6x ) |
345480 ( 1.0 ) |
19424 ( 1.0 ) |
512 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
2732712 ( 0.6x ) |
345480 ( 1.0 ) |
19424 ( 1.0 ) |
1024 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
2732712 ( 0.6x ) |
345480 ( 1.0 ) |
19424 ( 1.0 ) |
2048 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
2732712 ( 0.6x ) |
345480 ( 1.0 ) |
19424 ( 1.0 ) |
4096 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
vww
)
Visual Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
138790135 ( 0.3x ) |
419250 ( 0.928 ) |
134448 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | 0 | - |
138790170 ( 0.3x ) |
419282 ( 0.928 ) |
134448 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
138790170 ( 0.3x ) |
419282 ( 0.928 ) |
134448 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
138790170 ( 0.3x ) |
419282 ( 0.928 ) |
134448 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
138790170 ( 0.3x ) |
419282 ( 0.928 ) |
134448 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
138790170 ( 0.3x ) |
419282 ( 0.928 ) |
134448 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
138790170 ( 0.3x ) |
419282 ( 0.928 ) |
134448 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
46343084 ( Base ) |
451948 ( Base ) |
134464 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | 0 | - |
45906828 ( 1.0x ) |
448686 ( 0.993 ) |
134464 ( 1.0 ) |
0 | muRISCV-NN | Vector (Portable) | RV32GC | 0 | - |
46343070 ( 1.0x ) |
451996 ( 1.0 ) |
134464 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
46343070 ( 1.0x ) |
451996 ( 1.0 ) |
134464 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
46343070 ( 1.0x ) |
451996 ( 1.0 ) |
134464 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
46343070 ( 1.0x ) |
451996 ( 1.0 ) |
134464 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
46343070 ( 1.0x ) |
451996 ( 1.0 ) |
134464 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
46343070 ( 1.0x ) |
451996 ( 1.0 ) |
134464 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
23578581 ( 2.0x ) |
453584 ( 1.004 ) |
134464 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | 0 | - |
17459257 ( 2.7x ) |
453584 ( 1.004 ) |
134464 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | 0 | - |
15136515 ( 3.1x ) |
453584 ( 1.004 ) |
134464 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | 0 | - |
14077254 ( 3.3x ) |
453584 ( 1.004 ) |
134464 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | 0 | - |
13992815 ( 3.3x ) |
453584 ( 1.004 ) |
134464 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | 0 | - |
13996204 ( 3.3x ) |
453584 ( 1.004 ) |
134464 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | 0 | - |
45906827 ( 1.0x ) |
448734 ( 0.993 ) |
134464 ( 1.0 ) |
128 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
45906827 ( 1.0x ) |
448734 ( 0.993 ) |
134464 ( 1.0 ) |
256 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
45906827 ( 1.0x ) |
448734 ( 0.993 ) |
134464 ( 1.0 ) |
512 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
45906827 ( 1.0x ) |
448734 ( 0.993 ) |
134464 ( 1.0 ) |
1024 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
45906827 ( 1.0x ) |
448734 ( 0.993 ) |
134464 ( 1.0 ) |
2048 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
45906827 ( 1.0x ) |
448734 ( 0.993 ) |
134464 ( 1.0 ) |
4096 | muRISCV-NN | Vector (Portable) | RV32GCV | 0 | Loop+SLP |
Original data
Click here to download the raw files for this benchmark.