Benchmarks CUSTOM TFLM LLVM Os - tum-ei-eda/muriscv-nn GitHub Wiki
Setup
Simulator
- Spike (
riscv-isa-sim
) (ISS, CPI=1)
Toolchains
- LLVM/Clang:
- TODO: Version
- Linker: lld (TODO)
- RISC-V GCC for Headers, libc,...
Models
-
MLPerfTiny Benchmark
-
TODO: others!
Package Versions
-
MLonMCU : main
-
TFLM : main
-
Spike : 0bc176b3fca43560b9e8586cdbc41cfde073e17a
-
Spike PK : 7e9b671c0415dfd7b562ac934feb9380075d4aa2
Miscellaneous
- Used
-Os
flag for compilation. - Benchmarks generated using MLonMCU deployment tool with minimal efforts.
- Memory metrics are reported in Bytes
Results (Framework: tflm, Backend: tflmi, Toolchain: llvm, Flags: -Os)
aww
)
Audio Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
39299313 ( 0.4x ) |
147040 ( 0.869 ) |
36124 ( 1.0 ) |
128 | TFLM | Reference | RV32GC | 0 | - |
33496503 ( 0.5x ) |
152732 ( 0.903 ) |
36132 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
31703831 ( 0.5x ) |
152732 ( 0.903 ) |
36132 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
30807495 ( 0.5x ) |
152732 ( 0.903 ) |
36132 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
30359327 ( 0.5x ) |
152732 ( 0.903 ) |
36132 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
30362716 ( 0.5x ) |
152732 ( 0.903 ) |
36132 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
30362716 ( 0.5x ) |
152732 ( 0.903 ) |
36132 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
15089987 ( Base ) |
169180 ( Base ) |
36124 ( Base ) |
128 | muRISCV-NN | Scalar | RV32GC | 0 | - |
6142871 ( 2.5x ) |
179372 ( 1.06 ) |
36132 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
5140180 ( 2.9x ) |
179372 ( 1.06 ) |
36132 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
4638932 ( 3.3x ) |
179372 ( 1.06 ) |
36132 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
4351601 ( 3.5x ) |
179372 ( 1.06 ) |
36132 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
4351601 ( 3.5x ) |
179372 ( 1.06 ) |
36132 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
4358379 ( 3.5x ) |
179372 ( 1.06 ) |
36132 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
4078846 ( 3.7x ) |
169536 ( 1.002 ) |
36124 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | 0 | - |
2823868 ( 5.3x ) |
169536 ( 1.002 ) |
36124 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | 0 | - |
2146550 ( 7.0x ) |
169536 ( 1.002 ) |
36124 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | 0 | - |
2105562 ( 7.2x ) |
169536 ( 1.002 ) |
36124 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | 0 | - |
2108951 ( 7.2x ) |
169536 ( 1.002 ) |
36124 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | 0 | - |
2112340 ( 7.1x ) |
169536 ( 1.002 ) |
36124 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | 0 | - |
resnet
)
Image Classification (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
121053737 ( 0.5x ) |
188840 ( 0.929 ) |
68892 ( 1.0 ) |
128 | TFLM | Reference | RV32GC | 0 | - |
56889926 ( 1.0x ) |
194476 ( 0.957 ) |
68900 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
47133270 ( 1.2x ) |
194476 ( 0.957 ) |
68900 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
44786366 ( 1.3x ) |
194476 ( 0.957 ) |
68900 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
44352562 ( 1.3x ) |
194476 ( 0.957 ) |
68900 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
44355951 ( 1.3x ) |
194476 ( 0.957 ) |
68900 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
44355951 ( 1.3x ) |
194476 ( 0.957 ) |
68900 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
56368693 ( Base ) |
203198 ( Base ) |
68888 ( Base ) |
128 | muRISCV-NN | Scalar | RV32GC | 0 | - |
26157077 ( 2.2x ) |
213618 ( 1.051 ) |
68896 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
18292290 ( 3.1x ) |
213618 ( 1.051 ) |
68896 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
14605754 ( 3.9x ) |
213618 ( 1.051 ) |
68896 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
12929715 ( 4.4x ) |
213618 ( 1.051 ) |
68896 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
12110515 ( 4.7x ) |
213618 ( 1.051 ) |
68896 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
11502893 ( 4.9x ) |
213618 ( 1.051 ) |
68896 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
15254126 ( 3.7x ) |
204270 ( 1.005 ) |
68888 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | 0 | - |
9670930 ( 5.8x ) |
204270 ( 1.005 ) |
68888 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | 0 | - |
7125092 ( 7.9x ) |
204270 ( 1.005 ) |
68888 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | 0 | - |
5882964 ( 9.6x ) |
204270 ( 1.005 ) |
68888 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | 0 | - |
4958089 ( 11.4x ) |
204270 ( 1.005 ) |
68888 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | 0 | - |
4712386 ( 12.0x ) |
204270 ( 1.005 ) |
68888 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | 0 | - |
toycar
)
Anomaly Detection (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
2804288 ( 0.6x ) |
342396 ( 0.988 ) |
19376 ( 1.0 ) |
128 | TFLM | Reference | RV32GC | 0 | - |
898428 ( 1.9x ) |
344096 ( 0.993 ) |
19376 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
668156 ( 2.5x ) |
344096 ( 0.993 ) |
19376 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
553020 ( 3.1x ) |
344096 ( 0.993 ) |
19376 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
495452 ( 3.4x ) |
344096 ( 0.993 ) |
19376 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
466668 ( 3.6x ) |
344096 ( 0.993 ) |
19376 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
463084 ( 3.7x ) |
344096 ( 0.993 ) |
19376 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
1693692 ( Base ) |
346606 ( Base ) |
19376 ( Base ) |
128 | muRISCV-NN | Scalar | RV32GC | 0 | - |
598024 ( 2.8x ) |
349842 ( 1.009 ) |
19376 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
484104 ( 3.5x ) |
349842 ( 1.009 ) |
19376 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
427144 ( 4.0x ) |
349842 ( 1.009 ) |
19376 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
398664 ( 4.2x ) |
349842 ( 1.009 ) |
19376 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
384424 ( 4.4x ) |
349842 ( 1.009 ) |
19376 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
382640 ( 4.4x ) |
349842 ( 1.009 ) |
19376 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
595027 ( 2.8x ) |
347268 ( 1.002 ) |
19376 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | 0 | - |
478347 ( 3.5x ) |
347268 ( 1.002 ) |
19376 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | 0 | - |
420007 ( 4.0x ) |
347268 ( 1.002 ) |
19376 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | 0 | - |
391161 ( 4.3x ) |
347268 ( 1.002 ) |
19376 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | 0 | - |
387530 ( 4.4x ) |
347268 ( 1.002 ) |
19376 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | 0 | - |
385674 ( 4.4x ) |
347268 ( 1.002 ) |
19376 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | 0 | - |
vww
)
Visual Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
103598496 ( 0.4x ) |
420802 ( 0.95 ) |
134452 ( 1.0 ) |
128 | TFLM | Reference | RV32GC | 0 | - |
72373406 ( 0.6x ) |
426442 ( 0.963 ) |
134460 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
67211998 ( 0.7x ) |
426442 ( 0.963 ) |
134460 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
64760318 ( 0.7x ) |
426442 ( 0.963 ) |
134460 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
63728014 ( 0.7x ) |
426442 ( 0.963 ) |
134460 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
63312019 ( 0.7x ) |
426442 ( 0.963 ) |
134460 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
63279735 ( 0.7x ) |
426442 ( 0.963 ) |
134460 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | 0 | Loop+SLP |
45201423 ( Base ) |
442944 ( Base ) |
134452 ( Base ) |
128 | muRISCV-NN | Scalar | RV32GC | 0 | - |
19579538 ( 2.3x ) |
453082 ( 1.023 ) |
134460 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
16713138 ( 2.7x ) |
453082 ( 1.023 ) |
134460 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
15339954 ( 2.9x ) |
453082 ( 1.023 ) |
134460 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
14839103 ( 3.0x ) |
453082 ( 1.023 ) |
134460 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
14632583 ( 3.1x ) |
453082 ( 1.023 ) |
134460 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
14621925 ( 3.1x ) |
453082 ( 1.023 ) |
134460 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | 0 | Loop+SLP |
13374433 ( 3.4x ) |
443298 ( 1.001 ) |
134452 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | 0 | - |
10053433 ( 4.5x ) |
443298 ( 1.001 ) |
134452 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | 0 | - |
8762005 ( 5.2x ) |
443298 ( 1.001 ) |
134452 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | 0 | - |
8258015 ( 5.5x ) |
443298 ( 1.001 ) |
134452 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | 0 | - |
8213635 ( 5.5x ) |
443298 ( 1.001 ) |
134452 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | 0 | - |
8217024 ( 5.5x ) |
443298 ( 1.001 ) |
134452 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | 0 | - |
Original data
Click here to download the raw files for this benchmark.