Benchmarks 2024 02 20 TVM LLVM - tum-ei-eda/muriscv-nn GitHub Wiki
Setup
Simulator
- Spike (
riscv-isa-sim
) (ISS, CPI=1)
Toolchains
- LLVM/Clang:
- TODO: Version
- Linker: lld (TODO)
- RISC-V GCC for Headers, libc,...
Models
-
MLPerfTiny Benchmark
-
TODO: others!
Package Versions
-
MLonMCU : main
-
TVM : Nightly Pre-Build
-
Spike : 0bc176b3fca43560b9e8586cdbc41cfde073e17a
-
Spike PK : 7e9b671c0415dfd7b562ac934feb9380075d4aa2
Miscellaneous
- Used
-Os
flag for compilation. - Benchmarks generated using MLonMCU deployment tool with minimal efforts.
- Memory metrics are reported in Bytes
Results (Framework: tvm, Backend: tvmaot, Toolchain: llvm)
aww
)
Audio Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Layout | Kernels | Mode | Arch | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
33508602 ( 0.5x ) |
109102 ( 1.205 ) |
59508 ( 3.097 ) |
0 | NCHW | TVM | Fallback | RV32GC | - |
27511073 ( 0.6x ) |
102506 ( 1.133 ) |
59508 ( 3.097 ) |
0 | NHWC | TVM | Fallback | RV32GC | - |
13706216 ( 1.1x ) |
102504 ( 1.133 ) |
51336 ( 2.672 ) |
0 | NCHW | TVM | Autotuned | RV32GC | - |
27505623 ( 0.6x ) |
102618 ( 1.134 ) |
59508 ( 3.097 ) |
0 | NHWC | TVM | Autotuned | RV32GC | - |
3384606 ( 4.6x ) |
105404 ( 1.165 ) |
59508 ( 3.097 ) |
128 | NCHW | TVM | Fallback | RV32GCV | Loop+SLP |
3384606 ( 4.6x ) |
105404 ( 1.165 ) |
59508 ( 3.097 ) |
1024 | NCHW | TVM | Fallback | RV32GCV | Loop+SLP |
9566669 ( 1.6x ) |
103606 ( 1.145 ) |
59508 ( 3.097 ) |
128 | NHWC | TVM | Fallback | RV32GCV | Loop+SLP |
6682669 ( 2.3x ) |
103606 ( 1.145 ) |
59508 ( 3.097 ) |
1024 | NHWC | TVM | Fallback | RV32GCV | Loop+SLP |
5607715 ( 2.8x ) |
106776 ( 1.18 ) |
51336 ( 2.672 ) |
128 | NCHW | TVM | Autotuned | RV32GCV | Loop+SLP |
3881984 ( 4.0x ) |
106776 ( 1.18 ) |
51336 ( 2.672 ) |
1024 | NCHW | TVM | Autotuned | RV32GCV | Loop+SLP |
9565134 ( 1.6x ) |
104544 ( 1.155 ) |
59508 ( 3.097 ) |
128 | NHWC | TVM | Autotuned | RV32GCV | Loop+SLP |
6683403 ( 2.3x ) |
104544 ( 1.155 ) |
59508 ( 3.097 ) |
1024 | NHWC | TVM | Autotuned | RV32GCV | Loop+SLP |
15615223 ( Base ) |
90510 ( Base ) |
19212 ( Base ) |
0 | NHWC | muRISCV-NN | Scalar | RV32GC | - |
6838468 ( 2.3x ) |
93734 ( 1.036 ) |
19212 ( 1.0 ) |
128 | NHWC | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
5367941 ( 2.9x ) |
93734 ( 1.036 ) |
19212 ( 1.0 ) |
1024 | NHWC | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
7407276 ( 2.1x ) |
90216 ( 0.997 ) |
23676 ( 1.232 ) |
128 | NHWC | muRISCV-NN | Vector | RV32GCV | - |
3759264 ( 4.2x ) |
90216 ( 0.997 ) |
23676 ( 1.232 ) |
1024 | NHWC | muRISCV-NN | Vector | RV32GCV | - |
resnet
)
Image Classification (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Layout | Kernels | Mode | Arch | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
144789099 ( 0.4x ) |
218266 ( 1.582 ) |
108420 ( 1.953 ) |
0 | NCHW | TVM | Fallback | RV32GC | - |
112394400 ( 0.5x ) |
209174 ( 1.516 ) |
108420 ( 1.953 ) |
0 | NHWC | TVM | Fallback | RV32GC | - |
53701816 ( 1.1x ) |
212580 ( 1.541 ) |
92236 ( 1.661 ) |
0 | NCHW | TVM | Autotuned | RV32GC | - |
112389724 ( 0.5x ) |
209256 ( 1.517 ) |
108420 ( 1.953 ) |
0 | NHWC | TVM | Autotuned | RV32GC | - |
12825992 ( 4.6x ) |
213684 ( 1.549 ) |
108420 ( 1.953 ) |
128 | NCHW | TVM | Fallback | RV32GCV | Loop+SLP |
12825991 ( 4.6x ) |
213678 ( 1.549 ) |
108420 ( 1.953 ) |
1024 | NCHW | TVM | Fallback | RV32GCV | Loop+SLP |
36071697 ( 1.6x ) |
210202 ( 1.524 ) |
108420 ( 1.953 ) |
128 | NHWC | TVM | Fallback | RV32GCV | Loop+SLP |
24311825 ( 2.4x ) |
210202 ( 1.524 ) |
108420 ( 1.953 ) |
1024 | NHWC | TVM | Fallback | RV32GCV | Loop+SLP |
19686220 ( 3.0x ) |
226232 ( 1.64 ) |
92236 ( 1.661 ) |
128 | NCHW | TVM | Autotuned | RV32GCV | Loop+SLP |
13965591 ( 4.2x ) |
226232 ( 1.64 ) |
92236 ( 1.661 ) |
1024 | NCHW | TVM | Autotuned | RV32GCV | Loop+SLP |
36069956 ( 1.6x ) |
210938 ( 1.529 ) |
108420 ( 1.953 ) |
128 | NHWC | TVM | Autotuned | RV32GCV | Loop+SLP |
24309151 ( 2.4x ) |
210930 ( 1.529 ) |
108420 ( 1.953 ) |
1024 | NHWC | TVM | Autotuned | RV32GCV | Loop+SLP |
58402822 ( Base ) |
137958 ( Base ) |
55516 ( Base ) |
0 | NHWC | muRISCV-NN | Scalar | RV32GC | - |
28255398 ( 2.1x ) |
141694 ( 1.027 ) |
55516 ( 1.0 ) |
128 | NHWC | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
13704877 ( 4.3x ) |
141694 ( 1.027 ) |
55516 ( 1.0 ) |
1024 | NHWC | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
27976928 ( 2.1x ) |
138304 ( 1.003 ) |
55516 ( 1.0 ) |
128 | NHWC | muRISCV-NN | Vector | RV32GCV | - |
8035106 ( 7.3x ) |
138304 ( 1.003 ) |
55516 ( 1.0 ) |
1024 | NHWC | muRISCV-NN | Vector | RV32GCV | - |
toycar
)
Anomaly Detection (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Layout | Kernels | Mode | Arch | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
3404882 ( 0.6x ) |
581370 ( 1.841 ) |
5572 ( 1.168 ) |
0 | NCHW | TVM | Fallback | RV32GC | - |
3404882 ( 0.6x ) |
581370 ( 1.841 ) |
5572 ( 1.168 ) |
0 | NHWC | TVM | Fallback | RV32GC | - |
2245737 ( 0.8x ) |
609080 ( 1.929 ) |
6884 ( 1.443 ) |
0 | NCHW | TVM | Autotuned | RV32GC | - |
2245737 ( 0.8x ) |
609080 ( 1.929 ) |
6884 ( 1.443 ) |
0 | NHWC | TVM | Autotuned | RV32GC | - |
984695 ( 1.9x ) |
581106 ( 1.84 ) |
5572 ( 1.168 ) |
128 | NCHW | TVM | Fallback | RV32GCV | Loop+SLP |
984695 ( 1.9x ) |
581106 ( 1.84 ) |
5572 ( 1.168 ) |
1024 | NCHW | TVM | Fallback | RV32GCV | Loop+SLP |
984693 ( 1.9x ) |
581098 ( 1.84 ) |
5572 ( 1.168 ) |
128 | NHWC | TVM | Fallback | RV32GCV | Loop+SLP |
984695 ( 1.9x ) |
581106 ( 1.84 ) |
5572 ( 1.168 ) |
1024 | NHWC | TVM | Fallback | RV32GCV | Loop+SLP |
1280619 ( 1.5x ) |
600432 ( 1.902 ) |
6884 ( 1.443 ) |
128 | NCHW | TVM | Autotuned | RV32GCV | Loop+SLP |
1148032 ( 1.6x ) |
600432 ( 1.902 ) |
6884 ( 1.443 ) |
1024 | NCHW | TVM | Autotuned | RV32GCV | Loop+SLP |
1280619 ( 1.5x ) |
600432 ( 1.902 ) |
6884 ( 1.443 ) |
128 | NHWC | TVM | Autotuned | RV32GCV | Loop+SLP |
1148032 ( 1.6x ) |
600432 ( 1.902 ) |
6884 ( 1.443 ) |
1024 | NHWC | TVM | Autotuned | RV32GCV | Loop+SLP |
1893648 ( Base ) |
315742 ( Base ) |
4772 ( Base ) |
0 | NHWC | muRISCV-NN | Scalar | RV32GC | - |
662593 ( 2.9x ) |
316394 ( 1.002 ) |
4772 ( 1.0 ) |
128 | NHWC | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
430222 ( 4.4x ) |
316396 ( 1.002 ) |
4772 ( 1.0 ) |
1024 | NHWC | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
639278 ( 3.0x ) |
315738 ( 1.0 ) |
4772 ( 1.0 ) |
128 | NHWC | muRISCV-NN | Vector | RV32GCV | - |
465832 ( 4.1x ) |
315740 ( 1.0 ) |
4772 ( 1.0 ) |
1024 | NHWC | muRISCV-NN | Vector | RV32GCV | - |
vww
)
Visual Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Layout | Kernels | Mode | Arch | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
96665971 ( 0.5x ) |
545174 ( 1.685 ) |
181032 ( 2.113 ) |
0 | NCHW | TVM | Fallback | RV32GC | - |
79940190 ( 0.6x ) |
521128 ( 1.611 ) |
181032 ( 2.113 ) |
0 | NHWC | TVM | Fallback | RV32GC | - |
42404609 ( 1.1x ) |
525220 ( 1.623 ) |
181032 ( 2.113 ) |
0 | NCHW | TVM | Autotuned | RV32GC | - |
79940192 ( 0.6x ) |
521130 ( 1.611 ) |
181032 ( 2.113 ) |
0 | NHWC | TVM | Autotuned | RV32GC | - |
11010120 ( 4.2x ) |
532510 ( 1.646 ) |
181032 ( 2.113 ) |
128 | NCHW | TVM | Fallback | RV32GCV | Loop+SLP |
11010119 ( 4.2x ) |
532508 ( 1.646 ) |
181032 ( 2.113 ) |
1024 | NCHW | TVM | Fallback | RV32GCV | Loop+SLP |
30451801 ( 1.5x ) |
523636 ( 1.619 ) |
181032 ( 2.113 ) |
128 | NHWC | TVM | Fallback | RV32GCV | Loop+SLP |
22700930 ( 2.1x ) |
523638 ( 1.619 ) |
181032 ( 2.113 ) |
1024 | NHWC | TVM | Fallback | RV32GCV | Loop+SLP |
24965512 ( 1.9x ) |
550720 ( 1.702 ) |
181032 ( 2.113 ) |
128 | NCHW | TVM | Autotuned | RV32GCV | Loop+SLP |
19204878 ( 2.4x ) |
550714 ( 1.702 ) |
181032 ( 2.113 ) |
1024 | NCHW | TVM | Autotuned | RV32GCV | Loop+SLP |
30450317 ( 1.5x ) |
523748 ( 1.619 ) |
181032 ( 2.113 ) |
128 | NHWC | TVM | Autotuned | RV32GCV | Loop+SLP |
22698971 ( 2.1x ) |
523750 ( 1.619 ) |
181032 ( 2.113 ) |
1024 | NHWC | TVM | Autotuned | RV32GCV | Loop+SLP |
46765905 ( Base ) |
323530 ( Base ) |
85664 ( Base ) |
0 | NHWC | muRISCV-NN | Scalar | RV32GC | - |
19684201 ( 2.4x ) |
327716 ( 1.013 ) |
85664 ( 1.0 ) |
128 | NHWC | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
14888471 ( 3.1x ) |
327722 ( 1.013 ) |
85664 ( 1.0 ) |
1024 | NHWC | muRISCV-NN | Scalar | RV32GCV | Loop+SLP |
21141100 ( 2.2x ) |
324314 ( 1.002 ) |
85664 ( 1.0 ) |
128 | NHWC | muRISCV-NN | Vector | RV32GCV | - |
10453920 ( 4.5x ) |
324316 ( 1.002 ) |
85664 ( 1.0 ) |
1024 | NHWC | muRISCV-NN | Vector | RV32GCV | - |
Original data
Click here to download the raw files for this benchmark.