Benchmarks 2024 07 12 TVM LLVM Os spike_rv64 - tum-ei-eda/muriscv-nn GitHub Wiki
Setup
Simulator
- Spike (
riscv-isa-sim
) (ISS, CPI=1)
Toolchains
- LLVM/Clang:
- TODO: Version
- Linker: lld (TODO)
- RISC-V GCC for Headers, libc,...
Models
-
MLPerfTiny Benchmark
-
TODO: others!
Package Versions
-
MLonMCU : main
-
TVM : Nightly Pre-Build
-
Spike : 0bc176b3fca43560b9e8586cdbc41cfde073e17a
-
Spike PK : 7e9b671c0415dfd7b562ac934feb9380075d4aa2
Miscellaneous
- Used
-Os
flag for compilation. - Benchmarks generated using MLonMCU deployment tool with minimal efforts.
- Memory metrics are reported in Bytes
Results (Framework: tvm, Backend: tvmaot, Toolchain: llvm, Flags: -Os)
aww
)
Audio Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Layout | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|---|
28357073 ( 0.6x ) |
103786 ( 1.213 ) |
61424 ( 2.918 ) |
128 | NCHW | TVM | Fallback | RV64GC | 0 | - |
24010382 ( 0.7x ) |
98840 ( 1.155 ) |
61344 ( 2.914 ) |
128 | NHWC | TVM | Fallback | RV64GC | 0 | - |
12717679 ( 1.2x ) |
98560 ( 1.152 ) |
53216 ( 2.528 ) |
128 | NCHW | TVM | Autotuned | RV64GC | 0 | - |
24005350 ( 0.7x ) |
98922 ( 1.156 ) |
61344 ( 2.914 ) |
128 | NHWC | TVM | Autotuned | RV64GC | 0 | - |
5896638 ( 2.7x ) |
102836 ( 1.202 ) |
61352 ( 2.915 ) |
128 | NCHW | TVM | Fallback | RV64GCV | 0 | Loop+SLP |
9587844 ( 1.6x ) |
99950 ( 1.168 ) |
61344 ( 2.914 ) |
128 | NHWC | TVM | Fallback | RV64GCV | 0 | Loop+SLP |
4906303 ( 3.2x ) |
100862 ( 1.179 ) |
53184 ( 2.527 ) |
128 | NCHW | TVM | Autotuned | RV64GCV | 0 | Loop+SLP |
9585953 ( 1.6x ) |
100850 ( 1.178 ) |
61344 ( 2.914 ) |
128 | NHWC | TVM | Autotuned | RV64GCV | 0 | Loop+SLP |
15710527 ( Base ) |
85584 ( Base ) |
21048 ( Base ) |
128 | NHWC | muRISCV-NN | Scalar | RV64GC | 0 | - |
7141242 ( 2.2x ) |
88730 ( 1.037 ) |
21048 ( 1.0 ) |
128 | NHWC | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
6022687 ( 2.6x ) |
86064 ( 1.006 ) |
25512 ( 1.212 ) |
128 | NHWC | muRISCV-NN | Vector | RV64GCV | 0 | - |
resnet
)
Image Classification (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Layout | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|---|
123431323 ( 0.5x ) |
211828 ( 1.592 ) |
110272 ( 1.922 ) |
128 | NCHW | TVM | Fallback | RV64GC | 0 | - |
100170024 ( 0.6x ) |
205050 ( 1.541 ) |
110272 ( 1.922 ) |
128 | NHWC | TVM | Fallback | RV64GC | 0 | - |
50692903 ( 1.1x ) |
208804 ( 1.569 ) |
94088 ( 1.64 ) |
128 | NCHW | TVM | Autotuned | RV64GC | 0 | - |
100165642 ( 0.6x ) |
205100 ( 1.541 ) |
110272 ( 1.922 ) |
128 | NHWC | TVM | Autotuned | RV64GC | 0 | - |
24209704 ( 2.3x ) |
211166 ( 1.587 ) |
110272 ( 1.922 ) |
128 | NCHW | TVM | Fallback | RV64GCV | 0 | Loop+SLP |
35938784 ( 1.6x ) |
205972 ( 1.548 ) |
110272 ( 1.922 ) |
128 | NHWC | TVM | Fallback | RV64GCV | 0 | Loop+SLP |
15455900 ( 3.6x ) |
216088 ( 1.624 ) |
94088 ( 1.64 ) |
128 | NCHW | TVM | Autotuned | RV64GCV | 0 | Loop+SLP |
35936640 ( 1.6x ) |
206616 ( 1.553 ) |
110272 ( 1.922 ) |
128 | NHWC | TVM | Autotuned | RV64GCV | 0 | Loop+SLP |
55736887 ( Base ) |
133084 ( Base ) |
57368 ( Base ) |
128 | NHWC | muRISCV-NN | Scalar | RV64GC | 0 | - |
25746948 ( 2.2x ) |
137006 ( 1.029 ) |
57368 ( 1.0 ) |
128 | NHWC | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
15251111 ( 3.7x ) |
134188 ( 1.008 ) |
57368 ( 1.0 ) |
128 | NHWC | muRISCV-NN | Vector | RV64GCV | 0 | - |
toycar
)
Anomaly Detection (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Layout | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|---|
3202167 ( 0.5x ) |
577790 ( 1.853 ) |
7432 ( 1.121 ) |
128 | NCHW | TVM | Fallback | RV64GC | 0 | - |
3202167 ( 0.5x ) |
577790 ( 1.853 ) |
7432 ( 1.121 ) |
128 | NHWC | TVM | Fallback | RV64GC | 0 | - |
2274159 ( 0.7x ) |
604904 ( 1.94 ) |
8744 ( 1.318 ) |
128 | NCHW | TVM | Autotuned | RV64GC | 0 | - |
2274159 ( 0.7x ) |
604904 ( 1.94 ) |
8744 ( 1.318 ) |
128 | NHWC | TVM | Autotuned | RV64GC | 0 | - |
1041434 ( 1.6x ) |
577068 ( 1.851 ) |
7432 ( 1.121 ) |
128 | NCHW | TVM | Fallback | RV64GCV | 0 | Loop+SLP |
1041434 ( 1.6x ) |
577068 ( 1.851 ) |
7432 ( 1.121 ) |
128 | NHWC | TVM | Fallback | RV64GCV | 0 | Loop+SLP |
1435833 ( 1.2x ) |
588978 ( 1.889 ) |
8744 ( 1.318 ) |
128 | NCHW | TVM | Autotuned | RV64GCV | 0 | Loop+SLP |
1435833 ( 1.2x ) |
588978 ( 1.889 ) |
8744 ( 1.318 ) |
128 | NHWC | TVM | Autotuned | RV64GCV | 0 | Loop+SLP |
1662657 ( Base ) |
311812 ( Base ) |
6632 ( Base ) |
128 | NHWC | muRISCV-NN | Scalar | RV64GC | 0 | - |
628312 ( 2.6x ) |
312646 ( 1.003 ) |
6632 ( 1.0 ) |
128 | NHWC | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
621471 ( 2.7x ) |
312468 ( 1.002 ) |
6632 ( 1.0 ) |
128 | NHWC | muRISCV-NN | Vector | RV64GCV | 0 | - |
vww
)
Visual Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Layout | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|---|
81861381 ( 0.5x ) |
535834 ( 1.681 ) |
183048 ( 2.091 ) |
128 | NCHW | TVM | Fallback | RV64GC | 0 | - |
69686404 ( 0.6x ) |
518196 ( 1.625 ) |
182888 ( 2.09 ) |
128 | NHWC | TVM | Fallback | RV64GC | 0 | - |
43981648 ( 1.0x ) |
522216 ( 1.638 ) |
183056 ( 2.092 ) |
128 | NCHW | TVM | Autotuned | RV64GC | 0 | - |
69686407 ( 0.6x ) |
518202 ( 1.625 ) |
182888 ( 2.09 ) |
128 | NHWC | TVM | Autotuned | RV64GC | 0 | - |
18260665 ( 2.4x ) |
531230 ( 1.666 ) |
182912 ( 2.09 ) |
128 | NCHW | TVM | Fallback | RV64GCV | 0 | Loop+SLP |
28859346 ( 1.5x ) |
520966 ( 1.634 ) |
182888 ( 2.09 ) |
128 | NHWC | TVM | Fallback | RV64GCV | 0 | Loop+SLP |
31323822 ( 1.4x ) |
538792 ( 1.69 ) |
182936 ( 2.09 ) |
128 | NCHW | TVM | Autotuned | RV64GCV | 0 | Loop+SLP |
28858085 ( 1.5x ) |
521074 ( 1.634 ) |
182888 ( 2.09 ) |
128 | NHWC | TVM | Autotuned | RV64GCV | 0 | Loop+SLP |
44077978 ( Base ) |
318804 ( Base ) |
87520 ( Base ) |
128 | NHWC | muRISCV-NN | Scalar | RV64GC | 0 | - |
18469687 ( 2.4x ) |
322812 ( 1.013 ) |
87520 ( 1.0 ) |
128 | NHWC | muRISCV-NN | Scalar | RV64GCV | 0 | Loop+SLP |
13478659 ( 3.3x ) |
320234 ( 1.004 ) |
87520 ( 1.0 ) |
128 | NHWC | muRISCV-NN | Vector | RV64GCV | 0 | - |
Original data
Click here to download the raw files for this benchmark.