Benchmarks 2024 03 02 TVM GCC O3 - tum-ei-eda/muriscv-nn GitHub Wiki
Setup
Simulator
- Spike (
riscv-isa-sim
) (ISS, CPI=1)
Toolchains
- RISC-V GCC:
- Scalar: TODO: version & url
- Vector: TODO: version & url
- Packed: Self compiled using patches found in https://github.com/riscv-collab/riscv-gcc/pull/258 and https://github.com/riscvarchive/riscv-binutils-gdb/pull/257
Models
-
MLPerfTiny Benchmark
-
TODO: others!
Package Versions
-
MLonMCU : main
-
TVM : Nightly Pre-Build
-
Spike : 0bc176b3fca43560b9e8586cdbc41cfde073e17a
-
Spike PK : 7e9b671c0415dfd7b562ac934feb9380075d4aa2
Miscellaneous
- Used
-Os
flag for compilation. - Benchmarks generated using MLonMCU deployment tool with minimal efforts.
- Memory metrics are reported in Bytes
Results (Framework: tvm, Backend: tvmaot, Toolchain: gcc, Flags: -O3)
aww
)
Audio Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Layout | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|---|
14159792 ( 1.1x ) |
130854 ( 1.33 ) |
59536 ( 3.096 ) |
0 | NCHW | TVM | Fallback | RV32GC | False | - |
21330518 ( 0.8x ) |
103938 ( 1.057 ) |
59536 ( 3.096 ) |
0 | NHWC | TVM | Fallback | RV32GC | False | - |
12089908 ( 1.3x ) |
109182 ( 1.11 ) |
51364 ( 2.671 ) |
0 | NCHW | TVM | Autotuned | RV32GC | False | - |
21330425 ( 0.8x ) |
104668 ( 1.064 ) |
59536 ( 3.096 ) |
0 | NHWC | TVM | Autotuned | RV32GC | False | - |
14171727 ( 1.1x ) |
126938 ( 1.291 ) |
59536 ( 3.096 ) |
128 | NCHW | TVM | Fallback | RV32GCV | False | Loop+SLP |
14171727 ( 1.1x ) |
126938 ( 1.291 ) |
59536 ( 3.096 ) |
256 | NCHW | TVM | Fallback | RV32GCV | False | Loop+SLP |
14171727 ( 1.1x ) |
126938 ( 1.291 ) |
59536 ( 3.096 ) |
512 | NCHW | TVM | Fallback | RV32GCV | False | Loop+SLP |
14171727 ( 1.1x ) |
126938 ( 1.291 ) |
59536 ( 3.096 ) |
1024 | NCHW | TVM | Fallback | RV32GCV | False | Loop+SLP |
14171727 ( 1.1x ) |
126938 ( 1.291 ) |
59536 ( 3.096 ) |
2048 | NCHW | TVM | Fallback | RV32GCV | False | Loop+SLP |
14171727 ( 1.1x ) |
126938 ( 1.291 ) |
59536 ( 3.096 ) |
4096 | NCHW | TVM | Fallback | RV32GCV | False | Loop+SLP |
21254917 ( 0.8x ) |
104078 ( 1.058 ) |
59536 ( 3.096 ) |
128 | NHWC | TVM | Fallback | RV32GCV | False | Loop+SLP |
21254917 ( 0.8x ) |
104078 ( 1.058 ) |
59536 ( 3.096 ) |
256 | NHWC | TVM | Fallback | RV32GCV | False | Loop+SLP |
21254917 ( 0.8x ) |
104078 ( 1.058 ) |
59536 ( 3.096 ) |
512 | NHWC | TVM | Fallback | RV32GCV | False | Loop+SLP |
21254917 ( 0.8x ) |
104078 ( 1.058 ) |
59536 ( 3.096 ) |
1024 | NHWC | TVM | Fallback | RV32GCV | False | Loop+SLP |
21254917 ( 0.8x ) |
104078 ( 1.058 ) |
59536 ( 3.096 ) |
2048 | NHWC | TVM | Fallback | RV32GCV | False | Loop+SLP |
21254917 ( 0.8x ) |
104078 ( 1.058 ) |
59536 ( 3.096 ) |
4096 | NHWC | TVM | Fallback | RV32GCV | False | Loop+SLP |
12077116 ( 1.3x ) |
108976 ( 1.108 ) |
51364 ( 2.671 ) |
128 | NCHW | TVM | Autotuned | RV32GCV | False | Loop+SLP |
12077116 ( 1.3x ) |
108976 ( 1.108 ) |
51364 ( 2.671 ) |
256 | NCHW | TVM | Autotuned | RV32GCV | False | Loop+SLP |
12077116 ( 1.3x ) |
108976 ( 1.108 ) |
51364 ( 2.671 ) |
512 | NCHW | TVM | Autotuned | RV32GCV | False | Loop+SLP |
12077116 ( 1.3x ) |
108976 ( 1.108 ) |
51364 ( 2.671 ) |
1024 | NCHW | TVM | Autotuned | RV32GCV | False | Loop+SLP |
12077116 ( 1.3x ) |
108976 ( 1.108 ) |
51364 ( 2.671 ) |
2048 | NCHW | TVM | Autotuned | RV32GCV | False | Loop+SLP |
12077116 ( 1.3x ) |
108976 ( 1.108 ) |
51364 ( 2.671 ) |
4096 | NCHW | TVM | Autotuned | RV32GCV | False | Loop+SLP |
21254824 ( 0.8x ) |
104808 ( 1.066 ) |
59536 ( 3.096 ) |
128 | NHWC | TVM | Autotuned | RV32GCV | False | Loop+SLP |
21254824 ( 0.8x ) |
104808 ( 1.066 ) |
59536 ( 3.096 ) |
256 | NHWC | TVM | Autotuned | RV32GCV | False | Loop+SLP |
21254824 ( 0.8x ) |
104808 ( 1.066 ) |
59536 ( 3.096 ) |
512 | NHWC | TVM | Autotuned | RV32GCV | False | Loop+SLP |
21254824 ( 0.8x ) |
104808 ( 1.066 ) |
59536 ( 3.096 ) |
1024 | NHWC | TVM | Autotuned | RV32GCV | False | Loop+SLP |
21254824 ( 0.8x ) |
104808 ( 1.066 ) |
59536 ( 3.096 ) |
2048 | NHWC | TVM | Autotuned | RV32GCV | False | Loop+SLP |
21254824 ( 0.8x ) |
104808 ( 1.066 ) |
59536 ( 3.096 ) |
4096 | NHWC | TVM | Autotuned | RV32GCV | False | Loop+SLP |
11149680 ( 1.5x ) |
135396 ( 1.377 ) |
59484 ( 3.093 ) |
0 | NCHW | TVM | Fallback | RV32GCP | False | - |
18735452 ( 0.9x ) |
116534 ( 1.185 ) |
59484 ( 3.093 ) |
0 | NHWC | TVM | Fallback | RV32GCP | False | - |
9203054 ( 1.8x ) |
120208 ( 1.222 ) |
51312 ( 2.668 ) |
0 | NCHW | TVM | Autotuned | RV32GCP | False | - |
18735265 ( 0.9x ) |
116930 ( 1.189 ) |
59484 ( 3.093 ) |
0 | NHWC | TVM | Autotuned | RV32GCP | False | - |
16179752 ( Base ) |
98360 ( Base ) |
19232 ( Base ) |
0 | NHWC | muRISCV-NN | Scalar | RV32GC | False | - |
16179757 ( 1.0x ) |
98368 ( 1.0 ) |
19232 ( 1.0 ) |
128 | NHWC | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
16179756 ( 1.0x ) |
98364 ( 1.0 ) |
19232 ( 1.0 ) |
256 | NHWC | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
16179755 ( 1.0x ) |
98362 ( 1.0 ) |
19232 ( 1.0 ) |
512 | NHWC | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
16179756 ( 1.0x ) |
98368 ( 1.0 ) |
19232 ( 1.0 ) |
1024 | NHWC | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
16179754 ( 1.0x ) |
98366 ( 1.0 ) |
19232 ( 1.0 ) |
2048 | NHWC | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
16179757 ( 1.0x ) |
98370 ( 1.0 ) |
19232 ( 1.0 ) |
4096 | NHWC | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
8541878 ( 1.9x ) |
99328 ( 1.01 ) |
23696 ( 1.232 ) |
128 | NHWC | muRISCV-NN | Vector | RV32GCV | False | - |
6582869 ( 2.5x ) |
99334 ( 1.01 ) |
23696 ( 1.232 ) |
256 | NHWC | muRISCV-NN | Vector | RV32GCV | False | - |
5388368 ( 3.0x ) |
99332 ( 1.01 ) |
23696 ( 1.232 ) |
512 | NHWC | muRISCV-NN | Vector | RV32GCV | False | - |
5350776 ( 3.0x ) |
99328 ( 1.01 ) |
23696 ( 1.232 ) |
1024 | NHWC | muRISCV-NN | Vector | RV32GCV | False | - |
5354165 ( 3.0x ) |
99334 ( 1.01 ) |
23696 ( 1.232 ) |
2048 | NHWC | muRISCV-NN | Vector | RV32GCV | False | - |
5357556 ( 3.0x ) |
99330 ( 1.01 ) |
23696 ( 1.232 ) |
4096 | NHWC | muRISCV-NN | Vector | RV32GCV | False | - |
12852236 ( 1.3x ) |
105552 ( 1.073 ) |
19180 ( 0.997 ) |
0 | NHWC | muRISCV-NN | Scalar | RV32GCP | False | - |
6850935 ( 2.4x ) |
110864 ( 1.127 ) |
20332 ( 1.057 ) |
0 | NHWC | muRISCV-NN | Packed | RV32GCP | False | - |
resnet
)
Image Classification (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Layout | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|---|
57356214 ( 1.0x ) |
230148 ( 1.614 ) |
108448 ( 1.953 ) |
0 | NCHW | TVM | Fallback | RV32GC | False | - |
82358098 ( 0.7x ) |
211682 ( 1.485 ) |
108448 ( 1.953 ) |
0 | NHWC | TVM | Fallback | RV32GC | False | - |
51156664 ( 1.1x ) |
220824 ( 1.549 ) |
92264 ( 1.661 ) |
0 | NCHW | TVM | Autotuned | RV32GC | False | - |
82357968 ( 0.7x ) |
212250 ( 1.489 ) |
108448 ( 1.953 ) |
0 | NHWC | TVM | Autotuned | RV32GC | False | - |
57342451 ( 1.0x ) |
228600 ( 1.603 ) |
108448 ( 1.953 ) |
128 | NCHW | TVM | Fallback | RV32GCV | False | Loop+SLP |
57342451 ( 1.0x ) |
228600 ( 1.603 ) |
108448 ( 1.953 ) |
256 | NCHW | TVM | Fallback | RV32GCV | False | Loop+SLP |
57342451 ( 1.0x ) |
228600 ( 1.603 ) |
108448 ( 1.953 ) |
512 | NCHW | TVM | Fallback | RV32GCV | False | Loop+SLP |
57342451 ( 1.0x ) |
228600 ( 1.603 ) |
108448 ( 1.953 ) |
1024 | NCHW | TVM | Fallback | RV32GCV | False | Loop+SLP |
57342451 ( 1.0x ) |
228600 ( 1.603 ) |
108448 ( 1.953 ) |
2048 | NCHW | TVM | Fallback | RV32GCV | False | Loop+SLP |
57342451 ( 1.0x ) |
228600 ( 1.603 ) |
108448 ( 1.953 ) |
4096 | NCHW | TVM | Fallback | RV32GCV | False | Loop+SLP |
82389436 ( 0.7x ) |
212386 ( 1.489 ) |
108448 ( 1.953 ) |
128 | NHWC | TVM | Fallback | RV32GCV | False | Loop+SLP |
82389436 ( 0.7x ) |
212386 ( 1.489 ) |
108448 ( 1.953 ) |
256 | NHWC | TVM | Fallback | RV32GCV | False | Loop+SLP |
82389436 ( 0.7x ) |
212386 ( 1.489 ) |
108448 ( 1.953 ) |
512 | NHWC | TVM | Fallback | RV32GCV | False | Loop+SLP |
82389436 ( 0.7x ) |
212386 ( 1.489 ) |
108448 ( 1.953 ) |
1024 | NHWC | TVM | Fallback | RV32GCV | False | Loop+SLP |
82389436 ( 0.7x ) |
212386 ( 1.489 ) |
108448 ( 1.953 ) |
2048 | NHWC | TVM | Fallback | RV32GCV | False | Loop+SLP |
82389436 ( 0.7x ) |
212386 ( 1.489 ) |
108448 ( 1.953 ) |
4096 | NHWC | TVM | Fallback | RV32GCV | False | Loop+SLP |
51192655 ( 1.1x ) |
221222 ( 1.551 ) |
92264 ( 1.661 ) |
128 | NCHW | TVM | Autotuned | RV32GCV | False | Loop+SLP |
51192655 ( 1.1x ) |
221222 ( 1.551 ) |
92264 ( 1.661 ) |
256 | NCHW | TVM | Autotuned | RV32GCV | False | Loop+SLP |
51192655 ( 1.1x ) |
221222 ( 1.551 ) |
92264 ( 1.661 ) |
512 | NCHW | TVM | Autotuned | RV32GCV | False | Loop+SLP |
51192655 ( 1.1x ) |
221222 ( 1.551 ) |
92264 ( 1.661 ) |
1024 | NCHW | TVM | Autotuned | RV32GCV | False | Loop+SLP |
51192655 ( 1.1x ) |
221222 ( 1.551 ) |
92264 ( 1.661 ) |
2048 | NCHW | TVM | Autotuned | RV32GCV | False | Loop+SLP |
51192655 ( 1.1x ) |
221222 ( 1.551 ) |
92264 ( 1.661 ) |
4096 | NCHW | TVM | Autotuned | RV32GCV | False | Loop+SLP |
82389306 ( 0.7x ) |
212954 ( 1.493 ) |
108448 ( 1.953 ) |
128 | NHWC | TVM | Autotuned | RV32GCV | False | Loop+SLP |
82389306 ( 0.7x ) |
212954 ( 1.493 ) |
108448 ( 1.953 ) |
256 | NHWC | TVM | Autotuned | RV32GCV | False | Loop+SLP |
82389306 ( 0.7x ) |
212952 ( 1.493 ) |
108448 ( 1.953 ) |
512 | NHWC | TVM | Autotuned | RV32GCV | False | Loop+SLP |
82389306 ( 0.7x ) |
212954 ( 1.493 ) |
108448 ( 1.953 ) |
1024 | NHWC | TVM | Autotuned | RV32GCV | False | Loop+SLP |
82389306 ( 0.7x ) |
212954 ( 1.493 ) |
108448 ( 1.953 ) |
2048 | NHWC | TVM | Autotuned | RV32GCV | False | Loop+SLP |
82389306 ( 0.7x ) |
212952 ( 1.493 ) |
108448 ( 1.953 ) |
4096 | NHWC | TVM | Autotuned | RV32GCV | False | Loop+SLP |
43550530 ( 1.3x ) |
237924 ( 1.669 ) |
108396 ( 1.952 ) |
0 | NCHW | TVM | Fallback | RV32GCP | False | - |
71935976 ( 0.8x ) |
223908 ( 1.57 ) |
108396 ( 1.952 ) |
0 | NHWC | TVM | Fallback | RV32GCP | False | - |
38324442 ( 1.4x ) |
231866 ( 1.626 ) |
92212 ( 1.66 ) |
0 | NCHW | TVM | Autotuned | RV32GCP | False | - |
71935773 ( 0.8x ) |
224212 ( 1.572 ) |
108396 ( 1.952 ) |
0 | NHWC | TVM | Autotuned | RV32GCP | False | - |
54531190 ( Base ) |
142592 ( Base ) |
55536 ( Base ) |
0 | NHWC | muRISCV-NN | Scalar | RV32GC | False | - |
54649851 ( 1.0x ) |
142620 ( 1.0 ) |
55536 ( 1.0 ) |
128 | NHWC | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
54649849 ( 1.0x ) |
142616 ( 1.0 ) |
55536 ( 1.0 ) |
256 | NHWC | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
54649852 ( 1.0x ) |
142622 ( 1.0 ) |
55536 ( 1.0 ) |
512 | NHWC | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
54649852 ( 1.0x ) |
142624 ( 1.0 ) |
55536 ( 1.0 ) |
1024 | NHWC | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
54649851 ( 1.0x ) |
142622 ( 1.0 ) |
55536 ( 1.0 ) |
2048 | NHWC | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
54649847 ( 1.0x ) |
142614 ( 1.0 ) |
55536 ( 1.0 ) |
4096 | NHWC | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
28551125 ( 1.9x ) |
147670 ( 1.036 ) |
55536 ( 1.0 ) |
128 | NHWC | muRISCV-NN | Vector | RV32GCV | False | - |
17915049 ( 3.0x ) |
147670 ( 1.036 ) |
55536 ( 1.0 ) |
256 | NHWC | muRISCV-NN | Vector | RV32GCV | False | - |
13131542 ( 4.2x ) |
147676 ( 1.036 ) |
55536 ( 1.0 ) |
512 | NHWC | muRISCV-NN | Vector | RV32GCV | False | - |
10797812 ( 5.1x ) |
147670 ( 1.036 ) |
55536 ( 1.0 ) |
1024 | NHWC | muRISCV-NN | Vector | RV32GCV | False | - |
8769058 ( 6.2x ) |
147672 ( 1.036 ) |
55536 ( 1.0 ) |
2048 | NHWC | muRISCV-NN | Vector | RV32GCV | False | - |
8218708 ( 6.6x ) |
147668 ( 1.036 ) |
55536 ( 1.0 ) |
4096 | NHWC | muRISCV-NN | Vector | RV32GCV | False | - |
41653954 ( 1.3x ) |
151988 ( 1.066 ) |
55484 ( 0.999 ) |
0 | NHWC | muRISCV-NN | Scalar | RV32GCP | False | - |
26980731 ( 2.0x ) |
159654 ( 1.12 ) |
55484 ( 0.999 ) |
0 | NHWC | muRISCV-NN | Packed | RV32GCP | False | - |
toycar
)
Anomaly Detection (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Layout | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|---|
1633090 ( 1.0x ) |
586984 ( 1.862 ) |
5580 ( 1.167 ) |
0 | NCHW | TVM | Fallback | RV32GC | False | - |
1633090 ( 1.0x ) |
586984 ( 1.862 ) |
5580 ( 1.167 ) |
0 | NHWC | TVM | Fallback | RV32GC | False | - |
3274393 ( 0.5x ) |
637428 ( 2.022 ) |
6892 ( 1.442 ) |
0 | NCHW | TVM | Autotuned | RV32GC | False | - |
3274393 ( 0.5x ) |
637428 ( 2.022 ) |
6892 ( 1.442 ) |
0 | NHWC | TVM | Autotuned | RV32GC | False | - |
1633092 ( 1.0x ) |
587056 ( 1.862 ) |
5580 ( 1.167 ) |
128 | NCHW | TVM | Fallback | RV32GCV | False | Loop+SLP |
1633092 ( 1.0x ) |
587056 ( 1.862 ) |
5580 ( 1.167 ) |
256 | NCHW | TVM | Fallback | RV32GCV | False | Loop+SLP |
1633092 ( 1.0x ) |
587056 ( 1.862 ) |
5580 ( 1.167 ) |
512 | NCHW | TVM | Fallback | RV32GCV | False | Loop+SLP |
1633092 ( 1.0x ) |
587056 ( 1.862 ) |
5580 ( 1.167 ) |
1024 | NCHW | TVM | Fallback | RV32GCV | False | Loop+SLP |
1633092 ( 1.0x ) |
587056 ( 1.862 ) |
5580 ( 1.167 ) |
2048 | NCHW | TVM | Fallback | RV32GCV | False | Loop+SLP |
1633092 ( 1.0x ) |
587056 ( 1.862 ) |
5580 ( 1.167 ) |
4096 | NCHW | TVM | Fallback | RV32GCV | False | Loop+SLP |
1633092 ( 1.0x ) |
587056 ( 1.862 ) |
5580 ( 1.167 ) |
128 | NHWC | TVM | Fallback | RV32GCV | False | Loop+SLP |
1633092 ( 1.0x ) |
587056 ( 1.862 ) |
5580 ( 1.167 ) |
256 | NHWC | TVM | Fallback | RV32GCV | False | Loop+SLP |
1633092 ( 1.0x ) |
587056 ( 1.862 ) |
5580 ( 1.167 ) |
512 | NHWC | TVM | Fallback | RV32GCV | False | Loop+SLP |
1633092 ( 1.0x ) |
587056 ( 1.862 ) |
5580 ( 1.167 ) |
1024 | NHWC | TVM | Fallback | RV32GCV | False | Loop+SLP |
1633092 ( 1.0x ) |
587056 ( 1.862 ) |
5580 ( 1.167 ) |
2048 | NHWC | TVM | Fallback | RV32GCV | False | Loop+SLP |
1633092 ( 1.0x ) |
587056 ( 1.862 ) |
5580 ( 1.167 ) |
4096 | NHWC | TVM | Fallback | RV32GCV | False | Loop+SLP |
3274393 ( 0.5x ) |
637428 ( 2.022 ) |
6892 ( 1.442 ) |
128 | NCHW | TVM | Autotuned | RV32GCV | False | Loop+SLP |
3274393 ( 0.5x ) |
637428 ( 2.022 ) |
6892 ( 1.442 ) |
256 | NCHW | TVM | Autotuned | RV32GCV | False | Loop+SLP |
3274393 ( 0.5x ) |
637428 ( 2.022 ) |
6892 ( 1.442 ) |
512 | NCHW | TVM | Autotuned | RV32GCV | False | Loop+SLP |
3274393 ( 0.5x ) |
637428 ( 2.022 ) |
6892 ( 1.442 ) |
1024 | NCHW | TVM | Autotuned | RV32GCV | False | Loop+SLP |
3274393 ( 0.5x ) |
637428 ( 2.022 ) |
6892 ( 1.442 ) |
2048 | NCHW | TVM | Autotuned | RV32GCV | False | Loop+SLP |
3274393 ( 0.5x ) |
637428 ( 2.022 ) |
6892 ( 1.442 ) |
4096 | NCHW | TVM | Autotuned | RV32GCV | False | Loop+SLP |
3274393 ( 0.5x ) |
637428 ( 2.022 ) |
6892 ( 1.442 ) |
128 | NHWC | TVM | Autotuned | RV32GCV | False | Loop+SLP |
3274393 ( 0.5x ) |
637428 ( 2.022 ) |
6892 ( 1.442 ) |
256 | NHWC | TVM | Autotuned | RV32GCV | False | Loop+SLP |
3274393 ( 0.5x ) |
637428 ( 2.022 ) |
6892 ( 1.442 ) |
512 | NHWC | TVM | Autotuned | RV32GCV | False | Loop+SLP |
3274393 ( 0.5x ) |
637428 ( 2.022 ) |
6892 ( 1.442 ) |
1024 | NHWC | TVM | Autotuned | RV32GCV | False | Loop+SLP |
3274393 ( 0.5x ) |
637428 ( 2.022 ) |
6892 ( 1.442 ) |
2048 | NHWC | TVM | Autotuned | RV32GCV | False | Loop+SLP |
3274393 ( 0.5x ) |
637428 ( 2.022 ) |
6892 ( 1.442 ) |
4096 | NHWC | TVM | Autotuned | RV32GCV | False | Loop+SLP |
1374563 ( 1.2x ) |
598946 ( 1.899 ) |
5528 ( 1.156 ) |
0 | NCHW | TVM | Fallback | RV32GCP | False | - |
1374563 ( 1.2x ) |
598946 ( 1.899 ) |
5528 ( 1.156 ) |
0 | NHWC | TVM | Fallback | RV32GCP | False | - |
2347893 ( 0.7x ) |
634578 ( 2.012 ) |
6840 ( 1.431 ) |
0 | NCHW | TVM | Autotuned | RV32GCP | False | - |
2347893 ( 0.7x ) |
634578 ( 2.012 ) |
6840 ( 1.431 ) |
0 | NHWC | TVM | Autotuned | RV32GCP | False | - |
1680471 ( Base ) |
315320 ( Base ) |
4780 ( Base ) |
0 | NHWC | muRISCV-NN | Scalar | RV32GC | False | - |
1679524 ( 1.0x ) |
315442 ( 1.0 ) |
4780 ( 1.0 ) |
128 | NHWC | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
1679524 ( 1.0x ) |
315442 ( 1.0 ) |
4780 ( 1.0 ) |
256 | NHWC | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
1679524 ( 1.0x ) |
315440 ( 1.0 ) |
4780 ( 1.0 ) |
512 | NHWC | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
1679524 ( 1.0x ) |
315442 ( 1.0 ) |
4780 ( 1.0 ) |
1024 | NHWC | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
1679524 ( 1.0x ) |
315440 ( 1.0 ) |
4780 ( 1.0 ) |
2048 | NHWC | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
1679524 ( 1.0x ) |
315442 ( 1.0 ) |
4780 ( 1.0 ) |
4096 | NHWC | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
995973 ( 1.7x ) |
316664 ( 1.004 ) |
4780 ( 1.0 ) |
128 | NHWC | muRISCV-NN | Vector | RV32GCV | False | - |
716541 ( 2.3x ) |
316664 ( 1.004 ) |
4780 ( 1.0 ) |
256 | NHWC | muRISCV-NN | Vector | RV32GCV | False | - |
576825 ( 2.9x ) |
316664 ( 1.004 ) |
4780 ( 1.0 ) |
512 | NHWC | muRISCV-NN | Vector | RV32GCV | False | - |
507483 ( 3.3x ) |
316664 ( 1.004 ) |
4780 ( 1.0 ) |
1024 | NHWC | muRISCV-NN | Vector | RV32GCV | False | - |
502193 ( 3.3x ) |
316664 ( 1.004 ) |
4780 ( 1.0 ) |
2048 | NHWC | muRISCV-NN | Vector | RV32GCV | False | - |
501178 ( 3.4x ) |
316664 ( 1.004 ) |
4780 ( 1.0 ) |
4096 | NHWC | muRISCV-NN | Vector | RV32GCV | False | - |
1404579 ( 1.2x ) |
327536 ( 1.039 ) |
4728 ( 0.989 ) |
0 | NHWC | muRISCV-NN | Scalar | RV32GCP | False | - |
919928 ( 1.8x ) |
327826 ( 1.04 ) |
4728 ( 0.989 ) |
0 | NHWC | muRISCV-NN | Packed | RV32GCP | False | - |
vww
)
Visual Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Layout | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|---|
41732018 ( 1.1x ) |
599252 ( 1.809 ) |
181056 ( 2.113 ) |
0 | NCHW | TVM | Fallback | RV32GC | False | - |
60680199 ( 0.8x ) |
528924 ( 1.597 ) |
181056 ( 2.113 ) |
0 | NHWC | TVM | Fallback | RV32GC | False | - |
38453049 ( 1.2x ) |
541178 ( 1.634 ) |
181056 ( 2.113 ) |
0 | NCHW | TVM | Autotuned | RV32GC | False | - |
60680201 ( 0.8x ) |
528930 ( 1.597 ) |
181056 ( 2.113 ) |
0 | NHWC | TVM | Autotuned | RV32GC | False | - |
41660476 ( 1.1x ) |
583078 ( 1.76 ) |
181056 ( 2.113 ) |
128 | NCHW | TVM | Fallback | RV32GCV | False | Loop+SLP |
41660476 ( 1.1x ) |
583078 ( 1.76 ) |
181056 ( 2.113 ) |
256 | NCHW | TVM | Fallback | RV32GCV | False | Loop+SLP |
41660476 ( 1.1x ) |
583078 ( 1.76 ) |
181056 ( 2.113 ) |
512 | NCHW | TVM | Fallback | RV32GCV | False | Loop+SLP |
41660476 ( 1.1x ) |
583078 ( 1.76 ) |
181056 ( 2.113 ) |
1024 | NCHW | TVM | Fallback | RV32GCV | False | Loop+SLP |
41660476 ( 1.1x ) |
583078 ( 1.76 ) |
181056 ( 2.113 ) |
2048 | NCHW | TVM | Fallback | RV32GCV | False | Loop+SLP |
41660476 ( 1.1x ) |
583078 ( 1.76 ) |
181056 ( 2.113 ) |
4096 | NCHW | TVM | Fallback | RV32GCV | False | Loop+SLP |
60377273 ( 0.8x ) |
530488 ( 1.602 ) |
181056 ( 2.113 ) |
128 | NHWC | TVM | Fallback | RV32GCV | False | Loop+SLP |
60377273 ( 0.8x ) |
530488 ( 1.602 ) |
181056 ( 2.113 ) |
256 | NHWC | TVM | Fallback | RV32GCV | False | Loop+SLP |
60377273 ( 0.8x ) |
530488 ( 1.602 ) |
181056 ( 2.113 ) |
512 | NHWC | TVM | Fallback | RV32GCV | False | Loop+SLP |
60377273 ( 0.8x ) |
530488 ( 1.602 ) |
181056 ( 2.113 ) |
1024 | NHWC | TVM | Fallback | RV32GCV | False | Loop+SLP |
60377273 ( 0.8x ) |
530488 ( 1.602 ) |
181056 ( 2.113 ) |
2048 | NHWC | TVM | Fallback | RV32GCV | False | Loop+SLP |
60377273 ( 0.8x ) |
530488 ( 1.602 ) |
181056 ( 2.113 ) |
4096 | NHWC | TVM | Fallback | RV32GCV | False | Loop+SLP |
38347136 ( 1.2x ) |
541590 ( 1.635 ) |
181056 ( 2.113 ) |
128 | NCHW | TVM | Autotuned | RV32GCV | False | Loop+SLP |
38347136 ( 1.2x ) |
541590 ( 1.635 ) |
181056 ( 2.113 ) |
256 | NCHW | TVM | Autotuned | RV32GCV | False | Loop+SLP |
38347136 ( 1.2x ) |
541590 ( 1.635 ) |
181056 ( 2.113 ) |
512 | NCHW | TVM | Autotuned | RV32GCV | False | Loop+SLP |
38347136 ( 1.2x ) |
541590 ( 1.635 ) |
181056 ( 2.113 ) |
1024 | NCHW | TVM | Autotuned | RV32GCV | False | Loop+SLP |
38347136 ( 1.2x ) |
541590 ( 1.635 ) |
181056 ( 2.113 ) |
2048 | NCHW | TVM | Autotuned | RV32GCV | False | Loop+SLP |
38347136 ( 1.2x ) |
541590 ( 1.635 ) |
181056 ( 2.113 ) |
4096 | NCHW | TVM | Autotuned | RV32GCV | False | Loop+SLP |
60377275 ( 0.8x ) |
530494 ( 1.602 ) |
181056 ( 2.113 ) |
128 | NHWC | TVM | Autotuned | RV32GCV | False | Loop+SLP |
60377275 ( 0.8x ) |
530494 ( 1.602 ) |
181056 ( 2.113 ) |
256 | NHWC | TVM | Autotuned | RV32GCV | False | Loop+SLP |
60377275 ( 0.8x ) |
530494 ( 1.602 ) |
181056 ( 2.113 ) |
512 | NHWC | TVM | Autotuned | RV32GCV | False | Loop+SLP |
60377275 ( 0.8x ) |
530494 ( 1.602 ) |
181056 ( 2.113 ) |
1024 | NHWC | TVM | Autotuned | RV32GCV | False | Loop+SLP |
60377275 ( 0.8x ) |
530494 ( 1.602 ) |
181056 ( 2.113 ) |
2048 | NHWC | TVM | Autotuned | RV32GCV | False | Loop+SLP |
60377275 ( 0.8x ) |
530494 ( 1.602 ) |
181056 ( 2.113 ) |
4096 | NHWC | TVM | Autotuned | RV32GCV | False | Loop+SLP |
32953509 ( 1.4x ) |
587074 ( 1.772 ) |
181004 ( 2.113 ) |
0 | NCHW | TVM | Fallback | RV32GCP | False | - |
53858843 ( 0.9x ) |
542028 ( 1.636 ) |
181004 ( 2.113 ) |
0 | NHWC | TVM | Fallback | RV32GCP | False | - |
31028974 ( 1.5x ) |
549582 ( 1.659 ) |
181004 ( 2.113 ) |
0 | NCHW | TVM | Autotuned | RV32GCP | False | - |
53858845 ( 0.9x ) |
542034 ( 1.636 ) |
181004 ( 2.113 ) |
0 | NHWC | TVM | Autotuned | RV32GCP | False | - |
46437125 ( Base ) |
331226 ( Base ) |
85680 ( Base ) |
0 | NHWC | muRISCV-NN | Scalar | RV32GC | False | - |
46437128 ( 1.0x ) |
331238 ( 1.0 ) |
85680 ( 1.0 ) |
128 | NHWC | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
46437129 ( 1.0x ) |
331240 ( 1.0 ) |
85680 ( 1.0 ) |
256 | NHWC | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
46437129 ( 1.0x ) |
331248 ( 1.0 ) |
85680 ( 1.0 ) |
512 | NHWC | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
46437126 ( 1.0x ) |
331230 ( 1.0 ) |
85680 ( 1.0 ) |
1024 | NHWC | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
46437130 ( 1.0x ) |
331248 ( 1.0 ) |
85680 ( 1.0 ) |
2048 | NHWC | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
46437125 ( 1.0x ) |
331226 ( 1.0 ) |
85680 ( 1.0 ) |
4096 | NHWC | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
23682505 ( 2.0x ) |
332332 ( 1.003 ) |
85680 ( 1.0 ) |
128 | NHWC | muRISCV-NN | Vector | RV32GCV | False | - |
17563370 ( 2.6x ) |
332334 ( 1.003 ) |
85680 ( 1.0 ) |
256 | NHWC | muRISCV-NN | Vector | RV32GCV | False | - |
15240726 ( 3.0x ) |
332326 ( 1.003 ) |
85680 ( 1.0 ) |
512 | NHWC | muRISCV-NN | Vector | RV32GCV | False | - |
14181514 ( 3.3x ) |
332340 ( 1.003 ) |
85680 ( 1.0 ) |
1024 | NHWC | muRISCV-NN | Vector | RV32GCV | False | - |
14097076 ( 3.3x ) |
332344 ( 1.003 ) |
85680 ( 1.0 ) |
2048 | NHWC | muRISCV-NN | Vector | RV32GCV | False | - |
14100464 ( 3.3x ) |
332336 ( 1.003 ) |
85680 ( 1.0 ) |
4096 | NHWC | muRISCV-NN | Vector | RV32GCV | False | - |
36688655 ( 1.3x ) |
338064 ( 1.021 ) |
85628 ( 0.999 ) |
0 | NHWC | muRISCV-NN | Scalar | RV32GCP | False | - |
19166446 ( 2.4x ) |
343506 ( 1.037 ) |
85628 ( 0.999 ) |
0 | NHWC | muRISCV-NN | Packed | RV32GCP | False | - |
Original data
Click here to download the raw files for this benchmark.