Benchmarks 2024 03 02 TFLM GCC Os - tum-ei-eda/muriscv-nn GitHub Wiki
Setup
Simulator
- Spike (
riscv-isa-sim
) (ISS, CPI=1)
Toolchains
- RISC-V GCC:
- Scalar: TODO: version & url
- Vector: TODO: version & url
- Packed: Self compiled using patches found in https://github.com/riscv-collab/riscv-gcc/pull/258 and https://github.com/riscvarchive/riscv-binutils-gdb/pull/257
Models
-
MLPerfTiny Benchmark
-
TODO: others!
Package Versions
-
MLonMCU : main
-
TFLM : main
-
Spike : 0bc176b3fca43560b9e8586cdbc41cfde073e17a
-
Spike PK : 7e9b671c0415dfd7b562ac934feb9380075d4aa2
Miscellaneous
- Used
-Os
flag for compilation. - Benchmarks generated using MLonMCU deployment tool with minimal efforts.
- Memory metrics are reported in Bytes
Results (Framework: tflm, Backend: tflmi, Toolchain: gcc, Flags: -Os)
aww
)
Audio Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
174727949 ( 0.1x ) |
132511 ( 0.881 ) |
36204 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | False | - |
174727949 ( 0.1x ) |
132517 ( 0.881 ) |
36204 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | False | Loop+SLP |
174727949 ( 0.1x ) |
132517 ( 0.881 ) |
36204 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | False | Loop+SLP |
174727949 ( 0.1x ) |
132517 ( 0.881 ) |
36204 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | False | Loop+SLP |
174727949 ( 0.1x ) |
132517 ( 0.881 ) |
36204 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | False | Loop+SLP |
174727949 ( 0.1x ) |
132517 ( 0.881 ) |
36204 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | False | Loop+SLP |
174727949 ( 0.1x ) |
132517 ( 0.881 ) |
36204 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | False | Loop+SLP |
157574106 ( 0.1x ) |
144880 ( 0.963 ) |
36148 ( 0.998 ) |
0 | TFLM | Reference | RV32GCP | False | - |
16660002 ( Base ) |
150416 ( Base ) |
36212 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | False | - |
16660002 ( 1.0x ) |
150418 ( 1.0 ) |
36212 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
16660002 ( 1.0x ) |
150418 ( 1.0 ) |
36212 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
16660002 ( 1.0x ) |
150418 ( 1.0 ) |
36212 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
16660002 ( 1.0x ) |
150418 ( 1.0 ) |
36212 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
16660002 ( 1.0x ) |
150418 ( 1.0 ) |
36212 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
16660002 ( 1.0x ) |
150418 ( 1.0 ) |
36212 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
4118315 ( 4.0x ) |
151600 ( 1.008 ) |
36212 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | False | - |
2850259 ( 5.8x ) |
151600 ( 1.008 ) |
36212 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | False | - |
2161571 ( 7.7x ) |
151600 ( 1.008 ) |
36212 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | False | - |
2119859 ( 7.9x ) |
151600 ( 1.008 ) |
36212 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | False | - |
2119859 ( 7.9x ) |
151600 ( 1.008 ) |
36212 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | False | - |
2123248 ( 7.8x ) |
151600 ( 1.008 ) |
36212 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | False | - |
13520505 ( 1.2x ) |
161982 ( 1.077 ) |
36156 ( 0.998 ) |
0 | muRISCV-NN | Scalar | RV32GCP | False | - |
15963290 ( 1.0x ) |
164254 ( 1.092 ) |
36156 ( 0.998 ) |
0 | muRISCV-NN | Packed | RV32GCP | False | - |
resnet
)
Image Classification (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
745826387 ( 0.1x ) |
173101 ( 0.936 ) |
68968 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | False | - |
745826339 ( 0.1x ) |
173117 ( 0.936 ) |
68968 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | False | Loop+SLP |
745826339 ( 0.1x ) |
173117 ( 0.936 ) |
68968 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | False | Loop+SLP |
745826339 ( 0.1x ) |
173117 ( 0.936 ) |
68968 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | False | Loop+SLP |
745826339 ( 0.1x ) |
173117 ( 0.936 ) |
68968 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | False | Loop+SLP |
745826339 ( 0.1x ) |
173117 ( 0.936 ) |
68968 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | False | Loop+SLP |
745826339 ( 0.1x ) |
173117 ( 0.936 ) |
68968 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | False | Loop+SLP |
697937425 ( 0.1x ) |
185370 ( 1.002 ) |
68912 ( 0.999 ) |
0 | TFLM | Reference | RV32GCP | False | - |
81008676 ( Base ) |
184918 ( Base ) |
68960 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | False | - |
81008676 ( 1.0x ) |
184946 ( 1.0 ) |
68960 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
81008676 ( 1.0x ) |
184946 ( 1.0 ) |
68960 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
81008676 ( 1.0x ) |
184946 ( 1.0 ) |
68960 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
81008676 ( 1.0x ) |
184946 ( 1.0 ) |
68960 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
81008676 ( 1.0x ) |
184946 ( 1.0 ) |
68960 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
81008676 ( 1.0x ) |
184946 ( 1.0 ) |
68960 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
15481791 ( 5.2x ) |
186934 ( 1.011 ) |
68960 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | False | - |
9794783 ( 8.3x ) |
186934 ( 1.011 ) |
68960 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | False | - |
7201135 ( 11.2x ) |
186934 ( 1.011 ) |
68960 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | False | - |
5935615 ( 13.6x ) |
186934 ( 1.011 ) |
68960 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | False | - |
4993924 ( 16.2x ) |
186934 ( 1.011 ) |
68960 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | False | - |
4743397 ( 17.1x ) |
186934 ( 1.011 ) |
68960 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | False | - |
62984859 ( 1.3x ) |
196586 ( 1.063 ) |
68904 ( 0.999 ) |
0 | muRISCV-NN | Scalar | RV32GCP | False | - |
68445967 ( 1.2x ) |
199452 ( 1.079 ) |
68904 ( 0.999 ) |
0 | muRISCV-NN | Packed | RV32GCP | False | - |
toycar
)
Anomaly Detection (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
3106372 ( 0.6x ) |
334010 ( 0.991 ) |
19432 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | False | - |
3106372 ( 0.6x ) |
334016 ( 0.991 ) |
19432 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | False | Loop+SLP |
3106372 ( 0.6x ) |
334016 ( 0.991 ) |
19432 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | False | Loop+SLP |
3106372 ( 0.6x ) |
334016 ( 0.991 ) |
19432 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | False | Loop+SLP |
3106372 ( 0.6x ) |
334016 ( 0.991 ) |
19432 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | False | Loop+SLP |
3106372 ( 0.6x ) |
334016 ( 0.991 ) |
19432 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | False | Loop+SLP |
3106372 ( 0.6x ) |
334016 ( 0.991 ) |
19432 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | False | Loop+SLP |
3121275 ( 0.6x ) |
346368 ( 1.027 ) |
19380 ( 0.997 ) |
0 | TFLM | Reference | RV32GCP | False | - |
1789729 ( Base ) |
337190 ( Base ) |
19432 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | False | - |
1789729 ( 1.0x ) |
337192 ( 1.0 ) |
19432 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
1789729 ( 1.0x ) |
337192 ( 1.0 ) |
19432 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
1789729 ( 1.0x ) |
337192 ( 1.0 ) |
19432 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
1789729 ( 1.0x ) |
337192 ( 1.0 ) |
19432 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
1789729 ( 1.0x ) |
337192 ( 1.0 ) |
19432 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
1789729 ( 1.0x ) |
337192 ( 1.0 ) |
19432 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
584261 ( 3.1x ) |
338104 ( 1.003 ) |
19432 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | False | - |
465893 ( 3.8x ) |
338104 ( 1.003 ) |
19432 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | False | - |
406709 ( 4.4x ) |
338104 ( 1.003 ) |
19432 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | False | - |
377453 ( 4.7x ) |
338104 ( 1.003 ) |
19432 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | False | - |
373769 ( 4.8x ) |
338104 ( 1.003 ) |
19432 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | False | - |
371885 ( 4.8x ) |
338104 ( 1.003 ) |
19432 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | False | - |
1631830 ( 1.1x ) |
349334 ( 1.036 ) |
19380 ( 0.997 ) |
0 | muRISCV-NN | Scalar | RV32GCP | False | - |
959437 ( 1.9x ) |
350924 ( 1.041 ) |
19380 ( 0.997 ) |
0 | muRISCV-NN | Packed | RV32GCP | False | - |
vww
)
Visual Wake Words (Cycles (Speedup) | Total ROM (rel.) | Total RAM (rel.) | VLEN | Kernels | Mode | Arch | Unroll | Auto-Vectorization |
---|---|---|---|---|---|---|---|---|
495297744 ( 0.1x ) |
406217 ( 0.958 ) |
134520 ( 1.0 ) |
0 | TFLM | Reference | RV32GC | False | - |
495297744 ( 0.1x ) |
406223 ( 0.958 ) |
134520 ( 1.0 ) |
128 | TFLM | Reference | RV32GCV | False | Loop+SLP |
495297744 ( 0.1x ) |
406223 ( 0.958 ) |
134520 ( 1.0 ) |
256 | TFLM | Reference | RV32GCV | False | Loop+SLP |
495297744 ( 0.1x ) |
406223 ( 0.958 ) |
134520 ( 1.0 ) |
512 | TFLM | Reference | RV32GCV | False | Loop+SLP |
495297744 ( 0.1x ) |
406223 ( 0.958 ) |
134520 ( 1.0 ) |
1024 | TFLM | Reference | RV32GCV | False | Loop+SLP |
495297744 ( 0.1x ) |
406223 ( 0.958 ) |
134520 ( 1.0 ) |
2048 | TFLM | Reference | RV32GCV | False | Loop+SLP |
495297744 ( 0.1x ) |
406223 ( 0.958 ) |
134520 ( 1.0 ) |
4096 | TFLM | Reference | RV32GCV | False | Loop+SLP |
445917096 ( 0.1x ) |
418584 ( 0.987 ) |
134464 ( 1.0 ) |
0 | TFLM | Reference | RV32GCP | False | - |
49689783 ( Base ) |
424120 ( Base ) |
134528 ( Base ) |
0 | muRISCV-NN | Scalar | RV32GC | False | - |
49689783 ( 1.0x ) |
424122 ( 1.0 ) |
134528 ( 1.0 ) |
128 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
49689783 ( 1.0x ) |
424122 ( 1.0 ) |
134528 ( 1.0 ) |
256 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
49689783 ( 1.0x ) |
424122 ( 1.0 ) |
134528 ( 1.0 ) |
512 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
49689783 ( 1.0x ) |
424122 ( 1.0 ) |
134528 ( 1.0 ) |
1024 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
49689783 ( 1.0x ) |
424122 ( 1.0 ) |
134528 ( 1.0 ) |
2048 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
49689783 ( 1.0x ) |
424122 ( 1.0 ) |
134528 ( 1.0 ) |
4096 | muRISCV-NN | Scalar | RV32GCV | False | Loop+SLP |
13486964 ( 3.7x ) |
425304 ( 1.003 ) |
134528 ( 1.0 ) |
128 | muRISCV-NN | Vector | RV32GCV | False | - |
10158336 ( 4.9x ) |
425304 ( 1.003 ) |
134528 ( 1.0 ) |
256 | muRISCV-NN | Vector | RV32GCV | False | - |
8869214 ( 5.6x ) |
425304 ( 1.003 ) |
134528 ( 1.0 ) |
512 | muRISCV-NN | Vector | RV32GCV | False | - |
8364715 ( 5.9x ) |
425304 ( 1.003 ) |
134528 ( 1.0 ) |
1024 | muRISCV-NN | Vector | RV32GCV | False | - |
8316101 ( 6.0x ) |
425304 ( 1.003 ) |
134528 ( 1.0 ) |
2048 | muRISCV-NN | Vector | RV32GCV | False | - |
8319490 ( 6.0x ) |
425304 ( 1.003 ) |
134528 ( 1.0 ) |
4096 | muRISCV-NN | Vector | RV32GCV | False | - |
40770787 ( 1.2x ) |
435688 ( 1.027 ) |
134472 ( 1.0 ) |
0 | muRISCV-NN | Scalar | RV32GCP | False | - |
49192784 ( 1.0x ) |
437960 ( 1.033 ) |
134472 ( 1.0 ) |
0 | muRISCV-NN | Packed | RV32GCP | False | - |
Original data
Click here to download the raw files for this benchmark.