QNNPACK - AshokBhat/ml GitHub Wiki

About

Mobile-optimized library for low-precision high-performance neural network inference.
Quantized Neural Network PACKage by Facebook

Area	Details
Perf	2x better, see Performance section for details
Deployment	By FB on billions of phones
Arch support	Optimized on ARM, slow fallback on x86
Integration	Part of PyTorch

Performance

2x better than [state-of-the-art]] on phones for benchmarks such as quantized [MobileNetV2

Development status

Original project https://github.com/pytorch/QNNPACK archived
Under active development in PyTorch project - https://github.com/pytorch/pytorch/tree/master/aten/src/ATen/native/quantized/cpu/qnnpack
End-to-End benchmarking - https://github.com/pytorch/QNNPACK#end-to-end-benchmarking

Sources

Launch blog - https://engineering.fb.com/ml-applications/qnnpack/
GitHub archived repo - https://github.com/pytorch/QNNPACK (No longer active)
GitHub current location - https://github.com/pytorch/pytorch/tree/master/aten/src/ATen/native/quantized/cpu/qnnpack

See also

Facebook
[XNNPACK]] ](/AshokBhat/ml/wiki/[[NNPACK) | FBGEMM