QNNPACK - AshokBhat/ml GitHub Wiki
About
- Mobile-optimized library for low-precision high-performance neural network inference.
- Quantized Neural Network PACKage by Facebook
Area | Details |
---|---|
Perf | 2x better, see Performance section for details |
Deployment | By FB on billions of phones |
Arch support | Optimized on ARM, slow fallback on x86 |
Integration | Part of PyTorch |
Performance
- 2x better than [state-of-the-art]] on phones for benchmarks such as quantized [MobileNetV2
Development status
- Original project https://github.com/pytorch/QNNPACK archived
- Under active development in PyTorch project - https://github.com/pytorch/pytorch/tree/master/aten/src/ATen/native/quantized/cpu/qnnpack
- End-to-End benchmarking - https://github.com/pytorch/QNNPACK#end-to-end-benchmarking
Sources
- Launch blog - https://engineering.fb.com/ml-applications/qnnpack/
- GitHub archived repo - https://github.com/pytorch/QNNPACK (No longer active)
- GitHub current location - https://github.com/pytorch/pytorch/tree/master/aten/src/ATen/native/quantized/cpu/qnnpack