QNNPACK - AshokBhat/ml GitHub Wiki

About

  • Mobile-optimized library for low-precision high-performance neural network inference.
  • Quantized Neural Network PACKage by Facebook
Area Details
Perf 2x better, see Performance section for details
Deployment By FB on billions of phones
Arch support Optimized on ARM, slow fallback on x86
Integration Part of PyTorch

Performance

  • 2x better than [state-of-the-art]] on phones for benchmarks such as quantized [MobileNetV2

Development status

Sources

  1. Launch blog - https://engineering.fb.com/ml-applications/qnnpack/
  2. GitHub archived repo - https://github.com/pytorch/QNNPACK (No longer active)
  3. GitHub current location - https://github.com/pytorch/pytorch/tree/master/aten/src/ATen/native/quantized/cpu/qnnpack

See also