Quantization - AshokBhat/ml GitHub Wiki
Description
- Reduce the precision of the numbers used to represent a model's parameters - FP32 by default
Pros and Cons
- Pros - Smaller model size and faster computation.
- Cons - Not trivial
Types
- Post-training FP16 quantization
- Post-training dynamic range quantization
- Post-training integer quantization
- Quantization-aware training
FAQ
- What is quantization?
- What are the downsides?
- When is it used?
- What support do various frameworks have for quantization?
See also