FP16 - AshokBhat/ml GitHub Wiki
About
- 16-bit floating-point format
- Format: 1-bit sign, 5-bit exponent and a 10-bit fraction
- IEEE half-precision floating-point
Comparison
Hardware support
- Intel CPUs - First in Sapphire Rapids.
- AMD GPUs- No support
- Arm CPUs - Yes
- NVIDIA GPUs - Yes
See also
- [FP16]] ](/AshokBhat/ml/wiki/[FP32) | [bfloat16]]