FP16 - AshokBhat/ml GitHub Wiki

About

  • 16-bit floating-point format
  • Format: 1-bit sign, 5-bit exponent and a 10-bit fraction
  • IEEE half-precision floating-point

Comparison

Hardware support

  • Intel CPUs - First in Sapphire Rapids.
  • AMD GPUs- No support
  • Arm CPUs - Yes
  • NVIDIA GPUs - Yes

See also