Bfloat16 - AshokBhat/notes GitHub Wiki

Bfloat16

  • 16-bit floating-point format
  • Format: 1 bit sign, 8 bit exponent, 7 bit mantissa
  • Better suited for deep learning than FP16

Support

See Also