AWQ - AshokBhat/ml GitHub Wiki

About

  • Activation-aware Weight Quantization (AWQ)
  • Aimed to be a hardware-friendly approach for LLM low-bit weight-only quantization

See also