GGML - AshokBhat/ml GitHub Wiki

About

  • GGML (Georgi Gerganov Machine Learning)
  • C Library
  • Suited for handling large models on standard hardware (CPUs)

Quantization Support

  • Supports post-training quantization and bitsnbytes (quantizing weights to lower precisions like 4-bits or 8-bits)

Hardware Optimization

  • Utilizes AVX/AVX2 intrinsics on x86 architectures

Model Deployment

  • Through single-file formats like GGUF
  • Eliminates the need for extra configuration files, simplifying the deployment process.

See also

  • [llama.cpp]] ](/AshokBhat/ml/wiki/[[GGUF)