GGML - AshokBhat/ml GitHub Wiki

About

GGML (Georgi Gerganov Machine Learning)
C Library
Suited for handling large models on standard hardware (CPUs)

Quantization Support

Supports post-training quantization and bitsnbytes (quantizing weights to lower precisions like 4-bits or 8-bits)

Hardware Optimization

Utilizes AVX/AVX2 intrinsics on x86 architectures

Model Deployment

Through single-file formats like GGUF
Eliminates the need for extra configuration files, simplifying the deployment process.

See also

[llama.cpp]] ](/AshokBhat/ml/wiki/[[GGUF)