Home - Mungert69/GGUFModelBuilder GitHub Wiki
🌟 Welcome to GGUF Model Builder
The Complete Toolkit for Optimized LLM Deployment
🚀 Overview
GGUFModelBuilder is an end-to-end solution for converting, quantizing, and deploying large language models in the efficient GGUF format. Our toolkit bridges the gap between raw Hugging Face models and production-ready inference systems.
graph TB
A[Hugging Face Models] --> B(Download & Convert)
B --> C[GGUF Format]
C --> D{Quantization}
D --> E[4-bit]
D --> F[5-bit]
D --> G[8-bit]
D --> H[16-bit]
E --> I[Redis Catalog]
F --> I
G --> I
H --> I
I --> J[Production Deployment]
🧰 Core Components
1. Model Conversion Suite
download_convert.py
: Raw model → GGUF convertermake_files.py
: Quantization pipelineupload-files.py
: HF Hub deployment
2. Automation Tools
model_converter.py
: End-to-end orchestrationauto_build_new_models.py
: Automatic model discoverybuild_llama.py
: Patched llama.cpp builder
3. Catalog Management
- Redis-based model tracking
- Metadata version control
- Conversion status monitoring
📋 Key Features
- One-Click Conversions from Hugging Face to GGUF
- Smart Quantization with configurable presets
- Redis Catalog for enterprise-scale management
- Automatic Patching of llama.cpp
- CI/CD Ready pipelines
🏁 Getting Started
Quick Start
Set env vars HF_API_TOKEN=xxxxx
Set up Redis server to store model conversion progress Set env vars REDIS_SERVER=yourserver.com REDIS_PORT=6379 REDIS_PASSWORD=xxxx
git clone https://github.com/yourorg/GGUFModelBuilder
cd GGUFModelBuilder
python model_converter.py
This will attempt to download and covert the top 100 models on huggingface. Warning this is a very resource intensive.
System Requirements
- Python 3.8+
- Redis Server
- NVIDIA CUDA (optional)
- 200GB+ disk space