Home - Mungert69/GGUFModelBuilder GitHub Wiki

🌟 Welcome to GGUF Model Builder

The Complete Toolkit for Optimized LLM Deployment

🚀 Overview

GGUFModelBuilder is an end-to-end solution for converting, quantizing, and deploying large language models in the efficient GGUF format. Our toolkit bridges the gap between raw Hugging Face models and production-ready inference systems.

graph TB
    A[Hugging Face Models] --> B(Download & Convert)
    B --> C[GGUF Format]
    C --> D{Quantization}
    D --> E[4-bit]
    D --> F[5-bit]
    D --> G[8-bit]
    D --> H[16-bit]
    E --> I[Redis Catalog]
    F --> I
    G --> I
    H --> I
    I --> J[Production Deployment]

🧰 Core Components

1. Model Conversion Suite

  • download_convert.py: Raw model → GGUF converter
  • make_files.py: Quantization pipeline
  • upload-files.py: HF Hub deployment

2. Automation Tools

  • model_converter.py: End-to-end orchestration
  • auto_build_new_models.py: Automatic model discovery
  • build_llama.py: Patched llama.cpp builder

3. Catalog Management

  • Redis-based model tracking
  • Metadata version control
  • Conversion status monitoring

📋 Key Features

  • One-Click Conversions from Hugging Face to GGUF
  • Smart Quantization with configurable presets
  • Redis Catalog for enterprise-scale management
  • Automatic Patching of llama.cpp
  • CI/CD Ready pipelines

🏁 Getting Started

Quick Start

Set env vars HF_API_TOKEN=xxxxx

Set up Redis server to store model conversion progress Set env vars REDIS_SERVER=yourserver.com REDIS_PORT=6379 REDIS_PASSWORD=xxxx

git clone https://github.com/yourorg/GGUFModelBuilder
cd GGUFModelBuilder
python model_converter.py 

This will attempt to download and covert the top 100 models on huggingface. Warning this is a very resource intensive.

System Requirements

  • Python 3.8+
  • Redis Server
  • NVIDIA CUDA (optional)
  • 200GB+ disk space