Model Converter (convert) - Mungert69/GGUFModelBuilder GitHub Wiki

What is model_converter.py?

model_converter.py is the central automation script for managing the end-to-end process of:

  • Downloading models from Hugging Face
  • Converting them to GGUF format (for llama.cpp)
  • Quantizing and uploading them
  • Managing a Redis-based catalog of models
  • Handling disk space and cache
  • Detecting Mixture-of-Experts (MoE) models
  • Running as a batch daemon or for a single model

It is designed to be run as a service (daemon) or for single model conversion.


Key Features

  • Disk Space Management: Checks and manages disk space before conversion, cleans up cache and old files as needed.
  • Redis Catalog Integration: Tracks all models, their conversion status, attempts, quantizations, errors, and metadata in a Redis database.
  • Hugging Face Integration: Authenticates, downloads, and uploads models using the Hugging Face Hub API.
  • MoE Model Detection: Detects if a model is a Mixture-of-Experts using config and README analysis.
  • Batch and Single Model Processing: Can process all unconverted models in the catalog or a single specified model.
  • Pipeline Automation: Runs the full pipeline: download, convert, quantize, upload, and update catalog.
  • Cache and Cleanup: Aggressively cleans up cache and working directories to save space.

How Does It Work?

1. Initialization

  • Loads environment variables (for Redis, Hugging Face, etc)
  • Connects to Redis and Hugging Face
  • Sets up disk space and conversion parameters

2. Model Selection

  • Loads trending models from Hugging Face or uses the Redis catalog
  • Updates the catalog with new models if needed

3. Conversion Pipeline

For each model:

  • Checks disk space and cleans up if needed
  • Downloads and converts the model to GGUF (BF16) using download_convert.py
  • Quantizes the model using make_files.py (with special handling for MoE if needed)
  • Uploads all quantized files using upload-files.py
  • Updates the catalog with conversion status, attempts, errors, and quantizations

4. Modes of Operation

  • Daemon Mode: Runs in a loop, processing all unconverted models every 15 minutes
  • Single Model Mode: Processes a single model specified by the user

Usage

Command-Line Usage

python model_converter.py --daemon

Runs as a continuous service, processing all unconverted models in the catalog.

python model_converter.py --single company/model_name

Processes a single model (downloads, converts, quantizes, uploads, updates catalog).

Arguments

  • --daemon: Run as a continuous service (batch mode)
  • --single MODEL_NAME: Process a specific model (e.g., ibm-granite/granite-4.0-tiny-preview)

Main Methods

  • convert_model(model_id, is_moe): Runs the full pipeline for a single model
  • run_conversion_cycle(): Processes all unconverted models in the catalog
  • start_daemon(): Runs run_conversion_cycle() in a loop with a 15-minute sleep
  • update_catalog(models): Adds new models to the Redis catalog
  • can_fit_model(model_id): Checks if there is enough disk space for conversion
  • cleanup_hf_cache(model_id): Cleans up Hugging Face cache for a model or all models
  • aggressive_cache_cleanup(): Cleans all caches and temp files

Example Flow

flowchart TD
    A["Start (daemon or single)"] --> B["Load/Update Catalog"]
    B --> C{"Enough Disk Space?"}
    C -- "No" --> D["Cleanup Cache"]
    D --> C
    C -- "Yes" --> E["Download & Convert (BF16)"]
    E --> F["Quantize (Q4_K, Q6_K, etc.)"]
    F --> G["Upload to HF"]
    G --> H["Update Catalog"]
    H --> I{"More Models?"}
    I -- "Yes" --> C
    I -- "No" --> J["Sleep/Exit"]


Best Practices & Recommendations

  • Environment: Ensure .env contains valid Hugging Face and Redis credentials
  • Redis: The catalog is persistent and shared; use the web UI for manual edits if needed
  • Disk Space: The script is aggressive about cleaning up, but you should monitor disk usage if running many large models
  • Extensibility: You can add new quantization types or model handling by editing the helper scripts (make_files.py, etc)
  • Error Handling: Errors are logged in the Redis catalog for each model; check the error_log field for troubleshooting

When to Use

  • Automated LLM conversion and quantization for llama.cpp
  • Batch processing of many models
  • Catalog management for a large set of models
  • Continuous integration for new/trending models

Summary

model_converter.py is the automation backbone for this codebase, orchestrating the full lifecycle of LLM conversion, quantization, cataloging, and uploading, with robust error handling and resource management. Use it as your main entry point for large-scale or automated model processing.