Model Converter (convert) - Mungert69/GGUFModelBuilder GitHub Wiki

What is `model_converter.py`?

model_converter.py is the central automation script for managing the end-to-end process of:

Downloading models from Hugging Face
Converting them to GGUF format (for llama.cpp)
Quantizing and uploading them
Managing a Redis-based catalog of models
Handling disk space and cache
Detecting Mixture-of-Experts (MoE) models
Running as a batch daemon or for a single model

It is designed to be run as a service (daemon) or for single model conversion.

Key Features

Disk Space Management: Checks and manages disk space before conversion, cleans up cache and old files as needed.
Redis Catalog Integration: Tracks all models, their conversion status, attempts, quantizations, errors, and metadata in a Redis database.
Hugging Face Integration: Authenticates, downloads, and uploads models using the Hugging Face Hub API.
MoE Model Detection: Detects if a model is a Mixture-of-Experts using config and README analysis.
Batch and Single Model Processing: Can process all unconverted models in the catalog or a single specified model.
Pipeline Automation: Runs the full pipeline: download, convert, quantize, upload, and update catalog.
Cache and Cleanup: Aggressively cleans up cache and working directories to save space.

How Does It Work?

1. Initialization

Loads environment variables (for Redis, Hugging Face, etc)
Connects to Redis and Hugging Face
Sets up disk space and conversion parameters

2. Model Selection

Loads trending models from Hugging Face or uses the Redis catalog
Updates the catalog with new models if needed

3. Conversion Pipeline

For each model:

Checks disk space and cleans up if needed
Downloads and converts the model to GGUF (BF16) using download_convert.py
Quantizes the model using make_files.py (with special handling for MoE if needed)
Uploads all quantized files using upload-files.py
Updates the catalog with conversion status, attempts, errors, and quantizations

4. Modes of Operation

Daemon Mode: Runs in a loop, processing all unconverted models every 15 minutes
Single Model Mode: Processes a single model specified by the user

Usage

Command-Line Usage

python model_converter.py --daemon

Runs as a continuous service, processing all unconverted models in the catalog.

python model_converter.py --single company/model_name

Processes a single model (downloads, converts, quantizes, uploads, updates catalog).

Arguments

--daemon: Run as a continuous service (batch mode)
--single MODEL_NAME: Process a specific model (e.g., ibm-granite/granite-4.0-tiny-preview)

Main Methods

convert_model(model_id, is_moe): Runs the full pipeline for a single model
run_conversion_cycle(): Processes all unconverted models in the catalog
start_daemon(): Runs run_conversion_cycle() in a loop with a 15-minute sleep
update_catalog(models): Adds new models to the Redis catalog
can_fit_model(model_id): Checks if there is enough disk space for conversion
cleanup_hf_cache(model_id): Cleans up Hugging Face cache for a model or all models
aggressive_cache_cleanup(): Cleans all caches and temp files

Example Flow

flowchart TD
    A["Start (daemon or single)"] --> B["Load/Update Catalog"]
    B --> C{"Enough Disk Space?"}
    C -- "No" --> D["Cleanup Cache"]
    D --> C
    C -- "Yes" --> E["Download & Convert (BF16)"]
    E --> F["Quantize (Q4_K, Q6_K, etc.)"]
    F --> G["Upload to HF"]
    G --> H["Update Catalog"]
    H --> I{"More Models?"}
    I -- "Yes" --> C
    I -- "No" --> J["Sleep/Exit"]

Best Practices & Recommendations

Environment: Ensure .env contains valid Hugging Face and Redis credentials
Redis: The catalog is persistent and shared; use the web UI for manual edits if needed
Disk Space: The script is aggressive about cleaning up, but you should monitor disk usage if running many large models
Extensibility: You can add new quantization types or model handling by editing the helper scripts (make_files.py, etc)
Error Handling: Errors are logged in the Redis catalog for each model; check the error_log field for troubleshooting

When to Use

Automated LLM conversion and quantization for llama.cpp
Batch processing of many models
Catalog management for a large set of models
Continuous integration for new/trending models

Summary

model_converter.py is the automation backbone for this codebase, orchestrating the full lifecycle of LLM conversion, quantization, cataloging, and uploading, with robust error handling and resource management. Use it as your main entry point for large-scale or automated model processing.