Model Converter (convert) - Mungert69/GGUFModelBuilder GitHub Wiki
model_converter.py
?
What is model_converter.py
is the central automation script for managing the end-to-end process of:
- Downloading models from Hugging Face
- Converting them to GGUF format (for
llama.cpp
) - Quantizing and uploading them
- Managing a Redis-based catalog of models
- Handling disk space and cache
- Detecting Mixture-of-Experts (MoE) models
- Running as a batch daemon or for a single model
It is designed to be run as a service (daemon) or for single model conversion.
Key Features
- Disk Space Management: Checks and manages disk space before conversion, cleans up cache and old files as needed.
- Redis Catalog Integration: Tracks all models, their conversion status, attempts, quantizations, errors, and metadata in a Redis database.
- Hugging Face Integration: Authenticates, downloads, and uploads models using the Hugging Face Hub API.
- MoE Model Detection: Detects if a model is a Mixture-of-Experts using config and README analysis.
- Batch and Single Model Processing: Can process all unconverted models in the catalog or a single specified model.
- Pipeline Automation: Runs the full pipeline: download, convert, quantize, upload, and update catalog.
- Cache and Cleanup: Aggressively cleans up cache and working directories to save space.
How Does It Work?
1. Initialization
- Loads environment variables (for Redis, Hugging Face, etc)
- Connects to Redis and Hugging Face
- Sets up disk space and conversion parameters
2. Model Selection
- Loads trending models from Hugging Face or uses the Redis catalog
- Updates the catalog with new models if needed
3. Conversion Pipeline
For each model:
- Checks disk space and cleans up if needed
- Downloads and converts the model to GGUF (BF16) using
download_convert.py
- Quantizes the model using
make_files.py
(with special handling for MoE if needed) - Uploads all quantized files using
upload-files.py
- Updates the catalog with conversion status, attempts, errors, and quantizations
4. Modes of Operation
- Daemon Mode: Runs in a loop, processing all unconverted models every 15 minutes
- Single Model Mode: Processes a single model specified by the user
Usage
Command-Line Usage
python model_converter.py --daemon
Runs as a continuous service, processing all unconverted models in the catalog.
python model_converter.py --single company/model_name
Processes a single model (downloads, converts, quantizes, uploads, updates catalog).
Arguments
--daemon
: Run as a continuous service (batch mode)--single MODEL_NAME
: Process a specific model (e.g.,ibm-granite/granite-4.0-tiny-preview
)
Main Methods
convert_model(model_id, is_moe)
: Runs the full pipeline for a single modelrun_conversion_cycle()
: Processes all unconverted models in the catalogstart_daemon()
: Runsrun_conversion_cycle()
in a loop with a 15-minute sleepupdate_catalog(models)
: Adds new models to the Redis catalogcan_fit_model(model_id)
: Checks if there is enough disk space for conversioncleanup_hf_cache(model_id)
: Cleans up Hugging Face cache for a model or all modelsaggressive_cache_cleanup()
: Cleans all caches and temp files
Example Flow
flowchart TD
A["Start (daemon or single)"] --> B["Load/Update Catalog"]
B --> C{"Enough Disk Space?"}
C -- "No" --> D["Cleanup Cache"]
D --> C
C -- "Yes" --> E["Download & Convert (BF16)"]
E --> F["Quantize (Q4_K, Q6_K, etc.)"]
F --> G["Upload to HF"]
G --> H["Update Catalog"]
H --> I{"More Models?"}
I -- "Yes" --> C
I -- "No" --> J["Sleep/Exit"]
Best Practices & Recommendations
- Environment: Ensure
.env
contains valid Hugging Face and Redis credentials - Redis: The catalog is persistent and shared; use the web UI for manual edits if needed
- Disk Space: The script is aggressive about cleaning up, but you should monitor disk usage if running many large models
- Extensibility: You can add new quantization types or model handling by editing the helper scripts (
make_files.py
, etc) - Error Handling: Errors are logged in the Redis catalog for each model; check the
error_log
field for troubleshooting
When to Use
- Automated LLM conversion and quantization for
llama.cpp
- Batch processing of many models
- Catalog management for a large set of models
- Continuous integration for new/trending models
Summary
model_converter.py
is the automation backbone for this codebase, orchestrating the full lifecycle of LLM conversion, quantization, cataloging, and uploading, with robust error handling and resource management. Use it as your main entry point for large-scale or automated model processing.