Model Converter (llama.cpp patch) - Mungert69/GGUFModelBuilder GitHub Wiki
What is build_llama.py?
build_llama.py is a utility script that automates the process of:
- Keeping your local
llama.cpprepository up to date - Applying custom patches (for quantization and imatrix support)
- Building the
llama.cppbinaries with the correct CMake options - Copying the resulting binaries to the main
llama.cppdirectory for use by the rest of the pipeline
This ensures that your model conversion and quantization tools always use the latest and properly patched version of llama.cpp.
What Does It Do?
Prepares the llama.cpp repo:
- Resets any local changes (stashes and drops them if present)
- Cleans untracked files
- Pulls the latest changes from GitHub
Applies custom patches:
- Tries to apply two patch files (
my_quant_changes.patchandimatrix_word_boundary.patch) using several methods:patchcommandgit apply- 3-way merge if needed
- If patching fails, it provides hints for manual resolution
Configures and builds llama.cpp:
- Runs CMake with OpenBLAS and disables CURL
- Builds the binaries in release mode
Copies binaries:
- Copies all built binaries from the build directory to the main
llama.cppdirectory
Usage
Command-Line
python build_llama.py
No arguments are needed. The script expects:
-
Your
llama.cpprepo at~/code/models/llama.cpp -
Patch files in the current directory:
my_quant_changes.patchimatrix_word_boundary.patch
What Happens on Run?
- If the
llama.cppdirectory or patch files are missing, it will exit with an error - If all steps succeed, you’ll see “Build successful!” and the binaries will be ready for use
When Should You Use It?
- After pulling new changes from the upstream
llama.cpprepo - After updating your patch files
- Before running model conversions if you want to ensure you have the latest and correctly patched binaries
Example Output
Forcefully resetting repository...
Pulling latest changes...
Attempting system patch from src directory...
Configuring build...
Building...
Copying binaries...
Build successful!
Integration
model_converter.py will call build_and_copy() (from this script) automatically in daemon mode after each conversion cycle, so you usually don’t need to run it manually unless you’re debugging or updating patches.
Troubleshooting
- If patching fails, the script will print detailed errors and hints for regenerating the patch or resolving upstream changes
- If the build fails, check the output for CMake or compiler errors
Summary Table
| Step | What it does |
|---|---|
| Prepare repo | Reset, clean, pull latest llama.cpp |
| Apply patches | Try patch, git apply, or 3-way merge |
| Build | CMake configure and build with OpenBLAS |
| Copy binaries | Copy built binaries to main llama.cpp directory |