Model Converter (llama.cpp patch) - Mungert69/GGUFModelBuilder GitHub Wiki
build_llama.py
?
What is build_llama.py
is a utility script that automates the process of:
- Keeping your local
llama.cpp
repository up to date - Applying custom patches (for quantization and imatrix support)
- Building the
llama.cpp
binaries with the correct CMake options - Copying the resulting binaries to the main
llama.cpp
directory for use by the rest of the pipeline
This ensures that your model conversion and quantization tools always use the latest and properly patched version of llama.cpp
.
What Does It Do?
llama.cpp
repo:
Prepares the - Resets any local changes (stashes and drops them if present)
- Cleans untracked files
- Pulls the latest changes from GitHub
Applies custom patches:
- Tries to apply two patch files (
my_quant_changes.patch
andimatrix_word_boundary.patch
) using several methods:patch
commandgit apply
- 3-way merge if needed
- If patching fails, it provides hints for manual resolution
llama.cpp
:
Configures and builds - Runs CMake with OpenBLAS and disables CURL
- Builds the binaries in release mode
Copies binaries:
- Copies all built binaries from the build directory to the main
llama.cpp
directory
Usage
Command-Line
python build_llama.py
No arguments are needed. The script expects:
-
Your
llama.cpp
repo at~/code/models/llama.cpp
-
Patch files in the current directory:
my_quant_changes.patch
imatrix_word_boundary.patch
What Happens on Run?
- If the
llama.cpp
directory or patch files are missing, it will exit with an error - If all steps succeed, you’ll see “Build successful!” and the binaries will be ready for use
When Should You Use It?
- After pulling new changes from the upstream
llama.cpp
repo - After updating your patch files
- Before running model conversions if you want to ensure you have the latest and correctly patched binaries
Example Output
Forcefully resetting repository...
Pulling latest changes...
Attempting system patch from src directory...
Configuring build...
Building...
Copying binaries...
Build successful!
Integration
model_converter.py
will call build_and_copy()
(from this script) automatically in daemon mode after each conversion cycle, so you usually don’t need to run it manually unless you’re debugging or updating patches.
Troubleshooting
- If patching fails, the script will print detailed errors and hints for regenerating the patch or resolving upstream changes
- If the build fails, check the output for CMake or compiler errors
Summary Table
Step | What it does |
---|---|
Prepare repo | Reset, clean, pull latest llama.cpp |
Apply patches | Try patch, git apply, or 3-way merge |
Build | CMake configure and build with OpenBLAS |
Copy binaries | Copy built binaries to main llama.cpp directory |