Model Converter (llama.cpp patch) - Mungert69/GGUFModelBuilder GitHub Wiki

What is `build_llama.py`?

build_llama.py is a utility script that automates the process of:

Keeping your local llama.cpp repository up to date
Applying custom patches (for quantization and imatrix support)
Building the llama.cpp binaries with the correct CMake options
Copying the resulting binaries to the main llama.cpp directory for use by the rest of the pipeline

This ensures that your model conversion and quantization tools always use the latest and properly patched version of llama.cpp.

What Does It Do?

Prepares the `llama.cpp` repo:

Resets any local changes (stashes and drops them if present)
Cleans untracked files
Pulls the latest changes from GitHub

Applies custom patches:

Tries to apply two patch files (my_quant_changes.patch and imatrix_word_boundary.patch) using several methods:
- patch command
- git apply
- 3-way merge if needed
If patching fails, it provides hints for manual resolution

Configures and builds `llama.cpp`:

Runs CMake with OpenBLAS and disables CURL
Builds the binaries in release mode

Copies binaries:

Copies all built binaries from the build directory to the main llama.cpp directory

Usage

Command-Line

python build_llama.py

No arguments are needed. The script expects:

Your llama.cpp repo at ~/code/models/llama.cpp
Patch files in the current directory:
- my_quant_changes.patch
- imatrix_word_boundary.patch

What Happens on Run?

If the llama.cpp directory or patch files are missing, it will exit with an error
If all steps succeed, you’ll see “Build successful!” and the binaries will be ready for use

When Should You Use It?

After pulling new changes from the upstream llama.cpp repo
After updating your patch files
Before running model conversions if you want to ensure you have the latest and correctly patched binaries

Example Output

Forcefully resetting repository...
Pulling latest changes...
Attempting system patch from src directory...
Configuring build...
Building...
Copying binaries...

Build successful!

Integration

model_converter.py will call build_and_copy() (from this script) automatically in daemon mode after each conversion cycle, so you usually don’t need to run it manually unless you’re debugging or updating patches.

Troubleshooting

If patching fails, the script will print detailed errors and hints for regenerating the patch or resolving upstream changes
If the build fails, check the output for CMake or compiler errors

Summary Table

Step	What it does
Prepare repo	Reset, clean, pull latest `llama.cpp`
Apply patches	Try patch, git apply, or 3-way merge
Build	CMake configure and build with OpenBLAS
Copy binaries	Copy built binaries to main `llama.cpp` directory