Model Converter (llama.cpp patch) - Mungert69/GGUFModelBuilder GitHub Wiki

What is build_llama.py?

build_llama.py is a utility script that automates the process of:

  • Keeping your local llama.cpp repository up to date
  • Applying custom patches (for quantization and imatrix support)
  • Building the llama.cpp binaries with the correct CMake options
  • Copying the resulting binaries to the main llama.cpp directory for use by the rest of the pipeline

This ensures that your model conversion and quantization tools always use the latest and properly patched version of llama.cpp.


What Does It Do?

Prepares the llama.cpp repo:

  • Resets any local changes (stashes and drops them if present)
  • Cleans untracked files
  • Pulls the latest changes from GitHub

Applies custom patches:

  • Tries to apply two patch files (my_quant_changes.patch and imatrix_word_boundary.patch) using several methods:
    • patch command
    • git apply
    • 3-way merge if needed
  • If patching fails, it provides hints for manual resolution

Configures and builds llama.cpp:

  • Runs CMake with OpenBLAS and disables CURL
  • Builds the binaries in release mode

Copies binaries:

  • Copies all built binaries from the build directory to the main llama.cpp directory

Usage

Command-Line

python build_llama.py

No arguments are needed. The script expects:

  • Your llama.cpp repo at ~/code/models/llama.cpp

  • Patch files in the current directory:

    • my_quant_changes.patch
    • imatrix_word_boundary.patch

What Happens on Run?

  • If the llama.cpp directory or patch files are missing, it will exit with an error
  • If all steps succeed, you’ll see “Build successful!” and the binaries will be ready for use

When Should You Use It?

  • After pulling new changes from the upstream llama.cpp repo
  • After updating your patch files
  • Before running model conversions if you want to ensure you have the latest and correctly patched binaries

Example Output

Forcefully resetting repository...
Pulling latest changes...
Attempting system patch from src directory...
Configuring build...
Building...
Copying binaries...

Build successful!

Integration

model_converter.py will call build_and_copy() (from this script) automatically in daemon mode after each conversion cycle, so you usually don’t need to run it manually unless you’re debugging or updating patches.


Troubleshooting

  • If patching fails, the script will print detailed errors and hints for regenerating the patch or resolving upstream changes
  • If the build fails, check the output for CMake or compiler errors

Summary Table

Step What it does
Prepare repo Reset, clean, pull latest llama.cpp
Apply patches Try patch, git apply, or 3-way merge
Build CMake configure and build with OpenBLAS
Copy binaries Copy built binaries to main llama.cpp directory