Creating my first test - dav1312/Wiki-test GitHub Wiki

Contributor's Guide

Welcome! This guide outlines the process for contributing code and improvements to Stockfish. Following these steps ensures your changes can be reviewed, tested, and integrated smoothly.

Table of Contents


Guiding Principles

  • Be Respectful: Always interact politely and kindly. See the Open Source Etiquette Guidebook.
  • Keep Changes Small: Submit focused, atomic changes. Small patches are easier to review and understand.
  • One Idea Per Test: Each test should focus on a single idea. Avoid bundling multiple changes into one patch.
  • Passing Tests ≠ Automatic Merge: Your change still needs to be reviewed by maintainers, even if it passes all tests. Complex changes require significant benefits to be considered.
  • Join the Community: Participate in the Stockfish Discord server. It's the primary place to communicate with developers and maintainers.

Getting Started: One-Time Setup

You only need to do this once to prepare your environment.

Prerequisites

  • A recent C++ compiler
  • Git installed on your system
  • A GitHub account
  • A Git client (e.g., GitHub Desktop, GitKraken, or the command line)

Initial Setup Steps

  1. Fork the Repository: Create your personal copy (a "fork") of the official Stockfish repository.

    fork

  2. Clone Your Fork: Create a local copy of your forked repository on your computer. Use your Git client and the URL from your forked repository page.

Important

You must fork the repository first and then clone your fork. Simply cloning the official repository or creating a copy will not allow you to submit changes correctly.

The Contribution Workflow

This is the cycle you will follow for every new change you want to make.

Step 1: Sync Your Fork

Before starting any new work, ensure your local master branch is up-to-date with the official Stockfish master. This prevents merge conflicts.

You can use the provided script for convenience or perform the steps manually with Git.

Click here to view the sync script and instructions
  1. Save the script below as sync-with-official.sh.
  2. Crucially, edit the cd ./chess/stockfish/src line to point to the src directory of your local Stockfish clone.
  3. Open a terminal, navigate to where you saved the script, and run it with sh sync-with-official.sh.
#!/bin/sh

# Change directory to the path of the script
cd "${0%/*}"

# !!! EDIT THIS LINE to point to your local Stockfish src directory !!!
# Example: cd /path/to/your/Stockfish/src
cd ./chess/stockfish/src

echo
echo "Syncing local master branch with official-stockfish/master..."

# Add the official repo as a remote named 'official' if it doesn't exist
git remote add official https://github.com/official-stockfish/Stockfish.git 2>/dev/null
git remote set-url official https://github.com/official-stockfish/Stockfish.git

echo "--> Switching to local 'master' branch..."
git checkout master

echo "--> Fetching latest changes from 'official'..."
git fetch official

echo "--> Resetting local 'master' to match 'official/master'..."
# WARNING: This discards any local changes on your master branch!
git reset --hard official/master

echo "--> Pushing updated 'master' to your GitHub fork ('origin')..."
git push origin master --force

echo "--> Recompiling Stockfish..."
make clean
make build -j
make net

echo
echo "Sync complete."

Step 2: Create a New Branch and Make Changes

All work should be done on a dedicated branch, not on master. This keeps your changes isolated.

  1. Navigate to your local Stockfish directory in your terminal.
  2. Create and switch to a new branch:
    git checkout -b my-new-feature-branch
    (Choose a descriptive name for your branch.)
  3. Edit the source code to implement your idea.
  4. Compile your changes. You can find detailed instructions at Compiling from source.
  5. Commit your changes with a clear message:
    git commit -am "My clear and descriptive commit message"
  6. Push the branch to your fork on GitHub:
    git push origin my-new-feature-branch
How to prepare specific types of tests (NNUE, SPSA)

Testing a new NNUE net

  1. Upload the net to Fishtest (requires an account). By uploading, you license the network under CC0.
  2. Name your net correctly. The format is nn-SHA.nnue, where SHA is the first 12 characters of its sha256sum. You can get this with sha256sum nn.nnue | cut -c1-12.
  3. Update the source code. In your new branch, change the default value of EvalFileDefaultName in evaluate.h to your new net's filename.
  4. Commit and push as usual. Do not add an EvalFile test option in Fishtest; the engine must use the default net.

Tuning with SPSA

SPSA (Simultaneous Perturbation Stochastic Approximation) is used to automatically tune engine parameters.

1. Prepare Your Code for SPSA

  1. On your new branch, move the definitions of the variables you want to tune to the global scope of the Stockfish namespace.
  2. Remove const qualifiers from these variables.
  3. Flag the variables with the TUNE macro. For example:
    // Original variables
    int myKing = 10;
    Score myBonus = S(5, 15);
    
    // Add this line after their definition
    TUNE(myKing, myBonus);
  4. Optionally, you can define custom tuning ranges or post-update functions.
    // Tune myKing in the range [-100, 100] and myQueen in [-20, 20]
    TUNE(SetRange(-100, 100), myKing, SetRange(-20, 20), myQueen);
  5. Compile the source code.
  6. Run ./stockfish from the src directory. It will print a comma-separated list. Copy this list.
  7. Commit and push your changes.

2. Understanding SPSA in Fishtest

Click for a detailed explanation of the SPSA algorithm

The SPSA algorithm in Fishtest works in a loop:

  • Evaluation step: A mini-match is played using two versions of the engine with a parameter set to value - ck and value + ck.
  • Update step: The parameter's value is updated based on the result of the mini-match: value = value + (ck * rk) * (wins - losses).

The ck and rk values control the size of the perturbation and the update step, respectively. They decrease over the course of the test to allow for larger adjustments at the beginning and finer tuning toward the end. The Fishtest SPSA form requires a starting value, min/max clipping values, and final ck and rk values for each parameter.

For more details, see Issue #535.

Step 3: Run Your Test on Fishtest

Once your branch is on GitHub, you can submit it to Fishtest for performance testing.

Warning

Do not run too many tests at once. Having many active tests can significantly reduce their individual testing throughput (ITP).

  1. Go to https://tests.stockfishchess.org/tests/run.
  2. Base repository: The URL of the official repo: https://github.com/official-stockfish/Stockfish.
  3. Base branch: master.
  4. Test repository: The URL of your forked repo (e.g., https://github.com/yourname/Stockfish).
  5. Test branch: The name of the branch you just pushed (e.g., my-new-feature-branch).
  6. Test signature: Get this by running ./stockfish bench in your src folder. It's the number from the Nodes searched line. (Fishtest can often auto-fill this if you include Bench: XXXXXXX in your commit message).
  7. Choose a Test Type. For most changes, follow the Standard Testing Methodology.
  8. Info: Write a short but descriptive summary of your change.
  9. Click Submit test.
How to submit an SPSA test
  1. Follow the steps above to set up a new test.
  2. In Test options, consider using nodestime for evaluation tuning to reduce hardware noise. A common setting is Hash=128 nodestime=600.
  3. Choose an appropriate time control (TC). 60+0.6 is good for final tunes.
  4. Paste the comma-separated list you copied from ./stockfish into the SPSA Parameters box.
  5. Review the parameters. A good tune should show significant changes in values without being purely random. If values barely change after thousands of games, the ck value may be too low, and the test should be stopped.
  6. Submit the test.

[!NOTE] You cannot change the number of games for SPSA tests after they have started, as the tuning parameters depend on the initial game count.

Submitting Your Pull Request

Once your change has passed the required tests and you are confident in it, it's time to create a pull request (PR) to merge it into the official repository.

Pull Request Checklist

  • Is your branch up-to-date? Sync your fork and merge the latest master into your feature branch to resolve any conflicts.
  • Is it a single commit? If your branch has multiple messy commits, "squash" them into a single, clean commit.
  • Is the code clean? Ensure your code matches the surrounding style, with no trailing whitespace.
  • Is the commit message high-quality? It should explain the "what" and the "why" of your change. This message becomes the PR description.
  • Did you include test results? Link to the passed STC and LTC tests on Fishtest.
  • Did you include the new bench signature? The last line of your commit message must be either No functional change or Bench: XXXXXXX.
  • Is your patch portable? Your PR will be automatically tested on various compilers via GitHub Actions. You can test this yourself beforehand by pushing your branch to a branch named github_ci on your fork.
Click to see examples of high-quality commit messages

Example 1: Functional Change

Simplify away nnue scale pawn count multiplier

Removes 2x multipliers in nnue scale calculation along with the pawn count term that was recently reintroduced.

Passed non-regression STC:
https://tests.stockfishchess.org/tests/view/64305bc720eb941419bdf72e
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 38008 W: 10234 L: 10021 D: 17753

Passed non-regression LTC:
https://tests.stockfishchess.org/tests/view/6430b76a028b029b01ac9bfd
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 91232 W: 24686 L: 24547 D: 41999

Bench: 4017320

Example 2: Non-Functional Change

Set the length of GIT_SHA to 8 characters

Previously, the length of git commit hashes could vary depending on the git environment. This change standardizes it.

No functional change

Reference: Testing Methodology

This section provides guidelines for choosing the correct test parameters on Fishtest. When in doubt, use the Standard procedure.

Click to view Testing Definitions
  • Simplification: A change that makes the code clearer, smaller, or more efficient without changing functionality.
  • Bug Fix: A change that fixes a confirmed bug. These should be discussed on Discord first.
  • TC: Time Control (e.g., 10+0.1 is 10 seconds + 0.1s increment per move).
  • STC: Short Time Control (10+0.1).
  • LTC: Long Time Control (60+0.6).
  • SMP: Symmetric Multi-Processing (multi-threaded) tests.
  • SPRT(x,y): A Sequential Probability Ratio Test with Elo bounds of [elo0, elo1]. The test stops when it's statistically confident the true Elo gain is outside this range.

Standard Testing (for most functional changes)

This is the workhorse for ensuring only robust, Elo-positive patches are merged.

  1. Run a test at STC (10+0.1) with standard SPRT bounds [-0.75, 0.75].
  2. If the STC test passes, run a new test at LTC (60+0.6) with standard SPRT bounds [0.0, 1.0].
  3. If the LTC test passes, you are ready to create a pull request.

Simplifications and Bug Fixes

These changes aren't expected to gain Elo, so we test them to ensure they don't lose Elo.

  • Follow the standard STC -> LTC procedure, but use non-regression SPRT bounds, typically [-1.75, 0.25]. This requires the change to be statistically unlikely to lose more than 1.75 Elo.

Speedups

Changes that only improve speed (nodes per second) without altering search logic.

  1. Benchmark locally first. Use tools like perf on Linux or specialized scripts like pyshbench or FishBench to measure the speedup. A gain of at least 0.5% is typically required to be considered.
  2. If the change is complex, it must be tested on Fishtest like a standard functional change, as speedups can have unpredictable effects on search strength. A significant speedup will translate to an Elo gain on Fishtest.

Reference: Advanced Topics & Tools

Advanced Fishtest Options

These options are on the test creation page and should only be used if you know what you are doing.

  • Auto-purge: Toggles the automatic removal of statistically insignificant results. Useful to disable for time management tests.
  • Time odds: Uses different time controls for the base and test branches.
  • Custom book: Allows using a custom opening book for tests.
  • Disable adjudication: Prevents Fishtest from ending games early based on score.
Useful Resources
Optional: Git Commit Hook for Auto-Benchmarking

This script can automatically run the bench and insert the signature into your commit message.

  1. Save the following code as a file named commit-msg inside the .git/hooks/ directory of your local repository.
  2. Make it executable: chmod +x .git/hooks/commit-msg.
  3. When you write a commit message, include the word autobench on its own line. The script will replace it with Bench: XXXXXXX.
#!/bin/sh
set -e

if [ -z "$1" ]; then
    exit 1
fi

if ! git diff --exit-code --quiet; then
    echo "Working directory is not clean; cannot generate bench" >&2
    exit 0
fi

# Keyword to replace with bench message
ACTIVATION="autobench"

# Look for keyword in commit message to replace
if ! grep -q -- "$ACTIVATION" "$1"; then
    exit 0
fi

# Build Stockfish
cd src
make -j build debug=yes
cd ..

# Obtain signature
signature=$(./src/stockfish bench 2>&1 | grep "Nodes searched  : " | awk '{print $4}')

# Replace keyword with bench message
sed -i "/^${ACTIVATION}\$/c\Bench: $signature" "$1"
⚠️ **GitHub.com Fallback** ⚠️