v0.8.4 - QueensGambit/CrazyAra GitHub Wiki
Strength Evaluation v0.8.4
In the following, information is given on how to replicate the experiments presented in our paper Improving AlphaZero Using Monte-Carlo Graph.
A release with binaries for Linux and Windows using the TensorRT-backend can be found in the release 0.8.4
- ClassicAra_084 was built using commit CrazyAra#500da21e0bd9152657adbbc6118f3ebbc660e449.
- CrazyAra_pre_pull_47 was built using commit CrazyAra#82e821e8721aa635e1415e718027b2cbe19356a0.
:warning: The ClassicAra binary is not crash free. It turned out that a variable overflow of numberParentNodes and a wrong ply-counter check (#83) was the cause of this. A fix is applied in release 0.9.0.
The ClassicAra and CrazyAra 0.8.4 binaries expect the Model_Directory
to be model/
while release 0.9.0 expects it to be model/chess
and model/crazyhouse
by default instead (#75).
There are three ways to fix this problem.
-
- You manually set the
Model_Directory
before callingisready
setoption name Model_Directory value model/chess
- You manually set the
-
- You move the files from
model/chess
intomodel/
.
- You move the files from
-
- If you have already generated the trt-files, you should be able to edit the UCI-Option
Model_Directory
via the GUI directly.
- If you have already generated the trt-files, you should be able to edit the UCI-Option
Hardware Setup
Hardware / Software | Description |
---|---|
GPU | NVIDIA GeForce RTX2070 OC |
Backend | TensorRT-7.0.0.11, float16 precision |
GPU-Driver | CUDA 10.2, cuDNN 7.6.5 |
CPU | AMD® Ryzen 7 1700 eight-core processor × 16 |
Memory (RAM) | 31,4 GiB |
Operating System | Ubuntu 18.04.3 LTS, 64-bit |
Tournament Environment | Cutechess 1.1.0 |
CrazyAra | 500da21e0bd9152657adbbc6118f3ebbc660e449 |
Multi-Variant-Stockfish | 2020-06-13 |
Stockfish | 12-NNUE, nn-82215d0fd0df.nnue |
Opening Suites
The following opening suites were used when conducting the engine tournaments:
- Crazyhouse: crazyhouse_mix_cp_130.epd
- Chess: gaviota-starters.pgn.zip
UCI-Options
Multi-Variant-Stockfish (2020-06-13)
All default except:
- Threads: 8
Stockfish 12-NNUE
All default except:
- Threads: 2
CrazyAra 0.8.4
The configuration labeled as AlphaZero* in the paper corresponds to.
- Search_Type: MCTS
- Context: gpu
- Device_ID: 0
- Batch_Size: 16
- Threads: 2
- Centi_CPuct_Init: 250
- CPuct_Base: 19652
- Centi_Dirichlet_Epsilon: 0
- Centi_Dirichlet_Alpha: 20
- Centi_U_Init: 100
- Centi_U_Min: 100
- U_Base: 1965
- Centi_U_Init_Divisor: 100
- Centi_Q_Value_Weight: 0
- Centi_Q_Thresh_Init: 50
- Centi_Q_Thresh_Max: 90
- Q_Thresh_Base: 1965
- Max_Search_Depth: 99
- Centi_Temperature: 80
- Temperature_Moves: 0
- Centi_Temperature_Decay: 92
- Centi_Node_Temperature: 200
- Virtual_Loss: 1
- Nodes: 0
- Allow_Early_Stopping: True
- Use_Raw_Network: False
- Enhance_Checks: False
- Enhance_Captures: False
- Use_Transposition_Table: False
- Use_TensorRT: True
- Fixed_Movetime: 5000
- Model_Directory: model/
- Move_Overhead: 50
- Centi_Random_Move_Factor: 0
- Use_Random_Playout: False
- MCTS_Solver: False
The configuration labeled as MCGS-Combined in the paper corresponds to AlphaZero* with the following overrides.
Note: Use_Transposition_Table
= True activates the MCGS in this case.
- Centi_Q_Value_Weight: 200 (Q-Values for Move)
- Enhance_Checks: True (Enhanced Checks)
- Use_Transposition_Table: True (MCGS)
- Use_Random_Playout: True (Epsilon-Greedy)
- MCTS_Solver: True (Terminal Solver)
Nodes per Second (NPS)
- Multi-Variant-Stockfish: 7.9 Million NPS
- Stockfish 12-NNUE: 1.6 Million NPS
- CrazyAra 0.8.4: 17 K NPS
Experiments
In the following, the raw data for all experiments as shown in Figure 3 - 7 is given.
The experiments used a Fixed_Movetime between 100 to 5000 ms or a fixed number of Simulations and Nodes to avoid distorted results by the time manager.
Figure 3
- Elo development relative to the number of neural network evaluations in crazyhouse.
Rank Name Elo +/- Games Score Draws
1 CrazyAra-0.8.4-ALL [3200sim] 371 27 1490 89.4% 4.0%
2 CrazyAra-0.8.4 735b33481bfab02754f002926c4895d9aabdb7a1 [3200sim] 356 26 1488 88.6% 4.0%
3 CrazyAra-0.8.4-Random-Playout [3200sim] 346 26 1490 88.0% 3.6%
4 CrazyAra-0.8.4-Solver [3200sim] 342 26 1490 87.8% 3.7%
5 CrazyAra-0.8.4-CHECK-ENHANCE [3200sim] 341 25 1488 87.7% 5.0%
6 CrazyAra-0.8.4-Q-2.0 [3200sim] 339 25 1490 87.6% 4.3%
7 CrazyAra-0.8.4-DAG [3200sim] 338 25 1488 87.5% 4.4%
8 CrazyAra-0.8.4-ALL [1600sim] 204 20 1490 76.4% 4.6%
9 CrazyAra-0.8.4 735b33481bfab02754f002926c4895d9aabdb7a1 [1600sim] 196 20 1488 75.5% 4.5%
10 CrazyAra-0.8.4-CHECK-ENHANCE [1600sim] 195 20 1488 75.5% 4.0%
11 CrazyAra-0.8.4-DAG [1600sim] 194 20 1488 75.4% 4.1%
12 CrazyAra-0.8.4-Random-Playout [1600sim] 192 20 1490 75.1% 3.8%
13 CrazyAra-0.8.4-Q-2.0 [1600sim] 180 20 1488 73.9% 4.4%
14 CrazyAra-0.8.4-Solver [1600sim] 175 19 1490 73.3% 4.8%
15 CrazyAra-0.8.4-DAG [800sim] 74 18 1488 60.6% 4.0%
16 CrazyAra-0.8.4-ALL [800sim] 72 18 1488 60.2% 4.8%
17 CrazyAra-0.8.4-CHECK-ENHANCE [800sim] 71 18 1488 60.1% 4.3%
18 CrazyAra-0.8.4-Q-2.0 [800sim] 71 18 1490 60.1% 4.3%
19 CrazyAra-0.8.4 735b33481bfab02754f002926c4895d9aabdb7a1 [800sim] 63 18 1488 58.9% 4.4%
20 CrazyAra-0.8.4-Random-Playout [800sim] 61 18 1490 58.7% 4.1%
21 CrazyAra-0.8.4-Solver [800sim] 55 17 1490 57.9% 4.4%
22 CrazyAra-0.8.4-Q-2.0 [400sim] -44 17 1490 43.7% 4.6%
23 CrazyAra-0.8.4-ALL [400sim] -46 17 1488 43.4% 4.6%
24 CrazyAra-0.8.4 735b33481bfab02754f002926c4895d9aabdb7a1 [400sim] -52 17 1488 42.6% 5.0%
25 CrazyAra-0.8.4-CHECK-ENHANCE [400sim] -52 17 1488 42.6% 4.9%
26 CrazyAra-0.8.4-Solver [400sim] -57 17 1490 41.9% 4.8%
27 CrazyAra-0.8.4-DAG [400sim] -61 17 1488 41.4% 4.9%
28 CrazyAra-0.8.4-Random-Playout [400sim] -61 18 1490 41.3% 4.3%
29 CrazyAra-0.8.4-Q-2.0 [200sim] -166 19 1490 27.8% 4.4%
30 CrazyAra-0.8.4-ALL [200sim] -183 19 1490 25.9% 5.1%
31 CrazyAra-0.8.4-Solver [200sim] -189 20 1490 25.2% 5.0%
32 CrazyAra-0.8.4-CHECK-ENHANCE [200sim] -201 20 1488 23.9% 4.4%
33 CrazyAra-0.8.4-Random-Playout [200sim] -204 20 1490 23.6% 4.7%
34 CrazyAra-0.8.4-DAG [200sim] -206 20 1488 23.4% 4.5%
35 CrazyAra-0.8.4 735b33481bfab02754f002926c4895d9aabdb7a1 [200sim] -211 20 1488 22.9% 3.7%
36 CrazyAra-0.8.4-ALL [100sim] -334 26 1490 12.8% 2.9%
37 CrazyAra-0.8.4-Q-2.0 [100sim] -365 27 1488 10.9% 3.0%
38 CrazyAra-0.8.4-DAG [100sim] -377 28 1488 10.2% 2.9%
39 CrazyAra-0.8.4-Solver [100sim] -380 28 1490 10.1% 2.9%
40 CrazyAra-0.8.4 735b33481bfab02754f002926c4895d9aabdb7a1 [100sim] -388 29 1488 9.7% 2.2%
41 CrazyAra-0.8.4-CHECK-ENHANCE [100sim] -390 29 1488 9.6% 2.8%
42 CrazyAra-0.8.4-Random-Playout [100sim] -392 29 1490 9.5% 2.6%
31269 games finished.
Notes
- Figure 3 only shows CrazyAra-0.8.4-ALL and CrazyAra-0.8.4 735b33481bfab02754f002926c4895d9aabdb7a1 respectively, to avoid over-plotting.
- CrazyAra-0.8.4 735b33481bfab02754f002926c4895d9aabdb7a1 is described as AlphaZero* in the paper.
Figure 5
- Elo development relative to the number of neural network evaluations in chess.
Rank Name Elo +/- Games Score Draws
1 ClassicAra-0.8.6-ALL 1600-Evals 328 41 414 86.8% 15.7%
2 ClassicAra-0.8.6-500da21e0bd9152657adbbc6118f3ebbc660e449 1600-Evals 298 37 416 84.7% 19.5%
3 ClassicAra-0.8.6-ALL 800-Evals 159 32 416 71.4% 19.2%
4 ClassicAra-0.8.6-500da21e0bd9152657adbbc6118f3ebbc660e449 800-Evals 142 32 414 69.3% 17.4%
5 ClassicAra-0.8.6-ALL 400-Evals 17 30 416 52.4% 21.6%
6 ClassicAra-0.8.6-500da21e0bd9152657adbbc6118f3ebbc660e449 400-Evals 4 30 414 50.6% 19.6%
7 ClassicAra-0.8.6-ALL 200-Evals -140 33 416 30.9% 14.7%
8 ClassicAra-0.8.6-500da21e0bd9152657adbbc6118f3ebbc660e449 200-Evals -141 32 416 30.8% 17.3%
9 ClassicAra-0.8.6-ALL 100-Evals -349 47 414 11.8% 8.7%
10 ClassicAra-0.8.6-500da21e0bd9152657adbbc6118f3ebbc660e449 100-Evals -358 47 416 11.3% 8.7%
2077 of 45000 games finished.
Notes
- ClassicAra-0.8.6-500da21e0bd9152657adbbc6118f3ebbc660e449 is described as AlphaZero* in the paper.
Figure 4
- Elo development in crazyhouse over time of MCGS compared to MCTS which uses a hash table as a transposition buffer to copy neural network evaluations.
Version Time per Move [ms] Elo +/-
CrazyAra-0.8.4 (MCGS) 5000 511 36
CrazyAra-0.8.4-pre-pull-47 (Transposition Table) 5000 456 81
CrazyAra-0.8.4 (MCGS) 2500 252 36
CrazyAra-0.8.4-pre-pull-47 (Transposition Table) 2500 224 50
CrazyAra-0.8.4 (MCGS) 1000 120 41
CrazyAra-0.8.4-pre-pull-47 (Transposition Table) 1000 74 43
CrazyAra-0.8.4 (MCGS) 500 −33 42
CrazyAra-0.8.4-pre-pull-47 (Transposition Table) 500 -53 42
CrazyAra-0.8.4 (MCGS) 250 −188 22
CrazyAra-0.8.4-pre-pull-47 (Transposition Table) 250 -222 42
CrazyAra-0.8.4 (MCGS) 100 −554 21
CrazyAra-0.8.4-pre-pull-47 (Transposition Table) 100 -573 42
Notes
-
CrazyAra-0.8.4 (MCGS) corresponds to the binary CrazyAra-0.8.4.
-
CrazyAra-0.8.4-pre-pull-47 (Transposition Table) corresponds to the binary CrazyAra-0.8.4_pre_pull_47.
-
First all versions CrazyAra-0.8.4-pre-pull-47 (Transposition Table) have been tested against each other in different time controls.
-
Then each time control was tested against the respective CrazyAra-0.8.4 (MCGS) version. This is the reason why the error is lower for the CrazyAra-0.8.4 (MCGS) versions.
-
The configuration labeled as MCTS which uses a hash table as a transposition buffer to copy neural network evaluations corresponds to AlphaZero* with the override Use_Transposition_Table = True
-
Use_Transposition_Table = True activates the usage of a transposition table to copy over neural network evaluations for CrazyAra-0.8.4-pre-pull-47.
Figure 6
- Elo comparison of the proposed search modification in crazyhouse using five seconds per move. On the used hardware this resulted in 100 000 - 800 000 total nodes per move.
Rank Name Movetime Elo Uncertainty
1 CrazyAra-0.8.4-ALL 5000ms 283 96
2 CrazyAra-0.8.4-DAG 5000ms 81 62
6 CrazyAra-0.8.4-Random-Playout 5000ms 63 84
7 CrazyAra-0.8.4-CHECK-ENHANCE 5000ms 19 65
3 CrazyAra-0.8.4-Q-2.0 5000ms -18 65
4 CrazyAra-0.8.4-Solver 5000ms -28 59
5 CrazyAra-0.8.4-735b33481bfab02754f002926c4895d9aabdb7a1 5000ms -36 62
Notes
- CrazyAra-0.8.4-735b33481bfab02754f002926c4895d9aabdb7a1 is described as AlphaZero* in the paper.
Figure 7
- Elo comparison of the proposed search modification in chess using five seconds per move.
Rank Name Movetime Elo Uncertainty
1 ClassicAra-0.8.6-ALL 5000ms 51 38.99
2 ClassicAra-0.8.6-DAG 5000ms 51 36.83
3 ClassicAra-0.8.6-Q-2.0 5000ms 17 34
4 ClassicAra-0.8.6-Random-Playout 5000ms 9 54
5 ClassicAra-0.8.6-CHECK-ENHANCE 5000ms 0 43
6 ClassicAra-0.8.6-Solver 5000ms -9 48
7 ClassicAra-0.8.6-500da21e0bd9152657adbbc6118f3ebbc660e449 5000ms -18 66
Notes
- ClassicAra-0.8.6-500da21e0bd9152657adbbc6118f3ebbc660e449 is described as AlphaZero* in the paper.