Output representation - QueensGambit/CrazyAra GitHub Wiki

Output representation

CrazyAra 0.1 used a simliar policy encoding as AlphaZero: Using semantic planes for each compass directions, special planes for knight moves, etc. The old policy description had two minor disadvantages:

The total flattened policy vector had 4992 entries of which more than half of the entries are illegal moves. This is due to the fact that you can't access the region out of the board if the piece is located at the edge. Generally it should be avoided to train a neural net with classes that never occurred and will occur in the dataset.
Alpha-Zero's output description does only represent promoting to minor pieces treating promoting to queen as default. This leads to am ambiguity when converting the move to uci-notation because e7e8 and e7e8q are seen as two different moves.

CrazyAra 0.2 uses a more compact policy representation. It makes uses of a constant vector which stores all possible uci-moves in any given Crazyhouse-Board-State. Most of these entries are traditional chess moves. At the end of the vector all possible dropping moves have been added. This vector has only 2272 entries.