Kindergarten_Bitboards - peregrineshahin/ChessProgrammingWiki GitHub Wiki


title: Kindergarten Bitboards

Home * Board Representation * Bitboards * Sliding Piece Attacks * Kindergarten Bitboards

Paul Klee - Group of Masks, 1939 [1] Kindergarten bitboards [2],

was a kind of interactive forum development [3] with a lot of meanders [4] . There were two issues involved - first to calculate the occupancy of any line from the occupied bitboard [5] - and second, compact and dense lookup tables. As a quintessence Gerd Isenberg came up with this nomination. It relies on fast 64-bit multiplication, but is otherwise quite resource friendly and a compromise between calculation and table-size.

File-Attacks

Files need tad more work. Shift the board left (arithmetical right!) to the A-file to mask it. To get the inner six bits, a flip-multiplication by the c2-h7 diagonal is applied with further shift right 58. The lookup-table contains the A-file attacks, which are shifted "back" to the original file.


U64 aFileAttacks [8][64];  // 4 KByte

U64 fileAttacks(U64 occ, enumSquare sq) {
   const U64 aFile   = C64(0x0101010101010101);
   const U64 diac2h7 = C64(0x0080402010080400);
   occ = aFile  & (occ >> (sq&7));
   occ = (diac2h7 *  occ ) >> 58;
   return aFileAttacks[sq>>3][occ] << (sq&7);
}

This is how it works:


masked A-file    *  c2-h7 Diagonal    =  occupancy
H . . . . . . .     . . . . . . . .     . .[G F E D C B]    . . . . . . . .
G . . . . . . .     . . . . . . . 1     . . F E D C B A     . . . . . . . .
F . . . . . . .     . . . . . . 1 .     . . E D C B A .     . . . . . . . .
E . . . . . . .     . . . . . 1 . .     . . D C B A . .  >> . . . . . . . .
D . . . . . . .  *  . . . . 1 . . .  =  . . C B A . . .  58 . . . . . . . .
C . . . . . . .     . . . 1 . . . .     . . B A . . . .     . . . . . . . .
B . . . . . . .     . . 1 . . . . .     . . A . . . . .     . . . . . . . .
A . . . . . . .     . . . . . . . .     . . . . . . . .    [G F E D C B]. .

Note that the six inner bit occupancy is reversed - considered in the pre-calculated aFileAttacks array. This reversed lookup was justified to share first rank-attacks by all directions - with a dense lookup of 512 Byte. But the 4KByte tables outperform the additional multiplications and shift of the dense version - and one may alternatively multiply with the flipped diagonal, the c7-h2 anti-diagonal:


masked A-file    *  c7-h2 AntiDiag   =  occupancy
H . . . . . . .     . . . . . . . .     . .[B C D E F G]    . . . . . . . .
G . . . . . . .     . . 1 . . . . .     . . A B C D E F     . . . . . . . .
F . . . . . . .     . . . 1 . . . .     . . . A B C D E     . . . . . . . .
E . . . . . . .     . . . . 1 . . .     . . . . A B C D  >> . . . . . . . .
D . . . . . . .  *  . . . . . 1 . .  =  . . . . . A B C  58 . . . . . . . .
C . . . . . . .     . . . . . . 1 .     . . . . . . A B     . . . . . . . .
B . . . . . . .     . . . . . . . 1     . . . . . . . A     . . . . . . . .
A . . . . . . .     . . . . . . . .     . . . . . . . .    [B C D E F G]. .

Shared Rank Lookup

As often, computation versus memory size. One may share a 512Byte Lookup of the first rank by all lines with some trailing computation. Multiplying with the A-file (fill north) for ranks and diagonals, and multiplying with the diagonal for the file. Likely the additional multiplication don't pays off.


const BYTE firstRankAttacks[8][64];

U64 fileAttacks(U64 occ, enumSquare sq) {
   const U64 aFile   = C64(0x0101010101010101);
   const U64 hFile   = C64(0x8080808080808080);
   const U64 diaa1h8 = C64(0x8040201008040201);
   const U64 diac2h7 = C64(0x0080402010080400);

   unsigned int f = sq & 7;
   occ =   aFile   & (occ   >>  f);
   occ = ( diac2h7 *  occ ) >> 58;
   occ =   diaa1h8 * firstRankAttacks[(sq^56)>>3][occ];
   return ( hFile  &  occ ) >> (f^7);
}

U64 diagonalAttacks(U64 occ, enumSquare sq) {
   const U64 aFile = C64(0x0101010101010101);
   const U64 bFile = C64(0x0202020202020202);

   unsigned int f = sq & 7;
   occ  =  diagonalMaskEx[sq] & occ;
   occ  = (bFile * occ ) >> 58;
   occ  =  aFile *  firstRankAttacks[f][occ];
   return  diagonalMaskEx[sq] & occ;
}

32-bit Versions

One other variation of the memory versus computation theme was encouraged by 32-bit mode. 64-bit multiplication is quite expensive in 32-bit mode - a call using three imuls. Thus, it is more efficient to use shift-or plus 32-bit multiplication, which might in fact be used in 64-bit mode as well. Piotr Cichy proposed a multiplication less parallel prefix shift approach similar to Occupancy of any Line [6] , which is a good alternative for processors with slow multiplication.

An efficient and tricky file-approach was introduced by Zach Wegner [7], using a 32KByte, rotated like lookup-table: It is quite strange, yes, but it is an out of order mapping. There are only 5 bits because each bit in the factor maps more than one bit. The trick here is the odd shift 29, so that the multiply does not overflow individual bits. I have since found that 25 and 27 will work with the same magic:


occ
. . . . . . . .
a . . . . . . .
b . . . . . . .    occ | occ >> 29    * 0x01041041 with the index bracketed
c . . . . . . .    ...\               ...\               ...\
d . . . . . . .    d . . . . . . .    1 . . . . . . .    d a[f c e b d a]
e . . . . . . .    e . . a . . . .    . . 1 . . . . .    e b . a f c e b
f . . . . . . .    f . . b . . . .    . . . . 1 . . .    f c . b . . f c
. . . . . . . .    . . . c . . . .    1 . . . . . 1 .    . . . c . . . .

The interesting thing is that this works for any masked file. In fact if it was shifted to the a-file, you could get away with the 3-bit factor 0x00041040 (but using a shift of 23).


U64 arrFileAttacks[64][64]; // [sq][occ64] 32KByte

U64 fileAttacks(U64 occ, enumSquare sq) {
   occ &= fileMask[sq];
   U32 fold  = (U32)occ | (U32)(occ >> 29);
   U32 occ64 = fold * 0x01041041 >> 26;
   return arrFileAttacks[sq][occ64];
}

Ranks and diagonals are trivial, this version favors rotated like memory size for less computation and same operations than file-attacks. One may therefor generalize the routine by a line-direction parameter:


U64 arrDiagonalAttacks[64][64]; // [sq][occ64] 32KByte

U64 diagonalAttacks(U64 occ, enumSquare sq) {
   occ &= diagonalMaskEx[sq];
   U32 fold  = (U32)occ | (U32)(occ >> 32);
   U32 occ64 = fold * 0x02020202 >> 26;
   return arrDiagonalAttacks[sq][occ64];
}

A similar approach was proposed by Andrew Fan in 2009, been active in his own engine for a few years (2006 earliest recorded file time) [8].

Magic Compression

So far Kindergarten bitboards performs a perfect hashing of the up to six relevant and scattered occupied bits of any line to a six-bit index - which is a bijective mapping of 64 different occupancies per line to 64 indices for the precalculated attack sets.

If we have a closer look to the attack sets, say of a rook on the a-file, we enumerate far less disjoint sets. A rook on a1 (a8) has seven different attack-sets on that file, depending on the occupancy of a2-a7. On a2 (a7) there is even one attack set less, on a3 (a6) 2 times 5 and on a4 (a5) 3 times 4 attack-sets. Thus, there are {7, 6, 10, 12, 12, 10, 6, 7} disjoint attack-sets per square on line, or 70 in total over all eight squares.

While kindergarten bitboards apply a minimal perfect mapping of scattered bits to a six-bit index, the mapping of the attack-sets is surjective, since each of the 64 occupancies maps only up to 12 distinct sets. Of course that is because occupancies "behind" the first blocker are redundant and map the same attack.

Grant Osborne came up with the idea, derived from magic bitboards - to use different "magic" factors per square (rank), where multiplication may produce carries and enough so called constructive collisions to gain only five or even four bit indices and therefor denser tables. Since different squares may have different table sizes (16 or 32 entries), a Java-like array is used for the attacks, in C implemented as array of pointers to the arbitrary sized attack tables. The variable right shift by either 60 or 59 is encoded inside the otherwise redundant upper six bits of the magic factor, as mentioned in incorporating the shift of magic bitboards.

Grant's proposal, so far with {5,4,4,5,5,4,4,5} bit ranges for the lookups per square for vertical rook attacks, results in a 1.5 KByte array instead the 4KByte of the initial Kindergarten file attack getter [9] . Whether the effort of the rank-indexed magic-factor plus additional pointer indirection pays off the memory saving is another question, and should be tried inside a concrete chess program with its individual cache- and memory footprint.


U64 aFileAttacks[4*32+4*16]; // 1.5KByte
U64 aPtrFileAttacks[8]; // points to appropriate aFileAttacks
U64 fileMagic[8] = {
   0xEFFFA39DB01B23A3, // 5-bit
   0xF024691A3227FF42, // 4-bit
   0xF2808817CAD6FF0C, // 4-bit see below
   0xED6EDFBE467977D5, // 5-bit
   0xEC87CB0D961EC43A, // 5-bit
   0xF2FF594E14D8801C, // 4-bit
   0xF2FF5D69D4E3E7D6, // 4-bit
   0xEE404B349599FF88  // 5-bit
};

U64 fileAttacks(U64 occ, enumSquare sq) {
   unsigned int file = sq &  7;
   unsigned int rank = sq >> 3;

   occ =  0x0001010101010100 & (occ >> file);
   occ = (fileMagic[rank] * occ) >> (fileMagic[rank] >> 58);  // four&five bit index
   return *(aPtrFileAttacks[rank] + occ) << file;
}

The table demonstrates how it works for file-attack of the a3 rook with a four bit range only five relevant occupied bits, since a3 is member of the inner six bits. The empirical determined factor is 0xF2808817CAD6FF0C, six upper bits contain the right shift for the product, for this square shift 60:

| occupancy (A-File) | product | index 0..15 | attack set | | --- | --- | --- | --- | | o - outer squares don't care x - empty or any piece . - empty b - Blocker - any piece R - Rook | occupancy * 0xF2808817CAD6FF0C | upper nibble in product | 1 attacked . not attacked | | o x x x b R b o | 1. attack-set | | . . . . 1 . 1 . | | 0x0000010101000100 | 0x3A28F9D5E2FF0C00 | 3 | 0x0000000001000100 | | 0x0001010101000100 | 0x3934F9D5E2FF0C00 | 3 | 0x0000000001000100 | | 0x0000000101000100 | 0x6329EDD5E2FF0C00 | 6 | 0x0000000001000100 | | 0x0000010001000100 | 0x6F51FAC9E2FF0C00 | 6 | 0x0000000001000100 | | 0x0001000101000100 | 0x6235EDD5E2FF0C00 | 6 | 0x0000000001000100 | | 0x0001010001000100 | 0x6E5DFAC9E2FF0C00 | 6 | 0x0000000001000100 | | 0x0000000001000100 | 0x9852EEC9E2FF0C00 | 9 | 0x0000000001000100 | | 0x0001000001000100 | 0x975EEEC9E2FF0C00 | 9 | 0x0000000001000100 | | o x x x b R . o | 2. attack set | | . . . . 1 . 1 1 | | 0x0000000001000000 | 0x17CAD6FF0C000000 | 1 | 0x0000000001000101 | | 0x0001000001000000 | 0x16D6D6FF0C000000 | 1 | 0x0000000001000101 | | 0x0000010101000000 | 0xB9A0E20B0C000000 | 11 | 0x0000000001000101 | | 0x0001010101000000 | 0xB8ACE20B0C000000 | 11 | 0x0000000001000101 | | 0x0000000101000000 | 0xE2A1D60B0C000000 | 14 | 0x0000000001000101 | | 0x0000010001000000 | 0xEEC9E2FF0C000000 | 14 | 0x0000000001000101 | | 0x0001000101000000 | 0xE1ADD60B0C000000 | 14 | 0x0000000001000101 | | 0x0001010001000000 | 0xEDD5E2FF0C000000 | 14 | 0x0000000001000101 | | o x x b . R b o | 3. attack set | | . . . 1 1 . 1 . | | 0x0001010100000100 | 0x216A22D6D6FF0C00 | 2 | 0x0000000101000100 | | 0x0000010100000100 | 0x225E22D6D6FF0C00 | 2 | 0x0000000101000100 | | 0x0000000100000100 | 0x4B5F16D6D6FF0C00 | 4 | 0x0000000101000100 | | 0x0001000100000100 | 0x4A6B16D6D6FF0C00 | 4 | 0x0000000101000100 | | o x x b . R . o | 4. attack set | | . . . 1 1 . 1 1 | | 0x0000010100000000 | 0xA1D60B0C00000000 | 10 | 0x0000000101000101 | | 0x0001010100000000 | 0xA0E20B0C00000000 | 10 | 0x0000000101000101 | | 0x0000000100000000 | 0xCAD6FF0C00000000 | 12 | 0x0000000101000101 | | 0x0001000100000000 | 0xC9E2FF0C00000000 | 12 | 0x0000000101000101 | | o x b . . R b o | 5. attack set | | . . 1 1 1 . 1 . | | 0x0000010000000100 | 0x578723CAD6FF0C00 | 5 | 0x0000010101000100 | | 0x0001010000000100 | 0x569323CAD6FF0C00 | 5 | 0x0000010101000100 | | o x b . . R . o | 6. attack set | | . . 1 1 1 . 1 1 | | 0x0000010000000000 | 0xD6FF0C0000000000 | 13 | 0x0000010101000101 | | 0x0001010000000000 | 0xD60B0C0000000000 | 13 | 0x0000010101000101 | | o b . . . R b o | 7. attack set | | . 1 1 1 1 . 1 . | | 0x0001000000000100 | 0x7F9417CAD6FF0C00 | 7 | 0x0001010101000100 | | o b . . . R . o | 8. attack set | | . 1 1 1 1 . 1 1 | | 0x0001000000000000 | 0xFF0C000000000000 | 15 | 0x0001010101000101 | | o . . . . R b o | 9. attack set | | 1 1 1 1 1 . 1 . | | 0x0000000000000100 | 0x808817CAD6FF0C00 | 8 | 0x0101010101000100 | | o . . . . R . o | 10. attack set, no blocker | | 1 1 1 1 1 . 1 1 | | 0x0000000000000000 | 0x0000000000000000 | 0 | 0x0101010101000101 |

See also

Forum Posts

External Links

Nils Landgren, Lars Danielsson, Wolfgang Haffner, Esbjörn Svensson, Pat Metheny, Michael Brecker

References

  1. Paul Klee - Group of Masks, from the Israel Museum
  2. Magic Bitboards Explained! by Michael Sherwin and reply by Gerd Isenberg to call it Kindergarten Bitboards, Winboard Forum, December 4, 2006
  3. Compact Bitboard Attacks by Tom Likens, Winboard Forum, March 14, 2006
  4. rotated bitboards obsolete? by Gerd Isenberg, CCC, February 26, 2006
  5. Re: Some thoughts on Dann Corbit's rotated alternative by Steffan Westcott, CCC, March 03, 2006
  6. Kindergarten bitboards without multiplying by Piotr Cichy, CCC, August 07, 2009
  7. Zach's tricky 32-bit approach by Zach Wegner, Winboard Forum, August 22, 2006
  8. 32-bit Magic experiments by Andrew Fan, Winboard Forum, December 03, 2009
  9. Re: How to reduce the "bits" used in a magic number by Grant Osborne, CCC, July 04, 2008

Up one Level