AI Agents - WolfgangKonen/GBG GitHub Wiki
Agents
Currently implemented agents
- Max-N agent
- Expectimax-N agent
- MC (Monte Carlo) agent
- MCTS (Monte Carlo Tree Search) agent
- MCTS Expectimax agent
- TD (Temporal Difference) agent
- TD n-tuple agent
- SARSA n-tuple agent
- Q-Leraning n-tuple agent
- HumanAgent
- RandomAgent
Additionally, some games have specific agents, mainly for test and evaluation purposes:
- Edax, HeurPlayer, BenchPlayer (for game Othello only)
- BoutonAgent (for game Nim only )
- AlphaBetaAgent (for game Connect-4 only)
- BasicStrategyBlackJackAgent (for game BlackJack only)
- DAVI2Agent, DAVI3Agent (for game RubiksCube only)
Note that Edax in GBG is currently only available on Windows-based systems, since it involves a special EXE file.
Wrappers
Each agent can be wrapped during {train, play, eval, compete, inspect} with
- an n-ply look-ahead tree search (either deterministic: Max-N, or non-deterministic: Expectimax-N)
- an MCTS-Wrapper (deterministic) or MCTSE-Wrapper (non-deterministic), inspired by AlphaZero
Activate Max-N-Wrapper by selecting Wrapper Mode MaxNWrapper
in Tab Wrapper pars and by setting Wrapper nPly to a value larger than 0. But caution: larger values for Wrapper nPly may quickly lead to very long execution times, depending on the branching factor of your game.
Activate MCTS-Wrapper by selecting Wrapper Mode MCTSNWrapper
in Tab Wrapper pars and by setting suitable values for Wrapper MCTS (# iterations), PUCT for WrapM and Depth for WrapM.