Problem Analysis - leungjunbob/AIP-project GitHub Wiki

Problem Analysis

We are developing the agents for Splender game, that can return the selected action according to the designed algorithmns in current game state. Try to choose the best action that will win the game faster. In this game we only consider two players and once one player scores more than 15 points at the end of the round, the game is won.

This is a muti-agent problem. Minimum number of players is two, can have up to 4 palyers.
The transistion consist of both deterministic and stochastic, for example, given a game state and agent buy a card, the reward he can recieve is stable. But for the new card on board to replace the card agent just purchased is not stable. So this game has a deterministic reward system as well as a stochastic transistion system.
This game is partoally observable. The information such as agent score, cards and gems can always access in each game state, including those information from the opponent agnet. But the next card in the deck of board is unobservable. Although the deck was shuffled at the beginning of game, but we cannot access the full order of the card in the deck.
This game is in a dynamic environment. Since it has at least two agents, so the environment for agents in last game state is different from the current game state since the actions taken by opponent would change the environment.
This game is a single goal problem. The ultimate goal for winning this game is achieving 15 scores, accumulated by buying cards and attracting nobles.
This game is competitive problem for two players scenario, since there could be only one winner in this game. For more than two players scenarios, cooperative features might be involved in, players can cooperate to avoid the player who is more closeer to win.
This game have hard constraints, there are limited amount of actions can take place in given game state.

There are several type of action availabe, but not for every game state, for example if in current game state one player does not have enough gems that he would not able to choose buy the card. Here is the types of action:

Action Type	Description
card	sub-action under card
reserve	reserve a card to prevent other player purchase it, and give 1 yellow gem to player
buy	buy the particular card, add the card into player game state and reduce the gems correspondingly according to the cost of card (yellow gem can use as any colour)
noble	come with buy action, if gems from card satisfies the requirement of noble, can obtain the noble automatically

collect	sub-action under collect
collected_gems	obtain the gems in two way (1. three of different colour; 2. two of same colour if and only if this colour of gems have not been taken from bank)
returned_gems	come with collected_gems, and other actions that could gain gems, once holding more than 10 gems need to return the extra back to bank

pass	Cannot do any action in current stage, pass the current round

More rules and description for this game in https://en.boardgamearena.com/doc/Tips_splendor

Follow by the useful command to obtain the information during game, such as gamestate, player score. order of colour (for gems and card display) colour = ['black', 'red', 'yellow', 'green', 'blue', 'white']

command	Description
SplendorGameRule(num_of_agent=2)	setup the game with 2 players
SplendorGameRule.calScore(gameState, agent)	obtain the score of agent under this gameState
gameState.agents[agent].cards	return a dictionary of cards in format {colour1 : ['card_code'], colour2 : [], ...}
gameState.agents[agent].gems	return a dictionary of gems in format {colour1 : 1, colour2 : 0, ...}
gameState.board.nobles	return a dictionary of nobles in format {noble1_code : cost1, ...}, where cost is the gems from cards
gameState.board.dealt_list()	return a list of available card on board
SplendorGameRule.getLegalActions(gameState, agent)	return a list of actions avaliable for agent in given game state
SplendorGameRule.generateSuccessor(gameState, action, agent)	return a game state for agent by given game state and selected action