Machine Learning (ML) - MsCornell/2425Repo GitHub Wiki

The Tic Tac Toe game includes a Machine Learning (ML) service designed to enhance the gameplay experience by suggesting optimal moves for the opponent. This service is hosted by Azure Functions and is called directly from the client application during the game.

Gameplay Integration with Machine Learning

During gameplay, when a player makes a move, the ML service is called to determine the best possible response for the opponent. This allows for a more challenging and dynamic opponent, improving the overall game experience. The ML model is implemented to analyze the current state of the game board and suggest the next optimal move based on strategies and difficulty level.

Workflow Overview

  1. Player Makes a Move: The player submits their move by interacting with the game board on the client application.
  2. Board State Sent to ML Service: The current state of the game board (positions of "X" and "O") is sent to the Azure Functions host for analysis.
  3. ML Service Processes the Request: The ML service, hosted on Azure Functions, processes the current board state and predicts the most strategic move for the opponent.
  4. Response Sent Back to Client: The predicted move is sent back to the client application.
  5. Opponent Move Displayed: The client updates the game board with the opponent’s move, and the game continues.

Hosted ML Service using Azure Functions

The ML service is hosted using Azure Functions, independent of the game client. This separation ensures that the client remains lightweight and focused on managing the user interface, while the heavy computation required for decision-making is offloaded to the ML model.

  • Hosting: The ML model is hosted as an API service that listens for requests from the client and returns a suggested move based on the current board state and difficulty level.
  • Model: The ML model utilizes the minimax algorithm to determine the best move for any given board configuration.
  • Integration: Azure Functions is a serverless tool supported by Microsoft, allowing for seamless integration into the game and quick responses to the client’s requests.

Client-ML Interaction Example

Here is a simple flow to demonstrate how the client interacts with the ML service:

sequenceDiagram
    Client->>+MLService: Send current board state
    MLService-->>Client: Return next optimal move
    Client->>Client: Display opponent's move
Loading

Machine Learning Endpoint

Request body: { "board_state": <string, length: 9>, "difficulty_level": <string>}

"board_state" <string, length: 9>: For the small board, each character in the string is either X, O, or _ for empty; For the large board, each character in the string is either X, O, _ for empty, or C for a tie.

"difficulty level" <string>: "easy", "medium", or "hard".

Response body: { "next_move": <integer, 1-9> }.

Difficulty Level of the Game

There are three difficulty levels provided by the machine learning model: easy, medium, and hard.

  • Easy: The ML model returns a random move.
  • Medium: The ML model returns a random move (50% chance) or the optimal move suggested by the algorithm (50% chance).
  • Hard: The ML model returns the optimal move suggested by the algorithm.

Final Algorithm - Minimax

In Minimax, the two players are maximizer and minimizer, respectively. In our case, the maximizer is the AI player and minimizer is the human player. The maximizer tries to get the highest score possible while the minimizer tries to do the opposite and get the lowest score possible. Every board state has a value associated with it. The goal of the algorithm is to return the move that leads to the highest value for the maximizer. The Minimax algorithm is easy to implement and requires less overhead.

Possible Models To Consider

Monte Carlo Tree Search (MCTS)

MCTS uses Monte Carlo simulation to accumulate value estimates to guide towards highly rewarding trajectories in the search tree. The algorithm consists of four steps: 1) selection, 2) expansion, 3) roll-out, and 4) value update. This algorithm effectively balances exploitation and exploration using Upper Confidence Bound.

MCTS illustration Image from An Introduction: Reinforcement Learning by Sutton & Barto.

Advantages:

  1. Computationally efficient because it does not require a complete game tree.

Disadvantages:

  1. Performance can be inconsistent if not enough simulations are rolled out.
  2. May struggle in games with a strong adversarial component.

Adversarial Search

Adversarial Search uses min-max formulation. It evaluates the best move by assuming that the opponent plays optimally to minimize the player's gain.

A model is trained to approximate the value of each node in a low depth tree. Alpha/Beta pruning can be employed to optimize the algorithm by eliminating branches that will not affect the final decision.

Advantages:

  1. Provides optimal solutions in games with deterministic outcomes.

Disadvantages:

  1. Computationally expensive.
⚠️ **GitHub.com Fallback** ⚠️