UoM COMP90054 Contest Project

1. Introduction

This is written to describe our 3-person team’s approach to project2 of COMP90054. In this project, we aim at designing Pacman agents to compete with other Pacman teams in a Pacman game. The number of eaten foods in a limited time by different teams was used to judging the winner.
We designed three technologies, which are Monte Carlo Tree Search (MCTS), Approximate Q-learning (AQ), and Value Iteration (VI). After comparing the performance of the different technologies, VI was selected to be the most appropriate approach (Just for our implementations, and not means it’s the best approach under all situations).
These technologies were chosen based on two aspects: The expected performance and the difficulty degree of putting it into effort. VI was believed to be efficient and not hard to implement, while Monte Carlo Tree Search and approximate Q-learning were thought to have excellent performance.[1][2]

2. [Monte Carlo Tree Search](Monte Carlo Tree Search)

3. [Approximate Q Learning](Approximate Q Learning)

4. [Value Iteration](Value Iteration)

5. Conclusions and Reflections

After comparing so many technologies: MCT, AQ, VI, and MCT with AQ, we finally decided to use the VI approach with the second strategy as the final submitted version.
This version is not perfect because it can’t assure to defeat all the staff teams every time, and sometimes it loses to other student teams.
There are still many ways to improve all of these approaches, like setting better features and rewards, increasing the training time and the iteration times, and merging different technologies together. We also believe that if we add the corner detection or more selection models (if-else) into our approaches, they would perform better.

6. Specification

In the course of our work, all three of us in the team contributed. We didn’t work in isolation, but communicated with each other frequently. All the achievements are the result of our joint efforts.

7. Reference

[1] Guillaume Chaslot, Sander Bakkes, Istvan Szita and Pieter Spronck∗ Monte-Carlo Tree Search: A New Framework for Game AI Universiteit Maastricht / MICC P.O. Box 616, NL-6200 MD Maastricht, The Netherlands.
[2] Melo, Francisco S. "Convergence of Q-learning: a simple proof" (PDF).
[3] Unimelb COMP90054 Tim Miller AI Planning for Autonomy 8-9. Markov Decision Processes (MDPs)

8. Youtube Presentation

https://youtu.be/HvROm0785lU

9. Team Members

Zichun Zhu - [email protected] - 784145
Xinmiao Zhang - [email protected] - 990601
Zhuorui Cai - [email protected] - 1003142

Home - reporkey/Berkeley-Pacman GitHub Wiki

UoM COMP90054 Contest Project

1. Introduction

2. [Monte Carlo Tree Search](Monte Carlo Tree Search)

3. [Approximate Q Learning](Approximate Q Learning)

4. [Value Iteration](Value Iteration)

5. Conclusions and Reflections

6. Specification

7. Reference

8. Youtube Presentation

9. Team Members

⚠️ GitHub.com Fallback ⚠️

Home - reporkey/Berkeley-Pacman GitHub Wiki

UoM COMP90054 Contest Project

1. Introduction

2. [Monte Carlo Tree Search](Monte Carlo Tree Search)

3. [Approximate Q Learning](Approximate Q Learning)

4. [Value Iteration](Value Iteration)

5. Conclusions and Reflections

6. Specification

7. Reference

8. Youtube Presentation

9. Team Members

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️