4. Analyzing The Results The Hard Way - AgileBitFlipper/triominos GitHub Wiki
Now that we have played the games and recorded all of the significant events during gameplay, it comes time to digest this data and start showing some interesting results of this analysis. Eventually, this data will be used to help guide gameplay. This is the AI or machine intelligence aspect of this exercise.
During gameplay, we recorded significant event for each Game played. Each Game has a corresponding 'events-.ser' event file that holds each recorded event is POJO form. These Event files can be scanned for significant events that we can start to collect and correlate. For example, how about if we wanted to see how many times each Player has won a game?
So, the first part of this would be to construct an class that helps us scan for the Event files, digest these Event files into lists, and then scan those Events for the specific ones that we wish to examine.
Note: This is probably the least efficient way of doing this, as a database would be a much more efficient way. However, this is the goal of this exercise to show how inefficient manual processing of events can be, when we have the ability to process large amounts of data in a more realistic manner. But, we must learn to walk before we can fly.
Lucky for us, we have just such a class already; the EventManager class. We can extend this class to include the methods needed to process our serialized POJO files in the manner we need. The following code snippets show how we can utilize the EventManager to help us achieve our goals of getting back every Event for every Game we've played.
public List<String> getAllEventDataFiles() {
List<String> eventFiles = new ArrayList<String>();
File index = new File(EVENT_OBJECT_PATH);
String[] entries = index.list();
for (String s : entries) {
File currentFile = new File(index.getPath(), s);
String filename = currentFile.getName();
// todo: can we be smarter about this with a file system call that takes a mask?
if (filename.startsWith(EVENT_OBJECT_FILE_PREFIX) &&
filename.endsWith((EVENT_OBJECT_FILE_EXTENSION)))
eventFiles.add(currentFile.getAbsolutePath());
}
return eventFiles;
}
This first method, getAllEventDataFiles(), is pretty straight forward. It builds a List of String values that represent the absolute path to each and every Event data file. The Event data folder, logs, is scanned and every file found is returned. That list of files is scanned, one at a time to see if the filename starts with "events" and ends with "ser". This makes sure we are only trying to process our serialized POJO files. If one is found, it is added to the list we will return.
public List<Event> getAllEventsForDataFile(String dataFile) {
// The event list to return
List<Event> evtList = new ArrayList<Event>();
try {
FileInputStream fis = new FileInputStream(dataFile);
ObjectInputStream ois = new ObjectInputStream(fis);
try {
Event evt;
while ( ( evt = (Event) ois.readObject() ) != null ) {
evtList.add(evt);
}
} catch (EOFException eof) {
Log.Error(String.format(" Reading of event log %s is complete.", dataFile));
}
ois.close();
fis.close();
} catch (FileNotFoundException e) {
Log.Error(String.format(" Error event serialization file, %s, not found", dataFile));
} catch (IOException e) {
Log.Error(" Error initializing event stream:\n" + e.getMessage());
} catch (ClassNotFoundException cnfe ) {
Log.Error(" Class not found in event stream:\n" + cnfe.getMessage());
}
return evtList ;
}
This second method will take that list of Event files and will open it to extract the serialized POJO Event records. The list of Events will be returned to the caller. This list is the final list that we will process looking for the Events that we deem important for our analysis. _Note: Again, you can see how inefficient that this process is, and how using a database would be more appropriate for this type of data analysis. Just one more reason to move this analysis to an appropriate framework moving forward. But, we still or going to continue to do this for illustration of our point.
Now that we have the complete list of Events per Game, we can start to sample these Event lists for key event types. In this case, we will use the WIN_A_GAME EventType to further our discussion. This EventType, WIN_A_GAME, actually contains the game number, the round, and the player information. This is key since we want to identify which player won each game, and aggregate that information across all of the games that have been played.
We start by going through each Event, and only triggering off of the EventType of WIN_A_GAME. When that Event is found, the Player is extracted from the event and added to a HashMap<Player,Integer> with the Player being the key, and the number of times that Player has won a game. In order for Player to be used as a Key in a HashMap, we have to override the equals() method and the hashcode() method. This is due to the complexity of the Player object. A hash is computed for the Player object that extends beyond just the name. We are only concerned about how many times 'Player A' has won the game, not a single instance of 'Player A'. So, the following code snippets provide you with a sense of how to override these methods, and how they impact our results.
/**
* We need this for HashMap to work correctly for a Player.
* The name is the only really relevant field for a player of a game.
* @return hash of the Player's name
*/
public int hashCode() {
return name.hashCode();
}
/**
* We need this for HashMap to work when using a Player as a key
* @param o - Object to compare this against
* @return true if names are the same, false otherwise
*/
public boolean equals(Object o) {
if ((o instanceof Player) && ((Player) o).getName().equals(name))
return true;
return false;
}
Now that we can be sure that we can utilize a HashMap<Player,Integer> for the Player, we can setup our method that will perform the aggregation. The following code demonstrates how we can filter out the EventType that we need, and use that Event record to record the proper data in the HashMap for a winning event. You can see how the code looks at every Event from every POJO event file and stops only on EventType.WIN_A_GAME events. A check is made to the HashMap to see if the Player from this Event has already been added. If it has, the win count is incremented by one(1). If there isn't an entry for this Player in the HashMap, a new one is created and a single Integer value of one(1) is provided as the value.
public String gamesWonByPlayers() {
HashMap<Player, Integer> playersThatWon = new HashMap<Player, Integer>();
// Event files
List<String> eventFiles = EventManager.getInstance().getAllEventDataFiles();
for ( String eventFile : eventFiles ) {
// Events in file
List<Event> events = EventManager.getInstance().getAllEventsForDataFile(eventFile);
for ( Event event : events ) {
// If we have a WIN_A_GAME event
if ( event.type == EventType.WIN_A_GAME ) {
Player player = event.player ;
if ( playersThatWon.containsKey(player) ) {
playersThatWon.put(player,playersThatWon.get(player) + 1 ) ;
} else {
playersThatWon.put(event.player, 1);
}
}
}
}
...
}
Now that we have a HashMap with each Player and their total wins accumulated, we can construct the String that will be used to provide this information. In this case, we use a StringBuilder to help speed up construction of the return String. For each Player in the HashMap, using keySet() to return the Set of Players, the number of wins is retrieved, along with the name, to build the winning string.
_Note: It would be far more efficient to sort the result set so that the Player that wins the most would appear at the top, and the winners would descend in score to the bottom of the list. This can be left as an exercise for the reader.
...
StringBuilder strGamesWonByPlayer = new StringBuilder(100).append("Games won by players:\n");
Set<Player> players = playersThatWon.keySet() ;
for ( Player player : players ) {
Integer wonGames = playersThatWon.get(player);
strGamesWonByPlayer.append(String.format(" Player %s has one %d times.\n", player.getName(), wonGames));
}
return strGamesWonByPlayer.toString();
What we wind up with is a result like the following: Games won by players: Player Player B has one 62 times. Player Player A has one 55 times.
To determine both the efficacy and efficiency of our analysis, we have to establish some metrics. The first and obvious choice is how long it takes to get at the result we desire. The second metric would be to measure how much trouble, code, and time is needed to actually setup the data in order to perform the analysis. In this case, doing things by hand like we've demonstrated above, is both ineffective and inefficient to a large degree, and here is why.
The Event data from our Game is stored in a static POJO Event data file. This means that I have to load that file up, extract the Event objects, and scan through them each and every time I want to perform an analysis. Since I doubt that I will be only performing one analysis, we have not extended our time and effort by double. Since this data is static, it would be better served to be in a database or the like for indexing, making searching much more efficient.
Each and every search that I want to perform has to be coded and built into the program and executed with a command line option. In order for me to add another search, or correlate that data in a different manner, I have to open up the project, add new code and redistribute the JAR. This is extremely inefficient, and just one more reason that we would be better served if our data were in a database where searches can be conducted in an easier fashion.
Each query that we run requires us to process each and every POJO Event data file. This time extends each and every time a new file is generated. And since we have to process each event, regardless of which one we are going after, we continue to extend the time it takes to perform a search. In a proper database, the information can be indexed as it is added, allowing the searches to take an infinitesimal amount of time comparatively to this search here. This almost flattens out our search time instead of increasing it each time we perform a game run.
Here is a small list of queries that I've implemented programmatically in the triominos.jar application. You can access the results by using the "-a" option on the command line. For example, perform 100 games using "java -jar triominos.jar -g 100". Then, perform the analysis by using "java -jar triominos.jar -a" to see the results.
- How many times has each Player won a game?
- How many times has each Player won a round?
- How many times has a Tile been played?
- How many times has a Tile been used to start a Round?
- What is the average number of tiles played in a Round?
- What is the highest number of tiles played in a Round?
One thing we can't do with a HashMap is produce a sorted output. A HashMap doesn't guarantee the order in which the Objects are provided. That being the case, we do have the ability to move the HashMap into a sorted TreeMap, we just need to do some magic to make it happen. Since we have more than one type of Event data we need to sort for our queries, we need to have a more Generic way to handle our HashMaps. For example, we want to show data for Players as well as Tiles. Fortunately for us, Java has a way for us to establish a Comparator when converting the HashMap data into a TreeMap. Here is an example of a generic comparator that we can use to move from a HashMap<Player,Integer> to a TreeMap<Player,Integer> as well as a HashMap<Tile,Integer> to a TreeMap<Tile,Integer>.
// a comparator using generic type
class ValueComparator<K, V extends Comparable<V>> implements Comparator<K>{
HashMap<K, V> map = new HashMap<K, V>();
public ValueComparator(HashMap<K, V> map){
this.map.putAll(map);
}
@Override
public int compare(K s1, K s2) {
int order = -map.get(s1).compareTo(map.get(s2));
if ( order == 0 ) order = -1 ;
return order ;
}
}
Now, to take advantage of this new class, we define the type specific Comparator by instantiating the Generic ValueComparator using the Types we desire. For example, see the following instantiation:
Comparator<Player> playerComparator = new ValueComparator<Player, Integer>(playersThatWonARound);
The playerComparator now will allow us to compare the values associated with a HashMap element as it's added to a TreeMap. This comparison defines the order in which these elements are added. The - or + value of the comparison will define the order, ascending or descending, that the final elements are ordered in. Note: Please take note of the '( order == 0 )' line in the actual comparator code. This is necessary because without it, only elements with unique values will be added. And, only the first one encountered would be added. When the order is 0, it indicates equality. So, we change it to -1 to place the elements with the same value in descending order. The complete method of how the Comparator is utilized can be seen in the following code.
public String roundsWonByAPlayer() {
Comparator<Player> playerComparator = new ValueComparator<Player, Integer>(playersThatWonARound);
TreeMap<Player, Integer> sortedPlayers = new TreeMap<Player, Integer>(playerComparator);
sortedPlayers.putAll(playersThatWonARound);
StringBuilder strRoundsWonByPlayer = new StringBuilder(100).append("Rounds won by players:\n");
Set set = playersThatWonARound.entrySet() ;
Iterator it = set.iterator();
while ( it.hasNext() ) {
Map.Entry me = (Map.Entry)it.next();
Player player = (Player)me.getKey();
strRoundsWonByPlayer.append(String.format(" Player '%s' has won %d times.\n", player.getName(), me.getValue()));
}
return strRoundsWonByPlayer.toString();
}
So, we can now see how cumbersome it is to analyze the data produced by gameplay in a manual way. For each query, a new block of code needs to be added. And the Events that need to be examined must be added to the switch-case and values accounted for. It would be incredible if we could get this data into a better form, into a better framework, that allows us to query the data on demand. Adding a new query, and changing the sort order of the data should be greatly simplified for us to be more effective in guiding gameplay later.