5. Recording Events and Prepping For Data Ingest - AgileBitFlipper/triominos GitHub Wiki

Now that we have successfully designed and created the game of triominos, it comes time to start recording key events so we can analyze our gameplay. By recording key events, we can ingest them into our favorite analysis tool and start doing the real work. Eventually, we can use this data to help train our AI to play a "smarter" game to achieve whatever goals we set. But, for now, let's just work on getting the key events recorded and primed for consumption. I think it will also be a lot of fun to look at statistics like:

  • How often a tile is chosen in setup.
  • The average number of plays in a round.
  • How often a location on a board is used (heat map)
  • How often a tile is drawn.
  • How often a tile is played.
  • How often the starting player wins.
  • Which tile is started with most often. and many, many more things we can think of! The bottom line, we need this data in order to get there.

EventManager

In order for us to handle the creation and collection of Events, we need to have a class that can manage them. So, we create an EventManager to record the events of each and every Game that we play. Since each Game could contain multiple Rounds, I think it best to record each Game into a separate data file.

Object Serialization

Java has a really great way to serialize objects and prepare them to be written to a file. The key is to have the class implement Serializable. That permits the Event class to be written to a file and read back in as an Event. ## Event Object Stream When a file stream is opened for write, and objects are written to the object stream, a header is written to the file that defines the class version that was used, and where the stream starts. This permits us to only write a single stream to a file and not append to it. So, the EventManager needs to be able to create a unique file for each Game. I'm using the file pattern "events_.ser", where "" is a filename hash to provide uniqueness to the file's name.

Event Queue

The final piece of the puzzle needs to be the Event queue. This queue holds all of the events per Game, and writes them out at the conclusion of the Game. That way, each file will contain the entire set of events recorded per Game.

EventType

Each and every event we record needs to have a specific type; we need to know the difference between the start of a game, the start of a round, and the placement of a tile. For now, we are going to setup a basic set of event types and from there, as needed we can expand them. Adding any event types in the future will impact the ability to read older serialized event files, but the goal here is to progress forward. So on we go! Here are the current event types I'm thinking of:

  • START_A_GAME - indicates the start of a game
  • START_A_ROUND - indicates the start of a round within a game
  • SETUP_PLAYERS - indicates when we setup the players for each game
  • GENERATE_TILES - indicates when we generated the tiles for each game
  • SHUFFLE_TILES - indicates when we shuffled the tiles for each game
  • SETUP_PLAYER_TRAY - indicates when we setup a player's tray
  • HIGHEST_TILE_START - indicates when and which 'highest' tile was played
  • TRIPLE_ZERO_BONUS - indicates when and which 'zero-triplet' tile was played
  • TRIPLE_PLAY_BONUS - indicates when and which 'triplet' tile was played
  • DRAW_A_TILE - indicates when a tile was drawn from the pool and by whom
  • PLACE_A_TILE - indicates when a tile was placed on the board
  • FAIL_CORNER_TEST - indicates when a chosen tile fails the corner test
  • CREATE_A_HEXAGON - indicates when a placed tile completes a hexagon
  • CREATE_A_BRIDGE - indicates when a placed tile completes a bridge
  • WIN_A_ROUND_BY_EMPTY_TRAY - indicates when a round is won by a player with an empty tray
  • WIN_A_ROUND_BY_FEWEST_TILES - indicates when a round is won by a player with the fewest tiles
  • WIN_A_GAME - indicates when a game is won by a player
  • END_A_ROUND - indicates when a round is completed
  • END_A_GAME - indicates when a game is completed

Event

The actual makeup of an Event needs to be as thorough as we can to make sure that all event data is expressed. The Event class itself needs to implement the Serializable interface so that the class instance can be written to the "event_.ser" file. We can be as detailed as we need, and can even get crazy and put in too much information. So, for now, let's just minimize the data in the Event class. Here is the list:

  • Date eventDateTime - the date and time of the event
  • EventType type - the type of the event record
  • Player player - the player (context specific)
  • Tile tile - the tile (context specific)
  • int game - the game number (relevant when more than one game in a run)
  • int round - the round within the game
  • int row - the row (context specific)
  • int col - the column (context specific)
  • int score - the score (context specific)
  • int startBonus - the start bonus (if one - context specific)
  • boolean startingMove - is this record the starting move (context specific)
  • boolean completedAHexagon - does this move complete a hexagon (context specific)
  • boolean completedABridge - does this move complete a bridge (context specific)
  • boolean endOfRound - does this record represent the end of a round
  • boolean endOfGame - does this record represent the end of a game

Any time-bash analysis would need to know the exact moment of an event, so we added the 'eventDateTime' for an Event. The other fields need the type field to provide context; not all Event records will use all fields. For example, the Tile field will only be active when we log an Event that would utilize a Tile object. Eventually, the Event class should be able to handle the context similar to the way that the Command Line processor handles command line options. But for now, we'll handle the context manually.

One observation is that not all freely available analysis tools read native object serialization. It would benefit us to use a more easily consumable format like XML and JSON. There are several small libraries out there to help us serialize data in JSON, but the one that seems to be most often used is Jackson. Read more about the Jackson ObjectMapper at the link here.

Analyzing the Data

We could decide to manually collect and analyze the data. Since we wrote the file records, we know how to read and process them. However, serial files aren't really the best way to query large amounts of data. We could use a database, like NoSQL, SQL, Couchbase, or other. But, I'm thinking we could use the ELK stack; Elasticsearch, Logstash, and Kibana. There are a lot of articles on just how to get the ELK stack up and running on your platform of choice, but I'm leaning toward setting up the stack on a RHEL OS Docker container. LogStash has native readers for JSON, so I'm leaning even more towards Jackson ObjectMapper. Maybe that is a new branch I can setup in the repository.

ELK Stack

The best way to setup ELK is to let someone else take care of it for us. In this case, we can use a pre-defined docker images that has ELK already setup on it. The following link, ELK Stack on Docker, provides documentation on how to download and setup this image.

⚠️ **GitHub.com Fallback** ⚠️