Project Milestone 2 - JusungPark6/Chess-Database-Visualizer GitHub Wiki
Screenshot of ERD on chess games data:
Psuedocode for Database Build Plan
Pre-import Manipulation
-
The data within my data is mostly already normalized. For example, the results column are all normalized to be a set entity based on the results of the game with no variation.
-
Delete unecessary columns, such as White RD, Black RD, WhiteisComp, BlackisComp, White Clock, Black Clock, ECO, Result.
-
Rename Result-Winner column to result and White Elo and Black Elo to white rating and black rating.
-
Data transformation
-
Convert data format as required by the database schema
Import Processes
-
I will use the Table Data Import Wizard tool provided in MySQL Workbench to import my dataset csv file.
-
Forward engineer the EER diagram to create the relations and the connections between the relations Need to delete the foreign key constraint name in every occurrence in the automatically produced code to create the schemas.
Post-Import Manipulation
Data validation:
- Make sure that all the values are matching (i.e. player usernames are consistent) and the data entities are non-null.
- Make sure that data types align
- No redundancy exists
Data cleaning and modification:
- parse moves in Moves into an array rather than a long string
Organize information from the csv-produced tables into the EER-produced tables
Games
- put all games into Games table with a given unique game_id and make that the primary key
- put the white and black players in their respective columns with their primary key
- Put the recorded winner of the game {White, Black, Draw}
Player
- put all the players into Player name with a given unique player_id and make that the primary key
- Put the players' usernames
- put their most recent rating as of the latest game recorded
Game_Details
- put game_id referencing the Games table as primary key
- put the recorded winner referencing the Games table
- Put the result describing the conditions of the win (i.e. Black checkmates, White resigns, Draw by Repetition, etc.)
- Put the time control setting for the game played
- Put down the move count as move_count
- Put the list of moves played in the game in sequential order in an array