Project Milestone 3 - JusungPark6/Chess-Database-Visualizer GitHub Wiki

SQL Script for Schema Implementation

Creating the Schema

I used the tool "Forward Engineering" that made the creation of the schema much easier as I already had created the ERD model for my database. This tool generated the script to create the schema. However, I ran into an error "Error Code: 1822. Failed to add the foreign key constraint." This was fixed by making sure the attributes in my schema were referencing the correct attributes in other tables through the foreign keys. Here is the script that was created using the Forward Engineering Tool and what I used to create the database.

Data Implementation

I then used the Table Data Import Wizard Tool to import my .csv data file into the schema I had created using the Forward Engineering Tool. Since I cut out a lot of unnecessary columns and changed columns names for clarity and normalization purposes, I made the according normalization changes in the .csv file pre-import so I thought I wouldn't run into any importation problems when using the Table Data Import Wizard Tool.

However, I ran into a plethora of problems using the Table Data Import Wizard Tool.

Initially, I ran into the tool crashing every time I attempted to use the tool to import the .csv file. I fixed this my drastically reducing the amount of data points my .csv file had, which was very unfortunate, but the dataset had over 48,000 entities and I figured reducing the amount so that the tool would work would be a more effective trade-off than manually importing each one.

I then ran into this error: "Unhandled exception: 'ascii' codec can't decode byte 0xef in position 0: ordinal not in range(128)" and using this (https://stackoverflow.com/questions/71992180/mysql-error-unhandled-exception-ascii-codec-cant-decode-byte-0xef-in-positi) post for advice, I downloaded CotEditor to convert my .csv file into ASCII format. I also changed the import setting such that "," was the delimiter instead of the default ";" and the line endings to LF which is the default line ending in MacOS/Linux as opposed to CRLF, which was the default setting for the import tool and the Windows default setting. I then used the Table Data Import Wizard Tool and managed to import all the data into MySQL.

After the import process, I still had to clean up some data points and columns. I had created a username attribute that contained the usernames of all the players in the database, which included all the black players and white players, with no duplicate players. Then, I created scripts to import all the data accordingly into the tables that I had created with the schema creation tool. I had to modify the column types for result and moves in the table Game_details as the original types I had created the columns as was not big enough to fit the data that was being imported into them. The queries used can be found here.

SQL Query Execution

A question I had asked in Milestone 1 was: In games between players rated above 2400, which opening after white played d4 did black have the greatest win rate with?

With the help of online resources, SQL documentation, and ChatGPT, I used this query to find the answer to that question.

Right now the query outputs the most winning response to white playing d4 as the first move, which is d5. It also outputs the total number of games played in the database with a 1. d4 d5 start, as well as how many of those games resulted in black winning, and the winning percentage of black if this opening is played.