Wiki Code Walkthrough - mjha07/Wikibreach GitHub Wiki
The PrivacyRights.py ingests the Privacy Rights Clearinghouse breach data into WikiBreach main database. We are planning to create a dashboard which will fully automate the data ingestion process. The ingestion process involves below steps:
-
Takes the csv file and uploads it into python dataframe.
-
Subset the dataframe to include only those columns which we will ingest in our database.
-
Splits the dataframe into separate ones based on the tables data will reside in.
-
Change the header to the column name of the database:
- Date Made Public -> reported_date (incident_report table)
- Source URL -> cource_url (incident_report table)
- Description of incident -> incident_summary (incident_report table)
- Information Source -> reported_by (incident_report table)
- Type of organization -> victim_type (victim table)
- Company -> victim_name (victim table)
-
Converted the date from string into datetime format
-
Null values were treated by replacing with a string, "Missing Values".
-
Created connection with MySQL server
-
Insert statement to update the database with the data
-
Close the connection with server