Wiki Code Walkthrough - mjha07/Wikibreach GitHub Wiki

The PrivacyRights.py ingests the Privacy Rights Clearinghouse breach data into WikiBreach main database. We are planning to create a dashboard which will fully automate the data ingestion process. The ingestion process involves below steps:

  1. Takes the csv file and uploads it into python dataframe.

  2. Subset the dataframe to include only those columns which we will ingest in our database.

  3. Splits the dataframe into separate ones based on the tables data will reside in.

  4. Change the header to the column name of the database:

    1. Date Made Public -> reported_date (incident_report table)
    2. Source URL -> cource_url (incident_report table)
    3. Description of incident -> incident_summary (incident_report table)
    4. Information Source -> reported_by (incident_report table)
    5. Type of organization -> victim_type (victim table)
    6. Company -> victim_name (victim table)
  5. Converted the date from string into datetime format

  6. Null values were treated by replacing with a string, "Missing Values".

  7. Created connection with MySQL server

  8. Insert statement to update the database with the data

  9. Close the connection with server