Reading 14 - corey-marchand/data-structures-and-algorithms GitHub Wiki

DataBase Intro

Database normalization is a process used to organize a database into tables and columns. The idea is that a table should be about a specific topic and that and only supporting topics included. For example, a spreadsheet containing information about sales people and customers serves several purposes:

Identify sales people in your organization
List all customers your company calls upon to sell product
Identify which sales people call on specific customers.

By limiting a table to one purpose you reduce the number of duplicate data contained within your database. This eliminates some issues stemming from database modifications.

*Data Duplication and Modification Anomalies Notice that for each SalesPerson we have listed both the SalesOffice and OfficeNumber. There is duplicate sales person data. Duplicated information presents two problems:

It increases storage and decrease performance.
It becomes more difficult to maintain data changes.

Update Anomaly Table Update Anomaly

In this case we have the same information in several rows. For instance if the office number changes, then there are multiple updates that need to be made. If we don’t update all rows, then inconsistencies appear. Deletion Anomaly Table Row Deletion Anomaly

Deletion of a row causes removal of more than one set of facts. For instance, if John Hunt retires, then deleting that row cause us to lose information about the New York office. Search and Sort Issues