Sprint 2 - csc370-07a/blockbuster-video GitHub Wiki

Updated ERD

ERD-Sprint2

We made many changes to the Schema. An exhaustive list follows:

  • Rename all tables to pluralize
  • Update all ID attributes to int with Autoincrement
  • Update all attributes with real MariaDB datatypes
  • Update all attributes with PK / FK, Not Null, Unique constraints as necessary
  • Drop Transactions table (for now, entirely duplicated in Rentals)
  • Drop LateFees from Rentals
  • Add TransactionId to Rentals
  • Add ClerkId to Rentals
  • Drop Ratings from Users
  • Drop LeadActor from Movies
  • Add Year to Movies

Renaming Tables

USE MovieRentals;
DROP TABLE Transaction;
RENAME TABLE User to Users;
RENAME TABLE Movie to Movies;
RENAME TABLE Rental to Rentals;
RENAME TABLE Rating to Ratings;
RENAME TABLE Clerk to Clerks;

Alter Tables (including PK, FK, defaults, constraints)

ALTER TABLE Users
MODIFY Id INT;
ALTER TABLE Users
ADD PRIMARY KEY (Id);
ALTER TABLE Users
MODIFY Id INT AUTO_INCREMENT;
ALTER TABLE Users
DROP Rating;
ALTER TABLE Users
MODIFY Password VARCHAR(100) NOT NULL;
ALTER TABLE Users
ADD CONSTRAINT u_username UNIQUE (Username);
ALTER TABLE Users
ADD CONSTRAINT u_email UNIQUE (Email);

ALTER TABLE Movies
MODIFY Id INT;
ALTER TABLE Movies
ADD PRIMARY KEY (Id);
ALTER TABLE Movies
MODIFY Id INT AUTO_INCREMENT;
ALTER TABLE Movies
RENAME COLUMN LeadActor to Year;
ALTER TABLE Movies
MODIFY Year INT;
ALTER TABLE Movies
ADD CONSTRAINT chk_year CHECK (Year > 1888);
ALTER TABLE Movies
MODIFY Title VARCHAR(100);
ALTER TABLE Movies
MODIFY COLUMN Title VARCHAR(100) NOT NULL;

ALTER TABLE Clerks
MODIFY Id INT;
ALTER TABLE Clerks
ADD PRIMARY KEY (Id);
ALTER TABLE Clerks
MODIFY COLUMN Id INT AUTO_INCREMENT;
ALTER TABLE Clerks
MODIFY Username VARCHAR(25);
ALTER TABLE Clerks
MODIFY Username VARCHAR(25) NOT NULL;
ALTER TABLE Clerks
MODIFY Password VARCHAR(25) NOT NULL;
ALTER TABLE Clerks
ADD CONSTRAINT u_username UNIQUE (Username);

ALTER TABLE Ratings
MODIFY Id INT;
ALTER TABLE Ratings
ADD PRIMARY KEY (Id);
ALTER TABLE Ratings
MODIFY Id INT AUTO_INCREMENT;
ALTER TABLE Ratings
ADD CONSTRAINT chk_rating CHECK (RatingValue BETWEEN 1 AND 10);
ALTER TABLE Ratings
MODIFY UserId INT;
ALTER TABLE Ratings
MODIFY MovieId INT;
ALTER TABLE Ratings
ADD CONSTRAINT fk_users1 FOREIGN KEY (UserId) REFERENCES Users(Id);
ALTER TABLE Ratings
ADD CONSTRAINT fk_movies1 FOREIGN KEY (MovieId) REFERENCES Movies(Id);

ALTER TABLE Rentals
MODIFY Id INT;
ALTER TABLE Rentals
ADD PRIMARY KEY (Id);
ALTER TABLE Rentals
MODIFY Id INT AUTO_INCREMENT;
ALTER TABLE Rentals
DROP LateFees;
ALTER TABLE Rentals
MODIFY RentalDate DATE DEFAULT NOW();
ALTER TABLE Rentals
MODIFY DueDate DATE DEFAULT (NOW() + INTERVAL 14 DAY);
ALTER TABLE Rentals
ADD ClerkId INT;
ALTER TABLE Rentals
MODIFY UserId INT;
ALTER TABLE Rentals
MODIFY MovieId INT;
ALTER TABLE Rentals
ADD CONSTRAINT fk_clerks1 FOREIGN KEY (ClerkId) REFERENCES Clerks(Id);
ALTER TABLE Rentals
ADD CONSTRAINT fk_users2 FOREIGN KEY (UserId) REFERENCES Users(Id);
ALTER TABLE Rentals
ADD CONSTRAINT fk_movies2 FOREIGN KEY (MovieId) REFERENCES Movies(Id);

Load CSV scripts

SET GLOBAL local_infile=1;

LOAD DATA LOCAL INFILE 'blockbuster-video/clerks.csv'
INTO TABLE Clerks
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

SELECT * FROM Clerks LIMIT 5;

LOAD DATA LOCAL INFILE 'blockbuster-video/movies.csv'
INTO TABLE Movies
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

SELECT * FROM Movies LIMIT 5;

LOAD DATA LOCAL INFILE 'blockbuster-video/users.csv'
INTO TABLE Users
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

SELECT * FROM Users LIMIT 5;

LOAD DATA LOCAL INFILE 'blockbuster-video/rentals.csv'
INTO TABLE Rentals
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

SELECT * FROM Rentals LIMIT 5;

LOAD DATA LOCAL INFILE 'blockbuster-video/ratings.csv'
INTO TABLE Ratings
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

SELECT * FROM Ratings LIMIT 5;

SET GLOBAL local_infile=0;

Advanced SQL Queries

SELECT Title, Genre
FROM Movies
WHERE Id IN (SELECT MovieId FROM Rentals WHERE ClerkId = -1);

SELECT Genre, COUNT(*) as NumberOfMovies 
FROM Movies 
GROUP BY Genre 
HAVING COUNT(*) > 1;

Relational Algebra

Constraints using Relational Algebra

We developed several constraints for our database, which we implemented using the above SQL statements. We will use the three following examples to illustrate how these would be expressed with relational algebra:

ADD CONSTRAINT chk_year CHECK (Year > 1888);
ADD CONSTRAINT chk_rating CHECK (RatingValue BETWEEN 1 AND 10);
ADD CONSTRAINT fk_users1 FOREIGN KEY (UserId) REFERENCES Users(Id);

Become, respectively,
$\sigma_{Year < 1888} = \varnothing$
$\sigma_{RatingValue \ge 10}(\sigma_{RatingValue \le 1}(Ratings)) = \varnothing$, and
$\pi_{UserID}(Ratings) \subseteq \pi_{ID}(Users)$

SQL Queries using Relational Algebra

The following examples outline how different SQL Queries would be expressed with relational algebra.

SELECT * FROM Movie WHERE Genre = 'Sci-Fi';

Becomes $\sigma_{Genre = 'Sci-Fi'}(Movies)$, and

SELECT User.Username, Movie.Title 
FROM User
JOIN Rental ON User.Id = Rental.UserId
JOIN Movie ON Rental.MovieId = Movie.Id;

Becomes $\pi_{Username,Title}(User\bowtie(\rho_{ID,MovieID,Title}(\pi_{UserID,MovieID,Title}(Rental\bowtie\rho_{MovieID,Title}(\pi_{ID,Title}(Movie))))))$

BCNF Normalization

For the sake of brevity we will focus on the Users table. It has been designed in a way so that it conforms to BCNF. When considering the functional dependencies, we need to ensure that for any attribute in a table that can determine the value of another attribute, it must also be able to determine the value of the rest of the attributes in the table. As shown in the ERD, the primary key in each table can determine the value of the other attributes in that table. Also, some attributes like Username in the Users table can determine the other attributes' value. However, none of these attributes can only determine some of the other table attributes without knowing all of them.

Recapping Sprint 2 Goals

  • Cover an example of one of our tables conforming to BCNF
  • Express some attribute constraints in relational algebra
  • Express some SQL queries in relational algebra
  • Make use of advanced SQL such as ALTER TABLE, GROUP BY, subqueries

Sprint 3 Goals

  • Restrict/Grant data privileges (to admins/users) in the database using the GRANT and REVOKE statements
  • Identify and address scenarios in which ACID properties being violated would results in problems for the user
  • Apply a database log to restore a potentially inconsistent database