DEFS.4.2.2.Designing Keys, Indexes, and Constraints - sj50179/IBM-Data-Science-Professional-Certificate GitHub Wiki

Database Objects & Hierarchy (Including Schemas)

Database hierarchy

  • Instance
  • Database
  • Schema
  • Database objects
    • Tables
    • Constraints
    • Indexes

Instances

  • Is a logical boundary for databases, objects, and configuration
  • Provides unique database server environment
  • Allows isolation between databases

Relational databases

  • Stores, manages, and provides access to data
  • Contains objects like tables, views, indexes
  • Uses related tables to reduce data redundancy
  • Can be distributed across multiple systems

Schema

  • Organize database objects
  • Default schema is the user schema
  • Provide a naming context
  • System schemas contain database configuration information

Database partitions

  • Data is managed across multiple partitions
  • Split tables that contain very large quantities of data
  • Hold a subset of the data
  • Common in data warehousing

Database objects

  • Physical database design consists of defining database objects
  • Common database objects:
    • Tables
    • Constraints
    • Indexes
    • Views
    • Aliases
  • Create and manage using Graphical tools, scripting, APIs, SQL

Summary

  • An instance is a logical and configuration boundary
  • A relational database is a set of objects used to store, manage, and access data
  • A schema provides a way to logically group tables, views, nicknames, triggers, functions, packages, and other database objects
  • User schemas contain database objects like tables, views, functions
  • System schemas contain configuration information and metadata for the database
  • You can split very large tables across multiple partitions to improve performance
  • Database objects are the items that exist within the database

Primary Keys and Foreign Keys

Creating primary keys

Examples:

CREATE TABLE book(
  book_id INT NOT NULL,
  ...
  pub_id INT NULL,
  PRIMARY KEY(book_id));

ALTER TABLE book
  ADD PRIMARY KEY(book_id, ISBN);

Creating foreign keys

Examples:

CREATE TABLE copy(
  copy_id INT NOT NULL,
  book_id INT NULL,
  ...
  CONSTRAINT fk_copy_book FOREIGN KEY(book_id)
  REFERENCES book(book_id)
    );

  ...
  ON UPDATE NO ACTION);

  ...
  ON DELETE CASCADE);

Overview of Indexes

  • Indexes provide ordered pointers to rows in tables.

Creating an index

Examples:

CREATE TABLE book(
  ...
  PRIMARY KEY(book_id));

CREATE UNIQUE INDEX unique_book_id
  ON book(book_id);

Advantages and disadvantages

Advantages:

  • Improved performance of SELECT queries
  • Reduced need to sort data
  • Guaranteed uniqueness of rows

Disadvantages:

  • Use disk space
  • Decreased performance of INSERT, UPDATE, and DELETE queries

Normalization

What is normalization?

  • Data duplication lead to inconsistencies
  • Normalization reduces data duplication
  • Increases speed of transactions
  • Improves the integrity of data
  • Normalize each table
  • Most used:
    • First normal form
    • Second normal form
    • Third normal form

First normal form (1NF)

  • Each row must be unique
  • Each cell must contain only a single value

Second normal form (2NF)

  • Database must be in first normal form
  • Create separate tables for sets of values that apply to multiple rows

Third normal form (3NF)

  • Database must be in first and second normal form
  • Eliminate columns that do not depend on the key

Normalization in OLTP and OLAP

Online Transactional Processing (OLTP)

  • Data is read and written frequently
  • Data is normalized to 3NF or BCNF(Boyce Codd Normal form)

Online Analytical Processing (OLAP)

  • Data is mostly read only
  • Data is de-normalized to 2NF or 1NF

Relational Model Constraints - Advanced

Constraints

  • Entity Integrity Constraint - ensures primary key is a unique value that identifies each tuple (or row)
  • Referential Integrity Constraint - defines relationships between tables
  • Semantic Integrity Constraint - refers to the correctness of the meaning of the data
  • Domain Constraint - specifies the permissible values for a given attribute
  • Null Constraint - specifies that attribute values cannot be null
  • Check Constraint - limits the values that are accepted by an attribute

Hands-on Lab: Normalization, Keys and Constraints in Relational Database

Summary & Highlights

At this point in the course, you know:

The objects in a Relational Database Management System (RDBMS) object hierarchy include:

  • Instances. This is a logical boundary for a database or set of databases where you organize and isolate database objects and set configuration parameters.
  • Relational databases. This is a set of objects used to store, manage, and access data.
  • Schemas. A user or system schema is a logical grouping of tables, views, nicknames, triggers, functions, packages, and other database objects. Schemas provide naming contexts so that you can distinguish between objects with the same name.
  • Database partitions. You can split very large tables across multiple partitions to improve performance.
  • Database objects. Database objects are the items that exist within the database, such as tables, constraints, indexes, views, and aliases.

Primary key and Foreign Keys have several uses:

  • Primary keys enforce uniqueness of rows in a table, whereas Foreign keys are columns in a table that contain the same information as the primary key in another table.
  • You can use primary and foreign keys to create relationships between tables. Relationships between tables reduce redundant data and improve data integrity.
  • Indexes provide ordered pointers to rows in tables and can improve the performance of SELECT queries, but can decrease the performance of INSERT, UPDATE, and DELETE queries.

Normalization reduces redundancy and increases consistency of data. There are two forms of normalization:

  • First normal form (1NF). In this form, the table contains only single values and has no repeating groups.
  • Second normal form (2NF). This form splits data into multiple tables to reduce redundancy.

You can define six relational model constraints:

  • Entity integrity constraint. Ensures that the primary key is a unique value that identifies each tuple (or row.)
  • Referential integrity constraint. Defines relationships between tables.
  • Semantic integrity constraint. Refers to the correctness of the meaning of the data.
  • Domain constraint. Specifies the permissible values for a given attribute.
  • Null constraint. Specifies that attribute values cannot be null.
  • Check constraint. Limits the values that are accepted by an attribute.

Practice Quiz: Designing Keys, Indexes, and Constraints

TOTAL POINTS 5

Question 1

What can you use a primary key in a table for?

  • Define columns in the table
  • Define rows in the table
  • Enforce uniqueness of table rows
  • None of the above

Correct. Use a primary key to uniquely identify all table rows.

Question 2

What function does an index provide?

  • Stores metadata
  • Stores primary and foreign keys
  • Easily locate a specific row or set of rows
  • Slowly checks each row in turn

Correct. Create an index on a table to easily locate the specific row or set of rows you require.

Question 3

Which of the following are a constraint?

  • Manual constraint
  • Referential integrity constraint
  • Semantic integrity constraint
  • Null constraint

Correct. The six constraints do not define a manual constraint.

Question 4

Which of the following is one of the objects a relational database uses to store, manage, and access data?

  • Triggers
  • Functions
  • Tables
  • All of the above

Correct. The set of objects relational databases use includes tables, views, indexes, functions, triggers, and packages.

Question 5

How does normalization help speed up transactions?

  • Creates more tables
  • Increases data duplication
  • Enables you to perform updates only once on normalized databases
  • Improves data integrity

Correct. Normalization helps speed transactions as you only perform updates, additions, and deletes once on normalized databases.

Graded Quiz: Designing Keys, Indexes, and Constraints

LATEST SUBMISSION GRADE 100%

Question 1

What clause of the CREATE TABLE statement do you use to create a primary key?

  • ALTER TABLE
  • PRIMARY KEY
  • CONSTRAINT
  • REFERENCES

Correct. Use the PRIMARY KEY clause of the CREATE TABLE statement to create a primary key.

Question 2

What do you create that automatically creates an index?

  • Primary key
  • Table
  • Foreign key
  • None of the above

Correct. Creating a primary key on a table automatically creates an index on the key.

Question 3

Which of the following constraints identifies each tuple (or row) in a table?

  • Positive constraint
  • Domain constraint
  • Entity integrity constraint
  • Check constraint

Correct. This constraint ensures primary key is a unique value identifying each tuple or row.

Question 4

True or False: All relational databases use the concept of instances.

  • True
  • False

Correct. Some relational databases use a special database instead of instances.

Question 5

What are the requirements that a table must meet to be in first normal form (1NF)?

Each table must contain no more than three columns.

Each row must be unique, and each cell must contain only a single value.

The table must already be in third normal form (3NF) and second normal form (2NF).

The table must have a Primary Key.

Correct. The requirements for 1NF are that each row must be unique, and each cell must contain only one value, not a list.