Databases - kamialie/knowledge_corner GitHub Wiki

Basic commands:

  • Create/Insert data
  • Read/Select data
  • Update data
  • Delete data

Terms:

  • Database
  • Table (or Relation)
  • Row (or Tuple)
  • Column or Row (or Attributes)
  • Schema (metadata, first row of a table - will contains a lot of different constraints as well)

RDBMS (Relational Database Management System)

Contents

Key

Value used to identify a record in a table uniquely. A KEY could be a single column or combination of multiple columns.

A primary key is a single column value used to identify a database record uniquely.

  • cannot be NULL
  • value must be unique
  • should rarely be changed
  • must be given a value when a new record is inserted.

A composite key is a primary key composed of multiple columns used to identify a record uniquely.

Foreign key references the primary key of another table.

  • can have a different name from its primary key
  • ensures rows in one table have corresponding rows in another
  • do not have to be unique, most often they aren't
  • can be null even though primary keys can not

To join multiple tables we need an integer primary key for each row to efficiently add a reference to a row, thus, table creation looks like this:

CREATE TABLE Users (
	user_id INT UNSIGNED NOT NULL AUTO_INCREMENT,
	name VARCHAR(128),
	email VARCHAR(128),
	PRIMARY KEY(user_id),
	INDEX(email)
);

PRIMARY KEY statement tell db that this field is going to be used a lot, while INDEX tells that WHERE clause is going to use that one a lot, so that db can adjust they way it stores this fields.

INDEX - good for individual row lookup, sorting/grouping result, works best with exact matches or prefix lookups, generally HASH or BTREE

If forgot to specify on table creation, can still add it later (you can specify what to use, but rather let database decide):

ALTER TABLE Users ADD INDEX (email) USING BTREE;

Normalization

Database design technique that reduces data redundancy and eliminates undesirable characteristics like Insertion, Update and Deletion Anomalies.

A transitive functional dependency is when changing a non-key column, might cause any of the other non-key columns to change.

Normal Form Rules
1 each table cell should contain a single value; each record needs to be unique
2 1NF, single column primary key
3 2NF, no transitive functional dependencies
Boyce-Codd (BCNF) 3NF, still could contain anomalies, if there is more than one Candidate Key; functional dependecies
4 no database table instance contains two or more, independent and multivalued data describing the relevant entity'; functional and multifunctional dependencies
5 4NF, cannot be decomposed into any number of smaller tables without loss of data
6 not standardized yet

Funcitonal dependency

Description: if the value of property A always corresponds to the same value of property B (A -> B), the latter should be stored only once.

Boyce-Codd Normal form - if a set of attributes A determine all other attributes, then set of attributes A is a key.

Types of FD:

  • trivial - if B is a subset of A
  • nontrivial - if B is not a subset of A, but have some common attributes
  • completely nontrivial - if B is not a subset of A, no common attributes at all

BCNF decomposition algorithm (input: relation R and FDs for R, output: decomposition of R into BCNF relations with lossless join)

  • Compute keys for R
  • Repeat until all relations are in BCNF:
    • Pick any R' with A -> B that violates BCNF (left side does not have keys)
    • Decompose R' into R1(A, B) and R2(A, rest)
    • Compute FDs for R1 and R2
    • Compute keys for R1 and R2

Multifunctional dependency

  • multifunctional dependency is when a direct dependency between field A and B, and between field A and C, while B and C have no relationship
    • 4th Normal Form - separate dependent fields into separate relations (tables)

Conventions

  • table name start with upper letter - Users
  • primary key should be the table name in lower case letters plus underscore id - users_id
  • field name is lower case
  • avoid using spaces in schema, table and field names

Data types

NULL is a special value that any attribute can have, which denotes value is not known or underfined. Rows with NULL value are not queried, even with conditions that suppose to return all rows is set. Arithmetic operations involving NULL always return NULL. Specify NOT NULL for table attribute, when creating table, to require that field.

NULL can also be used as keyword to perform boolean comparison. Used in conjunction with IS or NOT.

CLI

Oracle DB default port - 1521 MSSQL default port - 1433

MySQL

Default port - 3306

Log in with credentials: shell $ mysql -u root -p

Show databases (MySQL contains 3 internal databases mysql, information_schema and performance_schema that should not be deleted): sql mysql> show databases;

Switch to particular database: sql mysql> use <database_name>;

Get info about particular table: sql mysql> descrbibe <table_name>;

PostgreSQL

Default port - 5432

Debian-like client package - postgresql-client.

Log in with credentials (password will be prompted): shell $ psql -h host -U user

Enclose names in double quotes, if there are upper-case letters.

  • \conninfo - display information about current connection

  • \? - list all the commands

  • \l - list databases

  • \c <database_name> - connect to new database

  • \dt - list tables of the public schema

    • \dt <schema_name>.* - list tables of certain schema
    • \dt *.* - list tables of all schemas
  • drop schema <schema_name> cascade; - drop all tables in schema_name

  • \q - quit psql

  • List users

     SELECT usename, valuntil FROM pg_user;

Cheatsheet

Resources

⚠️ **GitHub.com Fallback** ⚠️