PostgreSQL - ttulka/technologies GitHub Wiki

PostgreSQL is a relational database management system (RDBMS).

Basic Commands

Installation path usually: /usr/local/pgsql

# create database
createdb mydb

# drop database
dropdb mydb

# access database
psql mydb
\h        # help
\q        # quit
\d <table>  # describe table
\dt+      # list tables
\dn+    # list schemas
\c      # info about connection
-- show version
SELECT version();

-- show current user
SELECT session_user, current_user;

-- random between 0 and 1
SELECT random();

Data Types

Numeric Types

Name Storage Size Description Range
smallint 2 bytes small-range integer -32768 to +32767
integer 4 bytes typical choice for integer -2147483648 to +2147483647
bigint 8 bytes large-range integer -9223372036854775808 to +9223372036854775807
decimal variable user-specified precision, exact up to 131072 digits before the decimal point; up to 16383 digits after the decimal point
numeric variable user-specified precision, exact up to 131072 digits before the decimal point; up to 16383 digits after the decimal point
real 4 bytes variable-precision, inexact 6 decimal digits precision
double precision 8 bytes variable-precision, inexact 15 decimal digits precision
smallserial 2 bytes small autoincrementing integer 1 to 32767
serial 4 bytes autoincrementing integer 1 to 2147483647
bigserial 8 bytes large autoincrementing integer 1 to 9223372036854775807

Serial Types

The data types smallserial, serial and bigserial are not true types, but merely a notational convenience for creating unique identifier columns:

CREATE TABLE tablename (
    colname SERIAL
);

-- is equivalent to specifying:

CREATE SEQUENCE tablename_colname_seq AS integer;
CREATE TABLE tablename (
    colname integer NOT NULL DEFAULT nextval('tablename_colname_seq')
);
ALTER SEQUENCE tablename_colname_seq OWNED BY tablename.colname;

Monetary Types

The money type stores a currency amount with a fixed fractional precision.

Character Types

There is no performance difference among these three types in PostgreSQL.

Name Description
character varying(n), varchar(n) variable-length with limit
character(n), char(n) fixed-length, blank padded
text variable unlimited length

Binary Data Types

The bytea data type allows storage of binary strings.

SELECT '\xDEADBEEF';

Escape Format

Decimal Octet Value Description Escaped Input Example Hex
0 zero octet '\000' '\000'::bytea \x00
39 single quote '''' or '\047' ''''::bytea \x27
92 backslash '\' or '\134' '\'::bytea \x5c
0 to 31 and 127 to 255 “non-printable” octets '\xxx' (octal value) '\001'::bytea \x01

Enumerated Types

Enumerated (enum) types are data types that comprise a static, ordered set of values.

CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy');
CREATE TABLE person (
    name text,
    current_mood mood
);
INSERT INTO person VALUES ('Moe', 'happy');
SELECT * FROM person WHERE current_mood = 'happy';

Enum labels are case sensitive, so 'happy' is not the same as 'HAPPY'. White space in the labels is significant too.

Geometric Types

Name Storage Size Description Representation
point 16 bytes Point on a plane (x,y)
line 32 bytes Infinite line {A,B,C}
lseg 32 bytes Finite line segment ((x1,y1),(x2,y2))
box 32 bytes Rectangular box ((x1,y1),(x2,y2))
path 16+16n bytes Closed path (similar to polygon) ((x1,y1),...)
path 16+16n bytes Open path [(x1,y1),...]
polygon 40+16n bytes Polygon (similar to closed path) ((x1,y1),...)
circle 24 bytes Circle <(x,y),r> (center point and radius)

Text Search Types

PostgreSQL provides two data types that are designed to support full text search.

  • tsvector
  • tsquery

A tsvector value is a sorted list of distinct lexemes, which are words that have been normalized to merge different variants of the same word.

SELECT 'The Fat Rats'::tsvector;
      tsvector      
--------------------
 'Fat' 'Rats' 'The'

A tsquery value stores lexemes that are to be searched for, and can combine them using the Boolean operators & (AND), | (OR), and ! (NOT), as well as the phrase search operator <-> (FOLLOWED BY).

SELECT 'fat & (rat | cat)'::tsquery;
          tsquery          
---------------------------
 'fat' & ( 'rat' | 'cat' )

UUID Type

The data type uuid stores Universally Unique Identifiers (UUID) as defined by RFC 4122, ISO/IEC 9834-8:2005, and related standards.

JSON Types

JSON data types are for storing JSON (JavaScript Object Notation) data, as specified in RFC 7159.

  • json - stores an exact copy of the input text
  • jsonb - stores data in decomposed binary format that makes it slightly slower to input due to added conversion overhead

To implement efficient query mechanisms for these data types, PostgreSQL also provides the jsonpath data type.

-- Simple scalar/primitive value
-- Primitive values can be numbers, quoted strings, true, false, or null
SELECT '5'::json;

-- Array of zero or more elements (elements need not be of same type)
SELECT '[1, 2, "foo", null]'::json;

-- Object containing pairs of keys and values
-- Note that object keys must always be quoted strings
SELECT '{"bar": "baz", "balance": 7.77, "active": false}'::json;

-- Arrays and objects can be nested arbitrarily
SELECT '{"foo": [true, "bar"], "tags": {"a": 1, "b": null}}'::json;
SELECT '{"reading": 1.230e-5}'::json, '{"reading": 1.230e-5}'::jsonb;

         json          |          jsonb          
-----------------------+-------------------------
 {"reading": 1.230e-5} | {"reading": 0.00001230}

jsonb Containment and Existence

Testing containment is an important capability of jsonb.

SELECT '[1, 2, 3]'::jsonb @> '[3, 1]'::jsonb;  -- true
SELECT '["foo", "bar"]'::jsonb @> '"bar"'::jsonb;  -- true
SELECT '{"foo": {"bar": "baz"}}'::jsonb @> '{"foo": {}}'::jsonb;  -- true
SELECT '{"foo": {"bar": "baz"}}'::jsonb @> '{"bar": "baz"}'::jsonb;  -- false

jsonb has an existence operator: it tests whether a string appears as an object key or array element at the top level:

SELECT '["foo", "bar", "baz"]'::jsonb ? 'bar';  -- true
SELECT '{"foo": "bar"}'::jsonb ? 'foo';  -- true
SELECT '{"foo": "bar"}'::jsonb ? 'bar';  -- false
SELECT '"foo"'::jsonb ? 'foo';  -- true

jsonb Indexing

GIN indexes can be used to efficiently search for keys or key/value pairs occurring within a large number of documents.

CREATE INDEX idxgin ON api USING GIN (jdoc);

-- Find documents in which the key "company" has value "Magnafone"
SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @> '{"company": "Magnafone"}';

-- Find documents in which the key "tags" contains key or array element "qui"
SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc -> 'tags' ? 'qui';

GIN index supports @@ and @? operators, which perform jsonpath matching:

SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @@ '$.tags[*] == "qui"';

SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @@ '$.tags[*] ? (@ == "qui")';

Arrays

PostgreSQL allows columns of a table to be defined as variable-length multidimensional arrays.

CREATE TABLE sal_emp (
    name            text,
    pay_by_quarter  integer[],
    schedule        text[][]
);

CREATE TABLE tictactoe (
    squares   integer[3][3]
);

INSERT INTO sal_emp
    VALUES ('Bill',
    '{10000, 10000, 10000, 10000}',
    '{{"meeting", "lunch"}, {"training"}}');

SELECT name FROM sal_emp WHERE pay_by_quarter[1] <> pay_by_quarter[2];

Composite Types

A composite type represents the structure of a row or record; it is essentially just a list of field names and their data types.

CREATE TYPE inventory_item AS (
    name            text,
    supplier_id     integer,
    price           numeric
);

CREATE TABLE on_hand (
    item      inventory_item,
    count     integer
);

INSERT INTO on_hand VALUES (ROW('fuzzy dice', 42, 1.99), 1000);

Domain Types

A domain is a user-defined data type that is based on another underlying type.

  • can have constraints that restrict its valid values
CREATE DOMAIN posint AS integer CHECK (VALUE > 0);

CREATE TABLE mytable (id posint);

INSERT INTO mytable VALUES(1);   -- works
INSERT INTO mytable VALUES(-1);  -- fails

Queries

Views

You can create a view over the query, which gives a name to the query that you can refer to like an ordinary table:

CREATE VIEW myview AS
    SELECT city, temp_lo, temp_hi, prcp, date, location
        FROM weather, cities
        WHERE city = name;

SELECT * FROM myview;

Transactions

BEGIN;
UPDATE accounts SET balance = balance - 100.00
    WHERE name = 'Alice';
-- etc etc
COMMIT;

It's possible to control the statements in a transaction in a more granular fashion through the use of savepoints:

BEGIN;
UPDATE accounts SET balance = balance - 100.00
    WHERE name = 'Alice';

SAVEPOINT my_savepoint;

UPDATE accounts SET balance = balance + 100.00
    WHERE name = 'Bob';
-- oops ... forget that and use Wally's account

ROLLBACK TO my_savepoint;

UPDATE accounts SET balance = balance + 100.00
    WHERE name = 'Wally';
COMMIT;

Window Functions

A window function performs a calculation across a set of table rows that are somehow related to the current row.

Window functions do not cause rows to become grouped into a single output row like non-window aggregate calls would. Instead, the rows retain their separate identities.

SELECT depname, empno, salary, avg(salary) OVER (PARTITION BY depname) FROM empsalary;
  depname  | empno | salary |          avg          
-----------+-------+--------+-----------------------
 develop   |     8 |   6000 | 5020.0000000000000000
 develop   |    10 |   5200 | 5020.0000000000000000
 personnel |     5 |   3500 | 3700.0000000000000000
 sales     |     4 |   4800 | 4866.6666666666666667
(4 rows)

You can also control the order in which rows are processed by window functions using ORDER BY within OVER:

SELECT depname, empno, salary,
       rank() OVER (PARTITION BY depname ORDER BY salary DESC)
FROM empsalary;
  depname  | empno | salary | rank 
-----------+-------+--------+------
 develop   |     8 |   6000 |    1
 develop   |    10 |   5200 |    2
 develop   |    11 |   5200 |    2
 personnel |     5 |   3500 |    1
 personnel |     2 |   3900 |    2
 personnel |     9 |   4000 |    3
(6 rows)

For each row, there is a set of rows within its partition called its window frame:

  • since there is no ORDER BY in the OVER clause, the window frame is the same as the partition, which for lack of PARTITION BY is the whole table.
SELECT salary, sum(salary) OVER () FROM empsalary;
 salary |  sum  
--------+-------
   5200 | 47100
   5000 | 47100
   4800 | 47100
   6000 | 47100
   5200 | 47100
(5 rows)

If we add an ORDER BY clause, we get very different results:

SELECT salary, sum(salary) OVER (ORDER BY salary) FROM empsalary;
salary |  sum  
--------+-------
   4800 | 25700
   5000 | 30700
   5200 | 41100
   5200 | 41100
   6000 | 47100
(10 rows)

Inheritance

CREATE TABLE cities (
  name       text,
  population real,
  elevation  int     -- (in ft)
);

CREATE TABLE capitals (
  state      char(2) UNIQUE NOT NULL
) INHERITS (cities);

Functions

CREATE FUNCTION concat_lower_or_upper(a text, b text, uppercase boolean DEFAULT false)
RETURNS text
AS
$$
 SELECT CASE
        WHEN $3 THEN UPPER($1 || ' ' || $2)
        ELSE LOWER($1 || ' ' || $2)
        END;
$$
LANGUAGE SQL IMMUTABLE STRICT;
SELECT concat_lower_or_upper('Hello', 'World');

 concat_lower_or_upper 
-----------------------
 hello world
(1 row)

Generated Columns

A generated column is a special column that is always computed from other columns.

CREATE TABLE people (
    ...,
    height_cm numeric,
    height_in numeric GENERATED ALWAYS AS (height_cm / 2.54) STORED
);

Cross Joins

For every possible combination of rows from T1 and T2 (i.e., a Cartesian product), the joined table will contain a row consisting of all columns in T1 followed by all columns in T2.

T1 CROSS JOIN T2

FROM T1 CROSS JOIN T2 is equivalent to FROM T1 INNER JOIN T2 ON TRUE. It is also equivalent to FROM T1, T2.

Combining Queries

query1 UNION [ALL] query2
query1 INTERSECT [ALL] query2
query1 EXCEPT [ALL] query2

Operations can also be nested and chained:

query1 UNION query2 UNION query3

Common Table Expressions

WITH provides a way to write auxiliary statements for use in a larger query.

WITH regional_sales AS (
    SELECT region, SUM(amount) AS total_sales
    FROM orders
    GROUP BY region
), top_regions AS (
    SELECT region
    FROM regional_sales
    WHERE total_sales > (SELECT SUM(total_sales)/10 FROM regional_sales)
)
SELECT region,
       product,
       SUM(quantity) AS product_units,
       SUM(amount) AS product_sales
FROM orders
WHERE region IN (SELECT region FROM top_regions)
GROUP BY region, product;

Using optional RECURSIVE, a query can refer to its own output:

WITH RECURSIVE t(n) AS (
    VALUES (1)
  UNION ALL
    SELECT n+1 FROM t WHERE n < 100
)
SELECT sum(n) FROM t;

Constraints

Check Constraints

CREATE TABLE products (
    product_no integer,
    name text,
    price numeric CHECK (price > 0)
);

You can also give the constraint a separate name:

CREATE TABLE products (
    ...,
    price numeric CONSTRAINT positive_price CHECK (price > 0)
);

A check constraint can also refer to several columns:

CREATE TABLE products ( ..., price numeric CHECK (price > 0), discounted_price numeric CHECK (discounted_price > 0), CHECK (price > discounted_price) );

Foreign Keys

CREATE TABLE orders (
    order_id integer PRIMARY KEY,
    product_no integer REFERENCES products (product_no),
    quantity integer
);

CREATE TABLE t1 (
  a integer PRIMARY KEY,
  b integer,
  c integer,
  FOREIGN KEY (b, c) REFERENCES other_table (c1, c2)
);

Indexes

CREATE INDEX col1_idx ON table1 (col1);

Covering Index

An index specifically designed to include the columns needed by a particular type of query that you run frequently.

SELECT y FROM tab WHERE x = 'key';

Queries can be handeled as index-only scans, because y can be obtained from the index without visiting the heap:

CREATE INDEX tab_x_y ON tab(x) INCLUDE (y);

Because column y is not part of the index's search key, it does not have to be of a data type that the index can handle; it's merely stored in the index and is not interpreted by the index machinery.

Modifying Tables

Adding a Column

ALTER TABLE products ADD COLUMN description text;

You can also define constraints on the column at the same time:

ALTER TABLE products ADD COLUMN description text CHECK (description <> '');

Removing a Column

ALTER TABLE products DROP COLUMN description;

If the column is referenced by a foreign key constraint of another table, PostgreSQL will not silently drop that constraint. You can authorize dropping everything that depends on the column by adding CASCADE:

ALTER TABLE products DROP COLUMN description CASCADE;

Adding a Constraint

ALTER TABLE products ADD CHECK (name <> '');
ALTER TABLE products ADD CONSTRAINT some_name UNIQUE (product_no);
ALTER TABLE products ADD FOREIGN KEY (product_group_id) REFERENCES product_groups;

Adding a not-null constraint, which cannot be written as a table constraint:

ALTER TABLE products ALTER COLUMN product_no SET NOT NULL;

Removing a Constraint

ALTER TABLE products DROP CONSTRAINT some_name;

This works the same for all constraint types except not-null constraints:

ALTER TABLE products ALTER COLUMN product_no DROP NOT NULL;

Renaming a Column

ALTER TABLE products RENAME COLUMN product_no TO product_number;

Renaming a Table

ALTER TABLE products RENAME TO items;

Privileges

  • When an object is created, it is assigned an owner.
  • The owner is normally the role that executed the creation statement.
  • For most kinds of objects, the initial state is that only the owner (or a superuser) can do anything with the object.
  • To allow other roles to use it, privileges must be granted.
Privilege Abbreviation Applicable Object Types
SELECT r (“read”) LARGE OBJECT, SEQUENCE, TABLE (and table-like objects), table column
INSERT a (“append”) TABLE, table column
UPDATE w (“write”) LARGE OBJECT, SEQUENCE, TABLE, table column
DELETE d TABLE
TRUNCATE D TABLE
REFERENCES x TABLE, table column
TRIGGER t TABLE
CREATE C DATABASE, SCHEMA, TABLESPACE
CONNECT c DATABASE
TEMPORARY T DATABASE
EXECUTE X FUNCTION, PROCEDURE
USAGE U DOMAIN, FOREIGN DATA WRAPPER, FOREIGN SERVER, LANGUAGE, SCHEMA, SEQUENCE, TYPE

An object can be assigned to a new owner with an ALTER command:

ALTER TABLE table_name OWNER TO new_owner;

Superusers can always do this; ordinary roles can only do it if they are both the current owner of the object (or a member of the owning role) and a member of the new owning role.

To assign privileges, the GRANT command is used:

GRANT UPDATE ON accounts TO joe;

Writing ALL in place of a specific privilege grants all privileges that are relevant for the object type.

The special “role” name PUBLIC can be used to grant a privilege to every role on the system.

To revoke a previously-granted privilege, use the REVOKE command:

REVOKE ALL ON accounts FROM PUBLIC;

Row Security Policies

Tables can have row security policies that restrict, on a per-user basis, which rows can be returned inserted, or modified.

  • Operations that apply to the whole table, such as TRUNCATE and REFERENCES, are not subject to row security.
CREATE TABLE accounts (manager text, company text, contact_email text);

ALTER TABLE accounts ENABLE ROW LEVEL SECURITY;

CREATE POLICY account_managers_policy ON accounts TO managers
    USING (manager = current_user);

CREATE POLICY account_managers_select_policy ON managers
    FOR SELECT
    USING (true);

Database Roles

CREATE ROLE name;
createuser name

DROP ROLE name;
dropuser name
GRANT group_role TO role1, ... ;
REVOKE group_role FROM role1, ... ;
-- GRANT admin TO joe;

Login Privilege

Only roles that have the LOGIN attribute can be used as the initial role name for a database connection:

CREATE ROLE name LOGIN;

A role with the LOGIN attribute can be considered the same as a “database user”:

CREATE USER name;

(CREATE USER is equivalent to CREATE ROLE except that CREATE USER includes LOGIN by default, while CREATE ROLE does not.)

Schemas

Schemas are analogous to directories at the operating system level, except that schemas cannot be nested.

CREATE SCHEMA myschema;

CREATE TABLE myschema.mytable (
 ...
);

DROP SCHEMA myschema; -- if it's empty

DROP SCHEMA myschema CASCADE; -- including all contained objects

Create a schema owned by someone else:

CREATE SCHEMA schema_name AUTHORIZATION user_name;

The Public Schema

By default tables (and other objects) are automatically put into a schema named “public”:

CREATE TABLE products ( ... );
-- is equivalent to CREATE TABLE public.products ( ... );

The Schema Search Path

The system determines which table is meant by following a search path, which is a list of schemas to look in.

  • Default: "$user", public

Put our new schema in the path:

SET search_path TO myschema,public;

The System Catalog Schema

In addition to public and user-created schemas, each database contains a pg_catalog schema, which contains the system tables and all the built-in data types, functions, and operators.

  • pg_catalog is always effectively part of the search path.

Table Partitioning

Partitioning refers to splitting what is logically one large table into smaller physical pieces.

  • Query performance can be improved dramatically in certain situations
    • when most of the heavily accessed rows of the table are in a single partition or a small number of partitions
  • When queries or updates access a large percentage of a single partition, performance can be improved by taking advantage of sequential scan of that partition instead of using an index and random access reads scattered across the whole table.
  • Bulk loads and deletes can be accomplished by adding or removing partitions.
  • Seldom-used data can be migrated to cheaper and slower storage media.

Built-in forms of partitioning:

  • Range Partitioning
    • The table is partitioned into “ranges” defined by a key column or set of columns, with no overlap between the ranges of values assigned to different partitions.
  • List Partitioning
    • The table is partitioned by explicitly listing which key values appear in each partition.
  • Hash Partitioning
    • The table is partitioned by specifying a modulus and a remainder for each partition.
CREATE TABLE measurement (
    city_id         int not null,
    logdate         date not null,
    peaktemp        int,
    unitsales       int
) PARTITION BY RANGE (logdate);

CREATE TABLE measurement_y2006m02 PARTITION OF measurement
    FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');

CREATE TABLE measurement_y2006m03 PARTITION OF measurement
    FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');

...

Implement sub-partitioning, specify the PARTITION BY clause:

CREATE TABLE measurement_y2006m02 PARTITION OF measurement
    FOR VALUES FROM ('2006-02-01') TO ('2006-03-01')
    PARTITION BY RANGE (peaktemp);

Full Text Search

TBD

GIN and GiST Index Types

  • GIN (Generalized Inverted Index)-based index.
  • GiST (Generalized Search Tree)-based index.
CREATE INDEX name ON table USING GIN (column);
CREATE INDEX name ON table USING GIST (column [ { DEFAULT | tsvector_ops } (siglen = number) ] );

GIN indexes are the preferred text search index type. As inverted indexes, they contain an index entry for each word (lexeme), with a compressed list of matching locations.

  • GIN indexes store only the words (lexemes) of tsvector values, and not their weight labels.

A GiST index is lossy, meaning that the index might produce false matches, and it is necessary to check the actual table row to eliminate such false matches.

  • Each document is represented in the index by a fixed-length signature.

Concurrency Control

Internally, data consistency is maintained by using a multiversion model (Multiversion Concurrency Control, MVCC).

  • Each SQL statement sees a snapshot of data (a database version) as it was some time ago, regardless of the current state of the underlying data.
  • Prevents statements from viewing inconsistent data produced by concurrent transactions performing updates on the same data rows, providing transaction isolation for each database session.

MVCC locks acquired for querying (reading) data do not conflict with locks acquired for writing data, and so reading never blocks writing and writing never blocks reading.

Some DDL commands, currently only TRUNCATE and the table-rewriting forms of ALTER TABLE, are not MVCC-safe.

Transaction Isolation

The phenomena:

  • dirty read - A transaction reads data written by a concurrent uncommitted transaction.
  • nonrepeatable read - A transaction re-reads data it has previously read and finds that data has been modified by another transaction (that committed since the initial read).
  • phantom read - A transaction re-executes a query returning a set of rows that satisfy a search condition and finds that the set of rows satisfying the condition has changed due to another recently-committed transaction.
  • serialization anomaly - The result of successfully committing a group of transactions is inconsistent with all possible orderings of running those transactions one at a time.
Isolation Level Dirty Read Nonrepeatable Read Phantom Read Serialization Anomaly
Read uncommitted Allowed, but not in PG Possible Possible Possible
Read committed (def) Not possible Possible Possible Possible
Repeatable read Not possible Not possible Allowed, but not in PG Possible
Serializable Not possible Not possible Not possible Not possible

In PostgreSQL, you can request any of the four standard transaction isolation levels, but internally only three distinct isolation levels are implemented, i.e., PostgreSQL's Read Uncommitted mode behaves like Read Committed.

Serializable transactions are just Repeatable Read transactions which add nonblocking monitoring for dangerous patterns of read/write conflicts.

To set the transaction isolation level of a transaction, use the command SET TRANSACTION.

Performance

PostgreSQL devises a query plan for each query it receives.

EXPLAIN command can be used to see what query plan the planner creates for any query.

The structure of a query plan is a tree of plan nodes.

  • Nodes at the bottom level of the tree are scan nodes: they return raw rows from a table.
    • Different types of scan nodes for different table access methods:
      • sequential scans,
      • index scans, and
      • bitmap index scans.
    • If the query requires joining, aggregation, sorting, or other operations on the raw rows, then there will be additional nodes above the scan nodes to perform these operations.
  • The output of EXPLAIN has one line for each node in the plan tree, showing the basic node type plus the cost estimates that the planner made for the execution of that plan node.
EXPLAIN SELECT * FROM tenk1;

                         QUERY PLAN
-------------------------------------------------------------
 Seq Scan on tenk1  (cost=0.00..458.00 rows=10000 width=244)
  • estimated start-up cost
  • estimated total cost
  • estimated number of rows output
  • estimated average width of rows output

References

⚠️ **GitHub.com Fallback** ⚠️