postgres storage - ghdrako/doc_snipets GitHub Wiki

Physical storage

All the data needed for your PostgreSQL database cluster is stored within the cluster’s data directory and is controlled by PGDATA variable. The common location of the data directory for the RHEL-based system is /var/lib/pgsql/15/data

db # show data directory
var/lib/pgsql/15/data

Change location

Stop server
Create New dir

mkdir /pg_data ; chown postgres:postgres /pg_data

Move data

rsync -av /var/lib/pgsql/15/data/ /pg_data

Modify config
Start server

Relation - postgres term describe tables, index

In Postgresql, all these objects are referred to by the generic term relation.

All information associated with a relation is stored in several different forks,each containing data of a particular type.

fork is represented by a single file. Its filename consists of a numeric (oid), which can be extended by a suffix that corresponds to the fork’s type.

The file grows over time, and when its size reaches 1 GB, another file of this fork is created (such files are sometimes called segments). The sequence number of the segment is added to the end of its filename. You can change 1GB limit when building Postgresql (./configure --with-segsize).

There are several standard types of forks.

The main fork represents actual data: table rows or index rows. This fork is available for any relations (except for views, which contain no data). Files of the main fork are named by their numeric IDs, which are stored as relfilenode values in the pg_class table.
The initialization fork is available only for unlogged tables (created with the UNLOGGED clause) and their indexes. Such objects are the same as regular ones, except that any actions performed on them are not written into the write-ahead log. It has the same name as the main fork, but with the _init suffix:
The free space map keeps track of available space within pages.Its volume changes all the time, growing after vacuuming and getting smaller when new row versions appear. The free space map is used to quickly fi nd a page that can accommodate new data being inserted. All files related to the free space map have the _fsm suffix. Initially, no such files are created; they appear only when necessary. The easiest way to get them is to vacuum a table.
The visibility map can quickly show whether a page needs to be vacuumed or frozen. For this purpose, it provides two bits for each table page. The first bit is set for pages that contain only up-to-date row versions. Vac-uum skips such pages because there is nothing to clean up. Besides, when a transaction tries to read a row from such a page, there is no point in checking its visibility, so an index-only scan can be used. The second bit is set for pages that contain only frozen row versions. I will use the term freeze map to refer to this part of the fork. Visibility map files have the _vm suffix

TOAST (The Oversized-Attribute Storage Technique)

TOAST is solution for storing large values inside columns when the page size is just 8 kB and individual rows cannot spill over into the next pages. It supports in-line compression of the values, but also storing them out-of-line (in a different, associated table).

A page size of 8KB is used for storing tuples, indexes, and queries execution plan.
A row or tuple cannot extend across multiple pages, but there are no restrictions on the size of individual database rows.
Index ca by use TOAST but we want generally avoid that.

The Oversized-Attribute Storage Technique (TOAST) mechanizm ensures that a tuple does not surpass the size of the default page size by storing oversized attributes separately. Therefore, the block size serves as an absolute maximum limit for row size.

\d+ book.author_list

In output in column storage we can see plain or extend.

select relname, relfilenode, reltoastrelid from pg_class where relname = 'author_list'; 
 # Following the Output of the above command
 relname  | relfilenode | reltoastrelid
 -------------+-------------+---------------
 author_list | 24581 | 24584

24581 is OID of regularne table
24584 is OID of TOAST

\d+ pg_toast.pg_toast_24581;
select chunk_id, chunk_seq, length(chunk_data) from pg_toast.pg_toast_24581;

Heap-Only Tuples (HOT)

https://www.postgresql.org/docs/current/storage-hot.html

Filfactor

https://www.cybertec-postgresql.com/en/postgresql-autovacuum-insert-only-tables/

Logs

# Verify the log directory and current log file
 SELECT pg_current_logfile();
 # Execution output of the above SQL
 pg_current_logfile 
 ------------------------
 log/postgresql-Sat.csv

# Verify the Log enteries
 cat /pg_data/log/postgresql-Sat.csv