postgres storage - ghdrako/doc_snipets GitHub Wiki
- https://postgresql.org/docs/14/storage-file-layout.html
- https://rachbelaid.com/introduction-to-postgres-physical-storage/
Physical storage
All the data needed for your PostgreSQL database cluster is stored within the cluster’s data directory and is controlled by PGDATA variable. The common location of the data directory for the RHEL-based system is /var/lib/pgsql/15/data
db # show data directory
var/lib/pgsql/15/data
Change location
- Stop server
- Create New dir
mkdir /pg_data ; chown postgres:postgres /pg_data
- Move data
rsync -av /var/lib/pgsql/15/data/ /pg_data
- Modify config
- Start server
Relation
- postgres term describe tables, index
In Postgresql, all these objects are referred to by the generic term relation
.
All information associated with a relation is stored in several different forks
,each containing data of a particular type.
fork
is represented by a single file. Its filename consists of a numeric (oid), which can be extended by a suffix that corresponds to the fork’s type.
The file grows over time, and when its size reaches 1 GB, another file of this fork is created (such files are sometimes called segments). The sequence number of the segment is added to the end of its filename.
You can change 1GB limit when building Postgresql (./configure --with-segsize
).
There are several standard types of forks.
- The
main fork
represents actual data: table rows or index rows. This fork is available for any relations (except for views, which contain no data). Files of the main fork are named by their numeric IDs, which are stored asrelfilenode
values in thepg_class
table. - The
initialization fork
is available only for unlogged tables (created with theUNLOGGED
clause) and their indexes. Such objects are the same as regular ones, except that any actions performed on them are not written into the write-ahead log. It has the same name as the main fork, but with the_init
suffix: - The
free space map
keeps track of available space within pages.Its volume changes all the time, growing after vacuuming and getting smaller when new row versions appear. The free space map is used to quickly fi nd a page that can accommodate new data being inserted. All files related to the free space map have the_fsm
suffix. Initially, no such files are created; they appear only when necessary. The easiest way to get them is to vacuum a table. - The
visibility map
can quickly show whether a page needs to be vacuumed or frozen. For this purpose, it provides two bits for each table page. The first bit is set for pages that contain only up-to-date row versions. Vac-uum skips such pages because there is nothing to clean up. Besides, when a transaction tries to read a row from such a page, there is no point in checking its visibility, so an index-only scan can be used. The second bit is set for pages that contain only frozen row versions. I will use the term freeze map to refer to this part of the fork. Visibility map files have the_vm
suffix
TOAST
A page size of 8KB is used for storing tuples, indexes, and queries execution plan. A row or tuple cannot extend across multiple pages, but there are no restrictions on the size of individual database rows.
The Oversized-Attribute Storage Technique (TOAST) mechanizm ensures that a tuple does not surpass the size of the default page size by storing oversized attributes separately. Therefore, the block size serves as an absolute maximum limit for row size.
\d+ book.author_list
In output in column storage we can see plain
or extend
.
select relname, relfilenode, reltoastrelid from pg_class where relname = 'author_list';
# Following the Output of the above command
relname | relfilenode | reltoastrelid
-------------+-------------+---------------
author_list | 24581 | 24584
- 24581 is OID of regularne table
- 24584 is OID of TOAST
\d+ pg_toast.pg_toast_24581;
select chunk_id, chunk_seq, length(chunk_data) from pg_toast.pg_toast_24581;
Heap-Only Tuples (HOT)
Filfactor
Logs
# Verify the log directory and current log file
SELECT pg_current_logfile();
# Execution output of the above SQL
pg_current_logfile
------------------------
log/postgresql-Sat.csv
# Verify the Log enteries
cat /pg_data/log/postgresql-Sat.csv