postgres system catalog - ghdrako/doc_snipets GitHub Wiki

select * from information_schema.tables where table_schema ('information_schema', 'pg_catalog')

Metadata of all cluster objects (such as tables, indexes, data types, or functions) is stored in tables that belong to the system catalog.

  • Each database has its own set of tables (and views) that describe the objects of this database.
  • Several system catalog tables are common to the whole cluster; they do not belong to any particular database (technically, a dummy database with a zero id is used), but can be accessed from all of them. 'system wide' tables

Names of all system catalog tables begin with pg_*` `, like in pg_database. Column names start with a **three-letter prefix** that usually corresponds to the table name, like in datname``. In all system catalog tables, the column declared as the primary key is called oid (object identifier); its type, which is also called oid, is a 32-bit integer.

  • pg_catalog is used for system catalog tables.
  • information_schema provides an alternative view for the system catalog as defined by the sql standard.
  • pg_toast is used for objects related to TOAST
  • pg_temp comprises temporary tables. Although different users create temporary tables in different schemas called pg_temp_N, everyone refers to their objects using the pg_temp alias.

search_path parameter is implicitly extended with pg_catalog and (if necessary) pg_temp schemas.

System Wide Metadata

pg_database

pg_stat_database

  • Transactions

Transaction information can be found in the columns xact_commit and xact_rollback, which contain the number of transactions the database has committed and rolled back respectively. This will help show just how active a database is, as well as spot possible failures with programs who may be erroring / rolling back at an alarming rate.

  • Disk and Memory access

Information on whether or not data is retrieved from disk or memory is stored in the blks_read and blks_hit columns. Blks_read shows the number of blocks this database read from disk, while blks_hit shows the number of blocks that were found in PostgreSQL’s buffer cache (represented by the shared_buffers parameter). Since RAM is much faster than disk, we would ideally see blks_hit consistently higher than blks_read, and if not, we can re-evaluate our available memory.

  • Tuples

The next few columns deal with tuples. Tup_returned is the number of rows returned in the database, which is the number of rows read by sequential scans if from a table, or the number of index entries returned when from an index”. Tup_fetched is the number of rows fetched in the database, meaning they were a result of bitmap scans, which is the number of table rows fetched by bitmap scans if from a table, or table rows fetched by simple index scans if using an index.

We also have tup_inserted, tup_updated, and tup_deleted, which represents the number of tuples inserted, updated, and deleted in this database respectively. This will help understand how data enters, changes, and leaves the database. Since updated and deleted tuples result in dead rows, high values in these columns would suggest autovacuum operations be tuned to meet the needs of the database activity.

  • Conflicts

If the database in question is a standby server, the column conflicts comes in handy as a way to track how many queries were canceled due to conflicts with the standby being in ‘recovery mode’. If not a standby cluster, this column can be ignored.

  • Temporary files / data

At times, queries will need to write to temporary files. This can happen when the amount of work_mem allocated to the connection has been used up, and needs to continue a sort operation on disk rather than in memory. The column temp_files tracks the number of these files that were created, and temp_bytes tracks the total size of all temporary files used. This data can help inform work_mem tuning, or even finding queries that could use re-writing if temp file size is too big.

  • Deadlocks

The deadlocks column tracks how many times a deadlock occurs. Since a deadlock can cause errors for queries that otherwise wouldn’t error, it’s good to track this and make sure applications aren’t stepping on each others feet.

  • Read and write times

The columns blk_read_time and blk_write_time tracks the total number of milliseconds that backends in the database spend reading and writing data, which can be helpful if trying to compare / improve disk read/write speed

  • Stats reset

This column, stats_reset, simply shows a timestamp (with time zone) of the last time the stats mentioned in this row has been reset. A null value means they haven’t been reset since inception, or even possibly a crash of the database what may have wiped out these stats.

pg_stat_bgwriter

info about Checkpoints and The Background Writer

  • Checkpoints

A checkpoint happens either on schedule (represented by the checkpoint_timeout parameter), or when the maximum amount of WAL files have been used since the last checkpoint, and needs to force a checkpoint. Either way, a checkpoint writes dirty buffers to disk, and there are four columns tracking it.

The columns checkpoints_timed and checkpoints_req show the number of scheduled checkpoints occur (timed) and the number of requested checkpoints (also referred to as forced). A high climbing value of checkpoint_req could suggest an insufficient max_wal_size value.

Columns checkpoint_write_time and checkpoint_sync_time record the total amount of time (in milliseconds) the checkpoint process has spent writing and syncing to disk.

Finally, buffers_checkpoint is the total number of buffers written to disk by checkpoints.

  • Background Writer

The background writer is a separate process that writes dirty buffers to disk, which ideally reduces the amount of work the checkpointer needs to do.

The column buffers_clean represents the number of buffers written to disk by the background process. When compared to buffers_checkpoint, it shows how much of the workload is handled by each process (with the added knowledge that background writer has the possibility of writing buffers multiple times if they change often, vs less frequently with a timed checkpoint).

Maxwritten_clean represents the number of times the background writer reached the maximum number of pages to flush each time it runs (controlled with the bgwriter_lru_maxpages parameter).

  • Buffers in General

The remaining columns show us the general buffer information, with buffers_backend being the number of buffers a backend had to write itself, instead of background writer or checkpointer, buffers_backend_fsync is a count of how many times a backend had to execute its own fsync call, and buffers_alloc shows the number of buffers that have been allocated in general.

pg_stat_activity

  • General information

The pg_stat_activity view shows a row for every connection to the database, and some basic information about it. The column datname represents the database the connection is actually connected to, pid is the Process ID of the connection on the database host itself, and usesysid and usename represent the database user connected.

For the source of the client, client_addr is the IP address of the host the connection came from, null means it’s a local unix socket connection.

  • Timestamps

There are four timestamp columns that show when certain things started: backend_start is when the connection was actually established, xact_start is when the current transaction started (null if the client has no open transaction), query_start is when the current or most recent query started, and state_change is the time when the state of the connection last changed.

  • Connection Status

The final bits of pg_stat_activity cover the actual status of the connection. If the query is waiting on another to release locks, wait_event_type contains some information on what kind of wait event it is, and the column wait_event will show the wait event name. It’s a long list, but more information found at the PostgreSQL documentation.

Finally, the column ‘state’ shows what state the current connection is in, such as active, idle, idle in transaction, and the query column will show the actual query being run, or most recently run.

pg_lock

The pg_locks table works hand in hand with pg_stat_activity if looking into query activity. Anytime a lock is made to a relation, that information is stored in pg_locks. Using the pid from pg_stat_activity, we can query pg_locks to see what relations a connection may have locks on, what kinds of locks those are, and whether or not the locks have been granted.

The most important columns are ‘pid’, which matches the pid from pg_stat_activity, ‘relation’ which matches the OID from pg_class, ‘mode’ showing the name of the lock mode held, and ‘granted’ which states whether or not the lock in question has been granted.

pg_stat_replication

contains a row for every WAL sender process, containing information about its state, the location of the WAL files it’s working on, and the connection information of the standby host that is receiving the WAL data for replication.

pg_stat_wal_receiver

If the cluster is a standby, this will contain a single row showing statistics about the receiver process form the host.

pg_stat_subscription

If sending WAL data to a standby node, each row here will represent that subscription, and contain information about the status of the subscriptions.

pg_replication_slots

Contains a list of all the replication slots that exist on the cluster, and their current state.

Database Specific Metadata

pg_stat_user_tables

contains statistics on user access to the table

  • Table Access Methods

When clients access data from the table, it does so either directly or through indexes. The column ‘seq_scan’ counts the number of sequential scans the table received, and ‘seq_tup_read’ counts the number of tuples read through that process. The column ‘idx_scan’ counts how many times an index on the table was used to fetch data.

  • Table Tuple Activity

We now have a handful of columns that count different activities on the table.

‘n_tup_ins’ tracks the number of tuples inserted

‘n_tup_upd’ tracks the number of tuples updated

‘n_tup_del’ tracks the number of tuples deleted

  • Table Tuple State

Due to updates and deletes, there could be dead tuples that are no longer active data, and the vacuum process will eventually free them up. The columns ‘n_tup_ins’ and ‘n_tup_ins’ tracks the number of tuples that are alive and dead, respectively. When dead tuples reach a certain point, an autovacuum will be launched, depending on the autovacuum settings.

  • Table Vacuum Activity

Table maintenance is done through either VACUUM or AUTOVACUUM, and statistics are gathered through ANALYZE or AUTOANALYZE. The next four columns contain the dates for when each of these operations were last run: ‘last_vacuum’, ‘last_autovacuum’, ‘last_analyze’, ‘last_autoanalyze’.

We also have four more convenient columns that simply count how many times the previous actions occur. Using these, we can see which tables get the most activity: ‘vacuum_count’, ‘autovacuum_count’, ‘analyze_count’, and ‘autoanalyze_count’.

pg_statio_user_tables

contains I/O statistics for each table

The I/O output is useful for helping understand how the data is being accessed under the covers. The column ‘heap_blks_read’ represents the number of disk blocks read for this table, and ‘heap_blks_hit’ represents the buffer blocks read from memory on this table. This is helpful to know if queries accessing the table constantly have to go to disk, or fetch the data from memory.

Index stats on the table show the same information with the columns ‘idx_blks_read’ and ‘idx_blks_hit’.

Lastly, if the table has any TOAST tables, the columns ‘toast_blks_hit’ and ‘toast_blks_read’ track toast tables, while ‘tdix_blks_read’ and ‘tdix_blks_read’ tracks the indexes on those toast tables.

pg_stat_user_indexes

shows how many times the index was scanned with the column ‘idx_scan’, how many tuples were read with ‘idx_tup_read’, and how many live rows were actually fetched with ‘idx_tup_fetch’.

pg_statio_user_indexes

‘idx_blks_read’, and ‘idx_blks_hit’, representing the number of blocks read from disk and from memory