postgres vacume - ghdrako/doc_snipets GitHub Wiki

Vacuuming detects dead tuples based on the database horizon.

Way Postgres implements changes to rows in a table:

  • An UPDATE operation in Postgres is equivalent to a DELETE of the previous tuple, plus an INSERT of the new one.
  • A DELETE operation in Postgres does not cause the data to be removed from physical storage. It only causes it to be marked as deleted. This is why Postgres has the autovacuum process.

Table bloat happens when a table’s physical footprint far exceeds the size of the data that it actually holds.

VACUUM VERBOSE table; 
ALTER TABLE testings SET (autovacuum_enabled = true);

Monitoring autovacume

SELECT * FROM pg_stat_progress_vacuum

Above view shows:

  • phase — the name of the current vacuum phase (I have described the main ones, but there are actually more of them1)
  • heap_blks_total — the total number of pages in a table
  • heap_blks_scanned — the number of scanned pages
  • heap_blks_vacuumed — the number of vacuumed pages
  • index_vacuum_count — the number of index scans 1
    The overall vacuuming progress is defined by the ratio of heap_blks_vacuumed to heap_blks_total, but you have to keep in mind that it changes in spurts because of index scans. In fact, it is more important to pay attention to the number of vacuum cycles: if this value is greater than one, it means that the allocated memory was not enough to complete vacuuming in one go.

Change memory

ALTER SYSTEM SET maintenance_work_mem = '1MB';
SELECT pg_reload_conf();

Set how often autovacume must execute

ALTER SYSTEM SET autovacuum_naptime = '1s';

pg_repack or pg_squeeze can compact tables without exclusive locks.

Tunning vacume

Parameters

Name Type Range Default Desc
autovacuum_vacuum_scale_factor float 0 ... 100 The default is 0.2. No
autovacuum_vacuum_threshold integer 0 ... 2147483647 The default is 50.  
autovacuum_max_workers integer 1 ... varies (see note) The default is 3.

autovacuum_vacuum_scale_factor denotes the portion of the table size that, when surpassed by "dead" rows, prompts a vacuum action.

ALTER SYSTEM SET autovacuum_vacuum_scale_factor = 0.1; -- set 10% threshold dead tuples
SELECT pg_reload_conf();

autovacuum_vacuum_scale_insert_factor - fraction of the total number of tuples in a table that must be modified before autovacuum cleans the table.

select last_autovacuum,autovacuum_count,vacuum_count from pg_stat_user_tables;

Dla bardzo dużych tabel( miliard wierszy) współczynnik defaulowy 0.2 moze być za duży. Lepszy będzie 0.02 a dla tabel z duza ilością zmian nawet 0.002.

obraz obraz obraz

autovacuum_naptime parameter in Postgres specifies the minimum delay between autovacuum runs on any given database.Postgres wake up autovacume every autovacuum_naptime seconds. Generally, depending on just how high-churn your workloads are, it might be necessary to decrease this value, whereas a longer interval could be suited for environments that are not churning at such a high rate.

autovacuum_vacuum_cost_delay is the amount of time that the autovacuum process will halt processing when the autovacuum_vacuum_cost_limit is reached. In example a table reaches 10% bloat, meaning 10% of the tuples are dead. When the 10% threshold is reached, the autovacuum worker begins to work and starts accruing cost. When that cost reaches autovacuum_vacuum_cost_limit, it will pause for the duration specified by autovacuum_vacuum_cost_delay, and then continue working until it is complete.

obraz

SELECT relname, n_dead_tup FROM pg_stat_user_tables;  -- nie wyczyszczone rekordy
SELECT
       relname AS "table_name",
       pg_size_pretty(pg_table_size(C.oid)) AS "table_size"
FROM
       pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE nspname NOT IN ('pg_catalog', 'information_schema') AND nspname !~ '^pg_toast' AND relkind IN ('r')
ORDER BY pg_table_size(C.oid)
DESC LIMIT 1;
SELECT relname, last_vacuum, last_autovacuum FROM pg_stat_user_tables;
[pg_stat_progress_vacuum](https://www.postgresql.org/docs/10/static/progress-reporting.html)
# Install pgstattuple extension  
CREATE EXTENSION pgstattuple;  
# Identify dead tuples  
select * from pgstattuple('testings');

Troubleshooting

VACUUM (VERBOSE)

Jak zwroci xmin to

SELECT * FROM pg_stat_activity WHERE backend_xmin = '<xmin>';

There are three reasons why dead tuples cannot be removed:

  1. There is a long running transaction that has not been closed. You can find the bad boys with
    SELECT pid, datname, usename, state, backend_xmin
    FROM pg_stat_activity
    WHERE backend_xmin IS NOT NULL
    ORDER BY age(backend_xmin) DESC;
You can get rid of a transaction with [pg_cancel_backend() or pg_terminate_backend()](https://www.postgresql.org/docs/current/static/functions-admin.html#FUNCTIONS-ADMIN-SIGNAL).

VACUUM can only remove dead tuples which are long-dead, that is, dead to all possible uses. If you have long-lived transactions, they may prevent the recently-dead tuples from being removed.

Certainly a long running select or insert statement will do that. For isolation levels higher than read-committed, the whole transaction will retain the snapshot until it is down, so if some opens a repeatable read transaction and then goes on vacation without committing it, that would be a problem. Hung-up prepared transactions will as well (if you don't know what a prepared transaction is, then you probably aren't using them). Start logging the output of your VACUUM VERBOSE statements, so that you can find the information that covers the period during which the problem exists.

  1. There are prepared transactions which have not been commited. You can find them with
    SELECT gid, prepared, owner, database, transaction
    FROM pg_prepared_xacts
    ORDER BY age(transaction) DESC;
User [COMMIT PREPARED](https://www.postgresql.org/docs/current/static/sql-commit-prepared.html) or [ROLLBACK PREPARED](https://www.postgresql.org/docs/current/static/sql-rollback-prepared.html) to close them.

obraz

obraz 3. There are replication slots which are not used. Find them with

    SELECT slot_name, slot_type, database, xmin, catalog_xmin
    FROM pg_replication_slots
    ORDER BY age(xmin),age(catalog_xmin) DESC;
Use [pg_drop_replication_slot()](https://www.postgresql.org/docs/current/static/functions-admin.html#FUNCTIONS-REPLICATION) to delete an unused replication slot.

Hot standby obraz Replication slot obraz

  1. Long running backends - start transaction and forgot to end obraz
SELECT pid,datname,usename,state,backend_xmin FROM pg_stat_activity WHERE backend_xmin IS NOT NULL ORDER BY age(backend_xmin) DESC;

Transaction ID wraparound

obraz obraz obraz

⚠️ **GitHub.com Fallback** ⚠️