postgres Autovacuum Autoanalyze ANALYZE SKIP_LOCKED - ghdrako/doc_snipets GitHub Wiki

VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ table_and_columns [, ...] ]

Autovacum

Autovacuum can be thought of as the “garbage collection” process for PostgreSQL.

Autovacuum is a background scheduler that runs VACUUM workers for each table. The scheduler process runs by default when PostgreSQL starts. Autovacuum thresholds are configurable and are considered conservative by default. Thresholds can be tuned using various parameters.. When Autovacuum runs a worker on a table, it uses a SHARE UPDATE EXCLUSIVE lock mode. This lock type is both an “exclusive” lock and a “shared” lock.Access for reads of the table is shared, while access for writes (like updates) is exclusive.

PostgreSQL database tables are auto-vacuumed by default when 20% of the rows plus 50 rows are inserted, updated, or deleted. Tables are auto-analyzed when a threshold is met for 10% of the rows plus 50 rows. For example, a table with 10000 rows is not auto-vacuumed until 2050 rows are inserted, updated, or deleted. That same table is auto-analyzed when 1050 rows are inserted, updated, or deleted.

Autovacuum does not recover the disk space taken up by dead tuples.When Autovacuum processes dead tuples, the space they used is marked for re-use and the visibility map (VM) is updated.The visibility map is a data structure that tracks information about heap (table) pages.

information about the last time the ebk schema tables were vacuumed or analyzed:

select relname,last_vacuum, last_autovacuum, last_analyze, vacuum_count, autovacuum_count,
  last_autoanalyze from pg_stat_user_tables where schemaname = 'ebk' order by relname ASC;

Autovacuum Database Parameters

Parameter Name Original Value Tuned Value
autovacuum_vacuum_scale_factor 0.2 0.01
autovacuum_vacuum_cost_limit 200 2000
autovacuum_vacuum_cost_delay 20ms 2ms

The parameters autovacuum_vacuum_threshold and autovacuum_vacuum_scale_factor are related to the amount of dead tuples as a fraction of the total tuples. These values express a threshold that when exceeded, acts as triggering logic for a vacuum worker to run. These values may be changed for all tables or on individual tables. Changing the value first for individual tables that are heavily updated is a good place to start. The default scale factor value is 0.2. This means that when 20 percent of the table size is made up of dead tuples, VACUUM is triggered for that table. A good place to start is by following a common recommendation to lower the autovacuum_scale_factor so that VACUUM runs when there are fewer dead tuples. A value of 0.01 (1 percent) or even less causes VACUUM to run earlier than it would have otherwise.This parameter can be changed as the owner user.

ALTER   TABLE  trips  SET  (autovacuum_vacuum_scale_factor = 0.01);

The parameter autovacuum_vacuum_cost_limit specifies a cost of work that Autovacuum completes before it stops running (and goes to sleep). Raising this value increases the cost before Autovacuum stops, which means the worker completes more work (higher cost) and runs longer before it sleeps. autovacuum_vacuum_cost_limit, which has a default value of 200. Each autovacuum thread is assigned a cost limit using this formula shown below:

individual thread’s cost_limit = autovacuum_vacuum_cost_limit / autovacuum_max_workers

Parametr has a default value of 200. This parameter is used when VACUUM is run manually from psql. Try running VACUUM VERBOSE trips; from psql. The VERBOSE option shows more information about what’s happening. Change the value of vacuum_cost_limit to 2000 (ten times increase). Vacuum now completes ten times the amount of work before stopping. This is a change to try if your Autovacuum jobs are not cleaning up dead tuples at a fast enough rate. This parameter can be changed without restarting PostgreSQL, so you can experiment with different values to see the effect. The corresponding autovacuum_vacuum_cost_limit for vacuum_cost_limit is for Autovacuum VACUUM jobs instead of manual ones. You may also want to make global parameter changes. Try changing autovacuum_vacuum_cost_limit in your config file, which is a global change, then reloading the config file using `pg_ctl reload``

The cost of work done by an autovacuum thread is calculated using three parameters:

parameters mean is this:

  • When a vacuum thread finds the data page that it’s supposed to clean in the shared buffer, the cost is 1.
  • If the data page is not in the shared buffer, but the OS cache, the cost will be 10.
  • If the page has to be marked dirty because the vacuum thread had to delete dead rows, the cost will be 20.

autovacuum_vacuum_cost_delay controls the length of time Autovacuum is paused. A lower value means Autovacuum is paused for shorter periods of time. In the table, the value is reduced from 20ms to 2ms. In PostgreSQL 15, the default value for this parameter became 2ms, so this parameter change isn’t necessary on version 15 or newer.

configure individual tables.

ALTER TABLE <table_name> SET (autovacuum_vacuum_cost_limit = <large_value>)
ALTER TABLE <table_name> SET (autovacuum_vacuum_cost_delay = <lower_cost_delay>)

PostgreSQL will start autovacuum on a table when:

pg_stat_user_tables.n_dead_tup > (pg_class.reltuples x autovacuum_vacuum_scale_factor)  + autovacuum_vacuum_threshold

NOTE:For larger installations (that is, thousands of workloads), we recommend that you customize the settings as follows on individual tables to ensure optimal performance for data access:

  • The scale factor should be set to zero for both vacuum and analyze auto-vacuum settings.
  • The threshold should be set to 1000 for both the vacuum and analyze threshold settings.
ALTER TABLE ebk.tab1 SET (autovacuum_vacuum_scale_factor = 0.0);
ALTER TABLE ebk.workload_cost SET (autovacuum_vacuum_threshold = 1000);
ALTER TABLE ebk.workload_cost SET (autovacuum_analyze_scale_factor = 0.0);
ALTER TABLE ebk.workload_cost SET (autovacuum_analyze_threshold = 1000);

Autovacum insert-only tables

Autoanalyze

autoanalyze also uses two parameters that decide when autovacuum will also trigger an autoanalyze:

  • autovacuum_analyze_threshold: this has a default value of 50
  • autovacuum_analyze_scale_factor: this has a default value of 0.1

Modifying the autovacuum_analyze_threshold setting for a table.

ALTER TABLE <table_name> 
SET (autovacuum_analyze_threshold = <threshold rows>)

Manual analyze

ANALYZE [ VERBOSE ] [ table_name [ ( column_name [, ...] ) ] ]
ANALYZE [ ( option [, ...] ) ] [ table_and_columns [, ...] ]

where option can be one of:

    VERBOSE [ boolean ]
    SKIP_LOCKED [ boolean ]
    BUFFER_USAGE_LIMIT size

and table_and_columns is:

    table_name [ ( column_name [, ...] ) ]

Example

  • ANALYZE users; collects statistics for users table.
  • ANALYZE VERBOSE users; does exactly the same plus prints progress messages.
  • ANALYZE users (id, display_name); collects statistics for id and display_name columns of users table.
  • `ANALYZE``; collects statistics for all table in the current database.

Note that, in PostgreSQL, ANALYZE command doesn't read or update indexes. It deals only with table/column contents. It also doesn't block the table - other queries may read from the table while ANALYZE is running.

SKIP_LOCKED

ANALYZE SKIP_LOCKED user;

Turn off autovacume for table

CREATE TABLE vac( id integer, s char(100) ) 
WITH (autovacuum_enabled = off);

Python

def vacuum(self):
    old_isolation_level = self.conn.isolation_level
    self.conn.set_isolation_level(0)
    query = "VACUUM FULL"
    self._doQuery(query)
    self.conn.set_isolation_level(old_isolation_level)

Java

final Statement st2 = connection.createStatement();
st2.executeUpdate("VACUUM FULL ANALYZE VERBOSE");
st2.close();
⚠️ **GitHub.com Fallback** ⚠️