postgres parallel - ghdrako/doc_snipets GitHub Wiki

Parallel query

If the optimizer thinks that parallel processing would reduce the execution time, it will plan additional parallel worker processes. These processes are created by the query executor and live only for the duration of a single SQL statement. The parallel worker processes calculate intermediate results, which eventually are collected at the original backend process. This collection happens in the β€œGather” node of the execution plan: PostgreSQL executes all steps below the Gather node in parallel, while everything above Gather is single-threaded.

Limitation

While parallel query can speed up query execution, it incurs some overhead:

  • starting parallel worker processes is an expensive operation
  • exchanging data between worker processes and the backend requires inter-process communication via dynamic shared memory segments (DSM)

The PostgreSQL optimizer takes this overhead into account by planning parallel query only expensive statements that process large tables or indexes. There are some additional limits to prevent parallel query from consuming too many resources:

If the pool defined by max_parallel_workers is exhausted, the query executor cannot start all the parallel processes that the optimizer planned.

EXPLAIN (COSTS OFF) SELECT count(*) FROM bar;
 
                 QUERY PLAN                 
════════════════════════════════════════════
 Finalize Aggregate
   ->  Gather
         Workers Planned: 2
         ->  Partial Aggregate
               ->  Parallel Seq Scan on bar

To ensure that there is always just one core at work, use the following command:

SET max_parallel_workers_per_gather TO 0;