performance - GradedJestRisk/db-training GitHub Wiki

Performance

Restrict resources

From OS

Use docker quota

  • RAM
  • CPU
  • I/O is not available (unfortunately, as database performance rely on I/O)

Total RAM

services
  database:
    image: bitnami/postgresql
    deploy:
       resources:
          limits:
             cpus: '1'
             memory: 512m    

Shared memory (/dev/shm, used for IPC)

What does actually do ? https://docs.docker.com/compose/compose-file/compose-file-v3/#shm_size

services
  database:
    image: bitnami/postgresql
    shm_size: 256m    

Check using inspect

docker inspect databse | grep -i shm

Memory usage

Check cache https://stackoverflow.com/questions/1216660/see-and-clear-postgres-caches-buffers

Clear OS cache

echo 3 | sudo tee /proc/sys/vm/drop_caches

Reduce cache

The next best thing seems to be: Actual max RAM = shared_buffers + (temp_buffers + work_mem) * max_connections

https://stackoverflow.com/questions/28844170/how-to-limit-the-memory-that-is-available-for-postgresql-server

Restrict connexion max_connections (minimum 2, 1 for Postgresql maintenance and 1 for actual queries).

Restrict memory usage for

  • cache, using shared_buffers
  • sorting, using work_mem
  • temporary tables, using temp_buffers
  • autovacuum, using maintenance_work_mem

You can do this in postgresql.conf

shared_buffers = 2GB

Which you expose this way

services
  db-ptm-integration:
    image: bitnami/postgresql
    volumes:
      - ./postgresql-configuration:/bitnami/postgresql/conf

Or using parameter (works with alpine, to test here...)

services
  db-ptm-integration:
    image: bitnami/postgresql
    command: postgres
                -c shared_buffers=256m

better: configuration file extra https://stackoverflow.com/questions/76149706/bitnami-postgresql-replaces-postgresql-conf-after-restarting-docker

Index

Default fillfactor is 90%

statistics

pgstatindex function

Column Type Description
version integer B-tree version number
tree_level integer Tree level of the root page
index_size bigint Total index size in bytes
root_block_no bigint Location of root page (zero if none)
internal_pages bigint Number of "internal" (upper-level) pages
leaf_pages bigint Number of leaf pages
empty_pages bigint Number of empty pages
deleted_pages bigint Number of deleted pages
avg_leaf_density float8 Average density of leaf pages
leaf_fragmentation float8 Leaf page fragmentation

Doc

SELECT * FROM pgstatindex('pg_cast_oid_index');
CREATE EXTENSION pgstattuple;

DROP TABLE test;

CREATE TABLE test (
   id        bigint  CONSTRAINT test_pkey PRIMARY KEY,
   unchanged integer,
   changed   integer
);
 
INSERT INTO test
   SELECT i, i, 0
   FROM generate_series(1, 100000) AS i;
 
CREATE INDEX test_unchanged_idx ON test (unchanged);
CREATE INDEX test_changed_idx ON test (changed);

ANALYZE VERBOSE test;

-- block size
SELECT pg_size_pretty(current_setting('block_size')::numeric);

-- table size
SELECT pg_size_pretty(pg_table_size('test'));

-- index size
SELECT pg_size_pretty(pg_table_size('test_pkey'));

SELECT 
    'index =>'
    ,ndx_stt.tree_level tree_depth
    ,ndx_stt.internal_pages node_block_count
    ,ndx_stt.leaf_pages leaves_block_count
    ,pg_size_pretty(index_size) size
    ,'changes=>' 
    ,ndx_stt.avg_leaf_density density
    ,ndx_stt.empty_pages empty_block_count
    ,ndx_stt.deleted_pages deleted_block_count
    ,ndx_stt.leaf_fragmentation fragmentation_rate
    , 'pgstatindex.*'
    ,ndx_stt.*
FROM pgstatindex('test_unchanged_idx') ndx_stt;

CREATE EXTENSION "tablefunc";
SELECT normal_rand(1, 10000, 10);
UPDATE testtab SET changed = changed + 1 WHERE id = :id;

bloat (dead entries)

Cyberted

In a B-tree index, there is an index entry for every row version (“tuple”) in the table that is not dead (invisible to everybody). When VACUUM removes dead tuples, it also has to delete the corresponding index entries. Just like with tables, that creates empty space in an index page. Such space can be reused, but if no new entries are added to the page, the space remains empty.

This “bloat” is unavoidable and normal to some extent, but if it gets to be too much, the index will become less efficient:

  • for an index range scan, more pages have to be scanned
  • index pages cached in RAM means that you cache the bloat, which is a waste of RAM
  • fewer index entries per page mean less “fan out”, so the index could have more levels than necessary

This is particularly likely to happen if you update the same row frequently. Until VACUUM can clean up old tuples, the table and the index will contain many versions of the same row. This is particularly unpleasant if an index page fills up: then PostgreSQL will “split” the index page in two. This is an expensive operation, and after VACUUM is done cleaning up, we end up with two bloated pages instead of a single one.

solutions

index deduplication

https://www.cybertec-postgresql.com/en/b-tree-index-improvements-in-postgresql-v12/

Several consecutive index leaf's entries can point to a heap tuples with same indexed value (the rows are still different) This is true even for unique indexes, not only for index range scan, because of MVCC.

Cybertec

B-tree indexes used to store an index entry for each referenced table row. This makes maintenance faster but can lead to duplicate index entries. Duplicate index entries can still occur in B-tree indexes, but PostgreSQL deduplicates B-tree entries only when it would otherwise have to split the index page. So it has to do the extra work only occasionally, and only when it would have to do extra work anyway. This extra work is balanced by the reduced need for page splits and the resulting smaller index size.

HOT tuples

Cybertec

The creation of HOT tuples is perhaps the strongest weapon PostgreSQL has to combat unnecessary churn in the index. With this feature, an UPDATE creates tuples that are not referenced from an index, but only from the previous version of the table row. That way, there is no need to write a new index entry at all, which is good for performance and completely avoids index bloat.

Bitmap index scan : fetch form heap in batches

Cybertec

To avoid redundancy and to keep index tuples small, the visibility information is not stored in the index. The status of an index tuple is determined by the heap tuple it points to, and both are removed by VACUUM at the same time. As a consequence, an index scan has to inspect the heap tuple to determine if it can “see” an entry. This is the case even if all the columns needed are in the index tuple itself. Even worse, this “heap access” will result in random I/O, which is not very efficient on spinning disks.

This makes index scans in PostgreSQL more expensive than in other database management systems that use a different architecture. To mitigate that, several features have been introduced over the years:

  • “bitmap index scan”. This scan method first creates a list of heap blocks to visit and then scans them sequentially. This not only reduces the random I/O, but also avoids that the same block is visited several times during an index scan.

Index-only scan, aka all rows visible within block

Check code

Cybertec

To avoid redundancy and to keep index tuples small, the visibility information is not stored in the index. The status of an index tuple is determined by the heap tuple it points to, and both are removed by VACUUM at the same time. As a consequence, an index scan has to inspect the heap tuple to determine if it can “see” an entry. This is the case even if all the columns needed are in the index tuple itself. Even worse, this “heap access” will result in random I/O, which is not very efficient on spinning disks.

This makes index scans in PostgreSQL more expensive than in other database management systems that use a different architecture. To mitigate that, several features have been introduced over the years: “index-only scan”, which avoids fetching the heap tuple. This requires that

  • all the required columns are in the index
  • the “visibility map” shows that all tuples in the table block are visible to everybody.

All tuples in the table block are visible to everybody means

  • each row/tuple/item of the block is visible to everybody : rely on blog post - I didn't find how rto query it
    • means no uncommitted transaction can see this entry => the transaction has been started after this row version has been updated (xmax)
    • it is stored in the row header
    • it is set by any read query
  • all the rows means :
    • each and every row, from each and any block, are visible to everybody
    • it is stored in the table visibility map (VM), all_visible bit
    • it is set by VACUUM (not full)

Visibility map statistics

SELECT
   relname         table_name
   ,relpages       block_count          -- Number of pages
   ,relallvisible  all_visible_block_count   -- Number of pages that are visible to all transactions
   ,TRUNC(relallvisible / relpages) * 100 || ' %' pct_visible
FROM pg_class
WHERE 1=1
    AND relname = 'mytable'
    --AND relpages <> 0

Killed index tuples, aka simple index deletion

Index leaf entries :

  • can be marked as dead (pointing to a dead item in heap);
  • by a heap fetch following an index scan;
  • so next index scan can skip it (no heap fetch).

Postgresql doc

Not all deletion operations that are performed within B-Tree indexes are bottom-up deletion operations. There is a distinct category of index tuple deletion: simple index tuple deletion. This is a deferred maintenance operation that deletes index tuples that are known to be safe to delete (those whose item identifier's LP_DEAD bit is already set). Like bottom-up index deletion, simple index deletion takes place at the point that a page split is anticipated as a way of avoiding the split.

Cybertec

To avoid redundancy and to keep index tuples small, the visibility information is not stored in the index. The status of an index tuple is determined by the heap tuple it points to, and both are removed by VACUUM at the same time. As a consequence, an index scan has to inspect the heap tuple to determine if it can “see” an entry. This is the case even if all the columns needed are in the index tuple itself. Even worse, this “heap access” will result in random I/O, which is not very efficient on spinning disks.

This makes index scans in PostgreSQL more expensive than in other database management systems that use a different architecture. To mitigate that, several features have been introduced over the years (..)

When an index scan encounters an entry that points to a dead tuple in the table, it will mark the index entry as “killed”. Subsequent index scans will skip such entries, even before VACUUM can remove them.

Next time you encounter mysterious query run-time differences that cannot be explained by caching effects, consider killed index tuples as the cause. This may be an indication to configure autovacuum to run more often on the affected table, since that will get rid of the dead tuples and speed up the first query execution

But beware that even vacuum wil not necessarily remove dead items in heap / and indexes right away, so

  • the block will into become all-visible
  • no index-only-scan

We only skip index vacuuming when 2% or less of the table's pages have one or more LP_DEAD items -- bypassing index vacuuming as an optimization must not noticeably impede setting bits in the visibility map.

Cybertec

You can bypass though

VACUUM (INDEX_CLEANUP ON) mytable

Bottom-up index tuple deletion

Pg doc

Cybertec

“Bottom-up index tuple deletion” goes farther than the previous approaches: it deletes index entries that point to dead tuples right before an index page split is about to occur. This can reduce the number of index entries and avoid the expensive page split, together with the bloat that will occur later, when VACUUM cleans up.

it promises to provide a solid improvement for many workloads, especially those with lots of updates.

another post

Fastware

⚠️ **GitHub.com Fallback** ⚠️