postgres buffer cache - ghdrako/doc_snipets GitHub Wiki

Buffer cache is located in the server’s shared memory and is accessible to all the processes. It takes the major part of the shared memory and is surely one of the most important and complex data structures in PostgreSQL.It consists of an array of buffers. Each buffer reserves a memory chuck that can accommodate a single data page together with its header.

A header contains some information about the buf f er and the page in it, such as:

physical location of the page (file ID, fork, and block number in the fork)
the attribute showing that the data in the page has been modified and sooner or later has to be written back to disk (such a page is called dirty)
buffer usage count
pin count (or reference count)

To get access to a relation’s data page, a process requests it from the buf f er man-ager1and receives the ID of the buffer that contains this page. Then it reads the cached data and modif i es it right in the cache if needed. While the page is in use, its buffer is pinned. Pins forbid eviction of the cached page and can be applied together with other locks. Each pin increments the usage count as well.

As long as the page is cached, its usage does not incur any file operations.

dirty page: it has been modif i ed, but is not written to disk yet.

Cache Hits

When the buffer manager has to read a page,it first checks the buffer cache. All buffer IDs are stored in a hash table,which is used to speed up their search.

Buffer cache uses the extendible table that resolves hash collisions by chaining.

A hash key consists of the ID of the relation file, the type of the fork, and the ID of the page within this fork’s file. Thus, knowing the page, PostgreSQL can quickly find the buffer containing this page or make sure that the page is not currently cached.

If the hash table contains the required buffer ID, the buffer manager pins this buffer and returns its ID to the process. Then this process can start using the cached page without incurring any I/O traffic. To pin a buffer, PostgreSQL has to increment the pin counter in its header; a buffer can be pinned by several processes at a time. While its pin counter is greater than zero, the buffer is assumed to be in use, and no radical changes in its contents are allowed.

The pg_statio_all_tables view contains the complete statistics on buffer cache usage by tables:

=> SELECT heap_blks_read, heap_blks_hit FROM pg_statio_all_tables

PostgreSQL provides similar views for indexes and sequences. They can also display statistics on I/O operations, but only if offtrack_io_timing is enabled.

The usage count is incremented each time the buffer is accessed (that is, pinned), and reduced when the buf fer manager is searching for pages to evict. The first unpinned buffer with the zero count found by the clock hand will be cleared. As a result, the least recently used pages are evicted first, while those that have been accessed more often will remain in the cache longer.

If all the buf f ers have a non-zero usage count, the clock hand has to complete more than one full circle before any of them f i nally reaches the zero value. To avoid running multiple laps, PostgreSQL limits the usage count by 5. Once the buffer to evict is found, the reference to the page that is still in this buffer must be removed from the hash table. But if this buffer is dirty, that is, it contains some modified data, the old page cannot be simply thrown away—the buffer manager has to write it to disk first.