postgres declarative partitioning - ghdrako/doc_snipets GitHub Wiki
PostgreSQL does not support join between partition-leafs and non-partition tables before merging.
PostgreSQL supports three types of declarative partitioning:
- Range partitioning
- List partitioning
- Hash partitioning
Index
We can create global primary key constraint for partitioned table that is passed down to all partitions. The pk must include all partitioned columns.
Index created on partitioned table will be created automatically for all partitions. Such indexes are still physically separated indexes but cat't be removed for specyfic partition.
Unique index support global uniquess but must include partition key.
If you have a primary key for the table, the partition column should be a part of the primary key. If you choose the primary key not as a part of the partition column, create the primary key on the child table.
Parallel processing
By setting enable_partition_pruning=on
,enable_partitionwise_join=on
, and enable_partitionwise_aggregate=on
either at the session or system level, you can leverage the parallel processing capabilities of PostgreSQL.
Declarative partitioning
Create partition table
List partitioning
- Sanity check
DROP TABLE IF EXISTS part_tags cascade;
- create our parent table
CREATE TABLE part_tags (
pk INTEGER NOT NULL DEFAULT nextval('part_tags_pk_seq') ,
level INTEGER NOT NULL DEFAULT 0,
tag VARCHAR (255) NOT NULL,
primary key (pk,level)
)
PARTITION BY LIST (level);
Note is that the field used to partition the data must be part of the primary key.
- Define the child tables
CREATE TABLE part_tags_level_0 PARTITION OF part_tags FOR VALUES IN
(0);
CREATE TABLE part_tags_level_1 PARTITION OF part_tags FOR VALUES IN
(1);
CREATE TABLE part_tags_level_2 PARTITION OF part_tags FOR VALUES IN
(2);
CREATE TABLE part_tags_level_3 PARTITION OF part_tags FOR VALUES IN
(3);
CREATE INDEX part_tags_tag on part_tags using GIN (tag gin_trgm_ops);
\d part_tags;
\d part_tags_level_0;
select * from part_tags;
select * from part_tags_level_0;
select * from part_tags_level_1;
Range partitioning
DROP TABLE IF EXISTS part_tags cascade;
CREATE TABLE part_tags (
pk INTEGER NOT NULL DEFAULT nextval('part_tags_pk_seq'),
ins_date date not null default now()::date,
tag VARCHAR (255) NOT NULL,
level INTEGER NOT NULL DEFAULT 0,
primary key (pk,ins_date)
)
PARTITION BY RANGE (ins_date);
CREATE TABLE part_tags_date_01_2020 PARTITION OF part_tags FOR
VALUES FROM ('2020-01-01') TO ('2020-01-31');
CREATE TABLE part_tags_date_02_2020 PARTITION OF part_tags FOR
VALUES FROM ('2020-02-01') TO ('2020-02-28');
CREATE TABLE part_tags_date_03_2020 PARTITION OF part_tags FOR
VALUES FROM ('2020-03-01') TO ('2020-03-31');
CREATE TABLE part_tags_date_04_2020 PARTITION OF part_tags FOR
VALUES FROM ('2020-04-01') TO ('2020-04-30')
CREATE INDEX part_tags_tag on part_tags using GIN (tag gin_trgm_ops);
\d part_tags;
\d part_tags_date_01_2020;
select * from part_tags;
select * from part_tags_date_01_2020;
select * from part_tags_date_02_2020;
select * from part_tags_date_03_2020;
Example 2
- Create parent payments table:
CREATE TABLE partitioned_payments(
id serial not null,
created_at timestamp NOT NULL,
updated_at timestamp NOT NULL,
amount double precision NOT NULL,
total double precision NOT NULL,
description json DEFAULT NULL,
tax double precision NOT NULL,
customer_id integer NOT NULL,
shipping_address text DEFAULT NULL,
status varchar(255) DEFAULT NULL
) PARTITION BY RANGE (created_at);
Example 3
CREATE TABLE data (
payload integer
) PARTITION BY RANGE (payload);
CREATE TABLE negatives PARTITION
OF data FOR VALUES FROM (MINVALUE) TO (0);
CREATE TABLE positives PARTITION
OF data FOR VALUES FROM (0) TO (MAXVALUE);
INSERT INTO data VALUES (5);
SELECT * FROM positives;
--move between partition
UPDATE data
SET payload = -10
WHERE payload = 5
RETURNING *;
SELECT * FROM negatives;
DROP TABLE negatives
CREATE TABLE p_def PARTITION OF data DEFAULT;
- Create partitions. It makes sense to create a few more partitions in advance::
CREATE TABLE payments_2020_1_1_7 PARTITION OF partitioned_payments
FOR VALUES FROM ('2020-01-01') TO ('2020-01-07 23:59:59');
CREATE TABLE payments_2020_1_8_15 PARTITION OF partitioned_payments
FOR VALUES FROM ('2020-01-08') TO ('2020-01-15 23:59:59');
CREATE TABLE payments_2020_1_16_23 PARTITION OF partitioned_payments
FOR VALUES FROM ('2020-01-16') TO ('2020-01-23 23:59:59');
- Create indexes:
// for partition key columns
CREATE INDEX payments_2020_1_1_7_created_at_idx ON payments_2020_1_1_7(created_at);
CREATE INDEX payments_2020_1_8_15_created_at_idx ON payments_2020_1_8_15(created_at);
CREATE INDEX payments_2020_1_16_23_created_at_idx ON payments_2020_1_16_23(created_at);
// for id column
CREATE INDEX payments_2020_1_1_7_id_idx ON payments_2020_1_1_7(id);
CREATE INDEX payments_2020_1_8_15_id_idx ON payments_2020_1_8_15(id);
CREATE INDEX payments_2020_1_16_23_id_idx ON payments_2020_1_16_23(id);
- Schedule for partition creation. As our last partition will have rows till 23 of Jan, 2020 it means that on 23 of Jan we should create a new partitions for new rows having created_at greater then 23 of Jan, 2020
0 0 23 * * /project_dir/create_payments_partitions.sh
where create_payments_partitions.sh:
#!/bin/sh
year=$(date +%Y)
month=$(date +%m)
next_year=$(date -d "+1 months" +%Y)
next_month=$(date -d "+1 months" +%m)
month_last_day=$(date -d "`date +%Y%m01` +1 month -1 day" +%d)
sql=$(sed "s/|year|/$year/g" ./declarative_new_partitions_template.sql |
sed "s/|month|/$month/g" | sed "s/|next_month|/$next_month/g" |
sed "s/|next_year|/$next_year/g" |
sed "s/|month_last_day|/$month_last_day/g"
)
docker exec -it postgres-13.10 psql -U postgres partitioning_demo -c "$sql"
and declarative_new_partitions_template.sql :
CREATE TABLE payments_|year|_|month|_24_|month_last_day| PARTITION OF partitioned_payments
FOR VALUES FROM ('|year|-|month|-24') TO ('|year|-|month|-|month_last_day| 23:59:59');
CREATE TABLE payments_|next_year|_|next_month|_1_7 PARTITION OF partitioned_payments
FOR VALUES FROM ('|next_year|-|next_month|-01') TO ('|next_year|-|next_month|-07 23:59:59');
CREATE TABLE payments_|next_year|_|next_month|_8_15 PARTITION OF partitioned_payments
FOR VALUES FROM ('|next_year|-|next_month|-08') TO ('|next_year|-|next_month|-15 23:59:59');
CREATE TABLE payments_|next_year|_|next_month|_16_23 PARTITION OF partitioned_payments
FOR VALUES FROM ('|next_year|-|next_month|-16') TO ('|next_year|-|next_month|-23 23:59:59');
And you can test
INSERT INTO partitioned_payments (created_at, updated_at, amount, total, description, tax, customer_id, shipping_address, status)
SELECT
date('2020-01-' || random_between(1,7)),
NOW() - '1 year'::INTERVAL * ROUND(RANDOM() * 100),
ROUND(random() * 1000)::double precision ,
ROUND(random() * 1000)::double precision ,
'{}',
ROUND(random() * 1000)::double precision,
ROUND(RANDOM() * 1000)::INTEGER,
SUBSTRING(md5(random()::text), 0, 25),
SUBSTRING(md5(random()::text), 0, 100)
FROM generate_series(1, 20);
and check the explain for the select:
EXPLAIN (analyze,timing off) SELECT * FROM partitioned_payments WHERE created_at BETWEEN '2020-01-01' AND '2020-01-07' ORDER BY id asc LIMIT 10 OFFSET 100;
In the execution plan we can see:
Scan on partitioned_payments...
...
Index Scan using payments_2020_1_1_7_created_at_idx on payments_2020_1_1_7 partitioned_payments (cost=0.00..8.27 rows=1 width=628)
So only partitioned_payments and payments_2020_1_1_7 tables are being accessed to find the rows.
Example
CREATE TABLE log (
log_id integer NOT NULL GENERATED ALWAYS AS IDENTITY,
log_date date not null,
log_text char(10),
constraint pk_log primary key(log_id,log_date)
) PARTITION BY RANGE (log_date);
------- Creation of Parent Table
CREATE TABLE log2023_m1 PARTITION OF log
FOR VALUES FROM ('2023-01-01') TO ('2023-01-31'); --- Creation of Child table
CREATE TABLE log2023_m2 PARTITION OF log
FOR VALUES FROM ('2023-02-01') TO ('2023-03-01');
CREATE TABLE log_default PARTITION OF log DEFAULT ; --Default
INSERT INTO log(log_date,log_text) values('2023-01-01','log010123');
INSERT INTO log(log_date,log_text) values('2023-04-01','log010123');
ALTER TABLE log DETACH PARTITION log2023_m1; ---- Detach Partition
ALTER TABLE log ATTACH PARTITION log2023_m1 -----Attach Partition
FOR VALUES FROM ('2023-01-01') TO ('2023-02-01');
SELECT tableoid, tableoid::regclass, * FROM log;
tableoid | tableoid | log_id | log_date | log_text
----------+-------------+--------+------------+------------
16439 | log2023_m1 | 1 | 2023-01-01 | log010123
16449 | log_default | 2 | 2023-04-01 | log010123
(2 rows)
UPDATE log
set log_date='2023-08-01';
SELECT tableoid, tableoid::regclass, * FROM log;
UPDATE log set log_date='2023-07-01';
SELECT tableoid, tableoid::regclass, * FROM log;
Hash partitioning
- Create parent table:
CREATE TABLE partitioned_payments_hash(
id serial not null,
created_at timestamp NOT NULL,
updated_at timestamp NOT NULL,
amount double precision NOT NULL,
total double precision NOT NULL,
description json DEFAULT NULL,
tax double precision NOT NULL,
customer_id integer NOT NULL,
shipping_address text DEFAULT NULL,
status varchar(255) DEFAULT NULL
) PARTITION BY HASH (customer_id);
- Create partitions: The goal is to have 12 partitions, so we use the hash value and do a modulo 12. The remainder will identify the partition
CREATE TABLE payments_1 PARTITION OF partitioned_payments_hash
FOR VALUES WITH (modulus 12, remainder 0);
CREATE TABLE payments_2 PARTITION OF partitioned_payments_hash
FOR VALUES WITH (modulus 12, remainder 1);
CREATE TABLE payments_3 PARTITION OF partitioned_payments_hash
FOR VALUES WITH (modulus 12, remainder 2);
CREATE TABLE payments_4 PARTITION OF partitioned_payments_hash
FOR VALUES WITH (modulus 12, remainder 3);
CREATE TABLE payments_5 PARTITION OF partitioned_payments_hash
FOR VALUES WITH (modulus 12, remainder 4);
CREATE TABLE payments_6 PARTITION OF partitioned_payments_hash
FOR VALUES WITH (modulus 12, remainder 5);
CREATE TABLE payments_7 PARTITION OF partitioned_payments_hash
FOR VALUES WITH (modulus 12, remainder 6);
CREATE TABLE payments_8 PARTITION OF partitioned_payments_hash
FOR VALUES WITH (modulus 12, remainder 7);
CREATE TABLE payments_9 PARTITION OF partitioned_payments_hash
FOR VALUES WITH (modulus 12, remainder 8);
CREATE TABLE payments_10 PARTITION OF partitioned_payments_hash
FOR VALUES WITH (modulus 12, remainder 9);
CREATE TABLE payments_11 PARTITION OF partitioned_payments_hash
FOR VALUES WITH (modulus 12, remainder 10);
CREATE TABLE payments_12 PARTITION OF partitioned_payments_hash
FOR VALUES WITH (modulus 12, remainder 11);
- Create indexes for partition key column in each partition:
CREATE INDEX payments_1_idx ON payments_1 (customer_id);
CREATE INDEX payments_2_idx ON payments_2 (customer_id);
CREATE INDEX payments_3_idx ON payments_3 (customer_id);
CREATE INDEX payments_4_idx ON payments_4 (customer_id);
CREATE INDEX payments_5_idx ON payments_5 (customer_id);
CREATE INDEX payments_6_idx ON payments_6 (customer_id);
CREATE INDEX payments_7_idx ON payments_7 (customer_id);
CREATE INDEX payments_8_idx ON payments_8 (customer_id);
CREATE INDEX payments_9_idx ON payments_9 (customer_id);
CREATE INDEX payments_10_idx ON payments_10 (customer_id);
CREATE INDEX payments_11_idx ON payments_11 (customer_id);
CREATE INDEX payments_12_idx ON payments_12 (customer_id);
Now you can insert rows:
INSERT INTO partitioned_payments_hash (created_at, updated_at, amount, total, description, tax, customer_id, shipping_address, status)
SELECT
date('2023-02-' || random_between(1,28)),
NOW() - '1 year'::INTERVAL * ROUND(RANDOM() * 100),
ROUND(random() * 1000)::double precision ,
ROUND(random() * 1000)::double precision ,
'{}',
ROUND(random() * 1000)::double precision,
ROUND(RANDOM() * 100000)::INTEGER,
SUBSTRING(md5(random()::text), 0, 25),
SUBSTRING(md5(random()::text), 0, 255)
FROM generate_series(1, 10000);
test the query performance:
EXPLAIN SELECT * FROM partitioned_payments_hash WHERE customer_id = 47209;
QUERY PLAN
-----------------------------------------------------------------------------------------------
Bitmap Heap Scan on payments_5 partitioned_payments_hash (cost=4.29..10.12 rows=2 width=109)
Recheck Cond: (customer_id = 47209)
-> Bitmap Index Scan on payments_5_idx (cost=0.00..4.29 rows=2 width=0)
Index Cond: (customer_id = 47209)
(4 rows)
List with Range Subpartition
first, you create a table that is partitioned by a list; next, you create child tables that serve as partitions of the parent table; and finally, you create range subpartitions within those list-partitioned.
DROP TABLE if exists global;
CREATE TABLE global
(
countryid integer,
Region varchar,
country varchar,
date_cr date,
rank integer)
PARTITION BY list (country); -- Parent Table
--- Child Table with List values
CREATE TABLE country_list
PARTITION OF global
FOR VALUES IN ('China','Europe','USA')
PARTITION BY RANGE(date_cr);
----Child table based on list with range
CREATE TABLE country_jan_2021
PARTITION OF country_list
FOR VALUES FROM ('2021-01-01 00:00:00') TO ('2021-01-31
23:59:59');
INSERT INTO global values(1,'East','USA','2021-01-01
00:00:00',100);
INSERT INTO global values(2,'East','China','2021-01-01
00:00:00',90);
Range with Hash Subpartition
Drop table if exists global1 cascade;
CREATE TABLE global1
(
countryid integer,
Region varchar,
country varchar,
date_cr date,
amount integer
)
PARTITION BY RANGE (date_cr); -- Parent table partitioned
on range
\echo Child table based on range and partitioned by hash
CREATE TABLE country_dates_2020
PARTITION OF global1
FOR VALUES FROM ('2020-01-01 00:00:00') TO ('2020-12-31
23:59:59')
PARTITION BY HASH (amount);
CREATE TABLE hash_p0_2020 PARTITION OF country_dates_2020 --
Child table of hash
FOR VALUES WITH (modulus 3, remainder 1);
INSERT INTO global1 values(1113,'East','China','2020-01-01
00:00:00',3000);
INSERT INTO global1 values(11121,'East','japan','2020-01-01
00:00:00',30000);
INSERT INTO global1 values(1115,'East','USA','2020-02-01
00:00:00',3000);
SET enable_partition_pruning = on;
explain
SELECT * FROM global1
where date_cr='2020-01-01' AND country='China' and amount > 300;
SELECT relname as partition_table,
pg_get_expr(relpartbound, oid) as partition_range
FROM pg_class
WHERE relispartition
and relkind = 'r';
SELECT tableoid::regclass, * FROM global1;
SELECT * FROM pg_partition_tree('global1');
Range with List and Hash Partition
Drop table if exists global1 cascade;
CREATE TABLE global1
(
countryid integer,
Region varchar,
country varchar,
date_cr date,
amount integer
)
PARTITION BY range (date_cr);
CREATE TABLE country_dates_2020
PARTITION OF global1
FOR VALUES FROM ('2020-01-01 00:00:00') TO ('2020-12-31
23:59:59');
CREATE TABLE hash_p0_2020 PARTITION OF country_dates_2020
FOR VALUES WITH (modulus 3, remainder 1);
CREATE TABLE country_list_2023
PARTITION OF global1
FOR VALUES FROM ('2019-01-01 00:00:00') TO ('2019-12-31 23:59:59')
PARTITION BY LIST(country);
CREATE INDEX idx_id_country1 on global1(countryid);
CREATE INDEX idx_country1 on global1(country);
INSERT INTO global1 values (1113,'East','China','2020-01-01
00:00:00',3000);
INSERT INTO global1 values(1112,'East','usa','2021-01-01
00:00:00',300);
INSERT INTO global1 values(11121,'East','japan','2020-01-01
00:00:00',30000);
SET enable_partition_pruning = on;
SELECT * FROM global1;
List Partition with Range and Hash Subpartition
DROP TABLE IF EXISTS global1 CASCADE;
-- Create the main table with RANGE partitioning on date_cr
CREATE TABLE global1
(
countryid integer,
region varchar,
country varchar,
date_cr date,
amount integer
)
PARTITION BY RANGE (date_cr);
CREATE TABLE country_dates_2020
PARTITION OF global1
FOR VALUES FROM ('2020-01-01') TO ('2021-01-01');
CREATE TABLE country_dates_2020_p0
PARTITION OF country_dates_2020
FOR VALUES WITH (MODULUS 3, REMAINDER 0)
PARTITION BY LIST (country);
CREATE TABLE country_dates_2020_p0_list1
PARTITION OF country_dates_2020_p0
FOR VALUES IN ('USA', 'JAPAN', 'CHINA');
CREATE TABLE country_dates_2020_p0_list2
PARTITION OF country_dates_2020_p0
FOR VALUES IN ('UK', 'FRANCE', 'GERMANY');
CREATE TABLE country_dates_2020_p0_list3
PARTITION OF country_dates_2020_p0
FOR VALUES IN ('INDIA', 'BRAZIL', 'AUSTRALIA');
INSERT INTO global1 (countryid, region, country, date_cr,
amount) VALUES
(0, 'North America', 'USA', '2020-05-15', 1000),
(3, 'Asia', 'JAPAN', '2020-06-10', 2000),
(6, 'Europe', 'UK', '2020-08-25', 1800);
SELECT * FROM global1;
Maintenance partition
List partition
use the pg_partition_tree
function to get the list of partition names in the table:
# Verify the partitioned tab
SELECT p.relname AS partition_name,
pg_get_expr(p.relpartbound, p.oid) AS partition_bound,
pg_get_partkeydef(p.oid) AS partition_key
FROM pg_partition_tree('sales') AS t
JOIN pg_class AS p ON p.oid = t.relid;
Attach a new partition
create the new regular table with the desired schema and partition constraints, and then use the ALTER TABLE command with the ATTACH PARTITION option to attach it to the partitioned table.
CREATE TABLE part_tags_date_05_2020 PARTITION OF part_tags FOR VALUES FROM ('2020-05-01') TO ('2020-05-30');
Detach an existing partition
ALTER TABLE part_tags DETACH PARTITION part_tags_date_05_2020 ;
ALTER TABLE t_data
DETACH PARTITION payments_3;
ALTER TABLE t_data
ATTACH PARTITION payments_3
FOR VALUES WITH (MODULUS 12, REMAINDER 3);
Attach an already existing table to the parent table
ALTER TABLE part_tags ATTACH PARTITION part_tags_already_exists FOR
VALUES FROM ('1970-01-01') TO ('2019-12-31');
Constraint exclusion
Set parameter constraint exclusion
to on
or partition
. If WHERE clause contain partition key optimizer using check constraint to remove partition from scaning
select * from pg_setting where name = 'constraint_exclusion';
Partition pruning
It's use only in declarative partitioning. Set parameter partition pruning
to enable
(default). If WHERE clause contain partition key optimizer using table partition bounds to remove partition from scaning. This bounds exist only in declarative partition.
select * from pg_setting where name = 'enable_partition_pruning';
enable_partition_pruning = off/on
Partition pruning can be performed not only during the planning of a given query, but also during its execution. This is useful as it can allow more partitions to be pruned when clauses contain expressions whose values are not known at query planning time; for example, parameters defined in a PREPARE statement, using a value obtained from a subquery or using a parameterized value on the inner side of a nested loop join.
Default partition
use a default partition where all the values that are not reflected in the mapping of the child tables will be inserted.
CREATE TABLE part_tags_default PARTITION OF part_tags default;
Indexing
Postgres 11+ If define the indexes for the parent table. These indexes will automatically be propagated to child tablesa.
Moving rów between partitios
PostgreSQL 11+ is able to move a row from one partition to the other. However, keep in mind that moving data between partitions may not be the best idea in general.
When you create an index on the parent table, indexes are created on partitioned tables
postgres=# CREATE INDEX idx_emp_sales on employee(sale_amount);
CREATE INDEX
postgres=# \di
List of relations
Schema | Name | Type
| Owner | Table
--------+------------------------------+-------------------+---
-------+--------------
public | emp_feb_2023_sale_amount_idx | index |
postgres | emp_feb_2023
public | emp_jan_2023_sale_amount_idx | index |
postgres | emp_jan_2023
public | emp_mar_2023_sale_amount_idx | index |
postgres | emp_mar_2023
(9 rows)
Default partition
CREATE TABLE p_def PARTITION OF data DEFAULT;
All the data that doesn’t fit anywhere will end up in this default partition, which ensures that creating the right partition can never be forgotten.
Table partition information
The function pg_partition_tree
displays the parent and child partition levels in the database.
postgres=# select * from pg_partition_tree('log');
relid | parentrelid | isleaf | level
-------------+-------------+--------+-------
log | | f | 0
log_q1_2024 | log | t | 1
log_q2_2024 | log | t | 1
The partition range details are obtained from the pg_get_expr
function.
SELECT relname as partition_table,
pg_get_expr(relpartbound, oid) as partition_range
FROM pg_class
WHERE relispartition
AND relkind = 'r';
partition_table | partition_range
-----------------+--------------------------------------------------
log_q2_2024 | FOR VALUES FROM ('2024-04-01') TO ('2024-06-30')
log_q1_2024 | FOR VALUES FROM ('2024-01-01') TO ('2024-03-31')
The partition table object id(oid) is converted to tablename by the
::regclass typecase operator.
SELECT tableoid::regclass, * FROM log;
tableoid | log_id | logtext | log_date
-------------+--------+--------------------------------+------------
log_q1_2024 | 200 | test | 2024-03-20
log_q2_2024 | 100 | test | 2024-06-21