postgres Materialized Path hierarchical tree - ghdrako/doc_snipets GitHub Wiki
- https://www.postgresql.org/docs/current/ltree.html
- https://leonardqmarcq.com/posts/modeling-hierarchical-tree-data
- https://leonardqmarcq.com/posts/dos-and-donts-of-modeling-hierarchical-trees-in-postgres
- https://minervadb.xyz/implementing-the-materialized-path-model-in-postgresql-a-step-by-step-guide/
- https://gist.github.com/johnclittle/31ee0b36c24fca568a5d
The Materialized Path model is a strategy for representing hierarchical data within a relational database system like PostgreSQL. It involves storing the entire path to a node within a tree as a single column value in each row. This method is particularly effective for representing data with a hierarchical structure, such as categories, organizational structures, or file systems, within a flat table. The key advantage of the Materialized Path model is its simplicity and efficiency in querying hierarchical data with a single query, especially for retrieving ancestors or descendants of a node.
ltree
is an implementation of a materialized path which describes the path from a root to a node on a tree. This technique reduces the complexity of tree traversal queries to simple wildcard sort of expressions. Queries involving ltree operations are called lquery.
The ltree also supports B-tree indexing - as well as GiST.
The main limitation of this approach is that the ltree assumes that there is only one path from the root to any given node. This is approach to bypass the restriction: https://busta.win/posts/dags-with-materialized-paths-using-postgres-ltree
A label path is a sequence of zero or more labels separated by dots, for example L1.L2.L3
, representing a path from the root of a hierarchical tree to a particular node. The length of a label path cannot exceed 65535 labels.
Example: Top.Countries.Europe.Russia
CREATE EXTENSION ltree;
CREATE TABLE tree (id bigint, path ltree, ...);
INSERT INTO tree (path) VALUES ('Food');
INSERT INTO tree (path) VALUES ('Food.Fruit');
INSERT INTO tree (path) VALUES ('Food.Fruit.Cherry');
INSERT INTO tree (path) VALUES ('Food.Fruit.Banana');
INSERT INTO tree (path) VALUES ('Food.Meat');
INSERT INTO tree (path) VALUES ('Food.Meat.Beaf');
INSERT INTO tree (path) VALUES ('Food.Meat.Pork');
-- All children (Food.Fruit.Banana, Food.Fruit.Cherry):
SELECT * FROM tree WHERE path ~ 'Food.Fruit.*{1,}';
-- All parents (Food, Food.Fruit):
SELECT * FROM tree WHERE path @> subpath('Food.Fruit.Banana', 0, -1);
Examples
Texas.*
- matches any path that starts with the label Texas*.Texas.*
- will match any path that contains Texas
CREATE DATABASE world_tree WITH ENCODING 'UTF8';
CREATE TYPE admin_level AS ENUM ('country', 'state', 'city');
CREATE TABLE world (
name TEXT PRIMARY KEY,
label TEXT CHECK (label ~* '^[A-Za-z0-9_]{1,255}$'),
population BIGINT,
location_type admin_level,
path ltree
);
CREATE INDEX path_idx ON world USING btree(path);
INSERT INTO world(name, label, population, location_type, path) VALUES
('United States', 'United_States', 329500000, 'country', 'United_States'),
('Australia', 'Australia', 25690000, 'country', 'Australia'),
('Spain', 'Spain', 47350000, 'country', 'Spain'),
('Texas', 'Texas', 29000000, 'state', 'United_States.Texas'),
('California', 'California', 39510000, 'state', 'United_States.California'),
('New South Wales', 'New_South_Wales', 8166000, 'state', 'Australia.New_South_Wales'),
('Victoria', 'Victoria', 6681000, 'state', 'Australia.Victoria'),
('Comunidad de Madrid', 'Comunidad_de_Madrid', 6642000, 'state', 'Spain.Comunidad_de_Madrid'),
('Dallas', 'Dallas', 1331000, 'city', 'United_States.Texas.Dallas'),
('Austin', 'Austin', 950807, 'city', 'United_States.Texas.Austin'),
('Houston', 'Houston', 2310000, 'city', 'United_States.Texas.Houston'),
('Los Angeles', 'Los_Angeles', 3967000, 'city', 'United_States.California.Los_Angeles'),
('San Francisco', 'San_Francisco', 874961, 'city', 'United_States.California.San_Francisco'),
('San Diego', 'San_Diego', 1410000, 'city', 'United_States.California.San_Diego'),
('Sydney', 'Sydney', 5312000, 'city', 'Australia.New_South_Wales.Sydney'),
('Newcastle', 'Newcastle', 322278, 'city', 'Australia.New_South_Wales.Newcastle'),
('Melbourne', 'Melbourne', 5078000, 'city', 'Australia.Victoria.Melbourne'),
('Geelong', 'Geelong', 253269, 'city', 'Australia.Victoria.Geelong'),
('Madrid', 'Madrid', 3223000, 'city', 'Spain.Comunidad_de_Madrid.Madrid'),
('Móstoles', 'Mostoles', 207095, 'city', 'Spain.Comunidad_de_Madrid.Mostoles');
Get ancestors
SELECT name, path
FROM world
WHERE path @> (
SELECT path
FROM world
WHERE label = 'Madrid' -- node whose ancestors to find
);
Get descendants
SELECT name, path
FROM world
WHERE path <@ (
SELECT path
FROM locations_with_ltree
WHERE label = 'Victoria'
);