postgres Materialized Path hierarchical tree - ghdrako/doc_snipets GitHub Wiki

The Materialized Path model is a strategy for representing hierarchical data within a relational database system like PostgreSQL. It involves storing the entire path to a node within a tree as a single column value in each row. This method is particularly effective for representing data with a hierarchical structure, such as categories, organizational structures, or file systems, within a flat table. The key advantage of the Materialized Path model is its simplicity and efficiency in querying hierarchical data with a single query, especially for retrieving ancestors or descendants of a node.

ltree is an implementation of a materialized path which describes the path from a root to a node on a tree. This technique reduces the complexity of tree traversal queries to simple wildcard sort of expressions. Queries involving ltree operations are called lquery.

The ltree also supports B-tree indexing - as well as GiST.

The main limitation of this approach is that the ltree assumes that there is only one path from the root to any given node. This is approach to bypass the restriction: https://busta.win/posts/dags-with-materialized-paths-using-postgres-ltree

A label path is a sequence of zero or more labels separated by dots, for example L1.L2.L3, representing a path from the root of a hierarchical tree to a particular node. The length of a label path cannot exceed 65535 labels.

Example: Top.Countries.Europe.Russia

CREATE EXTENSION ltree;
CREATE TABLE tree (id bigint, path ltree, ...);
INSERT INTO tree (path) VALUES ('Food');
INSERT INTO tree (path) VALUES ('Food.Fruit');
INSERT INTO tree (path) VALUES ('Food.Fruit.Cherry');
INSERT INTO tree (path) VALUES ('Food.Fruit.Banana');
INSERT INTO tree (path) VALUES ('Food.Meat');
INSERT INTO tree (path) VALUES ('Food.Meat.Beaf');
INSERT INTO tree (path) VALUES ('Food.Meat.Pork');

-- All children (Food.Fruit.Banana, Food.Fruit.Cherry):
SELECT * FROM tree WHERE path ~ 'Food.Fruit.*{1,}';

-- All parents (Food, Food.Fruit):
SELECT * FROM tree WHERE path @> subpath('Food.Fruit.Banana', 0, -1);

Examples

  • Texas.* - matches any path that starts with the label Texas
  • *.Texas.* - will match any path that contains Texas
CREATE DATABASE world_tree WITH ENCODING 'UTF8';

CREATE TYPE admin_level AS ENUM ('country', 'state', 'city');

CREATE TABLE world (
	name 	      TEXT PRIMARY KEY,
	label 		  TEXT CHECK (label ~* '^[A-Za-z0-9_]{1,255}$'),
    population    BIGINT,
    location_type admin_level,
    path          ltree
);

CREATE INDEX path_idx ON world USING btree(path);

INSERT INTO world(name, label, population, location_type, path) VALUES
    ('United States', 'United_States', 329500000, 'country', 'United_States'),
    ('Australia', 'Australia', 25690000, 'country', 'Australia'),
    ('Spain', 'Spain', 47350000, 'country', 'Spain'),
    ('Texas', 'Texas', 29000000, 'state', 'United_States.Texas'),
    ('California', 'California', 39510000, 'state', 'United_States.California'),
    ('New South Wales', 'New_South_Wales', 8166000, 'state', 'Australia.New_South_Wales'),
    ('Victoria', 'Victoria', 6681000, 'state', 'Australia.Victoria'),
    ('Comunidad de Madrid', 'Comunidad_de_Madrid', 6642000, 'state', 'Spain.Comunidad_de_Madrid'),
    ('Dallas', 'Dallas', 1331000, 'city', 'United_States.Texas.Dallas'),
    ('Austin', 'Austin', 950807, 'city', 'United_States.Texas.Austin'),
    ('Houston', 'Houston', 2310000, 'city', 'United_States.Texas.Houston'),
    ('Los Angeles', 'Los_Angeles', 3967000, 'city', 'United_States.California.Los_Angeles'),
    ('San Francisco', 'San_Francisco', 874961, 'city', 'United_States.California.San_Francisco'),
    ('San Diego', 'San_Diego', 1410000, 'city', 'United_States.California.San_Diego'),
    ('Sydney', 'Sydney', 5312000, 'city', 'Australia.New_South_Wales.Sydney'),
    ('Newcastle', 'Newcastle', 322278, 'city', 'Australia.New_South_Wales.Newcastle'),
    ('Melbourne', 'Melbourne', 5078000, 'city', 'Australia.Victoria.Melbourne'),
    ('Geelong', 'Geelong', 253269, 'city', 'Australia.Victoria.Geelong'),
    ('Madrid', 'Madrid', 3223000, 'city', 'Spain.Comunidad_de_Madrid.Madrid'),
    ('Móstoles', 'Mostoles', 207095, 'city', 'Spain.Comunidad_de_Madrid.Mostoles');

Get ancestors

SELECT name, path
FROM world
WHERE path @> (
    SELECT path
    FROM world
    WHERE label = 'Madrid' -- node whose ancestors to find
);

Get descendants

SELECT name, path
FROM world
WHERE path <@ (
    SELECT path
    FROM locations_with_ltree
    WHERE label = 'Victoria'
);