postgres upsert insert on conflict - ghdrako/doc_snipets GitHub Wiki

merge(Postgres 15+): https://github.com/ghdrako/doc_snipets/wiki/postgres-sql-merge

INSERT INTO tweet_statistics (
  tweet_id, fanout, likes_count
) VALUES (
  1475870220422107137, FLOOR(RANDOM() * 10), 1
) ON CONFLICT (tweet_id, fanout) DO UPDATE SET likes_count =
tweet_statistics.likes_count + excluded.likes_count;

INSERT .. ON CONFLICT (postgres 9.5+)

In PostgreSQL, the UPSERT statement does not exist as in other DBMSes. An UPSERT - UPdate inSET statement is used when we want to insert a new record on top of the existing record or update an existing record. To do this in PostgreSQL, we can use the ON CONFLICT keyword:

INSERT INTO table_name(column_list) VALUES(value_list)
ON CONFLICT target action;

Here, ON CONFLICT means that the target action is executed when the record already exists (mean- ing when a record with the same primary key exists). The target action could be this:

DO NOTHING

Alternatively, it could be the following:

DO UPDATE SET { column_name = { expression | DEFAULT } |
( column_name [, ...] ) = [ ROW ] ( { expression | DEFAULT } [, ...]
) |
( column_name [, ...] ) = ( sub-SELECT )
} [, ...]
[ WHERE condition ]
insert into j_posts_tags (post_pk ,tag_pk) values (2,1) ON CONFLICT DO NOTHING;
insert into j_posts_tags (post_pk ,tag_pk) values (2,1) ON CONFLICT (tag_pk,post_pk) DO UPDATE set tag_pk=excluded.tag_pk+1;

The fields inside the ON CONFLICT condition must have a** unique or exclusion constraint**. The previous statement simply replaces the following statement:

INSERT INTO j_posts_tags (post_pk ,tag_pk) values (2,1)

It gets replaced with this statement:

UPDATE set tag_pk=tag_pk+1 where tag_pk=1 and post_pk=2

In an upsert statement in PostgreSQL, EXCLUDED is a special table that is used to reference the values that were proposed for insertion in the INSERT statement.

The EXCLUDED table has the same columns as the table being inserted into, and its values are the values that would have been inserted if the INSERT statement had not encountered a conflict.

The EXCLUDED table is not a physical table that is stored on disk but rather a virtual table that is created and used during the execution of an INSERT ... ON CONFLICT statement in PostgreSQL and exists only in memory during the execution of the INSERT ... ON CONFLICT statement.

PostgreSQL lets you either add or modify a record within a table depending on whether the record already exists. This is commonly known as an "upsert" operation

INSERT INTO my_table (column1, column2)
VALUES
    (value1, value2),
    (value3, value4),
    (value5, value6),
    (value7, value8)
ON CONFLICT <target> <action>;

Condition:

ON CONFLICT ON CONSTRAINT countries_pkey DO NOTHING;
ON CONFLICT (country) DO NOTHING;

The specified can be one of the following:

  • DO NOTHING: Tells PostgreSQL to leave the conflicting record as-is. In essence, this action makes no changes, but suppresses the error that would normally occur if you tried to insert a row that violates a condition.
  • DO UPDATE: This tells PostgreSQL that you want to update the row that is already in the table. The syntax for the update mirrors that of the normal UPDATE command.

When DO UPDATE is specified, a special virtual table called EXCLUDED is available for use within the UPDATE clause. The table contains the values suggested in the original INSERT command (that conflicted with the existing table values).

INSERT INTO director (id, name)
VALUES
    (2, 'robert'),
    (5, 'sheila'),
    (6, 'flora')
ON CONFLICT (id) DO UPDATE
SET name = EXCLUDED.name;

This time, we specify a modification to make to the existing row if it conflicts with one of our proposed insertions. We use the virtual EXCLUDED table, which contains the items we intended to insert, to update the name column to a new value on conflict.

INSERT INTO customers (name, email)
VALUES('Microsoft','[email protected]') 
ON CONFLICT (name) 
DO 
   UPDATE SET email = EXCLUDED.email || ';' || customers.email;
INSERT INTO tablename (id, username, password, level, email) 
     VALUES (1, 'John', 'qwerty', 5, '[email protected]')  
ON CONFLICT (id) DO 
     UPDATE SET   (username, password, level, email) = (EXCLUDED.username, EXCLUDED.password, EXCLUDED.level, EXCLUDED.email)

-- longer version
INSERT INTO tablename (id, username, password, level, email) 
                VALUES (1, 'John', 'qwerty', 5, '[email protected]') 
ON CONFLICT (id) DO UPDATE SET  username=EXCLUDED.username,
                                password=EXCLUDED.password, 
                                level=EXCLUDED.level,email=EXCLUDED.email
INSERT INTO users (id, level)
VALUES (1, 0)
ON CONFLICT (id) DO UPDATE
SET level = users.level + 1;

the downside of MERGE's handling of concurrency is that when you concurrently INSERT, so at the same time as you're executing the MERGE statement, there is another INSERT going on, then MERGE might not notice that. MERGE would go into its INSERT logic, and then it would get a unique violation.

In contrast with INSERT ON CONFLICT, the way it's designed with its speculative insertions, guarantees that you either get an INSERT or an UPDATE and that is true even if there are concurrent inserts. You might want to choose INSERT ON CONFLICT if you need the guarantee.

Monitoring

WITH new_employees AS (
    SELECT * FROM (VALUES
        ('George', 'Sales',    'Manager',   1000),
        ('Jane',   'R&D',      'Developer', 1200)
    ) AS t(
         name,      department, role,       salary
    )
)
INSERT INTO employees (name, department, role, salary)
SELECT name, department, role, salary
FROM new_employees
ON CONFLICT (name) DO UPDATE SET
    department = EXCLUDED.department,
    role = EXCLUDED.role,
    salary = EXCLUDED.salary
RETURNING *;

  name  │ department │   role    │ salary
────────┼────────────┼───────────┼────────
 George │ Sales      │ Manager   │   1000
 Jane   │ R&D        │ Developer │   1200
INSERT 0 2

WITH new_employees AS (
    SELECT * FROM (VALUES
        ('George', 'Sales',    'Manager',   1000),
        ('Jane',   'R&D',      'Developer', 1200)
    ) AS t(
         name,      department, role,       salary
    )
)
INSERT INTO employees (name, department, role, salary)
SELECT name, department, role, salary
FROM new_employees
ON CONFLICT (name) DO UPDATE SET
    department = EXCLUDED.department,
    role = EXCLUDED.role,
    salary = EXCLUDED.salary
RETURNING *, (xmax = 0) AS inserted;

  name  │ department │   role    │ salary │ inserted
────────┼────────────┼───────────┼────────┼──────────
 Jane   │ R&D        │ Developer │   1200 │ t
 George │ Sales      │ Manager   │   1000 │ f
INSERT 0 2

Te last statement includes the calculated field inserted that uses the special column xmax to determine how many rows were inserted. From the data returned by the command, you can spot that a new row was inserted for "Jane", but "George" was already in the table, so the row was updated.

The xmax column is a special system column:

The identity (transaction ID) of the deleting transaction, or zero for an undeleted row version.

In PostgreSQL, when a row is updated, the previous version is deleted, and xmax holds the ID of the deleting transaction. When the row is inserted, no previous row is deleted, so xmax is zero. This "trick" is cleverly using this behavior to distinguish between updated and inserted rows.

⚠️ **GitHub.com Fallback** ⚠️