SQL: ON CONFLICT clause for INSERT, UPDATE statements - tsafin/tarantool GitHub Wiki
In SQL we support non-standard clause called ON CONFLICT / OR. It is not a separate clause which can appear in INSERT/UPDATE statements or as a column option in CREATE TABLE statements. See examples below.
CREATE TABLE t1(a INT PRIMARY KEY, b NOT NULL ON CONFLICT IGNORE, c UNIQUE ON CONFLICT FAIL);
...
INSERT OR IGNORE INTO t VALUES (1, 1), (2, 2);
INSERT OR REPLACE INTO t VALUES (1, 1);
...
UPDATE OR REPLACE t SET ...
UPDATE OR IGNORE t SET ...
...
Here's brief description of possible actions
ABORT
When an applicable constraint violation occurs, the ABORT resolution algorithm aborts the current SQL statement with an SQLITE_CONSTRAINT error and backs out any changes made by the current SQL statement; but changes caused by prior SQL statements within the same transaction are preserved and the transaction remains active. This is the default behavior and the behavior specified by the SQL standard.
Example:
tarantool> CREATE TABLE t1(a int PRIMARY KEY, b INT UNIQUE);
tarantool> INSERT INTO t1 VALUES (1, 1);
tarantool> SELECT * FROM t1;
---
- - [1, 1]
tarantool> INSERT INTO t1 VALUES (2, 2), (3, 2);
---
- error: Duplicate key exists in unique index 'sqlite_autoindex_T1_2' in space 'T1'
...
tarantool> SELECT * FROM t1;
---
- - [1, 1]
...
As we can see, second INSERT caused UNIQUE constraint violation and all changes made by statement were discarded.
In Tarantool - because it is a default action, that ON CONFLICT option can be implemented by builtin-resources, no additional SQL bytecode is needed.
ROLLBACK
When an applicable constraint violation occurs, the ROLLBACK resolution algorithm aborts the current SQL statement with an SQLITE_CONSTRAINT error and rolls back the current transaction. If no transaction is active (other than the implied transaction that is created on every command) then the ROLLBACK resolution algorithm works the same as the ABORT algorithm.
Example:
CREATE TABLE t8(a UNIQUE PRIMARY KEY ON CONFLICT ROLLBACK);
INSERT INTO t8 VALUES(1);
-- UNIQUE constraint failed: T8.A
BEGIN;
COMMIT;
--
...
--
(It is unable to reproduce case from above in console, however there are some tap tests which ensure that, see sql-tap/index1.test.lua - test-cases 19.1, 19.4 and 19.5)
As we can see, not only in-statement, but explicitly opened transaction above was rolled back. And thats the difference between ROLLBACK and ABORT options. ABORT discards only in-statement changes, ROLLBACK discards everything inside single transaction.
In Tarantool - no opportunity to do described actions by Tarantool only, so additional bytecode will be generated for INSERT/UPDATE statement.
FAIL When an applicable constraint violation occurs, the FAIL resolution algorithm aborts the current SQL statement with an SQLITE_CONSTRAINT error. But the FAIL resolution does not back out prior changes of the SQL statement that failed nor does it end the transaction. Example below.
tarantool> CREATE TABLE t2(a INT PRIMARY KEY, b UNIQUE ON CONFLICT FAIL);
---
...
tarantool> INSERT INTO t2 VALUES (1, 1), (2, 1);
---
- error: 'UNIQUE constraint failed: T2.B'
...
tarantool> SELECT * FROM t2;
---
- - [1, 1]
Second row violated UNIQUE constraint, however first row was successfully inserted.
In Tarantool - no opportunity to do described actions in case of ON CONFLICT (require additional VDBE bytecode), however Tarantool can handle it by itself if user does INSERT OR FAIL statement (no additional bytecode generated).
IGNORE When an applicable constraint violation occurs, the IGNORE resolution algorithm skips the one row that contains the constraint violation and continues processing subsequent rows of the SQL statement as if nothing went wrong. Other rows before and after the row that contained the constraint violation are inserted or updated normally. No error is returned when the IGNORE conflict resolution algorithm is used.
Example:
tarantool> CREATE TABLE t3(a INT PRIMARY KEY, b INT UNIQUE ON CONFLICT IGNORE);
tarantool> INSERT INTO t3 VALUES (1, 1), (2, 1), (3, 3), (4, 4);
tarantool> SELECT * FROM t3;
---
- - [1, 1]
- [3, 3]
- [4, 4]
In Tarantool - no opportunity to do described actions in case of ON CONFLICT (require additional VDBE bytecode), however Tarantool can handle it by itself if user does INSERT OR IGNORE statement (no additional bytecode generated).
REPLACE When a UNIQUE or PRIMARY KEY constraint violation occurs, the REPLACE algorithm deletes pre-existing rows that are causing the constraint violation prior to inserting or updating the current row and the command continues executing normally. If a NOT NULL constraint violation occurs, the REPLACE conflict resolution replaces the NULL value with the default value for that column, or if the column has no default value, then the ABORT algorithm is used. If a CHECK constraint violation occurs, the REPLACE conflict resolution algorithm always works like ABORT.
When the REPLACE conflict resolution strategy deletes rows in order to satisfy a constraint, delete triggers fire if and only if recursive triggers are enabled.
The update hook is not invoked for rows that are deleted by the REPLACE conflict resolution strategy. Nor does REPLACE increment the change counter. The exceptional behaviors defined in this paragraph might change in a future release.
Optimizations
Before writing about cases which are available for checking optimization, it is necessary to mention what Tarantool what is able to do and what is not.
Tarantool:
- Supports unique indexes with multiple NULL values.
- Supports secondary indexes.
- Raises an appropriate error message if uniqueness was violated in INSERT/UPDATE statement.
- Only shows an error message without providing information in which concrete secondary index insertion had failed.
In original SQLite UNIQUE constraints checks were implemented by combinations of opcodes OP_NoConflict + OP_Halt for each constraint. Each OP_NoConflict opcode does a lookup in an appropriate unique index. However Tarantool also does lookups in unique indexes when does an UPDATE/INSERT into UNIQUE columns. This is not good for us, unnecessary lookups should be removed, UNIQUE constraint check should be optimized.
ON CONFLICT ABORT is a default action for all INSERT/UPDATE statements. If user didn't specify any ON CONCLICT options (or used only ON CONFLICT ABORT), all uniqueness is preserved by Tarantool only. In other cases it is unable to handle different errors actions by Tarantool only, so additional SQL bytecode (OP_NoConflict + OP_Halt for each constraint) for each constraint is needed.
However if user executes statements like INSERT OR IGNORE, UPDATE OR IGNORE, we are able to optimize bytecode in that situation. The reason is that error action after 'OR' keyword overrides every ON CONFLICT clause in every index. It means we have single determined error action for all the table indexes. In that case we are able to check uniqueness by Tarantool only and appropriately handle errors without NoConflict + Halt bytecode.