identifiers in Tarantool SQL - AnaNek/tarantool GitHub Wiki
For more details see:
- tests in folder
test/saql-tap
identifier_case.test.lua
identifier_characters.test.lua
- identifiers in Tarantool
- related issues: #2123, #2126, 2914
In short:
- identifiers are names of different staff (tables, columns...)
- non delimited (not surrounded with
"
) identifier is first converted to the UPPER-CASE and then acts as a delimited one - delimited identifier can contain any printable Unicode symbol
Identifiers in Tarantool SQL
The main principles of identifier mechanics were taken from ANSI ISO 2003 standard. For more details on implementation, see identifiers in Tarantool.
Identifiers in Tarantool SQL
List of identifiers:
- table/view names
- column names
- function names
- pragma parameters (like "UTF-8") (they also can work as strings)
- collation names
- trigger names
Not identifiers:
- strings, blobs (e.g. select 'cat', RAISE(ERROR, 'error text'))
- vriables (e.g.
select * from t where id = :my_variable
)
Rules that identifiers follow
Identifiers follow the common set of syntax rules:
- they are delimited with
"
, or not delimited at all - comparison with respect to case (see below)
Delimiters
Anything that delimited with "
(double quote) is considered to be identifier
Anything that not delimited could be an identifier or a keyword (full list is below).
Note 1: '
(quote) is used only for delimiting strings and blobs, so it is more or less obvious what this snippet does:
create table a("a" primary key)
insert into a values('a')
select * from a where "a" == 'a'
Note 2: creation of identifiers with "
in its name performed from SQL console like this "ab""c"
-> ab"c
(according to ANSI ISO 2003)
Comparition with respect to case
Comparition of identifiers also follow the ANSI ISO 2003 rules.
Each identifier does one of the following:
- if it is delimited, it is compared case sensitive as it is
- if it is not delimited, then it is cast to UPPER case on parse stage and then compared case sensitively
Important Notes
- only ASCII characters are converting to upper case by now
create table a (b primary key) -- ok create table A (b primary key) -- "A" already exists, not ok create table "A"(b primary key) -- "A" already exists, not ok create table "a"(b primary key) -- ok create table ю (b primary key) -- ok create table Ю (b primary key) -- ok, as Ю is not ASCII character
- system space names are stored in lower case and therefore should be delimited (see below)
selece id from _space -- 'no such table: _SPACE' select "id" from "_space" -- ok
- non-delimited identifiers in errors and result column names occur in UPPER case as a result of uppercasing on parsing stage (e.g.
selece id from _space -- 'no such table: _SPACE'
) - sql_create_function function can consume delimited and non-delimited names
box.internal.sql_create_function("func", finction () return 1 end) -- would create "FUNC" box.internal.sql_create_function("\"func\"", finction () return 2 end) -- would create "func" box.sql.execute("select func()") -- returns 1 box.sql.execute("select \"func\"()") -- returns 2
see identifier_case.test.lua
for more examples
Exceptions from the rule
Those examples work different from the general rule and can confuse one (known bugs):
-
expressions in result column names In such query
select a from b
you receive column "A" in result set BUT In such queryselect a+1 from b
you receive column "a+1" in result set, "a" is not cast to upper case because in general case it is a quite hard task and we (and all other DB) do not do this -
some parsing stage errors e.g:
create view a(a desc)
produces an errorsyntax error after column name "a"
, where"a"
is mentioned in the LOWER case because it is thrown in very early stage of parsing and not converted to the UPPER case -
binary
collation is "VIRTUAL" and exists only in SQL, and by now it accepts any case ("BinarY"
,binary
)
Reserved keywords
all, alter, analyze, and, any,
as, asc, asensitive, begin, between,
binary, by, call, case, char,
character, check, collate, column, commit,
condition, connect, constraint, create, cross,
current, current_date, current_time, current_timestamp, current_user,
cursor, date, decimal, declare, default,
delete, dense_rank, desc, describe, deterministic,
distinct, double, drop, each, else,
elseif, end, escape, except, exists,
explain, fetch, float, for, foreign,
from, function, get, grant, group,
having, if, immediate, in, index,
inner, inout, insensitive, insert, integer,
intersect, into, is, iterate, join,
leave, left, like, localtime, localtimestamp,
loop, match, natural, not, null,
of, on, or, order, out,
outer, over, partition, pragma, precision,
primary, procedure, range, rank, reads,
recursive, references, reindex, release, rename,
repeat, replace, resignal, return, revoke,
right, rollback, row, row_number, rows,
savepoint, select, sensitive, set, signal,
smallint, specific, sql, start, system,
table, then, to, transaction, trigger,
union, unique, update, user, using,
values, varchar, view, when, whenever,
where, while, with