identifiers in Tarantool SQL - AnaNek/tarantool GitHub Wiki

For more details see:

  1. tests in folder test/saql-tap
    • identifier_case.test.lua
    • identifier_characters.test.lua
  2. identifiers in Tarantool
  3. related issues: #2123, #2126, 2914

In short:

  • identifiers are names of different staff (tables, columns...)
  • non delimited (not surrounded with ") identifier is first converted to the UPPER-CASE and then acts as a delimited one
  • delimited identifier can contain any printable Unicode symbol

Identifiers in Tarantool SQL

The main principles of identifier mechanics were taken from ANSI ISO 2003 standard. For more details on implementation, see identifiers in Tarantool.

Identifiers in Tarantool SQL

List of identifiers:

  1. table/view names
  2. column names
  3. function names
  4. pragma parameters (like "UTF-8") (they also can work as strings)
  5. collation names
  6. trigger names

Not identifiers:

  1. strings, blobs (e.g. select 'cat', RAISE(ERROR, 'error text'))
  2. vriables (e.g. select * from t where id = :my_variable)

Rules that identifiers follow

Identifiers follow the common set of syntax rules:

  1. they are delimited with ", or not delimited at all
  2. comparison with respect to case (see below)

Delimiters

Anything that delimited with " (double quote) is considered to be identifier Anything that not delimited could be an identifier or a keyword (full list is below). Note 1: ' (quote) is used only for delimiting strings and blobs, so it is more or less obvious what this snippet does:

create table a("a" primary key)
insert into a values('a')
select * from a where "a" == 'a'

Note 2: creation of identifiers with " in its name performed from SQL console like this "ab""c" -> ab"c (according to ANSI ISO 2003)

Comparition with respect to case

Comparition of identifiers also follow the ANSI ISO 2003 rules.

Each identifier does one of the following:

  1. if it is delimited, it is compared case sensitive as it is
  2. if it is not delimited, then it is cast to UPPER case on parse stage and then compared case sensitively

Important Notes

  1. only ASCII characters are converting to upper case by now
    create table  a (b primary key) -- ok
    create table  A (b primary key) -- "A" already exists, not ok
    create table "A"(b primary key) -- "A" already exists, not ok
    create table "a"(b primary key) -- ok
    create table ю (b primary key) -- ok
    create table Ю (b primary key) -- ok, as Ю is not ASCII character
    
  2. system space names are stored in lower case and therefore should be delimited (see below)
    selece id from _space -- 'no such table: _SPACE'
    select "id" from "_space" -- ok
    
  3. non-delimited identifiers in errors and result column names occur in UPPER case as a result of uppercasing on parsing stage (e.g. selece id from _space -- 'no such table: _SPACE')
  4. sql_create_function function can consume delimited and non-delimited names
    box.internal.sql_create_function("func", finction () return 1 end) -- would create "FUNC"
    box.internal.sql_create_function("\"func\"", finction () return 2 end) -- would create "func"
    box.sql.execute("select func()") -- returns 1
    box.sql.execute("select \"func\"()") -- returns 2
    

see identifier_case.test.lua for more examples

Exceptions from the rule

Those examples work different from the general rule and can confuse one (known bugs):

  1. expressions in result column names In such query select a from b you receive column "A" in result set BUT In such query select a+1 from b you receive column "a+1" in result set, "a" is not cast to upper case because in general case it is a quite hard task and we (and all other DB) do not do this

  2. some parsing stage errors e.g: create view a(a desc) produces an error syntax error after column name "a", where "a" is mentioned in the LOWER case because it is thrown in very early stage of parsing and not converted to the UPPER case

  3. binary collation is "VIRTUAL" and exists only in SQL, and by now it accepts any case ("BinarY", binary)

Reserved keywords

all,         alter,          analyze,        and,                 any,
as,          asc,            asensitive,     begin,               between,
binary,      by,             call,           case,                char,
character,   check,          collate,        column,              commit,
condition,   connect,        constraint,     create,              cross,
current,     current_date,   current_time,   current_timestamp,   current_user,
cursor,      date,           decimal,        declare,             default,
delete,      dense_rank,     desc,           describe,            deterministic,
distinct,    double,         drop,           each,                else,
elseif,      end,            escape,         except,              exists,
explain,     fetch,          float,          for,                 foreign,
from,        function,       get,            grant,               group,
having,      if,             immediate,      in,                  index,
inner,       inout,          insensitive,    insert,              integer,
intersect,   into,           is,             iterate,             join,
leave,       left,           like,           localtime,           localtimestamp,
loop,        match,          natural,        not,                 null,
of,          on,             or,             order,               out,
outer,       over,           partition,      pragma,              precision,
primary,     procedure,      range,          rank,                reads,
recursive,   references,     reindex,        release,             rename,
repeat,      replace,        resignal,       return,              revoke,
right,       rollback,       row,            row_number,          rows,
savepoint,   select,         sensitive,      set,                 signal,
smallint,    specific,       sql,            start,               system,
table,       then,           to,             transaction,         trigger,
union,       unique,         update,         user,                using,
values,      varchar,        view,           when,                whenever,
where,       while,          with