SQL: VDBE assembler - tsafin/tarantool GitHub Wiki

Overview

It was suggested to implement VDBE assembler. For more detailed information, see: https://github.com/tarantool/tarantool/issues/3050

In a nutshell: execution of SQL query consists of several stages. Firstly, lexer and parser process string of a query in order to build AST or/and directly call functions, which in their turn emit sequence of opcodes implementing given query. As for selects, query planner comes into play and also handles some routine. After all, obtained VDBE program is executed via VDBE. It seems that some day an opportunity to launch VDBE programs without necessity of query compilation can be quite useful. For instance, one can notice that query planner seems to be a little bit complicated thing and achieving whatever execution plan (i.e. sequence of opcodes) is not easy. Hence, to point out which VDBE program performs better, it is quite handy to be capable of editing them.

Implementation details

For dumping purposes three pragmas have been added:

  • pragma vdbe_dump = 1/0 -- enable or disable dump of executed SQL statement.
  • pragma vdbe_dump_file = "filename" -- specify file for dump. By default it is "listing_dump.vdbe".
  • pragma vdbe_execute [= "filename"] -- execute content of filename as VDBE program. File should contain program previously dumped by vdbe_dump (or written using its pattern). "filename" is optional, without specifying name of file, default one is used.

In debug mode, struct sqlite3* contains name of file to dump VDBE listing. When flag for dump is set by first pragma, before VDBE execution all opcodes and VDBE context are dumped into given file. At the same time, operands containing space id are converted into space names, since space id may change after-while. Moreover, fourth operand is known to take different types (e.g. int, string, double etc.), so it is necessary to dump not only its values, but also its type. To create a brand new VDBE it is required to have notion concerning number of cursors, number of resulting columns and other attributes. Thus, alongside with opcodes, VDBE context is dumped as well. When it comes for execution of dumped program, file is read line by line. Using simple format scanf, opcode mnemonic and operands are fetched. Mnemonic is translated into opcode number, fourth operand turns into runtime value depending on its type. Finally, new VDBE is allocated and initialised with scanned context. At this moment nothing prevents from starting execution of new program.
It is worth mentioning that it is impossible to display result of VDBE execution as it ordinary appears (via lua facilities), owing to the fact that result of pragma is displayed this way. Therefore, execution result is displayed by simple printfs in a straightforward way.

Usage example

-- Let's create simple table and fill it with some data:
box.cfg{}
box.sql.execute("drop table if exists t1");
box.sql.execute("create table t1(id primary key, a)");
box.sql.execute("insert into t1 values(1, null)");
box.sql.execute("insert into t1 values(2, 123)");
box.sql.execute("insert into t1 values(3, -193)");
box.sql.execute("insert into t1 values(4, 'Hello, World!')");
box.sql.execute("insert into t1 values(5, 3872.123)");
box.sql.execute("insert into t1 values(6, 0)");
-- Set file and dump there select program:
box.sql.execute("pragma vdbe_dump_file = 'select.vdbe'");
-- Enable dumping:
box.sql.execute("pragma vdbe_dump = 1");
box.sql.execute("select * from t1");
-- Set another file and dump there delete program:
box.sql.execute("pragma vdbe_dump_file = 'delete.vdbe'");
box.sql.execute("delete from t1 where id = 4 or id = 3");
-- Again set another file and dump there insert program:
box.sql.execute("pragma vdbe_dump_file = 'insert.vdbe'");
box.sql.execute("insert into t1 values (3, 666), (4, 'Hello again!')");
-- Disable dumping:
box.sql.execute("pragma vdbe_dump = 0");
-- Execute select from file to be sure that it works:
box.sql.execute("pragma vdbe_execute = 'select.vdbe'");
-- Execute delete from file:
box.sql.execute("pragma vdbe_execute = 'delete.vdbe'");
box.sql.execute("pragma vdbe_execute = 'select.vdbe'");
-- Execute insert from file:
box.sql.execute("pragma vdbe_execute = 'insert.vdbe'");
box.sql.execute("pragma vdbe_execute = 'select.vdbe'");

Example of content(select.vdbe):
107404973
opcodes: 11
var: 0
mem: 4
cursor: 2
rescols: 2
read: 1
write: 1
Init 1 9 0 |type: none; | 0
OpenRead 1 T1 0 |type: key; nField: 1 xField: 1 | 0
Explain 0 0 0 |type: none; | 0
Rewind 1 8 1 |type: int; 0 | 0
Column 1 0 1 |type: none; | 0
Column 1 1 2 |type: none; | 0
ResultRow 1 2 0 |type: none; | 0
Next 1 4 0 |type: none; | 1
Halt 0 0 0 |type: none; | 0
Transaction 0 0 2 |type: int; 0 | 1
Goto 0 1 0 |type: none; | 0
125288175

Restrictions

Since current implementation is likely to be sort of proof-of-concept, it implies some restrictions.

  • For the simplicity's sake, assembler supports only 3 main types of P4 operand: int, string and key info. Support of other types will be easy to implement if it is ever needed.
  • Assembler doesn't include any lexer or parser, but only pure scanf with format string. Thus, be quite careful when editing VDBE listing.
  • Almost no static checks of program correctness.
  • It has been hardly tested, so it is available only in debug mode.