Mimir SQL - UBOdin/mimir GitHub Wiki
query :=
'SELECT' expression ['AS' alias][, expression ['AS' alias][, ...]]
'FROM' source
'WHERE' condition
'GROUP BY' column[, column[, ...]]
'ORDER BY' expression [asc | desc][, column [asc | desc][, ...]]
'LIMIT' number
expression :=
| '(' expression ')'
| value
| expression ('+'|'-'|'*'|'/') expression
| '-' expression
| function '(' [expression[, expression[, ...]] ')'
| 'CASE' 'WHEN' condition 'THEN' expression ['WHEN' condition 'THEN' expression ['WHEN'...]] 'ELSE' expression 'END'
| 'CASE' expression 'WHEN' expression 'THEN' expression ['WHEN' expression 'THEN' expression ['WHEN'...]] 'ELSE' expression 'END'
| 'CAST' '(' expression 'AS' type ')'
| 'EXTRACT' '(' field 'FROM' expression ')'
condition :=
| '(' condition ')'
| condition ('+'|'-'|'*'|'/'|'AND'|'OR'|'>'|'>='|'='|'<'|'<='|'<>') condition
| 'NOT' expression
| function '(' [expression[, expression[, ...]] ')'
source :=
| '(' query ')' ['AS'] alias
| [schema_name '.'] table_name [['AS'] alias]
Mimir generally supports most SQL92 features in SELECT
queries. Mimir's support for aggregate and non-aggregate functions is still a work in progress. See the Functions documentation for more details.
'ANALYZE' ['WITH ASSIGNMENTS'] [[column 'IN'] rowid 'OF'] query
Explain an uncertainty annotation for the specified relation, row, or cell.
- neither rowid or column: List all sources of uncertainty that might affect the table.
- rowid but not column: List all sources of uncertainty that affect the presence of the specified row in the table.
- rowid + column: List all sources of uncertainty that might affect the value of the specified cell.
To obtain the rowid
for a given row of a query, add the ROWID()
function to the query's outermost SELECT
. Alternatively, just write a select query that produces a single row and you'll get the same effect.
Add the WITH ASSIGNMENTS
option to get a list of prioritized cleaning tasks.
'FEEDBACK' variable_expression 'IS' value
Provide feedback, repairing a given source of uncertainty. To obtain the variable expression needed to repair a source of uncertainty, use ANALYZE
.
'LOAD' file_path_string [ AS name ]
Load the specified file into the database and prepare it for use. Depending on the format, one or more lenses and/or adaptive schemas will be created wrapping the data. If a table name is specified, this name will be used for the newly created table or adaptive schema. Otherwise the base name of the file will be used.
Presently, Mimir only supports loading of CSV
files. We expect to support loading JSON
files soon (tm).
CSV files are loaded using the Apache Commons CSV parser into a single table. Mimir makes every effort to recover from malformed CSV: Under-sized rows are padded, Over-sized rows are trimmed, Lines that break CommonsCSV are dropped. At present, these errors are logged to the console, but not persisted.
The behavior of the CSV loader differs depending on whether a table with the same name already exists. If there is such a table, rows of the CSV are loaded into this table according to its schema.
If there is no table with the same name, a new table is created with the _RAW
suffix, contents of the CSV file are loaded into it as string attributes, and a new Type Inference lens with the specified name is created.
'CREATE'
| 'TABLE' tableName '(' attrName type[, attrName type[, ...]][, tableOption] ')'
| 'VIEW' viewName 'AS' query
| 'LENS' lensName 'AS' query 'WITH' lensType '(' lensParameters ')'
| 'ADAPTIVE' 'SCHEMA' schemaName 'AS' query 'WITH' extractor '(' extractorParameters ')'
tableOption :=
| 'PRIMARY' 'KEY' '(' attrName[, attrName[, ...]] ')'
| 'INDEX' '(' attrName[, attrName[, ...]] ')'
Create one of several types of database objects. CREATE TABLE
and CREATE VIEW
function like they do in a classical database. CREATE LENS
and CREATE ADAPTIVE SCHEMA
are specific to Mimir. See Lens Documentation for more details.
'EXPLAIN' query
Document the Mimir rewriting engine's decisions in executing a specified query. Analogous to EXPLAIN in many database engines.
'PLOT' data_table_name [WITH globalOption][ globalOption][ USING lineSpecification][, lineSpecification];
globalOption:=
|'FORMAT' ('line'|'scatter'|'bar')
|'PLOTNAME' value
|'XLABEL' value
|'YLABEL' value
|'XMAX' numeric_value
|'YMAX' numeric_value
|'XMIN' numeric_value
|'YMIN' numeric_value
|'SAVEFORMAT' ('png'|'pdf'|'svg'|'eps')
|'SAVENAME' value
|'BARORIENT' ('vertical'|'horizontal')
|'LEGEND' ('show'|'hide')
|'LEGENDLOC' ('upper left'|'upper right'|'right'|'lower left'|'lower right'|'center'|
'center left'|'center right'|'lower center'|'upper center'|'best')
lineSpecification := '('x_column_name, y_column_name')' [lineValue][lineValue]
lineValue :=
|'COLOR' ('black'|'blue'|'red'|'yellow'|'green')
|'STYLE' ('^'|'s'|'o') | ('-'|'--'|':')
|'WEIGHT' numeric_value
|'FILTER' column_name'='value
|'LINENAME' value
Plot the given data table with any specified global or line settings. The first set of lineValue STYLE options are for scatter plots, while the second set are for line plots. The FILTER lineValue option is to allow the x and y values noted in the lineSpecification to be filtered to only contain the x,y values where the column denoted by column_name's values are equal to the specified value.