mix stdlib db - remixlabs/docs-public GitHub Wiki

Standard library - db

Overview

module db
  import q = query

  // Types
  type database = data
  type record = map(data)
  type recordID = data // actually a ref
  type query = <opaque>

  // Get databases (since PR 1160)
  def mainDB : database
  def appDB : string -> database
  def localInMemDB : string -> database
  def globalInMemDB : string -> database
  def remoteDB : string -> string -> database   // since PR1571

  // Primitive storing and retrieving
  def getOne : database -> recordID -> record
  def getOneWithDefault : record -> database -> recordID -> record
  def getArray : database -> array(recordID) -> array(record)
  def getStream : database -> array(recordID) -> stream(record)                         // since PR1612
  def getStreamWithDefault : record -> database -> array(recordID) -> stream(record)    // since PR1612
  def get : database -> array(recordID) -> stream(record)
  def getWithDefault : record -> database -> array(recordID) -> stream(record)
  def saveOne : appstate -> database -> record -> record
  def saveOneGetID : appstate -> database -> record -> recordID                         // since PR1612
  def saveArray : appstate -> database -> array(record) -> array(record)                // since PR1612
  def saveArrayGetIDs : appstate -> database -> array(record) -> array(recordID)        // since PR1612
  def saveStream : appstate -> database -> stream(record) -> stream(record)             // since PR1612
  def saveStreamGetIDs : appstate -> database -> stream(record) -> stream(recordID)     // since PR1612
  def save : appstate -> database -> stream(record) -> stream(record)
  def upsertOne : appstate -> database -> record -> record
  def upsertOneGetID : appstate -> database -> record -> recordID                       // since PR1612
  def upsertArray : appstate -> database -> array(record) -> array(record)              // since PR1612
  def upsertArrayGetIDs : appstate -> database -> array(record) -> array(recordID)      // since PR1612
  def upsertStream : appstate -> database -> stream(record) -> stream(record)           // since PR1612
  def upsertStreamGetIDs : appstate -> database -> stream(record) -> stream(recordID)   // since PR1612
  def upsert : appstate -> database -> stream(record) -> stream(record)
  def deleteOne : appstate -> database -> recordID -> record
  def deleteOneStm : appstate -> database -> recordID -> null                           // since PR1612
  def deleteArray : appstate -> database -> array(recordID) -> array(record)            // since PR1612
  def deleteArrayStm : appstate -> database -> array(recordID) -> null                  // since PR1612
  def deleteStream : appstate -> database -> stream(recordID) -> stream(record)         // since PR1612
  def deleteStreamStm : appstate -> database -> stream(recordID) -> null                // since PR1612
  def delete : appstate -> database -> stream(recordID) -> stream(record)
  def deref : database -> data -> data

  // deprecated
  def retrieve // replace with getOne
  def retrieveWithDefault // replace with getOneWithDefault

  // Sources
  def head : database -> query
  def all : database -> query
  def query : database -> string -> query
  def database : query -> database

  // Modifiers
  def includeDeleted : bool -> query -> query
  def includeSuperseded : bool -> query -> query
  def unrestricted : bool -> query -> query
  def label : string -> query -> query

  // Filters
  def filter : (record -> bool) -> query -> query
  def filterWithFallback : (record -> bool) -> query -> query

  // Mapping (projecting)
  def map : (record -> record) -> query -> query
  def mapWithFallback : (record -> record) -> query -> query

  // Sorting
  def sort : (record -> data) -> query -> query
  def dirSort : bool -> (record -> data) -> query -> query

  // Extracting results
  def length : query -> number
  def isEmpty : query -> bool
  def isNotEmpty : query -> bool
  def exists : (record -> bool) -> query -> bool
  def toArray : query -> array(record)
  def toStream : query -> record*

  // Special elements, slicing:
  def first : query -> record
  def firstWithDefault : record -> query -> record
  def last : query -> record
  def lastWithDefault : record -> query -> record
  def skip : query -> query
  def skipn : number -> query -> query
  def firstn : number -> query -> query
  def lastn : number -> query -> query

  // ast
  def eval : q.ast -> data

  // low-level execution
  def process : q.ast -> query -> query
  def processWithFallback : q.ast -> (stream(record) -> stream(record)) -> query -> query
  def processPipeline : array(q.ast) -> query -> query

  // query string parsing
  def parse : string -> q.ast
  def parseQueryString : database -> string -> q.ast     // since PR1651

  // "Magic link"
  def makeAnonymousLink : appstate -> database -> string -> result(string, string)

  // references
  private case ref(string)             // since PR#1813
  type reference = ref | null          // since PR#1813
  def makeRef: string -> recordID 
  def stringRef : recordID -> string   // since PR 1169
  def isRef: recordID -> bool

module end

Types

  type database = data
  type record = map(data)
  type recordID = data // really a ref
  type query = <opaque>

The type database designates the database to access; see next section for details.

The type record is a record coming from the database.

The type query is a query that will lead to the emission of a stream of records when executed.

Kinds of databases

Depending on the runtime environment, many varieties of database may be available. Typically, there is at least the "main" db associated with the current app, but there are generally others. These are summarized in the table below:

Kind On disk? Platform server? Browser runtimes? Helper function Notes
main db.mainDB/null
app DB db.appDB(name)
session DB db.localInMemDB(name)
in-mem app DB db.globalInMemDB(name) global in-mem DB, tied to particular global app DB, with same access rules
remote DB N/A db.remoteDB(url, name) connect to a remote DB over HTTP

Other than representing the main DB as null, the database type is represented internally as a map, which you can also construct directly:

  • {"type": "rmx", "db": <name>, "scope": ["local"|"global"], "inmem": [true|false] }: access the session-local in-memory database dbname (or the main database of the app if the name is "main")
  • {"type": "rmx_remote", "url": "baseURL", "db": <name> }: access the database db on the amp server with the given url. This is using the HTTP protocol. See below under db.remoteDB for more information.
  • {"type": <other>, ...}: not used currently

Save Options

A series of options have been added to save to enable recording of original metadata -

  • KeepOriginal makes an _rmx_original map and adds individual meta records from the original record there, like _rmx_original/_rmx_id
  • OriginalIdAlways always stores a _rmx_original_id and _rmx_original_rev from the original record
  • TryToUseId will check that the id's provided are "acceptable", and then try to use them
  • Upsert, turns out this is just an additional flag on save, so you can add this to any other option (or use by itself) to turn save into upsert

Save options are provided in the database type as a string array called "options".

Examples:

  • null: main on-disk DB of the app (same as db.mainDB)
  • {"type": "rmx", "scope": "main"}: same
  • {"type": "rmx", "scope": "session", "db": "temp_db"}: in-mem, session-scoped DB named "temp_db"
  • {"type": "rmx", "scope": "global", "db": "my_db"}: global, on-disk DB named "my_db"
  • {"type": "rmx", "scope": "global", "db": "my_db", "inmem": true}: global, in-mem DB named "my_db"
  • {"type": "rmx", "scope": "main", "options": [ "OriginalIdAlways" ]}: main on-disk DB with the additional save option
  • {"type": "rmx_remote", "url": "baseURL", "db": "my_db" }: remote DB available at "baseURL", eg, "https://remix.remixlabs.com/a". This is using the HTTP protocol. See below under db.remoteDB for more information.

Note that there is no additional access control for session-scoped DBs: if you can access the session, you can use these DBs. They also don't need to be explicitly created. Global DBs correspond to apps, and there is a one-to-one correspondence of "normal" on-disk DBs and in-mem DBs. They are created simultaneously and share access control rules. Access to global DBs within Mix is subject to the same access controls as over HTTP, and will fail as appropriate.

Regarding remoteDB: When accessing databases this way, the db operations are executed via the /documents HTTP endpoint. Keep in mind that you need appropriate privileges on the server. Also, there are a few differences to local db operations:

  • live queries (db subscriptions): not supported.

Environment: remoteDatabaseServer

If this environment variable is set, it is taken as the URL of the amp server, and any access to a local main database is redirected to this amp server using the remote protocol.

Primitive storing and retrieving

  def getOne : database -> recordID -> record
  def getOneWithDefault : record -> database -> recordID -> record
  def getArray : database -> array(recordID) -> array(record)
  def getStream : database -> array(recordID) -> stream(record)                         // since PR1612
  def getStreamWithDefault : record -> database -> array(recordID) -> stream(record)    // since PR1612
  def get : database -> array(recordID) -> stream(record)
  def getWithDefault : record -> database -> array(recordID) -> stream(record)
  def saveOne : appstate -> database -> record -> record
  def saveOneGetID : appstate -> database -> record -> recordID                         // since PR1612
  def saveArray : appstate -> database -> array(record) -> array(record)                // since PR1612
  def saveArrayGetIDs : appstate -> database -> array(record) -> array(recordID)        // since PR1612
  def saveStream : appstate -> database -> stream(record) -> stream(record)             // since PR1612
  def saveStreamGetIDs : appstate -> database -> stream(record) -> stream(recordID)     // since PR1612
  def save : appstate -> database -> stream(record) -> stream(record)
  def upsertOne : appstate -> database -> record -> record
  def upsertOneGetID : appstate -> database -> record -> recordID                       // since PR1612
  def upsertArray : appstate -> database -> array(record) -> array(record)              // since PR1612
  def upsertArrayGetIDs : appstate -> database -> array(record) -> array(recordID)      // since PR1612
  def upsertStream : appstate -> database -> stream(record) -> stream(record)           // since PR1612
  def upsertStreamGetIDs : appstate -> database -> stream(record) -> stream(recordID)   // since PR1612
  def upsert : appstate -> database -> stream(record) -> stream(record)
  def deleteOne : appstate -> database -> recordID -> record
  def deleteOneStm : appstate -> database -> recordID -> null                           // since PR1612
  def deleteArray : appstate -> database -> array(recordID) -> array(record)            // since PR1612
  def deleteArrayStm : appstate -> database -> array(recordID) -> null                  // since PR1612
  def deleteStream : appstate -> database -> stream(recordID) -> stream(record)         // since PR1612
  def deleteStreamStm : appstate -> database -> stream(recordID) -> null                // since PR1612
  def delete : appstate -> database -> stream(recordID) -> stream(record)
  def deref : database -> data -> data

The functions taking an appstate parameter modify the database (i.e. they modify the application state). This parameter can be passed down from function call to function call. In global scope, and inside action closures, the appstate parameter can also be obtained via the special variable _ (underscore), e.g.

def myAction = action(param) { db.saveOne(_, db, param) }

In contexts that do not permit the modification of the application state, it is not possible to obtain the appstate parameter this way. This is normally the case in defs and cells.

The meaning of the functions:

  • saveOne saves one record. If the record has an _rmx_id field, the existing record is updated. Otherwise a new record is created. The final record as saved is returned.
  • save saves a stream of records. The _rmx_id fields and other references in the records are interpreted relative to each other, and finally replaced by the IDs from the database.
  • upsert inserts/modifies the records found in the input stream. If the record does not exist yet, it is inserted. Otherwise, the present fields overwrite the already existing fields while the other fields remain unchanged.
  • deleteOne deletes one record with the given ID, and returns the updated record. If the ID did not exist in the db, nothing happens, and {} is returned instead (since PR#768 - before that PR undefined was returned).
  • delete deletes a stream of records with the given IDs, and returns the updated records. If an ID did not exist in the db, the ID is ignored, and the function returns {} instead of a record.
  • getOne loads the record with the given ID; get returns a stream for an array of IDs.
  • deref: if the parameter looks like an ID, it loads the record. Otherwise the parameter is returned unchanged.

NB: save, upsert, and delete return streams, but they are not evaluated lazily -- the records are saved/deleted whether or not you consume the returned stream.

Base queries

  def head : database -> query
  def all : database -> query
  def query : database -> string -> query
  def database : query -> database

The function head() is the query returning the latest versions of all records. The null database is the default database of the app, and head(null) or head() for short is the most common way to call this function.

The function all() is the query returning all versions of all records.

The function query parses the given string as native query of the database. The syntax is documented here.

The function database extracts the database value from the query.

Modifiers

  def includeDeleted : bool -> query -> query
  def includeSuperseded : bool -> query -> query
  def unrestricted : bool -> query -> query
  def label : string -> query -> query

The include... functions modify a query to include deleted records or old versions, respectively. They may be used independently or combined. Note that if you delete a record, it will exist as two versions -- the pre-deleted version, and a tombstone version marked deleted, and containing only metadata fields. The tombstone version typically will not match queries the prior-to-deletion version did, because it only contains metadata fields -- if you need "records that ever matched this query, plus possibly their deletion tombstones", you'll have to do the query on all revisions, collect the _rmx_ids, and then query with includeDeleted(true) on the IDs.

The unrestricted function sets the flag that controls whether the query is executed in unrestricted mode (by default this is off). In unrestricted mode certain checks are skipped.

The label function sets a string label with the query. The label is printed in some log messages.

Filters

  def filter : (record -> bool) -> query -> query
  def filterWithFallback : (record -> bool) -> query -> query

filter returns a new query that adds a filter to the input query. Only those records remain in the result stream where the function returns true. The filter function always compiles the query down to a fast query that is directly executed by the database engine. If such a compilation is not possible, the compiler will indicate an error.

filterWithFallback is like filter, but includes a fallback method of evaluating the filter should the compilation to a db query not be possible.

Note that only certain filter functions can be rewritten to db queries. In particular:

  • Equality predicates: (r -> r.field == value) and (r -> r.field != value) when the field name is a constant. The value can be any Mix expression. Note, however, that the Rmx engine only supports equality indexes for strings, bools, and null at the moment, and violating this will cause a runtime error.
    • Allowed since PR#811: deep field access like r.field1.field2
  • Existence predicate: (r -> r.field?) (is true when the field exists in the record)
  • Prefix predicate: (r -> string.startsWith(r.field, prefix)) (is true when the field is a string and starts with prefix)
    • the prefix can be an arbitrary Mix expression that evaluates to a string
  • like predicates: (r -> string.like(r.field, "pattern%"))
    • here, the pattern must be a literal string ending with a percent sign
    • other forms rejected by the compiler
  • contains predicates: (r -> string.contains("substring", r.field)) (is true when the string field contains the "substring")
  • contains predicates: (r -> array.contains(r.field, [x1,x2,...])) (is true when the field has one of the values x1, x2, etc.
    • works again since PR#811
  • featuresOne predicates: (r -> array.featuresOne(r.field, [x1,x2,...])) (is true when the field is an array of strings, and one of the strings occurs in the list [x1,x2,...] - in other words, the field and the list have elements in common)
  • featuresAll predicates: (r -> array.featuresAll(r.field, [x1,x2,...])) (is true when the field is an array of strings, and all of the strings occur in the list [x1,x2,...] - in other words, the field is a superset of the elements in the list)
  • Boolean expressions formed with &&, || and not
  • Boolean expressions true and false

Example:

db.head() |> db.filter(r -> r.field == "foo") |> db.toArray

The filter predicate may be written with abbreviated field access notation. For instance, the above example could also be written as

db.head() |> db.filter(.field == "foo") |> db.toArray

When any inner boolean expression does not reference the current row (e.g. r), the expression can be arbitrary Mix code evaluating to true or false. This feature may be used to switch sub filters on or off, e.g.

db.head() |> db.filter(r -> skip || r.field == "foo") |> db.toArray

Here, skip is such an expression. If skip is true, the actual check is skipped, and the record is retained in the result set. Otherwise, the check is executed.

The filter function allows it to include literal ASTs wherever a boolean expression is expected (note that filterWithFallback does not allow this because ASTs are a feature of the db engine). The AST must be marked with db.eval. For example:

db.head() |> db.filter(r -> r.field == "foo" && db.eval(db.astAfter(watermark))) |> db.toArray

Here, an after condition is added which is otherwise not possible with pure Mix-level syntax.

See below for how to construct ASTs.

Mapping

  def map : (record -> record) -> query -> query
  def mapWithFallback : (record -> record) -> query -> query

map returns a new query where the elements are mapped by calling the function. Like filter, calls of this function are translated to code of the db engine. Unlike filter, you can mix db-translatable and other code in the the projection function. This works as follows:

  • accesses of fields of the current row are extracted from the projection function
    • e.g. in db.map(r -> [ r.field1, r.field2 ]) the two field accesses r.field1 and r.field2 are taken out of the expression
  • these accesses are delegated to the db engine
  • the result values are finally inserted at the places where the field accesses were located

So, in short, field accesses are executed by the db engine, and the whole rest is executed by the Mix engine.

Also, there are these special features:

  • deep field accesses are possible, and it is possible to cross database references
    • e.g. r.field1.field2 works if field1 is a map but also if field1 is a dbref to a record in the db
    • this feature is limited to the current row
  • field accesses can finally indicate to follow references and to load the db record
    • e.g. r.field1.* loads the record from the db pointed to by the reference r.field1
    • however, there is no recursive expansion yet (r.field1.**)
  • projections can also include ASTs
    • e.g. db.eval(db.astInvoke("count",[]))

Using the function db.map is limited to cases where it follows a db.filter, db.head, db.all, or db.query in a query pipeline. In particular, it is not allowed after a db.filterWithFallback or another db.map (i.e. there must not have been any Mix-level postprocessing of the query results yet).

The function db.mapWithFallback can always be used. If possible, it passes the projection on to the db engine. In contrast to db.map there is a fallback should the db engine not be usable. The fallback supports:

  • deep field accesses (r.field1.field2): yes
  • deep fields crossing references: yes
  • follow final reference (r.field1.*): yes
  • ASTs: no

Sorting

  def sort : (record -> data) -> query -> query
  def dirSort : bool -> (record -> data) -> query -> query

sort(f) results in a query that sorts the records by the result of f (ascending).

dirSort(r, f): sorts in ascending (r=false) or descending (r=true) way.

The query rewriter does not currently handle the sort functions, although the db engine can do efficient indexed sorts on string fields. You can access the indexed version by generating the AST explicitly: db.process(query.invoke("order", [f, r])).

Extracting results

  def length : query -> number
  def isEmpty : query -> bool
  def isNotEmpty : query -> bool
  def exists : (record -> bool) -> query -> bool
  def toArray : query -> array(record)
  def toStream : query -> record*

The following functions all execute the query. Note that you can execute the same query several times.

length executes the query and returns the number of records it would have returned.

isEmpty executes the query and returns true when there are no records in the result set.

exists executes the query and checks whether there is a record in the query result where the function returns true.

toArray executes the query and returns the result as array of records.

toStream executes the query and returns the result as stream of records.

Special elements, slicing

  def first : query -> record
  def firstWithDefault : record -> query -> record
  def last : query -> record
  def lastWithDefault : record -> query -> record
  def skip : query -> query
  def skipn : number -> query -> query
  def firstn : number -> query -> query
  def lastn : number -> query -> query

first is the first element of the result stream (or a runtime error for the empty stream).

firstWithDefault returns the first element, or if the result is empty, the substitute.

last is the last element of the result stream (or a runtime error for the empty stream).

lastWithDefault returns the last element, or if the result is empty, the substitute.

skip creates a query where the first element of the input query is skipped.

skipn creates a query where the first n elements of the input query are skipped.

firstn creates a query whose result stream is the same as the input query limited to the first n records.

lastn creates a query whose result stream is the same as the last n records of the input query.

Abstract syntax trees

The AST structure and helper functions to create it are now in the query module, typically used via import q = query.

  def eval : query.ast -> data

You can use AST nodes in three ways:

  • Inside a db.filter you can use any of the AST nodes for filters. The AST node must be "evaluated" at the right place using db.eval (i.e. call db.eval with the AST node as parameter).
  • Inside a db.map you can use any of the AST nodes for projections. Again, db.eval must be called with the AST node.
  • Run them directly using db.process (see below)

low-level execution

  def process : ast -> query -> query
  def processWithFallback : ast -> (stream(record) -> stream(record)) -> query -> query
  def processPipeline : array(ast) -> query -> query

The function db.process adds the passed AST to the query executed by the db engine. It is required that there has not been any Mix postprocessing for the input query.

The function db.processWithFallback works like db.process if there was not yet any Mix postprocessing. Otherwise, the fallback function is executed on the Mix side.

The function db.processPipeline(asts) is like calling db.process for each AST in the list asts.

query string parsing

  def parse : string -> ast
  def parseQueryString : database -> string -> ast     // since PR1651

parse: Parses the query and returns the AST node. The query can only have one pipeline element (i.e. not several elements connected with |).

parseQueryString (preferred): in this version, the database is also specified, and the engine serving the db is used for parsing.

See https://github.com/remixlabs/mix-rs/wiki/MixQuery-%E2%80%90-writing-queries for an overview of string query syntax.

Live queries

Live queries are provided by the messaging module, and are documented here.

Anonymous links

  def makeAnonymousLink : appstate -> database -> string -> result(string, string)

makeAnonymousLink(as, null, "/documents/123.json") returns, if successful, an absolute URL that represents a shareable link to the passed endpoint, which must be inside the current app. Only GET, POST, and PUT methods will work with anonymous links. Accessing an anonymous link will invoke the underlying endpoint as if you are the creator of the link (with the same access restrictions, so you can't share something you can't access).

References

  private case ref(string)             // since PR#1813
  type reference = ref | null          // since PR#1813

  def makeRef: string -> recordID 
  def stringRef : recordID -> string   // since PR 1169
  def isRef: recordID -> bool

References have some history, but since PR#1813 the final representation of these values is available to Mix programmers. References used to be strings prefixed with the special {ref} prefix, and there was some hack active that we could tell whether a string like {ref}123 was meant as reference or as plain string. This is all gone now, and the following docs describe the new representation.

Now, a reference has the type db.reference which is either null or db.ref(ID) for a non-empty string ID (usually the _rmx_id of a db record). You can't directly create references by calling db.ref (because the case type is declared private). Instead, there is the db.makeRef function:

  • db.makeRef("") returns null
  • db.makeRef(ID) returns db.ref(ID) for any non-empty string ID

The database reacts on null references in a special way. For example, when the _rmx_id of a record is null, the db.saveOne function inserts a new record (instead of saving the record under its old ID). This piece of code takes the existing record found under the reference r and saves a copy:

let rec = db.getOne(null, r);
let copy = db.saveOne(appstate, null, rec with ._rmx_id = null);

The function db.stringRef can be used to extract the ID from the reference, or to get the empty string in case of a null. db.stringRef also accepts regular strings as input for backward compatibility.

The function db.isRef returns true if the input is null or a db.ref.

You may wonder why we don't use the db.reference type all over the place, and rely on db.recordID = data instead. This is mainly to keep the typing compatible with the old implementation, so users who always used makeRef and stringRef do not need to change any code. We are aware that this is not optimal, and perhaps there is another change coming.

In JSON, (non-empty) references are encoded as

{ "_rmx_type": "{tag}case:db.ref",
  "_rmx_value": "<value>"
}
⚠️ **GitHub.com Fallback** ⚠️