Write and Read Models - nsip/nias3-engine GitHub Wiki

Much of the functionality of nias3-engine has been superseded by n3-transport. This documents the write and read models of nias3-engine, which are a dependency for the nias3 front end.

Read models

Nias3-engine is set up to pass its tuples to three read models:

a Hexastore, which is a key/value store (boltDB) where a tuple is stored with all permutations of its SPO as keys, and which allows retrieval on key prefixes.
- The context of a tuple is prefixed to all keys, and the key is of format c:"..." s:"..." p:"..." o:"....", with the contents of each of c s p o is quote-escaped, to avoid ambiguity.
- The variant Hexastore key ordering is used by nias3, e.g. to request tuples by just XPath prefix, regardless of their subject.
- The hexastore allows flexible low-level queries on tuples, but queries on each of S P O are just as possible in Influx, so it is not a given that we will need to implement a hexastore in n3-transport.
- The hexastore stores only the latest version of a tuple under its respective keys, and tombstones previous versions of the tuple under a different key. (A tuple is considered changed if it has the same C S P but different O. Empty O is used to indicate tuple deletion.)
Influx, which is a time-series database, with each version of each tuple stored sequentially. Queries to influx will by default need to retrieve the most recent instance of a tuple with the given C S P.
Mongo, which stores each object (corresponding to a set of tuples with the same subject) as a separate document; the object is represented in Mongo as JSON, treating the predicates as JSON Paths.

nias3-engine applies versioning to all tuples it ingests, to resolve collisions between nodes trying to update the same tuple. It does so using a count-min sketch, which is a variant of a Bloom filter: when it sees a tuple for the first time (as identified by C S P), it assigns it version 1; if it has already seen a tuple with the same C S P, it assigns one version higher than the latest version it has seen. (See blockchain.go, AddNewBlock, reference to bc.cms.Estimate and bc.cms.Update.) Because all nodes see all tuples, the versions will reach internal consistency.

The count-min sketch implementation is appealing because it tracks versions against keys in memory, so it is fast; it returns false positives, but is guaranteed not to have false negatives. If the count-min sketch indicates that there is a collision, the potential collision still needs to be looked up in a read model, to confirm that there is actually a collision (and not a false positive): the read model needs to be fast, and the hexastore is used in this instance. So if there are too many collisions, the count-min sketch will end up having to do disk accesses much of the time, and will no longer be faster than a database lookup. As documented in n3/count-min-sketch.go, the count-min sketch needs to increase in RAM size, in order to stay performant as the number of keys it tracks increases: the implementation limits the size to 20 MB, which starts introducing too many collisions at around 1 million keys (= tuples).

NIAS3-engine also implements direct queries on the Hexastore read model, to address the Digital Classroom project (nswdigclass_query.go). These queries will need to be realised in the code port, whether as Influx, Mongo, or Hexastore queries. They may need to be exposed in the short term via a direct API query, as they currently are in nias3-engine; but the ultimate intention is to expose them as business queries via the GraphQL layer.

Both read queries and write requests are exposed from nias3-engine to nias3 via a web server (n3/webserver.go). The write API is superseded in n3-transport by gRPC calls. The write API also does encryption and digital signing of received tuples (called "blockchain" in nias3-engine); this should all now be done by the NaCl transport, and can be ignored. Some read queries will still need to be exposed from n3-transport to a nias3 query front end, because nias3 will still support direct SIF CRUD queries.

Write models

The nias3-engine passes the incoming stream of tuples through a filtering stage, and a batching stage, each with its own stream:

Filter > FilterFeed (n3/filterfeed.go: FilterFeed) removes from the stream any tuples that have already been seen (through accessing a combination of the count-min sketch and the hexastore); creates a put command for any tuple with a non-empty object and a delete command for any tuple with an empty object; adds a timestamp to each tuple (to deal with collisions), and sends them to the filterfeed stream.
FilterFeed > Read models (n3/hexastore.go, n3/influx.go, n3/mongo.go, ConnectToFeed: yes, it repeats much of the same code) processes the batches of tuple updates. It sorts them by CSP, then timestamp, and iterates through tuples in sorted batches of 1000 tuples. This is to deal with the slowness of random writes on the Hexastore key/value store, which (as with most key/value stores) is optimised for random reads but not random writes. For any set of of tuples with the same CSP, only the first tuple chronologically is used for deletes, and only the last tuple chronologically is used for puts. It attempts to speed up the operation by using asynchronous connections.
- There remains a problem if tuples for the same object end up split between two batches: if there is an update operation on an object, tuples must be deleted before their replacements are added. There is a risk that a put will end up in the earlier processed batch, and its correspondent delete will end up on the later processed batch. n3-transport now tracks the number of tuples associated with each object posted to the write model; it would be more robust to use that tracking in building up batches, in order to ensure that all tuples associated with an object are in the same batch, and that the batch is not processed until all tuples associated with the object are received.
- nias3-engine batches for reasons of write performance. Batches are problematic because they may compromise the ordering of tuple updates (particularly delete-then-write updates), and if we can avoid having to use batches, we should.

The write model needs to interact with a read model that is close to as fast as the write model, when it checks whether tuples have already been received. (Recall that this happens in the count-min sketch implementation, to check for false positives.) In nias3-engine, that is the Hexastore; if InfluxDB is as fast, that can be used instead.

The nias3-engine was constrained in performance, because its Publish call is quite slow (synchronously waits for Ack), and its PublishAsync call does not maintain the ordering of tuples on the stream. This bottleneck may have been removed with the switch in n3-transport from STAN (NATS Streaming) to NATS; but that needs to be confirmed.