REST - wtsi-hgi/hgi-web GitHub Wiki
- Hypermedia
- Modelling the Graph
- hypr: A Hypermedia Type
- Procedural-Graph Interface
It is important to thoroughly establish the underlying model which needs to be represented.
Broadly, we have an information system on a connected graph. Each vertex holds arbitrary state, which can be manipulated in particular ways, per the semantics of HTTP[1,2], which is used as the communications medium.
(Note that there is no technical necessity for the graph to be connected. However, of course, any unconnected subgraphs would be, while technically routable, undiscoverable and thus redundant.)
The spectrum of operations one would want to perform on an information system are well represented by HTTP methods, the semantics (potency) of which are well defined. Of the eight methods delineated in its specification, four are particularly relevant:
Method | Nullipotent | Idempotent | Description |
---|---|---|---|
GET (and HEAD ) |
✓ | n/a | Retrieve the resource (or just the headers thereof) |
PUT |
✗ | ✓ | Create or update the resource |
POST |
✗ | ✗ | Create a subordinate resource |
DELETE |
✗ | ✓ | Delete the resource |
n.b., POST
is defined per [1], rather than with its updated,
overloaded form from [2].
In addition to these, the PATCH
[3], LINK
and UNLINK
[4] extensions
may also prove useful. PATCH
particularly would facilitate updates
by diff, reducing network usage against resources with larger states. It
is unclear how useful LINK
and UNLINK
would be in the context of a
client outside of schema definition (the realm of privileged clients
only).
Moreover, the semantics of HTTP response status codes are, again, both well defined and sufficiently variegated to provide useful feedback on the outcome of any transaction.
An OPTIONS
request on a resource is, if supported, meant to return a
response that includes an Allow
header, listing the supported HTTP
methods of that resource. The HTTP specification also affords the
possibility of a non-trivial entity body.
It is sometimes discussed that this body could be used to provide
additional information that is relevant to the resource's options (e.g.,
resource schema, such that POST
requests can be correctly formed).
Indeed, this appears to be the perfect channel to represent such
information.
However, there is a problem.
The HTTP specification expressly forbids caching of OPTIONS
responses.
This is a deal breaker as it implies that, using the method in the way
described here, would incur at least two HTTP requests for every
resource. Pipelining may mitigate the effect of that, providing it's
supported, but shouldn't be relied upon. Moreover, there is real
potential for the OPTIONS
body to be simply stripped out by any
interposing proxy servers.
While a frustrating state of affairs, the lack of caching is justified:
-
The values of the
Allow
header on a resource could easily change without warning. As these data are outside the entity body, cache validators can't apply and so a client cannot determine when a local copy is dirty (i.e., caching can't work). -
Including a body in the
OPTIONS
response muddies the water with respect to the representation of a resource: It is seen as a "side channel", against a resource's true representation, which should not be supported.
Indeed, the implication from the HTTP working group is that OPTIONS
only persists for backwards compatibility. The Allow
header still has
currency, but can be included from any type of request.
HTTP is a stateless protocol, thus requests on resources are transactional in nature. An important assurance that transactional information systems can guarantee is ACID: Atomicity, Consistency, Isolation and Durability.
-
Atomicity ensures only successfully completed transactions are committed. Unlike some traditional DBMSes, HTTP request-response cycles cannot be grouped into a larger commit at the protocol level and so are already atomic in that sense. It would be the server's job to ensure true atomicity with regards to its persistence layer.
-
Consistency ensures state transitions are valid. This is not expressible at the protocol level, but scope exists for such assertions to be available on the server to process and enforce.
-
Isolation ensures concurrent transactions are equivalent if they were serialised. Presuming the server only has one NIC and runs as one process, HTTP requests will necessarily be serialised. However (and otherwise), additional safety can be guaranteed by enforcing a locking mechanism to force serialisation by validating state (e.g., comparing ETags with the
If-Match
request header). -
Durability ensures committed transactions are resilient to failure. Again, there is no protocol level support but rather something that should be assured by the server. Typically, this will be an HTTP wrapper over some other information store and will simply inherit durability from that.
Clearly providing an ACID guarantee is within the remit of the model, albeit delegated largely to the server rather than at a protocol level. That said, HTTP's semantics do provide significant assistance in achieving this aim.
Note that HTTP's caching mechanism, while potentially improving performance by reducing network usage, should be used carefully to avoid any departure from data consistency. Temporal cache validators may not provide the necessary resolution on high-throughput resources; on the other hand, an ETag cache validator is more complex to implement and intensive to run.
Each vertex can be independently referenced via its IRI[5] over HTTP.
While IRIs don't necessarily impose a structure (e.g., graph:f00ba4
is
a valid IRI), when used over HTTP there is the implication of a
hierarchical tree. This impedance mismatch with a connected graph needs
to be resolved.
A tree structure suggests a relationship between a node and its children, which is exemplified in current usage. (As stated, the original HTTP specification imposes this subordinate relationship.) In the context of a web API, this provides the useful cognitive model of items and collections.
Collections should be homogeneous, in that all entities are of the same class (respective to their parent). When a resource has multiple collections, potentially of different types, then without loss of generality this can be represented as a collection of collections. For example:
Buildings Collection of collections of collections
/|\
/ ...
Apartment Collection of collections
/ \
Rooms People Collections
/|\ /|\
. . . . . .
This example also serves to illustrate the structural problem with
mapping a graph on to a tree: It would be reasonable for an information
system to normalise the People
node as a sibling of Buildings
, but
then each building's People
collection would duplicate data. Thus, a
"dummy vertex" would be needed that provides a mechanism to link back to
a collection elsewhere in the tree (e.g., either encoded in the
representation, or using HTTP redirection; cf., POSIX symlinks and hard
links, respectively):
Buildings People
/|\ /|\
/ ... . o .
Apartment :
/ \ :
Rooms People :
/|\ /|\ :
. . . . x . :
: :
" " " " "
This machinery restores the graphical nature of the structure without adding unnecessary complication.
Unlike vertices, there is no facility to store edge data independently. Thus, they must be encapsulated within each vertex's representation. Without loss of generality, the edges can be thought of as directed; if there's a bidirectional relationship between any two vertices, the edge can simply be expressed in each of them. Cycles are certainly permitted in the graph; including loops (links to self), which are useful when resources are taken outside the context of HTTP (e.g., stored on disk) and as a means for symbolic linking. What's important to infer is that the relationship an edge represents between two vertices must be encoded with its means of reference.
Each vertex can contain arbitrary state data. Whether this is static or dynamically generated is irrelevant. If a vertex is the parent of a collection, then that collection -- in terms of its metadata -- is considered to be part of said vertex's state. A vertex's edges are not considered to be part of its state, however it may be true that edges are derived from state (and vice versa). As such, stateless vertices are effectively redundant, so needn't necessarily be representable. (If a use case for a truly stateless vertex arises, it can always be represented with vacuous state.)
The arbitrariness of the state data affords it no transparency when it comes to semantics. As such, all data meant for consumption and manipulation should be typed in some meaningful way -- which implies a key-value store -- without allowing the range of which to spiral out of control:
-
Primitive Types (Scalar)
- Bottom
- Textual
- Numeric
- Temporal
- Logical (i.e., Boolean)
- Enumeration (n.b., Boolean is a special case)
- Raw (i.e., binary)
-
Homogeneous Collection Types
- Simplex (i.e., collection of scalars)
- Complex (i.e., collection of state, see below)
-
Heterogeneous Collection Types
- State (i.e., collection of any of the above)
Additional metadata can augment type definitions (e.g., such as mutability, default values, optionality, etc.) to provide a rich type checking system and maintain the integrity of each vertex's state. Indeed, this is the beginnings of a contractual interface.
The idea behind contract programming is that an interface comes with a set of assertions that both parties (in this instance: the client and server) must abide by during any transaction. This is enforced by several mechanisms ("contracts"):
-
Preconditions: Logical assertions applied against any input (either atomically, or holistically). This can be to a finer guarantee than that provided by simple type checking. For example, if providing postal address details, one could assert a pattern match against the postal code relevant to that country.
-
Postconditions: Logical assertions applied against the output. For example, if computing the arithmetic mean, ensure it is between (or equal to) the minimum and maximum input values.
-
Invariants: Logical assertions applied against state. For example, if adding an element into a queue, ensure it doesn't exceed some maximum length by either refusing to queue the new item, or dequeuing the head. (Note that this differs from pre/postconditions insofar as it protects against indirect manipulations.)
HTTP methods can be seen as functions against resources that map requests to responses:
Method[Resource] : Request -> Response
As the client has no control over the server's runtime, it is effectively subordinate in the relationship. As such, the postconditions (and, to some extent, invariants) aren't enforceable, but may still provide useful information. While it is important that the server always checks the preconditions, it would also be possible for this to be done on the client before a request is even submitted.
In the context of HTTP and the RESTful architecture, the Fielding constraint of "Code on Demand"[6] could be used to enhance simple type checking into a fully fledged programmatic preconditions contract. (As the invariants contract involves indirect state changes, which would be actioned by the server, there is no way that this can be enforced by the client. However, it can be hinted at via mutability.)
Security Advisory Clients should not execute code from untrusted sources. All trusted code ought to be pure (i.e., no side effects) or at least run in a sandboxed environment.
A resource's representation is encapsulated in the entity body of the
response to a GET
request. From the above discussion, this
representation will encode the resource's state and the state of (or at
least a reference to) any subordinate collection.
When it comes to state, a resource may comprise many data fields. Only few of these might be relevant to any particular enquiry, so it stands to reason that a means of limiting the response would be desirable. Likewise, for a subordinate collection -- which may be vast -- a mechanism for filtering these appropriately is necessary.
IRI's provide a query component, appended to the logical address, that
could be standardised to such ends. Specifically, query components are
conventionally specified as key-value pairs; presuming the resource
state is ultimately a key-value collection, query keys of select
(to
select state keys; cf., SQL's SELECT
field list) and q
(to query
state values; cf., SQL's WHERE
clause) can be defined, where
appropriate.
The state key query should be a comma delimited list of required keys. If no key query is specified, then all keys should be returned. If specified keys don't exist, they are simply ignored. For example:
http://path/to/resource?select=id,name,mail
When a resource has a subordinate collection, then the key query should also descend into that collection (presuming the subordinate collection is fully represented).
For resources without a subordinate collection, a value query should be ignored. Otherwise, the query should be specified according to the following nominal ABNF[7], inspired by LDAP search strings[8]:
expression = "(" ( clause / predicate ) ")"
clause = conjunction / disjunction / negation
conjunction = "and" expression expression
disjunction = "or" expression expression
negation = "not" expression
predicate = key comparator value
comparator = [ "<" / ">" / "~" ] "="
; lte to, gte to, similar to and equal to
The key
rule should be matched with the collection items' respective
state key; the exact semantics of value
-- type coercion, case
sensitivity, wildcards, escaping, etc. -- and the similarity comparator
are server dependant, but should be sensible. Value queries should only
apply to the immediate collection, rather than being recursive.
For example:
http://path/to/collection?q=(and(name~=foo)(not(id<=1)))
Furthermore, for collections (full or filtered), it is often useful to
paginate results rather than returning the full set. Again, it would be
appropriate to use a query component for this. Presuming the collection
ordering is preserved in the representation, this can be achieved using
array slicing and thus uses the query key of slice
with the following
grammar:
slice = [ start-index ] ":" [ end-index ]
start-index = <Collection index>
; >= 0, inclusive
end-index = <Collection index>
: >= start-index, exclusive
Note that collections are considered to be zero-indexed and omitted
terminal indices in the slice denote the respective extremities of the
collection. If the slice
query is omitted, then all results should be
returned (equivalent to slice=:
). If slices go out of the bounds of
the collection, then the intersection (if any) should be returned.
For example:
http://some/collection?slice=1:5
Clearly the selection, querying and slice queries are orthogonal; that is, they can be applied simultaneously -- where it makes sense -- where the querying query applies to the resource state holistically (regardless of it being potentially masked by the selection query):
.../people?select=name,mail&q=(or(dept=hr)(dept=finance))&slice=:10
If the query component contains multiple selection, querying or slice queries, then the last (furthest to the right) should be used.
Note that the IRI query component can only contain a certain range of
characters, which is further restricted by the conventions of
subcomponents (hence and
, rather than LDAP's &
). Characters outside
of this range must be percent encoded.
As a linguistic (or customisation) consideration, the tuple of query
component keys (select, q, slice)
may be changed to be sensitive to
the environment. For example, the French equivalent may [n.b., the
author doesn't speak French] be (selectionne, interroge, tranche)
.
This will be parameterisable; where the means of doing so will be
discussed in the following section.
Next > hypr: A Hypermedia Type
-
Fielding, R. et al (1999) Hypertext Transfer Protocol -- HTTP/1.1; IETF RFC2616
-
Fielding, R. et al (eds.) (2014) Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content; IETF RFC7231
-
Dusseault, L. et al (2010) PATCH Method for HTTP; IETF RFC5789
-
Snell, J. (2014) HTTP Link and Unlink Methods; IETF Internet-Draft
-
Duerst, M. et al (2005) Internationalized Resource Identifiers (IRIs); IETF RFC3987
-
Fielding, R. T. (2000) Architectural Styles and the Design of Network-Based Software Architectures; University of California, Irvine; PhD Thesis
-
Crocker, D. (ed.) et al (2008) Augmented BNF for Syntax Specifications: ABNF; IETF RFC5234
-
Smith, M. (ed.) et al (2006) Lightweight Directory Access Protocol (LDAP): String Representation of Search Filters; IETF RFC4515