Delta Container Specification - aecdeltas/aec-deltas-spec GitHub Wiki
In our specification, the smallest unit of change is an AEC object.
Such an object is simply a collection of key-value pairs describing an AEC-specific entity such as a steel beam, window, slab etc., with its respective properties such as an id
, length
, fire_rating
, etc. However, we do not subscribe to any one specific data representation or encoding. That implies that the definition of an object can be based on anything such as an IFC Object but also a BHoM Object, a 3D Repo Object, a Speckle Object or any other representation as required. Ultimately, an object is simply a binary blob that some but not necessarily all applications would be able to decode, interpret and convert into their native representation of what the respective properties relate to.
Therefore, there is a need to maintain a set of converters that provide the necessary linkage between various representations of objects and their properties within the native applications. This will be achieved via a look-up table which defines the one-to-one mapping between the properties. For simplicity, we assume that each property only maps to one counterpart property in the receiving application and that not all properties necessarily need to be mapped across. After all, it is often the case that the receiving application is only interested in a subset of the object properties to achieve its specific function. Note that the BHoM library by BuroHappold Engineering already contains such converters known as toolkits, see https://github.com/BHoM for details.
- A single object is the smallest unit of change. Even if the smallest of properties on an object has been changed, the whole object needs to be flagged accordingly. For instance, a single vertex in a mesh change would require the whole mesh object to be updated.
- All objects are immutable. Any change on an object means the object has been modified, and thus a new object with a new ID/hash needs to be created to replace it and advance the revision further.
- Each object has a unique and a shared ID. Each object by definition requires a unique ID (UID) in order to uniquely identify it. In addition, it also requires a shared ID (SID) which is shared across various instances/revisions of the same object over time.
-
Not all objects have to have a visual representation. There are objects that simply have no visual representation such as a renderable mesh attached. For instance, a steel beam can be represented by its
load capacity
andlength
in order to perform structural analysis calculations even though it would not be renderable directly as is.
A Delta (∆) is a set of objects that have been identified as either created
, deleted
or updated
between two applications. The trivial example is an empty set {}
whereby no change has been detected. Differencing or "diffing" for short is then the process of establishing the delta set of objects between two revisions or states of host applications holding the AEC information in memory.
In mathematical notation, we would write:
∆ = { C, D, U },
where C
, D
and U
are sets of created/added, deleted and updated/modified AEC objects respectively. The naming convention was selected to closely follow the CRUD (create, read, update and delete) principles of persistent storage manipulation.
However, it is important to note that differencing itself can be implemented at two levels:
- Level 1: Collection/object-level differencing. Input are two collections and the result is a list of changed objects. This happens at the AEC Deltas middleware layer and is integral to our definition;
- Level 2: Property-level differencing. Differencing of individual line-by-line properties only happens at the application layer and is left open to implementation as required by the receiving application. This, therefore, is not part of the AEC Deltas middleware but can be delivered on top.
Here, we only concentrate on the object-level differencing (Level 1). Objects are being freely exchanged but it is the application's responsibility to determine what happens with this information on the client side. For instance, the receiving application might decide to consume only some of the updates, ignore all of them or pass them on unmodified depending on the user requirements and input. Note that the source application is the one responsible for notifying what the deltas/changes are between itself and the receiving application.
Delta definition is currently a work in progress and might be updated in the future.
Each delta set should have the following base fields in order to properly identify the authorship and ownership of the information that is carried within.
-
Stream ID -
uuid
unique ID in RFC 4122 v4 format (unique to each project) -
Author -
string
-
Timestamp -
time_t
timestamp in seconds from the Unix epoch -
Diff -
string
orbinary
- Cryptographic signature - signed by a private key of the author, i.e. an RSA cryptosystem of the delta itself
In a JavaScript notation, the encoding of a delta would look like this:
delta = {
"stream_id": <uuid>, // Id of the Stream owning both the Revision that this Delta targets and the Revision that it will produce.
"diff": <binary> // Represent the differences between two Revisions.
"revision_from": from_id, // Revision Id that this Delta targets.
"revision_to": to_id, // Another RevisionId that this Delta targets, or RevisionId that this Delta produces.
// In upload from client to Server, this can be null (a Revision will be produced - RevisionId is server defined).
// Security and authentication
"timestamp": <time_t>, // Time of the Delta creation (not upload started).
"signature": <base64 AES string>, // Unique signature of author.
// Additional info
"sender": <string> // Any descriptive string with the client used (and/or author name): BHoM, 3D Repo, Speckle, etc.
"comment" : <string> // Can be null. Description or comment.
}
Each delta stream carries a specific binary diff that constitutes the diffing change.
Thus, a single diff shall have the following components:
diff = {
"toBeCreated": [
[ uid_1, sid_1, mesh_1, material/colour_1, metadata_1, authoring_tool_1, ... ],
[ uid_2, sid_2, mesh_2, material/colour_2, metadata_2, authoring_tool_2, ... ],
[ uid_3, sid_3, mesh_3, material/colour_3, metadata_3, authoring_tool_3, ... ],
...
],
"toBeDeleted": [ uid_4, uid_5, uid_6, ... ],
"toBeUpdated": [
[ uid_7, sid_7, mesh_7, material/colour_7, metadata_7, authoring_tool_7, ... ],
[ uid_8, sid_8, mesh_8, material/colour_8, metadata_8, authoring_tool_8, ... ],
[ uid_9, sid_9, mesh_9, material/colour_9, metadata_9, authoring_tool_9, ... ],
...
]
}
The three fields of the diffing dictionary may follow different naming conventions than the proposed one (e.g. new, old, modified; others). The above representation expresses that the Delta container is a piece of information originated from a source and transmitted to a destination, and it's the receiver end's responsibility to enact the changes described by the Delta; hence the past tense for the three fields.
The design of the AEC Deltas framework does not prescribe any specific diffing encoding, as the aim is to deliver the most flexible system possible.
Therefore, it is not necessarily enforceable that the receiving application will be able to interpret all of the supplied data. For instance, there is no guarantee that there always will be a one-to-one mapping between properties of steel beam in Revit vs Tekla.
Therefore, an abstraction of the underlying specification is currently achieved via the BHoM Object Model, which provides low-level object definitions for the most common object types that are being exchanged in the AEC context, currently mostly structural analysis data with more being added, see here: https://github.com/BHoM/
Transfer of data happens always through a Delta payload.
However, the diff
field of the Delta payload might:
- Have all three properties present (
toBeCreated
,toBeUpdated
,toBeDeleted
) - we call this diff-based Delta payload. - Have only the first property present (
toBeCreated
) - we call this revision-based Delta payload.
The revision-based Delta payload consists in a full revision at a certain point in time; it could contain an entire model.
Thus, an example of a revision-based Delta payload is:
delta = {
"stream_id": <uuid>,
"diff" =
{
"toBeCreated": [
[ uid_1, sid_1, mesh_1, material/colour_1, metadata_1, authoring_tool_1, ... ],
[ uid_2, sid_2, mesh_2, material/colour_2, metadata_2, authoring_tool_2, ... ],
]
}
"revision_from": from_id,
"revision_to": null, // in upload from client, this is null
"timestamp": <time_t>,
"signature": <base64 AES string>,
"sender": <string> // E.g. BHoM, 3D Repo, Speckle, etc.
}
The distinction between Revision-based and Diff-based Delta payloads is conceptually important because it determines the Server Endpoint behaviour and general workflow, as shown in Payloads types: client side VS server side diffing.