Metaspace Design Spec - evomimic/map-proto1 GitHub Wiki

Introduction

The intent of this document is to specify the design of the Metaspace component of the MAP architecture to sufficient detail to drive implementation.

A Metaspace consists of:

  • a set of branches for managing evolution of TypeDescriptors
  • an ontology for defining TypeDescriptors (i.e., a meta-ontology)
  • a set of agents, some of whom have a stewardship role with respect to the Metaspace
  • a set of services for managing and querying the TypeDescriptors in the Metaspace
  • a set of governance rules that define the protocols, values and norms for managing the Metaspace. Some of these rules may be parameterized by metaspace level settings.

MAP Descriptors Data Model

image

NOTES:

  • Holochain EntryTypes within the map-descriptors package are depicted in orange boxes.
  • Holochain LinkTypes are shown in blue text.
  • Boxes grayed out have not yet been implemented.

Metaspace Governance

Metaspaces provide the descriptions for the self-describing active holons that comprise the foundation of all MAP objects. It is useful to think of a metaspace as circumscribing an ecosystem of shared services and visualizers. TypeDescriptors contain the information DAHN Visualizers require in order to present and allow interaction with MAP Holons and provide a common vocabulary for defining the Information Access Agreements that form a part of service offers and agreements. Uncontrolled change to existing TypeDescriptor definitions could break existing visualizers and/or Information Access Agreements. Uncontrolled proliferation of TypeDescriptors could result in a glut of nearly identical TypeDescriptors that works against reuse of Visualizers or Agreement templates. For example, imagine if 28 different Book Holon Types were defined. Or, if there were 17 different bookTitle properties provided for a Book Holon Type. In effect, proliferation of nearly identical TypeDescriptors tends to fragment the MAP community into different siloes.

Thus, the set of TypeDescriptors can be thought of as a shared resource that benefits from stewardship. For this reason, the MAP treats metaspaces as an epistemic commons. A commons is:

"A social system for the long-term stewardship of resources that preserves shared values and community identity.” (excerpt from Think Like a Commoner by David Bollier.

Metaspace governance embodies the protocols, values and norms of the social system via a combination of careful design and both automated and manual governance actions. Specifically, it is intended to support the following goals:

  1. Allow concurrent updates to the metaspace by multiple metaspace stewards.
  2. Support the controlled evolution of the metaspace in a way that both fosters innovation and protects components that depend on TypeDescriptors (e.g., Information Access Agreements and DAHN Visualizers) against the disruptive effects of change.
  3. Ensure the health of the metaspace (e.g., avoid polluting the metaspace by proliferating nearly identical TypeDescriptors)

To address these goals, the design segregates updates from releases. Only released TypeDescriptor versions can be used by metaspace clients. Semantic versioning of TypeDescriptors allows breaking changes to be distinguished from warning_changes and non-breaking changes. This helps clients of a TypeDescriptor decide when and how to adopt a newly released version of a TypeDescriptor. Examples of breaking changes include removing a TypeDescriptor, a property from a holon type, or a relationship, or strengthening a constraint (e.g., increasing the minimum length or decreasing the maximum length of a property). Weakening a constraint (e.g., reducing the minimum length or increasing the maximum length) is not considered a breaking change because any existing data that satisfies the prior constraint will also satisfy the weakened constraint. But it is possible that some code could have been written that assumes the prior constraint. For example, a visualizer may layout a property visualizer under the assumption its max length is 100 bytes. If the max length constraint is increased to 500 bytes, the layout may no longer work. For this reason, weakening a constraint is considered a "warning_change". Existing data will continue to be valid, but some code may exhibit unexpected behavior.

Updates to metaspaces (e.g., adding new TypeDescriptors or adding new versions of existing TypeDescriptors) are likely to be relatively rare operations performed by a relatively small number of stewards working collaboratively. The design allows these updates to happen in parallel. This leads to the potential for two (or more) stewards to make updates to the same TypeDescriptors. There are different strategies for dealing with this possibility:

  1. Pessimistic Concurrency Control (e.g., via 2-Phase Commit or Paxos)
  2. Optimistic or Multi-Version Concurrency Control with rollbacks when a conflicting update is discovered
  3. Conflict-Free Replicating Data Type (CRDT's) with automatic resolution of potential conflicts, e.g., via Last-Write-Wins (LWW)

The approach proposed here is a variant of CRDT/LWW. Potentially conflicting updates are automatically-resolved via LWW, but non-accepted updates are retained so that they can be manually merged by stewards later. Each update a steward makes is treated as a proposal and all proposals are provisionally accepted. Instead of algorithmically picking a winner, the stewards decide how to resolve potentially conflicting proposals.

On the other hand, usage of TypeDescriptors (e.g., by interface adaptors, Information Access Agreements and/or DAHN Visualizers) is considered to be widespread. The cost of adopting a new version of a TypeDescriptor can be expensive, especially if the new version includes a breaking change (BR). Therefore, TypeDescriptors are assumed to go through a lifecycle process that includes reviewing proposed updates before releasing them.

Private and Shared Metaspaces

MAP will support (eventually) private (to a person) and shared (across multiple people) metaspaces. This allows sets of changes to be tried out privately before committing them to the shared metaspace. This concept is similar to local and remote repositories in git. Each metaspace has its own DHT and associated agents. Agents have one of the following roles:

  • steward -- can read existing versions, create new versions, and resolve conflicts between versions of TypeDescriptors
  • reader -- can only read, but not create or update, metaspace information

CUD Operations

This section specifies the create, update and delete behaviors offered by the Metaspace. These behaviors are heavily influenced by the adoption and enforcement of the version-on-update principle. Under this principle, the external view of all TypeDescriptors is of an immutable object. TypeDescriptors are never updated in place. Instead, a new version of the TypeDescriptor is created. Following this principle in concert with the pull-rather-than-push principle, simplifies the orderly evolution of TypeDescriptors without breaking existing code that depends upon a TypeDescriptor. Objects that depend upon a particular version of a TypeDescriptor continue to refer to the prior version and are free to decide the best time to upgrade to the new version (within certain constraints as described below).

Object Identity

Every TypeDescriptor is identified by an immutable type identifier (tid) that is assigned when the (first version) of the TypeDescriptor is created. In the holochain implementation the ActionHash of the first create action for a TypeDescriptor is used as its tid. Each version of a TypeDescriptor is identified by an immutable object identifier (oid) In the holochain implementation, the ActionHash associated with a version is used as the oid for that version. Additionally, each version of a TypeDescriptor can be tagged with a unique semantic version number. This tag is assigned as part of the synchronization process (described below). Within the version history of a TypeDescriptor, the tid remains the same.

Builder Pattern

The Builder Pattern is leveraged to allow the components of a TypeDescriptor to be incrementally added and modified during the creation process using a separate TypeDescriptorBuilder type. The TypeDescriptorBuilder can be thought of as a staging object that allows incremental definition of the various aspects of a TypeDescriptor. Once the TypeDescriptorBuilder has been updated to reflect the desired state, it can be used to create a new version of a TypeDescriptor. This approach enables creation flexibility while also allowing the attributes of TypeDescriptor to be specified as immutable to enforce the _version-on-update _principle.

The builder object also tracks the changes made to the object under construction and summarizes them into a breaking warning or non-breaking state variable (as described in the next section). At build, this variable is used to determine the semantic version number to assign to the immutable built version.

Rules for Classifying Changes as Breaking, Warning, or NonBreaking

As updates are made to a TypeDescriptorBuilder, each change is marked as breaking, non-breaking or warning. The exact rules for doing so are specified below.

Breaking Changes

  • Deleting a TypeDescriptor (i.e., adding a TypeDescriptor to the DeletedTypeDescriptor list) (NOTE: Released TypeDescriptors must be first be Deprecated before they may be Deleted)
  • Adding a property to identifying_properties
  • Removing a property from identifying_properties
  • Strengthening a Property constraint (e.g., decreasing max_length or max_items) because this my cause existing data to become invalid.

Warning Changes

  • Weakening a Property constraint (e.g., increasing max_length or max_items). Such changes shouldn't break existing data, but they may change assumptions on which existing code is based. For example, a Visualizer may make assumptions about the maximum length of a property value in the overall layout of a node. If that maximum length is increased, this doesn't change the length of any existing data. But it allows new values to be assigned to that property that violate the Visualizer's assumption about length.

Non-Breaking Changes

  • Adding a new TypeDescriptor
  • Adding a TypeDescriptor to the DeprecatedTypeDescriptor list.
  • Adding a new Property to a TypeDescriptor

Semantic Versioning

The Metaspace adapts the semantic versioning approach initially created to manage API evolution to support schema evolution (i.e., TypeDescriptor evolution). The format of a TypeDescriptor's semantic version is: MAJOR.WARNING.PATCH

Each TypeDescriptor is assigned a version that is initially set to 0.0.1.

As new versions of a TypeDescriptor are built:

  • A change to the MAJOR version, indicates a breaking change was made since the previous version
  • A change to the WARNING version indicates a warning-level change was made since the previous version and NO _breaking change_s were made
  • A change to the PATCH version indicates only non-breaking changes were made since the previous version

Builder API's (trait definitions)

Traits are defined for each of the TypeDescriptor types. For definitions, refer to the Rust definitions in: crates/coordinator-zomes/apis/type_desc_coord_api/src/builder.rs

build -- applies the changes staged in a DescriptorBuilder (sub-)hierarchy. It is during the build process that semantic version numbers are derived based on the set of changes accumulated prior to the build. For this reason, builds are applied bottoms-up.

cancel -- abandons the changes for a version.

deprecate -- a version of a TypeDescriptor can be marked as deprecated. This signals intent to no longer support this (and earlier) versions of the TypeDescriptor. A DeprecationPolicy

deactivate

archive

delete

Functions provided by HolonDescriptorBuilder trait

newHolonDescriptor(branch_id)-> TypeDescriptorBuilder -- creates a new (empty) HolonDescriptorBuilder, on the specified branch. When committed, a new tid is generated for the new TypeDescriptor and its initial version number (0.0.1) is assigned.

cloneHolonDescriptor(from:HolonTypeDescriptor)-> TypeDescriptorBuilder -- creates a new HolonDescriptorBuilder on the specified branch within the specified metaspace from an existing version of a HolonDescriptor. The new HolonDescriptorBuilder will have its own tid and version history. It retains no relationship to the HolonDescriptor from which it was cloned. Note that cloning does a deep copy.

deriveHolonDescriptor(branch_id, descriptor:HolonTypeDescriptor) ...creates a new HolonDescriptorBuilder on the specified branch within the specified metaspace from an existing HolonDescriptor. The HolonDescriptorBuilder retains a reference to the version it was derived from (referred to as its predecessor). It retains the same tid as its predecessor but is assigned a new semantic version. Note that derive does not (immediately) create Builders for its children. So, for example, upon derive of a HolonDescriptor version, the HolonDescriptorBuilder will still reference the child properties CompositeTypeDescriptor and identifing_properties list.

addProperty -- adds a new property to the properties list for a HolonDescriptorBuilder -- first deriving the properties composite if this is the first property being added to this HolonDescriptorBuilder

removeProperty -- marks a property as removed

makeIdentifying -- adds an existing property to the list of identfying_properties -- returns Some/Error enum.

makeNotIdentifying -- removes a property from the list of identifying properties -- returns Some/Error enum.

Deriving Identifiers and Semantic Version Numbers

When a Metaspace steward submits a set of changes for a DescriptorBuilder, it's calculates the tid, oid, and semantic version as shown in the following diagram.

Descriptor Builder State Machine

The remaining TypeHeader fields are populated from the HolonDescriptorBuilder object.

Within each Metaspace branch, versions are chained in a linear linked list and each component of the semantic version number is incremented monotonically. Because stewards are allowed to operate concurrently, more than one agent could attempt to contribute the next version. The method for handling these conflicts is described below.

Build Top Down

As specified in the Metaspace Ontology, some descriptors may contain other descriptors. For example, HolonDescriptors contain CompositeDescriptors which contain DependentTypeDescriptors. These relationships can be thought of as a forming a hierarchy. Updates are applied from the root down (HolonTypeDescriptor) down. To edit a HolonTypeDescriptor, first derive a new HolonTypeDescriptorBuilder. To update a leaf descriptor (e.g., changing the attributes of a PropertyDescriptor, a new version of that descriptor must be created. A new version of its parent must also be created that references the new leaf. This process is followed transitively to the root of the hierarchy. To avoid version proliferation, sets of changes can be grouped together into a CommitBundle. All of the changes within a single CommitBundle are applied atomically -- i.e., either all changes are committed or none are.

Conflict Resolution

Usage of the Persistence Tier

Persistence Tier Design

The prototype will use holochain as its persistent storage mechanism. In the future, we may abstract the choice of storage technology away by leveraging the AD4M API's.

TODO (the following will be removed once they have been reflected in github issues/stories)

  1. Define MetaspaceState struct (containing branches and policies)
  2. Define Metaspace Trait with operations for adding (and merging?) branches and editing Metaspace policies
  3. Define BranchState struct (containing branch metadata and TypeDescriptors)
  4. Define 'Branch' Trait (supporting create of new TypeDescriptors, adding a versions of a TypeDescriptor, ...)
  5. Update TypeDescriptorHeader to set _id as the unique id for each specific version of a type and tid to refer to the shared type id.
  6. Update definition of 'HolonTypeDescriptor' so that all properties are defined in the single Composite. The identifying_properties attribute should just reference the subset of those properties that are considered identifying. Note the definition of ALL properties (whether identifying or not) are specified in the properties attribute.
  7. Add struct definitions for the SemanticVersion link
  8. Add struct definitions for the Conflict link
  9. Add struct definitions for the Predecessor link
  10. Add trait functions for relationships
  11. Define Builder structs for each of the base TypeDescriptor types defined in the ontology