20211231 Orbital Project Mercury Summary 2021 - orbitalfoundation/wiki GitHub Wiki

Orbital Project: Mercury

Orbital is a user facing "app runner" that can run durable persistent applications fetched over the wire. We see typical uses cases around 'sense making'; organizing and allowing interaction with information from over the Internet, or from computer vision applications that are scanning the users local context and providing filtered and organized contextual information.

Project Mercury is the concluding project for the first year of R&D. It embodies the thinking and research as best we are able to given time constraints. See https://github.com/orbitalweb/mercury

Core Concepts

Kernel

The kernel provides the ability to run multiple threads of execution simultaneously. We rely on Rust itself for this capability.

Services

A service in our nomenclature is simply a thread that also has a message channel bound to it. Our application architecture treats services as a core building block, with larger concepts such as “applications” built out of multiple services bound together by messaging. Some services are built-in. Others are dynamically created.

Service Factory

A common pattern in software in general is a "factory" that produces an object on demand (such as say from a name string). We don't have this concept of a factory as an independent entity. In our architecture the core services have their own factories - basically just a method call on the service itself. Some services are singletons, such as those that indicate limited device resources (such as a display or a front facing camera). Some services can be multiply instanced, such as a script or WASM loader / runner. All dynamically created services are always manufactured from one of the built-in services. There is no "abstract factory" that can produce a service on demand - rather one must message a built-in service to ask it specifically to produce an instance of a dynamic service".

Messages

In our nomenclature messaging is the act of sending traffic from one service thread to another. Principles here:

  • All messages are packaged as structs.
  • All traffic is between threads (services).
  • All messages can be sent over a TCP connection or directly using shared memory and higher efficiency pipelines locally.
  • All built-in services only communicate via messages.
  • All dynamic services are sandboxed to prevent any interaction other than through messages.
  • No traffic, messages or communication is possible outside of the messaging structure.
  • There is no late-binding, dynamic loading, direct function calls, dynamically loadable libraries, or any other scheme for connecting two pieces of code together other than by messages.

Wires

We use a voluntary mechanism where a party can message a service to ask it to send them traffic. This avoids the broker itself having to the actual work of later routing outbound traffic. Building a wire consists of something like this:

  1. Tell the broker to pass a message to a "Service Factory" that you want to start an instance of a service (such as load a WASM blob). Let's call this "Service A".

  2. Tell broker that Service A exists and specify the namespace paths that Service A would like to listen to for commands.

  3. Tell broker to pass a message to some service (let's call this Service B) you want to wire Service A to. That message itself tells Service B how to send traffic to Service A.

Namespaces

For our purposes we define a namespace as a directed acyclic graph based on DNS. The namespace is partitioned such that there are built-in services bound to well defined formal names (mounted at localhost:/orbital/*), and there are userland services (mounted at localhost:/orbital/home/[username]/running/[name], and then there are remote services (mapping to arbitrary DNS paths and ports).

  • localhost: -> traffic to local machine
  • localhost:/home -> user accounts
  • localhost:/home/root -> root account
  • localhost:/home/root/apps -> some root account app manifests, that are not yet running
  • localhost:/home/root/scripts -> some root acount scripts (not quite full blown apps - no formal manifest), that are not yet running
  • localhost:/home/root/running -> root account apps that are running (may have been produced from apps but not necessarily so!)

Events

There is a shared system namespace that services can register to listen to events within, and publish to (see namespaces). This follows a pubsub model. Events are always published to a path, and all listeners on that path get the event.

To send a message to a service you address it by name in a planetary and global namespace that uses the same notation as DNS. Here are some built-in services and their paths:

  • localhost:/orbital/service/broker -> local broker instance
  • localhost:/orbital/service/log
  • localhost:/orbital/service/scripting
  • localhost:/orbital/service/camera
  • localhost:/orbital/service/tensor
  • localhost:/orbital/service/timer
  • localhost:/orbital/service/wasm
  • localhost:/orbital/service/view
  • orbital.eth/service/broker -> orbital bridge gateway in the ENS namespace
  • pinata.eth/service/broker -> the pinata IFPS gateway

Applications

An "application" is a collection of services wired together. Principles here:

  • Application components (services themselves) can be distributed over the internet, and in fact they can move around dynamically, they are not necessarily always on device.
  • Persistence: Apps are persistent forever until they shut down or are stopped. This is different from traditional web browsers.

Application Manifests

Applications can be built by hand (by any running piece of code) but there is also a manifest scheme (see flow grammar) that allows an application to be defined as a collection of services, to be produced into running memory.

Security

Security exists between services in a single app, not just around apps. Application manifests can also (conveniently) declare security policies.

[TBD - exactly how to define security - examples!]

Hypervisor

This piece is not written yet. Right now all threads are run locally. Principles [TBD]:

  • Topology. In the current architecture all services that are connected to each other are on device. In the future there may be a hypervisor that load balances services over multiple machines depending on various criteria (latency, dataset size, privacy).

  • Semantic Intent. At the moment all messages are sent directly between explicit parties. In the future we see a role for an abstract clearing house for "answering questions" where a high level message is sent such as "what is the weather in SF today" and where agents can opportunistically compete to resolve the message.

Flow Grammar

We have a declarative grammar that is effectively JSON. Application manifests are described in this grammar. So are fragments of the display graph, and so are messages.

[TBD examples]

Specific Services

Be aware that there are both "built in services" and "dynamically loaded" services. All dynamically loaded services are produced by a pre-existing 'immortal' built-in service (acting as a factory). The only dynamic services permitted at the moment are WASM blobs and Javascript scripts.

Broker Service

There is a fundamentally special service called a 'broker'. It handles all messages for all other parties. The entire engine is built around it. It's a small piece of code but it is core glue.

Package Manager Service

Services that load up blobs such as WASM loaders have a persistence manager and revision manager as well as checking signing and security.

Scripting Service

We found that doing all of the high level logic in Rust was too slow. So we've moved to a model where we use the Chrome V8 Javascript engine to describe high level concepts such as application manifests or display layouts, and then do all the heavy lifting in Rust. Large parts of the user experience, and "slow" or non-critical parts of the user experience are simply defined in Javascript for development velocity as fits the rapid prototyping nature of this early work.

Computer Vision Service

There are several computer vision services that parse/segment video feeds into a set of abstract features such as "floor" or "person" or "wall". This state is piped to the View Service.

View Service

The view service is one of the larger pieces of this project. Also there are longer term requirements that place higher burdens on this area down the road. The hope is to swap the current implementation out with https://makepad.dev .

Here are the requirements we have for a view service:

  1. Synthesized 3d Representation. We see a need for a "view model" or a single scene graph containing marked up and segmented 3d objects that represent important concepts. This is a common practice and typical 3d applications often represent the "state of their world" as a directed acyclic graph of nodes, where each node can represent some kind of successive state transform to children nodes, and or some kind of graphical representation of a concept. Typically scene graphs use transform hierarchies and grouping concepts to organize all of the nodes. [ See https://en.wikipedia.org/wiki/Scene_graph ]. In our case we have a design concept that this scene-graph can be written to by multiple separate services, and can be browsed by multiple services. Effectively the scene-graph is a "database" of digested results, organized in such a way that it acts as a memory or resource for services. For example a computer vision algorithm can capture a point-cloud of an internal volume such as a house, and then segment that point-cloud into features such as floors, walls, windows, pets. It can then publish an abstract high level semantic segmentation model that can be used for a whole variety of purposes. Typical uses could include efficient rendering of a digital twin (at less cost than rendering the whole point cloud), or allowing high level expressions of semantic intent ("place the virtual television so that it occupies most of the largest wall facing the user at eye level"), or allowing decorators ("attach a friendship hat to the nearest person who is a friend").

  2. Singleton; single receiver with many writers*. We want a single service that synthetically merges many different sources together. A view is typically rasterized to 2d from a "view point" and then painted on the human users eyeballs, such as in a heads-up display or even an ordinary 2d display. We also see a case for internal view representation model for a robot.

  3. Live State Model The view service retains a constantly updated segmentation model of the scene as parsed out by computer vision algorithms.

  4. Retained Scene Graph Callers pass display graph fragments to the view, which the view incorporates into a total scene graph. Effectively outside services "throw fragments over the fence" to the view service, which incorporates or updates the pieces.

  5. Semantic Intent There are cases where multiple apps can paint or decorate the same objects. Callers can specify how they would to render their components not as an explicit XYZ location but as a relationship, such as "on the floor" or "on a persons head".

  6. Resource Contention. Callers may be limited in how much geometry and screen real-estate they are allowed to occupy. Also portions of the screen are reserved to prevent spoofing attacks.

Current implementation approach:

Fragment passing. Callers publish a message as a string encoded blob to the view service. The string itself is a fragment of a scene-graph described in a text format. Scene graph nodes themselves represent typical 3d elements such as one may see in VRML or in HTML. Each element is marked with a unique ID, and "replaces" any existing segment of the graph with that ID in the view. Our grammar (which we call 'flow grammar') is effectively javascript, and a typical fragment can look like this:

{
 id: 1,
 kind: "box",
 xyz: [100,100,0],
 hwd: [200,100,0],
 children: [{
      id: 2,
      kind: "text",
      text: "please login",
      xyz: [100,100,0],
    }]
}
  

Rendering Engine Choices . There are many available frameworks in Rust, however there's a tendency of Rust developers to try encourage programmers to program in Rust itself - in that they are often trying to create a pleasant Rust side programmatical way of keeping a developer typing logic to drive the rust based system in rust as well - and this leads to contorted exposed interfaces to the underlying semantics. Also there's a tendency of Rust developers to not document at all. And there's a tendency for rapid code and standards evolution, fragmented and incomplete tool ecosystems and crypto semantics using trait modifiers and borrow checker lifetime hacks that make understanding engines or building on engines to be extremely challenging. See comments such as : https://github.com/gfx-rs/wgpu/issues/1453 and https://github.com/gfx-rs/wgpu-rs/issues/198 . Some of the existing engines include Bevy, Nannou, Three-d, Raylib, Dotrix, Harmony. There's also a reasonable course on WGPU programming from scratch at https://github.com/sotrh/learn-wgpu . It's also worth noting there are some good collections of basic 3d maths - such as https://github.com/I3ck/rust-3d.

The core requirements are:

  1. Windows Opening a window on a device
  2. IO Being able to capture user input events, keyboard, mouse and controllers
  3. Fonts Being able to load true type fonts and have expressive visual rendering of text to the display
  4. 2D API Specifically fat lines, boxes, circles and colors including alpha at a minimum.
  5. Images Being able to load and represent images, rotate, scale, and scissor operations.
  6. Pixels Being able to dynamically paint to an image (such as capturing a live video feed and painting it)
  7. Widgets Buttons, input boxes, draggable regions and event reporting back up to callers.
  8. 3d Basic concepts such as point of view (camera), cubes, spheres, model loading (gltfs), lights

Our current approach

In our current approach we take the wgpu tutorial and stuff wgpu_glyph for fonts on top of that and then write custom event handlers for buttons and basic widgets. This isn't extremely elegant but we expect makepad will provide better solutions for us in the future.

See also:

Broker Service in detail

  1. Broker::event(). Sending. From native rust or scripting you can send a message to a path. This is useful to drive a service (make something do something).

    broker::event(payload)

    broker::event_with_path(path,payload)

  2. Broker::listen(). From native rust or from a scripting layer you can ask the broker itself to add you as a listener on a path - so that all messages to that path go to you. This is useful if you are a "service" that does work based on inbound messages. Scripting languages don't get an actual dedicated "receiver" back, rather the entire scripting module has a global event receiver and all messages arrive there and you have to sort them out by hand.

    receiver = broker::listen(path)

    broker::listen_with_sender(path,crossbeam_sender_channel)

  3. Broker:unlisten(). TBD. I haven't bothered implementing this yet.

  4. Broker::route(). This is deprecated (for now). I used to have a capability where you could "wire" one path to another, basically connecting two services. This means listening to traffic on one path and forwarding it to another path (such that any listeners on that other path also get it). But I think it is better for one service to specifically ask another service if it can listen to traffic there.

Service Event Payloads in detail

What do you actually send as a message? This is largely a set of conventions.

  1. "Ordinary events" - Order a service to do some work:
   {
   	   path:"localhost:/orbital/services/view" ,			<- where the message is going to
   	   command:"attach",						<- command is a convention for the receiver service
   	   args:{									<- args is an optional convention for arguments for the command
   	   		scenepath:"/myscene/mycube",
   	   		kind:"cube",
			whd:[1,1,1],
			ypr:[0,0,0],
			xyz:[0,0,0],
			color:"blue"
   	   }
   }
  1. "Factory events" - Ask a service to produce a child instance from a factory (most built-in services offer this power)

    { path:"localhost:/orbital/services/scripting", command:"spawn", args:{ script:"localhost:/orbital/accounts/root/scripts/boot.js", <- run a script found on local disk here listen:"localhost:/orbital/accounts/root/running/boot.js", <- have that script listen to traffic here echo: "localhost:/orbital/accounts/root/running/other.js", <- have that script send any published state to here } }

  2. "Rebroadcast events" - Ask a service to re-broadcast its own outbound events to you explicitly on a separate channel. This is probably not going to be needed much - a better pattern is to effectively ask a service to mint a new copy of itself and send traffic to you specifically. Also, another way to "wire" services together would be to directly ask the broker to do it - but I've decided to deprecate that.

	{
		path:"localhost:/orbital/services/view",
		command:"echo",
		args:{
			echo:"localhost:/orbital/accounts/root/running/boot.js",
		}
	}

Bootstrapping in project Mercury

From the shell:

  1. Broker is started by hand.
  2. Several core services are started by hand and registered with broker.
  3. Rust side main() logic orders the scripting engine to run "boot.js"

Once the app is running it leaps to javascript for some work:

  1. App Runner. The boot.js logic passes a boot application manifest to an application runner / manager.
  2. Boot Login. The boot manifest produces user interface (on the display service) and lets the user login to the desktop.
  3. Desktop. The desktop produces a simple desktop that lets users fetch, start, stop and manage other apps.

Demos

  1. Login. Login is an example of a simple user facing app. The main() rust entry point kicks off this script, which produces the login manifest off disk, which then asks to bind to several services, such as the view, and keyboard/mouse input, and then it produces a login display on the view by telling the view to paint a login button. It's a good example of a simple app.

  2. Desktop. Desktop is another app, also loaded from a manifest, and also talking to the view, and receiving keyboard/mouse input. It presents a list of available apps that can be started or stopped. It also allows the user to enter a URL to fetch, load, and run new remote WASM apps. The desktop does not support the viewing of any documents, images or ordinary web pages. Later we may have a mime-type binding to bring up an image viewer or an html page viewer (such as Servo).

  3. Button. This is an example of a button as a minimal demo.

  4. Clock. This is an example of a clock that does incremental updates. It has a callback in the app manifest.

  5. App loaded over wire. This is a remote app that can be loaded as a test. It is a clock written in Rust that uses the messaging gateways correctly to be able to paint to the display after being loaded.

  6. CV Segmentation Demo. This is an app that shows a use case around talking to a CV service, and then painting results from that to the display.