20210317 DEV M1 R2 Threading - orbitalfoundation/wiki GitHub Wiki

Orbital WASM Runner

March 17th 2021

OVERVIEW

This is a more narrow drill down on the microkernel design based on further research. This kernel is the core of the larger product offering.

Abstract Goals

  1. Produce a minimalist device-agnostic non-opinionated runtime service that is somewhere between an “app runner” and a “kernel”.
  2. Allow persistent dynamic downloadable applications on top of a module paradigm.
  3. Allow persistent dynamic downloadable shared libraries on top of a module paradigm.
  4. Allow persistent dynamic downloadable device drivers on top of a module paradigm.
  5. Allow inter module messaging.
  6. Strong security policy.
  7. Run in user land on top of Linux or any POSIX compliant OS.
  8. Abstract away hardware.
  9. Lend itself to adoption by new hardware developers - notably AR glasses.
  10. Be a statement about a common shared foundations. Discourage larger companies to build their own silos.
  11. Rasterization is pretty important. Use a lowest level approach for rasterization services. Any app or library should be using simplest primitives, not high level fancy capabilities -> ie "draw a polygon" not "make a user clickable button". For example a compositor process can let each app have a fragment of a raster buffer to render-into rather that providing yet another DSL. It's better to say provide a vertex-shader with clipping and so on - don't provide say '3js'. It may arguably make sense to let modules have their own fragment + depth buffer in fact (and then compositing can be done with the z-buffer and multiple render passes).
  12. Be careful about module dependencies -> don't have fluff... see https://www.youtube.com/watch?v=M3BM9TB-8yA

Application Modules

Philosophically everything is a module.

The intent is for applications and shared libraries to be dynamically downloaded, bind to each other, communicate with each other, and communicate with system resources and devices with security policies on a wide range of hardware. If a developer ports this service to their own hardware, that all the resulting capabilities, applications and services should “just run”. (Although In practice there may be some device drivers or low level capabilities that are specialized that need to be built for each piece of hardware).

The implementation approach is a native Rust application or “thread runner” that requires ALL applications, libraries, device-drivers, services and computation in general to be expressed as WASM modules that can then be dynamically loaded. A messaging bus is provided for inter module communication. WASI is used to expose basic POSIX services but in areas where WASI is incomplete (such as say WebGPU notably) we have to extend standards and roll our own for now (which should ultimately yield back to whatever the mainstream direction ends up being).

Dynamic Library Modules

The expectation (and one difference between this project and say a web browser) is that shared libraries such as TensorFlow or any kind of computer vision based image segmentation library will be able to be delivered to the device dynamically and then be available as a shared resource for other modules. The goal is durable, persistent, computational resources, managed with security policies, that serve as the building blocks for larger applications (that are also downloaded and bind at runtime). This is a different philosophy from platform service providers such as iOS who provide a full suite of libraries and tools. Implementationally Libraries are simply WASM modules and are identical to applications.

Device Driver Modules

Logic that talk to actual hardware may have to be written natively. But they should show up internally as "just another module" and otherwise follow the same policies (such as asking for permissions, being dynamically loaded and bound). They should fit into the module paradigm.

Historically app running services or full blown kernels tend to contain their device drivers “behind the wall” as built in capabilities running in a privileged space. The concern for us isn’t specifically that we want a lightweight kernel but we want to philosophically be “future oriented” where new devices as they emerge can be loaded up and run without having to upgrade the entire application.

Specialized capabilities (such as WebXR, WebGPU, Bluetooth, and other as of yet nascent future oriented capabilities) are ideally dynamically loaded modules rather than being “built in” to any kernel or foundation. This approach helps to allow this architecture to run on both limited resource embedded devices (such as say AR glasses) as well as larger devices (such as say a kind of app browser/runner running on a laptop).

Sandboxing & Process Namespaces

Safety and security is critical. We will use WASM/WASI and also build policy above that for security.

Each WASM module gets one full process. For sandboxing each WASM process has a “process namespace”. These are the capabilities that can be attached to each process namespace:

  1. Permission to access areas of the file-system (extended from WASI).
  2. Permission to access well known resources and pipes (keyboard, mouse, sockets, and so on - again extended from WASI).
  3. Permissions to access other modules (notably WebGPU and other device like modules that expose critical system features).
  4. Permissions around network sockets. As of yet there are no specific restrictions on certain IP connections but this makes sense to support.

KERNEL

✅ Compiler

[ DO NOTHING ] We will not bother providing arbitrary higher level compilers for now. If wasmer or wasmtime or say emscriptem or whatever other tools we use end up doing work for us - that's fine but we won't bother worrying about it for now beyond that. For early development the early developers can just compile to WASM themselves. Effectively any JIT compilation will later be bound to the Loader below. Note that wasmtime and the like do provide Cranelift which will take the intermediate format and turn it into native assembler bytecodes for the target platform. Later a "nice to have" is compilation of higher level grammars to WASM. I'm assuming we will be Rust biased but really we should be thinking about this in terms of any compiler ( such as https://github.com/emscripten-core/emscripten ) - although different compilers will be introducing their own baggage ( for example to use emscripten with rust you need to use it as a rust target https://www.hellorust.com/setup/emscripten/ and then target web assembly ).

✅ Loader

We will load and run dynamic modules on demand. WASMTIME will do for now. We might swap it out for some other loader if there is a good reason to. Wasmer and so on also seem to use the same sandbox manager for POSIX. This piece is easy on the surface - but there is complexity later around message passing. See https://docs.wasmtime.dev/lang-rust.html or https://docs.wasmtime.dev/examples-rust-hello-world.html for a trivial example of loading a WASM module from Rust and running it. See https://docs.wasmer.io/ecosystem/wasmer as well.

✅ Threading

The core of this product is a thread manager. More specifically we want a non-blocking thread manager so that message handling doesn't suspend state where not needed. It is simple to drive threads and inter-thread messaging in Rust. See https://www.koderhq.com/tutorial/rust/concurrency and https://crates.io/crates/threadpool.

Also it is worth noting that there are many WASM runners but few of them take a larger multi-threaded and inter-module communication approach. The Bytecode Alliance WASM Micro Runtime looks like it fills a similar space to this project. Notably they also have a message passing scheme and a thread running scheme: https://github.com/bytecodealliance/wasm-micro-runtime .

❓ How do I instance a WASM module without having to statically define the methods on that module? WASM dynamic module imports? We can avoid it by just having a catchall standard function entry point or we can scan the module. See: https://docs.wasmtime.dev/api/wasmtime/struct.Func.html

✅ Messaging

This is probably the most complex part of the project. We want strong concepts of inter module messaging and ideally a shared memory pattern so that large buffers can be moved around without copying.

Here's my thoughts so far on this:

  1. One underlying common practice is to use MPSC channels. We will use this as part of the solution - simply as a pattern for defining who owns what memory:

See https://blog.softwaremill.com/multithreading-in-rust-with-mpsc-multi-producer-single-consumer-channels-db0fc91ae3fa.

  1. One technique is to pass callback function handles - allowing the WASM module to "call out" to the outside world. But we need to formalize what that message channel looks like. The WASM module will have to explicitly know that this capability exists obviously.

  2. Another technique is to expose function handles through bound dependencies - similar to the way that wasmtime exposes POSIX to modules. This may however create static build dependencies. We do want a more "late binding" approach. It does seem like a messaging approach is best. See : https://github.com/WebAssembly/WASI/blob/main/design/application-abi.md . Also see https://github.com/WebAssembly/tool-conventions and https://github.com/WebAssembly/tool-conventions/blob/master/DynamicLinking.md and https://docs.wasmer.io/ecosystem/wasmer .

  3. Shared memory. Currently the plan is that WASM modules will communicate with each other through a messaging architecture for small amounts of memory. Data may be copied. We want to explore being able to define larger shared memory pools say under an MPSC regime.

  4. Makepad has a nice messaging formalization:

We definitely want a performant interface to displays. If we look at Makepad they basically pipe everything out of the WASM blob out to specialized handlers - this message based approach is probably the right way to do it:

https://github.com/makepad/makepad/blob/master/render/src/cx_webgl.js

We have some flexibility here for prototyping. We can for example invent a high level notation. We could use a scene graph. We could pipe OpenGL. There are lots of options. It's less critical to get this perfect today - since there should be some flexibility in being able to swap out this entire module for some other better approach - but we do need *something. It is also notable that this display module will likely need to define the overall app look and feel.

  1. Here is some more commentary on this kind of common problem:

https://stackoverflow.com/questions/61529205/what-is-the-best-practice-to-communicate-between-a-rust-program-and-an-embedded

Notably this above post cites FASTLY LUCET's approach: https://github.com/bytecodealliance/lucet/blob/master/lucet-wasi/src/runtime.rs . And interestingly the author of a quite nice summary of related messaging issues is also now at FASTLY (We don't have to go this far because we're only talking between well known senders and receivers - but it illuminates some of the challenges): https://hacks.mozilla.org/2019/08/webassembly-interface-types/

✅ Sandboxing

[ IGNORE FOR NOW ] We will need a richer security policy than WASI provides. But we don't need it today. We will scope our own sandboxing policy on top of WASI. There is a FDIR, Fault Detection Isolation and Recovery philosophy here - we want fault isolation from other threads. We will ignore this however for now (since we are the only party writing apps - we know that they are trusted).

✅ Native Module Support

Native Modules. There will inevitably be some behaviors that are too hard to process isolate. These will appear to be WASM modules but will just be native code. We need to make sure we can dynamically load native modules. We may build some capabilities as native for now and move out of 'core' later.

✅ Module Registry

We need a registry service that:

  1. Tells us which modules are available locally.
  2. What version they are at.
  3. What their dependencies are.
  4. What their exposed functions are.
  5. What triggers them.
  6. Stores, and retrieves them on demand.

In a rich application model there may be apps that rest until there is a certain event - and then they fire once - or startup a long lived thread. NodeJS and Deno are good examples lesson on broader implications of getting this right or wrong: https://www.secondstate.io/articles/deno-webassembly-rust-wasi

✅ Internet Module Registry

We need a web service that publishes modules to the world and has an audit process for modules.

MODULES

These are the modules we will write that we will try to keep outside of the core build.

Video Capture Module

A good test of this whole system is an ability to capture input from the camera device and make it available to other modules. For now this can be (pretty much has to be) a native module. This is turning out to be a hassle. It's kind of surprising just how poor the Rust support, documentation and capabilities are for these kind of - what should be trivial - native bindings. It looks like only different specific hardware is supported: https://www.reddit.com/r/rust/comments/hk7v95/eye_cross_platform_camera_capture_control/

I wasn't able to build the opencv bindings for the apple m1 silicon [ https://crates.io/crates/opencv ]. So it's hard to just do a trivial embedded example such as https://riptutorial.com/opencv/example/21401/get-image-from-webcam . There's also some possibly apple silicon issue with the apple library native bindings for image-capture-core and AVFoundation ... https://github.com/brandonhamilton/image-capture-core-rs/blob/master/Cargo.toml . What's going on here as well is we start to see the native roots of any external or device access - and how that will vary by platform. It shows how any third party developer hoping to "bring up" a full orbital microkernel will have to hand build all the module dependencies that have native hooks.

What I've done instead is use a shell command that captures a single frame of video, saves it to disk, and then pipe that back into Rust - using a kind of shell exec for now ( https://rust-lang-nursery.github.io/rust-cookbook/os/external.html ) . This is obviously suboptimal.

Tensorflow Module

I'm currently trying to build this as a "native" module to use as a test to help exercise the overall system... and there are some interesting problems. I would prefer if it was entirely a wasm blob - and I am surprised somewhat by the path I've been exploring here.

https://cetra3.github.io/blog/face-detection-with-tensorflow-rust/ -> this unfortunately doesn't build on my device right now. There's some kind of subtle issue with Bazel and the build versions required for a whole huge stack of incredibly subtle and complex dependencies. Specifically this module does not build: https://crates.io/crates/tensorflow-sys . Apparently I need a specific version of bazel : https://itssiva.medium.com/how-to-install-a-specific-version-of-haproxy-or-any-brew-package-a87561119e63. It's concerning to me that crates cannot manage versioned access to external capabilities! See : https://github.com/bazelbuild/bazel/releases?after=0.27.1 and see https://github.com/bazelbuild/homebrew-tap/issues/39 and see https://stackoverflow.com/questions/3987683/homebrew-install-specific-version-of-formula . Also see https://github.com/bazelbuild/bazel/issues/4812 . Bazel https://bazel.build/ itself is "yet another build system" -> it kind of again concerns me that a crate has these kinds of specific dependencies... it really shows how Rust itself is also "just another build system" - and just another way to package things. It all makes a credible argument for having the outer system just be LLVM or C / C++ rather than trying to migrate everything to a rust framework for no particular reason...

Rasterization and or GPU Module

What is any kind of rasterization target handler? There are several game engines, and winit. These will have to do for testing.

HTTP Server Self Management Admin Panel Module

While not absolutely necessary, it is appealing to me to provide a module that presents a web interface to the internal system state. I want to be able to open an HTTP socket to it and have some communication with it. A typical NODEJS server for example can handle user HTTP requests and deliver an interface (such as a specific app or user experience). In this case I want to deliver a command-and-control administrative panel to give me some introspection on the currently running tasks, to be able to start and stop them and so on. This lets me run this application on embedded systems without having to carry along the baggage of a native VGA/WEBGL UX.

I'm going to use a server architecture similar to NODEJS or DENO or Apache. However, unlike Apache I want an event-driven approach (like NODEJS) that allows concurrent requests to be handled in a single main thread or event loop. The loaded WASM modules then effectively are like "handlers" that can respond to well phrased requests.

This may be a good starting point:

https://ianjk.com/devserver/ https://github.com/kettle11/devserver

User Interface Module

For later (this is not core) it is itself an optional userland module but it has a special role:

  1. The plan is to have a simple UX with a URL bar, tabs and a windowing scheme similar to an ordinary web browser.
  2. The URL bar will “go to” an HTTP URL but there is no support for HTML, it can only download and run wasm modules into the sandbox.
  3. There will be an admin panel showing which modules are downloaded and how to manage them - as well as what is running.
  4. Apps may but do not necessarily have a display window open at any given time. They are persistent background processes unless stopped for now.

ISSUES / CONCERNS

  1. Rust ecosystem immaturity. The tools and services exposed to Rust - such as say Tensorflow, or access to the local webcam, or access to the GPU and suchlike are emaciated compared to other more mature ecosystems. Tools don't build, or they are incomplete, or broken. It's an extremely young ecosystem compared to many. This also shows up in a lack of documentation for the ecosystem.

  2. Large crate dependency stacks. The goal of this project is to build a small microkernel architecture - however I'm finding that most of the pieces used to build the system end up being quite large. It will be important to keep an eye on this if the goal is to run on IOT devices.

REFERENCES

Nice surveys

https://wiki.alopex.li/ActuallyUsingWasm <- excellent nice intro

https://sendilkumarn.com/blog/rustwasm-memory-model/

https://gumroad.com/sendilkumarn

https://hub.packtpub.com/multithreading-in-rust-using-crates-tutorial/ <- reasonable summary

https://arewegameyet.rs/

Threading questions ( some referring to in-browser issues )

https://stackoverflow.com/questions/59727133/can-multiple-wasm-modules-interact-with-each-other-and-share-memory-directly-via

https://rustwasm.github.io/2018/10/24/multithreading-rust-and-wasm.html

https://github.com/WebAssembly/wasi-sdk/issues/5 -> wasi threads?

https://doc.rust-lang.org/book/ch16-01-threads.html

https://blog.scottlogic.com/2019/07/15/multithreaded-webassembly.html -> threads in WebAssembly but in the browser for now

https://github.com/WebAssembly/threads/blob/master/proposals/threads/Overview.md -> web assembly threads

https://github.com/WebAssembly/WASI/issues/296#issuecomment-659223938

Microkernels in the wild

https://github.com/bytecodealliance/lucet <- Fastlys app runner

https://github.com/EOSIO/eos-vm <- https://github.com/appcypher/awesome-wasm-runtimes

https://alluto.io/

https://blackberry.qnx.com/en/aws

https://www.reddit.com/r/rust/comments/54ia4d/implementing_a_task_runner/

https://skipworth.io/posts/rust-wc-threads/ <- crates that help manage threads

Rendering stuff

https://github.com/rust-osdev/vga

https://github.com/WebAssembly/WASI/issues/171

https://www.libsdl.org/

https://www.glfw.org/

Packaging schemes and package managers

https://wapm.io/ <- hmmm, this feels a bit like npm or crates and it is a pattern I could use

https://medium.com/wasmer/wasmer-io-devices-announcement-6f2a6fe23081

https://github.com/bytecodealliance/wasmtime/blob/main/docs/WASI-tutorial.md <- basic tutorial

https://doc.rust-lang.org/reference/linkage.html <- Rust late binding (probably not that useful)

https://github.com/WebAssembly/proposals/issues/8 <- upcoming plans for web interface bindings

People

https://en.wikipedia.org/wiki/Alan_Kay <- Alan Kay theorist around objects apps and runtime ecosystems