Tutorial 1: Your first OpenXR API layer - mbucchia/OpenXR-Layer-Template GitHub Wiki

Objectives

In this tutorial, we will learn the very basics of the OpenXR API layer template and create a very simple API layer that modifies the Inter-Pupillary Distance (IPD) reported by the OpenXR runtime to apply a "World Scale" effect. We will also show the basics of API layers loading in order to understand how to ship our API layer to the users.

The code for this tutorial can be found in the branch: examples/ipd-override

Getting to know the template code

In order to work with the OpenXR API layer template, you first need to clone the repository. Be sure to also pull all the submodules (in external\). You can do that using git from the command line or using your favorite shell integration.

> git submodule update --init

These dependencies are needed to build your API layer, here is why:

external/OpenXR-SDK-Source: This repository contains some critical definition files for generating and building your API layer.
external/OpenXR-SDK: This repository contains the official Khronos SDK (pre-built), with the OpenXR header files needed for OpenXR development.
external/OpenXR-MixedReality: This repository contains (among other things) a collection of utilities to conveniently write OpenXR code.

To build your API layer, you will need the following tools:

Visual Studio 2019 or above;
NuGet package manager (installed via Visual Studio Installer);
Python 3 interpreter (installed via Visual Studio Installer or externally available in your PATH).

The API layer framework relies on a code generator that lets you specify the OpenXR API that you will hook into, and the API that you will consume. This framework lives under openxr-api-layer\framework. The generator is run for every build to ingest new changes to the configuration stored in openxr-api-layer\framework\layer_apis.py.

Before we start working on the hooks for our API layer, you might want to rename it: you do not want your API layer to be called XR_APILAYER_NOVENDOR_template! Start by renaming the top-level solution file. Once you do that, you will also want to update the description in the JSON description file at openxr-api-layer\openxr-api-layer.json and openxr-api-layer\openxr-api-layer-32.json.

Tip: it is recommended to follow the naming convention established by Khronos. The prefix shall be XR_APILAYER_, followed by your vendor name, and finally the name of the API layer. You may register your own vendor name with Khronos by mimicking this pull request.

Hooking into an OpenXR API

We first want to update the configuration of the code generator, in openxr-api-layer\framework\layer_apis.py. The definitions are as follows:

override_functions: the list of OpenXR functions to intercept (hook) and to populate a "chain" for. To override the implementation of each function, simply implement a virtual method with the same name and prototype in the OpenXrLayer class in openxr-api-layer\layer.cpp. To call the real OpenXR implementation for the function, simply invoke the corresponding method in the OpenXrApi base class.
requested_functions: the list of OpenXR functinos that your API layer may use. This list is used to create wrappers to the real OpenXR implementation without overriding the function. To invoke a function, simply call the corresponding method in the OpenXrApi base class.
extensions: if any of the function declared in the lists above is not part of the OpenXR core spec, you must specify the corresponding extensions to search in (eg: XR_KHR_D3D11_enable).

Let us now make our very first hook!

The template comes with a couple of functions that are implicitly hooked: xrCreateInstance() and xrDestroyInstance(). These are necessary to give you an entry point and an opportunity to cleanup! You should prefer using them over the constructor and destructor of the OpenXrLayer class.

The base implementation of the xrCreateInstance() hook contains some useful logging code that you should keep for debugging purposes.

Note: the prototype of the OpenXrLayer::xrCreateInstance() method is slightly different from the xrCreateInstance() function in the OpenXR standard. It doesn't contain a pointer to an XrInstance. Unlike any other hook in your API layer, the OpenXrLayer::xrCreateInstance() method is invoked after the real call to xrCreateInstance() is made to the upstream layers and runtime.

Tip: in any method of your OpenXrLayer, you can use GetXrInstance() to retrieve the XrInstance handle that you API layer is bound to.

The template also comes with a couple of functions explicitly hooked through the default generator configuration file.

# The list of OpenXR functions our layer will override.
override_functions = [
    "xrGetSystem",
    "xrCreateSession"
]

As shown above, the xrGetSystem() function is hooked. This hook contains some useful logging code that you should keep for debugging purposes. The xrCreateSession() function is hooked, and we do not need it for this tutorial, so instead, we will hook into xrLocateViews() and xrEndFrame():

# The list of OpenXR functions our layer will override.
override_functions = [
    "xrGetSystem",
    "xrLocateViews",
    "xrEndFrame"
]

Once we rebuild the project in Visual Studio, the generator is invoked, and it will now add the following 2 methods to the OpenXrApi base interface. Their prototype and base implementation is auto-generated in openxr-api-layer\framework\dispatch.gen.h. The prototypes match exactly the ones for the corresponding functions in the OpenXR standard:

class OpenXrApi
{
    [...]
    virtual XrResult xrEndFrame(XrSession session, const XrFrameEndInfo* frameEndInfo) {...}
    virtual XrResult xrLocateViews(XrSession session, const XrViewLocateInfo* viewLocateInfo, XrViewState* viewState, uint32_t viewCapacityInput, uint32_t* viewCountOutput, XrView* views) {...}

As we implement the OpenXrApiLayer class in openxr-api-layer\layer.cpp, we can now provide our own implementation of these 2 methods, as show below:

// This class implements our API layer.
class OpenXrLayer : public openxr_api_layer::OpenXrApi {
    [...]
    XrResult xrLocateViews(XrSession session,
                           const XrViewLocateInfo* viewLocateInfo,
                           XrViewState* viewState,
                           uint32_t viewCapacityInput,
                           uint32_t* viewCountOutput,
                           XrView* views) override {
        // Invoke the real implementation.
        return OpenXrApi::xrLocateViews(session, viewLocateInfo, viewState, viewCapacityInput, viewCountOutput, views);
    }

    XrResult xrEndFrame(XrSession session, const XrFrameEndInfo* frameEndInfo) override {
        // Invoke the real implementation.
        return OpenXrApi::xrEndFrame(session, frameEndInfo);
    }

We have hooked into 2 OpenXR APIs! Now let us do something useful with them.

Inter-Pupillary Distance (IPD) or "World Scale" override

One of the very first features I implemented in OpenXR Toolkit was the "World Scale" override. It is a very popular feature because it helps players (especially in simulation games) to make the perceived size of the VR world match their expectation. And it is extremely simple to implement!

The way we perceive the size of objects in VR is directly linked to the Inter-Pupillary Distance (IPD), which corresponds to the distance between the left eye and the right eye cameras used for rendering (and therefore sometimes called the Inter-Camera Distance (ICD)). Increase the ICD and the VR content will appear smaller. Decrease the IPD and the VR content will appear bigger. The rendering itself is not changed very much, only the perspective differs just a little bit, which is enough to alter our perception of size and depth.

A typical OpenXR application will set up rendering of a scene by placing the left eye and the right eye cameras as directed by the eye poses queried through the xrLocateViews() function. By altering the pose of each XrView populated by xrLocateViews() before they are returned to the application, we can override the ICD.

// Contains very useful utilities for manipulating XrPosef, XrQuaternion, XrVector3f...
using namespace xr::math;

float overrideIPD(XrPosef& leftEye, XrPosef& rightEye, float IPD) const {
    const XrVector3f vec = leftEye.position - rightEye.position;
    const XrVector3f center = leftEye.position + (vec * 0.5f);
    const XrVector3f offset = Normalize(vec) * (IPD * 0.5f);
    leftEye.position = center - offset;
    rightEye.position = center + offset;

    return Length(vec);
}

Above, we implemented a helper function that takes two eye poses, changes the distance between them with the one we specified, and returns the previous distance. This is achieved through simple vector geometry:

We find the center point between the two eyes;
We create a vector offset that corresponds to how much to move the camera from the center to either the left or right. The length of the vector is half of the IPD we want to set;
We locate the left eye (right eye respectively) by starting from the center and adding (subtracting respectively) the offset.

Note: in the OpenXR specification, distances are expressed in meters. Therefore the value of IPD above must be expressed in meters, and the value returned by the function will be expressed in meters as well.

Now let us modify the xrLocateViews() implementation to use the helper function above to modify the IPD:

XrResult xrLocateViews(XrSession session,
                       const XrViewLocateInfo* viewLocateInfo,
                       XrViewState* viewState,
                       uint32_t viewCapacityInput,
                       uint32_t* viewCountOutput,
                       XrView* views) override {
    // Invoke the real implementation.
    const XrResult result = OpenXrApi::xrLocateViews(session, viewLocateInfo, viewState, viewCapacityInput, viewCountOutput, views);

    if (XR_SUCCEEDED(result) && viewCapacityInput) {
        // If this is a stereoscopic view, apply our IPD override.
        if (viewLocateInfo->viewConfigurationType == XR_VIEW_CONFIGURATION_TYPE_PRIMARY_STEREO) {
            m_lastSeenIPD = overrideIPD(views[xr::StereoView::Left].pose, views[xr::StereoView::Right].pose, IPDOverride);
        }
    }

    return result;
}

Here we are. We have just intercepted the call made by an application to the OpenXR runtime, chained the call to the OpenXR runtime, and altered the data before passing it to the application.

Note: your code is invoked by an application. Congratulations!... but with great power comes great responsibilities. As shown in the code above, you must take great care to:

Look at whether the call to the real OpenXR runtime succeeded. This is important: if the call failed, then something must be wrong with what the application is doing, and we would not want to manipulate bad/uninitialized data.
Do not make assumptions on what the application is doing. As shown above, we only apply our IPD override when the application is querying a stereoscopic view. While it might be tempting to forego this change and assume that "all applications using OpenXR are doing stereo!", this would be incorrect and can lead to issues with certain applications.

So, we have our IPD override, we are done...?... Not exactly. Altering the output of xrLocateViews() is how we are telling the application to set up its cameras differently. But if you read closely the OpenXR specification, you will notice that when an application submits its rendered frame with xrEndFrame(), it does it by passing an XrCompositionLayerProjection structure, which contains two XrCompositionLayerProjectionView, and each of them specifies the eye poses used for rendering.

While it looks like this might not be a problem, we have to understand how reprojection works and how the OpenXR API comes at play with reprojection.

Disclaimer: reprojection is a very vast topic, and the explanation that follows is simplified as much as possible.

When your VR platform is performing reprojection, it uses the images rendered for each eye along with the position of the cameras "at rendering time". It computes the delta between the position of the cameras "at rendering time" and the position of the cameras "at predicted display time" (using the most recent tracking data available). It then uses this delta to slightly shift the rendered images to appear as if they were rendered "at predicted display time".

When the application calls xrEndFrame(), it will pass back those eye poses with the IPD overriden and the reprojection will attempt to shift the rendered images using the latest tracking data; data which uses the original IPD (because the OpenXR runtime and the VR platform are unaware of our IPD override!). In order to avoid this undesired reprojection with the incorrect IPD, we will patch the original IPD back into the eye poses submitted via xrEndFrame().

XrResult xrEndFrame(XrSession session, const XrFrameEndInfo* frameEndInfo) override {
    // We will need to create copies of some structures, because they are passed const from the application so
    // we cannot modify them in-place.
    XrFrameEndInfo chainFrameEndInfo = *frameEndInfo;
    std::vector<XrCompositionLayerProjection> projAllocator;
    projAllocator.reserve(frameEndInfo->layerCount);
    std::vector<std::array<XrCompositionLayerProjectionView, 2>> projViewsAllocator;
    projViewsAllocator.reserve(frameEndInfo->layerCount);
    std::vector<const XrCompositionLayerBaseHeader*> layersPtrAllocator;
    layersPtrAllocator.reserve(frameEndInfo->layerCount);

    for (uint32_t i = 0; i < frameEndInfo->layerCount; i++) {
        // Patch the IPD back for all projection layers with stereoscopic views.
        if (frameEndInfo->layers[i]->type == XR_TYPE_COMPOSITION_LAYER_PROJECTION) {
            const XrCompositionLayerProjection* proj =
                reinterpret_cast<const XrCompositionLayerProjection*>(frameEndInfo->layers[i]);

            if (proj->viewCount == xr::StereoView::Count) {
                // Create our copies of the structures we will modify.
                projAllocator.emplace_back(*proj);
                auto& patchedProj = projAllocator.back();
                projViewsAllocator.emplace_back(std::array<XrCompositionLayerProjectionView, 2>{proj->views[0], proj->views[1]});
                auto& patchedProjViews = projViewsAllocator.back();

                // Restore the original IPD, otherwise the OpenXR runtime will reproject the altered IPD
                // into the real IPD.
                overrideIPD(patchedProjViews[xr::StereoView::Left].pose, patchedProjViews[xr::StereoView::Right].pose,
                            m_lastSeenIPD);

                patchedProj.views = projViewsAllocator.back().data();

                // Take our modified projection layer.
                layersPtrAllocator.push_back(reinterpret_cast<XrCompositionLayerBaseHeader*>(&patchedProj));
            } else {
                // Take the unmodified projection layer.
                layersPtrAllocator.push_back(frameEndInfo->layers[i]);
            }
        } else {
            // Take the unmodified layer.
            layersPtrAllocator.push_back(frameEndInfo->layers[i]);
        }
    }

    // Use our newly formed list of layers.
    chainFrameEndInfo.layers = layersPtrAllocator.data();
    assert(chainFrameEndInfo.layerCount == (uint32_t)layersPtrAllocator.size());

    return OpenXrApi::xrEndFrame(session, &chainFrameEndInfo);
}

The snippet above shows how to iterate through all the composition layers submitted by the application, find the stereoscopic projections, and reverse the IPD override before submitting them to the OpenXR runtime.

The code above is probably more complex than you wished for, because it introduces a new challenge we had not seen before: structures are passed from the application as const and it would not be safe to modify them in-place. While it might be tempting to simply override the const-ness, and it would probably work OK in this specific case, doing so puts us at risk of nasty bugs. We have no guarantee that the application will not reuse those structures once the call to xrEndFrame() returns. So here we demonstrate how to create copies of all the structures we need to patch, by using std::vector as basic allocators, and pushing back a new structure every time we need a copy.

Note: it is very important to reserve() the correct worst-case size in the std::vector! As we will be capturing pointers to some of these structures, the reserve() will ensure that no re-allocation in the vector will cause our pointers to be invalidated.

Loading the API layer

The OpenXR Loader specification has many details on the loading of API layers. Here we will only show the basics.

There are three main ways to load your API layer into an application:

Explicit loading. This requires the application to be modified to specify your API layer in the call to xrCreateInstance(). We will skip the details on this approach, since it is probably not applicable to API layers that we want to load into unmodified applications.
Implicit loading via environment variables. This technique uses two environment variables, XR_ENABLE_API_LAYERS and XR_API_LAYER_PATH. The former must be set to the name of your API layer JSON file without the .json extension (here openxr-api-layer for the 64-bit build and openxr-api-layer-32 for the 32-bit build). The latter must point to the folder where the JSON file is located.
Implicit loading via registry key. This technique relies on a registry key being created under HKEY_LOCAL_MACHINE\SOFTWARE\Khronos\OpenXR\1\ApiLayers\Implicit (or HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Khronos\OpenXR\1\ApiLayers\Implicit for 32-bit), the key being a DWORD 32 with the name pointing to the API layer JSON file, and the value being 0 for enabled or non-zero value for disabled.

For API layers distributed to end users, solution 3) is the most practical. It is implemented by the developer's install/uninstall scripts under script\[Un]install-layer.ps1.

You will be using these scripts to enable/disable the API layer from within your output folder (eg: run bin\x64\Debug\Install-Layer.ps1 to activate the debug version of your layer during development).

Warning: keep track of the content of your registry under HKEY_LOCAL_MACHINE\SOFTWARE\Khronos\OpenXR\1\ApiLayers\Implicit. You don't want to have multiple copies of your API layer active at once!

Note: once you implicitly enable your API layer via the global environment method or the registry method, it will be loaded by every OpenXR application. If your API layer targets one application specifically, set the m_bypassApiLayer value to true in your xrCreateInstance() hook. You may use the value of createInfo->applicationInfo.applicationName to identify the application.

Some best practices about deploying API layers

Do not use the Install-Layer.ps1 script as the installation method for end users. PowerShell scripts do not run by default on non-developers machines and are subject to security restrictions.
Sign your code digitally in order to be compatible with anti-cheat software (like Easy Anti-Cheat).
Offer a simple way for the user to disable your API layer, in case it causes instabilities with certain applications. A simple Windows program with a checkbox to enable/disable is enough.
Do not forget to remove the registry key from the end user's machine once they delete or uninstall your API layer!

Capturing traces

Tracelogging will be an essential tool when it comes to debugging your API layer and investigating issues from your users.

You want to begin by customizing the name and GUID of your traces in both scripts\Tracing.wprp and openxr-api-layer\framework\log.cpp.

In the tracing profile (used to capture logs):

    <EventProvider Name="CBF3ADCD-42B1-4C38-830C-91980AF201F8" Id="OpenXRTemplate" />

In the layer's code:

    // {cbf3adcd-42b1-4c38-830c-91980af201f8}
    TRACELOGGING_DEFINE_PROVIDER(g_traceProvider,
                                 "OpenXRTemplate",
                                 (0xcbf3adcd, 0x42b1, 0x4c38, 0x83, 0x0c, 0x91, 0x98, 0x0a, 0xf2, 0x01, 0xf8));

To capture a trace for your API layer:

Open a command line prompt or powershell in administrator mode and in a folder where you have write permissions;
Begin recording a trace with the command: wpr -start path\to\Tracing.wprp -filemode;
Leave that command prompt open;
Reproduce the crash/issue;
Back to the command prompt, finish the recording with: wpr -stop output.etl;
These files are highly compressible!

Use an application such as Tabnalysis to inspect the content of the trace file.

Tip: writing the calls to TraceLoggingWrite() in each function implemented by your layer may be tedious... You can look at the source code of the PimaxXR runtime, which implements the tracelogging calls for all functions of the core OpenXR specification plus a handful of extensions. You may copy/paste the TraceLoggingWrite() calls from it.

Conclusion

You can now experiment with the code presented in this tutorial, by hooking into other OpenXR API functions and either logging them or altering their behavior.

The code snippets presented above are shortened for the sake of the tutorial. Review the full code change in the examples/ipd-override branch and take a look at some of the smaller details not presented in depth: additional error-checking code, tracing calls for debugging, etc.

If you have any feedback on this tutorial or the example code, please file an issue.

Thank you.

Next tutorial: Tutorial 2: Your first in-VR overlay.