RAUC integration notes - madisongh/tegra-test-distro GitHub Wiki
Notes on integrating RAUC into builds for Jetson platforms.
RAUC background
RAUC is a relatively lightweight update system for embedded Linux devices.
Features:
- Open source (LGPL-2.1 licensed)
- Written in C (makes heavy use of glib)
- Can be built as either a standalone program or systemd service with dBus interface
- Configurable for multiple update methods, partition layouts, etc.
- Single update package can support multiple variants of a platform
- Signed update packages with content verification (using dm-verity)
- Update packages can include "hooks" - scripts/programs that can be run pre-install or post-install
It supports full A/B updates (called "symmetric" in their documentation), single rootfs with an update/recovery image ("asymmetric"), or symmetric combined with a fallback recovery/rescue image that can be loaded should both A and B partitions fail to boot. RAUC calls their update packages "bundles".
RAUC does not currently support streamed updates. That is, with the default bundle format, there must be sufficient space available to download the bundle to the device before the update begins, rather than streaming the bundle content directly onto the partiton to be updated. RAUC does have "experimental" support for using casync, which features both streamed update support and delta updates.
The RAUC project does not supply a back-end deployment server. They do supply (as separate programs) clients that interface
with Eclipse hawkBit, an open-source IoT device management and update framework. I tested
the C language rauc-hawkbit-updater
client, and it worked as advertised.
Integration experiments
- Implemented in tegra-test-distro, a fork of the OE4T demo distro.
- Experimented with machines from all three current Jetson families (t210, t186, t194).
- Implemented "symmetric" (A/B) updates only.
I used custom flash layouts for each platform tested, similar to the layouts used for Mender integration in the demo distro.
Bootloader integration - t186/t194
RAUC comes with built-in support for commonly-used bootloaders, and also provides a documented "custom" interface for communicating boot slot status to bootloaders not already supported. The bootloader is expected to add boot slot information to the kernel command line when booting the system, so RAUC can know which boot slot is being used.
For the cboot platforms (t186/t194), I patched cboot to pass rauc.slot={a|b}
in the kernel command line and
added a script
to translate RAUC's slot update/status requests into tegra-boot-control
commands.
Bootloader integration - t210
RAUC's U-Boot support uses u-boot-fw-utils
tools to read and write U-Boot environment variables:
BOOT_ORDER
lists the slot names in order of preferences.BOOT_
slot_LEFT
holds a count of the number of remaining boot attempts for the named slot, maximum 3.
RAUC supplies a contributed U-Boot script to demonstrate the logic for handling the boot slot selection sequence
based on the above-mentioned variables. For the t210 platforms, I patched
U-Boot to integrate that logic into the distro-boot (extlinux) boot sequence we use by default, since we don't
normally implement U-Boot scripting on our platforms. I also added U-Boot config fragments to enable environment
redundancy (splitting the default 32KiB environment into two 16KiB chunks) and enable the setexpr
command
used for decrementing the BOOT
...LEFT
counter. The environment redundancy change also required updated
configuration files for libubootenv
's utilities.
Handling bootloader updates
For all platforms, I added a post-install hook to run tegra-bootloader-update
on the BUP bundled into the rootfs,
similar to the mechanism we currently implement for Mender on t186/t194 platforms.
Build integration
The RAUC project has a meta-rauc layer for OE/Yocto builds. This layer can
be added to the layer set without modifying your existing builds. Build modifications are triggered by adding rauc
to DISTRO_FEATURES
, although by manual integration you could be more selective. It includes recipes for RAUC
itself, the hawkBit clients, and some bbappend
s for OE-Core recipes that need modification for RAUC.
To adapt meta-rauc
to our builds, I added a dynamic layer under meta-tegra-support
to pull in the necessary recipe modifications not already covered by meta-rauc
directly:
- Kernel config additions (plus one patch - see below)
- cboot and U-Boot modifications
- Modifications to the RAUC build itself for the on-device configuration file and installing the cboot interface script
The above provides the basics. In my distro config, I added rauc
to DISTRO_FEATURES
and in my distro layer I added
image recipes
for building bundles from the demo
images we already have recipes for. I also generated signing keys which I simply
committed into the distro repository for testing purposes. meta-rauc
provides a script for generating the keys easily, but
some work is needed to properly integrate the use of the private keys into a development/build/deployment workflow, and
I did not spend time on that.
Kernel issue requiring patch
The major issue I encountered with RAUC was installation failures on t210 devices due to verification failures of the bundle contents.
The root cause of those failures was an issue with the use of the hardware security engine on the t210s for SHA256 hashes, which is
enabled by default in our kernel. For some input, the engine produces different results than the kernel's built-in SHA256 algorithm.
I didn't spend enough time on the issue to determine whether it's a driver bug, a hardware bug, or possibly a bug in the dm-verity
code making incorrect assumptions about the underlying crypto engine, which just happen to work with the built-in SHA256
implementation. Instead, I simply patched out the SHA256 support in the tegra-se
driver so the software implementation is used instead.
For any production use of RAUC on the t210 platforms, a proper fix would be needed. Note that the t186/t194 platforms have
a different hardware security engine that uses a different driver. I did not see this problem on those platforms.
Performance
Due to the kernel issue mentioned above, which took a while to track down, I did not spend much time on performance. Overall, updates
weren't particularly fast using the ext4
-formatted images, and it took a while to get used to the progress indicator that RAUC shows
quickly moving to the 90% mark, then staying at 90% for a long period of time while the image is installed. This is largely a UI
issue that could probably be fixed. Overall update time seemed comparable to what I've seen with Mender updates.
I did a few experiments with casync
support, and found that updates weren't really that much faster, despite the fact that
it implements delta updates. This is likely due to the deltas being computed on the device during the update process. The
computation appears to be single-threaded, as one CPU would be pegged at 100% for an extended period of time during the update.
Concluding notes
Except for the kernel issue mentioned above, RAUC integration was relatively straightforward, and worked more or less as promised for simple manual updates. The documentation is thorough, and the interfaces are well-covered there.
The lack of built-in streaming updates is a concern for the OTA use case for Jetson platforms, since our rootfs images tend to be quite large. The casync support (which would get us streaming and delta updates) is experimental, and casync itself doesn't appear to be under active development, based on the number of long-standing issues and PRs on that repository.
With a bit more work, it could be used for recovery/rescue installations - either instead of, or in addition to, symmetric A/B updates.