Mostly when I talk about #CHERIoT, I'm talking about the security, because that's the biggest and most obvious feature.
But since I've ranted a bit recently about Free Software things that don't empower users, I wanted to take a minute to write about how CHERIoT RTOS and is built on the principles that I think can provide maintainable systems.
The CHERIoT software is designed around compartments, which are somewhat like MULTICS shared libraries. They are isolated things that can own some mutable state and expose entry points, which are functions. Code in one compartment can invoke code in another compartment by simply calling a function that the other compartment exposes. This is a security domain transition. The callee has access to objects that were passed by reference, but there is no implicit sharing (and the hardware lets you share read-only view of data structures, or share for the duration of the call, and so on).
This model means that a lot of things that you run on the platform are tiny. This include the RTOS itself. There are four core bits of the RTOS:
- The switcher, which is the equivalent of the kernel, and is around 350 instructions. This is the most privileged part of the system and implements the fundamental bits of the programming model that are too complex to put in hardware. It is carefully co-designed with the hardware architecture though, and we expect very few users will ever modify it. We're working to formally verify it.
- The scheduler is trusted for availability, but not for confidentiality and integrity. It is just another compartment. We provide a simple fixed-priority scheduler with priority inheriting futexes. We think it's a nice design for embedded systems, but if you want to build your own scheduler you really just need to implement the function that the switcher calls that says 'this is the thread that was interrupted, go and talk to the interrupt controller and figure out which new thread you want to run now' and the futex APIs.
- The heap allocator. This owns the region of memory used for the heap and works with the hardware to enforce spatial and temporal safety. We provide an implementation of this in C++, some folks at UBC are working on a reimplementation in Rust using Verus. If this is slower / bigger, we'll support both, if it's as good then we'll probably replace ours.
The RTOS also contains a bunch of optional shared libraries and makes it easy to wrap other components. For example, we have a network stack that puts BearSSL, the FreeRTOS TCP/IP stack, and the FreeRTOS MQTT libraries in compartments and adds a control-plane compartment that manages authorisation for socket creation and firewall control, a firewall compartment, and a DNS resolver compartment.
Most of these are small. If you want to replace BearSSL with WolfSSL or something similar, that should be easy. If you don't want to use it at all, that's fine too, you can replace the entire network stack with something different if you want to.
The goal is to have a load of building blocks that device vendors and end users can easily reuse if they want to, but that they can easily replace with their own things if they don't like ours.
We built an auditing framework that makes it easy to add CI-time or code-signing-time checks to enforce rules like 'only the firewall can talk to the network interface device directly' or 'only the TLS compartment may call the TCP send and receive functions'. We did this to support the conflicting requirements of proposed right-to-repair legislation (which should give end users the right to modify the software on their device) and other safety regulations (which may restrict the ability to modify things like the code talking to software-defined radio, or the safety-critical parts of a medical device). This enables people who build on the platform to create devices that empower their end users, even in domains where allowing replacement of the firmware may be illegal.