Summary. I proposed the concept of "unikernels" -- single-purpose appliances that are compile-time specialised into standalone bootable kernels, and sealed against modification when deployed to a cloud platform. In return they offer significant reduction in image sizes, improved efficiency and security, and reduce operational costs. I also co-founded the MirageOS project which is one of the first complete unikernel frameworks, and also integrated them to create the Docker for Desktop apps that are used by hundreds of millions of users daily.
While working on Personal Containers in late 2008, I had a need to run lots of distributed edge nodes holding personal data. The state of computer security is generally a disaster when it comes to leaving software unupgraded for even a few months, so building robust infrastructure that normal people could use was proving quite difficult. Meanwhile, my PhD research in building Functional Internet Services had constructed really viable prototypes of network protocols written in pure OCaml, and I'd previously used OCaml industrially in the Xen Hypervisor hypervisor to write lots of system management code.
All of these ideas came crashing together in late 2009 and I decided to have a go at putting together a complete OCaml-based operating system. The adventure began with grabbing the Xen mini-os and the C lwIP stack to provide networking and sqlite for persistent storage, and hacking for a few months until everything booted and was reasonably stable. I then convinced Thomas Gazagnaire (then at Inria) to help me with storage integration with OCaml in Dynamics for ML using Meta-Programming and we had a remarkably good prototype that we presented in Turning Down the LAMP: Software Specialisation for the Cloud.
I wrote up my early thoughts on Multiscale not multicore: efficient heterogeneous cloud computing to describe this emerging idea of heterogenous cloud and edge computing combined into a single programming model. After realising that the prototype worked well, I started steadily removing C bindings (like lwIP) and replacing them with pure OCaml code all the way down to the VM Xen interface (e.g. like mirage-tcpip). These early heady days saw lots of prototypes and experimentation:
One of the earliest decisions I made in MirageOS was to self-host as soon as possible. I registered openmirage.org in late 2009, and (joined by @mort and @djs55) we had a Xen-based website running in short order in 2010 (now mirage-www). A big boost to the project was winning a grant from the Verisign Infrastructure Awards, which was the first external validation that this thing might be of interest to other people. As my OCaml Labs group grew in the University, more intrepid hackers joined the group and started making MirageOS work properly.
A year of intense work in 2012 turned the prototype into a fully-fleshed out paper which got soundly rejected by the OSDI review committee as we hadn't identified what the core systems research contribution was (as opposed to the impressive programming work, which they acknowledged in the rejection). I'd just gone to visit Timothy Roscoe's group in ETH where they had been working on the Barrelfish multikernel OS, and the answer came right to me while in the pub with Jon Crowcroft. What MirageOS represented was a revival of the concept of library operating systems, but with the additional twist that it specialised the compilation into single-user mode. Thus, I settled on the term "unikernels" to describe this idea and rewrote the paper and duly published it in Unikernels: library operating systems for the cloud.
Publishing a major research paper in ASPLOS led to further momentum and interest:
MirageOS also gave us ideas for other top systems research, such as the filesystem verification idas in SibylFS: formal specification and oracle-based testing for POSIX and real-world file systems (which I still intend to use for a proper POSIX compatibility layer on top of Irmin at some point), and FLICK: Developing and Running Application-Specific Network Services (to build domain-specific data processing platforms, something that I'm now working on in 2021 in Trusted Carbon Credits).
By this point, MirageOS was also a thriving open source community with regular IRC meetings and the beginning of hack retreats. There were several organisations using it, and the overall OCaml community started using some of our protocol implementations independently of the unikernel ideas. For example, the cohttp was something I rapidly hacked together for the ASPLOS deadline, but the Unix/Lwt/Async backends are now used in quite a few major systems (including within Jane Street, no less).
We had to deal with all this growth, as a university isn't the easiest place to have a very large group. In 2015, Balraj Singh (who had made huge contributions to the Mirage TCP/IP stack) Thomas Gazagnaire and myself founded Unikernel Systems along with Jeremy Yallop, Thomas Leonard, Magnus Skjegstad, Mindy Preston, Justin Cormack, David Sheets, Amir Chaudhry, and Dave Scott. After a fun few months pitching to west coast VCs in California (including fun chats with the likes of Jerry Yang), Peter Fenton from Benchmark convinced us to meet Solomon Hykes over at Docker. This conversation changed the course of our careers, as he shared his vision for the future of containerisation and how unikernels could fit in there.
A short set of negotiations later, and Unikernel Systems was acquired by Docker in 2016. We spent a very fun couple of years commercialising the technology and incorporating it into Docker for Desktop. Our work ended up shipping as Docker for Desktop which remains one of the most popular developer tools in the world, and I describe its architecture in this talk.
Our startup aside, the core development of MirageOS continued to be nicely distributed in several spinouts:
The wider industry also saw a number of interesting spinouts, as many other communities also latched on to the ideas of unikernels and began their own language-specific and domain-specific versions. I joined the advisory boards of IncludeOS (now sadly defunct) and Zededa (now thankfully going from strength to strength in edge computing) to help guide strategy and adoption outside of just MirageOS. Dr Pierre Oliver maintains a great list of unikernel papers where you can see the diversity and interest in unikernels. One of the most exciting implementations of a C-based unikernel can be found in Unikraft.
As for my interest in unikernels moving forward? My heart always remains in finding the intersection of safety and performance, which means I mostly pay attention to language-based approaches. MirageOS continues to thrive (particularly with the effect system being integrated into OCaml in 2022, which will really change the way we develop OCaml code for embedded systems). Since 2020, I've been investigating the application of DIFC to embedded infrastructure, for example via Snape: The Dark Art of Handling Heterogeneous Enclaves.
[»] Real World OCaml: Functional Programming for the Masses |
[»] Banyan: Coordination-Free Distributed Transactions over Mergeable Types |
[»] MirageOS 4: the dawn of practical build systems for exotic targets |
[»] Programming Unikernels in the Large via Functor Driven Development |
[»] DaLi: Database as a Library |
[»] FLICK: Developing and Running Application-Specific Network Services |
[»] Declarative Foreign Function Binding Through Generic Programming |
[»] Personal Data: Thinking Inside the Box |
[»] SibylFS: formal specification and oracle-based testing for POSIX and real-world file systems |
[»] Not-Quite-So-Broken TLS |
[»] Jitsu: Just-In-Time Summoning of Unikernels |
[»] Mergeable persistent data structures |
[»] Raft Refloated: Do We Have Consensus? |
[»] Irminsule: a branch-consistent distributed library database |
[»] Using Dust Clouds to Enhance Anonymous Communication |
[»] Trevi: watering down storage hotspots with cool fountain codes |
[»] Unikernels: Rise of the Virtual Library Operating System |
[»] Lost in the Edge: Finding Your Way with DNSSEC Signposts |
[»] Unikernels: library operating systems for the cloud |
[»] Evolving TCP: how hard can it be? |
[»] Exploring Compartmentalisation Hypotheses with SOAAP |
[»] Programming the Xen cloud using OCaml |
[»] Cost, Performance & Flexibility in OpenFlow: Pick three |
[»] Confidential carbon commuting: exploring a privacy-sensitive architecture for incentivising 'greener' commuting |
[»] The case for reconfigurable I/O channels |
[»] Dynamics for ML using Meta-Programming |
[»] Reconfigurable Data Processing for Clouds |
[»] CIEL: A universal execution engine for distributed data-flow computing |
[»] Unclouded vision |
[»] Turning Down the LAMP: Software Specialisation for the Cloud |
[»] Multiscale not multicore: efficient heterogeneous cloud computing |