< back to projects

OCaml Labs

(2012 - 2021)

I founded a research group called OCaml Labs at the University of Cambridge, with the goal of pushing OCaml and functional programming forward as a platform, making it a more effective tool for all users (including large-scale industrial deployments), while at the same time growing the appeal of the language, broadening its applicability and popularity.

Background

In my PhD work on Functional Internet in around 2003-2007, I developed high performance and reliable protocol implementations in OCaml. Subsequently from 2010, I worked on Personal Containers to build high assurance private data processing platforms. This research lead me to really appreciate functional programming as a powerful approach to building robust software, and I got involved in the Commercial Users of Functional Programming workshop, first as a speaker and then an organiser in “Commercial Users of Functional Programming 2011 Workshop Report”.

It was around this time in 2011 that my work on Unikernels and MirageOS was starting to materialise into a real project, but the OCaml language that we wrote everything in didn't have a unified open source community. Instead, there were islands of developers all over the world: the core maintainers concentrated in Inria in France, and academics teaching it in various universities, and some industrial shops like Jane Street or my own experiences from “Using Functional Programming within an Industrial Product Group: Perspectives and Perceptions”. I put my head together with Yaron Minsky in Tokyo at IFCP 2011 to see if we could try something a little unique for the time – establishing a centre for excellence in functional programming that would focus on the open-source and community building aspects of functional programming as well as traditional academic research.

Early Days (2012-2014)

In 2012, we launched the centre from the Cambridge Computer Lab in Announcing OCaml Labs. Things moved very quickly indeed as the group quickly grew to around 6 full time postdocs and engineers, with lots of interns coming through our doors. Our general strategy at this point was to understand the basic problems we were going to tackle, and so started with a few concrete projects to bootstrap the ecosystem:

  • publishing “Real World OCaml: functional programming for the masses (1st Ed)” with O'Reilly, which sold lots of copies in the early days and created plenty of buzz for OCaml. It was quite fun attending author signings around the world and having lines of people queuing up for a signature!
  • I worked closely with Thomas Gazagnaire (then CTO at OCamlPro) who lead the development of the first version of the opam package manager. Both of us were also establishing the MirageOS project at the time, and so we ended up bootstrapping a big chunk of the opam-repository for use by it, and we also took a (in hindsight excellent) decision to use the nascent GitHub platform as the primary mechanism for managing packages instead of hosting a database. After a few releases in 2012 and then OPAM 1.1 beta available, with pretty colours, the package manager rapidly established itself as the defacto standard for the OCaml ecosystem. I've been the chief maintainer of the opam-repository ever since then (with many wonderful co-maintainers who do much of the heavy lifting, of course!). As of 2021, there are over 20000 packages in the repository.

We also began organising community events, both online and offline:

There was enough activity in the early days that I managed to capture it in annual blog posts:

After 2014 though, things had grown to the point where it was just too difficult for me to keep up with the flurry of movement. We then aggregated into a "middle age" research project around 2015 with the following projects that would take the next few years.

The OCaml Platform

One of the main thrusts in OCaml Labs was to construct the tools to enable effective development workflows for OCaml usage at an industrial scale, while remaining maintainable with a small community that needed to migrate from existing workflows. This effort was dubbed the "OCaml Platform" and really picked up stream after our release of the opam package manager, since it began the process of unifying the OCaml community around a common package collection.

While much of the work was lead from OCaml Labs, it's also been highly collaborative with other organisations and individuals in the community. And of course, 100% of the work was released as open source software under a liberal license. I've been giving annual talks since 2013 or so about the steady progress we've been making towards building, testing, documentation and package management for OCaml.

  • “Real World OCaml: functional programming for the masses (1st Ed)” was the book published by O'Reilly that explained how to use OCaml with the Core library.
  • My 2013 talk on “The OCaml Platform v0.1” first introduced the OCaml Platform just after opam was first released.
  • My 2014 talk on “The OCaml Platform v1.0” continued the steady adoption of opam within the OCaml community, to start bringing a standard package database across the different users.
  • My 2015 Platform talk then introduced continous integration for opam, as well the start of the central documentation efforts (which were finally completed in 2021 after some herculean efforts!).
  • By my 2017 Platform talk in Oxford, we had most of the OCaml community using opam and released opam 2.0, started contributing to the new jbuilder build tool from Jane Street, and began the shift from camlp4 to ppx and the development of the new odoc tool.
  • In my 2018 Platform talk in Missouri, we had helped evolve jbuilder into the Dune build system (now the build tool of choice in OCaml), and started to combine packaging and build into a cohesive platform. The key challenge so far had been to fill in gaps in functionality, and now we could begin to weave together the components we'd built.
  • My 2019 Platform talk in Berlin focussed on how workflows using all these tools would work, such as for package managers or application developers or end users.
  • My 2020 Platform talk saw the unveiling of the VSCode OCaml Platform plugin, which provided a seamless integration with the IDE to let all the workflows and tools from earlier years "just work" out of the box.
  • In 2021, we embarked on a huge mission to rebuild the ocaml.org online presence with a central documentation site that built 20000 packages with cross-referenced HTML documentation.

As you can see, it's quite a journey to build community-driven development tools. A key to our approach was to "leave no OCaml project behind", and we spent considerable effort ensuring that every step of the tooling evolution had a migration path for older OCaml projects. As a result, it's often still possible to compile 20 year old OCaml code using the modern tooling.

Multicore OCaml

The other big research project we drove from OCaml Labs was the effort to bring multicore parallelism to OCaml. While this might seem straightforward, we quickly realised that the challenge was in preserving existing sequential performance while also allowing new code to take advantage of multicore CPUs.

The first talk we gave was in 2014 on “Multicore OCaml”. Little did we know how much work it would take to get this production worthy! After several years of hacking, we finally had several breakthroughs:

  • Any multicore-capable language needs a well-defined memory model, and we realised that none of the existing ones (e.g. in C++ or Java) were particularly satisfactory. Our PLDI paper on “Bounding Data Races in Space and Time” defined a sensible and novel memory model for OCaml that was predictable for developers.
  • Our garbage collector and runtime design won the best paper award at ICFP for its systematic approach to the design and evaluation of several minor heap collectors, in “Retrofitting parallelism onto OCaml”.

Algebraic Effects

While working on parallelism in OCaml with Leo White and Stephen Dolan, KC Sivaramakrishnan joined our group after completing his PhD and started us down the path of using algebraic effects to express concurrency in OCaml code.

In around 2020, I started publishing multicore monthlies on the OCaml discussion forum. This was because we had begin the journey to upstream our feature into the mainline OCaml compiler. As of 2021, this is going well and we are planning to incorporate domains-parallelism and runtime fibres into OCaml 5.0 in early 2022. The amount of work that we put into multicore has been way more than I expected at the outset of the project, but the results are deeply satisfying. I'm finding that coding using effects in a mainstream PL like OCaml to be really fun, and anticipate this having a big boost for Unikernels in MirageOS that are struggling somewhat under the weight of over-functorisation for portability.

OCaml Labs (2021-present)

The OCaml Labs research project at the University of Cambridge finally came to a happy end in 2021, after almost ten years. After the first decade of fundamental research and early engineering, the maintainership and stewarding of the resulting code has only picked up pace as the OCaml userbase grows. There are now three commercial companies who have taken over the work from the University, all run by research staff originally in the Computer Lab group (Gemma Gordon, KC Sivaramakrishnan and Thomas Gazagnaire).

There's really exciting work happening there -- finishing up the upstreaming of the multicore OCaml feature into mainline OCaml (due in 2022), making unikernels and MirageOS ever more practical and robust to deploy, and shipping end-to-end Windows support in the OCaml toolchain.

Related publications

Commercial Users of Functional Programming 2011 Workshop Report
Anil Madhavapeddy, Yaron Minsky and Marius Eriksen.
Using Functional Programming within an Industrial Product Group: Perspectives and Perceptions
David Scott, Richard Sharp, Thomas Gazagnaire and Anil Madhavapeddy.
Real World OCaml: functional programming for the masses (1st Ed)
Yaron Minsky, Anil Madhavapeddy and Jason Hickey.
Book (510 pages), O'Reilly Media Nov 2013 (1st Edition).
Commercial Users of Functional Programming 2011 Workshop Report
Anil Madhavapeddy, Yaron Minsky and Marius Eriksen.
Commercial Users of Functional Programming 2013 Scribe's Report
Marius Eriksen, Michael Sperber and Anil Madhavapeddy.
Real World OCaml: functional programming for the masses (1st Ed)
Yaron Minsky, Anil Madhavapeddy and Jason Hickey.
Book (510 pages), O'Reilly Media Nov 2013 (1st Edition).
The OCaml Platform v0.1
Anil Madhavapeddy, Amir Chaudhry, Thomas Gazagnaire, David Sheets, Phillipe Wang, Leo White and Jeremy Yallop.
Workshop paper in the the 2nd ACM OCaml Users and Developers Workshop on Sep 2013 at Boston, USA.
The OCaml Platform v1.0
Anil Madhavapeddy, Amir Chaudhry, Jeremie Dimino, Thomas Gazagnaire, Louis Gesbert, Thomas Leonard, David Sheets, Mark Shinwell, Leo White and Jeremy Yallop.
Workshop paper in the the 4th ACM OCaml Users and Developers Workshop on Sep 2014 at Gothenberg, Sweden.
Multicore OCaml
Stephen Dolan, Leo White and Anil Madhavapeddy.
Workshop paper in the the 4th ACM OCaml Users and Developers Workshop on Sep 2014 at Gothenberg, Sweden.
Bounding Data Races in Space and Time
Stephen Dolan, KC Sivaramakrishnan and Anil Madhavapeddy.
Retrofitting parallelism onto OCaml
KC Sivaramakrishnan, Stephen Dolan, Leo White, Sadiq Jaffer, Tom Kelly, Anmol Sahoo, Sudha Parimala, Atul Dhiman and Anil Madhavapeddy.
Conference paper in the 25th ACM SIGPLAN International Conference on Functional Programming (ICFP20) on Aug 2020. Awarded distinguished paper.
Effectively Tackling the Awkward Squad
Stephen Dolan, Spiros Elipolous, Daniel Hillerström, Anil Madhavapeddy, KC Sivaramakrishnan and Leo White.
Workshop paper in the ACM ML Family Workshop 2017 on Jun 2017 at Oxford, United Kingdom.
Concurrent System Programming with Effect Handlers
Stephen Dolan, Spiros Elipolous, Daniel Hillerström, Anil Madhavapeddy, KC Sivaramakrishnan and Leo White.
Workshop paper in the 18th Symposium on Trends in Functional Programming on Jun 2017 at Canterbury, United Kingdom.
Retrofitting effect handlers onto OCaml
KC Sivaramakrishnan, Stephen Dolan, Leo White, Tom Kelly, Sadiq Jaffer and Anil Madhavapeddy.

Related projects

2003 - 2008 Functional Internet
2009 - 2015 Personal Containers
2010 - 2019 Unikernels
2010 - 2019 Unikernels