home · projects · papers · blog · gallery · contact
anil madhavapeddy // anil.recoil.org

Reviewing the second year of OCaml Labs in 2014

02 April 2015   |   Anil Madhavapeddy   |   tags: ocamllabs,ocaml   |   post syndicated from OCaml Labs   |   all posts

The OCaml Labs initiative within the Cambridge Computer Laboratory is now just over two years old, and it is time for an update about our activities since the last update at the end of 2013 and 2012.

The theme of our group was not to be pure research, but rather a hybrid group that takes on some of the load of day-to-day OCaml maintenance from INRIA, as well as help grow the wider community and meet our own research agendas around topics such as unikernels. To this end, all of our projects have been highly collaborative, often involving colleagues from OCamlPro, INRIA, Jane Street, Lexifi and Citrix.

This post covers our progress in tooling, the compiler and language, community efforts, research projects and concludes with our priorities for 2015.


At the start of 2014, we had just helped to release OPAM 1.1.1 with our colleagues at OCamlPro, and serious OCaml users had just started moving over to using it.

Our overall goal at OCaml Labs is to deliver a modular set of of development tools around OCaml that we dub the OCaml Platform. The remainder of 2014 was thus spent polishing this nascent OPAM release into a solid base (both as a command-line tool and as a library) that we could use as the basis for documentation, testing and build infrastructure, all the while making sure that bigger OCaml projects continued to migrate over to it. Things have been busy; here are the highlights of this effort.


The central OPAM repository that contains the package descriptions has grown tremendously in 2014, with over 280 contributors committing almost 10000 changesets across 3800 pull requests on GitHub. The front line of incoming testing has been continuous integration by the wonderful Travis CI, who also granted us access to their experimental MacOS X build pool. The OPAM package team also to expanded to give David Sheets, Jeremy Yallop, Peter Zotov and Damien Doligez commit rights, and they have all been busily triaging new packages as they come in.

Several large projects such as Xapi, Ocsigen and our own MirageOS switched over to using OPAM for day-to-day development, as well as prolific individual developers such as Daniel Buenzli and Markus Mottl. Jane Street continued to send regular monthly updates of their Core/Async suite, and releases appeared from the Facebook open-source team as well (who develop Hack, Flow and Pfff in OCaml).

Number of unique contributors to the central OPAM package repository.

Total number of unique packages (including multiple versions of the same package).

Total packages with multiple versions coalesced so you can see new package growth.

We used feedback from the users to smooth away many of the rough edges, with:

These changes were all incorporated into the OPAM 1.2, along with backwards compatibility shims to keep the old 1.1 metadata format working until the migration is complete. The 1.2.x series has been a solid and usable development manager, and last week’s release of OPAM 1.2.1 has further polished the core scripting engine.

Platform Blog

One of the more notable developments during 2014 was the adoption of OPAM further up the ecosystem by the Coq theorem prover. This broadening of the community prompted us to create an official OPAM blog to give us a central place for new and tips, and we’ve had posts about XenServer developments, the Merlin IDE tool and the modern UTop interactive REPL. If you are using OPAM in an interesting or production capacity, please do get in touch so that we can work with you to write about it for the wider community.

The goal of the blog is also to start bringing together the various components that form the OCaml Platform. These are designed to be modular tools (so that you can pick and choose which ones are necessary for your particular use of OCaml). There are more details available from the OCaml Workshop presentation at ICFP 2014 (abstract, slides, video).

Onboarding New Users

OPAM has also been adopted now by several big universities (including us at Cambridge!) for undergraduate and graduate Computer Science courses. The demands increased for an out-of-the-box solution that makes it as easy possible for new users to get started with minimum hassle. We created a dedicated teaching list to aid collaboration, and a list of teaching resources on ocaml.org and supported several initiatives in collaboration with Louis Gesbert at OCamlPro, as usual with OPAM development).

The easiest way to make things “just work” are via regular binary builds of the latest releases of OCaml and OPAM on Debian, Ubuntu, CentOS and Fedora, via Ubuntu PPAs and the OpenSUSE Build Service repositories. Our industrial collaborators from Citrix, Jon Ludlam and Dave Scott began an upstreaming initiative to Fedora and sponsored the creation of a CentOS SIG to ensure that binary packages remain up-to-date. We also contribute to the hardworking packagers on MacOS X, Debian, FreeBSD, NetBSD and OpenBSD where possible as well to ensure that binary builds are well rounded out. Richard Mortier also assembled Vagrant boxes that contain OCaml, for use with VirtualBox.

Louis cooks us dinner in Nice at our OPAM developers summit Within OPAM itself, we applied polish to the handling of external dependencies to automate checking that the system libraries required by OPAM are present. Two emerging tools that should help further in 2015 are the opam-user-setup and OPAM-in-a-box plugins that automate first-time configuration. These last two are primarily developed at OCamlPro, with design input and support from OCaml Labs.

We do have a lot of work left to do with making the new user experience really seamless, and help is very welcome from anyone who is interested. It often helps to get the perspective of a newcomer to find out where the stumbling blocks are, and we value any such advice. Just mail opam-devel@lists.ocaml.org with your thoughts, or create an issue on how we can improve. A particularly good example of such an initiative was started by Jordan Walke, who prototyped CommonML with a NodeJS-style development workflow, and wrote up his design document for the mailing list. (Your questions or ideas do not need to be as well developed as Jordan’s prototype!)

Testing Packages

The public Travis CI testing does come with some limitations, since it only checks that the latest package sets install, but not if any transitive dependencies fail due to interface changes. It also doesn’t test all the optional dependency combinations due to the 50 minute time limit.

We expanded the OPAM repository testing in several ways to get around this:

Language Evolution

This ability to do unattended builds of the package repository has also improved the decision making process within the core compiler team. Since we now have a large (3000+ package) corpus of OCaml code, it became a regular occurrence in the 4.02 development cycle to “ask OPAM” whether a particular feature or new syntax would break any existing code. This in turn provides an incentive for commercial users to provide representative samples of their code; for instance, the Jane Street Core releases in OPAM (with their very modular style) act as an open-source canary without needing access to any closed source code.

One good example in 2014 was the decoupling of the Camlp4 macro preprocessor from the main OCaml distribution. Since Camlp4 has been used for over a decade and there are some very commonly used syntax extensions such as type_conv, a simple removal would break a lot of packages. We used OPAM to perform a gradual movement that most users hopefully never noticed by the time OCaml 4.02 was released. First, we added a dummy package in OPAM for earlier versions of the compiler that had Camlp4 built-in, and then used the OPAM constraint engine to compile it as an external tool for the newer compiler revisions. Then we just had to triage the bulk build logs to find build failures from packages that were missing a Camlp4 dependency, and add them to the package metadata.

GitHub Integration

An interesting comment from Vincent Hanquez about OPAM is that “OCaml’s OPAM is a post-GitHub design”. This is very true, as much of the workflow for pinning git:// URLs emerged out of being early adopters of GitHub for hosting the MirageOS. OCaml Labs supported two pieces of infrastructure integration around GitHub in 2014:

Codoc Documentation

Leo White, David Sheets, Amir Chaudhry and Thomas Gazagnaire led the charge to build a modern documentation generator for OCaml, and published an alpha version of codoc 0.2.0 after a lot of work throughout 2014. In the 2014 OCaml workshop presentation (abstract, slides, video), we mentioned the “module wall” for documentation and this attempts to fix it. To try it out, simply follow the directions in the README on that repository, or browse some samples of the current, default output of the tool. Please do bear in mind codoc and its constituent libraries are still under heavy development and are not feature complete, but we’re gathering feedback from early adopters.

codoc’s aim is to provide a widely useful set of tools for generating OCaml documentation. In particular, we are striving to:

  1. Cover all of OCaml’s language features
  2. Provide accurate name resolution and linking
  3. Support cross-linking between different packages
  4. Expose interfaces to the components we’ve used to build codoc
  5. Provide a magic-free command-line interface to the tool itself
  6. Reduce external dependencies and default integration with other tools

We haven’t yet achieved all of these at all levels of our tool stack but are getting close, and the patches are all under discussion for integration into the mainstream OCaml compiler. codoc 0.2.0 is usable today (if a little rough in some areas like default CSS), and there is a blog post that outlines the architecture of the new system to make it easier to understand the design decisions that went into it.

Community Governance

As the amount of infrastructure built around the ocaml.org domain grows (e.g. mailing lists, file hosting, bulk building), it is important to establish a governance framework to ensure that it is being used as best needed by the wider OCaml community.

Amir Chaudhry took a good look at how other language communities organise themself, and began putting together a succinct governance framework to capture how the community around ocaml.org operates, and how to quickly resolve any conflicts that may arise in the future. He took care to ensure it had a well-defined scope, is simple and self-contained, and (crucially) documents the current reality. The result of this work is circulating privately through all the existing volunteers for a first round of feedback, and will go live in the next few months as a living document that explains how our community operates.


One consequence of OCaml’s age (close to twenty years old now) is that the tools built around the compiler have evolved fairly independently. While OPAM now handles the high-level package management, there is quite a complex ecosystem of other components that are complex for new users to get to grips with: OASIS, ocamlfind, ocamlbuild, and Merlin to name a few. Each of these components (while individually stable) have their own metadata and namespace formats, further compounding the lack of cohesion of the tools.

Thomas Gazagnaire and Daniel Buenzli embarked on an effort to build an eDSL that unifies OCaml package descriptions, with the short-term aim of generating the support files required by the various support tools, and the long-term goal of being the integration point for the build, test and documentation generation lifecycle of an OCaml/OPAM package. This prototype, dubbed Assemblage has gone through several iterations and design discussions over the summer of 2014. Daniel has since been splitting out portions of it into the Bos OS interaction library.

Assemblage is not released officially yet, but we are committed to resuming work on it this summer when Daniel visits again, with the intention of unifying much of our workflow through this tool. If you are interested in build and packaging systems, now is the time to make your opinion known!

Core Compiler

We also spent time in 2014 working on the core OCaml language and compiler, with our work primarily led by Jeremy Yallop and Leo White. These efforts were not looking to make any radical changes in the core language; instead, we generally opted for evolutionary changes that either polish rough edges in the language (such as open type and handler cases), or new features that fit into the ML style of building programs.

New Features in 4.02.0

The OCaml 4.02 series was primarily developed and released in 2014. The ChangeLog generated much user excitement, and we were also pleased to have contributed several language improvements.

Handler Cases and exceptional syntax

OCaml’s try and match constructs are good at dealing with exceptions and values respectively, but neither constructs can handle both values and exceptions. Jeremy Yallop investigated how to handle success more elegantly, and an elegant unified syntax emerged. A simple example is that of a stream iterator that uses exceptions for control flow:

let rec iter_stream f s =
  match (try Some (MyStream.get s) with End_of_stream -> None) with
  | None -> ()
  | Some (x, s') -> f x; iter_stream f s'

This code is not only verbose, but it also has to allocate an option value to ensure that the iter_stream calls remains tail recursive. The new syntax in OCaml 4.02 allows the above to be rewritten succinctly:

let rec iter_stream f s =
  match MyStream.get s with
  | (x, s') -> f x; iter_stream f s'
  | exception End_of_stream -> ()

Read more about the background of this feature in Jeremy’s blog post, the associated discussion in the upstream Mantis bug, and the final manual page in the OCaml 4.02 release. For an example of its use in a real library, see the Jane Street usage in the s-expression handling library (which they use widely to reify arbitrary OCaml values and exceptions).

Open Extensible Types

A long-standing trick to build universal containers in OCaml has been to encode them using the exception exn type. There is a similar concept of a universal type in Standard ML, and they were described in the “Open Data Types and Open Functions” paper by Andres Löh and Ralf Hinze in 2006.

Leo White designed, implemented and upstreamed support for extensible variant types in OCaml 4.02. Extensible variant types are variant types that can be extended with new variant constructors. They can be defined as follows:

type attr = ..

type attr += Str of string

type attr +=
  | Int of int
  | Float of float

Pattern matching on an extensible variant type requires a default case to handle unknown variant constructors, just as is required for pattern matching on exceptions (extensible types use the exception memory representation at runtime).

With this feature added, the OCaml exn type simply becomes a special case of open extensible types. Exception constructors can be declared using the type extension syntax:

    type exn += Exc of int

You can read more about the discussion behind open extensible types in the upstream Mantis bug. If you’d like to see another example of their use, they have been adopted by the latest releases of the Jane Street Core libraries in the Type_equal module.

Modular Implicits

A common criticism of OCaml is its lack of support for ad-hoc polymorphism. The classic example of this is OCaml’s separate addition operators for integers (+) and floating-point numbers (+.). Another example is the need for type-specific printing functions (print_int, print_string, etc.) rather than a single print function which works across multiple types.

Taking inspiration from Scala’s implicits and Modular Type Classes by Dreyer et al., Leo White designed a system for ad-hoc polymorphism in OCaml based on using modules as type-directed implicit parameters. The design not only supports implicit modules, but also implicit functors (that is, modules parameterised by other module types) to permit the expression of generic modular implicits in exactly the same way that functors are used to build abstract data structures.

Frederic Bour joined us as a summer intern and dove straight into the implementation, resulting in an online demo and ML Workshop presentation (abstract, video and paper). Another innovation in how we’ve been trialling this feature is the use of Andy Ray’s IOCamlJS to publish an interactive, online notebook that is fully hosted in the browser. You can follow the examples of modular implicits online, or try them out on your own computer via an OPAM switch:

opam switch 4.02.0+modular-implicits
eval `opam config env`
opam install utop 

Some of the early feedback on modular implicits from industrial users was interesting. Jane Street commented that although this would be a big usability leap, it would be dangerous to lose control over exactly what goes into the implicit environment (i.e. the programmer should always know what (a + b) represents by locally reasoning about the code). The current design thus follows the ML discipline of maintaining explicit control over the namespace, with any ambiguities in resolving an implicit module type resulting in a type error.


In addition to ad-hoc polymorphism, support for parallel execution on multicore CPUs is undoubtedly the most common feature request for OCaml. This has been high on our list after improving tooling support, and Stephen Dolan and Leo White made solid progress in 2014 on the core runtime plumbing required.

Stephen initially added thread-local support to the OCaml compiler. This design avoided the need to make the entire OCaml runtime preemptive (and thus a huge patch) by allocating thread-local state per core.

We are now deep into the design and implementation of the programming abstractions built over these low-level primitives. One exciting aspect of our implementation is much of the scheduling logic for multicore OCaml can be written in (single-threaded) OCaml, making the design very flexible with respect to heterogenous hardware and variable IPC performance.

To get feedback on the overall design of multicore OCaml, we presented at OCaml 2014 (slides, video and abstract), and Stephen visited INRIA to consult with the development team and Arthur Chargueraud (the author of PASL). Towards the end of the year, KC Sivaramakrishnan finished his PhD studies at Purdue and joined our OCaml Labs group. He is the author of MultiMlton, and is now driving the completion of the OCaml multicore work along with Stephen Dolan, Leo White and Mark Shinwell. Stay tuned for updates from us when there is more to show later this year!

Ctypes: a Modular Foreign Function Interface

The Ctypes library started as an experiment with GADTs by Jeremy Yallop, and has since ballooned in a robust, comprehensive library for safely interacting with the OCaml foreign function interface. The first release came out in time to be included in Real World OCaml in lieu of the low-level FFI (which I was not particularly enamoured with having to explain in a tight page limit).

Throughout 2014, Jeremy expanded support for a number of features requested by users (both industrial and academic) who adopted the library in preference to manually writing C code to interface with the runtime, and issued several updated releases.

C Stub Generation

The first release of Ctypes required the use of libffi to dynamically load shared libraries and dynamically construct function call stack frames whenever a foreign function is called. While this works for simple libraries, it cannot cover all usecases, since interfacing with C demands an understanding of struct memory layout, C preprocessor macros, and other platform-dependent quirks which are more easily dealt with by invoking a C compiler. Finally, the performance of a libffi-based API will necessarily be slower than writing direct C stub code.

While many other language FFIs provide separate libraries for dynamic and static FFI libraries, we decided to have a go at building a modular version of Ctypes that could handle both cases from a single description of the foreign function interface. The result (dubbed “Cmeleon”) remained surprisingly succinct and usable, and now covers almost every use of the OCaml foreign function interface. We submitted a paper to ICFP 2015 dubbed “A modular foreign function interface” that describes it in detail. Here is a highlight of how simple a generic binding looks:

module Bindings(F : FOREIGN) = struct
  open F
  let gettimeofday = foreign "gettimeofday"
     (ptr timeval @-> ptr timezone @-> returning int)

The FOREIGN module type completely abstracts the details of whether or not dynamic or static binding is used, and handles C complexities such as computing the struct layout on the local machine architecture.

Inverse Stubs

The other nice result from functorising the foreign function interface emerged when we tried to invert the FFI and serve a C interface from OCaml code (for example, by compiling the OCaml code as a shared library). This would let us begin swapping out C libraries that we don’t trust with safer equivalents written in OCaml.

You can see an example of how inverted stubs work via a simple C XML parsing exposed from the Xmlm library. We can define a C struct by:

(* Define a struct of callbacks (C function pointers) *)
let handlers : [`handlers] structure typ = structure "handlers"
let (--) s f = field handlers s (funptr f)
 let on_data      = "on_data"      -- (string @-> returning void)
 let on_start_tag = "on_start_tag" -- (string @-> string @-> returning void)
 let on_end_tag   = "on_end_tag"   -- (void @-> returning void)
 let on_dtd       = "on_dtd"       -- (string @-> returning void) 
 let on_error     = "on_error"     -- (int @-> int @-> string @-> returning void)
let () = seal handlers

and then expose this via C functions:

module Stubs(I : Cstubs_inverted.INTERNAL) = struct
  (* Expose the type 'struct handlers' to C. *)
  let () = I.structure handlers

  (* We expose just a single function to C.  The first argument is a (pointer
     to a) struct of callbacks, and the second argument is a string
     representing a filename to parse. *)
  let () = I.internal "parse_xml" 
     (ptr handlers @-> string @-> returning void) parse

You can find the full source code to these snippets on the ocaml-ctypes-inverted-stubs-example repository on GitHub.

We’ll be exploring this aspect of Ctypes further in 2015 for SSL/TLS with David Kaloper and Hannes Mehnert, and Microsoft Research has generously funded a PhD studentship to facilitate the work.

Community Contributions

Ctypes benefited enormously from several external contributions from the OCaml community. From a portability perspective, A. Hauptmann contributed Windows support, and Thomas Leonard added Xen support to allow Ctypes bindings to work with MirageOS unikernels (which opens up the intriguing possibility of accessing shared libraries across virtual machine boundaries in the future). C language support was fleshed out by Edwin Torok contributing typedef support, Ramkumar Ramachandra adding C99 bools and Peter Zotov integrating native strings.

The winner of “most enthusiastic use of OCaml Labs code” goes to Thomas Braibant of Cryptosense, who used every feature of the Ctypes library (consider multi-threaded, inverted, staged and marshalled bindings) in their effort to hack the hackers. David Sheets comes a close second with his implementation of the FUSE binary protocol, parameterised by version quirks.

If you’re using Ctypes, we would love to hear about your particular use. A search on GitHub and OPAM reveals over 20 projects using it already, including industrial use at Cryptosense and Jane Street, and ports to Windows, *BSD, MacOS X and even iPhone and Android. There’s a getting started guide, and a mailing list available.

Community and Teaching Efforts

In addition to the online community building, we also participated in a number of conferences and face-to-face events to promote education about functional programming.

 Conferences and Talks

There has been a huge growth in the number of quality conferences in recent years, making it tough to choose which ones to attend. ICFP is the academic meeting point that predates most of them, and we participated extensively in 2014 via talks, tutorials and a keynote at the Haskell Symposium.
I also served on the program committee and industrial relations chair and took over as the steering committee chair of CUFP. Jeremy Yallop, Thomas Gazagnaire and Leo White all served program committees on workshops, with Jeremy also chairing this year’s ML Workshop.

Outside of academic conferences, we participated in a number of non-academic conferences such as QCon, OSCON, CCC, New Directions in OS, FunctionalConf, FPX and FOSDEM. The vast majority of these talks were about the MirageOS, and slides can be found at decks.openmirage.org.

The 2048 Browser Game

Yaron Minsky and I have run OCaml tutorials for ICFP for a few years, and we finally hung up our boots in favour of a new crowd.

Jeremy Yallop and Leo White stepped up to the mark with their ICFP/CUFP 2014 Introduction to OCaml tutorial, which had the additional twist of being taught entirely in a web browser by virtue of using the js_of_ocaml and IOCamlJS. They decided that a good practical target was the popular 2048 game that has wasted many programmer hours here at OCaml Labs. They hacked on it over the summertime, assisted by our visitor Daniel Buenzli who also released useful libraries such as Vg, React, Useri, and Gg.

The end result is satisfyingly playable online, with the source code available at ocamllabs/2048-tutorial.

Thomas Gazagnaire got invited to Bangalore for Functional Conf later in the year, and he extended the interactive tutorial notebook and also ran an OCaml tutorial to a packed room. We were very happy to support the first functional programming conference in India, and hope to see many more such events spring up! Amir Chaudhry then went to Belgium to FOSDEM 2015 where he showed off the 2048 game running as an ARM unikernel to a crowd of attendees at the Xen booth.

Graduate Teaching

Jeremy Yallop and Leo White (with assistance from Alan Mycroft and myself) also led the design of a new graduate course on Advanced Functional Programming at the Computer Laboratory. This ran in the Lent Term and was over-subscribed by three times the number who pre-registered (due to a number of PhD students and our collaborators from Citrix also attending).

The course materials are freely available online and cover the theory behind functional programming, and then move onto type inference, abstraction and parametricity, GADTs, rows, monads, and staging. We will be running this again in future years, and the lecture materials are already proving useful to answer mailing list questions.

Mentoring Beginners

We also had the pleasure of mentoring up-and-coming functional programmers via several outreach programs, both face-to-face and remote.

Cambridge Compiler Hacking

We started the Cambridge Compiler Hacking sessions in a small way towards the end of 2013 in order to provide a local, friendly place to assist people who wanted to dip their toes into the unnecessarily mysterious world of programming language hacking. The plan was simple: provide drinks, pizza, network and a bug list of varying difficulty for attendees to choose from and work on for the evening, with mentoring from the experienced OCaml contributors.

We continued this bi-monthly tradition in 2014, with a regular attendance of 15-30 people, and even cross-pollinated communities with our local F# and Haskell colleagues. We rotated locations from the Cambridge Computer Laboratory to Citrix, Makespace, and the new Cambridge Postdoc Centre. We posted some highlights from sessions towards the start of the year, and are very happy with how it’s going. There has even been uptake of the bug list across the water in France, thanks to Gabriel Scherer.

In 2015, we’d like to branch out further and host some sessions in London. If you have a suggestion for a venue or theme, please get in touch!

Summer Programs

There has been a laudable rise in summer programs designed to encourage diversity in our community, and we of course leap at the opportunity to participate in these when we find them.

Our own students also had the chance to participate in such workshops to get out of Cambridge in the summer! Heidi Howard liveblogged her experiences at the PLMW workshop in Mumbai. Meanwhile, David Sheets got to travel to the slightly less exotic London to liveblog OSIO, and Leonhard Markert covered ICFP 2014 as a student volunteer.

Blogging and Online Activities

Our blog roll maintains the ongoing stream of activity from the OCaml Labs crew, but there were some particular highlights throughout 2014.

It wasn’t all just blogging though, and Jeremy Yallop and Leo White in particular participated in some epic OCaml bug threads about new features, and explanations about OCaml semantics on the mailing list.

Amir Chaudhry also continued to curate and develop the content on the ocaml.org website with our external collaborators Ashish Agarwal, Christophe Troestler and Phillippe Wang. Notably, it is now the recommended site for OCaml (with the INRIA site being infrequently updated), and also hosts the ACM OCaml Workshop pages. One addition that highlighted the userbase of OCaml in the teaching community came from building a map of all of the universities where the language is taught, and this was Yan Shvartzshnaider’s first contribution to the site.

Visitors and Interns

Finally, a really important part of any community is hanging out with each other to chat over ideas in a friendly environment. As usual, we had a very steady stream of visitors and interns throughout 2014 to facilitate this.

Frederic Bour, Benjamin Farinier and Matthieu Journault joined us as summer interns from their respective universities in France as part of their Masters programs. Frederic worked on modular implicits and gave a great talk at the OCaml Users group. Benjamin and Matthieu worked on Irmin data structures and complexity (and merge-queues and merge-ropes), and Benjamin had his paper on “Mergeable Persistent Data Structures” accepted to JFLA 2015, while Matthieu’s work on efficient algorithms for synchronising Irmin DAGs is being integrated into the upstream source code.

Daniel Buenzli repeated his visit from 2013 and spent a productive summer with us, commenting on almost every project we’re working on. In his own words (edited for brevity):

I started by implementing and releasing Uucp, a library to provide efficient access to a selection of the properties of the latest Unicode Character database (UCD). […] As a side effect of the previous point I took time to write an absolute minimal introduction to Unicode. […] Since I was in this Unicode business I took the opportunity to propose a 31 loc patch to the standard library for a type to represent Unicode scalar values (an Unicode character to be imprecise) to improve interoperability.

The usual yearly update to OpenGL was announced at the Siggraph conference. This prompted me to update the ctypes-based tgls library for supporting the latest entry point of OpenGL 4.5 and OpenGL ES 3.1. Since the bindings are automatically generated from the OpenGL XML registry the work is not too involved but there’s always the odd function signature you don’t/can’t handle automatically yet.

Spend quite a bit (too much) time on useri, a small multi-platform abstraction for setting up a drawing surface and gather user input (not usury) as React events. Useri started this winter as a layer on top of SDL to implement a CT scan app and it felt like this could be the basis for adding interactivity and animation to Vg/Vz visualizations – js viz libraries simply rely on the support provided by the browser or SVG support but Vg/Vz strives for backend independence and clear separations of concern (up to which limit remains an open question). Unfortunately I couldn’t bring it to a release and got a little bit lost in browser compatibility issues and trying to reconcile what browser and SDL give us in terms of functionality and way of operating, so that a maximum of client code can be shared among the supported platforms. But despite this non-release it still managed to be useful in some way, see the next point.

Helped Jeremy and Leo to implement the rendering and interaction for their ICFP tutorial 2048 js_of_ocaml implementation. This featured the use of Gg, Vg, Useri and React and I was quite pleased with the result (despite some performance problems in certain browsers, but hey composable rendering and animation without a single assignement in client code). It’s nice to see that all these pains at trying to design good APIs eventually fit together […]

A couple of visitors joined us from sunny Morocco, where Hannes Mehnert and David Kaloper had gone to work on a clean-slate TLS stack. They found the MirageOS effort online, and got in touch about visiting. After a very fun summer of hacking, their stack is now the standard TLS option in MirageOS and resulted in the Bitcoin Pinata challenge being issued! Hannes and David have since moved to Cambridge to work on this stack full-time in 2015, but the internships served as a great way for everyone to get to know each other.

We also had the pleasure of visits from several of our usually remote collaborators. Christophe Troestler, Yaron Minsky, Jeremie Diminio and Andy Ray all visited for the annual OCaml Labs review meeting in Christ’s College. There were also many academic talks from foreign visitors in our SRG seminar series, ranging from Uday Khedkar from IIT to Oleg Kiselyov deliver multiple talks on staging and optimisation (as well as making a celebrity appearance at the compiler hacking session, and Yaron Minsky delivering an Emacs-driven departmental seminar on his experiences with Incremental computation.

Research Efforts

The OCaml Labs are of course based in the Cambridge Computer Laboratory, where our day job is to do academic research. Balancing the demands of open source coding, community efforts and top-tier research has be a tricky one, but an effort that has been worthwhile.

Our research efforts are broadly unchanged from 2013 (it takes time to craft good ideas!), and this will not be an exhaustive recap. Instead, we’ll summarise them here and point to our papers that describe the work in detail.

Our long standing research into personal online privacy led to our next system target that uses unikernels: the Databox paper outlines the architecture, and was covered in the Guardian newspaper. Jon Crowcroft led the establishment of the Cambridge wing of the Microsoft Cloud Computing Research Center to consider the legal aspect of things, and so we have made forays outside of technology into considering the implications of region-specific clouds as well.

Some of the most exciting work done in the group as part of the REMS and NaaS projects came towards the end of 2014 and start of 2015, with multiple submissions going into top conferences. Unfortunately, due to most of them being double blind reviewed, we cannot link to the papers yet. Keep an eye on the blog and published paper set, or ask us directly about what’s been going on!

Priorities for 2015

As spring breaks and the weather (almost) becomes bearable again, we’re setting our work priorities for the remainder of the year.

I’d like to thank the entire team and wider community for a wonderfully enjoyable 2014 and start of 2015, and am very thankful to the funding and support from Jane Street, Citrix, British Telecom, RCUK, EPSRC, DARPA and the EU FP7 that made it all possible. As always, please feel free to contact any of us directly with questions, or reach out to me personally with any queries, concerns or bars of chocolate as encouragement.

ICFP 2015 - a call for sponsorship and how you can help

18 February 2015   |   Anil Madhavapeddy   |   tags: icfp   |   all posts

The call for papers for this year’s International Conference on Functional Programming is about to close in two weeks, and over a hundred cutting edge research papers will be submitted on the theory, application and experiences behind functional programming and type theory. In addition to the main conference, there are also over 10 big affiliated workshops that run throughout the week on topics ranging from specific languages (Erlang, Haskell, OCaml), the broader commercial community, and even art and music.

The ICFP conference experience can be a remarkable one for students. Some great ideas have emerged from random corridor conversations between talks with the likes of Phil Wadler, or from rain-soaked discussions with Simon PJ at Mikeller, or in my case, from being convinced to write a book while in a smoky Tokyo bar. This year, it will be held in the beautiful city of Vancouver in the fall.

We’re committed to growing the ICFP community, not just in numbers but also in diversity. The Programming Language Mentoring Workshop has been at capacity since it started and will run again. For the first time ever, I am really excited to announce that the Ada Initiative will also be running an Ally Skills workshop during the conference.

Sustaining these activities and responsible growth means that we need to reach ever wider to support the activities of the (not-for-profit) ICFP conference. So as this year’s industrial relations chair, I wish to invite any organization that wishes to support ICFP to get in touch with us (e-mail at avsm2@cl.cam.ac.uk) and sponsor us. I’ve put an abridged version of the e-mail solicitation below that describes the benefits. Sponsorship can start as low as $500 and is often tax deductible in many countries.

I’m writing to ask if you would be willing to provide corporate financial support for the 20th ACM SIGPLAN International Conference on Functional Programming (ICFP), which takes place in Vancouver, Canada, from August 30th through September 5th, 2015:


Corporate support funds are primarily used to subsidize students – the lifeblood of our community – and in turn serve to raise the community profile of the supporting companies through a high-profile industrial recruitment event.

Last year, unprecedented levels of support from you and folks like you at over 25 companies and institutions made it possible for students from all over the world to attend ICFP 2014 in Sweden. The Industrial Reception, open to all attendees, was by all accounts a roaring success. All 2014 sponsoring companies had the opportunity to interact with the gathered students, academics, and software professionals.

This year, let’s build on that success and continue to grow our community, and bring even more students to ICFP 2015 in Vancouver!

Your generosity will make it possible for students from all over the world to attend ICFP, the premier conference in functional programming. There, they will meet luminaries in the field, as well as people who’ve built a successful career and/or business on functional programming. They will return home inspired to continue pursuing functional programming in the confidence that exciting future careers await them. For the first time, we will also host an Ally Skills workshop by the Ada Foundation, as well as continue the successful student mentoring workshop from previous years.

This year, we’re continuing similar system of levels of financial support as last year. Our goal is to enable smaller companies to contribute while allowing larger companies to be as generous as they wish (with additional benefits, in recognition of that generosity).

The support levels, and their associated benefits and pledge amounts and benefits are as follows (costs in US dollars).

Bronze: $500: Logo on website, poster at industrial reception, listed in proceedings.

Silver: $2500: As above plus: logo in proceedings, logo on publicity materials (e.g., posters, etc.)

Gold: $5000: As above plus: named supporter of industrial reception with opportunity to speak to the audience, and opportunity to include branded merchandise in participants’ swag bag.

Platinum: $10000: As above plus: named supporter of whole event, logo on lanyards, badge ribbon, table/booth-like space available (in coffee break areas), other negotiated benefits (subject to ACM restrictions on commercial involvement).

Thank you for your time and especially for your generosity! I look forward to seeing you in Vancouver. If you are willing to be a sponsor, it would be helpful to hear back by March 9th to help us plan and budget.

If you are interested, please get in touch with me or any of the organizing committee. If you’re interested in helping out ICFP in a non-financial capacity (for example as a student volunteer), then there will also be plenty of opportunity to sign up later in the year.

Talks from OCaml Labs during ICFP 2014

31 August 2014   |   Anil Madhavapeddy   |   tags: ocaml,ocamllabs   |   all posts

It’s the ever-exciting week of the International Conference on Functional Programming again in Sweden, and this time OCaml Labs has a variety of talks, tutorials and keynotes to deliver throughout the week. This post summarises all them so you can navigate your way to the right session. Remember that once you register for a particular day at ICFP, you can move between workshops and tutorials as you please.

Quick links to the below in date order:

Language and Compiler Improvements

The first round of talks are about improvements to the core OCaml language and runtime.

» Modular implicits

Leo White and Frederic Bour have been taking inspiration from Scala implicits and Modular Type Classes by Dreyer et al, and will describe the design and implementation of a system for ad-hoc polymorphism in OCaml based on passing implicit module parameters to functions based on their module type.

This provides a concise way to write functions to print or manipulate values generically, while maintaining the ML spirit of explicit modularity. You can actually get get a taste of this new feature ahead of the talk, thanks to a new facility in OCaml: we can compile any OPAM switch directly into an interactive JavaScript notebook thanks to iocamljs by Andy Ray.

» Multicore OCaml

Currently, threading in OCaml is only supported by means of a global lock, allowing at most one thread to run OCaml code at any time. Stephen Dolan, Leo White and Anil Madhavapeddy have been building on the early design of a multicore OCaml runtime that they started in January, and now have a (early) prototype of a runtime design that is capable of shared memory parallelism.

» Type-level Module Aliases

Leo White has been working with Jacques Garrigue on adding support for module aliases into OCaml. This significantly improves the compilation speed and executable binary sizes when using large libraries such as Core/Async.

» Coeffects: A Calculus of Context-dependent Computation

Alan Mycroft has been working with Tomas Petricek and Dominic Orchard on defining a broader notion of context than just variables in scope. Tomas will be presenting a research paper on developing a generalized coeffect system with annotations indexed by a correct shape.

 Mirage OS 2.0

We released Mirage OS 2.0 in July, and there will be several talks diving into some of the new features you may have read on the blog.

» Unikernels Keynote at Haskell Symposium

Since MirageOS is a unikernel written entirely in OCaml, it makes perfect sense to describe it in detail to our friends over at the Haskell Symposium and reflect on some of the design implications between Haskell type-classes and OCaml functors and metaprogramming. Anil Madhavapeddy will be doing just that in a Friday morning keynote at the Haskell Symposium.

» Transport Layer Security in OCaml

Hannes Menhert and David Kaloper have been working hard on integrating a pure OCaml Transport Layer Security stack into Mirage OS. They’ll talk about the design principles underlying the library, and reflect on the next steps to build a TLS stack that we can rely on not to been more insecure than telnet.

Hannes will also continue his travels and deliver a couple of talks the week after ICFP on the same topic in Denmark, so you can still see it if you happen to miss this week’s presentation:

» Irmin: a Branch-consistent Distributed Library Database

Irmin is an OCaml library to persist and synchronize distributed data structures both on-disk and in-memory. It enables a style of programming very similar to the Git workflow, where distributed nodes fork, fetch, merge and push data between each other. The general idea is that you want every active node to get a local (partial) copy of a global database and always be very explicit about how and when data is shared and migrated.

This has been a big collaborative effort lead by Thomas Gazagnaire, and includes contributions from Amir Chaudhry, Anil Madhavapeddy, Richard Mortier, David Scott, David Sheets, Gregory Tsipenyuk, Jon Crowcroft. We’ll be demonstrating Irmin in action, so please come along if you’ve got any interesting applications you would like to talk to us about.

» Metaprogramming with ML modules in the MirageOS

Mirage OS lets the programmer build modular operating system components using a combination of OCaml functors and generative metaprogramming. This ensures portability across both Unix binaries and Xen unikernels, while preserving a usable developer workflow.

The core Mirage OS team of Anil Madhavapeddy, Thomas Gazagnaire, David Scott and Richard Mortier will be talking about the details of the functor combinators that make all this possible, and doing a live demonstration of it running on a tiny ARM board!

» CUFP OCaml Language Tutorial

Leo White and Jeremy Yallop (with much helpful assistance from Daniel Buenzli) will be giving a rather different OCaml tutorial from the usual fare: they are taking you on a journey of building a variant of the popular 2048 game in pure OCaml, and compiling it to JavaScript using the js_of_ocaml compiler. This is a very pragmatic introduction to using statically typed functional programming combined with efficient compilation to JavaScript.

In this tutorial, we will first introduce the basics of OCaml using an interactive environment running in a web browser, as well as a local install of OCaml using the OPAM package manager. We will also explore how to compile OCaml to JavaScript using the js_of_ocaml tool.

The tutorial is focused around writing the 2048 logic, which will then be compiled with js_of_ocaml and linked together with a frontend based on (a pre-release version of) Useri, React, Gg and Vg, thanks to Daniel Buenzli. There’ll also be appearances from OPAM, IOCaml, Qcheck and OUnit.

There will also be a limited supply of special edition OCaml-branded USB sticks for the first tutorial attendees, so get here early for your exclusive swag!

» The OCaml Platform

The group here has been working hard all summer to pull together an integrated demonstration of the new generation of OCaml tools being built around the increasingly popular OPAM package manager. Anil Madhavapeddy will demonstrate all of these pieces in the OCaml Workshop, with guest appearances of work from Amir Chaudhry, Daniel Buenzli, Jeremie Diminio, Thomas Gazagnaire, Louis Gesbert, Thomas Leonard, David Sheets, Mark Shinwell, Christophe Troestler, Leo White and Jeremy Yallop.

The OCaml Platform combines the OCaml compiler toolchain with a coherent set of tools for build, documentation, testing and IDE integration. The project is a collaborative effort across the OCaml community, tied together by the OCaml Labs group in Cambridge and with other major contributors.

» The 0install Binary Installation System

Thomas Leonard will also be delivering a separate talk about cross-platform binary installation via his 0install library, which works on a variety of platforms ranging from Windows, Linux and MacOS X. He recently rewrote it in OCaml from Python, and will be sharing his experiences on how this went as a new OCaml user, as well as deliver an introduction to 0install.

» Service and Socialising

Heidi Howard and Leonhard Markert are acting as student volunteers at this years ICFP, and assisting with videoing various workshops such as CUFP Tutorials, Haskell Symposium, the Workshop on Functional High-Performance Computing and the ML Family Workshop. Follow their live blogging on the Systems Research Group SysBlog and leave comments about any sessions you’d like to know more about!

Anil Madhavapeddy is the ICFP industrial relations chair and will be hosting an Industrial Reception on Thursday 4th September in the Museum of World Culture starting from 1830. There will be wine, food and some inspirational talks from the ICFP sponsors that not only make the conference possible, but provide an avenue for the academic work to make its way out into industry (grad students that are job hunting: this is where you get to chat to folk hiring FP talent).

This list hasn’t been exhaustive, and only covers the activities of my group in OCaml Labs and the Systems Research Group at Cambridge. There are numerous other talks from the Cambridge Computer Lab during the week, but the artistic highlight will be on Saturday evening following the CUFP talks: Sam Aaron will be doing a live musical performance sometime after 8pm at 3vaningen. Sounds like a perfect way to wind down after what’s gearing to up to be an intense ICFP 2014. I look forward to seeing old friends and making new ones in Gothenburg soon!

Grepping the source of every OCaml package in OPAM

08 April 2014   |   Anil Madhavapeddy   |   tags: opam,ocaml,ocamllabs   |   all posts

A regular question that comes up from OCaml developers is how to use OPAM as a hypothesis testing tool against the known corpus of OCaml source code. In other words: can we quickly and simply run grep over every source archive in OPAM? So that’s the topic of today’s 5 minute blog post:

git clone git://github.com/ocaml/opam-repository
cd opam-repository
opam-admin make
cd archives
for i in *.tar.gz; \
  do tar -zxOf $i | grep caml_stat_alloc_string; \

In this particular example we’re looking for instances of caml_stat_alloc_string, so just replace that with the regular expression of your choice. The opam-admin tool repacks upstream archives into a straightforward tarball, so you don’t need to worry about all the different archival formats that OPAM supports (such as git or Darcs). It just adds an archive directory to a normal opam-repository checkout, so you can reuse an existing checkout if you have one already.

$ cd opam-repository/archives
$ du -h
669M	.
$ ls | wc -l

Codio, the insanely slick web way to build Mirage unikernels from a browser

26 March 2014   |   Anil Madhavapeddy   |   tags: mirage,ocaml,ocamllabs   |   all posts

I noticed an offhand tweet from Phil Tomson about Codio adding OPAM support, and naturally had to take a quick look. I was really impressed by the whole process, and ended up building the Mirage Xen website unikernel directly from my web browser in less than a minute, including registration!

I notice Codio supports OCaml and opam on the server side now.

— phil tomson (@philtor) March 26, 2014
$ parts install opam
$ opam init -a
$ eval `opam config env`
$ opam install mirage-www -y
$ make MODE=xen

Then have a cup of coffee while the box builds, and you have a mir-www.xen, all from your web browser! Codio has a number of deployment options available too, so you should be able to hook up a Git-based workflow using some combination of Travis or other CI service.

This is the first time I’ve ever been impressed by an online editor, and might consider moving away from my beloved vi…

Easily OPAM switching to any OCaml feature request

25 March 2014   |   Anil Madhavapeddy   |   tags: ocaml,ocamllabs   |   all posts

Gabriel Scherer announced an experiment to host OCaml compiler pull requests on GitHub for six months. There is a general feeling that GitHub would be a more modern hosting platform than the venerable but reliable Mantis setup that has in place for over a decade, but the only way to find out for sure is by trying it out for a while.

One of the great benefits of using GitHub is their excellent API to easily automate workflows around issues and pull requests. After a suggestion from Jeremy Yallop and David Sheets over lunch, I decided to use this to make it easier to locally apply compiler patches. OPAM has a great compiler switch feature that lets you run simultaneous OCaml installations and swap between them easily. For instance, the default setting gives you access to:

$ opam switch
system  C system       System compiler (4.01.0)
--     -- 3.11.2       Official 3.11.2 release
--     -- 3.12.1       Official 3.12.1 release
--     -- 4.00.0       Official 4.00.0 release
--     -- 4.00.1       Official 4.00.1 release
--     -- 4.01.0       Official 4.01.0 release
--     -- 4.01.0beta1  Beta1 release of 4.01.0

I used my GitHub API bindings to knock up a script that converts every GitHub pull request into a custom compiler switch. You can see these by passing the --all option to opam switch, as follows:

$ opam switch --all
--     -- 4.02.0dev+pr10              Add String.{split,rsplit}
--     -- 4.02.0dev+pr13              Add String.{cut,rcut}.
--     -- 4.02.0dev+pr14              Add absolute directory names to bytecode format for ocamldebug to use
--     -- 4.02.0dev+pr15              replace String.blit by String.unsafe_blit
--     -- 4.02.0dev+pr17              Cmm arithmetic optimisations
--     -- 4.02.0dev+pr18              Patch for issue 5584
--     -- 4.02.0dev+pr2               Parse -.x**2. (unary -.) as -.(x**2.).  Fix PR#3414
--     -- 4.02.0dev+pr20              OCamlbuild: Fix the check of ocamlfind
--     -- 4.02.0dev+pr3               Extend record punning to allow destructuring.
--     -- 4.02.0dev+pr4               Fix for PR#4832 (Filling bigarrays may block out runtime)
--     -- 4.02.0dev+pr6               Warn user when a type variable in a type constraint has been instantiated.
--     -- 4.02.0dev+pr7               Extend ocamllex with actions before refilling
--     -- 4.02.0dev+pr8               Adds a .gitignore to ignore all generated files during `make world.opt'
--     -- 4.02.0dev+pr9               FreeBSD 10 uses clang by default, with gcc not available by default
--     -- 4.02.0dev+trunk             latest trunk snapshot

Testing the impact of a particular compiler switch is now pretty straightforward. If you want to play with Stephen Dolan’s optimized arithmetic operations, for instance, you just need to do:

$ opam switch 4.02.0dev+pr17
$ eval `opam config env`

And your local environment now points to the patched OCaml compiler. For the curious, the scripts to generate the OPAM pull requests are in my avsm/opam-sync-github-prs repository. It contains an example of how to query active pull requests, and also to create a new cross-repository pull request (using the git jar binary from my GitHub bindings). The scripts run daily for now, and delete switches once the corresponding pull request is closed. Just run opam update to retrieve the latest switch set from the upstream OPAM package repository.

all posts