The call for papers for this year’s International Conference on Functional Programming has just closed, with around a hundred cutting edge research papers submitted on the theory, application and experiences behind functional programming. This marks just the beginning of sorting out the program, as there are also over 10 big affiliated workshops that run throughout the week on topics ranging from specific languages (Erlang, Haskell, OCaml), the broader commercial community, and even art and music.
The ICFP conference experience can be a remarkable one for students. Some great ideas have emerged from random corridor conversations between talks with the likes of Phil Wadler, or from rain-soaked discussions with Simon PJ at Mikeller, or in my case, from being convinced to write a book while in a smoky Tokyo bar.
Functional programming worldwide has been growing ever more popular in 2014 (and lucrative). We’re committed to growing the ICFP community, not just in numbers but also in diversity. We had a record number of sponsors in 2013, and sustaining the growth means that we need to reach ever wider to support the activities of the (not-for-profit) conference.
So as this year’s industrial relations chair, I thought I’d throw the gates open and invite any organization that wishes to support FP to get in touch with
us (e-mail at
firstname.lastname@example.org) and sponsor us. I’ve put an abridged version of the e-mail solicitation below that describes the benefits. Sponsorship can start as low as $500 and is often tax deductible in many countries.
I’m writing to ask if you would be willing to provide corporate financial support for the 19th ACM SIGPLAN International Conference on Functional Programming (ICFP), which takes place in Gothenburg, Sweden, from September 1st through 3rd, 2014:
Corporate support funds are primarily used to subsidize students – the lifeblood of our community – and in turn serve to raise the community profile of the supporting companies through a high-profile industrial recruitment event.
Last year, unprecedented levels of support from you and folks like you at over 25 companies and institutions made it possible for students from all over the world to attend ICFP 2013 in Boston. The Industrial Reception, open to all attendees, was by all accounts a roaring success. All 2013 sponsoring companies had the opportunity to speak to the gathered students, academics, and software professionals.
This year, let’s build on that success and continue to grow our community, and bring even more students to ICFP 2014 in Sweden!
Your generosity will make it possible for students from all over the world to attend ICFP, the premier conference in functional programming. There, they will meet luminaries in the field, as well as people who’ve built a successful career and/or business on functional programming. They will return home inspired to continue pursuing functional programming in the confidence that exciting future careers await them.
This year, we’re continuing similar system of levels of financial support as last year. Our goal is to enable smaller companies to contribute while allowing larger companies to be as generous as they wish (with additional benefits, in recognition of that generosity).
The support levels, and their associated benefits and pledge amounts and benefits are as follows (costs in US dollars).
Bronze: $500: Logo on website, poster at industrial reception, listed in proceedings.
Silver: $2500: As above plus: logo in proceedings, logo on publicity materials (e.g., posters, etc.)
Gold: $5000: As above plus: named supporter of industrial reception, opportunity to include branded merchandise in participants’ swag bag.
Platinum: $10000: As above plus: named supporter of whole event, logo on lanyards, badge ribbon, table/booth-like space available (in coffee break areas), other logo on lanyards, badge ribbon, table/booth-like space available (in coffee break areas), other negotiated benefits (subject to ACM restrictions on commercial involvement).
If you are interested, please get in touch with me or any of the organizing committee. If you’re interested in helping out ICFP in a non-financial capacity (for example as a student volunteer), then there will also be plenty of opportunity to sign up later in the year.
The Communications of the ACM have just published an article that Dave Scott and I wrote providing a broader background on the concept of Unikernels that we’ve been working on since about 2003, when we started building Melange and the Xen toolstack. You can read either the print article (requires an ACM subscription) or the open access version on the ACM Queue. There’s been some interesting discussion about it already online:
On Reddit, a number of queries about how it fits into the space of containers, microkernels and other experimental operating systems.
Two of the most interesting bits of feedback for me personally came from Butler Lampson (via Jon Crowcroft) and Robert Harper, two computer scientists who have made key contributions to operating systems and programming languages and provided some broader perspective.
Butler Lampson points out (edited for the web):
I found the Mirage work quite interesting: a 21st century version of things that we did at Xerox in the 1970s. Of course the application domain is quite different, and so is the whole-program optimization. And we couldn’t afford garbage collection, so freeing storage was not type-safe. But there are lots of interesting parallels.
The “OS as libraries” idea was what made it possible to fit big applications into the Alto’s 128k bytes of memory:
Lampson and Sproull, An open operating system for a single-user machine, ACM Operating Systems Rev. 11, 5 (Dec. 1979), pp 98-105. ACM.
Lauer and Satterthwaite, The impact of Mesa on system design, Proc. 4th ICSE, Munich, Sep. 1979, pp 174-182.
Redell et al, Pilot: An Operating System for a Personal Computer, Comm. ACM 23, 2 (Feb 1980), pp 81-92 (from 7th SOSP, 1979). ACM
Robert Harper correctly points out some related work that was missing from our CACM article:
FoxNet is an implementation of the standard TCP/IP networking protocol stack using the Standard ML (SML) language. It was part of a wide-reaching project at CMU in the 1990s that made seminal contributions in proof-carrying code and typed intermediate languages, among many other things. The FoxNet stack was actually one of my big inspirations for wanting to build Mirage, since the elegance of using functors as a form of dependency injection into a system as complex as an OS and application stack is very desirable, and the reason we chose to build Mirage in ML instead of another, less modular, language.
Ensemble (website now offline but here’s a SOSP 1999 paper) is a group communication system written in OCaml, developed at Cornell and the Hebrew University. For an application builder, Ensemble provides a library of protocols that can be used for quickly building complex distributed applications For a distributed systems researcher, Ensemble is a highly modular and reconfigurable toolkit: the high-level protocols provided to applications are really stacks of tiny protocol “layers”, each of whose can be modified or rebuilt to experiment.
Both Ensemble and FoxNet made strong echoes throughout the design of Mirage (and its precursor software such as Melange in 2007). The Mirage command-line tool uses staged-computation to build a concrete application out of functors, and we are making this even more programmable via a new combinator-based functor types library that Thomas Gazagnaire built, and also experimenting with higher kinded polymorphic abstractions.
My thanks to Butler Lampson and Robert Harper for making me go re-read their papers again, and I’d like to leave you with Malte Schwarzkopf’s OS Reading Group papers for other essential reading in this space. Many more citations immediately relevant to Mirage can also be found in our ASPLOS 2013 paper.
This time last year in 2012, I had just announced the formation of a new group called OCaml Labs in the Cambridge Computer Lab that would combine research and community work towards the practical application of functional programming. An incredible year has absolutely flown by, and I’ve put together this post to summarise what’s gone on, and point to our future directions for 2014.
The theme of our group was not to be pure research, but rather a hybrid group that would take on some of the load of day-to-day OCaml maintenance from INRIA, as well as help grow the wider OCaml community. To this end, all of our projects have been highly collaborative, often involving colleagues from OCamlPro, INRIA, Jane Street, Lexifi and Citrix.
At the start of 2013, OCaml was in the interesting position of being a mature decades-old language with a small, loyal community of industrial users who built mission critical applications using it. We had the opportunity to sit down with many of them at the OCaml Consortium meeting and prioritise where we started work. The answer came back clearly: while the compiler itself is legendary for its stability, the tooling around it (such as package management) was a pressing problem.
Our solution to this tooling was centered around the OPAM package manager that OCamlPro released into beta just at the end of 2012, and had its first stable release in March 2013. OPAM differs from most system package managers by emphasising a flexible distributed workflow that uses version constraints to ensure incompatible libraries aren’t mixed up (important for the statically-typed OCaml that is very careful about dependencies). Working closely with OCamlPro we developed a git-based workflow to make it possible for users (both individual or industrial) to easily build up their own package repositories and redistribute OCaml code, and started curating the package repository.
The results have been satisfying: we started with an initial set of around 100 packages in OPAM (mostly imported by the 4 developers), and ended 2013 with 587 unique packages and 2000 individual versions, with contributions from 160 individuals. We now have a curated central package repository for anyone to submit their OCaml code, several third-party remotes are maintained (e.g. the Xen Project and Ocsigen). We also regularly receive releases of the Core libraries from Jane Street, and updates from sources as varied as Facebook, Coherent PDF, to the Frenetic SDN research.
Number of unique contributors to the central OPAM package repository.
Total number of unique packages (including multiple versions of the same package).
Total packages with multiple versions coalesced so you can see new package growth.
A notable contribution from OCamlPro during this time was to clarify the licensing on the package repository to be the liberal CC0, and also to pass ownership to the OCaml organization on GitHub, where it’s now jointly maintained by OCaml Labs, OCamlPro and anyone else that wishes to contribute.
A lens into global OCaml code
It’s been quite interesting just watching all the varied code fly into the repository, but stability quickly became a concern as the new packages piled up. OCaml compiles to native code on not just x86, but also PowerPC, Sparc and ARM CPUs. We kicked off various efforts into automated testing: firstly David Sheets built the OCamlot daemon that would schedule builds across all the exotic hardware. Later in the year, the Travis service launched support for testing from GitHub pull requests, and this became the front line of automated checking for all incoming new packages to OPAM.
A major headache with automated testing is usually setting up the right build environment with external library dependencies, and so we added Docker support to make it easier to bulk-build packages for local developer use, with the results of builds available publically for anyone to help triage. Unfortunately fixing the bugs themselves is still a very manual process, so more volunteers are always welcome to help out!
We’re going to be really seeing the rewards from all this effort as OCaml 4.02 development proceeds, since we can now adopt a data-driven approach to changing language features instead of guessing how much third-party code will break. If your code is in OPAM, then it’ll be tested as new features such as module aliases, injectivity and extension points show up.
The venerable OCamlDoc tool has done an admirable job for the last decade, but is increasingly showing its age due to a lack of support for cross-referencing across packages. We started working on this problem in the summer when Vincent Botbol visited us on an internship, expecting it to be a quick job to come up with something as good as Haskell’s excellent Haddock online documentation.
Instead, we ran into the “module wall”: since OCaml makes it so easy to parameterise code over other modules, it makes it hard to generate static documentation without outputting hundreds of megabytes of HTML every time. After some hard work from Vincent and Leo, we’ve got a working prototype that lets you simply run
opam install opam-doc && opam doc core async to generate package documentation. You can see the results for Mirage online, but expect to see this integrated into the main OCaml site for all OPAM packages as we work through polishing up the user interface.
Turning OPAM into libraries
The other behind-the-scenes effort for OPAM has been to keep the core command-line tool simple and stable, and to have it install OCaml libraries that can be interfaced with by other tools to do domain-specific tasks. Thomas Gazagnaire, Louis Gesbert and David Sheets have been steadily hacking away at this and we now have opamfu to run operations over all packages, and an easy-to-template opam2web that generates the live opam.ocaml.org website.
This makes OPAM easier to deploy within other organizations that want to integrate it into their workflow. For example, the software section of the OCaml Labs website is regularly generated from a search of all OPAM packages tagged
ocamllabs. We also used it to rewrite the entire OPAM repository in one epic diff to add external library dependencies via a command-line shim.
All of this effort is geared towards making it easier to maintain reusable local OPAM installations. After several requests from big universities to help out their teaching needs, we’re putting together all the support needed to easily redistribute OPAM packages via an ”OPAM-in-a-Box” command that uses Docker containers to let you clone and do lightweight modifications of OCaml installations.
This will also be useful for anyone who’d like to run tutorials or teach OCaml, without having to rely on flaky network connectivity at conference venues: a problem we’ve suffered from too!
Starting to work on a real compiler can often be a daunting prospect, and so one initiative we started this year is to host regular compiler hacking sessions where people could find a curated list of features to work on, with the regular developers at hand to help out when people get stuck, and free beer and pizza to oil the coding wheels. This has worked out well, with around 20 people showing up on average for the three we held, and several patches submitted upstream to OCaml. Gabriel Scherer and Damien Doligez have been helping this effort by tagging junior jobs in the OCaml Mantis bug tracker as they are filed.
Syntax transformations and extension points
Leo White started the year fresh out of completing his PhD with Alan Mycroft, and before he realized what he’d gotten himself into was working with Alain Frisch on the future of syntax transformations in OCaml. We started off our first wg-camlp4 working group on the new lists.ocaml.org host, and a spirited discussion started that went on and on for several months. It ended with a very satisfying design for a simpler extension points mechanism which Leo presented at the OCaml 2013 workshop at ICFP, and is now merged into OCaml 4.02-trunk.
Not all of the working groups were quite as successful in coming to a conclusion as the Camlp4 one. On the Platform mailing list, Gabriel Scherer started a discussion on the design for namespaces in OCaml. The resulting discussion was useful in separating multiple concerns that were intermingled in the initial proposal, and Leo wrote a comprehensive blog post on a proposed namespace design.
After further discussion at ICFP 2013 with Jacques Garrigue later in the year, it turns out adding support for module aliases would solve much of the cost associated with compiling large libraries such as Core, with no backwards compatibility issues. This solution has now been integrated into OCaml 4.02.0dev and is being tested with Core.
Delving into the bug tracker
Jeremy Yallop joined us in April, and he and Leo also leapt into the core compiler and started triaging issues on the OCaml bug tracker. This seems unglamorous in the beginning, but there rapidly turned out to be many fascinating threads that shed light on OCaml’s design and implementation through seemingly harmless bugs. Here is a pick of some interesting threads through the year that we’ve been involved with:
- An unexpected interaction between variance and GADTs that led to Jacques Garrigue’s talk at OCaml 2013.
- Type unsoundness by pattern matching lazy mutable values, thus shedding light on the precise semantics of the order of pattern matching.
- Leo proposed an open types extension to allow abstract types to be declared open. You can try it via
opam switch 4.00.1+open-types.
- Designing the popular, but controversial record disambiguation feature in OCaml 4.01.0, and debating the right warnings needed to prevent programmer surprise.
- Exposing a GADT representation for Bigarray.
This is just a sample of some of the issues solved in Mantis; if you want to learn more about OCaml, it’s well worth browsing through it to learn from over a decade of interesting discussions from all the developers.
Thread-local storage runtime
While OCamlPro was working on their reentrant OCaml runtime, we took a different tack by adding thread-local storage to the runtime instead, courtesy of Stephen Dolan. This is an important choice to make at the outset of adding multicore, so both approaches are warranted. The preemptive runtime adds a lot of code churn (due to adding a context parameter to most function calls) and takes up a register, whereas the thread-local storage approach we tried doesn’t permit callbacks to different threads.
Much of this work isn’t interesting on its own, but forms the basis for a fully multicore runtime (with associated programming model) in 2014. Stay tuned!
One other complaint from the Consortium members was quite surprising: the difficulty of using the OCaml foreign function interface safely to interface with C code. Jeremy Yallop began working on the ctypes library that had the goal of eliminating the need to write any C code at all for the vast majority of foreign bindings.
Instead, Ctypes lets you describe any C function call as an OCaml value, and provides various linkage options to invoke that function into C. The first option he implemented was a
dlopen interface, which immediately brought us the same level of functionality as the Python or Haskell Ctypes equivalents. This early code was in itself startlingly useful and more pleasant to use than the raw FFI, and various folk (such as David Sheets’ libsodium cryptography bindings) started adopting it.
At this point, I happened to be struggling to write the Foreign Function Interface chapter of Real World OCaml without blowing through our page budget with a comprehensive explanation of the existing system. I decided to take a risk and write about Ctypes instead, since it let new users to the language have a far more productive experience to get started. Xavier Leroy pointed out some shortcomings of the library in his technical book review, most notably with the lack of an interface with C macros. The design of Ctypes fully supports alternate linking mechanisms than just
dlopen though, and Jeremy has added automatic C stub generation support as well. This means that if you use Ctypes to build an OCaml binding in 2014, you can choose several mechanisms for the same source code to link to the external system. Jeremy even demonstrated a forking model at OCaml 2013 that protects the OCaml runtime from the C binding via process separation.
The effort is paying off: Daniel Bünzli ported SDL2 using ctypes, and gave us extensive feedback about any missing corner cases, and the resulting bindings don’t require any C code to be written. Jonathan Protzenko even used it to implement an OCaml controller for the Adafruit Raspberry Pi RGB LCD!
Our community efforts were largely online, but we also hosted visitors over the year and regular face-to-face tutorials.
Online at OCaml.org
While the rest of the crew were hacking on OPAM and OCaml, Amir Chaudhry and Philippe Wang teamed up with Ashish Agarwal and Christophe Troestler to redesign and relaunch the OCaml website. Historically, OCaml’s homepage has been the caml.inria.fr domain, and the ocaml.org effort was begun by Christophe and Ashish some years ago to modernize the web presence.
The webpages were already rather large with complex scripting (for example, the 99 Problems page runs the OCaml code to autogenerate the output). Philippe developed a template DSL that made it easier to unify a lot of the templates around the website, and also a Markdown parser that we could link to as a library from the rest of the infrastructure without shelling out to Pandoc.
Meanwhile, Amir designed a series of interactive wireframe sketches and gathered feedback on it from the community. A local design agency in Cambridge helped with visual look and feel, and finally at the end of the summer we began the migration to the new website, followed by a triumphant switchover in November to the design you see today.
The domain isn’t just limited to the website itself. Leo and I set up a SVN-to-Git mirror of the OCaml compiler Subversion repository on the GitHub OCaml organization, which is proving popular with developers. There is an ongoing effort to simplify the core compiler tree by splitting out some of the larger components, and so camlp4 is also now hosted on that organization, along with OASIS. We also administer several subdomains of ocaml.org, such as the mailing lists and the OPAM repository, and other services such as the OCaml Forge are currently migrating over. This was made significantly easier thanks to sponsorship from Rackspace Cloud (users of XenServer which is written in OCaml). They saw our struggles with managing physical machines and gave us developer accounts, and all of the ocaml.org infrastructure is now hosted on Rackspace. We’re very grateful to their ongoing help!
If you’d like to contribute to infrastructure help (for example, I’m experimenting with a GitLab mirror), then please join the email@example.com mailing list and share your thoughts. The website team also need help with adding content and international translations, so head over to the website issue tracker and start proposing improvements you’d like to see.
Next steps for ocaml.org
The floodgates requesting features opened up after the launch of the new look and feel. Pretty much everyone wanted deeper OPAM integration into the main website, for features such as:
- Starring and reviewing packages
- Integrating the opam-doc documentation with the metadata
- Display test results and a compatibility matrix for non-x86 and non-Linux architectures.
- Link to blog posts and tutorials about the package.
Many of these features were part of the original wireframes but we’re being careful to take a long-term view of how they should be created and maintained. Rather than building all of this as a huge bloated opam2web extension, David Sheets (our resident relucant-to-admit-it web expert) has designed an overlay directory scheme that permits the overlaying of different metadata onto the website. This lets one particular feature (such as blog post aggregation) be handled separately from the others via Atom aggregators.
Real World OCaml
A big effort that took up most of the year for me was finishing and publishing an O’Reilly book called Real World OCaml with Yaron Minsky and Jason Hickey. Yaron describes how it all started in his blog post, but I learnt a lot from developing a book using the open commenting scheme that we developed just for this.
In particular, the book ended up shining a bright light into dark language corners that we might otherwise not have explored in OCaml Labs. Two chapters of the book that I wasn’t satisfied with were the objects and classes chapters, largely since neither Yaron nor Jason nor I had ever really used their full power in our own code. Luckily, Leo White decided to pick up the baton and champion these oft-maligned (but very powerful) features of OCaml, and the result is the clearest explanation of them that I’ve read yet. Meanwhile, Jeremy Yallop helped out with extensive review of the Foreign Function Interface chapter that used his ctypes library. Finally, Jeremie Diminio at Jane Street worked hard on adding several features to his utop toplevel that made it compelling enough to become our default recommendation for newcomers.
All in all, we ended up closing over 2000 comments in the process of writing the book, and I’m very proud of the result (freely available online, but do buy a copy if you can to support it). Still, there’s more I’d like to do in 2014 to improve the ease of using OCaml further. In particular, I removed a chapter on packaging and build systems since I wasn’t happy with its quality, and both Thomas Gazagnaire and I intend to spend time in 2014 on improving this part of the ecosystem.
Tutorials and Talks
We had a lively presence at ICFP 2013 this year, with the third iteration of the OCaml 2013 held there, and Stephen Dolan presenting a paper in the main conference. I liveblogged OCaml 2013 and CUFP 2013 as they happened, and all the talks we gave are linked from the program. The most exciting part of the conference for a lot of us were the two talks by Facebook on their use of OCaml: first for program analysis using Pfff and then to migrate their massive PHP codebase using an OCaml compiler. I also had the opportunity to participate in a panel at the Haskell Workshop on whether Haskell is too big to fail yet; lots of interesting perspectives on scaling another formerly academic language into the real world.
Yaron Minsky and I have been giving tutorials on OCaml at ICFP for several years, but the release of Real World OCaml has made it significantly easier to give tutorials without the sort of labor intensity that it took in previous years (one memorable ICFP 2011 tutorial that we did took almost 2 hours to get everyone installed with OCaml. In ICFP 2013, it took us 15 minutes or so to get everyone started). Still, giving tutorials at ICFP is very much preaching to the choir, and so we’ve started speaking at more general-purpose events.
Julien Verlaguet and Yoann Padioleau show off Pfff code visualisation at Facebook.
Marius Eriksen and Yaron Minsky start a Scala vs OCaml rap battle at the ICFP industrial fair. Maybe.
A successful FPDays tutorial in Cambridge, with all attendees getting a free copy of RWO!
Visitors and Interns
Since OCaml Labs is a normal group within the Cambridge Computer Lab, we often host academic visitors and interns who pass through. This year was certainly diverse, and we welcomed a range of colleagues:
- Mathias Bourgoin has just finished his work on interfacing OCaml with GPUs, and gave us a seminar on how his SPOC tool works (also available in OPAM via a custom remote).
- Roberto Di Cosmo, who directs the IRILL organization on Free Software in Paris delivered a seminar on constraint solving for package systems that are as large-scale as Debian’s.
- Thomas Gazagnaire visited during the summer to help plot the Mirage 1.0 and OPAM 1.1 releases. He has also since joined OCaml Labs fulltime to work on Nymote.
- Louis Gesbert from OCamlPro visited for 2 weeks in December and kicked off the inaugral OPAM developers summit (which was, admittedly, just 5 developers in the Kingston Arms, but all good things start in a pub, right?)
- Jonathan Protzenko presented his PhD work on Mezzo (which is now merged into OPAM), and educated us on the vagaries of Windows support.
- Gabriel Scherer from the Gallium INRIA group visited to discuss the direction of OPAM and various language feature discussions (such as namespaces). He didn’t give a talk, but promises to do so next time!
- Benoît Vaugon gave a seminar on his OCamlCC OCaml-to-C compiler, talked about porting OCaml to 8-bit PICs, and using GADTs to implement Printf properly.
We were also visited several times by Wojciech Meyer from ARM, who was an OCaml developer who maintained (among other things) the ocamlbuild system and worked on DragonKit (an extensible LLVM-like compiler written in OCaml). Wojciech very sadly passed away on November 18th, and we all fondly remember his enthusiastic and intelligent contributions to our small Cambridge community.
We also hosted visitors to live in Cambridge and work with us over the summer. In addition to Vincent Botbol (who worked on OPAM-doc as described earlier) we had the pleasure of having Daniel Bünzli and Xavier Clerc work here. Here’s what they did in their own words.
Xavier Clerc: OCamlJava
Xavier Clerc took a break from his regular duties at INRIA to join us over the summer to work on OCaml-Java and adapt it to the latest JVM features. This is an incredibly important project to bridge OCaml with the huge Java community, and here’s his report:
After a four-month visit to the OCaml Labs dedicated to the OCaml-Java project, the time has come for an appraisal! The undertaken work can be split into two areas: improvements to code generation, and interaction between the OCaml & Java languages. Regarding code generation, several classical optimizations have been added to the compiler, for example loop unrolling, more aggressive unboxing, better handling of globals, or partial evaluation (at the bytecode level). A new tool, namely ocamljar, has been introduced allowing post-compilation optimizations. The underlying idea is that some optimizations cannot always be applied (e.g. depending whether multiple threads/programs will coexist), but enabling them through command-line flags would lead to recompilation and/or multiple installations of each library according to the set of chosen optimizations. It is thus far more easier to first build an executable jar file, and then modify it according to these optimizations. Furthermore, this workflow allows the ocamljar tool to take advantage of whole-program information for some optimizations. All these improvements, combined, often lead to a gain of roughly 1/3 in terms of execution time.
Regarding language interoperability, there are actually two directions depending on whether you want to call OCaml code from Java, or want to call Java code from OCaml. For the first direction, a tool allows to generate Java source files from OCaml compiled interfaces, mapping the various constructs of the OCaml language to Java classes. It is then possible to call functions, and to manipulate instances of OCaml types in pure Java, still benefiting from the type safety provided by the OCaml language. In the other direction, an extension of the OCaml typer is provided allowing to create and manipulate Java instances directly from OCaml sources. This typer extension is indeed a thin layer upon the original OCaml typer, that is mainly responsible for encoding Java types into OCaml types. This encoding uses a number of advanced elements such as polymorphic variants, subtyping, variance annotations, phantom typing, and printf-hack, but the end-user does not have to be aware of this encoding. On the surface, the type of instances of the Java Object classes is
java'lang'Object java_instance, and instances can be created by calling Java.make
Daniel Bünzli: Typography and Visualisation
Daniel joined us from Switzerland, and spent some time at Citrix before joining us in OCaml Labs. All of his software is now on OPAM, and is seeing ever-increasing adoption from the community.
Released a first version of Vg … I’m especially happy about that as I wanted to use and work on these ideas since at least 2008. The project is a long term project and is certainly not finished yet but this is already a huge step.
Adjusted and released a first version of Gg. While the module was already mostly written before my arrival to Cambridge, the development of Vg and Vz prompted me to make some changes to the module.
… released Otfm, a module to decode OpenType fonts. This is a work in progress as not every OpenType table has built-in support for decoding yet. But since it is needed by Vg’s PDF renderer I had to cut a release. It can however already be used to implement certain simple things like font kerning with Vg, this can be seen in action in the
vechobinary installed by Vg.
Started to work on Vz, a module for helping to map data to Vg images. This is really unfinished and is still considered to be at a design stage. There are a few things that are however well implemented like (human) perceptually meaningful color palettes and the small folding stat module (
Vz.Stat). However it quickly became evident that I needed to have more in the box w.r.t. text rendering in Vg/Otfm. Things like d3js entirely rely on the SVG/CSS support for text which makes it easy to e.g. align things (like tick labels on such drawings). If you can’t rely on that you need ways of measuring rendered text. So I decided to suspend the work on Vz and put more energy in making a first good release of Vg. Vz still needs quite some design work, especially since it tries to be independent of Vg’s backend and from the mechanism for user input.
Spent some time figuring out a new “opam-friendly” release workflow in pkgopkg. One of my problem is that by designing in the small for programming in the large — what a slogan — the number of packages I’m publishing is growing (12 and still counting). This means that I need to scale horizontally maintenance-wise unhelped by the sad state of build systems for OCaml. I need tools that make the release process flawless, painless and up to my quality standards. This lead me to enhance and consolidate my old scattered distribution scripts in that repo, killing my dependencies on Oasis and ocamlfind along the way. (edited for brevity, see here)
Daniel also left his bicycle here for future visitors to use, and the “Bünzli-bike” is available for our next visitor! (Louis Gesbert even donated lights, giving it a semblance of safety).
Most of our regular funding bodies such as EPSRC or EU FP7 provide funding, but leave all the intellectual input to the academics. A compelling aspect of OCaml Labs has been how involved our industrial colleagues have been with the day-to-day problems that we solve. Both Jane Street and Citrix have senior staff regularly visiting our group and working alongside us as industrial fellows in the Computer Lab.
- Mark Shinwell from Jane Street Europe has been working on improving the state of native debugging in OCaml, by adding extended DWARF debugging information to the compiler output. Mark is also a useful source of feedback about the forthcoming design of multicore, since he has daily insight into a huge production codebase at Jane Street (and can tell us about it without us requiring access!).
- Dave Scott is the principal architect of XenServer at Citrix in Cambridge. This year has been transformative for that project, since Citrix open-sourced XenServer to GitHub and fully adopted OPAM into their workflow. Dave is the author of numerous libraries that have all been released to OPAM, and his colleagues Jon Ludlam and Euan Harris are also regular visitors who have also been contributors to the OPAM and Mirage ecosystems.
The other 100% of our time at the Labs is spent on research projects. When we started the group, I wanted to set up a feedback loop between local people using OCaml to build systems, with the folk developing OCaml itself. This has worked out particularly well with a couple of big research projects in the Lab.
Mirage is a library operating system written in OCaml that compiles source code into specialised Xen microkernels, developed at the Cambridge Computer Lab, Citrix and the Horizon Digital Economy institute at Nottingham. This year saw several years of effort culminate in the first release of Mirage 1.0 as a self-hosting entity. While Mirage started off as a quick experiment into building specialised virtual appliances, it rapidly became useful to make into a real system for use in bigger research projects. You can learn more about Mirage here, or read the Communications of the ACM article that Dave Scott and I wrote to close out the year.
This project is where the OCaml Labs “feedback loop” has been strongest. A typical Mirage application consists of around 50 libraries that are all installed via OPAM. These range from device drivers to protocol libraries for HTTP or DNS, to filesystems such as FAT32. Coordinating regular releases of all of these would be near impossible without using OPAM, and has also forced us to use our own tools daily, helping to sort out bugs more quickly. You can see the full list of libraries on the OCaml Labs software page.
Mirage is also starting to share code with big projects such as XenServer now, and we have been working with Citrix engineers to help them to move to the Core library that Jane Street has released (and that is covered in Real World OCaml). Moving production codebases this large can take years, but OCaml Labs is turning out to be a good place to start unifying some of the bigger users of OCaml into one place. We’re also now an official Xen Project incubator project, which helps us to validate functional programming to other Linux Foundation efforts.
Nymote and User Centric Networking
The release of Mirage 1.0 has put us on the road to simplifying embedded systems programming. The move to the centralized cloud has led to regular well-publicised privacy and security threats to the way we handle our digital infrastructure, and so Jon Crowcroft, Richard Mortier and I are leading an effort to build an alternative privacy-preserving infrastructure using embedded devices as part of the User Centric Networking project, in collaboration with a host of companies led by Technicolor Paris. This work also plays on the strong points of OCaml: it already has a fast ARM backend, and Mirage can easily be ported to the new Xen/ARM target as hardware becomes available.
One of the most difficult aspects of programming on the “wide area” Internet are dealing with the lack of a distributed identity service that’s fully secure. We published our thoughts on this at the USENIX Free and Open Communications on the Internet workhsop, and David Sheets is working towards a full implementation using Mirage. If you’re interested in following this effort, Amir Chaudhry is blogging at the Nymote project website, where we’ll talk about the components as they are released.
Data Center Networking
At the other extreme from embedded programming is datacenter networking, and we started the Network-as-a-Service research project with Imperial College and Nottingham. With the rapid rise of Software Defined Networking this year, we are investigating how application-specific customisation of network resources can build fast, better, cheaper infrasructure. OCaml is in a good position here: several other groups have built OpenFlow controllers in OCaml (most notably, the Frenetic Project), and Mirage is specifically designed to assemble such bespoke infrastructure.
Another aspect we’ve been considering is how to solve the problem of optimal connectivity across nodes. TCP is increasingly considered harmful in high-through, high-density clusters, and George Parisis led the design of Trevi, which is a fountain-coding based alternative for storage networking. Meanwhile, Thomas Gazagnaire (who joined OCaml Labs in November), has been working on a branch-consistent data store called Irminsule which supports scalable data sharing and reconciliation using Mirage. Both of these systems will see implementations based on the research done this year.
Higher Kinded Programming
Jeremy Yallop and Leo White have been developing an approach that makes it possible to write programs with higher-kinded polymorphism (such as monadic functions that are polymorphic in the monad they use) without using functors. It’s early days yet, but there’s a library available on OPAM that implements the approach, and a draft paper that outlines the design.
Priorities for 2014
This year has been a wild ride to get us up to speed, but we now have a solid sense of what to work on for 2014. We’ve decided on a high-level set of priorities led by the senior members of the group:
- Multicore: Leo White will be leading efforts in putting an end-to-end multicore capable OCaml together.
- Metaprogramming: Jeremy Yallop will direct the metaprogramming efforts, continuing with Ctypes and into macros and extension points.
- Platform: Thomas Gazagnaire will continue to drive OPAM development towards becoming the first OCaml Platform.
- Online: Amir Chaudhry will develop the online and community efforts that started in 2013.
These are guidelines to choosing where to spend our time, but not excluding other work or day-to-day bugfixing. Our focus on collaboration with Jane Street, Citrix, Lexifi, OCamlPro and our existing colleagues will continue, as well as warmly welcoming new community members that wish to work with us on any of the projects, either via internships, studentships or good old-fashioned open source hacking.
I appreciate the whole team’s feedback in editing this long post into shape, the amazing professorial support from Jon Crowcroft, Ian Leslie and Alan Mycroft throughout the year, and of course the funding and support from Jane Street, Citrix, RCUK, EPSRC, DARPA and the EU FP7 that made all this possible. Roll on 2014, and please do get in touch with me with any queries!
Now that OCaml 4.01 has been released, there is a frenzy of commit activity in the development trunk of OCaml as the new features for 4.02 are all integrated. These include some enhancements to the type system such as injectivity, module aliases and extension points as a simpler alternative to syntax extensions.
The best way to ensure that these all play well together is to test against the ever-growing OPAM package database as early as possible. While we’re working on more elaborate continuous building solutions, it’s far easier if a developer can quickly run a bulk build on their own system. The difficulty with doing this is that you also need to install all the external dependencies (e.g. libraries and header files for bindings) needed by the thousands of packages in OPAM.
Enter a hip new lightweight container system called Docker. While containers aren’t quite as secure as type-1 hypervisors such as Xen, they are brilliant for spawning lots of lightweight tasks such as installing (and reverting) package installations. Docker is still under heavy development, but it didn’t take me long to follow the documentation and put together a configuration file for creating an OCaml+OPAM image to let OCaml developers do these bulk builds.
A basic Docker and OPAM setup
I started by spinning up a fresh Ubuntu Saucy VM on the Rackspace Cloud, which has a recent enough kernel version to work out-of-the-box with Docker. The installation instructions worked without any problems.
Next, I created a Dockerfile to represent the set of commands needed to prepare the base Ubuntu image with an OPAM and OCaml environment. You can find the complete repository online at https://github.com/avsm/docker-opam. Let’s walk through the
Dockerfile in chunks.
FROM ubuntu:latest MAINTAINER Anil Madhavapeddy <firstname.lastname@example.org> RUN apt-get -y install sudo pkg-config git build-essential m4 software-properties-common RUN git config --global user.email "email@example.com" RUN git config --global user.name "Docker CI" RUN apt-get -y install python-software-properties RUN echo "yes" | add-apt-repository ppa:avsm/ocaml41+opam11 RUN apt-get -y update -qq RUN apt-get -y install -qq ocaml ocaml-native-compilers camlp4-extra opam ADD opam-installext /usr/bin/opam-installext
This sets up a basic OCaml and OPAM environment using the same Ubuntu PPAs as the Travis instructions I posted a few months ago. The final command adds a helper script which uses the new
depexts feature in OPAM 1.1 to also install operating system packages that are required by some libraries. I’ll explain in more detail in a later post, but for now all you need to know is that
opam installext ctypes will not only install the
ctypes OCaml library, but also invoke
apt-get install libffi-dev to install the relevant development library first.
RUN adduser --disabled-password --gecos "" opam RUN passwd -l opam ADD opamsudo /etc/sudoers.d/opam USER opam ENV HOME /home/opam ENV OPAMVERBOSE 1 ENV OPAMYES 1
The next chunk of the Dockerfile configures the OPAM environment by installing a non-root user (several OPAM packages fail with an error if configured as root). We also set the
OPAMYES variables to ensure we get the full build logs and non-interactive use, respectively.
Running the bulk tests
We’re now set to build a Docker environment for the exact test that we want to run.
RUN opam init git://github.com/mirage/opam-repository#add-depexts-11 RUN opam install ocamlfind ENTRYPOINT ["usr/bin/opam-installext"]
This last addition to the
Dockerfile initializes our OPAM package set. This is using my development branch which adds a massive diff to populate the OPAM metadata with external dependency information for Ubuntu and Debian.
Building an image from this is a single command:
docker build -t avsm/opam github.com/avsm/docker-opam
ENTRYPOINT tells Docker that our wrapper script is the “root command” to run for this container, so we can install a package in a container by doing this:
docker run avsm/opam ctypes
The complete output is logged to stdout and stderr, so we can capture that as easily as a normal shell command. With all these pieces in place, my local bulk build shell script is trivial:
pkg=`opam list -s -a` RUN=5 mkdir -p /log/$RUN/raw /log/$RUN/err /log/$RUN/ok for p in $pkg; do docker run avsm/opam $p > /log/$RUN/raw/$p 2>&1 if [ $? != 0 ]; then ln -s /log/$RUN/raw/$p /log/$RUN/err/$p else ln -s /log/$RUN/raw/$p /log/$RUN/ok/$p fi done
This iterates through a local package set and serially builds everything. Future enhancements I’m working on: parallelising these on a multicore box, and having a linked container that hosts a local package repository so that we don’t require a lot of external bandwidth. Stay tuned!
I’ve just uploaded the camera-ready of our HotNets 2013 paper, titled ”Trevi: Watering Down Storage Hotspots with Cool Fountain Codes”. This is my first foray into fountain coding, and it’s really exciting working with George Parisis (who has done lots of cool work in this field) to apply it as the backend storage to upcoming projects.
For instance, I’m working with Thomas Gazagnaire on a branch-consistent database called Irminsule, which aims to support very scalable git-style persistent programming. Another interesting k/v store is Arakoon which we ported to MirageOS, and Trevi provides a nice mechanism for retrieving large binary values.
Here’s the abstract:
Datacenter networking has brought high-performance storage systems research to the foreground once again. Many modern storage systems are built with commodity hardware and TCP/IP networking to save costs. We highlight a group of problems that are present in such storage systems and which are all related to the use of TCP. As an alternative, we explore Trevi: a fountain coding-based approach for distributing I/O requests that overcomes these problems while still efficiently scheduling resources across both networking and storage layers. We also discuss how receiver-driven flow and congestion control, in combination with fountain coding, can guide the design of Trevi and provide a viable alternative to TCP for datacenter storage.
The attraction of our fountain coding scheme is its simplicity when compared to dealing with TCP-based retrieval, particularly for larger k/v stores where multiple nodes can respond without any strong synchronization. Comments on the paper are most welcome ahead of HotNets!
Trevi: Watering Down Storage Hotspots with Cool Fountain Codes, George Parisis, Toby Moncaster, Anil Madhavapeddy and Jon Crowcroft in the Twelfth ACM Workshop on Hot Topics in Networks (HotNets-XII), Nov 2013.
Yaron Minsky and I have been running OCaml tutorials for a few years at ICFP and CUFP, but haven’t really spread out into the wider conference circuit. Now that Real World OCaml is almost finished, the scene is set for doing much more. The first such tutorial is being help at FPDays 2013 on October 24th in the lovely Murray Edwards College in Cambridge. Check out the Lanyrd page for ticket information, and the OCaml session page for more information.
The basic layout of the tutorial is to get started with the guided tour of the book, and then work through building a distributed message broker. This gets you familiar with the Core standard library, the Async event-driven I/O library, and all the strongly-typed RPC plumbing that goes in between. We’re hoping to have physical preprints of the book available for free to attendees, so do sign up fast if you wish to attend.
As a bonus, the Cambridge FPDays session will feature Jeremy Yallop working through the book and conducting the tutorial: he has an incredible depth of knowledge about the innards of OCaml’s type system, and so advanced users will also find a good home in this tutorial to throw questions at him too! For those of you interested in other programming languages, there are also excellent-looking sessions on Erlang, F# and Scala, and Phil Wadler is giving a keynote speech. I’m most excited about Sam Aaron’s session on live coding and music though. You have to hear it to believe it…
all posts ↑