I’m at the 2011 OCaml Users Group in Paris, reporting on some splendid talks this year. It looked like around 60-70 people in the room, and I had the pleasure of meeting users all the way from Russia to New York as well as all the Europeans!
First up was Pierre Chambart
talking about the js_of_ocaml
compiler. It compiles OCaml bytecode directly to Javascript, with few
external dependencies. Since the bytecode format changes very rarely, it
is simpler to maintain than alternatives (such as Jake Donham’s
ocamljs) that require patching the
compiler tool-chain. Javascript objects are mapped to dynamic OCaml
objects via a light-weight ##
operator, so you can simply write code
like:
class type window = object
method alert : js_string t -> unit meth
method name : js_string t prop
end
let window : window t =
JS.Unsafe.variable "window"
let () =
window##alert ( window##name)
name <- Js.string "name"
Overloading is handled similarly to
PyObjC, with each parameter
combination being mapped into a uniquely named function. Raphael
Proust then demonstrated a cool game
he wrote using via bindings
to the Raphael Javascript vector graphics
library. Performance of js_of_ocaml
is good compared to writing it by
hand, and they have have quite a few
benchmarks
on their website.
Overall the project looks very usable: the main omissions are Bigarray,
no dynlink, no Str (replaced by native regexps), no recursive modules or
weak references. None of these missing features seem very critical for
the sorts of applications that js_of_ocaml
is intended for.
Next up Phillipe Wang presented something completely different: running OCaml on tiny 8-bit PIC microcontrollers! These PICs have 4-128Kb of flash (to store the code), and from 256 bytes to 4 kilobytes. Not a lot of room to waste there. He demonstrated an example with a game with 24 physical push buttons that beat humans at a conference (JFLA).
It works by translating OCaml bytecode through several stages:
ocamlclean
to eliminate dead code in the bytecode (which would be very
useful for native code too!), a compression step that does run-length
encoding, and then translation to PIC assembly. They have a replacement
stop-and-copy GC (150 lines of assembly) and a full collection cycle
runs in less than 1.5ms. Integers are 15-bits (with 1 bit reserved) and
the block representation is the same as native OCaml. Very cool project!
We went onto static analysis and Julien Signoles presented Frama-C, a powerful static analysis tool for real-world C. It forks the CIL project from Berkeley and adds ocamlgraph and GUI support. He demonstrated a simple loop counter plugin to count them in C code, and the homepage has many interesting plugins maintained by the community.
I hadn’t realised that CIL was still maintained in the face of clang, so it’s nice to see it live on as part of Frama-C.
The ever-cheerful Vincent Balat updated us about the Ocsigen web framework, including unveiling their exciting new logo! This was written using an amazing collaborative editor that lets users edit in real time.
Ocsigen is based around services of type
service: parameters -> page
. Services are first-class values, and can
be registered dynamically and associated with sessions. The code for the
collaborative editor was about 100 lines of code.
There is a syntax extension to distinguish between client and server
side code, and both can be written in the same service (invoking
js_of_ocaml
to compile the client code to Javascript). They have
bindings to Google Closure in order
to provide UI support. There is a really nice “bus” service to pass
messages between the server and the client, with seamless integration of
Lwt to hide the details of communication to
the browser.
Ocsigen is looking like a very mature project at this point, and I’m very keen to integrate it with Mirage to specialise the into micro-kernels. A task for the hacking day tomorrow morning I think!
I talked about Mirage, hurrah! Good questions about why we need a block device (and not just use NFS), and I replied that everything is available as the library and the programmer can choose depending on their needs (the core goal of exokernels).
A highlight for me was lunch where I finally met Richard Jones, who is one of the other OCaml and cloud hackers out there. Wide ranging conversation about what the cool stuff going in KVM and Red Hat in general. Richard also gave a short talk about how they use OCaml to generate hundreds of thousands of lines of code in libguestfs. There are bindings for pretty much every major language, and it is all generated from an executable specification. He notes that “normal” programmers love the OCaml type safety without explicit annotations, and that it is a really practical language for the working programmer. The Xen Cloud Platform also has a similar generator for XenAPI bindings, so I definitely agree with him about this!
Xavier “superstar” Leroy then gave
an update of OCaml development. Major new features in 3.12.0 are
first-class modules, polymorphic recursion, local module opens, and
richer operations over module signatures. Version 3.12.1 is coming out
soon, with bug fixes (in camlp4 and ocamlbuild mainly), and better
performance on x86_64: turns out a new mov
instruction change
improves floating point performance on x86_64
.
OCaml 3.13 has no release date, but several exciting features are in the pipeline. Firstly, more lightweight first-class modules by permitting some annotations to be inferred by the context, and it introduces patterns to match and bind first-class module values. Much more exciting is support for GADTs (Generalised Algebraic Data Types). This permits more type constraints to be enforced at compile time:
type _ t =
| IntLit : int -> int t
| Pair : 'a t * 'b t -> ('a * 'b) t
| App : ('a -> 'b) t * 'a t -> 'b t
| Abs : ('a -> 'b) -> ('a -> 'b) t
let rec eval : type s . s t -> s = function
| IntLit x -> x (* s = int here *)
| Pair (x,y) -> (eval x, eval y) (* s = 'a * 'b here *)
| App (f,a) -> (eval f) (eval a)
| Abs f -> f
In this example of a typed interpreter, the eval
function is annotated
with a type s . s t -> s
type that lets each branch of the pattern
match have a constrained type for s
depending on the use. This
reminded me of Edwin Brady’s partial
evaluation work
using dependent types, but a much more restricted version suitable for
OCaml.
There are some really interesting uses for GADTs:
The challenges in the implementation are that principle type inference is now impossible (so some annotation is required), and pattern matching warnings are also trickier.
From the IDE perspective, the third bit of work is to have the OCaml
compiler save the full abstract syntax tree annotation with source
locations, scoping information, types (declared and inferred) and
addition user-defined annotations. This generalises the -annot
flag
and can help projects like
OCamlSpotter,
OCamlWizard,
OcaIDE, etc. It also helps
code-generators driven by type-generators (such as our SQL
ORM or
ATDgen).
The OCaml consortium has new members; MLState and MyLife, and Esterel, OCamlPro and one unnamed new member are joining. The consortium goals are to sell permissive licensing (BSD) to members, and sound off new features with the serious users. Three companies are now doing commercial development (Gerd, OCamlCore, OCamlPro) which is growing the community nicely.
Luc Maranget (who looks like an archetypal mad professor!) gave a great rundown on JoCaml, a distributed programming extension to OCaml. This extends the compiler with join-definitions (a compiler patch), and a small bit of runtime support (using Thread), and significant extensions for concurrent and distributed programming in a type-safe way.
It extends the syntax with three new keywords: def
, spawn
and
reply
, and new usage for or
and &
(you should be using ||
and
&&
anyway). Binary libraries remain compatible between matching
versions of JoCaml and OCaml. An example of JoCaml code is:
let create n =
def st(rem) & tick() = st(rem-1)
or st(0) & wait() = reply to wait in
spawn st(n) ; { tick=tick; wait=wait; }
type t = {
tick: unit Join.chan;
wait: unit -> unit;
}
After n
messages to tick
, the wait
barrier function will be
called.
let c = create n
let () =
for k = 0 to 9 do
spawn begin printf "%i" k; c.tick ()
done;
c.wait ()
Here we asynchronously print the numbers of 0
to 9
, and then the
wait
call acts as a barrier until it finishes. JoCaml is useful for
distributed fork-join parallelism tasks such as raytracing, but with the
type system support of OCaml. It is a bit like MapReduce, but without
the data partitioning support of Hadoop (and is more light-weight). It
would be quite interesting to combine some of the JoCaml extensions with
the dynamic dataflow graphs in our own
CIEL distributed
execution engine.
### Forgetful Memoisation in OCaml
Francois Bobot talks about the problem of memoizing values so that they can be re-used (e.g. in a cache). Consider a standard memoiser:
let memo_f =
let cache = H.create () in
fun k ->
try H.find cache k
with Not_found ->
let v = f k in
H.add cache k v;
v
let v1 = memo_f k1
let v2 = memo_f k2 in (* k2 = k1 in O(1) *)
If a key is not reachable from anywhere other than the heap, we want to eliminate it from the cache also. The first solution is a normal hashtable, but this results in an obvious memory leak since a key held in the cache marks it as reachable. A better solution is using OCaml weak pointers that permit references to values without holding on to them (see Weaktbl by Zheng Li who is now an OCaml hacker at Citrix). The problem with Weaktbl is that if the value points to the key, forming a cycle which will never be reclaimed.
Francois solves this by using Ephemerons from Smalltalk. They use the rule that the value can be reclaimed if the key or the ephemeron itself can be reclaimed by the GC, and have a signature like:
module Ephemeron : sig type ('a,'b) t
val create : 'a -> 'b -> ('a,'b) t
val check : ('a,'b) t -> bool
val get : ('a,'b) t -> 'b option
val get_key : ('a,'b) t -> 'a option
end
The implementation in OCaml patches the runtime to use a new tag for ephemerons, and the performance graphs in his slides look good. This is an interesting topic for me since we need efficient memoisation in Mirage I/O (see the effects on DNS performance in the Eurosys paper which used Weaktbl). When asked if the OCaml patch will be upstreamed, Damien Doligez did not like the worst-case complexity of long chains of ephemerons in the GC, and there are several approaches under consideration to alleviate this without too many changes to the runtime, but Francois believes the current complexity is not too bad in practise.
Sylvain came on stage later to give a
demonstration of
OASIS, an equivalent
of Cabal for Haskell or
CPAN for Perl. It works with a small _oasis
file that describes the project, and then the OASIS tool auto-generates
ocamlbuild
files from it (this reminds me of Perl’s
MakeMaker). Once the
files are auto-generated, it is self-contained and there is no further
dependency on OASIS itself.
OASIS works with either an existing build system in a project, or can be
integrated more closely with ocamlbuild
by advanced users. Lots of
projects are already using OASIS (from Cryptokit to Lwt to the huge
Jane Street
Core). He is also
working on a distribution mechanism on a central website, which should
make for convenient OCaml packaging when it is finished and gets more
adoption from the community.
Finally, Ashish Agarwal led a discussion on how OCaml can improve its web presence for beginners. Lots of good ideas here (some of which we implemented when reworking the CUFP website last year). Looking forward to seeing what happens next year in this space! I really enjoyed the day; the quality of talks was very high, and many engaging discussions from all involved!
Of course, not all of the OCaml community action is in France. The ever-social Jake Donham organised the First Ever San Francisco User Group that I attended when I was over there a few weeks ago. Ok, admittedly it was mainly French people there too, but it was excellent to meet up with Mika, Martin, Julien, Henri and of course Jake when over there.
We should definitely have more of these fun local meetups, and a number of other OCaml hackers I mentioned it to want to attend next time in the Bay Area, if only to cry into their drinks about the state of multi-core... just kidding, OCamlPro is hard at work fixing that after all :-)