OxCaml Labs

We're exploring systems applications for Oxidised OCaml (or 'OxCaml' for short). The language extension itself is primarily developed by Jane Street, and our research group here focusses on building high-performance systems in OxCaml that are competitive with other "systems" languages such as Rust, as well as exploring new trends such as the impact of AI-assisted development. We've been working on mainline OCaml for a long time, and so we also help with stewardship activities around the wonderful OCaml community as well and maintaining tools and code we've released over the years.

This is all a highly collaborative effort across the group. The team is a merry band of postdoctoral associates and PhD students, including myself (Anil), Jon Ludlam, Mark Elvers, Patrick Ferris Ryan Gibb, David Allsopp (departed to Jane Street at the start of 2026), Sadiq Jaffer and Michael Dales. Everything below is joint work not only within our group, but also across close collaborators such as Tarides, the FP Launchpad, Jane Streeters, and the wider community.

Here follows a review of roughly our first year of activity of OxCaml Labs, from around February 2025 to March 2026.

1 OCaml Community Stewardship

The first topic is something of a continuation of OCaml Labs, and helps to motivate our interest in OxCaml. We maintain and improve the shared tooling and infrastructure that the upstream open-source OCaml ecosystem depends on.

1.1 Documentation, Tutorials and Lectures

A healthy programming language needs documentation, accessible teaching materials, and a welcoming community. We've been organising OCaml's participation in Outreachy since its inception. This scheme provides paid internships in open source for people underrepresented in tech. In the June 2025 round, we mentored two interns: one built Claudius, a fantasy-console graphics library, and another extended dune to discover system information. In the December 2025 round, three interns worked on Raven dashboards for ML experiments, better errors for YOCaml, and writing TIFF files in pure OCaml — the last of which feeds directly into our geospatial stack that we'll come to later!

1.1.1 odoc

odoc 3 is a major overhaul of OCaml's documentation generator, adding rendered source code navigation, type-based search via sherlodoc, cross-package linking into a web of interconnected docs across packages, a global sidebar, and media support to mvoe beyond just text.

We drove it into production use on large codebases (such as Jane Street's) with fixes for complex features such as module-type-of. After this, we rolled out package documentation on ocaml.org for all of opam in July 2025 with the ocaml-docs-ci scaling to 20k+ packages.

Once this was deployed, the rest of the year was spent getting it into full production use with odoc bug fixes, sherlodoc search integration, and support for OxCaml modes and layouts (more on that later!).

Searchable and cross-referenced documentation for every OCaml package
Searchable and cross-referenced documentation for every OCaml package

1.1.2 PPX syntax extensions

OCaml's PPX system allows developers to extend the language with custom syntax transformations and code generation (similar in spirit to Rust's procedural macros, working at the AST level). That's the good news; the bad news is that every single compiler release changes the AST, requiring updating almost all of the syntax extensions.

The community unified around ppxlib as the single framework for writing these extensions as a central place to perform these AST upgrades, and we help maintain it by bumping the ppxlib AST each time a new compiler version changes the parse tree, and then sending patches to downstream PPX authors.

1.1.3 Undergraduate teaching with notebooks

While I was on sabbatical in 2025-2026, Jon Ludlam covered the Cambridge 1A FoCS course and built interactive OCaml lecture slides using Slipshow combined with x-ocaml, a web component that compiles and runs OCaml code directly in the browser via js_of_ocaml. The lecturer can now modify and execute code snippets live during lectures, with type-on-hover and programmatic highlighting of the OCaml code (very useful for the notorious 'giving change' algorithm in the middle of the course).

The same technique is now one we're investigating to power interactive TESSERA notebooks that let users draw regions on a map, fetch embeddings, place training labels and run classifications all client-side in OCaml compiled to JavaScript. Separately, we built a transpiler from OCaml to Hazel (paper), bootstrapping a corpus of ill-typed OCaml programs for the Hazel live programming environment as well.

The 1A Foundations of Computer Science interactive notebooks
The 1A Foundations of Computer Science interactive notebooks

1.1.4 Reflections and experiments

We wrote an extensive Irmin retrospective to look back at a project that's been going for over a decade now. Monthly OCaml Roundups and quarterly summaries also helped to keep the community informed, alongside writeups with more detail on coding agents and fixes to the OCaml runtime.

Fun experiments also abound: we've been running multicore OCaml 5 on a Raspberry Pi Pico 2 microcontroller, building eInk display drivers on a Raspberry Pi, writing a complete ray tracer in OCaml using TSDL, OCaml 5 Domains, and Atomics (with performance work on reducing heap allocations), and getting into FPGA programming with HardCaml.

One issue we're facing now is the number of communications channels available, so blogging has been a good way to keep up with our individual progress while also keeping each other informed!

1.2 Package management

Every OCaml project depends on a toolchain of compilers, package management, build systems, and continuous integration to turn source code into useful software. We maintain significant parts of this infrastructure and are also researching how to make it better in the future.

1.2.1 Continuous Integration

OCaml-CI is the shared continuous integration service that automatically tests every package submitted to the opam repository across multiple OS and architecture combinations before it can be merged. We run and maintain the underlying cluster of build workers (spanning Linux x86/ARM64/PPC64/s390x/RISC-V, macOS, Windows and FreeBSD). This involves quite a bit of infrastructure shuffling last year as our Scaleway and ARM sponsorships both expired, so we spent time moving the cluster to Cambridge, issuing Docker base images for new compiler releases, adding Windows builds, FreeBSD, ARM64 workers, and 32-bit backends. We also documente how to distribute OCaml via Homebrew for macOS users.

%rc
OCaml-CI is of course built in OCaml using our OCurrent pipeline framework, and each package build runs inside a Docker container constructed from our base images. So we have a deep dependency on containerisation, and a long history with it since Docker Desktop's own networking stack (VPNKit) has been written in OCaml since 2016. Our retrospective on "A Decade of Docker Containers" made the front page of the Communications of the ACM! This year we defunctorised VPNKit to port it to direct-style OCaml 5 with Eio, and reported on the migration at ICFP.

On the build side, we replaced the flaky Windows/containerd stack with OBuilder on Windows HCS and compared BuildKit against OCurrent for image building. I suspect that OCaml's use of Windows containers might be one of the largest such uses in the world, given the large number of bugs we've run into.

1.2.2 opam and dune

opam is OCaml's package manager, and dune is the build system that compiles OCaml projects. Together they form basis of the OCaml Platform of default developer tooling. We help maintain the opam package repository and contribute to both tools.

The opam repository is a community-maintained collection of package metadata, via a large Git repository where every OCaml library has a file describing its dependencies, build commands, and compatibility constraints with other packages. When someone runs opam install, a constraint solver finds a compatible set of packages. We help review and merge submissions, and maintain tooling around the repository itself.

Day-to-day maintenance included retiring legacy compiler versions from the CI matrix to keep the size of tests under control, migrating to opam 2.5, improving dependency solving and solver caching, and the perennial question of whether semantic versioning makes sense for OCaml (spoiler: no). The Day10 bulk testing tool can rebuild the entire package database in half an hour on a single multicore machine, and we use it to compare builds across compiler variants to, for example, catch packages that break when switching from OCaml 5.4 to 5.5 or from mainline to OxCaml or if we remove ocamldoc from the computer distribution.

1.2.3 Researching new formalisms

On the research end, we're developing formal models for dependency resolution that give insights into how to evolve systems like opam and Docker. This started with using hypergraphs for resolutions and then "Package Managers à la Carte: A Formal Model of Dependency Resolution", formalising dependency resolution across ecosystems. We built the Pac tool, got opam-nix integration merged upstream, and presented three talks at FOSDEM 2026: package management formalised, Eilean, and Opam's Nix mechanism.

Shelter is a data provenance system/shell built with OCaml and Eio that uses eBPF tracing to track how data is flowing through an interactive system. It support imports for modularity and sessions with full lineage tracking, and we've been working on running the Mapping LIFE on Earth biodiversity pipeline through it to get end-to-end data provenance.

1.3 Core compiler development

There's also the nitty gritty of working on the main OCaml compiler project.

1.3.1 OCaml Relocatable OCaml

%rc
The headline achievement last year was getting Relocatable OCaml merged into mainline in December 2025 after a multi-year effort from David Allsopp, cheered on by the rest of us while carefully leaving him alone to hack on Makefiles. In OCaml 5.5, this landed as a cluster of PRs: #14244 lets the runtime locate the standard library relative to the binary, #14243 makes paths be interpreted relative to the config file, and #14245 enables different runtime configurations to coexist via filename mangling.

Together these make it possible to ship self-contained OCaml installations without absolute paths. This might sound pedestrian, but it's critical for fast opam switch cloning, Docker images, and cross-compilation that all need custom toolchains in specific locations (and can now just move binaries around rather than spend minutes compiling from source).

We then backported all this to OCaml 4.13+ and cleaned up compiler packaging in opam. We also reviewed thread-safe POSIX functions, Unix.unsetenv), Domain.count and FlexDLL updates for 5.5.

1.3.2 OCaml 5.4 and 5.5 and beyond

Back in OCaml 5.4, we contributed string conversion functions for the C API, frame table linker fixes, frame pointer maintenance, cloexec fixes for Windows CRT descriptors, native symlinks on Windows, Windows target triplet validation, and BUILD_PATH_PREFIX_MAP support for .cmt files. We reviewed the runtime events timestamp addition, making Gc.control globals atomic, and a Unix.getgroups fix for musl. Many of these are aimed towards winning on Windows (an effort that started back in 2018!) while others improve the robustness and observability of multicore.

Beyond upstream, we explored using effects in the compiler itself and rethinking file descriptor abstractions for ocaml-uring, as well as how static linking works and OCaml 5.4 native ARM32 (out-of-tree, as 32-bit native output is being deprecated upstream).

2 OxCaml for High-Performance Systems

We're also trying to push on the frontiers of OCaml performance through the extensions available in OxCaml. We're doing this by building our research systems to exploit zero-allocation, unboxed types, modes, and SIMD, usually in networked infrastructure such as TESSERA and planetary dashboards.

2.1 Getting familiar with OxCaml

It was a little daunting getting started with OxCaml since it's got a high development velocity within the walls of Jane Street. While helping with the open source release last summer, we built OxCaml base images and an opam-repository for OxCaml, and documented the experience of trying OxCaml and discovering its internals.

After that, we helped to run an OxCaml tutorial at ICFP 2025 covering modes, locals, and performance engineering (also using Slipshow as the FoCS notebooks did). We also published a fun and direct OCaml vs OxCaml Pi Day comparison to demonstrate performance tradeoffs. Our ICFP reflections covered data race freedom and the multicore saga, and there were trip reports from multiple group members about the rest of ICFP.

2.2 High-performance OxCaml systems

After this, it was time to get our research infrstructure going. We built httpz, a zero-allocation HTTP/1.1 server using unboxed types and local allocations that now runs this website. We ported compression libraries (Brotli, Zstd, Snappy) to OxCaml in a monorepo to experiment with SIMD, built an ONNX inference engine using OxCaml SIMD intrinsics, and benchmarked GPU vs CPU inference. We're using all these pieces to help train and infer TESSERA embeddings. For fun, we also optimized an MP3 codec comparing OCaml and OxCaml performance.

2.2.1 Planetary Computing with TESSERA and LIFE

A major application of our OxCaml work is deploying the TESSERA, a pixelwise geospatial foundation model geospatial model, which involves manipulating petabytes of satellite data. We've been building the data infrastructure for earth observation, biodiversity monitoring, and geospatial analysis in a combination of Python, OxCaml and of course...shell scripts. Most of the ecosystem for geospatial analysis is currently in other languages, so this involves grassroots library work from many directions: either from scratch, or FFI bindings, or a combination.

We built an end-to-end TESSERA inference pipeline in OCaml using ONNX bindings, progressing through early Zarr v2 support and STAC tile serving. The OxCaml-Zarr transcoding pipeline went from prototype to browser-based streaming via wasm/WebGPU, and the full Zarr v3 conventions are now documented. On the application side, we demonstrated finding solar farms with a 42k-parameter model using TESSERA embeddings, and built OpenStreetMap protobuf and DuckDB bindings in OxCaml for combining vector data with pixel embeddings.

The Yirgacheffe declarative geospatial library was presented at PROPL and published. It's designed for processing rasters at planetary scale (the LIFE biodiversity pipeline works with 80-terapixel habitat maps) and recent work on area-per-pixel maps eliminated an entire pre-computed raster by calculating pixel area on demand from the map projection. We also worked on Parquet file optimization for geospatial data formats.

2.2.2 GeoCaml

In parallel, we're assembling geocaml, a suite of pure OCaml geospatial libraries that can eventually replace C dependencies like GDAL for our use cases. The centrepiece is ocaml-tiff, a pure OCaml TIFF reader and writer. Our Outreachy intern delivered LZW decompression speedups and added write support. Around it we maintain OCaml bindings to PROJ4 for coordinate reference system projections, a WKT codec for well-known text format, an R-Tree spatial index adopted from old friend Marius Eriksen, and ocaml-geojson. The goal is to reach a critical mass where these libraries reinforce each other, and we're getting there with early success porting geotessera to OCaml using ocaml-tiff for landmasks and Nx for array operations.

More broadly, we co-chaired the Programming for the Planet workshop at ICFP, gathering enthusiastic functional programmers interested in planet-positive action! PROPL will return in 2026 in PLDI...stay tuned.

3 AI-Assisted OCaml Development

Claude Code appeared just as we started OxCaml Labs, and has completely turned software development on its head. We've spent a good chunk of time figuring out how OCaml and LLMs can work well together: improving training data, building tooling, benchmarking, and developing practices for agentic coding.

We presented "Three steps for OCaml to crest the AI humps" at the OCaml Workshop, benchmarking OCaml's representation in foundation models. We curated an opam archive dataset for LLM training, evaluated 19 local LLMs on OCaml, and built GC debugging benchmarks for agents to make sure OCaml was represented in frontier model evaluations.

On the tooling side, we built an OCaml MCP server and integrated ocaml-lsp via MCP for AI coding, Cresting the OCaml AI humps surveyed the broader ecosystem, and we published a prebuilt devcontainer for OCaml/OxCaml with Claude Code and explored adversarial approaches to teaching OxCaml to agents. To keep up with the firehose of activity across OCaml's GitHub repositories, we also built Ruminant, a tool that syncs thousands of issues, PRs and discussions and uses Claude to generate condensed weekly digests — published regularly on thicket.dev so anyone can follow what's happening across the ecosystem without drowning in notifications.

3.1 The Advent of Agentic Humps

In order to really push the limits, we also did a December Advent of Agentic Humps in which we built 25 O(x)Caml libraries in 25 days with Claude Code to stress-test agentic coding for OCaml. At a high-level, it all worked surprisingly well given the lack of training data for a relatively low-resource language like OCaml (and especially OxCaml). Several other patterns emerged:

3.1.1 Specifications as source code

Feeding agents formal specs like RFCs or WHATWG standards produced surprisingly good results, especially when combined with external test suites. We built HTTP cookie, punycode and public-suffix libraries from the RFC text alone, a pure OCaml Yaml 1.2 parser that passes 20% more tests than the C libyaml bindings (and is 20% faster), and TOML 1.1 codecs reaching 100% of the toml-test suite. The key was having a comprehensive expect test suite which could be used to guide the code generation.

3.1.2 Error messages guide agents

Well-designed libraries with precise error messages (particularly jsont and its typed codecs) let agents self-correct in a loop. This proved useful when debugging against live APIs and when reverse-engineering protocols from multiple language SDKs. The bidirectional codec pattern from jsont generalised really well across other formats such as Yaml, TOML, and INI formats as well.

3.1.3 Source code context across languages often beat documentation

Providing agents with working implementations to study (cookbooks, previous codebases, and even reference implementations in other languages) consistently outperformed relying on docs alone. For a HTTP client, we aggregated best practices from 50 implementations across 10 languages to synthesise a unified specification for the OCaml version. For Eio-based libraries, cloning the source repo and pointing the agent at the README was effective enough to produce fairly robust code.

In general, agentic coding improves when all relevant code is locally available. We built monopam for assembling monorepos with unified sources, and unpac for branch-based package management where each dependency becomes a git branch rather than a submodule. This enabled agents to spawn parallel worktrees for independent porting tasks, a feature now natively supported by Claude Code.

3.1.4 Self-healing and recursive testing

When external test suites don't exist, we found that building tools which test themselves works surprisingly well: for example, a Zulip bot that exercises its own API, and a feed aggregator that patches its own parsing code when it encounters format quirks. The Postel "be liberal in what you accept" principle can be implemented through automatic code evolution.

OCaml's type system is a natural guardrail here, as it enforces strong typing provided a feedback loop that improved agentic coding quality compared to our earlier Python experiments. Agents can also handle verbose combinator-based boilerplate well, removing the need for PPX-based code generation in many cases. But the code produced is "unoriginal"; good for parsing and utility libraries, but not a substitute for creative library design. We treat all agent output as "slop by default" until human-reviewed, and some of us are switching our agents to use separate Git identities to keep it all separate.

Outside the Advent, we tried agentic bug hunting, experimented with Claude completing format specs and agentic OCaml, and used Claude for dune odoc rules as a first foray into upstreaming with an agent.

On the community side, there's been a real rollercoaster of conflicting incentives with AI slop encroaching on our more traditional repositories. We wrote up Vibecoding Etiquette guidelines to help navigate this. Much more thought, empathy and discussion will be needed this year to figure out what to do about the incoming AI code flood.

3.2 Evolutionary Redecentralisation

Our passion is also in operating self-hosted infrastructure, federated services, and distributed systems using MirageOS and modern OCaml.

We built Eilean, a self-hosted digital islands platform using MirageOS unikernels, and Eon, an effects-based OCaml DNS nameserver using Eio and MirageOS's pure DNS library. Our Bifrost preprint introduces a programming model using bigraphs for spatial networking, enabling policies scoped by physical boundaries and location-based device naming. Tangled code hosting and adding support to dune for it made mlgpx the first Tangled-hosted opam package; one small step to reducing our reliance on GitHub.

If none of the above is mad enough for you, we put all of the above into a mixing pot in our Ecology of/for the Internet paper, which argues for applying ecological resilience principles to the Internet. We prototype digital immune systems, mutualistic communication, and self-mutating code, using coding models and types and sandboxing to see what the world might look like if every host (out of the trillion or so connected to the Internet by 2030) might all be a bit different from each other. OxCaml may surprisingly be an important part of making this sort of evolutionary coding robust, since the extra type/kind information helps to constrain the runtime characteristics of AI-generated outputs.

Activity

Evidence synthesis at the DEFRA science conference, TESSERA transcoding and building a new SPA, OpenStreetMap/DuckDB bindings in OxCaml, and early thoughts on vibecoding etiquette.
How we restructured TESSERA's geospatial embeddings from millions of individual numpy files into sharded Zarr v3 stores for efficient HTTP streaming, enabling everything from single-pixel mobile lookups to regional-scale analysis with just a couple of range requests.
Mark Elvers. For Pi Day, I have implemented the same algorithm in both OCaml and OxCaml and compared the generated assembly and runtime performance.
Mark Elvers. Following my previous CPU vs GPU post I started thinking about what the ONNX inference engine actually did and if it could be replicated in OxCaml with SIMD.
Mark Elvers. In a previous post, I compared the ONNX Runtime with PyTorch on the CPU and GPU. In this post, I take this to the extreme to see if a CPU can outpace the NVIDIA L4 GPU.
Patrick Ferris. Much like last month, this month has been busy with lots of work on ocaml-tiff thanks to Tambe Salome and ppxlib. Outreachy Tambe Salome, at the time of writing, has completed her internship! We will be hosting the biannual Demo Day celebration for this round's interns, so please do come along. Sinc…
Patrick Ferris. I am yet to jump head-first into using LLM-based tools like Claude Code or even ChatGPT to help with my programming in any serious way. After using some "free"-tier tools to try to better understand some tricky eBPF problems and to make sense of the semantics of POSIX shells, I was left un…
A little screencast of a fully browser based streaming interface to manipulate TESSERA embeddings. All the classification and UMAPs run directly in a browser, with no server required aside from static HTTP serving of the embeddings!
Mark Elvers. Following on from the Arm32 multicore backend, I have now ported the remaining two 32-bit architectures to OCaml 5 with multicore support: i386 and PowerPC 32-bit (PPC32).
Mark Elvers. Following from post last week about obuilder and Windows Host Compute Services, I am pleased to report that this is now running on OCaml-CI. In this early phase, I have enabled testing only on Windows 2025 with OCaml 5.4 and opam 2.5 using the MinGW toolchain.
Jon Ludlam. Let's make this really terse!
Anil Madhavapeddy, David J. Scott et al. — Communications of the ACM
Got TESSERA working in Zarr and the browser, and a preprint of package management a la carte pushed out
Our CACM cover article reflects on a decade of Docker, from the early days of hacking Docker for Mac on a French farm to today's AI-driven sandboxing, covering the technical origins, cross-platform challenges, and the vibrant open-source community that made it all possible.
Mark Elvers. Following from my containerd posts last year and my previous work on obuilder backends for macOS and QEMU, this post extends obuilder to use the Host Compute System (HCS) and containerd on Windows.
Michael Dales. This is a bit of a meandering blog post, which was meant to be about one thing, and then I had to pull in some other bits to give context, and now it feels a little incoherent. However, I do think somewhat that is part of the broader point I'll try to make at the end, about the challenge of solving …
Mark Elvers. ocurrent/obuilder is the workhorse of OCaml CI testing, but the current deployment causes packages to be built repeatedly because the opam switch is assembled from scratch for each package, leading to common dependencies being frequently recompiled. day10 uses an alternative model whereby switches a…
Growing the Ceph cluster for TESSERA embeddings, a Lego brainstorming session for the Evidence TAP, hosting Echo Labs from ARIA, and Shane's IUCN Red List seminar.
Mark Elvers. The Tessera pipeline is written in Python. What would it take to have an OCaml version?
Mark Elvers. After reading Anil’s post about his zero-allocation HTTP parser httpz, I decided to apply some OxCaml optimisation techniques to my pure OCaml MP3 encoder/decoder.
PhD viva for Maddy, presenting TESSERA at ARIA, Nature covers the conservation evidence conference, giving evidence to Parliamentary POST, and a CACM interview.
Patrick Ferris. A new year, another roundup in my open-source, OCaml activities. This month has been busy with lots of work on ocaml-tiff thanks to Tambe Salome and ppxlib. Outreachy Tambe Salome has been working to add write support to ocaml-tiff; and we are well on the way to having good support. Writing TIFF fil…
Ryan Gibb explains why package managers are legion. Every language and operating system has its own solution, each with subtly different semantics for dependency resolution. This fragmentation prevents multi-lingual projects expressing precise dependencies across languages.
Deploying an OxCaml zero-allocation webserver, OCaml CI maintenance and opam versioning, and OCaml Workshop and FOSDEM talks
Building httpz, a high-performance HTTP/1.1 parser with zero heap allocation using OxCaml's unboxed types, local allocations, and mutable local variables.
Michael Dales. I claimed last week that I'd have to put aside the fun I was having working through the Ray Tracer Challenge book in OCaml due to other commitments. And technically that was true, I didn't get to do any new features based on the book, I had a couple of longish train trips, and so I did go off piste …
Mark Elvers. With Claude Code, perhaps we are now at the point where the test suite is actually more valuable than the code itself.
David Allsopp. The spring-cleaning continues! When I originally prototyped Relocatable OCaml, it was during the OCaml 4.13 development cycle. The focus for the work originally was always about multiple versions of the compiler co-existing without interfering with each other, so even the early prototypes were done …
Mark Elvers. Early in the upgrade program for Ubuntu 24.04, there were permission issues when extracting tar files. The workaround was to update to the latest dev version of Docker. However, this didn’t resolve all the issues on ARM64, so only one machine was updated and excluded from the base image builder wo…
Sadiq Jaffer. Satellite imagery analysis doesn't always require massive data and compute. We show how to combine open data from OpenStreetMap and the UK Government's Renewable Energy Planning database with Tessera foundation model embeddings to map solar farms across the UK using a lightweight neural network.
Mark Elvers. opam 2.5.0 was released on 27th November, and this update needs to be propagated through the CI infrastructure. This post mirrors the steps taken for the release of opam 2.4.1.
David Allsopp. As we settle into 2026, I have been doing a little early spring-cleaning. A few years ago, we had a slightly chaotic time in opam-repository over what should have been a migration from gforge.inria.fr to a new GitLab instance. Unfortunately, some release archives effectively disappeared from officia…
A prebuilt Docker devcontainer for sandboxed OCaml and OxCaml development with Claude Code, including multiarch builds and network isolation.
Michael Dales. For the last few years I've spent the run up to the festive break working on something graphical, and this year whilst I was a little late to start, I decided to have a go at the Ray Tracer Challenge book by Jamis Buck. This book provides a language-neutral guide to building a classic old-school ray…
Ryan Gibb. Jan. 2026Ryan Gibb.Free and Open Source Software Developers’ European Meeting (FOSDEM). Our digital lives are increasingly fragmented across numerous centralised online services. This model concentrates power, leaving us with minimal technical control over our personal data and online identities. …
Ryan Gibb. Jan. 2026Ryan Gibb.Free and Open Source Software Developers’ European Meeting (FOSDEM). The OCaml language package manager, Opam, has support for interfacing with system package mangers to provide dependencies external to the language. Supporting Nix required re-thinking the abstractions used to i…
Ryan Gibb. Jan. 2026Ryan Gibb.Free and Open Source Software Developers’ European Meeting (FOSDEM). Package managers are legion. Every language and operating system has its own solution, each with subtly different semantics for dependency resolution. This fragmentation prevents multi-lingual projects expressi…
Mark Elvers. Running OCaml 5 with multicore support on bare-metal Raspberry Pi Pico 2 W (RP2350, ARM Cortex-M33).
An exploration of agentic programming through building useful OCaml libraries daily using Claude Code while establishing groundrules for responsible development.
Tuatara is a feed aggregator that integrates Claude to evolve and patch its own code when encountering parsing errors, embodying the concept of self-healing software.
Introducing unpac, a tool that unifies git and package management into a single workflow where all code dependencies live in one repository as trackable branches.
Materialising opam metadata into git submodules and monorepos, enabling cross-cutting fixes and unified odoc3 documentation across dozens of OCaml libraries.
Building an OCaml Zulip bot framework with functional handlers, and pivoting from TOML to INI codecs for Python configparser compatibility
Building tomlt, a pure OCaml TOML 1.1 parser with bidirectional codecs following the jsont design patterns
Jon Ludlam. Back in March of this year we released , a major new version of the OCaml documentation generator. It had a whole load of , many of which came with new demands on the build system driving it. We decid...
David Allsopp. As I was very happy to announce on Discuss on 12 December, OCaml is Relocatable! Today, the final piece of the puzzle was merged, which is the necessary support to allow opam to take advantage of all this to be able to clone switches instead of recompiling them. Before this, you could rename a local…
Vibe coding an OCaml library for the Karakeep bookmarking service by giving an agent a live API key and letting it debug jsont codecs against the real service.
Agentically synthesising a batteries-included OCaml HTTP client by gathering recommendations from fifty open-source implementations across JavaScript, Python, Java, Rust, Swift, Haskell, Go, C++, PHP and shell.
Synthesizing three RFC-compliant libraries (punycode, public-suffix, and cookeio) directly from Internet RFC specifications, establishing a workflow for automating standards implementation with proper cross-referencing to spec sections.
Building yamlt to enable jsont codec definitions to work with both JSON and Yaml, providing data manipulation with location tracking and good error messages for both formats.
Implementing a pure OCaml Yaml 1.2 parser using bytesrw by synthesizing from the specification and existing C library behavior, passing thousands of test suite cases while being 20% faster than the C-based implementation.
Three steps for OCaml to crest the AI humps, OCaml 2025 by Sadiq Jaffer, Jonathan Ludlam, Ryan Gibb, Thomas Gazagnaire, and Anil Madhavapeddy. We discuss how OCaml could adapt to the fast-moving world of AI-assisted agentic coding. We first benchmark how well represented OCaml is in the large and diverse set of open weight models that can be run locally. We then consider what is unique about OCaml programming (in particular, modules and abstraction) that differentiates it in this space. We then consider the changes required in our ecosystem to work better with AI coding assistants. Presentation at the OCaml 2025 workshop, Oct 17, 2025, https://conf.researchr.org/home/icfp-splash-2025/ocaml-2025.
Docker is a developer tool used by millions of developers to build, share and run software stacks. The Docker Desktop clients for Mac and Windows have long used a novel combination of virtualisation and OCaml unikernels to seamlessly run Linux containers on these non-Linux hosts. We reflect on a decade of shipping this functional OCaml code into production across hundreds of millions of developer desktops, and discuss the lessons learnt from our experiences in integrating OCaml deeply into the container architecture that now drives much of the global cloud. We conclude by observing just how good a fit for systems programming that the unikernel approach has been, particularly when combined with the OCaml module and type system.
A guided tour through Oxidized OCaml at ICFP/SPLASH 2025 by Gavin Gray, Anil Madhavapeddy, KC Sivaramakrishnan, Will Crichton, Shriram Krishnamurthi, Chris Casinghino, and Richard A. Eisenberg. OxCaml is a set of extensions to the OCaml programming language that form Jane Street’s production compiler for performance-oriented programming. OxCaml’s primary design goals are to provide safe, convenient, predictable control over performance-critical aspects of program behavior while preserving ML-style programming ergonomics. This tutorial will focus on key extensions in OxCaml, such as: fearless concurrency: additions to the type system to statically rule out data races. data layouts: providing more control over how data is laid out in memory and native access to vector instructions. allocation control: reducing GC pressure and improving cache efficiency and determinism.
Creating OCaml bindings for the Claude API using Eio and jsont codecs by reverse-engineering the JSON-RPC protocol from Python and Go SDKs, enabling Claude to write more Claude-powered OCaml code.
Building an XDG Base Directory Specification library with Eio capabilities and Cmdliner integration, providing sandboxed filesystem access patterns with full environment variable and CLI override support.
Tile ServerDec 2025
Mark Elvers. My throw-away comment at the end of my earlier post shows my scepticism that the JSON file approach was really viable.
Building a Base32 Crockford encoding library in OCaml using Claude Code, establishing the development workflow with sandboxed Docker containers and local development environments.
Mark Elvers. I’ve been copying the TESSERA data to Cephfs, but what is actually in the files?
Mark Elvers. Recently, I have been using my Pi Zero (armv6), which has reminded me that OCaml 5 dropped native 32-bit support, and I wondered what it would take to reinstate it.
Patrick Ferris. A late update that stretches more than a week... @ How? A quick update on a new feature that was added recently, the @ how meta-command. One of the issues facing any exploratory programmer (or indeed anybody returning to a project) is the provenance of a given file. How that file came to be and who …
Patrick Ferris. Most of this week was spent fixing bugs in Shelter in an attempt to get it ready to run the LIFE pipeline. To do so, Shelter requires a mechanism by which to pull in files from the outside world. That could be configuration files, source trees, data etc. For now, this has taken the form of a crude @…
Mark Elvers. The weather outside is frightful, but the Raspberry Pi is so delightful; I have been cheering myself by connecting up all the various bits of hardware scattered on my desk. I often buy these components but never quite get around to using them.
Jon Ludlam. I recently completed lecturing the course to our newly arrived first-year computer scientists here at . This is the first time I've lectured this course, taking over from while he's on sabbatical. A...
Ryan Gibb. I attended the co-located International Conference on Functional Programming (ICFP) and International Conference on Systems, Programming, Languages and Applications: Software for Humanity (SPLASH) as a co-author of a couple of papers, and was presenting ‘Spatial Programming for Environmental Monit…
Patrick Ferris. Welcome to a monthly roundup of OCaml-related open-source work I have been involved with. If you haven't already, a quick look at my ICFP roundup might be a good preface to what follows. Outreachy December 2025 The contribution period for this year's Outreachy round took place for most of the month …
ICFP 2025Oct 2025
Patrick Ferris. Two weeks ago I was fortunate enough to attend the International Conference on Functional Programming in Singapore. My first time in Asia and my second time at the conference, what follows are some thoughts and presentations I enjoyed whilst I was there. I must thank my office mate (and friend!) Rya…
David Allsopp. I spent last week at ICFP 2025. A nice (if exhausting!) week, as ever. Amusingly, the most reflections were actually sparked by Yaron’s talk which was right at the end (you can see the talk itself on YouTube).
Five-part series overview covering workshops, tutorials, talks and keynotes from ICFP/SPLASH 2025 in Singapore.
Mark Elvers. We are increasingly hitting the Docker Hub rate limits when pushing the Docker base images. This issue was previously identified in issue #267. However, this is now becoming critical as many more jobs are failing.
Jane Street's production deployment of OCaml 5 and Docker's migration to direct-style programming with Eio presented at ICFP.
Mark Elvers. The FreeBSD CI worker rosemary needs to be updated to FreeBSD 14.3.
Tutorial at ICFP 2025 on OxCaml extensions for performance engineering with modes and locals.
Report on second Programming for the Planet workshop featuring papers on climate modeling, geospatial computation and planetary-scale collaborative systems.
Sadiq Jaffer, Jon Ludlam et al. — proceedings of the 2025 OCaml Workshop
Josh Millar, Ryan Gibb et al.
Patrick Ferris, Anil Madhavapeddy — proceedings of the 2025 Workshop on Type-Driven Development (TyDe)
Michael Winston Dales, Alison Eyres et al. — Proceedings of the 2nd ACM SIGPLAN International Workshop on Programming for the Planet
David Allsopp. There was a flurry of activity on ocaml-multicore/ocaml-uring this month leading to a release (ocaml/opam-repository#28604). ocaml-uring provides bindings to the Linux’s io_uring, which allows batching various syscalls to the kernel for it to execute out-of-order, and in parallel. Its principal us…
David Allsopp. Continuing the previous theme of dabbling with matters agentic. Previously, I’d quite assiduously kept my fingers away from files. This time, I wanted to try something exploratory, switching to the agent for things I was actively stuck on.
David Allsopp. Over the summer, Lucas Ma has been investigating ideas surrounding using effects in the OCaml compiler itself. He’s blogged some of his discoveries and adventures. The technical core of this work leads towards being able to use the OCaml compiler as a library on-demand to create a longer-lived “…
Odoc bugsSep 2025
Jon Ludlam. This post is a brief write-up of a couple of bugs in odoc that I've been working on over the past 2 weeks. I was convinced at the start of this that I was actually fixing one bug, but although they bo...
Mark Elvers. Yesterday I wrote about the amazing performance of Apache Parquet files; today I reflect on how that translates into an actual application reading Parquet files using the OCaml wrapper of Apache’s C++ library.
Jon Ludlam. The system works by watching opam-repository for changes, and then when it notices a new package it performs an opam solve and builds the package, a prerequisite for building the documentation. In or...
Jon Ludlam. Here's a quick post on how to get the OCaml Language Server (ocaml-lsp-server) working with an MCP server.
Jon Ludlam. LLMs are proving themselves superbly capable of a variety of coding tasks, having been trained against the enormous amount of code, tutorials and manuals available online. However, with smaller langua...
Sadiq Jaffer. I stumbled onto terminal-bench a few weeks ago while researching datasets to evaluate agents. It contains around 120 tasks that need to be completed using a terminal..
Mark Elvers. I previously wrote about a mtelvers/package-tool which would generate Dockerfiles for each package in opam.
The Tangled git forge has recently gained support for CI, stacked pull requests and also the Dune build system can generate Tangled metadata easily now for OCaml packages hosted there.
Patrick Ferris. Irmin is an OCaml library for building branchable and mergeable data stores. The data is mergeable in the sense of mergeable replicated data types. I have been using Irmin for over five years to build different kinds of interesting data stores including: A simple markdown-based note-taking web appli…
Anil Madhavapeddy, David J. Scott et al. — Proceedings of ACM Programming Languages
Anil Madhavapeddy, Sam Reynolds et al. — Proceedings of the sixth decennial Aarhus conference: Computing X Crisis
Community efforts to improve agentic coding experience for OCaml including MCP libraries, opam embeddings, and tooling improvements.
Patrick Ferris. Thanks to Tarides sponsorship, I get to work on open-source OCaml. This quarterly is a companion to my weeklies, summarising the last three months of development, peppered with ideas and thoughts about OCaml, its community and its future. What I wanted to work on? There were two main things I hoped …
Sadiq Jaffer. A look at recent OCaml projects, from benchmarking AI code models and building new agentic tools to improvements in the garbage collector.
Jon Ludlam. As of today, Odoc 3 is now live on OCaml.org! This is a major update to odoc, and has brought a whole host of new features and improvements to the documentation pages.
Mark Elvers. Most of the time, you don’t think about how your file is linked. We’ve come to love dynamically linked files with their small file sizes and reduced memory requirements, but there are times when the convenience of a single binary download from a GitHub release page is really what you need.
Mark Elvers. This morning, Anil proposed that having an opam-repository that didn’t have old versions of the packages that require patches to work with OxCaml would be good.
Mark Elvers. The tricky part of using runhcs has been getting the layers correct. While I haven’t had any luck, I have managed to created Windows containers using ctr and containerd.
Mark Elvers. As @dra27 suggested, I first added support in ocurrent/ocaml-version. I went with the name flambda2, which matched the name in the opam package.
Patrick Ferris. This week included some time finishing opentrace and subsequently folding it into shelter. I have been writing up some more of the draft paper for shelter which I am excited to share in the near future. I revisited the upgrading vpnkit PR and pushed some more fixes. I have been thinking, again, abou…
Patrick Ferris. I missed a week of posting last week, mainly because I spent more time writing posts. Hazel of OCaml I mentioned previously that I was building a tool to transpile OCaml code to Hazel. This work is now in a good enough state that, along with one of my students, we have transpiled a good number of OC…
Try OxCamlMay 2025
Patrick Ferris. This week, I have been trying out Janestreet's Oxidised OCaml (see their ICFP paper). This adds a system of modes to OCaml for expressing things like locality. ...we’re introducing a system of modes, which track properties like the locality and uniqueness of OCaml values. Modes allow the compiler …
Sadiq Jaffer. How well can locally-runnable language models handle OCaml code generation? We evaluate 19 open-weight LLMs on first-year Computer Science exercises, exploring the balance between model size, architecture, and reasoning capabilities for less mainstream programming languages.
Patrick Ferris. Over the past few months, I have been piecing together a transpiler from Hazel to OCaml. This is, in part, to help one of my third-year undergraduate students who is working on type error debugging in Hazel. Typed Holes Hazel is a functional programming language with typed holes. Holes are pieces of…
Sadiq Jaffer. Introducing opam-archive-dataset: a Parquet dataset containing code from OCaml packages, designed to improve performance of language models for OCaml development through better training data
Jon Ludlam. The release of Odoc 3 means that we need to update the project so that the documentation that appears on is using the latest, greatest Odoc. With this major release of Odoc, it's also time to give t...
Mark Elvers. As noted on Thursday, the various OCaml services will need to be moved away from Equinix. Below are my notes on moving OCaml-CI.
Ryan Gibb. On 22 Apr 2022, three years ago, I opened an issue in the OCaml package manager, opam, ‘depext does not support nixOS’. Last week, my pull request fixing this got merged! Let’s Encrypt Example Before, if we tried installing an OCaml package with a system dependency we would run into: $ opam --…
EileanApr 2025
Ryan Gibb. Self-hosted digital islands
EonApr 2025
Ryan Gibb. A programmable nameserver
PacApr 2025
Ryan Gibb. A universal dependency solver
Jon Ludlam. is a lovely and simple idea that, if it were reliably implemented everywhere, would make life a lot simpler. So, is it possible to make our OCaml libraries stick to this scheme? There are some projec...
Mark Elvers. Over the weekend, I decided to extend my Box tool to incorporate file upload. There is a straightforward POST API for this with a curl one-liner given in the Box documentation. Easy.
Mark Elvers. Previously, I discussed the installation order for a simple directed acyclic graph without any cycles. However, opam packages include post dependencies. Rather than package A depending upon B where B would be installed first, post dependencies require X to be installed after Y. The post dependencies…
Jon Ludlam. There are that Odoc 3 brings, but there are also a large number of bugfixes. I thought I'd write about one in particular here, an that landed in May 2024.
OCaml users: share your needs regarding older versions to help determine support for OCaml 4.08 and earlier.
Claude Code auto-generates OCaml bindings, but lacks robust sandboxing.
Patrick Ferris. VPNKit is a core part of the Docker for Mac/Windows stack. It is tasked with translating network activity from the host to the Linux container. This short post discusses some of the recent changes to MirageOS and how they impact VPNKit. Dune Virtual Libraries Dune, for quite some time, has supported…
Patrick Ferris. Ppxlib is a libary for building OCaml preprocessors. Users can generate OCaml code from OCaml code (derivers) or replace parts of OCaml code with other OCaml code (rewriters). At the core of ppxlib is the OCaml parsetree; a data structure in the compiler that represents OCaml source code. Ppxlib mak…
Learn FPGA programming with OCaml using HardCaml.
Publish custom OCaml Homebrew taps with a simple GitHub workflow.
Learn about my sixth generation oxidised website built with a bleeding-edge OCaml variant.
MirageOS v2.0 adds ARM support, Irmin storage and OCaml TLS stack.
OCaml LabsJan 2012