OxCaml Labs

We're exploring systems applications for Oxidised OCaml (or 'OxCaml'). The language extension is primarily developed by Jane Street, and our group builds high-performance applications using it that are competitive with other systems languages. We've been working on mainline OCaml for a long time, and so we also perform stewardship activities around the OCaml community.
Our group is organised around three pillars. We're supporting the OCaml platform that comprises the compiler, package manager, CI infrastructure, and documentation systems. Then we're building live programming environments that put OCaml into browser notebooks and maps, with the aim of teaching algorithms as well as supporting planetary-scale exploratory research. Finally, we're figuring out the impacts of AI-assisted development on OCaml, ranging from training data to vibecoding etiquette.

Here follows a review of roughly our first year of activity of OxCaml Labs (~Feb 2025 - Mar 2026):
- OCaml Stewardship >> Compiler (Relocatable OCaml, Runtime) · OxCaml extensions · odoc 3 · PPX · Package management & CI (CI, opam & dune, Research) · Outreachy & community
- Live Programming >> Browser-based OCaml · Hazel · Numerical computing · Planetary computing · GeoCaml · Data provenance
- AI-Assisted Development >> Benchmarks · Tooling · Advent of Agentic Humps · Redecentralisation
1 OCaml Stewardship
We maintain and extend the language, tools, and infrastructure that the open-source OCaml ecosystem depends on. This in turn benefits OxCaml, which is developed as a monorepo but with dependencies into OCaml itself.
1.1 Compiler development
Several of our group are core OCaml developers, and so we hack on the internals of the compiler itself. It's never been easier to contribute, even if you're a student just getting started so just have a go at it!
1.1.1 Relocatable OCaml

Together these make it possible to ship self-contained OCaml installations without absolute paths. This might sound a bit pedestrian, but it's a critical feature for many modern packaging workflows. Once the compiler can be relocated without hardcoded paths, it means that binary builds can be downloaded and support features such as fast opam switch cloning, multistage Docker images, and cross-compilation that all need custom toolchains in specific locations. In general, an end user will experience much shorter bootstrap times for their OCaml toolchains.
We then backported all this to OCaml 4.13+ and cleaned up compiler packaging in opam, which should pave the way to using relocatable OCaml by default soon.
1.1.2 Runtime and backend work

We also changed the free list representation in the shared heap to use run-length encoding, so sweeping takes time proportional to the live portion of the heap rather than its total size; this is sometimes 3-4x faster on sparse heaps (such as our multi-terabyte-RAM Epyc machines). This has been backported to OxCaml as well.
We also experimented with using effects in the compiler itself (#) and rethinking file descriptor abstractions for new IO interfaces like ocaml-uring, as well as how static linking works and OCaml 5.4 native ARM32 (out-of-tree, as 32-bit native output is being deprecated upstream).
1.2 OxCaml extensions and high-performance systems
It was a little daunting getting started with OxCaml, since it's got a high development velocity within the walls of Jane Street and we have lots of third-party libraries in our existing OCaml code that weren't quite compatible. The first thing we did was to do automated summaries of the (thousands) of GitHub issues around the ecosystem, in a system called thicket.dev, that has its own dedicated OxCaml section. I'm always wary of publishing LLM-generated text, but I review the output and it's also a very constrained usage (to summarise and link to issues), and (most importantly) I can't think of a reasonable alternative given the volume of activity ongoing.

We had great fun helping to run an OxCaml tutorial at ICFP 2025 covering modes, locals, and performance engineering, and published a fun OCaml vs OxCaml Pi Day comparison. There were trip reports from multiple group members about the rest of ICFP, and our reflections covered new features like data race freedom and reflections on the multicore saga.
Once we were more comfortable with OxCaml, we started building research infrastructure for our own use. We built httpz, a zero-allocation HTTP/1.1 server using unboxed types and local allocations that now runs this website. We ported compression libraries (Brotli, Zstd, Snappy) to OxCaml in a monorepo to experiment with SIMD, built an ONNX inference engine using OxCaml SIMD intrinsics, and benchmarked GPU vs CPU inference.
We're using all these pieces to help train and infer TESSERA embeddings, which we talk about later in this post. The overall aim is an end-to-end native pipeline where satellite data arrives over HTTP, gets decompressed, inferenced, and served; the fast data transfers and transcodings are done using OxCaml. For fun, we also optimized an MP3 codec comparing OCaml and OxCaml performance.
1.3 Cross-linked documentation with odoc 3
odoc 3 is a major overhaul of OCaml's documentation generator. The earlier versions focused on feature coverage in order to replace the aging ocamldoc. odoc 3's theme is manuals instead of individual module documentation, via .mld files that let authors write comprehensive package documentation with cross-package linking.
odoc 2 could only link to dependencies, but odoc 3 can now link to any simultaneously installable package. This is a big deal for for multi-package projects like MirageOS or Core where the top-level package needs to reference dozens of libraries it doesn't directly depend on. Users can now write a single mld file that's a single-page tutorial. Odoc 3 also adds rendered source code navigation, type-based search via sherlodoc, a global sidebar, and media support so we can reference images directly in docs as well (or audio, although I'm not sure anyone's done that yet).
We deployed odoc 3 into production use on large codebases (including Jane Street's) with fixes for complex features such as module type of (these type-check fine but are surprisingly tricky to cross-link accurately in HTML). After this, we rolled it out on ocaml.org in July 2025, which required attempting to build all 17,000+ distinct package versions in the opam repository. The docs-ci runs on a dedicated 40-thread blade server, producing about 1 TiB of documentation over a couple of days. Common dependencies like dune are built thousands of times during this process — always producing identical binaries — so the new CI is far more efficient than the old pipeline, taking only a few hours to rebuild all docs when a new odoc version ships.
Once this went live (to positive reactions, thankfully) the rest of the year was spent on odoc bug fixes, sherlodoc search integration, and support for OxCaml modes and layouts. This is all essential groundwork for documenting the live programming libraries we're building in our research work!

1.4 PPX syntax extensions

The community unified around ppxlib as the single framework for managing these AST upgrades. We help maintain it by bumping the ppxlib AST each time a new compiler version ships, and then sending patches to downstream PPX authors.
This work matters more than ever because OxCaml currently depends on PPX for practical use. OxCaml's mode and kind systems don't yet support full polymorphism natively, so you can't write a single identity function that works for both heap-allocated (global) and stack-allocated (local) values or for boxd and unboxed values. Until that lands in the type system, the ppx_template fills the gap to generate monomorphic copies for each instantiation (a bit like C++ templates). This means that virtually all OxCaml code (in particular the Base and Core libraries) flows through ppxlib.
We've been helping to upstream the ppxlib features that ppx_template needs, such as a new context-free rule type for attribute-based AST replacement, and fixing OOM errors that surfaced on larger codebases.
1.5 Package management and CI

1.5.1 Continuous integration
OCaml-CI is the shared CI service that automatically tests every package submitted to the opam repository across multiple OS and architecture combinations before it can be merged. We run and maintain the underlying cluster of build workers (spanning Linux x86/ARM64/PPC64/s390x/RISC-V, macOS, Windows and FreeBSD). Last year both our Scaleway and ARM sponsorships expired, so we spent time moving the cluster to Cambridge, issuing Docker base images for new compiler releases, adding Windows builds, FreeBSD, ARM64 workers, and 32-bit backends. We also documented how to distribute OCaml via Homebrew for macOS users.

On the build side, Windows container support progressed from an initial QEMU backend (which provided some testing but was unreproducible for end-user debugging) to proper OBuilder on Windows HCS using the Host Compute Service. We've also compared BuildKit against OCurrent for image building. Our use of Windows containers may be one of the largest such deployments in the world, given the number of bugs we've encountered.
1.5.2 opam and dune
opam is OCaml's package manager, and dune is the build system that compiles OCaml projects. Together they form the basis of the OCaml Platform of out-of-the-box developer tooling. We help maintain the opam package repository and contribute to both tools.
The opam repository is a community-maintained collection of package metadata via a large Git repository where every OCaml library has a file describing its dependencies, build commands, and compatibility constraints. When someone runs opam install, a constraint solver finds a compatible set of packages. We help review and merge submissions, and maintain tooling around the repository itself.
Day-to-day maintenance included retiring legacy compiler versions from the CI matrix, migrating to opam 2.5, improving dependency solving and solver caching, and the perennial question of whether semantic versioning makes sense for OCaml (spoiler: no).
The bulk testing tool we built, Day10, takes a different approach to the standard CI by encouraging the use of merge queues. Instead of rebuilding an opam switch from scratch for each package (which recompiles common dependencies repeatedly), it assembles switches from pre-built component packages that are merged in overlayfs using hardlinks. This gets the entire package database rebuilt in half an hour on a single multicore machine, which we then use to compare builds across compiler variants. The end result is that we can now catching packages that break when switching from OCaml 5.4 to 5.5, or from mainline to OxCaml, much quicker than before.
1.5.3 Researching new formalisms
On the research end, we're developing formal models for dependency resolution that give insights into how to evolve systems like opam and Docker. The key finding is that dependency resolution across heterogeneous package managers can be unified via a single algebraic framework, which we're now using to prototype cross-ecosystem compatibility between opam, Nix and indeed any other language ecosystem.
This started with using hypergraphs for version resolutions and then "Package Managers à la Carte: A Formal Model of Dependency Resolution" that formalises dependency resolution across PL ecosystems. We built the Pac tool, got opam-nix integration merged upstream, and presented three talks at FOSDEM 2026: package management formalised, Eilean, and Opam's Nix mechanism.
1.6 Outreachy and community
A healthy programming language needs a welcoming community. Outreachy provides paid internships in open source for people underrepresented in tech, and OCaml's participation goes all the way back to 2015. We've been coordinating it for the past several rounds, funded by Jane Street and Tarides. Outreachy has been a wonderful source of new OCaml developers, though the programme itself is facing funding challenges in recent years.
In the June 2025 round, we mentored two interns: one built Claudius, a fantasy-console graphics library, and another extended dune with to discover system information. The December 2025 round was our largest yet with four interns:
- an OxCaml backend for Raven (now merged upstream with benchmarks approaching the C backend!), and also an ML experiment dashboard also for Raven. There's more about the Raven backend below including a video.
- better errors for YOCaml
- writing TIFF files in pure OCaml which feeds directly into our geospatial stack.
We also try to issue monthly OCaml Roundups or quarterly summaries or more detailed writeups on coding agents and fixes to the OCaml runtime. We also wrote an Irmin retrospective looking back at a project that's been going for over a decade now, and is about to have direct style IO integrated.
Fun experiments also abound: we're running multicore OCaml 5 on a Raspberry Pi Pico 2 microcontroller, building eInk display drivers on a Raspberry Pi, writing a complete ray tracer in OCaml usng TSDL, OCaml 5 Domains, and Atomics (with performance work on reducing heap allocations), building an OCaml static site generator for blogging via webplats, and getting into FPGA programming with HardCaml.
2 Live Programming
Our second pillar of work is about making OCaml executable in new contexts than just binaries, via publishing to the web for lecture slides, interactive notebooks, satellite maps, and even numerical computing. This connects our teaching mission to our research since the same tools can let students explore algorithms and also let ecologists explore satellite embeddings on a live map. The common thread is bringing OCaml's traditional advantages like static type safety into interactive, explorable environments that are usually more dynamic (a euphemism for saying that we just don't want to program frontends in Javascript).
2.1 Browser-based literate OCaml
The foundation for all of this is js_of_ocaml, which compiles OCaml to JavaScript, and increasingly wasm_of_ocaml for WebAssembly. We've been building on top of these to create interactive programming tools.
Arthur Wendling wrote x-ocaml as a web component that compiles and runs OCaml code directly in the browser, with type-on-hover and programmatic code highlighting. After my ICFP OxCaml tutorial first used it, Jon Ludlam combined it with Slipshow for presentations to power our interactive lecture slides for the Cambridge 1A Foundations of Computer Science course. Jon also took over my course for the 2025-2026 year as I've been on sabbatical, and his lectures saw him modify and execute code snippets live; especially useful for the notorious "giving change" algorithm in the middle of the course that trips up many a first year undergrad!
All this is building towards a broader OCaml-based literate programming infrastructure. This first introduces a new odoc plugin infrastructure, and example plugins for admonitions, Mermaid diagrams, and even Scrollycode tutorials. The end goal is a serverless deployment, using web workers to run OCaml code fully in the browser. This matters to us not just for robustness and longevity, but also when writing code for deployments in remote field stations when building biodiversity monitoring applications.

2.1.1 The Hazel connection
There's a connection to another exciting browser-based live programming environment in the form of Hazel. We built a transpiler from OCaml to Hazel (paper), bootstrapping a corpus of ill-typed OCaml programs for the Hazel live programming environment. Hazel provides richer interactive feedback than a simple toplevel since it can reason about incomplete and ill-typed programs via "typed holes".
This gives us a bootstrapped corpus of ill-typed code in Hazel to build type level debuggers and eventually could provide students with feedback on why their code is wrong, not just that it's broken (a common frustration with statically typed PLs).
Another connection to OCaml here is that the Hazel interpreter is itself written in OCaml and compiled to Javascript. We're interested in using OxCaml's performance features to speed this up, and try to use Hazel's live programming features as a DSL for live computational wikis.
2.2 Numerical computing
OCaml is traditionally 'ok' at floating point numerical code, but hasn't quite kept up over the years with newer advances. OxCaml has a number of performance-oriented features such as small numbers and SIMD which improve this. We've been investigating how to use these for both traditional CPU-based algorithsm as well as GPU-based machine learning training and inference.
2.2.1 Planetary computing with TESSERA and LIFE
A major application of our live programming work is deploying the TESSERA, a pixelwise geospatial foundation model geospatial model, which involves manipulating petabytes of satellite data. We've been building out the data infrastructure for earth observation, biodiversity monitoring, and geospatial analysis in a heady combination of Python, OxCaml and shell scripts.
We built an end-to-end TESSERA inference pipeline in OCaml using ONNX bindings, progressing through early Zarr v2 support and STAC tile serving. The OxCaml-Zarr transcoding pipeline went from prototype to browser-based streaming via wasm/WebGPU, and the full Zarr v3 conventions are now documented with a shared community convention. We demonstrated finding solar farms with a 42k-parameter model using these, and built OpenStreetMap protobuf and DuckDB bindings in OxCaml for combining vector data with pixel embeddings.
Our first prototypes were written in Python, and then OCaml TESSERA notebooks bring this back to our functional world. The notebooks let users draw regions on a map, fetch embeddings, place training labels and run classification all entirely browserside-side in OCaml compiled to JavaScript with no server required!
We also worked on Parquet file optimization for geospatial data formats. The Yirgacheffe declarative geospatial library, which we're also currently porting to OxCaml, and its role in our biodiversity pipeline is covered separately in the Mapping LIFE on Earth project.
2.3 Raven and nx-oxcaml
An Outreachy intern, Nirnay Roy, built nx-oxcaml, an OxCaml backend for the Raven numerical computing library. It uses unboxed primitive types (float#, int32#) for zero-allocation numeric operations, and early benchmarks show it approaching the C backend's performance!
This is a super exciting early validation of OxCaml's promise for us. It looks like we've got a high-level functional language that's approaching being competitive with C for numerical workloads, and reasonable enough to use to have been built during an internship in three months. There's still work to be done, but it's getting there!
2.3.1 GeoCaml
We're assembling geocaml, a suite of pure OCaml geospatial libraries that can eventually replace C dependencies like GDAL for our use cases.
The centrepiece is ocaml-tiff, a pure OCaml TIFF reader and writer that can handle GeoTIFFs as well. Our Outreachy intern Tambe Salome delivered LZW decompression speedups and added write support. Tambe also wrote up her own experiences about the internship.
We also maintain OCaml bindings to PROJ4 for coordinate reference system projections, a WKT codec for the well-known text format, an R-Tree spatial index adopted from old friend Marius Eriksen, and ocaml-geojson.
The goal here is to reach a critical mass where these libraries can reinforce each other so we can stay in pure OCaml for common geospatial tasks. We're getting there with early success porting geotessera to OCaml using ocaml-tiff for landmasks and Nx for array operations, and the interactive browser notebooks.
2.3.2 Shells for a modern, less civilized age
Shelter is a data provenance system built with OCaml and Eio that uses eBPF tracing to track how data flows through scientific pipelines. It supports imports for modularity and sessions with full lineage tracking, and we've been working on running the Mapping LIFE on Earth biodiversity pipeline through it to get end-to-end data provenance.
Since any reasonable Unix distribution needs a POSIX compatibility story, we've also been working on Merry, a complete POSIX shell in OCaml! While Merry is usable today, the pieces are currently being fitted together to build a complete time-travelling Linux shell!
What I am particularly interested in reasoning about, is the execution context in Merry. This value, alongside the file-system, constitutes a fairly deep understanding of the state that changes in each step of a shell's evaluation loop.
This was, in terms of Shelter, the missing piece for truly building some kind of MRDT across shell sessions. -- Merry, Patrick Ferris, 2026w12
Reproducible provenance of computational pipelines is fast becoming a requirement for evidence-based policy. Our work on global biodiversity frameworks calls for auditable chains of evidence from raw data all the way to reported indicators, or we risk wasting resources on low-impact interventions.
We also co-organised the second outing of the Programming for the Planet workshop at ICFP, gathering functional programmers interested in planet-positive action. PROPL 2026 returns at PLDI in Boulder with an action-oriented format later this year!
3 AI-Assisted OCaml Development
Claude Code appeared just as we started OxCaml Labs in March 2025, and has completely turned software development on its head. Our goal here is in some ways just keeping up, but also in advancing our understanding in what AI-assisted development means for a strongly-typed functional language. This ranges from improving the training data quality to the social etiquette of working alongside agents.
3.1 Contributing OCaml benchmarks
A low-resource language that isn't represented in AI benchmarks and training data risks being left behind as coding agents improve. We've been working to make sure OCaml is in the mix.
We presented Three steps for OCaml to crest the AI humps at the 2025 OCaml Workshop, benchmarking OCaml's representation in foundation models. We curated an opam archive dataset for LLM training — extracting structured code from thousands of opam packages to give models more OCaml to learn from. We then evaluated 19 local LLMs on OCaml (finding that Qwen3-32B nearly matches Claude at a fraction of the cost, which is promising for self-hosted coding agents), and contributed an OCaml GC debugging task to the terminal-bench AI agent benchmark suite — a real runtime bug that tests whether agents can reason about garbage collector internals.
3.2 Tooling for agentic OCaml
On the tooling side, we built an OCaml MCP server and integrated ocaml-lsp via MCP for AI coding, surveyed the broader ecosystem, and published a prebuilt devcontainer for OCaml/OxCaml with Claude Code and explored adversarial approaches to teaching OxCaml to agents.
To keep up with the firehose of activity across OCaml's GitHub repositories, we built Ruminant, an LLM tool that syncs thousands of issues, PRs and discussions and uses Claude to generate condensed weekly digests. This is now published regularly on thicket.dev for the OCaml ecosystem.
3.3 The Advent of Agentic Humps
To really push the limits, I ran a December Advent of Agentic Humps sprint, building 25 O(x)Caml libraries in 25 days with Claude Code. At a high level, it worked surprisingly well for a relatively low-resource language like OCaml. Several patterns (not quite lessons just yet) emerged:
-
Specifications as source code. Feeding agents formal specs like RFCs or WHATWG standards produced good results when combined with external test suites. We built HTTP cookie, punycode and public-suffix libraries from RFC text alone, a pure OCaml Yaml 1.2 parser that passes 20% more tests than C libyaml (and is 20% faster), and TOML 1.1 codecs reaching 100% of the toml-test suite.
-
Error messages guide agents. Well-designed libraries with precise error messages (particularly jsont and its typed codecs) let agents self-correct in a loop. This proved useful when debugging against live APIs and when reverse-engineering protocols from multiple language SDKs. The bidirectional codec pattern generalised across Yaml, TOML, and INI.
-
Source code context across languages beats documentation. Providing agents with working implementations to study consistently outperformed docs alone. For an HTTP client, we aggregated best practices from 50 implementations across 10 languages. For Eio-based libraries, cloning the source repo and pointing at the README was enough. We built monopam and unpac for assembling monorepos where agents have full local context, a pattern now natively supported by Claude Code.
-
Self-healing and recursive testing. When external test suites don't exist, building tools that test themselves works well: a Zulip bot exercising its own API, and a feed aggregator that patches its own parsing code. OCaml's type system is a reasonable guardrail here and we think it improves agentic coding quality compared to Python. However, the code produced is "unoriginal" and often verbose; good for parsing and utility libraries, but not a substitute for creative design. We treat all agent output as "slop by default" until human-reviewed.
Outside the Advent, we tried agentic bug hunting, experimented with Claude completing IMAP protocol specs and controlling hosts with OCaml, and tried to used it to build dune odoc rules as a first foray into upstreaming code with an agent.

3.4 Evolutionary redecentralisation
It wouldn't be any fun to be in a university if we didn't try something really crazy. We have a long history of working on self-hosted infrastructure, federated services, and distributed systems using MirageOS and modern OCaml.
Our Ecology of/for the Internet paper (presented at the decennial Aarhus conference) argues that the Internet is dangerously ossifying into software monocultures. We're wondering if the fix actally comes from an unlikely source: use AI code models to deliberately introduce diversity!
The idea is to mutate end-host software stacks (often written in OCaml via MirageOS) so that each deployment is slightly different from its neighbours, much like how genetic diversity in a wild population prevents a single pathogen from wiping out the entire species. We dubbed this the "antibotty" approach due to the irresistable pun: use locally adapted software vigilantes (antibodies) that fight back against global botnets. This also turns the biggest down side of generative AI (that it's unpredictable) into a useful property when you actually want more local diversity. OxCaml may be surprisingly important here, since its extra type and kind information helps constrain the runtime characteristics of AI-generated code.
So it's not too pie in the sky, we're building the practical pieces to make this work without using any AI. Eilean is a self-hosted digital islands platform using Nix, and Eon is an effects-based OCaml DNS nameserver using Eio. Our Bifrost network introduces a programming model using bigraphs for spatial networking, enabling policies scoped by physical boundaries. Tangled code hosting now supporting opam packages is also another small step to reducing our reliance on GitHub.
All in all, we don't know where this brave new AI world is taking us, but we're going to inject some entropy into the works to help ensure that decentralisation and local agency remains an option!
