.plan-26-16: Chennai, Cambridge, Belfast: a week on the wing

A week of hops between Chennai, Cambridge and Belfast for the FP Launchpad takeoff at IIT Madras, a surprise Publication of the Year at the Cambridge Ring Hall of Fame, meeting the VC on the upcoming Rokos School of Governance, mirroring half a petabyte of TESSERA tiles and hacking on oi

%rc
I spent most of the week in the air with hops to Chennai, Cambridge and Belfast. The reason was the FP Launchpad 'takeoff' at IIT Madras where I spent half the week, in a campus teeming with banyan trees and monkeys! I've sketched out two project ideas inspired by the visit: an io_uring backend for Lean and an OCaml port of sPyTial, with more to come this week as I catch up.

1 Paper of the Year at the Cambridge Ring Hall of Fame

%rc
Back on home turf in Cambridge, the Cambridge Ring held its annual Hall of Fame Awards at Queens' College. In previous years they tell you if you've won, but this year it switched to an "Oscar style" nomination mechanism. After a tense set of announcements of the runners up (well, cheering on Jon Sterling actually!) I was delighted that our ICFP 2025 paper "Functional Networking for Millions of Docker Desktops" won Publication of the Year! (see LI or Bluesky)

My huge thanks to my coauthors Dave Scott, Patrick Ferris, Ryan Gibb and Thomas Gazagnaire, and to the cast of hundreds at Docker, Cambridge, California and beyond who made the decade of work behind it possible. There's a companion CACM piece with Dave Scott and Justin Cormack that covers the broader arc, but it's the ICFP experience report that has the OCaml networking nitty gritty details that I sunk a lot of time into over the years.

2 Visiting the Old Combination Room

%rc
On getting back to Cambridge on a redeye flight, I got invited to the Old Schools Combination Room with the Vice-Chancellor Deborah Prentice to celebrate and to discuss the upcoming Rokos School of Governance. It was lovely to catch up with Marla Fuchs, who has done an enormous amount of work behind the scenes to make all of this happen, and also to hear her toil get properly acknowledged in the room full of very senior people!

%rc
The conversations about what a 21st-century governance school should look like were very animated. More on my thoughts as they form in the coming months, but this feels like a real opportunity in Cambridge.

3 TESSERA: tiles on AWS and a multi-registry client

%rc
Mark Elvers and I have been grinding through the migration of TESSERA tiles to AWS Open Data. Mark has written up the nuts and bolts in two posts:

  • GeoTessera STAC on exposing the embeddings via a STAC catalogue so clients can discover tiles using standard spatiotemporal queries. Zarr v3 doesn't have an indexing mechanism that's standard, which is an unusual (but I think deliberate) omission.
  • CephFS to S3 on the actual bulk transfer from our Cambridge Ceph cluster to AWS, including the tuning of parallelism and retry behaviour to make a half-a-petabyte transfer go faster. The first iteration without the tweaks would have taken two months of transfer time, but we got it down to a week...

With the precious TESSERA data now in two sites at last, I've been building the multi-registry support into geotessera itself so the client can discover and fetch tiles transparently from Cambridge, AWS or other future mirrors. This builds on the Zarr v3 layout and geo-embeddings convention and gets us closer to a proper federated story for TESSERA data distribution. I'm also meeting up with Cyrus Omar and his group in London next week as they are visiting, as this is related to our planetary wiki that we wrote up last year.

4 oi: a uv-like distributor for OCaml binaries

Now that more people outside our immediate circle are using OCaml in production across the group, the issue of "how to run this OCaml CLI tool without CLI gymnastics" has started to bite. opam has always been great once set up, but it's a lot of machinery for someone who just wants to run the TESSERA CLI or some utility.

I've been spending a lot of time with uv in recent months while working on the Python machine learning end of TESSERA, and it has become my default choice to ship Python tooling. I spotted an opportunity to get this Python goodness over to my statically typed world and have been hacking on oi: a fast, stateless client that fetches and manages binary releases of OCaml tooling with a single invocation.

The idea is much older than my prototype code. Back in 2023 I sketched out an opam-repo roadmap around a merge-queue-driven overlay repository: rather than each user resolving the whole universe on their laptop, a central CI would continuously solve and build the overlay, and clients would just pull pre-resolved, pre-built artefacts. What I was missing was a clean way to actually execute the builds reproducibly.

Two members of my group came up with the answers. Firstly, David Allsopp got his relocatable OCaml compiler patches merged after a year of hard work. Then Mark Elvers came up with the day10 build tool that builds opam packages inside OCI containers with layer caching, and also an opam overlay CI that wires up the GitHub merge queue so regressions can be caught before a PR lands.

Some cool things you can do with oi today:

  • oi run utop gets you the toplevel quickly.
  • oi run --with=async utop gets you the toplevel with Async loaded.
  • oi run --with=https://tangled.org/patrick.sirref.org/merry msh gets you running with the msh binary from Merry
  • oi run https://www.cl.cam.ac.uk/~avsm2/foo.ml runs a remote script without a dune file being needed!

It does this with the simple trick of adding OCaml attributes to the toplevel, just as Python inline script metadata does. oi then synthesises a dune file and adds ppx preprocessors in via heuristics. You can see my version of the package attributes in the snippet of OCaml below:

[@@@opam base stdio ppx_jane]

open Base
open Stdio

type t = { bar: float } [@@deriving sexp]

let rec read_and_accumulate accum =
  let line = In_channel.input_line In_channel.stdin in
  match line with
  | None -> accum
  | Some x -> read_and_accumulate (accum +. Float.of_string x)

let () =
  let t = { bar=read_and_accumulate 0. } in
  printf "Total: %s\n" (Sexp.to_string_hum (sexp_of_t t))

So to recap, with day10 supplying a reproducible build substrate and OCaml now being relocatable, that's all I needed to glue together this oi tool that apes uv! It's early and rough and only really intended for local use, but issues/opinions are very welcome especially around signing and platform compatibility. I'll be blogging about the technical details more this week, and switching to using it day to day to make sure it's good enough before sharing more widely. Having said that, it already seems to have escaped into the wild.

5.1 RCTs as a survival skill

A great FT piece this week on "The trials that quietly changed our lives" (h/t Hetan Shah) on how randomised controlled trials and the steady accumulation of evidence have underpinned most of the quiet improvements in modern life:

Arming our children — and ourselves — with the ability to spot bunk and think critically about claims has become an essential survival skill. -- The trials that quietly changed our lives, 2026

This hits on what I've been working on with the Conservation Evidence and the evidence TAP teams in recent months. The world's awash with confidently incorrect LLM-generated assertions, and the ability to trace a claim back to an actual evidentiary test isn't just a niche skill for academics any more.

5.2 Hamed Haddadi visits me and Mort

There have been persistent rumours that Hamed Haddadi and I are the same person, and I hope that this evidence from Christ's College Cambridge at a delightful high table hosted by Richard Mortier will resolve this situation. Thank you for your attention to this matter.

Hamed? Or is it?
Hamed? Or is it?
Anil? Or is it?
Anil? Or is it?

References

[1]Madhavapeddy et al (2025). Functional Networking for Millions of Docker Desktops. 10.1145/3747525
[2]Madhavapeddy (2026). The FP Launchpad takes off at IIT Madras. 10.59350/4bsr3-h6735
[3]Jaffer et al (2025). AI-assisted Living Evidence Databases for Conservation Science. Cambridge Open Engage. 10.33774/coe-2025-rmsqf
[4]Madhavapeddy et al (2026). A Decade of Docker Containers. 10.1145/3761803
[5]Madhavapeddy (2026). TESSERA now supports the Zarr geo-embeddings convention proposal. 10.59350/c3hrq-zsx02
[6]Madhavapeddy (2026). Streaming millions of TESSERA tiles over HTTP with Zarr v3. 10.59350/tk0er-ycs46
[7]Omar et al (2025). A FAIR Case for a Live Computational Commons. Association for Computing Machinery. 10.1145/3759536.3763802
[8]Madhavapeddy (2025). GeoTessera Python library released for geospatial embeddings. 10.59350/7hy6m-1rq76