After the travel marathon of the past fortnight I got to catch up with hacking this week! I did pop down to London once for a Royal Society policy meeting on AI in science with the European Commission and discovered that the EU still has a (much-shrunken) delegation in London; a bit of post-Brexit infrastructure that I hadn't appreciated existed but am very glad to hear about.
While I was down there, I caught up with Cyrus Omar who was over from Michigan to chat about our Fairground planetary wiki work. We're both interested in how programmable wikis can become a serious substrate for sharing structured scientific data with provenance baked in, and PROPL 2026 is coming up in PLDI where we'll do more work on this.
1 Welcoming Akshay to Cambridge
I'm most delighted to welcome Akshay Oppiliappan to my group here in Cambridge! I've long been a fan of his work on Tangled, and indeed consider it to be the most useful app built over ATProto.
We've been using Tangled for a lot of our code hosting here in my group, and it's a really practical way to get towards some of the things we want to do for building federated scientific infrastructure under the five principles of collective knowledge.

One interesting thing I learnt is that Tangled is working on a separable 'app view' (that is, a version of the https://tangled.org website that can be deployed elsewhere). I'd love to have a version that is restricted to just the immediate group members in order to help get a focussed view on a particular set of repositories, while still keeping the overall metadata open.
2 TESSERA: AWS sync done, Zarr bindings next
The big milestone on the TESSERA side is that the AWS Open Data sync has finally finished, so we now have the full half-petabyte mirrored alongside our Cambridge Ceph copy. With that done, I'm turning my attention to the OxCaml Zarr conversion by building on Mark's ocaml-zarr work, so that we can start consuming the cloud-native stores directly via HTTP.
There are also some exciting updates coming soon about a new version of the TESSERA model that pushes the embedding quality further. The nice property of how we've architected "embeddings as data" is that no user-facing code will need to change when v1.1 lands. We just regenerate the map tiles under the existing geo-embeddings convention and downstream tasks should pick up the improvements automatically. More on this once the embeddings generation progresses!
3 Recoil refresh to Linux 7.0
On the home infrastructure front, I spent some quality time upgrading several of the Recoil self-hosting machines to Ubuntu 26.04. I have not been able to recreate the pesky io_uring/zfs wedge that has been plaguing me even on 6.14 kernels recently. Fingers crossed that it really is fixed and not just hiding behind a race condition!
I've also been happily using Komodo as the lightweight web interface for Docker across three machines, and am busy migrating to Mythic Beasts since our former Equinix hosting is sunsetting next month. The only technical complexity here is that Mastodon is tied to one hostname and I made a mistake calling it amok.recoil.org (the raw hostname) instead of something more abstract. Michael Dales did manage to migrate last year though, which is a good sign when I try next week...
4 oi continues, and now deploys this very site
My sidequest on oi, my uv-like distributor for OCaml binaries, has been steadily gaining steam. It now supports OxCaml as well as multiple OCaml versions, which is tricky since OxCaml isn't relocatable yet. Still, some hacks later, I've got far enough that I'm quietly using it for myself day-to-day to see if the tool holds up under real development workloads.
This very website is now deployed using:
oi run --toolchain=oxcaml @avsm/arod -- arod serve -v
This feels like a nice eating-my-own-dogfood moment! I'll write up technical details properly once I've stopped rewriting the implementation.
I've also been working with Thomas Gazagnaire to merge his significant changes from the last four months into the agentic libraries I built last year, so we can reconcile our diverging trees. He's been hacking on these in his monopampam tree and there's a lot of cleanup to bring across.
4.1 Cross-building OCaml Windows binaries
Apropos to the above, I've been poking at
msys2-docker to see if I could compile
OCaml Windows binaries directly from Linux without doing full
cross-compilation. It does almost work but the layering of MSYS2 inside Wine
is unreliable due to fork not working very well. Dave Scott then mentioned to me
over a coffee that it's possible to do this more directly via Wine running
cmd.exe, by extracting the necessary bits out of a nanoserver Docker image.
That sounds way better, so I'll try that approach next week.
5 A new forest leakage preprint
A new preprint has gone up from our 4C trusted carbon credits work, led by the wonderful Francisco d'Albertas.
This one's about "Estimating the carbon impacts of leakage from forest restoration and the costs of reducing them". The abstract:
Ecosystem restoration is a key nature-based climate solution but risks displacing economic activities and triggering leakage – whereby forgone production drives habitat loss elsewhere, eroding benefits. Focusing on reforestation opportunities Brazilian ranchland we characterized leakage risk as the ratio of forgone beef production to carbon gained.
Assuming 100% of forgone production results in extensification we asked: what is the impact of unaddressed leakage; how much can leakage be reduced by prioritizing restoration in low-yielding, high-carbon areas; and can it be cost-effectively mitigated by targeted intensification?
Taking likely leakage into account but not tackling it increased median costs of restoration (over ignoring it entirely) by 43-100%, to median values of 33 and 24 USD tCO₂e⁻¹ in the Atlantic Forest and Amazon, respectively. Prioritizing low-leakage sites reduced these costs by 21–37%; combining this with targeted intensification cut net carbon costs further, to 67% of unmitigated levels. Our broad findings hold at 30% (cf 100%) extensification and in other sensitivity analyses, and reveal leakage can substantially increase carbon costs, but that careful siting and targeted intensification can provide extremely cost-effective mitigation. -- d'Albertas et al., 2026
This pushes our forest restoration analyses onto the all-important "leakage" question, which is something of an elephant in the room for almost any nature-based climate intervention (if we choose our interventions badly, then displacement of existing use of that land causes yet more deforestation). Congratulations to Chico and the rest of the team for getting this out!
