.plan-26-17: Unwedging kernels, dogfood deployments, and managing beef leakage

#oxcaml #policy #selfhosting #4c #leakage #evidence26 Apr 2026

Welcoming Akshay to Cambridge, TESSERA AWS sync done, oi now self-hosts this site, and a new 4C forest leakage preprint appears.

After the travel marathon of the past fortnight I got to catch up with hacking this week! I did pop down to London once for a Royal Society policy meeting on AI in science with the European Commission and discovered that the EU still has a (much-shrunken) delegation in London; a bit of post-Brexit infrastructure that I hadn't appreciated existed but am very glad to hear about.

While I was down there, I caught up with Cyrus Omar who was over from Michigan to chat about our Fairground planetary wiki work. We're both interested in how programmable wikis can become a serious substrate for sharing structured scientific data with provenance baked in, and PROPL 2026 is coming up in PLDI where we'll do more work on this.

1 Welcoming Akshay to Cambridge

I'm most delighted to welcome Akshay Oppiliappan to my group here in Cambridge! I've long been a fan of his work on Tangled, and indeed consider it to be the most useful app built over ATProto.

We've been using Tangled for a lot of our code hosting here in my group, and it's a really practical way to get towards some of the things we want to do for building federated scientific infrastructure under the five principles of collective knowledge.

Akshay, Jon, Mark and me hang out in my jungle

One interesting thing I learnt is that Tangled is working on a separable 'app view' (that is, a version of the https://tangled.org website that can be deployed elsewhere). I'd love to have a version that is restricted to just the immediate group members in order to help get a focussed view on a particular set of repositories, while still keeping the overall metadata open.

2 TESSERA: AWS sync done, Zarr bindings next

The big milestone on the TESSERA side is that the AWS Open Data sync has finally finished, so we now have the full half-petabyte mirrored alongside our Cambridge Ceph copy. With that done, I'm turning my attention to the OxCaml Zarr conversion by building on Mark's ocaml-zarr work, so that we can start consuming the cloud-native stores directly via HTTP.

There are also some exciting updates coming soon about a new version of the TESSERA model that pushes the embedding quality further. The nice property of how we've architected "embeddings as data" is that no user-facing code will need to change when v1.1 lands. We just regenerate the map tiles under the existing geo-embeddings convention and downstream tasks should pick up the improvements automatically. More on this once the embeddings generation progresses!

3 Recoil refresh to Linux 7.0

On the home infrastructure front, I spent some quality time upgrading several of the Recoil self-hosting machines to Ubuntu 26.04. I have not been able to recreate the pesky io_uring/zfs wedge that has been plaguing me even on 6.14 kernels recently. Fingers crossed that it really is fixed and not just hiding behind a race condition!

I've also been happily using Komodo as the lightweight web interface for Docker across three machines, and am busy migrating to Mythic Beasts since our former Equinix hosting is sunsetting next month. The only technical complexity here is that Mastodon is tied to one hostname and I made a mistake calling it amok.recoil.org (the raw hostname) instead of something more abstract. Michael Dales did manage to migrate last year though, which is a good sign when I try next week...

4 oi continues, and now deploys this very site

My sidequest on oi, my uv-like distributor for OCaml binaries, has been steadily gaining steam. It now supports OxCaml as well as multiple OCaml versions, which is tricky since OxCaml isn't relocatable yet. Still, some hacks later, I've got far enough that I'm quietly using it for myself day-to-day to see if the tool holds up under real development workloads.

This very website is now deployed using:

oi run --toolchain=oxcaml @avsm/arod -- arod serve -v

This feels like a nice eating-my-own-dogfood moment! I'll write up technical details properly once I've stopped rewriting the implementation.

I've also been working with Thomas Gazagnaire to merge his significant changes from the last four months into the agentic libraries I built last year, so we can reconcile our diverging trees. He's been hacking on these in his monopampam tree and there's a lot of cleanup to bring across.

4.1 Cross-building OCaml Windows binaries

Apropos to the above, I've been poking at msys2-docker to see if I could compile OCaml Windows binaries directly from Linux without doing full cross-compilation. It does almost work but the layering of MSYS2 inside Wine is unreliable due to fork not working very well. Dave Scott then mentioned to me over a coffee that it's possible to do this more directly via Wine running cmd.exe, by extracting the necessary bits out of a nanoserver Docker image. That sounds way better, so I'll try that approach next week.

5 A new forest leakage preprint

A new preprint has gone up from our 4C trusted carbon credits work, led by the wonderful Francisco d'Albertas.

This one's about "Estimating the carbon impacts of leakage from forest restoration and the costs of reducing them". The abstract:

Ecosystem restoration is a key nature-based climate solution but risks displacing economic activities and triggering leakage – whereby forgone production drives habitat loss elsewhere, eroding benefits. Focusing on reforestation opportunities Brazilian ranchland we characterized leakage risk as the ratio of forgone beef production to carbon gained.

Assuming 100% of forgone production results in extensification we asked: what is the impact of unaddressed leakage; how much can leakage be reduced by prioritizing restoration in low-yielding, high-carbon areas; and can it be cost-effectively mitigated by targeted intensification?

Taking likely leakage into account but not tackling it increased median costs of restoration (over ignoring it entirely) by 43-100%, to median values of 33 and 24 USD tCO₂e⁻¹ in the Atlantic Forest and Amazon, respectively. Prioritizing low-leakage sites reduced these costs by 21–37%; combining this with targeted intensification cut net carbon costs further, to 67% of unmitigated levels. Our broad findings hold at 30% (cf 100%) extensification and in other sensitivity analyses, and reveal leakage can substantially increase carbon costs, but that careful siting and targeted intensification can provide extremely cost-effective mitigation. -- d'Albertas et al., 2026

This pushes our forest restoration analyses onto the all-important "leakage" question, which is something of an elephant in the room for almost any nature-based climate intervention (if we choose our interventions badly, then displacement of existing use of that land causes yet more deforestation). Congratulations to Chico and the rest of the team for getting this out!

References

[1]Wheeler et al (2025). The path to robust evaluation of carbon credits generated by forest restoration and REDD+ projects. 10.1016/j.rse.2025.115041

[2]Madhavapeddy (2026). TESSERA now supports the Zarr geo-embeddings convention proposal. 10.59350/c3hrq-zsx02

[3]d'Albertas et al (2026). Estimating the carbon impacts of leakage from forest restoration and the costs of reducing them. Research Square. 10.21203/rs.3.rs-9440067/v1

[4]Madhavapeddy (2025). Socially self-hosting source code with Tangled on Bluesky. 10.59350/r80vb-7b441

[5]Madhavapeddy (2025). Four Ps for Building Massive Collective Knowledge Systems. 10.59350/418q4-gng78

[6]Omar et al (2025). A FAIR Case for a Live Computational Commons. Association for Computing Machinery. 10.1145/3759536.3763802

[7]Madhavapeddy (2025). Using AT Proto for more than just Bluesky posts. 10.59350/32rdt-zny05

[8]Madhavapeddy (2025). mlgpx is the first Tangled-hosted package available on opam. 10.59350/7267y-nj702

.plan-26-22: From digital rewilding in Edinburgh to uring and Tessera hackeryMay 2026

Rewilding the Web workshop in Edinburgh, an OCaml io_uring binding refresh, and GeoTessera 0.9 moves the embeddings to AWS alongside a fresh HuggingFace org.

.plan-26-18: From tropical forest protection to oi swallowing its oxcaml tailMay 2026

Our REDD+ over-crediting paper hits Nature Communications just as Microsoft retreats from removals, we talk responsible evidence synthesis while LLMs appear in UK planning, and oi grows a self-update bootstrap.

2026-17Apr 2026

Akshay Oppiliappan. Joining Cambridge I am thrilled to announce that I have joined Cambridge as a Research Associate working under Anil Madhavapeddy. I thank the EE group for the warm welcome (or cold?). My immidiate areas of focus are to customize knots to make them more amenable to scientific workloads (LFS or LOPs),…

AI, science and the UK–EU relationship at the Royal SocietyApr 2026

Notes from a Royal Society policy meeting with the European Commission on responsible AI, interoperable data and UK–EU alignment in AI for science; covering AI-poisoned literature, federated TESSERA-scale infrastructure, disclosure standards and the practical value of sustained UK–EU dialogue.

.plan-26-16: Chennai, Cambridge, Belfast: a week on the wingApr 2026

A week of hops between Chennai, Cambridge and Belfast for the FP Launchpad takeoff at IIT Madras, a surprise Publication of the Year at the Cambridge Ring Hall of Fame, meeting the VC on the upcoming Rokos School of Governance, mirroring half a petabyte of TESSERA tiles and hacking on oi

.plan-26-14: Tracking AI screen time and escaping to pen and paperApr 2026

Mythos Preview and the urgent need for internet immune systems, cognitive DDoS and AI screen time for code, a proposal for voluntary disclosure in OCaml, desktop focus and printed papers, iOS misery, GeoTessera 0.8, Ceph at 1.4PB, OCaml CI migration, hardware perf counters for OxCaml, and the FP Launchpad launch at IIT Madras.

Estimating the carbon impacts of leakage from forest restoration and the costs of reducing themApr 2026

Francisco d'Albertas, Tom Swinfield et al.

TESSERA now supports the Zarr geo-embeddings convention proposalMar 2026

Community feedback reshaped our Zarr store layout — years became a dimension, shards got bigger, and we retired the TESSERA-specific convention in favour of a shared geo-embeddings standard that also covers other models.

Enki, a Dashboard of Life on EarthJan 2026

2025 Advent of Agentic Humps: Building a useful O(x)Caml library every dayDec 2025

An exploration of agentic programming through building useful OCaml libraries daily using Claude Code while establishing groundrules for responsible development.

Four Ps for Building Massive Collective Knowledge SystemsNov 2025

Design principles for collective knowledge systems—permanence, provenance, permission, and placement—that enable robust networks for evidence-based decision making.

A FAIR Case for a Live Computational CommonsOct 2025

Cyrus Omar, Michael Coblenz et al. — Proceedings of the 2nd ACM SIGPLAN International Workshop on Programming for the Planet

The path to robust evaluation of carbon credits generated by forest restoration and REDD+ projectsOct 2025

Charlotte E. Wheeler, Felipe Begliomini et al. — Remote Sensing of Environment

mlgpx is the first Tangled-hosted package available on opamAug 2025

The Tangled git forge has recently gained support for CI, stacked pull requests and also the Dune build system can generate Tangled metadata easily now for OCaml packages hosted there.

Socially self-hosting source code with Tangled on BlueskyMar 2025

Self-host source code with Tangled on Bluesky for decentralized Git repositories.

Using AT Proto for more than just Bluesky postsFeb 2025

Explore alternative uses for AT Proto beyond Bluesky posts, enabling self-sovereign digital infrastructure and innovative apps.

OxCaml LabsJan 2025

TESSERA, a pixelwise geospatial foundation modelJan 2025

Decentralised tech on RecoilSep 2021

Recoil's decentralized tech stack includes email, web, and chat services.

Trusted Carbon CreditsJan 2021