1 TESSERA streaming into the browser and OxCaml hacking
I've completed a working cut at a streaming interface for TESSERA embeddings, and it's unexpectedly addictive to zoom around the world staring at false colours! The amazing thing about this interface is that it's entirely browser based, using JavaScript, WebGPU and WASM to perform all the analysis on the client side. The Zarr embeddings are chunked and served over HTTP, using range requests to retrieve the minimum amount of data.
The video shows classification workflows, but I've also got solar panel detection working using Sadiq's patch based embeddings. You can try it for yourself when I release this properly next week!
There are obvious limitations to how much we can do in the browser; for most serious work we will need a server running, but my goal here is to see if we can embed TESSERA into the living dashboard that Shane Weisz is working on. I've also just received a drop of areas-of-habitats from Michael Dales which I'll have a go at integrating next week.
1.1 It's TEE time with multiple browsers in development
What's blocking everyone from using Zarr and TESSERA? Well, we need to transcode a petabyte of embeddings from the old numpy format into Zarr, which is a difficult parallelisation problem. I'm making steady progress on an OxCaml pipeline for this with a from-scratch OxCaml-Zarr implementation that Mark Elvers helped me kick off. Mark also published his ocaml-tessera pipeline which I'm going to import to OxCaml next week as well, so that we can do both model training and tile inference in OCaml!
For more production-oriented usecases, Srinivasan Keshav has published a server with his excellent Tessera Embeddings Explorer. We've been building our implementations independently and swapping ideas for user interfaces and analyses, which has been a very productive way of experimenting for different users. His implementation is in use for several downstream tasks projects and should be what people use, while mine is heading towards more dynamic mobile/browser workflows. You can grab that code at https://github.com/ucam-eo/tee, complete with convenient Dockerfile.
1.2 Discussing TESSERA programming models at WG2.8

I began my talk by presenting the streaming browser demo above (which, thanks to the perma caching httpz oxcaml proxy I hacked up lets the remote tiles be cached on my laptop so the app works offline too). I used the opportunity to posit a high-level programming design problem I've encountered when coding with TESSERA...
There's a contradictory tension in the styles of programming that we conventionally embark on with the workflows needed by machine learning. In our FAIRground paper we describe a purely functional Python variant that represents conventional 'forward programming'. You can do lots of nice things when the language is pure, such as enabling incremental live computations. This forward programming style would be good for building a global computational wiki for example.
However, when we program with observational embeddings like TESSERA, we're doing 'backwards programming'. The units we're dealing with are 128-dimensional self-supervised representations that have been learnt from primary satellite data, and the job of the program is to help cluster these higher dimensional structures into some semblance of useful meaning. We do this via downstream classifiers, segmenters and regression tasks. This is a very different programming style from forward programming even though it requires a similar amount of CPU.


- Amal Ahmed said this really looked like a multi-DSL problem (i.e. implement all three different styles of programming in OCaml, and then examine the DSL properties/data structures). She also pointed me to Multi-Language Probabilistic Programming which allows for differently specialised probabilistic programming languages.
- Sam Lindley also suggested this multi-DSL would be a good use of effects: could we write one OCaml program to represent all three styles, but then interpret them completely differently using effects? We could get a set of points as program traces via sampling using effects, and then we could do reproducible simulations via effects for stochastic choice, and then for causal path tests effects that check program data structure invariants regularly to build up causal hypotheses. I need to talk to KC Sivaramakrishnan and Patrick Ferris about this more, as with any effects based idea that involves more than just a Suspend effect.
- Mary Sheeran observed that this is somewhat like hardware programming, whereby we put the minimal structural constraints in and then try to discover layouts.
- John Hughes noted that the statistical testing combined with datastructures (the synthetic computation) is quite similar to quickcheck: can we posit causal relationships and 'quickcheck' them efficiently?
- Gabriele Keller is working with spatial ecologists on saltmarshes using array programming to speed up their calculations, so we had a very productive conversation that I will follow up on! Sounds remarkably similar to the work in the Cairngorms that David Coomes is leading in the CLR.
- Manuel Chakravarthy gave me a lot of tips on Mac/iOS approaches and also told me about Volt Europa and their push for liberal sovereignty.
- Nate Foster and Sam Lindley helped me simplify my thinking a lot: rather than worry about scale (millions of species), can we find the smallest possible example to work outwards from synthetically? I obviously thought of hedgehog mapping as good one here. We also thought that viewing causality as a 'triangle' wasn't right: instead, we could use a combination of synthetic models + observational samples as bidirectional lenses, and then draw causal path diagrams across them to test the lenses (sort of like natural experiments). This is somewhat like boomerang lenses but for sample data instead of strings.
- Richard Eisenberg gave me practical OxCaml advice as it's a fast-moving target: layout polymorphism is a while away yet, so keep using
ppx_templatefor now, but other features like float16 (useful for TESSERA) could be done fairly easily. - Andreas Rossberg was impressed by the use of WASM for browser-based geospatial, and we discussed the difficulty of using wasm with the DOM for interactive interfaces. Machine learning workflows perform well because of the lack of DOM transitions, but hopefully Mozilla is working on improving this.
- Simon Peyton Jones looked bemused by it all and thought it was too high level a concept to latch onto. I need a worked example like the above to convince him when I'm back at Cambridge!
It was a short trip to Portugal in the end, but massively energising. I do love hanging with functional programmers!
2 Biodiversity action through technology
Two big perspective papers on global biodiversity are out in PNAS this week, which I wrote up separately in a detailed note. To follow up on these, Cyrus Omar is chairing this year's PROPL (to be held at PLDI this summer), and we've been discussing doing something different this year to tie into a more 'action-oriented' workshop that combines the learnings from the last couple of years with the call to biodiversity action above.
Meanwhile, to followup on the first TESSERA hackathon over in India a few weeks ago, Aadi Seth and Srinivasan Keshav have put together a call for students to get involved. If you're interested then apply here and get going with TESSERA!

A programmable public infrastructure for environmental planning, combining TESSERA's satellite-derived representations with CoRE Stack data and compositional functional models in O(x)Caml to support auditable indicators and scenario analysis for India’s water and habitat systems. -- FP Launchpad Charter, 2026
So it's action stations at both the IITs and I'm looking forward to working with them from Cambridge! This is a nice followup to our Cambridge VC visiting India and kicking off a cricketing tour!
3 Docker buzz from the CACM article
Following the CACM Docker article, there's been lots of positive online discussions about the article. Hackernews leads the way with typically split opinions. Some loved it, some hated it, some thought it should be replaced with a very small shell script, and others reminisced over our use of SLIRP. Overall though a lovely discussion and vibe.
I also read two interesting papers while researching more background for the package calculus that Ryan Gibb has been working on:
- Docker Does Not Guarantee Reproducibility discusses some of the common pitfalls around building fully reproducible containers. While there's support at the lower levels for this in the Docker stack, I agree it's not kept up with modern needs. Luckily, Patrick Ferris is hacking on a new shell interface with provenance.
- Does Functional Package Management Enable Reproducible Builds at Scale? Yes.: This is a complementary paper, and argues for a Nix-like approach. It's good to see that Nix (despite its constrained versioning) does a good job of supporting retrospective builds.
4 Visitor from KTH

In this seminar, Professor Ban will discuss recent research at the intersection of EO and AI, with a focus on deep learning methods for monitoring environmental change at scale. She will present selected results from EO-AI4GlobalChange, a collaborative research project developing novel, globally-applicable deep learning approaches for analysing multi-sensor, multi-modal EO data. The talk will cover examples including 2D and 3D urban mapping, urban change detection, wildfire detection and near-real-time monitoring, flood mapping, and multi-hazard building damage detection.
The seminar will also briefly introduce PANGAEA, a global benchmark for Geospatial Foundation Models, and discuss insights from the systematic evaluation of widely used foundation models across multiple geospatial domains. Finally, Professor Ban will briefly outline the objectives of the recently established AI4EO Working Group within Group on Earth Observations (GEO), which aims to advance GEO’s vision of Earth Intelligence for All through AI-driven Earth observation research, innovation, and collaboration. -- Yifang Ban, EEG Seminar, March 2026
But most importantly, we had excellent fish and chips to celebrate her first visit to Cambridge!
5 Fun links
- Next OxCaml vibespiling target: "How we built the fastest regexp engine in F#" with code here.
- The calls to reform publishing are getting louder and louder. A new ATProto service called Chive looks interesting here.
- New podcast on sci-fi is a lot of fun, called Starship Alexandria with Emma Newman and Adrian Tchaikovsky. I've been reminded to pick up Adrian's latest series City of Last Chances which I'm enjoying so far. Great insect world building as always!
- Extremely sad news is the passing of Prof Alan Wilson, who I was showing my Botswana leopard pictures and getting flying tips from just a few months ago. He passed away in a light aircraft crash while heading into the sand dunes of Namibia. Very, very sad news.
- More bad news of (a pretty bad) week is that global warming has accelerated significantly.
- But the good news is that I learnt of lazarus taxon that come back from extinction, such as this week's adorable marsupial thought extinct for 6000 years. Hurrah!

