.plan-26-10: Streaming TESSERA working, biodiversity action papers, and FPL takes off

Anil Madhavapeddy

doi:10.59350/re0zy-3rt26

.plan-26-10: Streaming TESSERA working, biodiversity action papers, and FPL takes off

#tessera #packages #opensource #docker #india #fplaunchpad #ocaml #oxcaml[cite]·8 Mar 2026

TESSERA streaming in the browser, planetary programming at WG2.8, biodiversity action papers, FP Launchpad opens, and Docker CACM buzz

1 TESSERA streaming into the browser and OxCaml hacking

I've completed a working cut at a streaming interface for TESSERA embeddings, and it's unexpectedly addictive to zoom around the world staring at false colours! The amazing thing about this interface is that it's entirely browser based, using JavaScript, WebGPU and WASM to perform all the analysis on the client side. The Zarr embeddings are chunked and served over HTTP, using range requests to retrieve the minimum amount of data.

The video shows classification workflows, but I've also got solar panel detection working using Sadiq's patch based embeddings. You can try it for yourself when I release this properly next week!

There are obvious limitations to how much we can do in the browser; for most serious work we will need a server running, but my goal here is to see if we can embed TESSERA into the living dashboard that Shane Weisz is working on. I've also just received a drop of areas-of-habitats from Michael Dales which I'll have a go at integrating next week.

1.1 It's TEE time with multiple browsers in development

What's blocking everyone from using Zarr and TESSERA? Well, we need to transcode a petabyte of embeddings from the old numpy format into Zarr, which is a difficult parallelisation problem. I'm making steady progress on an OxCaml pipeline for this with a from-scratch OxCaml-Zarr implementation that Mark Elvers helped me kick off. Mark also published his ocaml-tessera pipeline which I'm going to import to OxCaml next week as well, so that we can do both model training and tile inference in OCaml!

For more production-oriented usecases, Srinivasan Keshav has published a server with his excellent Tessera Embeddings Explorer. We've been building our implementations independently and swapping ideas for user interfaces and analyses, which has been a very productive way of experimenting for different users. His implementation is in use for several downstream tasks projects and should be what people use, while mine is heading towards more dynamic mobile/browser workflows. You can grab that code at https://github.com/ucam-eo/tee, complete with convenient Dockerfile.

1.2 Discussing TESSERA programming models at WG2.8

I got invited to WG2.8 again, so Simon Peyton Jones and I trooped off there from Cambridge earlier in the week. I had to leave early due to some family matters, but I got to present 'planetary programming' to the assembled gods of functional programming!

I began my talk by presenting the streaming browser demo above (which, thanks to the perma caching httpz oxcaml proxy I hacked up lets the remote tiles be cached on my laptop so the app works offline too). I used the opportunity to posit a high-level programming design problem I've encountered when coding with TESSERA...

There's a contradictory tension in the styles of programming that we conventionally embark on with the workflows needed by machine learning. In our FAIRground paper we describe a purely functional Python variant that represents conventional 'forward programming'. You can do lots of nice things when the language is pure, such as enabling incremental live computations. This forward programming style would be good for building a global computational wiki for example.

However, when we program with observational embeddings like TESSERA, we're doing 'backwards programming'. The units we're dealing with are 128-dimensional self-supervised representations that have been learnt from primary satellite data, and the job of the program is to help cluster these higher dimensional structures into some semblance of useful meaning. We do this via downstream classifiers, segmenters and regression tasks. This is a very different programming style from forward programming even though it requires a similar amount of CPU.

The ultimate goal of both of these scientific programming styles is to establish causal relationships; that is, we want to form (or reinforce or falsify) a theory of how the world works that can be tested by the scientific method. So I wondered: how do we combine these three styles into a programming language? This is a very general question, but I figured there was no better place to ask than a room full of people who have designed dozens if not hundreds of languages between them.

Three sorts of relations we are trying to program

Amal Ahmed said this really looked like a multi-DSL problem (i.e. implement all three different styles of programming in OCaml, and then examine the DSL properties/data structures). She also pointed me to Multi-Language Probabilistic Programming which allows for differently specialised probabilistic programming languages.
Sam Lindley also suggested this multi-DSL would be a good use of effects: could we write one OCaml program to represent all three styles, but then interpret them completely differently using effects? We could get a set of points as program traces via sampling using effects, and then we could do reproducible simulations via effects for stochastic choice, and then for causal path tests effects that check program data structure invariants regularly to build up causal hypotheses. I need to talk to KC Sivaramakrishnan and Patrick Ferris about this more, as with any effects based idea that involves more than just a Suspend effect.
Mary Sheeran observed that this is somewhat like hardware programming, whereby we put the minimal structural constraints in and then try to discover layouts.
John Hughes noted that the statistical testing combined with datastructures (the synthetic computation) is quite similar to quickcheck: can we posit causal relationships and 'quickcheck' them efficiently?
Gabriele Keller is working with spatial ecologists on saltmarshes using array programming to speed up their calculations, so we had a very productive conversation that I will follow up on! Sounds remarkably similar to the work in the Cairngorms that David Coomes is leading in the CLR.
Manuel Chakravarthy gave me a lot of tips on Mac/iOS approaches and also told me about Volt Europa and their push for liberal sovereignty.
Nate Foster and Sam Lindley helped me simplify my thinking a lot: rather than worry about scale (millions of species), can we find the smallest possible example to work outwards from synthetically? I obviously thought of hedgehog mapping as good one here. We also thought that viewing causality as a 'triangle' wasn't right: instead, we could use a combination of synthetic models + observational samples as bidirectional lenses, and then draw causal path diagrams across them to test the lenses (sort of like natural experiments). This is somewhat like boomerang lenses but for sample data instead of strings.
Richard Eisenberg gave me practical OxCaml advice as it's a fast-moving target: layout polymorphism is a while away yet, so keep using ppx_template for now, but other features like float16 (useful for TESSERA) could be done fairly easily.
Andreas Rossberg was impressed by the use of WASM for browser-based geospatial, and we discussed the difficulty of using wasm with the DOM for interactive interfaces. Machine learning workflows perform well because of the lack of DOM transitions, but hopefully Mozilla is working on improving this.
Simon Peyton Jones looked bemused by it all and thought it was too high level a concept to latch onto. I need a worked example like the above to convince him when I'm back at Cambridge!

It was a short trip to Portugal in the end, but massively energising. I do love hanging with functional programmers!

2 Biodiversity action through technology

Two big perspective papers on global biodiversity are out in PNAS this week, which I wrote up separately in a detailed note. To follow up on these, Cyrus Omar is chairing this year's PROPL (to be held at PLDI this summer), and we've been discussing doing something different this year to tie into a more 'action-oriented' workshop that combines the learnings from the last couple of years with the call to biodiversity action above.

Meanwhile, to followup on the first TESSERA hackathon over in India a few weeks ago, Aadi Seth and Srinivasan Keshav have put together a call for students to get involved. If you're interested then apply here and get going with TESSERA!

And not to be left behind by their Delhi counterparts, KC Sivaramakrishnan announced that applications are now open for the FP Launchpad in IIT-Madras. This should be of interest to computer scientists who want to get involved in environmental work; as I mentioned before, one of the illustrative projects to kick off the FPL is programming TESSERA embeddings more ergonomically:

A programmable public infrastructure for environmental planning, combining TESSERA's satellite-derived representations with CoRE Stack data and compositional functional models in O(x)Caml to support auditable indicators and scenario analysis for India’s water and habitat systems. -- FP Launchpad Charter, 2026

So it's action stations at both the IITs and I'm looking forward to working with them from Cambridge! This is a nice followup to our Cambridge VC visiting India and kicking off a cricketing tour!

3 Docker buzz from the CACM article

Following the CACM Docker article, there's been lots of positive online discussions about the article. Hackernews leads the way with typically split opinions. Some loved it, some hated it, some thought it should be replaced with a very small shell script, and others reminisced over our use of SLIRP. Overall though a lovely discussion and vibe.

I also read two interesting papers while researching more background for the package calculus that Ryan Gibb has been working on:

Docker Does Not Guarantee Reproducibility discusses some of the common pitfalls around building fully reproducible containers. While there's support at the lower levels for this in the Docker stack, I agree it's not kept up with modern needs. Luckily, Patrick Ferris is hacking on a new shell interface with provenance.
Does Functional Package Management Enable Reproducible Builds at Scale? Yes.: This is a complementary paper, and argues for a Nix-like approach. It's good to see that Nix (despite its constrained versioning) does a good job of supporting retrospective builds.

4 Visitor from KTH

We had a delightful visit from Professor Yifang Ban from KTH, who delivered this week's EEG seminar on EO-AI4GlobalChange. We went to the pub after, and discussed a rather staggering number of downstream tasks that Prof Ban works on:

In this seminar, Professor Ban will discuss recent research at the intersection of EO and AI, with a focus on deep learning methods for monitoring environmental change at scale. She will present selected results from EO-AI4GlobalChange, a collaborative research project developing novel, globally-applicable deep learning approaches for analysing multi-sensor, multi-modal EO data. The talk will cover examples including 2D and 3D urban mapping, urban change detection, wildfire detection and near-real-time monitoring, flood mapping, and multi-hazard building damage detection.

The seminar will also briefly introduce PANGAEA, a global benchmark for Geospatial Foundation Models, and discuss insights from the systematic evaluation of widely used foundation models across multiple geospatial domains. Finally, Professor Ban will briefly outline the objectives of the recently established AI4EO Working Group within Group on Earth Observations (GEO), which aims to advance GEO’s vision of Earth Intelligence for All through AI-driven Earth observation research, innovation, and collaboration. -- Yifang Ban, EEG Seminar, March 2026

But most importantly, we had excellent fish and chips to celebrate her first visit to Cambridge!

5 Fun links

Next OxCaml vibespiling target: "How we built the fastest regexp engine in F#" with code here.
The calls to reform publishing are getting louder and louder. A new ATProto service called Chive looks interesting here.
New podcast on sci-fi is a lot of fun, called Starship Alexandria with Emma Newman and Adrian Tchaikovsky. I've been reminded to pick up Adrian's latest series City of Last Chances which I'm enjoying so far. Great insect world building as always!
Extremely sad news is the passing of Prof Alan Wilson, who I was showing my Botswana leopard pictures and getting flying tips from just a few months ago. He passed away in a light aircraft crash while heading into the sand dunes of Namibia. Very, very sad news.
More bad news of (a pretty bad) week is that global warming has accelerated significantly.
But the good news is that I learnt of lazarus taxon that come back from extinction, such as this week's adorable marsupial thought extinct for 6000 years. Hurrah!

References

[1]Madhavapeddy (2026). Connecting the dots for biodiversity action from the NAS/Royal Society Forum. 10.59350/dy7d3-hdt43

[2]Madhavapeddy (2026). At the AI Impact Summit in Delhi: people, planet, progress. 10.59350/6vc5q-mbk23

[3]Feng et al (2026). Applications of the TESSERA Geospatial Foundation Model to Diverse Environmental Mapping Tasks. SSRN. 10.2139/ssrn.6142416

[4]Madhavapeddy (2025). Royal Society's Future of Scientific Publishing meeting. 10.59350/nmcab-py710

[5]Madhavapeddy (2026). 1st TESSERA/CoRE hackathon at the Indian AI Summit. 10.59350/1na80-7ak85

[6]Omar et al (2025). A FAIR Case for a Live Computational Commons. Association for Computing Machinery. 10.1145/3759536.3763802

[7]Gibb et al (2026). Package Managers à la Carte: A Formal Model of Dependency Resolution. arXiv. 10.48550/arXiv.2602.18602

[8]Patterson et al (2023). Semantic Encapsulation using Linking Types. ACM. 10.1145/3609027.3609405

[9]Bohannon et al (2008). Boomerang: resourceful lenses for string data. 10.1145/1328897.1328487

[10]Rahmstorf et al (2025). Global Warming has Accelerated Significantly. Research Square. 10.21203/rs.3.rs-6079807/v1

[11]Stites et al (2025). Multi-Language Probabilistic Programming. arXiv. 10.48550/arXiv.2502.19538

[12]Malka et al (2026). Docker Does Not Guarantee Reproducibility. arXiv. 10.48550/arXiv.2601.12811

[13]Malka et al (2025). Does Functional Package Management Enable Reproducible Builds at Scale? Yes. arXiv. 10.48550/arXiv.2501.15919

Streaming millions of TESSERA tiles over HTTP with Zarr v3Mar 2026

How we restructured TESSERA's geospatial embeddings from millions of individual numpy files into sharded Zarr v3 stores for efficient HTTP streaming, enabling everything from single-pixel mobile lookups to regional-scale analysis with just a couple of range requests.

Weeknotes 2026 week 10Mar 2026

Jon Ludlam. Here are my weeknotes for the last week, while I'm still writing up some more focused posts on some specific topics - like the experience of putting everything in a monorepo to create this site, and m...

Tessera Zarr streaming previewMar 2026

A little screencast of a fully browser based streaming interface to manipulate TESSERA embeddings. All the classification and UMAPs run directly in a browser, with no server required aside from static HTTP serving of the embeddings!

Connecting the dots for biodiversity action from the NAS/Royal Society ForumMar 2026

Summary of the Nine Recommendations and Biodiversity Monitoring Standards Framework papers from the NAS/Royal Society US-UK Forum in summer 2025, and how they connect to my work on collective knowledge systems, TESSERA, and evidence synthesis.

Weeknotes 2026-W10Mar 2026

Jon Sterling. Project Pterodactyl I’ve made a lot of progress on Project Pterodactyl this week, which I outline in my new blog post: Project Pterodactyl’s layered architecture. One highlight is that Pterodactyl’s kernel is now completely syntax-independent and uses an abstract type to ensure that the proofs…

Applications Open: Post-Baccalaureate FellowshipMar 2026

FP Launchpad. FP Launchpad is accepting applications for its inaugural cohort of Post-Baccalaureate Fellows. We are looking for 8 Fellows to join us at IIT Madras for a fully funded, two-year research fellowship starting later this year.

.plan-26-09: Browser TESSERA, package management and Docker in the CACMMar 2026

Got TESSERA working in Zarr and the browser, and a preprint of package management a la carte pushed out

A Decade of Docker Containers on the CACM cover!Feb 2026

Our CACM cover article reflects on a decade of Docker, from the early days of hacking Docker for Mac on a French farm to today's AI-driven sandboxing, covering the technical origins, cross-platform challenges, and the vibrant open-source community that made it all possible.

At the AI Impact Summit in Delhi: people, planet, progressFeb 2026

Trip report from the Indian AI Impact Summit in New Delhi, covering the massive expo, a conversation with Yann LeCun, a hackathon/talk at IIT-Delhi, networking at the British High Commission, and reflections on the summit declaration's shift from safety to progress and equitable access.

1st TESSERA/CoRE hackathon at the Indian AI SummitFeb 2026

First TESSERA hackathon held at the Indian AI Impact Summit in Delhi, exploring integration with IIT-Delhi's CoRE Stack for geospatial analysis and testing TESSERA labeling workflows.

Applications of the TESSERA Geospatial Foundation Model to Diverse Environmental Mapping TasksJan 2026

Zhengpeng Feng, Clement Atzberger et al.

Package Managers à la Carte: A Formal Model of Dependency ResolutionJan 2026

Ryan Gibb, Patrick Ferris et al.

AoAH Day 13: Heckling an OCaml HTTP client from 50 implementations in 10 languagesDec 2025

Agentically synthesising a batteries-included OCaml HTTP client by gathering recommendations from fifty open-source implementations across JavaScript, Python, Java, Rust, Swift, Haskell, Go, C++, PHP and shell.

Data Provenance in ShelterNov 2025

Patrick Ferris. A late update that stretches more than a week... @ How? A quick update on a new feature that was added recently, the @ how meta-command. One of the issues facing any exploratory programmer (or indeed anybody returning to a project) is the provenance of a given file. How that file came to be and who …

A FAIR Case for a Live Computational CommonsOct 2025

Cyrus Omar, Michael Coblenz et al. — Proceedings of the 2nd ACM SIGPLAN International Workshop on Programming for the Planet

Can a model trained on satellite data really find brambles on the ground?Sep 2025

Sadiq Jaffer. Can a model trained on satellite data really find brambles on the ground? We put a new model to the test with a field trip around Cambridge. The results were surprisingly good and taught us a lot about the model's strengths and weaknesses.

A Living IUCN Red List of the World's SpeciesSep 2025

Ongoing · PhD

Royal Society's Future of Scientific Publishing meetingJul 2025

Live notes from Royal Society conference on scientific publishing challenges including peer review crisis, AI poisoning threats and open access economics.

Mapping urban and rural British hedgehogsJun 2025

Ongoing

TESSERA, a pixelwise geospatial foundation modelJan 2025