TESSERA, a pixelwise geospatial foundation model

TESSERA is an open and pixel-wise foundation model for multi-modal (Sentinel-1/2) earth observation time series that learns robust, label-efficient embeddings.

Our goal with TESSERA is to make manipulating global satellite intelligence as easy as conventional programming tasks are. Towards this we release global, annual, 10m, pixel-wise embeddings together with open weights and code and lightweight adaptation heads. We also develop practical tooling for large-scale retrieval and inference at planetary scale.

As with any good foundation model, there are a staggering array of downstream tasks which can benefit. TESSERA embeddings deliver state-of-the-art accuracy with high label efficiency across diverse classification, segmentation, and regression tasks.

1 Storage, Zarr, and cloud-native distribution

A lot of the early 2026 work has been on the plumbing needed to actually use TESSERA at scale. We restructured the store around a Zarr v3 layout and a shared geo-embeddings convention, iterating on the chunking after community feedback and shipping it through geotessera 0.8 with multi-year support and a browser-based TZE explorer backed by HTTP range requests.

On the storage side, we expanded the Cambridge Ceph cluster to 1.4PB just in time to mirror the full half-petabyte to AWS Open Data, with the sync finishing a week or so later. The geotessera client now discovers tiles from multiple registries so consumers can pull from whichever copy is closest. In parallel, Mark Elvers has been porting Brotli/Zstd/Snappy to OxCaml and building ocaml-zarr as the basis for native OCaml access to the cloud-native stores.

2 Scaling up to v2

The v2 checkpoint paper was uploaded in July 2026, led by Frank Feng with Sadiq Jaffer, David Coomes, Srinivasan Keshav and others, using the UKRI AIRR allocation on Isambard 2 for a hero run to scale the model. Hundreds of ablation sweeps gave us a simple rule for allocating compute: as the training budget grows, the encoder and the satellite data should grow together while the projector stays fixed. The result is a 1-billion parameter model, with a 2B one still training (as of July 2026).

Since inference over 1.8 million tiles has to stay cheap, we distil that model into a family of smaller student models. Pleasingly, TESSERA-v2-1B-M has fewer parameters than v1 and still outperforms it, along with every other GeoFM we could find. v2 also adds Matryoshka embeddings, so the first 1/8th of the 128 dimensions carry ~90% of the downstream performance. This matters for smaller devices, but leaves open how to slice dimensions efficiently in Zarr.

3 Distribution and downstream results

We are moving the embeddings to Source Cooperative after the enthusiastic reaction at CNG London, which mostly means routing terabytes via Cambridge to dodge five-figure AWS egress charges, with Zarr v3 becoming the default access mechanism. Isaac Corley has been very helpful in getting us started.

On the downstream side, tree species mapping in Trentino showed foundation-model embeddings reaching near-asymptotic accuracy from as few as 5% of training parcels, provided a nonlinear classifier is used.

Pedro Sousa is applying the embeddings to probabilistic weather downscaling, where they supply the sub-grid structure that topography alone cannot. Independent validations have also started appearing, with TESSERA the best decametric option for cocoa mapping in Côte d'Ivoire and on GeoLifeCLEF. On the outreach side, Sadiq Jaffer presented at the RAISE Summit in Paris with Vultr, and we ran a stall at a House of Lords reception on AI for science with NVIDIA, who have been helping optimise our kernels.

Ideas

Earth embeddings for probabilistic weather downscaling2026 · PhD

Ongoing with Pedro Sousa, with Sadiq Jaffer

Activity

.plan-26-29: Perfect weather, imperfectly measured, precisely predictedJul 2026

My first viva in Law, a TESSERA stall at the House of Lords, downscaling the weather with embeddings, moving terabytes onto Source Cooperative, the Pembroke garden party, and a from-scratch shell makes progress.

.plan-26-28: What fun papers piled up while I was out at seaJul 2026

Back from the Arctic into a heatwave, hacking on Eio for the TESSERA sync engine, the Conservation Evidence team demoing at Parliament, and TESSERA on stage at the RAISE Summit in Paris.

Geospatial foundation models enable data-efficient tree species mapping in temperate mountain forestsJul 2026

James GC Ball, Jana Annika Wicklein et al. — Science of Remote Sensing

TESSERA v2: Scaling Pixel-wise Earth Foundation ModelsJul 2026

Zhengpeng Feng, Sadiq Jaffer et al.

.plan-26-26: Gelato, geospatial, and players of gamesJun 2026

Spoke at CHIA's annual conference on AI for a changing world, as well as the first Cloud-Native Geospatial Forum outside the US, and started moving TESSERA's embeddings onto Source Cooperative.

A scorching CNG London during Climate Action WeekJun 2026

My notes from the first Cloud-Native Geospatial Forum gathering outside the US, up on the top floor of the Jellicoe; covering Source Cooperative's open data economics, Argentina's invisible settlements, and provenance and trust for geospatial decisionmaking.

.plan-26-25: Planetary scale plans, Windows file-descriptor scale problemsJun 2026

Ten years of the CCI with Sir David Attenborough, Andrew's Royal Society Environment Medal lecture, and the third PROPL at PLDI, while wrapping a local DeepSeek agent in OCaml and a first stab at getting Eio fleshed out on Windows.

Sir David Attenborough joins us to celebrate the Cambridge Conservation InitiativeJun 2026

Sir David Attenborough drops by for the Cambridge Conservation Initiative's tenth birthday in the DAB, and I spend the afternoon demoing TESSERA and the Dash of Life to a campus packed with visitors.

The Royal Society Environmental Prize lecture on feeding the world without costing the earthJun 2026

Notes from Andrew Balmford's Royal Society Environment Medal lecture, on why shifting diets and cutting food waste are necessary but not sufficient, and why sustainable high-yield farming tied to land sparing is the key to slowing the extinction crisis.

.plan-26-23: Earth Embeddings, Emails Everywhere, and ERRNOOOsJun 2026

TESSERA on the ESA homepage and at CVPR, GeoTessera 0.9 stabilising onto S3/Zarr, io-uring in OCaml, carbon credits in New Scientist and WSJ, and musings on internet malware again.

Tessera at CVPR 2026, and the front page of the European Space Agency!Jun 2026

TESSERA gets its CVPR debut in Denver, the BBC hedgehog story trots on into national news, and the European Space Agency puts our model on their homepage!

.plan-26-22: From digital rewilding in Edinburgh to uring and Tessera hackeryMay 2026

Rewilding the Web workshop in Edinburgh, an OCaml io_uring binding refresh, and GeoTessera 0.9 moves the embeddings to AWS alongside a fresh HuggingFace org.

Rewilding the Web: my workshop report from EdinburghMay 2026

Notes from a wonderfully interdisciplinary Edinburgh workshop on 'Rewilding the Web', ranging coopetition and biological variety through the philosophy of self-organisation, polycrisis governance, protopian science fiction, and moderation seen through the lens of artisanal cheese.

Sadiq at Pint of Science CambridgeMay 2026

Sadiq Jaffer speaks at Pint of Science at the Cambridge Station Tavern about TESSERA geospatial foundation modelling (slides).

BBC Cambridgeshire with Louise Hulland on HedgehogsMay 2026

Louise Hulland from BBC Cambridgeshire interviews Anil Madhavapeddy about spotting hedgehogs from space using TESSERA. Mirror of <https://x.com/BBCCambs/status/2057760666266558867>

What happens when a hedgehog story prickles its way into the BBCMay 2026

Behind the scenes of a week of BBC/ITV news and radio appearances about hedgehogs and TESSERA, but also what to expect when a research story catches the news cycle.

.plan-26-20: Putting OxCaml in a box and OCaml in orbit (again)May 2026

Consolidating my OCaml trees for easier OxCaml deployment, shipping native system packages for OxCaml which then got into space, and remembering Peter Neumann

.plan-26-19: Ancient oaks, parliamentary evidence, and TESSERA in the CityMay 2026

Celebrating David Attenborough's 100th birthday at a Conservation Research Institute retreat in Norwich, a Parliament POST briefing on Evidence for Nature Recovery lands, and a TESSERA talk at the Cambridge Ring alumni evening at Jane Street.

.plan-26-17: Unwedging kernels, dogfood deployments, and managing beef leakageApr 2026

Welcoming Akshay to Cambridge, TESSERA AWS sync done, oi now self-hosts this site, and a new 4C forest leakage preprint appears.

AI, science and the UK–EU relationship at the Royal SocietyApr 2026

Notes from a Royal Society policy meeting with the European Commission on responsible AI, interoperable data and UK–EU alignment in AI for science; covering AI-poisoned literature, federated TESSERA-scale infrastructure, disclosure standards and the practical value of sustained UK–EU dialogue.

.plan-26-16: Chennai, Cambridge, Belfast: a week on the wingApr 2026

A week of hops between Chennai, Cambridge and Belfast for the FP Launchpad takeoff at IIT Madras, a surprise Publication of the Year at the Cambridge Ring Hall of Fame, meeting the VC on the upcoming Rokos School of Governance, mirroring half a petabyte of TESSERA tiles and hacking on oi

The FP Launchpad takes off at IIT MadrasApr 2026

A day at the launch of the FP Launchpad at IIT Madras, covering talks on hardware design, trusted execution on Shakti, verifiable Indian tax law, precise JIT analysis, AI-assisted Lean metatheory, constraint-based diagramming, and my own TESSERA talk.

.plan-26-15: Banyan trees, (anti)botnets and Bose-Einstein basesApr 2026

Travelling from Ireland to IIT Madras for the FP Launchpad launch, mirroring half a petabyte of TESSERA embeddings to AWS Open Data, antibotty discussions, and Tangled trust boundaries for AI code review.

.plan-26-14: Tracking AI screen time and escaping to pen and paperApr 2026

Mythos Preview and the urgent need for internet immune systems, cognitive DDoS and AI screen time for code, a proposal for voluntary disclosure in OCaml, desktop focus and printed papers, iOS misery, GeoTessera 0.8, Ceph at 1.4PB, OCaml CI migration, hardware perf counters for OxCaml, and the FP Launchpad launch at IIT Madras.

.plan-26-13: Oxidised, standardised, and syndicatedMar 2026

Publishing the OxCaml Labs year-one review, POSSE and AI content disclosure for the web, adopting the geo-embeddings Zarr convention for TESSERA, action PROPL at PLDI, the death of the grant application, and NASA's new swathe lidar mission.

TESSERA now supports the Zarr geo-embeddings convention proposalMar 2026

Community feedback reshaped our Zarr store layout — years became a dimension, shards got bigger, and we retired the TESSERA-specific convention in favour of a shared geo-embeddings standard that also covers other models.

.plan-26-12: Zarr across space and TESSERA timeMar 2026

Reworking the TESSERA Zarr store layout after community feedback, Springer's API woes for evidence synthesis, vibecoding introspection, and git remote helpers for ATProto.

.plan-26-11: Bins, bollards, bots and biodiversity boffinsMar 2026

Evidence synthesis at the DEFRA science conference, TESSERA transcoding and building a new SPA, OpenStreetMap/DuckDB bindings in OxCaml, and early thoughts on vibecoding etiquette.

Streaming millions of TESSERA tiles over HTTP with Zarr v3Mar 2026

How we restructured TESSERA's geospatial embeddings from millions of individual numpy files into sharded Zarr v3 stores for efficient HTTP streaming, enabling everything from single-pixel mobile lookups to regional-scale analysis with just a couple of range requests.

Tessera Zarr streaming previewMar 2026

A little screencast of a fully browser based streaming interface to manipulate TESSERA embeddings. All the classification and UMAPs run directly in a browser, with no server required aside from static HTTP serving of the embeddings!

.plan-26-10: Streaming TESSERA working, biodiversity action papers, and FPL takes offMar 2026

TESSERA streaming in the browser, planetary programming at WG2.8, biodiversity action papers, FP Launchpad opens, and Docker CACM buzz

Connecting the dots for biodiversity action from the NAS/Royal Society ForumMar 2026

Summary of the Nine Recommendations and Biodiversity Monitoring Standards Framework papers from the NAS/Royal Society US-UK Forum in summer 2025, and how they connect to my work on collective knowledge systems, TESSERA, and evidence synthesis.

Tessera PipelineFeb 2026

Mark Elvers. Mainly for my future reference here is a walk-through of the Tessera pipeline.

At the AI Impact Summit in Delhi: people, planet, progressFeb 2026

Trip report from the Indian AI Impact Summit in New Delhi, covering the massive expo, a conversation with Yann LeCun, a hackathon/talk at IIT-Delhi, networking at the British High Commission, and reflections on the summit declaration's shift from safety to progress and equitable access.

1st TESSERA/CoRE hackathon at the Indian AI SummitFeb 2026

First TESSERA hackathon held at the Indian AI Impact Summit in Delhi, exploring integration with IIT-Delhi's CoRE Stack for geospatial analysis and testing TESSERA labeling workflows.

.plan-26-07: Storage, Lego, Echo, and the IUCNFeb 2026

Growing the Ceph cluster for TESSERA embeddings, a Lego brainstorming session for the Evidence TAP, hosting Echo Labs from ARIA, and Shane's IUCN Red List seminar.

Tessera pipeline in OCamlFeb 2026

Mark Elvers. The Tessera pipeline is written in Python. What would it take to have an OCaml version?

Weekly Notes - 2026-02-15Feb 2026

Andres Zuñiga-Gonzalez. Introduction This is quite a large update as it includes everything I’ve done for the past two weeks. I’ll talk about the LCZ classification and road mapping projects as well as my first actual experience with Claude Code and a cool toy example. LCZ Classification It turns out that getting the r…

From data to decisions: Toward a Biodiversity Monitoring Standards FrameworkJan 2026

Andrew Gonzalez, Tom August et al. — Proceedings of the National Academy of Sciences

Applications of the TESSERA Geospatial Foundation Model to Diverse Environmental Mapping TasksJan 2026

Zhengpeng Feng, Clement Atzberger et al.

Earth embeddings for probabilistic weather downscalingJan 2026

Ongoing · PhD

Enki, a Dashboard of Life on EarthJan 2026

Introduction to TESSERA at the Workshop on Foundational AI to forecast ecosystem resilienceNov 2025

TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and AnalysisNov 2025

Zhengpeng Feng, Clement Atzberger et al.

Des satellites pour protéger les hérissons, en voie d'extinctionOct 2025

GeoTessera Python library released for geospatial embeddingsAug 2025