# A scorching CNG London during Climate Action Week

*2026-06-24 — note*


It's [London Climate Action Week](https://londonclimateactionweek.org/) in the midst of a [searing heatwave](https://www.bbc.co.uk/weather/articles/c0myl4d3l0no), which was a good backdrop for the [Cloud-Native Geospatial Forum](https://cloudnativegeo.org/) meeting (the first outside the US!).
The venue was [the Jellicoe](https://www.foraspace.com/event-venues/london/the-jellicoe) where [ARIA is based](https://anil.recoil.org/notes/2026w6) up on the top floor with a panoramic view over the City. CNG is a [Radiant Earth](https://radiant.earth/) initiative that I joined [last year](https://anil.recoil.org/notes/coar-prc) when I heard about their drive to make geospatial data available as a public good. This gathering was a rather excellent collection of 50 practitioners who were geeking out on coordinate systems and Zarr access patterns. Here are my notes of the day...

## Unlocking the value of geospatial data in the commons

[Jed Sundwall](https://jed.co/) is the CEO of [Radiant Earth](https://radiant.earth), and is the person who previously [founded AWS Open Data](https://aws.amazon.com/blogs/publicsector/noaa-and-aws-expand-commitment-to-increase-access-to-environmental-data/) when on the social responsibility team at Amazon.  He opened proceedings by explaining that the CNG consortium aims to bring together data users with wildly different budgets (big corps, individual contributors, etc) but who all have the same geospatial problems.
His idea is that *"data has a lot of potential energy stored inside it, and the sweet spot is how we maximise the potential value of that data."*

CNG exists to turn this data into something useful by promoting modern cloud-native methods to drive down the cost of the public good overall.
If you've read my writing recently, you'll know that the biggest professional problem in my life is juggling TESSERA embeddings without running out of disk space and/or crashing the Cambridge University egress bandwidth from [eager downloaders](https://www.tunbury.org/2026/06/18/proof-of-work/)! So I'm enormously excited at the idea of having a public data commons to help share the load.

<figure class="image-center"><img src="/images/cng-london26-1.webp" alt="Jed Sundwall opens proceedings on the top floor of the Jellicoe" title="Jed Sundwall opens proceedings on the top floor of the Jellicoe" loading="lazy" srcset="/images/cng-london26-1.768.webp 768w, /images/cng-london26-1.640.webp 640w, /images/cng-london26-1.480.webp 480w, /images/cng-london26-1.3840.webp 3840w, /images/cng-london26-1.320.webp 320w, /images/cng-london26-1.2560.webp 2560w, /images/cng-london26-1.1920.webp 1920w, /images/cng-london26-1.1600.webp 1600w, /images/cng-london26-1.1440.webp 1440w, /images/cng-london26-1.1280.webp 1280w, /images/cng-london26-1.1024.webp 1024w"><figcaption>Jed Sundwall opens proceedings on the top floor of the Jellicoe</figcaption></figure>

Jed then showed us the actual unit economics of their operation. [Source Cooperative](https://source.coop/) is Radiant Earth's data-publishing utility built on cloud object storage. As of a few months ago, it has 6.16 PB stored, 739M objects, over a billion requests a month, at a blended cost of around \$20/terabyte/month. Most vendors guard their cost base for competitive reasons, but as a not-for-profit they share it so the community can reason about what it actually costs to share data at scale.

An important technical point is that Source Coop uses [Cloudflare R2](https://www.cloudflare.com/developer-platform/products/r2/) as their CDN, so hot data is edge-cached around the world. This was a big feature request for making TESSERA easier to access at the [first Indian hackathon](https://anil.recoil.org/notes/first-tessera-hackathon) recently, as their egress international bandwidth was pretty bad.

<figure class="image-center"><img src="/images/cng-london26-2.webp" alt="Source Cooperative's publishers include NASA, CarbonPlan, Planet and Asterisk Labs." title="Source Cooperative's publishers include NASA, CarbonPlan, Planet and Asterisk Labs." loading="lazy" srcset="/images/cng-london26-2.768.webp 768w, /images/cng-london26-2.640.webp 640w, /images/cng-london26-2.480.webp 480w, /images/cng-london26-2.3840.webp 3840w, /images/cng-london26-2.320.webp 320w, /images/cng-london26-2.2560.webp 2560w, /images/cng-london26-2.1920.webp 1920w, /images/cng-london26-2.1600.webp 1600w, /images/cng-london26-2.1440.webp 1440w, /images/cng-london26-2.1280.webp 1280w, /images/cng-london26-2.1024.webp 1024w"><figcaption>Source Cooperative's publishers include NASA, CarbonPlan, Planet and Asterisk Labs.</figcaption></figure>

## From maps to decision systems

[Luca Budello](https://uk.linkedin.com/in/luca-budello-geospatial), the Geospatial Lead at [Innovate UK Business Connect](https://iuk-business-connect.org.uk/geospatial/) and [formerly of the CCI](https://www.fauna-flora.org/news/coral-symphony-a-new-record-for-cambodia/), gave his policy view learnt from running the [GeoAI Festival](https://iuk-business-connect.org.uk/perspectives/the-geoai-festival-outlook-report-and-programme-round-up/). He noted that decision-makers are interested in *"reliable answers to real problems, grounded in trustworthy data"*, so data provenance and trustworthiness matters enormously.

Luca traced how every wave of geospatial innovation has changed the way we interpret the world: first *maps* drew the world, then *dashboards* queried it, then *cloud APIs* let us program it, then *reporting and analytics* informed decisions, and now the next step is *decision systems* that automate them.

<figure class="image-center"><img src="/images/cng-london26-3.webp" alt="Luca's arc from first using data to draw the world through to automating decisions" title="Luca's arc from first using data to draw the world through to automating decisions" loading="lazy" srcset="/images/cng-london26-3.768.webp 768w, /images/cng-london26-3.640.webp 640w, /images/cng-london26-3.480.webp 480w, /images/cng-london26-3.3840.webp 3840w, /images/cng-london26-3.320.webp 320w, /images/cng-london26-3.2560.webp 2560w, /images/cng-london26-3.1920.webp 1920w, /images/cng-london26-3.1600.webp 1600w, /images/cng-london26-3.1440.webp 1440w, /images/cng-london26-3.1280.webp 1280w, /images/cng-london26-3.1024.webp 1024w"><figcaption>Luca's arc from first using data to draw the world through to automating decisions</figcaption></figure>

The worrying part here is the jump from analytical workflows we can predict (deterministic, linear, static data) to these future decision systems that are probabilistic and dynamic and running continuous loops that can act directly on the world. AI is being imposed on us fast, so the obvious question is what the checks and balances are on these future decision systems. His answer was to build trust as a federated property and engineer in properties to the federation protocols to be explainable, accountable, auditable and trackable.

The analogy Luca used was that *"GeoAI needs its [open banking](https://en.wikipedia.org/wiki/Open_banking) moment"* to unlock the value of data the way open banking did for finance, and that the UK has a genuine advantage here with decades of authoritative, temporally rich, high-integrity data. The policy scaffolding is arriving ([AI growth zones](https://www.gov.uk/government/collections/ai-growth-zones), [BridgeAI](https://iuk-business-connect.org.uk/programme/bridgeai/) and a [sovereign AI fund](https://www.sovereign.tech/)) which roughly mirrors the EU's approach but with unfortunately [rather less](https://anil.recoil.org/notes/rs-eu-ai-science) funding. He noted that a recent major AI policy report (I missed which one exactly) mentioned geospatial exactly once (urban planning), which is a strange omission for something so foundational as landuse planning is for a government.

## Making Earth observation embeddings actionable

[Earth Genome](https://www.earthgenome.org/) is a mission-driven non-profit out of California, behind [ClimateTRACE](https://climatetrace.org/) and a dozen-plus geospatial products, funded by a hybrid of targeted projects and philanthropic donations for R\&D. Noelia Jiménez Martínez and Glen Low walked us through their [Earth Index](https://www.earthgenome.org/blog/mapping-the-planet-with-earth-index-is-now-open-to-everyone) work to make foundation-model embeddings usable by non-experts:

> In the short time that Earth Index has been available, we’ve been amazed by the impact our users have made. Just to highlight a few: they’ve exposed [narcotrafficking airstrips](https://news.mongabay.com/custom-story/2024/11/indigenous-leaders-killed-as-narco-airstrips-cut-into-their-amazon-territories/) in the Peruvian Amazon; [mapped illegal palm oil](https://infoamazonia.org/en/2025/06/18/brazilian-firm-behind-saf-plan-found-growing-oil-palm-on-deforested-amazon-land/) expansion in Brazil; [uncovered hazardous quarries](https://www.slobodnaevropa.org/a/zapadni-balkan-kamenolomi/33351974.html) in the Balkans; and even mapped how rose farming is [contributing to wetland loss](https://www.theafricareport.com/408151/ugly-valentine-how-fairtrade-roses-ravage-ugandas-wetlands/) in Uganda.
> <cite>\-- [Earth Genome](https://www.earthgenome.org/blog/mapping-the-planet-with-earth-index-is-now-open-to-everyone), 2026</cite>

Their worked example was [quantifying Jamaican seagrass](https://www.allenphilanthropies.org/news-and-stories/stories/mapping-seagrasses-with-earth-genome-and-the-nature-conservancy) with [The Nature Conservancy](https://www.nature.org/). They used 70 datasets across four regions, blending field surveys, drone and high-resolution imagery with interpolation and modelling. The outputs need geolocalisation, a class (seagrass or not) and a density estimate as input data.

I was of course delighted to see our own [Tessera](https://anil.recoil.org/projects/tessera) embeddings up on the slide alongside [AlphaEarth](https://deepmind.google/discover/blog/alphaearth-foundations-helps-map-our-planet-in-unprecedented-detail/) (Google, 64-dim) and [OLMo Earth](https://allenai.org/) (Allen AI, vision transformer at 768-dim). Reassuringly, all three embedding models beat the no-embedding baseline comfortably on a benthic-environment case (using the [Allen Coral Atlas](https://allencoralatlas.org/) ground truth). Their Tessera tests were using v1.0, and afterwards I explained how [v1.1](https://anil.recoil.org/notes/tessera-v11-out) has [better coastal maps](https://github.com/ucam-eo/geotessera/issues/12) and so should see even better performance.

<figure class="image-center"><img src="/images/cng-london26-5.webp" alt="The embedding landscape: AlphaEarth OLMo Earth and Tessera (yay)" title="The embedding landscape: AlphaEarth OLMo Earth and Tessera (yay)" loading="lazy" srcset="/images/cng-london26-5.768.webp 768w, /images/cng-london26-5.640.webp 640w, /images/cng-london26-5.480.webp 480w, /images/cng-london26-5.3840.webp 3840w, /images/cng-london26-5.320.webp 320w, /images/cng-london26-5.2560.webp 2560w, /images/cng-london26-5.1920.webp 1920w, /images/cng-london26-5.1600.webp 1600w, /images/cng-london26-5.1440.webp 1440w, /images/cng-london26-5.1280.webp 1280w, /images/cng-london26-5.1024.webp 1024w"><figcaption>The embedding landscape: AlphaEarth OLMo Earth and Tessera (yay)</figcaption></figure>

<figure class="image-center"><img src="/images/cng-london26-6.webp" alt="Every embedding outperforms the no-embedding baseline on the benthic case." title="Every embedding outperforms the no-embedding baseline on the benthic case." loading="lazy" srcset="/images/cng-london26-6.768.webp 768w, /images/cng-london26-6.640.webp 640w, /images/cng-london26-6.480.webp 480w, /images/cng-london26-6.3840.webp 3840w, /images/cng-london26-6.320.webp 320w, /images/cng-london26-6.2560.webp 2560w, /images/cng-london26-6.1920.webp 1920w, /images/cng-london26-6.1600.webp 1600w, /images/cng-london26-6.1440.webp 1440w, /images/cng-london26-6.1280.webp 1280w, /images/cng-london26-6.1024.webp 1024w"><figcaption>Every embedding outperforms the no-embedding baseline on the benthic case.</figcaption></figure>

Glen's closing thoughts were that these systems should be globally comprehensive but also *locally useful*. This usually comes down to partnering and roles; choose where you add the most (multi-benefit) value; ensure that data and AI have actual users with a human-in-the-loop.  Jamaica seagrass is an example of getting to a practical outcome rather than just frontier AI for its own sake.

## Wrangling multidimensional data

[Sol Cotton](https://www.openclimatefix.org/author/sol-cotton) from [Open Climate Fix](https://www.openclimatefix.org/) used an excellent terminal/markdown presentation to guide us through the large multidimensional datasets challenge for cloud-native workflows, and how careful [Zarr](https://zarr.dev/) chunking strategies have transformed access efficiency across multiple dimensions.

This topic aligned well with the [Icechunk discussions at PROPL](https://anil.recoil.org/notes/2026w25) last
week. It looks like the geospatial community is converging on chunked/sharded,
compressed, cloud-native ndarray storage.  The main remaining question I have
is how to find optimal chunking strategies for the queries and data appends,
and also to layer a query interface over it (but I believe Icechunk has this
capability).

I asked in the Q\&A how coordinate transforms are handled, since in Tessera we use a [MegaZarr](https://anil.recoil.org/notes/tessera-zarr-v3-layout) one-group-per-utm-zone which requires some client-side stitching for ROIs that span UTM zones. There's no satisfactory answer for this yet, except perhaps shifting to equal size projections in the future.

<figure class="image-center"><img src="/images/cng-london26-7.webp" alt="Sol Cotton on taming multidimensional data with cloud-native Zarr chunking." title="Sol Cotton on taming multidimensional data with cloud-native Zarr chunking." loading="lazy" srcset="/images/cng-london26-7.768.webp 768w, /images/cng-london26-7.640.webp 640w, /images/cng-london26-7.480.webp 480w, /images/cng-london26-7.3840.webp 3840w, /images/cng-london26-7.320.webp 320w, /images/cng-london26-7.2560.webp 2560w, /images/cng-london26-7.1920.webp 1920w, /images/cng-london26-7.1600.webp 1600w, /images/cng-london26-7.1440.webp 1440w, /images/cng-london26-7.1280.webp 1280w, /images/cng-london26-7.1024.webp 1024w"><figcaption>Sol Cotton on taming multidimensional data with cloud-native Zarr chunking.</figcaption></figure>

I also gave a talk of my own in this session on [Tessera](https://anil.recoil.org/projects/tessera), first explaining global 10m
pixel-wise embeddings with open weights, our move to Zarr v3  and a preview of
the 1.1 and (forthcoming) 2.0 models. It was a happy coincidence to follow
Earth Genome with them having just independently benchmarked us, as it made it much
easier for me to motivate some of our recent improvements on coastal regions\!

The reception to my talk from the audience was awesome; I spent most of the
rest of my attendance chatting with people about it all. Questions ranged
from whether we could help with weather (answer: yes coming soon!), ice caps
(nope but a similar approach might work), ocean (nope but see Laure Zanna's work)
and ecological modeling (see below).

It's quite difficult to point people to our
[eeg.zulipchat.com](https://eeg.zulipchat.com) chat service when discussing in
person, so I'll look into printing some Tessera 'project cards' that we can
hand to people with the QRCodes and links. This seems more useful than personal
business cards (which I haven't used in years!)

<figure class="image-center"><img src="/images/cng-london26-13.webp" alt="A full house for the afternoon session." title="A full house for the afternoon session." loading="lazy" srcset="/images/cng-london26-13.768.webp 768w, /images/cng-london26-13.640.webp 640w, /images/cng-london26-13.480.webp 480w, /images/cng-london26-13.3840.webp 3840w, /images/cng-london26-13.320.webp 320w, /images/cng-london26-13.2560.webp 2560w, /images/cng-london26-13.1920.webp 1920w, /images/cng-london26-13.1600.webp 1600w, /images/cng-london26-13.1440.webp 1440w, /images/cng-london26-13.1280.webp 1280w, /images/cng-london26-13.1024.webp 1024w"><figcaption>A full house for the afternoon session.</figcaption></figure>

## Getting ground-level nature data into the cloud

[Echo Labs](https://echolabs.org/) (represented by the wonderful Molly Blank and Kaja Wasik) were up next and talking about how to turn ecological complexity into useful signal. As background, this is a [FRO](https://www.convergentresearch.org/about-fros) backed by ARIA and Convergent Research who visited the CCI [earlier in the year](https://anil.recoil.org/notes/2026w7). They've also just launched their shiny new [website](https://echolabs.org) this week\!

Echo want to take fragmented, multimodal ground-level ecological data and
transform it into representations of ecosystem condition ('ecosystem vectors'),
as a shared foundation for measuring change and evaluating impact of
interventions on the ground.

<figure class="image-center"><img src="/images/cng-london26-8.webp" alt="Echo Labs: turning ecological complexity into useful signal, ants and all." title="Echo Labs: turning ecological complexity into useful signal, ants and all." loading="lazy" srcset="/images/cng-london26-8.768.webp 768w, /images/cng-london26-8.640.webp 640w, /images/cng-london26-8.480.webp 480w, /images/cng-london26-8.3840.webp 3840w, /images/cng-london26-8.320.webp 320w, /images/cng-london26-8.2560.webp 2560w, /images/cng-london26-8.1920.webp 1920w, /images/cng-london26-8.1600.webp 1600w, /images/cng-london26-8.1440.webp 1440w, /images/cng-london26-8.1280.webp 1280w, /images/cng-london26-8.1024.webp 1024w"><figcaption>Echo Labs: turning ecological complexity into useful signal, ants and all.</figcaption></figure>

Their proposed primitive for the representation is an *ecosystem state vector*,
which is a compact representation that fuses different ground-level modalities
(camera traps, acoustics, sensors) into one object the client can compute over.

Tessera is an obvious source of input here, but also a lot of other modalities
of sensor data and ground truth info from the
[CLR](https://www.clr.conservation.cam.ac.uk/) would be useful to them as well.
[David Coomes](https://coomeslab.org) has been discussing this with them since their visit to Cambridge\!

<figure class="image-center"><img src="/images/cng-london26-9.webp" alt="A new data primitive from Echo is the 'ecosystem state vector'." title="A new data primitive from Echo is the 'ecosystem state vector'." loading="lazy" srcset="/images/cng-london26-9.768.webp 768w, /images/cng-london26-9.640.webp 640w, /images/cng-london26-9.480.webp 480w, /images/cng-london26-9.3840.webp 3840w, /images/cng-london26-9.320.webp 320w, /images/cng-london26-9.2560.webp 2560w, /images/cng-london26-9.1920.webp 1920w, /images/cng-london26-9.1600.webp 1600w, /images/cng-london26-9.1440.webp 1440w, /images/cng-london26-9.1280.webp 1280w, /images/cng-london26-9.1024.webp 1024w"><figcaption>A new data primitive from Echo is the 'ecosystem state vector'.</figcaption></figure>

Their roadmap is staged sensibly to me with a first sprint on a proof that
multimodal ground signals carry useful information in the first place.  Then
they're working on mid-term pilot projects grounding that utility in an
ecological intervention context.

Longer-term, they want to release a shared
resource/benchmark of embedded multimodal sensor data for research, policy and
industry.  I'm extremely excited to see other people intending to work on
benchmarks in this space, as it's really difficult to evaluate techniques right
now.

<figure class="image-center"><img src="/images/cng-london26-10.webp" alt="Where Echo Labs is heading from proof-of-concept to a shared multimodal resource." title="Where Echo Labs is heading from proof-of-concept to a shared multimodal resource." loading="lazy" srcset="/images/cng-london26-10.768.webp 768w, /images/cng-london26-10.640.webp 640w, /images/cng-london26-10.480.webp 480w, /images/cng-london26-10.3840.webp 3840w, /images/cng-london26-10.320.webp 320w, /images/cng-london26-10.2560.webp 2560w, /images/cng-london26-10.1920.webp 1920w, /images/cng-london26-10.1600.webp 1600w, /images/cng-london26-10.1440.webp 1440w, /images/cng-london26-10.1280.webp 1280w, /images/cng-london26-10.1024.webp 1024w"><figcaption>Where Echo Labs is heading from proof-of-concept to a shared multimodal resource.</figcaption></figure>

In the Q\&A, I brought up a topic that [Mike Harfoot](https://www.vizzuality.com/team/mike-harfoot) and I have [been discussing](https://anil.recoil.org/notes/foundational-ecosystem-workshop). We're wondering whether synthetic data generation
(e.g. from a process model like [Madingley](https://madingley.github.io/))
could help to accelerate the training of their ecosystem model, since ground
truth data is quite sparse.  This isn't on their near-term roadmap but one of
the things they're considering.

I had a quick chat with [Stefan Istrate](https://www.linkedin.com/in/stefanistrate/) (who has
been working with [Silviu Petrovan](https://www.cambridgeconservation.org/about/people/dr-silviu-o-petrovan/) on frog vision models) and am delighted to see
that he's recently joined Echo as their head of machine learning! They're
shaping up to have a very classy team indeed and I look forward to seeing how their ecosystems
vectors progress.

## The invisible settlements of Argentina

The talk I enjoyed the most was [Nissim Lebovits](https://nlebovits.github.io/) (Radiant Earth), who explained the [Barrios Visibles](https://www.barriosvisibles.org/en) project (read [paper](https://doi.org/10.2139/ssrn.6588819) as well). They used building-footprint data to surface a systematic population undercount in Argentina's informal settlements. And not just a small undercount: he reported they found some 3.4 million people missing from their [national record](https://www.argentina.gob.ar/obras-publicas/sisu/renabap), a significant fraction of the estimated 45 million inhabitants across the country\!

> RENABAP, Argentina's official registry of barrios populares, lists 1.24
> million families across 6,467 settlements. But satellite imagery reveals 1.97
> million buildings within those same boundaries—59% more structures than
> recorded families.
> 
> This isn't about the registry being outdated. RENABAP's own quality-control
> protocol requires that family counts match dwellings visible in satellite
> imagery. The gap documented here is a departure from that standard. Closing
> it requires methodological change, not just [updated data](https://www.argentina.gob.ar/sites/default/files/manual-para-la-conformacion-y-actualizacion-del-renabap.pdf).
> <cite>\-- [Barrios Visibles explainer](https://www.barriosvisibles.org/en), 2026</cite>

<figure class="image-center"><img src="/images/cng-london26-12.webp" alt="3.4 million people missing from Argentina's national record of informal settlements." title="3.4 million people missing from Argentina's national record of informal settlements." loading="lazy" srcset="/images/cng-london26-12.768.webp 768w, /images/cng-london26-12.640.webp 640w, /images/cng-london26-12.480.webp 480w, /images/cng-london26-12.3840.webp 3840w, /images/cng-london26-12.320.webp 320w, /images/cng-london26-12.2560.webp 2560w, /images/cng-london26-12.1920.webp 1920w, /images/cng-london26-12.1600.webp 1600w, /images/cng-london26-12.1440.webp 1440w, /images/cng-london26-12.1280.webp 1280w, /images/cng-london26-12.1024.webp 1024w"><figcaption>3.4 million people missing from Argentina's national record of informal settlements.</figcaption></figure>

The talk was (to me anyway) an incredible demonstration of cloud-native open
data doing politically consequential work. He ran a big query over the Parquet
files hosted on Source Coop, doing in a few queries a full spatial cross-referencing
run that combined [Google+Microsoft+OpenStreetMap building footprints](https://source.coop/vida/google-microsoft-osm-open-buildings)
against the official registries from Argentina.

The point of doing this over the hosted
Parquet is that it's not necessary to download everything to run the query (hence the
importance of the cloud native approach).

As an example, in La Plata alone, roughly 72,000 building footprints intersect
polygons for which the registry lists only 34,000 families. This kind of gap
seems very important to account for when budgeting services and infrastructure
development in the country.

<figure class="image-center"><img src="/images/cng-london26-11.webp" alt="Improving on the census: 72,000 buildings detected against 34,000 families registered." title="Improving on the census: 72,000 buildings detected against 34,000 families registered." loading="lazy" srcset="/images/cng-london26-11.768.webp 768w, /images/cng-london26-11.640.webp 640w, /images/cng-london26-11.480.webp 480w, /images/cng-london26-11.3840.webp 3840w, /images/cng-london26-11.320.webp 320w, /images/cng-london26-11.2560.webp 2560w, /images/cng-london26-11.1920.webp 1920w, /images/cng-london26-11.1600.webp 1600w, /images/cng-london26-11.1440.webp 1440w, /images/cng-london26-11.1280.webp 1280w, /images/cng-london26-11.1024.webp 1024w"><figcaption>Improving on the census: 72,000 buildings detected against 34,000 families registered.</figcaption></figure>

Another point that he made (show in the video below) is that
debugging/visualising this dataset is pretty easy, since the entire map is
zoomable. To validate a given region, the officials just directly navigate
there and find the polygons which are marked as settlements, as use normal
visual satellite imagery to verify that there are in fact settlements there.

<div class="video-center"><iframe title="" width="100%" height="315px" src="https://crank.recoil.org/videos/embed/0f6a181b-a0fb-4795-96aa-cbcb23f51349" frameborder="0" allowfullscreen sandbox="allow-same-origin allow-scripts allow-popups allow-forms"></iframe></div>

This left me wondering about the role of OpenStreetMap here: it was used as an input, but what's the mechanism to then propose updates to it so that the crowdsourced database remains accurate? I met another attendee [Petya Kangalova](https://www.hotosm.org/en/members/petya-kangalova/) who works for [Humanitarian OpenStreetMap](https://www.hotosm.org/en/), which is a community of mappers focussed on the disaster response utility of the database.

Petya explained to me that HumOSM has a bunch of [specialised tech products](https://www.hotosm.org/en/tools-resources/tech-product-suite/) for disaster response. Two cool ones are a [multiuser coordination](https://www.hotosm.org/en/tools-resources/tech-product-suite/tasking-manager/) layer for planning how to update an area, and [OpenAerialMap](https://www.hotosm.org/en/tools-resources/tech-product-suite/open-aerial-map/) to explore decently licensed imagery.

## Compressing the Earth

[Jacqueline Campbell](https://schmidtsciencefellows.org/fellow/jacqueline-campbell/) of [Asterisk Labs](https://www.aria.org.uk/scoping-our-planet-creators/) talked next.  She's a planetary scientist who came to Earth's oceans via looking for life in the Mars dust, and presented "[Earth Compress](https://asterisk.coop/project-pages/fed-collective.html)".

Their goal is to have open source, publicly owned, compression infrastructure for a variety of Earth data, built with the [National Oceanography Centre](https://noc.ac.uk/). The domain-aware compression stack will make petabyte-scale analysis accessible to everyone rather than only to those with the biggest egress budgets:

> The critical bottleneck limiting high-impact environmental research is how
> difficult it has become to process increasingly huge and complex datasets. To
> overcome this bottleneck we will build open source, AI-powered software
> infrastructure and data-as-AI models that are trustworthy and publicly-owned.
> We will not only build technology, but establish a multi-institutional
> cooperative, reducing current fragmentation and complexity to massively
> increase the number of organisations that can access and manipulate
> Earth-scale environmental data.
> 
> Our software infrastructure will simultaneously empower data producers (so
> they can easily create Earth Embedding models) and data users (so they can
> easily access and manipulate them). Therefore, we will enable a
> transformative shift, massively reducing the compute costs and complexity for
> all.
> <cite>\-- [Future of Environment Data Collective](https://asterisk.coop/project-pages/fed-collective.html)</cite>

This describes the problems we're having in Tessera pretty accurately. [Srinivasan Keshav](https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page) has also been leading an effort on our side to use [residual vector quantization](https://eeg.zulipchat.com/#narrow/channel/527258-Tessera/topic/Tessera.20with.20residual.20vector.20quantization.20.28RVQ.29/with/606474641) to dramatically shrink the size of the Tessera embeddings, so we've started a direct conversation with the Asterisk team to see how we can join forces\!

In theory, this will allow for hugely faster 'sketches' of global analyses without much loss in accuracy for many downstream tasks. And because the Asterisk team is also applying the same trick to other embeddings, it'll make fusion of multimodal data sources much easier as well\!

<figure class="image-center"><img src="/images/cng-london26-14.webp" alt="Earth Compress is a publicly-owned, domain-aware compression stack for Earth data." title="Earth Compress is a publicly-owned, domain-aware compression stack for Earth data." loading="lazy" srcset="/images/cng-london26-14.768.webp 768w, /images/cng-london26-14.640.webp 640w, /images/cng-london26-14.480.webp 480w, /images/cng-london26-14.3840.webp 3840w, /images/cng-london26-14.320.webp 320w, /images/cng-london26-14.2560.webp 2560w, /images/cng-london26-14.1920.webp 1920w, /images/cng-london26-14.1600.webp 1600w, /images/cng-london26-14.1440.webp 1440w, /images/cng-london26-14.1280.webp 1280w, /images/cng-london26-14.1024.webp 1024w"><figcaption>Earth Compress is a publicly-owned, domain-aware compression stack for Earth data.</figcaption></figure>

Their architecture splits a compression toolbox (i.e. either classical
compression, AI-based data fields, and AI embedding models like Tessera) and
feeds into tailored data archives (file, columnar and vector databases) on the
server side, with a transmission protocol that streams dynamic data through a
manager out to decompressed data and embeddings on the client side.

I don't think there are many standards for what this custom VBR decoder
might be yet, so this seems a good opportunity to establish one, much like
the [Zarr conventions community](https://anil.recoil.org/notes/tessera-embeddings-convention) is doing.

<figure class="image-center"><img src="/images/cng-london26-15.webp" alt="The Earth Compress architecture, from compression toolbox to client-side transmission." title="The Earth Compress architecture, from compression toolbox to client-side transmission." loading="lazy" srcset="/images/cng-london26-15.768.webp 768w, /images/cng-london26-15.640.webp 640w, /images/cng-london26-15.480.webp 480w, /images/cng-london26-15.3840.webp 3840w, /images/cng-london26-15.320.webp 320w, /images/cng-london26-15.2560.webp 2560w, /images/cng-london26-15.1920.webp 1920w, /images/cng-london26-15.1600.webp 1600w, /images/cng-london26-15.1440.webp 1440w, /images/cng-london26-15.1280.webp 1280w, /images/cng-london26-15.1024.webp 1024w"><figcaption>The Earth Compress architecture, from compression toolbox to client-side transmission.</figcaption></figure>

## When the developer is an agent

My laptop started running out of juice (too many demos), so my remaining notes are a bit sketchy.

[Stefan Amberger](https://www.deeprec.ai/blog/earth-observed-accelerating-space-data-or-stefan-amberger), co-founder of [Tilebox](https://tilebox.com/), made the case that Tilebox is the "operating loop" for geospatial data workflows, and asked what changes when the developer is a semi-autonomous LLM agent.

Their answer is to establish a single workflow loop ("discover -\> define -\> run -\> observe -\>  improve") that's shared across three kinds of callers. These are either humans on a console, LLM agents over MCP, and conventional software via APIs. The work is orchestrated to wherever the data is, either between the cloud or over to on-prem and edge devices.

<figure class="image-center"><img src="/images/cng-london26-16.webp" alt="Tilebox's one workflow loop for people, agents and software." title="Tilebox's one workflow loop for people, agents and software." loading="lazy" srcset="/images/cng-london26-16.768.webp 768w, /images/cng-london26-16.640.webp 640w, /images/cng-london26-16.480.webp 480w, /images/cng-london26-16.3840.webp 3840w, /images/cng-london26-16.320.webp 320w, /images/cng-london26-16.2560.webp 2560w, /images/cng-london26-16.1920.webp 1920w, /images/cng-london26-16.1600.webp 1600w, /images/cng-london26-16.1440.webp 1440w, /images/cng-london26-16.1280.webp 1280w, /images/cng-london26-16.1024.webp 1024w"><figcaption>Tilebox's one workflow loop for people, agents and software.</figcaption></figure>

As with the discussion at last week's [PROPL](https://anil.recoil.org/notes/2026w25), there's quite a wide
consensus that agents will join the coding loop whether we like it or not. So
the focus needs to shift to how we keep not only the data source auditable, but
also the coding loop more verifiable.

I did a quick poll of the audience to find out which of the [geotessera](https://anil.recoil.org/notes/geotessera-python) users
did coding by hand, and who used agents. I couldn't find a _single_ person who'd use my
lovely library by hand. Every single person used a variety of Claude to Codex. There were
no local agent users, and no Copilot users, so that's a sign of a rarified crowd.

## Lightning talks

The afternoon lightning round was a tour of practical pipelines. [Jake Wilkins](https://epoch.blue/) (Epoch Blue) showed how they go from days to minutes with a just-in-time pipeline for plot-level supply-chain analytics, aimed at helping companies comply with the forthcoming [EUDR](https://epoch.blue/article/the-path-to-eudr-readiness-in-2026/index.html) deforestation-compliance deadline.

I had a chance to chat to Jake afterwards and show him our [FOOD provenance paper](https://anil.recoil.org/notes/food-and-risk-to-life) and the [interactive explorer](https://quantifyearth.github.io/food-globe/). What's really cool about Jake's work is that they're using global embeddings to calculate probabilities at the 10m2 level of a commodity being produced, whereas our (pre-Tessera) work depends on [FAO provenance](https://anil.recoil.org/ideas/food-provenance-fao) which is only at a national level.

<figure class="image-center"><img src="/images/cng-london26-17.webp" alt="Jake Wilkins on Epoch Blue's just-in-time supply-chain pipeline." title="Jake Wilkins on Epoch Blue's just-in-time supply-chain pipeline." loading="lazy" srcset="/images/cng-london26-17.768.webp 768w, /images/cng-london26-17.640.webp 640w, /images/cng-london26-17.480.webp 480w, /images/cng-london26-17.3840.webp 3840w, /images/cng-london26-17.320.webp 320w, /images/cng-london26-17.2560.webp 2560w, /images/cng-london26-17.1920.webp 1920w, /images/cng-london26-17.1600.webp 1600w, /images/cng-london26-17.1440.webp 1440w, /images/cng-london26-17.1280.webp 1280w, /images/cng-london26-17.1024.webp 1024w"><figcaption>Jake Wilkins on Epoch Blue's just-in-time supply-chain pipeline.</figcaption></figure>

The Epoch Blue process runs customer-supplied locations and addresses through a geocode-and-verify loop, calculates probabilistic supply sheds down to delineated commodity plots, and then merges this with environmental metrics (deforestation, emissions, biodiversity, water use). Jake also wrote a nice [piece on using AlphaEarth embeddings](https://medium.com/google-earth/seeding-the-search-alphaearth-foundations-satellite-embeddings-for-detecting-agricultural-43cf78e1cc5f) to detect palm-oil mill effluent lagoons. I really want to try this with Tessera as well...

<figure class="image-center"><img src="/images/cng-london26-18.webp" alt="Epoch Blue's process ranges from addresses to environmental metrics." title="Epoch Blue's process ranges from addresses to environmental metrics." loading="lazy" srcset="/images/cng-london26-18.768.webp 768w, /images/cng-london26-18.640.webp 640w, /images/cng-london26-18.480.webp 480w, /images/cng-london26-18.3840.webp 3840w, /images/cng-london26-18.320.webp 320w, /images/cng-london26-18.2560.webp 2560w, /images/cng-london26-18.1920.webp 1920w, /images/cng-london26-18.1600.webp 1600w, /images/cng-london26-18.1440.webp 1440w, /images/cng-london26-18.1280.webp 1280w, /images/cng-london26-18.1024.webp 1024w"><figcaption>Epoch Blue's process ranges from addresses to environmental metrics.</figcaption></figure>

The other lightning talks were great; Alper Dincer (Climingo) spoke on global
drought mapping with [H3](https://h3geo.org/),
[GeoParquet](https://geoparquet.org/) and [DuckDB](https://duckdb.org/). Ross
Slater (Leeds) on going cloud-native without the cloud for Antarctic ice
dynamics; and Petya Kangalova ([HOT](https://www.hotosm.org/)) who I mentioned
earlier on cloud-native open imagery for disaster response. Ross has a really
interesting usecase which could benefit from a Tessera-style Barlow Twins
approach, but using different satellite data (S1/S2 don't go that far north),
which I need to think about more.

## Panels and closing thoughts

The day closed with a panel with [David Eaves](https://www.ucl.ac.uk/bartlett/public-purpose/) (UCL), [Jack Kelly](https://dynamical.org/) (dynamical.org and Open Climate Fix), [Niall Robinson](https://www.nvidia.com/) (NVIDIA) and Kaja Wasik (Echo Labs). Frustratingly, the heatwave had thrown the trains into the usual chaos and I had to leg it to King's Cross to get back to Cambridge, so I missed it entirely.

I did have a chinwag with Niall though, as he's been helping us train Tessera v2 on the [Isambard-AI](https://isambard.ac.uk/) cluster (part of the UK's AI Research Resource). Jack's [dynamical.org](https://dynamical.org/) is also publishing weather data via [Icechunk](https://www.earthmover.io/blog/icechunk/), which I'm planning to use in some weather forecasting research we're doing with Tessera atm.

Jed, Luca, Niall and I all talked about how many of the day's talks came back to matters of provenance and trust. The encouraging thing is that I think we now have many of the pieces in place to do something concrete about it, especially after last week's [PROPL living document](https://docs.google.com/document/d/1ZPUsfBinY1bKiXg-nuKDoEcIEqpL9d2BdOI5FuBfusk/edit) as well showed the number of PL researchers who want to dive into this problem alongside systems people.
An ATProto-native trust graph like Tangled's [evidence-backed vouching](https://blog.tangled.org/vouching/) (which I [wrote about a few weeks ago](https://anil.recoil.org/notes/2026w18)) could also anchor data provenance to an identity graph that's reusable across different services (see [Semble](https://semble.so/) for example),  and supporting [evidence-driven practice](https://anil.recoil.org/projects/ce).

This was a first great experience of the cloud-native geospatial community in
London for me! Thanks to Jed and Radiant Earth for convening it; next time,
ideally, in slightly cooler weather, but the coffee in the Jellicoe was top
notch so that made up for the burns\!

<figure class="image-center"><img src="/images/cng-london26-19.webp" alt="Granary Square was a full on fountain spraying experience for adults, kids and pets" title="Granary Square was a full on fountain spraying experience for adults, kids and pets" loading="lazy" srcset="/images/cng-london26-19.768.webp 768w, /images/cng-london26-19.640.webp 640w, /images/cng-london26-19.480.webp 480w, /images/cng-london26-19.3840.webp 3840w, /images/cng-london26-19.320.webp 320w, /images/cng-london26-19.2560.webp 2560w, /images/cng-london26-19.1920.webp 1920w, /images/cng-london26-19.1600.webp 1600w, /images/cng-london26-19.1440.webp 1440w, /images/cng-london26-19.1280.webp 1280w, /images/cng-london26-19.1024.webp 1024w"><figcaption>Granary Square was a full on fountain spraying experience for adults, kids and pets</figcaption></figure>
Synopsis: My notes from the first Cloud-Native Geospatial Forum gathering outside the US, up on the top floor of the Jellicoe; covering Source Cooperative's open data economics, Argentina's invisible settlements, and provenance and trust for geospatial decisionmaking.
Words: 3466

## Related

- [Nissim Lebovits on Barrios Visibles at CNG London 2026](https://anil.recoil.org/videos/0f6a181b-a0fb-4795-96aa-cbcb23f51349) (video, 2026-06-25)
- [.plan-26-25: Planetary scale plans, Windows file-descriptor scale problems](https://anil.recoil.org/notes/2026w25) (note, 2026-06-21)
- [Tessera v1.1 released, with smoother and temporally stable embeddings](https://anil.recoil.org/notes/tessera-v11-out) (note, 2026-06-12)
- [.plan-26-18: From tropical forest protection to oi swallowing its oxcaml tail](https://anil.recoil.org/notes/2026w18) (note, 2026-05-03)
- [AI, science and the UK–EU relationship at the Royal Society](https://anil.recoil.org/notes/rs-eu-ai-science) (note, 2026-04-21)
- [TESSERA now supports the Zarr geo-embeddings convention proposal](https://anil.recoil.org/notes/tessera-embeddings-convention) (note, 2026-03-27)
- [Streaming millions of TESSERA tiles over HTTP with Zarr v3](https://anil.recoil.org/notes/tessera-zarr-v3-layout) (note, 2026-03-14)
- [1st TESSERA/CoRE hackathon at the Indian AI Summit](https://anil.recoil.org/notes/first-tessera-hackathon) (note, 2026-02-19)
- [.plan-26-07: Storage, Lego, Echo, and the IUCN](https://anil.recoil.org/notes/2026w7) (note, 2026-02-15)
- [.plan-26-06: Vivas, ARIA and interviews](https://anil.recoil.org/notes/2026w6) (note, 2026-02-08)
- [Publish, Review, Curate to upend scholarly publishing](https://anil.recoil.org/notes/coar-prc) (note, 2025-12-08)
- [Foundational AI for Ecosystem Resilience workshop](https://anil.recoil.org/notes/foundational-ecosystem-workshop) (note, 2025-12-03)
- [Food and the long term risk to life](https://anil.recoil.org/notes/food-and-risk-to-life) (note, 2025-11-06)
- [GeoTessera Python library released for geospatial embeddings](https://anil.recoil.org/notes/geotessera-python) (note, 2025-08-31)
- [An access library for the world crop, food production and consumption datasets](https://anil.recoil.org/ideas/food-provenance-fao) (idea, 2025-04-01)
- [TESSERA, a pixelwise geospatial foundation model](https://anil.recoil.org/projects/tessera) (project, 2025-01-01)
- [Conservation Evidence Copilots](https://anil.recoil.org/projects/ce) (project, 2024-01-01)

---
Canonical: https://anil.recoil.org/notes/cng-london-2026
Type: note
Tags: tessera, biodiversity, conservation, climate, ai, nature, academia
