# Planetary Computing

*2022-01-01 — project*


Planetary computing is our research into the systems required to handle the
ingestion, transformation, analysis and publication of global data products for
furthering environmental science and enabling better informed policy-making. We
apply computer science to problem domains such as forest carbon and
biodiversity preservation (see [Trusted Carbon Credits](https://anil.recoil.org/projects/4c) and [Remote Sensing of Nature](https://anil.recoil.org/projects/rsn)), and design solutions that can
scalably process geospatial data that build trust in the results via
traceability and reproducibility. Key problems include how to handle
continuously changing datasets that are often collected across decades and
require careful access and version control.


"Planetary computing" originated as a term back in 2020 when a merry band of us
from Computer Science ([Srinivasan Keshav](https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page) and me, later joined by [Sadiq Jaffer](https://toao.com), [Patrick Ferris](https://patrick.sirref.org),
[Michael Dales](https://mynameismwd.org) and the bigger EEG group now) began working on [Trusted Carbon Credits](https://anil.recoil.org/projects/4c) and implementing
the large-scale computing infrastructure required for processing remote sensing
data. Our early thoughts on how computer science could help were captured in
"[How Computer Science Can Aid Forest Restoration](https://anil.recoil.org/papers/2021-arxiv-forestrycs)", which laid out the vision for bringing
computational techniques to bear on forest restoration.

By 2024, we'd developed enough of a research programme to write up our approach
in "[Planetary computing for data-driven environmental policy-making](https://anil.recoil.org/papers/2024-planetary-computing)", which describes the systems architecture
we've been building. The core insight is that environmental science needs the
same level of computational rigor that we've brought to other domains, but with
unique challenges around data provenance, reproducibility, and scale.

## The Programming for the Planet Community

Then in early 2024, [Dominic Orchard](https://dorchard.github.io) and I decided to find others interested in the
problem domain, and organised the first "[Programming for the Planet](https://propl.dev)" (PROPL) workshop in London, co-located with
POPL2024.  This turned out to be a fully subscribed event, with chairs having to be brought in at one point for some of the [more popular
talks](https://plas4sci.github.io/conference/2024/01/22/propl.html)! Either way, it convinced us that there's a genuine momentum and need for planetary
computing research as a distinct discipline.

<figure class="image-center"><img src="/images/propl24-poster.webp" alt="The PROPL 2024 invitation poster" title="The PROPL 2024 invitation poster" loading="lazy" srcset="/images/propl24-poster.768.webp 768w, /images/propl24-poster.640.webp 640w, /images/propl24-poster.480.webp 480w, /images/propl24-poster.320.webp 320w, /images/propl24-poster.1024.webp 1024w"><figcaption>The PROPL 2024 invitation poster</figcaption></figure>

The [second PROPL workshop](https://anil.recoil.org/notes/icfp25-propl) in October 2025 was co-located with
ICFP/SPLASH in Singapore, and we were thrilled to have enough quality
submissions to publish [proceedings](https://anil.recoil.org/papers/2025-propl) proceedings in the ACM
Digital Library for the first time! The workshop covered everything from
climate model verification and GPU-accelerated hydrology to our own work on
declarative geospatial programming with [Yirgacheffe: A Declarative Approach to Geospatial Data](https://anil.recoil.org/papers/2025-yirgacheffe) and the
vision for a FAIR computational commons in [A FAIR Case for a Live Computational Commons](https://anil.recoil.org/papers/2025-fairground).

The diversity of the community (spanning climate scientists, ecologists,
systems researchers, and programming language theorists) reinforces that we're
tackling problems that genuinely need this kind of cross-disciplinary
collaboration.

## Core Systems Research

I'm working on various systems involved with the ingestion, processing,
analysis and publication of global geospatial data products. To break them
down:

**Data Ingestion and Processing.** Ingesting satellite data is a surprisingly
tricky process, usually involving lots of manual curation and trying not to
crash nasa.gov or the ESA websites with too many parallel requests. We're
working on systems that can ingest data from multiple sources while keeping
track of provenance, including satellite imagery (see [Remote Sensing of Nature](https://anil.recoil.org/projects/rsn)), ground-based
sensors (see [Terracorder: Sense Long and Prosper](https://anil.recoil.org/papers/2024-terracorder)), and citizen science data gathering. This
involves a lot of data cleaning and transformation as well as parallel and
clustered code. The challenge is similar to what we're tackling with [Conservation Evidence Copilots](https://anil.recoil.org/projects/ce) for
literature scanning - how do you build trust in automatically processed data
pipelines at scale?

Our recent work on geospatial foundation models (see [TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis](https://anil.recoil.org/papers/2025-tessera)) is opening up
new possibilities here, allowing us to work with rich embeddings rather than
raw satellite data. This "embedding-as-data" approach could democratise access
to advanced remote sensing analytics, though it creates new programming
challenges around how to work with these planetary-scale embeddings
effectively.

**Developer Workflow and Reproducibility.** Once data is available, we're
building a next-generation "Docker for geospatial" system that can package up
precisely versioned data, code and OS environment into a single container that
can be run anywhere. This is a key part of our reproducibility story, and is a
work-in-progress at [quantifyearth/shark](https://github.com/quantifyearth/shark).
The core idea,
described in our [Lineage first computing: towards a frugal userspace for Linux](https://anil.recoil.org/papers/2024-loco-shark) paper, is "lineage-first computing"; we
put the workflow graph containing relationships between tools, provenance and
labelling at the core of the system. By tracking how data-pipelines evolve from
experimental practice and what data has already been built, we can prevent
re-execution both during development and after publication. This builds on
years of experience with unikernels and containers (see [Functional Networking for Millions of Docker Desktops](https://anil.recoil.org/papers/2025-docker-icfp))
but adapts the model specifically for the frugal, reproducible computing that
environmental science demands.

However, building trustworthy computational pipelines at planetary scale
introduces profound challenges around uncertainty propagation and
reproducibility. Our [Uncertainty at scale: how CS hinders climate research](https://anil.recoil.org/papers/2024-uncertainty-cs) work explores how computer science
assumptions can inadvertently hinder climate research - from non-determinism in
floating-point operations to subtle differences in library versions affecting
satellite data processing. These issues become critical when climate scientists
need to quantify uncertainty bounds on their models, yet standard CS tools
often obscure rather than illuminate sources of variability.

Frugality extends beyond reproducibility to carbon awareness. In
[Carbon-aware Name Resolution](https://anil.recoil.org/papers/2024-loco-carbonres), we explore how DNS name resolution could become
carbon-aware, treating emissions as a first-class metric for scheduling
decisions. By extending DNS with load balancing that considers carbon costs, we
can maintain compatibility with existing infrastructure while enabling
applications to minimize their environmental footprint - particularly important
for planetary-scale computations that may run repeatedly over decades.

**Specification Languages and Pipelines.** We're also working on domain-specific languages for specifying geospatial data processing pipelines, which can be compiled down to efficient code that can run on our planetary computing infrastructure. Our [Yirgacheffe: A Declarative Approach to Geospatial Data](https://anil.recoil.org/papers/2025-yirgacheffe) library for Python, developed by [Michael Dales](https://mynameismwd.org), [Patrick Ferris](https://patrick.sirref.org) and colleagues, allows spatial algorithms to be implemented concisely while automatically handling resources (cores, memory, GPUs) and supporting parallel execution. This avoids common errors and makes it possible for ecologists to write robust pipelines without being systems programming experts.

Ideally, these languages would also capture elements of the _specification_ of the data at different levels of precision, so that we can swap out different data sources or processing steps without having to rewrite the entire pipeline or change the intent behind the domain expert writing the code. You can see an example of a manually written and extremely detailed pipeline in our [PACT Tropical Moist Forest Accreditation Methodology v2.1](https://anil.recoil.org/papers/2023-pact-tmf) whitepaper - converting this to readable, maintainable code is a pretty big challenge! The vision laid out in [A FAIR Case for a Live Computational Commons](https://anil.recoil.org/papers/2025-fairground) of notebooks that can reference each other as libraries in a planetary-scale computational commons is one direction we're exploring.

## Looking Forward

There's a lot more to say about ongoing projects, but the overall message is:
if you're interested in contributing to some part of the planetary computing
ecosystem, either as a collaborator or a student, get in touch! The community
we've built through PROPL and related work (see also [Nine changes needed to deliver a radical transformation in biodiversity measurement](https://anil.recoil.org/papers/2025-biodiversity-9recs)
for broader recommendations on transforming biodiversity measurement) shows
there's real momentum behind making computational environmental science more
rigorous, reproducible, and accessible.

## Related Reading

Cyrus Omar and his team over at Hazel language have also been working on a
similar problem domain, and we're looking forward to collaborating with them.
Read [A FAIR Case for a Live Computational Commons](https://anil.recoil.org/papers/2025-fairground) here or watch their [PROPL 2024 talk](https://watch.eeg.cl.cam.ac.uk/w/3nGExywoVm6XFRBA2zYxSL).

I've also given several talks on planetary computing, including a [keynote at ICFP 2023](https://icfp23.sigplan.org/track/icfp-2023-icfp-keynotes?track=ICFP%20%20Keynotes#program) and at LambdaDays. Both are linked below, but the latter is the most recent one.

<div class="video-center"><iframe title="" width="100%" height="315px" src="https://crank.recoil.org/videos/embed/981c00b5-32c0-4cac-a387-6c945dfa9934" frameborder="0" allowfullscreen sandbox="allow-same-origin allow-scripts allow-popups allow-forms"></iframe></div>

<div class="video-center"><iframe title="" width="100%" height="315px" src="https://crank.recoil.org/videos/embed/d592bf17-c835-435f-9469-f0f65e926975" frameborder="0" allowfullscreen sandbox="allow-same-origin allow-scripts allow-popups allow-forms"></iframe></div>
Period: 2022–present

## Related

- [The FP Launchpad takes off at IIT Madras](https://anil.recoil.org/notes/fpl-launch) (note, 2026-04-13)
- [.plan-26-15: Banyan trees, (anti)botnets and Bose-Einstein bases](https://anil.recoil.org/notes/2026w15) (note, 2026-04-12)
- [Streaming millions of TESSERA tiles over HTTP with Zarr v3](https://anil.recoil.org/notes/tessera-zarr-v3-layout) (note, 2026-03-14)
- [A FAIR Case for a Live Computational Commons](https://anil.recoil.org/videos/0e82977c-ba11-487f-bece-147fb1da104d) (video, 2026-03-08)
- [Connecting the dots for biodiversity action from the NAS/Royal Society Forum](https://anil.recoil.org/notes/nas-rs-biodiversity-papers) (note, 2026-03-07)
- [My (very) fast zero-allocation webserver using OxCaml](https://anil.recoil.org/notes/oxcaml-httpz) (note, 2026-02-01)
- [Yirgacheffe: a declarative approach to geospatial data](https://anil.recoil.org/videos/b72021da-29fb-44f2-9ec0-2ebb9fc8993f) (video, 2026-01-28)
- [Five ways to use the LIFE metric for conservation decision-making](https://anil.recoil.org/notes/life-uses-paper) (note, 2026-01-12)
- [Nine changes needed to deliver a radical transformation in biodiversity measurement](https://anil.recoil.org/papers/2025-biodiversity-9recs) (paper, 2026-01-01)
- [Four Ps for Building Massive Collective Knowledge Systems](https://anil.recoil.org/notes/principles-for-collective-knowledge) (note, 2025-11-23)
- [TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis](https://anil.recoil.org/papers/2025-tessera) (paper, 2025-11-01)
- [Jane Street and Docker on moving to OCaml 5 at ICFP/SPLASH 2025](https://anil.recoil.org/notes/icfp25-ocaml5-js-docker) (note, 2025-10-07)
- [Programming for the Planet at ICFP/SPLASH 2025](https://anil.recoil.org/notes/icfp25-propl) (note, 2025-10-05)
- [A FAIR Case for a Live Computational Commons](https://anil.recoil.org/papers/2025-fairground) (paper, 2025-10-01)
- [Programming Opportunities for the Global Biodiversity Observation Network](https://anil.recoil.org/papers/2025-programming-gbon) (paper, 2025-10-01)
- [Yirgacheffe: A Declarative Approach to Geospatial Data](https://anil.recoil.org/papers/2025-yirgacheffe) (paper, 2025-10-01)
- [Proceedings of the 2nd ACM SIGPLAN International Workshop on Programming for the Planet](https://anil.recoil.org/papers/2025-propl) (paper, 2025-10-01)
- [Functional Networking for Millions of Docker Desktops](https://anil.recoil.org/papers/2025-docker-icfp) (paper, 2025-08-01)
- [ZFS replication strategies with encryption](https://anil.recoil.org/ideas/zfs-filesystem-perf) (idea, 2025-06-01)
- [Carbon-Aware Name Resolution](https://anil.recoil.org/videos/4cd6efdb-fd22-4a1c-a326-df49dfc1f398) (video, 2025-04-15)
- [Lineage first computing: towards a frugal userspace for Linux](https://anil.recoil.org/videos/cb2439c9-d160-4daa-8103-b952c5aa2c5f) (video, 2025-04-15)
- [LIFE becomes an Official Statistic of the UK government](https://anil.recoil.org/notes/life-official-statistic) (note, 2025-03-21)
- [Thoughts on the National Data Library and private research data](https://anil.recoil.org/notes/uk-national-data-lib) (note, 2025-02-17)
- [About](https://anil.recoil.org/notes/index) (note, 2025-02-15)
- [Programming FPGAs using OCaml](https://anil.recoil.org/notes/fpgas-hardcaml) (note, 2025-02-07)
- [OxCaml Labs](https://anil.recoil.org/projects/oxcaml) (project, 2025-01-01)
- [Carbon-aware Name Resolution](https://anil.recoil.org/papers/2024-loco-carbonres) (paper, 2024-12-01)
- [Lineage first computing: towards a frugal userspace for Linux](https://anil.recoil.org/papers/2024-loco-shark) (paper, 2024-12-01)
- [Mapping greener futures with planetary computing](https://anil.recoil.org/notes/a0280750-2ef0-4f5c-b138-68f7b11b4c29-1) (note, 2024-10-24)
- [Royal Society meeting on ecological/commercial risks](https://anil.recoil.org/notes/rs-ecorisk-day1) (note, 2024-10-04)
- [PACT Tropical Moist Forest Accreditation Methodology v2.1](https://anil.recoil.org/papers/2023-pact-tmf) (paper, 2024-08-01)
- [Terracorder: Sense Long and Prosper](https://anil.recoil.org/papers/2024-terracorder) (paper, 2024-08-01)
- [COMPASS 2024 report on the CoRE stack RIC meeting](https://anil.recoil.org/notes/compass2024-ric-tripreport) (note, 2024-07-08)
- [Programming for the Planet](https://anil.recoil.org/videos/d592bf17-c835-435f-9469-f0f65e926975) (video, 2024-05-27)
- [Planetary computing for data-driven environmental policy-making](https://anil.recoil.org/papers/2024-planetary-computing) (paper, 2024-03-01)
- [Uncertainty at scale: how CS hinders climate research](https://anil.recoil.org/papers/2024-uncertainty-cs) (paper, 2024-02-01)
- [Conservation Evidence Copilots](https://anil.recoil.org/projects/ce) (project, 2024-01-01)
- [BBC interview about new Cambridge supercomputer](https://anil.recoil.org/videos/48a7ab10-3f49-4978-a00f-c26b64c2cae7) (video, 2023-11-02)
- [Functional Programming for the Planet](https://anil.recoil.org/videos/981c00b5-32c0-4cac-a387-6c945dfa9934) (video, 2023-09-05)
- [Remote Sensing of Nature](https://anil.recoil.org/projects/rsn) (project, 2023-01-01)
- [How Computer Science Can Aid Forest Restoration](https://anil.recoil.org/papers/2021-arxiv-forestrycs) (paper, 2021-08-01)
- [Trusted Carbon Credits](https://anil.recoil.org/projects/4c) (project, 2021-01-01)
- [Information Flow for Trusted Execution](https://anil.recoil.org/projects/difc-tee) (project, 2020-01-01)

---
Canonical: https://anil.recoil.org/projects/plancomp
Type: project
Tags: systems, conservation, satellite, sensing
