Planetary Computing
Planetary computing is our research into the systems required to handle the
ingestion, transformation, analysis and publication of global data products for
furthering environmental science and enabling better informed policy-making. We
apply computer science to problem domains such as forest carbon and
biodiversity preservation (see
"Planetary computing" originated as a term back in 2020 when a merry band of us
from Computer Science (
By 2024, we'd developed enough of a research programme to write up our approach
in "
The Programming for the Planet Community
Then in early 2024,

The
The diversity of the community (spanning climate scientists, ecologists, systems researchers, and programming language theorists) reinforces that we're tackling problems that genuinely need this kind of cross-disciplinary collaboration.
Core Systems Research
I'm working on various systems involved with the ingestion, processing, analysis and publication of global geospatial data products. To break them down:
Data Ingestion and Processing. Ingesting satellite data is a surprisingly
tricky process, usually involving lots of manual curation and trying not to
crash nasa.gov or the ESA websites with too many parallel requests. We're
working on systems that can ingest data from multiple sources while keeping
track of provenance, including satellite imagery (see
Our recent work on geospatial foundation models (see
Developer Workflow and Reproducibility. Once data is available, we're
building a next-generation "Docker for geospatial" system that can package up
precisely versioned data, code and OS environment into a single container that
can be run anywhere. This is a key part of our reproducibility story, and is a
work-in-progress at quantifyearth/shark.
The core idea,
described in our
However, building trustworthy computational pipelines at planetary scale
introduces profound challenges around uncertainty propagation and
reproducibility. Our
Frugality extends beyond reproducibility to carbon awareness. In
Specification Languages and Pipelines. We're also working on domain-specific languages for specifying geospatial data processing pipelines, which can be compiled down to efficient code that can run on our planetary computing infrastructure. Our
Ideally, these languages would also capture elements of the specification of the data at different levels of precision, so that we can swap out different data sources or processing steps without having to rewrite the entire pipeline or change the intent behind the domain expert writing the code. You can see an example of a manually written and extremely detailed pipeline in our
Looking Forward
There's a lot more to say about ongoing projects, but the overall message is:
if you're interested in contributing to some part of the planetary computing
ecosystem, either as a collaborator or a student, get in touch! The community
we've built through PROPL and related work (see also
Related Reading
Cyrus Omar and his team over at Hazel language have also been working on a
similar problem domain, and we're looking forward to collaborating with them.
Read
I've also given several talks on planetary computing, including a keynote at ICFP 2023 and at LambdaDays. Both are linked below, but the latter is the most recent one.
Activity
My (very) fast zero-allocation webserver using OxCaml – Research note (Feb 2026)
Five ways to use the LIFE metric for conservation decision-making – Research note (Jan 2026)
Four Ps for Building Massive Collective Knowledge Systems – Research note (Nov 2025)
Jane Street and Docker on moving to OCaml 5 at ICFP/SPLASH 2025 – Note about Functional Networking for Millions of Docker Desktops (Oct 2025)
LIFE becomes an Official Statistic of the UK government – Research note (Mar 2025)
Thoughts on the National Data Library and private research data – Research note (Feb 2025)
About – Research note (Feb 2025)
Programming FPGAs using OCaml – Research note (Feb 2025)
Mapping greener futures with planetary computing – Note about Mapping greener futures with planetary computing (Oct 2024)
Royal Society meeting on ecological/commercial risks – Research note (Oct 2024)
COMPASS 2024 report on the CoRE stack RIC meeting – Research note (Jul 2024)
Remote Sensing of Nature – Project (2023–present)
Trusted Carbon Credits – Project (2021–present)
Information Flow for Trusted Execution – Project (2020–present)
ZFS replication strategies with encryption – Research idea (ongoing, Any level, Jan 2000)