Remote Sensing of Nature

Measuring the world's forest carbon and biodiversity is made possible by remote sensing instruments, ranging from satellites in space (Landsat, Sentinel, GEDI) to ground-based sensors (ecoacoustics, camera traps, moisture sensors) that take regular samples and are processed into time-series metrics and actionable insights for conservation and human development. However, the algorithms for processing this data are challenging as the data is highly multimodal (multispectral, hyperspectral, synthetic aperture radar, or lidar), often sparsely sampled spatially, and not in a continuous time series. I work on various algorithms and software and hardware systems we are developing to improve the datasets we have about the surface of the earth, in close collaboration with the Planetary Computing and Trusted Carbon Credits projects.
1 Mapping nature on earth
Figuring out where things live on the planet's surface from satellites requires a lot of data processing, and tricks to work around the fact that we can't easily see through clouds (when using optical sensors) or handle very sloped surfaces (if using lidar) or peek through the top of a dense forest canopy (especially in tropical forests). Along with colleagues in the Cambridge Centre for Earth Observation, I've been working on a few projects that aim to improve the quality of the data we have about the surface of the earth.
The main research question we're tackling is how to improve our knowledge about where most wild species live on the planet, so that we can better protect their receding habitats. And in particular, our knowledge of where rare plant species live is surprisingly data deficient.
1.1 Geospatial Foundation Models with TESSERA
A breakthrough in our remote sensing work came in 2025 with TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis, a geospatial foundation model that processes Sentinel-1 and Sentinel-2 satellite data to generate global embedding maps. We developed TESSERA to compress a full year of satellite observations into compact 128-dimensional embeddings for every 10m2 surface of the planet.
Foundation models represent a paradigm shift for remote sensing: instead of training new models from scratch for every downstream task (crop classification, forest height estimation, biomass calculations), TESSERA's pre-trained embeddings can be used directly as features. This "embedding-as-data" approach democratises access to advanced remote sensing analytics, making it feasible for ecologists without deep ML expertise to build sophisticated classifiers. See GeoTessera Python library released for geospatial embeddings for details on the Python library we built to make these embeddings accessible.
TESSERA is trained only on public ESA satellite data and we make the embeddings freely available, aligning with our commitment to open science. The model has been evaluated on diverse tasks from wildfire detection to forest stock estimation, and we presented the programming challenges around planetary-scale embeddings at Proceedings of the 2nd ACM SIGPLAN International Workshop on Programming for the Planet.
1.2 Satellite and Drone Sensing of Tree Species
Old-growth tropical trees have the big advantage of being relatively easily visible from the air, and we've been developing a robust satellite and drone processing pipeline as part of the Planetary Computing project. James G. C. Ball and Sadiq Jaffer have leading an effort to use this data to develop a new approach for mapping tropical tree species. They link a multi-temporal implementation of a CNN method to segment tropical forest tree-crowns from aerial photographs, to ML classifiers that can identify species from hyperspectral data.

Read more about it in the "Harnessing temporal & spectral dimensionality to identify individual trees in tropical forests" preprint.
1.3 Biodiversity Metrics from Remote Sensing Data
The remote sensing data we collect feeds directly into biodiversity impact metrics. Our work on Area of Habitat (AoH) maps, Species Distribution Models, and applications like the LIFE metric that quantifies species extinction risks from land-cover changes, is covered in detail in the Mapping LIFE on Earth project.
1.4 Coordinating Global Biodiversity Observations
Our work on remote sensing feeds into broader efforts to coordinate biodiversity monitoring at global scale. Through collaboration with the Global Biodiversity Observation Network (GEO BON) and their "BON in a Box" platform (see Programming Opportunities for the Global Biodiversity Observation Network), we're helping to establish standardised pipelines for calculating essential biodiversity variables and tracking progress towards conservation targets. This work demonstrates how the planetary computing infrastructure we're building can serve the needs of international biodiversity monitoring networks.
The urgency of the Kunming-Montreal Global Biodiversity Framework demands harmonized monitoring approaches. Our contribution to the From data to decisions: Toward a Biodiversity Monitoring Standards Framework proposes a comprehensive Biodiversity Monitoring Standards Framework (BMSF) - a modular, tiered system guiding standardization from ethical data collection to model-based analysis and reporting. The framework integrates Essential Variables, FAIR and CARE data-management principles, and accredited analytical workflows through open-source platforms. This enables comparison and aggregation across scales while ensuring consistent data capture, quality assurance, and validated analytical pathways - critical for decisions from prioritizing restoration areas to verifying corporate nature-related disclosures.
2 Ground-Based Sensing with the Terracorder

In 2024, I started collaborating with Josh Millar over at Imperial College on developing a low-cost sensor device designed for long-term deployment in remote nature areas as well as urban environments. Since in-situ sensing devices need to be deployed in remote environments for long periods of time, minimizing their power consumption is vital for maximising both their operational lifetime and coverage. We started from an ESP32 base (due to the lovely 16-bit ultra-low power mode) and have been prototyping the "Terracorder" as a versatile multi-sensor device. Read more about it in Terracorder: Sense Long and Prosper and Poster: Towards Low-Power Comprehensive Biodiversity Monitoring.
Since I've been exploring spatial networking with Josh and Ryan Gibb (see Where on Earth is the Spatial Name System?), and we've also been figuring out whether a combination of reinforcement learning and spatial networking knowledge might take this device to the next level of usability. We've been experimenting with using an on-device reinforcement learning scheduler. When evaluating our prototype scheduler against a number of fixed schedules, the scheduler captures more than 80% of events at less than 50% of the number of activations of the best-performing fixed schedule. We're currently working on a collaborative scheduler that can maximise the useful operation of a network of these Terracorders, improving overall network power consumption and robustness.
The spatial coordination challenge extends beyond power management to network architecture itself. Our An Architecture for Spatial Networking work proposes using bigraphs as a unifying representation for spatial, social, and communication relationships in sensor networks. This enables enforcing spatial access policies and distributed reasoning that scopes computation to the smallest viable subspace - crucial for low-latency, privacy-preserving spatial networking in remote environmental deployments where Terracorders need to coordinate while respecting physical and organizational boundaries.
3 Applications to Human Health and Urban Nature
Ultimately, it would also be nice to understand the impact of more natural spaces on human health as well. After all, we not only need to protect unspoilt nature, but also need to make sure that highly urbanised areas are liveable. Andres Zuñiga-Gonzalez, Ronita Bardhan and I have been investigating the impact of green spaces in cities. These have been demonstrated to offer multiple benefits to their inhabitants, including cleaner air, shade in sunny periods, and a place that contributes to mental well-being. In addition, trees in cities are home to several species of animals and work as a nature-based solution that can sequester CO2 and regulate water storage in urban ecosystems.
We began this work by analyzing the 3-30-300 urban greening rule across major UK cities in Green Urban Equity: Analyzing the 3-30-300 Rule in UK Cities and Its Socioeconomic Implications, demonstrating how remote sensing imagery, census data, and machine learning could reveal patterns of green space inequality. In 2025, we completed the first national, building-level assessment of urban nature access in England using this framework (see Airborne assessment uncovers socioeconomic stratification of urban nature in England). Using high-resolution LiDAR, Sentinel-2 imagery, and geospatial data for over 28 million buildings, we integrated raster, vector, and socioeconomic data within a scalable computational framework. Our results revealed stark inequalities: while most urban areas meet the 3-tree proximity rule, fewer than 3% achieve 30% canopy cover. Crucially, we found that ambient greenness is concentrated in affluent areas, whereas proximity to parks is greatest in dense, often deprived urban centres - exposing a multidimensional nature gap that has important implications for environmental justice.
This framework establishes a reproducible, open, and computationally efficient blueprint for evaluating urban nature equity at scale, supporting the integration of environmental justice metrics into national urban planning agendas. Read more about the broader project at The role of urban vegetation in human health.
4 Update: 2025-2026
4.1 TESSERA at scale
The TESSERA project has matured significantly. The Applications of the TESSERA Geospatial Foundation Model to Diverse Environmental Mapping Tasks paper demonstrated state-of-the-art accuracy across diverse downstream tasks from crop classification to canopy height estimation, confirming the "embedding-as-data" approach. We then applied TESSERA to tree species mapping in Trentino, showing data-efficient classification in temperate mountain forests. On the application side, we demonstrated finding solar farms with a 42k-parameter model — showing that tiny classifiers on top of TESSERA embeddings can match purpose-built models.
The GeoTessera 0.7 Python library switched to GeoParquet manifests for faster initialisation and added Zarr tensor storage. But the bigger shift came with our move to Zarr v3 sharded stores — restructuring millions of individual numpy files into a single cloud-native store per UTM zone. Community feedback quickly reshaped the layout (years as a dimension, NCHW ordering, 4096-pixel shards), and we retired our TESSERA-specific convention in favour of a shared geo-embeddings Zarr convention that also covers Clay and AEF foundation models.
The TZE explorer now lets users browse embeddings from 2017-2025 in the browser, click individual 10m² pixels to see their 128-d vectors across years, and run interactive classification — all via HTTP range requests with no server. We're building an OxCaml inference pipeline using SIMD intrinsics for the native stack, and interactive OCaml notebooks for browser-based exploration.
The first TESSERA hackathon at the Indian AI Summit in Delhi explored integration with IIT-Delhi's CoRE Stack, and we discussed federated embedding mirrors — with India as a potential first node.
4.2 NAS/Royal Society biodiversity framework
Two companion papers from the US-UK Forum on Measuring Biodiversity were published in PNAS in early 2026. The nine recommendations for transforming biodiversity measurement call for capitalising on novel technology (including foundation models like TESSERA), agreeing standard methods, calibrating new technologies with existing data, and creating living databases of trusted information. The Biodiversity Monitoring Standards Framework provides a concrete architecture — a federated, auditable "chain of evidence" from ethical principles through data collection to reporting.
Our evidence TAP pipeline is cited as an example of the living databases that recommendation #5 calls for, and TESSERA fits into recommendation #1 on integrating novel technology with ground-truth data. The BMSF's federated design is compatible with our Four Ps for Building Massive Collective Knowledge Systems and the Shelter data provenance system we're developing.
4.3 LIFE as a UK Official Statistic
The Mapping LIFE on Earth metric — which quantifies species extinction risk from land-cover changes — was adopted as a UK government Official Statistic in 2025. The Informing conservation problems and actions using an indicator of extinction risk: A detailed assessment of applying the LIFE metric paper demonstrated five diverse applications: tropical deforestation monitoring, conservation effectiveness evaluation, corporate nature-related disclosures, national consumption tracking, and urban biodiversity assessment. The Yirgacheffe declarative geospatial library (Yirgacheffe: A Declarative Approach to Geospatial Data) that underpins the LIFE pipeline processing 80-terapixel habitat maps saw area-per-pixel improvements that eliminated an entire pre-computed raster.
We also published Learning lessons from over-crediting to ensure additionality in forest carbon credits on learning lessons from over-crediting in forest carbon credits, connecting our remote sensing capabilities to the carbon finance accountability debate.
4.4 GeoCaml and native geospatial tooling
In parallel with the Python ecosystem, we're assembling geocaml, a suite of pure OCaml geospatial libraries. The centrepiece is ocaml-tiff (with Outreachy intern contributions on LZW decompression and write support), alongside PROJ bindings, a WKT codec, an R-Tree spatial index, and ocaml-geojson. The long-term goal is a native OCaml geospatial stack that doesn't depend on C bindings like GDAL, enabling tighter integration with our OxCaml high-performance systems and Zarr streaming.
4.5 Workshops and community building
We co-chaired the Programming for the Planet workshop at ICFP 2025, bringing together functional programmers and environmental scientists. PROPL 2026 returns at PLDI in Boulder with an action-oriented format focused on building a concrete planetary compute engine architecture.
The Foundational AI for Ecosystem Resilience workshop in December 2025 combined TESSERA with Concordia agent-based modelling to simulate ecosystem resilience. And the upcoming Rewilding the Web workshop in Edinburgh (May 2026) explores applying ecological insights to digital infrastructure resilience — the circular argument from our Steps towards an Ecology for the Internet paper that we need resilient infrastructure to monitor biodiversity, and ecological theory can teach us how to build it.