Anil Madhavapeddy

Conservation Evidence Copilots / Jan 2024

The Conservation Evidence team at the University of Cambridge has spent years screening 1.6m+ scientific papers on conservation, as well as manually summarising 8600+ studies relating to conservation actions. However, progress is limited by the specialised skills needed to screen and summarise relevant studies -- it took more than 75 person years to manually curate the current database and only a few 100 papers can be added each year! We are working on AI-driven techniques to accelerate addition of robust evidence to the CE database via automated literature scanning, LLM-based copilots and scanning of grey literature. We aim to provide co-pilots that augment human decision making to figure out how to categorise interventions much more quickly and accurately, and ultimately accelerate the positive impact of conservation actions.

[…918 words]

Evaluating a human-in-the-loop AI framework to improve inclusion criteria for evidence synthesis
Currently ongoing with Radhika Agrawal and cosupervised with Alec Christie and Sadiq Jaffer
Evaluating LLMs for providing evidence-based information on conservation actions
Currently ongoing with Alex Wang and cosupervised with Alec Christie and Sadiq Jaffer
Accurate summarisation of threats for conservation evidence literature
Currently ongoing (MPhil) with Kittson Hamill and cosupervised with Sadiq Jaffer
Generating chunk-free embeddings for LLMs
Currently ongoing (MPhil) with Mark Jacobsen and cosupervised with Sadiq Jaffer
Evaluating RAG pipelines for conservation evidence
Completed by Radhika Iyer and cosupervised with Sadiq Jaffer in 2024
Crawling grey literature for conservation evidence
Completed by Shrey Biswas and Kacper Michalik and cosupervised with Sadiq Jaffer in 2024
Assessing mangrove literature for conservation evidence
Expired (Part II) and cosupervised with Sadiq Jaffer and Tom Worthington
Spatial and multi-modal extraction from conservation literature
Expired (MPhil) and cosupervised with Sadiq Jaffer, Alec Christie and Bill Sutherland

# 1st Jan 2024

projects ai biodiversity conservation evidence llms

Remote Sensing of Nature / Jan 2023

Measuring the world's forest carbon and biodiversity is made possible by remote sensing instruments, ranging from satellites in space (Landsat, Sentinel, GEDI) to ground-based sensors (ecoacoustics, camera traps, moisture sensors) that take regular samples and are processed into time-series metrics and actionable insights for conservation and human development. However, the algorithms for processing this data are challenging as the data is highly multimodal (multispectral, hyperspectral, synthetic aperture radar, or lidar), often sparsely sampled spatially, and not in a continuous time series. I work on various algorithms and software and hardware systems we are developing to improve the datasets we have about the surface of the earth.

[…904 words]

Mapping urban and rural British hedgehogs
Currently ongoing with Gabriel Mahler and cosupervised with Silviu Petrovan
Validating predictions with ranger insights to enhance anti-poaching patrol strategies in protected areas
Currently ongoing with Hannah McLoone and cosupervised with Charles Emogor and Rob Fletcher
Habitat mapping of the Cairngormes Connect restoration area
Currently ongoing with Isabel Mansley and cosupervised with David Coomes and Aland Chan
Battery-free wildlife monitoring with Riotee
Currently ongoing with Dominico Parish and cosupervised with Josh Millar
Affordable digitisation of insect collections using photogrammetry
Currently ongoing (MPhil) with Beatrice Spence, Arissa-Elena Rotunjanu and Anna Yiu and cosupervised with Tiffany Ki and Edgar Turner
Enhancing Navigation Algorithms with Semantic Embeddings
Currently ongoing (MPhil) with Gabriel Mahler
Foundation models for complex geospatial tasks
Currently ongoing (PhD) with Onkar Gulati and cosupervised with Sadiq Jaffer and David Coomes
Low-power sensing infrastructure for biodiversity
Currently ongoing (PhD) with Josh Millar and cosupervised with Hamed Haddadi
The role of urban vegetation in human health
Currently ongoing (PhD) with Andres Zuñiga-Gonzalez and cosupervised with Ronita Bardhan
Mapping hunting risks for wild meat in protected areas
Currently ongoing with Charles Emogor and cosupervised with Milind Tambe
Reverse emulating agent-based models for policy simulation
Completed (MPhil) by Pedro Sousa and cosupervised with Sadiq Jaffer in 2023
Species distribution modelling using CNNs
Completed (MPhil) by Emily Morris and cosupervised with David Coomes in 2023
Scalable agent-based models for optimized policy design
Completed (MPhil) by Sharan Agrawal and cosupervised with Srinivasan Keshav in 2022
Exploring Concurrency in Agent-Based Modelling with Multicore OCaml
Completed (Part II) by Martynas Sinkievič in 2021
Diffusion models for terrestrial predictions about land use change
Expired (MPhil) and cosupervised with Sadiq Jaffer

# 1st Jan 2023

projects biodiversity satellite sensing

Mapping LIFE on Earth / Jan 2023

Human-driven habitat loss is recognised as the greatest cause of biodiversity loss, but we lack robust, spatially explicit metrics quantifying the impacts of anthropogenic changes in habitat extent on species' extinctions. LIFE is our new metric that uses a persistence score approach that combines ecologies and land-cover data whilst considering the cumulative non-linear impact of past habitat loss on species' probability of extinction. We apply large-scale computing to map ~30k species of terrestrial vertebrates and provide quantitative estimates of the marginal changes in the expected number of extinctions caused by converting remaining natural vegetation to agriculture, and also by restoring farmland to natural habitat. We are also investigating many of the conservation opportunities opened up via its estimates of the impact on extinctions of diverse actions that change land cover, from individual dietary choices through to global protected area development.

[…653 words]

Using graph theory to define data-driven ecoregion and bioregion maps
Available and cosupervised with Daniele Baisero and Michael Dales
An access library for the world crop, food production and consumption datasets
Available and cosupervised with Alison Eyres and Thomas Ball
Using wasm to locally explore geospatial layers
Currently ongoing (Part II) with Sam Forbes and cosupervised with Michael Dales
Real-time mapping of changes in species extinction risks
Currently ongoing (PhD) with Emilio Luz-Ricca and cosupervised with Andrew Balmford

# 1st Jan 2023

projects biodiversity conservation hpc

Planetary Computing / Jan 2022

Planetary computing is our research into the systems required to handle the ingestion, transformation, analysis and publication of global data products for furthering environmental science and enabling better informed policy-making. We apply computer science to problem domains such as forest carbon and biodiversity preservation, and design solutions that can scalably processing geospatial data that build trust in the results via traceability and reproducibility. Key problems include how to handle continuously changing datasets that are often collected across decades and require careful access and version control.

[…618 words]

Autoscaling geospatial computation with Python and Yirgacheffe
Available and cosupervised with Michael Dales
Using computational SSDs for vector databases
Available (MPhil) and cosupervised with Sadiq Jaffer
ZFS replication strategies with encryption
Currently ongoing with Becky Terefe-Zenebe and cosupervised with Mark Elvers
Bidirectional Hazel to OCaml programming
Currently ongoing with Max Carroll and cosupervised with Patrick Ferris and Cyrus Omar
Gradually debugging type errors
Currently ongoing (Part II) with Max Carroll and cosupervised with Patrick Ferris
An imperative, pure and effective specification language
Currently ongoing (Part II) with Max Smith and cosupervised with Patrick Ferris
Effective geospatial code in OCaml
Currently ongoing (Part II) with George Pool and cosupervised with Michael Dales and Patrick Ferris
Privacy preserving emissions disclosure techniques
Currently ongoing (PhD) with Jessica Man and cosupervised with Martin Kleppmann
Computational Models for Scientific Exploration
Currently ongoing (PhD) with Patrick Ferris and cosupervised with Srinivasan Keshav
Assessing high-performance lightweight compression formats for geospatial computation
Completed (MPhil) by Omar Tanner and cosupervised with Sadiq Jaffer in 2023
Towards reproducible URLs with provenance
Expired (Part II) and cosupervised with Patrick Ferris
Composable diffing for heterogenous file formats
Expired (MPhil) and cosupervised with Patrick Ferris

# 1st Jan 2022

projects conservation satellite sensing systems

Trusted Carbon Credits / Jan 2021

The Cambridge Centre for Carbon Credits is an initiative I started with Andrew Balmford, David Coomes, Srinivasan Keshav and Thomas Swinfield, aimed at issuing trusted and verifiable carbon credits towards the prevention of nature destruction due to anthropogenic actions. We researched a combination of large-scale data processing (satellite and and sensor networks) and decentralised Tezos smart contracts to design a carbon marketplace with verifiable transactions that link back to trusted primary observations.

[…422 words]

Legal perspectives on integrity issues in forest carbon
Completed by Sophie Chapman and cosupervised with Eleanor Toye Scott in 2024
Meta Properties of Financial Smart Contracts
Completed (PhD) by Derek Sorensen and cosupervised with Srinivasan Keshav in 2023
Making GPS accurate in dense forests using sensor fusion
Completed by Keshav Sivakumar and cosupervised with Srinivasan Keshav and David Coomes in 2020

# 1st Jan 2021

projects carboncredits conservation systems

Information Flow for Trusted Execution / Jan 2020

There is now increased hardware support for improving the security and performance of privilege separation and compartmentalization techniques such as process-based sandboxes, trusted execution environments, and intra-address space compartments. We dub these "hetero-compartment environments" and observe that existing system stacks still assume single-compartment models (i.e. user space processes), leading to limitations in using, integrating, and monitoring heterogeneous compartments from a security and performance perspective. This project explores how we might deploy techniques such as fine-grained information flow control (DIFC) to allow developers to securely use and combine compartments, define security policies over shared system resources, and audit policy violations and perform digital forensics across hetero-compartments.

[…177 words]

Decentralised Capability-based Code Collaboration using Matrix
Completed (Part II) by Samuel Wedgwood in 2022
Secure Programming with Dispersed Compartments
Completed (PhD) by Zahra Tarkhani in 2022
Security analysis of brain-computing interfaces
Completed by Malachy O'Connor Brown and Oscar Hill and cosupervised with Zahra Tarkhani and Lorena Qendro in 2021
Void Processes: Minimising privilege by default
Completed (MPhil) by Jake Hillion in 2021

# 1st Jan 2020

projects security systems tee xen

Interspatial OS / Jan 2018

Digital infrastructure in modern urban environments is currently very Internet-centric, and involves transmitting data to physically remote environments. The cost for this is data insecurity, high response latency and unpredictable reliability of services. I am working on Osmose -- a new OS architecture that inverts the current model by building an operating system designed to securely connect physical spaces with extremely low latency, high bandwidth local-area computation capabilities and service discovery.

[…452 words]

Low power audio transcription with Whisper
Currently ongoing with Dan Kvit and cosupervised with Josh Millar
3D printing the planet (or bits of it)
Currently ongoing with Finley Stirk and cosupervised with Michael Dales
Low-latency wayland compositor in OCaml
Currently ongoing (Part II) with Tom Thorogood and cosupervised with Ryan Gibb
Building bigraphs of the real world
Currently ongoing (Part II) with Roy Ang and cosupervised with Ryan Gibb
Interspatial Networking with DNS
Currently ongoing (PhD) with Ryan Gibb and cosupervised with Jon Crowcroft
Scheduling for Reduced Tail Latencies in Highly Utilised Datacenters
Completed (PhD) by Smita Vijayakumar and cosupervised with Evangelia Kalyvianaki in 2023
Spatial Name System
Completed (MPhil) by Ryan Gibb and cosupervised with Jon Crowcroft in 2022
A DSL for decentralised identity in OCaml
Completed (Part II) by Michał Mgeładze-Arciuch and cosupervised with Patrick Ferris in 2022
Deep learning for decomposing sound into vector audio
Expired (MPhil) and cosupervised with Trevor Agus

# 1st Jan 2018

projects distributed networking spatial systems unikernels vr

OCaml Labs / Jan 2012

I founded a research group called OCaml Labs at the University of Cambridge, with the goal of pushing OCaml and functional programming forward as a platform, making it a more effective tool for all users (including large-scale industrial deployments), while at the same time growing the appeal of the language, broadening its applicability and popularity. Over a decade, we retrofitted multicore parallelism into the mainline OCaml manager, wrote a popular book on the language, and helped start and grow an OCaml package and tooling ecosystem that is thriving today.

[…1931 words]

A hardware description language using OCaml effects
Available (MPhil) and cosupervised with KC Sivaramakrishnan and Andy Ray
Runtimes à la carte: crossloading native and bytecode OCaml
Currently ongoing with Jeremy Chen and cosupervised with David Allsopp
Effects based scheduling for the OCaml compiler pipeline
Currently ongoing with Lucas Ma and cosupervised with David Allsopp
Parallel traversal effect handlers for OCaml
Completed (Part II) by Sky Batchelor and cosupervised with Patrick Ferris in 2024
Implementing a higher-order choreographic language
Completed (Part II) by Rokas Urbonas and cosupervised with Dmirtij Szamozvancev in 2024
Using effect handlers for efficient parallel scheduling
Completed (MPhil) by Bartosz Modelski in 2022
Probabilistic Programming in OCaml
Completed (Part II) by Hari Chandrasekaran and cosupervised with Tom Kelly and Liang Wang in 2018
Concurrent revisions for OCaml
Completed (Part II) by Dimitar Popov in 2013
Analysis of the Raft Consensus Protocol
Completed (Part II) by Heidi Howard in 2012
Macro- and Micro-benchmarking in OCaml
Completed (Part II) by Sebastian Funk in 2012
Functional Reactive Web Applications
Completed (Part II) by Henry Hughes in 2010

# 1st Jan 2012

projects fp multicore ocaml packaging systems

Unikernels / Jan 2010

I proposed the concept of "unikernels" -- single-purpose appliances that are compile-time specialised into standalone bootable kernels, and sealed against modification when deployed to a cloud platform. In return they offer significant reduction in image sizes, improved efficiency and security, and reduce operational costs. I also co-founded the MirageOS project which is one of the first complete unikernel frameworks, and also integrated them to create the Docker for Desktop apps that are used by hundreds of millions of users daily.

[…1565 words]

A strongly consistent index for email using git and MirageOS
Completed (Part II) by Oliver Hope and cosupervised with David Allsopp in 2019
Distributed Task Scheduling Framework over Irmin
Completed (Part II) by Mohammed Daudali in 2019
CausalRPC: a traceable distributed computation framework
Completed (Part II) by Craig Ferguson in 2018
Consolidating Trust for Client Groups that use TLS to Secure Connections
Completed (Part II) by Johann Beleites and cosupervised with David Sheets in 2014
Control flow analysis for privilege separation
Completed by Chris Harding and Ross McIlroy and cosupervised with Robert M Watson in 2011
Extending 64-bit MIPS support for LLVM
Completed by William Morland and cosupervised with Robert M Watson in 2011

# 1st Jan 2010

projects cloud embedded mirageos security systems tee unikernels xen

Personal Containers / Jan 2009

As cloud computing empowered the creation of vast data silos, I investigated how decentralised technologies might be deployed to allow individuals more vertical control over their own data. Personal containers was the prototype we built to learn how to stem the flow of our information out to the ad-driven social tarpits. We also deployed personal containers in an experimental data locker system at the University of Cambridge in order to incentivise lower-carbon travel schemes.

[…928 words]

Improving Resilience of ActivityPub Services
Completed (Part II) by Gediminas Lelešius in 2023
Simulating XMPP Group Communication
Completed (Part II) by Farhān Mannān in 2011

# 1st Jan 2009

projects cloud databox selfhosting

Ubiquitous Interaction Devices / Jan 2003

I investigated how to interface the new emerging class of smartphone devices (circa 2002) with concepts from ubiquitous computing such as location-aware interfaces or context-aware computing. I discovered the surprisingly positive benefits of piggybacking on simple communications medium such as audible sound and visual tags. Our implementations of some of these ended up with new audio ringtone and visual smart tags that worked on the widely deployed mobile phones of the era.

In 2003, the mobile phone market had grown tremendously and given the average consumer access to cheap, small, low-powered and constantly networked devices that they could reliably carry around. Similarly, laptop computers and PDAs became a common accessory for businesses to equip their employees with when on the move. The research question then, was how to effectively interface them with existing digital infrastructure and realise some of the concepts of ubiquitous computing such as location-aware interfaces or context-aware computing.

[…1178 words]

# 1st Jan 2003

projects audio hci systems ubicomp

Functional Internet Services / Jan 2003

My PhD dissertation work proposed an architecture for constructing new implementations of standard Internet protocols with integrated formal methods such as model checking and functional programming that were then not used in deployed servers. A more informal summary is "rewrite all the things in OCaml from C!", which lead to a merry adventure into implementing many networks protocols from scratch in a functional style, and learning lots about how to enforce specifications without using a full blown proof assistant.

[…1092 words]

Not-quite-so-broken TLS in OCaml
Completed by Hannes Mehnert and David Kaloper-Mersinjak and cosupervised with Peter Sewell in 2014
Functional ABNF parser generators
Completed (Part II) by Nicholas Skehin in 2011

# 1st Jan 2003

projects fp mirageos ocaml

Xen Hypervisor / Jan 2002

I was on the original team at Cambridge that built the Xen hypervisor in 2002 -- the first open-source "type-1" hypervisor that ushered in the age of cloud computing and virtual machines. Xen emerged from the Xenoservers project at the CL SRG, where I started my PhD and hacked on the emerging codebase and subsequently worked on the development of the commercial distribution of XenServer.

[…529 words]

# 1st Jan 2002

projects cloud distributed opensource security systems xen