Conservation Evidence Copilots

The Conservation Evidence team at the University of Cambridge has spent years screening 1.6m+ scientific papers on conservation, as well as manually summarising 8600+ studies relating to conservation actions. However, progress is limited by the specialised skills needed to screen and summarise relevant studies -- it took more than 75 person years to manually curate the current database and only a few 100 papers can be added each year! We are working on AI-driven techniques to accelerate addition of robust evidence to the CE database via automated literature scanning, LLM-based copilots and scanning of grey literature. We aim to provide co-pilots that augment human decision making to figure out how to categorise interventions much more quickly and accurately, and ultimately accelerate the positive impact of conservation actions.

The goal of the Conservation Evidence project is to transform conservation so that evidence is routinely embedded in decisions to improve outcomes for biodiversity and society. CE is becoming the authoritative, most comprehensive, freely available platform for evidence-led conservation and is starting to profoundly change the way in which conservationists access and use evidence for improving the state of the planet.

Team AICN in the CCI building, Feb 2024
Team AICN in the CCI building, Feb 2024

The CE collation and synthesis work has significantly improved the availability of evidence for use in conservation practice and remains the only resource of evidence synopses for biodiversity conservation and the largest database of effectiveness reviews of actions outside the field of medicine. The approach of carrying out reviews on an industrial scale means that they can carry out reviews for a fraction of the costs in comparable fields, such as medicine. Using subject-wide evidence synthesis, CE systematically searches the literature and summarise results from (and provides citations for) each study testing the effectiveness of an action. As of April 2024, CE has read 1.6 million paper titles in 17 languages (326 non-english journals) and reviewed evidence for >3600 conservation actions, freely available on their website, with collaboration from over 380 international academics and practitioners.

1 Accelerating literature surveys with LLMs

We got involved from computer science in 2023 as part of the AI@CAM competition to harness the momentum behind machine learning to accelerate conservation actions. Our overall aim is to help CE to dramatically accelerate their data searching and data extraction pipelines. Currently, the searching of literature and summarising of key data is undertaken by human experts. Although this method of working is time consuming, it does benefit from being thorough and replicable. The main difficulties come in the subtleties of deciphering study designs, methodologies and whether controls are actually appropriate for testing the effectiveness of the specified action. Any LLM-based automation that we deploy must account for these as part of the validation pipeline.

Our evaluation of LLM performance against human experts on conservation intervention questions showed that properly configured LLMs with retrieval augmentation can achieve competitive performance on evidence synthesis tasks. However, out-of-the-box general LLMs performed poorly and risk misinforming decision-makers, reinforcing our commitment to careful validation and human-in-the-loop approaches.

The collaboration originally began in 2022 as part of the Computer Science 1B group projects, when Bill Sutherland, Sam Reynolds and Alec Christie from Zoology proposed a group project related to CE. A team of undergraduate students (including Jamie Cao) trained an ML model to facilitate searching for papers and indexing relevant articles by species and habitat. After the group project completed with encouraging results, Sadiq Jaffer and I joined the collaboration and -- with help from the Cambridge Office for Scholarly Communication -- built up a comprehensive (and legal!) corpus of millions of academic papers related to conservation evidence.

Through 2024, we evaluated ten different LLMs against human experts using the CE database, leading to our LLM evaluation paper showing promising but cautious results. We were joined in the summer of 2024 by three CST undergraduates: Radhika Iyer, Shrey Biswas and Kacper Michalik who built out various elements of the system. Moving into 2025, our focus shifts to production deployment of hybrid retrieval systems while maintaining rigorous validation against the expanding challenges of AI contamination in scientific literature.

2 Living Evidence Databases

In October 2025, we published AI-assisted Living Evidence Databases for Conservation Science describing a complete, end-to-end pipeline for maintaining living evidence databases. Traditional systematic reviews become outdated quickly, but living evidence databases offer a dynamic alternative by continuously processing new evidence as it emerges. Our pipeline, designed to operate on local infrastructure using self-hosted models, ingests and normalizes documents from academic publishers, screens them for relevance using a multi-stage process, and extracts structured data according to predefined schemas.

The system features a hybrid retrieval model combining keyword search with semantic understanding, and integrates a human-AI collaborative process for refining inclusion criteria from complex protocols. We also incorporated an established, statistically-principled stopping rule to ensure efficiency. In baseline evaluation against a prior large-scale manual review, the fully automated pipeline achieved 97% recall and identified significant numbers of relevant studies not included in the original review, demonstrating its viability as a foundational tool for maintaining living evidence databases. This work was earlier presented at a workshop on AI for evidence synthesis in March 2025 that brought together policymakers, researchers, and practitioners to discuss the responsible integration of AI into evidence synthesis workflows.

3 Challenges in the AI-Evidence Era

Our work on LLM-based evidence synthesis has gained urgency in 2025 as the body of scientific literature faces challenges from AI-contaminated papers. The publication of ever-larger numbers of problematic papers, including fake ones generated by AI, represents what our Nature paper "Will AI speed up literature reviews or derail them entirely?" argues is an existential crisis for the established way of doing evidence synthesis. Recent analysis suggests that up to 2% of submitted papers may be AI-generated, with some estimates potentially much higher. This contamination poses particular risks for evidence databases like CE, as fake papers with plausible-sounding conservation interventions could mislead decision-makers if incorporated without rigorous validation.

The paper argues that while AI-generated papers pose serious threats to literature reviews, a new approach using AI might also offer solutions. As AI generation becomes more sophisticated, our validation pipeline needs to evolve beyond simple detection to robust experimental design verification and cross-reference validation. The paradox is that AI is both the problem and potentially part of the solution; but only if we build systems with rigorous validation, traceability, and human oversight at their core. See also A FAIR Case for a Live Computational Commons for the importance of FAIR principles throughout science.

4 Technology for Conservation, Not Division

We also conducted a horizon scan in 2024, which sparked discussions about ensuring AI serves conservation equitably. In response to concerns raised by Katie Murray and colleagues about the potential for AI to divide conservation, we published a response in Trends in Ecology & Evolution. As outlined in both that response and our reflections on technology uniting conservation, we are committed to developing CE tools that:

  • Respect and amplify human expertise rather than replacing it via "human-in-the-loop" methods
  • Follow participatory design principles with conservation practitioners
  • Maintain open source and open data approaches with thorough documentation to facilitate reproducible outputs
  • Address capacity building needs, particularly in the Global South with respect to AI capability
  • Keep conservation goals and not short term technology trends at the center of our research.

Our AI@CAM interview highlights this approach: we are building detailed models of the world that can be queried by policy makers to help make informed decisions. The technology serves the evidence, and the evidence serves conservation practices.

Activity

TESSERA paper accepted at CVPR 2026, went to the AI Impact Summit, OCaml Zarr hacking, Shriram's talk on human factors of formal methods, and discussions on teaching OxCaml to agents.
Growing the Ceph cluster for TESSERA embeddings, a Lego brainstorming session for the Evidence TAP, hosting Echo Labs from ARIA, and Shane's IUCN Red List seminar.
PhD viva for Maddy, presenting TESSERA at ARIA, Nature covers the conservation evidence conference, giving evidence to Parliamentary POST, and a CACM interview.
Hosting the UK chief scientists for nature conservation at Pembroke to discuss TESSERA and AI for biodiversity, followed by the Conservation Evidence conference where I talked about choosing the open red pill over black-box AI for conservation decision-making.
My favourite books, podcasts and recommendations from 2025, covering moral ambition, maps, wolves, AI dystopias, geopolitics, Chennai history, and the best tech podcasts.
An exploration of agentic programming through building useful OCaml libraries daily using Claude Code while establishing groundrules for responsible development.
Critiquing ACM's paywalled AI paper summaries and proposing better alternatives like open feeds, easier downloads, provenance tracking, and personalised agentic interfaces.
Synthesizing three RFC-compliant libraries (punycode, public-suffix, and cookeio) directly from Internet RFC specifications, establishing a workflow for automating standards implementation with proper cross-referencing to spec sections.
Report from a COAR conference on transforming scholarly publishing through the Publish, Review, Curate model, discussing diamond open access, early career challenges, and expanding open infrastructure to datasets and code.
Workshop report combining TESSERA geospatial foundation models with Concordia agent-based modeling to simulate ecosystem resilience, covering causal modeling for ecology and AI applications in nature conservation.
Reflections on the Franco-British AI collaboration workshops exploring how AI is transforming scientific practice, plus follow-up funding for the Conservation Copilot project.
Reflections on UK-India AI collaboration from meetings at the Alan Turing Institute and OpenUK, discussing ethical AI deployment, open source infrastructure, and the challenges of building community in the age of AI-assisted coding.
Sadiq Jaffer, William Morgan et al.
Cyrus Omar, Michael Coblenz et al. — Proceedings of the 2nd ACM SIGPLAN International Workshop on Programming for the Planet
Anil Madhavapeddy, Sam Reynolds et al. — Proceedings of the sixth decennial Aarhus conference: Computing X Crisis
Live notes from Royal Society conference on scientific publishing challenges including peer review crisis, AI poisoning threats and open access economics.
Nature comment on AI-generated paper threats to evidence synthesis proposing federated living evidence databases with human-in-loop review.
Sam Reynolds, Alec Christie et al. — Nature
Coordination note for summer 2025 undergraduate and graduate internships covering projects from evidence databases to remote sensing and embedded systems.
Visit to National Geographic HQ for workshop on global urban wildlife tracking initiative and human-wildlife coexistence research.
Report from NAS/Royal Society forum on standardized biodiversity measurement technologies covering foundation models, eDNA and evidence synthesis.
Earning Junior Ranger badges at Shenandoah National Park and recording Conservation Concepts video series episode.
PLOS One publication showing pretrained LLMs perform poorly on conservation questions but improve dramatically with Conservation Evidence database training.
Discussion of responsible AI adoption in conservation, emphasizing human agency and equity over technological solutions.
Sadiq Jaffer. How well can locally-runnable language models handle OCaml code generation? We evaluate 19 open-weight LLMs on first-year Computer Science exercises, exploring the balance between model size, architecture, and reasoning capabilities for less mainstream programming languages.
Response to critique of AI in conservation emphasizing participatory design, open source tools and equitable capacity building in Global South.
Exploring ZFS and Sanoid for distributed filesystem management with automated snapshots and replication to replace centralized NFS storage.
Sam A. Reynolds, Sara Beery et al. — Trends in Ecology & Evolution
LIFE biodiversity metric becomes UK government Official Statistic to track consumption's environmental impact.
"AI-generated paper passes peer review, sparking discussion on evidence synthesis and AI's role in policymaking."
EEG group discusses useful AI tools and emerging techniques in productivity and research.
IETF's new AI Preferences Working Group aims to standardize protocols for expressing preferences on AI content collection and processing.
Exploring the National Data Library and its potential to improve access to private research data while balancing security and privacy concerns.
AI-generated papers are contaminating scientific literature at a rapid pace
Sadiq Jaffer. Here's a workaround to get JSON output from Deepseek R1 and distills when using the llama.cpp OpenAI-compatible server endpoint
Sam Reynolds, Sara Beery et al. — Trends in Ecology & Evolution
Preprint on using LLMs for conservation evidence based on large-scale academic literature crawling.
Report from COMPASS 2024 on the CoRE stack RIC meeting on climate adaptation for rural communities using digital public infrastructure and commoning technologies
At the CCI speaking about our work on LLMs in analysing conservation literature
Discussing conservation with AI@CAM and using AI to transform biodiversity conservation efforts.