AI-assisted Living Evidence Databases for Conservation Science
Sadiq Jaffer, William Morgan, Sam Reynolds, Alec Christie, Anil Madhavapeddy, and Bill Sutherland.
Working paper at Cambridge Open Engage.
Living evidence databases offer a robust and dynamic alternative to static systematic reviews but require a resilient technical infrastructure for continuous evidence processing.
This working paper describes the architecture and implementation of a complete, end-to-end pipeline for this purpose, developed initially for the conservation science domain. Designed to operate on local infrastructure using self-hosted models, the system ingests and normalizes documents from academic publishers, screens them for relevance using a multi-stage process, and extracts structured data according to a predefined schema.
Key features include a hybrid retrieval model; a human-AI collaborative process for refining inclusion criteria from complex protocols, and the integration of an established, statistically-principled stopping rule to ensure efficiency. In a baseline evaluation against a prior large-scale manual review, the fully automated pipeline achieved 97% recall and identified a significant number of relevant studies not included in the original review, demonstrating its viability as a foundational tool for maintaining living evidence databases.