This is an idea proposed in 2024 as a Cambrige Computer Science Part III or MPhil project, and is currently being worked on by Kittson Hamill. It is supervised by Anil Madhavapeddy and Sadiq Jaffer as part of the Conservation Evidence Copilots project.
At the Conservation Evidence Copilots project, we are interested in constructing a taxonomy of threats to wildlife from the literature. This involves scanning the body of conservation literature and gathering/synthesising evidence for conservation interventions from a threats perspective. Once the text has been retrieved, it needs to be summarised in a way that is accurate, concise and relevant and verified with human experts. This is particularly important for conservation evidence, where the key findings need to be communicated clearly to inform policy and practice.
This project therefore investigates how to generate threats, and to verify their accuracy as generated by LLMs and RAG pipelines from the CE literature. Our goal is to develop a pipeline that can reliably go from extracting relevant information from text to a summary that is verifiably (by a human) correct.