Weeknote 2026/w7: Storage, Lego, Echo, and the IUCN

Growing the Ceph cluster for TESSERA embeddings, a Lego brainstorming session for the Evidence TAP, hosting Echo Labs from ARIA, and Shane's IUCN Red List seminar.

The week was dominated by having to sort out yet another 300TB of SSDs to grow our Ceph cluster, as the TESSERA embeddings being generated on Vultr are even more optimised and being spit out at a rate of 4TB per day. But I did get around to playing with Lego as well!

1 Storage and OxCaml

Mark Elvers and I are getting more comfortable with using Ceph to manage all the embeddings, and Malcolm Scott helped us to wire up a bunch of old machines donated by Jane Street that we can use to distribute them. We now have multiple large storage blobs that we're consolidating, ranging from TrueNAS mirrored storage (from the 4C days) over to a backup Ceph cluster at Scaleway (kindly sponsored by Tarides) and also the OCaml infrastructure and documentation generation that sucks up our maintenance time.

Meanwhile, use of oxmono for our OxCaml infrastructure is working extremely well, and I'm solely using that with worktrees and branches to deploy services now. The OxCaml compiler is rock solid and with a monorepo, managing packages is a breeze. Jon Ludlam is also using it to fix OxCaml doc generation which will be extremely useful for agentic coding (since the agents can read the generated docs to determine if the interfaces are 'good' or contain lots of hidden modules and other bad practise).

%rc
The only real downside with the monorepo is that a dune exec takes about 3s to initialise on my laptop from scanning all the dune files. I'll investigate some ways to speed up this, as the Dune cache doesn't help much until all the dune files have been parsed.

It was also good to see other OCaml projects progressing: Mark is getting more familiar with OxCaml performance and the OCaml TIFF library in Outreachy is going well. I hope to use this soon in TESSERA-oxcaml...

2 Lego for the evidence TAP

%rc
We're just finalising a grant that's been awarded to do the exciting next phase of the Conservation Evidence Copilots "Cambridge Evidence TAP" project by generalising it to other fields (such as education, public health and climate adaptation). While waiting for the grant paperwork to finish, my colleague from Education Jenny Gibson had the brainwave of inviting Gina Gomez de la Cuesta from Play Included to come along to facilitate a brainstorming session about what we'll work on when the project starts.

%rc
This involved getting our friends from across the university and CSaP together. It was me and Sadiq Jaffer from Computer Science. Lynn Dicks and Sam Reynolds from Conservation, Jenny Gibson from education, Mélanie Gréaux from the WHO, Rob Doubleday and Nicola Buckley from the Judge, and Alex Marcoci from CSER. I've got to say that the use of Lego to break the ice among so many fields, and also to help us form thoughts about a very complex and nuanced area was just brilliant.

%rc
And after a long week, it was just nice to kick back and play with LEGO. I'm really looking forward to doing more work with our friends from other departments: working with conservation has gotten me out and about finding hedgehogs, so now learning about play in education, development and learning will hopefully help me understand how to be a better teacher! I also highly recommend contacting Gina at Play Included if you would like to try this yourself for one of your own projects.

3 Echo

After last weeks talk at ARIA, I enjoyed hosting the founders of a new focussed research organisation called 'Echo Labs' funded by ARIA who are doing very ambitious things with AI and biodiversity. More on what they're up to after they officially launch, but I was hugely impressed with their drive and focus to deliver near term impact. I hope we will continue to build collaboration with them as part of our Centre for Landscape Regeneration.

%rc
While they were visiting the CCI, there was also a dramatic unveiling of the new CCI logo, so everyone poured into the coffee room. Well done on a successful launch; the new logo is very practical and inclusive and pretty and I'll be using it everywhere once I get my mittens on the high res versions!

4 IUCN Red List

Shane Weisz continued his storming first year PhD by giving a fantastic EEG seminar on his PhD work to date on speeding up RED List assessment. He's off to the inaugural Conservation Technology Conference in Peru next week to speak about this work there as well, which should be most exciting (and hopefully filled with much birding).

On another storage topic, I've been working on syncing GBIF locally as Parquet files (about 9TB) so we don't need to hammer their API. But I noticed that the AWS Open Data hosting hadn't been updated for six months, but a single Bluesky post was enough to get GBIF's attention and they fixed it within days. Props to them for being so responsive, but it's also interesting that almost noone else seems to locally copy the data regularly.

5 Reading

Simon Peyton Jones pointed me to his friend Julian Allwood's new book Promise the Earth: A Safe Climate in Good Faith which I picked up from the CUP bookstore. It's an intriguing book in that it's written by an engineer and a theologian, and makes t he argument that restraint is the only way forward.

This brilliant book makes the case that rational self-interest alone will not bring about the radical changes to the world economy required to protect our children from the climate breakdown that is coming. The role of faith and compassion has historically played a major role in the way that people cooperate and plan for the future; this persuasive book makes the case that it is needed more than ever. -- Review by Professor Mark Miodownik MBE FREng, 2025

6 Next week

Jon Ludlam finished off undergraduate exam questions for Foundations of Computer Science and I am going to do a review on them -- the only teaching thing I am doing this sabbatical year.

On Monday, I'm jetting off to the India AI summit in Delhi and to see some relatives for the week. I'll be back in Cambridge on Friday to host Shriram Krishnamurthi who is passing through on sabbatical. Very excited to see him again to continue our ICFP conversations about teaching!

Some fun links:

References

[1]Madhavapeddy (2025). Foundations of Computer Science. 10.59350/qms3q-ymn65
[2]Madhavapeddy (2025). What I learnt at ICFP/SPLASH 2025 about OCaml, Hazel and FP. 10.59350/w1jvt-8qc58
[3]Jaffer et al (2025). AI-assisted Living Evidence Databases for Conservation Science. Cambridge Open Engage. 10.33774/coe-2025-rmsqf
[4]Reynolds et al (2025). Will AI speed up literature reviews or derail them entirely?. Nature Publishing Group. 10.1038/d41586-025-02069-w