Dear ACM, you're doing AI wrong but you can still get it right / Dec 2025 / DOI
There's outrage in the computer science community over a new feature rolled out by the ACM Digital Library that generates often inaccurate AI summaries. To make things worse, this is hidden behind a 'premier' paywall, so authors without access (for example, having graduated from University) can't even see what is being said.
Read full note... (2016 words)
2025 Advent of Agentic Humps: Building a useful O(x)Caml library every day / Dec 2025
Agentic programming has been getting a hilariously bad rap in the OCaml community recently, but it's definitely here to stay despite the
Day 1: Crockford for Crockford Base32 encoding.Day 2: Jsonfeed for an implementation of the JSONFeed 1.1 spec.Day 3: XDGe for a XDG Directory specifiction with Eio capabilities.Day 4: Claudeio for a Claude OCaml/Eio SDK so I can use Claude to write more Eio.Day 5: Bytesrw-eio Bytesrw/Eio adapter and automate opam metadata via a custom Claude skill.Day 6: Yamlrw for a pure OCaml Yaml 1.2 library, to replace ocaml-yaml's C binding.Day 7: Yamlt to allow jsont codecs to be serialised to Yaml as well as JSON.Day 8: Sortal : a contacts management CLI using Yaml, Git and Cmdliner.Day 9: Sortal-Bonsai : adding aBonsai_termterminal UI to Sortal via Async.Day 10: Sortal-Mosaic : adding aMosaicterminal UI to Sortal via Eio.Day 11: Cookeio, Public-suffix, Punycode : parsing Internet RFCs to build cookie libraries.Day 12: Conpool : Eio TLS/TCP connection pooling and self-contained performance viz.Day 13: Requests : Heckling an OCaml HTTP client from 50 other implementations.Day 14: Karakeep : Live agentic API construction for the Karakeep app.Day 15: Htmlrw : Vibespiling Rust/Python into a 100% compliant HTML5 manipulation library.Day 16: Json-pointer : Vibesplaining specifications by generating OCaml Javascript notebooks.Day 17: Jmap : Vibemailing little CLI agents to bring my JMAP messages under control.
Read full note... (1481 words)
AoAH Day 17: OCaml JMAP to plaster my painful email papercuts / Dec 2025
After building a
Luckily, I've been self-hosting my
Read full note... (1707 words)
AoAH Day 16: Vibesplaining JSON Pointers using OCaml/Javascript / Dec 2025
After the successful
I decided to build a JMAP email client implementation in
OCaml that I need for myself Being a professor is not accurately measured by research or teaching outputs, but by how overloaded your INBOX is.
OCaml has superb tooling to help with this; it can not only compile to efficient native code but also to JavaScript and WASM that runs standalone in the browser. I turned to my colleagues
Today's work resulted in an ocaml-json-pointer (RFC6901) implementation along with an interactive notebook tutorial that bundles the entire OCaml compiler toolchain alongside it. There's even another one for Yaml just to illustrate how easy this is to replicate once we've built the first one.
Read full note... (1573 words)
AoAH Day 15: Porting a complete HTML5 parser and browser test suite / Dec 2025
After my success with
My question, though, is how difficult is to go in the other direction and move towards a strongly typed interface like OCaml's. Could we ultimately distill down the extremely complex set of rules around parsing HTML all the way into a proof assistant like Lean, but hopping via OCaml and Haskell to provide convenient executable pitstops?
Today's task was to vibespile the Python into ocaml-html5rw, a pure OCaml HTML5 parser and serialiser that passes the browser test suite 100%.
Read full note... (1387 words)
AoAH Day 14: Debugging a Karakeep CLI against the live service / Dec 2025
With the
To start with, I use Karakeep across all my devices to bookmark things, and I'd like to be able to programmatically search through tags, for example by taking all outbound links from the blogs that I read and autosynching them with my remote service. Karakeep on the server side does some cool things like screenshot links and create local webarchives.
Unfortunately, Karakeep doesn't publish an OCaml interface. Fortunately, my new bestie Claude helped me build ocaml-karakeep without much input from me!
AoAH Day 13: Heckling an OCaml HTTP client from 50 implementations in 10 languages / Dec 2025
Now I had some
Luckily, there's an
I'm not sure what the collective verb is for a group of HTTP clients, so dubbed this whole process a 'heckle' of HTTP coding!
Read full note... (2006 words)
AoAH Day 12: Eio Connection pooling and event tracing / Dec 2025
After yesterday's
For example, github.io has four A records:
> host github.io
github.io has address 185.199.110.153
github.io has address 185.199.109.153
github.io has address 185.199.108.153
github.io has address 185.199.111.153
With this new connection pooling library, my application should be able to
connect to the github.io name and keep track of all the outgoing connections
on the basis of it being called github.io and load balance the number of
outgoing connections accordingly.
In the interests of exploring something new, I also decided to add in visualisation
support to figure out what the library is spending its time on.
I decided to generate self-contained visualisations,
inspired by
AoAH Day 11: HTTP Cookies and vibing RFCs for breakfast / Dec 2025
I'm switching focus for a few days to build a complete HTTP(S) client to use in my
So I thought I'd have a go at a different approach today using agentic coding: can we synthesise a complete HTTP Cookie implementation purely from the RFC 6265 prose itself, and then differentially compare this OCaml implementation against the others? In theory, running a single test suite across all three libraries might be a good way of discovering how to improve the existing implementations. In the long-term, http-cookie is probably the upstream library I want to use, but I don't want to generate a giant diff against it today due to my
Read full note... (1631 words)
AoAH Day 10: Building a TUI for Sortal using Mosaic / Dec 2025
After building a reasonably complete
I first noticed this library when Thibaut presented his OCaml coding with AI talk at FunOCaml. It's quite different from Bonsai in that Mosaic uses OCaml's effects to provide a more direct-style API, and so seems worth experimenting with. So today's task is to port Sortal to use Mosaic and see what this terminal UI looks like!
AoAH Day 9: Adding a Bonsai terminal UI to Sortal / Dec 2025
After building a reasonably complete
Read full note... (1032 words)
Publish, Review, Curate to upend scholarly publishing / Dec 2025 / DOI
I was not expecting to find a bunch of activist librarians at the lovely spires of King's College Chapel last week, but I was very glad that I did! I gave a talk to the Confederation of Open Access Repositories group that was having a meeting about "Turning scholarly publishing on its head". Luckily, I had my budding
Read full note... (1562 words)
AoAH Day 8: Building a contacts CLI manager with Sortal / Dec 2025
I've been accumulating a lot of contacts that I use to write cross references
on my website. This works by using
Cmarkit to parse my custom Markdown,
and spot entries like [@sadiqj] and convert those into a full reference like
Today, I want to build a full CLI application that stores all my contacts as Yaml files in my home directory using XDG conventions, and give me a simple search interface so I can quickly autocomplete these posts from my editor. I call this little application "Sortal".
AoAH Day 7: Converting between JSON and Yaml with yamlt / Dec 2025
After the excitement of building an entire
AoAH Day 6: Getting a Yaml 1.2 implementation in pure OCaml / Dec 2025
I did the palate cleanser of
Since Yaml is an monstrously convoluted spec, I opted back then to bind to the C libyaml using
And the worst thing is, I cannot find the motivation to figure out how Yaml really works. It's the world's worst serialisation format, with lots of corner cases and memory blowups inherent in how it works. So I decided to dive in and see if I could build a pure OCaml Yaml 1.2 implementation using bytesrw and the source spec.
TL;DR: it worked. It actually seems to have come up with a reasonable, pure OCaml implementation that I'm now using! It needs more validation and external code review, but this has been on my TODO list for years now.
AoAH Day 5: Bytesrw Eio adapters and automating opam metadata / Dec 2025
After the
AoAH Day 4: Going recursive with Claudeio for Claude / Dec 2025
By this point, I've got three useful libraries and my use of Claude is getting better. So naturally I want to automate my invocations of the claude CLI, but I hit a roadblock: there are no OCaml SDK bindings! However, there appear to be SDKs in Python, Go and many others. So today will involve having a stab at generating Claude OCaml bindings using Eio, so I can use Claude to write more OCaml!
Foundational AI for Ecosystem Resilience workshop / Dec 2025 / DOI
As part of the ARIA Engineering Ecosystem Resilience
program, we've been convening a series of workshops here at the Cambridge
Conservation Initiative to explore the
potential of combining two very radically different approaches to modeling.
Ecology and ecosystems are inherently agent-based. In other words, patterns in biodiversity in both space and time emerge as a function of the local interaction of many types of individual organisms, both with each other and with their abiotic environment.
Generative agent-based models, such as Concordia enable the simulation of multiple interacting large language models. Given LLMs now possess significant ecological knowledge, it is possible that models such as Concordia will enable the meaningful simulation of ecological interactions.
The biotic and abiotic environment in which ecological agents interact in a given ecosystem is likely measurable via remotely monitored earth-observation data. Raw EO data, however, is unwieldy, containing large quantities of information that can be difficult to interpret. Earth-system models, such as
TESSERA or AlphaEarth are foundational AI models which compress large quantities of EO data into "embeddings", unambiguous and consistent digital representations of the structure of the Earth’s surface. -- Foundational AI to forecast ecosystem resilience, J. Millard, A. Pili, K. Berthon, R. Fletcher, L. Dicks
We held two separate workshops to explore this; one for a
deep-dive into the technical details, and another to invite conservation
practitioners to drive our modeling direction in a realistic and positive
direction. This was all lead by
Read full note... (1500 words)
AoAH Day 3: XDG filesystem paths using Eio capabilities / Dec 2025
By Day 3 of the
Read full note... (1030 words)
AoAH Day 2: Building an OCaml JSONFeed library / Dec 2025
Day 2 of the
JSONfeed is a successor to Atom for website feeds, that has a nice informal specification about how to parse it. However, it also has a growing number of extensions which also need to be implemented somehow, as well as some informal rules to map RSS/Atom to JSONFeed.
There is no existing OCaml implementation that I could find, and I need it to integrate my website with Rogue Scholar more easily for
The AI French Connection to the Practice of Science / Dec 2025 / DOI
Our neighbours France and the UK announced a Franco-British AI collaboration a few months ago dubbed the Entente CordIAle. Last week we held a couple of days of workshops with our Oxford and French buddies deep diving into details of what a partnership might actually involve; a particular pleasure with France given my group's long
I sprinted

AoAH Day 1: Building a Base32 Crockford library in OCaml / Dec 2025
Let's start day 1 of the
With Claude, my setup first involved a custom devcontainer using Docker on a Linux host, and my local Mac laptop. I coordinate both of these via Git repositories hosted up at Tangled with a
Four Ps for Building Massive Collective Knowledge Systems / Nov 2025 / DOI
I've been building some big
I found the perfect place to codify this at the ARIA Workshop on Collective Flourishing that
Will building these collective knowledge systems be a transformative capability for human society? Hot on the heels of COP30 concluding indecisively, I've been getting excited by decision making towards biodiversity going down a more positive path in IPBES. We could empower decisionmakers at all scales (local, country, international) to be able to move five times faster on actions about global species extinctions, unsustainable wildlife trade and food security, while rapidly assimilating extraordinarily complex evidence chains. I'll talk about this more while explaining the principles...
Read full note... (4169 words)
GeoTessera 0.7 out with efficient sampling and Zarr support / Nov 2025 / DOI
I've just released geotessera 0.7 to pypi for our
TESSERA is a foundation model for Earth observation that processes Sentinel-1 and Sentinel-2 satellite data to generate representation (embedding) maps. It compresses a full year of Sentinel-1 and Sentinel-2 data and learns useful temporal-spectral features. -- Temporal Embeddings of Surface Spectra for Earth Representation and Analysis
With this new release, there's convenient documentation to show how you can freely access 150TB+ of CC-BY-licensed embeddings of the earth's surface. We've been getting a growing influx of requests for diverse regions of the world, and so our focus for the next few months is attaining complete coverage of our v1 model on the whole planet.
Read full note... (1228 words)
On the path to the UK/India AI Summit with OpenUK and the ATI / Nov 2025 / DOI
There's a buzz forming around the upcoming AI Impact
Summit next year in India, following up the