AoAH Day 2: Building an OCaml JSONFeed library / Dec 2025

Day 2 of the Advent of Agentic Humps dawns with building a slightly more complex library than before, via the JSONFeed specification that is a more modern version of Atom.

JSONfeed is a successor to Atom for website feeds, that has a nice informal specification about how to parse it. However, it also has a growing number of extensions which also need to be implemented somehow, as well as some informal rules to map RSS/Atom to JSONFeed.

There is no existing OCaml implementation that I could find, and I need it to integrate my website with Rogue Scholar more easily for permanent DOIs.

Approach

Unlike the Crockford implementation, parsing JSON involves selecting a third-party library dependency cone. By default the agent chose Yojson (presumably because its the most popular in its training set). I would conventionally use my own ezjsonm library that builds over the lower level jsonm, but I noticed a deprecation notice on jsonm towards a newer library by Daniel Bünzli called jsont.

Jsont

After seeing the announcement of jsont about a year ago, I gave it a quick try:

Jsont is an OCaml library for declarative JSON data manipulation. It provides:

  • Combinators for describing JSON data using the OCaml values of your choice. The descriptions can be used by generic functions to decode, encode, query and update JSON data without having to construct a generic JSON representation.
  • A JSON codec with optional text location tracking and layout preservation. The codec is compatible with effect-based concurrency.

The descriptions are independent from the codec and can be used by third-party processors or codecs. -- jsont, Daniel Bünzli, 2025

The codec for a JSON type is expressed using combinators, and then separately serialised or deserialised. I found that it had fantastic error messages since it could use the codec to come up with the reason why some input has been rejected. But on the flipside, writing the codecs involved a lot of boilerplate, so I found it quite time consuming.

Daniel wrote a nice paper about the combinator magic behind jsont
Daniel wrote a nice paper about the combinator magic behind jsont

But now, with Claude, I could use it to scan a spec and automate codec construction using jsont! So I decided to try a complex coding case where I fed the prose JSONFeed spec to Claude, and instructed it to build jsont codecs. As a separate phase, I then built test cases and serialisers.

Results

The very first run of Claude crashed and burned with jsont as it didn't have enough examples to figure out the interface from scratch, resulting in a lot of type errors and no working code.

So I took a different tack: I ran opam source jsont to get the source code into the current working directory, and then prompted a fresh Claude session to "ultrathink about the interface of jsont.0.2.0, paying particular attention to the cookbook". The cookbook is a section of the (excellent) documentation in Jsont that describes real-world usage, and the agent also picked up on the jsonrpc testcase in the jsont repository.

Once it had this bit of example-driven context, the agent proceeded to build a very credible set of codecs:

  • Author descriptions look about right, with a pretty printer thrown in as a bonus.
  • The extremely tedious set of CITO citation methods came out in one shot.
  • It figured out to use Ptime based RFC3339 date handling for the feeds, from a combination of the spec and by querying opam locally.
  • The overall Jsonfeed.mli interface weaves all these together, exposes the jsont codec value and accessor and pretty printer functions.
  • As a bonus, i fed it the extensions repository so it can now expose structured references in my own site's JSON feed.

The user exposed API is idiomatic OCaml:

type t
val jsont : t Jsont.t
val create :
  title:string -> ?home_page_url:string ->
  ?feed_url:string -> ?description:string ->
  ?user_comment:string -> ?next_url:string ->
  ?icon:string -> ?favicon:string ->
  ?authors:Author.t list -> ?language:string ->
  ?expired:bool -> ?hubs:Hub.t list ->
  items:Item.t list -> ?unknown:Unknown.t ->
  unit -> t

The full set of constructors and validators can be read in jsonfeed.mli.

Tests

For test cases, I first got Claude to synthesise a variety of JSONFeed corpuses that exercised the spec, and manually inspected it to check it all seemed ok. I then made use of Dune Cram tests to write CLI-based validators that run.

All of the boilerplate around writing the test cases just worked out of the box, with a little bit of manual prompting from me required to guide the agent towards some known edge cases (like extensions handling).

Here's an excerpt from the cram tests:

Missing Required Fields
------------------------

Test missing title field:
  $ ./test_location_errors.exe data/missing_title.json title
  {"status":"error","message":"Missing member title in JSON Feed object",
   "location":{"file":"data/missing_title.json","line":1,"column":1,
   "byte_start":0,"byte_end":65},"context":"$"}
  [1]

Test missing version field:
  $ ./test_location_errors.exe data/missing_version.json title
  {"status":"error","message":"Missing member version in JSON Feed object",
   "location":{"file":"data/missing_version.json","line":1,"column":1,
   "byte_start":0,"byte_end":51},"context":"$"}
  [1]

Reflections

The code is available on anil.recoil.org/ocaml-jsonfeed with online docs. My first user for this might be Michael Dales in his own website.

The use of Claude has made using jsont the default choice for me now. As a first phase, I can ask the LLM to synthesise down a spec into a combinator-based codec, and then validate that separately. Crucially, the good error messages from jsont help the agent to root-cause why some tests fail, and give me the option of either clarifying the spec or fixing the test case if that was the error.

I did, however, still have to make the judgement call of which libraries to use at the start. The agent also happily spat out Yojson and Ezjsonm based implementations for me, but I simply prefer the jsont approach. If you had other priorities like pure performance, you might go for Yojson instead.

Onto Day 3, where we'll then build our first Eio based library!


Loading recent items...