AoAH Day 3: XDG filesystem paths using Eio capabilities / Dec 2025

By Day 3 of the Advent of Agentic Humps, I now have the confidence to build a slightly more complex library that uses Eio to implement the XDG Base Directory Specification with a twist: let's use Eio capabilities to sandbox XDG paths by default.

Lots of other languages have cross-platform XDG implementations, so I wanted one in OCaml for Eio
Lots of other languages have cross-platform XDG implementations, so I wanted one in OCaml for Eio

Approach

The XDG Spec is comprehensive, but full of lots of informal rules about directories. Some of the rules are pretty easy to follow:

$XDG_DATA_DIRS defines the preference-ordered set of base directories to search for data files in addition to the $XDG_DATA_HOME base directory. The directories in $XDG_DATA_DIRS should be separated with the separator used for $PATH on the platform (typically this is a colon :).

While others are harder to enforce in code:

The directory MUST be on a local file system and not shared with any other system. The directory MUST be fully-featured by the standards of the operating system. More specifically, on Unix-like operating systems AF_UNIX sockets, symbolic links, hard links, proper permissions, file locking, sparse files, memory mapping, file change notifications, a reliable hard link count must be supported, and no restrictions on the file name character set should be imposed. Files in this directory MAY be subjected to periodic clean-up. To ensure that your files are not removed, they should have their access time timestamp modified at least once every 6 hours of monotonic time or the 'sticky' bit should be set on the file.

We can do bits of that, but POSIX doesn't really expose everything we need to mechanically verify some aspects of filesystem support. Still, we have a big long spec, so let's see what happens if we throw an agent at it!

My general approach with Claude was to download a copy of the XDG spec, instruct the agent to digest it. But then, I also supplied the previous two libraries as examples of "my OCaml style" that it could draw from. Having more code available to act as an oracle to guide the model towards something I find acceptable is a useful way to avoid lots of prompting.

Results

The first attempts with using completely crashed and burnt as the agent failed to grok the admittedly very complicated Eio types for capabilities (which do involve subtyping and phantom polymorphic variants). I then cloned the Eio source code and specifically instructed the agent to read the Eio README, as it has extensive information about best practises to use the library. This is similar to my earlier trick with jsont to instruct it to read the cookbooks.

Things got much better after this; the agent had to iterate quite a few times, but did eventually converge on the right types.

val config_dir : t -> Eio.Fs.dir_ty Eio.Path.t
(** [config_dir t] returns the path to user-specific configuration files.

    {b Purpose:} Store user preferences, settings, and configuration files.
    Configuration files should be human-readable when possible.

    {b Environment Variables:}
    - [${APP_NAME}_CONFIG_DIR]: Application-specific override (highest priority)
    - [XDG_CONFIG_HOME]: XDG standard variable
    - Default: [$HOME/.config/{app_name}]

    @see <https://specifications.freedesktop.org/basedir-spec/latest/#variables>
      XDG_CONFIG_HOME specification *)

You can see one nice feature here that would have taken a while to code by hand. The type t for the XDGe library is typically constructed by exposing Cmdliner terms to allow other applications to "plug in" XDG support by just specifying a single term.

This term takes care of adding of the ordering of environment variable, command-line arguments, and default values in the right order. The manual page for an example binary shows how this works from the CLI.

In this case, the library specifies XDG_EXAMPLE as the app name, but this is easily customised to your app
In this case, the library specifies XDG_EXAMPLE as the app name, but this is easily customised to your app

Tests

You can see the Cmdliner support in action with the test cases, where I adopted the same cram-based testing strategy as earlier with jsont.

This allows for a nice repository structure where I can simply add in the XDG Cmdliner term to the test case binaries, and have all the gory details of config setup handled by the xdge library. For example, in the cram tests you can see how the source of a XDG path is tracked (i.e. did it come from the CLI, from an environment variable or the defaults?):

 $ export HOME=./test_home
  $ unset XDG_CONFIG_DIRS XDG_DATA_DIRS
  $ XDG_CONFIG_HOME=/tmp/xdge/xdg-config \
  > XDG_EXAMPLE_CONFIG_DIR=./app-config \
  > ./xdg_example.exe --config-dir ./cli-config
  === Cmdliner Config ===
  XDG config:
  config_dir: ./cli-config [cmdline]
  
  === XDG Directories ===
  XDG directories for 'xdg_example':
  User directories:
    config: <fs:./test_home/./cli-config> [cmdline]
    data: <fs:./test_home/./test_home/.local/share/xdg_example> [default]
    cache: <fs:./test_home/./test_home/.cache/xdg_example> [default]
    state: <fs:./test_home/./test_home/.local/state/xdg_example> [default]
    runtime: <none> [default]
  System directories:
    config_dirs: [<fs:/etc/xdg/xdg_example>]
    data_dirs: [<fs:/usr/local/share/xdg_example>; <fs:/usr/share/xdg_example>]

Command-line argument overrides both types of environment variables. Even
though both XDG_CONFIG_HOME and XDG_EXAMPLE_CONFIG_DIR are set, the
--config-dir flag takes precedence and shows [cmdline] source. Other
directories fall back to defaults since no other command-line args are
provided.

I ran the cram tests quite a few times and read through them to make sure the shell sessions and explanations made sense, and then read through the xdge source code itself which was pretty simple. There are some features which Eio doesn't expose functionality for yet that are OS-specific (like checking the mount type), but they can come in a future iteration.

Reflections

Cmdliner is one of the gems in the open-source OCaml community due to how easy it makes it to build "Unix-style" applications with sensible manual pages and consistent argument parsing. However, even after using it for years I can never remember its term language without referring to the manual, and I always find myself cut-and-pasting from previous code and editing it.

Using the agent definitely helped me out here. A lot of the XDG logic is fairly boilerplate, but extremely useful to express in a typed way. I anticipate now using this library in almost every CLI tool I build in OCaml, as it has enough information exposed in the interface to let downstream CLI-coding agents pick the right base directories to use by default.

Onto Day 4 then, where we'll go recursive by wrapping Claude in OCaml using Claude!


Loading recent items...