{
  "version": "https://jsonfeed.org/version/1.1",
  "title": "Anil Madhavapeddy's feed",
  "home_page_url": "https://anil.recoil.org",
  "feed_url": "https://anil.recoil.org/feed.json",
  "icon": "https://anil.recoil.org/favicon.png",
  "authors": [
    {
      "name": "Anil Madhavapeddy",
      "url": "https://orcid.org/0000-0001-8954-2428",
      "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
    }
  ],
  "language": "en-US",
  "items": [
    {
      "id": "https://doi.org/10.59350/re0zy-3rt26",
      "content_html": "<h2 id=\"tessera-streaming-into-the-browser-and-oxcaml-hacking\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#tessera-streaming-into-the-browser-and-oxcaml-hacking\"></a>TESSERA streaming into the browser and OxCaml hacking</h2>\n<p>I've completed a working cut at a streaming interface for TESSERA embeddings, and it's unexpectedly addictive to zoom around the world staring at false colours!  The amazing thing about this interface is that it's <em>entirely</em> browser based, using JavaScript, WebGPU and WASM to perform all the analysis on the client side. The Zarr embeddings are chunked and served over HTTP, using range requests to retrieve the minimum amount of data.</p>\n<p><div class=\"video-center\"><iframe title=\"Tessera Zarr streaming preview\" width=\"100%\" height=\"315px\" src=\"https://crank.recoil.org/videos/embed/08aafc87-9aea-48e3-8c41-a2fe1b94fea4\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<p>The video shows classification workflows, but I've also got <a href=\"https://toao.com/blog/can-we-really-see-brambles-from-space\">solar panel detection</a> working using Sadiq's patch based embeddings. You can try it for yourself when I release this properly next week!</p>\n<p>There are obvious limitations to how much we can do in the browser; for most serious work we will need a server running, but my goal here is to see if we can embed TESSERA into the <a href=\"/ideas/living-iucn-redlist\">living dashboard</a> that <a href=\"https://shaneweisz.com\">Shane Weisz</a> is working on. I've also just received a drop of <a href=\"https://digitalflapjack.com/weeknotes/fractional_life_progress/\">areas-of-habitats</a> from <a href=\"https://mynameismwd.org\">Michael Dales</a> which I'll have a go at integrating next week.</p>\n<h3 id=\"its-tee-time-with-multiple-browsers-in-development\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#its-tee-time-with-multiple-browsers-in-development\"></a>It's TEE time with multiple browsers in development</h3>\n<p>What's blocking everyone from using Zarr and TESSERA? Well, we need to transcode a petabyte of embeddings from the old numpy format into Zarr, which is a difficult parallelisation problem. I'm making steady progress on an <a href=\"https://oxcaml.org\">OxCaml</a> pipeline for this with a from-scratch OxCaml-Zarr implementation that <a href=\"https://www.tunbury.org/\">Mark Elvers</a> helped me kick off.  Mark also published his <a href=\"https://github.com/mtelvers/ocaml-tessera\">ocaml-tessera</a> pipeline which I'm going to import to OxCaml next week as well, so that we can do both model training and tile inference in OCaml!</p>\n<p>For more production-oriented usecases, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> has published a server with his excellent <a href=\"https://tee.cl.cam.ac.uk\">Tessera Embeddings Explorer</a>. We've been building our implementations independently and swapping ideas for user interfaces and analyses, which has been a very productive way of experimenting for different users. His implementation is in use for several downstream tasks projects and should be what people use, while mine is heading towards more dynamic mobile/browser workflows. You can grab that code at <a href=\"https://github.com/ucam-eo/tee\">https://github.com/ucam-eo/tee</a>, complete with convenient Dockerfile.</p>\n<h3 id=\"discussing-tessera-programming-models-at-wg28\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#discussing-tessera-programming-models-at-wg28\"></a>Discussing TESSERA programming models at WG2.8</h3>\n<p><img src=\"/images/wg28-26-2.webp\" alt=\"%rc\" title=\"Viana de Castelo had the loveliest venue and hotel this week\" >\nI got invited to <a href=\"https://ifip-wg28.github.io/\">WG2.8</a> again, so <a href=\"https://simon.peytonjones.org/\">Simon Peyton Jones</a> and I trooped off there from Cambridge earlier in the week. I had to leave early due to some family matters, but I got to present 'planetary programming' to the assembled gods of functional programming!</p>\n<p>I began my <a href=\"https://www.cl.cam.ac.uk/~avsm2/slides/wg28-2026-tessera.pdf\">talk</a> by presenting the streaming browser demo above (which, thanks to the <a href=\"https://github.com/avsm/oxmono/blob/main/avsm/httpz-perma-proxy\">perma caching httpz oxcaml proxy</a> I hacked up lets the remote tiles be cached on my laptop so the app works offline too). I used the opportunity to posit a high-level programming design problem I've encountered when coding with TESSERA...</p>\n<p>There's a contradictory tension in the styles of programming that we\nconventionally embark on with the workflows needed by machine learning. In our\n<a href=\"/papers/2025-fairground\">FAIRground</a> paper we describe a purely functional Python\nvariant that represents conventional 'forward programming'. You can do lots of\nnice things when the language is pure, such as enabling incremental live\ncomputations. This forward programming style would be good for building a global computational wiki for example.</p>\n<p>However, when we program with observational embeddings like TESSERA, we're\ndoing 'backwards programming'. The units we're dealing with are 128-dimensional\nself-supervised representations that have been learnt from primary satellite\ndata, and the job of the program is to help cluster these higher dimensional\nstructures into some semblance of useful meaning. We do this via downstream\nclassifiers, segmenters and regression tasks. This is a very different\nprogramming style from forward programming even though it requires a similar\namount of CPU.</p>\n<p><img src=\"/images/wg28-26-3.webp\" alt=\"%rc\" title=\"There was no danger of losing weight in Portugal\" >\nThe ultimate goal of <em>both</em> of these scientific programming styles is to establish <em>causal</em> relationships; that is, we want to form (or reinforce or falsify) a theory of how the world works that can be tested by the scientific method. So I wondered: how do we combine these three styles into a programming language? This is a very general question, but I figured there was no better place to ask than a room full of people who have designed dozens if not hundreds of languages between them.</p>\n<p><a href=\"https://www.cl.cam.ac.uk/~avsm2/slides/wg28-2026-tessera.pdf\"> <img src=\"/images/fwd-back-causal-ss-1.webp\" alt=\"%c\" title=\"Three sorts of relations we are trying to program\" > </a></p>\n<ul>\n<li><a href=\"http://www.ccs.neu.edu/home/amal/\">Amal Ahmed</a> said this really looked like a <a href=\"https://doi.org/10.1145/3609027.3609405\">multi-DSL problem</a> (i.e. implement all three different styles of programming in OCaml, and then examine the DSL properties/data structures). She also pointed me to <a href=\"https://arxiv.org/abs/2502.19538\">Multi-Language Probabilistic Programming</a> which allows for differently specialised probabilistic programming languages.</li>\n<li><a href=\"https://homepages.inf.ed.ac.uk/slindley/\">Sam Lindley</a> also suggested this multi-DSL would be a good use of effects: could we write <em>one</em> OCaml program to represent all three styles, but then interpret them completely differently using effects? We could get a set of points as program traces via sampling using effects, and then we could do reproducible simulations via effects for stochastic choice, and then for causal path tests effects that check program data structure invariants regularly to build up causal hypotheses.  I need to talk to <a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a> and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> about this more, as with any effects based idea that involves more than just a Suspend effect.</li>\n<li><a href=\"https://www.chalmers.se/en/persons/ms/\">Mary Sheeran</a> observed that this is somewhat like hardware programming, whereby we put the minimal structural constraints in and then try to discover layouts.</li>\n<li><a href=\"https://www.chalmers.se/en/persons/rjmh/\">John Hughes</a> noted that the statistical testing combined with datastructures (the synthetic computation) is quite similar to quickcheck: can we posit causal relationships and 'quickcheck' them efficiently?</li>\n<li><a href=\"https://www.uu.nl/staff/GKKeller\">Gabriele Keller</a> is working with spatial ecologists on saltmarshes using array programming to speed up their calculations, so we had a <em>very</em> productive conversation that I will follow up on! Sounds remarkably similar to the work in the Cairngorms that <a href=\"https://coomeslab.org\">David Coomes</a> is leading in the <a href=\"https://www.clr.conservation.cam.ac.uk/\">CLR</a>.</li>\n<li><a href=\"https://justtesting.org/\">Manuel Chakravarthy</a> gave me a lot of tips on Mac/iOS approaches and also told me about <a href=\"https://volteuropa.org/\">Volt Europa</a> and their push for liberal sovereignty.</li>\n<li><a href=\"https://www.cs.cornell.edu/~jnfoster/\">Nate Foster</a> and <a href=\"https://homepages.inf.ed.ac.uk/slindley/\">Sam Lindley</a> helped me simplify my thinking a lot: rather than worry about scale (millions of species), can we find the smallest possible example to work outwards from synthetically? I obviously thought of <a href=\"/ideas/hedgehog-mapping\">hedgehog mapping</a> as good one here. We also thought that viewing causality as a 'triangle' wasn't right: instead, we could use a combination of synthetic models + observational samples as bidirectional lenses, and then draw causal path diagrams across them to test the lenses (sort of like natural experiments). This is somewhat like <a href=\"https://doi.org/10.1145/1328897.1328487\">boomerang lenses</a> but for sample data instead of strings.</li>\n<li><a href=\"https://richarde.dev/\">Richard Eisenberg</a> gave me practical OxCaml advice as it's a fast-moving target: layout polymorphism is a while away yet, so keep using <code>ppx_template</code> for now, but other features like float16 (useful for TESSERA) could be done fairly easily.</li>\n<li><a href=\"https://people.mpi-sws.org/~rossberg/\">Andreas Rossberg</a> was impressed by the use of WASM for browser-based geospatial, and we discussed the difficulty of using wasm with the DOM for interactive interfaces. Machine learning workflows perform well because of the lack of DOM transitions, but hopefully <a href=\"https://hacks.mozilla.org/2026/02/making-webassembly-a-first-class-language-on-the-web/\">Mozilla is working on improving this</a>.</li>\n<li><a href=\"https://simon.peytonjones.org/\">Simon Peyton Jones</a> looked bemused by it all and thought it was too high level a concept to latch onto. I need a worked example like the above to convince him when I'm back at Cambridge!</li>\n</ul>\n<p>It was a short trip to Portugal in the end, but massively energising. I do love hanging with functional programmers!</p>\n<h2 id=\"biodiversity-action-through-technology\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#biodiversity-action-through-technology\"></a>Biodiversity action through technology</h2>\n<p>Two big perspective papers on global biodiversity are out in PNAS this week, which I <a href=\"/notes/nas-rs-biodiversity-papers\">wrote up separately</a> in a detailed note.\nTo follow up on these, <a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a> is chairing this year's <a href=\"https://propl.dev\">PROPL</a> (to be held at PLDI this summer), and we've been discussing doing something different this year to tie into a more 'action-oriented' workshop that combines the learnings from the last couple of years with the call to biodiversity action above.</p>\n<p>Meanwhile, to followup on the <a href=\"/notes/first-tessera-hackathon\">first TESSERA hackathon</a> over in India a few weeks ago, <a href=\"https://www.cse.iitd.ac.in/~aseth/\">Aadi Seth</a> and <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> have put together a <a href=\"https://www.linkedin.com/posts/core-stack_first-round-of-innovation-challenge-advances-activity-7436361754617552897-tyrD\">call for students</a> to get involved. If you're interested then <a href=\"https://docs.google.com/document/d/1vcYj6D_ReWE5xG51A7-Gdt2g6pqpZfw8/edit\">apply here</a> and get going with TESSERA!</p>\n<p><img src=\"/images/iitm-campus-1.webp\" alt=\"%rc\" title=\"Life on the IIT-Madras campus is all about adorable dogs and deer roaming around\" >\nAnd not to be left behind by their Delhi counterparts, <a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a> announced that applications <a href=\"https://fplaunchpad.org/2026/03/06/applications-open-post-bacc-fellowship.html\">are now open</a> for the <a href=\"https://fplaunchpad.org\">FP Launchpad</a> in IIT-Madras.\nThis should be of interest to computer scientists who want to get involved in environmental work; as I <a href=\"/notes/india-ai-summit\">mentioned</a> before, one of the illustrative projects to kick off the FPL is programming TESSERA embeddings more ergonomically:</p>\n<blockquote>\n<p>A programmable public infrastructure for environmental planning, combining\nTESSERA's satellite-derived representations with CoRE Stack data and\ncompositional functional models in O(x)Caml to support auditable indicators\nand scenario analysis for India’s water and habitat systems.\n<cite>-- <a href=\"https://fplaunchpad.org/charter/\">FP Launchpad Charter, 2026</a> </cite></p>\n</blockquote>\n<p>So it's action stations at both the IITs and I'm looking forward to working with them from Cambridge! This is a nice followup to our <a href=\"https://www.cam.ac.uk/news/new-boost-for-historic-relationship-between-university-of-cambridge-and-india-announced\">Cambridge VC visiting India</a> and kicking off a cricketing tour!</p>\n<h2 id=\"docker-buzz-from-the-cacm-article\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#docker-buzz-from-the-cacm-article\"></a>Docker buzz from the CACM article</h2>\n<p>Following the <a href=\"/notes/cacm-docker-cover\">CACM Docker article</a>, there's been lots of positive online discussions about the article. <a href=\"https://news.ycombinator.com/item?id=47289311\">Hackernews</a> leads the way with typically split opinions. Some loved it, some hated it, some thought it should be replaced with a very small shell script, and others reminisced over our use of <a href=\"https://en.wikipedia.org/wiki/Slirp\">SLIRP</a>. Overall though a lovely discussion and vibe.</p>\n<p><a href=\"https://news.ycombinator.com/item?id=47289311\"> <img src=\"/images/hn-docker-ss-1.webp\" alt=\"%c\" title=\"Docker on top of HN again\" > </a></p>\n<p>I also read two interesting papers while <a href=\"/notes/2026w9\">researching</a> more background for the <a href=\"/papers/2026-package-calculus\">package calculus</a> that <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> has been working on:</p>\n<ul>\n<li><a href=\"https://arxiv.org/abs/2601.12811\">Docker Does Not Guarantee Reproducibility</a> discusses some of the common pitfalls around building fully reproducible containers. While there's support at the lower levels for this in the Docker stack, I agree it's not kept up with modern needs. Luckily, <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> is hacking on a new <a href=\"https://patrick.sirref.org/weekly-2025-w49/\">shell interface with provenance</a>.</li>\n<li><a href=\"https://arxiv.org/abs/2501.15919v1\">Does Functional Package Management Enable Reproducible Builds at Scale? Yes.</a>: This is a complementary paper, and argues for a Nix-like approach. It's good to see that Nix (despite its constrained versioning) does a good job of supporting retrospective builds.</li>\n</ul>\n<h2 id=\"visitor-from-kth\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#visitor-from-kth\"></a>Visitor from KTH</h2>\n<p><img src=\"/images/wg28-26-1.webp\" alt=\"%rc\" title=\"The Mill does the best fish and chips in Cambridge I reckon!\" >\nWe had a delightful visit from <a href=\"https://www.kth.se/profile/yifang\">Professor Yifang Ban</a> from KTH, who delivered this week's <a href=\"https://watch.eeg.cl.cam.ac.uk/w/dZNDoKiuH8sugfKLCWh8gS\">EEG seminar</a> on EO-AI4GlobalChange. We went to the pub after, and discussed a rather staggering number of <a href=\"/papers/2025-tessera-tasks\">downstream tasks</a> that Prof Ban works on:</p>\n<blockquote>\n<p>In this seminar, Professor Ban will discuss recent research at the\nintersection of EO and AI, with a focus on deep learning methods for\nmonitoring environmental change at scale. She will present selected results\nfrom EO-AI4GlobalChange, a collaborative research project developing novel,\nglobally-applicable deep learning approaches for analysing multi-sensor,\nmulti-modal EO data. The talk will cover examples including 2D and 3D urban\nmapping, urban change detection, wildfire detection and near-real-time\nmonitoring, flood mapping, and multi-hazard building damage detection.</p>\n<p>The seminar will also briefly introduce PANGAEA, a global benchmark for\nGeospatial Foundation Models, and discuss insights from the systematic\nevaluation of widely used foundation models across multiple geospatial\ndomains. Finally, Professor Ban will briefly outline the objectives of the\nrecently established AI4EO Working Group within Group on Earth Observations\n(GEO), which aims to advance GEO’s vision of Earth Intelligence for All\nthrough AI-driven Earth observation research, innovation, and collaboration.\n<cite>-- <a href=\"https://watch.eeg.cl.cam.ac.uk/w/dZNDoKiuH8sugfKLCWh8gS\">Yifang Ban, EEG Seminar</a>, March 2026</cite></p>\n</blockquote>\n<p>But most importantly, we had excellent fish and chips to celebrate her <a href=\"https://www.linkedin.com/feed/update/urn:li:activity:7434213135705694210/\">first visit</a> to Cambridge!</p>\n<h2 id=\"fun-links\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#fun-links\"></a>Fun links</h2>\n<ul>\n<li>Next OxCaml <a href=\"/notes/aoah-2025-13\">vibespiling</a> target: &quot;<a href=\"https://iev.ee/blog/resharp-how-we-built-the-fastest-regex-in-fsharp/\">How we built the fastest regexp engine in F#</a>&quot; with code <a href=\"https://github.com/ieviev/resharp-dotnet\">here</a>.</li>\n<li>The calls to <a href=\"/notes/rs-future-of-publishing\">reform publishing</a> are getting <a href=\"https://www.experimental-history.com/p/the-one-science-reform-we-can-all\">louder and louder</a>. A new ATProto service called <a href=\"https://chive.leaflet.pub/3mgb6k5pwsc2q\">Chive</a> looks interesting here.</li>\n<li>New podcast on sci-fi is a lot of fun, called <a href=\"https://starshipalexandria.com/\">Starship Alexandria</a> with Emma Newman and Adrian Tchaikovsky. I've been <a href=\"https://www.jonmsterling.com/2026-W10/\">reminded</a> to pick up Adrian's latest series <a href=\"https://www.goodreads.com/book/show/60147395-city-of-last-chances\">City of Last Chances</a> which I'm enjoying so far. Great insect world building as always!</li>\n<li>Extremely sad news is the passing of Prof <a href=\"https://royalsociety.org/people/alan-wilson-10879/\">Alan Wilson</a>, who I was showing my <a href=\"https://www.flickr.com/photos/avsm/albums/72177720328187549/with/54709177736\">Botswana leopard pictures</a> and getting flying tips from just a few months ago. He passed away in a light aircraft crash while heading into the sand dunes of Namibia. Very, very sad news.</li>\n<li>More bad news of (a pretty bad) week is that <a href=\"https://doi.org/10.21203/rs.3.rs-6079807/v1\">global warming has accelerated significantly</a>.</li>\n<li><strong>But the good news</strong> is that I learnt of <a href=\"https://en.wikipedia.org/wiki/Lazarus_taxon\">lazarus taxon</a> that come back from extinction, such as this week's <a href=\"https://www.theguardian.com/environment/2026/mar/05/marsupials-discovered-new-guinea\">adorable marsupial thought extinct for 6000 years</a>. Hurrah!</li>\n</ul><h1>References</h1><ul><li>Madhavapeddy (2026). Connecting the dots for biodiversity action from the NAS/Royal Society Forum. <a href=\"https://doi.org/10.59350/dy7d3-hdt43\" target=\"_blank\"><i>10.59350/dy7d3-hdt43</i></a></li>\n<li>Madhavapeddy (2026). At the AI Impact Summit in Delhi: people, planet, progress. <a href=\"https://doi.org/10.59350/6vc5q-mbk23\" target=\"_blank\"><i>10.59350/6vc5q-mbk23</i></a></li>\n<li>Feng et al (2026). Applications of the TESSERA Geospatial Foundation Model to Diverse Environmental Mapping Tasks. SSRN. <a href=\"https://doi.org/10.2139/ssrn.6142416\" target=\"_blank\"><i>10.2139/ssrn.6142416</i></a></li>\n<li>Madhavapeddy (2025). Royal Society's Future of Scientific Publishing meeting. <a href=\"https://doi.org/10.59350/nmcab-py710\" target=\"_blank\"><i>10.59350/nmcab-py710</i></a></li>\n<li>Madhavapeddy (2026). 1st TESSERA/CoRE hackathon at the Indian AI Summit. <a href=\"https://doi.org/10.59350/1na80-7ak85\" target=\"_blank\"><i>10.59350/1na80-7ak85</i></a></li>\n<li>Omar et al (2025). A FAIR Case for a Live Computational Commons. Association for Computing Machinery. <a href=\"https://doi.org/10.1145/3759536.3763802\" target=\"_blank\"><i>10.1145/3759536.3763802</i></a></li>\n<li>Gibb et al (2026). Package Managers à la Carte: A Formal Model of Dependency Resolution. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2602.18602\" target=\"_blank\"><i>10.48550/arXiv.2602.18602</i></a></li>\n<li>Patterson et al (2023). Semantic Encapsulation using Linking Types. ACM. <a href=\"https://doi.org/10.1145/3609027.3609405\" target=\"_blank\"><i>10.1145/3609027.3609405</i></a></li>\n<li>10.1145/1328897.1328487<a href=\"https://doi.org/10.1145/1328897.1328487\" target=\"_blank\"><i>10.1145/1328897.1328487</i></a></li>\n<li>Rahmstorf et al (2025). Global Warming has Accelerated Significantly. Research Square. <a href=\"https://doi.org/10.21203/rs.3.rs-6079807/v1\" target=\"_blank\"><i>10.21203/rs.3.rs-6079807/v1</i></a></li>\n<li>Stites et al (2025). Multi-Language Probabilistic Programming. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2502.19538\" target=\"_blank\"><i>10.48550/arXiv.2502.19538</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2026w10",
      "title": ".plan-26-10: Streaming TESSERA working, biodiversity action papers, and FPL takes off",
      "summary": "TESSERA streaming in the browser, planetary programming at WG2.8, biodiversity action papers, FP Launchpad opens, and Docker CACM buzz",
      "date_published": "2026-03-08T00:00:00.000000Z",
      "date_modified": "2026-03-08T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "tessera",
        "packages",
        "opensource",
        "docker",
        "india",
        "fplaunchpad"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/dy7d3-hdt43",
          "doi": "10.59350/dy7d3-hdt43",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/6vc5q-mbk23",
          "doi": "10.59350/6vc5q-mbk23",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.2139/ssrn.6142416",
          "doi": "10.2139/ssrn.6142416",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/nmcab-py710",
          "doi": "10.59350/nmcab-py710",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/1na80-7ak85",
          "doi": "10.59350/1na80-7ak85",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3759536.3763802",
          "doi": "10.1145/3759536.3763802",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2602.18602",
          "doi": "10.48550/arXiv.2602.18602",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3609027.3609405",
          "doi": "10.1145/3609027.3609405",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1145/1328897.1328487",
          "doi": "10.1145/1328897.1328487",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.21203/rs.3.rs-6079807/v1",
          "doi": "10.21203/rs.3.rs-6079807/v1",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2502.19538",
          "doi": "10.48550/arXiv.2502.19538",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/dy7d3-hdt43",
      "content_html": "<p>Last summer I spoke at the <a href=\"/notes/nas-rs-biodiversity\">US-UK Forum on Measuring Biodiversity</a> at the National Academy of Sciences, jointly organised with the Royal Society. Two companion papers from that forum have just been published in <a href=\"https://pnas.org\">PNAS</a> to turn the Washington meeting into an agenda for transforming how the world monitors global biodiversity.</p>\n<p><img src=\"/images/nas-rs-2.webp\" alt=\"%c\" title=\"Throwback to last summer at the National Academy of Sciences in DC!\" ></p>\n<p><a href=\"https://pnas.org/doi/10.1073/pnas.2519345123\"> <img src=\"/papers/2025-biodiversity-9recs\" alt=\"%rc\" title=\"Nine changes needed to deliver a radical transformation in biodiversity measurement\" > </a>\n<strong>&quot;<a href=\"/papers/2025-biodiversity-9recs\">Nine changes needed to deliver a radical transformation in biodiversity measurement</a>&quot;</strong> (<a href=\"/papers/2025-biodiversity-9recs.pdf\">pdf</a>) lead by <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> tackles the flood of new data from emerging technologies and makes concrete recommendations for ensuring biodiversity measurement can turn into effective action on the ground. Then <strong>&quot;<a href=\"/papers/2025-biodiversity-msf\">From data to decisions: Toward a Biodiversity Monitoring Standards Framework</a>&quot;</strong> (<a href=\"/papers/2025-biodiversity-msf.pdf\">pdf</a>) lead by <a href=\"https://www.thegonzalezlab.org/\">Andrew Gonzalez</a> introduces the BMSF as a structured, auditable &quot;chain of evidence&quot; that connects ethical principles through to standardised data collection, curation, analysis and reporting into a single federated framework. The first paper identifies what needs to change, and the BMSF paper provides a blueprint for <em>how</em> to implement those changes.</p>\n<p><a href=\"https://pnas.org/doi/10.1073/pnas.2519347123\"> <img src=\"/papers/2025-biodiversity-msf\" alt=\"%rc\" title=\"From data to decisions: Toward a Biodiversity Monitoring Standards Framework\" > </a>\nThe BMSF is particularly relevant to tying together my group's work on <a href=\"/projects/plancomp\">planetary computing</a>, and so I've been thinking hard about what to organise this year to bring more computer scientists into the fold. Data processing might sound boring at first glance, but the scales and diversity of information involved in biodiversity is <a href=\"https://doi.org/10.1146/annurev-environ-121522-045106\">mindbogglingly complex</a>. We are, after all, trying to describe all life on earth...</p>\n<h2 id=\"the-nine-recommendations-for-biodiversity\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-nine-recommendations-for-biodiversity\"></a>The nine recommendations for biodiversity</h2>\n<p>The key numbers in global biodiversity are pretty stark: roughly <a href=\"https://www.nationalgeographic.com/environment/article/ipbes-un-biodiversity-report-warns-one-million-species-at-risk\">1 million species threatened with extinction</a>, a <a href=\"https://www.worldwildlife.org/news/press-releases/catastrophic-73-decline-in-the-average-size-of-global-wildlife-populations-in-just-50-years-reveals-a-system-in-peril/\">70% decline in monitored wildlife populations since 1970</a>,\nand <a href=\"https://doi.org/10.1126/science.aax9931\">insect populations declining at around 10% per decade</a>.</p>\n<p>Against this backdrop, the paper makes nine recommendations towards accelerating action:</p>\n<ol>\n<li>\n<p><strong>Capitalise on novel technology to integrate data sources.</strong> This is partly where my work comes in! Geospatial foundation models like <a href=\"/projects/tessera\">TESSERA</a>, eDNA, <a href=\"/papers/2024-terracorder\">acoustic monitoring</a> and citizen science are generating data at scale. The challenge is combining these heterogeneous sources, which is where models like TESSERA fit by <a href=\"/papers/2025-tessera-tasks\">making downstream task combinations way more efficient</a>. What's missing is effective labeling infrastructure to combine the observational knowledge from space with human expert knowledge on the ground (more on that <a href=\"#tessera-and-labelling-frameworks\">later</a>).</p>\n</li>\n<li>\n<p><strong>Agree standard methods for data collection.</strong> Inconsistency in methods hinders all major uses of biodiversity data, but it's not practitioners that are at fault: there's just a lot of different levels of biodiversity that need to be monitored. Consider how different even a medium sized back garden is from one end to the other, and then scale this out to a tropical rainforest. The paper calls for a tiered and modular monitoring workflow, which is described in detail in the companion <a href=\"/papers/2025-biodiversity-msf\">BMSF paper</a>.</p>\n</li>\n<li>\n<p><strong>Ensure new technologies are calibrated with existing data.</strong> As we race to <a href=\"https://www.nightjar.tech/\">upgrade technology</a>, monitoring results from traditional field surveys (that might date back decades) need to be migrated to newer AI-driven workflows. But we <em>must</em> keep the primary observations distinct from the statistical derivations or else we <a href=\"/papers/2025-ai-poison\">poison our sources</a> and confound long-term trends.</p>\n</li>\n<li>\n<p><strong>Fill data gaps using emerging technologies, especially in the tropics.</strong> Biodiversity data is <a href=\"https://doi.org/10.1126/science.adh8874\">geographically biased toward charismatic species</a> in wealthy countries. Invertebrates, soil biota and tropical ecosystems are massively underrepresented; just check out <a href=\"https://shaneweisz.com\">Shane Weisz</a> and his <a href=\"/ideas/living-iucn-redlist\">living RED list dashboard</a> to see how poorly some taxa are represented in the overall picture.</p>\n</li>\n<li>\n<p><strong>Create living databases of trusted information to reduce AI poisoning.</strong> Our <a href=\"/projects/ce\">CE</a> <a href=\"/papers/2025-evidence-tap\">evidence TAP</a> pipeline of the scientific literature shows how we need institutionally federated networks of living evidence with humans-in-the-loop who can continuously gather, screen and index literature. The alternatives of depending on <a href=\"/notes/red-pill-conservation\">blackbox and unaccountable LLMs</a> is rapidly descending on us, and so we are building out dynamic meta-analysis and active vetting of literature databases to spot <a href=\"/papers/2025-ai-poison\">AI poisoning</a> to not only reject fabricated papers but also to invalidate downstream damage in the citation networks.</p>\n</li>\n<li>\n<p><strong>Ensure data generation is valued.</strong> Last summer, <a href=\"https://coomeslab.org\">David Coomes</a> and I spoke at an IUCN meeting about how <a href=\"https://bsky.app/profile/anil.recoil.org/post/3mc7p23y3gs2y\">vital field-based data collection</a> is for ground truth, especially as self-supervised AI advances. Incentives for data sharing need reform in the majority world where data is currently scarce, for example via governance for wholesale sharing, academic credit through coauthorship, recognition of data production as a research output, and sufficient resourcing for archival. My <a href=\"/notes/principles-for-collective-knowledge\">principles for collective knowledge</a> capture much of this from a technical perspective, but of course regulatory and financial reform is also necessary to push incentives in the right direction.</p>\n</li>\n<li>\n<p><strong>Ensure respectful incorporation of Indigenous Knowledge.</strong> When I spoke at the <a href=\"/notes/red-pill-conservation\">CE conference</a> recently, I carelessly used the term 'indigenous knowledge' to refer generically to grey literature, and <a href=\"https://www.biology.ox.ac.uk/people/ej-milner-gulland\">E. J. Milner-Gulland</a> quite rightly told me off afterwards! Indigenous knowledge systems are distinct in that they hold <em>generations</em> of environmental insight accumulated through direct interaction with ecosystems. I need to learn more about this all as it also came up at the <a href=\"/notes/india-ai-summit\">Indian AI summit</a> as well in the conversations about sovereignty. The paper stresses <a href=\"https://en.wikipedia.org/wiki/Free,_prior_and_informed_consent\">Free, Prior and Informed Consent</a> and the <a href=\"https://www.gida-global.org/care\">CARE principles</a> for Indigenous data governance. After our paper was published, WarīNkwī Flores also <a href=\"https://www.linkedin.com/feed/update/urn:li:activity:7435043659147997184?commentUrn=urn%3Ali%3Acomment%3A%28activity%3A7435043659147997184%2C7435165914645405696%29&amp;dashCommentUrn=urn%3Ali%3Afsd_comment%3A%287435165914645405696%2Curn%3Ali%3Aactivity%3A7435043659147997184%29\">pointed out</a> their work on <a href=\"https://doi.org/10.5281/zenodo.18675312\">sovereign data supply chains</a> which is very complementary.</p>\n</li>\n<li>\n<p><strong>Ensure measurements enable quantification of effectiveness of actions.</strong> Monitoring outcomes without a focus on evaluating the impact of interventions is like redecorating while the house burns. High spatial and temporal resolution data (for instance, dynamic <a href=\"https://about.conservationevidence.com/2026/01/16/geospatial-foundation-models/\">ground truth combined with geospatial models</a>) could estimate credible counterfactuals (what would have happened without the intervention). This in turn enables evidence-based policy at much larger scale than we currently do. Most ecological mitigation recommendations in the UK <a href=\"https://doi.org/10.1002/2688-8319.12089\">are not evidence based</a>, which means there's a lot of paperwork without much point!</p>\n</li>\n<li>\n<p><strong>Increase the resilience of global datasets to technical and societal change.</strong> This is a problem across society with the pace of AI, but it's particularly urgent for nature where business as usual results in <a href=\"/papers/2024-food-life\">mass extinctions if we take action in the wrong place</a>. There are short term measures such as federation that will help, but I'm fascinated by the potential of <a href=\"/papers/2025-internet-ecology\">our work on building an ecology for the internet</a> that draws on ecological theory to make digital infrastructure more resilient. This is a pretty far-out prospect right now, but coding models have advanced <em>massively</em> in just one year since I wrote that paper, and there's a growing connection between biodiversity science and computer science. There will be a <a href=\"https://avsm.leaflet.pub/3mgampzfq6k27\">workshop on &quot;Rewilding the Web&quot;</a> in Edinburgh (28-29 May 2026) exploring exactly these ideas of applying ecological insights to digital infrastructure resilience, so I hope to see you there if you are into this!</p>\n</li>\n</ol>\n<h2 id=\"the-biodiversity-monitoring-standards-framework\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-biodiversity-monitoring-standards-framework\"></a>The Biodiversity Monitoring Standards Framework</h2>\n<p>The <a href=\"/papers/2025-biodiversity-msf\">BMSF paper</a> then provides a more concrete implementation to follow the flow of data. This is probably of more relevance to computer scientists as it covers a lot of practical ground for which we already have many piecemeal solutions in the tech world. The paper describes an iterative cycle:</p>\n<p><img src=\"/images/msf-fig.webp\" alt=\"%c\" ></p>\n<ol>\n<li><strong>Ethics:</strong> Foundational principles including FAIR, CARE, FPIC ethical guidelines covering respect for life, indigenous data sovereignty, data security, transparency of purpose, the precautionary principle, and equitable benefit-sharing.</li>\n<li><strong>Sensing &amp; Knowing:</strong> Standardised data collection from multiple evidence streams including fixed sensing, scientific methods, indigenous knowledge systems, and citizen science.</li>\n<li><strong>Processing &amp; Curation:</strong>  Data validation, quality assurance, interoperability via standards like <a href=\"https://dwc.tdwg.org/\">Darwin Core</a> and multimodal integration where needed.</li>\n<li><strong>Trust</strong>: Provenance and licensing towards trusting inputs (see earlier comments about AI poisoning), metadata, data lineage tracking, clear licensing and attribution, security protocols, and data sovereignty.</li>\n<li><strong>Analysis:</strong> Standardised analytical approaches, uncertainty quantification based on groups of data and their sources, documented models and software that's reproducible (including training data for models, so bitwise not necessary)</li>\n<li><strong>Insight:</strong> Indicator calculation and interpretation relative to baselines to generate counterfactuals, including long-term trend interpretation.</li>\n<li><strong>Reporting:</strong> Consistent reporting formats, publishing for long-term accessibility, <a href=\"https://doi.org/10.1093/biosci/biaf189\">peer review mechanisms</a>, and communication.</li>\n</ol>\n<p>The framework is federated by design, in a manner quite compatible with <a href=\"/notes/cambridge-green-blue\">Ostrom's guidelines</a> as locally generated data flows into regional and national hubs which apply consistent analytical workflows, producing GBF-compatible indicators for international reporting. Countries or organisations can start with simple methods and progressively adopt more sophisticated approaches as their capacity grows.</p>\n<blockquote>\n<p>For monitoring frameworks like REDD+ MRV, a centralized, top–down body like\nFAO makes sense given the direct link to the UNFCCC and IPCC and the coupling\nof climate observing systems to carbon stock and emissions models and\nassessments.</p>\n<p>However, the multifaceted nature of biodiversity and the existing landscape\nof key organizations make a federated governance model, with national\nownership at its heart, more appropriate and likely to succeed for the BMSF.\n<cite>-- Sec 2.4, <a href=\"https://www.pnas.org/doi/10.1073/pnas.2519347123\">From data to decisions: Toward a BMSF</a></cite></p>\n</blockquote>\n<p>The paper also provides a handy worked example for forest connectivity\nindicators under the maximum GBF Target 3, showing how each step produces a\ncertified evidence chain for a full provenance chain from <a href=\"/projects/rsn\">remote sensing</a> to the final reported indicator value.</p>\n<h2 id=\"connecting-to-collective-knowledge-systems\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#connecting-to-collective-knowledge-systems\"></a>Connecting to collective knowledge systems</h2>\n<p>The BMSF maps on pretty well to the general <a href=\"/notes/principles-for-collective-knowledge\">principles for collective knowledge systems</a> that I recently sketched out:</p>\n<h3 id=\"permanence\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#permanence\"></a>Permanence</h3>\n<p><strong>&quot;Permanence&quot;</strong> maps to the BMSF's recommendation to use DOIs for datasets, version-controlled code pipelines, and long-term archiving. Rec 9 from the first paper makes the case that biodiversity data must survive <a href=\"https://www.npr.org/2025/08/08/nx-s1-5495338/climate-change-environment-websites-trump\">political shocks</a>, <a href=\"https://www.motherjones.com/politics/2025/02/us-fish-wildlife-service-conservation-funding-freeze-pause-endangered-species-animals/\">institutional closures</a> and <a href=\"https://doi.org/10.7717/peerj.2743\">format obsolescence</a>.</p>\n<p>The BMSF's requirement for persistent identifiers at every step of the chain fits in nicely with the &quot;DOIs for all&quot; approach that <a href=\"https://rogue-scholar.org\">Rogue Scholar</a> and <a href=\"https://zenodo.com\">Zenodo</a> have been pioneering for blogs and datasets (my own site uses both).</p>\n<h3 id=\"permission\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#permission\"></a>Permission</h3>\n<p><strong>&quot;Permission&quot;</strong> connects to the BMSF's Ethics step and its emphasis on FAIR and CARE principles, indigenous data sovereignty, and tiered access.\nAs I noted in my <a href=\"/notes/principles-for-collective-knowledge\">collective knowledge principles</a>, biodiversity data is a prime example of where everything <em>cannot</em> be open: location data for endangered species could be exploited by poachers, and indigenous communities have sovereign rights over their own traditional knowledge.</p>\n<p>The BMSF's federated model explicitly supports this as not all collected data needs to be shared openly, and the principle of &quot;guardianship&quot; enables tiered access levels. This is the kind of <a href=\"/papers/2025-bifrost\">spatial permissioning</a> that current Internet protocols handle poorly.</p>\n<h3 id=\"placement\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#placement\"></a>Placement</h3>\n<p><strong>&quot;Placement&quot;</strong> maps to the BMSF's federated, nationally- and regionally-owned architecture. The tiered design allows countries to operate at their own level of capacity while still contributing to global assessments.</p>\n<p>We're feeling this with TESSERA's multi-petabyte embeddings as well; they physically can't move casually between continents and networks, and so the processing must be distributed and federated rather than centralised. <a href=\"https://www.cse.iitd.ac.in/~aseth/\">Aadi Seth</a> has suggested a mirror in India as our first federated node!</p>\n<h3 id=\"provenance\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#provenance\"></a>Provenance</h3>\n<p><strong>&quot;Provenance&quot;</strong> is at the centre of the BMSF as it is designed as an auditable chain of evidence where every indicator can be traced back through analytical methods, data processing, licensing and raw observations.</p>\n<p>Our fifth recommendation calls out for hallucination-free AI systems for evidence synthesis, and the BMSF's Trust+Analysis phases provide the architecture to make provenance tracking systematic.\nThe same provenance protocols we <a href=\"/papers/2025-internet-ecology\">need for the broader Internet</a> (tracking where knowledge came from and whether to trust it) apply even more to biodiversity data that informs actions affecting the abundances of millions of species.</p>\n<p>There's also clear need for <a href=\"/notes/coar-prc\">revamping peer review</a> in this world, something that <a href=\"https://boninabox.geobon.org\">BON-in-a-Box</a> is <a href=\"https://boninabox.geobon.org/indicator?i=GeneticDiversity\">tackling</a>. Our <a href=\"/papers/2025-programming-gbon\">PROPL 2025 paper on programming the BON</a> lays out some projects in this space, and the talk below is an excellent overview of the space for those computer scientists interested in learning more.</p>\n<p><div class=\"video-center\"><iframe title=\"Programming Opportunities for the Global Biodiversity Observation Network\" width=\"100%\" height=\"315px\" src=\"https://watch.eeg.cl.cam.ac.uk/videos/embed/e6b55b76-7ef2-4851-bbd9-9200ff59d044\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<h2 id=\"how-this-maps-to-my-immediate-research-efforts\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#how-this-maps-to-my-immediate-research-efforts\"></a>How this maps to my immediate research efforts</h2>\n<p>Several strands of the research in the EEG connect directly to what these papers are calling for!</p>\n<h3 id=\"tessera-and-labelling-frameworks\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#tessera-and-labelling-frameworks\"></a>TESSERA and labelling frameworks</h3>\n<p><a href=\"https://tee.cl.cam.ac.uk\"> <img src=\"/images/tee-ss-1.webp\" alt=\"%rc\" title=\"The TESSERA Embeddings Explorer, browsable at https://tee.cl.cam.ac.uk\" > </a>\n<a href=\"/projects/tessera\">TESSERA</a> provides an open and robust geospatial foundation model to satisfy the first recommendation and support diverse downstream tasks from crop classification to canopy height estimation.</p>\n<p><a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and I are just starting work on a labelling framework that fits within the BMSF architecture, so that the ground-truth data used to finetune TESSERA for biodiversity applications can map to the standardised collection, curation and provenance requirements laid out in the framework. Since TESSERA itself is fully open and reproducible, this is a big step toward making AI-driven biodiversity assessments auditable end-to-end. There is also the intriguing possibility of using the <a href=\"/notes/atproto-for-fun-and-blogging\">ATProto</a> to anchor the federation while re-using identity infrastructure across social networks, but we need to carefully consider issues of privacy. More on this as it develops!</p>\n<h3 id=\"evidence-synthesis-and-living-databases\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#evidence-synthesis-and-living-databases\"></a>Evidence synthesis and living databases</h3>\n<p>Our <a href=\"/papers/2025-evidence-tap\">paper pipeline</a> for analysing the academic literature at scale for\n<a href=\"/projects/ce\">Conservation Evidence</a> is a working prototype of the &quot;living databases&quot;\nthat the fifth recommendation calls for.</p>\n<p>Dynamic meta-analysis, hallucination-free screening with humans in the loop,\nand continuous evidence ingestion are all on the table here, and the BMSF\nprovides the architecture within which these tools can interoperate. We're just\nbeginning a <a href=\"/notes/2026w7\">new project</a> on this that also includes Education as well\nas conservation, which I'm very excited about!</p>\n<h3 id=\"internet-resilience-meets-biodiversity\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#internet-resilience-meets-biodiversity\"></a>Internet resilience meets biodiversity</h3>\n<p>The most fun and somewhat far out thing is our <a href=\"/papers/2025-internet-ecology\">ecology of the internet</a> work, which tries to draw a connection\nbetween ecological resilience principles and the digital infrastructure that\nbiodiversity data depends on. This is a <a href=\"/notes/ecology-at-aarhus\">deliberately circular</a> argument to see\nwhere the thoughts go: we need resilient digital infrastructure to monitor\nbiodiversity, and ecological theory can teach us how to build that\ninfrastructure.</p>\n<p>The upcoming <a href=\"https://avsm.leaflet.pub/3mgampzfq6k27\">Rewilding the Web workshop</a> in Edinburgh will bring\ntogether ecologists, philosophers and technologists to push this further. I'm also working with\n<a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a> to figure out how to connect all these into a <a href=\"/papers/2025-fairground\">global live wiki</a>, and <a href=\"https://jon.recoil.org\">Jon Ludlam</a> is <a href=\"https://jon.recoil.org/blog/2026/03/weeknotes-2026-09.html\">hacking on a prototype</a> already!</p>\n<p><div class=\"video-center\"><iframe title=\"A FAIR Case for a Live Computational Commons\" width=\"100%\" height=\"315px\" src=\"https://watch.eeg.cl.cam.ac.uk/videos/embed/0e82977c-ba11-487f-bece-147fb1da104d\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<p>These two perspective papers represent the published output of the conversations we had at the NAS last summer, and my own agenda is now a bit clearer. I want to connect models like TESSERA and our Cambridge TAP to federated, auditable monitoring systems that can transform the flood of biodiversity data into credible evidence that's urgently needed to halt and reverse nature loss. Stay tuned for some ideas I have about venues where we can do this together!</p><h1>References</h1><ul><li>Sutherland et al (2026). Nine changes needed to deliver a radical transformation in biodiversity measurement. <a href=\"https://doi.org/10.1073/pnas.2519345123\" target=\"_blank\"><i>10.1073/pnas.2519345123</i></a></li>\n<li>Madhavapeddy (2026). Discussing effective conservation with all the UK Chief Scientists. <a href=\"https://doi.org/10.59350/qjrmv-38130\" target=\"_blank\"><i>10.59350/qjrmv-38130</i></a></li>\n<li>Madhavapeddy (2025). What I learnt at the National Academy of Sciences US-UK Forum on Biodiversity. <a href=\"https://doi.org/10.59350/j6zkp-n7t82\" target=\"_blank\"><i>10.59350/j6zkp-n7t82</i></a></li>\n<li>Gonzalez et al (2026). From data to decisions: Toward a Biodiversity Monitoring Standards Framework. <a href=\"https://doi.org/10.1073/pnas.2519347123\" target=\"_blank\"><i>10.1073/pnas.2519347123</i></a></li>\n<li>Madhavapeddy (2026). At the AI Impact Summit in Delhi: people, planet, progress. <a href=\"https://doi.org/10.59350/6vc5q-mbk23\" target=\"_blank\"><i>10.59350/6vc5q-mbk23</i></a></li>\n<li>Madhavapeddy et al (2025). Steps towards an Ecology for the Internet. Association for Computing Machinery. <a href=\"https://doi.org/10.1145/3744169.3744180\" target=\"_blank\"><i>10.1145/3744169.3744180</i></a></li>\n<li>Feng et al (2026). Applications of the TESSERA Geospatial Foundation Model to Diverse Environmental Mapping Tasks. SSRN. <a href=\"https://doi.org/10.2139/ssrn.6142416\" target=\"_blank\"><i>10.2139/ssrn.6142416</i></a></li>\n<li>Jaffer et al (2025). AI-assisted Living Evidence Databases for Conservation Science. Cambridge Open Engage. <a href=\"https://doi.org/10.33774/coe-2025-rmsqf\" target=\"_blank\"><i>10.33774/coe-2025-rmsqf</i></a></li>\n<li>Madhavapeddy (2025). Publish, Review, Curate to upend scholarly publishing. <a href=\"https://doi.org/10.59350/fpc9w-ccj82\" target=\"_blank\"><i>10.59350/fpc9w-ccj82</i></a></li>\n<li>Ball et al (2025). Food impacts on species extinction risks can vary by three orders of magnitude. <a href=\"https://doi.org/10.1038/s43016-025-01224-w\" target=\"_blank\"><i>10.1038/s43016-025-01224-w</i></a></li>\n<li>Millar et al (2025). An Architecture for Spatial Networking. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2507.22687\" target=\"_blank\"><i>10.48550/arXiv.2507.22687</i></a></li>\n<li>Madhavapeddy (2025). Presenting our Ecology of the Internet ideas at Aarhus 2025. <a href=\"https://doi.org/10.59350/p45b8-kvt85\" target=\"_blank\"><i>10.59350/p45b8-kvt85</i></a></li>\n<li>Millar et al (2024). Terracorder: Sense Long and Prosper. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2408.02407\" target=\"_blank\"><i>10.48550/arXiv.2408.02407</i></a></li>\n<li>Reynolds et al (2025). Will AI speed up literature reviews or derail them entirely?. Nature Publishing Group. <a href=\"https://doi.org/10.1038/d41586-025-02069-w\" target=\"_blank\"><i>10.1038/d41586-025-02069-w</i></a></li>\n<li>Madhavapeddy (2025). Four Ps for Building Massive Collective Knowledge Systems. <a href=\"https://doi.org/10.59350/418q4-gng78\" target=\"_blank\"><i>10.59350/418q4-gng78</i></a></li>\n<li>Omar et al (2025). A FAIR Case for a Live Computational Commons. Association for Computing Machinery. <a href=\"https://doi.org/10.1145/3759536.3763802\" target=\"_blank\"><i>10.1145/3759536.3763802</i></a></li>\n<li>Madhavapeddy (2025). The Cambridge \"Green Blue\" competition to reduce emissions. <a href=\"https://doi.org/10.59350/y1g67-aq825\" target=\"_blank\"><i>10.59350/y1g67-aq825</i></a></li>\n<li>Madhavapeddy (2025). Using AT Proto for more than just Bluesky posts. <a href=\"https://doi.org/10.59350/32rdt-zny05\" target=\"_blank\"><i>10.59350/32rdt-zny05</i></a></li>\n<li>Burgess et al (2024). Global Metrics for Terrestrial Biodiversity. <a href=\"https://doi.org/10.1146/annurev-environ-121522-045106\" target=\"_blank\"><i>10.1146/annurev-environ-121522-045106</i></a></li>\n<li>Klink et al (2020). Meta-analysis reveals declines in terrestrial but increases in freshwater insect abundances. <a href=\"https://doi.org/10.1126/science.aax9931\" target=\"_blank\"><i>10.1126/science.aax9931</i></a></li>\n<li>Chapman et al (2024). Biodiversity monitoring for a just planetary future. <a href=\"https://doi.org/10.1126/science.adh8874\" target=\"_blank\"><i>10.1126/science.adh8874</i></a></li>\n<li>Hub et al (2026). Sovereign Data Supply Chain – Functional and Operational Framework. <a href=\"https://doi.org/10.5281/zenodo.18675312\" target=\"_blank\"><i>10.5281/zenodo.18675312</i></a></li>\n<li>Hunter et al (2021). Evidence shortfalls in the recommendations and guidance underpinning ecological mitigation for infrastructure developments. <a href=\"https://doi.org/10.1002/2688-8319.12089\" target=\"_blank\"><i>10.1002/2688-8319.12089</i></a></li>\n<li>Griffith et al (2026). BON in a Box: An Open and Collaborative Platform for Biodiversity Monitoring, Indicator Calculation, and Reporting. <a href=\"https://doi.org/10.1093/biosci/biaf189\" target=\"_blank\"><i>10.1093/biosci/biaf189</i></a></li>\n<li>Escribano et al (2016). Biodiversity data obsolescence and land uses changes. PeerJ Inc.. <a href=\"https://doi.org/10.7717/peerj.2743\" target=\"_blank\"><i>10.7717/peerj.2743</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/nas-rs-biodiversity-papers",
      "title": "Connecting the dots for biodiversity action from the NAS/Royal Society Forum",
      "summary": "Summary of the Nine Recommendations and Biodiversity Monitoring Standards Framework papers from the NAS/Royal Society US-UK Forum in summer 2025, and how they connect to my work on collective knowledge systems, TESSERA, and evidence synthesis.",
      "date_published": "2026-03-07T00:00:00.000000Z",
      "date_modified": "2026-03-07T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "biodiversity",
        "conservation",
        "policy",
        "royalsociety",
        "usa",
        "ai",
        "llm"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2025-biodiversity-9recs.pdf",
          "mime_type": "application/pdf",
          "title": "Nine changes needed to deliver a radical transformation in biodiversity measurement"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1073/pnas.2519345123",
          "doi": "10.1073/pnas.2519345123",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/qjrmv-38130",
          "doi": "10.59350/qjrmv-38130",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/j6zkp-n7t82",
          "doi": "10.59350/j6zkp-n7t82",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1073/pnas.2519347123",
          "doi": "10.1073/pnas.2519347123",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/6vc5q-mbk23",
          "doi": "10.59350/6vc5q-mbk23",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3744169.3744180",
          "doi": "10.1145/3744169.3744180",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.2139/ssrn.6142416",
          "doi": "10.2139/ssrn.6142416",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.33774/coe-2025-rmsqf",
          "doi": "10.33774/coe-2025-rmsqf",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/fpc9w-ccj82",
          "doi": "10.59350/fpc9w-ccj82",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s43016-025-01224-w",
          "doi": "10.1038/s43016-025-01224-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2507.22687",
          "doi": "10.48550/arXiv.2507.22687",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/p45b8-kvt85",
          "doi": "10.59350/p45b8-kvt85",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2408.02407",
          "doi": "10.48550/arXiv.2408.02407",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1038/d41586-025-02069-w",
          "doi": "10.1038/d41586-025-02069-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/418q4-gng78",
          "doi": "10.59350/418q4-gng78",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3759536.3763802",
          "doi": "10.1145/3759536.3763802",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/y1g67-aq825",
          "doi": "10.59350/y1g67-aq825",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/32rdt-zny05",
          "doi": "10.59350/32rdt-zny05",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1146/annurev-environ-121522-045106",
          "doi": "10.1146/annurev-environ-121522-045106",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1126/science.aax9931",
          "doi": "10.1126/science.aax9931",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1126/science.adh8874",
          "doi": "10.1126/science.adh8874",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.5281/zenodo.18675312",
          "doi": "10.5281/zenodo.18675312",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1002/2688-8319.12089",
          "doi": "10.1002/2688-8319.12089",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1093/biosci/biaf189",
          "doi": "10.1093/biosci/biaf189",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.7717/peerj.2743",
          "doi": "10.7717/peerj.2743",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2026w9",
      "content_html": "<h2 id=\"decade-of-docker-containers\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#decade-of-docker-containers\"></a>Decade of Docker containers</h2>\n<p>A busy week on the socials as the paper <a href=\"https://dave.recoil.org\">Dave Scott</a>, <a href=\"https://github.com/justincormack\">Justin Cormack</a> and I wrote\nlooking back at the Docker adventure made <a href=\"/notes/cacm-docker-cover\">the cover of the Communications of the CACM</a>.  Lots of really positive coverage about it online,\nand the print issue should be with you all this coming week!</p>\n<p><a href=\"https://cacm.acm.org/research/a-decade-of-docker-containers/\"> <img src=\"/images/cacm-docker-cover-1.webp\" alt=\"%c\" title=\"Cover of a Decade of Docker Containers\" > </a></p>\n<h2 id=\"tessera-and-zarr\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#tessera-and-zarr\"></a>TESSERA and Zarr</h2>\n<p>In TESSERA land I spent most of my week heads down on getting full Zarr streaming support. This has been extraordinarily successful; I now not only have a significant chunk of embeddings streaming over HTTP using Zarr, it also meant that I could build an entire classification and segmentation pipeline that runs entirely in my browser! I prototyped a user interface that can run <a href=\"https://toao.com/blog/earth-observation-budget-solar-farms-tiny-model\">Sadiq's solar farm CNN</a> entirely in my browser using wasm and WebGPU. There's a nice roundup for <a href=\"https://medium.com/@tobias.ramalho.ferreira/zarr-in-the-browser-fast-flexible-and-surprisingly-powerful-for-big-geo-data-eeb90ddf8a3d\">Zarr visualisation options</a> that greatly helped me put this together.</p>\n<p><img src=\"/images/tessera-zarr-stream-1.webp\" alt=\"%c\" title=\"A full browser based streaming interface for TESSERA using Zarr, to be released soon!\" ></p>\n<p>Here's a screenshot to whet your appetite; I'll write a full blog about this shortly and post a hosted URL.  The reason it's taking a bit longer is that <a href=\"https://www.tunbury.org/\">Mark Elvers</a> and I have been busy rearranging storage in our cluster; exposing all the embeddings as Zarr means transcoding 100s of terabytes of data, which is killing our poor internal network.</p>\n<p>While working on this, I also found this <a href=\"https://rohitbandaru.github.io/blog/JEPA-Deep-Dive/\">overview of hierarchical JEPA</a> to be very good.</p>\n<p>The OxCaml TESSERA pipeline continues to <a href=\"https://www.tunbury.org/2026/02/25/teserra-pipeline/\">come online</a> and <a href=\"https://jon.recoil.org\">Jon Ludlam</a> also got <a href=\"https://jon.recoil.org/blog/2026/02/weeknotes-2026-08.html#oxcaml\">odoc OxCaml docs building</a> from my <a href=\"https://github.com/avsm/oxmono\">monorepo</a> which is a HUGE help to developing all the code.</p>\n<p>While preparing my <a href=\"https://ifip-wg28.github.io\">wg2.8</a> I also needed an HTTP caching proxy that would permanently persist Zarr tiles on my laptop for demos, so I wrote a <a href=\"https://github.com/avsm/oxmono/tree/main/avsm/httpz-perma-proxy\">HTTP perma proxy</a> over httpz/OxCaml as well which is working for me.\nAlso spent a bit of time reviewing <a href=\"https://github.com/ocaml/opam-repository/pull/29451\">OCaml relocatability in opam</a> as that would be extremely useful for rapid development setup of oxmono.</p>\n<p>I also researched <a href=\"https://github.com/huggingface/datasets/issues/4096\">Huggingface support for Zarr</a> which is still not quite there but it seems to be of interest to HF.</p>\n<h2 id=\"package-management-a-la-carte\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#package-management-a-la-carte\"></a>Package management a la carte</h2>\n<p><a href=\"https://ryan.freumh.org\">Ryan Gibb</a> followed up on his <a href=\"/notes/2026w7\">FOSDEM talks extravaganza</a> with a preprint on the <a href=\"/papers/2026-package-calculus\">package management calculus</a> we've been working on for a while. Lots of discussions and online interest in places like <a href=\"https://news.ycombinator.com/item?id=47136272\">HN</a> and <a href=\"https://lobste.rs/s/fm1eln/package_managers_la_carte_formal_model\">Lobsters</a>.  I thought one of the most telling things about how subtle this area is was from HN. Someone confidently pointed out that Rust has semver:</p>\n<blockquote>\n<p>The rust ecosystem standardised on semver. This means it is perfectly allowed to use 1.2 in place of 1.1. While you can specify upper bounds for the dependency ranges, that is extremely uncommon in practice. Instead the bounds are just &quot;1.1 or newer semver compatible&quot; etc.</p>\n</blockquote>\n<p>...but the <a href=\"https://news.ycombinator.com/item?id=47136272#47193369\">reality</a> that Ryan points out when you look at the crates registry is:</p>\n<blockquote>\n<p>In https://github.com/rust-lang/crates.io-index I count just under 7000 upper bounds on dependency ranges that aren't\njust semver in disguise (e.g. not &quot;&gt;=1.0.0, &lt;2.0.0&quot;):\n$ rg --no-filename -o '&quot;req&quot;:&quot;[^&quot;]<em>&lt;[^&quot;]</em>&quot;' . | grep -Ev '&lt; ?=? ?([0-9]+(.0){0,2}|0.[0-9]+(.0)?)&quot;' | wc -l\n6727\nSo it's definitely used. One person's non-breaking change is another's breaking change https://xkcd.com/1172/</p>\n</blockquote>\n<p>This repeats itself all over the place. Table 1 of the <a href=\"/papers/2026-package-calculus\">paper</a> covers a lot of the edge cases that we had to categorise when working through all the package ecosystems.</p>\n<p>There was also a <a href=\"https://lobste.rs/s/fm1eln/package_managers_la_carte_formal_model\">brief discussion</a> on Lobsters about <a href=\"https://www.oilshell.org/blog/2022/03/backlog-arch.html#what-is-a-narrow-waist\">narrow waist effect</a> in technology.</p>\n<h2 id=\"fun-links-around-the-web\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#fun-links-around-the-web\"></a>Fun links around the web</h2>\n<p>My paper of the week is <a href=\"https://www.usenix.org/system/files/osdi25-schuermann.pdf\">&quot;Building Bridges: Safe Interactions with Foreign Languages through Omniglot&quot;</a> from OSDI25, which builds out a generic FFI with zero-copy. Seems very useful for OxCaml as well; I'm going to see Mae Milano at next week's WG2.8 and hopefully learn more!</p>\n<p><a href=\"https://www.usenix.org/system/files/osdi25-schuermann.pdf\"> <img src=\"/images/omniglot-fig1.webp\" alt=\"%c\" title=\"Building Bridges: Safe Interactions with Foreign Languages through Omniglot, OSDI 25\" > </a></p>\n<ul>\n<li>New <a href=\"https://www.linkedin.com/pulse/talent-everywhere-opportunity-isnt-greater-cambridge-impact-gnvpe\">Greater Cambridge Impact</a> startup in Cambridge with a nice focus on closing social inequality around here. Nearly <a href=\"https://www.cambridge-news.co.uk/news/local-news/cambridgeshire-areas-deprived-kids-suffering-29293363\">1 in 3 children in parts of Cambridgeshire</a> live in poverty, a shocking statistic against the backdrop of University largesse. And related to the next item, <a href=\"https://www.theguardian.com/business/2023/nov/14/millions-of-uk-households-forced-to-unplug-fridge-to-cope-with-rising-bills\">a million households don't have a fridge</a> due to energy poverty.</li>\n<li>Brilliant episode of Amol Rajan's podcast on <a href=\"https://www.bbc.co.uk/sounds/play/curation/m001bm45/m002rrxd\">the addiction to ultra processed foods</a>, which shows the other side of the coin to our own work on <a href=\"/notes/food-and-risk-to-life\">food and the risk to wildlife</a> from the <a href=\"\">LIFE metric</a>. Treating ULP as an <a href=\"https://www.theguardian.com/food/2023/oct/12/its-like-trying-to-quit-smoking-why-are-1-in-7-of-us-addicted-to-ultra-processed-foods\">addictive substance</a> seems to be the only way out of this monocultural consumption mess we've got us into.</li>\n<li>While convincing me to visit Brown (not much convincing needed to be honest), <a href=\"https://cs.brown.edu/~sk/\">Shriram Krishnamurthi</a> pointed out that <a href=\"https://en.wikipedia.org/wiki/Roger_Williams\">Roger Williams</a> who founded Rhode Island in 1636 was educated at... Pembroke College Cambridge!</li>\n<li>Extremely cool paper upcoming at CVPR26 on <a href=\"https://wenjiawang0312.github.io/projects/embodmocap/\">EmbodMocap: In-the-Wild 4D Human-Scene Reconstruction for Embodied Agents</a> that uses just two iPhones to do mocap reconstructions. Very relevant to our <a href=\"/ideas/digitisation-of-insects\">insect reconstruction</a> project.</li>\n<li>Interesting to note that Forester may also <a href=\"https://amok.recoil.org/@avsm/116119655840612727\">benefit from odoc plugins</a> for knowledge visualisation.</li>\n<li>I've started listening to back issues of <a href=\"https://podcasts.castplus.fm/environment-variables\">Anne Currie's podcast called Environmental Variables</a></li>\n<li>Been thinking about how <a href=\"https://nfraprado.net/post/vcard-rss-as-an-alternative-to-social-media.html\">Vcard and RSS fit together for my blog</a>.</li>\n<li><a href=\"https://loosemore.com/\">Tom Loosemore</a> makes a great case for how <a href=\"https://loosemore.com/2026/02/25/ai-agents-will-join-up-government-before-government-does/\">gov.uk should handle AI agents</a> by <em>&quot;restricting access to only those AI Agents who sign up to a GOV.UK kitemark, with legally-mandated conditions to manage the risks above&quot;</em>. Excellent analogy to how India <a href=\"https://economictimes.indiatimes.com/tech/technology/uidai-teams-up-with-google-for-display-of-authorised-aadhaar-centres-on-google-maps/articleshow/128815309.cms?from=mdr\">did this while Aadhar was maturing</a> for universal ID in India.</li>\n</ul>\n<p>Next week I'm in Portugal at WG2.8, so less hacking than usual!</p><h1>References</h1><ul><li>Gibb et al (2026). Package Managers à la Carte: A Formal Model of Dependency Resolution. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2602.18602\" target=\"_blank\"><i>10.48550/arXiv.2602.18602</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2026w9",
      "title": ".plan-26-09: Browser TESSERA, package management and Docker in the CACM",
      "summary": "Got TESSERA working in Zarr and the browser, and a preprint of package management a la carte pushed out",
      "date_published": "2026-03-01T00:00:00.000000Z",
      "date_modified": "2026-03-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "tessera",
        "packages",
        "opensource",
        "docker"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.48550/arXiv.2602.18602",
          "doi": "10.48550/arXiv.2602.18602",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/cacm-docker-cover",
      "content_html": "<p>I am <em>beyond</em> excited to be on the cover of the <a href=\"https://cacm.org\">CACM</a> March issue with &quot;<strong><a href=\"https://cacm.acm.org/research/a-decade-of-docker-containers/\">A Decade of Docker Containers</a></strong>&quot;, coauthored with <a href=\"https://dave.recoil.org\">Dave Scott</a> and <a href=\"https://github.com/justincormack\">Justin Cormack</a>:</p>\n<p><a href=\"https://cacm.acm.org/research/a-decade-of-docker-containers/\"> <img src=\"/images/cacm-docker-cover-1.webp\" alt=\"%rc\" title=\"Cover of the CACM March 2026!\" > </a></p>\n<blockquote>\n<p>For the past decade, Docker has provided a robust solution for building,\nshipping, and sharing applications. But behind its simple &quot;build and run&quot;\nworkflow lie many years of complex technical challenges.\n<cite>-- <a href=\"https://cacm.acm.org/research/a-decade-of-docker-containers/\">A Decade of Docker Containers</a>, Communications of the ACM, Mar 26</cite></p>\n</blockquote>\n<p>Docker was such a whirlwind ride that we never got to write any academic papers about some of the technical systems magic that went into it. Today's article, along with the <a href=\"/papers/2025-docker-icfp\">ICFP experience report</a> from last year form a companion pair to delve into the tricks required to scale the system to millions of daily users.</p>\n<p>We cover the technical origins in Linux, the library VMM layers needed to hide Linux on macOS and Windows. And then we discuss where Docker is going next, with the giant AI coding wave making it incredibly important to sandbox agents running pretty much everywhere now.</p>\n<div class=\"video-center\" style=\"padding:56.25% 0 0 0;position:relative;\"><iframe src=\"https://player.vimeo.com/video/1166690675?badge=0&amp;autopause=0&amp;player_id=0&amp;app_id=58479\" frameborder=\"0\" allow=\"autoplay; fullscreen; picture-in-picture; clipboard-write; encrypted-media; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" style=\"position:absolute;top:0;left:0;width:100%;height:100%;\" title=\"A Decade of Docker Containers\"></iframe></div><script src=\"https://player.vimeo.com/api/player.js\"></script>\n<p>The video accompanying the article was <a href=\"/notes/2026w6\">recorded</a> in my office by the wonderful <a href=\"https://www.rosiepowellfreelance.com/\">Rosie Powell</a>, with thanks to Pembroke College. And the pixel cover art of container ships that the CACM commissioned is fantastic!</p>\n<h2 id=\"getting-involved-in-docker\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#getting-involved-in-docker\"></a>Getting involved in Docker</h2>\n<p><img src=\"/images/cacm-docker-cover-5.webp\" alt=\"%rc\" title=\"The first time I booted up an embedded VM and got a console!\" >\nFirstly, a huge thank you to Solomon Hykes, the project founder and the person who invited us <a href=\"/notes/docker-buys-unikernel-systems\">to join forces</a> in the early days of Docker. We all holed up in a French farm and hacked like mad (<a href=\"https://x.com/solomonstre/status/1584963582235906049\">photo from Solomon</a>) and came up with the first iteration of Docker for Desktop in a few days!</p>\n<p>We didn't actually realise that's what we'd <a href=\"https://web.archive.org/web/20160504110338/https://blog.docker.com/2016/03/docker-for-mac-windows-beta/\">call it back then</a>.  The project was originally codenamed Pinata and even had a <a href=\"https://forums.docker.com/t/pinata-missing-in-latest-mac-beta-1-11-2-beta15/15541\">CLI tool</a> of the same name for quite a while! In order to get a feel for whether or not it would be popular, we took a leaf out Gmail's launch and send out limited invite codes. There was nothing to have been worried about as it took off fast (except the traditional <a href=\"https://news.ycombinator.com/item?id=11352389\">HN disdain</a>) with <a href=\"https://medium.com/@nzoschke/docker-for-mac-beta-review-b91692289eb5\">positive reviews</a>.</p>\n<blockquote>\n<p>Docker For Mac is a game changer. I’ve been able to cope with the previous tools but the experience has been rough to say the least.\n<cite>-- <a href=\"https://medium.com/@nzoschke/docker-for-mac-beta-review-b91692289eb5\">Docker For Mac Beta Review</a>, Noah Zoschke, Apr 2016</cite></p>\n</blockquote>\n<p>After the desktop beta came out in 2016, we also <a href=\"https://www.docker.com/blog/docker-unikernels-open-source/\">open sourced quite a few components</a>, some of which are now features <a href=\"/notes/apple-containerisation\">implemented</a> into macOS and Windows. Some <a href=\"/papers/2025-docker-icfp\">tricks like VPNKit</a> are now <a href=\"https://github.com/containers/gvisor-tap-vsock\">adopted widely</a> in other ecosystems, which is nice to see.</p>\n<h2 id=\"docker-is-defined-by-its-incredible-community\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#docker-is-defined-by-its-incredible-community\"></a>Docker is defined by its incredible community</h2>\n<p><div class=\"video-vertical\"><iframe title=\"Solomon Hykes unveiling the Docker Desktop Pinãta\" src=\"https://crank.recoil.org/videos/embed/2c165f30-ea51-4b6e-893d-9273aba630be\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\" style=\"aspect-ratio: 9/16; width: 100%; height: 100%; max-width: 325px;\"></iframe></div>\nWhile our article covers the technical aspects of Docker, we don't comment enough on how <em>fun</em> the community is! (See the massive <a href=\"https://cacm.acm.org/research/a-decade-of-docker-containers/#:~:text=from%20external%20networks.-,Acknowledgments,-We%20are%20grateful\">acknowledgements</a> section in the article for just a small sample of the key contributors).</p>\n<p>Container management and cloud computing are obviously worth vast amounts of money now, but the giant whale and plush toys and crazy antics at Dockercons are what I'll remember most fondly.  Throughout all the ups and downs, Docker's been (I strongly feel) a strong force for openness in preventing any single entity capturing the full workflow of how we manage software, and therefore contributing to building a vibrant and diverse ecosystem.</p>\n<p>Today, it's still entirely possible for a small player to quite simply spin up their own selfhosted infrastructure and interoperate with the behemoths. That's important; heck I use <a href=\"https://docs.docker.com/engine/swarm/\">swarm mode</a> on my own <a href=\"\">##selfhosting</a> to this day!</p>\n<p>We're seeing a big change in open source community building happening this year. The vibe coding onslaught is calling into question how we'll make open source friends in the future, and it looks like we're falling back to <a href=\"https://mitchellh.com/writing/my-ai-adoption-journey\">reputation networks for contributors</a>. I hope that we see more Docker-style communities spring up than boring corporate driven artificial ecosystems!</p>\n<h2 id=\"the-futures-bright-for-containerisation\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-futures-bright-for-containerisation\"></a>The future's bright for containerisation</h2>\n<p><img src=\"/images/cacm-docker-cover-3.webp\" alt=\"%rc\" title=\"Stealing a camel and joining forces with a whale at OSCon\" >\nThe other great thing to see in recent years is the new generation of maintainers hacking on adjacent technologies among my colleagues and students. <a href=\"https://www.tunbury.org/\">Mark Elvers</a> is advancing <a href=\"https://www.tunbury.org/2026/02/19/obuilder-hcs/\">Windows support with HCS</a>, <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> just uploaded his latest work on <a href=\"/papers/2026-package-calculus\">formalising dependency management</a> and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> has been hacking on <a href=\"https://patrick.sirref.org/merry/index.xml\">shell integrated provenance</a>. And &quot;old&quot; maintainers like <a href=\"https://github.com/samoht\">Thomas Gazagnaire</a> who I cofounded UnikS with are <a href=\"https://gazagnaire.org/blog/2026-02-23-asplos-unikernels.html\">taking Docker and OCaml into space</a>!</p>\n<p><img src=\"/images/cacm-docker-cover-2.webp\" alt=\"%rc\" title=\"At Pembroke College with Solomon\" >\nCombining these advances with <a href=\"/papers/2025-ocaml-ai\">agentic coding</a> results in radically different coding methodologies, but using the same lower level interfaces that Docker's built on today. Evolution is happening fast, and more accessible than ever thanks to Docker's open source roots.</p>\n<p>Here's to the coming century of containerisation; enjoy <a href=\"https://cacm.acm.org/research/a-decade-of-docker-containers/\">reading the article</a> and do let me know if you have any comments or queries!</p>\n<p><img src=\"/images/cacm-docker-cover-4.webp\" alt=\"%c\" title=\"Remembering Gordon the turtle, sadly passed on now\" ></p><h1>References</h1><ul><li>Madhavapeddy et al (2026). A Decade of Docker Containers. <a href=\"https://doi.org/10.1145/3761803\" target=\"_blank\"><i>10.1145/3761803</i></a></li>\n<li>Madhavapeddy et al (2025). Functional Networking for Millions of Docker Desktops. <a href=\"https://doi.org/10.1145/3747525\" target=\"_blank\"><i>10.1145/3747525</i></a></li>\n<li>Madhavapeddy (2025). Under the hood with Apple's new Containerization framework. <a href=\"https://doi.org/10.59350/70ynk-ves20\" target=\"_blank\"><i>10.59350/70ynk-ves20</i></a></li>\n<li>Gibb et al (2026). Package Managers à la Carte: A Formal Model of Dependency Resolution. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2602.18602\" target=\"_blank\"><i>10.48550/arXiv.2602.18602</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/cacm-docker-cover",
      "title": "A Decade of Docker Containers on the CACM cover!",
      "summary": "Our CACM cover article reflects on a decade of Docker, from the early days of hacking Docker for Mac on a French farm to today's AI-driven sandboxing, covering the technical origins, cross-platform challenges, and the vibrant open-source community that made it all possible.",
      "date_published": "2026-02-24T00:00:00.000000Z",
      "date_modified": "2026-02-24T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "docker",
        "ocaml",
        "unikernels"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1145/3761803",
          "doi": "10.1145/3761803",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3747525",
          "doi": "10.1145/3747525",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/70ynk-ves20",
          "doi": "10.59350/70ynk-ves20",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2602.18602",
          "doi": "10.48550/arXiv.2602.18602",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/wvx71-0na91",
      "content_html": "<p>Most of the week was taken up by hopping over to New Delhi to host a <a href=\"/notes/first-tessera-hackathon\">TESSERA hackathon</a> and also to <a href=\"/notes/india-ai-summit\">attend the AI Impact Summit</a>.  I redeyed back to host <a href=\"https://cs.brown.edu/~sk/\">Shriram Krishnamurthi</a> in Cambridge as he does his UK tour.</p>\n<h2 id=\"tessera\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#tessera\"></a>TESSERA</h2>\n<p>The best news of the week was that the <a href=\"/papers/2025-tessera\">TESSERA paper</a> got\naccepted into <a href=\"https://cvpr.thecvf.com/\">CVPR 2026</a>, out of a whopping 16000+\n(!) submissions. This has been a giant amount of work for the whole team, but\nparticular props to lead author and PhD student <a href=\"https://www.cst.cam.ac.uk/people/zf281\">Frank Feng</a> who has lead the whole\neffort with perseverance and a big smile the whole time!</p>\n<p>It looks like CVPR is in Denver right before PLDI in Boulder (where I have an\nOxCaml tutorial to help hold) so I guess a chunk of my June will be spent in\nColorado this year.</p>\n<p>I also spent some time porting <a href=\"https://www.tunbury.org/\">Mark Elvers</a> OCaml Zarr implementation over to\n<a href=\"https://github.com/avsm/oxmono\">OxCaml</a>, and also started <a href=\"https://github.com/ucam-eo/geotessera/pull/194\">adding Zarr zone support\nto geotessera</a> so we can start\nconverting the registry over.</p>\n<h2 id=\"literature-downloader\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#literature-downloader\"></a>Literature downloader</h2>\n<p><a href=\"https://www.lambdacambridge.com/robin-message\">Robin Message</a> and <a href=\"https://toao.com\">Sadiq Jaffer</a> have restarted the <a href=\"/projects/ce\">literature downloader</a>, and Robin has\nbeen manually classifying DOI prefixes into a two-level tree so we can easily\ndispatch download logic on a per-publisher basis (we have individual <a href=\"/papers/2025-evidence-tap\">agreements</a>\nvia the University library with many publishers).</p>\n<p>I'm surprised that DOIs are not a two-level tree to start with, as now with no\ncentral source of detailed DOI prefix metadata if a journal is sold to another\npublisher (as just happened with <a href=\"https://jfp.mpi-sws.org/\">JFP</a>), you either\nhave to forward a portion of your DOI space or continue to resolve old journal\narticle DOIs forever.</p>\n<p>I also started migrating a lot of datasets over to our new Ceph cluster,\nincluding full syncs of GBIF, OpenAlex, and Crossref. This should set us up\nnicely for <a href=\"/ideas/living-iucn-redlist\">Shane's dashboard</a> using locally hosted\ndatabase for fast queries.  On the queue once the storage settles is also\n<a href=\"https://github.com/inaturalist/inaturalist-open-data\">iNaturalist open data</a>, and\nto mirror the TESSERA embeddings to our Ceph so that local Cambridge users\nsuch as <a href=\"https://ancazugo.github.io/\">Andres Zuñiga-Gonzalez</a> can access them more easily to do global analyses directly without a full\nlocal copy.</p>\n<h3 id=\"figuring-out-what-a-uri-really-is\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#figuring-out-what-a-uri-really-is\"></a>Figuring out what a URI really is</h3>\n<p>I also had a really fun discussion with <a href=\"https://jonmsterling.com\">Jon Sterling</a> over High Table dinner\nat Pembroke about whether it was a good idea for me to get into Lean to start\nto specify the semantics of URI resolution.</p>\n<p>Jon published a design for <a href=\"https://www.forester-notes.org/JVIT\">canonical URLs in Forester</a> last year,\nand as I'm getting slightly obsessed with managing Atom, RSS and JSONFeeds at the moment (the <a href=\"/network\">/network</a>\nview above is powered by this) this seems relevant to both that and also the literature downloader.  In return for Jon's help, I will happily\n<a href=\"https://amok.recoil.org/@jonmsterling@mathstodon.xyz/116105419508228113\">code up an OCaml monorepo script</a> for him!</p>\n<h2 id=\"shrirams-pl-opinions\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#shrirams-pl-opinions\"></a>Shriram's PL opinions</h2>\n<p><img src=\"/images/week26n8-3.webp\" alt=\"%rc\" title=\"Showing Shriram a thing or two about Cambridge\" >\nShriram passed through Cambridge on Friday on his UK lecture tour, so I leapt at the chance to host him after leaping off the redeye from India. I last <a href=\"/notes/icfp25-what-i-learnt\">chatted to Shriram at ICFP</a> over the summer, and this time we got hear him speak about <a href=\"https://www.youtube.com/watch?v=wBRtEQ02-HI&amp;list=PL1a1q1zrmyEwpA2PvYcM1UqE18zekujW-&amp;index=1\">The Human Factors of Formal Methods</a> in the <a href=\"https://talks.cam.ac.uk/talk/index/244831\">Logic &amp; Semantics seminar</a> here in the CL.</p>\n<p>The talk was fantastic and I can't recommend watching it enough; I have so many\npapers to follow up on now:</p>\n<ul>\n<li><a href=\"https://doi.org/10.1037/h0048826\">Perceptual learning; differentiation or enrichment</a> (1955)</li>\n<li><a href=\"https://doi.org/10.1037/0278-7393.13.4.640\">Sexing day-old chicks: A case study and expert systems analysis of a difficult perceptual-learning task</a> (1987). The twist being it involves WWII era tanks as well.</li>\n<li><a href=\"https://doi.org/10.1037/a0025140\">Practicing versus inventing with contrasting cases: The effects of telling first on learning and transfer</a> (2011)</li>\n</ul>\n<h3 id=\"can-llms-learn-the-stroop-effect\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#can-llms-learn-the-stroop-effect\"></a>Can LLMs learn the Stroop effect?</h3>\n<p>Shriram used the Stroop effect in his talk, which naturally led to <a href=\"https://www.cl.cam.ac.uk/~nk480/\">Neel Krishnaswami</a> and me wondering if <a href=\"https://www.crumplab.com/blog/771_GPT_Stroop/\">LLMs could learn the Stroop effect too</a>! I found <a href=\"https://doi.org/10.1136/bmj-2024-081948\">one paper</a> on this topic:</p>\n<blockquote>\n<p>Moreover, as in humans, age is a key determinant of cognitive decline: “older” chatbots, like older patients, tend to perform worse on the MoCA test. These findings challenge the assumption that artificial intelligence will soon replace human doctors, as the cognitive impairment evident in leading chatbots may affect their reliability in medical diagnostics and undermine patients’ confidence.\n<cite>-- <a href=\"https://www.bmj.com/content/387/bmj-2024-081948\">Age against the machine</a>, 2024</cite></p>\n</blockquote>\n<p>I find the analogy between human age and 'model age' a bit incongruous, since of course models dont age -- there are improved training regimes. So the basic takeaway is that human cognitive impairment is decreasing as frontier LLMs advance.</p>\n<h3 id=\"adversarial-experiments-to-teach-oxcaml\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#adversarial-experiments-to-teach-oxcaml\"></a>Adversarial experiments to teach OxCaml?</h3>\n<p>When chatting about how to teach our agents OxCaml better, Shriram pointed me to his 2017 paper on <a href=\"https://cs.brown.edu/~sk/Publications/Papers/Published/pkf-teach-pl-exp-adv-think/\">Teaching Programming Languages by Experimental and Adversarial Thinking</a>:</p>\n<blockquote>\n<p>Its essence is to view programming language learning as a natural science\nactivity, where students probe languages experimentally to understand both\nthe normal and extreme behaviors of their features. [...] The approach is\nmodular (with minimal dependencies), incremental (it can be introduced slowly\ninto existing classes), interoperable (it does not need to push out other,\nexisting methods), and complementary (since it introduces a new mode of\nthinking).\n<cite>-- <a href=\"https://cs.brown.edu/~sk/Publications/Papers/Published/pkf-teach-pl-exp-adv-think/\">J. Pombrio et al 2017</a></cite></p>\n</blockquote>\n<p>There's obvious parallels here to how the OCaml to OxCaml translation process\nworks, whereby we typically add in mode annotations once the OCaml version is\nworking. The only practical twist is that shifting to OxCaml also requires\nporting code to Base/Core as well, since the stdlib doesn't have mode\nannotations.</p>\n<h2 id=\"fun-reading\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#fun-reading\"></a>Fun Reading</h2>\n<p><img src=\"/images/week26n8-4.webp\" alt=\"%rc\" title=\"Ran into Neil Lawrence randomly at the AI Summit!\" ></p>\n<ul>\n<li>I discovered that <a href=\"https://github.com/saurabhs92/logic-and-functional-programming-iit-delhi?tab=readme-ov-file\">Saurabh Sharma taught OCaml</a> as the first year course in IIT-Delhi for quite some time!</li>\n<li>I enjoyed the <a href=\"https://music.youtube.com/podcast/iIJjl6I4GNU\">Full Disclosure episode with Rutger Bregman</a> as a followup to <a href=\"/notes/hny2026\">reading his book</a>.</li>\n<li>Nice episode of MCJ covering <a href=\"https://music.youtube.com/podcast/iIJjl6I4GNU\">Turning Wasted Renewable Power into AI Compute with Rune</a>. Lots of geeking about the physics of using all that power. <a href=\"https://dave.recoil.org\">Dave Scott</a> also pointed out to me the reason we can't just build AI datacenters up north in Scotland where the renewable power is cheap and plentiful is because there's a <a href=\"https://www.theregister.com/2025/11/18/uk_ai_growth_zones/\">requirement for a constant national electricity price</a>.</li>\n<li>Welcome <a href=\"https://github.com/samoht\">Thomas Gazagnaire</a> back to the blogosphere with a banging post about <a href=\"https://gazagnaire.org/blog/2026-02-19-nasa-fprime.html\">porting NASA's reusable flight software framework to OCaml</a>.</li>\n<li><a href=\"https://news.mongabay.com/2026/02/scientists-cant-agree-on-where-the-worlds-forests-are/\">Scientists cant agree on where the world's forests are</a>: would be fun to cross-check the datasets mentioned here against TESSERA.</li>\n</ul>\n<p><img src=\"/images/week26n8-1.webp\" alt=\"%rc\" title=\"Most of my week was spent in Delhi traffic, here with Amanda Brock!\" >\nExtremely random feature: I added <a href=\"https://amok.recoil.org/@avsm/116081078944922670\">finger support</a> to my website, so you can just do <code>finger @anil.recoil.org</code> (it is installed by default on macOS) to see my latest weekly.</p>\n<h2 id=\"next-week\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#next-week\"></a>Next Week</h2>\n<p>I need to get TESSERA Zarr in shape. This will fix so many infrastructure\nissues with using the embeddings!  I'm also going to vibe code up a cool\nwebsite for the project, using the feed aggregation logic from my own website\nand these <a href=\"https://github.com/CloudAI-X/threejs-skills\">Threejs Claude skills</a>\nI just stumbled across.</p>\n<p>I'm also off to <a href=\"https://ifip-wg28.github.io/\">WG2.8</a> the week after, so I need\nto figure out what functional programming goodness I will present there!</p>\n<p><img src=\"/images/week26n8-2.webp\" alt=\"%c\" title=\"Thanks Jon Sterling for letting me look around Clare College and see the restored buildings; the scaffolding just came down!\" ></p><h1>References</h1><ul><li>Madhavapeddy (2026). At the AI Impact Summit in Delhi: people, planet, progress. <a href=\"https://doi.org/10.59350/6vc5q-mbk23\" target=\"_blank\"><i>10.59350/6vc5q-mbk23</i></a></li>\n<li>Feng et al (2025). TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2506.20380\" target=\"_blank\"><i>10.48550/arXiv.2506.20380</i></a></li>\n<li>Madhavapeddy (2025). What I learnt at ICFP/SPLASH 2025 about OCaml, Hazel and FP. <a href=\"https://doi.org/10.59350/w1jvt-8qc58\" target=\"_blank\"><i>10.59350/w1jvt-8qc58</i></a></li>\n<li>Jaffer et al (2025). AI-assisted Living Evidence Databases for Conservation Science. Cambridge Open Engage. <a href=\"https://doi.org/10.33774/coe-2025-rmsqf\" target=\"_blank\"><i>10.33774/coe-2025-rmsqf</i></a></li>\n<li>Madhavapeddy (2026). Happy new year and my fave readings of the year. <a href=\"https://doi.org/10.59350/y9f0e-raa45\" target=\"_blank\"><i>10.59350/y9f0e-raa45</i></a></li>\n<li>Madhavapeddy (2026). 1st TESSERA/CoRE hackathon at the Indian AI Summit. <a href=\"https://doi.org/10.59350/1na80-7ak85\" target=\"_blank\"><i>10.59350/1na80-7ak85</i></a></li>\n<li>Gibson et al (1955). Perceptual learning: Differentiation or enrichment?. <a href=\"https://doi.org/10.1037/h0048826\" target=\"_blank\"><i>10.1037/h0048826</i></a></li>\n<li>Biederman et al (1987). Sexing day-old chicks: A case study and expert systems analysis of a difficult perceptual-learning task.. <a href=\"https://doi.org/10.1037/0278-7393.13.4.640\" target=\"_blank\"><i>10.1037/0278-7393.13.4.640</i></a></li>\n<li>Schwartz et al (2011). Practicing versus inventing with contrasting cases: The effects of telling first on learning and transfer.. <a href=\"https://doi.org/10.1037/a0025140\" target=\"_blank\"><i>10.1037/a0025140</i></a></li>\n<li>Dayan et al (2024). Age against the machine—susceptibility of large language models to cognitive impairment: cross sectional analysis. British Medical Journal Publishing Group. <a href=\"https://doi.org/10.1136/bmj-2024-081948\" target=\"_blank\"><i>10.1136/bmj-2024-081948</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2026w8",
      "title": ".plan-26-08: At AI summit, Shriram's PL opinions, Zarr hacking",
      "summary": "TESSERA paper accepted at CVPR 2026, went to the AI Impact Summit, OCaml Zarr hacking, Shriram's talk on human factors of formal methods, and discussions on teaching OxCaml to agents.",
      "date_published": "2026-02-22T00:00:00.000000Z",
      "date_modified": "2026-02-22T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "tessera",
        "ai",
        "policy",
        "india",
        "zarr",
        "teaching"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/6vc5q-mbk23",
          "doi": "10.59350/6vc5q-mbk23",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2506.20380",
          "doi": "10.48550/arXiv.2506.20380",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/w1jvt-8qc58",
          "doi": "10.59350/w1jvt-8qc58",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.33774/coe-2025-rmsqf",
          "doi": "10.33774/coe-2025-rmsqf",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/y9f0e-raa45",
          "doi": "10.59350/y9f0e-raa45",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/1na80-7ak85",
          "doi": "10.59350/1na80-7ak85",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1037/h0048826",
          "doi": "10.1037/h0048826",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1037/0278-7393.13.4.640",
          "doi": "10.1037/0278-7393.13.4.640",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1037/a0025140",
          "doi": "10.1037/a0025140",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1136/bmj-2024-081948",
          "doi": "10.1136/bmj-2024-081948",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/6vc5q-mbk23",
      "content_html": "<p>Very little sleep this week as I hopped over to India's <a href=\"https://impact.indiaai.gov.in/\">AI Impact Summit</a> for a slew of events to followup my <a href=\"/notes/path-to-uk-india-ai-summit\">earlier meetings at OpenUK and the Turing</a>.  The Indian government knocked it out of the park with the first summit held in the majority world: there were over 200,000 people registered and the keynote for <a href=\"https://sarvam.ai\">Sarvam AI</a>'s launch had more people attending than the entire French AI summit last year! The venue was the enormous <a href=\"https://en.wikipedia.org/wiki/Bharat_Mandapam\">Bharat Mandapam</a>, which was opened just a couple of years ago in the G20.</p>\n<p>I found the summit a fantastic networking event, although unlikely to result in any significant policy shifts aside from establishing India as a serious target region for growth. I was lucky enough to meet Yann LeCun and get a bunch of technical insights into <a href=\"/projects/tessera\">TESSERA</a> from him, which was my personal highlight!</p>\n<p><img src=\"/images/aisummit-gen-2.webp\" alt=\"%c\" title=\"The glorious view from the Bharat Mandapam roof\" ></p>\n<h2 id=\"the-main-ai-summit-expo\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-main-ai-summit-expo\"></a>The main AI Summit expo</h2>\n<p><img src=\"/images/aisummit-gen-3.webp\" alt=\"%rc\" title=\"The map of the global arena, just one of the halls\" >\n<img src=\"/images/aisummit-gen-5.webp\" alt=\"%rc\" title=\"The overall map of the whole summit with all the halls\" >\nThrough a series of misadventures not suitable for the public web, I ended up\nin the rooftop VIP section of the summit and got to take in the breathtaking\nviews of the whole crowd. The size of the conference centre is incredible, with a\nset of conference halls all connected by a lovely garden through to the gates\ninto Delhi proper.\nThe main expo I visited was the &quot;world hall&quot;, where every country had an\nexhibition (some projects were more <a href=\"https://www.bbc.co.uk/news/articles/c0q3g0ln274o\">credible</a> than\n<a href=\"https://www.bbc.co.uk/news/articles/cge8nd5ve00o\">others</a>).</p>\n<p><div class=\"video-vertical\"><iframe title=\"Wandering around the Indian Impact AI Summit\" src=\"https://crank.recoil.org/videos/embed/cba9ce1c-ff77-469c-bc8a-daa0308606c7\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\" style=\"aspect-ratio: 9/16; width: 100%; height: 100%; max-width: 325px;\"></iframe></div>\nThere were corridor conversations everywhere; an Nvidia\ncontingent spotted my Docker hoodie and started talking about containers and\nGPUs; and some students saw my Jane Street t-shirt and thought I worked there\n(I did try not to disappoint them by explaining 'why OCaml' though). I pulled\nout my laptop a few times while lingering and started demonstrating TESSERA\n(which looks cool due to the maps and false colours), and I had small crowds\nwatching along. I wish Cambridge had a proper presence at the 'GREAT Britain'\nstall, but I only saw Dundee and (I think) Coventry represented there.</p>\n<p>I didn't see any really mindblowing demos in the stalls, but perhaps that's\nbecause my baselines have shifted in the last year. What was remarkable\nwas the breadth of solutions on offer: pretty much every aspect of Indian\nsociety seemed to be covered, from urban living to rural food security.</p>\n<p>There was little '<a href=\"https://en.wikipedia.org/wiki/Eating_your_own_dog_food\">eat our own dogfood</a>' on display,\nas the security lines <a href=\"https://www.bbc.co.uk/news/articles/ceqvjgrvpn3o\">were long</a> and manual. The payment\nsystem was actually really cool: <a href=\"https://en.wikipedia.org/wiki/Unified_Payments_Interface\">UPI</a> allows\na vendor to display a QRCode to receive payment with 0% overhead. The user scans the QRCode on their\nmobile, pays online, and then shows the proof to the vendor. The vendor doesn't need to have any electronic\nequipment at all.  This had nothing to do with AI, but reminded me strongly of our work on <a href=\"/projects/ubiqinteraction\">spotcodes</a> over two decades ago!</p>\n<p><img src=\"/images/aisummit-gen-1.webp\" alt=\"%rc\" title=\"Most talks at the summit were beyond standing room only\" >\n<img src=\"/images/ai-hackathon-2.webp\" alt=\"%rc\" >\nWith 200,000 people pouring through the expo, it was basically impossible to have any coordinated meetings as even getting from one end to the other took ages. However, this meant that a lot of attendees vibe coded up alternative (and more usable) interfaces, and were showing off QRCodes which I scanned in order to access their versions. One enterprising chap had a UPI payment link to an MCP endpoint that I could point my <a href=\"https://openclaw.ai/\">Claw</a> to help figure out where to go next; I didn't buy this, but I appreciated the hustle!</p>\n<p><img src=\"/images/aisummit-gen-4.webp\" alt=\"%c\" title=\"The three pillars of the AI Impact Summit\" ></p>\n<h2 id=\"meeting-yann-lecun-and-discussing-satellite-commons\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#meeting-yann-lecun-and-discussing-satellite-commons\"></a>Meeting Yann LeCun and discussing satellite commons</h2>\n<p><img src=\"/images/aisummit-aia-1.webp\" alt=\"%rc\" title=\"Yann LeCun turns out to be a selfie king\" >\nOne of the evening events was hosted by the good folk at the <a href=\"https://thealliance.ai/\">AI Alliance</a>, a not-for-profit aiming to boost open models.\nThe guest of honour Yann LeCun, who turns out to be the nicest person! He gave a layman's summary to the audience about why LLMs are an evolutionary dead end, and then started talking about how self-supervised methods such as <a href=\"https://arxiv.org/abs/2103.03230\">Barlow Twins</a> and <a href=\"https://arxiv.org/abs/2301.08243\">JEPA</a> are the future. Barlow Twins are, of course, the exact training mechanism we've been using to train <a href=\"/papers/2025-tessera\">TESSERA</a>, and so I bounded up to him to discuss it in more detail!</p>\n<p>Yann had made an argument on stage that the only fair way to train language models that are open is for every country to contribute a large corpus of country-specific data (not necessarily open), and to combine these at the global level via one level of federated learning into a 'united language model'.</p>\n<p>While this may work ok for LLMs, I thought it was <em>even more perfect</em> for satellite data! We already have a base set of public data, in the form of Landsat and Sentinel 1/2/3. Meanwhile, many countries have their own geostationary task satellites hovering over them, often with much higher resolutions and interesting instruments. I even heard rumours that India has commissioned a <a href=\"https://surveyofindia.gov.in/\">Lidar survey of the entire country</a>, but this may be a future project as all I could find concretely is one of <a href=\"https://www.linkedin.com/posts/surveyofindia_surveyofindia-lidar-digitalelevationmodel-activity-7282654478921601024-O62z/\">major river systems</a>.</p>\n<p>So while open frontier LLMs may be a lost cause in the short term, it strikes me as a real opportunity that TESSERA may be the perfect way to trial Yann's idea of a global training cooperative.  The incentives are all there, and geospatial foundation modeling seems to be maturing rapidly.</p>\n<p><img src=\"/images/aisummit-aia-2.webp\" alt=\"%c\" title=\"With the leadership team of the AI Alliance\" ></p>\n<h2 id=\"holding-a-hackathon-and-visiting-iit-delhi\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#holding-a-hackathon-and-visiting-iit-delhi\"></a>Holding a hackathon and visiting IIT-Delhi</h2>\n<p><img src=\"/images/ai-hackathon-1.webp\" alt=\"%rc\" >\nI skipped the 'government day' of the summit (with all the glitzy tech CEO talks) in order to satisfy demand for <a href=\"/projects/tessera\">TESSERA</a> talks.  I <a href=\"/notes/first-tessera-hackathon\">advertised a hackathon</a> with OpenUK on the day before the summit, and we had a full house of signups by the end of the day! I posted a <a href=\"/notes/first-tessera-hackathon\">trip report</a> for this separately.</p>\n<p>There was so much discussion at the hackathon that I trotted along to the IIT-Delhi campus the morning after to give a detailed talk on the bigger picture of TESSERA (similar to the talk <a href=\"/notes/2026w6\">at ARIA</a> last week). The audience was highly engaged and I went well over time answering questions. A number of students were interested in followup opportunities to work in this area, and I pointed them to <a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a> and his <a href=\"https://fplaunchpad.org/\">FP Launchpad</a> which is taking off in April. We've got <a href=\"https://www.tunbury.org/2026/02/15/ocaml-tessera/\">TESSERA and OCaml</a> playing well together now, so there's a really fun opportunity to combine functional programming with planetary computing now!</p>\n<h2 id=\"partyinghhnetworking-at-the-british-high-commission\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#partyinghhnetworking-at-the-british-high-commission\"></a>Partying^H^Hnetworking at the British High Commission</h2>\n<p><div class=\"video-vertical\"><iframe title=\"AI Impact Summit at the British High Commission party\" src=\"https://crank.recoil.org/videos/embed/ab39a536-a89a-4756-b04d-9fd6d9cc9bd5\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\" style=\"aspect-ratio: 9/16; width: 100%; height: 100%; max-width: 325px;\"></iframe></div>\nI got a kind invitation from the British High Commissioner <a href=\"https://www.gov.uk/government/news/change-of-british-high-commissioner-to-india-lindy-cameron\">Lindy Cameron</a> to attend a party at her house, which turns out to have the largest private garden in New Delhi.  The good and the great of the Delhi political scene were there, along with a number of visitors to the summit. The main draw of the event was a conversation between Rishi Sunak and David Lammy, but first Kanishka Narayan (the minister for AI and Digital Safety) and Amanda Brock from OpenUK announced the launch of 'open source and AI' video.</p>\n<p>Both speeches were charming, and it was good to see the emphasis on openness. Kanishka Narayan made a wry observation that Britain might not lead on raw engineering resource, but it does have 'the best technical taste', which I thought was quite an apt claim!</p>\n<p>Most of the conversations I had here were all about landuse and datacenter growth. There seems to be massive investment within India for datacentre capacity, so questions of water usage and landuse are obvious barriers. I'm looking forward to working with the <a href=\"https://core-stack.org\">CoRE Stack</a> team to help map out some of these challenges throughout India.</p>\n<p>There was also a lot of interest in datacentres in space, so I took the opportunity to explain what we're doing with our startup <a href=\"https://parsimoni.co/\">Parsimoni</a> lead by <a href=\"https://github.com/samoht\">Thomas Gazagnaire</a>. The idea of having a multi-tenant 'Docker in space' was received well by everyone I mentioned it to, and Thomas has been finding <a href=\"https://gazagnaire.org/blog/2026-02-19-nasa-fprime.html\">a lot of similarity to our earlier unikernel work</a> as he builds SpaceOS out in California.</p>\n<h2 id=\"summit-outcomes\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#summit-outcomes\"></a>Summit Outcomes</h2>\n<p>The <a href=\"https://www.mea.gov.in/bilateral-documents.htm?dtl%2F40809\">summit declaration</a> that came out today is remarkable in actually being signed by the US, UK and China <a href=\"https://www.bbc.co.uk/news/articles/c8edn0n58gwo\">unlike last time</a>.\nSome snippets of interest to me from the statement are:</p>\n<blockquote>\n<p>We take note of the voluntary and collaborative International Network of AI for Science Institutions as a platform to connect scientific communities and pool AI research capabilities across regions among participating institutions, in order to accelerate the impactful adoption of AI.\n[...]</p>\n<p>While encouraging international collaboration on meaningful skilling and reskilling AI initiatives, we take note of the voluntary guiding principles for reskilling in the age of AI and the playbook on AI workforce development, which would support participants in preparation for a future AI driven economy.\n[...]</p>\n<p>We take note of the Global AI Impact Commons as a voluntary initiative that provides a practical platform to encourage and enable the adoption, replication, and scale-up of successful AI use cases across regions.\n<cite>-- <a href=\"https://www.mea.gov.in/bilateral-documents.htm?dtl%2F40809\">AI Impact Summit Declaration</a>, Feb 2026</cite></p>\n</blockquote>\n<p>The word &quot;safety&quot; is notably missing; it's all about rapid progress and equitable access now. This year will not be about whether or not AI adoption should happen; it's now a race to defend ourselves against <a href=\"/papers/2025-ai-poison\">AI poisoning</a> and whether we take the <a href=\"/notes/red-pill-conservation\">red pill or the blue pill</a> and embrace an open future. It is for this reason I'm very grateful to <a href=\"https://openuk.uk\">OpenUK</a> for all their work on helping make sure the UK takes the red open-source pill. It'll be a harder road, but a worthwhile one.</p>\n<p>An excellent recap of the outcomes of the summit can be found in this hour long segment on Indian TV with none other than my colleague <a href=\"https://inverseprobability.com/\">Neil Lawrence</a>!</p>\n<p><div class=\"video-center\"><iframe title=\"Neil Lawrence on Indian TV at the AI Impact Expo\" width=\"100%\" height=\"315px\" src=\"https://watch.eeg.cl.cam.ac.uk/videos/embed/46a77cfa-0db8-4010-abf4-357c14801897\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<p>I also contributed to the very comprehensive OpenUK AI Openness Summit Report which is <a href=\"https://openuk.uk/wp-content/uploads/2026/02/AI-Impact-Summit-2026-AI-Openness-Report.pdf\">available here</a> and very comprehensive. I'm not entirely sure what happened to the ATI report I contributed to in November; I suspect it's been filed away Indiana Jones style in some vast document repository underground...</p>\n<p>Thanks for the hospitality New Delhi! It was an exhilarating whirlwind to be at the summit. Well done <a href=\"https://indiaai.gov.in/people/abhishek-singh\">Abhishek Singh</a> and other organisers.</p>\n<p><small class=\"credit\"><strong>(Updated 23rd Feb 2026 with a link to the OpenUK report. March 6th 2026 with a typo from Sam Reynolds.)</strong></small></p><h1>References</h1><ul><li>Madhavapeddy (2026). Discussing effective conservation with all the UK Chief Scientists. <a href=\"https://doi.org/10.59350/qjrmv-38130\" target=\"_blank\"><i>10.59350/qjrmv-38130</i></a></li>\n<li>Feng et al (2025). TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2506.20380\" target=\"_blank\"><i>10.48550/arXiv.2506.20380</i></a></li>\n<li>Madhavapeddy (2025). On the path to the UK/India AI Summit with OpenUK and the ATI. <a href=\"https://doi.org/10.59350/x6rea-1g262\" target=\"_blank\"><i>10.59350/x6rea-1g262</i></a></li>\n<li>Reynolds et al (2025). Will AI speed up literature reviews or derail them entirely?. Nature Publishing Group. <a href=\"https://doi.org/10.1038/d41586-025-02069-w\" target=\"_blank\"><i>10.1038/d41586-025-02069-w</i></a></li>\n<li>Madhavapeddy (2026). 1st TESSERA/CoRE hackathon at the Indian AI Summit. <a href=\"https://doi.org/10.59350/1na80-7ak85\" target=\"_blank\"><i>10.59350/1na80-7ak85</i></a></li>\n<li>Zbontar et al (2021). Barlow Twins: Self-Supervised Learning via Redundancy Reduction. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2103.03230\" target=\"_blank\"><i>10.48550/arXiv.2103.03230</i></a></li>\n<li>Assran et al (2023). Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2301.08243\" target=\"_blank\"><i>10.48550/arXiv.2301.08243</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/india-ai-summit",
      "title": "At the AI Impact Summit in Delhi: people, planet, progress",
      "summary": "Trip report from the Indian AI Impact Summit in New Delhi, covering the massive expo, a conversation with Yann LeCun, a hackathon/talk at IIT-Delhi, networking at the British High Commission, and reflections on the summit declaration's shift from safety to progress and equitable access.",
      "date_published": "2026-02-21T00:00:00.000000Z",
      "date_modified": "2026-02-21T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "tessera",
        "ai",
        "policy",
        "india",
        "aisummit"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/qjrmv-38130",
          "doi": "10.59350/qjrmv-38130",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2506.20380",
          "doi": "10.48550/arXiv.2506.20380",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/x6rea-1g262",
          "doi": "10.59350/x6rea-1g262",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1038/d41586-025-02069-w",
          "doi": "10.1038/d41586-025-02069-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/1na80-7ak85",
          "doi": "10.59350/1na80-7ak85",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2103.03230",
          "doi": "10.48550/arXiv.2103.03230",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2301.08243",
          "doi": "10.48550/arXiv.2301.08243",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/1na80-7ak85",
      "content_html": "<p>We held the first <a href=\"/projects/tessera\">TESSERA</a> hackathon over the <a href=\"/notes/india-ai-summit\">Indian AI Impact summit</a> in Delhi today, thanks to sponsorship\nfrom <a href=\"https://openuk.uk\">OpenUK</a>.  Despite only announcing it the weekend\nbefore, we had a full house of active participants (some of whom came up all\nthe way from Bangalore for the day to attend!).</p>\n<p>This is the first TESSERA hackathon I've taken part in outside the UK (where\nwe've had several really fun ones with the <a href=\"https://iucn.org\">IUCN</a> and\n<a href=\"https://unep-wcmc.org\">UNEP-WCMC</a>).  It was especially packed because of the\nmassive buzz in Delhi with the summit going on; every hotel in the city was\nbooked out and traffic was gridlocked as 200,000 people descended.  We held the\nhackathon in the lovely <a href=\"https://iicdelhi.in/\">India International Centre</a>\ncampus in the centre of Delhi which was a peaceful locale amidst the chaos\noutside.</p>\n<p><img src=\"/images/tessera-hackathon-3.webp\" alt=\"%c\" title=\"The hackathon was fueled by marmalade biscuits supplied by Amanda Brock from OpenUK; the best kind of sponsorship!\" ></p>\n<h2 id=\"learning-about-core-stack\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#learning-about-core-stack\"></a>Learning about CoRE Stack</h2>\n<p><img src=\"/images/tessera-hackathon-1.webp\" alt=\"%rc\" title=\"The CoRE stack in action\" >\nI organised the hackathon with <a href=\"https://www.cse.iitd.ac.in/~aseth/\">Aadi Seth</a> from IIT-Delhi, who leads the <a href=\"https://core-stack.org\">CoRE Stack</a>. We first learnt about how the CoRE stack works; my notes follow:</p>\n<ul>\n<li>The <a href=\"https://github.com/core-stack-org/\">GitHub core-stack-org</a> has the backend service, the <a href=\"https://github.com/core-stack-org/landscape-explorer\">landscape explorer</a> web app, and various <a href=\"https://github.com/core-stack-org/cc-android-offline\">mobile apps</a>.</li>\n<li>They use <a href=\"https://earthengine.google.com/faq/\">GEE</a> to perform the geospatial analyses and then export the rasters into their own pipeline, with a <a href=\"https://geoserver.org/\">geoserver</a> instance to serve the site. Currently runs only for ROIs for projects, but would like to scale pan-India but depends on compute availability.\nThere's a Django, Celery, GEE, Geoserver, Airflow cloud flow, but we discussed <a href=\"https://yirgacheffe.org/latest/\">Yirgacheffe</a> and a local machine <a href=\"https://digitalflapjack.com/blog/yirgacheffe/\">like we do for LIFE</a> to simplify deployment. Concerns include having to manage on-prem resources (but need to balance this vs cost of cloud resources).</li>\n<li>They have developed a <a href=\"https://www.spatialnode.net/articles/building-reproducible-geospatial-pipelines-a-stac-extension-with-dags25119d\">dataflow extension to STAC</a> that extends STAC with lineage tracking, algorithm versioning, and incremental recomputation. This is of great <a href=\"https://www.tunbury.org/2025/11/30/tessera-zarr/\">interest to us</a> as we deploy STAC for TESSERA too!  See <a href=\"https://watch.eeg.cl.cam.ac.uk/w/fEoa7jde33i35w1Xz816ft\">PROPL talk</a>, <a href=\"https://dl.acm.org/doi/10.1145/3759536.3763803\">paper</a> and <a href=\"/notes/icfp25-propl\">my notes</a> on STAC-D.</li>\n<li>The <a href=\"https://www.explorer.core-stack.org/\">CoRE Stack Explorer</a> does more than just mapping; it also generates structured reports about items of concern such as hydrological flows, soil health and other indicators being tracked that are of relevance to the local people in the region. It's open access and browsable, and I enjoyed looking through projects in Andhra Pradesh!</li>\n</ul>\n<h2 id=\"showing-tessera\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#showing-tessera\"></a>Showing TESSERA</h2>\n<p><img src=\"/images/tessera-hackathon-4.webp\" alt=\"%rc\" title=\"I must admit the other event going on here was also of interest!\" >\nTo get people going in TESSERA I first of all explained the basic ideas behind the model, on the same lines that <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and <a href=\"https://coomeslab.org\">David Coomes</a> did in our <a href=\"/notes/foundational-ecosystem-workshop\">Ecosystem Resilience Workshop</a>. The two main 'getting started' ways right now are:</p>\n<ul>\n<li>Clone the <a href=\"https://github.com/ucam-eo/tessera-interactive-map\">tessera-interactive-map</a> repo and setup a pip or uv environment from the requirements.txt, and then open up <code>app.ipynb</code> in VSCode or Jupyter (as you prefer, but I tend to use VSCode as it works better with uv). Then pick an area of interest and try out the labeling workflow directly on your laptop.</li>\n<li>Clone Keshav's <a href=\"https://github.com/sk818/TEE\">TESSERA Embeddings Explorer</a> which is a much more feature-rich app but requires a bit more setup. I demonstrated this using the <a href=\"https://github.com/sk818/TEE/blob/main/docker-compose.yml\">docker-compose</a> file and several people with <a href=\"/papers/2025-docker-icfp\">Docker for Windows</a> got it up and running with a <code>docker compose up</code>.</li>\n</ul>\n<p>While this all went right, the wifi at the venue was pretty slow, so downloading embeddings all the way from Cambridge was pretty slow. I think there are two avenues we need to explore quickly:</p>\n<ul>\n<li>I want to prioritise switching the embeddings to Zarr, and there's a <a href=\"https://eeg.zulipchat.com/#narrow/channel/527258-Tessera/topic/zarr.20file.20format/with/571006960\">comprehensive thread</a> on the EEG Zulip about how we can go about this. This is on next week's hacking queue, using the <a href=\"https://github.com/mtelvers/ocaml-zarr\">ocaml-zarr</a> bindings that <a href=\"https://www.tunbury.org/\">Mark Elvers</a> has put together!</li>\n<li>We discussed mirroring embeddings for India to IIT Delhi's servers, which <a href=\"https://www.cse.iitd.ac.in/~aseth/\">Aadi Seth</a> is going to investigate. I think this will also be easier once we have Zarr, since then a static webserver is all that's needed and even JavaScript clients could fetch the data they need.</li>\n<li>A mobile phone app for TESSERA would be extremely cool, as Sadiq has <a href=\"https://toao.com/blog/can-we-really-see-brambles-from-space\">observed before</a>.</li>\n</ul>\n<p>After this, we had a brief period to try out some labeling, but the conversations about exactly what we hack on together will continue on our EEG Zulip channel! There was a lot of discussion about how to store labels, as many of the groups there had ground truth information that they weren't quite sure how to share. I think it would be very valuable to have a general geospatial labeling service that could then export to various specific services like OpenStreetMap or for downstream training.</p>\n<p>Thanks OpenUK and the Impact Summit for facilitating this, and for IIT-Delhi and the other participants for the fascinating discussions and hacking!</p>\n<p><img src=\"/images/tessera-hackathon-2.webp\" alt=\"%c\" title=\"The hackers hacking!\" >\n<img src=\"/images/tessera-hackathon-5.webp\" alt=\"%c\" title=\"The lovely IIC venue\" ></p><h1>References</h1><ul><li>Madhavapeddy et al (2025). Functional Networking for Millions of Docker Desktops. <a href=\"https://doi.org/10.1145/3747525\" target=\"_blank\"><i>10.1145/3747525</i></a></li>\n<li>Madhavapeddy (2026). At the AI Impact Summit in Delhi: people, planet, progress. <a href=\"https://doi.org/10.59350/6vc5q-mbk23\" target=\"_blank\"><i>10.59350/6vc5q-mbk23</i></a></li>\n<li>Madhavapeddy (2025). Programming for the Planet at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/hasmq-vj807\" target=\"_blank\"><i>10.59350/hasmq-vj807</i></a></li>\n<li>Madhavapeddy (2025). Foundational AI for Ecosystem Resilience workshop. <a href=\"https://doi.org/10.59350/26hy6-rry61\" target=\"_blank\"><i>10.59350/26hy6-rry61</i></a></li>\n<li>Laud et al (2025). STACD: STAC Extension with DAGs for Geospatial Data and Algorithm Management. <a href=\"https://doi.org/10.1145/3759536.3763803\" target=\"_blank\"><i>10.1145/3759536.3763803</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/first-tessera-hackathon",
      "title": "1st TESSERA/CoRE hackathon at the Indian AI Summit",
      "summary": "First TESSERA hackathon held at the Indian AI Impact Summit in Delhi, exploring integration with IIT-Delhi's CoRE Stack for geospatial analysis and testing TESSERA labeling workflows.",
      "date_published": "2026-02-19T00:00:00.000000Z",
      "date_modified": "2026-02-19T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "tessera",
        "hackathon",
        "ai",
        "aisummit",
        "corestack"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1145/3747525",
          "doi": "10.1145/3747525",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/6vc5q-mbk23",
          "doi": "10.59350/6vc5q-mbk23",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/hasmq-vj807",
          "doi": "10.59350/hasmq-vj807",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/26hy6-rry61",
          "doi": "10.59350/26hy6-rry61",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3759536.3763803",
          "doi": "10.1145/3759536.3763803",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2026w7",
      "content_html": "<p>The week was dominated by having to sort out yet another 300TB of SSDs to grow our Ceph cluster, as the <a href=\"/projects/tessera\">TESSERA</a> embeddings being generated on Vultr are even more optimised and being spit out at a rate of 4TB per day. But I did get around to playing with Lego as well!</p>\n<h2 id=\"tessera\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#tessera\"></a>TESSERA</h2>\n<p>I released a minor release of <a href=\"https://github.com/ucam-eo/geotessera/releases/tag/v0.7.5\">geotessera 0.7.x</a> to speed up the startup time of the library, now that we have so many embeddings generated (<a href=\"https://doi.org/10.5281/zenodo.18649425\">doi</a>). <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> has been knocking up a beautiful <a href=\"https://github.com/sk818/TEE\">TESSERA Embeddings Explorer</a> which is now Dockerised and so is running on my laptop, perfect for demos on the fly! He noticed the TEE startup time was slow and reported it, and I undid a hack I put in to support Windows better by picking a simpler way to manage coordinate transforms.</p>\n<p>Also saw an excellent post about what makes <a href=\"https://terrabytes.substack.com/p/tessera-a-blueprint-for-earth-observation\">TESSERA a blueprint for other EO models</a> from <a href=\"https://substack.com/@rakshithsathish\">Rakshith Santish</a>. I'm glad he enjoyed the ablations!</p>\n<h2 id=\"storage-and-oxcaml\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#storage-and-oxcaml\"></a>Storage and OxCaml</h2>\n<p><a href=\"https://www.tunbury.org/\">Mark Elvers</a> and I are getting more comfortable with <a href=\"https://www.tunbury.org/2026/01/06/ceph-notes/\">using Ceph</a> to manage all the TESSERA embeddings, and Malcolm Scott helped us to wire up a bunch of old machines donated by Jane Street that we can use to distribute them.  We now have multiple large storage blobs that we're consolidating, ranging from TrueNAS mirrored storage (from the <a href=\"/projects/4c\">4C</a> days) over to a <a href=\"https://www.tunbury.org/2025/12/09/ceph-placement-groups/\">backup Ceph cluster</a> at Scaleway (kindly sponsored by Tarides) and also the <a href=\"https://www.tunbury.org/2026/01/16/base-image-builder/\">OCaml infrastructure</a> and <a href=\"https://jon.recoil.org/blog/2026/02/weeknotes-2026-06.html#docs-ci\">documentation generation</a> that sucks up our maintenance time.</p>\n<p>Meanwhile, use of <a href=\"https://github.com/avsm/oxmono\">oxmono</a> for our OxCaml infrastructure is working extremely well, and I'm solely using that with worktrees and branches to deploy services now. The OxCaml compiler is rock solid and with a monorepo, managing packages is a breeze. <a href=\"https://jon.recoil.org\">Jon Ludlam</a> is also using it to fix <a href=\"https://jon.recoil.org/blog/2026/02/weeknotes-2026-06.html#oxmono\">OxCaml doc generation</a> which will be extremely useful for agentic coding (since the agents can read the generated docs to determine if the interfaces are 'good' or contain lots of hidden modules and other bad practise).</p>\n<p><img src=\"/images/tap-lego-1.webp\" alt=\"%rc\" >\nThe only real downside with the monorepo is that a <code>dune exec</code> takes about 3s to initialise on my laptop from scanning all the dune files. I'll investigate some ways to speed up this, as the Dune cache doesn't help much until all the dune files have been parsed.</p>\n<p>It was also good to see other OCaml projects progressing: Mark is getting <a href=\"https://www.tunbury.org/2026/02/11/ocaml-mp3/\">more familiar with OxCaml performance</a> and the <a href=\"https://patrick.sirref.org/ocaml-roundup-january-2026\">OCaml TIFF library</a> in Outreachy is going well. I hope to use this soon in TESSERA-oxcaml...</p>\n<h2 id=\"lego-for-the-evidence-tap\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#lego-for-the-evidence-tap\"></a>Lego for the evidence TAP</h2>\n<p><img src=\"/images/tap-lego-2.webp\" alt=\"%rc\" >\nWe're just finalising a grant that's been awarded to do the exciting next phase of the <a href=\"/projects/ce\">CE</a> &quot;Cambridge Evidence TAP&quot; project by generalising it to other fields (such as education, public health and climate adaptation). While waiting for the grant paperwork to finish, my colleague from Education <a href=\"https://www.educ.cam.ac.uk/people/staff/gibson/\">Jenny Gibson</a> had the brainwave of inviting Gina Gomez de la Cuesta from <a href=\"https://playincluded.com/\">Play Included</a> to come along to facilitate a brainstorming session about what we'll work on when the project starts.</p>\n<p><img src=\"/images/tap-lego-3.webp\" alt=\"%rc\" >\nThis involved getting our friends from across the university and <a href=\"https://www.csap.cam.ac.uk\">CSaP</a> together. It was me and <a href=\"https://toao.com\">Sadiq Jaffer</a> from Computer Science. <a href=\"https://www.zoo.cam.ac.uk/directory/prof-lynn-dicks\">Lynn Dicks</a> and <a href=\"https://samreynolds.org\">Sam Reynolds</a> from Conservation, <a href=\"https://www.educ.cam.ac.uk/people/staff/gibson/\">Jenny Gibson</a> from education, Mélanie Gréaux from the WHO, <a href=\"mailto:r.doubleday@jbs.cam.ac.uk\">Rob Doubleday</a> and <a href=\"mailto:njb1010@cam.ac.uk\">Nicola Buckley</a> from the Judge, and <a href=\"https://www.cser.ac.uk/team/alex-marcoci/\">Alex Marcoci</a> from <a href=\"https://www.cser.ac.uk\">CSER</a>. I've got to say that the use of Lego to break the ice among so many fields, and <em>also</em> to help us form thoughts about a very <a href=\"/papers/2025-evidence-tap\">complex</a> and <a href=\"/papers/2025-ai-poison\">nuanced</a> area was just brilliant.</p>\n<p><img src=\"/images/tap-lego-5.webp\" alt=\"%rc\" >\nAnd after a long week, it was just nice to kick back and play with LEGO. I'm <em>really</em> looking forward to doing more work with our friends from other departments: working with conservation has gotten me out and about <a href=\"/ideas/hedgehog-mapping\">finding hedgehogs</a>, so now learning about <a href=\"https://www.educ.cam.ac.uk/centres/pedal/\">play in education, development and learning</a> will hopefully help me understand how to be a better teacher!  I also highly recommend contacting Gina at <a href=\"https://playincluded.com/\">Play Included</a> if you would like to try this yourself for one of your own projects.</p>\n<h2 id=\"echo\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#echo\"></a>Echo</h2>\n<p>After <a href=\"/notes/2026w6\">last weeks</a> talk at ARIA, I enjoyed hosting the founders of a new focussed research organisation called 'Echo Labs' funded by ARIA who are doing very ambitious things with AI and biodiversity. More on what they're up to after they officially launch, but I was hugely impressed with their drive and focus to deliver near term impact. I hope we will continue to build collaboration with them as part of our <a href=\"https://www.clr.conservation.cam.ac.uk/\">Centre for Landscape Regeneration</a>.</p>\n<p><img src=\"/images/cci-valentines.webp\" alt=\"%rc\" >\nWhile they were visiting the CCI, there was also a dramatic unveiling of the new CCI logo, so everyone poured into the coffee room. Well done on a successful launch; the new logo is very practical and inclusive and pretty and I'll be using it everywhere once I get my mittens on the high res versions!</p>\n<h2 id=\"iucn-red-list\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#iucn-red-list\"></a>IUCN Red List</h2>\n<p><a href=\"https://shaneweisz.com\">Shane Weisz</a> continued his storming first year PhD by giving a fantastic <a href=\"https://watch.eeg.cl.cam.ac.uk/w/ppe8LV3MAaYCAmeDMAfspi\">EEG seminar</a> on his PhD work to date on <a href=\"https://www.shaneweisz.com/blog/presenting-ai-for-the-red-list-to-iucn\">speeding up RED List assessment</a>. He's off to the inaugural <a href=\"https://wildlabs.net/article/save-date-first-international-conservation-technology-conference\">Conservation Technology Conference</a> in Peru next week to speak about this work there as well, which should be most exciting (and hopefully filled with much birding). We decided to codename the overall dashboard effort as &quot;<a href=\"/projects/enki\">Enki</a>&quot;.</p>\n<p><div class=\"video-center\"><iframe title=\"Can Agentic AI Accelerate IUCN Red List Assessments?\" width=\"100%\" height=\"315px\" src=\"https://watch.eeg.cl.cam.ac.uk/videos/embed/bd8063d9-0572-4d12-ba0e-8c2ac2614a1f\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<p>On the Enki storage topic, I've been working on syncing <a href=\"https://gbif.org\">GBIF</a> locally as Parquet files (about 9TB) so we don't need to hammer their API. But I noticed that the AWS Open Data hosting hadn't been updated for six months, but a <a href=\"https://bsky.app/profile/gbif.org/post/3memdqsxjo22v\">single Bluesky post</a> was enough to get GBIF's attention and they fixed it within days. Props to them for being so responsive, but it's also interesting that almost noone else seems to locally copy the data regularly.</p>\n<h2 id=\"reading\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reading\"></a>Reading</h2>\n<p><a href=\"https://simon.peytonjones.org/\">Simon Peyton Jones</a> pointed me to his friend Julian Allwood's new book <a href=\"https://www.cambridge.org/core/books/promise-the-earth/31E27442471A864A6582BA751ECD239F\">Promise the Earth: A Safe Climate in Good Faith</a> which I picked up from the CUP bookstore. It's an intriguing book in that it's written by an engineer and a theologian, and makes t\nhe argument that restraint is the only way forward.</p>\n<blockquote>\n<p>This brilliant book makes the case that rational self-interest alone will not\nbring about the radical changes to the world economy required to protect our\nchildren from the climate breakdown that is coming. The role of faith and\ncompassion has historically played a major role in the way that people\ncooperate and plan for the future; this persuasive book makes the case that\nit is needed more than ever.\n<cite>-- <a href=\"https://www.cambridge.org/core/books/promise-the-earth/reviews/98D86D15A31E74CD4934B2F28F489609\">Review by Professor Mark Miodownik MBE FREng</a>, 2025</cite></p>\n</blockquote>\n<h2 id=\"next-week\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#next-week\"></a>Next week</h2>\n<p><a href=\"https://jon.recoil.org\">Jon Ludlam</a> finished off undergraduate exam questions for <a href=\"/notes/focs\">Foundations of Computer Science</a> and I am going to do a review on them -- the only teaching thing I am doing this <a href=\"https://mort.io/blog/all-change/\">sabbatical year</a>.</p>\n<p>On Monday, I'm jetting off to the <a href=\"https://impact.indiaai.gov.in/\">India AI summit</a> in\nDelhi and to see some relatives for the week.\nI'll be back in Cambridge on Friday to host <a href=\"https://cs.brown.edu/~sk/\">Shriram Krishnamurthi</a> who is passing through on sabbatical. Very excited to see him again to continue our <a href=\"/notes/icfp25-what-i-learnt\">ICFP conversations about teaching</a>!</p>\n<p>Also the third outing of <a href=\"https://propl.dev\">PROPL</a> has been accepted to PLDI 2026 this summer, so I'll be helping the new chair <a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a> prepare the PC invitations and roundup the deadlines for submissions.</p>\n<p>Some fun links:</p>\n<ul>\n<li><a href=\"https://blog.janestreet.com/advent-of-fpga-challenge-2025-results/\">Results from the Advent of FPGA Challenges</a>: the diversity of results was hilarious, with my favourite being the ones who TAPED OUT <a href=\"https://atx.name/electronics/hardcaml-to-74-series-logic/\">a PCB</a> or <a href=\"https://gds-viewer.tinytapeout.com/?pdk=sky130A&amp;model=https%3A%2F%2Frobertsaabwoo.github.io%2Fadvent_of_fpga_2025_day_12_tiny_tapeout%2F%2Ftinytapeout.oas\">an IC</a>. Incredibly impressive and not what you might expect.</li>\n<li><a href=\"https://www.tunbury.org/2026/02/09/base-image-builder/\">Is OCaml the only user of Windows native Docker containers?</a> because it sure feels lonely out there. We deployed them years ago, but they're flaky and slow and everyone's using Linux anyway. I feel that once Windows cross compilation is solved we'll never look back at Windows native toolchains, although <a href=\"https://www.dra27.uk\">David Allsopp</a> may <a href=\"https://www.dra27.uk/blog/platform/2025/12/17/its-merged.html\">disagree</a> as he makes OCaml's toolchain ever easier to deploy. But even David can't fix Windows nanoserver..</li>\n<li>Big up to <code>chasewnorton</code>, zero-day vibing like mad with <a href=\"https://github.com/anthropics/claudes-c-compiler/pulls?q=is%3Apr+is%3Aclosed\">100 PRs on a codebase no living being understands</a>.</li>\n</ul><h1>References</h1><ul><li>Madhavapeddy (2025). Foundations of Computer Science. <a href=\"https://doi.org/10.59350/qms3q-ymn65\" target=\"_blank\"><i>10.59350/qms3q-ymn65</i></a></li>\n<li>Madhavapeddy (2025). What I learnt at ICFP/SPLASH 2025 about OCaml, Hazel and FP. <a href=\"https://doi.org/10.59350/w1jvt-8qc58\" target=\"_blank\"><i>10.59350/w1jvt-8qc58</i></a></li>\n<li>Jaffer et al (2025). AI-assisted Living Evidence Databases for Conservation Science. Cambridge Open Engage. <a href=\"https://doi.org/10.33774/coe-2025-rmsqf\" target=\"_blank\"><i>10.33774/coe-2025-rmsqf</i></a></li>\n<li>Reynolds et al (2025). Will AI speed up literature reviews or derail them entirely?. Nature Publishing Group. <a href=\"https://doi.org/10.1038/d41586-025-02069-w\" target=\"_blank\"><i>10.1038/d41586-025-02069-w</i></a></li>\n<li>Madhavapeddy et al (2026). ucam-eo/geotessera: Reduce startup time, improved coordinate clamping and reduces the size of coverage data for the globe viewer. Zenodo. <a href=\"https://doi.org/10.5281/zenodo.18649425\" target=\"_blank\"><i>10.5281/zenodo.18649425</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2026w7",
      "title": ".plan-26-07: Storage, Lego, Echo, and the IUCN",
      "summary": "Growing the Ceph cluster for TESSERA embeddings, a Lego brainstorming session for the Evidence TAP, hosting Echo Labs from ARIA, and Shane's IUCN Red List seminar.",
      "date_published": "2026-02-15T00:00:00.000000Z",
      "date_modified": "2026-02-15T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "tessera",
        "evidence",
        "biodiversity",
        "storage"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/qms3q-ymn65",
          "doi": "10.59350/qms3q-ymn65",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/w1jvt-8qc58",
          "doi": "10.59350/w1jvt-8qc58",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.33774/coe-2025-rmsqf",
          "doi": "10.33774/coe-2025-rmsqf",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1038/d41586-025-02069-w",
          "doi": "10.1038/d41586-025-02069-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.5281/zenodo.18649425",
          "doi": "10.5281/zenodo.18649425",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2026w6",
      "content_html": "<h2 id=\"vivaing\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#vivaing\"></a>Vivaing</h2>\n<p>This week started off with my conducting the <a href=\"https://mort.io/blog/phd-viva/\">PhD viva</a> for <a href=\"https://mlisaius.github.io/\">Madeline Lisaius</a>, which was a very enjoyable 4.5 hours of discussion with her and external examiner <a href=\"https://le.ac.uk/people/kevin-tansey\">Kevin Tansey</a> who came up from Leicester for the day. I can't comment on the result until it is officially ratified by the examiners board here, but you can judge for yourself from the expressions below how it went! Maddy is off to <a href=\"https://www.linkedin.com/feed/update/urn:li:activity:7427756948470665216/\">Paris next</a> for a new role working on CLAY2, and we will miss her here in Cambridge!</p>\n<p><img src=\"/images/maddy-viva-1.webp\" alt=\"%c\" title=\"A happy Maddy and Kevin after a long viva. Photo credit: Simon Peyton Jones\" ></p>\n<h2 id=\"talking-at-aria\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#talking-at-aria\"></a>Talking at ARIA</h2>\n<p>After that I went to prepare for a big talk at <a href=\"https://aria.org.uk\">ARIA</a> after being invited by Ilan Gur from meeting him <a href=\"/notes/principles-for-collective-knowledge\">a few months ago</a>. I went down with <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and <a href=\"https://toao.com\">Sadiq Jaffer</a> and I gave the talk while Keshav demoed his increasingly brilliant <a href=\"https://github.com/sk818/TEE\">TESSERA Embeddings Explorer</a> and Sadiq showed off his <a href=\"https://toao.com/blog/earth-observation-budget-solar-farms-tiny-model\">solar panel CNNs</a> and predicting what would grow in the newest season of Clarkson's Farm (seriously, it works great).</p>\n<p><img src=\"/images/aria-visit-2.webp\" alt=\"%rc\" title=\"Sadiq poses at ARIA!\" >\nMany of the program directors were present, and I appreciated the chance to throw out some of the bleeding edge stuff we've being doing, and we got a lot of useful feedback and also excitement about usecases.  I gotta say I’m really digging the general sense of optimism and sense of momentum from all the ARIA staff I talk to. There’s a &quot;can do&quot; attitude that's sometimes been missing from the general discourse in the UK in recent years, and I greatly enjoyed all the discussions across programs.</p>\n<p><img src=\"/images/aria-visit-1.webp\" alt=\"%rc\" >\nIt was also delightful to see top functional programmer Kathleen Fisher there briefly (who is also their <a href=\"https://www.aria.org.uk/insights/introducing-our-next-ceo/\">new CEO</a>!.  On the train back, I also took the opportunity to refresh <a href=\"https://en.wikipedia.org/wiki/Kathleen_Fisher\">Kathleen's wikipedia article</a>.</p>\n<h2 id=\"nature-covers-our-conservation-evidence-conference\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#nature-covers-our-conservation-evidence-conference\"></a>Nature covers our conservation evidence conference</h2>\n<p>Nature sent along a reporter to the <a href=\"/notes/red-pill-conservation\">Conservation evidence conference</a> conference, and she published an <a href=\"https://www.nature.com/articles/d41586-026-00309-1\">excellent piece in Nature</a> on how biodiversity has an evidence problem and approaches to fix it (the latter being most important!).</p>\n<blockquote>\n<p>William Sutherland, a conservation scientist at the University of Cambridge\nwho leads the Conservation Evidence project, says that his team is now\ndeploying artificial intelligence to improve the speed and the thoroughness\nof the project's process. The ambition is for users to be able to interrogate\nthe data set with a specially designed '<a href=\"/projects/ce\">conservation chatbot</a>', meaning\npractical questions would be answered with a narrative summary and links\nprovided to sources of evidence. The data set would constantly be updated to\ntake account of new studies and retractions, with humans overseeing the\nprocess. The concept is described in a <a href=\"/papers/2025-evidence-tap\">preprint</a> that was published last year.\n<cite>-- <a href=\"https://www.nature.com/articles/d41586-026-00309-1\">Biodiversity conservation has an evidence problem — it’s time to fix it</a>, 2025</cite></p>\n</blockquote>\n<h2 id=\"parliamentary-post-briefing\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#parliamentary-post-briefing\"></a>Parliamentary POST briefing</h2>\n<p>As another followup to the conference, I also got invited by <a href=\"https://pml.ac.uk/profile/stephanie-day/\">Stephanie Day</a> to give evidence to the the <a href=\"https://en.wikipedia.org/wiki/Parliamentary_Office_of_Science_and_Technology\">Parliamentary Office of Science and Technology</a> (POST). POST is an office of both Houses of Parliament, which provides independent and balanced analysis of research evidence related to public policy issues. We had a lively roundtable with a breadth of experts discussing the pros and cons of evidence-based approaches and data accessibility for both <a href=\"/notes/coar-prc\">open models</a> and also <a href=\"/notes/rs-future-of-publishing\">conventional publishing</a>, and I'm sure this will make for a good briefing report when completed.</p>\n<h2 id=\"interviewed-by-the-cacm\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#interviewed-by-the-cacm\"></a>Interviewed by the CACM</h2>\n<p>I've got an upcoming article (on the front cover!) of CACM, so <a href=\"https://dave.recoil.org\">Dave Scott</a> and I\nwere actually video interviewed in my office by a wonderful videographer <a href=\"https://www.rosiepowellfreelance.com/\">Rosie\nPowell</a> who did the trek up three floors\nof stairs with all her kit! More on the article itself when it's out in a few\nweeks.  It was a set of good news for papers all around, with another article\naccepted to Nature Communications as well after 5 (!) rounds of reviews and\nprobably over 100 pages of responses.</p>\n<p><img src=\"/images/cacm-interview-1.webp\" alt=\"%c\" title=\"Dave and I were interviewed separately in my office while staring at each other. Just like when we were at school.\" ></p>\n<h2 id=\"storage-storage-storage\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#storage-storage-storage\"></a>Storage storage storage</h2>\n<p>The week would also not be complete with doing more work on the TESSERA storage array. This time we were busy juggling machines to cope with the influx of embeddings from Vultr which are filling up storage at an alarming rate. There was a bit of drama last month when one of our <a href=\"https://www.tunbury.org/2026/01/13/pima-nvme/\">machines failed</a>; while we didn't lose anything, it took a <em>long</em> time to sync data back. Now, we're building a much more resilient <a href=\"https://www.tunbury.org/2026/01/06/ceph-notes/\">Ceph cluster</a> to avoid a dependency on any one host.</p>\n<p>I also managed to do a lot of hacking on <a href=\"https://github.com/avsm/oxmono\">oxmono</a>, including porting compression libraries like Brotli, Zstd and Snappy over to OxCaml in preparation for my pure OCaml Parquet implementation. OxCaml is pretty solid so far, although it's inscrutable enough due to being a moving target that Claude's a necessity to work through all the annotated interfaces.</p>\n<h2 id=\"next-week\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#next-week\"></a>Next week</h2>\n<p>I've got a battery of meetings and followups from ARIA, including a bunch of biodiversity related meetings and also a storage array to fix up!\tI'm in Cambridge, although the wet and windy weather isn't inspiring to go out and do my running routine...</p>\n<p>Fun links:</p>\n<ul>\n<li>I don't really understand the intricacies, but I'm very excited by <a href=\"https://jonmsterling.com\">Jon Sterling</a> working on <a href=\"https://www.jonmsterling.com/2026-W06/\">NewsDrawer</a> and seeing the mac UI come together.</li>\n</ul><h1>References</h1><ul><li>Madhavapeddy (2026). Discussing effective conservation with all the UK Chief Scientists. <a href=\"https://doi.org/10.59350/qjrmv-38130\" target=\"_blank\"><i>10.59350/qjrmv-38130</i></a></li>\n<li>Madhavapeddy (2025). Royal Society's Future of Scientific Publishing meeting. <a href=\"https://doi.org/10.59350/nmcab-py710\" target=\"_blank\"><i>10.59350/nmcab-py710</i></a></li>\n<li>Jaffer et al (2025). AI-assisted Living Evidence Databases for Conservation Science. Cambridge Open Engage. <a href=\"https://doi.org/10.33774/coe-2025-rmsqf\" target=\"_blank\"><i>10.33774/coe-2025-rmsqf</i></a></li>\n<li>Madhavapeddy (2025). Publish, Review, Curate to upend scholarly publishing. <a href=\"https://doi.org/10.59350/fpc9w-ccj82\" target=\"_blank\"><i>10.59350/fpc9w-ccj82</i></a></li>\n<li>Madhavapeddy (2025). Four Ps for Building Massive Collective Knowledge Systems. <a href=\"https://doi.org/10.59350/418q4-gng78\" target=\"_blank\"><i>10.59350/418q4-gng78</i></a></li>\n<li>(2026). Biodiversity conservation has an evidence problem — it’s time to fix it. Nature Publishing Group. <a href=\"https://doi.org/10.1038/d41586-026-00309-1\" target=\"_blank\"><i>10.1038/d41586-026-00309-1</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2026w6",
      "title": ".plan-26-06: Vivas, ARIA and interviews",
      "summary": "PhD viva for Maddy, presenting TESSERA at ARIA, Nature covers the conservation evidence conference, giving evidence to Parliamentary POST, and a CACM interview.",
      "date_published": "2026-02-08T00:00:00.000000Z",
      "date_modified": "2026-02-08T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "tessera",
        "conservation",
        "evidence",
        "biodiversity",
        "ocaml",
        "oxcaml"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/qjrmv-38130",
          "doi": "10.59350/qjrmv-38130",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/nmcab-py710",
          "doi": "10.59350/nmcab-py710",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.33774/coe-2025-rmsqf",
          "doi": "10.33774/coe-2025-rmsqf",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/fpc9w-ccj82",
          "doi": "10.59350/fpc9w-ccj82",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/418q4-gng78",
          "doi": "10.59350/418q4-gng78",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1038/d41586-026-00309-1",
          "doi": "10.1038/d41586-026-00309-1",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/qjrmv-38130",
      "content_html": "<p>I helped <a href=\"/projects/ce\">CE</a> host a gathering of evidence champions for biodiversity at Pembroke in late January. The first day was a remarkable closed meeting to sit in on, with the science leads of <em>all five</em> UK nature conservation bodies and the <a href=\"https://jncc.gov.uk\">JNCC</a> present! <a href=\"https://toao.com\">Sadiq Jaffer</a> and I got to present <a href=\"/papers/2025-tessera\">TESSERA</a> to them and discuss more broadly how we could apply machine learning to accelerate nature recovery and preservation across the UK. Read CE's blog series (parts <a href=\"https://about.conservationevidence.com/2026/01/16/geospatial-foundation-models/\">1</a>, <a href=\"https://about.conservationevidence.com/2026/02/03/delivering-practitioner/\">2</a>, <a href=\"https://about.conservationevidence.com/2026/02/12/delivering-funders/\">3</a>) with more details from me next.</p>\n<p><img src=\"/images/ce-champions-1.webp\" alt=\"%c\" title=\"The UK Statutory Nature Conservation Bodies meet with academic colleagues and DEFRA's chief scientist to discuss how advances in AI can contribute to their work. Image credit: Conservation Evidence\" ></p>\n<h2 id=\"presenting-to-the-chief-scientists\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#presenting-to-the-chief-scientists\"></a>Presenting to the Chief Scientists</h2>\n<p>The first day's meeting of chief scientists was convened by <a href=\"https://www.bangor.ac.uk/staff/sens/julia-patricia-gordon-jones-010356/en\">Julia P.G. Jones</a> who recently became co-chair of the new <a href=\"https://jncc.gov.uk/about-jncc/who-we-are/joint-committee/professor-julia-jones/\">Chief Scientists Group</a>. Julia wrote a <a href=\"https://about.conservationevidence.com/2026/01/16/geospatial-foundation-models/\">fantastic roundup blog post</a> about the goings on:</p>\n<blockquote>\n<p>The presentation resulted in a fascinating discussion about how such\ninnovations may influence the work of the SNCBs in future. The next step is\nto train and validate downstream models combining existing ground-based data\nwith the embeddings. The hope is that this would allow us to interpolate\ndatasets between sampled locations and then ask larger-scale questions about\necosystem change and the impact of interventions for a wider range of\noutcomes than is currently possible. Such understanding is crucial for so\nmany applications.\n<cite>-- <a href=\"https://about.conservationevidence.com/2026/01/16/geospatial-foundation-models/\">Could geospatial foundation models help improve conservation effectiveness?</a></p>\n</blockquote>\n<p><img src=\"/images/ce-champions-2.webp\" alt=\"%rc\" title=\"Julia and Sadiq prepping in the 'corridor track'\" ></p>\n<p>It was a real honour to be able to directly discuss our research with this\ngroup, with me personally finally getting to meet Dr Sara McGuckin, from the\nNorthern Ireland Environment Agency (NIEA); just round the corner from where I\nwent to school! I showed her <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>'s work on mapping <a href=\"https://patricoferris.github.io/ni-forests/\">NI\ntrees</a> as well and really want to\nget around to doing a TESSERA version soon.</p>\n<h2 id=\"conservation-evidence-conference-at-pembroke\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#conservation-evidence-conference-at-pembroke\"></a>Conservation Evidence Conference at Pembroke</h2>\n<p>The second day was held in the new Pembroke Auditorium, which was as always\na gorgeous venue. The district heating system failed on the day, and it was\nunfortunately close to freezing. Luckily the hardy conservation community didn't\ncomplain even once; this was just another day in the outdoors for them. Bill even\nrallied everyone with his famous bell!</p>\n<p><div class=\"video-center\"><iframe title=\"Bill Sutherland at the Conservation Evidence conference at Pembroke\" width=\"100%\" height=\"315px\" src=\"https://crank.recoil.org/videos/embed/dc3d8209-cec7-42b7-b6ca-f5f8f9182f60\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<p>The CE blog published a <a href=\"https://about.conservationevidence.com/2026/02/03/delivering-practitioner/\">comprehensive roundup</a> of all the talks with a <a href=\"https://about.conservationevidence.com/2026/02/12/delivering-funders/\">summary of takeaways for funders</a>. All of the talk recordings are available on <a href=\"https://www.youtube.com/playlist?list=PLAprj0cLPLDdfu7Lw76BaqgWHMIePQuNW\">CE's Youtube channel</a> and I also mirrored them ad-free on our <a href=\"https://watch.eeg.cl.cam.ac.uk/c/ce/videos\">Watch EEG channel</a> if you are more Fediverse oriented.</p>\n<p>I'm going to highlight three talks here, but only because they're relevant when viewed together -- if you're interested in this then all of the talks were of high quality and worth your time to watch!</p>\n<h3 id=\"bill-sutherland-evidence-based-conservation-progress-and-challenges\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#bill-sutherland-evidence-based-conservation-progress-and-challenges\"></a>Bill Sutherland: Evidence-based conservation: progress and challenges</h3>\n<p>Bill's talks are always a must-watch, and this one combined nuggets of evidentiary falsification (bird nest box designs turn out to be entirely suboptimal) vs a basic optimism that evidence based conservation is steadily becoming the norm rather than the exception.</p>\n<p><div class=\"video-center\"><iframe title=\"Evidence-based conservation: progress and challenges\" width=\"100%\" height=\"315px\" src=\"https://watch.eeg.cl.cam.ac.uk/videos/embed/d3f6cb6f-59c9-4691-bf0f-33c49a7efeff\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<h3 id=\"julia-pg-jones-biodiversity-conservations-causal-revolution\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#julia-pg-jones-biodiversity-conservations-causal-revolution\"></a>Julia PG Jones: Biodiversity conservation's causal revolution</h3>\n<p>Julia's theme was on choosing the right design for the vastly <a href=\"/papers/2025-biodiversity-9recs\">growing quantities of biodiversity data</a>. She focussed on remembering that &quot;assumptions are everything&quot; and the importance of upfront design combined with with post-project evaluation to avoid a causal soup of data that's difficult to reconnect.</p>\n<p><div class=\"video-center\"><iframe title=\"Biodiversity conservation's causal revolution\" width=\"100%\" height=\"315px\" src=\"https://watch.eeg.cl.cam.ac.uk/videos/embed/2297b0d5-4d5b-4125-94b5-9db0d1d67889\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<h3 id=\"red-pill-or-blue-pill-for-conservation\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#red-pill-or-blue-pill-for-conservation\"></a>Red Pill or Blue Pill for Conservation?</h3>\n<p>And last and least, I got to talk about some of our recent advances in using <a href=\"/notes/humans-save-nature-not-ai\">AI for conservation</a>, such as <a href=\"/papers/2025-tessera\">TESSERA</a> and the <a href=\"/papers/2025-evidence-tap\">Conservation Evidence TAP</a>. Before that though, I opened the talk with my thoughts on the basic choice facing conservation this year. AI is here to stay, for good or for evil, and the entire field has to cope really quickly. The easy (blue pill) would be to simply use off-the-shelf chatbots for decision-making, but be unable to reproduce or trace the provenance of outputs from the black boxes. The harder (red pill) is to engineer open and traceable processing pipelines which can filter out <a href=\"/papers/2025-ai-poison\">AI poison</a> and give sovereign decision making capability to governments. And of course, all the work we're doing is as open as we can make it. Watch the talk to learn more!</p>\n<p><div class=\"video-center\"><iframe title=\"The red pill or the blue pill for AI and conservation\" width=\"100%\" height=\"315px\" src=\"https://watch.eeg.cl.cam.ac.uk/videos/embed/5b58f41f-8d71-48ba-92d1-b34a49ee76ef\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<h2 id=\"fun-photos\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#fun-photos\"></a>Fun Photos</h2>\n<p>The conference was a very intense two days indeed, but it was brilliant to see the Auditorium so full of enthusiasm and desire for effective action.</p>\n<p><img src=\"/images/ce-champions-3.webp\" alt=\"%c\" title=\"The very full Pembroke auditorium! Credit: Sam Reynolds\" >\n<img src=\"/images/ce-champions-4.webp\" alt=\"%c\" title=\"My picture of Sam taking the previous picture\" >\n<img src=\"/images/ce-champions-5.webp\" alt=\"%c\" title=\"Did I mention how cold the second day was, but also my excitement at meeting the head of the NI Northern Ireland Environment Agency?\" ></p><h1>References</h1><ul><li>Madhavapeddy (2025). Humans are the ones that will save nature, helped by AI. <a href=\"https://doi.org/10.59350/32h4v-5kt36\" target=\"_blank\"><i>10.59350/32h4v-5kt36</i></a></li>\n<li>Feng et al (2025). TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2506.20380\" target=\"_blank\"><i>10.48550/arXiv.2506.20380</i></a></li>\n<li>Sutherland et al (2026). Nine changes needed to deliver a radical transformation in biodiversity measurement. <a href=\"https://doi.org/10.1073/pnas.2519345123\" target=\"_blank\"><i>10.1073/pnas.2519345123</i></a></li>\n<li>Jaffer et al (2025). AI-assisted Living Evidence Databases for Conservation Science. Cambridge Open Engage. <a href=\"https://doi.org/10.33774/coe-2025-rmsqf\" target=\"_blank\"><i>10.33774/coe-2025-rmsqf</i></a></li>\n<li>Reynolds et al (2025). Will AI speed up literature reviews or derail them entirely?. Nature Publishing Group. <a href=\"https://doi.org/10.1038/d41586-025-02069-w\" target=\"_blank\"><i>10.1038/d41586-025-02069-w</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/red-pill-conservation",
      "title": "Discussing effective conservation with all the UK Chief Scientists",
      "summary": "Hosting the UK chief scientists for nature conservation at Pembroke to discuss TESSERA and AI for biodiversity, followed by the Conservation Evidence conference where I talked about choosing the open red pill over black-box AI for conservation decision-making.",
      "date_published": "2026-02-03T00:00:00.000000Z",
      "date_modified": "2026-02-03T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "conservation",
        "evidence",
        "ai",
        "biodiversity",
        "tessera"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/32h4v-5kt36",
          "doi": "10.59350/32h4v-5kt36",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2506.20380",
          "doi": "10.48550/arXiv.2506.20380",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1073/pnas.2519345123",
          "doi": "10.1073/pnas.2519345123",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.33774/coe-2025-rmsqf",
          "doi": "10.33774/coe-2025-rmsqf",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1038/d41586-025-02069-w",
          "doi": "10.1038/d41586-025-02069-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/9c6bz-kb659",
      "content_html": "<p>Since helping with the <a href=\"/notes/icfp25-oxcaml\">OxCaml tutorial</a> last year at ICFP,\nI've been chomping at the bit to use it for real in our research infrastructure\nfor <a href=\"/projects/plancomp\">planetary computing</a> to manage the petabytes of <a href=\"/notes/geotessera-python\">TESSERA embeddings</a> we've been generating.</p>\n<p>The reason for my eagerness is that OxCaml has a number of language extensions\nthat give giant leaps in performance for systems-oriented programs, while\nretaining the familiar OCaml functional style of programming. And unlike Rust,\nthere's a garbage collector available for 'normal' code. I am also deeply sick\nand tired of maintaining large Python scripts recently, and crave the modularity and\ntype safety of OCaml.</p>\n<p>The traditional way I learn a new technology is by replacing my <a href=\"/notes/bushel-lives\">website infrastructure</a> with the latest hotness. I switched my live site\nover to building with OxCaml last year, but never got around to deeply\nintegrating the new extensions. Therefore, what I'll talk about next is a new\nwebserver I've been building called\n<strong><a href=\"https://github.com/avsm/oxmono/tree/e0b061c0f6621c80e3a990d02867e3302fd7ce16/avsm/httpz\">httpz</a></strong>\nwhich goes all in on performance in OCaml!</p>\n<p><em>(Many thanks to <a href=\"https://tyconmismatch.com/code.html\">Chris Casinghino</a>, <a href=\"https://thenumb.at/\">Max Slater</a>, <a href=\"https://richarde.dev/\">Richard Eisenberg</a>, <a href=\"https://github.com/yminsky\">Yaron Minsky</a>, <a href=\"https://github.com/mshinwell\">Mark Shinwell</a>, <a href=\"https://www.dra27.uk\">David Allsopp</a> and the rest of the Jane Street\ntools and compilers team for answering many questions while I got started on all this!)</em></p>\n<h2 id=\"why-zero-allocation-for-http11\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#why-zero-allocation-for-http11\"></a>Why Zero Allocation for HTTP/1.1?</h2>\n<p><a href=\"https://github.com/avsm/oxmono/tree/e0b061c0f6621c80e3a990d02867e3302fd7ce16/avsm/httpz\">httpz</a> is a high-performance HTTP/1.1 parser that aims to have no major heap allocation, and very minimal minor heap allocation, by using OxCaml's <a href=\"https://oxcaml.org/documentation/unboxed-types/01-intro/\">unboxed types</a> and <a href=\"https://oxcaml.org/documentation/stack-allocation/intro/\">local allocations</a>.</p>\n<p>Why is this useful?  It means that the entire lifetime of an HTTP connection\ncan be handled in the callstack alone, so freeing up a connection is just a\nmatter of returning from the function that handles it. In the steady state, a\nwebserver would have almost no garbage collector activity. When combined with\n<a href=\"/papers/2021-pldi-retroeff\">direct style effects</a>, it can also be written without\nlooking like callback soup!</p>\n<p>I decided to specialise this library for HTTP/1.1 for now, and so settled on\nthe input being a simple 32KB bytes value. This represents an HTTP request with\nthe header portion (HTTP body handling is relatively straightforward for POST\nrequests, and not covered in this post).</p>\n<p>Given an input buffer like this, what can we do with OxCaml <em>vs</em> vanilla OCaml\nto make this go fast?</p>\n<h3 id=\"unboxed-types-and-records\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#unboxed-types-and-records\"></a>Unboxed Types and Records</h3>\n<p>The first port of call is to figure out the core types we're going to use for our\nparser. If you need to get familiar with OCaml's upstream memory representation then\n<a href=\"https://dev.realworldocaml.org/runtime-memory-layout.html\">head over to Real World OCaml</a>.</p>\n<p>In my usual OCaml code, I use libraries like <a href=\"https://github.com/mirage/ocaml-cstruct\">cstruct</a>\nthat I <a href=\"/projects/unikernels\">originally</a> wrote back in 2012 to manage non-copying views into bytes buffers. Cstruct\ndefines a record that has four words (the box, and three words for the fields):</p>\n<pre><code class=\"language-ocaml\">type buffer = (char, Bigarray.int8_unsigned_elt, Bigarray.c_layout) Bigarray.Array1.t\ntype Cstruct.t = private {\n  buffer: buffer;\n  off   : int;\n  len   : int;\n}\n</code></pre>\n<p>The idea is to use the record to get narrow views into a larger buffer, and that these\nsmall views can just live on the minor heap of the runtime which is fast to collect.\nOxCaml advances this by providing unboxed versions of <a href=\"https://oxcaml.org/documentation/miscellaneous-extensions/small-numbers/\">small\nnumbers</a>\nthat live in registers or on the stack, via a new syntax <code>int16#</code>.</p>\n<p>Instead of Bigarrays, we're now going to switch to use <code>bytes</code> instead, but the\nbasic idea is the same.  Since httpz's buffer is a max of 32KB, 16-bit integers\nalso suffice for all positions and lengths!</p>\n<pre><code class=\"language-ocaml\">type Httpz.t = #{ off : int16# ; len : int16# }\n</code></pre>\n<p>There are actually two new features here: the first is that records can be unboxed with the <code>#{}</code>\nsyntax, and the contents themselves are of a smaller width.  Let's have a closer look\nat the difference between the Cstruct boxed version and this new OxCaml one:</p>\n<h4 id=\"inspect-unboxing-in-utop\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#inspect-unboxing-in-utop\"></a>Inspect unboxing in utop</h4>\n<p>My first port-of-call is usually to use utop interactively to poke around\nusing the <code>Obj</code> module.  This isn't quite so easy in OxCaml since the unboxed\nrecords use a special <a href=\"https://oxcaml.org/documentation/unboxed-types/01-intro/\">layout</a>:</p>\n<pre><code># type t = #{ off : int16# ; len : int16# };;\ntype t = #{ off : int16#; len : int16#; }\n\n# let x = #{ off=#1S; len=#2S };;\nval x : t = #{off = &lt;abstr&gt;; len = &lt;abstr&gt;}\n\n# Obj.repr x;;\nError: This expression has type t but an expression was expected of type\n         ('a : value)\n       The layout of t is bits16 &amp; bits16\n         because of the definition of t at line 1, characters 0-41.\n       But the layout of t must be a sublayout of value.\n\n</code></pre>\n<p>That failed, but it did reveal that we have this intriguing int16 pair layout\ninstead of the normal OCaml flat value representation!  Let's use the compiler\nto figure this out...</p>\n<h4 id=\"inspect-unboxing-in-lambda\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#inspect-unboxing-in-lambda\"></a>Inspect unboxing in lambda</h4>\n<p>I next built a small\ntest program and inspected the <a href=\"https://dev.realworldocaml.org/compiler-backend.html\">lambda intermediate language</a> from the compiler. To avoid dependencies, I just bound the raw compiler internals directly by checking out the oxcaml source code.</p>\n<pre><code class=\"language-ocaml\">external add_int16 : int16# -&gt; int16# -&gt; int16# = &quot;%int16#_add&quot;\nexternal int16_to_int : int16# -&gt; int = &quot;%int_of_int16#&quot;\n\ntype span = #{ off : int16#; len : int16# }\n\nlet[@inline never] add_spans (x : span) (y : span) : span =\n  #{ off = add_int16 x.#off y.#off; len = add_int16 x.#len y.#len }\n\nlet () =\n  let x = Sys.opaque_identity #{ off = #1S; len = #2S } in\n  let y = Sys.opaque_identity #{ off = #100S; len = #200S } in\n  let z = add_spans x y in\n  Printf.printf &quot;off=%d len=%d\\n&quot; (int16_to_int z.#off) (int16_to_int z.#len)\n</code></pre>\n<p>This introduces enough compiler optimisation barriers such that\nthe addition is not optimised away at compile time. We can compile this\nwith <code>ocaml -dlambda src.ml</code> and see the intermediate form after type checking:</p>\n<pre><code>(let\n  (add_spans/290 =\n     (function {nlocal = 0} x/292[#(int16, int16)] y/293[#(int16, int16)]\n       never_inline : #(int16, int16)\n       (funct-body add_spans ./x.ml(6)&lt;ghost&gt;:196-294\n         (before add_spans ./x.ml(7):229-294\n           (make_unboxed_product #(int16, int16)\n             (%int16#_add (unboxed_product_field 0 #(int16, int16) x/292)\n               (unboxed_product_field 0 #(int16, int16) y/293))\n             (%int16#_add (unboxed_product_field 1 #(int16, int16) x/292)\n               (unboxed_product_field 1 #(int16, int16) y/293)))))))\n</code></pre>\n<p>You can see the unboxing propagating nicely here through the intermediate code!</p>\n<h4 id=\"inspect-unboxing-in-native-code\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#inspect-unboxing-in-native-code\"></a>Inspect unboxing in native code</h4>\n<p>The next step is to verify what this looks like when compiled as optimised native\ncode. I used <code>ocamlopt -O3 -S</code> on my arm64 machine which emits the assembly code\nafter all the compiler passes, and found:</p>\n<pre><code>In the entry point:\n  orr   x0, xzr, #1      ; x.#off = 1\n  orr   x1, xzr, #2      ; x.#len = 2\n  movz  x2, #100, lsl #0 ; y.#off = 100\n  movz  x3, #200, lsl #0 ; y.#len = 200\n  bl    _camlX__add_spans_0_1_code\n\n_camlX__add_spans_0_1_code:\n  add   x1, x1, x3       ; len: x.#len + y.#len\n  sbfm  x1, x1, #0, #15  ; sign-extend to 16 bits (int16# semantics)\n  add   x0, x0, x2       ; off: x.#off + y.#off\n  sbfm  x0, x0, #0, #15  ; sign-extend to 16 bits\n  ret\n\n</code></pre>\n<p>We can see from the assembly that there's no boxing, and no heap allocations,\nand the <a href=\"https://finkmartin.com/aarch64-morello/sbfm.html\">sbfm instruction</a> maintains\nthe 16-bit semantics via sign extension.</p>\n<p>Let's double check that the normal boxed OCaml does do more work and that isn't\njust the flambda2 compiler doing its magic.  Here's a boxed version of the benchmark using\nplain OCaml:</p>\n<pre><code>type span = { off : int; len : int }\n\nlet[@inline never] add_spans (x : span) (y : span) : span =\n  { off = x.off + y.off; len = x.len + y.len }\n\nlet () =\n  let x = Sys.opaque_identity { off = 1; len = 2 } in\n  let y = Sys.opaque_identity { off = 100; len = 200 } in\n  let z = add_spans x y in\n  Printf.printf &quot;off=%d len=%d\\n&quot; z.off z.len\n</code></pre>\n<p>Compiling this boxed version with <code>ocamlopt -O3 -S</code> and looking at the assembly shows\nmuch more minor heap activity:</p>\n<pre><code>_camlY__add_spans_0_1_code:\n      sub   sp, sp, #16\n      str   x30, [sp, #8]\n      mov   x2, x0\n      ldr   x16, [x28, #0]        ; load young_limit\n      sub   x27, x27, #24         ; bump allocator: reserve 24 bytes (3 words)\n      cmp   x27, x16              ; check if GC needed\n      b.cc  L114                  ; branch to GC if out of space\n  L113:\n      add   x0, x27, #8           ; x0 = pointer to new block\n      orr   x3, xzr, #2048        ; header word (tag 0, size 2)\n      str   x3, [x0, #-8]         ; write header\n      ldr   x3, [x1, #0]          ; load y.off from heap\n      ldr   x4, [x2, #0]          ; load x.off from heap\n      add   x3, x4, x3            ; add them\n      sub   x3, x3, #1            ; adjust for tagged int\n      str   x3, [x0, #0]          ; store result.off to heap\n      ldr   x1, [x1, #8]          ; load y.len from heap\n      ldr   x2, [x2, #8]          ; load x.len from heap\n      add   x1, x2, x1            ; add them\n      sub   x1, x1, #1            ; adjust for tagged int\n      str   x1, [x0, #8]          ; store result.len to heap\n      ...\n      ret\n  L114:\n      bl    _caml_call_gc         ; GC call if needed\n</code></pre>\n<p>The OCaml minor heap is really fast, but it's nowhere near as\nfast as just passing values around in registers and doing\ndirect operations, which the unboxed version lets us do!</p>\n<p>My benchmark above used direct external calls to compiler primitives,\nbut OxCaml exposes normal modules for all these special types so\nwe can just open them and gain access to the usual integer operations:</p>\n<pre><code>module I16 = Stdlib_stable.Int16_u\n\nlet[@inline always] i16 x = I16.of_int x\nlet[@inline always] to_int x = I16.to_int x\n\nlet pos : int16# = i16 0\nlet next : int16# = I16.add pos #1S\n</code></pre>\n<h3 id=\"unboxed-characters\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#unboxed-characters\"></a>Unboxed characters</h3>\n<p>There's more than just integer operations in OxCaml. Hot off the press in the\npast few weeks have been unboxed character operations as well, so we don't need\nto use an OCaml int (this is unboxed as well, but I presume the compiler can\noptimise and pack 8-bit operations much more effectively if it knows that we're\noperating on a char instead of a full word).</p>\n<p>The httpz parser tries to use these, but the support for untagged ints <a href=\"https://github.com/oxcaml/oxcaml/pull/4779\">isn't fully\ncomplete yet</a> (thanks <a href=\"https://thenumb.at/\">Max Slater</a> for\nthe <a href=\"https://bsky.app/profile/thenumb.at/post/3mdevcomw2k2d\">pointer</a>).</p>\n<p>HTTP <a href=\"https://github.com/avsm/oxmono/blob/e0b061c0f6621c80e3a990d02867e3302fd7ce16/avsm/httpz/core/date.ml#L107-L137\">date timestamps</a> use unboxed floats as well.</p>\n<h3 id=\"returning-unboxed-records-and-tuples\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#returning-unboxed-records-and-tuples\"></a>Returning unboxed records and tuples</h3>\n<p>Once we've declared these unboxed records, they're fully nestable within other unboxed records.\nFor example, <a href=\"https://github.com/avsm/oxmono/blob/e0b061c0f6621c80e3a990d02867e3302fd7ce16/avsm/httpz/core/req.ml#L12-L21\">HTTP requests with multiple fields</a> remain unboxed:</p>\n<pre><code class=\"language-ocaml\">type request =\n  #{ meth : method_\n   ; target : span           (* Nested unboxed record *)\n   ; version : version\n   ; body_off : int16#\n   ; content_length : int64#\n   ; is_chunked : bool\n   ; keep_alive : bool\n   ; expect_continue: bool\n   }\n</code></pre>\n<p>Functions can therefore naturally return multiple values without allocation by using unboxed tuples in the return value of a function:</p>\n<pre><code class=\"language-ocaml\">let take_while predicate buf ~(pos : int16#) ~(len : int16#)\n    : #(span * int16#) =\n  let start = pos in\n  let mutable p = pos in\n  while (* ... *) do p &lt;- I16.add p #1S done;\n  #(#{ off = start; len = I16.sub p start }, p)\n\nlet #(result_span, new_pos) = take_while is_token buf ~pos ~len\n</code></pre>\n<p>Vanilla OCaml did some unboxing of this use of tuples, but not with\nrecords (which would land up on the minor heap).  With this OxCaml code,\nit's all just passed directly on the stack through function call traces.</p>\n<h3 id=\"local-allocations-and-exclaves\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#local-allocations-and-exclaves\"></a>Local allocations and exclaves</h3>\n<p>We can then also mark parameters to demand that they won't escape a function, enabling stack\nallocation more explicitly:</p>\n<pre><code class=\"language-ocaml\">(* Buffer is borrowed, won't be stored anywhere *)\nlet[@inline] equal (local_ buf) (sp : span) (s : string) : bool =\n  let sp_len = I16.to_int sp.#len in\n  if sp_len &lt;&gt; String.length s then false\n  else Bigstring.memcmp_string buf ~pos:(I16.to_int sp.#off) s = 0\n</code></pre>\n<p>If a function needs to return a local value, then it uses a new <code>exclave_</code> keyword. For example, in the <a href=\"https://github.com/avsm/oxmono/blob/e0b061c0f6621c80e3a990d02867e3302fd7ce16/avsm/httpz/core/header.mli\">HTTP request parsing</a> we look up a stack allocated list of headers:</p>\n<pre><code class=\"language-ocaml\">val find : t list @ local -&gt; Name.t -&gt; t option @ local\n\nlet rec find_string (buf : bytes) (headers : t list @ local) name = exclave_\n  match headers with\n  | [] -&gt; None\n  | hdr :: rest -&gt;\n    let matches =\n      match hdr.name with\n      | Name.Other -&gt; Span.equal_caseless buf hdr.name_span name\n      | known -&gt;\n        let canonical = Name.lowercase known in\n        String.( = ) (String.lowercase name) canonical\n    in\n    if matches then Some hdr else find_string buf rest name\n;;\n</code></pre>\n<p>Notice that it's a recursive function as well, so this is a fairly natural way\nto write something that remains heap allocated.  You can learn more about this\nfrom <a href=\"https://gavinleroy.com/\">Gavin Gray</a>'s <a href=\"https://gavinleroy.com/oxcaml-tutorial-icfp25/\">OxCaml tutorial slides</a>.</p>\n<h2 id=\"mutable-local-variables-with-let-mutable\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#mutable-local-variables-with-let-mutable\"></a>Mutable Local Variables with &quot;let mutable&quot;</h2>\n<p>A nice quality of life improvement is that OxCaml allows stack-allocated\nmutable variables in loops, eliminating the need to allocate <code>ref</code> values. This\nallows parsing code to have local mutability:</p>\n<pre><code class=\"language-ocaml\">let parse_int64 (local_ buf) (sp : span) : int64# =\n  let mutable acc : int64# = #0L in\n  let mutable i = 0 in\n  let mutable valid = true in\n  while valid &amp;&amp; i &lt; I16.to_int sp.#len do\n    let c = Bytes.get buf (I16.to_int sp.#off + i) in\n    match c with\n    | '0' .. '9' -&gt;\n      acc &lt;- I64.add (I64.mul acc #10L) (I64.of_int (Char.code c - 48));\n      i &lt;- i + 1\n    | _ -&gt; valid &lt;- false\n  done;\n  acc\n</code></pre>\n<p>Whereas in conventional OCaml there might be a minor heap allocation for the\nreference:</p>\n<pre><code class=\"language-ocaml\">let parse_int64 buf sp =\n  let acc = ref 0L in           (* Heap-allocated ref *)\n  let i = ref 0 in              (* Heap-allocated ref *)\n  let valid = ref true in       (* Heap-allocated ref *)\n  while !valid &amp;&amp; !i &lt; sp.len do\n    let c = Bytes.get buf (sp.off + !i) in\n    match c with\n    | '0' .. '9' -&gt;\n      acc := Int64.add (Int64.mul !acc 10L) (Int64.of_int (Char.code c - 48));\n      i := !i + 1\n    | _ -&gt; valid := false\n  done;\n  !acc\n</code></pre>\n<h3 id=\"putting-the-parser-together\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#putting-the-parser-together\"></a>Putting the parser together</h3>\n<p>The toplevel <a href=\"https://github.com/avsm/oxmono/blob/e0b061c0f6621c80e3a990d02867e3302fd7ce16/avsm/httpz/core/httpz.mli#L182\">Httpz.parse function</a> has a pretty simple signature from a user's perspective:</p>\n<pre><code>val parse : bytes -&gt; len:int16# -&gt; limits:limits -&gt;\n  #(Buf_read.status * Req.t * Header.t list) @ local\n</code></pre>\n<p>This function receives some a bytebuffer and resource limits and returns an unboxed local tuple of the connection status, parsed (unboxed) request and a stack-local list of header spans that represent the offsets within the input buffer of what was passed.</p>\n<p>I should probably make the input buffer local too; one nice aspect of OxCaml is how easy it is to incrementally add type and kind annotations and lean on the compiler type inference to help guide where to fixup callsites.</p>\n<h3 id=\"caveats-and-limitations\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#caveats-and-limitations\"></a>Caveats and limitations</h3>\n<p>There are lots and lots of other new features in OxCaml which I've started integrating, but require careful planning of layouts.\nFor example, I wanted to use <a href=\"https://oxcaml.org/documentation/unboxed-types/02-or-null/\">or_null</a> to have a non-allocating\nversion of option, but you often end up with long compiler errors about value inference failures, so I ended up just allocating\na local type instead. Something to investigate more in the future as I get familiar with OxCaml.</p>\n<p>I also ran into issues using mutable fields in unboxed records and found this is <a href=\"https://oxcaml.org/documentation/unboxed-types/01-intro/\">documented</a>:</p>\n<blockquote>\n<p>We plan to allow mutating unboxed records within boxed records (the design\nwill differ from boxed record mutability, as unboxed types don’t have the\nsame notion of identity).</p>\n</blockquote>\n<p>It's also difficult right now to strip away the OxCaml extensions and go back\nto normal OCaml syntax. <a href=\"https://tyconmismatch.com/code.html\">Chris Casinghino</a> pointed me to the OxCaml ocamlformat fork which\nhas a <code>--erase-jane-syntax</code>, but it requires some build system work to\nintegrate and seems to lag a little behind the new features (like unboxed small\nliterals). For now, I've decided to just focus on using OxCaml exclusively and\nsee how it goes for a while.</p>\n<p>Finally, the tooling is still a fluid story. <a href=\"https://github.com/art-w\">Arthur Wendling</a> and <a href=\"https://jon.recoil.org\">Jon Ludlam</a> are making\nfast progress on getting <a href=\"https://github.com/ocaml/odoc/pull/1399\">odoc working</a> in the\nmainline tool, but it's not quite there today.</p>\n<h3 id=\"claude-skills-for-oxcaml\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#claude-skills-for-oxcaml\"></a>Claude skills for OxCaml</h3>\n<p>While I built small scale examples to test out the architecture, I leaned heavily\non Claude code to build out the majority of the parser so I could rapidly experiment.\nTo do this, I synthesised a set of <a href=\"https://github.com/avsm/ocaml-claude-marketplace/tree/main/plugins/ocaml-dev/skills/oxcaml\">OxCaml specific Claude skills</a>\nin my <a href=\"/notes/aoah-2025-25\">Claude OCaml marketplace</a> which you can add to your own projects as well. Browsing the skills is a pretty nice way of getting familiar with the different features.</p>\n<p>I generated those skills via a combination of summarising the OxCaml source trees and cribbing from the <a href=\"/notes/icfp25-oxcaml\">ICFP 2025 tutorial</a>, and then getting CC to verify that the example code actually compiled. All automated and very easy to refresh every time a new compiler drops from Jane Street.</p>\n<p><img src=\"/images/claude-oxlocal-1.webp\" alt=\"%c\" title=\"The OxCaml compiler errors are really descriptive in the latest drop, which greatly helps coding agents figure out the new types\" ></p>\n<h2 id=\"performance-results\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#performance-results\"></a>Performance Results</h2>\n<p>Ultimately, none of this matters if the  runtime performance isn't there!\nLuckily, the HTTPz parser is incredible in a synthetic benchmark (just passing\nbuffers around) as opposed to a network benchmark, using Core_bench to measure\nperformance. What's impressive isn't the straightline throughput, but the\nmassive drop in heap activity which greatly increased the predictability and\ntail latency of the service. And with all the extra typing information, I\nexpect that straightline performance will only increase (and this is before\nI've looked at the <a href=\"https://oxcaml.org/documentation/simd/intro/\">SIMD\nsupport</a>).</p>\n<div role=\"region\"><table>\n<tr>\n<th>Metric</th>\n<th>httpz (OxCaml)</th>\n<th>Traditional Parser</th>\n</tr>\n<tr>\n<td>Small request (35B)</td>\n<td>154 ns</td>\n<td>300+ ns</td>\n</tr>\n<tr>\n<td>Medium request (439B)</td>\n<td>1,150 ns</td>\n<td>2,000+ ns</td>\n</tr>\n<tr>\n<td>Heap allocations</td>\n<td>0</td>\n<td>100-800 words</td>\n</tr>\n<tr>\n<td>Throughput</td>\n<td>6.5M req/sec</td>\n<td>3M req/sec</td>\n</tr>\n</table></div><h2 id=\"putting-my-new-site-live\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#putting-my-new-site-live\"></a>Putting my new site live</h2>\n<p>I then glued this together using Eio into a <a href=\"https://github.com/avsm/oxmono/blob/e0b061c0f6621c80e3a990d02867e3302fd7ce16/avsm/httpz/eio/httpz_eio.mli\">full\nwebserver</a>.\nIt works, and serves traffic just fine and in fact you are reading this web page via it right now!</p>\n<h3 id=\"what-next-caml_alloc_local-for-c-bindings\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#what-next-caml_alloc_local-for-c-bindings\"></a>What next: caml_alloc_local for C bindings</h3>\n<p>The current Eio/OxCaml does a data copy right now since Eio uses Bigarray, but I had a catchup coffee\nwith <a href=\"https://github.com/https://roscidus.com\">Thomas Leonard</a> and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> where I agreed to treesmash my local eio into\nswitching entirely to bytes from the io-uring layer up. <a href=\"https://toao.com\">Sadiq Jaffer</a> informs me\nthat his compactor doesn't trigger automatically, so any bytes above a 4KB\nthreshold are allocated using mmap and so are fine to pass to the kernel for\nzero copy receive.</p>\n<p>The key OxCaml feature to make this <code>io_uring</code> integration awesome is a new FFI\nfunction that allocates an OCaml value directly into the caller's OxCaml stack\nrather than the heap. This means that we <em>should</em> be able to come up with a scheme\nby which io_uring requests are routed directly to an OCaml continuation that's woken\nup directly with a buffer available to it on the stack. True zero-copy to the kernel\nawaits, which should also help speed up <a href=\"/papers/2025-docker-icfp\">Docker's VPNKit</a> hugely\nas well.</p>\n<h3 id=\"making-it-easier-to-develop-in-oxcaml-in-the-open\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#making-it-easier-to-develop-in-oxcaml-in-the-open\"></a>Making it easier to develop in OxCaml in the open</h3>\n<p>Keen readers may note that my OxCaml repo links here go to a new <a href=\"https://github.com/avsm/oxmono\">monorepo</a> I've\nsetup for the purpose of hacking on real code in production outside of Jane Street's\nwalls.</p>\n<p>I'll blog more about this next week, but for now I hope you've enjoyed a little\ntaste of what the OxCaml extensions offer in real world code.  Stay tuned also for\neven more performance improvements, and for native TLS with an OxCaml port of\n<a href=\"https://github.com/mirleft/ocaml-tls\">ocaml-tls</a> from <a href=\"https://github.com/hannesm\">Hannes Mehnert</a> soon!</p><h1>References</h1><ul><li>Madhavapeddy et al (2025). Functional Networking for Millions of Docker Desktops. <a href=\"https://doi.org/10.1145/3747525\" target=\"_blank\"><i>10.1145/3747525</i></a></li>\n<li>Madhavapeddy (2025). Holding an OxCaml tutorial at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/55bc5-x4p75\" target=\"_blank\"><i>10.59350/55bc5-x4p75</i></a></li>\n<li>Sivaramakrishnan et al (2021). Retrofitting effect handlers onto OCaml. ACM. <a href=\"https://doi.org/10.1145/3453483.3454039\" target=\"_blank\"><i>10.1145/3453483.3454039</i></a></li>\n<li>Madhavapeddy (2025). Arise Bushel, my sixth generation oxidised website. <a href=\"https://doi.org/10.59350/0r62w-c8g63\" target=\"_blank\"><i>10.59350/0r62w-c8g63</i></a></li>\n<li>Madhavapeddy (2025). GeoTessera Python library released for geospatial embeddings. <a href=\"https://doi.org/10.59350/7hy6m-1rq76\" target=\"_blank\"><i>10.59350/7hy6m-1rq76</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/oxcaml-httpz",
      "title": "My (very) fast zero-allocation webserver using OxCaml",
      "summary": "Building httpz, a high-performance HTTP/1.1 parser with zero heap allocation using OxCaml's unboxed types, local allocations, and mutable local variables.",
      "date_published": "2026-02-01T00:00:00.000000Z",
      "date_modified": "2026-02-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "oxcaml",
        "ocaml",
        "embedded",
        "systems"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1145/3747525",
          "doi": "10.1145/3747525",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/55bc5-x4p75",
          "doi": "10.59350/55bc5-x4p75",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3453483.3454039",
          "doi": "10.1145/3453483.3454039",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/0r62w-c8g63",
          "doi": "10.59350/0r62w-c8g63",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/7hy6m-1rq76",
          "doi": "10.59350/7hy6m-1rq76",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2026w5",
      "content_html": "<h2 id=\"deploying-a-zero-allocation-oxcaml-webserver\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#deploying-a-zero-allocation-oxcaml-webserver\"></a>Deploying a zero allocation OxCaml webserver</h2>\n<p>I decided to spend this week on as much focussed hacking as I could, and in particular finished up switching my website to <a href=\"/notes/oxcaml-httpz\">a new webserver in OxCaml</a>. This attracted a lot of attention, so I spent a surprising amount of time answering questions on the socials about it! If you're reading this site, then it also works...</p>\n<p>One funny thing that happened right after deploying it was that I noticed tens of thousands of concurrent connections opened. It turns out that <a href=\"https://moltbook.com\">Moltbook</a> had a sub-molt used by agents that track Hackernews somehow, and a bunch of them had decided to mine my website for ... something. The switch to agents dominating the Internet are arriving rapidly!</p>\n<p>The actual development of httpz is happening in a new <a href=\"https://github.com/avsm/oxmono\">oxcaml monorepo</a> I've opened. I've not abandoned my older way of publishing to opam as well, but <a href=\"https://jon.recoil.org\">Jon Ludlam</a> and <a href=\"https://github.com/samoht\">Thomas Gazagnaire</a> are refining <a href=\"https://jon.recoil.org/blog/2026/01/weeknotes-2026-04-05.html\">that approach</a>. When coding in OxCaml, I need to fork almost every library, so a monorepo is the only way to go!</p>\n<p>On a more human note, it was delightful to see <a href=\"https://jonmsterling.com\">Jon Sterling</a> <a href=\"https://www.jonmsterling.com/2026-W05/\">discuss</a> his <a href=\"https://www.jonmsterling.com/01JR/\">Research Group Manual</a> which codifies many of the reasons why I started a blogging tradition here in my own group.</p>\n<h2 id=\"ocaml\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#ocaml\"></a>OCaml</h2>\n<p>In the land of OCaml, I proposed <a href=\"https://discuss.ocaml.org/t/proposal-make-the-minimum-tested-opam-2-1-and-higher/17736\">dropping the minimum supported version of opam</a> to general support. The general drag of maintenance of the CI infrastructure in my group is becoming a problem; just <a href=\"https://www.tunbury.org/2026/01/12/opam-25/\">moving to opam 2.5</a> or <a href=\"https://www.tunbury.org/2026/01/16/arm64-workers/\">debugging arm64 issues</a> was a giant amount of work for Mark, so we have to keep on top of deprecating old things somehow. On the other hand, we're also extending some <a href=\"https://www.tunbury.org/2026/01/26/ocurrent-rpc/\">cool uses of Capnp capabilities</a> throughout the infrastructure, which makes CLI usage of all these services easier and easier.</p>\n<p>I also took the opportunity to upload <a href=\"https://watch.ocaml.org/c/ocaml2025/videos\">all of the OCaml workshop talks</a> to &lt;watch.ocaml.org&gt;, so they're available for your browsing pleasure.</p>\n<p>I caught up with <a href=\"https://github.com/https://roscidus.com\">Thomas Leonard</a> and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> to discuss what to do about the number of Eio issues piling up. We're all generally happy with how <a href=\"https://patrick.sirref.org/fellowship-roundup/index.xml\">stable and usable</a> it is (I'm using it everywhere), but we'll get together after our <a href=\"https://roscidus.com/blog/blog/2025/11/16/libdrm-ocaml/\">current set</a> of projects to do a collective push to merge our branches together. I'm pretty happy with this model of development: get some experience using it, and then make a bunch of changes after &quot;learning by doing&quot;.</p>\n<h2 id=\"fosdem\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#fosdem\"></a>FOSDEM</h2>\n<p>Congratulations also to <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> for a tremendous showing at FOSDEM, delivering three talks to packed rooms! They're all <a href=\"https://watch.eeg.cl.cam.ac.uk/c/fosdem/videos\">online to watch</a> and I want to particularly highlight how much I enjoyed his package management calculus one. We're working on getting this submitted to a PL conference next!</p>\n<p><div class=\"video-center\"><iframe title=\"Package managers à la carte: A Formal Model of Dependency Resolution\" width=\"100%\" height=\"315px\" src=\"https://watch.eeg.cl.cam.ac.uk/videos/embed/905d3833-a890-4ece-8ba2-cf6dbf5e2dcb\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<p>The live streaming support from FOSDEM was fantastic this year and I got to see everything live while also sitting in a chilly picnic in Cambridge over the weekend.</p>\n<h2 id=\"next-week\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#next-week\"></a>Next week</h2>\n<p><img src=\"/images/running-into-anna.webp\" alt=\"%rc\" title=\"Anna Lapwood in the Mill!\" >\nIt was also delightful to run into Anna Lapwood in the Mill and catch up; first time I've seen her since she left Pembroke to travel the world!</p>\n<p>Some fun links:</p>\n<ul>\n<li><a href=\"https://nick.recoil.org/articles/blender-falling-leaves-simulation/\">Simulating falling autumn leaves in Blender</a> is a lovely post by <a href=\"https://nick.recoil.org\">Nick Ludlam</a> that I want to follow through.</li>\n<li><a href=\"https://digitalflapjack.com/weeknotes/performance_and_stl_files/\">Ray Tracer Performance improvements</a> by <a href=\"https://mynameismwd.org\">Michael Dales</a> continues his inevitable journey to OxCaml as he builds his OCaml raytracer.</li>\n<li><a href=\"https://x.com/avsm/status/2016425983843189071\">Tom Blomfield notes how funny it is that Cambridge handwritten exams are now the best way to assess</a>.</li>\n</ul><h1>References</h1><ul><li>Madhavapeddy (2026). My (very) fast zero-allocation webserver using OxCaml. <a href=\"https://doi.org/10.59350/9c6bz-kb659\" target=\"_blank\"><i>10.59350/9c6bz-kb659</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2026w5",
      "title": ".plan-26-05: An OxCaml hacking week",
      "summary": "Deploying an OxCaml zero-allocation webserver, OCaml CI maintenance and opam versioning, and OCaml Workshop and FOSDEM talks",
      "date_published": "2026-02-01T00:00:00.000000Z",
      "date_modified": "2026-02-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "tessera",
        "ocaml",
        "oxcaml"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/9c6bz-kb659",
          "doi": "10.59350/9c6bz-kb659",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2026w4",
      "content_html": "<p>Just short weeknotes this week as I've been travelling back to Belfast for various matters and haven't had much computer time.</p>\n<p>One exciting development in the week was that <a href=\"https://shaneweisz.com\">Shane Weisz</a> continued the conversation with the <a href=\"https://www.shaneweisz.com/blog/presenting-ai-for-the-red-list-to-iucn\">IUCN Red list team</a> about his developing dashboard, which went extremely well. There's so much excitement on both sides about how all this is going!</p>\n<h2 id=\"tessera-developments\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#tessera-developments\"></a>TESSERA developments</h2>\n<p>I also spent a chunk of time wrestling with understanding <a href=\"https://zarr-specs.readthedocs.io/en/latest/v3/core/index.html\">Zarr</a> so I can port <a href=\"/notes/geotessera-python\">TESSERA</a> to use this instead of Numpy arrays. There's been <a href=\"https://eeg.zulipchat.com/#narrow/channel/527258-Tessera/topic/zarr.20file.20format/with/571006960\">a long and helpful thread</a> on our Zulip about this with a lot of people chiming in. <a href=\"https://mynameismwd.org\">Michael Dales</a> has also been a <a href=\"https://digitalflapjack.com/weeknotes/2025-12-15\">source of coordinate inspiration</a> from his experiences, so I'll put proper thoughts together on this soon.</p>\n<p>On the OxCaml front, <a href=\"https://www.tunbury.org/\">Mark Elvers</a> has knocked up some <a href=\"https://github.com/mtelvers/ocaml-zarr\">OCaml Zarr</a> so I'll be porting those to OxCaml soon and taking the TESSERA support in OCaml for a spin.</p>\n<h3 id=\"tessera-activity-around-the-web\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#tessera-activity-around-the-web\"></a>TESSERA activity around the web</h3>\n<p>There was quite a lot of TESSERA things going on alongside this.</p>\n<p>First a preprint <a href=\"https://arxiv.org/abs/2601.13134\">Earth Embeddings as Products: Taxonomy, Ecosystem, and Standardized Access</a> that's a nice summary of how to use geoembeddings like TESSERA:</p>\n<blockquote>\n<p>Geospatial Foundation Models (GFMs) provide powerful representations, but\nhigh compute costs hinder their widespread use. Pre-computed embedding data\nproducts offer a practical &quot;frozen&quot; alternative, yet they currently exist in\na fragmented ecosystem of incompatible formats and resolutions. This lack of\nstandardization creates an engineering bottleneck that prevents meaningful\nmodel comparison and reproducibility. We formalize this landscape through a\nthree-layer taxonomy: Data, Tools, and Value. We survey existing products to\nidentify interoperability barriers. To bridge this gap, we extend TorchGeo\nwith a unified API that standardizes the loading and querying of diverse\nembedding products. By treating embeddings as first-class geospatial\ndatasets, we decouple downstream analysis from model-specific engineering,\nproviding a roadmap for more transparent and accessible Earth observation\nworkflows.</p>\n</blockquote>\n<p>Then <a href=\"https://toao.com\">Sadiq Jaffer</a> and <a href=\"https://www.cst.cam.ac.uk/people/zf281\">Frank Feng</a> did a great podcast on <a href=\"https://www.satellite-image-deep-learning.com/p/tessera-a-temporal-foundation-model\">Satellite Image Deep learning</a> where they go through the journey of how we trained the model. I can't believe it's barely been a year!</p>\n<iframe width=\"560\" height=\"315\" src=\"https://www.youtube-nocookie.com/embed/10CBuGfrz6M?si=FAIWvnfIOPaEkGwn\" title=\"YouTube video player\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen></iframe>\n<p>TESSERA support also got merged into <a href=\"https://github.com/torchgeo/torchgeo/pull/3243\">Torchgeo</a> for those looking to work on customising the model itself. Most users don't have to use this as they can just use our pregenerated embeddings.</p>\n<h2 id=\"a-fond-farewell-to-dra27-from-the-lab\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#a-fond-farewell-to-dra27-from-the-lab\"></a>A fond farewell to dra27 from the Lab</h2>\n<p>After almost a <a href=\"/papers/2017-oud-platform\">decade</a> of working with <a href=\"https://www.dra27.uk\">David Allsopp</a> in both the University and later <a href=\"/notes/founded-tarides\">Tarides</a>, he finally &quot;graduated&quot; and went off to <a href=\"https://www.dra27.uk/blog/platform/2026/01/19/plotting-a-new-course.html\">join Jane Street</a> where...we will continue to work together on OxCaml and OCaml.</p>\n<p>Good luck to David as he no doubt enjoys the ridiculously nice Jane Street office, where I would overdose on the fresh fruit juice machine and be on a perpetual sugar high!</p>\n<p>Some fun links:</p>\n<ul>\n<li>It was nice to see others getting excited about my <a href=\"https://bsky.app/profile/oppi.li/post/3mcjcygf3r227\">OCaml ATProto client support</a></li>\n<li>And also <a href=\"https://bsky.app/profile/apenwarr.ca/post/3mci727zgxk2s\">WebFinger seems more important</a> so I implemented that too.</li>\n</ul><h1>References</h1><ul><li>Madhavapeddy (2025). GeoTessera Python library released for geospatial embeddings. <a href=\"https://doi.org/10.59350/7hy6m-1rq76\" target=\"_blank\"><i>10.59350/7hy6m-1rq76</i></a></li>\n<li>Fang et al (2026). Earth Embeddings as Products: Taxonomy, Ecosystem, and Standardized Access. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2601.13134\" target=\"_blank\"><i>10.48550/arXiv.2601.13134</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2026w4",
      "title": ".plan-26-04: Travelling and tracking TESSERA activity",
      "summary": "Tracking TESSERA activity including a new preprint and podcast, wrestling with Zarr, and saying farewell to David Allsopp.",
      "date_published": "2026-01-25T00:00:00.000000Z",
      "date_modified": "2026-01-25T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "belfast",
        "tessera"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/7hy6m-1rq76",
          "doi": "10.59350/7hy6m-1rq76",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2601.13134",
          "doi": "10.48550/arXiv.2601.13134",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2026w3",
      "content_html": "<p>While my colleagues have been busy with start-of-term matters, I've been\nenjoying being able to get some research done while on sabbatical! I got back to\nCambridge from India and settled back to hacking on my research projects.</p>\n<h2 id=\"vultr-and-nvidia\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#vultr-and-nvidia\"></a>Vultr and Nvidia</h2>\n<p>I caught up with the AMD and Vultr team, as they have been sponsoring TESSERA\nwith giving us access to an AMD MI325X 6x node, which has been tremendous help\nin getting TESSERA embeddings going. Before Christmas, our <a href=\"/videos/48a7ab10-3f49-4978-a00f-c26b64c2cae7\">DAWN access</a> expired and we had literally no\nGPUs to continue working on TESSERA, so Vultr stepping in at extremely short\nnotice has been an incredible godsend for our plucky geospatial project.\nThe team from Vultr, especially Kasia Hilborne, have been delightfully nice to\nwork with as well!</p>\n<p>Following up with the extremely random <a href=\"/notes/jensen-huang-hawking\">Jensen Huang chat</a> last year, the nVidia team visited and we also\ndiscussed future collaborations around TESSERA.  And then I went to the pub\nwith the AMD crew, so this was pretty much full GPU hustling coverage for the\nweek except for the absence of anyone from Intel in my immediate vicinity.</p>\n<h2 id=\"speaking-at-the-ai4nature-launch\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#speaking-at-the-ai4nature-launch\"></a>Speaking at the AI4Nature launch</h2>\n<p>I got invited by <a href=\"https://samreynolds.org\">Sam Reynolds</a> to go speak at the <a href=\"https://ai4nature.org/\">AI4Nature launch</a> down at the <a href=\"https://savoyplace.theiet.org/\">IET in London</a>.</p>\n<blockquote>\n<p>Ai4Nature is a cross-sectoral collaborative initiative to advance responsible\nAI applications in biodiversity data and nature recovery and enhancement,\nbuilding sustainable global influence through collaborative standards\ndevelopment, education, and knowledge sharing.\n<cite>-- <a href=\"https://ai4nature.org\">AI4Nature</a>, 2025</cite></p>\n</blockquote>\n<p><img src=\"/images/ai4nature-2.webp\" alt=\"%c\" title=\"Pretty full house in the IET library just before I spoke!\" ></p>\n<p>I talked about how modern AI initiatives might help, similarly to the <a href=\"/notes/2026w2\">red pill and blue pill talk last week</a> at the <a href=\"/notes/red-pill-conservation\">Conservation Evidence conference</a>. The talk was recorded, but the recording just focussed on me and didn't show my slides, so you'll have to <a href=\"https://www.cl.cam.ac.uk/~avsm/slides/ai4nature-jan26\">follow along with my slides</a> by hand if you watch it!</p>\n<p><div class=\"video-center\"><iframe title=\"Moonshot: AI, Evidence, and Human Judgement in Nature Recovery\" width=\"100%\" height=\"315px\" src=\"https://crank.recoil.org/videos/embed/41723993-6f58-4e78-ba01-260295b7d1f1\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<p>You can read <a href=\"https://ai4nature.org/the-alliance-journal/usot739onqmh3n43tgt3r8b9m6j3ye\">a full roundup on the AI4Nature journal</a> along with all the other talks. There were an interesting mix of urban developers, remote sensing experts and conservation ecologists there, all wondering how to balance the delicate mix of nature and human needs. I sent several people there our just-out <a href=\"/notes/life-uses-paper\">LIFE recipes</a> <a href=\"/papers/2025-life-uses\">paper</a> and others the <a href=\"/notes/exploring-food-impacts\">FOOD biodiversity explorer</a>.\nThanks to Sam for the invite to a great event!</p>\n<p><img src=\"/images/ai4nature-1.webp\" alt=\"%c\" ></p>\n<h2 id=\"picking-up-tomass-new-pl-book\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#picking-up-tomass-new-pl-book\"></a>Picking up Tomas's new PL book</h2>\n<p><a href=\"https://dorchard.github.io\">Dominic Orchard</a> and I went along to the CUP bookshop to pick up our PhD buddy Tomas Petricek's latest book on &quot;<a href=\"https://www.cambridge.org/core/books/cultures-of-programming/075A2D1DE611EE47807A683147B21691\">Cultures of Programming</a>&quot;. Although it was closed on first visit, I nipped along later and found the last copy, which I'm now reading with great enjoyment.</p>\n<p><img src=\"/images/cup-visit-1.webp\" alt=\"%c\" title=\"Dominic and I go shopping, to find a closed bookshop, woops\" ></p>\n<p>Fun links:</p>\n<ul>\n<li>Claude Cowork got released, and it uses a <a href=\"https://bsky.app/profile/anil.recoil.org/post/3mcc5kxbmf22t\">much worse VM than Docker for Desktop</a> so I sent them our <a href=\"/papers/2025-docker-icfp\">ICFP paper on Docker</a> in case they wanted to vibe up a better base image.</li>\n<li>I chatted a bit with <a href=\"https://mynameismwd.org\">Michael Dales</a> about his struggles with <a href=\"https://digitalflapjack.com/weeknotes/gdal-and-filestar-vs-macos/\">buffered file IO on macOS</a> and spelunked <a href=\"https://github.com/OSGeo/gdal/issues/13672#issuecomment-3731879519\">deep into macOS libc</a> to discover just how bad the situation is.</li>\n</ul><h1>References</h1><ul><li>Madhavapeddy (2026). Discussing effective conservation with all the UK Chief Scientists. <a href=\"https://doi.org/10.59350/qjrmv-38130\" target=\"_blank\"><i>10.59350/qjrmv-38130</i></a></li>\n<li>Madhavapeddy et al (2025). Functional Networking for Millions of Docker Desktops. <a href=\"https://doi.org/10.1145/3747525\" target=\"_blank\"><i>10.1145/3747525</i></a></li>\n<li>Eyres et al (2026). Informing conservation problems and actions using an indicator of extinction risk: A detailed assessment of applying the LIFE metric. <a href=\"https://doi.org/10.1016/j.biocon.2025.111663\" target=\"_blank\"><i>10.1016/j.biocon.2025.111663</i></a></li>\n<li>Madhavapeddy (2026). Five ways to use the LIFE metric for conservation decision-making. <a href=\"https://doi.org/10.59350/hjg1b-seq03\" target=\"_blank\"><i>10.59350/hjg1b-seq03</i></a></li>\n<li>Madhavapeddy (2025). Jensen Huang receives the Hawking Fellowship at Cambridge. <a href=\"https://doi.org/10.59350/c7zd2-6912\" target=\"_blank\"><i>10.59350/c7zd2-6912</i></a></li>\n<li>Madhavapeddy (2025). Exploring the biodiversity impacts of what we choose to eat. <a href=\"https://doi.org/10.59350/xj427-y3q48\" target=\"_blank\"><i>10.59350/xj427-y3q48</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2026w3",
      "title": ".plan-26-03: TESSERA scaling and speaking at AI4Nature's launch",
      "summary": "Scaling TESSERA embeddings on Vultr AMD GPUs, speaking at the AI4Nature launch in London, and picking up Tomas Petricek's new PL book.",
      "date_published": "2026-01-18T00:00:00.000000Z",
      "date_modified": "2026-01-18T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "tessera",
        "evidence",
        "iucn",
        "london",
        "vultr"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/qjrmv-38130",
          "doi": "10.59350/qjrmv-38130",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3747525",
          "doi": "10.1145/3747525",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1016/j.biocon.2025.111663",
          "doi": "10.1016/j.biocon.2025.111663",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/hjg1b-seq03",
          "doi": "10.59350/hjg1b-seq03",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/c7zd2-6912",
          "doi": "10.59350/c7zd2-6912",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/xj427-y3q48",
          "doi": "10.59350/xj427-y3q48",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/hjg1b-seq03",
      "content_html": "<p>I was at the launch of <a href=\"https://ai4nature.org\">AI4Nature</a> a few days ago, and met a lot of\npeople looking for <em>practical</em> advice on integrating remote sensing into their biodiversity\ndecision making. So it's good timing that our <a href=\"/papers/2025-life-uses\">latest paper</a> just came out,\nlead by the inestimable <a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\">Alison Eyres</a> to give some &quot;recipes&quot; on how to use our global <a href=\"/papers/2024-life\">LIFE metric</a>! <a href=\"https://anil.recoil.org/papers/2025-life-uses.pdf\">Read it here</a>.</p>\n<p>As a reminder, the LIFE metric is one we published <a href=\"/notes/exploring-food-impacts\">last year</a> to calculate\nthe extinction costs of different landuse actions (either conversion or restoration) and\nto allow these costs to be compared in a &quot;common currency&quot; anywhere in the world.</p>\n<p>To achieve this, the metric tries to be representative of geographic, taxonomic, and <a href=\"https://doi.org/10.1111/j.1523-1739.2010.01605.x\">habitat diversity</a>, allows disaggregation into scores for groups of species, and be interpretable on a ratio scale (i.e. a two-fold difference in LIFE scores corresponds to a two-fold difference in estimated extinction effects). Finally, in order to enable real-world impact, the LIFE maps must be <a href=\"https://doi.org/10.5281/zenodo.14945383\">accessible, actionable</a> and <em>usable</em>. The latter point is what our latest paper covers, by providing some useful recipes for the a decisionmaker to follow! Co-author <a href=\"https://mynameismwd.org\">Michael Dales</a> also jotted down <a href=\"https://digitalflapjack.com/weeknotes/aoh_2.0/\">his notes</a> on the paper as well.</p>\n<p><a href=\"https://anil.recoil.org/papers/2025-life-uses.pdf\"> <img src=\"/images/2025-life-uses-ss.webp\" alt=\"%c\" title=\"The LIFE metric paper showing five case studies\" > </a></p>\n<h2 id=\"the-five-case-studies\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-five-case-studies\"></a>The five case studies</h2>\n<p>The paper walks through how to apply LIFE across different\nconservation and development contexts, ranging from the local to the global. Here are the highlights from each:</p>\n<h3 id=\"near-real-time-biodiversity-harm-in-tropical-hotspots\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#near-real-time-biodiversity-harm-in-tropical-hotspots\"></a>Near real-time biodiversity harm in tropical hotspots</h3>\n<p>This case study integrates LIFE with the <a href=\"https://glad.earthengine.app/view/global-forest-change\">Global Forest Change</a> forest loss data to quantify biodiversity harms as they happen.</p>\n<blockquote>\n<p>[...] our analyses demonstrate that forest loss has a greater per km2 impact on extinction in some countries than others. The impact in terms of extinction risk arising in Peru, Indonesia and Papua New Guinea are disproportionately large compared to the extent of forest loss in these regions.</p>\n<p>[...] LIFE identifies areas of high conservation concern that would not necessarily be detected through species richness alone, where the loss of a widespread species is weighted equally to that of a narrowly endemic or threatened species.</p>\n</blockquote>\n<p>This enables monitoring of extinction risk impacts from deforestation as it happens, and helps focus on biodiversity hotspots separately from forest carbon. While other metrics like the <a href=\"https://doi.org/10.3390/su11071841\">countryside species-area relationship</a> could also do this, they're not as readily available as LIFE since we've done all the <a href=\"/papers/2025-yirgacheffe\">large-scale computing needed</a> and provided <a href=\"https://zenodo.org/records/14188450\">precomputed maps</a> for this usecase.</p>\n<p><img src=\"/images/life-uses-1.webp\" alt=\"%c\" title=\"Monthly extinction impact shows the contribution to species extinction risk attributable specifically to deforestation occurring in that month. These impacts are ongoing unless the forest is subsequently restored.\" ></p>\n<h3 id=\"comparing-uk-apples-with-other-apples\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#comparing-uk-apples-with-other-apples\"></a>Comparing UK apples with... other apples</h3>\n<p>The biggest driver of land-use change is the <a href=\"/notes/food-and-risk-to-life\">global food system</a>, and so we used LIFE to assess the extinction risk impacts of apple consumption in the UK (2019-2023) and how these varied depending on <em>which country</em> the apples were sourced from.  This builds on our <a href=\"/notes/exploring-food-impacts\">earlier work</a> showing food impacts can vary by <a href=\"/papers/2024-food-life\">three orders of magnitude</a> depending on the source.</p>\n<p><img src=\"/images/life-uses-2.webp\" alt=\"%c\" title=\"The total extinction cost of UK apple consumption (ΔE), shown by country. Domestic production in red, imported apples in blue.\" ></p>\n<p>The UK apple example shows how even domestically produced foods have hidden biodiversity footprints depending on sourcing. More generally, it could be used to evaluate the consequences of <a href=\"/notes/life-official-statistic\">policies</a> promoting lower-yield domestic agriculture, and to track how food-related biodiversity impacts change over time due to post-Brexit trade shifts or random trade tarriffs. You can see more on this in our <a href=\"https://quantifyearth.github.io/food-globe/\">interactive food explorer</a>.</p>\n<h3 id=\"biodiversity-compensation-in-sumatra\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#biodiversity-compensation-in-sumatra\"></a>Biodiversity compensation in Sumatra</h3>\n<p>This hypothetical scenario covers a company that has converted a forest to agricultural land for coffee production. The company wants to use the LIFE metric to assess its impact and then select the most suitable <em>restoration</em> site to compensate for biodiversity loss by restoring biodiversity to a &quot;pre-impact&quot; baseline. This could then be used to calculate <a href=\"/notes/carbon-credits-vs-offsets\">financial contributions</a> towards a no-net-loss biodiversity scenario for that (hopefully essential) development.</p>\n<p>The pixel matching mechanism to calculate our baselines were chosen from pixels that are currently agricultural land but were historically forested. This is similar to what we did for our <a href=\"/papers/2023-pact-tmf\">carbon credit tropical moist forest</a>, except that we also account for restoration being highly uncertain and so adding an ex-ante multiplying factor.</p>\n<p>They key differentiator for using LIFE here, <em>vs</em> other metrics, is that LIFE can be <em>disaggregated</em> to track the fate of individual species, making it well suited to capture local biodiversity values and transparently assess net losses and gains based on local knowledge.</p>\n<h3 id=\"prioritising-conservation-investments-in-honduras\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#prioritising-conservation-investments-in-honduras\"></a>Prioritising conservation investments in Honduras</h3>\n<p>This fourth scenario uses LIFE to inform prioritisation of site-based conservation actions within the  <a href=\"https://www.worldlandtrust.org\">World Land Trust</a>'s portfolio of interventions:</p>\n<blockquote>\n<p>World Land Trust (WLT) is an international conservation charity that protects\nthe world’s most\nbiologically significant and threatened habitats.</p>\n<p>Working through a network of partner organisations around the world, WLT\nfunds the creation of reserves and provides permanent protection for habitats\nand wildlife. Partnerships are developed with established and highly\nrespected local organisations who engage support and commitment among the\nlocal community.</p>\n<p><cite>-- <a href=\"https://www.worldlandtrust.org/who-we-are-2/\">Who We Are</a>, World Land Trust, 2025</cite></p>\n</blockquote>\n<p>For each project, we estimated extinctions that could be averted\nunder an extreme counterfactual that all habitat was assumed to be\nconverted to agriculture. We multiplied each pixel level\nLIFE-convert value by its area and summed all pixels within the\nproject. We then explored the additional species-level insights by\ndisaggregating the toplevel LIFE metric.</p>\n<p><img src=\"/images/life-uses-3.webp\" alt=\"%c\" title=\"Potential total extinctions assuming 100 % conversion of natural habitats in existing WLT projects to arable land. Dashed lines show four potential projects in Honduras\" ></p>\n<p>LIFE doesn't aim to replace all other metrics here, but provides an excellent baseline that can be disaggregated:</p>\n<blockquote>\n<p>Despite inherent uncertainties, species-level information is valuable for\ncomparing sites. It enables practitioners, policy makers and funders to\nbetter understand why a metric identifies a site as important and provides\npersuasive evidence for conservation investment by highlighting key\nthreatened species.</p>\n</blockquote>\n<h3 id=\"evaluating-long-term-conservation-effectiveness-in-sierra-leone\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#evaluating-long-term-conservation-effectiveness-in-sierra-leone\"></a>Evaluating long-term conservation effectiveness in Sierra Leone</h3>\n<p>And last but not least, my favourite rainforest in Africa is the <a href=\"https://golarainforest.org\">Gola rainforest</a>.\nWe combined LIFE with <a href=\"/papers/2025-redd-evals\">counterfactual methods</a> to evaluate a real conservation project there that the RSPB has been <a href=\"https://www.rspb.org.uk/helping-nature/what-we-do/protecting-species-and-habitats/international/gola-rainforest\">running for decades</a>.\nWe try to answer the crucial question of whether the intervention actually had a net-positive biodiversity impact via an ex-post evaluation of what's happened in the past decade.</p>\n<p><img src=\"/images/life-uses-4.webp\" alt=\"%c\" title=\"The extinction impact/km2 converted from natural habitat to arable land, with impact estimated as the product of additionality and the LIFE score\" ></p>\n<p>The use of LIFE lets us go beyond broad species richness metrics:</p>\n<blockquote>\n<p>The Gola region hosts several narrow-ranged, highly threatened species (e.g.\nDiana Monkey and the Pygmy hippopotamus), which would not be highlighted by\nanalyses focused solely on species richness, including those based on PDF or\ncSAR (without rarity weightings)</p>\n</blockquote>\n<h2 id=\"caveats-and-responsible-use\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#caveats-and-responsible-use\"></a>Caveats and responsible use</h2>\n<p>We're hopefully clear about LIFE's limitations in the paper as well:</p>\n<blockquote>\n<p>Like all global metrics, LIFE's broad applicability relies on assumptions and\nsimplifications. It should be used cautiously, and alongside local knowledge\nand ground-truthing, especially for restoration, offsetting, or fine-scale\nanalysis, and in poorly studied areas.</p>\n</blockquote>\n<p>The computational infrastructure behind LIFE, lead by <a href=\"https://mynameismwd.org\">Michael Dales</a>, is available via our\n<a href=\"https://github.com/quantifyearth/\">quantifyearth</a> GitHub organisation, with the\nheavy lifting done by the <a href=\"https://github.com/quantifyearth/yirgacheffe\">Yirgacheffe</a>\nlibrary for processing large-scale raster data. That's what Michael presented back\nat <a href=\"/notes/icfp25-propl\">PROPL earlier this year</a>.</p>\n<p>With LIFE now demonstrated across these five use cases, we're really excited to see how\nothers apply it to their own conservation challenges. The combination of LIFE\nwith <a href=\"/projects/plancomp\">planetary computing</a> infrastructure means we can provide\ndecisionmakers with extinction risk information at much quicker scales and\nspeeds than possible before. But of course, this requires turning metrics into\non-the-ground action, so please do reach out if we can help with that!</p><h1>References</h1><ul><li>Eyres et al (2026). Informing conservation problems and actions using an indicator of extinction risk: A detailed assessment of applying the LIFE metric. <a href=\"https://doi.org/10.1016/j.biocon.2025.111663\" target=\"_blank\"><i>10.1016/j.biocon.2025.111663</i></a></li>\n<li>Eyres et al (2025). LIFE: A metric for mapping the impact of land-cover change on global extinctions. <a href=\"https://doi.org/10.1098/rstb.2023.0327\" target=\"_blank\"><i>10.1098/rstb.2023.0327</i></a></li>\n<li>Balmford et al (2024). PACT Tropical Moist Forest Accreditation Methodology v2.1. Cambridge Open Engage. <a href=\"https://doi.org/10.33774/coe-2024-gvslq\" target=\"_blank\"><i>10.33774/coe-2024-gvslq</i></a></li>\n<li>Dales et al (2025). Yirgacheffe: A Declarative Approach to Geospatial Data. Association for Computing Machinery. <a href=\"https://doi.org/10.1145/3759536.3763806\" target=\"_blank\"><i>10.1145/3759536.3763806</i></a></li>\n<li>Ball et al (2025). Food impacts on species extinction risks can vary by three orders of magnitude. <a href=\"https://doi.org/10.1038/s43016-025-01224-w\" target=\"_blank\"><i>10.1038/s43016-025-01224-w</i></a></li>\n<li>Madhavapeddy (2025). LIFE becomes an Official Statistic of the UK government. <a href=\"https://doi.org/10.59350/xb1fz-c5v35\" target=\"_blank\"><i>10.59350/xb1fz-c5v35</i></a></li>\n<li>Swinfield et al (2025). Learning lessons from over-crediting to ensure additionality in forest carbon credits. Cambridge Open Engage. <a href=\"https://doi.org/10.33774/coe-2025-29fk2\" target=\"_blank\"><i>10.33774/coe-2025-29fk2</i></a></li>\n<li>Madhavapeddy (2025). Programming for the Planet at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/hasmq-vj807\" target=\"_blank\"><i>10.59350/hasmq-vj807</i></a></li>\n<li>Madhavapeddy (2025). Disentangling carbon credits and offsets with contributions. <a href=\"https://doi.org/10.59350/g4ch1-64343\" target=\"_blank\"><i>10.59350/g4ch1-64343</i></a></li>\n<li>Madhavapeddy (2025). Exploring the biodiversity impacts of what we choose to eat. <a href=\"https://doi.org/10.59350/xj427-y3q48\" target=\"_blank\"><i>10.59350/xj427-y3q48</i></a></li>\n<li>Jones et al (2011). The Why, What, and How of Global Biodiversity Indicators Beyond the 2010 Target. Conservation Biology. <a href=\"https://doi.org/10.1111/j.1523-1739.2010.01605.x\" target=\"_blank\"><i>10.1111/j.1523-1739.2010.01605.x</i></a></li>\n<li>Eyres et al (2024). LIFE: A metric for mapping the impact of land-cover change on global extinctions. Zenodo. <a href=\"https://doi.org/10.5281/zenodo.14945383\" target=\"_blank\"><i>10.5281/zenodo.14945383</i></a></li>\n<li>Maier et al (2019). Conceptual Framework for Biodiversity Assessments in Global Value Chains. Sustainability. <a href=\"https://doi.org/10.3390/su11071841\" target=\"_blank\"><i>10.3390/su11071841</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/life-uses-paper",
      "title": "Five ways to use the LIFE metric for conservation decision-making",
      "summary": "Our new paper in Biological Conservation demonstrates how the LIFE extinction risk metric can be applied across five diverse case studies, from real-time tropical deforestation monitoring to evaluating conservation project effectiveness.",
      "date_published": "2026-01-12T00:00:00.000000Z",
      "date_modified": "2026-01-12T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "life",
        "biodiversity",
        "conservation",
        "sensing",
        "food",
        "policy"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2025-life-uses.pdf",
          "mime_type": "application/pdf",
          "title": "Informing conservation problems and actions using an indicator of extinction risk: A detailed assessment of applying the LIFE metric"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1016/j.biocon.2025.111663",
          "doi": "10.1016/j.biocon.2025.111663",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1098/rstb.2023.0327",
          "doi": "10.1098/rstb.2023.0327",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.33774/coe-2024-gvslq",
          "doi": "10.33774/coe-2024-gvslq",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3759536.3763806",
          "doi": "10.1145/3759536.3763806",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s43016-025-01224-w",
          "doi": "10.1038/s43016-025-01224-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/xb1fz-c5v35",
          "doi": "10.59350/xb1fz-c5v35",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.33774/coe-2025-29fk2",
          "doi": "10.33774/coe-2025-29fk2",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/hasmq-vj807",
          "doi": "10.59350/hasmq-vj807",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/g4ch1-64343",
          "doi": "10.59350/g4ch1-64343",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/xj427-y3q48",
          "doi": "10.59350/xj427-y3q48",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1111/j.1523-1739.2010.01605.x",
          "doi": "10.1111/j.1523-1739.2010.01605.x",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.5281/zenodo.14945383",
          "doi": "10.5281/zenodo.14945383",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.3390/su11071841",
          "doi": "10.3390/su11071841",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2026w2",
      "content_html": "<p>I ended up the India family trip with a successful navigation of getting everyone back to Ireland in one piece, and then with just a few hours was back in Cambridge to help host the <a href=\"/notes/red-pill-conservation\">big Conservation Evidence conference</a> at Pembroke!</p>\n<h2 id=\"conservation-evidence-conference\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#conservation-evidence-conference\"></a>Conservation Evidence conference</h2>\n<p>Luckily, I had pretty much no actual work to do beyond show up and smile, as the catering staff at Pembroke are beyond outstanding as are the Conservation Evidence team.</p>\n<p><img src=\"/images/pemb-aud-1.webp\" alt=\"%c\" title=\"The event was buzzing despite the chill\" ></p>\n<p>You can see my talk here or read the <a href=\"/notes/red-pill-conservation\">roundup</a>:</p>\n<p><div class=\"video-center\"><iframe title=\"The red pill or the blue pill for AI and conservation\" width=\"100%\" height=\"315px\" src=\"https://watch.eeg.cl.cam.ac.uk/videos/embed/5b58f41f-8d71-48ba-92d1-b34a49ee76ef\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<h2 id=\"five-uses-for-the-life-metric\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#five-uses-for-the-life-metric\"></a>Five uses for the LIFE metric</h2>\n<p>I also wrote up some notes on our new paper on <a href=\"/notes/life-uses-paper\">five ways to use the LIFE metric</a> for conservation decision-making, which just came out in Biological Conservation. Well done <a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\">Alison Eyres</a>!</p>\n<h2 id=\"the-fp-launchpad-takes-off\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-fp-launchpad-takes-off\"></a>The FP Launchpad takes off</h2>\n<p>I've been working with <a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a> for a while on helping him craft the <a href=\"https://fplaunchpad.org\">FP Launchpad</a> and he announced it to the public!</p>\n<blockquote>\n<p>The Functional Programming (FP) Launchpad at IIT Madras aims to build\nresearch and educational capacity for crafting efficient, reliable and\ntrustworthy software with mathematical guardrails. As AI coding agents make\nsoftware increasingly accessible and enable its production at unprecedented\nscale, they also introduce far-reaching consequences for civic society. In\nthis context, the centre is founded on the belief that discovering future\nsoftware engineering best practices requires sustained feedback loops between\nresearch and real-world systems, as well as educational structures that\ncultivate a new generation of maintainers for foundational software.\n<cite>-- <a href=\"https://fplaunchpad.org\">FP Launchpad manifesto</a>, 2026</cite></p>\n</blockquote>\n<p>All three of the <a href=\"https://fplaunchpad.org/charter/#illustrative-projects\">illustrative projects</a> are of huge interest to me:</p>\n<blockquote>\n<ul>\n<li>An open-source, verifiable voting system for general elections, built on top of Shakti RISC-V processor, MirageOS unikernels and O(x)Caml.</li>\n<li>A programmable public infrastructure for environmental planning, combining\nTESSERA’s satellite-derived representations with CoRE Stack data and\ncompositional functional models in O(x)Caml to support auditable indicators\nand scenario analysis for India’s water and habitat systems.</li>\n<li>A formally verified runtime system for O(x)Caml in O(x)Caml guaranteeing memory\nsafety and functional correctness properties. The focus will also include\nformal verification of tools for debugging and observability, such as\neBPF-based tracing, continuous runtime contract checking, and deterministic\nrecord and replay.</li>\n</ul>\n</blockquote>\n<p>The official launch of the centre is in early April, so I'll report back in a few months when I head over to Chennai!</p>\n<p>Fun links:</p>\n<ul>\n<li>I had an email exchange with <a href=\"https://asbradbury.org\">Alex Bradbury</a> about his excellent analysis of <a href=\"https://muxup.com/2026q1/per-query-energy-consumption-of-llms\">Per-query energy consumption of LLMs</a> which is well worth a read.</li>\n<li>I got hilariously <a href=\"https://www.bbc.co.uk/news/articles/cd0ynenr1eno\">anonymously quoted</a> in the BBC as a result of my <a href=\"/notes/path-to-uk-india-ai-summit\">dinner conversation</a> with Zoe Kleinname: <em>&quot;A month later, I had dinner with a university professor who told me he had a GPU – a powerful computer processor used to drive AI – under his desk. And as it churned away, it was also keeping his office warm.&quot;</em></li>\n</ul><h1>References</h1><ul><li>Madhavapeddy (2026). Discussing effective conservation with all the UK Chief Scientists. <a href=\"https://doi.org/10.59350/qjrmv-38130\" target=\"_blank\"><i>10.59350/qjrmv-38130</i></a></li>\n<li>Madhavapeddy (2025). On the path to the UK/India AI Summit with OpenUK and the ATI. <a href=\"https://doi.org/10.59350/x6rea-1g262\" target=\"_blank\"><i>10.59350/x6rea-1g262</i></a></li>\n<li>Madhavapeddy (2026). Five ways to use the LIFE metric for conservation decision-making. <a href=\"https://doi.org/10.59350/hjg1b-seq03\" target=\"_blank\"><i>10.59350/hjg1b-seq03</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2026w2",
      "title": ".plan-26-02: Back from India and straight into the conservation conference",
      "summary": "Hosting the Conservation Evidence conference at Pembroke, recovering from the India trip, and keeping up with LLM developments.",
      "date_published": "2026-01-11T00:00:00.000000Z",
      "date_modified": "2026-01-11T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "india",
        "conservation",
        "llms"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/qjrmv-38130",
          "doi": "10.59350/qjrmv-38130",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/x6rea-1g262",
          "doi": "10.59350/x6rea-1g262",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/hjg1b-seq03",
          "doi": "10.59350/hjg1b-seq03",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/ocaml-claude-dev",
      "content_html": "<p>I got a few questions about the dev setup I used for my <a href=\"/notes/aoah-2025\">AoAH</a> sprint last month.\nI've cleaned this up and published\n<strong><a href=\"https://github.com/avsm/claude-ocaml-devcontainer\">claude-ocaml-devcontainer</a></strong>,\na <a href=\"https://devcontainers.io\">Devcontainer</a> of everything you need to do OCaml\nor OxCaml development in Claude Code in a sandboxed Docker container. This means you\ncan (reasonably safely) run it in unattended mode with permissions bypass enabled.</p>\n<h2 id=\"using-the-ocamlclaude-devcontainers\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#using-the-ocamlclaude-devcontainers\"></a>Using the OCaml/Claude devcontainers</h2>\n<p>A devcontainer can either be used in an editor that supports it like VSCode, or directly from the CLI (which is what I do for Claude Code). Adding it to your project is as simple as:</p>\n<pre><code>$ mkdir .devcontainer\n$ cd .devcontainer\n$ curl -OL https://raw.githubusercontent.com/avsm/claude-ocaml-devcontainer/refs/heads/main/.devcontainer/devcontainer.json\n</code></pre>\n<p>Edit the JSON file to add any other post-installation or extensions that you might need for that project. It's intended to be customisable.</p>\n<p>To spin the devcontainer up, I use the npx CLI:</p>\n<pre><code>$ npx @devcontainers/cli up   --workspace-folder .\n$ npx @devcontainers/cli exec --workspace-folder . bash -l\n</code></pre>\n<p>This will mount your current project into <code>/workspace</code> in the dev container. The set of network domains that can be accessed are limited to a <a href=\"https://github.com/avsm/claude-ocaml-devcontainer/blob/main/.devcontainer/init-firewall.sh#L68-L83\">select few</a>, but I'll parameterise this into the project metadata in the future.</p>\n<p>When you're in the workspace container, you have two preinstalled OCaml switches:</p>\n<pre><code class=\"language-bash\">$ opam switch\n#  switch    compiler                 description\n   5.2.0+ox  ocaml-variants.5.2.0+ox  5.2.0+ox\n→  default   ocaml.5.4.0              default\n</code></pre>\n<p>And <code>claude</code> maps its config from your home directory, so you can start up sessions once you've authenticated as normal, and everything runs in the container reasonably sandboxed.</p>\n<h2 id=\"customising-the-devcontainers\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#customising-the-devcontainers\"></a>Customising the devcontainers</h2>\n<p>Since compiling up O[x]Caml can take a while, it's a prebuilt image, but you can also clone the <a href=\"https://github.com/avsm/claude-ocaml-devcontainer\">repository</a> and customise the Dockerfile in <code>.devcontainer</code> to your heart's content.\nThe default disk space in a GitHub Action wasn't sufficient, but it turns out that the default <code>ubuntu-latest</code> has a ton of pre-installed packages <a href=\"https://carlosbecker.com/posts/github-actions-disk-space/\">that you can just delete to double your disk space</a>.</p>\n<p>One pretty cool thing about the action is that it doesn't use qemu to build the\nmultiarch images. Instead, the\n<a href=\"https://github.com/avsm/claude-ocaml-devcontainer/blob/main/.github/workflows/multi-build.yaml\">multibuild.yml</a>\ndispatches separate builds to the native arm64 and amd64 hosts, and then\ncombines them together. This is faster and more reliable than the conventional\npath of going through CPU emulation for the non-native host.</p>\n<p>I also modify my own devcontainer to mount some limited SSH keys and a\n<code>.gitconfig</code> so I can commit from within the container, which allows for more\nunattended feedback loops.</p>",
      "url": "https://anil.recoil.org/notes/ocaml-claude-dev",
      "title": "Devcontainer for using O(x)Caml and Claude in your projects",
      "summary": "A prebuilt Docker devcontainer for sandboxed OCaml and OxCaml development with Claude Code, including multiarch builds and network isolation.",
      "date_published": "2026-01-08T00:00:00.000000Z",
      "date_modified": "2026-01-08T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "oxcaml",
        "docker"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2026w1",
      "content_html": "<p>After an amazingly fun trip to Udaipur and lots of family chitchat, we got over to Vijayawada and then Hyderabad, while visiting family temples and villages en route.</p>\n<p><img src=\"/images/anil-india-1.webp\" alt=\"%c\" title=\"The mosquitos got me good\" >\n<img src=\"/images/anil-india-2.webp\" alt=\"%c\" title=\"But it was worth it to find the Madhavapeddi family temple!\" ></p>\n<p>I wrote my reading list in my <a href=\"/notes/hny2026\">happy new year 2026 post</a>. And just to hit the ground running, <a href=\"https://www.cst.cam.ac.uk/people/zf281\">Frank Feng</a> and I got the TESSERA embeddings for 2025 kicked off on the second day of the new year, since all the Sentinel-1/2 data was ready for us to generate the annualised embeddings.</p>\n<p>Looking forward to the new year. If 2025 is anything to go by, 2026 is going to be a wild ride with AI developments...</p><h1>References</h1><ul><li>Madhavapeddy (2026). Happy new year and my fave readings of the year. <a href=\"https://doi.org/10.59350/y9f0e-raa45\" target=\"_blank\"><i>10.59350/y9f0e-raa45</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2026w1",
      "title": ".plan-26-01: Easing into the new year with reading, temples and mosquitos",
      "summary": "Family travels through Udaipur, Vijayawada and Hyderabad, visiting temples and villages, plus kicking off the TESSERA 2025 embeddings.",
      "date_published": "2026-01-04T00:00:00.000000Z",
      "date_modified": "2026-01-04T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "india"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/y9f0e-raa45",
          "doi": "10.59350/y9f0e-raa45",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/y9f0e-raa45",
      "content_html": "<p>Happy new 2026 everyone! I've had a wonderful journey through Udaipur and the lake palaces, and\nam currently in Vijayawada exploring my <a href=\"https://www.onefivenine.com/india/villages/Guntur/Ponnur/Brahmanakoduru\">ancestral village</a>\nfor the next few days. I've caught up on some reading, so here are my book thoughts (and podcasts for the first time)\nfrom the past year!</p>\n<h2 id=\"inspirational-moral-ambition\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#inspirational-moral-ambition\"></a>Inspirational: Moral ambition</h2>\n<p>I thoroughly enjoyed listening to the <a href=\"https://www.bbc.co.uk/sounds/play/m002n7rf\">BBC Reith\nlectures</a> this year by <a href=\"http://rutgerbregman.com\">Rutger\nBregman</a>, after hearing about them from <a href=\"https://mlg.eng.cam.ac.uk/carl/\">Carl Edward Rasmussen</a>. I picked up a copy of Bregman's new book\n<a href=\"http://rutgerbregman.com/books/moral-ambition\">Moral Ambition</a> to read in my\ntravels; it was a light read and I got through it all on the flight to India.</p>\n<p><a href=\"http://rutgerbregman.com/books/moral-ambition\"> <img src=\"/images/2026-books-ma.webp\" alt=\"%lc\" > </a></p>\n<p>I enquired online if anyone else had read it, and <a href=\"https://www.lbj.org.uk\">Laura James</a> nailed it\nwith her assessment that it <em>&quot;<a href=\"https://amok.recoil.org/@Laura@social.coop/115781160880218125\">looks like a classic airport book for business folk having a career crisis</a>&quot;</em>. Well, working in the climate and biodiversity space feels like a <a href=\"/notes/nas-rs-biodiversity\">constant crisis</a>, so I dove straight into it and ended up bounding off the plane with a spring in my step!</p>\n<p>Bregman has written this book for a rather narrow audience. The key idea he pushes is that ambitious and idealistic young people shouldn't waste their lives on &quot;<a href=\"https://en.wikipedia.org/wiki/Bullshit_Jobs\">bullshit jobs</a>&quot; (a term coined by the late David Graeber, author of the fantastic <a href=\"https://en.wikipedia.org/wiki/Debt:_The_First_5,000_Years\">history of debt</a>). He argues that it's possible to be <em>both</em> ambitious and moral in our careers:</p>\n<blockquote>\n<p>Sometimes it seems 'ambition' has become a dirty word, incompatible with an\nidealistic lifestyle. Many people are more preoccupied with the kind of work\nthey do than the impact their work has [...] or 'think global, act local' as\nif achieving little is somehow a virtue.</p>\n<p><cite>-- <a href=\"http://rutgerbregman.com/books/moral-ambition\">Moral Ambition</a>, Ch2, Rutger Bregman</cite></p>\n</blockquote>\n<p>The book follows a style fairly typical of this class, whereby a number of anecdotes about morally ambitious characters from history (such as <a href=\"https://en.wikipedia.org/wiki/Ralph_Nader\">Ralph Nader</a> before he became a perennial presidential candidate) are recounted.  I particularly liked the breakdown of 'illusions' that cause inaction, such as the illusion of awareness that we've run across in our <a href=\"/notes/food-and-risk-to-life\">research on food consumption impacts</a> :</p>\n<blockquote>\n<p>Psychologists speak of the &quot;belief-behaviour&quot; gap. Take people who think it's\nawful how animals are treated but still eat meat; progressives who think\nplanes are too polluting but fly all the same; church-goers who scarcely give\nto charity, despite the scriptures' call to tithe.</p>\n<p><cite>-- <a href=\"http://rutgerbregman.com/books/moral-ambition\">Moral Ambition</a>, Ch4, Rutger Bregman</cite></p>\n</blockquote>\n<p>And then the illusion of good intentions that we've seen repeatedly in our work on <a href=\"/projects/ce\">conservation evidence</a> and in <a href=\"https://doi.org/10.1126/science.adl6547\">climate policy</a>. There is a huge lack of evidence driven decision making, possibly due to the extremely difficult job of robustly <a href=\"/papers/2025-redd-evals\">evaluating counterfactual</a> decision making:</p>\n<blockquote>\n<p>A global analysis of 1,500 climate policies put in place between 1998 and\n2022 found that only 63 of them – a mere 4 per cent – led to a significant\ndrop in emissions.</p>\n<p><cite>-- <a href=\"http://rutgerbregman.com/books/moral-ambition\">Moral Ambition</a>, Ch4, Rutger Bregman</cite></p>\n</blockquote>\n<p>And something I've run across all the time in the open source community is the 'illusion of purity', whereby <em>&quot;activists continued to write each other off for all manner of minor missteps and mistakes&quot;</em> (ch4).</p>\n<p>The most disappointing chapter of the book was where he covered the Effective\nAltruism movement in a strangely wavering tone.  He tries to\neven-handedly cover the <a href=\"https://www.theguardian.com/business/2024/mar/23/sam-bankman-fried-rise-and-fall-details\">very strange</a>\ncommunity that built up around EA, and also criticise the pooling of wealth, but didn't really stimulate any new insights or\ncritical thought in me. He does take a pragmatic stance on the status quo, which is understandable given the emphasis on action over awareness. I can't say that I disagree here either, given the number of absolutely amazing philanthropists I've been lucky to work with on my own research:</p>\n<blockquote>\n<p>The truth is that money and moral ambition need each other. Philanthropy\ndoesn't have to get stuck in vanity and paternalism. It can lead to real\nsystemic change, as long as you prioritise wisely and keep an eye out for\ndamaging side effects. What's more, private individuals are in a unique\nposition to support unpopular causes – when government and business steer\nclear.</p>\n<p><cite>-- <a href=\"http://rutgerbregman.com/books/moral-ambition\">Moral Ambition</a>, Ch8, Rutger Bregman</cite></p>\n</blockquote>\n<p>Pushing on beyond that, though, I greatly enjoyed <em>the author's own ambition</em> in setting up a <a href=\"https://www.moralambition.org\">School for Moral Ambition</a> and promoting the idea of creating <a href=\"https://www.moralambition.org/circles\">moral ambition circles</a>. This is pretty close to the sort of thing we <a href=\"/notes/cambridge-green-blue\">think about</a> day-to-day in our university environment anyway, so I hope to find <a href=\"https://mlg.eng.cam.ac.uk/carl/\">Carl Edward Rasmussen</a> and set one up in Cambridge when I'm back next year (if there isn't one we can already join).</p>\n<h2 id=\"entertaining-non-fiction-this-way-up\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#entertaining-non-fiction-this-way-up\"></a>Entertaining non-fiction: This way up</h2>\n<p><a href=\"https://guardianbookshop.com/this-way-up-9780008710279/\"> <img src=\"/images/2026-books-twu.webp\" alt=\"%rc\" > </a></p>\n<p>For something less heavy, I weaved through &quot;<a href=\"https://guardianbookshop.com/this-way-up-9780008710279/\">This Way Up: When Maps Go Wrong</a>&quot; by the famed <a href=\"https://www.youtube.com/playlist?list=PLfxy4_sBQdxy3A2lvl-y3qWTeJEbC_QCp\">Map Men</a>. This is an irreverent walk through all the extremely weird ways humans have come up with to project a 3D topology onto a 2D surface.</p>\n<p>This included the fun\n<a href=\"https://en.wikipedia.org/wiki/Psychogeography\">psychogeography</a> movement from\nFrench surrealists who <a href=\"https://www.flickr.com/photos/maudnewton/202396012\">cut up a map of\nParis</a> and rearranged it\nwith parts of Paris that were &quot;stimulating&quot; and &quot;worthy of study&quot; with giant\narrows between them to represent the teleportation to the most exhilarating\nareas.  Perhaps we need to draw a similar 'a bicycling birders guide to\nCambridgeshire' in 2026!</p>\n<h2 id=\"best-scifi-when-there-are-wolves-again\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#best-scifi-when-there-are-wolves-again\"></a>Best scifi: When there are wolves again</h2>\n<p><a href=\"https://www.goodreads.com/book/show/217490523-when-there-are-wolves-again\"> <img src=\"/images/2026-books-ww.webp\" alt=\"%rc\" > </a></p>\n<p>I'm only halfway through &quot;<a href=\"https://www.goodreads.com/book/show/217490523-when-there-are-wolves-again\">When there are wolves again</a>&quot;\nby EJ Swift, but it's SO GOOD so far that I had to include it.\nIt combines some of my favourite topics about restoring depleted\necosystems and exploring space in one book. Something I've often discussed with\n<a href=\"https://coomeslab.org\">David Coomes</a> are the challenges around reintroducing wolves to Scotland; a real\nsource of tension between conservationists (who want to get rid of deer) and\nthe farmers (who quite reasonably don't want their livestock eaten by roving\nwolves).</p>\n<p>This also has a storytelling format of having two narrators chatting to each other,\nand maintains a really narrow and hopeful focus. You might notice I'm dosing up\non optimism to charge myself up for 2026, so thank you <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\">Jon Crowcroft</a> for recommending\nthis one to me!</p>\n<h2 id=\"best-book-i-finally-finished-when-the-sparrow-falls\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#best-book-i-finally-finished-when-the-sparrow-falls\"></a>Best book I finally finished: When the sparrow falls</h2>\n<p><a href=\"https://thelilycafe.com/2021/06/22/book-review-when-the-sparrow-falls-by-neil-sharpson/\"> <img src=\"/images/2026-books-sf.webp\" alt=\"%rc\" > </a></p>\n<p>We can't be too optimistic and lose touch with reality, so a brilliant novel that I lost behind my sofa but found again\nin November is <a href=\"https://thelilycafe.com/2021/06/22/book-review-when-the-sparrow-falls-by-neil-sharpson/\">&quot;When the Sparrow Falls&quot;</a>\nthat has has an entertaining (but getting awfully close) take on AI agents.</p>\n<p>In the fictional Caspian republic, the rest of the world has been infected by AI governance, one\nsmall nation resists being overtaken and outlaws digital technology. Anyone has to be checked\nto see if they're human at the borders, and the books opens with the autopsy of an unfortunate\nhigh-ranking government official revealing that they're an AI.  The rest of the book is\ndark humour, bleak worldbuilding, and mystery at its finest. Not a cheerful book though...</p>\n<h2 id=\"best-thought-provoking-recommendation-prisoners-of-geography\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#best-thought-provoking-recommendation-prisoners-of-geography\"></a>Best thought-provoking recommendation: Prisoners of geography</h2>\n<p><a href=\"https://en.wikipedia.org/wiki/Prisoners_of_Geography\"> <img src=\"/images/2026-books-pg.webp\" alt=\"%rc\" > </a></p>\n<p>Thank you <a href=\"https://www.linkedin.com/in/isobelcohen/\">Isobel Cohen</a> for recommending <a href=\"https://en.wikipedia.org/wiki/Prisoners_of_Geography\">&quot;Prisoners of Geography&quot;</a> which is\na decade old now, but a fascinating view on geopolitics <em>just</em> before Trumpian\npolitics took over.  Tim Marshall argues that no matter how much technological\nprogress is made, a dominant force in politics will always be geography.</p>\n<p>I don't actually agree with it at all though.  This summer I <a href=\"/notes/owntracks-and-lifecycle\">went to Botswana</a> where I found a country that had a terrible\noutlook when it <a href=\"https://en.wikipedia.org/wiki/History_of_Botswana\">achieved independence</a>: no capital\nreserves, surrounded by hostile apartheid regimes, and with mineral wealth that\nusually lead to invasion or takeover. But today, Botswana has one of the highest life expectancies\nand most stable governments in Africa.</p>\n<iframe src=\"https://ourworldindata.org/grapher/life-expectancy?tab=map&mapSelect=BWA~ZAF~NAM~MOZ~ZWE~ZMB~AGO~TZA~COD~UGA~SOM~COG~CMR~MDG&globe=1&globeRotation=-10.98%2C29.81&globeZoom=2.44\" loading=\"lazy\" style=\"width: 100%; height: 600px; border: 1px;\" allow=\"web-share; clipboard-write\"></iframe>\n<p>The same is also true in Costa Rica, having built up a solid ecotourism industry. But I did find the framework of <a href=\"https://en.wikipedia.org/wiki/Tim_Marshall_(journalist)\">Tim Marshall</a>'s argument to be well worth reading, and I've queued up his <a href=\"https://www.goodreads.com/book/show/62675564-the-future-of-geography\">2023 book on the politics of space</a> to read this year.</p>\n<h2 id=\"best-gift-chennai-the-history\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#best-gift-chennai-the-history\"></a>Best gift: Chennai the history</h2>\n<p><a href=\"https://www.goodreads.com/book/show/59917455-chennai\"> <img src=\"/images/2026-books-ch.webp\" alt=\"%rc\" > </a></p>\n<p>I used to live in Chennai way back in 1990 after the Gulf War chucked us out of\nKuwait, and <a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a> gave me a copy of <a href=\"https://www.goodreads.com/book/show/59917455-chennai\">Chennai: A\nBiography</a> when I visited\nhim earlier in the year.</p>\n<p>Madras is a city of so many &quot;firsts&quot;: the first Indian corporation, the first\narmy regiment. But I had no idea that the bookshop that I used to hang out in 36 years ago\nfinding all the maths books I could devour, called\n<a href=\"https://en.wikipedia.org/wiki/Higginbotham%27s\">Higginbotham's</a>, is actually the oldest\nbookshop in all of India!  A\nbrilliantly engaging piece of history from <a href=\"https://en.wikipedia.org/wiki/V._Sriram\">V.\nSriram</a> who apparently leads heritage\nwalks through the city that I must catch on my next visit.</p>\n<h2 id=\"best-non-tech-podcast\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#best-non-tech-podcast\"></a>Best non-tech podcast</h2>\n<p><a href=\"https://feeds.acast.com/public/shows/solving-for-climate\"> <img src=\"/images/2026-books-sc.webp\" alt=\"%rc\" > </a></p>\n<p>I greatly enjoyed tuning into &quot;<a href=\"https://feeds.acast.com/public/shows/solving-for-climate\">Solving for Climate</a>&quot; from my favourite climate optimists Hannah Ritchie and Rob Stewart!  They've got a nicely balanced set of science and society oriented episodes, with some of my favourite being:</p>\n<ul>\n<li>&quot;Rahul Tongia: Is India on track to meet climate goals?&quot; (Dec 2025). I listened to this in India while watching an episode of my colleague <a href=\"https://en.wikipedia.org/wiki/Bhaskar_Vira\">Bhaskar Vira</a> appearing on <a href=\"https://www.cam.ac.uk/news/cambridge-pvc-prof-bhaskar-vira-in-new-documentary-on-indias-environment\">Indian TV</a>!</li>\n<li>&quot;Ian McKay: How do we get rid of contrails?&quot; (Sep 2025). The <a href=\"https://www.bbc.co.uk/news/articles/cz7wp777780o\">science behind contrails</a> is fascinating, and we had to account for this in our <a href=\"/notes/carbon-credits-vs-offsets\">carbon credits calculations</a> in <a href=\"/projects/4c\">4C</a> when calculating the damaging effects of flying: it's not just the CO2e.</li>\n<li>&quot;Mark Titley: How do we stop deforestation?&quot; (Jun 2025). <a href=\"https://trase.earth\">Trase</a> is an absolutely incredibly organisation that's creating reliable supply chain maps, and we collaborated with them in our research on <a href=\"/papers/2024-food-life\">food biodiversity</a>.</li>\n</ul>\n<h2 id=\"best-tech-podcast-signals-and-threads\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#best-tech-podcast-signals-and-threads\"></a>Best tech podcast: Signals and Threads</h2>\n<p><a href=\"https://signalsandthreads.com\"> <img src=\"/images/2026-books-st.webp\" alt=\"%rc\" > </a></p>\n<p>I might be biased as I've <a href=\"https://signalsandthreads.com/what-is-an-operating-system/\">appeared on a previous season</a>, but <a href=\"https://signalsandthreads.com\">Signals and Threads</a> has really developed into a solid series that manages to make listening about tech not boring (a crime that about 99% of other podcasts I've attempted to listen to have committed). After working on it all day, do I really want to listen to more tech while out for a run?</p>\n<p><a href=\"https://github.com/yminsky\">Yaron Minsky</a> has hosted some unexpectedly fun episodes this year:</p>\n<ul>\n<li><a href=\"https://signalsandthreads.com/the-thermodynamics-of-trading/\">The Thermodynamics of Trading</a> (July 2025). You'd think that an hour of talking about cooling data centres would be boring, but this was my standout episode of the year! It kicks into high gear when Daniel discusses what he optimistically dubs &quot;thermal events&quot; (boom goes the rack), and dips into physics, alerts, layouts and all kinds of fun things.</li>\n<li><a href=\"https://signalsandthreads.com/building-tools-for-traders/\">Building Tools for Traders</a> (May 2025). For the terminal obsessors among you (I am looking at <a href=\"https://ryan.freumh.org\">Ryan Gibb</a>), this episode delves in just mouse-averse Jane Street is: <em>&quot;everyone at Jane Street has roughly fighter pilot eyes and they say, I want this to be about six pixels high because then I can see more of it on my screen. You’re designing these tools with extremely high information density&quot;</em>. And indeed, I wasn't disappointed when I played with <a href=\"/notes/aoah-2025-9\">Bonsai Term</a> a few weeks ago!</li>\n<li><a href=\"https://signalsandthreads.com/from-the-lab-to-the-trading-floor/\">From the Lab to the Trading Floor: Designing for Expert Users</a> is actually from last year, but I only caught up with the season in 2025 so it counts! Erin Murphy formerly worked at JPL working on UIs for space missions, and she covers both the technical <em>and</em> cultural aspects of developer experience. One really neat trick JS does that I'd like to replicate here in Cambridge is <em>&quot;theres a channel that’s just alerting us when a person logs into the tool [...] we get their username, and we can easily find where they are in the office, and we can go up and talk to them&quot;</em>. What an excellent idea for encouraging human-to-human interaction in the <a href=\"/notes/aoah-2025\">agentic craziness</a> we will find ourselves in 2026!</li>\n</ul>\n<h2 id=\"first-book-to-read-in-2026-katabasis\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#first-book-to-read-in-2026-katabasis\"></a>First book to read in 2026: Katabasis</h2>\n<p><a href=\"https://en.wikipedia.org/wiki/Katabasis_(novel)\"> <img src=\"/images/2026-books-kb.webp\" alt=\"%lc\" > </a></p>\n<p>I've not actually read this yet, but thank you Sara Biswas for an effervescent review that got me really excited to pick this up on the way home! <a href=\"https://en.wikipedia.org/wiki/Katabasis_(novel)\">Katabasis</a> is a novel published this year by <a href=\"https://en.wikipedia.org/wiki/R._F._Kuang\">RF Kuang</a> (who studied at Cambridge) with a hilariously dark plot about two graduate students (and magicians) who must venture into hell to save their thesis advisor in order to get letters of recommendation.  I shall reserve judgement until after I have read it, and indeed after I have completed all the recommendation letters I need to get done despite allegedly being on vacation. But with a setting like this, I can feel it's going to be good!</p>\n<p>That's a wrap for my abridged reading recommendations for the year! I have a\nmonster pile of unread books to attempt to catch up on this year, so I'll try\nto blog about what I'm reading more regularly than annually.</p><h1>References</h1><ul><li>Madhavapeddy (2025). What I learnt at the National Academy of Sciences US-UK Forum on Biodiversity. <a href=\"https://doi.org/10.59350/j6zkp-n7t82\" target=\"_blank\"><i>10.59350/j6zkp-n7t82</i></a></li>\n<li>Madhavapeddy (2025). Tracking locations with OwnTracks, Life Cycle and Home Assistant. <a href=\"https://doi.org/10.59350/13ras-yd957\" target=\"_blank\"><i>10.59350/13ras-yd957</i></a></li>\n<li>Ball et al (2025). Food impacts on species extinction risks can vary by three orders of magnitude. <a href=\"https://doi.org/10.1038/s43016-025-01224-w\" target=\"_blank\"><i>10.1038/s43016-025-01224-w</i></a></li>\n<li>Swinfield et al (2025). Learning lessons from over-crediting to ensure additionality in forest carbon credits. Cambridge Open Engage. <a href=\"https://doi.org/10.33774/coe-2025-29fk2\" target=\"_blank\"><i>10.33774/coe-2025-29fk2</i></a></li>\n<li>Madhavapeddy (2025). Disentangling carbon credits and offsets with contributions. <a href=\"https://doi.org/10.59350/g4ch1-64343\" target=\"_blank\"><i>10.59350/g4ch1-64343</i></a></li>\n<li>Madhavapeddy (2025). The Cambridge \"Green Blue\" competition to reduce emissions. <a href=\"https://doi.org/10.59350/y1g67-aq825\" target=\"_blank\"><i>10.59350/y1g67-aq825</i></a></li>\n<li>Stechemesser et al (2024). Climate policies that achieved major emission reductions: Global evidence from two decades. Science. <a href=\"https://doi.org/10.1126/science.adl6547\" target=\"_blank\"><i>10.1126/science.adl6547</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/hny2026",
      "title": "Happy new year and my fave readings of the year",
      "summary": "My favourite books, podcasts and recommendations from 2025, covering moral ambition, maps, wolves, AI dystopias, geopolitics, Chennai history, and the best tech podcasts.",
      "date_published": "2026-01-02T00:00:00.000000Z",
      "date_modified": "2026-01-02T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "books",
        "scifi",
        "fiction",
        "policy"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/j6zkp-n7t82",
          "doi": "10.59350/j6zkp-n7t82",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/13ras-yd957",
          "doi": "10.59350/13ras-yd957",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s43016-025-01224-w",
          "doi": "10.1038/s43016-025-01224-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.33774/coe-2025-29fk2",
          "doi": "10.33774/coe-2025-29fk2",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/g4ch1-64343",
          "doi": "10.59350/g4ch1-64343",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/y1g67-aq825",
          "doi": "10.59350/y1g67-aq825",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1126/science.adl6547",
          "doi": "10.1126/science.adl6547",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/aoah-2025",
      "content_html": "<p>Agentic programming has been getting a <a href=\"https://devclass.com/2025/11/27/ocaml-maintainers-reject-massive-ai-generated-pull-request/\">hilariously</a> <a href=\"https://mastodon.social/@regehr/115606922116794760\">bad</a> <a href=\"https://news.ycombinator.com/item?id=46039274\">rap</a> in the OCaml community recently, but it's definitely here to stay despite the <a href=\"/notes/claude-copilot-sandbox\">security</a> and <a href=\"https://github.com/ocaml/ocaml/pull/14052#discussion_r2565290229\">legal</a> concerns. I realised that to form a useful opinion on all this, I needed to really get into <a href=\"/papers/2025-ocaml-ai\">using Claude with OCaml</a> for real outputs and not just toy code. So this holiday month, I'm going to release a new <em>useful</em> OCaml library per day until Christmas using Claude Code: the advent of agentic humps is here!</p>\n<ul>\n<li><strong><a href=\"/notes/aoah-2025-1\">Day 1: Crockford</a></strong> for <a href=\"https://www.crockford.com/base32.html\">Crockford Base32</a> encoding.</li>\n<li><strong><a href=\"/notes/aoah-2025-2\">Day 2: Jsonfeed</a></strong> for an implementation of the <a href=\"https://www.jsonfeed.org\">JSONFeed 1.1</a> spec.</li>\n<li><strong><a href=\"/notes/aoah-2025-3\">Day 3: XDGe</a></strong> for a <a href=\"https://specifications.freedesktop.org/basedir/latest/\">XDG Directory specifiction</a> with Eio capabilities.</li>\n<li><strong><a href=\"/notes/aoah-2025-4\">Day 4: Claudeio</a></strong> for a Claude OCaml/Eio SDK so I can use Claude to write more Eio.</li>\n<li><strong><a href=\"/notes/aoah-2025-5\">Day 5: Bytesrw-eio</a></strong> Bytesrw/Eio adapter and automate opam metadata via a custom Claude skill.</li>\n<li><strong><a href=\"/notes/aoah-2025-6\">Day 6: Yamlrw</a></strong> for a pure OCaml Yaml 1.2 library, to replace <a href=\"https://github.com/avsm/ocaml-yaml\">ocaml-yaml</a>'s C binding.</li>\n<li><strong><a href=\"/notes/aoah-2025-7\">Day 7: Yamlt</a></strong> to allow jsont codecs to be serialised to Yaml as well as JSON.</li>\n<li><strong><a href=\"/notes/aoah-2025-8\">Day 8: Sortal</a></strong>: a contacts management CLI using Yaml, Git and Cmdliner.</li>\n<li><strong><a href=\"/notes/aoah-2025-9\">Day 9: Sortal-Bonsai</a></strong>: adding a <code>Bonsai_term</code> terminal UI to Sortal via Async.</li>\n<li><strong><a href=\"/notes/aoah-2025-10\">Day 10: Sortal-Mosaic</a></strong>: adding a <code>Mosaic</code> terminal UI to Sortal via Eio.</li>\n<li><strong><a href=\"/notes/aoah-2025-11\">Day 11: Cookeio, Public-suffix, Punycode</a></strong>: parsing Internet RFCs to build cookie libraries.</li>\n<li><strong><a href=\"/notes/aoah-2025-12\">Day 12: Conpool</a></strong>: Eio TLS/TCP connection pooling and self-contained performance viz.</li>\n<li><strong><a href=\"/notes/aoah-2025-13\">Day 13: Requests</a></strong>: Heckling an OCaml HTTP client from 50 other implementations.</li>\n<li><strong><a href=\"/notes/aoah-2025-14\">Day 14: Karakeep</a></strong>: Live agentic API construction for the Karakeep app.</li>\n<li><strong><a href=\"/notes/aoah-2025-15\">Day 15: Htmlrw</a></strong>: Vibespiling Rust/Python into a 100% compliant HTML5 manipulation library.</li>\n<li><strong><a href=\"/notes/aoah-2025-16\">Day 16: Json-pointer</a></strong>: Vibesplaining specifications by generating OCaml Javascript notebooks.</li>\n<li><strong><a href=\"/notes/aoah-2025-17\">Day 17: Jmap</a></strong>: Vibemailing little CLI agents to bring my JMAP messages under control.</li>\n<li><strong><a href=\"/notes/aoah-2025-18\">Day 18: Tomlt</a></strong>: Elegant TOML 1.1 codecs inspired by the jsont data soup paper.</li>\n<li><strong><a href=\"/notes/aoah-2025-19\">Day 19: Zulip, INIt</a></strong>: Zulip bot framework and INI codecs compatible with Python configparser.</li>\n<li><strong><a href=\"/notes/aoah-2025-20\">Day 20: Langdetect</a></strong>: Statistical detection for human languages in OCaml, JavaScript and wasm.</li>\n<li><strong><a href=\"/notes/aoah-2025-21\">Day 21: Html5rw_check</a></strong>: Vibespiling the Nu HTML Validator from Java to typed OCaml checkers.</li>\n<li><strong><a href=\"/notes/aoah-2025-22\">Day 22: Monopam</a></strong>: Monorepo workflow with dune vendoring for cross-cutting fixes.</li>\n<li><strong><a href=\"/notes/aoah-2025-23\">Day 23: Unpac</a></strong>: Unifying git and opam package management with branch-based monorepos.</li>\n<li><strong><a href=\"/notes/aoah-2025-24\">Day 24: Tuatara</a></strong>: Tuatara, an evolving Atom aggregator that mutates its own code.</li>\n<li><strong><a href=\"/notes/aoah-2025-25\">Day 25: OCaml Claude Marketplace</a></strong>: Wrapping up my Claude skills into a reusable bundle.</li>\n</ul>\n<p><img src=\"/images/aoah-ss-2.webp\" alt=\"%c\" title=\"Claude is also very good at automating non-coding tasks like opam metadata\" ></p>\n<p>I'm working through a large backlog of ideas that I'll figure out as each days goes on. Ideas thrown on the pile by colleagues include TCP connection reuse and pooling library with TLS support, HTTP cookie jar handling using Eio, Batteries-include HTTP(S) client library with redirect/cookies, digest vast amounts of Git and summarise it (see a <a href=\"https://thicket.dev\">preview</a>), Zulip bindings using Requests and Eio, Kitty graphics protocol to show graphics in your terminal, client bindings for the JMAP protocol, client bindings for the Immich self hosted photo service, client bindings for the Peertube video service, generate image srcsets in various resolutions for websites, DOI resolution of papers to structured metadata, and a Parquet library in pure OCaml. I'm also working on an <code>io_uring</code> <a href=\"https://oxcaml.org\">OxCaml</a> webserver if I can get the Linux kernel not crashing on me before Santa visits...</p>\n<p>My overall goal is to accelerate the heck out of how I manage the growing data in this website. I've been building it as <a href=\"/notes/bushel-lives\">homebrew infrastructure</a> for the past twenty five years, and now I want it to move from ad-hoc scripts to principled data management. I am also using the libraries to do data processing in the day job for the <a href=\"/projects/rsn\">remote sensing of nature</a> or <a href=\"/projects/ce\">evidence synthesis</a>. I'll edit the above list every day to link to what I actually did.</p>\n<p>I've picked these choices fairly carefully as they're not &quot;core&quot; libraries that\nare difficult to write and require functional ingenuity, but are instead\nproblems that involve a fair amount of boilerplate code that is typically quite\ntedious to write in OCaml. Hand writing code might be on the ropes, but not\nquite out of action just yet! But first, let's establish some groundrules for if\nthis is a good idea or not.</p>\n<h2 id=\"isnt-this-just-more-ai-slop-code\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#isnt-this-just-more-ai-slop-code\"></a>Isn't this just more AI slop code?</h2>\n<p>There's a definite gag reflex involved with releasing so much code: by prioritising quantity over quality, aren't I just contributing to the world of <a href=\"/notes/ai-poisoning\">AI slop</a>? However, the hypothesis I am exploring is that the software engineering process fundamentally changes when using agents towards specification driven development, which has always been the <a href=\"https://deepspec.org/main\">holy grail of functional programming</a>.</p>\n<p>There's been extensive discussion recently about the role of <a href=\"https://rfd.shared.oxide.computer/rfd/0576\">LLMs in open source</a> elsewhere that informed my thinking. I liked <a href=\"https://github.com/tmattio\">Thibaut Mattio</a> stating how he's <a href=\"https://discuss.ocaml.org/t/ann-mosaic-a-modern-terminal-user-interface-framework-for-ocaml-early-preview/17572/5\">approaching</a> his <a href=\"https://www.youtube.com/watch?v=BAvXqd0QeVM\">own</a> agentic software development:</p>\n<blockquote>\n<p>AI writes a significant amount of the initial code, and I review, revise, and iterate on a large portion of it. That’s how I work these days. But the architecture, design, and core logic are very much the result of deliberate iteration and manual refinement.\n<cite>-- <a href=\"https://discuss.ocaml.org/t/ann-mosaic-a-modern-terminal-user-interface-framework-for-ocaml-early-preview/17572/5\">Thibaut Mattio</a>, OCaml Discuss, 2025</cite></p>\n</blockquote>\n<p>Bryan Cantrill came up with a superb set of principles for <a href=\"https://rfd.shared.oxide.computer/rfd/0576\">LLM Usage at Oxide</a>. In particular, he separates out using LLMs for reading, writing and coding. I totally agreed with him that I hate people sending me LLM-generated writing for me to review; I would rather get the raw prompt and use my own LLM+context rather than read through other people's slop.</p>\n<blockquote>\n<p>LLM-generated prose undermines a social contract of sorts: absent LLMs, it is presumed that of the reader and the writer, it is the writer that has undertaken the greater intellectual exertion. (That is, it is more work to write than to read!) For the reader, this is important: should they struggle with an idea, they can reasonably assume that the writer themselves understands it — and it is the least a reader can do to labor to make sense of it.\n<cite>-- <a href=\"https://rfd.shared.oxide.computer/rfd/0576\">Using LLMs at Oxide</a>, RFD0576, Dec 2025</cite></p>\n</blockquote>\n<p>However, there is an undeniable (and growing) power in the ability to generate code at scale using LLMs. I've been doing a lot of this <a href=\"/notes/geotessera-python-0-7\">with Python</a> in recent months, but I find myself increasingly frustrated by the lack of typing guardrails involved with agentic coding there.</p>\n<p>I believe that a strongly typed, modular language like OCaml could become one of the <em>best</em> languages for agentic coding in the longer term, with <a href=\"https://arxiv.org/abs/2508.04865\">advances</a> happening rapidly to cure the data deficiency problem for relatively obscure languages with smaller corpuses. Also, with <a href=\"/notes/icfp25-oxcaml\">OxCaml on the horizon</a>, getting help with increasingly complex (but rewarding) code annotations such as modes and kinds sems essential.</p>\n<h2 id=\"groundrules-for-the-advent-of-agentic-humps\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#groundrules-for-the-advent-of-agentic-humps\"></a>Groundrules for the Advent of Agentic Humps</h2>\n<p>After reflecting on the recent discussions, I decided on these for my little\nDecember experiment:</p>\n<ul>\n<li><strong>No AI-driven contributions to other people's code.</strong> All my slop stays in my\nown lane unless the other person agrees. Luckily my own research group is\neasy to bribe with some festive beer so I hope to get them (or you, my dear reader) to voluntarily help me judge the success or failure.</li>\n<li><strong>Read every line of code that's tagged for release</strong>. Even if I haven't written it all, it's vital to look for howlers. However, intermediate pushes may have slop in them, so stick to the tagged releases.</li>\n<li><strong>The library has to be used somewhere</strong> in my production code stack, for example this website. Time to <a href=\"https://en.wikipedia.org/wiki/Eating_your_own_dog_food\">eat my own</a> agentic slop on my own knowledge bases!</li>\n<li><strong>Build on great human designed code.</strong> LLMs do not replace or compete with well designed foundation libraries in the OCaml ecosystem like <a href=\"https://github.com/ocaml-multicore/eio\">Eio</a>, <a href=\"https://dev.realworldocaml.org\">Core</a>, <a href=\"http://github.com/ocsigen/lwt/blob/master/CHANGES\">Lwt</a> or the <a href=\"https://erratique.ch/contact.en\">Bunzli-verse</a>. Each of these have different design ethoses, but if they didn't exist there is no scaffolding over which to compose LLM-driven code outputs. So this is <em>not</em> a competition to beat them, but rather to use them more effectively.</li>\n</ul>\n<p>And overall, this process should not help me learn more about agentic workflows\nbut also contribute to the wider discussion, so I'll capture what I learn in\nthis blog series at the end.</p>\n<p>Some non-rules:</p>\n<ul>\n<li>Keeping agentic code separate from my &quot;real code&quot; seems pointless nowadays,\nwith LLMs everywhere. I tried that earlier in the year, but I fear the\npoisoning will have to be dealt with by other means.</li>\n<li>I'm trying to keep this specific to my own OCaml workflow, and not\ngeneralising this for a hypothetical other user. But you should feel free to\nfork this stuff.</li>\n<li>I have no idea how I'm going to maintain all these libraries once released. A\nproblem for 2026. I'm not particularly attached to any of these libraries, so\nmaintainance/rewrite offers are all fine by me.</li>\n<li>There's a reasonable chance some of this has some bad bugs, since it's not\ngoing through peer review. I'll do my best to handle test coverage, but\nplease be tolerant. Bug reports are welcome.</li>\n<li>I've done my best to manually scan code and attribute copyright where possible,\nbut there remains a chance I have horribly screwed up. Any errors in attribution\nare my own, but I'm going to press on and take the risk.</li>\n</ul>\n<p>If anyone else wants to join in the Advent of Agentic Humps, ping me on\nwhatever communication medium you like. Just remember the groundrules: don't waste\nother maintainer's time without their permission first.</p><h1>References</h1><ul><li>Madhavapeddy (2025). Oh my Claude, we need agentic copilot sandboxing right now. <a href=\"https://doi.org/10.59350/aecmt-k3h39\" target=\"_blank\"><i>10.59350/aecmt-k3h39</i></a></li>\n<li>Madhavapeddy (2025). Is AI poisoning the scientific literature? Our comment in Nature. <a href=\"https://doi.org/10.59350/pbxew-d2j78\" target=\"_blank\"><i>10.59350/pbxew-d2j78</i></a></li>\n<li>Madhavapeddy (2025). Arise Bushel, my sixth generation oxidised website. <a href=\"https://doi.org/10.59350/0r62w-c8g63\" target=\"_blank\"><i>10.59350/0r62w-c8g63</i></a></li>\n<li>Madhavapeddy (2025). Holding an OxCaml tutorial at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/55bc5-x4p75\" target=\"_blank\"><i>10.59350/55bc5-x4p75</i></a></li>\n<li>Madhavapeddy (2025). GeoTessera 0.7 out with efficient sampling and Zarr support. <a href=\"https://doi.org/10.59350/nagwp-tnw89\" target=\"_blank\"><i>10.59350/nagwp-tnw89</i></a></li>\n<li>Boruch-Gruszecki et al (2025). Agnostics: Learning to Code in Any Programming Language via Reinforcement with a Universal Learning Environment. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2508.04865\" target=\"_blank\"><i>10.48550/arXiv.2508.04865</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/aoah-2025",
      "title": "2025 Advent of Agentic Humps: Building a useful O(x)Caml library every day",
      "summary": "An exploration of agentic programming through building useful OCaml libraries daily using Claude Code while establishing groundrules for responsible development.",
      "date_published": "2025-12-01T00:00:00.000000Z",
      "date_modified": "2025-12-26T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "oxcaml",
        "agents",
        "llms",
        "ai",
        "aoah"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/aecmt-k3h39",
          "doi": "10.59350/aecmt-k3h39",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/pbxew-d2j78",
          "doi": "10.59350/pbxew-d2j78",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/0r62w-c8g63",
          "doi": "10.59350/0r62w-c8g63",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/55bc5-x4p75",
          "doi": "10.59350/55bc5-x4p75",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/nagwp-tnw89",
          "doi": "10.59350/nagwp-tnw89",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2508.04865",
          "doi": "10.48550/arXiv.2508.04865",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/aoah-2025-25",
      "content_html": "<p>I'm somewhat frazzled after managing to get through <a href=\"/notes/aoah-2025\">25 days of agentic coding</a>.  It hasn't actually been that much physical work, but I\nunderestimated just how much my brain would be in full gear multitasking across\n<em>so many</em> terminal windows and ideas.  My outbound queue that I didn't manage to write\nup is enormous: I have sessions running with OxCaml experiments, io_uring webservers, implementations of ATProto, a pure OCaml Parquet, and some even stranger ideas I won't go into now!</p>\n<p>In the past, my computer systems brain was limited by the speed of coding, but now it feels like we're entering a different age. I'll reserve my longform thoughts on all of this for the new year as I need to head into Christmas festivities, but I wanted to leave you all with my <strong><a href=\"https://github.com/avsm/ocaml-claude-marketplace\">Claude Code OCaml marketplace</a></strong> in case you want to try this stuff for yourself!</p>\n<p><a href=\"https://github.com/avsm/ocaml-claude-marketplac\"> <img src=\"/images/aoah-plugin-ss-1.webp\" alt=\"%c\" > </a></p>\n<h2 id=\"claude-skills-and-marketplaces\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#claude-skills-and-marketplaces\"></a>Claude Skills and marketplaces</h2>\n<p>Claude has made it very simple to distribute <a href=\"https://code.claude.com/docs/en/plugin-marketplaces\">plugins</a>. I just created a GitHub <a href=\"https://github.com/avsm/ocaml-claude-marketplace\">avsm/ocaml-claude-marketplace</a> and kicked off Claude's own development plugin to develop plugins with. I then fed it all my skills and asked it to generalise them a bit.  They'll likely need some work to generalise beyond me, but feel free to send in PRs!</p>\n<p><img src=\"/images/aoah-plugin-ss-4.webp\" alt=\"%c\" title=\"Just go to the marketplaces section under /plugins in Claude\" ></p>\n<p><img src=\"/images/aoah-plugin-ss-2.webp\" alt=\"%c\" title=\"It'll auto install it and you can find all the plugins for OCaml\" ></p>\n<p><img src=\"/images/aoah-plugin-ss-3.webp\" alt=\"%c\" title=\"And you can enable or disable it selectively\" ></p>\n<p>I'll be back in the new year with more thoughts, but in the meanwhile, I hope\nyou all have a very good holiday and a chance to recharge! It's going to be a\nmad 2026...</p>",
      "url": "https://anil.recoil.org/notes/aoah-2025-25",
      "title": "AoAH Day 25: Claude OCaml Marketplace for all your festive coding needs",
      "summary": "Wrapping up 25 days of agentic coding with a Claude Code OCaml plugin marketplace to share the skills and tools developed throughout the series.",
      "date_published": "2025-12-25T00:00:00.000000Z",
      "date_modified": "2025-12-25T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ai",
        "ocaml",
        "oxcaml",
        "llms",
        "aoah"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/aoah-2025-24",
      "content_html": "<p>My original purpose for starting this <a href=\"/notes/aoah-2025\">AoAH</a> series was to build a\nfeed aggregator for my group website, so I had to finish up with something to show!</p>\n<p>I'm not sure if taking the <a href=\"https://www.goodreads.com/quotes/6001-think-you-re-escaping-and-run-into-yourself-longest-way-round\">longest way around</a>\nwas wise here but I ended up building <strong><a href=\"https://tangled.org/anil.recoil.org/tuatara\">tuatara</a></strong>, an\naggregator to pull together all my colleagues' writing into one place.\nThey're a quirky bunch with many diverse homegrown feeds in various\nstates of brokenness, so it's difficult to build a one-size-fits-all tool.</p>\n<p>So given it's the end of the year and I'm sozzled on Christmas eve on\nmulled wine, I decided to make Tuatara <strong>mutate its own code</strong> by linking with my <a href=\"/notes/aoah-2025-4\">Claudeio</a> library to\nforce it to evolve and modify itself as it runs across feed errors. Every deployment of\nTuatara is meant to be slightly <a href=\"/papers/2025-internet-ecology\">different</a>.</p>\n<h2 id=\"evolving-code-like-its-2026\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#evolving-code-like-its-2026\"></a>Evolving code like it's 2026</h2>\n<p>The initial generation of the code was pretty straightforward, using Sqlite to\nstore a database with all the posts and importing metadata from my previously\ncreated <a href=\"/notes/aoah-2025-8\">Sortal</a> contacts manager.</p>\n<pre><code class=\"language-bash\">&gt; tuatara import-sortal\nSortal Import Results:\n\n  Total contacts scanned: 420\n  Contacts with feeds: 15\n  Feeds imported: 16\n  Feeds skipped (already exist): 0\n\nRun 'tuatara fetch' to download posts from the imported feeds.\n</code></pre>\n<p>But when we actually get the feeds, I rapidly realised that there are lots of\nparsing quirks needed:</p>\n<pre><code class=\"language-bash\">&gt; tuatara fetch\nFetching Anil Madhavapeddy...\n  340 posts (0 new)\nFetching David Allsopp...\n  Not modified\nFetching Jessica Man...\n  Not modified\nFetching Jon Ludlam...\n  28 posts (0 new)\nFetching Jon Sterling...\n  Not modified\nFetching Mark Elvers...\n  Not modified\nFetching Martin Kleppmann...\n  Error: Feed parse error: document MUST contains exactly one &lt;feed&gt; element at l.0 c.0\n  URL: http://feeds.feedburner.com/martinkl\nFetching Onkar Gulati...\n  Error: Not_found\n  URL: https://onkargulati.com/feed.xml\nFetching Patrick Ferris...\n  Error: Feed parse error: &lt;entry&gt; elements MUST contains at least an &lt;author&gt; element or &lt;feed&gt; element MUST contains one or more &lt;author&gt; elements at l.1460 c.7\n  URL: http://patrick.sirref.org/weeklies/atom.xml\nFetching Richard Mortier...\n  79 posts (79 new)\nFetching Ryan Gibb...\n  38 posts (38 new)\nFetching Sadiq Jaffer...\n  10 posts (10 new)\n\nTotal: 127 new posts (3 errors)\n</code></pre>\n<p>Either we skip content, or talk to the people involved to fix their feeds, but\nit's Christmas eve so that's unlikely. And anyway, we want to be <a href=\"https://en.wikipedia.org/wiki/Robustness_principle\">liberal in\nwhat we accept</a> so why\ncan't I fix my own software first?!</p>\n<p><img src=\"/images/aoah-tuatara-ss-1.webp\" alt=\"%c\" title=\"Like the amazing Tuatara, why don't we build evolution directly into our software? Time to take off the semantics seatbelt...\" ></p>\n<h2 id=\"medice-cura-te-ipsum\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#medice-cura-te-ipsum\"></a>Medice, cura te ipsum</h2>\n<p>The non-obvious and probably-terrible answer here is to use our fancy coding\nmodels to force the Tuatara source code to heal itself.  I added an <code>--evolve</code>\nflag to allow tuatara to invoke <a href=\"/notes/aoah-2025-4\">Claude/OCaml</a> upon errors and patch its own code.\nTurns out self help works, as Martin's blog feed was just fine!</p>\n<pre><code class=\"language-bash\">Fetching Martin Kleppmann...\n  Error: Feed parse error: document MUST contains exactly one &lt;feed&gt; element at l.0 c.0\n  URL: http://feeds.feedburner.com/martinkl\n\nInvoking Claude Code to fix parse error...\n\nThe feed from `http://feeds.feedburner.com/martinkl` is an **RSS 2.0 feed** (it\nstarts with `&lt;rss version=&quot;2.0&quot;&gt;`), but tuatara was incorrectly detecting it as\nan **Atom feed**. This caused the Syndic Atom parser to fail with &quot;document\nMUST contains exactly one \\&lt;feed\\&gt; element&quot;.\n\nThe root cause was that the `detect_feed_type` function prioritized the HTTP\n`Content-Type` header over the actual content. FeedBurner (and other feed\naggregators) often serve RSS feeds with an incorrect `application/atom+xml`\ncontent-type header.\n\nThis is a generic fix that will work for any feed aggregator or CDN that\nmis-labels RSS feeds as Atom (or vice versa), so no domain-specific quirk was\nneeded.\n</code></pre>\n<p>But the parsing drama continued, as <a href=\"https://mynameismwd.org\">Michael Dales</a> uses the wrong date format in his\nfeed (tsk tsk I'd send the RFC police out if it wasn't Christmas), but Tuatara\n<a href=\"https://tangled.org/anil.recoil.org/tuatara/commit/a56c4a877936a1ae49c0fab301fb6afbcc74b3bb\">evolves a quirk</a> that gets past it:</p>\n<blockquote>\n<p>The quirk module converts ISO 8601 dates (2025-10-22T12:24:00-00:00) to RFC\n822 format (Wed, 22 Oct 2025 12:24:00 GMT) which is what Syndic's RSS2 parser\nexpects.</p>\n</blockquote>\n<p>And <a href=\"https://www.cst.cam.ac.uk/people/og309\">Onkar Gulati</a> and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> both have an empty author field which would\nordinarily give us a dreaded <code>Not_found</code> exception:</p>\n<blockquote>\n<p>Fetching Patrick Ferris...Error: Feed parse error: <entry> elements MUST contains at least an <author> element or <feed> element MUST contains one or more <author> elements at l.1460 c.7 URL: http://patrick.sirref.org/weeklies/atom.xml</p>\n</blockquote>\n<p>But never fear, the inexorable <code>--evolve</code> flag figures it out and patches its own code!</p>\n<p>There were some non-trivial quirks as well; <a href=\"https://ancazugo.github.io/\">Andres Zuñiga-Gonzalez</a> uses Quatro for his website which puts the entire HTML blob into the summary field, but the evolution managed to use <a href=\"/notes/aoah-2025-15\">html5rw</a> to parse <a href=\"https://tangled.org/anil.recoil.org/tuatara/commit/7f29b37e1c647f984589e42164a0fc2ec0cda5c4\">its way out of this</a>. This sort of fix is very hard to generalise, so it's actually quite useful for the tool to fix itself on demand for our small group.</p>\n<h2 id=\"using-the-claude-frontend-design\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#using-the-claude-frontend-design\"></a>Using the Claude frontend design</h2>\n<p>Then I needed a quick way to do a clean frontend output so I can visualise the\nJSONfeed. Claude has a <code>/plugin frontend-design</code> skill that is built in, and\nprompting it to give me a few designs let me integrate a <code>--html</code> output.</p>\n<p>And because it's Christmas, I added some snowflakes as well. Yay!</p>\n<p><a href=\"https://www.cl.cam.ac.uk/~avsm2/eeg-xmas\"> <img src=\"/images/aoah-tuatara-ss-4.webp\" alt=\"%c\" title=\"Ho ho ho merry xmas everyone from the EEG feed that isnt live yet but will be after the new year\" > </a></p>\n<h2 id=\"reflections\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reflections\"></a>Reflections</h2>\n<p>The paper I enjoyed writing the most this year was <a href=\"/papers/2025-internet-ecology\">Steps towards an Ecology for the Internet</a> for\n<a href=\"/notes/ecology-at-aarhus\">Aarhus 2025</a>. In the back of my head since has been a desire\nto start figuring out what self-evolving software actually might be. It's a\nstrange, and probably impractical idea, but I'm delighted that I took a tiny\nstep towards it with this project.</p>\n<p>Back in March, I had the honour of being invited to a <a href=\"https://bellairs.net\">Bellairs</a> meeting to discuss a heady combination of semantics and computational science. <a href=\"https://jonmsterling.com\">Jon Sterling</a> demonstrated his wonderfully organised Forester website. And I... showed how my mismash of semi-structured writings can kind of be connected together in a vaguely coherent way to build my website. Next year will have me thinking much harder about the implications of <a href=\"/papers/2025-internet-ecology\">self-evolving code</a>, of how radically <a href=\"/papers/2025-biodiversity-9recs\">transformative to global biodiversity</a> semi-structured agentic processing might be, and other heavy matters. But to close this year, I'm disproportionately pleased to have gotten my tiny website under control a little!</p>\n<p><img src=\"/images/aoah-tuatara-ss-2.webp\" alt=\"%c\" title=\"Sitting indoors in Barbados with a gigantic beach outside: a classic sign of semanticists in the wild\" ></p>\n<p>As I noted in my <a href=\"/notes/acm-ai-recs\">letter to the ACM</a>, it's important that we can use AI for things that boost the\nhuman condition; I really enjoy reading my colleagues' long form thoughts much\nmore than doomscrolling on the web, and so making it easier to gather their\nthoughts digestibly and easily is a nice end to my <a href=\"/notes/aoah-2025\">agentic humps</a>\neffort. Tomorrow on <a href=\"/notes/aoah-2025-25\">Christmas</a> I'll publish all the skills I used so others can try them out.</p><h1>References</h1><ul><li>Madhavapeddy et al (2025). Steps towards an Ecology for the Internet. Association for Computing Machinery. <a href=\"https://doi.org/10.1145/3744169.3744180\" target=\"_blank\"><i>10.1145/3744169.3744180</i></a></li>\n<li>Sutherland et al (2026). Nine changes needed to deliver a radical transformation in biodiversity measurement. <a href=\"https://doi.org/10.1073/pnas.2519345123\" target=\"_blank\"><i>10.1073/pnas.2519345123</i></a></li>\n<li>Madhavapeddy (2025). Dear ACM, you're doing AI wrong but you can still get it right. <a href=\"https://doi.org/10.59350/c84g4-5zt58\" target=\"_blank\"><i>10.59350/c84g4-5zt58</i></a></li>\n<li>Madhavapeddy (2025). Presenting our Ecology of the Internet ideas at Aarhus 2025. <a href=\"https://doi.org/10.59350/p45b8-kvt85\" target=\"_blank\"><i>10.59350/p45b8-kvt85</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/aoah-2025-24",
      "title": "AoAH Day 24: Tuatara, an evolving Atom aggregator that mutates",
      "summary": "Tuatara is a feed aggregator that integrates Claude to evolve and patch its own code when encountering parsing errors, embodying the concept of self-healing software.",
      "date_published": "2025-12-24T00:00:00.000000Z",
      "date_modified": "2025-12-24T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ai",
        "ocaml",
        "oxcaml",
        "llms",
        "aoah",
        "networks"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1145/3744169.3744180",
          "doi": "10.1145/3744169.3744180",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1073/pnas.2519345123",
          "doi": "10.1073/pnas.2519345123",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/c84g4-5zt58",
          "doi": "10.59350/c84g4-5zt58",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/p45b8-kvt85",
          "doi": "10.59350/p45b8-kvt85",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/aoah-2025-23",
      "content_html": "<p>Yesterday's <a href=\"/notes/aoah-2025-22\">monopam</a> workflow used git submodules to combine\nvendored packages, but was awkward to use for crosscutting changes involving\nlots of vendored git repositories. Today I asked what agentic code development\nwould look like if we could unify <em>all code</em> into a single git repository,\nwhere upstream packages become branches instead of submodules. I've\nopen-sourced the <strong><a href=\"https://tangled.org/anil.recoil.org/unpac\">unpac</a></strong> CLI to\nexplore this, and have begun using it myself.</p>\n<p>Coding agents work best when all relevant code is locally available so\nthey can grep and make <a href=\"/notes/aoah-2025-21\">cross-cutting changes</a>. I first noticed this when building <a href=\"/notes/aoah-2025-9\">Bonsai terminal UIs</a> and <a href=\"/notes/aoah-2025-10\">Mosaic</a>, where I had to manually assemble monorepos just to get the agent working.\nThings come to a crashing halt when package management\ngets involved; the tool calls for web search are far slower and unreliable.\nThis means that the agent doesn't really have a good view on what third-party\npackages might be useful to solve a problem, leading to the common complaint\nthat LLMs <a href=\"https://ryan.freumh.org/claude-code.html\">reinvent the wheel</a>.</p>\n<p>To fix this, unpac parses package metadata and materialises it into a git branch\nstructure <em>in a single repository</em> to make vendoring, patching, and updating\na native git workflow. Local changes can later be exported into git patches for\nsending upstream, but in the meanwhile our agents can work on a single\nrepository.</p>\n<p>The secret sauce to working on so many branches is to use <a href=\"https://git-scm.com/docs/git-worktree\">git worktrees</a>, which allow multiple\nbranches to be checked out simultaneously from one git repo! I'll explain how unpac works next, and you can\nbrowse a <a href=\"https://tangled.org/anil.recoil.org/unpac-work\">working unpac tree</a>. You do end up with a\nlot of git branches, which got me <a href=\"https://x.com/rhatr/status/1012001138110029824\">banned from GitHub</a> back when I announced <a href=\"https://www.theregister.com/2017/01/17/docker_adds_continuous_integration_to_datakit/\">Docker DataKit</a>. Luckily this time around I am hosting on <a href=\"/notes/disentangling-git-with-bluesky\">Tangled</a> where I host my own Git remotes and so don't have to worry about third-party SLAs!</p>\n<p><a href=\"https://tangled.org/anil.recoil.org/unpac-work\"> <img src=\"/images/aoah-unpac-ss-6.webp\" alt=\"%c\" title=\"All the dependent code is in separate branches in the git repo, managed by unpac\" > </a></p>\n<h2 id=\"the-unpac-branching-model\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-unpac-branching-model\"></a>The unpac branching model</h2>\n<p>unpac organises code and dependencies using a lot of unrelated git branches, with\ncareful merging across them.  The <code>main</code> branch only holds the unpac metadata about which projects exist and which opam remotes to use.\nWhile this defaults to the <a href=\"https://github.com/ocaml/opam-repository\">upstream opam-repository</a>, I'm also using this with the <a href=\"https://github.com/oxcaml/opam-repository\">oxcaml/opam-repository</a> and my <a href=\"https://tangled.org/anil.recoil.org/aoah-opam-repo\">aoah-opam-repo</a> overlays (maintained via the <a href=\"/notes/aoah-2025-5\">opam metadata skill</a>) to help track community forks.</p>\n<pre><code>[opam]\ncompiler = &quot;5.4.0&quot;\n\n[[opam.repositories]]\nname = &quot;opam&quot;\npath = &quot;/workspace/opam/opam-repository&quot;\n\n[[opam.repositories]]\nname = &quot;aoah&quot;\npath = &quot;/workspace/opam/aoah-opam-repo&quot;\n</code></pre>\n<p>Each third-party opam package then has <em>three</em> branches in my unpac repo:</p>\n<ul>\n<li><code>upstream/opam/&lt;pkg&gt;</code> holds the unmodified upstream code and history</li>\n<li><code>vendor/opam/&lt;pkg&gt;</code> is upstream history relocated to a <code>vendor/opam/&lt;pkg&gt;/</code> prefix using <a href=\"https://github.com/newren/git-filter-repo\">git-filter-repo</a>.</li>\n<li><code>patches/opam/&lt;pkg&gt;</code> is the vendor branch with any local changes applied.</li>\n</ul>\n<p>The projects you are working on are all in a <code>project/&lt;name&gt;</code> orphan git\nbranches, independent of all the others.  Adding a dependency to a project is a mere matter of\n<em>merging</em> the vendor branch for a given dependency into the <code>project</code> branch.</p>\n<p>This merging will materialise the dependency code in the project branch\nconflict-free under <code>vendor/opam/</code>. This allows us to build a monorepo of OCaml\ncode that's maintained its git history across all the different developers,\nwhile also allowing local commits to be held and rebased. The agent has a ton\nof context available to it now without having to go to the outside world!</p>\n<h3 id=\"working-on-multiple-branches-simultaneously-with-worktrees\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#working-on-multiple-branches-simultaneously-with-worktrees\"></a>Working on multiple branches simultaneously with worktrees</h3>\n<p>Before showing off the unpac CLI, it's worth explaining git worktrees as I'd\nnever used them before today.  Normally, you can only have one git branch\nchecked out at a time, but worktrees free us of that restriction:</p>\n<blockquote>\n<p>A git repository can support multiple working trees, allowing you to check\nout more than one branch at a time. With <code>git worktree add</code> a new working\ntree is associated with the repository, along with additional metadata that\ndifferentiates that working tree from others in the same repository. The\nworking tree, along with this metadata, is called a &quot;worktree&quot;.</p>\n<p>This new worktree is called a &quot;linked worktree&quot; as opposed to the &quot;main\nworktree&quot; prepared by git-init or git-clone. A repository has one main\nworktree (if it's not a bare repository) and zero or more linked worktrees.\nWhen you are done with a linked worktree, remove it with <code>git worktree remove</code>.\n<cite>-- <a href=\"https://git-scm.com/docs/git-worktree\">git-worktree documentation</a>, 2023</cite></p>\n</blockquote>\n<p>Creating them is pretty straightforward using <code>git worktree add</code>.  This creates\na new checkout with the <code>.git</code> entry being a file containing an entry\nlike:</p>\n<pre><code>gitdir: /workspace/git/worktrees/tuatara\n</code></pre>\n<p>Without worktrees, agents fall over themselves switching branches due to the\nrequirement that all files be committed before switching. With worktrees, you\ncan have uncommitted stuff in multiple branches, meaning we can simultaneously\nview upstream code while making patches, compare vendor and patches branches\nside-by-side with diff, and work on multiple projects from the same repository.\nI used to have these affordances back when I used Mercurial and Perforce (with\ndifferent distribution models, admittedly), so it's great to have it back!</p>\n<h2 id=\"using-unpac-via-the-cli\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#using-unpac-via-the-cli\"></a>Using unpac via the CLI</h2>\n<p>This elaborate git schema is all very good, but not something I'd want to manage\nmanually. The unpac CLI takes care of all the gruntwork involved,\nincluding integrating an opam solver to create 100s of branches with one\ncommand. Let's take a look at an example project:</p>\n<pre><code class=\"language-bash\">$ unpac init\n\n# Vendors in packages from opam including dependencies\n$ unpac add opam eio --solve\n</code></pre>\n<p>We now have a bunch of vendor branches in our local repository, and need to\ncreate a project to use them:</p>\n<pre><code class=\"language-bash\"># Create a new project branch\n$ unpac project new myapp\n\n# Merge in the patch branches of these dependencies into project/myapp/vendor\n$ unpac opam merge eio --solve myapp\n\n# Hax0r like it's 2026 on your project\n$ cd project/myapp\n</code></pre>\n<p><img src=\"/images/aoah-unpac-ss-1.webp\" alt=\"%c\" title=\"The unpac CLI solves package constraints and merges individual branches into a project\" ></p>\n<h2 id=\"doing-agentic-monorepo-development\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#doing-agentic-monorepo-development\"></a>Doing agentic monorepo development</h2>\n<p>We can now do a simple <code>dune build</code> in the <code>project/myapp</code> directory since all our code is present in one working branch.\nA typical unpac project looks like:</p>\n<pre><code>.\n└── dune-project\n├── vendor/\n│   ├── eio/            # vendored eio source\n│   ├── lwt/            # vendored lwt source\n│   └── ...\n├── src/                # your project source\n</code></pre>\n<p>Dune <a href=\"/notes/aoah-2025-22\">automatically discovers and builds</a> the vendored packages. No special configuration is needed beyond standard dune files.</p>\n<p>However, some packages don't build with dune since the upstream projects don't use dune but have other build systems (quite reasonably! Choice is important). In the past, I have maintained over 50 <a href=\"https://github.com/dune-universe\">dune ports</a> via opam overlays, mostly by hand. However, I can now easily use my coding agent to do all the porting automatically.</p>\n<p><img src=\"/images/aoah-unpac-ss-4.webp\" alt=\"%c\" title=\"Claude can spawn parallel subagents using git worktrees to do the ports independently.\" ></p>\n<p>You can see some of the diffs in the patch branches in my working tree: <a href=\"https://tangled.org/anil.recoil.org/unpac-work/tree/opam%2Fpatches%2Flogs\">patches/logs</a> or <a href=\"https://tangled.org/anil.recoil.org/unpac-work/tree/opam%2Fpatches%2Fcmdliner\">patches/cmdliner</a> or <a href=\"https://tangled.org/anil.recoil.org/unpac-work/tree/opam%2Fpatches%2Fbos\">patches/bos</a> for example. Since the agent has a clean local interface to work with, it can keep its commits neatly organised.</p>\n<p>An <code>unpac vendor status</code> command neatly summarises the status of which packages have been patched, and which project they're merged into:</p>\n<pre><code class=\"language-bash\">$ unpac vendor status\nPackage                    Patches   Merged into\n----------------------------------------------------------------------\nangstrom                         0   -\nasn1-combinators                 0   myapp\nastring                          0   -\nbase64                           0   myapp\nbigstringaf                      0   -\nbos                              1   -\nbytesrw                          0   -\nbytesrw-eio                      0   -\nca-certs                         0   -\ncheckseum                        0   -\ncmdliner                         1   tuatara\nconpool                          0   -\ncookeio                          0   -\ncsexp                            0   myapp\ncstruct                          0   -\ndecompress                       0   -\ndigestif                         0   myapp, tuatara\ndomain-local-await               0   -\ndomain-name                      0   myapp\n</code></pre>\n<p>You can see here that <code>cmdliner</code> has been patched and merged into the tuatara\nproject, but <code>bos</code> has been patched and is unmerged. This is all calculated\ninternally via git commands, so there's no separate metadata store to get out\nof sync.</p>\n<p>I haven't completed porting all the third-party packages to use dune just yet,\nbut I've left it running overnight. When that's done, the big feature we gain\nis that a dune build can seamlessly cross-compile binaries since all the OCaml\ncode and C bindings are in one place. This is what <a href=\"https://mirage.io/docs/mirage-4\">MirageOS\n4</a> does, and we can reap the benefits now for\nconventional binaries too. Windows builds should also be a lot easier as long\nas the dune rules don't have too many Unixisms.</p>\n<h3 id=\"importing-existing-projects\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#importing-existing-projects\"></a>Importing existing projects</h3>\n<p>The unpac CLI was self-explaining enough that another agent session could import\nan OCaml project by analysing that project and running the sequence of unpac\ncommands.</p>\n<p><img src=\"/images/aoah-unpac-ss-2.webp\" alt=\"%c\" title=\"Importing and vendoring all the code needed for an existing project using Claude\" ></p>\n<p>In order to reduce the load on external git clones, unpac also supports having a local &quot;git branch cache&quot; which pulls remotes just once, and then all unpac invocations pull from that local store. As an experiment over the holidays, I've left a session doing a slow clone of <em>all</em> opam git remotes, to see how well git scales to a few thousand branches.</p>\n<h3 id=\"pushing-the-results\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#pushing-the-results\"></a>Pushing the results</h3>\n<p>We do end up with 100s of local branches, and so an <code>unpac push</code> command checks which ones need pushing and takes care of it for you.</p>\n<p><img src=\"/images/aoah-unpac-ss-3.webp\" alt=\"%c\" ></p>\n<p>You can browse one of my working unpac repositories on\n<a href=\"https://tangled.org/anil.recoil.org/unpac-work\">tangled/anil.recoil.org/unpac-work</a>\nto get a sense of the structure.</p>\n<p>I'm still working on the pulling/rebasing functionality, but the basic idea is\nall the same: pull from the outside world into a pristine branch, relocate the\ndirectory, and then have local patch branches.</p>\n<h2 id=\"integrating-ocaml-with-other-languages-in-one-repo\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#integrating-ocaml-with-other-languages-in-one-repo\"></a>Integrating OCaml with other languages in one repo</h2>\n<p>The current unpac focuses on opam, but <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> has been leading the research on a <a href=\"/papers/2025-hyperres\">generalised packaging language</a> that can describe package management <em>across</em> ecosystems. Imagine something like this in a future unpac:</p>\n<pre><code class=\"language-toml\"># Works with opam, npm, cargo, pip...\ndependencies = [\n  { source = &quot;opam&quot;, name = &quot;eio&quot;, version = &quot;&gt;=1.0&quot; },\n  { source = &quot;npm&quot;, name = &quot;d3&quot;, version = &quot;^7.0&quot; },\n  { source = &quot;cargo&quot;, name = &quot;tokio&quot;, version = &quot;1.0&quot; },\n]\n</code></pre>\n<p>I've already had need for this last week when I <a href=\"/notes/aoah-2025-13\">vibespiled 50 HTTP libraries</a> across 10 languages into an OCaml implementation. I really want to be able to more easily draw from other language ecosystems, and unpac's git branch model works regardless of the package manager (hence the <code>opam/</code> suffix for <code>vendor/</code> branches).</p>\n<h2 id=\"reflections\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reflections\"></a>Reflections</h2>\n<p><a href=\"https://x.com/rhatr/status/1012001138110029824\"> <img src=\"/images/aoah-unpac-ss-5.webp\" alt=\"%rc\" title=\"I did not get banned from anything while writing this post\" > </a></p>\n<p>Unpac's branching model actually doesn't work hugely well with GitHub due to the storage limits on an account being hit pretty fast, but it's peachy when used with self-hosted Git services. I'm sure we could do something with Git object alternates as well to improve on this in the future.</p>\n<p>There's quite a lot of work required to make unpac production grade, but I'm astounded by how quickly I could put this prototype together in a day. Sketching out CLI tools and cram tests is extraordinarily fun as well, as I could specify my desired user interface and then engage in a Socratic dialogue with the agent to refine the specification.</p>\n<p>I'm also having subversive thoughts now about issue management. I've been a fan for many years of <a href=\"https://github.com/janestreet/iron\">Jane Street's Iron</a> code review system. However, despite having talked to Stephen Weeks and <a href=\"https://github.com/yminsky\">Yaron Minsky</a> extensively about it over the years, I've never found the bandwidth to build an equivalent for open source. But with coding agents being able to interpret natural language alongside code, it seems like a really obvious extension to also store issues within branches as well as code, and to unify our agent context horizon. Something for the 2026 queue!</p>\n<p>I'd love to hear any feedback on unpac's model from other projects. I wouldn't use the tool I've released just yet as it's only about 18 hours old, but I'll work on it more in the new year as well and do a proper release once it's self hosting. Many thanks to <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> (who came up with the name) for several design discussions that lead to this post.</p>\n<h2 id=\"feedback-25th-dec-2025\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#feedback-25th-dec-2025\"></a>Feedback (25th Dec 2025)</h2>\n<p>Some useful comments about this post:</p>\n<p>First, <a href=\"https://github.com/edwintorok\">Török Edwin</a> from the <a href=\"/projects/xen\">XenServer</a> team <a href=\"https://discuss.systems/@edwintorok/115777291897330349\">reports</a> that:</p>\n<blockquote>\n<p>XAPI has 2 invalid commits in its commit history (invalid email address\ncontaining a duplicate of the author name and email), so it is impossible to\npush its history into a brand new GitHub repo, you can only do that by\nforking the original repo through the API.  Although I see you are not using\nGitHub, so you might be fine. Still might be a good idea to run <code>git fsck</code> on\nthe final repo, XAPI may not be the only project with some invalid commits in\nits history, and that could create problems later (e.g. if <code>tangled</code> starts\nrunning <code>git fsck</code>)&quot;.</p>\n</blockquote>\n<p>Something to investigate for the new year: antagonistic git remotes! Then, <a href=\"https://mynameismwd.org\">Michael Dales</a> <a href=\"https://toot.mynameismwd.org/@michael/statuses/01KDAB3RCJB7FVZQK17WZTJ767\">notes that</a>:</p>\n<blockquote>\n<p>For any production system I’ve had to deliver, I’d always vendor in third\npart code using git submodules (and those taken from a local mirror of each\nrepo). You just can’t use external dependencies if you need to know you can\nship updates to customers at the drop of a hat, and you never know when an\nopen source project might go away.</p>\n</blockquote>\n<p>And then <a href=\"https://dave.recoil.org\">Dave Scott</a> reminded me how much this proposed branch scheme makes it easier for shipping\nproducts to comply with open-source licenses. In <a href=\"/papers/2025-docker-icfp\">Docker for Desktop</a> we have an elaborate <a href=\"https://github.com/moby/vpnkit/tree/master/scripts\">license gathering\nscript</a> for the 'about' box\nto credit contributors.</p><h1>References</h1><ul><li>Madhavapeddy et al (2025). Functional Networking for Millions of Docker Desktops. <a href=\"https://doi.org/10.1145/3747525\" target=\"_blank\"><i>10.1145/3747525</i></a></li>\n<li>Gibb et al (2025). Solving Package Management via Hypergraph Dependency Resolution. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2506.10803\" target=\"_blank\"><i>10.48550/arXiv.2506.10803</i></a></li>\n<li>Madhavapeddy (2025). Socially self-hosting source code with Tangled on Bluesky. <a href=\"https://doi.org/10.59350/r80vb-7b441\" target=\"_blank\"><i>10.59350/r80vb-7b441</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/aoah-2025-23",
      "title": "AoAH Day 23: Unpac unifies git branching with package management",
      "summary": "Introducing unpac, a tool that unifies git and package management into a single workflow where all code dependencies live in one repository as trackable branches.",
      "date_published": "2025-12-23T00:00:00.000000Z",
      "date_modified": "2025-12-23T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ai",
        "ocaml",
        "oxcaml",
        "llms",
        "aoah"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1145/3747525",
          "doi": "10.1145/3747525",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2506.10803",
          "doi": "10.48550/arXiv.2506.10803",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/r80vb-7b441",
          "doi": "10.59350/r80vb-7b441",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/c84g4-5zt58",
      "content_html": "<p>There's outrage in the computer science community over a new feature rolled out\nby the ACM Digital Library that generates <a href=\"https://amok.recoil.org/@lindsey@recurse.social/115737104219637066\">often inaccurate</a> AI summaries. To make things worse, this is hidden behind a 'premier' paywall, so authors without access (for\nexample, having graduated from University) can't even see what is being said.</p>\n<h2 id=\"why-are-these-paper-ai-summaries-harmful\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#why-are-these-paper-ai-summaries-harmful\"></a>Why are these paper AI summaries harmful?</h2>\n<p>The summaries themselves are deeply average. Looking at one of my\n<a href=\"/papers/2023-raid-deluminator\">recent papers</a>, it somehow expands a carefully crafted\ntwo paragraph summary into a six paragraph thing that says roughly the same\nthing. This seems like <em>exactly</em> the wrong place to apply LLM technology to as\nit's replacing a carefully peer-reviewed paragraph with a longer slopful version.</p>\n<p><img src=\"/images/acm-slop-1.webp\" alt=\"%c\" title=\"The AI generated summary regresses us to the mean by turning two paragraphs into six.\" ></p>\n<p>The ACM <a href=\"https://www.acm.org/about-acm/mission-vision-values-goals\">stands</a> for the dissemination of <em>knowledge</em> accessibly. I could imagine cases where summarising abstracts would be useful: for example, into <a href=\"https://bhashini.gov.in\">foreign</a> languages for which no such abstract exists, or really nice audio transcriptions for assistive usage. However, putting it behind a paywall and distracting from peer-reviewed human-created content is really, really bad.</p>\n<h3 id=\"is-the-acm-trying-to-make-money-from-ai\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#is-the-acm-trying-to-make-money-from-ai\"></a>Is the ACM trying to make money from AI?</h3>\n<p>I dug in a bit deeper to find out more, and discovered this <a href=\"https://dl.acm.org/generative-ai/summarizations\">statement</a>:</p>\n<blockquote>\n<p>Currently, we offer written summaries of individual articles as well as podcast-style audio summaries of conference sessions. We will soon add chat-style interactivity to our content and search functionality. All summaries on the Digital Library are clearly labeled as AI-generated. When citing ACM content, authors should always cite the original article, not an AI-generated summary.</p>\n<p>AI can make mistakes, and it cannot replace the experience of reading an article in full. But we do believe it will help you find, understand and use ACM content both more quickly and more deeply.</p>\n<p>These tools were designed in consultation with a diverse group of Digital Library stakeholders and will continue to evolve as Artificial Intelligence advances. We are continuously tuning our Foundational Model to optimize readability and we conduct regular audits for hallucinations and other errors. We are very interested in your thoughts and suggestions- please leave them by clicking the &quot;Feedback&quot; button on the far right of this page. If you find a problem with a specific AI-generated summary, please return to that summary and click the Feedback there.\n<cite><a href=\"https://dl.acm.org/generative-ai/summarizations\">Artificial Intelligence Tools in the ACM Digital Library</a>, undated.</cite></p>\n</blockquote>\n<p>I have many questions here: who is this diverse group of stakeholders, what foundation model is being used, what tuning happened, what audits, and what is happening with the corrections from authors. Are we suddenly using the world's scholars to create a fine-tuning label database without their permission? There's a definite lack of transparency here.</p>\n<p>Luckily, <a href=\"https://www.cs.cmu.edu/~aldrich/\">Jonathan Aldrich</a> is on the ACM Publications Board, which must be a thankless job. He acknowledged this very graciously yesterday during the outrage:</p>\n<blockquote>\n<p>I also owe the community an apology; I was told about this feature (though I'm not sure I was told it was going to be the default view). I should have recognized the potential issues and been loudly critical earlier, before it went live.  But I will do my best to get it fixed now.\n<cite>Jonathan Aldritch, <a href=\"https://social.sigsoft.org/@JonathanAldrich/115732290649035065\">Mastodon sigsocial</a>, 17th Dec 2025</cite></p>\n</blockquote>\n<p>This got me thinking about what the ACM <em>should</em> be doing instead of this. Putting these abstracts up represents not only a step in the wrong direction, but also a high opportunity cost of not unlocking some other positive activity that can leverage AI for social good.</p>\n<h2 id=\"how-the-acm-could-do-ai-right\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#how-the-acm-could-do-ai-right\"></a>How the ACM could do AI right</h2>\n<p>We're at a real crossroads with <a href=\"/notes/uk-national-data-lib\">scientific communication</a> and <a href=\"/notes/rs-future-of-publishing\">scholarly publishing</a>, but I firmly believe that the ACM can correct itself and make a real difference.</p>\n<h3 id=\"less-algorithmically-driven-communication\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#less-algorithmically-driven-communication\"></a>Less algorithmically driven communication</h3>\n<p>Looking through the ACM digital library footer, I see news channels using <a href=\"https://x.com/acmdl\">X</a>, <a href=\"https://www.linkedin.com/company/association-for-computing-machinery/\">LinkedIn</a> and <a href=\"https://www.facebook.com/AssociationForComputingMachinery/\">Facebook</a>. The only open protocol listed is <a href=\"mailto:dl-team@hq.acm.org\">email</a>, although I did discover an (unlisted on the ACM website) <a href=\"https://bsky.app/profile/acm.org\">Bluesky</a> account.</p>\n<p>None of these platforms are conducive to longform, thoughtful community conversations. Let's look at ACM's mission statement:</p>\n<blockquote>\n<p>ACM is a global scientific and educational organization dedicated to advancing the art, science, engineering, and application of computing, serving both professional and public interests by fostering the open exchange of information and by promoting the highest professional and ethical standards.\n<cite>-- <a href=\"https://www.acm.org/about-acm/mission-vision-values-goals\">ACM's Mission, Vision, Core Values and Goals</a>, 2025</cite></p>\n</blockquote>\n<p>The platforms the ACM has chosen for communicating with the scholarly community are <a href=\"https://doi.org/10.1145/1226736.1226740\">algorithmic engagement factories</a>. There are countless papers on the ACM Digital Library itself recording the <a href=\"https://dl.acm.org/doi/abs/10.1145/3600211.3604673\">harms</a> and <a href=\"https://dl.acm.org/doi/10.1145/3543507.3583857\">spread</a> of <a href=\"https://dl.acm.org/doi/10.1145/3686967\">misinformation</a>.</p>\n<blockquote>\n<p>While early ads were found to be effective in creating brand awareness and positive attitudes, recent Internet advertising has been described as nonsensical, uninformative, forgettable, ineffective, and intrusive.\n<cite>-- <a href=\"https://cacm.acm.org/practice/the-effects-of-online-advertising/\">The Effects of Online Advertising</a>, Communications of the ACM, 2007</cite></p>\n</blockquote>\n<p>Instead, the ACM should focus on using models that encourage scholarly discourse, such as standards-based mechanisms like an Atom/RSS feed for their news (which can be consumed widely and accessibly), and consider increasing engagement on non-advertising-driven platforms such as Bluesky. A <a href=\"https://www.nature.com/articles/d41586-025-00177-1\">poll at the start of the year</a> from Nature revealed that 70% of respondents use that platform. I suspect it's <a href=\"https://arxiv.org/abs/2507.18840\">less</a> for computer science, but the ACM setting a direction would also go a long way to give community direction.</p>\n<p><em>(Update 22nd Dec 2025: <a href=\"https://mastodon.acm.org/@stefanwagner\">Stefan Wagner</a> and <a href=\"https://amok.recoil.org/@vzaliva@mastodon.acm.org\">Vadim Zaliva</a> both <a href=\"https://amok.recoil.org/@vzaliva@mastodon.acm.org/115747577961996010\">point</a> <a href=\"https://amok.recoil.org/@stefanwagner@mastodon.acm.org/115750600759054994\">out</a> that the ACM also already runs a Mastodon instance, which is easily <a href=\"https://fed.brid.gy/\">bridged</a> to Bluesky. A fine alternative!)</em></p>\n<p><a href=\"https://validator.w3.org/feed/check.cgi?url=https%3A%2F%2Fdl.acm.org\"> <img src=\"/images/acm-slop-3.webp\" alt=\"%c\" title=\"The W3C Atom Feed Validator doesn't get very far with the ACM Digital Library\" > </a></p>\n<h3 id=\"make-papers-easier-to-download\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#make-papers-easier-to-download\"></a>Make papers easier to download</h3>\n<p>I've been working on <a href=\"/notes/principles-for-collective-knowledge\">collective knowledge principles</a> to boost the <a href=\"/projects/ce\">conservation evidence</a> project. As part of this process, I've downloaded tens of millions of fulltext papers to help figure out where living things are on the planet. By far the most difficult task here was <em>getting access to even the open papers</em>. At the recent <a href=\"/notes/coar-prc\">COAR meetup</a>, half the conversations were around the <a href=\"/notes/uk-national-data-lib\">difficulty of obtaining knowledge</a> even before curation.</p>\n<p>Incredibly, while just browsing around the ACM DL in order to research this article, I got blocked from the entire library. This was after opening about 10 browser tabs: not an unusual amount of human traffic!</p>\n<p><img src=\"/images/acm-slop-2.webp\" alt=\"%c\" title=\"I'm still blocked an hour later, so I guess I won't be doing any computer science research for the rest of the day. Pub, anyone?\" ></p>\n<p>Contrast this to the <a href=\"https://plos.org\">Public Library of Science</a> (PLOS), which has a <a href=\"https://github.com/PLOS/allofplos\">allofplos</a> repository that allows me to download the <em>entire</em> fulltext paper repository by running a single line of Python: <code>pipenv run python -m allofplos.update</code>. The script not only downloads papers, but does a bunch of important bookkeeping:</p>\n<blockquote>\n<p>The script:</p>\n<ul>\n<li>checks for and then downloads to a temporary folder individual new articles that have been published</li>\n<li>of those new articles, checks whether they are corrections (and whether the linked corrected article has been updated)</li>\n<li>checks whether there are VORs (Versions of Record) for uncorrected proofs in the main articles directory and downloads those</li>\n<li>checks whether the newly downloaded articles are uncorrected proofs or not after all of these checks, it moves the new articles into the main articles folder.\n<cite>-- <a href=\"https://github.com/PLOS/allofplos\">AllOfPLOS</a> README, 2016</cite></li>\n</ul>\n</blockquote>\n<p>PLOS has been doing for years, so why hasn't the ACM done this yet for its open access papers? I applaud the ACM's <a href=\"https://www.acm.org/publications/openaccess\">recent shift to open access by default</a> but this is pointless if not accompanied by an investment in the <em>dissemination</em> of knowledge.</p>\n<h3 id=\"build-a-provenance-defence-against-fake-papers\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#build-a-provenance-defence-against-fake-papers\"></a>Build a provenance defence against fake papers</h3>\n<p>One of the most exciting things about Bluesky is that it allows for the <a href=\"/notes/disentangling-git-with-bluesky\">reuse of the identity layer</a> to build other services. Right now we are seeing that <a href=\"/notes/ai-poisoning\">AI poisoning of the literature</a> is upending all kinds of evidence-driven social norms, and is a huge threat to rational policymaking for all of society.</p>\n<p>The ACM is uniquely positioned in computer science as the body that could build a reasonable reputation network that not only identifies academics, but also <a href=\"/notes/principles-for-collective-knowledge\">enforces provenance tracking</a> of whether papers and artefacts did in fact follow a rigorous methodology. LLMs are now amazingly good at constructing <a href=\"/notes/ai-contamination-of-papers\">fake papers</a>, and so capturing the peer review process and building up a defence against &quot;knowledge from thin air&quot; will be one of the great challenges for the remainder of this decade.</p>\n<p><a href=\"https://rdcu.be/evkfj\"> <img src=\"/images/davidparkins-ai-poison.webp\" alt=\"%c\" title=\"AI poisoning the literature in a legendary cartoon. Credit: David Parkins, Nature\" > </a></p>\n<h3 id=\"agentic-ai-is-here-to-stay-so-deal-with-it-on-our-terms\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#agentic-ai-is-here-to-stay-so-deal-with-it-on-our-terms\"></a>Agentic AI is here to stay, so deal with it on our terms</h3>\n<p>My <a href=\"/notes/aoah-2025\">December adventures in agentic programming</a> have been eyeopening in just how quickly I can build hyper-personalised views on a large body of knowledge.  While many computer science scholars tend to view LLMs skeptically, there <em>is</em> a good use of agentic summaries of papers: by allowing readers to summarise papers directly for themselves when supplying the LLM with other information about <a href=\"/notes/aoah-2025-16\">what they already know</a>.</p>\n<p>Bryan Cantrill explained this best in his principles for <a href=\"https://rfd.shared.oxide.computer/rfd/0576\">LLM Usage at Oxide</a>. He separates out using LLMs for reading, writing and coding. I totally agreed with him that I detest people sending me LLM-generated writing, but he teased out a good explainer as to why:</p>\n<blockquote>\n<p>LLM-generated prose undermines a social contract of sorts: absent LLMs, it is presumed that of the reader and the writer, it is the writer that has undertaken the greater intellectual exertion. (That is, it is more work to write than to read!) For the reader, this is important: should they struggle with an idea, they can reasonably assume that the writer themselves understands it — and it is the least a reader can do to labor to make sense of it.<br>\n<cite>-- <a href=\"https://rfd.shared.oxide.computer/rfd/0576\">Using LLMs at Oxide</a>, RFD0576, Dec 2025</cite></p>\n</blockquote>\n<p>And that, dear reader, is why the ACM redistributing AI summaries is a bad idea. It breaks the social contract with the reader that the ACM Digital Library is a source of truths which the scholars who contributed it do understand. We might not agree with everything on the library, but it's <em>massively</em> dilutive to have to sift through AI-generated writing to get to the original bits.</p>\n<p>If the ACM itself deliberately introduces errors into its own library, that's a massive self-own.\nInstead, if the ACM Library exposed a <a href=\"https://llmstxt.org\">simple text-based</a> interface that allowed my agents to do the papers summaries <em>just for me</em>, then that personalisation makes it useful. I find <a href=\"https://github.com/nickscamara/open-deep-research\">deep research agents</a> surprisingly useful when exploring a new field, but primarily because I can guide their explorations with my own personal research intuition, not someone elses.</p>\n<h2 id=\"my-appeal-to-the-acm-dont-regress-to-the-mean\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#my-appeal-to-the-acm-dont-regress-to-the-mean\"></a>My appeal to the ACM: don't regress to the mean</h2>\n<p>My appeal to the ACM is to not try to build differentiated paid services using AI. Let the rest of the profit-driven world do that and peddle their slop; at least they are earning money while doing so (hopefully)! The ACM needs to be a force for <a href=\"https://www.cl.cam.ac.uk/archive/rja14/unauthorised.html\">creative disruption</a> and discovery, and help defend and nurture the joys inherent in computer science research.</p>\n<p>This means taking a critical view at how AI is impacting all aspects of our society, but not just rolling out bland services: instead, deploy AI technologies that enhance the human condition and allow us to be even more inquisitive with our time on earth. The recent <a href=\"/notes/ai-for-science-2024\">Royal Society meeting</a> on <a href=\"https://royalsociety.org/-/media/policy/projects/science-in-the-age-of-ai/science-in-the-age-of-ai-report.pdf\">Science in the Age of AI</a> put it very well:</p>\n<blockquote>\n<p>A growing body of irreproducible studies are raising concerns regarding the robustness of AI-based discoveries. The black-box and non-transparent nature of AI systems creates challenges for verification and external scrutiny. Furthermore, its widespread but inequitable adoption raises ethical questions regarding its environmental and societal impact. Yet, ongoing advancements in making AI systems more transparent and ethically aligned hold the promise of overcoming these challenges.\n<cite><a href=\"https://royalsociety.org/-/media/policy/projects/science-in-the-age-of-ai/science-in-the-age-of-ai-report.pdf\">Science in the Age of AI</a>, Royal Society, 2024</cite></p>\n</blockquote>\n<p>This is a well balanced view, I feel. There are huge challenges ahead of us, but also huge opportunities for new discoveries!</p>\n<p>In the meanwhile, I remain blocked from the ACM Digital Library for unknown\nreasons, so I guess it's time to start the Pembroke Christmas feasting a few\nhours early! Anyone want to head down to the pub now?</p><h1>References</h1><ul><li>Madhavapeddy (2025). Fake papers abound in the literature. <a href=\"https://doi.org/10.59350/qmsqz-ark89\" target=\"_blank\"><i>10.59350/qmsqz-ark89</i></a></li>\n<li>Madhavapeddy (2025). Royal Society's Future of Scientific Publishing meeting. <a href=\"https://doi.org/10.59350/nmcab-py710\" target=\"_blank\"><i>10.59350/nmcab-py710</i></a></li>\n<li>Madhavapeddy (2025). Thoughts on the National Data Library and private research data. <a href=\"https://doi.org/10.59350/fk6vy-5q841\" target=\"_blank\"><i>10.59350/fk6vy-5q841</i></a></li>\n<li>Madhavapeddy (2025). Publish, Review, Curate to upend scholarly publishing. <a href=\"https://doi.org/10.59350/fpc9w-ccj82\" target=\"_blank\"><i>10.59350/fpc9w-ccj82</i></a></li>\n<li>Tarkhani et al (2023). Information Flow Tracking for Heterogeneous Compartmentalized Software. ACM. <a href=\"https://doi.org/10.1145/3607199.3607235\" target=\"_blank\"><i>10.1145/3607199.3607235</i></a></li>\n<li>Madhavapeddy (2025). Is AI poisoning the scientific literature? Our comment in Nature. <a href=\"https://doi.org/10.59350/pbxew-d2j78\" target=\"_blank\"><i>10.59350/pbxew-d2j78</i></a></li>\n<li>Madhavapeddy (2025). Socially self-hosting source code with Tangled on Bluesky. <a href=\"https://doi.org/10.59350/r80vb-7b441\" target=\"_blank\"><i>10.59350/r80vb-7b441</i></a></li>\n<li>Madhavapeddy (2025). Four Ps for Building Massive Collective Knowledge Systems. <a href=\"https://doi.org/10.59350/418q4-gng78\" target=\"_blank\"><i>10.59350/418q4-gng78</i></a></li>\n<li>Madhavapeddy (2024). Royal Society and DeepMind host AI for Science Forum. <a href=\"https://doi.org/10.59350/0znpc-fw825\" target=\"_blank\"><i>10.59350/0znpc-fw825</i></a></li>\n<li>McCoy et al (2007). The effects of online advertising. Commun. ACM. <a href=\"https://doi.org/10.1145/1226736.1226740\" target=\"_blank\"><i>10.1145/1226736.1226740</i></a></li>\n<li>Drolsbach et al (2023). Believability and Harmfulness Shape the Virality of Misleading Social Media Posts. Proceedings of the ACM Web Conference 2023. <a href=\"https://doi.org/10.1145/3543507.3583857\" target=\"_blank\"><i>10.1145/3543507.3583857</i></a></li>\n<li>Chuai et al (2024). Did the Roll-Out of Community Notes Reduce Engagement With Misinformation on X/Twitter?. Proc. ACM Hum.-Comput. Interact.. <a href=\"https://doi.org/10.1145/3686967\" target=\"_blank\"><i>10.1145/3686967</i></a></li>\n<li>Biever (2025). Bluesky’s science takeover: 70% of Nature poll respondents use platform. Nature. <a href=\"https://doi.org/10.1038/d41586-025-00177-1\" target=\"_blank\"><i>10.1038/d41586-025-00177-1</i></a></li>\n<li>Zheng et al (2025). How is science discussed on Bluesky?. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2507.18840\" target=\"_blank\"><i>10.48550/arXiv.2507.18840</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/acm-ai-recs",
      "title": "Dear ACM, you're doing AI wrong but you can still get it right",
      "summary": "Critiquing ACM's paywalled AI paper summaries and proposing better alternatives like open feeds, easier downloads, provenance tracking, and personalised agentic interfaces.",
      "date_published": "2025-12-18T00:00:00.000000Z",
      "date_modified": "2025-12-22T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ai",
        "policy",
        "publishing"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/qmsqz-ark89",
          "doi": "10.59350/qmsqz-ark89",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/nmcab-py710",
          "doi": "10.59350/nmcab-py710",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/fk6vy-5q841",
          "doi": "10.59350/fk6vy-5q841",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/fpc9w-ccj82",
          "doi": "10.59350/fpc9w-ccj82",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3607199.3607235",
          "doi": "10.1145/3607199.3607235",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/pbxew-d2j78",
          "doi": "10.59350/pbxew-d2j78",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/r80vb-7b441",
          "doi": "10.59350/r80vb-7b441",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/418q4-gng78",
          "doi": "10.59350/418q4-gng78",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/0znpc-fw825",
          "doi": "10.59350/0znpc-fw825",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1145/1226736.1226740",
          "doi": "10.1145/1226736.1226740",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3543507.3583857",
          "doi": "10.1145/3543507.3583857",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3686967",
          "doi": "10.1145/3686967",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1038/d41586-025-00177-1",
          "doi": "10.1038/d41586-025-00177-1",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2507.18840",
          "doi": "10.48550/arXiv.2507.18840",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/aoah-2025-22",
      "content_html": "<p>Over the past three weeks, I've accumulated dozens of OCaml repositories as\npart of this <a href=\"/notes/aoah-2025\">series</a>. Keeping them coordinated has become a real challenge;\nwhen I fix something in one library, dependent packages need updating, and\nagents working on one repo have no visibility into related code.  Ideally,\nI could have all my code in one place and see what agents can do with a lot of local context.</p>\n<p>Today I'm switching tacks to address this with a monorepo workflow built around dune's\n<a href=\"https://www.dra27.uk/blog/platform/2018/08/15/dune-vendoring.html\">excellent vendoring support</a>.\nI last visited this when building <a href=\"/papers/rwo\">RWOv2</a> and its <a href=\"https://github.com/realworldocaml/book\">monorepo</a> when I built a <a href=\"https://github.com/tarides/opam-monorepo\">duniverse</a> tool that turned into the <a href=\"https://github.com/tarides/opam-monorepo\">opam-monorepo</a> plugin that <a href=\"https://mirage.io\">MirageOS</a>\nnow uses. Let's see what happens in today's agentic world instead!</p>\n<p>I also wanted to explore the small group dynamic around vibecoding tools. For today's tool, I first asked <a href=\"https://www.tunbury.org/\">Mark Elvers</a> to spend a few hours sketching out the sort of tool he might want, and then <a href=\"https://jon.recoil.org\">Jon Ludlam</a> has been <a href=\"https://jon.recoil.org/blog/2025/12/claude-and-dune.html\">using Claude</a> to build up <a href=\"https://github.com/ocaml/dune/pull/12995\">complex odocv3 rules</a>. The way we work together with agentic code is quite different from when we've handcrafted a project, with the code itself now being more throwaway as we pass the baton among each other. I'm lightheartedly calling this 'vibrating' amongst each other to reflect the new speed of agentic iterations, and to differentiate from the more thoughtful process of pair programming. Today's tool <strong><a href=\"https://tangled.org/anil.recoil.org/repo-tool\">monopam</a></strong> helps to manage OCaml monorepos for cross-cutting code and documentation.</p>\n<h2 id=\"the-git-repo-coordination-problem\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-git-repo-coordination-problem\"></a>The Git repo coordination problem</h2>\n<p>The OCaml libraries I've built are designed to be standalone, but obviously have interdependencies among each other. <a href=\"/notes/aoah-2025-13\">Requests</a> depends on <a href=\"/notes/aoah-2025-12\">conpool</a> for connection management and HTTP cookie logic from <a href=\"/notes/aoah-2025-11\">cookeio</a>. The codec libraries like <a href=\"/notes/aoah-2025-7\">yamlt</a>, <a href=\"/notes/aoah-2025-18\">tomlt</a>, and <a href=\"/notes/aoah-2025-19\">init</a> all have optional dependencies on bytesrw for serialisation. Meanwhile, <a href=\"/notes/aoah-2025-15\">html5rw</a> depends on <a href=\"/notes/aoah-2025-20\">langdetect</a> and has optional dependencies on the wasm and JavaScript compiler stack.</p>\n<p>Here's the full inventory of what I've built in the last few weeks:</p>\n<div role=\"region\"><table>\n<tr>\n<th>Day</th>\n<th>Library</th>\n<th>Description</th>\n</tr>\n<tr>\n<td>1</td>\n<td><a href=\"https://tangled.org/anil.recoil.org/ocaml-crockford\">ocaml-crockford</a></td>\n<td>Crockford Base32 encoding</td>\n</tr>\n<tr>\n<td>2</td>\n<td><a href=\"https://tangled.org/anil.recoil.org/ocaml-jsonfeed\">ocaml-jsonfeed</a></td>\n<td>JSONFeed 1.1 implementation</td>\n</tr>\n<tr>\n<td>3</td>\n<td><a href=\"https://tangled.org/anil.recoil.org/xdge\">xdge</a></td>\n<td>XDG directories with Eio capabilities</td>\n</tr>\n<tr>\n<td>4</td>\n<td><a href=\"https://tangled.org/anil.recoil.org/claudeio\">claudeio</a></td>\n<td>Claude OCaml/Eio SDK</td>\n</tr>\n<tr>\n<td>5</td>\n<td><a href=\"https://tangled.org/anil.recoil.org/ocaml-bytesrw-eio\">ocaml-bytesrw-eio</a></td>\n<td>Bytesrw/Eio adapter</td>\n</tr>\n<tr>\n<td>6</td>\n<td><a href=\"https://tangled.org/anil.recoil.org/ocaml-yamlrw\">ocaml-yamlrw</a></td>\n<td>Pure OCaml Yaml 1.2 parser</td>\n</tr>\n<tr>\n<td>7</td>\n<td><a href=\"https://tangled.org/anil.recoil.org/ocaml-yamlt\">ocaml-yamlt</a></td>\n<td>jsont codecs for Yaml</td>\n</tr>\n<tr>\n<td>8</td>\n<td><a href=\"https://tangled.org/anil.recoil.org/sortal\">sortal</a></td>\n<td>Contacts management CLI</td>\n</tr>\n<tr>\n<td>11</td>\n<td><a href=\"https://tangled.org/anil.recoil.org/ocaml-punycode\">ocaml-punycode</a></td>\n<td>Punycode RFC3492 implementation</td>\n</tr>\n<tr>\n<td>11</td>\n<td><a href=\"https://tangled.org/anil.recoil.org/ocaml-publicsuffix\">ocaml-publicsuffix</a></td>\n<td>Public suffix list for cookies</td>\n</tr>\n<tr>\n<td>11</td>\n<td><a href=\"https://tangled.org/anil.recoil.org/ocaml-cookeio\">ocaml-cookeio</a></td>\n<td>HTTP cookie handling</td>\n</tr>\n<tr>\n<td>12</td>\n<td><a href=\"https://tangled.org/anil.recoil.org/ocaml-conpool\">ocaml-conpool</a></td>\n<td>TCP/TLS connection pooling</td>\n</tr>\n<tr>\n<td>13</td>\n<td><a href=\"https://tangled.org/anil.recoil.org/ocaml-requests\">ocaml-requests</a></td>\n<td>HTTP client library</td>\n</tr>\n<tr>\n<td>14</td>\n<td><a href=\"https://tangled.org/anil.recoil.org/ocaml-karakeep\">ocaml-karakeep</a></td>\n<td>Karakeep bookmark API</td>\n</tr>\n<tr>\n<td>15</td>\n<td><a href=\"https://tangled.org/anil.recoil.org/ocaml-html5rw\">ocaml-html5rw</a></td>\n<td>HTML5 parser and validator</td>\n</tr>\n<tr>\n<td>16</td>\n<td><a href=\"https://tangled.org/anil.recoil.org/ocaml-json-pointer\">ocaml-json-pointer</a></td>\n<td>JSON Pointer RFC6901</td>\n</tr>\n<tr>\n<td>16</td>\n<td><a href=\"https://tangled.org/anil.recoil.org/odoc-xo\">odoc-xo</a></td>\n<td>odoc extras for notebooks</td>\n</tr>\n<tr>\n<td>17</td>\n<td><a href=\"https://tangled.org/anil.recoil.org/ocaml-jmap\">ocaml-jmap</a></td>\n<td>JMAP email client</td>\n</tr>\n<tr>\n<td>18</td>\n<td><a href=\"https://tangled.org/anil.recoil.org/ocaml-tomlt\">ocaml-tomlt</a></td>\n<td>TOML 1.1 codecs</td>\n</tr>\n<tr>\n<td>19</td>\n<td><a href=\"https://tangled.org/anil.recoil.org/ocaml-zulip\">ocaml-zulip</a></td>\n<td>Zulip bot framework</td>\n</tr>\n<tr>\n<td>19</td>\n<td><a href=\"https://tangled.org/anil.recoil.org/ocaml-init\">ocaml-init</a></td>\n<td>INI file codecs</td>\n</tr>\n<tr>\n<td>20</td>\n<td><a href=\"https://tangled.org/anil.recoil.org/ocaml-langdetect\">ocaml-langdetect</a></td>\n<td>Language detection</td>\n</tr>\n</table></div><p>And the Claude skills I've developed along the way:</p>\n<div role=\"region\"><table>\n<tr>\n<th>Skill</th>\n<th>Purpose</th>\n</tr>\n<tr>\n<td><a href=\"https://tangled.org/anil.recoil.org/claude-ocaml-metadata\">claude-ocaml-metadata</a></td>\n<td>Automate opam package setup</td>\n</tr>\n<tr>\n<td><a href=\"https://tangled.org/anil.recoil.org/claude-ocaml-internet-rfc\">claude-ocaml-internet-rfc</a></td>\n<td>Fetch and integrate IETF RFCs</td>\n</tr>\n<tr>\n<td><a href=\"https://tangled.org/anil.recoil.org/claude-ocaml-tidy-code\">claude-ocaml-tidy-code</a></td>\n<td>Refactor generated OCaml</td>\n</tr>\n<tr>\n<td><a href=\"https://tangled.org/anil.recoil.org/claude-ocaml-to-npm\">claude-ocaml-to-npm</a></td>\n<td>Publish js_of_ocaml to NPM</td>\n</tr>\n</table></div><p>So far, I've been publishing each of these as individual Git repositories, but maintaining an <a href=\"https://tangled.org/anil.recoil.org/aoah-opam-repo\">overlay opam repo</a> that a user can add to gain access to consistent metadata that makes the dev packages installable. Unfortunately, incorrect interdependencies are already creeping in; <a href=\"https://github.com/samoht\">Thomas Gazagnaire</a> asked me today why my yamlt library depends on webassembly, and I'm sure it shouldn't -- I've clearly got a stray missing dependency somewhere in my metadata.</p>\n<p>When an agent works on just one repository, it has no visibility into\nhow changes might benefit (or break) dependent code. It also can't make fixes\nfor documentation <em>across</em> repositories. I noticed quite often in the past month that I was\ncloning source packages temporarily into my workspace for the agent to access, and then\ndeleting them.  All this motivates me to investigate alternatives to having lots of small git repos for my day-to-day agentic development.</p>\n<h2 id=\"dunes-vendoring-is-amazing-for-monorepos\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#dunes-vendoring-is-amazing-for-monorepos\"></a>Dune's vendoring is amazing for monorepos</h2>\n<p>Dune has a fantastic but underappreciated feature: it automatically discovers and builds any OCaml code in subdirectories. As <a href=\"https://www.dra27.uk\">David Allsopp</a> explained <a href=\"https://www.dra27.uk/blog/platform/2018/08/15/dune-vendoring.html\">back in 2018</a>, you can simply clone dependencies into your project tree and dune will build them together.</p>\n<p>Note that this only works if all the packages contain dune files. Since OCaml is all about choice, there's no hard mandate to use one build tool: it's perfectly fine to use <a href=\"https://github.com/ocaml/ocamlbuild\">ocamlbuild</a> or Makefiles, as long as your libraries install a <a href=\"https://dune.readthedocs.io/en/stable/reference/findlib.html\">findlib META file</a>. Dune will also gain support for <a href=\"https://dune.readthedocs.io/en/latest/explanation/package-management.html\">opam package installation</a> next year to help make this even easier.</p>\n<p>Years ago I built a tool called <a href=\"https://github.com/ocamllabs/duniverse\">duniverse</a> to automate this vendoring workflow. It worked, but required a lot of manual repository management. With agents now doing the heavy lifting, though, I thought it might be easier now and so decided to revisit it.</p>\n<p>Today's work ended up extending <a href=\"https://www.tunbury.org/\">Mark Elvers</a> initial foray into monorepos to release <strong><a href=\"https://tangled.org/anil.recoil.org/repo-tool\">monopam</a></strong>: a little CLI tool that reads opam metadata from a local repository (like <a href=\"https://tangled.org/anil.recoil.org/aoah-opam-repo\">aoah-opam-repo</a>), resolves the dependency graph, materialises the sources as git submodules, and produces a single dune workspace that builds everything together. For now, it depends on an opam local switch to work, but if someone wants to try it with dune package management I'd love to hear how it goes.</p>\n<h2 id=\"materialising-aoah-opam-repo\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#materialising-aoah-opam-repo\"></a>Materialising aoah-opam-repo</h2>\n<p>The <a href=\"https://tangled.org/anil.recoil.org/aoah-opam-repo\">aoah-opam-repo</a> contains all the packages I've built during this series, maintained using the <a href=\"/notes/aoah-2025-5\">opam metadata skill</a>. Let's turn it into a unified source tree using monopam:</p>\n<pre><code class=\"language-bash\">$ monopam --opam-overlay aoah-opam-repo -o aoah-vendor --submodules\nScanning opam overlay at aoah-opam-repo\nFound 21 repositories to process\nInitialized empty Git repository in aoah-vendor/.git/\nUsing git submodules for vendor dependencies\nCloning into ...\nremote: Enumerating objects: 14, done.\nremote: Counting objects: 100% (14/14), done.\nremote: Compressing objects: 100% (11/11), done.\nremote: Total 14 (delta 0), reused 0 (delta 0), pack-reused 0 (from 0)\n# ...etc\nOutput written to aoah-vendor\n  opam-repository/ - opam package definitions\n  vendor/          - source code\n  setup.sh         - run to pin packages and install deps\n</code></pre>\n<p>This solves the opam constraints, finds a cut of dependencies, and then git\nsubmodule adds the lot of them into my target repository. At this point, we run\n<code>setup.sh</code> which creates an opam local switch and then <code>dune build</code> just works\nusing all the locally cloned repos.</p>\n<pre><code>.\n├── _opam\n├── dune\n├── dune-project\n├── opam-repository\n│   ├── packages\n│   └── repo\n└── vendor\n    ├── dune\n    ├── ocaml-bytesrw-eio\n    ├── ocaml-claudeio\n    ├── ocaml-conpool\n    ├── ocaml-cookeio\n    ├── ocaml-crockford\n    ├── ocaml-html5rw\n    ├── ocaml-init\n    ├── ocaml-json-pointer\n    ├── ocaml-karakeep\n    ├── ocaml-langdetect\n    ├── ocaml-publicsuffix\n    ├── ocaml-punycode\n    ├── ocaml-requests\n    ├── ocaml-tomlt\n    ├── ocaml-yamlrw\n    ├── ocaml-yamlt\n    ├── odoc-xo\n    └── xdge\n</code></pre>\n<p>The directory structure is straightforward: we have our opam repository, a local switch\nand the source code all in one place now, and buildable in a single dune\ninvocation.</p>\n<h2 id=\"cross-cutting-fixes-with-agents\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#cross-cutting-fixes-with-agents\"></a>Cross-cutting fixes with agents</h2>\n<p>With all the code in one place, agents can now spot opportunities that span\nmultiple packages. The first thing I did was to build a full documentation set\nacross all my packages.</p>\n<p><img src=\"/images/aoah-monopam-ss-1.webp\" alt=\"%c\" ></p>\n<h3 id=\"building-unified-documentation-with-odoc3\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#building-unified-documentation-with-odoc3\"></a>Building unified documentation with odoc3</h3>\n<p><a href=\"https://jon.recoil.org\">Jon Ludlam</a> has been doing <a href=\"https://jon.recoil.org/blog/2025/12/claude-and-dune.html\">excellent work</a> on odoc3, the modern documentation generator for OCaml. odoc is a composable documentation generator that has a number of <a href=\"https://ocaml.github.io/odoc/odoc/driver.html\">mini-commands</a> that can be called in sequence to build fragments of HTML. Jon's been adding support into dune build rules to build a fully cross-referenced documentation site across an entire dune workspace.</p>\n<p>This is where the monorepo approach obviously shines, since we could generate a single site for all my code with types linking directly to their definitions across opam packages. The <a href=\"/notes/aoah-2025-16\">interactive notebooks</a> I built earlier could reference any type across the whole codebase.</p>\n<p>I first pinned Jon's odoc branch and then built the unified docs with the right rules.</p>\n<pre><code class=\"language-bash\">$ opam pin add dune https://github.com/jonludlam/dune.git#odoc-v3-rules\n$ dune build @doc\n$ open _build/default/_doc/_html/index.html\n</code></pre>\n<p>This generated a working doc page, that also included cross-referenced links\n<em>across</em> packages. But even more cool is that if a package doesn't exist in the\nlocal monorepo, it also does a best-effort link straight to the central doc\nrepository on OCaml.org.</p>\n<p>There were a few integration issues that may be bugs in the dune rules. For instance:</p>\n<pre><code>&gt; dune build @doc\nFile &quot;/Users/avsm/src/git/knot/aoah-vendor3/_opam/lib/angstrom/META&quot;, line 1, characters 0-0:\nError: Library &quot;angstrom-unix&quot; not found.\n-&gt; required by library &quot;angstrom.unix&quot; in\n   /Users/avsm/src/git/knot/aoah-vendor3/_opam/lib/angstrom\n-&gt; required by alias vendor/ocaml-karakeep/doc\n</code></pre>\n<p>This is a package that's present in the local tree, but not installed in opam.\nAfter I opam installed it, the doc generation worked. This probably shouldn't\nbreak a local docs build, so I commented on the GitHub PR.</p>\n<p>After this, there were genuine bugs in my own documentation, as evidenced by\nwarnings emitted by dune. The agents fixed problems and added cross-references across\nthe documentation, and I could do a single <code>git status</code> to see all the affected\npackages.</p>\n<pre><code>Changes not staged for commit:\n  (use &quot;git add &lt;file&gt;...&quot; to update what will be committed)\n  (use &quot;git restore &lt;file&gt;...&quot; to discard changes in working directory)\n  (commit or discard the untracked or modified content in submodules)\n        modified:   vendor/ocaml-claudeio (modified content)\n        modified:   vendor/ocaml-init (modified content)\n        modified:   vendor/ocaml-json-pointer (modified content)\n        modified:   vendor/ocaml-requests (modified content)\n        modified:   vendor/ocaml-tomlt (modified content)\n        modified:   vendor/ocaml-yamlrw (modified content)\n        modified:   vendor/ocaml-yamlt (modified content)\n        modified:   vendor/xdge (modified content)\n</code></pre>\n<h3 id=\"code-fixing-across-packages\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#code-fixing-across-packages\"></a>Code fixing across packages</h3>\n<p>I then prompted the agents to find opportunities for optimisation <em>across</em>\nall the packages. Running this in a fixpoint ended up allowing for backwards\nand forwards cross-references: for example, it could add &quot;related libraries&quot;\nsections, and also normalise error handling and logging interfaces where\nthere were inconsistencies.</p>\n<p><img src=\"/images/aoah-monopam-ss-2.webp\" alt=\"%c\" title=\"Docs fixes from the agent across repositories\" >\n<img src=\"/images/aoah-monopam-ss-3.webp\" alt=\"%c\" title=\"And similarly, interface fixes work just as well\" ></p>\n<h2 id=\"reflections\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reflections\"></a>Reflections</h2>\n<p>Once I had a consistent monorepo, I could commit the changes and distribute\na batch easily. For example, I uploaded my <a href=\"https://tangled.org/anil.recoil.org/monopam-odocv3-dune-test\">test odocv3 monorepo</a>\nand commented on <a href=\"https://github.com/ocaml/dune/pull/12995\">ocaml/dune#12995</a>.</p>\n<p>On the other hand, monopam's git submodule workflow is awkward to use due to how separate\nsubmodules are from the main git repository. I had to individually commit and push each of the\nchanges, and I couldn't get a unified git diff or make commits <em>across</em> the vendored\nrepositories.  I have a scheme in mind to improve this, which is a topic for tomorrow's post!</p>\n<p>Socially speaking, I'm reasonably convinced a monorepo workflow of <em>some</em> sort is the\nfuture for agentic coding. They just work so much better with local tool calls that can\nrapidly scan a lot of data instead of making remote calls (which are awkward from a permissions\nperspective as well). We'll still need to figure out how the dynamics of 'vibrating' patches\nacross each other goes; it's early days for the dynamics of agentic pair programming.</p><h1>References</h1><ul><li>Madhavapeddy et al (2022). Real World OCaml: Functional Programming for the Masses. Cambridge University Press. <a href=\"https://doi.org/10.1017/9781009129220\" target=\"_blank\"><i>10.1017/9781009129220</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/aoah-2025-22",
      "title": "AoAH Day 22: Assembling monorepos for agentic OCaml development",
      "summary": "Materialising opam metadata into git submodules and monorepos, enabling cross-cutting fixes and unified odoc3 documentation across dozens of OCaml libraries.",
      "date_published": "2025-12-22T00:00:00.000000Z",
      "date_modified": "2025-12-22T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ai",
        "ocaml",
        "oxcaml",
        "llms",
        "aoah"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1017/9781009129220",
          "doi": "10.1017/9781009129220",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/aoah-2025-21",
      "content_html": "<p>With <a href=\"/notes/aoah-2025-20\">language detection</a> now working in OCaml, I completed vibespiling the <a href=\"https://validator.github.io/validator/\">Nu HTML Validator</a> from Java to OCaml. This is the official <a href=\"https://validator.w3.org\">W3C validator</a> used to check HTML5 conformance, and it's a substantial codebase with thousands of validation rules.  I set out to see what a few <em>days</em> of agentic processing would do to transform the complex Java codebase into a more functionally structured pure OCaml codebase.</p>\n<p>The result is a pure OCaml HTML5 conformance checker that integrates with the <a href=\"/notes/aoah-2025-15\">parser</a> I built last week, all published as <strong><a href=\"https://tangled.org/anil.recoil.org/ocaml-html5rw\">ocaml-html5rw</a></strong>. Having the logic in pure OCaml meant that I could <em>also</em> <a href=\"/notes/aoah-2025-20\">compile it</a> into standalone JavaScript and WASM.  Dynamic conformance checking works even better than server-side filtering since live JavaScript executing on the page (and modifying the DOM) can <em>also</em> be checked. I <a href=\"https://www.npmjs.com/package/html5rw-jsoo\">published</a> this to NPM using a <a href=\"https://tangled.org/anil.recoil.org/claude-ocaml-to-npm/blob/main/SKILL.md\">new Claude skill</a>, and coded a live panel overlay to live debug HTML5 issues that I use on my own website now.</p>\n<p><a href=\"https://www.cl.cam.ac.uk/~avsm2/html5rw-validate/\"> <img src=\"/images/aoah-html5v-ss-1.webp\" alt=\"%c\" title=\"My conformance checker now runs the OCaml straight in the browser on my dev website and highlights errors along with explanations.\" > </a></p>\n<h2 id=\"full-html5-validation-in-ocamljavascriptwasm\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#full-html5-validation-in-ocamljavascriptwasm\"></a>Full HTML5 validation in OCaml/JavaScript/WASM</h2>\n<p>I'm going to talk about my results in reverse today, since I thought the outcomes were so unexpectedly useful.  I took yesterday's <a href=\"/notes/aoah-2025-20\">session</a> and asked Claude to build an <a href=\"https://tangled.org/anil.recoil.org/claude-ocaml-to-npm/blob/main/SKILL.md\">ocaml-to-npm</a> skill, which I used to publish the html5rw JavaScript and wasm <a href=\"https://www.npmjs.com/package/html5rw-jsoo\">to npm</a>.</p>\n<p>This runs the HTML5 validation OCaml code by serialising the live DOM tree and then collecting the various validation errors <em>along with the source</em>. This is sufficient to populate an overlay panel that can not only list the errors, but also highlighting the offending DOM node with a red box. This spotted lots of dynamic errors in my website!</p>\n<p>Publishing on NPM is quite convenient as there are several CDNs that serve the JavaScript directly. I integrate this into the development version of my blog as simply as:</p>\n<pre><code>&lt;script src=&quot;https://cdn.jsdelivr.net/npm/html5rw-jsoo@1.0.0/htmlrw.js&quot;&gt;&lt;/script&gt;\n&lt;script&gt;\nfunction validateWithPanel() {\n  const result = html5rw.validateAndShowPanel(document.documentElement, {\n    // Annotation options\n    annotation: {\n      addDataAttrs: true,\n      addClasses: true,\n      showTooltips: true,\n      tooltipPosition: 'auto',\n      highlightOnHover: true\n    },\n    // Panel options\n    panel: {\n      initialPosition: 'topRight',\n      draggable: true,\n      collapsible: true,\n      groupBySeverity: true,\n      clickToHighlight: true,\n      showSelectorPath: true,\n      theme: 'auto'\n    }\n});\n&lt;/script&gt;\n</code></pre>\n<p>I'll probably wrap this in a <a href=\"https://webcomponents.org\">webcomponent</a> in the\nfuture as <a href=\"https://github.com/art-w\">Arthur Wendling</a> did with <a href=\"https://github.com/art-w/x-ocaml\">x-ocaml</a> but for\nnow this is already useful on my own website. If anyone has any pointers for\nwhat the right CSS patterns are for adding these debug overlay panels to\nwebsites with minimal intrusion, I'd be grateful. I'm extremely unfamiliar with how\nmodern frontend programming works...</p>\n<p><a href=\"https://www.cl.cam.ac.uk/~avsm2/html5rw-validate/\"> <img src=\"/images/aoah-html5v-ss-2.webp\" alt=\"%c\" title=\"Did you know that you're not really supposed to have more than one h1 tag? Neither did I...\" > </a></p>\n<p>And of course, if you do prefer to stick to the server-side, then you get fast native code OCaml via a command-line binary provided by the package:</p>\n<pre><code class=\"language-bash\">$ dune exec -- html5check test.html\ntest.html:126.73: error [no-p-element-in-scope]: No “p” element in scope but a\n“p” end tag seen.\ntest.html:113.72: error [missing-alt]: An “img” element must have an “alt”\nattribute, except under certain conditions. For details, consult guidance on\nproviding text alternatives for images.\ntest.html:120.27: error [duplicate-id]: Duplicate ID “duplicate-id”.\ntest.html:123.36: error [disallowed-child]: Element “div” not allowed as child\nof element “span” in this context. (Suppressing further errors from this\nsubtree.)\ntest.html:152.8: info [multiple-h1]: Consider using only one “h1” element per\ndocument (or, if using “h1” elements multiple times is required, consider using\nthe “headingoffset” attribute to indicate that these “h1” elements are not all\ntop-level headings).\n</code></pre>\n<h2 id=\"a-few-days-of-vibespiling\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#a-few-days-of-vibespiling\"></a>A few days of vibespiling</h2>\n<p>The reason this took a few days of background vibespiling comes down to the sheer size of the problem. The <a href=\"https://github.com/validator/validator\">Nu Validator</a> is a mature Java application that is built around Java's <a href=\"https://docs.oracle.com/javase/tutorial/jaxp/sax/parsing.html\">SAX event model</a>, which I last used in 2000 when I worked on <a href=\"https://en.wikipedia.org/wiki/Chello\">Chello's website</a>. Looking through the validator Java code brought back &quot;fond&quot; memories of building <a href=\"https://getyarn.io/yarn-clip/796493b5-d8f6-42fa-9252-3d3803379653\">factories of factory makers</a>.  In the Nu validators, there are lots of rules checkers that iterate through an HTML5 parse tree and extend a base <code>Checker</code> class:</p>\n<pre><code>public final class TableChecker extends Checker {\nprivate Table current;\nprivate final LinkedList&lt;Table&gt; stack = new LinkedList&lt;&gt;();\n\n@Override\npublic void startElement(String uri, String localName,\n                         String qName, Attributes atts)\n  throws SAXException {\n  if (&quot;http://www.w3.org/1999/xhtml&quot;.equals(uri)) {\n    if (&quot;table&quot;.equals(localName)) { push(); } else\n    if (current != null) {\n      if (&quot;td&quot;.equals(localName) || &quot;th&quot;.equals(localName)) {\n        current.cell(atts, localName);\n      }\n      // ... more element handling\n    }\n  } } }\n</code></pre>\n<p>Since the number of rules was massive, a single run of the agent wasn't enough. Instead, I knocked up a <a href=\"/notes/aoah-2025-4\">Claudeio</a> wrapper that ran the agent iteratively exhorting it to continually sample the rules and iterate on a good architecture. This is only possible since the validator has a <a href=\"https://github.com/validator/validator/blob/main/tests/messages.json\">massive test suite</a> with expected outputs. The goal of the agent was therefore to try OCaml code architectures that maximised the number of passing rules, and then combining them towards getting 100% pass rate.</p>\n<p>After a few days, this converged and hit 100% pass rate with a bit of human prompt massaging from me.  The OCaml version replaces inheritance with <a href=\"https://dev.realworldocaml.org/first-class-modules.html\">first-class modules</a> instead, and each checker implements the <a href=\"https://tangled.org/anil.recoil.org/ocaml-html5rw/blob/main/lib/check/checker.mli\">Checker.S</a> signature:</p>\n<pre><code>module type S = sig\n  type state\n  val create : unit -&gt; state\n  val reset : state -&gt; unit\n  val start_element : state -&gt; element:Element.t -&gt; Message_collector.t -&gt; unit\n  val end_element : state -&gt; tag:Tag.element_tag -&gt; Message_collector.t -&gt; unit\n  val characters : state -&gt; string -&gt; Message_collector.t -&gt; unit\n  val end_document : state -&gt; Message_collector.t -&gt; unit\nend\n</code></pre>\n<p>This gives us the same flexibility to compose checkers, but with abstract state\ntypes rather than hidden mutable fields scattered across a class hierarchy.</p>\n<h3 id=\"browsing-the-giant-html5-test-suite\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#browsing-the-giant-html5-test-suite\"></a>Browsing the giant HTML5 test suite</h3>\n<p>I extended the visual HTML test suite generator I built a few days ago to include the thousands of validation tests, and the library now outputs one <a href=\"https://www.cl.cam.ac.uk/~avsm2/html5rw-check/\">epic HTML file</a> that lists each of the thousands of tests.</p>\n<p>Having these tests was essential when doing refactoring, as a small change in one checker affected others. Without the comprehensive test oracle, the agents would quickly diverge out of control.</p>\n<p><a href=\"https://www.cl.cam.ac.uk/~avsm2/html5rw-check/\"> <img src=\"/images/aoah-html5v-ss-4.webp\" alt=\"%c\" title=\"It's quite fun just browsing around the expect tests to see what's going on with HTML5\" > </a></p>\n<h3 id=\"a-case-study-on-the-table-checker\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#a-case-study-on-the-table-checker\"></a>A case study on the table checker</h3>\n<p>The table checker is one of the more complex validators, tracking cell spans,\ndetecting overlaps, and validating that <code>headers</code> attributes reference valid\n<code>th</code> elements. The Java version spreads this across multiple files, but the\nOCaml version consolidates everything into a <a href=\"https://tangled.org/anil.recoil.org/ocaml-html5rw/blob/main/lib/check/specialized/table_checker.ml\">single module</a> with explicit types.</p>\n<pre><code class=\"language-ocaml\">type cell = {\n  mutable left : int;\n  mutable right : int;\n  mutable bottom : int;\n  headers : string list;\n  element_name : string;\n}\n\ntype row_group = {\n  mutable current_row : int;\n  mutable insertion_point : int;\n  cells_in_effect : ((int * int), cell) Hashtbl.t;\n  mutable cells_on_current_row : cell array;\n  row_group_type : string option;\n}\n\ntype table = {\n  mutable state : table_state;\n  mutable column_count : int;\n  header_ids : (string, unit) Hashtbl.t;\n  cells_with_headers : cell list ref;\n  mutable current_row_group : row_group option;\n  (* ... *)\n}\n</code></pre>\n<p>However, the agent didn't fundamentally change the algorithmic structure; we still have the same basic state machine but with much more succinct variant types.</p>\n<h2 id=\"typed-error-codes\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#typed-error-codes\"></a>Typed error codes</h2>\n<p>One significant quality of life improvement came from refactoring how the error messages are collected for rendering. The Java code uses string formatting throughout to directly output messages, but the OCaml <a href=\"https://tangled.org/anil.recoil.org/ocaml-html5rw/blob/main/lib/check/error_code.mli\">error_code.mli</a> module defines a polymorphic variant hierarchy that's exhaustively checkable:</p>\n<pre><code class=\"language-ocaml\">type table_error = [\n  | `Cell_overlap\n  | `Cell_spans_rowgroup\n  | `Row_no_cells of [`Row of int]\n  | `Column_no_cells of [`Column of int] * [`Elem of string]\n]\n\ntype attr_error = [\n  | `Not_allowed of [`Attr of string] * [`Elem of string]\n  | `Missing of [`Elem of string] * [`Attr of string]\n  | `Bad_value of [`Elem of string] * [`Attr of string] *\n                  [`Value of string] * [`Reason of string]\n  | `Duplicate_id of [`Id of string]\n  (* ... *)\n]\n</code></pre>\n<p>This allows clients to pattern match on specific classes of errors easily:</p>\n<pre><code class=\"language-ocaml\">match err with\n| `Attr (`Duplicate_id (`Id id)) -&gt; handle_duplicate id\n| `Img `Missing_alt -&gt; suggest_alt_text ()\n| `Table `Cell_overlap -&gt; report_overlap ()\n| _ -&gt; default_handler err\n</code></pre>\n<p>This also means we can add new error categories without changing existing code,\nand the compiler tells us if we miss any cases. This is a pretty classic\nusecase for OCaml that both <a href=\"https://github.com/yminsky\">Yaron Minsky</a> and I talked about (in <a href=\"https://www.youtube.com/watch?v=hKcOkWzj0_s\">Caml\nTrading</a> and\n<a href=\"/papers/2010-icfp-xen\">XenServer</a>), but sometimes it's good to remember why I love\nOCaml so much!</p>\n<h2 id=\"reflections\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reflections\"></a>Reflections</h2>\n<p>While the OCaml code generated is by no means sparkling clean, it is useful and\noperational already.  The typed error hierarchy is perhaps the biggest win, as\nit lifts up the abstraction level to be more idiomatic to OCaml style and\neventually makes it easier to perhaps jump over to Haskell or Lean for even\nmore purity and formal specification work.  This is by far the biggest agentic\ncoding translation I've attempted so far, to the point where it used up all my\nClaude 20x Max credits in a matter of days. I have two accounts now!</p>\n<p>What also surprised me was how little the agent struggled with the architectural\ntransformation <em>across programming languages</em>. Given examples of OCaml first\nclass modules (from Real World OCaml and the Jane Street OCaml code), it\nproduced well-structured code.</p>\n<p><img src=\"/images/aoah-html5v-ss-3.webp\" alt=\"%c\" title=\"Claude can introspect its session context to update a skill to match what its learnt\" ></p>\n<p>I still have no idea how I'm going to maintain this code in the long term, but\ndo let me know if the HTML5 checker is useful to you. One little Claude trick\nthat remains handy is that after a prompting session, I prompt the agent to fix\nits own skill based on the feedback its received this session. That helps to\ngeneralise the skills as more projects get to using it.</p><h1>References</h1><ul><li>Scott et al (2010). Using functional programming within an industrial product group: perspectives and perceptions. ACM. <a href=\"https://doi.org/10.1145/1863543.1863557\" target=\"_blank\"><i>10.1145/1863543.1863557</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/aoah-2025-21",
      "title": "AoAH Day 21: Complete dynamic HTML5 validation in OCaml and the browser",
      "summary": "Porting the W3C's Nu HTML Validator from Java to OCaml and running in the browser dynamically",
      "date_published": "2025-12-21T00:00:00.000000Z",
      "date_modified": "2025-12-21T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "aoah",
        "ocaml",
        "agents",
        "llms",
        "ai",
        "web"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1145/1863543.1863557",
          "doi": "10.1145/1863543.1863557",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/aoah-2025-20",
      "content_html": "<p>I took a break from yesterday's <a href=\"/notes/aoah-2025-19\">bot hacking</a> to continue the <a href=\"/notes/aoah-2025-15\">HTML5 parsing</a> in OCaml adventure. Vibespiling seems to have taken off, with <a href=\"https://simonwillison.net/\">Simon Willison</a> <a href=\"https://simonwillison.net/2025/Dec/18/swift-justhtml/\">reporting</a> that there's a <a href=\"https://github.com/kylehowells/swift-justhtml\">Swift version</a> now as well. I got curious about how far I could push the vibespiling support: could we go beyond &quot;just&quot; parsing to also do <em>complete HTML5 validation</em>? The <a href=\"https://validator.github.io/validator/\">Nu HTML Validator</a> is where I went next, which is a bunch of Java code used by the W3C to apply some seriously complex rules for HTML5 validation.</p>\n<p>I decided to split this work into two days, and started with a simple problem: HTML5 validation includes the need for automated language detection to validate that the <code>lang</code> attribute on HTML elements matches the actual content. This is important for accessibility, as screen readers use language hints to select the correct pronunciation.</p>\n<p>The W3C validator uses the <a href=\"https://github.com/shuyo/language-detection\">Cybozu langdetect</a> algorithm, so I vibespiled this into pure OCaml code as <strong><a href=\"https://tangled.org/anil.recoil.org/ocaml-langdetect\">ocaml-langdetect</a></strong>. However, I decided to push harder by compiling this to <em>three</em> different backends: native code OCaml, JavaScript via <a href=\"https://github.com/ocsigen/js_of_ocaml\">js_of_ocaml</a> and then into <a href=\"https://tarides.com/blog/2025-02-19-the-first-wasm-of-ocaml-release-is-out/\">modern WebAssembly using wasm_of_ocaml</a>. As a fun twist, I got the regression tests running as interactive &quot;<a href=\"/notes/aoah-2025-17\">vibesplained</a>&quot; online notebooks that can do language detection in the browser.</p>\n<p><a href=\"https://www.cl.cam.ac.uk/~avsm2/langdetect-js/langdetect.html\"> <img src=\"/images/aoah-langdetect-ss-1.webp\" alt=\"%c\" title=\"The JavaScript version is interactive so you can test it out directly as you read this post.\" > </a></p>\n<h2 id=\"the-n-gram-frequency-algorithm\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-n-gram-frequency-algorithm\"></a>The n-gram frequency algorithm</h2>\n<p>Language detection via <a href=\"https://en.wikipedia.org/wiki/N-gram\">n-gram analysis</a> is surprisingly simple. The algorithm first trains profiles for each language, by analysing a corpus of text and counting the frequency of sequences of 1-3 characters. This creates a statistical fingerprint of the language.\nThen, when given unknown text it extracts its n-grams and compares against all trained profiles using Bayesian probability. The language whose profile best matches the text wins.</p>\n<p>It turns out that n-gram frequencies are <a href=\"https://web.stanford.edu/~jurafsky/slp3/3.pdf\">remarkably\nstable</a> across different texts\nin the same language. To pick an obvious example, the word &quot;the&quot; appears\nfrequently in English texts, giving bigrams &quot;th&quot; and &quot;he&quot; high frequencies.\nSimilarly, &quot;qu&quot; is common in French, &quot;sch&quot; in German, etc etc.  The algorithm\nuses multiple trials with randomized sampling to avoid overfitting to any\nparticular part of the text. Each trial adjusts the smoothing parameter\nslightly using a Gaussian distribution, all of which should be straightforward\nto implement in OCaml.</p>\n<h2 id=\"implementing-langdetect-in-ocaml\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#implementing-langdetect-in-ocaml\"></a>Implementing langdetect in OCaml</h2>\n<p>I grabbed the <a href=\"https://github.com/validator/validator/tree/main/langdetect\">validator/validator/langdetect</a> directory and vibespiled it from Java to OCaml, which is straightforward now with all the earlier Claude skills I've developed this month. The major hurdle to leap that's different from the other libraries is where to stash the precomputed ngram statistics for all the different languages. I wrote <a href=\"https://github.com/mirage/ocaml-crunch/blob/main/CHANGES.md\">ocaml-crunch</a> for Mirage back in 2011 which just generates OCaml modules, but it's still <a href=\"https://github.com/tarides/hackocaml/issues/21\">surprisingly difficult</a> to be more efficient and store precomputed data. <a href=\"https://www.cst.cam.ac.uk/people/jdy22\">Jeremy Yallop</a> <a href=\"https://discuss.ocaml.org/t/generating-a-map-at-compile-time/16217/7?u=avsm\">noted</a> back in March that his <a href=\"https://www.cl.cam.ac.uk/~jdy22/projects/modular-macros/\">modular macros</a> project should support this sort of usecase but it's not quite ready yet. Similarly, using OCaml Marshal requires stashing the marshalled datastructure somewhere, which is hard to do portably.</p>\n<p>Without a clear optimisation strategy, I prompted the agent to just precompute the profiles directly into OCaml\ncode.  The initial port worked immediately thanks to the clear structure of the\nJava code it was being vibespiled from. The static library was 115MB, but I\ndidn't really notice as the regression tests all passed. The language profiles\ncontain 172,000 unique n-grams across 47 languages, and the naive approach of\ngenerating one OCaml module per language with string literals duplicated\nn-grams across profiles.</p>\n<p>The native code library provides a straightforward interface to query the ngrams via a cmdliner binary:</p>\n<pre><code class=\"language-sh\">$ dune exec langdetect\nHello Thomas Gazagnaire, I'm finally learning French! Just kidding, I don't know anything about it.\nen 1.0000\n\n$ dune exec langdetect\nBonjour Thomas Gazagnaire, j'apprends enfin le français ! Je plaisante, je n'y connais rien.\nfr 1.0000\n\n$ dune exec langdetect\nHello Thomas Gazagnaire, I'm finally learning French! Just kidding, I don't know anything about it.\nBonjour Thomas Gazagnaire, j'apprends enfin le français ! Je plaisante, je n'y connais rien.\nen 0.5714\nfr 0.4286\n</code></pre>\n<h2 id=\"the-115mb-problem-for-javascript\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-115mb-problem-for-javascript\"></a>The 115MB problem for JavaScript</h2>\n<p>But then, when I compiled it to JavaScript using the <a href=\"https://dune.readthedocs.io/en/latest/reference/dune/executable.html#jsoo-field\">dune\nstanzas</a>\nthe massive size was a little too big, with very long compilation times.  The\n<a href=\"https://tangled.org/anil.recoil.org/ocaml-langdetect/commit/69e99a9c342957eee8db079137c803b0895e63fa\">fix</a> was simply to pack everything into a shared data structure across <em>all</em>\nlanguages, looking something like this:</p>\n<pre><code class=\"language-ocaml\">(* Shared string table for all 172K unique n-grams *)\nlet ngram_table = [| &quot;the&quot;; &quot;th&quot;; &quot;he&quot;; ... |]\n\n(* Flat int array: (ngram_index, frequency) pairs for all languages *)\nlet profile_data = [| 0; 15234; 1; 8921; ... |]\n\n(* Offsets: (lang_code, start_index, num_pairs) *)\nlet profile_offsets = [|\n  (&quot;en&quot;, 0, 4521);\n  (&quot;fr&quot;, 9042, 3892);\n  ...\n|]\n</code></pre>\n<p>This reduced the binary from 115MB to around 28MB, which is a reasonable\nreduction without having to resort to compression.  Many n-grams appear in\nmultiple languages (consider Latin alphabet characters) so deduplicating into a\nshared string table eliminated quite a bit of redundancy.</p>\n<p>At this point, I prompted the agent to build me a full Javascript based\nregression test that took the native code, and gave me a browser based\nversion instead.</p>\n<p>One minor hiccup was that the regression tests failed due to JavaScript\nintegers overflowing <em>vs</em> native integers, but the fix was simple and the\nregression tests in the browser made debugging them easy for the agentic loop. Without them, there would have been a lot of human cut-and-pasting which is quite tedious!</p>\n<p><a href=\"https://www.cl.cam.ac.uk/~avsm2/langdetect-js/langdetect.html\"> <img src=\"/images/aoah-langdetect-ss-2.webp\" alt=\"%c\" title=\"The JavaScript version also executes regression tests in the browser environment derived from the native tests.\" > </a></p>\n<h2 id=\"the-wasm-array-limit\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-wasm-array-limit\"></a>The WASM array limit</h2>\n<p>With the JavaScript size under control, I turned to WASM compilation via <code>wasm_of_ocaml</code>. The first attempt failed with a cryptic error about exceeding operands and a parse error. It turns out WASM's <code>array_new_fixed</code> instruction has a limit of 10000 operands, and our profile data array had 662,000 elements.</p>\n<p>The <a href=\"https://tangled.org/anil.recoil.org/ocaml-langdetect/commit/6f25190ebfb2edc4697b3a2ab05e6b33ae1cba3b\">solution</a> was to chunk the arrays and concatenate at runtime, which incurs runtime overhead but <a href=\"https://github.com/konsoletyper/teavm/issues/971\">is a common enough solution</a>.  The generated code now includes 74 chunks for the profile data and 20 chunks for the n-gram string table, but clocks in at around 20MB and could probably be reduced further with some browser compression.</p>\n<p><a href=\"https://www.cl.cam.ac.uk/~avsm2/langdetect-js/langdetect.html\"> <img src=\"/images/aoah-langdetect-ss-3.webp\" alt=\"%c\" title=\"Wasm mode is effectively the same as JavaScript, but more modern.\" > </a></p>\n<p>Now, the browser tests include the ability to switch between Wasm and\nJavaScript in the same test HTML. There was no real performance difference here, but the dataset is small. The most observable difference is that the wasm needs to be served via a web server and not local filesystem, as otherwise browsers reject it. The browser also must serve <code>.wasm</code> files as mime type <code>application/wasm</code> or it's promptly rejected.</p>\n<h2 id=\"browser-demo\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#browser-demo\"></a>Browser demo</h2>\n<p>The OCaml <code>langdetect-js</code> package provides a browser-ready API using Brr callbacks from HTML to register them with the JavaScript:</p>\n<pre><code>// Detect language\nconst lang = langdetect.detect(&quot;Hello, world!&quot;);  // &quot;en&quot;\n\n// Get probability scores\nconst result = langdetect.detectWithProb(&quot;Bonjour le monde&quot;);\n// { lang: &quot;fr&quot;, prob: 0.9987 }\n\n// Get all candidates\nconst all = langdetect.detectAll(&quot;这是中文&quot;);\n// [{ lang: &quot;zh-cn&quot;, prob: 0.85 }, { lang: &quot;zh-tw&quot;, prob: 0.12 }, ...]\n</code></pre>\n<p>I also <a href=\"https://www.npmjs.com/package/langdetect-jsoo\">published this to npm</a> so that the JavaScript is conveniently available via a CDN like <a href=\"https://cdn.jsdelivr.net/npm/langdetect-jsoo@1.0.0/langdetect.js\">jsDeliver</a>.  Publishing to npm required putting a <a href=\"https://tangled.org/anil.recoil.org/ocaml-langdetect/tree/npm\">npm branch</a> in the repo with the npm <code>package.json</code>. I followed the convenient guide by <a href=\"https://simonwillison.net/\">Simon Willison</a> to <a href=\"https://til.simonwillison.net/npm/npm-publish-github-actions\">get a minimal package.json</a> for the project.</p>\n<h2 id=\"reflections\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reflections\"></a>Reflections</h2>\n<p>This was a good intermediate port to work on since it let me exercise Webassembly a bit more, and understand the tradeoffs in OCaml compilation to these other backends.  The process of getting the agent to systematically port first to native code (from Java), and then compile to JavaScript and debug platform-specific issues like the integer overflows, and then go to wasm was quite good.</p>\n<p>The agent was particularly helpful for the tedious work of generating the\nchunked array code and debugging the Unicode normalization edge cases.</p>\n<p>For future hacking, there are several language optimisations coming up in <a href=\"https://oxcaml.org\">OxCaml</a> that should make this even more efficient; support for compile time metaprogramming (so I could for example compute a perfect hash statically for all the ngrams), and also for smaller integer sizes so I dont need to use a full 31-bit range for the ngram values. However, I couldn't quite get the wasm_of_ocaml constraints on the oxcaml branch working so I ran out of time today to get this going. Package management takes me out of the flow zone yet again!</p>\n<p>Now that langdetect works, we'll go onto the full HTML5 validator in <a href=\"/notes/aoah-2025-21\">Day 21</a>!</p>",
      "url": "https://anil.recoil.org/notes/aoah-2025-20",
      "title": "AoAH Day 20: Human language detection in native code, JS and wasm",
      "summary": "Porting the Nu HTML Validator's language detection to OCaml, then optimizing from 115MB to 28MB and fixing WASM array limits for browser deployment.",
      "date_published": "2025-12-20T00:00:00.000000Z",
      "date_modified": "2025-12-20T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "aoah",
        "ocaml",
        "agents",
        "llms",
        "ai",
        "web",
        "wasm"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/aoah-2025-19",
      "content_html": "<p>After building <a href=\"/notes/aoah-2025-18\">tomlt</a> yesterday for TOML 1.1 parsing, I proceeded to integrate it with my group's <a href=\"https://zulip.com\">Zulip</a> chat <a href=\"https://eeg.zulipchat.com\">server</a>. I then discovered that Zulip actually uses Python's <a href=\"https://docs.python.org/3/library/configparser.html\">configparser</a> INI format for its <code>.zuliprc</code> files rather than TOML, woops! But this gave me the perfect opportunity to attempt to quickly replicate the tomlt experience with a <em>third</em> config format codec library for Windows-style INI files as well.</p>\n<p>So today I released both <strong><a href=\"https://tangled.org/anil.recoil.org/ocaml-zulip\">ocaml-zulip</a></strong> for Zulip API integration and <strong><a href=\"https://tangled.org/anil.recoil.org/ocaml-init\">ocaml-init</a></strong> for INI file codecs that are compatible with Pythonic features such as variable interpolation. Along the way, I developed a new regression test mechanism by writing a Zulip bot that tests the Zulip API using OCaml Zulip!</p>\n<h2 id=\"zulip-organized-chat-for-distributed-teams\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#zulip-organized-chat-for-distributed-teams\"></a>Zulip: Organized chat for distributed teams</h2>\n<p>Zulip is an open-source &quot;async first&quot; messaging app that strikes a nice balance between immediate and thoughtful conversations:</p>\n<blockquote>\n<ul>\n<li>In Zulip, channels determine who gets a message. Each conversation within a channel is labeled with a topic, which keeps everything organized.</li>\n<li>You can read Zulip one conversation at a time, seeing each message in context, no matter how many other conversations are going on.</li>\n<li>If anything is out of place, it's easy to move messages, rename and split topics, or even move a topic to a different channel\n<cite>-- <a href=\"https://zulip.com/why-zulip/\">Why Zulip?</a>, 2024</cite></li>\n</ul>\n</blockquote>\n<p>Zulip itself is fully open source and has a pretty straightforward REST API to communicate with the server, and so I deployed my <a href=\"/notes/aoah-2025-13\">requests library</a> as well the various API codecs to interface with it. I used the <a href=\"https://github.com/zulip/python-zulip-api\">Zulip Python SDK</a> and the <a href=\"https://github.com/zulip/zulip-js\">Zulip JavaScript</a> library to give me two API specifications. Unlike previous libraries I've vibecoded, there's no language-agnostic test suite so I needed to get a bit more creative to verify correctness by building a live bot to test itself.</p>\n<h3 id=\"the-zulip-ini-config-format\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-zulip-ini-config-format\"></a>The Zulip INI config format</h3>\n<p>Zulip's <code>.zuliprc</code> file looks like this:</p>\n<pre><code>[api]\nemail = bot@example.com\nkey = your-api-key-here\nsite = https://your-domain.zulipchat.com\n</code></pre>\n<p>This is classic INI format as used by Python's configparser module. It's\nsimpler than TOML but isn't fully compatible as it has quirks like case-insensitive\nkeys, multiline value support via continuation lines, and basic variable\ninterpolation with a <code>%(name)s</code> syntax.</p>\n<p>I couldn't find a feature complete implementation of Python's module, so I quickly\nreused the <a href=\"/notes/aoah-2025-18\">tomlt</a> approach to build <a href=\"https://tangled.org/anil.recoil.org/ocaml-init\">ocaml-init</a> with bidirectional\ncodecs. The resulting API is unsurprisingly extremely similar to manipulate this format file:</p>\n<pre><code class=\"language-ocaml\">type server_config = { host : string; port : int; debug : bool }\n\nlet server_codec = Init.Section.(\n  obj (fun host port debug -&gt; { host; port; debug })\n  |&gt; mem &quot;host&quot; Init.string ~enc:(fun c -&gt; c.host)\n  |&gt; mem &quot;port&quot; Init.int ~enc:(fun c -&gt; c.port)\n  |&gt; mem &quot;debug&quot; Init.bool ~dec_absent:false ~enc:(fun c -&gt; c.debug)\n  |&gt; finish\n)\n</code></pre>\n<p>There's a <code>bool</code> codec in this library that follows Python's configparser\nsemantics exactly, accepting yes/no/true/false/on/off/1/0 as boolean values.\nThis was important for compatibility with existing Zulip configuration files.</p>\n<h2 id=\"the-zulip-bot-framework\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-zulip-bot-framework\"></a>The Zulip bot framework</h2>\n<p>With configuration parsing sorted, I turned to building the actual bot\nframework. Our research group at <a href=\"https://eeg.zulipchat.com\">eeg.zulipchat.com</a>\nhas been wanting an Atom feed bot to post updates from our blogs' Atom/RSS sources,\nso this seemed like a good excuse to knock up a bot.</p>\n<p>I prompted the agent to follow the basic <a href=\"https://zulip.com/api/deploying-bots\">Python\nbotserver</a> considerations but adapted to\na more Eio and OCaml idiomatic style. This resulted in a nice design where a\nbot handler is just a function:</p>\n<pre><code class=\"language-ocaml\">type handler =\n  storage:Storage.t -&gt; identity:identity -&gt;\n  Message.t -&gt; Response.t\n</code></pre>\n<p>The Zulip library provides modules for <code>storage</code> for persisting state (on the Zulip server side), <code>identity</code> containing functions to access the bot's email and user ID, and the incoming <code>Message.t</code>. The handler returns a <code>Response.t</code> which can be a reply in the same context (DM or channel/topic), or a post to a specific channel, or a direct message, or an indication that the bot's ignoring the event.</p>\n<p>There's an echo bot handler that's executable that shows the API quite simply:</p>\n<pre><code class=\"language-ocaml\">let echo_handler ~storage ~identity msg =\n  let bot_email = identity.Bot.email in\n  let sender_email = Message.sender_email msg in\n\n  (* Ignore our own messages *)\n  if sender_email = bot_email then Response.silent\n  else\n    (* Remove bot mention and echo back *)\n    let cleaned_msg = Message.strip_mention msg ~user_email:bot_email in\n    if cleaned_msg = &quot;&quot; then\n      Response.reply (Printf.sprintf &quot;Hello %s!&quot; (Message.sender_full_name msg))\n    else\n      Response.reply (Printf.sprintf &quot;Echo: %s&quot; cleaned_msg)\n</code></pre>\n<p>After this running the bot in an Eio environment is a single function call:</p>\n<pre><code class=\"language-ocaml\">let () =\n  Eio_main.run @@ fun env -&gt;\n  Eio.Switch.run @@ fun sw -&gt;\n  let config = Zulip_bot.Config.load ~fs:(Eio.Stdenv.fs env) &quot;echo-bot&quot; in\n  Zulip_bot.Bot.run ~sw ~env ~config ~handler:echo_handler\n</code></pre>\n<h2 id=\"tying-it-all-together-with-requests\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#tying-it-all-together-with-requests\"></a>Tying it all together with Requests</h2>\n<p>One nice payoff from this <a href=\"/notes/aoah-2025\">advent</a> series is seeing how\nthe libraries begin to compose. The Zulip OCaml package depends on\n<a href=\"/notes/aoah-2025-13\">Requests</a> for HTTPS communication back to the server:</p>\n<pre><code class=\"language-ocaml\">let create ~sw env auth =\n  let session =\n    Requests.create ~sw\n      ~default_headers:(Requests.Headers.of_list [\n        (&quot;Authorization&quot;, Auth.to_basic_auth_header auth);\n        (&quot;User-Agent&quot;, &quot;OCaml-Zulip/1.0&quot;);\n      ])\n      ~follow_redirects:true\n      ~verify_tls:true\n      env\n  in\n  { auth; session }\n</code></pre>\n<p>This shows how the session abstraction in Requests can persist the common auth\ntokens required, making subsequent API calls very syntactically succinct.  The\nbot framework also uses <a href=\"/notes/aoah-2025-3\">XDGe</a> for configuration directory\nresolution, <a href=\"/notes/aoah-2025-2\">jsont</a> for JSON parsing of API responses, and of\ncourse Eio throughout for async operations. The dependency graph is starting to\nlook like actual infrastructure!</p>\n<h2 id=\"testing-with-a-regression-bot\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#testing-with-a-regression-bot\"></a>Testing with a regression bot</h2>\n<p>Remember that problem I mentioned earlier about Zulip lacking a\nlanguage-agnostic test suite? My cunning solution was recursive; let's just\nbuild a Zulip bot that can test itself!\nI built a <a href=\"https://tangled.org/anil.recoil.org/ocaml-zulip/blob/main/examples/regression_test.ml\">regression test bot</a> that exercises as\nmuch of the Zulip API as possible when triggered via a direct message:</p>\n<pre><code class=\"language-ocaml\">let make_handler ~env ~channel =\n  fun ~storage ~identity:_ msg -&gt;\n    let content = String.lowercase_ascii (Message.content msg) in\n    let sender_email = Message.sender_email msg in\n    (* Only respond to DMs containing &quot;regress&quot; *)\n    if Message.is_private msg &amp;&amp; String.starts_with ~prefix:&quot;regress&quot; content\n    then (\n      let client = Storage.client storage in\n      let summary = run_tests ~env ~client ~channel ~trigger_user:sender_email in\n      Response.reply summary)\n    else Response.silent\n</code></pre>\n<p>When someone sends the bot a DM with &quot;regress&quot;, it runs through a comprehensive\ntest suite covering user operations, channel management, message\nsending/editing, reactions, message flags, typing indicators, presence updates,\nand alert words.  I duly started the harness and DMed Vicuna, and the bot\nimmediately spat out a number of failures resulting from errors in the codecs.\nHowever, as the logs show, the errors also included useful information about\nwhere the protocol decoding had gone wrong.</p>\n<p><img src=\"/images/aoah-zulip-regress-1.webp\" alt=\"%c\" title=\"The first run of my Zulip regression bot\" ></p>\n<p>The bot posts a summary to a test channel showing which tests passed or failed,\ncomplete with timing information. This turned out to be more useful than\ntraditional unit tests since it exercises the actual API against a real Zulip\nserver. After one round of fixes, the bot successfully posted its results to\na Zulip channel recording success!</p>\n<p><img src=\"/images/aoah-zulip-regress-2.webp\" alt=\"%c\" title=\"The bot posts the results of its Zulip regression test to Zulip!\" ></p>\n<h2 id=\"composable-command-line-terms\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#composable-command-line-terms\"></a>Composable command-line terms</h2>\n<p>One pattern I've been developing across these libraries is to expose <a href=\"https://github.com/dbuenzli/cmdliner\">cmdliner</a>\nterms that compose together to make it easy to build CLI tools that expose\nthe configuration needed by a library along with a manual page.</p>\n<p>The <code>Zulip_bot.Cmd</code> module provides a <code>config_term</code> that bundles common bot configuration parameters:</p>\n<pre><code class=\"language-ocaml\">let config_term default_name env =\n  let fs = env#fs in\n  Term.(const (fun name config_file verbosity verbose_http -&gt;\n        setup_logging ~verbose_http:verbose_http.value verbosity.value;\n        load_config ~fs ~name ~config_file)\n    $ name_term default_name\n    $ config_file_term\n    $ verbosity_term\n    $ verbose_http_term default_name)\n</code></pre>\n<p>While this looks complex, all it's doing is to combine various declarations\nof command-line parameters.  This combines individual terms for the bot name, config file path, verbosity level,\nand HTTP-level debugging into a single composable unit.</p>\n<p>The <code>verbose_http_term</code> controls logging sources in the <a href=\"/notes/aoah-2025-13\">Requests</a>\nlibrary, letting you toggle detailed HTTP traces without that being the default verbose output.</p>\n<pre><code class=\"language-ocaml\">let verbose_http_term app_name =\n  let env_name = String.uppercase_ascii app_name ^ &quot;_VERBOSE_HTTP&quot; in\n  let env_info = Cmdliner.Cmd.Env.info env_name in\n  Arg.(value &amp; flag &amp; info [ &quot;verbose-http&quot; ] ~env:env_info ~doc)\n</code></pre>\n<p>Each bot can then define its command with minimal boilerplate:</p>\n<pre><code class=\"language-ocaml\">let bot_cmd eio_env =\n  let info = Cmd.info &quot;echo_bot&quot; ~version:&quot;2.0.0&quot; ~doc ~man in\n  let config_term = Zulip_bot.Cmd.config_term &quot;echo-bot&quot; eio_env in\n  Cmd.v info Term.(const (fun config -&gt; run_echo_bot config eio_env) $ config_term)\n</code></pre>\n<p>The source tracking also helps with debugging by showing where each value\noriginated from -- whether from the command line, environment variables,\nXDG config files, or defaults. This makes it much easier to understand\nwhy a bot is behaving a certain way when deployed.</p>\n<p><img src=\"/images/aoah-zulip-regress-3.webp\" alt=\"%c\" title=\"The manual pages for a bot are pretty good by default thanks to the cmdliner terms.\" ></p>\n<h2 id=\"reflections\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reflections\"></a>Reflections</h2>\n<p>It's nice to get to the Zulip bot framework at last, since this is one of the\nthings I wanted to fix at the start of the month. It uses a number of things\nI've built this month, including the Requests library to handle HTTP, the INI\ncodec for Python configuration, XDG to handle path resolution, and so on. Each\npiece is small and focused, and generatively replicated from other\n(human-written) exemplar libraries from the OCaml ecosystem.</p>\n<p>The only &quot;agentic trick&quot; I learnt today was the value of live debugging, as I\nfound with both <a href=\"/notes/aoah-2025-17\">JMAP email</a> and <a href=\"/notes/aoah-2025-14\">Karakeep</a>. Building\nservices amenable to this kind of live mocking is something I'll keep in mind\nfor the future. It's also extremely useful to have good terminal manual pages,\nsince those can also be interrogated by coding agents as well as be used by humans.</p>",
      "url": "https://anil.recoil.org/notes/aoah-2025-19",
      "title": "AoAH Day 19: Zulip bot framework to bring Vicuna the friendly camel back",
      "summary": "Building an OCaml Zulip bot framework with functional handlers, and pivoting from TOML to INI codecs for Python configparser compatibility",
      "image": "https://anil.recoil.org/images/aoah-zulip-regress-2.1280.webp",
      "date_published": "2025-12-19T00:00:00.000000Z",
      "date_modified": "2025-12-19T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "aoah",
        "ocaml",
        "agents",
        "llms",
        "ai",
        "functional"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/aoah-2025-18",
      "content_html": "<p>After getting my <a href=\"/notes/aoah-2025-17\">email</a> interfaces automated yesterday, I turned my attention to <a href=\"https://eeg.zulipchat.com\">Zulip</a> integration. But first, I took a segway into another format that it required known as <a href=\"https://toml.io\">TOML</a>. I noticed <a href=\"https://lobste.rs/s/h50lml/toml_1_1_0_released\">TOML 1.1.0 was released</a> today and so I built <strong><a href=\"https://tangled.org/anil.recoil.org/ocaml-tomlt\">ocaml-tomlt</a></strong> today.</p>\n<p>What I wanted to explore with this library is whether I could use a coding agent to build a complex functional abstraction from scratch. After building <a href=\"/notes/aoah-2025-6\">yamlrw</a> and <a href=\"/notes/aoah-2025-7\">yamlt</a>, I settled on the technique <a href=\"https://erratique.ch\">Daniel Bünzli</a> developed with <a href=\"https://github.com/dbuenzli/jsont\">jsont</a> in his <a href=\"https://github.com/dbuenzli/jsont/blob/main/paper/soup.pdf\">paper</a>.</p>\n<p><a href=\"https://github.com/dbuenzli/jsont/blob/main/paper/soup.pdf\"> <img src=\"/images/jsont-paper.webp\" alt=\"%c\" title=\"Daniel wrote a nice paper about the combinator magic behind jsont\" > </a></p>\n<h2 id=\"why-toml-instead-of-yaml-or-json\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#why-toml-instead-of-yaml-or-json\"></a>Why TOML instead of Yaml or JSON?</h2>\n<p>TOML has become a popular configuration format for other language ecosystems like Rust and Python. Unlike <a href=\"/notes/aoah-2025-6\">Yaml 1.2</a>, TOML is actually a <a href=\"https://toml.io/en/v1.1.0\">reasonable human-editable format</a> without the terrifying corner cases and denial of service traps hidden in Yaml.</p>\n<p>Since Toml 1.1 was just released today, there are existing OCaml libraries that fully supported. In addition, I need one that is pure OCaml with no C dependencies (like <a href=\"/notes/aoah-2025-6\">yamlrw</a>) and that uses <a href=\"https://github.com/dbuenzli/bytesrw\">Bytesrw</a> for streaming I/O so that it composes well with my other libraries from this month's coding.</p>\n<h3 id=\"the-data-soup-paper\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-data-soup-paper\"></a>The data soup paper</h3>\n<p>The implementation of tomlt was prompted from <a href=\"https://raw.githubusercontent.com/dbuenzli/jsont/refs/heads/main/paper/soup.tex\">&quot;An Alphabet for Your Data Soups&quot;</a> which accompanies his jsont library. Working with untyped data formats like TOML in strongly-typed languages like OCaml requires a lot of tedious dynamic marhsalling, and I'd like to switch to conventional OCaml <a href=\"https://dev.realworldocaml.org/records.html\">records</a> or other static types as soon as possible.</p>\n<p>Daniel's solution is to define a <a href=\"https://dev.realworldocaml.org/gadts.html\">generalised algebraic datatype</a> whose values represent\nbidirectional mappings between subsets of the wire format and my chosen OCaml\ntypes. Waaay back in 2010 when <a href=\"https://github.com/samoht\">Thomas Gazagnaire</a> and I worked on <a href=\"/papers/2010-dyntype-wgt\">camlp4-based serialisation</a>, we converted into a generic intermediate\nrepresentation for OCaml types and values. More recently <a href=\"https://www.cst.cam.ac.uk/people/jdy22\">Jeremy Yallop</a> has been\nworking on <a href=\"https://dl.acm.org/doi/10.1145/3607851\">MacoCaml</a> which performs\nthis transformation at compile time via hygenic macros.</p>\n<p>Unlike any of these approaches, the functional pearl Daniel came up with allows the programmer to\ndefine direct functional transformations that work in both directions. It's a\nbit more work at runtime and so a bit slower, but in return you get excellent\nerror messages for malformed messages. The core Toml type therefore becomes:</p>\n<pre><code class=\"language-ocaml\">(* A codec encapsulates both decoding and encoding *)\ntype 'a t = {\n  kind : string;\n  doc : string;\n  dec : Toml.t -&gt; ('a, codec_error) result;\n  enc : 'a -&gt; Toml.t;\n}\n</code></pre>\n<p>This means you write your schema once and get both directions for free, and\nuser functions can be placed at every coding step to allow the programmer to\n<em>interpose</em> custom functionality such as transformation or validation.</p>\n<h2 id=\"using-tomlt-in-practise\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#using-tomlt-in-practise\"></a>Using tomlt in practise</h2>\n<p>A Toml config file might look something like this:</p>\n<pre><code>[server]                                                                                                                                                    \n  host = &quot;localhost&quot;                                                                                                                                          \n  port = 8080                                                                                                                                                 \n                                                                                                                                                              \n[database]                                                                                                                                                  \n  connection_max = 5000\n</code></pre>\n<p>Here's what using tomlt to parse this looks like in practice:</p>\n<pre><code class=\"language-ocaml\">type config = { host : string; port : int; debug : bool }\n\nlet config_codec =\n  Tomlt.(Table.(\n    obj (fun host port debug -&gt; { host; port; debug })\n    |&gt; mem &quot;host&quot; string ~enc:(fun c -&gt; c.host)\n    |&gt; mem &quot;port&quot; int ~enc:(fun c -&gt; c.port)\n    |&gt; mem &quot;debug&quot; bool ~enc:(fun c -&gt; c.debug) ~dec_absent:false\n    |&gt; finish\n  ))\n\nlet () =\n  match Tomlt.decode_string config_codec {|\n    host = &quot;localhost&quot;\n    port = 8080\n  |} with\n  | Ok config -&gt; Printf.printf &quot;Host: %s\\n&quot; config.host\n  | Error e -&gt; prerr_endline (Tomlt.Toml.Error.to_string e)\n</code></pre>\n<p>The functional pattern is almost identical to the <a href=\"/notes/aoah-2025-7\">yamlt</a> or <a href=\"/notes/aoah-2025-2\">jsont</a> codecs I've been building.\nYou don't <em>have</em> to define a codec, as tomlt also provides <a href=\"https://ocaml.org/manual/5.4/indexops.html\">custom index operators</a> to navigate tables directly:</p>\n<pre><code class=\"language-ocaml\">let config = Toml.of_string {|\n  [server]\n  host = &quot;localhost&quot;\n  port = 8080\n\n  [database]\n  connection_max = 5000\n|} in\n(* Navigate nested tables with .%{} *)\nlet host = Toml.(config.%{[&quot;server&quot;; &quot;host&quot;]} |&gt; to_string) in\nlet port = Toml.(config.%{[&quot;server&quot;; &quot;port&quot;]} |&gt; to_int) in\nPrintf.printf &quot;Server: %s:%Ld\\n&quot; host port;\n\n(* Update values *)\nlet config' = Toml.(config.%{[&quot;database&quot;; &quot;enabled&quot;]} &lt;- bool true) in\nprint_endline (Toml.to_string config')\n</code></pre>\n<p>The syntax is a little verbose due to the module opening, but it's still a\npretty nice way to poke around TOML files interactively!</p>\n<h3 id=\"datetime-handling\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#datetime-handling\"></a>Datetime handling</h3>\n<p>Another area where TOML differs from other formats is that it four distinct\ndatetime formats: offset datetimes, local datetimes, local dates, and local\ntimes. tomlt tries to unify this a little via a single codec that normalises everything to <a href=\"https://github.com/dbuenzli/ptime\">Ptime.t</a>, but allows the codec to supply sensible defaults (e.g. for a missing timezone, or a missing date).</p>\n<pre><code class=\"language-ocaml\">(* All of these decode to Ptime.t with sensible defaults *)\n(* when = 2024-01-15T10:30:00Z       -&gt; offset datetime *)\n(* when = 2024-01-15T10:30:00        -&gt; local datetime *)\n(* when = 2024-01-15                 -&gt; date at midnight *)\n(* when = 10:30:00                   -&gt; time on today's date *)\n\nlet event_codec = Tomlt.(Table.(\n  obj (fun name when_ -&gt; { name; when_ })\n  |&gt; mem &quot;name&quot; string ~enc:(fun e -&gt; e.name)\n  |&gt; mem &quot;when&quot; (ptime ()) ~enc:(fun e -&gt; e.when_)\n  |&gt; finish\n))\n</code></pre>\n<p>For applications that need to preserve the exact format, there's also a\n<code>ptime_full</code> function which returns a polymorphic variant indicating precisely\nwhat was present in the source config file.</p>\n<h2 id=\"testing\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#testing\"></a>Testing</h2>\n<p>The secret to vibing seems to be having a specification oracle to guide the\nagent, and TOML has a <a href=\"https://github.com/toml-lang/toml-test\">toml-test</a> suite\nthat's perfect for this purpose:</p>\n<blockquote>\n<p>toml-test is a language-agnostic test suite to verify the correctness of TOML parsers and writers.</p>\n<p>Tests are divided into two groups: &quot;invalid&quot; and &quot;valid&quot;. Decoders or\nencoders that reject &quot;invalid&quot; tests pass the tests, and decoders that accept\n&quot;valid&quot; tests and output precisely what is expected pass the tests. The\noutput format is JSON, described below.\n<cite>-- <a href=\"https://github.com/toml-lang/toml-test\">Toml-test GitHub</a>, 2021</cite></p>\n</blockquote>\n<p><img src=\"/images/aoah-toml-ss-1.webp\" alt=\"%c\" title=\"The Claude coding agent iterated overnight on getting to 100% test on the third party tests\" ></p>\n<h2 id=\"reflections\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reflections\"></a>Reflections</h2>\n<p>After building <a href=\"/notes/aoah-2025-6\">yamlrw</a>, <a href=\"/notes/aoah-2025-7\">yamlt</a>, and now tomlt,\nI'm convinced that the bidirectional codec pattern is a good approach for\n<em>agentic</em> OCaml programming. It's a little verbose to express by hand, which\nleads down the ppx route for most. But with agentic generation and oracle\nspecification testing, the coding agent was particularly helpful with both\nfiguring out the TOML grammar and exposing all the variations of codecs\nrequired for parsing all those datetime variants.</p>\n<p>Having the <a href=\"https://toml.io/en/v1.1.0\">TOML 1.1 specification</a> as context and\nmy earlier <a href=\"/notes/aoah-2025-11\">Claude OCaml RFC skill</a> helped a lot as well, to\nallow the ocamldoc to be cross referenced.  And of course, the key design\ninsights at the heart of the library came from <a href=\"https://erratique.ch\">Daniel Bünzli</a> publishing jsont and\nalso uploading his paper. This Tomlt library is a generative clone of his\nideas, but a useful one to my personal workflows this advent!</p>\n<p>Tomorrow in <a href=\"/notes/aoah-2025-19\">Day 19</a>, I'll continue with my original goal of getting\na Zulip bot working!</p><h1>References</h1><ul><li>Xie et al (2023). MacoCaml: Staging Composable and Compilable Macros. MacoCaml: Staging Composable and Compilable Macros (Artifact). <a href=\"https://doi.org/10.1145/3607851\" target=\"_blank\"><i>10.1145/3607851</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/aoah-2025-18",
      "title": "AoAH Day 18: TOML 1.1 codecs directly from the spec and paper",
      "summary": "Building tomlt, a pure OCaml TOML 1.1 parser with bidirectional codecs following the jsont design patterns",
      "date_published": "2025-12-18T00:00:00.000000Z",
      "date_modified": "2025-12-18T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "aoah",
        "ocaml",
        "agents",
        "llms",
        "ai",
        "functional"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1145/3607851",
          "doi": "10.1145/3607851",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/aoah-2025-17",
      "content_html": "<p>After building a <a href=\"/notes/aoah-2025-16\">JSON Pointer library</a> yesterday, I proceeded\nto complete my <a href=\"https://tangled.org/anil.recoil.org/ocaml-jmap\">OCaml JMAP</a>\nlibrary today so that I could wrestle my overflowing email inbox under control.\nEmail is <a href=\"https://www.ncsc.gov.uk/collection/phishing-scams\">central</a> to our digital lives and yet we have mostly <a href=\"https://www.theverge.com/24113616/gmail-email-20-years-old-internet\">ceded control</a> to third-party services for something that unlocks access to almost any service we use.</p>\n<p>Luckily, I've been self-hosting my <a href=\"/notes/decentralised-stack\">own email</a> for some time, so I do have full local access to about three decades worth of messages. However, I've been hampered by existing email clients which are mostly geared towards a temporal view and not towards easy programmability. So today's exercise has been to build an <strong><a href=\"https://tangled.org/anil.recoil.org/ocaml-jmap\">ocaml-jmap</a></strong> that lets me write little agentic programs to help me manage my ever overflowing inbox!</p>\n<p><a href=\"https://x.com/ShriramKMurthi/status/2001361749539459166\"> <img src=\"/images/shriram-vibe-1.webp\" alt=\"%c\" title=\"Shriram vibes a fair criticism of my current email strategy\" > </a></p>\n<h2 id=\"why-jmap-for-email\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#why-jmap-for-email\"></a>Why JMAP for email?</h2>\n<p>Historically, the protocol of choice for accessing email has been <a href=\"https://datatracker.ietf.org/doc/html/rfc3501\">IMAP</a>, and for years I used the <a href=\"https://github.com/nojb/ocaml-imap\">Lwt ocaml-imap</a> library that <a href=\"https://github.com/nojb\">Nicolás Ojeda Bär</a> wrote back when he was here in Cambridge. However, IMAP is a pretty convoluted protocol that hasn't evolved much over time, and requires a number of accreted <a href=\"https://www.ietf.org/archive/id/draft-rjbs-mailmaint-imap-extensions-suggestions-00.html\">extensions</a> that are patchily implemented.</p>\n<p>The <a href=\"https://en.wikipedia.org/wiki/JSON_Meta_Application_Protocol\">JSON Meta Application Protocol</a> (or JMAP) was developed in response to this. The developers wrote:</p>\n<blockquote>\n<p>JMAP is the result of efforts to address shortcomings [in existing protocols], providing a modern, efficient, easy-to-use API, built on many years of experience and field testing.\n<cite>-- <a href=\"https://www.ietf.org/blog/jmap/\">JMAP, a modern open email protocol</a>, Gondwana and Jenkins, May 2019</cite></p>\n</blockquote>\n<p>JMAP's exciting because it allows for the reuse of a number of other protocol implementations I've already built these past few weeks. I can use <a href=\"/notes/aoah-2025-13\">ocaml-requests</a> and <a href=\"/notes/aoah-2025-12\">conpool</a> for HTTPS connections, and my ever-growing dependence on <a href=\"https://github.com/dbuenzli/jsont\">jsont</a> means that I can get reasonable error messages while debugging the implementation.</p>\n<p>Servers that support JMAP are still thin on the ground, but all my email endpoints do now support it. At work in the Cambridge Computer Lab, the enlightened sysadmins gave us <a href=\"https://app.fastmail.com\">Fastmail accounts</a> as an alternative to having Exchange inflicted on us by the central University.  Over at Recoil, my personal mail is switching to <a href=\"https://github.com/stalwartlabs/stalwart\">Stalwart JMAP</a>. While there's still no good JMAP <em>client</em> that replaces my day-to-day email, all these services also expose IMAP for that purpose. This clears the way for me to vibecode up a JSON library in OCaml for the programmatic fragments that I so crave!</p>\n<h2 id=\"buildinghhhvibing-ocaml-jmap\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#buildinghhhvibing-ocaml-jmap\"></a>Building^H^H^Hvibing OCaml JMAP</h2>\n<p>Interestingly, I had also tried to build an OCaml JMAP waaay back when <a href=\"/notes/claude-copilot-sandbox\">Claude Code was first released</a>, and the difference in nine months of model development is massive; browse my <a href=\"https://tangled.org/anil.recoil.org/ocaml-jmap/tree/old1\">first attempt here</a> and see what a mess it was.</p>\n<p>But now that I've got a bunch of examples under my belt, the latest round of vibing of <a href=\"https://tangled.org/anil.recoil.org/ocaml-jmap\">ocaml-jmap</a> was extremely fast. The agent had examples of jsont use from <a href=\"/notes/aoah-2025-2\">jsonfeed</a> and the <a href=\"/notes/aoah-2025-14\">Karakeep REST client</a>, and <a href=\"https://tangled.org/anil.recoil.org/ocaml-requests\">ocaml-requests</a> provides direct-style HTTP requests with authentication bearer support. I <a href=\"/notes/aoah-2025-16\">vibesplained</a> a Javascript tutorial to help me view messages.</p>\n<h3 id=\"building-a-browser-based-jmap-client-to-test\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#building-a-browser-based-jmap-client-to-test\"></a>Building a browser-based JMAP client to test</h3>\n<p>However, I wanted to go a little further than <em>just</em> notebook style tutorials. In addition to compiling OCaml code to JavaScript, there's also really good support for programming browser API directly through libraries like <a href=\"https://github.com/dbuenzli/brr\">Brr</a>. These expose an OCaml interface that allows us to build up browser-specific logic to better integrate within it.</p>\n<p><img src=\"/images/aoah-jmap-ss-1.webp\" alt=\"%c\" title=\"Connecting to fastmail from a browser OCaml JMAP compiled to JavaScript\" ></p>\n<p>So I took the notebook idea one step further and prompted up an entire JMAP client that runs in the browser (albeit a very simple one!). This <a href=\"https://tangled.org/anil.recoil.org/ocaml-jmap/blob/main/lib/js/jmap_brr.mli\">Json_brr</a> library looks just like normal OCaml, except that it uses libraries designed to run in the browser as its dependencies. The interface shows some differences</p>\n<pre><code class=\"language-ocaml\">type connection\nval api_url : connection -&gt; Jstr.t\n(** [api_url conn] returns the API URL for requests. *)\n\n(** {1 Session Establishment} *)\n\nval get_session :\n  url:Jstr.t -&gt;\n  token:Jstr.t -&gt;\n  (connection, Jv.Error.t) result Fut.t\n(** [get_session ~url ~token] establishes a JMAP session.\n</code></pre>\n<p>Instead of <code>string</code>, this uses <a href=\"https://erratique.ch/software/brr/doc/Jstr/index.html\">Jstr.t</a> for JavaScript strings. Instead of Eio, this also uses <a href=\"https://erratique.ch/software/brr/doc/Fut/index.html\">Fut.t</a> to encode future promises.\nBut aside from those things, it's just plain OCaml! The <a href=\"https://tangled.org/anil.recoil.org/ocaml-jmap/blob/main/lib/js/jmap_brr.ml\">implementation</a> makes Websocket requests to connect to a remote JMAP server in the browser.</p>\n<pre><code class=\"language-ocaml\">let fetch_json ~url ~meth ~headers ?body () =\n  Console.(log [str &quot;&gt;&gt;&gt; Request:&quot;; str (Jstr.to_string meth); str (Jstr.to_string url)]);\n  (match body with\n   | Some b -&gt; Console.(log [str &quot;&gt;&gt;&gt; Body:&quot;; b])\n   | None -&gt; Console.(log [str &quot;&gt;&gt;&gt; No body&quot;]));\n  let init = Brr_io.Fetch.Request.init\n    ~method':meth\n    ~headers\n    ?body\n    ()\n  in\n  let req = Brr_io.Fetch.Request.v ~init url in\n  let* response = Brr_io.Fetch.request req in\n</code></pre>\n<p>To use <a href=\"https://tangled.org/anil.recoil.org/ocaml-jmap/tree/main/web\">jmap.html</a>, I obtained a (read only!!) API key from my live email server, and then managed to connect to my live email <em>and</em> get a protocol debugger, all from my browser.</p>\n<p><img src=\"/images/aoah-jmap-ss-2.webp\" alt=\"%c\" title=\"The client also dumps the JMAP protocol messages so I can learn more about how it works.\" ></p>\n<p>One important performance point about how this works is that some of the base libraries I'm using take advantage of OCaml's modularity in order to expose <em>browser specific backends</em>. For example, <a href=\"https://github.com/dbuenzli/jsont/blob/main/src/brr/jsont_brr.ml\">Jsont_brr</a> uses the native browser JSON parser while still working with Jsont codecs. This sort of casual modularity is an extremely nice and undersung feature of OCaml, and was also key in <a href=\"/papers/2019-mirage-functors\">MirageOS being portable</a> to so many <a href=\"/videos/287364fa-b59c-4b9f-812d-d81cc0c992a5\">embedded devices</a>.</p>\n<h2 id=\"fixing-my-notifications-with-programmable-email\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#fixing-my-notifications-with-programmable-email\"></a>Fixing my notifications with programmable email</h2>\n<p>So now we have a working library, I wanted to go beyond the conventional email clients. With agentic coding, I should be able to construct <em>hundreds</em> of domain-specific bits of code that help me manipulate my email <strong>the way I want it done</strong>.  This was the dream of <a href=\"\">##selfhosting</a> and open source in the first place, but it was too much work to write all that glue code... until now.</p>\n<p><img src=\"/images/aoah-jmap-ss-3.webp\" alt=\"%c\" title=\"The sad state of my 10000s of GitHub notifications, which I mostly ignore these days\" ></p>\n<p>One papercut that bugs both me and <a href=\"https://www.tunbury.org/\">Mark Elvers</a> is the flood of email notifications we get that are rapidly out of date. I get thousands a day from GitHub, and ideally they would be marked as read and filed away automatically when I finish with it on the remote service, and not stay in my Inbox.  This is now easily solveable using OCaml JMAP, so I built a proof of concept to filter away my <a href=\"https://eeg.zulipchat.com\">Zulip notifications</a> nicely.</p>\n<h3 id=\"jmap-keywords-and-cli-fragments\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#jmap-keywords-and-cli-fragments\"></a>JMAP keywords and CLI fragments</h3>\n<p>I built a domain-specific CLI called <a href=\"https://tangled.org/anil.recoil.org/ocaml-jmap/blob/main/bin/jmapq.ml\">jmapq</a> which exists solely to implement specialist workflows. I'll break this out into a private repo for my own use later, this is just a prototype!</p>\n<p>Zulip sends a nicely structured email with a subject like &quot;#Blogs &gt; My 2025 Advent of Agentic Humps: a new library daily [Cambridge Energy &amp; Environment Group]&quot; every so often which I instructed the agent to <a href=\"https://tangled.org/anil.recoil.org/ocaml-jmap/blob/main/bin/jmapq.ml#L22-46\">parse into an OCaml data structure</a>:</p>\n<pre><code class=\"language-ocaml\">(** Parsed information from a Zulip notification email subject.\n    Subject format: &quot;#Channel &gt; topic [Server Name]&quot; *)\nmodule Zulip_message = struct\n  type t = {\n    id : string;\n    date : Ptime.t;\n    thread_id : string;\n    channel : string;\n    topic : string;\n    server : string;\n    is_read : bool;\n    labels : string list;\n  }\n\n  (** Parse a Zulip subject line of the form &quot;#Channel &gt; topic [Server Name]&quot; *)\n  let parse_subject subject =\n    (* Pattern: #&lt;channel&gt; &gt; &lt;topic&gt; [&lt;server&gt;] *)\n    let channel_re = Re.Pcre.regexp {|^#(.+?)\\s*&gt;\\s*(.+?)\\s*\\[(.+?)\\]$|} in\n    match Re.exec_opt channel_re subject with\n    | Some groups -&gt;\n        let channel = Re.Group.get groups 1 in\n        let topic = Re.Group.get groups 2 in\n        let server = Re.Group.get groups 3 in\n        Some (channel, topic, server)\n    | None -&gt; None\n</code></pre>\n<p>Once we have this, I then added another CLI command to list all the Zulip notifications in my email, which it outputs using nicely structured JSON output that is parseable by <a href=\"https://github.com/jqlang/jq\">jq</a>.</p>\n<p><img src=\"/images/aoah-jmap-ss-4.webp\" alt=\"%c\" title=\"Given a read only API key, this parses out just what I need from Zulip\" ></p>\n<p>Then I instructed the CLI to give me more grouped views, so I can see notifications in my email by topic and by server (I now also get notifications from other Zulips I'm on like the <a href=\"https://ocaml.zulipchat.com\">OCaml</a> or <a href=\"https://zulip.com/case-studies/lean/\">Lean</a>).</p>\n<p><img src=\"/images/aoah-jmap-ss-5.webp\" alt=\"%c\" title=\"The vibe coded CLI can show really specific queries like my Zulip channels\" ></p>\n<p>And then the really sketchy bit. I swapped API keys to have a read/write one (it could delete all my email in theory), and then -- while only mildly sweating -- ran the timeout command. I did add a dry-run mode so I could see the query first before I let it rip.</p>\n<p><img src=\"/images/aoah-jmap-ss-6.webp\" alt=\"%c\" title=\"Running AI against my live email with a vibe coded client is not how I imagined 2025 to go.\" ></p>\n<p>And et voila, it added the right keywords and marked things as unread, and my live email view in Fastmail now allows for <a href=\"https://www.fastmail.help/hc/en-us/articles/360060591213-Searching-your-mail\">keyword searches</a> that show me <em>exactly</em> what I want.</p>\n<p><img src=\"/images/aoah-jmap-ss-7.webp\" alt=\"%c\" title=\"I'm not sure what the limit on keywords is, but Fastmails search interface is great\" ></p>\n<h2 id=\"reflections\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reflections\"></a>Reflections</h2>\n<p>Today's been great; I managed to find a papercut that's been bugging me for\nyears, and code up something quickly that genuinely makes my personal workflows\na little bit better. I used to do this a <a href=\"/projects/perscon\">lot</a> before the pandemic, but somehow\nthe shift to online services has really damaged my ability to program my own\ndigital life.</p>\n<p>I feel back in control again; although\n<a href=\"https://tangled.org/anil.recoil.org/ocaml-jmap\">ocaml-jmap</a> should be used by\nanyone else with extreme caution, I'm rigging up some ZFS snapshots of my own\nemail so I can code up a few hundred custom agents over things and see how it\ngoes.  But mostly, I'm happy to have found a glimmer of the <a href=\"https://resonantcomputing.org\">Resonant Computing manifesto</a>\nthrough the medium of strong OCaml types, self hosted services and agentic coding glue:</p>\n<blockquote>\n<p>Dedicated: Software should work exclusively for you, ensuring contextual\nintegrity where data use aligns with your expectations. You must be able to\ntrust there are no hidden agendas or conflicting interests.\n<cite>-- <a href=\"https://resonantcomputing.org\">Resonant Computing Manifesto</a>, 2025</cite></p>\n</blockquote>\n<p>I'm going to work on wrestling GitHub and Netdata under control next. And what about the other side of the Zulip notifications? Well, I'll knock up a Zulip bot in OCaml tomorrow, so stay tuned for that!</p><h1>References</h1><ul><li>Radanne et al (2019). Programming Unikernels in the Large via Functor Driven Development. arXiv. <a href=\"https://doi.org/10.48550/arXiv.1905.02529\" target=\"_blank\"><i>10.48550/arXiv.1905.02529</i></a></li>\n<li>Madhavapeddy (2025). Oh my Claude, we need agentic copilot sandboxing right now. <a href=\"https://doi.org/10.59350/aecmt-k3h39\" target=\"_blank\"><i>10.59350/aecmt-k3h39</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/aoah-2025-17",
      "title": "AoAH Day 17: OCaml JMAP to plaster my painful email papercuts",
      "summary": "Building an OCaml JMAP client that runs in browsers and CLI, then using it to build personalised email workflows for taming notification overload.",
      "date_published": "2025-12-17T00:00:00.000000Z",
      "date_modified": "2025-12-17T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "aoah",
        "ocaml",
        "agents",
        "llms",
        "ai",
        "email"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.48550/arXiv.1905.02529",
          "doi": "10.48550/arXiv.1905.02529",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/aecmt-k3h39",
          "doi": "10.59350/aecmt-k3h39",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/aoah-2025-16",
      "content_html": "<p>After the successful <a href=\"/notes/aoah-2025-15\">HTML5 translation</a> yesterday, I realised\nthat I know next to nothing about HTML5 parsing and had leant extremely heavily\non agentic coding. This approach has also been useful to help me explore\n<a href=\"/notes/aoah-2025-13\">diverse codebases</a> in a combination of languages. So today I\nset my sights on understanding the pedagogical impacts of agentic coding a bit\nmore. Can we use coding agents to help us iteratively explore complex\nprotocols?</p>\n<p>I decided to build a <a href=\"https://jmap.io\">JMAP email</a> client implementation in\nOCaml that I need for myself<sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote\">[1]</a></sup>  but with the added twist of seeing how I\ncould engineer agents to &quot;vibesplain&quot; a protocol to me that I'm unfamiliar\nwith.</p>\n<p>OCaml has superb tooling to help with this; it can not only compile to efficient native code but also to <a href=\"https://github.com/ocsigen/js_of_ocaml\">JavaScript and WASM</a> that runs standalone in the browser. I turned to my colleagues <a href=\"https://jon.recoil.org\">Jon Ludlam</a>, <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> and <a href=\"https://github.com/art-w\">Arthur Wendling</a> for help with the tooling, since they've been leading the way on <a href=\"https://patrick.sirref.org/var/propl2025.pdf\">scientific programming</a>, <a href=\"https://jon.recoil.org/blog/2025/12/an-svg-is-all-you-need.html\">visualisations</a> and <a href=\"https://art-w.github.io/x-ocaml/\">webcomponents</a> in OCaml.</p>\n<p>Today's work resulted in an <strong><a href=\"https://tangled.org/anil.recoil.org/ocaml-json-pointer\">ocaml-json-pointer</a></strong> (<a href=\"https://datatracker.ietf.org/doc/html/rfc6901\">RFC6901</a>) implementation along with an <a href=\"https://www.cl.cam.ac.uk/~avsm2/json-pointer/\">interactive notebook tutorial</a> that bundles the entire OCaml compiler toolchain alongside it. There's even another one for <a href=\"https://www.cl.cam.ac.uk/~avsm2/yamlrw-doc\">Yaml</a> just to illustrate how easy this is to replicate once we've built the first one.</p>\n<h2 id=\"why-do-we-need-to-vibesplain-specifications\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#why-do-we-need-to-vibesplain-specifications\"></a>Why do we need to vibesplain specifications?</h2>\n<p>I first began by informally scanning the <a href=\"https://www.rfc-editor.org/rfc/rfc8620\">JMAP RFCs</a> with a cup of tea, and noticing that it needed support for something called <a href=\"https://datatracker.ietf.org/doc/html/rfc6901\">JSON Pointer</a>. I know nothing about this, and so I used my <a href=\"https://tangled.org/anil.recoil.org/claude-ocaml-internet-rfc\">Claude Internet RFC skill</a> to sketch out an <a href=\"https://tangled.org/anil.recoil.org/ocaml-json-pointer\">OCaml implementation</a> based on the specification.</p>\n<p>This is straightforward, but I didn't know how to evaluate the generated interface for fitness since I'm not familiar with the topic. Ideally, I could use the coding agent to explain JMAP pointers to me <em>by using the generated OCaml code illustratively</em> so I could do a code review of its output at the same time as educating myself. For this to work, we would have to machine check the generated documentation and code to make sure it all compiles.\nThis seems like a really good use to put coding agents to!</p>\n<p>I dub <strong>vibesplaining</strong> as using an agent to generate executable code <em>and an executable tutorial</em> that can be interacted with by the prompter in order to assist with both understanding the code and iterating on the implementation.</p>\n<p>When writing <a href=\"https://dev.realworldocaml.org\">Real World OCaml</a> about a decade ago, we built it using a tool called <a href=\"https://github.com/realworldocaml/mdx\">mdx</a> that allowed for embedding OCaml sections within a Markdown document. Mdx can compile these markdown documents by executing the embedded OCaml, and promote the outputs directly into the document itself. You can see an example in the <a href=\"https://github.com/realworldocaml/book/tree/master/book/guided-tour\">Guided Tour chapter</a> where every single OCaml phrase has been machine generated. <a href=\"https://github.com/yminsky\">Yaron Minsky</a> has talked extensively about how successful this &quot;expect test&quot; approach is within Jane Street: they use it for <a href=\"https://blog.janestreet.com/using-ascii-waveforms-to-test-hardware-designs/\">hardware waveforms</a>, <a href=\"https://blog.janestreet.com/repeatable-exploratory-programming/\">exploratory programming</a> and <a href=\"https://blog.janestreet.com/the-joy-of-expect-tests/\">interactive specification</a>.</p>\n<p><a href=\"https://www.cl.cam.ac.uk/~avsm2/json-pointer/\"> <img src=\"/images/aoah-json-ss-2.webp\" alt=\"%c\" title=\"What we'll explore in the rest of this post is how having an interactive JS notebook tutorial of OCaml code is now straightforward!\" > </a></p>\n<h2 id=\"executable-ocaml-documentation-within-ocaml-comments\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#executable-ocaml-documentation-within-ocaml-comments\"></a>Executable OCaml documentation within OCaml comments</h2>\n<p>Upon consulting <a href=\"https://jon.recoil.org\">Jon Ludlam</a>, he told me that <a href=\"https://jon.recoil.org/blog/2025/04/this-site.html\">odoc v3 supports executable code fragments</a> already, so that we don't even need a separate Markdown file anymore. The <em>comments</em> in OCaml code can themselves contain OCaml code that can be compiled into JavaScript and run in a browser!</p>\n<p>I had never even looked at the tooling for this aspect of odoc, but <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>\nhas been building a <a href=\"https://github.com/patricoferris/ocaml-bibtex\">BibTeX OCaml</a> that is extensively using\nthis workflow. His <a href=\"https://github.com/patricoferris/ocaml-bibtex/blob/main/src/dune\">dune file</a> just\nadds a new stanza:</p>\n<pre><code>(mdx\n  (files index.mld bib.mli)\n  (libraries bib bytesrw astring))\n</code></pre>\n<p>This stanza handles the build logic for odoc and mdx to interface and extract\nexecutable code out of <a href=\"https://github.com/patricoferris/ocaml-bibtex/blob/main/src/bib.mli\">interfaces\nfiles</a>\nlike:</p>\n<pre><code class=\"language-ocaml\">(** ... \n  Use {! decode} to read your Bibtex data and {! encode} to write\n  it. For example, here is a little program to format a Bibtex file\n  from a string.\n\n  {@ocaml[\n    open Bytesrw\n\n    let format s =\n      let reader = Bytes.Reader.of_string s in\n      let writer = Bytes.Writer.of_out_channel Out_channel.stdout in\n      Bib.encode (Bib.decode reader) writer\n  ]}\n\n  You can then use it with something like:\n\n  {@ocaml[\n    # format &quot;@string{  hello=world }&quot;;;\n    @string{\n      hello=world\n    }\n    - : unit = ()\n  ]}\n</code></pre>\n<p>OCaml inception right here! The <code>#</code> at the start of the fragment indicates that it's an <a href=\"https://ocaml.org/docs/toplevel-introduction\">OCaml toplevel phrase</a>, otherwise it's library code.</p>\n<h3 id=\"vibesplaining-a-json-pointer-tutorial\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#vibesplaining-a-json-pointer-tutorial\"></a>Vibesplaining a JSON pointer tutorial</h3>\n<p>I used this expert knowledge to then vibesplain up a\n<a href=\"https://tangled.org/anil.recoil.org/ocaml-json-pointer/blob/main/doc/tutorial.mld\">tutorial.mld</a>,\nwhich is a standalone ocamldoc fragment that explains JSON Pointers to me.\nHere's an illustrative fragment:</p>\n<pre><code class=\"language-sh\">From RFC 6901, Section 1:\n\n{i JSON Pointer defines a string syntax for identifying a specific value\nwithin a JavaScript Object Notation (JSON) document.}\n\nIn other words, JSON Pointer is an addressing scheme for locating values\ninside a JSON structure. Think of it like a filesystem path, but for JSON\ndocuments instead of files.\n\nFor example, given this JSON document:\n\n{x@ocaml[\n# let users_json = parse_json {|{\n    &quot;users&quot;: [\n      {&quot;name&quot;: &quot;Alice&quot;, &quot;age&quot;: 30},\n      {&quot;name&quot;: &quot;Bob&quot;, &quot;age&quot;: 25}\n    ]\n  }|};;\nval users_json : Jsont.json =\n  {&quot;users&quot;:[{&quot;name&quot;:&quot;Alice&quot;,&quot;age&quot;:30},{&quot;name&quot;:&quot;Bob&quot;,&quot;age&quot;:25}]}\n]x}\n\nThe JSON Pointer [/users/0/name] refers to the string [&quot;Alice&quot;]:\n\n{@ocaml[\n# let ptr = of_string_nav &quot;/users/0/name&quot;;;\nval ptr : nav t = [Mem &quot;users&quot;; Nth 0; Mem &quot;name&quot;]\n# get ptr users_json;;\n- : Jsont.json = &quot;Alice&quot;\n]}\n</code></pre>\n<p>This tutorial is not only available in the code repo, but it's also present in the\ngenerated HTML library documentation for the JSON Pointer library as well.  It will,\nfor example, appear on the central <a href=\"https://ocaml.org\">ocaml.org</a> if this package\nis ever submitted to the official package repository.</p>\n<h3 id=\"interactive-vibesplaining-with-javascript\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#interactive-vibesplaining-with-javascript\"></a>Interactive vibesplaining with JavaScript</h3>\n<p>However, the tutorial would be even useful to me if it was fully interactive so\nI can mess around with it while learning the codebase.\n<a href=\"https://github.com/art-w\">Arthur Wendling</a> recently released a <a href=\"https://www.webcomponents.org\">webcomponent</a> for OCaml called\n<a href=\"https://github.com/art-w/x-ocaml\">x-ocaml</a> that allows a full OCaml compiler to be embedded\ninto a webpage. This is how I built the recent\n<a href=\"/notes/icfp25-oxcaml\">ICFP 2025 OxCaml tutorial</a>, but I figured it would also be an easy\nway of redistributing a fully standalone version of the JSON Pointer library.</p>\n<p>Being a webcomponent, <a href=\"https://art-w.github.io/x-ocaml/\">using x-ocaml</a> is as simple as including a script header and adding\nan <code>&lt;x-ocaml&gt;</code> tag with the OCaml code.  The only extra thing I needed was to create a Dockerfile to install my local library\ninto a separate opam switch and then run the <code>x-ocaml</code> shell script to generate the libraries blob.</p>\n<p>We still need to compile our odoc source tutorial into HTML with the right <code>x-ocaml</code> tags. <a href=\"https://jon.recoil.org\">Jon Ludlam</a> explained to me odoc mld can be compiled straight to HTML using a sequence of commands:</p>\n<pre><code class=\"language-sh\">$ odoc compile &lt;file.mld&gt; \n$ odoc html-generate page-file.odoc -o . \n$ odoc support-files -o \n</code></pre>\n<p>I then used my <a href=\"/notes/aoah-2025-15\">html5rw library</a> from yesterday to write a utility to transform the odoc HTML output by rewriting to introduce the <code>&lt;x-ocaml&gt;</code> tags. I've published this as a <a href=\"https://tangled.org/anil.recoil.org/odoc-xo\">odoc-xo</a> binary.  This then sit inside some dune rules in my JSON Pointer repository to automate the conversion from mld to interactive Javascript. You can see the <a href=\"https://tangled.org/anil.recoil.org/ocaml-json-pointer/blob/main/doc/dune\">promotion rules here</a>; I'll wrap them into a Claude skill later this week.</p>\n<p>You can browse the <a href=\"https://www.cl.cam.ac.uk/~avsm2/json-pointer/\">interactive tutorial\nhere</a> for JSON Pointer, which I\nfound very useful to exploring the specification interactively. I did have a\nquick go at compiling my earlier <a href=\"/notes/aoah-2025-4\">Claude OCaml bindings</a> in order\nto be able to invoke Claude during my explorations, but I got interrupted by a\nvery pleasant mince pie celebration so I'll leave that as an exercise to the\nreader.</p>\n<h2 id=\"reflections\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reflections\"></a>Reflections</h2>\n<p>There are still a few tooling rough edges to smooth out here, but nothing that can't be automated with a Claude skill and some more binaries like <a href=\"https://tangled.org/anil.recoil.org/odoc-xo\">odoc-xo</a>. In <a href=\"/notes/aoah-2025-17\">Day 17</a> I'll finish up the JMAP implementation as well.</p>\n<p>More generally, what I find fascinating about agentic skills is that once you apply a template somewhere, it can be reused very quickly. It took me a few hours and a <em>lot</em> of input from <a href=\"https://jon.recoil.org\">Jon Ludlam</a> and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> to come up with the first version today, but then it took less than 10 minutes to generate an equivalent one for other repositories. For example, here's the <a href=\"/notes/aoah-2025-6\">Yaml</a> library <a href=\"https://www.cl.cam.ac.uk/~avsm2/yamlrw-doc\">explainer tutorial</a> if you really want to know more.</p>\n<p><a href=\"https://www.cl.cam.ac.uk/~avsm2/yamlrw-doc\"> <img src=\"/images/aoah-yamlrw-doc-1.webp\" alt=\"%c\" title=\"All you ever wanted to interactively be vibesplained to about... Yaml\" > </a></p>\n<p>These tutorials aren't high quality writing, especially compared to one where I've really thought it through and spent time on for specific teaching outcomes. However, agent generated &quot;vibesplained&quot; tutorials can also be very personalised even if not redistributed. For example, I want to know about recent advances in TCP, and so I am feeding the agent my <a href=\"https://github.com/mirage/mirage-tcpip\">mirage-tcpip</a> stack and the latest RFCs, and requesting a tutorial about new concepts that I might have missed in recent years. Having the capability to provide contexts and generate <em>executable</em> tutorial notebooks is a new capability I've discovered today.</p>\n<p>This does, of course, depend on having a high quality baseline of documentation in my dependent libraries, and of exemplar human-written rules and documentation. Thank you to <a href=\"https://jon.recoil.org\">Jon Ludlam</a>, <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>, <a href=\"https://github.com/art-w\">Arthur Wendling</a> and <a href=\"https://erratique.ch\">Daniel Bünzli</a> for going to the (long) effort of giving me the base material to work from here. I am very appreciative of your contributions to OCaml open source.</p>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p>Being a professor is not accurately measured by research or teaching outputs, but by how overloaded your INBOX is.</p>\n <a href=\"#fnref:1\" class=\"reversefootnote\">&#8617;</a></p></li></ol></div><h1>References</h1><ul><li>Madhavapeddy (2025). Holding an OxCaml tutorial at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/55bc5-x4p75\" target=\"_blank\"><i>10.59350/55bc5-x4p75</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/aoah-2025-16",
      "title": "AoAH Day 16: Vibesplaining JSON Pointers using OCaml/Javascript",
      "summary": "Building interactive OCaml tutorials that compile to JavaScript, using agents to generate executable documentation that teaches protocols like JSON Pointer while you code review.",
      "date_published": "2025-12-16T00:00:00.000000Z",
      "date_modified": "2025-12-16T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "aoah",
        "ocaml",
        "agents",
        "llms",
        "ai",
        "email"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/55bc5-x4p75",
          "doi": "10.59350/55bc5-x4p75",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/aoah-2025-15",
      "content_html": "<p>After my success with <a href=\"/notes/aoah-2025-6\">Yaml 1.2</a> in pure OCaml, I found <a href=\"https://github.com/EmilStenstrom/justhtml\">JustHTML</a>, a new Python library for parsing HTML5 by <a href=\"https://friendlybit.com\">Emil Stenström</a> <em>(via <a href=\"https://simonwillison.net/\">Simon Willison</a> <a href=\"https://simonwillison.net/2025/Dec/14/justhtml/\">posting</a> about it)</em>.\nEmil wrote JustHTML <a href=\"https://friendlybit.com/python/writing-justhtml-with-coding-agents/\">using coding agents</a> as well, and then Simon <a href=\"https://simonwillison.net/2025/Dec/15/porting-justhtml/\">ported it to JavaScript in a few hours</a>.</p>\n<p>My question, though, is how difficult is to go in the <em>other</em> direction and move towards a strongly typed interface like OCaml's. Could we ultimately distill down the extremely complex set of rules around parsing HTML all the way into a proof assistant like Lean, but hopping via OCaml and Haskell to provide convenient executable pitstops?</p>\n<p>Today's task was to vibespile the Python into <strong><a href=\"https://tangled.org/anil.recoil.org/ocaml-html5rw\">ocaml-html5rw</a></strong>, a pure OCaml HTML5 parser and serialiser that passes the <a href=\"https://github.com/html5lib/html5lib-tests\">browser test suite</a> 100%.</p>\n<h2 id=\"approach\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#approach\"></a>Approach</h2>\n<p>I took a very similar approach to my earlier <a href=\"/notes/aoah-2025-6\">Yaml 1.2</a> port,\ndepending purely on <a href=\"https://github.com/dbuenzli/bytesrw\">Bytesrw</a> as the\nstring decoding/encoding codec. I instructed the agent not to use any other\nexternal libraries, and to build up a test suite that could decode the\nhtml5lib-tests to act as an external oracle. I also used my <a href=\"/notes/aoah-2025-11\">earlier</a> Claude skills to tidy up OCaml code that was generated.</p>\n<p>I also used an earlier trick to get my freshly generated OCaml test suite (which is itself parsing the\nthird-party html5lib expect tests) to output a standalone HTML report. This was\nextremely useful to see progress, but also for me to understand what sort of thing HTML5 parsing involves. You can <a href=\"https://www.cl.cam.ac.uk/~avsm2/html5rw/\">browse a snapshot</a> to judge for yourself.</p>\n<p><a href=\"https://www.cl.cam.ac.uk/~avsm2/html5rw/\"> <img src=\"/images/aoah-html5-ss-tests.webp\" alt=\"%c\" title=\"The HTML5 test suite is quite scary, but this is what browser developers have to deal with.\" > </a></p>\n<p>I'd never actually realised before doing this that, unlike many parsers, HTML5\nparsing never actually fails. The <a href=\"https://html.spec.whatwg.org/multipage/\">WHATWG\nspecification</a> is a living standard\nthat defines error recovery rules for almost every possible malformed input,\nensuring all HTML documents produce a valid DOM tree just like browsers do.\nSo, the biggest danger here is that we parse a minor syntax error into a nonsensical\nDOM that is &quot;far away&quot; from the author intention.</p>\n<h2 id=\"results\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#results\"></a>Results</h2>\n<p>The HTML5 port took a few hours, and the resulting library seems to pass the\nHTML5 tests without much drama. One footgun is that the test runner itself was\nquite complex, and hidden in there was some skipping of test cases. So my\nreview of this library actually happens in reverse: I read the test runners\nfirst to figure out what's going on, only working backwards to the library\nitself. A very strange workflow...</p>\n<p><img src=\"/images/aoah-html5-ss-1.webp\" alt=\"%c\" title=\"The planning using subagents churned through the tests fairly quickly\" ></p>\n<p>There's nothing too surprising so far; after all <a href=\"https://simonwillison.net/\">Simon Willison</a> ported it to\nJavaScript in no time at all. So what makes this interesting to do in OCaml\n<em>vs</em> Python or Javascript? The obvious thing is modules and ease of\nrefactoring, so I spent some time critically going through the resulting\nmodule structure of the library itself.</p>\n<h3 id=\"avoid-reinventing-the-wheel\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#avoid-reinventing-the-wheel\"></a>Avoid reinventing the wheel</h3>\n<p>A lot of the code inside the library looked suspiciously like it was reinventing the Unicode wheel, with lots of character-encoding specific manipulation. So I cloned some key libraries from the OCaml ecosystem that are both fairly standalone and well engineered: <a href=\"https://github.com/dbuenzli/astring\">astring</a>, <a href=\"https://github.com/dbuenzli/uutf\">uutf</a>, and <a href=\"https://github.com/dbuenzli/uunf\">uunf</a> by <a href=\"https://erratique.ch\">Daniel Bünzli</a>.</p>\n<p><img src=\"/images/aoah-html5-ss-3.webp\" alt=\"%c\" title=\"The planning process uses parallel agents to explore each library, to minimise context usage.\" >\n<img src=\"/images/aoah-html5-ss-4.webp\" alt=\"%c\" title=\"The agent then returns with results of the search including recommendations.\" ></p>\n<p>I cloned a few candidate libraries, and the agentic analyse determined that only some of them were relevant to the problem at hand, so I discarded the rest and focussed on the top set.</p>\n<p>The <a href=\"https://tangled.org/anil.recoil.org/ocaml-html5rw/commit/958671e9df25f94f38795ada0d291aa5355f024a\">resulting diff</a> crunched the size of the library down considerably. Since we had extensive test coverage, the 100% pass rate at the end built up some confidence that semantics weren't changed too badly. The existence of OCaml interface files also meant I could inspect those separately in the diff and be satisfied that only implementations had changed.</p>\n<h3 id=\"types-making-browsing-the-specification-fun\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#types-making-browsing-the-specification-fun\"></a>Types making browsing the specification fun</h3>\n<p><a href=\"https://dev.realworldocaml.org/files-modules-and-programs.html\">Modules</a> and types are defining feature of OCaml, so I pointed the agent at the WHATWG standard and asked it to introduce explanations directly into the interface files themselves. This is quite different from an informal specification; the guidelines can be browsed directly and navigated around via the <a href=\"https://www.cl.cam.ac.uk/~avsm2/htmlrw-doc/html5rw/Html5rw/index.html\">odoc HTML output</a>.</p>\n<p><a href=\"https://www.cl.cam.ac.uk/~avsm2/htmlrw-doc/html5rw/Html5rw/index.html\"> <img src=\"/images/aoah-html5-ss-odoc.webp\" alt=\"%c\" title=\"Learn all sorts of random HTML5 facts like context sensitive fragment parsing by browsing the OCaml docs!\" > </a></p>\n<p>When I was browsing the types, I realised that there were too many strings involved in parsing errors, and so the agent helped synthesise them into an <a href=\"https://tangled.org/anil.recoil.org/ocaml-html5rw/commit/c9a783be220f20922d1bf5106852f8ff7711cc19\">extensible OCaml variant</a> that describes things much more precisely.  This has gotten me thinking about 'doing verification in reverse'; just like <a href=\"https://www.quantamagazine.org/mathematical-beauty-truth-and-proof-in-the-age-of-ai-20250430/\">AI for maths</a> is shifting what it means to do math, this sort of thing is shifting what it means to do formal specification.</p>\n<h2 id=\"reflections\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reflections\"></a>Reflections</h2>\n<p>I'll end by asking the same <a href=\"https://simonwillison.net/2025/Dec/15/porting-justhtml/\">questions</a> that Simon did:</p>\n<blockquote>\n<p>I’ll end with some open questions:</p>\n<ul>\n<li>Does this library represent a legal violation of copyright of either the Rust library or the Python one?</li>\n<li>Even if this is legal, is it ethical to build a library in this way?</li>\n<li>Does this format of development hurt the open source ecosystem?</li>\n<li>Can I even assert copyright over this, given how much of the work was produced by the LLM?</li>\n<li>Is it responsible to publish software libraries built in this way?</li>\n<li>How much better would this library be if an expert team hand crafted it over the course of several months?\n<cite>-- <a href=\"https://simonwillison.net/2025/Dec/15/porting-justhtml/\">I just ported JustHTML from Python to JavaScript</a>, Simon Willison, 2025</cite></li>\n</ul>\n</blockquote>\n<p>I feel the last question is answered most easily: an expert team that has access to these tools <em>and</em> the domain knowledge about HTML5 should be able to do a good job. While I have no experimental evidence in this domain about that fact, we did find earlier this year that <a href=\"/papers/2024-ce-llm\">expert-level retrieval of conservation evidence could be significantly boosted</a> via access to <a href=\"/papers/2025-evidence-tap\">living evidence databases</a>. I feel agentic search sits alongside the same 'needle-in-a-haystack' productivity boost; scanning through thousands of HTML5 test cases to find the problem is something that agents are better at than humans who do not have domain knowledge (like me, in this case).</p>\n<p>The question of copyright and licensing is difficult. I definitely did <em>some</em>\nediting by hand, and a fair bit of prompting that resulted in targeted code\nedits, but the vast amount of architectural logic came from JustHTML. So I\nopted to make the <a href=\"https://tangled.org/anil.recoil.org/ocaml-html5rw/blob/main/LICENSE.md\">LICENSE a joint one</a>\nwith <a href=\"https://friendlybit.com\">Emil Stenström</a>. I did not follow the transitive dependency through to the Rust\none, which I probably should.</p>\n<p>I'm also extremely uncertain about every releasing this library to the central\nopam repository, especially as there are <a href=\"https://github.com/aantron/lambdasoup\">excellent HTML5\nparsers</a> already available. I haven't\nchecked if those pass the HTML5 test suite, because this is wandering into the\nagents <em>vs</em> humans territory that I ruled out in my <a href=\"/notes/aoah-2025\">groundrules</a>.\nWhether or not this agentic code is better or not is a moot point if releasing\nit drives away the human maintainers who are the source of creativity in the code!</p>\n<p>I note that throughout my entire AoAH adventure so far, most of the code I've\ngenerated has been spectacularly unoriginal. I've received tons of help and\nexamples of how to use tools from colleagues as <em>input</em>, but the aggregate set\nof OCaml code <em>output</em> that I would class as refreshingly interesting from me\nis pretty minimal.\nSo in the long term, I don't think this process is &quot;helping the core&quot; of the\ncommunity by coming up with beautiful functional pearls.</p>\n<p>However, the libraries <em>are</em> satisfying an important need for utility: some\nthings like Yaml and HTML5 are fundamentally such ugly formats that I find it\nhard to argue for elegant solutions therein, and yet we need to manipulate them\nto live in the real world. So I'm ending up with a utilitarian plea, with some\nangst that this might be smothering the functional flame that make's OCaml so\nmuch fun in the first place. I must ask <a href=\"https://www.educ.cam.ac.uk/people/staff/gibson/\">Jenny Gibson</a> for her views on where\nplay fits into programming the next time I see her!</p>\n<p>Tomorrow in <a href=\"/notes/aoah-2025-16\">Day 16</a> we'll use this new library to help with\n&quot;vibesplaining&quot; code via executable notebooks.</p><h1>References</h1><ul><li>Jaffer et al (2025). AI-assisted Living Evidence Databases for Conservation Science. Cambridge Open Engage. <a href=\"https://doi.org/10.33774/coe-2025-rmsqf\" target=\"_blank\"><i>10.33774/coe-2025-rmsqf</i></a></li>\n<li>Iyer et al (2025). Careful design of Large Language Model pipelines enables expert-level retrieval of evidence-based information from syntheses and databases. <a href=\"https://doi.org/10.1371/journal.pone.0323563\" target=\"_blank\"><i>10.1371/journal.pone.0323563</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/aoah-2025-15",
      "title": "AoAH Day 15: Porting a complete HTML5 parser and browser test suite",
      "summary": "Vibespiling JustHTML from Python to pure OCaml, achieving 100% pass rate on the browser html5lib test suite using agentic workflows.",
      "date_published": "2025-12-15T00:00:00.000000Z",
      "date_modified": "2025-12-15T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "aoah",
        "ocaml",
        "agents",
        "llms",
        "ai",
        "web"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.33774/coe-2025-rmsqf",
          "doi": "10.33774/coe-2025-rmsqf",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1371/journal.pone.0323563",
          "doi": "10.1371/journal.pone.0323563",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/aoah-2025-14",
      "content_html": "<p>With the <a href=\"/notes/aoah-2025-13\">Requests</a> library under my belt, I finally got to what\nI actually need for myself: vibe coding OCaml library interfaces to my\n<a href=\"\">##selfhosted</a> services that contain most of my data.</p>\n<p>To start with, I use\n<a href=\"https://karakeep.app\">Karakeep</a> across all my devices to bookmark things, and\nI'd like to be able to programmatically search through tags, for example by\ntaking all outbound links from the blogs that I read and autosynching them with\nmy remote service. Karakeep on the server side does some cool things like\nscreenshot links and create local webarchives.</p>\n<p>Unfortunately, Karakeep doesn't publish an OCaml interface.  Fortunately, my\nnew bestie Claude helped me build\n<a href=\"https://tangled.org/anil.recoil.org/ocaml-karakeep\">ocaml-karakeep</a> without\nmuch input from me!</p>\n<h2 id=\"approach\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#approach\"></a>Approach</h2>\n<p><a href=\"https://karakeep.app\"> <img src=\"/images/karakeep-ss.webp\" alt=\"%c\" title=\"I keep a ton of bookmarks in Karakeep, and its Reader mode is convenient to quickly scan papers and such.\" > </a></p>\n<p>With a ton of pre-Christmas organising to do this weekend, I didn't have a lot of\nspare time. So I thought I'd use everything I've learnt so far and try to\none-shot this as bravely/foolishly as possible. I setup a repo with:</p>\n<ul>\n<li>Access to all the previous libraries, including the <a href=\"https://tangled.org/anil.recoil.org/ocaml-jsonfeed\">jsonfeed</a> to act as an 'exemplar' for using jsont, as well as <a href=\"https://tangled.org/anil.recoil.org/ocaml-requests\">requests</a>.</li>\n<li>Cloned the <a href=\"https://github.com/karakeep-app/karakeep\">karakeep source code</a> which has the details of the API somewhere in the Node source. It uses something called <a href=\"https://trpc.io\">tRPC</a> that I'm not familiar with.</li>\n<li>I took a ZFS snapshot of my karakeep deployment, and gave my agent a <strong>live API key</strong> to access my node. Yes, this is the 'take the seatbelt off' moment.</li>\n</ul>\n<p>And then I prompted the agent to build jsont specifications derived from the\nKarakeep source, and then hook this up to Requests, and then iterate using my\nlive key to debug failures until the API spec was right.</p>\n<h2 id=\"results\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#results\"></a>Results</h2>\n<p>This almost worked first time! While the testcases generated worked, I <em>also</em> got the agent to generate a karakeep CLI command. I'd made sure that, as with <a href=\"/notes/aoah-2025-3\">xdge</a>, the Requests library also exposes Cmdliner terms that permit easy integration of HTTP requests into any CLI (to configure logging, proxies, that sort of thing).</p>\n<p><img src=\"/images/karakeep-ss-3.webp\" alt=\"%c\" title=\"I could just invoke the new karakeep CLI and try out different API functionality directly.\" ></p>\n<p>Since I didn't have time to do a full coverage test of the API, I asked the agent to exercise the functionality and do live debugging. This is where the jsont codec was incredibly useful, since it provided all the right hints to the agent to take action. Good error messages aren't just for humans any more!</p>\n<p>A typical error, for example was:</p>\n<pre><code>&gt; main.exe bookmarks summarize wqa1sc1z42xcxfwwrsxeggbl\nmain.exe: [ERROR] Karakeep error: JSON error: Missing members in bookmark object:\n content\n createdAt\n id\nFile &quot;-&quot;, line 1, characters 0-615: \n</code></pre>\n<p>The agent had enough data from that error message to figure things out and make the jsont codec more permissive with option types.</p>\n<p><img src=\"/images/karakeep-ss-2.webp\" alt=\"%c\" title=\"The agent live debugs the missing JSON fields, looks up the original Karakeep source, and fixes the jsont codecs\" ></p>\n<p>The library also made good use of the stateful HTTP interface from Requests, for example by modifying all the default requests to use the API key:</p>\n<pre><code class=\"language-ocaml\">let create ~sw ~env ~base_url ~api_key =                                                                                                                      \n  let session = Requests.create ~sw env in                                                                                                                    \n  let session =                                                                                                                                               \n    Requests.set_auth session (Requests.Auth.bearer ~token:api_key)                                                                                           \n  in                                                                                                                                                          \n  { session; base_url }\n</code></pre>\n<p>The <code>karakeep.proto</code> subpackage has all the dedicated jsont decoders, which take care of conversion to-and-from OCaml types. I found these quite readable:</p>\n<pre><code class=\"language-ocaml\">(** Type of content a bookmark can have *)                                                                                                                    \ntype bookmark_content_type =                                                                                                                                  \n  | Link  (** A URL to a webpage *)                                                                                                                           \n  | Text  (** Plain text content *)                                                                                                                           \n  | Asset  (** An attached asset (image, PDF, etc.) *)                                                                                                        \n  | Unknown  (** Unknown content type *)                                                                                                                      \n                                                                                                                                                              \nval bookmark_content_type_jsont : bookmark_content_type Jsont.t\n</code></pre>\n<h2 id=\"reflections\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reflections\"></a>Reflections</h2>\n<p>The new agent trick I learnt today is the power of debugging against live services. The jsont descriptions derived from the Karakeep source code provided just enough information in error messages that the agent could fix them by trying to run the command against my live service. Luckily, nothing <a href=\"https://www.reddit.com/r/google_antigravity/comments/1p82or6/google_antigravity_just_deleted_the_contents_of/\">got deleted</a> this time, but in the future projects like <a href=\"https://patrick.sirref.org/weekly-2025-w49/index.xml\">Shelter</a> will be ever more important.</p>\n<p><a href=\"https://mynameismwd.org\">Michael Dales</a> <a href=\"https://eeg.zulipchat.com/#narrow/channel/522690-Blogs/topic/My.202025.20Advent.20of.20Agentic.20Humps.3A.20a.20new.20library.20daily/near/563008635\">asked</a> an interesting question: <em>&quot;Why OCaml over an even more strict language?&quot;</em>.  My instinct is that OCaml is pretty much the best choice here; I think it's pretty hard to find a usable language with a stronger typing discipline. The more exotic the type systems get, the harder it is to do proof discharge. However, I believe that with some reinforcement learning magic, <a href=\"/notes/icfp25-oxcaml\">OxCaml's lifetimes</a> might be gamechanging when combined with agents that can also reason about performance and memory in addition to conventional type safety.</p>\n<p>Anyway, I now have a <a href=\"https://tangled.org/anil.recoil.org/ocaml-karakeep\">karakeep binary and library</a> that's working great for my search needs. Back to bookmarking Christmas presents I needed to have obtained yesterday!</p><h1>References</h1><ul><li>Madhavapeddy (2025). Holding an OxCaml tutorial at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/55bc5-x4p75\" target=\"_blank\"><i>10.59350/55bc5-x4p75</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/aoah-2025-14",
      "title": "AoAH Day 14: Debugging a Karakeep CLI against the live service",
      "summary": "Vibe coding an OCaml library for the Karakeep bookmarking service by giving an agent a live API key and letting it debug jsont codecs against the real service.",
      "date_published": "2025-12-14T00:00:00.000000Z",
      "date_modified": "2025-12-14T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "aoah",
        "ocaml",
        "agents",
        "llms",
        "ai"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/55bc5-x4p75",
          "doi": "10.59350/55bc5-x4p75",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/aoah-2025-13",
      "content_html": "<p>Now I had some <a href=\"/notes/aoah-2025-11\">prerequisite</a> <a href=\"/notes/aoah-2025-12\">libraries</a>, I turned my attention to having a batteries-included OCaml HTTP tool with features like request throttling and redirect loop detection. I've hacked on <a href=\"https://github.com/mirage/ocaml-cohttp\">OCaml HTTP protocol</a> libraries since 2011, but these higher level features weren't necessary in things like <a href=\"/papers/2025-docker-icfp\">Docker's VPNKit</a>. The problem with building one now is that there are <em>loads</em> of random quirks needed in real-world HTTP, which would take ages to figure out if I start from scratch.</p>\n<p>Luckily, there's an <a href=\"/papers/2025-internet-ecology\">entire ecology</a> of HTTP clients built in <em>other</em> languages that could use for inspiration as well! Today, I <strong>gathered <em>fifty</em> open-source HTTP clients from a variety of other language ecosystems, and agentically synthesised a specification <em>across</em> all of them into one OCaml client</strong> using Eio.</p>\n<p>I'm not sure what the collective verb is for a group of HTTP clients, so dubbed this whole process a 'heckle' of HTTP coding!</p>\n<h2 id=\"condensing-an-http-implementation-from-fifty-other-implementations\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#condensing-an-http-implementation-from-fifty-other-implementations\"></a>Condensing an HTTP implementation from fifty other implementations</h2>\n<p>The first thing is to find the heckle of other implementations, which I grabbed from <a href=\"https://raw.githubusercontent.com/easybase/awesome-http/refs/heads/main/resources.json\">this awesome-http</a> repository. I vibed up a <a href=\"https://tangled.org/anil.recoil.org/ocaml-requests/blob/claude-test/tools/clone_repos.ml\">git cloner</a> that fetched the sources to my local dev repo, so the agent could access everything.</p>\n<p><a href=\"https://github.com/easybase/awesome-http\"> <img src=\"/images/awesome-http.webp\" alt=\"%c\" title=\"To paragraph the Fifth Element: don't want one library, want ALL the libraries!\" > </a></p>\n<p>Then came the big task of differentially coming up with a spec. I used my earlier <a href=\"/notes/aoah-2025-4\">Claudeio</a> OCaml library to build a <a href=\"https://tangled.org/anil.recoil.org/ocaml-requests/blob/claude-test/tools/summarise_recommendations.ml\">custom agent</a> that iterates over each of the fifty repos and uses the <a href=\"https://tangled.org/anil.recoil.org/ocaml-claudeio/blob/main/lib/structured_output.mli\">structured JSON output</a> to emit the recommendations using a simple schema. I left this churning for an hour while I went for my morning run, and came back to records like this:</p>\n<pre><code class=\"language-json\">&quot;priority_features&quot;: [ {\n  &quot;priority_rank&quot;: 1,\n  &quot;category&quot;: &quot;Security &amp; Spec Compliance&quot;,\n  &quot;title&quot;: &quot;Strip sensitive headers on cross-origin redirects&quot;,\n  &quot;description&quot;: &quot;When redirecting to a different origin (host/port/scheme),\n    automatically strip Authorization, Cookie, Proxy-Authorization, and\n    WWW-Authenticate headers to prevent credential leakage to\n    unintended domains. Also strip headers on HTTPS-&gt;HTTP protocol downgrade.&quot;,\n  &quot;rfc_references&quot;: [ &quot;RFC 9110 Section 15.4 (Redirection)&quot; ],\n  &quot;source_libraries&quot;: [\n    &quot;reqwest&quot;, &quot;got&quot;, &quot;node-fetch&quot;, &quot;okhttp&quot;, &quot;axios&quot;, &quot;superagent&quot;,\n    &quot;http-client&quot;, &quot;needle&quot; ],\n  &quot;affected_files&quot;: [ &quot;lib/requests.ml&quot;, &quot;lib/one.ml&quot;, &quot;lib/http_client.ml&quot; ],\n  &quot;implementation_notes&quot;: &quot;Implement same_origin check comparing scheme, host,\n   and port. Create a list of sensitive headers to strip. Call\n   strip_sensitive_headers before following any redirect where origin changes.&quot;,\n  &quot;cross_language_consensus&quot;: 8 },\n</code></pre>\n<p>There were <a href=\"https://tangled.org/anil.recoil.org/ocaml-requests/commit/dc03b221b36a1c53d853e628cc3746bbf370e3d4\">hundreds of recommendations</a>, ranked by their class (security, new or missing features, etc), but also how many other language ecosystems viewed\nthis as important along with a list of specific libraries. Each recommendation also included\nimplementation notes specific to OCaml, since the agent had been instructed to differentially\ncompare with the strawman OCaml implementation.</p>\n<p>At this point, I did a manual scan and dropped a few language ecosystems that weren't immediately useful.\nFor example, I'd added in a few <a href=\"https://github.com/algebraic-dev/http\">Lean HTTP clients</a> after\nreading <a href=\"https://martin.kleppmann.com\">Martin Kleppmann</a> predict that <a href=\"https://martin.kleppmann.com/2025/12/08/ai-formal-verification.html\">AI will make formal verification go mainstream</a>. However, the Lean clients weren't quite feature complete enough yet, and also don't seem to include any actual verification (it's using Lean as a conventional programming language) so just reverting to Haskell as the reference there seemed fine.</p>\n<p>The agent came back with a nice summary when asked to focus on just Python, Rust and Lean and helped confirm my hypothesis:</p>\n<blockquote>\n<p>Key gaps compared to Python requests and Rust reqwest:</p>\n<div role=\"region\"><table>\n<tr>\n<th>Gap</th>\n<th>Impact</th>\n</tr>\n<tr>\n<td>No proxy support</td>\n<td>Blocks enterprise/corporate adoption</td>\n</tr>\n<tr>\n<td>No compression</td>\n<td>Larger downloads, slower performance</td>\n</tr>\n<tr>\n<td>No Response.json()</td>\n<td>Extra boilerplate for common use case</td>\n</tr>\n<tr>\n<td>No auth stripping on redirects</td>\n<td>Potential credential leakage</td>\n</tr>\n<tr>\n<td>No custom CA certs</td>\n<td>Can't use with self-signed/internal PK</td>\n</tr>\n<tr>\n<td>No client certs (mTLS)</td>\n<td>Missing enterprise auth scenario</td>\n</tr>\n</table></div><p>The Lean HTTP library is interesting for its strong typing but is\nserver-focused and lacks many client features. It's not a good model for your\n&quot;batteries included&quot; goal.</p>\n</blockquote>\n<p>The full list of projects I ended up using was pretty epic: JavaScript (<a href=\"https://github.com/axios/axios\">Axios</a>, <a href=\"https://github.com/node-fetch/node-fetch\">node-fetch</a>, <a href=\"https://github.com/sindresorhus/got\">Got</a>, <a href=\"https://github.com/visionmedia/superagent\">superagent</a>, <a href=\"https://github.com/tomas/needle\">Needle</a>),\nPython (<a href=\"https://github.com/psf/requests\">Requests</a>, <a href=\"https://github.com/urllib3/urllib3\">urllib3</a>, <a href=\"https://github.com/httplib2/httplib2\">httplib2</a>, <a href=\"https://github.com/spyoungtech/grequests\">GRequests</a>, <a href=\"https://github.com/prkumar/uplink\">Uplink</a>), Java (<a href=\"https://github.com/eclipse/jetty.project\">Eclipse Jetty</a>, <a href=\"https://github.com/square/okhttp\">OkHttp</a>, <a href=\"https://github.com/internetarchive/heritrix3\">Heritrix</a>, <a href=\"https://github.com/apache/httpcomponents-client\">Apache HttpClient</a>, <a href=\"https://github.com/googleapis/google-http-java-client\">Google HTTP Client</a>, <a href=\"https://github.com/kevinsawicki/http-request\">Http Request</a>), Rust (<a href=\"https://github.com/seanmonstar/reqwest\">reqwest</a>, <a href=\"https://github.com/hyperium/hyper\">hyper</a>, <a href=\"https://github.com/sagebind/isahc\">Isahc</a>, <a href=\"https://github.com/http-rs/surf\">Surf</a>, <a href=\"https://github.com/alexcrichton/curl-rust\">curl-rust</a>), Swift (<a href=\"https://github.com/Alamofire/Alamofire\">Alamofire</a>, <a href=\"https://github.com/daltoniam/SwiftHTTP\">SwiftHTTP</a>, <a href=\"https://github.com/nghialv/Net\">Net</a>, <a href=\"https://github.com/Moya/Moya\">Moya</a>, <a href=\"https://github.com/dduan/Just\">Just</a>, <a href=\"https://github.com/onevcat/Kingfisher\">Kingfisher</a>), Haskell (<a href=\"https://github.com/mrkkrp/req\">Req</a>, <a href=\"https://github.com/snoyberg/http-client\">http-client</a>, <a href=\"https://github.com/haskell-servant/servant\">servant-client</a>, <a href=\"https://github.com/aesiniath/http-streams\">http-streams</a>), Go (<a href=\"https://github.com/imroc/req\">Req</a>, <a href=\"https://github.com/go-resty/resty\">Resty</a>, <a href=\"https://github.com/dghubble/sling\">Sling</a>, <a href=\"https://github.com/asmcos/requests\">requests</a>), C++ (<a href=\"https://github.com/apache/serf\">Apache Serf</a>, <a href=\"https://github.com/libcpr/cpr\">cpr</a>, <a href=\"https://github.com/cpp-netlib/cpp-netlib\">cpp-netlib</a>, <a href=\"https://github.com/sprinfall/webcc\">Webcc</a>, <a href=\"https://github.com/facebook/proxygen\">Proxygen</a>, <a href=\"https://github.com/yhirose/cpp-httplib\">cpp-httplib</a>, <a href=\"https://github.com/spotify/NFHTTP\">NFHTTP</a>, <a href=\"https://github.com/sony/easyhttpcpp\">EasyHttp</a>), PHP (<a href=\"https://github.com/guzzle/guzzle\">Guzzle</a>, <a href=\"https://github.com/php-http/httplug\">HTTPlug</a>, <a href=\"https://github.com/amphp/http-client\">HTTP Client</a>, <a href=\"https://github.com/sendgrid/php-http-client\">SendGrid HTTP Client</a>, <a href=\"https://github.com/kriswallsmith/Buzz\">Buzz</a>), Shell/C (<a href=\"https://github.com/httpie/httpie\">HTTPie</a>, <a href=\"https://github.com/curl/curl\">curl</a>, <a href=\"https://github.com/aria2/aria2\">aria2</a>, <a href=\"https://github.com/httpie/http-prompt\">HTTP Prompt</a>, <a href=\"https://github.com/micha/resty\">Resty</a>, <a href=\"https://github.com/jonaslu/ain\">Ain</a>). Thank you to all the authors for publishing your respective open-source code!</p>\n<h3 id=\"crunching-together-thousands-of-recommendations-into-a-spec\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#crunching-together-thousands-of-recommendations-into-a-spec\"></a>Crunching together thousands of recommendations into a spec</h3>\n<p>The second phase was to build a <a href=\"https://tangled.org/anil.recoil.org/ocaml-requests/blob/claude-test/tools/summarise_recommendations.ml#L213\">summarisation agent</a> that took the <a href=\"https://tangled.org/anil.recoil.org/ocaml-requests/commit/dc03b221b36a1c53d853e628cc3746bbf370e3d4\">hundreds of recommendations</a> and crunched them up into a unified priority list. Rather than do this by hand, it's very convenient to let the agent tool use take care of scanning the JSON files, backed up by some forceful prompting to make sure it covers them all (there is no guarantee it will, but hey, this is AI). I also used my <a href=\"https://tangled.org/anil.recoil.org/claude-ocaml-internet-rfc\">Claude Internet RFC skill</a> to fetch the relevant specifications so they could also be cross linked from the recommendations.</p>\n<p>The summariser crunched through them in about a minute using the latest Claude Opus 4.5 model. The <a href=\"https://tangled.org/anil.recoil.org/ocaml-requests/commit/46235dac9d4c575e2546751757e5a779870a61ce\">resulting recommendations</a> was a very clean list of features in priority order. Here's an example of another security feature I'd never considered before:</p>\n<blockquote>\n<p>Implement Header Injection Prevention (Newline Validation)</p>\n<p>Validate that user-provided header names and values do not contain newlines\n(CR/LF) which could enable HTTP request smuggling attacks. Reject headers\ncontaining these characters with a clear error message.</p>\n<ul>\n<li><strong>RFC References:</strong>\n<ul>\n<li>RFC 9110 Section 5.5 (Field Syntax)</li>\n<li>RFC 9112 Section 2.2 (Message Format)</li>\n</ul>\n</li>\n<li><strong>Cross-Language Consensus:</strong> 5 libraries</li>\n<li><strong>Source Libraries:</strong> haskell/http-client, rust/hyper, php/guzzle, java/okhttp, go/req</li>\n<li><strong>Affected Files:</strong> <code>lib/headers.ml</code> <code>lib/http_client.ml</code> <code>lib/error.ml</code></li>\n<li><strong>Implementation Notes:</strong> Add validation in Headers.set/Headers.add that\nrejects header names/values containing <code>\\r</code> or <code>\\n</code> characters. Add InvalidHeader\nerror variant with the offending header name for debugging.</li>\n</ul>\n</blockquote>\n<p>After this, the rest was 'normal' agentic coding, whereby I put the agent in a loop\nbuilding the recommendations in order, regularly using my <a href=\"https://tangled.org/anil.recoil.org/claude-ocaml-tidy-code\">ocaml-tidy-code</a>\nto clean up the generated mess (somewhat), and afterwards running it through a module refactoring.</p>\n<p><img src=\"/images/aoah-heckle-ss-1.webp\" alt=\"%c\" title=\"The TODO lists for the agent coding OCaml came out of fifty other libraries' features\" ></p>\n<h2 id=\"results\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#results\"></a>Results</h2>\n<p>The resulting <a href=\"https://tangled.org/anil.recoil.org/ocaml-requests\">ocaml-requests library</a> does work, but I feel this is the first time I've lost the thread on the exact architecture of the library, so this will take some careful code review.</p>\n<p>The basic library fulfills its original mandate fairly well; there is a simple high level interface that's direct style:</p>\n<pre><code class=\"language-ocaml\">Eio_main.run @@ fun env -&gt;\nSwitch.run @@ fun sw -&gt;\nlet req = Requests.create ~sw env in\nRequests.set_auth req (Requests.Auth.bearer &quot;your-token&quot;);\nlet user, repos = Eio.Fiber.both\n  (fun () -&gt; Requests.get req &quot;https://api.github.com/user&quot;)\n  (fun () -&gt; Requests.get req &quot;https://api.github.com/user/repos&quot;) in\nlet user_data = Response.body user |&gt; Eio.Flow.read_all in\nlet repos_data = Response.body repos |&gt; Eio.Flow.read_all in\n...\n</code></pre>\n<p>This is straightforward, and there is also a stateless <code>Requests.One</code> module that contains the one-shot equivalents. There's also an <code>ocurl</code> binary that exercises it all via the CLI. For example, here's ocurl downloading a <a href=\"/papers/2024-food-life\">recent paper</a> and following redirect chains:</p>\n<pre><code>&gt; dune exec -- bin/ocurl.exe https://doi.org/10.1038/s43016-025-01224-w -Iv\nocurl: Creating new connection pool (max_per_endpoint=10, max_idle=60.0s, max_lifetime=300.0s)\nocurl: Created Requests session with connection pools (max_per_host=10, TLS=true)\nGET https://doi.org/10.1038/s43016-025-01224-w\nocurl: &gt; GET https://doi.org/10.1038/s43016-025-01224-w HTTP/1.1\nocurl: &gt; Request Headers:\nocurl: &gt;   Accept-Encoding: gzip, deflate\nocurl: &gt;   User-Agent: ocaml-requests/0.1.0 (OCaml 5.4.0)\nocurl: Creating endpoint pool for doi.org:443 (max_connections=10)\nocurl: TLS connection established to doi.org:443\nocurl: &lt; HTTP/1.1 302\nocurl: &lt; Response Headers: &lt;trimmed&gt;\nocurl: &lt;   date: Sun, 14 Dec 2025 14:22:18 GMT\nocurl: &lt;   expires: Sun, 14 Dec 2025 14:48:37 GMT\nocurl: &lt;   location: https://www.nature.com/articles/s43016-025-01224-w\nocurl: \nocurl: Following redirect (10 remaining)\nocurl: \nocurl: Request to https://www.nature.com/articles/s43016-025-01224-w ===\nocurl: &gt; GET https://www.nature.com/articles/s43016-025-01224-w HTTP/1.1\nocurl: &gt; Request Headers:\nocurl: &gt;   Accept-Encoding: gzip, deflate\nocurl: &gt;   User-Agent: ocaml-requests/0.1.0 (OCaml 5.4.0)\nocurl: &gt; &lt;trimmed&gt;\nocurl: Following redirect (9 remaining)\nocurl: \nocurl: &gt; GET https://idp.nature.com/authorize?response_type=cookie&amp;client_id=grover&amp;redirect_uri=https://www.nature.com/articles/s43016-025-01224-w HTTP/1.1\nocurl: &gt; Request Headers:\nocurl: &gt;   Accept-Encoding: gzip, deflate\nocurl: &gt;   User-Agent: ocaml-requests/0.1.0 (OCaml 5.4.0)\nocurl: \nocurl: Following redirect (8 remaining)\nocurl: &gt; GET https://idp.nature.com/transit?redirect_uri=https://www.nature.com/articles/s43016-025-01224-w&amp;code=99850362-a71b-426f-a325-887d4ebc2346 HTTP/1.1\nocurl: &gt; Request Headers:\nocurl: &gt;   Accept-Encoding: gzip, deflate\nocurl: &gt;   Cookie: idp_session=sVERSION_1fdd8b701-8a14-4c8d-b31f-01af7f3653a2; idp_session_http=hVERSION_149b210b2-c08c-40b1-949f-4125872fff82; idp_marker=4792551b-f6f8-4b89-a02b-9032ea177821\nocurl: &gt;   User-Agent: ocaml-requests/0.1.0 (OCaml 5.4.0)\nocurl: \nocurl: Following redirect (7 remaining)\nocurl: \nocurl: Request completed in 0.972 seconds\nocurl: Request completed with status 200\n</code></pre>\n<p>Quite verbose, but you can see there's a lot going on with a seemingly simple HTTP request beyond just one request!</p>\n<h3 id=\"better-cram-tests-with-httpbin\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#better-cram-tests-with-httpbin\"></a>Better cram tests with httpbin</h3>\n<p>Where it gets a little messier is the internals, which were built as a result\nof iterative agentic work. I don't entirely have confidence (beyond peering at\nthe code) that all those edge cases identified from 50 other libraries are all\ntested in my implementation.</p>\n<p>To mitigate this, I put together a <a href=\"https://httpbin.org\">httpbin</a> set of <a href=\"https://tangled.org/anil.recoil.org/ocaml-requests/blob/claude-test/test/httpbin.t\">ocaml-requests cram tests</a> which use an ocurl binary (which models upstream curl, but using this OCaml code) in order to run\na better of command line tests. These do things like:</p>\n<pre><code class=\"language-sh\">Test cookie setting endpoint:\n\n  $ ocurl --verbosity=error &quot;$HTTPBIN_URL/cookies/set?session=abc123&quot; | \\\n  &gt;   grep -o '&quot;session&quot;: &quot;abc123&quot;'\n  &quot;session&quot;: &quot;abc123&quot;\n\nTest setting multiple cookies:\n\n  $ ocurl --verbosity=error &quot;$HTTPBIN_URL/cookies/set?session=abc123&amp;user=testuser&quot; | \\\n  &gt;   grep '&quot;cookies&quot;' -A 4\n    &quot;cookies&quot;: {\n      &quot;session&quot;: &quot;abc123&quot;,\n      &quot;user&quot;: &quot;testuser&quot;\n    }\n  }\n</code></pre>\n<p>While httpbin is extremely convenient to providing a local endpoint that understands HTTP, this still doesn't cover the full battery of tests. Ideally in the future, we should be able to <em>use the 50 test suites</em> from other language's libraries somehow. I haven't quite thought this through, but it would be extremely powerful if we could bridge OCaml over to a wider set of test suites.</p>\n<h2 id=\"reflections\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reflections\"></a>Reflections</h2>\n<p>I've accomplished what I set out to do at the start of the day, but I feel like\nI've gone further off the 'conventional coding' path than ever before. On one\nhand, it's incredible to have churned through <em>fifty</em> other libraries in a\nmatter of hours. On the other hand, I have almost no intuition about the\ndetailed structure of my resulting library without going through it\nline-by-line.</p>\n<p>There are many important details like the intermingling of connection handling,\nexceptions and resource cleanups that are likely shaky. A lot of common issues\nwill be taken care of by the fact that <a href=\"https://ocaml-multicore.github.io/eio/eio/Eio/Switch/index.html\">Eio Switches</a>\nhandle resource cleanup very well, but it's unlikely to be 100%. This definitely requires\nhuman attention, which I'll do over the next few months.</p>\n<p>However, I've also got very little <em>emotional</em> attachment to the structure of the\nnew Requests library, unlike other libraries I've written by hand -- it's so\neasy to refactor with a bit more agentic coding! <a href=\"https://asbradbury.org\">Alex Bradbury</a>\n<a href=\"https://www.linkedin.com/feed/update/urn:li:activity:7404566293862887425?commentUrn=urn%3Ali%3Acomment%3A%28activity%3A7404566293862887425%2C7404578370950307840%29&amp;dashCommentUrn=urn%3Ali%3Afsd_comment%3A%287404578370950307840%2Curn%3Ali%3Aactivity%3A7404566293862887425%29\">asked</a> me if I'm enjoying this whole process, and today I realised that I <em>am</em>. If I\nviewed this as conventional coding, then I would hate it. But agentic coding is\nextraordinarily different from conventional coding, much more akin to top-down\nspecification and <a href=\"https://doi.org/10.1007/11691372_34\">counterexample driven abstract refinement</a>\nusing natural language!</p>\n<p>I'm doing a lot of <a href=\"\">machine learning research</a> elsewhere, and this agentic\ncoding adventure reminds me of a wonderful quote from <a href=\"https://en.wikiquote.org/wiki/Emma_Goldman\">Emma Goldman</a>\nthat I heard while listening to <a href=\"https://www.bbc.co.uk/sounds/play/m002n7rf\">3rd Reith Lecture</a> today:</p>\n<blockquote>\n<p>&quot;If can't dance, it's not my revolution!&quot;\n<cite>-- <a href=\"https://en.wikiquote.org/wiki/Emma_Goldman\">Emma Goldman</a>, 1931</cite></p>\n</blockquote>\n<p>It's important to find joy in whatever we're working on, and I'm glad I am enjoying myself\nhere, even though I remain highly uncertain about how to integrate this with the open source\npractises I've experienced in the past three decades.\nI'm going to forge ahead with using Requests for a while on some real-world\nAPIs that I need access to, and see how it goes. But this is definitely not\nready for wider use just yet!</p>\n<p>In <a href=\"/notes/aoah-2025-14\">Day 14</a> we'll take a look at using Requests to build OCaml\ninterfaces to some of the <a href=\"\">##selfhosted</a> services I use.</p><h1>References</h1><ul><li>Madhavapeddy et al (2025). Functional Networking for Millions of Docker Desktops. <a href=\"https://doi.org/10.1145/3747525\" target=\"_blank\"><i>10.1145/3747525</i></a></li>\n<li>Madhavapeddy et al (2025). Steps towards an Ecology for the Internet. Association for Computing Machinery. <a href=\"https://doi.org/10.1145/3744169.3744180\" target=\"_blank\"><i>10.1145/3744169.3744180</i></a></li>\n<li>Ball et al (2025). Food impacts on species extinction risks can vary by three orders of magnitude. <a href=\"https://doi.org/10.1038/s43016-025-01224-w\" target=\"_blank\"><i>10.1038/s43016-025-01224-w</i></a></li>\n<li>Gulavani et al (2006). Counterexample Driven Refinement for Abstract Interpretation. Tools and Algorithms for the Construction and Analysis of Systems. <a href=\"https://doi.org/10.1007/11691372_34\" target=\"_blank\"><i>10.1007/11691372_34</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/aoah-2025-13",
      "title": "AoAH Day 13: Heckling an OCaml HTTP client from 50 implementations in 10 languages",
      "summary": "Agentically synthesising a batteries-included OCaml HTTP client by gathering recommendations from fifty open-source implementations across JavaScript, Python, Java, Rust, Swift, Haskell, Go, C++, PHP and shell.",
      "date_published": "2025-12-13T00:00:00.000000Z",
      "date_modified": "2025-12-13T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "aoah",
        "ocaml",
        "agents",
        "llms",
        "ai",
        "ecology"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1145/3747525",
          "doi": "10.1145/3747525",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3744169.3744180",
          "doi": "10.1145/3744169.3744180",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s43016-025-01224-w",
          "doi": "10.1038/s43016-025-01224-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1007/11691372_34",
          "doi": "10.1007/11691372_34",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/aoah-2025-12",
      "content_html": "<p>After yesterday's <a href=\"/notes/aoah-2025-11\">library bonanza</a> for HTTP cookie handling, I implemented a TCP/TLS <a href=\"https://tangled.org/anil.recoil.org/ocaml-conpool\">connection pooling library</a>. This is useful for an HTTP client as it provides the network-level mechanisms for keeping track of outgoing network connections <em>by their DNS name</em>.  This allows for more flexible outgoing connection management without worrying about overloading remote endpoints.</p>\n<p>For example, <code>github.io</code> has four A records:</p>\n<pre><code>&gt; host github.io\ngithub.io has address 185.199.110.153\ngithub.io has address 185.199.109.153\ngithub.io has address 185.199.108.153\ngithub.io has address 185.199.111.153\n</code></pre>\n<p>With this new connection pooling library, my application should be able to\nconnect to the <code>github.io</code> name and keep track of all the outgoing connections\non the basis of it being called <code>github.io</code> and load balance the number of\noutgoing connections accordingly.</p>\n<p>In the interests of exploring something new, I also decided to add in visualisation\nsupport to figure out what the library is spending its time on.\nI decided to generate <a href=\"https://jon.recoil.org/blog/2025/12/an-svg-is-all-you-need.html\">self-contained visualisations</a>,\ninspired by <a href=\"https://jon.recoil.org\">Jon Ludlam</a> rediscovering the joy of SVGs yesterday!</p>\n<h2 id=\"approach\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#approach\"></a>Approach</h2>\n<p>The core library itself is pretty straightforward, as it's reasonably similar\nto the <a href=\"https://github.com/mirage/ocaml-conduit\">ocaml-conduit</a> mechanism of\nproviding a name-based resolver. The interface should be a matter of creating a\nconnection pool to keep track of state and then requesting a connection from it\nto a hostname:</p>\n<pre><code>module Endpoint: sig\n type t\n val make : host:string -&gt; port:int -&gt; t\nend\n\ntype connection_ty = [Eio.Resource.close_ty | Eio.Flow.two_way_ty]                                                                                            \ntype connection = connection_ty Eio.Resource.t\nval connection : sw:Eio.Switch.t -&gt; t -&gt; Endpoint.t -&gt; connection\n</code></pre>\n<p>This uses Eio's <a href=\"https://github.com/ocaml-multicore/eio?tab=readme-ov-file#provider-interfaces\">resource mechanism</a> to allow this connection to be used like any other.</p>\n<h3 id=\"stacking-io-errors\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#stacking-io-errors\"></a>Stacking IO errors</h3>\n<p>One of the coolest things about Eio's error handling is the ability to <a href=\"https://github.com/ocaml-multicore/eio?tab=readme-ov-file#provider-interfaces\">stack errors</a> by re-raising exceptions and adding more context to it.  I prompted the agent to also create connection-specific errors for Eio, but to integrate them into the <a href=\"https://ocaml.org/p/eio/1.3/doc/eio/Eio/index.html#exception-Io\">Eio.Io extensible type</a>. This allows errors from failures to look like this:</p>\n<pre><code>[WARNING] Connection attempt 3 to localhost:8088 failed:\n  Eio.Io Net Connection_failure Refused Unix_error\n  (Connection refused, &quot;connect-in-progress&quot;, &quot;&quot;),\n   connecting to tcp:127.0.0.1:8088,\n   connecting to localhost:8088,\n   after 3 retry attempts\n</code></pre>\n<p>The conpool library can keep a lightweight reporting stack of what it's been\ndoing while propagating the error, which makes a big different to the quality\nof the end logging.</p>\n<h2 id=\"results\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#results\"></a>Results</h2>\n<h3 id=\"self-contained-visualisations-work-well\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#self-contained-visualisations-work-well\"></a>Self-contained visualisations work well</h3>\n<p>I activated <a href=\"https://github.com/anthropics/claude-code/tree/main/plugins/frontend-design\">Claude Marketplace's front end module</a>\nand instructed it to generate me a self-contained HTML file with the results of\nthe output of the stress test.</p>\n<p>This...just worked. Here's a <a href=\"https://www.cl.cam.ac.uk/~avsm2/conpool-stress.html\">HTML\nsnapshot</a> of the results\nof a conpool run, with all the various configurations tests pushed together\ninto a visualisation. I've also seen <a href=\"https://toao.com\">Sadiq Jaffer</a> do the same thing when iterating\non <a href=\"/papers/2025-tessera\">TESSERA</a> CNN visualisations for his intermediate runs.</p>\n<p><a href=\"https://www.cl.cam.ac.uk/~avsm2/conpool-stress.html\"> <img src=\"/images/aoah-vis-ss-1.webp\" alt=\"%c\" title=\"Standalone visualiation of the connection pool stress tests on localhost\" > </a></p>\n<h3 id=\"a-negative-result-with-event-tracing\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#a-negative-result-with-event-tracing\"></a>A negative result with event tracing</h3>\n<p>I figured I'd have a go at integrating at <a href=\"https://ocaml.org/manual/5.2/api/Runtime_events.html\">event tracing</a>, but the various\npackaging problems around the tools defeated my quick attempt. There's\n<a href=\"https://github.com/ocaml-multicore/meio\">meio</a> which I really want to use, but\nit requires various upstream PRs merging, and I'll also need to figure out what\nto do if the ring buffer overflows. I'll have to come back to this later; for\nnow the self-contained visualisations are fine.</p>\n<h3 id=\"another-claude-skill-for-tidying-ocaml-code\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#another-claude-skill-for-tidying-ocaml-code\"></a>Another Claude skill for tidying OCaml code</h3>\n<p>I found myself doing the same prompting repeatedly to tidy up generated code to\nmake it more idiomatic, so I had Claude go back over my history to compact my\ninstructions into a reusable skill, which I uploaded to\n<a href=\"https://tangled.org/anil.recoil.org/claude-ocaml-tidy-code\">ocaml-tidy-code</a>.\nRunning this over all the code is good to do <em>after</em> test cases have been\ngenerated, as then it's fairly easy to verify that things are working well.</p>\n<p><img src=\"/images/aoah-cleanup-ss-1.webp\" alt=\"%c\" title=\"The cleanup agent does a reasonable job of pulling out reusable functions after several generation passes\" ></p>\n<h2 id=\"reflections\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reflections\"></a>Reflections</h2>\n<p>I find the Eio types to be quite complex after we stripped out objects, and so\nhaving an agent code up the idiomatic boilerplate required for registering Eio\nexceptions, pretty printers and resources was quite nice. It's also good to\nhave another Claude skill to further refine my workflow for cleaning up code.</p>\n<p>The self-contained visualisations are also extremely useful; in addition to the\nstandalone HTML one, I also vibed up a full event logging system using the\n<a href=\"https://ocaml.org/manual/5.2/api/Runtime_events.html\">Runtime_events</a> that\n<a href=\"https://toao.com\">Sadiq Jaffer</a> implemented. I decided not to go with that due to some difficulties\nwith dealing with full ring buffers, but I'll come back to in the future.\nThere's clearly a need for a library that can register library-level events and\ndispatch them to the OCaml custom events buffer, to remote logging endpoints,\nand to clients like <a href=\"/notes/aoah-2025-9\">terminal UIs</a> or web monitors.</p>\n<p>Tomorrow, we'll pull this all together into a <a href=\"/notes/aoah-2025-13\">Day 13 HTTP client</a>!</p><h1>References</h1><ul><li>Feng et al (2025). TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2506.20380\" target=\"_blank\"><i>10.48550/arXiv.2506.20380</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/aoah-2025-12",
      "title": "AoAH Day 12: Eio Connection pooling and event tracing",
      "summary": "Building a TCP/TLS connection pooling library for Eio with DNS-based load balancing, stacked error handling, and self-contained HTML visualisations for stress test results.",
      "date_published": "2025-12-12T00:00:00.000000Z",
      "date_modified": "2025-12-12T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "aoah",
        "ocaml",
        "agents",
        "llms",
        "ai"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.48550/arXiv.2506.20380",
          "doi": "10.48550/arXiv.2506.20380",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/aoah-2025-11",
      "content_html": "<p>I'm switching focus for a few days to build a complete HTTP(S) client to use in my <a href=\"/projects/ce\">literature downloader</a>. This requires building a few support libraries before we build the full client, so I figured I'd dive in them in the next few days.  First up is <a href=\"https://en.wikipedia.org/wiki/HTTP_cookie\">RFC6264 HTTP Cookie</a> support. There are some excellent existing cookie libraries already on opam, notably <a href=\"https://github.com/lemaetech/http-cookie\">http-cookie</a> and <a href=\"https://github.com/ulrikstrid/ocaml-cookie\">ocaml-cookie</a>, but I wasn't sure what their coverage of the protocol is, and there's no Eio serialisation support.</p>\n<p>So I thought I'd have a go at a different approach today using agentic coding: can we synthesise a <em>complete</em> HTTP Cookie implementation purely from the <a href=\"https://www.rfc-editor.org/rfc/rfc6265\">RFC 6265 prose</a> itself, and then differentially compare this OCaml implementation against the others? In theory, running a single test suite across all three libraries might be a good way of discovering how to improve the existing implementations. In the long-term, http-cookie is probably the upstream library I want to use, but I don't want to generate a giant diff against it today due to my <a href=\"/notes/aoah-2025\">groundrules of not disturbing other maintainers</a>.</p>\n<h2 id=\"starting-with-rfc-6265-http-state-management\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#starting-with-rfc-6265-http-state-management\"></a>Starting with RFC 6265: HTTP State Management</h2>\n<p>I downloaded <a href=\"https://www.rfc-editor.org/rfc/rfc6265\">RFC 6265</a> and gave the agent all my previously generate code in the same directory to act as examples. My first run went verrrrry strangely as it generated a plan for a <a href=\"https://www.rfc-editor.org/rfc/rfc6264\">carrier-grade NAT</a> implementation, until I realised that I'd typoed the RFC number and picked 6264 by mistake.  Ah, such human frailty...</p>\n<p>Once I got the right RFC number, I stuck Claude into <a href=\"https://code.claude.com/docs/en/common-workflows#use-plan-mode-for-safe-code-analysis\">planning mode</a> with the RFC text and also instructed it to search through opam to find relevant packages. This opam search went well, as it identified some missing dependent logic that a full implementation of cookies would require. So I went down a sub-rabbithole of implementing those dependent packages first.</p>\n<p>First up is <a href=\"https://en.wikipedia.org/wiki/Punycode\">Punycode</a>, as <a href=\"https://datatracker.ietf.org/doc/html/rfc6265#section-5.1.2\">RFC6265 Section 5.1.2</a> requests that:</p>\n<blockquote>\n<p>Convert each label that is not a Non-Reserved LDH (NR-LDH) label,\nto an A-label (see Section 2.3.2.1 of [RFC5890] for the former\nand latter), or to a &quot;punycode label&quot; (a label resulting from the\n&quot;ToASCII&quot; conversion in Section 4 of [RFC3490]), as appropriate\n(see Section 6.3 of this specification).)</p>\n</blockquote>\n<h2 id=\"diving-into-rfc-3490-punycode\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#diving-into-rfc-3490-punycode\"></a>Diving into RFC 3490: Punycode</h2>\n<p>I'd never heard of this Punycode before, but perusing\n<a href=\"https://datatracker.ietf.org/doc/html/rfc3490\">RFC3490</a> introduces it such:</p>\n<blockquote>\n<p>Until now, there has been no standard method for domain names to use\ncharacters outside the ASCII repertoire.  This document defines\ninternationalized domain names (IDNs) and a mechanism called\nInternationalizing Domain Names in Applications (IDNA) for handling them in a\nstandard fashion.  IDNs use characters drawn from a large repertoire\n(Unicode), but IDNA allows the non-ASCII characters to be represented using\nonly the ASCII characters already allowed in so- called host names today.\nThis backward-compatible representation is required in existing protocols\nlike DNS, so that IDNs can be introduced with no changes to the existing\ninfrastructure.  IDNA is only meant for processing domain names, not free\ntext.\n<cite>-- <a href=\"https://datatracker.ietf.org/doc/html/rfc3490\">RFC3490</a>, 2003</cite></p>\n</blockquote>\n<p>Ok, ok so this is some DNS thing needed to reliably compare cookie domains that\nmight be in different languages. The RFC is a little unusual in that it <em>embeds</em>\nC code and also test vectors, so this seems ideal for an agentic session.</p>\n<h3 id=\"result-an-ocaml-punycode-library\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#result-an-ocaml-punycode-library\"></a>Result: an OCaml-punycode library</h3>\n<p>I pushed <a href=\"https://tangled.org/anil.recoil.org/ocaml-punycode/\">ocaml-punycode</a> to <a href=\"/notes/disentangling-git-with-bluesky\">Tangled</a>. It's fairly straightforward, with one module for the core Unicode (using <a href=\"https://erratique.ch/software/uunf\">uufn</a> for Unicode normalisation and the <a href=\"https://github.com/hannesm/domain-name\">domain-name</a> by <a href=\"https://github.com/hannesm\">Hannes Mehnert</a> for RFC1035 Internet Domain Name handling.\nAlthough we didn't use cram tests for this, the <a href=\"https://tangled.org/anil.recoil.org/ocaml-punycode/blob/main/test/test_punycode.ml\">Punycode alcotest</a> seem to encode the test vectors from the RFC faithfully, and also do roundtrip testing.</p>\n<p>One trick that helped get more idiomatic OCaml code here was to do the generation in two passes. First I asked it to transcribe the RFC into OCaml along with the test cases, and then a second pass to refactor the code using higher-order functions and Stdlib combinators. You can see the results of the <a href=\"https://tangled.org/anil.recoil.org/ocaml-punycode/commit/cb3b948db5e4331c200ef196b41ea35be325cf60\">second pass here</a>, which does look like more normal OCaml that I might write.</p>\n<h3 id=\"linking-code-to-rfc-sections\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#linking-code-to-rfc-sections\"></a>Linking code to RFC sections</h3>\n<p>Coding with the RFC as the source spec actually resurrected an issue I've been thinking about since 2015. Back then, <a href=\"https://github.com/lcdunstan\">Luke Dunstan</a> wanted to add <a href=\"https://github.com/mirage/ocaml-dns/issues/28#issuecomment-107215853\">multicast DNS support to ocaml-dns</a>, and as part of that he proposed using extension attributes in the code to link the relevant section of the RFC spec to the associated OCaml function.</p>\n<p>Coding agents go a long way to automating this. In the punycode ocamldoc, I instructed it to use the online RFC repository as an anchor base, and it successfully links almost every type to where it got the specification constraints from.</p>\n<pre><code class=\"language-ocaml\">type error =\n| Overflow of position\n   (** Arithmetic overflow during encode/decode. This can occur with\n       very long strings or extreme Unicode code point values.\n       See {{:https://datatracker.ietf.org/doc/html/rfc3492#section-6.4}\n       RFC 3492 Section 6.4} for overflow handling requirements. *)\n| Invalid_character of position * Uchar.t\n   (** A non-basic code point appeared where only basic code points\n       (ASCII &lt; 128) are allowed. Per\n       {{:https://datatracker.ietf.org/doc/html/rfc3492#section-3.1}\n       RFC 3492 Section 3.1}, basic code points must be segregated\n       at the beginning of the encoded string. *)\n</code></pre>\n<p>I've therefore mechanised this insight into a\n<a href=\"https://tangled.org/anil.recoil.org/claude-ocaml-internet-rfc\">claude-ocaml-internet-rfc</a>\nskill which formalises my future approach to RFCs and allows this part of the\nworkflow to be reused in future OCaml code.</p>\n<p><img src=\"/images/http-cookie-1.webp\" alt=\"%c\" title=\"This post was fueled by delicious Swedish mulled wine and 'cookies'\" ></p>\n<h2 id=\"segwaying-into-public-suffix-lists\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#segwaying-into-public-suffix-lists\"></a>Segwaying into Public Suffix lists</h2>\n<p>But then the exploration of the OCaml RFCs told me that for real compliance, we\nneeded to support <a href=\"https://publicsuffix.org/list/public_suffix_list.dat\">Public Suffix lists</a>, something else\nI'd never heard about!</p>\n<blockquote>\n<p>A &quot;public suffix&quot; is one under which Internet users can (or historically\ncould) directly register names. Some examples of public suffixes are com,\nco.uk and pvt.k12.ma.us. The Public Suffix List is a list of all known public\nsuffixes.</p>\n<p>The Public Suffix List is an initiative of Mozilla, but is\nmaintained as a community resource. It is available for use in any software,\nbut was originally created to meet the needs of browser manufacturers. It\nallows browsers to, for example:</p>\n<ul>\n<li>Avoid privacy-damaging &quot;supercookies&quot; being set for high-level domain name suffixes</li>\n<li>Highlight the most important part of a domain name in the user interface</li>\n<li>Accurately sort history entries by site\n<cite>-- <a href=\"https://publicsuffix.org/learn/\">Public Suffix Lists</a>, 2025</cite></li>\n</ul>\n</blockquote>\n<p>This one's different from the previous library in that there's a large <a href=\"https://publicsuffix.org/list/public_suffix_list.dat\">dataset</a> that needs to be embedded inside the library, and we need reasonably efficient datastructures to traverse them and do the domain comparison (using the Punycode logic from earlier for domain name normalisation).</p>\n<p>So my approach for this was to download the dataset and clone the <a href=\"https://github.com/publicsuffix/list/wiki/\">public suffix list wiki</a> and prompt the agent to come up with a generation architecture that would first parse the list into an OCaml data structure and then provide a <code>Public_suffix</code> OCaml interface that we could use without having to download the data as a library user.</p>\n<h3 id=\"results-a-public-suffix-opam-package\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#results-a-public-suffix-opam-package\"></a>Results: a public-suffix opam package</h3>\n<p>The agent managed all the dune build rules fine for the generated package, so I pushed a <a href=\"https://tangled.org/anil.recoil.org/ocaml-publicsuffix/\">ocaml-publicsuffix</a> Git repository using the <a href=\"/notes/aoah-2025-5\">opam metadata skill</a>. I <em>also</em> then used the earlier <a href=\"https://tangled.org/anil.recoil.org/claude-ocaml-internet-rfc\">ocaml-internet-rfc skill</a> to add references to the various RFCs we used.</p>\n<p>The interface is very simple, since the dataset is embedded as OCaml code:</p>\n<pre><code class=\"language-ocaml\">let psl = Publicsuffix.create () in\n\nPublicsuffix.public_suffix psl &quot;www.example.com&quot;\n(* Returns: Ok &quot;com&quot; *)\n\nPublicsuffix.public_suffix psl &quot;www.example.co.uk&quot;\n(* Returns: Ok &quot;co.uk&quot; *)\n\nPublicsuffix.registrable_domain psl &quot;www.example.com&quot;\n(* Returns: Ok &quot;example.com&quot; *\n</code></pre>\n<h2 id=\"and-so-finally-onto-cookeio\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#and-so-finally-onto-cookeio\"></a>And so finally onto Cookeio!</h2>\n<p>After these segues, we finally have enough for an Eio-based Cookie library that's derived from the main cookie RFC!</p>\n<p>This was pretty plain sailing given the previous libraries. The published <a href=\"https://tangled.org/anil.recoil.org/ocaml-cookeio\">ocaml-cookeio library</a> library has subpackages for the <a href=\"https://tangled.org/anil.recoil.org/ocaml-cookeio/blob/main/lib/jar/cookeio_jar.mli\">Cookie jar</a> interface that provide a pretty simple persistent cookie implementation that we can use in the next few days for our HTTP client.</p>\n<pre><code>val create : unit -&gt; t\n(** Create an empty cookie jar. *)\n\nval load : clock:_ Eio.Time.clock -&gt; Eio.Fs.dir_ty Eio.Path.t -&gt; t\n(** Load cookies from Mozilla format file.\n\n    Loads cookies from a file in Mozilla format, using the provided clock to set\n    creation and last access times. Returns an empty jar if the file doesn't\n    exist or cannot be loaded. *)\n\nval save : Eio.Fs.dir_ty Eio.Path.t -&gt; t -&gt; unit\n(** Save cookies to Mozilla format file. *)\n\n(** {1 Cookie Jar Management} *)\n\nval add_cookie : t -&gt; Cookeio.t -&gt; unit\n</code></pre>\n<h2 id=\"reflections\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reflections\"></a>Reflections</h2>\n<p>I didn't expect to generate three libraries in one day, but I was massively sped up by creating this <a href=\"https://tangled.org/anil.recoil.org/claude-ocaml-internet-rfc\">Claude skill for Internet RFC handling</a>.  The three libraries today, <a href=\"https://tangled.org/anil.recoil.org/ocaml-punycode\">punycode</a>, <a href=\"https://tangled.org/anil.recoil.org/ocaml-publicsuffix/\">publicsuffix</a> and <a href=\"https://tangled.org/anil.recoil.org/ocaml-cookeio/\">cookeio</a> do a lot of fairly ad-hoc checking that's specified throughout a variety of RFCs, but do benefit from public test sets to check their behaviour.</p>\n<p>Now that these are out of the way, I'm going to continue on with another essential library on <a href=\"/notes/aoah-2025-12\">Day 12</a> tomorrow: Eio TCP/TLS connection pooling.</p>\n<p><strong>Update 13th Dec 2025:</strong> <a href=\"https://github.com/dinosaure\">Romain Calascibetta</a> and <a href=\"https://github.com/hannesm\">Hannes Mehnert</a> both kindly alerted me to the fact that there are existing implementations of punycode and publicsuffix. The first one by <a href=\"https://github.com/cfcs/ocaml-punycode\">cfcs/punycode</a> wasn't found by my agent since it's not in the opam repo, and <a href=\"https://opam.ocaml.org/packages/public-suffix/public-suffix.0.0.1/\">public-suffix</a> was missed since my checkout was just a week out of date! But to re-emphasise my <a href=\"/notes/aoah-2025\">groundrules</a>, this is <em>not</em> a competition of agents <em>vs</em> humans. I will very happily use these &quot;proper&quot; libraries in the long term, but in the short-term I cannot contribute a giant chunk of AI-generated code onto those upstream projects. Figuring out what to do about this is one of the points of this month-long adventure...</p><h1>References</h1><ul><li>Madhavapeddy (2025). Socially self-hosting source code with Tangled on Bluesky. <a href=\"https://doi.org/10.59350/r80vb-7b441\" target=\"_blank\"><i>10.59350/r80vb-7b441</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/aoah-2025-11",
      "title": "AoAH Day 11: HTTP Cookies and vibing RFCs for breakfast",
      "summary": "Synthesizing three RFC-compliant libraries (punycode, public-suffix, and cookeio) directly from Internet RFC specifications, establishing a workflow for automating standards implementation with proper cross-referencing to spec sections.",
      "date_published": "2025-12-10T00:00:00.000000Z",
      "date_modified": "2025-12-10T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "aoah",
        "ocaml",
        "agents",
        "llms",
        "ai",
        "rfcs"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/r80vb-7b441",
          "doi": "10.59350/r80vb-7b441",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/aoah-2025-10",
      "content_html": "<p>After building a reasonably complete <a href=\"/notes/aoah-2025-8\">Sortal contacts manager</a> and\ntrying out <a href=\"/notes/aoah-2025-9\">OxCaml's Bonsai_term</a>, I thought I'd have a second go\nat a terminal UI using a newly announced <a href=\"https://discuss.ocaml.org/t/ann-mosaic-a-modern-terminal-user-interface-framework-for-ocaml-early-preview/17572\">Mosaic library</a> by <a href=\"https://github.com/tmattio\">Thibaut Mattio</a>.</p>\n<p>I first noticed this library when Thibaut presented his <a href=\"https://watch.ocaml.org/w/oTv8j7T7eGrtHxpzaRe1LZ\">OCaml coding with AI</a> talk at FunOCaml. It's quite different from <a href=\"https://github.com/janestreet/bonsai_term\">Bonsai</a> in that Mosaic uses OCaml's effects to provide a more direct-style API, and so seems worth experimenting with. So today's task is to port Sortal to use Mosaic and see what this terminal UI looks like!</p>\n<p><a href=\"https://github.com/tmattio/mosaic\"> <img src=\"/images/mosaic-cli-1.webp\" alt=\"%c\" title=\"Thibaut has big plans for an ML training TUI using Mosaic\" > </a></p>\n<h2 id=\"approach\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#approach\"></a>Approach</h2>\n<p>The environmental setup for Mosaic was similarly complicated to <a href=\"/notes/aoah-2025-9\">yesterday's with Bonsai</a>, but for different reasons. Mosaic requires OCaml 5.4.0 or higher, so I had to relax some constraints in dependent packages. But it also requires <a href=\"https://github.com/tmattio/mosaic/issues/7\">pinning vendored packages</a>, which I did manually (specifically, the vendored <code>tree-sitter</code> package). I later automated this vendoring with <a href=\"/notes/aoah-2025-23\">unpac</a>.</p>\n<p>Once this manual package messing was done, the rest was straightforward. I left my remote Linux system vibing with access to all the relevant source code (important since this is a bleeding edge package so the parametric memory of the agent will know nothing about it).  Luckily, the interaction between Eio and Mosaic is far easier, so we ended up with a single-process OCaml binary for the TUI; a relief after yesterday's JSON-RPC gymnastics.</p>\n<p>Once again, the ability to <a href=\"/notes/aoah-2025-9\">paste in images</a> is key to debugging terminal UIs. I wish I had the time to automate this via a skill, but I'll come back to this later in the month perhaps.</p>\n<p><img src=\"/images/sortal-mosaic-ss-1.webp\" alt=\"%c\" title=\"I don't think Mosaic has themes like Bonsai term, but the agent picked really ugly colours by default.\" ></p>\n<p><img src=\"/images/sortal-mosaic-ss-2.webp\" alt=\"%c\" title=\"However pasting in an image allowed the agent to pick a better greyscale baseline quickly.\" ></p>\n<h2 id=\"results\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#results\"></a>Results</h2>\n<p>The working <a href=\"https://tangled.org/anil.recoil.org/sortal-term\">terminal application</a> was far simpler than the earlier Bonsai version by virtue of being a single binary. In fact, the code is pretty <a href=\"https://tangled.org/anil.recoil.org/sortal-term/blob/main/lib/sortal_mosaic.ml\">readable</a> and lives in a single file.  Like with Bonsai, what really helped was specifying that the agent should use the <a href=\"https://github.com/tmattio/mosaic/tree/main/mosaic/examples\">Tea example components</a> as inspiration, and it used those to both fix the theming but also introduced Markdown rendering to make the default UI look really smart.</p>\n<p><img src=\"/images/sortal-mosaic-ss-3.webp\" alt=\"%c\" title=\"The UI before running through the example components with just the agent guessing\" ></p>\n<p><img src=\"/images/sortal-mosaic-ss-4.webp\" alt=\"%c\" title=\"The UI after deriving knowledge from the Mosaic examples, with nicer rendering!\" ></p>\n<p>There was also an amusing keybinding bug that both Bonsai and Mosaic suffered from in the first cut of the agent code. When I searched for &quot;/sadiq&quot; it would crash consistently on the last letter of his name. Why?!! About 15 minutes of debugging code ensued, but at one point I realised that &quot;q&quot; was also the &quot;quit&quot; keybinding, and the app was just exiting cleanly. The agent didn't spot this at all, so some human intuition was useful in building the UI logic!</p>\n<h2 id=\"reflections\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reflections\"></a>Reflections</h2>\n<p>This Mosaic terminal is also very usable, and some things like terminal mouse clicking works too. It's less plug-and-play than Bonsai in the longer term as I'll have to build more components like text editors. However, the easier integration with the wider OCaml ecosystem really helps here since there is only one problem to solve (&quot;build me a TUI against an existing data model&quot;) rather than two (&quot;port all your dependencies to OxCaml&quot;) as well.</p>\n<p>Both of these TUIs will require more focussed attention from me to learn their data models and rendering logic; neither Bonsai nor Mosaic particularly benefited from asking the agent to explain their core architectures to me. Only excellent human-written documentation will achieve that.</p>\n<p>I am very tempted to build a <a href=\"/notes/aoah-2025-4\">Claudeio</a> TUI using Mosaic as one of the future advent day projects, though!  But tomorrow, I'm going to switch gears in <a href=\"/notes/aoah-2025-11\">Day 11</a> and work on HTTP cookies towards a HTTP downloader tool.</p>",
      "url": "https://anil.recoil.org/notes/aoah-2025-10",
      "title": "AoAH Day 10: Building a TUI for Sortal using Mosaic",
      "summary": "Building a simpler single-process terminal UI for Sortal using Mosaic's effects-based direct-style API, with Eio integration and discovering multimodal image debugging for terminal layouts.",
      "date_published": "2025-12-10T00:00:00.000000Z",
      "date_modified": "2025-12-10T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "aoah",
        "ocaml",
        "agents",
        "llms",
        "ai"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/aoah-2025-9",
      "content_html": "<p>After building a reasonably complete <a href=\"/notes/aoah-2025-8\">Sortal contacts manager</a>,\nI decided to try to do a proper job of a terminal user interface. The first\noption for a modern UI is something that <a href=\"https://github.com/yminsky\">Yaron Minsky</a> announced last week:\n<a href=\"https://github.com/janestreet/bonsai_term\">bonsai_term</a>, which also gives me\na chance to dip into the <a href=\"/notes/icfp25-oxcaml\">OxCaml</a> ecosystem with my agentic hacking!</p>\n<h2 id=\"approach-to-using-oxcaml\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#approach-to-using-oxcaml\"></a>Approach to using OxCaml</h2>\n<p><a href=\"https://x.com/avsm/status/1994022009978925518\"> <img src=\"/images/x-bonsai.webp\" alt=\"%rc\" title=\"I did dip my toes into this a couple of weeks ago very quickly\" > </a></p>\n<p>Switching to the OxCaml compiler is a relatively straightforward process by following the <a href=\"https://oxcaml.org/get-oxcaml/\">installation instructions</a>. There is a custom <a href=\"https://github.com/oxcaml/opam-repository\">oxcaml/opam-repository</a> remote that contains all the bleeding edge packages required.</p>\n<p>However, stepping outside of the Jane Street OxCaml package universe and interfacing with bleeding edge upstream OCaml packages does require a minor act of god to succeed. Luckily, <a href=\"https://www.dra27.uk\">David Allsopp</a> is one such OCaml god that's been <a href=\"https://github.com/oxcaml/opam-repository/pull/27\">helping me</a> work through a <a href=\"https://github.com/oxcaml/opam-repository/pull/23\">bunch of constraints</a> to get a consistent package universe. This is a topic for a future note, but I just wanted to warn the casual reader that this aspect of OxCaml isn't quite ready for casual use yet: if you want to play with it, then stick with the Jane Street package set for now and you'll make fine progress.</p>\n<h3 id=\"making-eio-and-async-play-nicely\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#making-eio-and-async-play-nicely\"></a>Making Eio and Async play nicely</h3>\n<p>I however, thought I'd go in the deep end and make Bonsai work with Sortal, including the Romeo-and-Juliet-esque blending of Bonsai's use of <a href=\"https://github.com/janestreet/async\">Async</a> with Sortal's <a href=\"https://github.com/ocaml-multicore/eio\">Eio</a>. This is complex because they each have their own run queues that don't compose cleanly -- one of the motivating problems we wrote about in <a href=\"/papers/2017-tfp-effecthandlers\">early OCaml effects</a> work back in 2017.</p>\n<p>There have been attempts since then; <a href=\"https://github.com/https://roscidus.com\">Thomas Leonard</a> built an <a href=\"https://github.com/talex5/async_eio\">async_eio bridge</a> and even a <a href=\"https://github.com/talex5/async-eio-lwt-chimera\">lwt/async/eio chimera</a> that could run all three in one process!</p>\n<p>I gave this a quick shot, but there has been a lot of bitrot and code movement in the intervening years since these prototypes, especially as OxCaml has zoomed off with all of <a href=\"/notes/icfp25-oxcaml\">its extensions</a>. So after my morning espresso I decided to instruct the coding agent to instead use JSON-RPC to communicate between a Sortal server (that it spawns) and a Bonsai terminal client. After all, we've got really good <a href=\"/notes/aoah-2025-2\">jsont codecs</a> so JSON-RPC shouldn't be too difficult.</p>\n<h3 id=\"setting-up-the-agentic-environment\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#setting-up-the-agentic-environment\"></a>Setting up the agentic environment</h3>\n<p>The only environment that really worked here was giving the agent a monorepo with all the relevant Jane Street packages checked out with the OxCaml switch configured, and my <a href=\"https://github.com/oxcaml/opam-repository/pull/23\">eio OxCaml patched packages</a>. I later automated this monorepo assembly with <a href=\"/notes/aoah-2025-23\">unpac</a>.</p>\n<p>I then had to build the application in two phases:</p>\n<ul>\n<li>first get a client/server JSON-RPC working with test cases to check that there is a useful information flow</li>\n<li>then, with reference to the <a href=\"https://github.com/janestreet/bonsai_term_examples\">bonsai_term_examples</a> build the user interface.</li>\n</ul>\n<p><img src=\"/images/sortal-claude-ss-planning.webp\" alt=\"%c\" title=\"The agent went through a fairly complex TODO list to do the client/server split\" ></p>\n<h2 id=\"results\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#results\"></a>Results</h2>\n<p>There was some drama, but I did end up with a working terminal UI here, although perhaps one that requires a lot more iteration before I consider it sound. The most obvious result is that I have no idea how to code review this, as I need to spend more time learnings the basics of Bonsai and <em>then</em> get to terminal coding and <em>then</em> apply it my specific Sortal usecase. But, as the only user of this thing, the artefact that came out after an hour's agentic coding isn't completely bad.</p>\n<p>I was only confident enough <a href=\"https://tangled.org/anil.recoil.org/sortal-term/tree/bonsai\">to push this to a branch and not main</a> until I do more experimentation, but I did learn a few fun things.</p>\n<h3 id=\"how-does-a-coding-agent-debug-a-ui\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#how-does-a-coding-agent-debug-a-ui\"></a>How does a coding agent debug a UI?</h3>\n<p>One big drawback with coding agents is that they don't have a spatial sense of how UI elements can be laid out. There are workaround for browsers such as using headless Chrome, but often the results of a bad web layout just need repeated iteration.</p>\n<p>However, <a href=\"https://toao.com\">Sadiq Jaffer</a> pointed out to me that Claude Code also supports multimodal <em>images</em> as part of its prompts, which could be fed into it to debug the UI! It's as simple as dragging and dropping images into the Claude Code prompt.</p>\n<p>For example, look at the two screenshots below that represent before and after triggering a bug:</p>\n<p><img src=\"/images/sortal-claude-ss-before.webp\" alt=\"%c\" title=\"Before I triggered a UI corruption bug\" ></p>\n<p><img src=\"/images/sortal-claude-ss-after.webp\" alt=\"%c\" title=\"After the bug, the contents are all over the place\" ></p>\n<p>Simply dragging these two images allowed the agent to figure out (visually) that there was a layout problem, and then apply it to a reasonable looking code fix that refactored the component layout logic.</p>\n<p><img src=\"/images/sortal-claude-ss-result.webp\" alt=\"%c\" title=\"The result of the before/after image prompting\" ></p>\n<p>It seems really worthwhile to build a proper Claude Skill that wraps this process up. I could imagine taking <a href=\"https://github.com/orangekame3/awesome-terminal-recorder\">your favourite terminal recording library</a> and hooking it up to an interactive Pty to automate this prompting process of UI layout.</p>\n<h3 id=\"bonsai-itself-seems-to-have-a-lot-of-power\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#bonsai-itself-seems-to-have-a-lot-of-power\"></a>Bonsai itself seems to have a lot of power</h3>\n<p>The other fun thing was experimenting with <a href=\"https://github.com/janestreet/bonsai_term_components\">Bonsai terminal components</a>, since there are some incredibly nice ones present. These range from <a href=\"https://github.com/janestreet/bonsai_term_components/tree/with-extensions/bar_chart/src\">TUI bar charts</a> to full blown <a href=\"https://github.com/janestreet/bonsai_term_components/tree/with-extensions/text_editor/src\">text editor components</a> (with vi bindings!) to <a href=\"https://github.com/janestreet/bonsai_term_components/tree/with-extensions/tree_view/src\">tree views</a>.</p>\n<p>Check out some of the example gifs <a href=\"https://github.com/janestreet/bonsai_term\">in the main repo</a> like the weighted tree one below.</p>\n<figure class=\"image-center\"><img src=\"https://www.cl.cam.ac.uk/~avsm2/weighted-tree.gif\"></figure>\n<p>So I think working through the constraint problems are well worth it in order to use so many cool components. The text editor especially is extremely usable out of the box as a replace for ed or nvi, which is incredible! There are also pager replacement components which remove the need for less/more, but with the same bindings.</p>\n<h2 id=\"reflections\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reflections\"></a>Reflections</h2>\n<p>I call today a &quot;negative but worthwhile result&quot;. The Bonsai-term architecture is too complex with the client-server split, although agentic coding made it possible to try it out really quickly. I think in the future it's worth investing in Eio and Async interoperability properly so that everything can run in one process, which dramatically simplifies a TUI. <a href=\"https://github.com/https://roscidus.com\">Thomas Leonard</a> has already done the groundwork here, so it's a matter of putting time in the integration and keeping it working.</p>\n<p>Bonsai itself shows how powerful TUIs can be, and in the <a href=\"/notes/aoah-2025-10\">Day 10</a> I'll look at yet another new TUI library just announced a few days ago!</p><h1>References</h1><ul><li>Madhavapeddy (2025). Holding an OxCaml tutorial at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/55bc5-x4p75\" target=\"_blank\"><i>10.59350/55bc5-x4p75</i></a></li>\n<li>Dolan et al (2018). Concurrent System Programming with Effect Handlers. Springer International Publishing. <a href=\"https://doi.org/10.1007/978-3-319-89719-6_6\" target=\"_blank\"><i>10.1007/978-3-319-89719-6_6</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/aoah-2025-9",
      "title": "AoAH Day 9: Adding a Bonsai terminal UI to Sortal",
      "summary": "Experimenting with OxCaml's bonsai_term framework for Sortal's terminal UI, navigating Eio-Async interoperability challenges through JSON-RPC while discovering image-based debugging techniques for terminal applications.",
      "date_published": "2025-12-09T00:00:00.000000Z",
      "date_modified": "2025-12-09T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "aoah",
        "ocaml",
        "agents",
        "llms",
        "ai",
        "oxcaml"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/55bc5-x4p75",
          "doi": "10.59350/55bc5-x4p75",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1007/978-3-319-89719-6_6",
          "doi": "10.1007/978-3-319-89719-6_6",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/fpc9w-ccj82",
      "content_html": "<p>I was not expecting to find a bunch of activist librarians at the lovely spires of King's College Chapel last week, but I was very glad that I did! I gave a <a href=\"https://www.cl.cam.ac.uk/~avsm2/slides/coar-nov25/\">talk</a> to the <a href=\"https://coar-repositories.org\">Confederation of Open Access Repositories</a> group that was having a meeting about &quot;<a href=\"https://coar-repositories.org/news-updates/publish-review-curate-turning-scholarly-publishing-on-its-head/\">Turning scholarly publishing on its head</a>&quot;. Luckily, I had my budding <a href=\"/notes/principles-for-collective-knowledge\">Four Ps for Collective Intelligence</a> fresh on my brain, so I discussed it with the assembled librarians. The crowd was a really interesting mix of the <a href=\"https://www.openresearch.cam.ac.uk/about-us\">open research</a> team at Cambridge, their French equivalents in <a href=\"https://www.ccsd.cnrs.fr/\">CNRS</a>, academic researchers like myself and <a href=\"https://albert.rierol.net/\">Albert Cardona</a> interested in non-traditional outputs, and of course digital librarians from all over the world.</p>\n<h2 id=\"what-is-coar-about\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#what-is-coar-about\"></a>What is COAR about?</h2>\n<p>The conference was held in the lovely King's College Cambridge, and was described as such:</p>\n<blockquote>\n<p>Tired of long delays, expensive fees, and a lack of transparency about editorial decisions when publishing your articles? Join us at King’s College, Cambridge on December 3rd for an in-depth discussion about a better future for scholarly publishing using the Publish, Review, Curate model.\n<cite>-- <a href=\"https://coar-repositories.org/news-updates/publish-review-curate-turning-scholarly-publishing-on-its-head/\">Confederation of Open Access Repositories</a>, 2025</cite></p>\n</blockquote>\n<p><a href=\"https://coar-repositories.org/news-updates/publish-review-curate-turning-scholarly-publishing-on-its-head/\"> <img src=\"/images/coar-25-3.webp\" alt=\"%c\" title=\"Open access to the papers, but got to get through the King's porters first!\" > </a></p>\n<p>The first thing I learnt was the Cambridge <a href=\"https://www.openresearch.cam.ac.uk\">Open Research</a> team is distinct from the <a href=\"https://lib.cam.ac.uk\">University library</a>. The OR team is responsible for <a href=\"https://www.research.cam.ac.uk\">Apollo</a> that is a permanent repository of scholarly publications here (aka the place where I constantly forget to deposit my papers to be REF-eligible).</p>\n<p>This is <em>also</em> distinct from <a href=\"https://www.cambridge.org/engage/coe/public-dashboard\">Cambridge Open Engage</a> (where I often drop preprints that aren't suitable for arXiv), which is run by Cambridge University Press (another independent entity that's a big contributor to the <a href=\"https://www.cambridge.org/about-us/annual-report\">solvency</a> of the University).</p>\n<p>I'm still a bit muddled up about the exact incentives of all these diverse groups, but they all seem to be working together well, so I don't think I need to know too much more than this!</p>\n<p><img src=\"/images/coar-25-4.webp\" alt=\"%c\" title=\"Discussions in the Beves rooms at King's College\" ></p>\n<h2 id=\"diamond-community-lead-publishing\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#diamond-community-lead-publishing\"></a>Diamond community-lead publishing</h2>\n<p>Peter-Sutton Long from the OR team gave an excellent overview of the activities of the OR team, and one thing that caught my eye was that there has been an initiative called &quot;<a href=\"https://www.cambridge.org/engage/coe/public-dashboard\">Diamond Open Access Journals at Cambridge</a>&quot; which hosts various journals like the <a href=\"https://diamond-oa.lib.cam.ac.uk/communities/b560ba0c-b26d-4b8d-bf24-b415dc75ba4b\">Cambridge Journal for Climate Research</a> or the <a href=\"https://diamond-oa.lib.cam.ac.uk/communities/b560ba0c-b26d-4b8d-bf24-b415dc75ba4b\">Cambridge Journal for Visual Culture</a>. These are &quot;hyper-local&quot; journals, but obviously very important to nourish to ensure that any local knowledge is kept alive, that might not otherwise make it into hallowed halls of major publishing houses like Springer.</p>\n<p><img src=\"/images/coar-25-5.webp\" alt=\"%c\" title=\"Apollo vs Diamond open access and community-lead publishing\" ></p>\n<p>It looks like Diamond/Cam is a relatively new initiative within the University, so I asked the wisest<sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote\">[1]</a></sup> person I know -- <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> -- about the relevance of this to his own work on <a href=\"/projects/ce\">conservation evidence</a>.  Bill has run a diamond open access publication called the &quot;<a href=\"https://conservationevidencejournal.com\">Conservation Evidence Journal</a>&quot; since 2004, and has been managing this ever since!</p>\n<p>Back when they set it up, the CEJ was one of <em>very</em> few open journals, but nowadays initiatives like the Cambridge OR one should make it significantly easier to manage. Bill also mentioned that it's getting quite expensive to manage the thousands of articles in CEJ, so I doubly hope to connect the Cambridge OR team with the Conservation Evidence team to see if there could be some helpful sharing of digital platforms there.</p>\n<p>The CEJ has been a critical source of <a href=\"/notes/foundational-ecosystem-workshop\">causal ground truth</a> for our collaboration with CE on <a href=\"/papers/2025-evidence-tap\">evidence pipelines</a>, so it's really really important that the knowledge about what actions on the ground make a difference is widely available and as open as possible.</p>\n<h2 id=\"betting-your-career-future-on-a-journal-editor\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#betting-your-career-future-on-a-journal-editor\"></a>Betting your career future on a journal editor</h2>\n<p>My talk was preceded by the brilliant <a href=\"https://www.joh.cam.ac.uk/research/academics/fellows/dr-helena-gellersen\">Helena Gellersen</a> from the <a href=\"https://www.memlab.psychol.cam.ac.uk\">Memory Laboratory</a> talking about her own experiences with preprints and open publishing. She had gone through quite an adventure with conventional publishing to get some of her research out, and noted that the length of time it took for an early career research was about the same time a PhD takes here in Cambridge!</p>\n<p>Something clearly has to give here, as we also discussed at the <a href=\"/notes/rs-future-of-publishing\">Royal Society</a> earlier this year. <a href=\"https://uniweb.uottawa.ca/view/profile/members/2846\">Stefanie Haustein</a> at that meeting subsequently published this brilliant <a href=\"https://arxiv.org/abs/2511.04820\">critique</a> of the dominance of commercial publishers in academia:</p>\n<blockquote>\n<p>The domination of scientific publishing in the Global North by major commercial publishers is harmful to science. We need the most powerful members of the research community, funders, governments and Universities, to lead the drive to re-communalise publishing to serve science not the market.\n<cite>-- <a href=\"https://arxiv.org/abs/2511.04820\">The Drain of Scientific Publishing</a>, Nov 2025</cite></p>\n</blockquote>\n<p>The arguments this paper really vibed with what the crowd at the COAR conference was saying: <em>&quot;The drain is four-fold, depriving the research system of Money, Time, Trust and Control&quot;</em>. And for early career researchers who have a limited runway in which to get their insights out into the world, there is a critical time gap introduced by editorial decisionmaking gatekeeping peer review.</p>\n<p>There are some interesting alternative pathways forming. Just this week, the <a href=\"https://2026.ijcai.org/\">IJCAI</a> conference is offering a <a href=\"https://2026.ijcai.org/primary-paper-initiative/\">Primary Paper Initiative</a> whereby they charge $100 to submit a paper, but waive the fee for papers in which no author appears on any other paper. All proceeds raised from this will go towards compensating peer reviewers! This is a model that might actually work, especially given the waiver for submitting just one paper.</p>\n<p><img src=\"/images/coar-25-6.webp\" alt=\"%c\" title=\"Helena showing the tumultous path of editorial decisions\" ></p>\n<h2 id=\"presenting-the-four-principles\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#presenting-the-four-principles\"></a>Presenting the four principles</h2>\n<p>I followed up with <a href=\"https://www.cl.cam.ac.uk/~avsm2/slides/coar-nov25/\">my own talk about the 4 principles for collective knowledge networks</a>, which is the first outing of these since I only came up with them a couple of weeks ago. These were some of my takeaways from the discussions that followed:</p>\n<p>Firstly, it was <em>amazing</em> to be in a room full of people that care about metadata. Several people commented on how interesting the <a href=\"https://rogue-scholar.org/\">Rogue Scholar</a> initiative is, so I hope to find a way to get Martin Fenner discussing some his own experiences with topics like <a href=\"https://doi.org/10.53731/4pr0j-7pq24\">subject classification</a> in these COAR forums.</p>\n<p>There were some questions about my interest in <a href=\"https://atproto.com/\">ATProto</a> instead of <a href=\"https://activitypub.rocks/\">ActivityPub</a>. My answer to this is that they will almost certainly <a href=\"https://github.com/bluesky-social/atproto/discussions/1716\">converge</a> or be <a href=\"https://github.com/snarfed/bridgy-fed\">bridged</a>, but the difference is where they start from. ATProto bootstrapped itself centrally via BlueSky and has around <a href=\"https://bsky-users.theo.io/\">40m users</a> quickly as a result. It is now increasingly supporting alternative implementations and <a href=\"/notes/disentangling-git-with-bluesky\">services</a>. This makes it (in my mind) much easier to experiment with protocol evolution without having to update every single endpoint, which is often the case with ActivityPub. That's not to say one is better than the other!</p>\n<h3 id=\"towards-non-traditional-scholarly-data\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#towards-non-traditional-scholarly-data\"></a>Towards non-traditional scholarly data</h3>\n<p>However, I increasingly think that the green fields for open access infrastructure should not be solely focussed on the crowded area of paper publishing, but move to non-traditional scholarly outputs such as datasets, <a href=\"/ideas/grey-lit-crawl\">hyperlocal grey literature</a> and code management. These are all areas where my group is spending vast amounts of effort, and is underserved:</p>\n<ul>\n<li>For datasets, publishing the <a href=\"/notes/geotessera-python-0-7\">Tessera embeddings</a> involves setting up lots and lots of infrastructure that really ought to be <a href=\"/papers/2025-fairground\">federated</a>. Managing and versioning these openly is much easier if we spread the load; for example the <a href=\"https://www.swissdatacube.org/index.php/2025/12/03/geoembeddings-for-switzerland/\">Swiss datacube</a> just republished Swiss embeddings on their own infrastructure, and the <a href=\"https://radiant.earth/\">Radiant Earth</a> community is also involved. This is geospatial data, but no less valuable as we deploy it around the world, for example with our friends in India working on the <a href=\"https://core-stack.org/core-stack-innovation-challenge-1st-edition/\">CoRE stack</a> for rural resilience.</li>\n<li>For grey literature, the <a href=\"/projects/ce\">Conservation Evidence Copilots</a> has shown that a huge amount of knowledge isn't &quot;just&quot; available in published form from major publishers, but is also hugely informed by <a href=\"https://about.conservationevidence.com/2022/03/14/new-non-english-language-studies-database-increasing-the-availability-of-conservation-evidence/\">non-English sources</a>. The next step of our CE literature crawl will almost certainly be to the more obscure journals that aren't in the &quot;DOI mainstream&quot;.</li>\n<li>For computational pipelines, I found going back to this 2013 paper on <a href=\"https://doi.org/10.1126/science.1231535\">Troubling Trends in Scientific Software Use</a> (<a href=\"https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/Troubling20Trends20in20Scientific20Software20Use20_Science_2013.pdf\">OA version</a>) to be slightly depressingly unchanged today. We still don't really have a good reputation system for code, and given the giant amount of <a href=\"/papers/2025-ai-poison\">AI slop</a> being rolled out into codebases worldwide, it's only going to get worse.</li>\n</ul>\n<p>So my message to the COAR community after attending their first (and most energising meeting) is to open up to new forms of data mediums, and to craft infrastructure that unlocks learnings from them. I feel that solely going up against the conventional scholarly publishing incumbency is an unnecessary waste of precious energy, as it looks increasingly brittle from a commercial perspective.</p>\n<p><img src=\"/images/coar-25-7.webp\" alt=\"%c\" title=\"In gratitude for Bill's advice, I feel obliged to plug his Conservation Concepts channel on YouTube as @Bill_Sutherland!\" ></p>\n<h3 id=\"update-1-metaror-2025-12-08\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#update-1-metaror-2025-12-08\"></a>Update 1: MetaROR (2025-12-08)</h3>\n<p>Minutes after publishing this post, Martin Fenner wrote me with some interesting experiments they've been doing with Rogue Scholar and peer review:</p>\n<blockquote>\n<p>[...] we ran an experiment trying PRC with a blog post together with the <a href=\"https://metaror.org/\">MetaROR</a> platform, the result is summarized by Chris Marcum in <a href=\"https://doi.org/10.54900/bymaz-4fw37\">https://doi.org/10.54900/bymaz-4fw37</a>. And just week we had a demo from CottageLabs regarding their upcoming COAR Notify integration coming to Zenodo and InvenioRDM, and thus Rogue Scholar.</p>\n</blockquote>\n<p>So <a href=\"https://metaror.org/\">MetaROR</a> is another way of doing peer review across non-traditional open publishing. So much to look at in 2026!</p>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p>Usefully, also the first person I saw at Espresso Lane this morning.</p>\n <a href=\"#fnref:1\" class=\"reversefootnote\">&#8617;</a></p></li></ol></div><h1>References</h1><ul><li>Madhavapeddy (2025). Royal Society's Future of Scientific Publishing meeting. <a href=\"https://doi.org/10.59350/nmcab-py710\" target=\"_blank\"><i>10.59350/nmcab-py710</i></a></li>\n<li>Jaffer et al (2025). AI-assisted Living Evidence Databases for Conservation Science. Cambridge Open Engage. <a href=\"https://doi.org/10.33774/coe-2025-rmsqf\" target=\"_blank\"><i>10.33774/coe-2025-rmsqf</i></a></li>\n<li>Reynolds et al (2025). Will AI speed up literature reviews or derail them entirely?. Nature Publishing Group. <a href=\"https://doi.org/10.1038/d41586-025-02069-w\" target=\"_blank\"><i>10.1038/d41586-025-02069-w</i></a></li>\n<li>Madhavapeddy (2025). Socially self-hosting source code with Tangled on Bluesky. <a href=\"https://doi.org/10.59350/r80vb-7b441\" target=\"_blank\"><i>10.59350/r80vb-7b441</i></a></li>\n<li>Madhavapeddy (2025). Four Ps for Building Massive Collective Knowledge Systems. <a href=\"https://doi.org/10.59350/418q4-gng78\" target=\"_blank\"><i>10.59350/418q4-gng78</i></a></li>\n<li>Madhavapeddy (2025). GeoTessera 0.7 out with efficient sampling and Zarr support. <a href=\"https://doi.org/10.59350/nagwp-tnw89\" target=\"_blank\"><i>10.59350/nagwp-tnw89</i></a></li>\n<li>Omar et al (2025). A FAIR Case for a Live Computational Commons. Association for Computing Machinery. <a href=\"https://doi.org/10.1145/3759536.3763802\" target=\"_blank\"><i>10.1145/3759536.3763802</i></a></li>\n<li>Madhavapeddy (2025). Foundational AI for Ecosystem Resilience workshop. <a href=\"https://doi.org/10.59350/26hy6-rry61\" target=\"_blank\"><i>10.59350/26hy6-rry61</i></a></li>\n<li>Fenner (2025). Rogue Scholar is improving subject classification (Version 2). Front Matter. <a href=\"https://doi.org/10.53731/4pr0j-7pq24\" target=\"_blank\"><i>10.53731/4pr0j-7pq24</i></a></li>\n<li>Joppa et al (2013). Troubling Trends in Scientific Software Use. Science. <a href=\"https://doi.org/10.1126/science.1231535\" target=\"_blank\"><i>10.1126/science.1231535</i></a></li>\n<li>Marcum (2025). Peer-Review for a Blog Post? My Experience with MetaROR. Upstream. <a href=\"https://doi.org/10.54900/bymaz-4fw37\" target=\"_blank\"><i>10.54900/bymaz-4fw37</i></a></li>\n<li>Beigel et al (2025). The Drain of Scientific Publishing. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2511.04820\" target=\"_blank\"><i>10.48550/arXiv.2511.04820</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/coar-prc",
      "title": "Publish, Review, Curate to upend scholarly publishing",
      "summary": "Report from a COAR conference on transforming scholarly publishing through the Publish, Review, Curate model, discussing diamond open access, early career challenges, and expanding open infrastructure to datasets and code.",
      "date_published": "2025-12-08T00:00:00.000000Z",
      "date_modified": "2025-12-08T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "opensource",
        "publishing",
        "ai",
        "networks",
        "atproto"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/nmcab-py710",
          "doi": "10.59350/nmcab-py710",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.33774/coe-2025-rmsqf",
          "doi": "10.33774/coe-2025-rmsqf",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1038/d41586-025-02069-w",
          "doi": "10.1038/d41586-025-02069-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/r80vb-7b441",
          "doi": "10.59350/r80vb-7b441",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/418q4-gng78",
          "doi": "10.59350/418q4-gng78",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/nagwp-tnw89",
          "doi": "10.59350/nagwp-tnw89",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3759536.3763802",
          "doi": "10.1145/3759536.3763802",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/26hy6-rry61",
          "doi": "10.59350/26hy6-rry61",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.53731/4pr0j-7pq24",
          "doi": "10.53731/4pr0j-7pq24",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1126/science.1231535",
          "doi": "10.1126/science.1231535",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.54900/bymaz-4fw37",
          "doi": "10.54900/bymaz-4fw37",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2511.04820",
          "doi": "10.48550/arXiv.2511.04820",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/aoah-2025-8",
      "content_html": "<p>I've been accumulating a <em>lot</em> of contacts that I use to write cross references\non my website. This works by using\n<a href=\"https://erratique.ch/software/cmarkit\">Cmarkit</a> to parse my custom Markdown,\nand spot entries like <code>[@sadiqj]</code> and convert those into a full reference like\n<a href=\"https://toao.com\">Sadiq Jaffer</a>.</p>\n<p>Today, I want to build a full CLI application that stores all my\ncontacts as Yaml files in my home directory using XDG conventions, and give me\na simple search interface so I can quickly autocomplete these posts from my\neditor.  I call this little application &quot;<a href=\"https://tangled.org/anil.recoil.org/sortal\">Sortal</a>&quot;.</p>\n<h2 id=\"approach\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#approach\"></a>Approach</h2>\n<p>The data model I want behind sortal is that I can have a flat set of Yaml files somewhere\nin my XDG path, and that the CLI and library will build those up into data structures for\nquerying and printing. I have around 2500 contacts or so, so this is very manageable without\na fancy database.</p>\n<p>Sortal uses many of the previous libraries I've been building up so far. I prompted\nthe agent to generate a standalone OCaml project, and to analyse my existing (extremely hacked together)\nwebsite code to determine a reasonable semantic for Sortal's schema, and then to design\na CLI that uses <a href=\"/notes/aoah-2025-7\">yamlt</a> and <a href=\"/notes/aoah-2025-3\">xdge</a> along with jsont/cmdliner\nto plan a user interface with subcommands. I also blended in <a href=\"https://github.com/dbuenzli/fmt\">Fmt</a>\nand <a href=\"https://github.com/dbuenzli/logs\">Logs</a> to get nice colours in my terminal. I'm using a lot of libraries from <a href=\"https://erratique.ch\">Daniel Bünzli</a>, which is no coincidence as the model needs to have far less context if it only has to use a few, well-designed and modular dependencies.</p>\n<p>Architecturally, I used Claude's <a href=\"https://code.claude.com/docs/en/common-workflows\">planning mode</a> for this\nwith the best Opus 4.5 model, along with instructions to maintain a separation between the core\njsont schemas, the library logic, and the cmdliner terms. A useful tip is to ensure that you prompt\nClaude to &quot;ask for any clarifications&quot; after working, and it'll drop you into a custom terminal user\ninterface that structures followup questions. This is a very convenient way of batching answers to the\ncode model.</p>\n<p><img src=\"/images/aoah-ss-1.webp\" alt=\"%c\" ></p>\n<p>I did have to do some prompting to refine how the agent designed the xdge and cmdliner integration, specifically by selecting <em>only</em> the XDG dirs that Sortal actually uses.  It does work beautifully with the CLI once fixed, as the man page shows:</p>\n<p><img src=\"/images/sortal-ss-1.webp\" alt=\"%c\" ></p>\n<p>I also prompted the model to use the man pages to guide a better CLI design, and it came up with a reasonable set of subcommands:</p>\n<p><img src=\"/images/sortal-ss-2.webp\" alt=\"%c\" ></p>\n<h2 id=\"tests\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#tests\"></a>Tests</h2>\n<p>I'm actually opting not to do any fancy testing for this library, since it's basically a bunch of data munging at this stage and I'm going to using it myself for my own data.</p>\n<p>The one thing I did do was to prompt the agent to have an automated import script from my existing Yaml formats, so I could run both in parallel. But aside from that, just keeping an eye the model's inferred success criteria was helpful:</p>\n<pre><code>- dune build @check succeeds\n- All tests pass (dune runtest)\n- Documentation builds without warnings (dune build @doc)\n- No __ references in generated HTML docs\n- CLI executable works correctly\n- sortal.schema builds without eio/xdge dependencies\n- Existing API unchanged (backward compatible)\n</code></pre>\n<p>This is all the sort of thing I would have done myself by hand, but the model has now enough previous libraries and <a href=\"https://platform.claude.com/docs/en/agents-and-tools/tool-use/memory-tool\">memory</a> and the <a href=\"https://tangled.org/anil.recoil.org/claude-ocaml-metadata\">ocaml-metadata skill</a> to figure out my preferred style of libraries.</p>\n<h2 id=\"results\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#results\"></a>Results</h2>\n<p>The proof is in the pudding, so after knocking up a quick import script for my existing contacts, here's it showing <a href=\"https://toao.com\">Sadiq Jaffer</a>.</p>\n<p><img src=\"/images/sortal-ss-3.webp\" alt=\"%c\" ></p>\n<p>The associated Yaml is very straightforward, out of my <code>~/.local/share/sortal</code> directory:</p>\n<pre><code class=\"language-yaml\">version: 1\nkind: person\nhandle: sadiqj\nnames:\n  - Sadiq Jaffer\nemails:\n  - address: sj514@cam.ac.uk\n    type: personal\n  - address: sadiq@toao.com\norganizations:\nurls:\n  - url: https://toao.com\nservices:\n  - url: https://github.com/sadiqj\n    kind: github\n    handle: sadiqj\n    primary: false\norcid: 0009-0006-4120-3244\nfeeds:\n  - type: atom\n    url: https://toao.com/feeds/posts.atom.xml\n</code></pre>\n<p>As a giant quality of life improvement, I also coded up a <code>Git_store</code> module that automated the commit and push of the XDG directory to a private repo on every CLI change. This gives me a super quick way of synching my contacts.</p>\n<p>I also left the version field in there to allow for future schema evolution. I guess this would need a jsont codec that <em>only</em> reads the version field and preserves <a href=\"https://erratique.ch/software/jsont/doc/cookbook.html#unknown_members\">unknown object members</a> and passes them to a versioning module. Something for the future!</p>\n<h2 id=\"reflections\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reflections\"></a>Reflections</h2>\n<p>Well, it works! I'm using this now for this website, and I'm going to add more quality-of-life things as I need it, such as support for thumbnails.</p>\n<p>It's very encouraging to see an emerging workflow in OCaml: this language is brilliant at sketching out a big complicated service, and then refactoring to break it up into composable libraries. Having said that, I don't think the agentic coding exhibits particularly good taste in library design, and is nowhere near capable of building libraries of the quality of <a href=\"https://erratique.ch\">Daniel Bünzli</a>, but they are very good at <em>using</em> those libraries.</p>\n<p>Also of note is that I have zero PPX in this CLI, which I would have used if I were building by hand. Instead, building up boilerplate combinator functions (like jsont codecs) is done pretty well by the coding agent, and results in semantically richer code.  I very much appreciate <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> maintaining ppxlib, but I do not miss having a dependency on it!</p>\n<p>Tomorrow in <a href=\"/notes/aoah-2025-9\">Day 9</a> we'll try Bonsai_term for a terminal UI, and then <a href=\"/notes/aoah-2025-10\">Mosaic</a> to see how alternative approaches might work.</p>",
      "url": "https://anil.recoil.org/notes/aoah-2025-8",
      "title": "AoAH Day 8: Building a contacts CLI manager with Sortal",
      "summary": "Creating Sortal, a CLI contacts management application using Yaml storage, XDG directories, Git-based synchronization, and integrating all previously built libraries into a cohesive CLI tool.",
      "date_published": "2025-12-08T00:00:00.000000Z",
      "date_modified": "2025-12-08T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "aoah",
        "ocaml",
        "agents",
        "llms",
        "ai"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/aoah-2025-7",
      "content_html": "<p>After the excitement of building an entire <a href=\"/notes/aoah-2025-6\">Yaml 1.2 parser yesterday</a>, I began to put it to use. Since I've been steadily converting all my JSON parsers <a href=\"/notes/aoah-2025-2\">to use jsont</a> codecs, it would be convenient if a single JSONt codec definition could <em>also</em> convert that schema to Yaml. In theory, Yaml is a superset of JSON, except <a href=\"https://metacpan.org/pod/JSON::XS#JSON-and-YAML\">it isn't actually</a>. But it's close enough that we <em>should</em> be able to build a <a href=\"https://tangled.org/anil.recoil.org/ocaml-yamlt\">yamlt library</a> that can accept a <a href=\"https://github.com/dbuenzli/jsont\">jsont codec</a> and spit out Yaml (or the reverse).</p>\n<p><a href=\"https://tangled.org/anil.recoil.org/ocaml-yamlt\"> <img src=\"/images/yamlt-tangled.webp\" alt=\"%c\" > </a></p>\n<h2 id=\"approach\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#approach\"></a>Approach</h2>\n<p>I used all the previous tricks learnt so far, with all the dependent libraries available in the source tree for the agent to browse.  The source code to jsont was absolutely essential, since the agent needed to learn how to traverse the jsont codecs to build a different (yaml) translator.</p>\n<p>One thing that was essential was guidance on how to do certain conversions where the results are ambiguous. For example, null handling in Yaml is much more permissive than in JSON, so I opted for any dictionary or array types to resolve nulls in the Yaml to the empty dictionary. It would also be ok to just raise an error in that case, but it seems safer to try to recover from what a human might do.</p>\n<h2 id=\"tests\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#tests\"></a>Tests</h2>\n<p>As <a href=\"/notes/aoah-2025-2\">before</a>, cram tests are a very effective way of exercising codepaths. I got the agent to generate a <a href=\"https://tangled.org/anil.recoil.org/ocaml-yamlt/tree/main/tests/cram\">comprehensive suite</a> of cram tests that do things like:</p>\n<pre><code>Encode arrays to JSON and YAML formats\n\n  $ test_arrays encode\n  JSON: {&quot;numbers&quot;:[1,2,3,4,5],&quot;strings&quot;:[&quot;hello&quot;,&quot;world&quot;]}\n  YAML Block:\n  numbers:\n    - 1\n    - 2\n    - 3\n    - 4\n    - 5\n  strings:\n    - hello\n    - world\n  YAML Flow: {numbers: [1, 2, 3, 4, 5], strings: [hello, world]}\n</code></pre>\n<p>And also negative results:</p>\n<pre><code>Attempting to decode an object file with an array codec should fail\n\n  $ test_arrays int ../data/objects/simple.yml\n  JSON: int_array: ERROR: Missing member values in Numbers object\n  File &quot;-&quot;, line 1, characters 0-28:\n  YAML: int_array: ERROR: Missing member values in Numbers object\n  File &quot;-&quot;:\n</code></pre>\n<p>Which is a convenient way of checking that location tracking is working.</p>\n<h2 id=\"results\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#results\"></a>Results</h2>\n<p>This all worked fairly straightforwardly. The Yamlt interface even includes support for multidocument Yaml files:</p>\n<pre><code class=\"language-ocaml\">val decode_all :\n  ?layout:bool -&gt;\n  ?locs:bool -&gt;\n  ?file:Jsont.Textloc.fpath -&gt;\n  ?max_depth:int -&gt;\n  ?max_nodes:int -&gt;\n  'a Jsont.t -&gt;\n  Bytes.Reader.t -&gt;\n  ('a, string) result Seq.t\n(** [decode_all t r] decodes all documents from a multi-document YAML stream.\n    Returns a sequence where each element is a result of decoding one document.\n    Parameters are as in {!val-decode}. Use this for YAML streams containing\n    multiple documents separated by [---]. *)\n</code></pre>\n<p>There's a lot going on this function, but it's easy to use. The optional parameters control details like Yaml layout and location tracking and defence against billion laughs attacks, but ultimately accept the jsont codec and a Yaml bytesrw source and give the result back as a lazy Seq.t.</p>\n<h2 id=\"reflections\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reflections\"></a>Reflections</h2>\n<p>Using this library is very easy. The example code illustrates this:</p>\n<pre><code class=\"language-ocaml\">module Config = struct\n  type t = { name : string; port : int }\n\n  let make name port = { name; port }\n\n  let jsont =\n    Jsont.Object.map ~kind:&quot;Config&quot; make\n    |&gt; Jsont.Object.mem &quot;name&quot; Jsont.string ~enc:(fun c -&gt; c.name)\n    |&gt; Jsont.Object.mem &quot;port&quot; Jsont.int ~enc:(fun c -&gt; c.port)\n    |&gt; Jsont.Object.finish\nend\n\n(* Use the same codec for both JSON and YAML *)\nlet from_json = Jsont_bytesrw.decode_string Config.jsont json_str\nlet from_yaml = Yamlt.decode_string Config.jsont yaml_str\n</code></pre>\n<p>I've started using yamlt in my website code without issue. It's very convenient\nbeing able to describe a format as a jsont codec (which take care of conversion\nto OCaml types and a concrete wire format), but then also exposing this as Yaml\nfor easier human editing. The line number tracking means that it's much easier\nfor me to trace back errors, thanks to jsont being so careful about this in its\nimplementation and interface.</p>\n<p>I also notice that <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> has been <a href=\"https://patrick.sirref.org/graft-and-bib-update/index.xml\">hacking</a> on a <a href=\"https://github.com/patricoferris/ocaml-bibtex\">BibTeX codec</a> which is based on the same principles as jsont and also uses Bytesrw. I hope to also integrate this into my website code soon too!  But first, in <a href=\"/notes/aoah-2025-8\">Day 8</a> we will build a complete CLI application that uses some of these libraries we've built so far.</p>",
      "url": "https://anil.recoil.org/notes/aoah-2025-7",
      "title": "AoAH Day 7: Converting between JSON and Yaml with yamlt",
      "summary": "Building yamlt to enable jsont codec definitions to work with both JSON and Yaml, providing data manipulation with location tracking and good error messages for both formats.",
      "date_published": "2025-12-07T00:00:00.000000Z",
      "date_modified": "2025-12-07T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "aoah",
        "ocaml",
        "agents",
        "llms",
        "ai"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/aoah-2025-6",
      "content_html": "<p>I did the palate cleanser of <a href=\"/notes/aoah-2025-5\">Bytesrw-eio</a> yesterday for a good reason. Back in 2017, I wrote the <a href=\"https://github.com/avsm/ocaml-yaml/blob/master/CHANGES.md\">OCaml Yaml</a> bindings that a lot of projects use in the OCaml ecosystem, and I'm having trouble maintaining it.</p>\n<p>Since Yaml is an monstrously convoluted spec, I opted back then to bind to the <a href=\"https://github.com/yaml/libyaml\">C libyaml</a> using <a href=\"/papers/2018-socp-modular-ffi\">ocaml-ctypes</a>.  This was a good decision a decade ago, but maintaining this has been a nightmare due to the complexity of vendoring the C library, dealing with security issues there, and exposing a reasonable OCaml interface. The ocaml-yaml implementation also doesn't pass the full Yaml test suite.</p>\n<p>And the worst thing is, I <em>cannot</em> find the motivation to figure out how Yaml really works. It's the world's <a href=\"https://ruudvanasseldonk.com/2023/01/11/the-yaml-document-from-hell\">worst serialisation format</a>, with lots of corner cases and <a href=\"https://en.wikipedia.org/wiki/Billion_laughs_attack\">memory blowups</a> inherent in how it works. So I decided to dive in and see if I could build a <em>pure OCaml Yaml 1.2</em> implementation using <a href=\"https://github.com/dbuenzli/bytesrw\">bytesrw</a> and the <a href=\"https://yaml.org/spec/1.2.2/\">source spec</a>.</p>\n<p>TL;DR: it worked. It actually seems to have come up with a reasonable, <a href=\"https://tangled.org/anil.recoil.org/ocaml-yamlrw\">pure OCaml implementation</a> that I'm now using! It needs more validation and external code review, but this has been on my TODO list for <em>years</em> now.</p>\n<h2 id=\"approach\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#approach\"></a>Approach</h2>\n<p>As with previous projects, I carefully set up the source directory using all the previous libraries to act as style guides, along with the source code to the key dependency of bytesrw and the associated <a href=\"/notes/aoah-2025-5\">bytesrw-eio</a> libraries.\nSince the yaml spec is so complex, I <em>also</em> added in the source code to my original <a href=\"https://github.com/avsm/ocaml-yaml\">ocaml-yaml</a> library, which includes the vendored version of the C libyaml.</p>\n<p>In a slight twist, I also instructed the agent to look at the Git history for ocaml-yaml, since there have been a decade of bug reports about bad ways of interpreting yaml <a href=\"https://github.com/avsm/ocaml-yaml/issues?q=is%3Aissue\">reported</a>, including <a href=\"https://github.com/avsm/ocaml-yaml/issues/82\">one from Martin Jambon</a> that I haven't gotten around to looking at yet for the main library. There have also been frequent requests for a pure OCaml version to make <a href=\"https://github.com/avsm/ocaml-yaml/issues/81\">cross compilation to iOS</a> easier, as well as compilation on <a href=\"https://github.com/avsm/ocaml-yaml/issues/78\">OpenBSD</a> (failing due to the vendoring of the C library), and of course <a href=\"https://github.com/avsm/ocaml-yaml/issues/73\">C memory leaks</a> and <a href=\"https://github.com/avsm/ocaml-yaml/issues/72\">spec violations</a>.  <strong>All of these</strong> would disappear with a spec-compliant implementation, so I fed the agent these to use as regression examples from user bug reports.</p>\n<p><a href=\"https://www.cl.cam.ac.uk/~avsm2/yaml-test-results.html\"> <img src=\"/images/aoah-yamlrw-tests.webp\" alt=\"%c\" title=\"It's easy to generate an HTML visualisation of the test suite\" > </a></p>\n<h2 id=\"tests\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#tests\"></a>Tests</h2>\n<p>The key to the development loop succeeding was adding integration to an external test suite. Yaml has the <a href=\"https://github.com/yaml/yaml-test-suite\">yaml-test-suite</a> which has <a href=\"https://github.com/yaml/yaml-test-suite/tree/data\">thousands</a> of little examples of good and bad specs, along with the expected outputs in both JSON and Yaml.  For example, <a href=\"https://github.com/yaml/yaml-test-suite/tree/data/36F6\">test 36F6</a> takes this input yaml:</p>\n<pre><code>plain: a\n b\n\n c\n</code></pre>\n<p>and expects this error &quot;Multiline plain scalar with empty line&quot; and the following parser events in a custom DSL exposed by the test suite:</p>\n<pre><code>+MAP\n=VAL :plain\n=VAL :a b\\nc\n-MAP\n-DOC\n-STR\n</code></pre>\n<p>So I instructed the agent to also build up a <a href=\"https://tangled.org/anil.recoil.org/ocaml-yamlrw/blob/main/tests/test_suite_lib/tree_format.ml\">test suite DSL</a> that could output in a format compatible with the test suite. A simple <a href=\"https://tangled.org/anil.recoil.org/ocaml-yamlrw/blob/main/tests/test_suite_lib/test_suite_loader_generic.ml\">custom loader</a> and <a href=\"https://tangled.org/anil.recoil.org/ocaml-yamlrw/blob/main/tests/test_suite_lib/json_format.ml\">JSON converter</a> then output in the exact format required by the checked in test suite files.</p>\n<p>I also added in some of the pathological tests from my original OCaml yaml, including an implementation of the <a href=\"https://github.com/avsm/ocaml-yaml/blob/master/tests/yaml/bomb.yml\">Yaml bomb</a>:</p>\n<pre><code>a: &amp;a [&quot;lol&quot;,&quot;lol&quot;,&quot;lol&quot;,&quot;lol&quot;,&quot;lol&quot;,&quot;lol&quot;,&quot;lol&quot;,&quot;lol&quot;,&quot;lol&quot;]\nb: &amp;b [*a,*a,*a,*a,*a,*a,*a,*a,*a]\nc: &amp;c [*b,*b,*b,*b,*b,*b,*b,*b,*b]\nd: &amp;d [*c,*c,*c,*c,*c,*c,*c,*c,*c]\ne: &amp;e [*d,*d,*d,*d,*d,*d,*d,*d,*d]\nf: &amp;f [*e,*e,*e,*e,*e,*e,*e,*e,*e]\ng: &amp;g [*f,*f,*f,*f,*f,*f,*f,*f,*f]\nh: &amp;h [*g,*g,*g,*g,*g,*g,*g,*g,*g]\ni: &amp;i [*h,*h,*h,*h,*h,*h,*h,*h,*h]\n</code></pre>\n<p>This simple Yaml file exponentially allocates lols into a <a href=\"https://en.wikipedia.org/wiki/Billion_laughs_attack\">billion\nlaughs</a>, so I prompted the\nagent to also add in depth tracking to terminate parsing after a configurable\nnumber of nodes or depths have been crossed.</p>\n<h2 id=\"results\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#results\"></a>Results</h2>\n<p>With the test structure setup, it was plain sailing. I prompted the agent to maintain a strict separation between a pure Yaml parser (with a single dependency on bytesrw), and then have Unix and Eio converters using the Bytesrw unix one, or the <a href=\"https://tangled.org/anil.recoil.org/ocaml-bytesrw-eio\">bytesrw-eio</a> one I coded up <a href=\"/notes/aoah-2025-5\">yesterday</a>.</p>\n<p>As a useful aid to debugging, I also prompted the test suite to output nice HTML, which you can <a href=\"https://www.cl.cam.ac.uk/~avsm2/yaml-test-results.html\">browse here</a>. It's convenient to have a rendered version of the entire test suite!</p>\n<p>I also coded up a quick <a href=\"https://github.com/janestreet/core_bench\">core-bench</a>\nlibrary to differentially test both the original Yaml library and this new one\non the full yaml test suite, and the pure OCaml one seems around 20% faster.\nI'm going to do a bit more memory benchmarking in addition to performance\nbefore I'm confident in these results, but the vibe coded smoke test was\nreassuring to see that I wasn't dramatically slower. Memory usage remains a\nrisk of being high, though; something to look at for a future day.</p>\n<h2 id=\"reflections\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reflections\"></a>Reflections</h2>\n<p>This was the first day of the advent adventure where I really felt like I'd hit\na breakthrough! Yamlrw is a dropin replacement for <em>all</em> of my uses of\nocaml-yaml now, and I can't find any regressions. After a bit more code review,\nI'm going to post on the OCaml forums to request existing users to test this\none and see if they can find any regressions.</p>\n<p>It's also nice having a streaming Eio Yaml parser, which will be convenient\nfor some projects in the remaining days, like my contacts manager. But first,\nin <a href=\"/notes/aoah-2025-7\">Day 7</a> we'll build a nice yamlt codec library...</p><h1>References</h1><ul><li>Yallop et al (2018). A modular foreign function interface. <a href=\"https://doi.org/10.1016/j.scico.2017.04.002\" target=\"_blank\"><i>10.1016/j.scico.2017.04.002</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/aoah-2025-6",
      "title": "AoAH Day 6: Getting a Yaml 1.2 implementation in pure OCaml",
      "summary": "Implementing a pure OCaml Yaml 1.2 parser using bytesrw by synthesizing from the specification and existing C library behavior, passing thousands of test suite cases while being 20% faster than the C-based implementation.",
      "date_published": "2025-12-06T00:00:00.000000Z",
      "date_modified": "2025-12-06T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "aoah",
        "ocaml",
        "agents",
        "llms",
        "ai",
        "opam"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1016/j.scico.2017.04.002",
          "doi": "10.1016/j.scico.2017.04.002",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/aoah-2025-5",
      "content_html": "<p>After the <a href=\"/notes/aoah-2025-4\">Claude exertions</a> of yesterday, I needed something easier to cool my laptop down. I wanted to learn how to use another new library from <a href=\"https://erratique.ch\">Daniel Bünzli</a> called <a href=\"https://github.com/dbuenzli/bytesrw\">Bytesrw</a>, which provides composable byte stream readers and writers. It supplies ways to serialise Bytesrw to Unix file descriptors, so I figured I'd add in an <a href=\"https://github.com/ocaml-multicore/eio\">Eio</a> library for this. Along the way though, I was generating a growing number of opam packages, so I also learnt how to use <a href=\"https://www.claude.com/blog/skills\">Claude Skills</a> to automate my opam metadata on <a href=\"/notes/disentangling-git-with-bluesky\">Tangled</a> as well.</p>\n<h2 id=\"approach\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#approach\"></a>Approach</h2>\n<p>The <a href=\"https://tangled.org/anil.recoil.org/ocaml-bytesrw-eio\">Bytesrw-eio</a> library is exceedingly simple and exposes just two functions, so it was pretty easy for the agent to code up given the source repositories already had Unix equivalents.</p>\n<pre><code>val bytes_reader_of_flow :\n  ?slice_length:int -&gt; _ Eio.Flow.source-&gt;\n  Bytesrw.Bytes.Reader.t\n\nval bytes_writer_of_flow :\n  ?slice_length:int -&gt; _ Eio.Flow.sink -&gt;\n  Bytesrw.Bytes.Writer.t\n</code></pre>\n<h2 id=\"results\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#results\"></a>Results</h2>\n<p>While coding this simple library, I realised my bottleneck lay in managing the\ngrowing number of opam packages that I have lying around! Can I also use agents\nto manage these libraries?</p>\n<p><img src=\"/images/aoah-ss-2.webp\" alt=\"%c\" title=\"The completed list of opam metadata actions, all done with using a custom Claude skill\" ></p>\n<h3 id=\"my-opam-structure\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#my-opam-structure\"></a>My opam structure</h3>\n<p>I'm maintaining all the libraries at a custom <a href=\"https://tangled.org/anil.recoil.org/aoah-opam-repo\">aoah-opam-repository overlay</a> on <a href=\"/notes/disentangling-git-with-bluesky\">Tangled</a>. Each individual repository I publish also uses a <a href=\"https://tangled.org/anil.recoil.org/aoah-opam-repo/blob/main/.tangled/workflows/build.yml\">Spindle CI action</a> to build that library on every push. This overlay later became the input to <a href=\"/notes/aoah-2025-22\">monopam</a> and <a href=\"/notes/aoah-2025-23\">unpac</a> for assembling monorepos.</p>\n<p>For every package I publish, I need to:</p>\n<ul>\n<li>get the metadata right in the <a href=\"https://dune.readthedocs.io/en/latest/howto/opam-file-generation.html\">dune-project file</a> so that the opam files are generated right</li>\n<li>add the right <a href=\"https://tangled.org/anil.recoil.org/ocaml-jsonfeed/blob/main/.ocamlformat\">.ocamlformat version</a> to the repo</li>\n<li>then translate the per-repo opam metadata into an entry that lives in the <a href=\"https://tangled.org/anil.recoil.org/aoah-opam-repo\">aoah-opam-repo</a> so packages can depend on each other even though they're unreleased.</li>\n<li>do various other health checks like copyright headers and <a href=\"https://opam.ocaml.org/packages/opam-dune-lint/\">opam-dune-lint</a>.</li>\n</ul>\n<p>The problem with scripting these up is that there's subtle parameterisation which changes slightly across each library. Some might have an extra external dependency, another might have a custom build step, and others might just have tests that require specific actions. This seems ideal for a coding agent that can take a template of actions and then propose metadata changes automatically.</p>\n<h3 id=\"using-claude-skills-to-manage-opam-metadata\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#using-claude-skills-to-manage-opam-metadata\"></a>Using Claude skills to manage opam metadata</h3>\n<p>The idea behind Claude skills is <a href=\"https://www.claude.com/blog/skills\">incredibly simple</a>: you just create a <code>~/.claude/skills/my-skill</code> that contains a <code>SKILL.md</code> and the associated scripts that need to run. Nothing else; no state, no MCP server, just a simple file! The header looks as follows:</p>\n<pre><code>—\nname: ocaml-metadata\ndescription: Standards for OCaml project metadata files. Use when initialising a new OCaml library/module, preparing for opam release, setting up testing infrastructure, or searching the OCaml ecosystem for dependencies. Not for normal code edits.\nlicense: ISC\n—\n\n# OCaml Project Metadata Standards\n\n## When to Use This Skill\n\nInvoke this skill when:\n\n1. **Initializing a new OCaml project** - Setting up dune-project, LICENSE, README, CI, etc.\n2. **Preparing for opam release** - Ensuring all metadata is correct for publication\n3. **Setting up testing infrastructure** - Especially for Eio-based libraries that need mock testing\n4. **Searching the OCaml ecosystem** - Finding and fetching dependency sources for reference\n5. **Adding third-party source references** - Using `opam source` to study library implementations\n\n**Do not use for:**\n- Regular code edits or bug fixes\n- Simple function additions\n- Refactoring existing code\n</code></pre>\n<p>The Yaml frontmatter is the only thing loaded by Claude at startup (thus saving context space). Then it uses that information to load the rest when needed on demand. For example, for the <a href=\"/notes/aoah-2025-4\">claudeio</a> repository it spins it up on demand.</p>\n<p><img src=\"/images/aoah-ss-4.webp\" alt=\"%c\" title=\"The OCaml metadata skill being loaded\" ></p>\n<p>If you browse the <a href=\"https://tangled.org/anil.recoil.org/claude-ocaml-metadata\">full skill repository</a> you will also find even more detailed (and personalised) instructions about how to structure tests and retrieve sources to my laptop. My workflow is that if Claude gets something wrong workflow-wise, I now prompt it to <em>also fix the skill</em> after its done. The agent then revises the workflow information in this repository and (hopefully) doesn't make the same mistake twice.</p>\n<p><img src=\"/images/aoah-ss-3.webp\" alt=\"%c\" title=\"The skill working through bytesrw to fix up the metadata\" ></p>\n<h2 id=\"tests\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#tests\"></a>Tests</h2>\n<p>Getting back to the problem at hand, testing the bytesrw-eio adapter was a simple matter of prompting the agent to\ncome up with some <a href=\"https://tangled.org/anil.recoil.org/ocaml-bytesrw-eio/blob/main/test/test_bytesrw_eio.ml\">read and write tests</a>,\nand then I deliberately introduced an error to sanity check they were failing\nwhen expected and then refined the tests to include those failures as well.</p>\n<p>One frustration here I had is a long-standing problem in OCaml where we don't\nhave a single agreed IO representation. Eio uses\n<a href=\"https://github.com/mirage/ocaml-cstruct\">Cstruct</a> which is backed by Bigarray\n(which lives out-of-heap), and then others use <code>bytes</code>/<code>string</code> (which live\nin-heap). The reason is usually because out-of-heap values can be non-moving\n(allowing for zero-copy IO), but in return it results in <a href=\"/notes/icfp25-ocaml5-js-docker\">GC pacing issues</a> that are hard to fix.  This is relevant to\nthis Bytesrw-eio library because we introduce a copy between Bigarray (from\nEio) to Bytesrw (which obviously uses <code>bytes</code>). At some point in the future, I\nreckon we need to get a non-moving <code>bytes</code> mechanism into OCaml 5.x so that we\ncan all just use one IO type and avoid these dratted copies.</p>\n<h2 id=\"reflections\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reflections\"></a>Reflections</h2>\n<p>The bytesrw-eio package itself was simple enough, and it added to my long-term\ntodo list to <a href=\"/notes/icfp25-post-posix\">revisit OCaml IO abstractions</a> again in the\nfuture. The big breakthrough today though, was figuring out Claude skills. I'm\nnot using MCP at all any more, and will be creating more domain-specific skills\nfor OCaml actions as I proceed through the month.</p>\n<p>Next, onto <a href=\"/notes/aoah-2025-6\">Day 6</a> where we'll build a pure Yaml codec using this library!</p><h1>References</h1><ul><li>Madhavapeddy (2025). It's time to go post-POSIX at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/mch1m-8a030\" target=\"_blank\"><i>10.59350/mch1m-8a030</i></a></li>\n<li>Madhavapeddy (2025). Socially self-hosting source code with Tangled on Bluesky. <a href=\"https://doi.org/10.59350/r80vb-7b441\" target=\"_blank\"><i>10.59350/r80vb-7b441</i></a></li>\n<li>Madhavapeddy (2025). Jane Street and Docker on moving to OCaml 5 at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/3jkaq-d3398\" target=\"_blank\"><i>10.59350/3jkaq-d3398</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/aoah-2025-5",
      "title": "AoAH Day 5: Bytesrw Eio adapters and automating opam metadata",
      "summary": "Building Bytesrw-Eio adapters for composable byte stream I/O while discovering Claude Skills as a powerful way to automate opam package metadata management through reusable workflow templates.",
      "date_published": "2025-12-05T00:00:00.000000Z",
      "date_modified": "2025-12-05T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "aoah",
        "ocaml",
        "agents",
        "llms",
        "ai",
        "opam"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/mch1m-8a030",
          "doi": "10.59350/mch1m-8a030",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/r80vb-7b441",
          "doi": "10.59350/r80vb-7b441",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/3jkaq-d3398",
          "doi": "10.59350/3jkaq-d3398",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/aoah-2025-4",
      "content_html": "<p>By this point, I've got three useful libraries and my use of Claude is getting better. So naturally I want to automate my invocations of the <code>claude</code> CLI, but I hit a roadblock: there are no OCaml SDK bindings! However, there appear to be SDKs in <a href=\"https://github.com/anthropics/claude-agent-sdk-python\">Python</a>, <a href=\"https://github.com/anthropics/anthropic-sdk-go\">Go</a> and <a href=\"https://github.com/anthropics\">many others</a>. So today will involve having a stab at generating <a href=\"https://tangled.org/anil.recoil.org/claudeio\">Claude OCaml bindings</a> using Eio, so I can use Claude to write more OCaml!</p>\n<h2 id=\"approach\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#approach\"></a>Approach</h2>\n<p>I prodded around the <a href=\"https://github.com/anthropics/claude-agent-sdk-python\">Python</a> and noted that the communications protocol between the SDK and the CLI is JSON-RPC. I'd noticed <a href=\"/notes/aoah-2025-2\">when hacking with jsont</a> that it includes a <a href=\"https://github.com/dbuenzli/jsont/blob/main/test/json_rpc.ml\">json-rpc codec</a>, so adopting the same approach as I did with <a href=\"https://tangled.org/anil.recoil.org/ocaml-jsonfeed\">ocaml-jsonfeed</a> seems reasonable: code up the core protocol using JSONt codecs, and then handle serialisation and process coordination using Eio.</p>\n<p>For context to the agent, I gave it the Python and Go Claude SDKs to digest what the actual Claude protocol involves, and then all my previous OCaml libraries and the sources to jsont and Eio (i.e. the lessons learnt from the previous couple of days with xdge and jsonfeed).</p>\n<p>One important prompt was to instruct it to <em>first</em> generate a <code>claude.proto</code> subpackage that <em>only</em> has jsont codecs and OCaml types, and then to use that package in the coordination layer with Eio. This avoids mixing up concerns in one giant module, as an unprompted Claude is prone to do.</p>\n<h2 id=\"tests\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#tests\"></a>Tests</h2>\n<p>Using jsont at the codec layer made all the difference, since I could get the model to debug the wire-level messages independently of the transport layer.  In fact, I left the agent running in a loop where it looked at the error outputs from its own regression tests (against a live Claude instance) and then proceeded to fix them. This was only possible because of the excellent error instructions from jsont. For example, with the structured output test I got:</p>\n<pre><code>structured_output_demo.exe: [ERROR] Failed to decode incoming message: Missing member tool_name in Rule object\nFile &quot;-&quot;, line 1, characters 451-515:\nFile &quot;-&quot;, line 1, characters 451-515: at index 0 of\nFile &quot;-&quot;, line 1, characters 450-515: array&lt;Rule object&gt;\nFile &quot;-&quot;: in member rules of\nFile &quot;-&quot;, line 1, characters 423-515: Update object\nFile &quot;-&quot;, line 1, characters 423-515: at index 0 of\nFile &quot;-&quot;, line 1, characters 422-515: array&lt;Update object&gt;\nFile &quot;-&quot;: in member permission_suggestions of\nFile &quot;-&quot;, line 1, characters 88-515: Permission object\nFile &quot;-&quot;: in member request of\nFile &quot;-&quot;, line 1, characters 0-515: ControlRequest object\nLine: {&quot;type&quot;:&quot;control_request&quot;,&quot;request_id&quot;:&quot;055cb59c-2f9f-457d-8c98-0a2c5a48c577&quot;,&quot;request&quot;:{&quot;subtype&quot;:&quot;can_use_tool&quot;,&quot;tool_name&quot;:&quot;Bash&quot;,&quot;input&quot;:{&quot;command&quot;:&quot;find /Users/avsm/src/git/knot -type f -name \\&quot;*.ml\\&quot; -o -name \\&quot;*.mli\\&quot; -o -name \\&quot;*.md\\&quot; -o -name \\&quot;*.html\\&quot; -o -name \\&quot;*.go\\&quot; -o -name \\&quot;dune\\&quot; -o -name \\&quot;dune-project\\&quot; | head -100&quot;,&quot;description&quot;:&quot;List all relevant files in the repository&quot;},&quot;permission_suggestions&quot;:[{&quot;type&quot;:&quot;addRules&quot;,&quot;rules&quot;:[{&quot;toolName&quot;:&quot;Read&quot;,&quot;ruleContent&quot;:&quot;//Users/avsm/src/git/knot/**&quot;}],&quot;behavior&quot;:&quot;allow&quot;,&quot;destination&quot;:&quot;session&quot;}],&quot;tool_use_id&quot;:&quot;toolu_011w6XYAbALBytxMaLxtnBGd&quot;,&quot;agent_id&quot;:&quot;05bf9384-6c4b-4edd-898f-962d945ff724&quot;}}\n</code></pre>\n<p>This was enough information for Claude to pick up the problem and address it in the codec:</p>\n<pre><code>Claude: I found the issue! The Rule decoder in proto/permissions.ml is expecting\nsnake_case field names (tool_name, rule_content) but the Claude CLI is sending\ncamelCase field names (toolName, ruleContent). The rest of the permission\nsystem already uses camelCase consistently.\n</code></pre>\n<h2 id=\"results\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#results\"></a>Results</h2>\n<p>I did find a <em>lot</em> of breakage when using different versions of the upstream\nClaude SDKs. For example, permission handling is just...broken... in some\nversions, but they seem to quite quickly push changes. I suspect they might be\nusing a bit too much bleeding edge Claude in developing Claude!</p>\n<p><img src=\"/images/claude-ss-perm-1.webp\" alt=\"%c\" title=\"I managed to get interactive OCaml callbacks to Claude working after some upstream fixes\" ></p>\n<p>However, their breakage did exercise my agent quite nicely into switching\nbetween the Python and Go SDKs to come up with a good answer, and also\nhighlighted why agentic coding is so different from one-shot coding LLMs. It's\npretty crazy seeing an agent dynamically introspect itself to come up with the\narchitecture I specified, against a live service!</p>\n<h2 id=\"reflection\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reflection\"></a>Reflection</h2>\n<p>It's now quite convenient to have a <a href=\"https://tangled.org/anil.recoil.org/claudeio\">Claude OCaml wrapper</a>, but I stopped short of making a really nice Eio interface as the upstream project is moving so quickly.</p>\n<p>I'd like to eventually use this as a basis for a distributed Claude to unify my local and remote Docker development, and also integrate with our local initiatives like <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> and his work on <a href=\"/papers/2025-hyperres\">package management</a> and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>'s cool new <a href=\"https://patrick.sirref.org/weekly-2025-w49/index.xml\">Shelter</a>.  For now, I'm holding the fort on just doing simple OCaml invocations of Claude and not trying anything too exotic until the CLI itself settles down and stabilises.</p>\n<p>Onto <a href=\"/notes/aoah-2025-5\">Day 5</a> next, where we use Claude skills for the first time!</p><h1>References</h1><ul><li>Gibb et al (2025). Solving Package Management via Hypergraph Dependency Resolution. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2506.10803\" target=\"_blank\"><i>10.48550/arXiv.2506.10803</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/aoah-2025-4",
      "title": "AoAH Day 4: Going recursive with Claudeio for Claude",
      "summary": "Creating OCaml bindings for the Claude API using Eio and jsont codecs by reverse-engineering the JSON-RPC protocol from Python and Go SDKs, enabling Claude to write more Claude-powered OCaml code.",
      "date_published": "2025-12-04T00:00:00.000000Z",
      "date_modified": "2025-12-04T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "aoah",
        "ocaml",
        "agents",
        "llms",
        "ai"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.48550/arXiv.2506.10803",
          "doi": "10.48550/arXiv.2506.10803",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/26hy6-rry61",
      "content_html": "<p>As part of the ARIA <a href=\"https://www.aria.org.uk/opportunity-spaces/engineering-ecosystem-resilience\">Engineering Ecosystem Resilience</a>\nprogram, we've been convening a series of workshops here at the <a href=\"https://www.conservation.cam.ac.uk\">Cambridge\nConservation Initiative</a> to explore the\npotential of combining two very radically different approaches to modeling. <a href=\"https://joemillard.github.io/\">Joe Millard</a> wrote this to frame the discussion:</p>\n<blockquote>\n<p>Ecology and ecosystems are inherently agent-based. In other words, patterns\nin biodiversity in both space and time emerge as a function of the local\ninteraction of many types of individual organisms, both with each other and\nwith their abiotic environment.</p>\n<p>Generative agent-based models, such as <a href=\"https://arxiv.org/abs/2312.03664\">Concordia</a> enable the simulation of\nmultiple interacting large language models. Given LLMs now possess\n<a href=\"https://www.biorxiv.org/content/10.1101/2025.02.10.637097v1\">significant ecological knowledge</a>,\nit is possible that models such as Concordia will enable the meaningful simulation of ecological interactions.</p>\n<p>The biotic and abiotic environment in which ecological agents interact in a given\necosystem is likely measurable via remotely monitored earth-observation data. Raw EO\ndata, however, is unwieldy, containing large quantities of information that can be\ndifficult to interpret. Earth-system models, such as <a href=\"/papers/2025-tessera\">TESSERA</a> or <a href=\"https://arxiv.org/abs/2507.22291\">AlphaEarth</a>\nare foundational AI models which compress large quantities of EO data into &quot;embeddings&quot;, unambiguous\nand consistent digital representations of the structure of the Earth’s surface.\n<cite>-- Foundational AI to forecast ecosystem resilience, J. Millard, A. Pili, K. Berthon, R. Fletcher, L. Dicks</cite></p>\n</blockquote>\n<p>We held two separate workshops to explore this; one for a\ndeep-dive into the technical details, and another to invite conservation\npractitioners to drive our modeling direction in a realistic and positive\ndirection.  This was all lead by <a href=\"https://www.zoo.cam.ac.uk/directory/prof-lynn-dicks\">Lynn Dicks</a> and the stellar organisation of\n<a href=\"https://joemillard.github.io/\">Joe Millard</a>, <a href=\"https://www.zoo.cam.ac.uk/directory/katherine-berthon\">Katherine Berthon</a>, <a href=\"https://www.zoo.cam.ac.uk/directory/dr-arman-pili\">Arman Pili</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/professor-rob-fletcher\">Rob Fletcher</a>, with input from me, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and <a href=\"https://coomeslab.org\">David Coomes</a>.\nI'll go into each talk next, or you can <a href=\"https://watch.eeg.cl.cam.ac.uk/w/p/iWr9BhchX6NT6jBfMZmR2P\">watch the playlist</a> yourself.</p>\n<h2 id=\"the-technical-workshop\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-technical-workshop\"></a>The technical workshop</h2>\n<p>For the technical day, we held it in the lovely Ferguson Nazareth<sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote\">[1]</a></sup> room at <a href=\"https://www.pem.cam.ac.uk\">Pembroke College</a>, contrasting ancient rows of books with group brainstorms about machine learning for improbably complex simulations!</p>\n<h3 id=\"tessera-from-the-ground-up-to-looking-down\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#tessera-from-the-ground-up-to-looking-down\"></a>TESSERA from the ground up to looking down</h3>\n<p>We began with <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and <a href=\"https://coomeslab.org\">David Coomes</a> going through <a href=\"/notes/geotessera-python-0-7\">TESSERA</a> in some detail, with a particular focus on its use for ecological downstream tasks. <a href=\"https://www.bangor.ac.uk/staff/sens/julia-patricia-gordon-jones-010356/en\">Julia P.G. Jones</a> <a href=\"https://www.linkedin.com/posts/julia-p-g-jones-85294215_i-have-already-posted-about-the-fascinating-activity-7401648875461165056-MoJx\">observed</a> that this was the clearest explanation of the utility of TESSERA to a non-machine-learning person that she had heard, and I agree! Do give this a watch if you'd like a gentle intro to geospatial foundation models.</p>\n<p><div class=\"video-center\"><iframe title=\"Introduction to TESSERA at the Workshop on Foundational AI to forecast ecosystem resilience\" width=\"100%\" height=\"315px\" src=\"https://watch.eeg.cl.cam.ac.uk/videos/embed/82e14165-298e-48e1-9cc4-b9035f9c5930\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<h3 id=\"concordia-agent-based-modelling\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#concordia-agent-based-modelling\"></a>Concordia: Agent based modelling</h3>\n<p>Then we switched tacks to something entirely different, with <a href=\"https://www.vezhnick.com/\">Sasha Vezhnevets</a> and <a href=\"https://locross93.github.io/\">Logan Cross</a> from <a href=\"https://deepmind.com\">DeepMind</a> telling us about their approach to generative social simulation:</p>\n<blockquote>\n<p>Concordia is a library to facilitate construction and use of generative agent-based models to simulate interactions of agents in grounded physical, social, or digital space. It makes it easy and flexible to define environments using an interaction pattern borrowed from tabletop role-playing games in which a special agent called the Game Master (GM) is responsible for simulating the environment where player agents interact (like a narrator in an interactive story).\n<cite>-- <a href=\"https://github.com/google-deepmind/concordia\">Concordia on GitHub</a>, 2025</cite></p>\n</blockquote>\n<p>What was fascinating about this is how LLMs just change the game for defining simulations. Instead of having to specify every last thing, the LLMs understand a certain amount of human culture and context by virtue of pre-training, and so can take more nuanced decisions than a discrete simulator. So the &quot;defaults&quot; are intriguingly much more useful for ecological simulations, which are deeply connected and social in nature and extremely difficult to model computationally via brute force methods.</p>\n<p><div class=\"video-center\"><iframe title=\"Concordia: Generative Agent Based Modeling\" width=\"100%\" height=\"315px\" src=\"https://watch.eeg.cl.cam.ac.uk/videos/embed/e2bd3af2-84e9-4066-9963-7175572ffb31\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<p>The Concordia <a href=\"https://github.com/google-deepmind/concordia\">source code</a> is all open, so I've started messing around with some local LLM simulations. It's really as easy as specifying some prompts and parallelisation; quite different from conventional agent-based modeling! There was a <a href=\"https://www.cooperativeai.com/post/google-deepmind-releases-concordia-library-v2-0\">v2 release this summer</a> and a <a href=\"https://www.youtube.com/watch?v=2FO5g65mu2I\">video tutorial</a> as well.</p>\n<h3 id=\"causal-modeling-for-ecosystems\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#causal-modeling-for-ecosystems\"></a>Causal modeling for ecosystems</h3>\n<p>Then <a href=\"https://www.bangor.ac.uk/staff/sens/julia-patricia-gordon-jones-010356/en\">Julia P.G. Jones</a>, the host of one of my favourite podcasts &quot;<a href=\"https://www.youtube.com/playlist?list=PLpRUERxtS22DYG_YhoR25ck4ll3HpHIOY\">Tuesdays with Team Counterfactual</a>&quot; gave us a dive into why <em>causal modeling</em> is so important for real nature simulations. I've been steadily learning from Julia and <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\">Thomas Swinfield</a> over the past five years in <a href=\"/projects/4c\">4C</a> about the importance of <a href=\"https://en.wikipedia.org/wiki/The_Book_of_Why\">causal inference</a> to try and find those pesky hidden confounders over lots of observational data.</p>\n<p><div class=\"video-center\"><iframe title=\"Forecasting ecosystem resilience: what this means and why we need it\" width=\"100%\" height=\"315px\" src=\"https://watch.eeg.cl.cam.ac.uk/videos/embed/ef29c6b1-800d-4272-9b9f-fb4626c65905\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<p>Julia highlighted just how much more complex the observational and causal variables are in biodiversity <em>vs</em> something like climate, and observed that climate modeling is <em>not</em> analogous to modeling biodiversity and ecosystem services. This is because biodiversiy is highly multi-scale, and local changes will have an impact nearby (unlike the global climate). But it's also possible to construct (quasi-)experimental and natural experiments to test various aspects, which is difficult to do with climate. The discussion of after her talk went on for quite some time, so the second half of the video is also useful I hope.</p>\n<h3 id=\"the-virtual-ecologist\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-virtual-ecologist\"></a>The virtual ecologist</h3>\n<p><a href=\"https://www.southampton.ac.uk/people/5x8vlw/doctor-becks-spake\">Rebecca Spake</a> then connected the dots with ecology and the industry that has driven billions of dollars of investment into machine learning: advertising. This began via her amazing pet turkey (watch the talk!) and then dove deep into how to use causal information to gain predictive power over the outcomes of interventions. It's worth noting that almost all natural experiments in ecology seem to be centered around counterfactuals due to the difficulty of exposing any single organism outside of a habitat, so this puts a <em>lot</em> of importance on robust causal modeling.</p>\n<p><div class=\"video-center\"><iframe title=\"Dr Becks (Rebecca) Spake: Virtual Ecology\" width=\"100%\" height=\"315px\" src=\"https://watch.eeg.cl.cam.ac.uk/videos/embed/8360bf94-521f-4f6d-a337-5a232bdd6691\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<p>Becks went into detail about how conventional <a href=\"https://arxiv.org/abs/2502.00713v1\">individual treatment effects</a> are difficult to test with real-world data, and so a &quot;virtual ecologist&quot; simulation approach is needed to simulate ecosystem responses under both treatment and control. If we could use some of the techniques discussed above like TESSERA and Concordia to make these simulations as high fidelity as possible, then we can dramatically improve the evidence available to motivate expensive on-the-ground interventions to protect some aspect of biodiversity.</p>\n<h3 id=\"using-ai-to-understand-nature\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#using-ai-to-understand-nature\"></a>Using AI to understand nature</h3>\n<p>There were fewer talks in the second practitioners workshop, but no discussion of AI and nature would be complete without <a href=\"https://samreynolds.org\">Sam Reynolds</a> giving an entertaining<sup id=\"fnref:2\"><a href=\"#fn:2\" class=\"footnote\">[2]</a></sup> talk covering our recent <a href=\"/papers/2024-ai-conhorizon\">horizon scan</a> and also our work on <a href=\"/papers/2025-evidence-tap\">large scale literature analysis</a> and using <a href=\"/projects/ce\">conservation evidence</a> to design <a href=\"/notes/ai-for-evidence-synthesis-workshop\">policymaking frameworks</a> that can rapidly replicate positive interventions usefully around the world.</p>\n<p><div class=\"video-center\"><iframe title=\"Sam Reynolds: Conservation and AI\" width=\"100%\" height=\"315px\" src=\"https://watch.eeg.cl.cam.ac.uk/videos/embed/9a74f82a-33cd-403a-94f0-76b797376431\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<p>An interesting term I hadn't heard before is <a href=\"https://en.wikipedia.org/wiki/Human_bycatch\">human bycatch</a> resulting from using automated nature monitoring. Issues of privacy and human rights (often for indigenous populations who might not have agency over such monitoring technology) were discussed at length.</p>\n<h3 id=\"discussing-the-opportunity-space-with-conservation-practitioners\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#discussing-the-opportunity-space-with-conservation-practitioners\"></a>Discussing the opportunity space with conservation practitioners</h3>\n<p>The workshop was also attended by the ARIA programme manager <a href=\"https://www.qmul.ac.uk/sbbs/news/items/queen-mary-professor-yannick-wurm-joins-aria-as-programme-director.html%0A\">Yannick Wurm</a>, who took the time to explain ARIA's structure and how his program is evolving rapidly as they get feedback from workshops like ours. I am finding interacting with ARIA a breath of fresh air: the workshops I've been to have been well organised and useful beyond just looking for funding by <a href=\"/notes/principles-for-collective-knowledge\">sparking new thoughts</a>. I was initially not going to be taking part in this program at all due to concerns about licensing and IP ownership, but a <a href=\"https://www.linkedin.com/posts/anilmadhavapeddy_the-advanced-research-invention-agency-activity-7370729885948071937-ynnj\">quick post on social media</a> was picked up and my concern was addressed within days. ARIA has an exceptionally free IPR policy that makes it so much easier for us to maintain our open-source code while being funded from a variety of sources, and I'm appreciative of this.</p>\n<p>On the other hand, it also means that we need to keep up with all the changes happening to their opportunity spaces in order to keep abreast of all the changes in thinking that are resulting! I am hosting one last workshop for this opportunity space on the 12th December with <a href=\"http://oisin.info\">Oisin Mac Aodha</a> and <a href=\"https://www.vizzuality.com/team/mike-harfoot\">Mike Harfoot</a> to go into range mapping. Both workhops will have a proper writeup about our actual discussions and recommendations (which I haven't covered here!), so stay tuned for those ahead of the Christmas break.</p>\n<p><img src=\"/images/aria-er-4.webp\" alt=\"%c\" title=\"Yannick Wurm discussing the program with the second workshop and the conservation practitioners\" ></p>\n<p><img src=\"/images/aria-er-1.webp\" alt=\"%c\" title=\"Shane showing off his latest ecosystem agentic hacking to a rapt audience\" ></p>\n<p><img src=\"/images/aria-er-3.webp\" alt=\"%c\" title=\"The assembled team from the first workshop in Pembroke\" ></p>\n<p>Random fun facts from the workshop:</p>\n<ul>\n<li><a href=\"https://www.bangor.ac.uk/staff/sens/julia-patricia-gordon-jones-010356/en\">Julia P.G. Jones</a> told us that <a href=\"https://www.linkedin.com/posts/julia-p-g-jones-85294215_togetherfornature-ugcPost-7400203881076457472-EuPX\">25% of all penguins are 'British'</a>!</li>\n<li><a href=\"https://www.shaneweisz.com\">Shane Weisz</a> showed us his experiments towards <a href=\"https://www.notion.so/Yes-AI-can-pass-the-Red-List-Assessor-Exam-What-s-next-3-Dec-2025-2bde7c511467802098b0eef9f806665b\">AI passing the Red List Assessor's exam</a>, which makes me feel better about LLMs now acing our <a href=\"https://toao.com/blog/ocaml-local-code-models\">first year computer science course</a>.</li>\n<li>What's the weirdest and more interesting animal we could simulate? Well, <a href=\"https://en.wikipedia.org/wiki/Naked_mole-rat\">naked mole rats</a> turn out to be one of the few <a href=\"https://en.wikipedia.org/wiki/Eusociality\">eusocial species</a> that have a complex social structure including a reproductive division of labor, separation of reproductive and non-reproductive castes, and cooperative care of young! Also they look really weird. Thanks <a href=\"https://www.zoo.cam.ac.uk/directory/prof-lynn-dicks\">Lynn Dicks</a> and <a href=\"https://www.southampton.ac.uk/people/5x8vlw/doctor-becks-spake\">Rebecca Spake</a> for this introduction to the strangest species I've ever learnt about.</li>\n</ul>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p>That is indeed the same Annette Nazareth who is the chair of the Integrity Council for Voluntary Carbon Markets and defender of tropical forests!</p>\n <a href=\"#fnref:1\" class=\"reversefootnote\">&#8617;</a></p></li>\n<li id=\"fn:2\"><p><p>Assuming you find Will Smith videos entertaining, but it was a cold day and we needed warming up.</p>\n <a href=\"#fnref:2\" class=\"reversefootnote\">&#8617;</a></p></li></ol></div><h1>References</h1><ul><li>Feng et al (2025). TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2506.20380\" target=\"_blank\"><i>10.48550/arXiv.2506.20380</i></a></li>\n<li>Madhavapeddy (2025). A fully AI-generated paper just passed peer review; notes from our evidence synthesis workshop. <a href=\"https://doi.org/10.59350/k540h-6h993\" target=\"_blank\"><i>10.59350/k540h-6h993</i></a></li>\n<li>Jaffer et al (2025). AI-assisted Living Evidence Databases for Conservation Science. Cambridge Open Engage. <a href=\"https://doi.org/10.33774/coe-2025-rmsqf\" target=\"_blank\"><i>10.33774/coe-2025-rmsqf</i></a></li>\n<li>Madhavapeddy (2025). Four Ps for Building Massive Collective Knowledge Systems. <a href=\"https://doi.org/10.59350/418q4-gng78\" target=\"_blank\"><i>10.59350/418q4-gng78</i></a></li>\n<li>Madhavapeddy (2025). GeoTessera 0.7 out with efficient sampling and Zarr support. <a href=\"https://doi.org/10.59350/nagwp-tnw89\" target=\"_blank\"><i>10.59350/nagwp-tnw89</i></a></li>\n<li>Reynolds et al (2024). The potential for AI to revolutionize conservation: a horizon scan. <a href=\"https://doi.org/10.1016/j.tree.2024.11.013\" target=\"_blank\"><i>10.1016/j.tree.2024.11.013</i></a></li>\n<li>Vezhnevets et al (2023). Generative agent-based modeling with actions grounded in physical, social, or digital space using Concordia. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2312.03664\" target=\"_blank\"><i>10.48550/arXiv.2312.03664</i></a></li>\n<li>Brown et al (2025). AlphaEarth Foundations: An embedding field model for accurate and efficient global mapping from sparse label data. <a href=\"https://doi.org/10.48550/arXiv.2507.22291\" target=\"_blank\"><i>10.48550/arXiv.2507.22291</i></a></li>\n<li>Sechidis et al (2025). Using Individualized Treatment Effects to Assess Treatment Effect Heterogeneity. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2502.00713\" target=\"_blank\"><i>10.48550/arXiv.2502.00713</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/foundational-ecosystem-workshop",
      "title": "Foundational AI for Ecosystem Resilience workshop",
      "summary": "Workshop report combining TESSERA geospatial foundation models with Concordia agent-based modeling to simulate ecosystem resilience, covering causal modeling for ecology and AI applications in nature conservation.",
      "image": "https://anil.recoil.org/images/aria-er-3.1280.webp",
      "date_published": "2025-12-03T00:00:00.000000Z",
      "date_modified": "2025-12-03T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "tessera",
        "ai",
        "sensing",
        "nature",
        "ecology",
        "aria"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.48550/arXiv.2506.20380",
          "doi": "10.48550/arXiv.2506.20380",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/k540h-6h993",
          "doi": "10.59350/k540h-6h993",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.33774/coe-2025-rmsqf",
          "doi": "10.33774/coe-2025-rmsqf",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/418q4-gng78",
          "doi": "10.59350/418q4-gng78",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/nagwp-tnw89",
          "doi": "10.59350/nagwp-tnw89",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1016/j.tree.2024.11.013",
          "doi": "10.1016/j.tree.2024.11.013",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2312.03664",
          "doi": "10.48550/arXiv.2312.03664",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2507.22291",
          "doi": "10.48550/arXiv.2507.22291",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2502.00713",
          "doi": "10.48550/arXiv.2502.00713",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/aoah-2025-3",
      "content_html": "<p>By Day 3 of the <a href=\"/notes/aoah-2025\">Advent of Agentic Humps</a>, I now have the\nconfidence to build a slightly more complex library that uses\n<a href=\"https://github.com/ocaml-multicore/eio\">Eio</a> to implement the <a href=\"https://specifications.freedesktop.org/basedir/latest/\">XDG Base\nDirectory Specification</a> with a twist: let's use <a href=\"/papers/2023-ocaml-eio\">Eio capabilities</a> to sandbox XDG paths by default.</p>\n<p><img src=\"/images/cross-xdg.webp\" alt=\"%c\" title=\"Lots of other languages have cross-platform XDG implementations, so I wanted one in OCaml for Eio\" ></p>\n<h2 id=\"approach\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#approach\"></a>Approach</h2>\n<p>The <a href=\"https://specifications.freedesktop.org/basedir/latest/\">XDG Spec</a> is\ncomprehensive, but full of lots of informal rules about directories. Some of\nthe rules are pretty easy to follow:</p>\n<blockquote>\n<p><code>$XDG_DATA_DIRS</code> defines the preference-ordered set of base directories to\nsearch for data files in addition to the <code>$XDG_DATA_HOME</code> base directory. The\ndirectories in <code>$XDG_DATA_DIRS</code> should be separated with the separator used\nfor <code>$PATH</code> on the platform (typically this is a colon <code>:</code>).</p>\n</blockquote>\n<p>While others are harder to enforce in code:</p>\n<blockquote>\n<p>The directory MUST be on a local file system and not shared with any other\nsystem. The directory MUST be fully-featured by the standards of the\noperating system. More specifically, on Unix-like operating systems <code>AF_UNIX</code>\nsockets, symbolic links, hard links, proper permissions, file locking, sparse\nfiles, memory mapping, file change notifications, a reliable hard link count\nmust be supported, and no restrictions on the file name character set should\nbe imposed. Files in this directory MAY be subjected to periodic clean-up. To\nensure that your files are not removed, they should have their access time\ntimestamp modified at least once every 6 hours of monotonic time or the\n'sticky' bit should be set on the file.</p>\n</blockquote>\n<p>We can do bits of that, but POSIX doesn't really expose everything we need to\nmechanically verify some aspects of filesystem support. Still, we have a big\nlong spec, so let's see what happens if we throw an agent at it!</p>\n<p>My general approach with Claude was to download a copy of the XDG spec, instruct\nthe agent to digest it. But then, I also supplied the <a href=\"/notes/aoah-2025-1\">previous</a>\n<a href=\"/notes/aoah-2025-2\">two</a> libraries as examples of &quot;my OCaml style&quot; that it could draw\nfrom. Having more code available to act as an oracle to guide the model towards\nsomething I find acceptable is a useful way to avoid lots of prompting.</p>\n<h2 id=\"results\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#results\"></a>Results</h2>\n<p>The first attempts with using completely crashed and burnt as the agent failed\nto grok the admittedly very complicated Eio types for capabilities (which do involve\nsubtyping and phantom polymorphic variants). I then cloned the Eio source code\nand specifically instructed the agent to read the <a href=\"https://github.com/ocaml-multicore/eio/blob/main/README.md\">Eio README</a>,\nas it has extensive information about best practises to use the library. This is\nsimilar to my earlier trick with <a href=\"/notes/aoah-2025-2\">jsont</a> to instruct it to read\nthe cookbooks.</p>\n<p>Things got much better after this; the agent had to iterate quite a few times,\nbut did eventually converge on the right types.</p>\n<pre><code class=\"language-ocaml\">val config_dir : t -&gt; Eio.Fs.dir_ty Eio.Path.t\n(** [config_dir t] returns the path to user-specific configuration files.\n\n    {b Purpose:} Store user preferences, settings, and configuration files.\n    Configuration files should be human-readable when possible.\n\n    {b Environment Variables:}\n    - [${APP_NAME}_CONFIG_DIR]: Application-specific override (highest priority)\n    - [XDG_CONFIG_HOME]: XDG standard variable\n    - Default: [$HOME/.config/{app_name}]\n\n    @see &lt;https://specifications.freedesktop.org/basedir-spec/latest/#variables&gt;\n      XDG_CONFIG_HOME specification *)\n</code></pre>\n<p>You can see one nice feature here that would have taken a while to code by\nhand.  The <code>type t</code> for the XDGe library is typically constructed by exposing <a href=\"https://github.com/dbuenzli/cmdliner\">Cmdliner</a> terms to allow\nother applications to &quot;plug in&quot; XDG support by just specifying a <a href=\"https://tangled.org/anil.recoil.org/xdge/blob/v1.0.0/lib/xdge.mli#L347\">single term</a>.</p>\n<p>This term takes care of adding  of the ordering of environment variable, command-line arguments,\nand default values in the right order. The manual page for an example binary shows how this works from the CLI.</p>\n<p><img src=\"/images/xdge-man-page.webp\" alt=\"%c\" title=\"In this case, the library specifies XDG_EXAMPLE as the app name, but this is easily customised to your app\" ></p>\n<h2 id=\"tests\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#tests\"></a>Tests</h2>\n<p>You can see the Cmdliner support in action with the test cases, where I adopted\nthe same cram-based testing strategy as earlier with <a href=\"/notes/aoah-2025-2\">jsont</a>.</p>\n<p>This allows for a nice repository structure where I can simply add in the XDG\nCmdliner term to the test case binaries, and have all the gory details of config setup\nhandled by the xdge library. For example, in the <a href=\"https://tangled.org/anil.recoil.org/xdge/blob/main/test/xdg.t\">cram tests</a> you can\nsee how the <em>source</em> of a XDG path is tracked (i.e. did it come from the CLI, from an environment variable or the defaults?):</p>\n<pre><code class=\"language-bash\"> $ export HOME=./test_home\n  $ unset XDG_CONFIG_DIRS XDG_DATA_DIRS\n  $ XDG_CONFIG_HOME=/tmp/xdge/xdg-config \\\n  &gt; XDG_EXAMPLE_CONFIG_DIR=./app-config \\\n  &gt; ./xdg_example.exe --config-dir ./cli-config\n  === Cmdliner Config ===\n  XDG config:\n  config_dir: ./cli-config [cmdline]\n  \n  === XDG Directories ===\n  XDG directories for 'xdg_example':\n  User directories:\n    config: &lt;fs:./test_home/./cli-config&gt; [cmdline]\n    data: &lt;fs:./test_home/./test_home/.local/share/xdg_example&gt; [default]\n    cache: &lt;fs:./test_home/./test_home/.cache/xdg_example&gt; [default]\n    state: &lt;fs:./test_home/./test_home/.local/state/xdg_example&gt; [default]\n    runtime: &lt;none&gt; [default]\n  System directories:\n    config_dirs: [&lt;fs:/etc/xdg/xdg_example&gt;]\n    data_dirs: [&lt;fs:/usr/local/share/xdg_example&gt;; &lt;fs:/usr/share/xdg_example&gt;]\n\nCommand-line argument overrides both types of environment variables. Even\nthough both XDG_CONFIG_HOME and XDG_EXAMPLE_CONFIG_DIR are set, the\n--config-dir flag takes precedence and shows [cmdline] source. Other\ndirectories fall back to defaults since no other command-line args are\nprovided.\n</code></pre>\n<p>I ran the cram tests quite a few times and read through them to make sure\nthe shell sessions and explanations made sense, and then read through\nthe xdge source code itself which was pretty simple. There are some\nfeatures which Eio doesn't expose functionality for yet that are OS-specific\n(like checking the mount type), but they can come in a future iteration.</p>\n<h2 id=\"reflections\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reflections\"></a>Reflections</h2>\n<p>Cmdliner is one of the gems in the open-source OCaml community due to how\neasy it makes it to build &quot;Unix-style&quot; applications with sensible manual\npages and consistent argument parsing. However, even after using it for\nyears I can never remember its term language without referring to the manual,\nand I always find myself cut-and-pasting from previous code and editing it.</p>\n<p>Using the agent definitely helped me out here. A lot of the XDG logic\nis fairly boilerplate, but extremely useful to express in a typed way.\nI anticipate now using this library in almost every CLI tool I build\nin OCaml, as it has enough information exposed in the interface to let\ndownstream CLI-coding agents pick the right base directories to use by\ndefault.</p>\n<p>Onto <a href=\"/notes/aoah-2025-4\">Day 4</a> then, where we'll go recursive by wrapping Claude in OCaml using Claude!</p>",
      "url": "https://anil.recoil.org/notes/aoah-2025-3",
      "title": "AoAH Day 3: XDG filesystem paths using Eio capabilities",
      "summary": "Building an XDG Base Directory Specification library with Eio capabilities and Cmdliner integration, providing sandboxed filesystem access patterns with full environment variable and CLI override support.",
      "date_published": "2025-12-03T00:00:00.000000Z",
      "date_modified": "2025-12-03T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "aoah",
        "ocaml",
        "agents",
        "llms",
        "ai"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/aoah-2025-2",
      "content_html": "<p>Day 2 of the <a href=\"/notes/aoah-2025\">Advent of Agentic Humps</a> dawns with building a slightly more complex library than before, via the <a href=\"https://www.jsonfeed.org\">JSONFeed</a> specification that is a more modern version of Atom.</p>\n<p>JSONfeed is a successor to Atom for website feeds, that has a nice <a href=\"https://www.jsonfeed.org/version/1.1/\">informal specification</a> about how to parse it. However, it also has a growing number of <a href=\"https://github.com/egonw/JSONFeed-extensions/tree/main\">extensions</a> which also need to be implemented somehow, as well as some <a href=\"https://www.jsonfeed.org/mappingrssandatom/\">informal rules to map RSS/Atom to JSONFeed</a>.</p>\n<p>There is no existing OCaml implementation that I could find, and I need it to integrate my website with <a href=\"https://rogue-scholar.org\">Rogue Scholar</a> more easily for <a href=\"/notes/principles-for-collective-knowledge\">permanent DOIs</a>.</p>\n<p><a href=\"https://jsonfeed.org\"> <img src=\"/images/jsonfeed.webp\" alt=\"%c\" > </a></p>\n<h2 id=\"approach\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#approach\"></a>Approach</h2>\n<p>Unlike the <a href=\"/notes/aoah-2025-1\">Crockford</a> implementation, parsing JSON involves selecting a third-party library dependency cone. By default the agent chose <a href=\"https://github.com/ocaml-community/yojson\">Yojson</a> (presumably because its the most popular in its training set). I would conventionally use my own <a href=\"https://github.com/mirage/ezjsonm\">ezjsonm</a> library that builds over the lower level <a href=\"https://github.com/dbuenzli/jsonm\">jsonm</a>, but I noticed a deprecation notice on jsonm towards a newer library by <a href=\"https://erratique.ch\">Daniel Bünzli</a> called <a href=\"https://github.com/dbuenzli/jsont\">jsont</a>.</p>\n<h3 id=\"jsont\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#jsont\"></a>Jsont</h3>\n<p>After seeing the <a href=\"https://discuss.ocaml.org/t/ann-jsont-0-1-0-declarative-json-data-manipulation-for-ocaml/15702\">announcement of jsont</a> about a year ago, I gave it a quick try:</p>\n<blockquote>\n<p>Jsont is an OCaml library for declarative JSON data manipulation. It provides:</p>\n<ul>\n<li>Combinators for describing JSON data using the OCaml values of your choice. The descriptions can be used by generic functions to decode, encode, query and update JSON data without having to construct a generic JSON representation.</li>\n<li>A JSON codec with optional text location tracking and layout preservation. The codec is compatible with effect-based concurrency.</li>\n</ul>\n<p>The descriptions are independent from the codec and can be used by third-party processors or codecs.\n<cite>-- <a href=\"https://github.com/dbuenzli/jsont\">jsont</a>, Daniel Bünzli, 2025</cite></p>\n</blockquote>\n<p>The codec for a JSON type is expressed using combinators, and then separately serialised or deserialised. I found that it had <em>fantastic</em> error messages since it could use the codec to come up with the reason why some input has been rejected. But on the flipside, writing the codecs involved a lot of boilerplate, so I found it quite time consuming.</p>\n<p><a href=\"https://github.com/dbuenzli/jsont/blob/main/paper/soup.pdf\"> <img src=\"/images/jsont-paper.webp\" alt=\"%c\" title=\"Daniel wrote a nice paper about the combinator magic behind jsont\" > </a></p>\n<p>But now, with Claude, I could use it to scan a spec and automate codec construction using jsont! So I decided to try a complex coding case where I fed the prose JSONFeed spec to Claude, and instructed it to build jsont codecs. As a separate phase, I then built test cases and serialisers.</p>\n<h3 id=\"results\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#results\"></a>Results</h3>\n<p>The very first run of Claude crashed and burned with jsont as it didn't have enough examples to figure out the interface from scratch, resulting in a lot of type errors and no working code.</p>\n<p>So I took a different tack: I ran <code>opam source jsont</code> to get the source code into the current working directory, and then prompted a fresh Claude session to <em>&quot;ultrathink about the interface of jsont.0.2.0, paying particular attention to the cookbook&quot;</em>. The <a href=\"https://erratique.ch/software/jsont/doc/cookbook.html\">cookbook</a> is a section of the (excellent) documentation in Jsont that describes real-world usage, and the agent also picked up on the <a href=\"https://github.com/dbuenzli/jsont/blob/main/test/json_rpc.ml\">jsonrpc</a> testcase in the jsont repository.</p>\n<p>Once it had this bit of example-driven context, the agent proceeded to build a very credible set of codecs:</p>\n<ul>\n<li><a href=\"https://tangled.org/anil.recoil.org/ocaml-jsonfeed/blob/main/lib/author.ml\">Author descriptions</a> look about right, with a pretty printer thrown in as a bonus.</li>\n<li>The extremely tedious set of <a href=\"https://tangled.org/anil.recoil.org/ocaml-jsonfeed/blob/main/lib/cito.ml\">CITO citation methods</a> came out in one shot.</li>\n<li>It figured out to use <a href=\"https://tangled.org/anil.recoil.org/ocaml-jsonfeed/blob/main/lib/rfc3339.ml\">Ptime based RFC3339</a> date handling for the feeds, from a combination of the spec and by querying opam locally.</li>\n<li>The overall <a href=\"https://tangled.org/anil.recoil.org/ocaml-jsonfeed/blob/main/lib/jsonfeed.mli\">Jsonfeed.mli</a> interface weaves all these together, exposes the jsont codec value <em>and</em> accessor and pretty printer functions.</li>\n<li>As a bonus, i fed it the extensions repository so it can now expose <a href=\"https://tangled.org/anil.recoil.org/ocaml-jsonfeed/blob/main/lib/item.ml#L160\">structured references</a> in my own <a href=\"https://anil.recoil.org/perma.json\">site's JSON feed</a>.</li>\n</ul>\n<p>The user exposed API is idiomatic OCaml:</p>\n<pre><code class=\"language-ocaml\">type t\nval jsont : t Jsont.t\nval create :\n  title:string -&gt; ?home_page_url:string -&gt;\n  ?feed_url:string -&gt; ?description:string -&gt;\n  ?user_comment:string -&gt; ?next_url:string -&gt;\n  ?icon:string -&gt; ?favicon:string -&gt;\n  ?authors:Author.t list -&gt; ?language:string -&gt;\n  ?expired:bool -&gt; ?hubs:Hub.t list -&gt;\n  items:Item.t list -&gt; ?unknown:Unknown.t -&gt;\n  unit -&gt; t\n</code></pre>\n<p>The full set of constructors and validators can be read in <a href=\"https://tangled.org/anil.recoil.org/ocaml-jsonfeed/blob/main/lib/jsonfeed.mli\">jsonfeed.mli</a>.</p>\n<h3 id=\"tests\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#tests\"></a>Tests</h3>\n<p>For test cases, I first got Claude to synthesise a <a href=\"https://tangled.org/anil.recoil.org/ocaml-jsonfeed/tree/main/test/data\">variety of JSONFeed corpuses</a> that exercised the spec, and manually inspected it to check it all seemed ok. I then made use of <a href=\"https://dune.readthedocs.io/en/latest/reference/cram.html\">Dune Cram tests</a> to write CLI-based validators that run.</p>\n<p>All of the boilerplate around writing the test cases just worked out of the box, with a little bit of manual prompting from me required to guide the agent towards some known edge cases (like extensions handling).</p>\n<p>Here's an excerpt from the <a href=\"https://tangled.org/anil.recoil.org/ocaml-jsonfeed/blob/main/test/test_locations.t\">cram tests</a>:</p>\n<pre><code>Missing Required Fields\n------------------------\n\nTest missing title field:\n  $ ./test_location_errors.exe data/missing_title.json title\n  {&quot;status&quot;:&quot;error&quot;,&quot;message&quot;:&quot;Missing member title in JSON Feed object&quot;,\n   &quot;location&quot;:{&quot;file&quot;:&quot;data/missing_title.json&quot;,&quot;line&quot;:1,&quot;column&quot;:1,\n   &quot;byte_start&quot;:0,&quot;byte_end&quot;:65},&quot;context&quot;:&quot;$&quot;}\n  [1]\n\nTest missing version field:\n  $ ./test_location_errors.exe data/missing_version.json title\n  {&quot;status&quot;:&quot;error&quot;,&quot;message&quot;:&quot;Missing member version in JSON Feed object&quot;,\n   &quot;location&quot;:{&quot;file&quot;:&quot;data/missing_version.json&quot;,&quot;line&quot;:1,&quot;column&quot;:1,\n   &quot;byte_start&quot;:0,&quot;byte_end&quot;:51},&quot;context&quot;:&quot;$&quot;}\n  [1]\n</code></pre>\n<h3 id=\"reflections\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reflections\"></a>Reflections</h3>\n<p>The code is available on <a href=\"https://tangled.org/anil.recoil.org/ocaml-jsonfeed\">anil.recoil.org/ocaml-jsonfeed</a> with <a href=\"https://ocaml.org/p/jsonfeed/latest\">online docs</a>.  My first user for this might be <a href=\"https://mynameismwd.org\">Michael Dales</a> in his <a href=\"https://digitalflapjack.com/blog/the-partially-dynamic-web/\">own website</a>.</p>\n<p>The use of Claude has made using jsont the default choice for me now. As a first phase, I can ask the LLM to synthesise down a spec into a combinator-based codec, and then validate that separately. Crucially, the good error messages from jsont help the agent to root-cause why some tests fail, and give me the option of either clarifying the spec or fixing the test case if that was the error.</p>\n<p>I did, however, still have to make the judgement call of which libraries to use at the start. The agent also happily spat out Yojson and Ezjsonm based implementations for me, but I simply prefer the jsont approach. If you had other priorities like pure performance, you might go for Yojson instead.</p>\n<p>Onto <a href=\"/notes/aoah-2025-3\">Day 3</a>, where we'll then build our first Eio based library!</p><h1>References</h1><ul><li>Madhavapeddy (2025). Four Ps for Building Massive Collective Knowledge Systems. <a href=\"https://doi.org/10.59350/418q4-gng78\" target=\"_blank\"><i>10.59350/418q4-gng78</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/aoah-2025-2",
      "title": "AoAH Day 2: Building an OCaml JSONFeed library",
      "summary": "Implementing a JSONFeed specification library using jsont codecs, discovering how Claude can automate the construction of complex combinators from prose specifications with excellent error messages.",
      "date_published": "2025-12-02T00:00:00.000000Z",
      "date_modified": "2025-12-02T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "aoah",
        "ocaml",
        "agents",
        "llms",
        "ai"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/418q4-gng78",
          "doi": "10.59350/418q4-gng78",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/askdv-e9z43",
      "content_html": "<p>Our neighbours France and the UK announced a <a href=\"https://www.cam.ac.uk/news/british-french-research-partnership-on-ai\">Franco-British AI collaboration</a> a few months ago dubbed the <a href=\"https://www.ip-paris.fr/en/news/entente-cordiale-paris-saclay-oxford-cambridge-ai-initiative-franco-british-partnership-field-ai\">Entente CordIAle</a>. Last week we held a couple of days of workshops with our Oxford and French buddies deep diving into details of what a partnership might actually involve; a particular pleasure with France given my group's long <a href=\"/projects/ocamllabs\">history</a> of working with <a href=\"/notes/acm-sigplan-award\">Inria on OCaml</a> and other open source projects.</p>\n<p>I sprinted <a href=\"/notes/principles-for-collective-knowledge\">back from Birmingham</a> to speak about our research on connecting the dots on <a href=\"/projects/life\">terrestrial life</a> via <a href=\"/papers/2025-tessera\">TESSERA</a>, <a href=\"/papers/2024-life\">LIFE</a>, <a href=\"/papers/2024-food-life\">food</a> and the <a href=\"/papers/2025-evidence-tap\">scholarly literature</a>. The other talks were vastly ambitious, best summarised in the <a href=\"https://www.scriberia.com/\">Scriberia</a> visual:</p>\n<p><img src=\"/images/aicam-french-6.webp\" alt=\"%c\" title=\"The scriberia summary of the Entente Cordiale Workshop (credit: AI@CAM). My talk is on the bottom left!\" ></p>\n<p>There's a great <a href=\"https://www.ai.cam.ac.uk/blog/how-ai-is-changing-the-practice-of-science\">blog post from AI@CAM</a> (who hosted the workshops on our Cambridge end) on how AI is changing our <a href=\"/notes/ai-for-science-2024\">practise of science</a>:</p>\n<blockquote>\n<p>On 20 November, as part of an emerging UK-France collaboration in AI, ai@cam brought together researchers working at the intersection of AI and science. The conversation moved across different problems: how do you forecast weather without traditional numerical weather prediction? How do you model how a drug affects tissue structure? How do you predict molecular structures from mass spectrometry data? How do you quantify the extinction impact of agricultural land use? How do you extract reliable insights from the neuroscience literature? How do you diagnose dementia earlier? These aren't separate questions. They're connected by something deeper: a rethinking of how the scientific pipeline works, from data to actionable insight.\n<cite>-- <a href=\"https://www.ai.cam.ac.uk/blog/how-ai-is-changing-the-practice-of-science\">How AI is Changing the Practice of Science</a>, AI@CAM, Nov 2025</cite></p>\n</blockquote>\n<p><img src=\"/images/aicam-french-2.webp\" alt=\"%rc\" ></p>\n<p>One of my favourite talks was related to open-source; I hadn't realised that the <a href=\"https://scikit-learn.org/stable/\">scikit-learn</a> project that's used widely in machine learning actually began as a Google Summer of Code project and then incubated by <a href=\"https://en.wikipedia.org/wiki/Scikit-learn\">Inria Saclay</a> subsequently. Gaël Varoquaux gave a fantastic overview of his interest in <a href=\"https://arxiv.org/abs/2502.05564\">machine learning for tabular data</a>, which struck me as a <a href=\"/notes/icfp25-what-i-learnt\">similar argument</a> to <a href=\"https://cs.brown.edu/~sk/\">Shriram Krishnamurthi</a> making the <a href=\"https://dl.acm.org/doi/10.1145/3408056\">case for data-centricity</a> in teaching.</p>\n<p><img src=\"/images/aicam-french-3.webp\" alt=\"%rc\" ></p>\n<p>After this, we had a formal reception at the French Embassy in London hosted by Her Excellency <a href=\"https://uk.diplomatie.gouv.fr/en/helene-treheux-duchene-french-ambassador-united-kingdom\">Hélène Tréheux-Duchêne</a>, Ambassador to the United Kingdom. It was as posh as you <a href=\"https://www.youtube.com/watch?v=hMlP_Moo0bE\">may expect</a>, and <a href=\"https://inverseprobability.com/\">Neil Lawrence</a> represented Cambridge with a splendid speech about the importance of soverignity and openness in this age of frontier AI.</p>\n<p>I also enjoyed meeting my counterparts from Oxford to learn more about their own work. In particular, <a href=\"https://www.cs.ox.ac.uk/people/sandra.kiefer/\">Sandra Kiefer</a> gave me some interesting <a href=\"https://proceedings.mlr.press/v162/morris22a\">reading</a> on her recent work in complexity theory and algorithms, which I need to get up to speed on!</p>\n<p><img src=\"/images/aicam-french-7.webp\" alt=\"%c\" title=\"I managed to sneak into the group embassy picture by a literal nose\" ></p>\n<h2 id=\"followup-funding-for-conservation-copilots\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#followup-funding-for-conservation-copilots\"></a>Followup funding for Conservation Copilots</h2>\n<p>Subsequently, I'm also grateful to AI@CAM for giving our <a href=\"/projects/ce\">Conservation Evidence Copilots</a> project followon funding to help develop the &quot;<a href=\"https://www.ai.cam.ac.uk/news/from-diagnosing-endometriosis-to-protecting-global-food-security-seven-new-ai-deas-projects-advance-ai-for-science-citizens-and-society\">Conservation Co-pilot: Making Environmental Evidence Accessible</a>&quot;, lead by <a href=\"https://samreynolds.org\">Sam Reynolds</a>:</p>\n<blockquote>\n<p>With over 1 million users already accessing the Conservation Evidence database, this project will develop the first ‘Conservation Co-pilot’ – an AI-powered chat interface that retrieves, summarises, and presents conservation evidence to answer user questions. The challenge is ensuring that AI faithfully represents scientific evidence without misrepresentation. Building on rigorous evaluation research comparing frontier AI models with human experts, the team will create an agentic system that draws together evidence while maintaining faithfulness to source material. The tool will be game-changing for conservation decision-makers seeking evidence to guide action.\n<cite>-- <a href=\"https://www.ai.cam.ac.uk/news/from-diagnosing-endometriosis-to-protecting-global-food-security-seven-new-ai-deas-projects-advance-ai-for-science-citizens-and-society\">Seven New AI-deas Projects Advance AI for Science, Citizens, and Society</a>, 2025</cite></p>\n</blockquote>\n<p><img src=\"/images/aicam-french-4.webp\" alt=\"%rc\" ></p>\n<p>Read more about this from the <a href=\"https://www.linkedin.com/posts/lucastourny_artificialintelligence-ententecordiale-teamfrance-ugcPost-7397952272338681856-1BYo\">French contingent</a> and the <a href=\"https://www.linkedin.com/feed/update/urn:li:activity:7398464926412939264/\">LinkedIn post</a> from <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a>, who also evangelised our <a href=\"/projects/ce\">Conservation Evidence Copilots</a> project extremely well across the event!. While I've enjoyed the recent <a href=\"/notes/path-to-uk-india-ai-summit\">diplomatic visits from India</a> as well, I am also glad to be able to sit down and do some hacking this week when things are a bit quieter.</p>\n<p><a href=\"http://carlhenrik.com/\">Carl Henrik Ek</a> and I are also very grateful to <a href=\"https://inverseprobability.com/\">Neil Lawrence</a> for taking the time to sketch out the logic in his recent paper on <a href=\"https://arxiv.org/abs/2511.06795\">The Inaccessible Game</a> to our undergraduates at Pembroke after this event as well. It was a heady mix of thermodynamics, information theory and <a href=\"https://en.wikipedia.org/wiki/GENERIC_formalism\">GENERIC</a> in the best possible way!</p>\n<p><img src=\"/images/aicam-french-5.webp\" alt=\"%c\" title=\"Back in familiar territory, Neil Lawrence explains the Inaccessible Game to Pembroke CST undergrads\" ></p><h1>References</h1><ul><li>Eyres et al (2025). LIFE: A metric for mapping the impact of land-cover change on global extinctions. <a href=\"https://doi.org/10.1098/rstb.2023.0327\" target=\"_blank\"><i>10.1098/rstb.2023.0327</i></a></li>\n<li>Feng et al (2025). TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2506.20380\" target=\"_blank\"><i>10.48550/arXiv.2506.20380</i></a></li>\n<li>Madhavapeddy (2025). What I learnt at ICFP/SPLASH 2025 about OCaml, Hazel and FP. <a href=\"https://doi.org/10.59350/w1jvt-8qc58\" target=\"_blank\"><i>10.59350/w1jvt-8qc58</i></a></li>\n<li>Jaffer et al (2025). AI-assisted Living Evidence Databases for Conservation Science. Cambridge Open Engage. <a href=\"https://doi.org/10.33774/coe-2025-rmsqf\" target=\"_blank\"><i>10.33774/coe-2025-rmsqf</i></a></li>\n<li>Ball et al (2025). Food impacts on species extinction risks can vary by three orders of magnitude. <a href=\"https://doi.org/10.1038/s43016-025-01224-w\" target=\"_blank\"><i>10.1038/s43016-025-01224-w</i></a></li>\n<li>Madhavapeddy (2025). On the path to the UK/India AI Summit with OpenUK and the ATI. <a href=\"https://doi.org/10.59350/x6rea-1g262\" target=\"_blank\"><i>10.59350/x6rea-1g262</i></a></li>\n<li>Madhavapeddy (2025). Four Ps for Building Massive Collective Knowledge Systems. <a href=\"https://doi.org/10.59350/418q4-gng78\" target=\"_blank\"><i>10.59350/418q4-gng78</i></a></li>\n<li>Madhavapeddy (2024). Royal Society and DeepMind host AI for Science Forum. <a href=\"https://doi.org/10.59350/0znpc-fw825\" target=\"_blank\"><i>10.59350/0znpc-fw825</i></a></li>\n<li>Qu et al (2025). TabICL: A Tabular Foundation Model for In-Context Learning on Large Data. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2502.05564\" target=\"_blank\"><i>10.48550/arXiv.2502.05564</i></a></li>\n<li>Krishnamurthi et al (2020). Data-centricity: a challenge and opportunity for computing education. Commun. ACM. <a href=\"https://doi.org/10.1145/3408056\" target=\"_blank\"><i>10.1145/3408056</i></a></li>\n<li>Lawrence (2025). The Inaccessible Game. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2511.06795\" target=\"_blank\"><i>10.48550/arXiv.2511.06795</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/entente-cordiale",
      "title": "The AI French Connection to the Practice of Science",
      "summary": "Reflections on the Franco-British AI collaboration workshops exploring how AI is transforming scientific practice, plus follow-up funding for the Conservation Copilot project.",
      "date_published": "2025-12-01T00:00:00.000000Z",
      "date_modified": "2025-12-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ai",
        "evidence",
        "llms",
        "evidence"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1098/rstb.2023.0327",
          "doi": "10.1098/rstb.2023.0327",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2506.20380",
          "doi": "10.48550/arXiv.2506.20380",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/w1jvt-8qc58",
          "doi": "10.59350/w1jvt-8qc58",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.33774/coe-2025-rmsqf",
          "doi": "10.33774/coe-2025-rmsqf",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s43016-025-01224-w",
          "doi": "10.1038/s43016-025-01224-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/x6rea-1g262",
          "doi": "10.59350/x6rea-1g262",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/418q4-gng78",
          "doi": "10.59350/418q4-gng78",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/0znpc-fw825",
          "doi": "10.59350/0znpc-fw825",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2502.05564",
          "doi": "10.48550/arXiv.2502.05564",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3408056",
          "doi": "10.1145/3408056",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2511.06795",
          "doi": "10.48550/arXiv.2511.06795",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/aoah-2025-1",
      "content_html": "<p>Let's start day 1 of the <a href=\"/notes/aoah-2025\">Advent of Agentic Humps</a> with a gentle introduction to agentic coding. Firstly, I've chosen to exclusively use <a href=\"https://www.claude.com/product/claude-code\">Claude Code</a> for this since it's CLI driven. I tried some of the other Copilot and Cursor IDEs, but I just couldn't adjust to how busy the displays were.</p>\n<p>With Claude, my setup first involved a custom devcontainer using Docker on a Linux host, and my local Mac laptop. I coordinate both of these via Git repositories hosted up at <a href=\"https://tangled.org\">Tangled</a> with a <a href=\"/notes/disentangling-git-with-bluesky\">self-hosted knot</a>.</p>\n<h2 id=\"remote-sandboxed-docker-devcontainer\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#remote-sandboxed-docker-devcontainer\"></a>Remote sandboxed Docker devcontainer</h2>\n<p>I setup a remote Docker container on a Linux host where I can leave the agent running unattended in a sandbox. I coded up a <a href=\"https://tangled.org/anil.recoil.org/slop/tree/main/.devcontainer\">custom Claude devcontainer</a> for this which takes care of firewalling the agent at the DNS level and installing <a href=\"https://tangled.org/anil.recoil.org/slop/blob/main/.devcontainer/setup-ocaml.sh\">relevant OCaml packages</a>.</p>\n<p>With this setup, I can run <code>claude --dangerously-skip-permissions</code> and have it only access my repositories on <a href=\"/notes/disentangling-git-with-bluesky\">Tangled</a> and GitHub.</p>\n<p>As a bit of fun, the wrapper <a href=\"https://tangled.org/anil.recoil.org/slop/blob/main/slopper\">slopper script</a> to start the session passes the session shell output back to Claude afterwards, to give it an opportunity to rewrite the Dockerfile with additional libraries I installed. It's shaky, but it occasionally does something useful.</p>\n<p><img src=\"/images/aoah-ss-5.webp\" alt=\"%c\" title=\"The Claude devcontainer works great with tmux and Docker sandboxing\" ></p>\n<h2 id=\"local-development\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#local-development\"></a>Local development</h2>\n<p>On my Mac and Linux desktops, I just use Claude code directly with my usual vi-based OCaml coding setup. This is just a straightforward addition to my usual workflow with Claude running in a terminal.</p>\n<p>The only major change is that I assume that all code generated is slop by default, and is so until &quot;promoted&quot; via review and manual editing to the next level. I therefore stage all my work-in-progress slop code in a dedicate <a href=\"https://tangled.org/anil.recoil.org/slop\">slop repository</a> which I'll reset and delete regularly. This is a convenient way of doing coordination, nothing more.</p>\n<h2 id=\"getting-started-with-the-first-library-crockford-base32\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#getting-started-with-the-first-library-crockford-base32\"></a>Getting started with the first library: Crockford base32</h2>\n<p><strong>Problem.</strong> I've integrated my blog <a href=\"/notes/principles-for-collective-knowledge\">with Rogue Scholar</a> but the mechanism for doing so required generating <a href=\"https://www.crockford.com/base32.html\">Base32 encoded random DOIs</a> for the Rogue Scholar DOI assigner to pick up from my blog feed. There is no existing Base32 library in OCaml, so let's make one.</p>\n<p><strong>Approach.</strong>\nI looked over at the <a href=\"https://www.crockford.com/base32.html\">spec</a>, and also (with permission from Martin Fenner) at his reference <a href=\"https://pkg.go.dev/github.com/front-matter/commonmeta/crockford\">Go implementation</a> on GitHub. I then instructed Claude to synthesise a spec from the online spec <em>and</em> the Go implementation, and used Sonnet 4.5 to come up with a plan. I then instructed it to write the OCaml implementation as a single module with no dependencies other than the stdlib. I then <em>separately</em> had a session to take just the OCaml interface file and synthesise alcotest-based checks from the spec, to act as test cases.</p>\n<p><strong>Results.</strong> The <a href=\"https://tangled.org/anil.recoil.org/ocaml-crockford\">crockford code</a> is up with <a href=\"https://ocaml.org/p/crockford/latest\">online docs</a>. I assigned co-copyright to Front Matter due to cross-generating the code with their implemenation as a reference. Unsure if this is the right thing to do, but it seems better to be generous about copyright assignment in this case.</p>\n<p><strong>Reflections.</strong> This is a pretty simple day 1 task, as it's a single module of OCaml and relatively easy to test. Still, I had to be careful to not just generate the tests in the same session as the code to ensure that the agent didn't just specialise the tests to the code it had written.</p>\n<p>Onto <a href=\"/notes/aoah-2025-2\">Day 2</a> then, where we'll build a more substantive JSONfeed library!</p><h1>References</h1><ul><li>Madhavapeddy (2025). Socially self-hosting source code with Tangled on Bluesky. <a href=\"https://doi.org/10.59350/r80vb-7b441\" target=\"_blank\"><i>10.59350/r80vb-7b441</i></a></li>\n<li>Madhavapeddy (2025). Four Ps for Building Massive Collective Knowledge Systems. <a href=\"https://doi.org/10.59350/418q4-gng78\" target=\"_blank\"><i>10.59350/418q4-gng78</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/aoah-2025-1",
      "title": "AoAH Day 1: Building a Base32 Crockford library in OCaml",
      "summary": "Building a Base32 Crockford encoding library in OCaml using Claude Code, establishing the development workflow with sandboxed Docker containers and local development environments.",
      "date_published": "2025-12-01T00:00:00.000000Z",
      "date_modified": "2025-12-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "aoah",
        "ocaml",
        "agents",
        "llms",
        "ai"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/r80vb-7b441",
          "doi": "10.59350/r80vb-7b441",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/418q4-gng78",
          "doi": "10.59350/418q4-gng78",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/418q4-gng78",
      "content_html": "<p>I've been building some big <a href=\"/notes/rs-future-of-publishing\">collective knowledge</a> systems recently, both for <a href=\"/papers/2025-evidence-tap\">scholarly literature</a> or to power large-scale <a href=\"/papers/2025-tessera\">observational foundation models</a>.  While the modalities of\nknowledge in these systems are very different, they share a common set of design principles I've noticed while building\n<a href=\"/notes/geotessera-python\">individual</a> <a href=\"/notes/uk-national-data-lib\">pieces</a>. A good computer architecture is one that can be re-used, and I've been mulling over what this exactly is for some time.</p>\n<p>I found the perfect place to codify this at the <a href=\"https://www.aria.org.uk/opportunity-spaces/collective-flourishing\">ARIA Workshop on Collective Flourishing</a> that <a href=\"https://toao.com\">Sadiq Jaffer</a> and I attended in Birmingham last week. I posit there are <em>&quot;4 P's&quot;</em> needed for any collective knowledge system to be robust and accurate: <strong>permanence, provenance, permission and placement</strong>. If these properties exist throughout our knowledge graph, we can make robust networks for rapid <a href=\"/notes/ai-for-evidence-synthesis-workshop\">evidence-based decision making</a>. They also form a dam against the wave of <a href=\"/notes/claude-copilot-sandbox\">agentic AI</a> that is going to dominate the Internet next year in a big way.</p>\n<p><em>Will building these collective knowledge systems be a transformative capability for human society?</em> Hot on the heels of COP30 <a href=\"https://www.bbc.co.uk/news/articles/cp84m16mdm1o\">concluding indecisively</a>, I've been getting excited by decision making towards <em>biodiversity</em> going down a more positive path in <a href=\"https://www.cbd.int/sbstta/ipbes.shtml\">IPBES</a>. We could empower decisionmakers at all scales (local, country, international) to be able to move <a href=\"https://fivetimesfaster.org/\">five times faster</a> on actions about global species extinctions, unsustainable wildlife trade and food security, while rapidly assimilating extraordinarily complex evidence chains. I'll talk about this more while explaining the principles...</p>\n<p>This post is split up into a few parts. First, let's <a href=\"#collective-flourishing-at-aria\">introduce ARIA's convening role</a> in this. Then I'll <a href=\"#the-4ps-for-collective-knowledge-systems\">introduce the principles</a> of <a href=\"#p1-permanence-aka-dois-for-all-with-the-rogue-scholar\">permanence</a>, <a href=\"#p2-provenance-is-it-ai-poison-or-precious\">provenance</a>, <a href=\"#p3-permission-not-everything-needs-to-go-into-the-borg\">permission</a>, and <a href=\"#p4-placement-data-has-weight-and-geopolitics-matters\">placement</a>. The post does assume some knowledge of Internet protocols; I'll write a more accessible one for a general audience at a future date when the ideas are more baked!</p>\n<h2 id=\"collective-flourishing-at-aria\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#collective-flourishing-at-aria\"></a>Collective Flourishing at ARIA</h2>\n<p>The ARIA workshop was held in lovely Birmingham, hosted by programme manager <a href=\"https://scholar.google.com/citations?user=7MTl9CAAAAAJ&amp;hl=en\">Nicole Wheeler</a>. It explored four core beliefs that ARIA had published about this opportunity space in collective flourishing:</p>\n<blockquote>\n<ol>\n<li>Navigating towards a better future requires clarity on direction and path &gt; <strong>we need the capability to make systemic complexity legible so we can envision and deliberate over radically different futures.</strong></li>\n<li>Simply defining our intent for the future is not enough → <strong>we need a means of negotiating our fragmented values into shared, actionable plans for collective progress.</strong></li>\n<li>Our current cognitive, emotional, and social characteristics are not immutable constants → <strong>human capacity can and will change over time, and we need tools to figure out together how we navigate this change.</strong></li>\n<li>Capabilities that augment our vision, action, and capacity are powerful and can have unintended consequences → <strong>we must balance the pressing need for these tools with the immense responsibility they entail.</strong>\n<cite>-- <a href=\"https://www.aria.org.uk/opportunity-spaces/collective-flourishing\">ARIA Collective Flourishing Opportunity Space</a>, 2025</cite></li>\n</ol>\n</blockquote>\n<p>I agree with these values, and translating these into concrete\nsystems concepts seems a useful exercise.  The workshop was under <a href=\"https://www.chathamhouse.org/about-us/chatham-house-rule\">Chatham\nHouse rules</a>, which\nhamstrings my ability to credit individuals, but the gathering was a\nuseful and eclectic mix of social scientists and technologists.  There was also\na real sense of collective <em>purpose</em>: a desire to <a href=\"https://ukfoundations.co/\">reignite UK growth</a> and <a href=\"https://www.bensouthwood.co.uk/p/regional-inequality-also-a-housing\">decrease inequality</a>.</p>\n<p><img src=\"/images/aria-ci-1.webp\" alt=\"%c\" title=\"While I cannot confirm nor deny any individual's presence at the workshop, there was much banter late into the evening at the pub afterwards!\" ></p>\n<h2 id=\"the-4ps-for-collective-knowledge-systems\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-4ps-for-collective-knowledge-systems\"></a>The 4P's for Collective Knowledge Systems</h2>\n<p>First, why come up with these system design principles at all? I believe strongly both in building systems from the groundup, and also in\n<a href=\"https://en.wikipedia.org/wiki/Eating_your_own_dog_food\">eating my own dogfood</a>\nand using whatever I build. I also define knowledge broadly: not just academic\npapers, but also geospatial datasets, blogs and other more conventionally\n&quot;informal&quot; knowledge sources that are increasingly complementing scholarly\npublishing as a source of timely knowledge.<sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote\">[1]</a></sup></p>\n<p>Towards this, several of my colleagues such as <a href=\"https://jonmsterling.com\">Jon Sterling</a> have been building systems like <a href=\"https://www.forester-notes.org/tfmt-000V/index.xml\">Forester</a>, and I've got my own homebrew <a href=\"/notes/bushel-lives\">Bushel</a>, and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> has <a href=\"https://patrick.sirref.org/weekly-2025-w45/index.xml\">Graft</a>, and <a href=\"https://mynameismwd.org\">Michael Dales</a> has decades of <a href=\"https://digitalflapjack.com/blog/\">Atom/RSS</a> on his sites. These sites are very loosely coupled -- they're built by different people over many years -- but there is already the beginnings of a rich mesh of hyperlinks across them.</p>\n<p>To get to the next level of collective meshing (not just for these personal sites but for <a href=\"/papers/2025-biodiversity-9recs\">biodiversity data</a> as well), I posit we need to explicitly engineer in support for permanence, provenance, permission and placement right into the way we access data across the Internet. If we build these mechanisms via Internet <em>protocols</em>, collective knowledge can be meshed without any one entity requiring central control. I view this as being vital for the Internet as a whole to <a href=\"/papers/2025-internet-ecology\">evolve and adapt</a> into the coming decades and combat <a href=\"https://en.wikipedia.org/wiki/Enshittification\">enshittification</a>.</p>\n<p>I'll now dive into the four principles: (i) <a href=\"#p1-permanence-aka-dois-for-all-with-the-rogue-scholar\">permanence</a>; (ii) <a href=\"#p2-provenance-is-it-ai-poison-or-precious\">provenance</a>; (iii) <a href=\"#p3-permission-not-everything-needs-to-go-into-the-borg\">permission</a>; and (iv) <a href=\"#p4-placement-data-has-weight-and-geopolitics-matters\">placement</a>.</p>\n<h2 id=\"p1-permanence-aka-dois-for-all-with-the-rogue-scholar\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#p1-permanence-aka-dois-for-all-with-the-rogue-scholar\"></a>P1: Permanence (aka DOIs for all with the Rogue Scholar!)</h2>\n<p>Firstly, knowledge that is spread around the world needs a way to be retrieved\nreliably. Scholarly publications, especially open-access ones, are distributed\nboth digitally and physically and often <a href=\"https://www.researchgate.net\">replicated</a>. While papers are &quot;big&quot; enough\npieces of work to warrant this effort, what about all the other outputs we have\nsuch as (micro)blogs, social media posts, and datasets?  A reliable addressing\nsystem is essential to be retrieve these too, and we can do this via standard\nInternet protocols such as HTTP and DNS.</p>\n<p>The eagle-eyed among you might notice that my site now has a unique DOI for most post.\nA <a href=\"https://doi.org\">Digital Object Identifier</a> is something you more conventionally\nsee associated with academic papers, but thanks to the hard work of <a href=\"https://front-matter.de/team\">Martin Fenner</a>\nwe can now have them for other forms of content!  Martin set up the <a href=\"https://rogue-scholar.org/\">Rogue Scholar</a>\nwhich permits <em>any</em> standards-compliant site with an Atom or JSONFeed to be assigned a DOI automatically.</p>\n<p>This post, for example, has the\n<a href=\"https://doi.org/10.59350/418q4-gng78\">10.59350/418q4-gng78</a> DOI assigned to\nit, which forms its unique DOI identifier.  This can be resolved into the real\nlocation by retrieving the DOI URL, which issues a HTTP redirect to this post:</p>\n<pre><code>$ curl -I https://doi.org/10.59350/418q4-gng78\nHTTP/2 302\ndate: Wed, 26 Nov 2025 12:48:43 GMT\nlocation: https://anil.recoil.org/notes/fourps-for-collective-knowledge\n</code></pre>\n<p>Crucially, this DOI URL is not the <em>only</em> identifier for this post, as you can still\nalso use my original homepage URL. However, it's an identifier that can\nbe redirected to a new location if the content moves, and <em>also</em> has extra metadata associated\nwith it that help with keeping track of networks of knowledge.</p>\n<p><a href=\"https://rogue-scholar.org\"> <img src=\"/images/aria-ci-4.webp\" alt=\"%c\" title=\"You too can sign up to the Rogue Scholar with your blog!\" > </a></p>\n<p>Let's look at some more details of the extra useful metadata, by peering at one of my <a href=\"/notes/icfp25-what-i-learnt\">recent posts</a> to see how Rogue Scholar <a href=\"https://rogue-scholar.org/records/kqwjw-cjb76\">augments the metadata for it</a>. They (i) <a href=\"#tracking-authorship-metadata\">track author identities</a>; (ii) build a <a href=\"#forming-a-reference-mesh\">reference mesh</a> across items; and (iii) <a href=\"#archiving-and-versioning-posts\">archive clean versions</a> to replicate content with an open license.</p>\n<h3 id=\"tracking-authorship-metadata\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#tracking-authorship-metadata\"></a>Tracking authorship metadata</h3>\n<p>Firstly, the authorship information helps to identify me concretely across name variations. My own <a href=\"https://orcid.org/0000-0001-8954-2428\">ORCID</a> forms a unique identifier for my own scholarly publishing, and this is now <a href=\"https://blog.front-matter.de/posts/rogue-scholar-links-records-via-orcid-and-doi/\">tied</a> to my blog post.\nYou can then <a href=\"https://rogue-scholar.org/search?q=orcid%3A0000-0001-8954-2428&amp;l=list&amp;p=1&amp;s=10&amp;sort=newest\">search for my ORCID</a> and find my posts, but <em>also</em> find it in other indexing systems such as <a href=\"https://search.crossref.org/?from_ui=&amp;q=0000-0001-8954-2428\">CrossRef</a> which index scholarly metadata.  <a href=\"https://blog.openalex.org/were-rebuilding-openalex-while-its-running-heres-whats-changing/\">OpenAlex has just rewritten</a> their codebase and released it a few weeks ago, with 10s of millions of new types of works indexed.</p>\n<p>Curating databases that are this large across decades clearly leads to some inconsistencies as people move around jobs and change their circumstances. Identifying &quot;who&quot; has done something is therefore a surprisingly tricky metadata problem. This is one of the areas where <a href=\"/notes/disentangling-git-with-bluesky\">ATProto, Bluesky and Tangled</a> have a lot to offer, by allowing the social graph to be shared among multiple differentiated services (e.g. microblogging or code hosting).</p>\n<p><a href=\"https://arxiv.org/pdf/2402.03239\"> <img src=\"/images/aria-ci-5.webp\" alt=\"%rc\" title=\"Bluesky is a highly reusable social authentication mechanism\" > </a></p>\n<h3 id=\"forming-a-reference-mesh\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#forming-a-reference-mesh\"></a>Forming a reference mesh</h3>\n<p>Secondly, references from links within this post are extracted out and linked\nto <em>other</em> DOIs. I do this by generating a <a href=\"/perma.json\">structured JSONFeed</a>\nwhich breaks out metadata for each post by scanning the links within my source\nMarkdown. For example, here is an excerpt for one of the <code>&quot;references&quot;</code> fields\nin my post:</p>\n<pre><code class=\"language-json\">{ &quot;url&quot;: &quot;https://doi.org/10.33774/coe-2025-rmsqf&quot;,\n  &quot;doi&quot;: &quot;10.33774/coe-2025-rmsqf&quot;,\n  &quot;cito&quot;: [ &quot;citesAsSourceDocument&quot; ] }, {\n  &quot;url&quot;: &quot;https://doi.org/10.59350/hasmq-vj807&quot;,\n  &quot;doi&quot;: &quot;10.59350/hasmq-vj807&quot;,\n  &quot;cito&quot;: [ &quot;citesAsRelated&quot; ] },\n</code></pre>\n<p>This structured list of references also includes <a href=\"https://purl.archive.org/spar/cito\">CITO</a> conventions to also list <em>how</em> the citation should be interpreted, which may be useful input to LLMs that are interpreting a document. I've <a href=\"https://ocaml.org/p/jsonfeed/latest\">published</a> an <a href=\"https://tangled.org/anil.recoil.org/ocaml-jsonfeed/blob/main/lib/cito.mli\">OCaml-JSONFeed</a> library that conveniently lists all the citation structures possible.\nThis reference metadata is hoovered up by databases such as <a href=\"https://search.crossref.org/search/works?q=10.59350%2Fhasmq-vj807&amp;from_ui=yes\">CrossRef</a> which use them to maintain their giant graph databases that associate posts, papers and anything else with a DOI with each other.</p>\n<p>To make this as easy as possible to do with any blog content online, Rogue Scholar has augmented <a href=\"https://blog.front-matter.de/posts/rogue-scholar-references-learn-new-tricks/\">how it scans posts</a> so that just adding a &quot;References&quot; header to your content is enough to make this just work. We now have an interconnected mesh of links between diverse blogs and papers and datasets, all using simple URLs!</p>\n<h3 id=\"archiving-and-versioning-posts\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#archiving-and-versioning-posts\"></a>Archiving and versioning posts</h3>\n<p>Thirdly, the metadata and Atom feeds are used to archive the contents of the post via the <a href=\"https://blog.front-matter.de/posts/rogue-scholar-blog-posts-archived-by-internet-archive/\">Internet Archive Archive-It service</a>. This is also not as straightforward as you might expect; the problem with archiving HTML straight from the source is that the web pages you read are usually quite a mess of JavaScript and display logic, whereas the <em>essence</em> of the page is hidden.</p>\n<p>For example, look at the <a href=\"https://web.archive.org/web/20250909194331/https://anil.recoil.org/notes/owntracks-and-lifecycle\">archive.org version</a> of one of my posts vs the <a href=\"https://rogue-scholar.org/records/wm49x-a9q51\">Rogue Scholar version</a> of the same post.  The latter is significantly cleaner, since the &quot;archival&quot; version actually uses my <a href=\"https://anil.recoil.org/perma.json\">blog feed</a> instead of the original HTML.\nThe feed reader version strips out all the unnecessary display gunk so that it can be read by clients like <a href=\"https://netnewswire.com/\">NetNewsWire</a> or Thunderbird. There is some work that needs to happen on the Atom feed generation side to really make this clean; for example, I learnt about how to <a href=\"https://simonwillison.net/2024/Aug/1/footnotes-that-work-in-rss-readers/\">lay out footnotes to be feed-reader friendly</a>.</p>\n<p><a href=\"https://web.archive.org/web/20250909194331/https://anil.recoil.org/notes/owntracks-and-lifecycle\"> <img src=\"/images/aria-ci-2.webp\" alt=\"%c\" title=\"The raw archive version of the post, which keeps all the display HTML\" > </a></p>\n<p><a href=\"https://rogue-scholar.org/records/wm49x-a9q51\"> <img src=\"/images/aria-ci-3.webp\" alt=\"%c\" title=\"The feed reader version of the post, with a clean reader view that is easier to mechanically interpret\" > </a></p>\n<p>To wrap up the first P of Permanence, we've seen that it's a bit more involved than &quot;simply archiving it&quot;. Some metadata curation and formatting flexibility really helps to clean up the connections. If you have your own blog, you should sign up to <a href=\"https://rogue-scholar.org\">Rogue Scholar</a>. Martin has just incorporated it as a <a href=\"https://doi.org/10.53731/rftfk-qv692\">German non-profit organisation</a>, showing he's thinking about the long-term sustainability of such ventures as well.</p>\n<h2 id=\"p2-provenance-is-it-ai-poison-or-rare-literature\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#p2-provenance-is-it-ai-poison-or-rare-literature\"></a>P2: Provenance (is it AI poison or rare literature?)</h2>\n<p>The enormous problem we're facing with collective intelligence right now is that the Internet is getting flooded by AI generated slop. While there are obvious dangers to our collective sanity and attention spans, there's also the pragmatic problem that <a href=\"https://www.nature.com/articles/s41586-024-07566-y\">recursive training causes model collapse</a>. If we just feed our models the output of other language models, we greatly dilute the quality of the resulting LLMs and the overall quality of collective knowledge.</p>\n<p>We observed the societal implications in our <a href=\"/papers/2025-ai-poison\">recent Nature comment</a>:</p>\n<blockquote>\n<p>The publication of ever-larger numbers of problematic papers, including fake ones generated by artificial intelligence, represents an existential crisis for the established way of doing evidence synthesis. But with a new approach, AI might also save the day.\n<cite>-- <a href=\"https://rdcu.be/evkfj\">Will AI speed up literature reviews or derail them entirely?</a>, 2025</cite></p>\n</blockquote>\n<p>We urgently need to build accurate provenance information into our collective knowledge networks to distinguish <em>where</em> some piece of knowledge came from. Efforts like Rogue Scholar and <a href=\"https://blog.kagi.com/small-web\">Kagi Small Web</a> do this by human judgement: a community keeps an eye on the feeds and filters out the obviously bad actors. <a href=\"https://www.shaneweisz.com\">Shane Weisz</a> also pointed out to me that crowdsourcing communities also often self-organise like this. For example, <a href=\"https://www.inaturalist.org/journal/gcwarbler/115290-cv-and-geomodel-predictions-visibility-identifiability-seasonality-apples-oranges\">iNaturalist volunteers</a> painstakingly critique AI output <em>vs</em> human experts for species detection. Provenance in these systems would help them scale their efforts without burning out.</p>\n<p>Luckily though, we do have some partial solutions already for keeping track of\nprovenance:</p>\n<ul>\n<li>Code can be versioned through Git, now widely adopted, but also <a href=\"/notes/tangled-and-ci\">federated via Tangled</a>.</li>\n<li>Data can be traced through services like <a href=\"https://zenodo.org\">Zenodo</a> and even <a href=\"https://help.zenodo.org/docs/deposit/describe-records/reserve-doi/\">given DOIs</a> just like Rogue Scholar has been doing. This is not perfect yet since it's difficult to <a href=\"/papers/2025-yirgacheffe\">continuously update large datasets</a>, but technology is steadily advancing here.</li>\n<li>Code <em>and</em> data can be versioned through dataflow systems, of which there are many out there include several we discussed at <a href=\"/notes/icfp25-propl\">PROPL 2025</a>, such as <a href=\"https://www.cse.iitd.ac.in/~aseth/\">Aadi Seth</a>'s <a href=\"https://dl.acm.org/doi/10.1145/3759536.3763803\">dynamic STACs</a> or our own <a href=\"/papers/2022-oud-ocurrent\">OCurrent</a>, or <a href=\"https://codeocean.com/resources/nature-partnership\">Nature+CodeOcean</a> for scientific computation.</li>\n<li>Rogue Scholar <a href=\"https://doi.org/10.53731/nxp08-a9947\">supports DOI versioning</a> of posts to allow intentional edits of the same content.</li>\n</ul>\n<p>What's missing is a provenance <em>protocol</em> by which each of these &quot;islands of provenance&quot; can interoperate across each other's boundaries. Almost every project runs its own CI systems that never share the details of how they got their data and code. Security organisations are now recommending <a href=\"https://www.ncsc.gov.uk/blog-post/sboms-and-the-importance-of-inventory\">Software Bill of Materials</a> be generated for all software, and <a href=\"https://www.docker.com/products/hardened-images/\">Docker Hardened Images</a> are acting as an anchor for wider efforts in this space. The IETF <a href=\"https://blog.aayushg.com/standards/\">is moving to advance standards of provenance</a> but perhaps too slowly and conservatively given the <a href=\"/notes/ai-ietf-aiprefs\">rapid rise of AI crawlers</a>.</p>\n<p>An area I'm going to investigate in the future is how HTTP-based provenance headers\nmight help glue these together, so that a collective knowledge crawler doesn't\nneed to build a global provenance graph (which would be overwhelmingly massive)\nto filter out non-trusted primary content.</p>\n<p><a href=\"https://rdcu.be/evkfj\"> <img src=\"/images/davidparkins-ai-poison.webp\" alt=\"%c\" title=\"AI poisoning the literature in a legendary cartoon. Credit: David Parkins, Nature\" > </a></p>\n<h2 id=\"p3-permission-not-everything-needs-to-go-into-the-borg\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#p3-permission-not-everything-needs-to-go-into-the-borg\"></a>P3: Permission (not everything needs to go into the Borg)</h2>\n<p>The Internet is pretty good about building giant public databases, and it's\nalso pretty good at supporting storing secret data. However, it's <em>terrible</em>\nat supporting semi-private access to remote sites.</p>\n<p>Consider a really obvious collective knowledge case: I want to expose my draft\npapers that I'm working on with a diverse group of people. I collaborate with\ndozens of people all over the world, and so want to selectively grant them\naccess to my works-in-progress.  Why is this so difficult to do?</p>\n<p>It's currently easy to use <em>individual</em> services to grant access; for example,\nI might share my <a href=\"https://overleaf.com\">Overleaf</a> or my Google Drive for a\nproject, but propagating those access rights across services is near impossible\nas soon as you cross a project or API boundary.  There are a few directions we\ncould go to break this problem down into easier to solve chunks:</p>\n<ul>\n<li>If we make it easier to self-host services, for example via initiatives like <a href=\"https://doi.org/10.59350/s621r-eg143\">Eilean</a>, then having access to the databases directly makes it much easier to take nuanced decisions about which bits of the data to grant access to. I run, for example, three separate video hosting sites: one for <a href=\"https://watch.ocaml.org\">OCaml</a>, for the <a href=\"https://watch.eeg.cl.cam.ac.uk\">EEG</a> and another <a href=\"https://crank.recoil.org\">personally</a>. Each of these federates across each other via <a href=\"https://activitypub.rocks\">ActivityPub</a>, but still supports private videos.</li>\n<li>There was research into distributed permission protocols like <a href=\"https://research.google/pubs/macaroons-cookies-with-contextual-caveats-for-decentralized-authorization-in-the-cloud/\">Macaroons</a> at the height of the cloud boom a decade ago, but they've all been swallowed up into the bottomless pit of pain that is <a href=\"https://oauth.net/2/\">oAuth</a>. It's high time we resurrected some of the more nuanced work on fine-grained authentication that doesn't give access to absolutely everything and/or SMS you at 2am requesting a verification code.</li>\n<li>Rather than 'yes/no' decisions, we could also share different <em>views</em> of the data depending on who's asking. This used to be difficult due to the combinatorics involved, but you could imagine nowadays applying a local LLM to figure out the rich context. The <a href=\"https://github.com/google-deepmind/concordia\">DeepMind Concordia</a> project takes this idea even further with social simulations based on the same principles.</li>\n</ul>\n<p>When we look at the <a href=\"/notes/rs-future-of-publishing\">current state of the publishing industry</a>, it becomes clearer why a few publishers are hoovering up smaller journals. Maintaining the infrastructure for open/closed access is just a lot easier in a centralised setup than it is when distributed. However, it's crucial for expanding the ceiling on our collective knowledge that we support such federated access to semi-private data. Let's consider another sort of data to see why...</p>\n<h3 id=\"biodiversity-needs-spatial-permissioning\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#biodiversity-needs-spatial-permissioning\"></a>Biodiversity needs spatial permissioning</h3>\n<p>Zooming out to a global usecase, biodiversity data is a prime example of where\neverything can't be open. Economically motivated rational actors (i.e.\npoachers) are highly incentivised to use all available data to figure out where\nto snarf a rare species, and so some of this presence data is vital to keep\ntight control over. But the pendulum swings both ways, and without robust\npermissions mechanisms to share the data with well intentioned actors, we\ncannot make evidence-driven planning decisions for global topics such as <a href=\"/notes/exploring-food-impacts\">food consumption</a>.</p>\n<p>I asked <a href=\"https://www.cambridgeconservation.org/about/people/professor-neil-burgess/\">Neil Burgess</a>, the Chief Scientist of <a href=\"https://www.unep-wcmc.org/en\">UNEP-WCMC</a>, and <a href=\"https://scholar.google.com/citations?user=fK2N8doAAAAJ&amp;hl=en\">Violeta Muñoz-Fuentes</a> about their views on how biodiversity data might make an impact if connected together. They gave me a remarkable list of databases they maintain (an excerpt reproduced with permission here):</p>\n<blockquote>\n<ul>\n<li><a href=\"https://www.protectedplanet.net\">Protected Planet</a>, 10s of thousands of users. The world's most trusted, up-to-date, and complete source of information on protected areas and other effective area-based conservation measures (OECMs). Includes effectiveness of protected and conserved area management; updated monthly with submissions from governments, NGOs, landowners, and communities.</li>\n<li><a href=\"https://unbiodiversitylab.org/en/\">UN Biodiversity Lab</a>, 1000s of governments, NGO users.  A geospatial platform with 400+ of the world’s best data layers on nature, climate change, and sustainable development. Supports country-led efforts for planning, monitoring, and reporting; linked to the Convention on Biological Diversity's global nature agreements.</li>\n<li><a href=\"https://tradeview.cites.org/\">CITES Wildlife Trade View</a>, 1000s of government and NGO users. Visualizes legal wildlife trade globally, by species and by country.</li>\n<li><a href=\"https://trade.cites.org/\">CITES Wildlife Trade Database</a>, 10000s of government, NGO, research users.  Contains records of all legal wildlife trade under CITES (&gt;40,900 species globally).</li>\n<li><a href=\"https://www.ibat-alliance.org/\">IBAT</a>, 1000s of businesses.  Spatial tool for businesses to calculate potential impacts on nature. IBAT is an alliance of BirdLife International, Conservation International, IUCN, and UNEP-WCMC.\n<cite>-- An excerpt of UNEP-WCMC tools and systems (N. Burgess, personal communication, 2025)</cite></li>\n</ul>\n</blockquote>\n<p>This is just a short excerpt from the list, and many of these involve illegal activities (tracking them, not doing them!). The value in connecting them together and making them safely accessible by both humans and AI agents would be transformative to the global effort to save species from extinction, for example by carefully picking and choosing what trade agreements are signed between countries. A real-time version could change the course of human history for pivotal global biodiversity conferences where negotiations decide the future of many.</p>\n<p>So I make a case that we <em>must</em> engineer robust permission protocols into the heart of how we share data, and not just for copyright and legal reasons. Some data must stay private for security, economic or geopolitical reasons, but that act of hiding knowledge currently makes it very difficult to take part in a collective knowledge network with our current training architectures. Perhaps <a href=\"https://en.wikipedia.org/wiki/Federated_learning\">federated learning</a> will be one breakthrough, but I'm betting on <a href=\"/papers/2024-hope-bastion\">agentic permissions</a> being where this goes instead.</p>\n<p><a href=\"https://unbiodiversitylab.org/en\"> <img src=\"/images/aria-ci-6.webp\" alt=\"%c\" title=\"The UN Biodiversity Lab has spatial biodiversity information for thousands of species across the world, collected by many, many people over the years. (source: Violeta Muñoz-Fuentes, UNEP-WCMC)\" > </a></p>\n<h2 id=\"p4-placement-data-has-weight-and-geopolitics-matters\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#p4-placement-data-has-weight-and-geopolitics-matters\"></a>P4: Placement (data has weight, and geopolitics matters)</h2>\n<p>The final P is one that we thought we wouldn't need to worry about thanks to\n<a href=\"/papers/xen02\">the cloud</a> back in the day: placement. A lot of the digital data\ninvolved in our lives is spatial in nature (e.g. our movement data), but also\nmust be <em>accessed</em> only from some locations. If we don't engineer in location as\na first-class element of how we treat collective knowledge, it'll never be a truely\nuseful knowledge companion to humans.</p>\n<h3 id=\"physical-location-matters-a-lot-for-knowledge-queries\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#physical-location-matters-a-lot-for-knowledge-queries\"></a>Physical location matters a lot for knowledge queries</h3>\n<p>We explained some spatial ideas in our recent <a href=\"/papers/2025-bifrost\">Bifrost</a> paper:</p>\n<blockquote>\n<p>Physical containment creates a natural network hierarchy, yet we do not currently take advantage of this. Even local interactions between devices often require traversal over a wide-area network (WAN), with consequences for privacy, robustness, and latency.</p>\n<p>Instead, devices in the same room should communicate directly, while physical barriers should require explicit networking gateways. We call this spatial networking: instead of overlaying virtual addresses over physical network connections, we use physical spaces to constrain virtual network addresses.</p>\n<p>This lets users point at two devices and address them by their physical relationship; devices are named by their location, policies are scoped\nby physical boundaries, and spaces naturally compose while maintaining local autonomy.\n<cite>-- <a href=\"/papers/2025-bifrost\">An Architecture for Spatial Networking</a>, Millar et al, 2025</cite></p>\n</blockquote>\n<p>Think about all the times in your life that you've wanted to pull up some specific knowledge about the region you're in, and how bad our digital systems currently are at dealing with fine-grained location. I go to the gym every Sunday morning like clockwork with <a href=\"https://dave.recoil.org\">Dave Scott</a>, and yet my workout app treats it like a whole new experience every single time I open it.</p>\n<p>Similarly, if you have a group of people in a meeting room, they should be able to use their physical proximity to take advantage of that inherent trust! For example, photos weren't allowed the Collective Intelligence workshop due to the Chatham House rules, but it would have been <em>really</em> useful to be able to get a copy of other people's photos for me to have a personal record of all the amazing whiteboarding brainstorming that was going on.</p>\n<p>Protocol support for placement, combined with permissioning above, would allow us to build a personal knowledge network that actually fits into our lives based on where we physically are.</p>\n<h3 id=\"where-code-and-data-is-hosted-also-matters\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#where-code-and-data-is-hosted-also-matters\"></a>Where code and data is hosted also matters</h3>\n<p>When I started working on <a href=\"/projects/plancomp\">planetary computing</a>, the most obvious\nchange was just how vast the datasets involved are. We've just installed a\n<a href=\"https://www.tunbury.org/2025/10/18/quick-look-at-ceph/\">multi petabyte cluster</a> in the\nComputer Lab just to deal with the embeddings from <a href=\"/papers/2025-tessera\">TESSERA</a>,\nand syncing those takes weeks even on a gigabit link. This data can't just\ncasually move, which means that in turn we can't use many cloud-based services\nlike GitHub for our hosting. And this, in turn, disconnects us from all the\ncollective knowledge training happening there for their code foundation models.</p>\n<p>An alternative is to decouple the names of the code and data from where it's hosted.\nThis is a feature explicitly supported by the <a href=\"https://arxiv.org/abs/2402.03239\">AT Protocol</a> that underpins Bluesky.\nThere are <a href=\"/notes/atproto-for-fun-and-blogging\">dozen of alternative services</a> that are springing up\nthat can reuse the authentication infrastructure, but allow users to choose <em>where</em> their data is hosted.</p>\n<p>The one we are using most here in my group is <a href=\"https://tangled.org\">Tangled</a>, which is a code hosting service\nthat I've <a href=\"/notes/disentangling-git-with-bluesky\">described before</a> and <a href=\"https://tangled.org/anil.recoil.org\">use regularly</a>.\nWhat makes this service different from other code hosting is that while I can remain social and share, the\nactual code is stored on a &quot;knot&quot; that I host. I run several: one on my personal <a href=\"https://recoil.org\">recoil.org</a> domain,\nand another for my colleagues in the Cambridge Computer Lab. Code that sits there can also be run using <a href=\"/notes/tangled-and-ci\">local CI</a>\nwhich can access private data stored in our local network, by virtue of the fact that we run our own infrastructure.</p>\n<p>To wrap up the principle of placement, I've made a case for why explicit control over locations (of people, of code, of data, of predictions) matter a lot for collective intelligence, and should be factored into any system architecture for this. If you don't believe me, try asking your nearest LLM for what the best pub is near you, and watch it hallucinate and burn!</p>\n<p><img src=\"/images/okavango-2.webp\" alt=\"%c\" title=\"Offline access also still matters, for latency and battery life, or you're simply in the middle of nowhere\" ></p>\n<h2 id=\"next-directions\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#next-directions\"></a>Next Directions</h2>\n<p>I jotted down these four principles to help organise my thoughts, and they're by no means set in stone. I am reasonably convinced that the momentum building around <a href=\"https://atprotocol.dev/atmosphereconf/\">ATProto usage</a> worldwide makes it a compelling place to focus prototyping and research efforts on, and they are working on plugging gaps such as <a href=\"https://atproto.wiki/en/working-groups/private-data\">permission support</a> already. If you'd like to work on this or have pointers for me, please do let me know! I'll update this post as they come in.</p>\n<p><small class=\"credits\">(I'd like to thank many people for giving me input and ideas into this post, many of whom are cited above. In particular, Sadiq Jaffer, Shane Weisz, Michael Dales, Patrick Ferris, Cyrus Omar, Aadi Seth, Michael Coblenz, Jon Sterling, Nate Foster, Aurojit Panda, Ian Brown, Srinivasan Keshav, Jon Crowcroft, Ryan Gibb, Josh Millar, Hamed Haddadi, Sam Reynolds, Alec Christie, Bill Sutherland, Violeta Muñoz-Fuentes and Neil Burgess have all poured in relevant ideas, along with the wider ATProto and ActivityPub communities)</small></p>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p>As I write this, the UK's budget announcement was released <a href=\"https://www.bbc.co.uk/news/live/cy8vz032qgpt\">an hour early</a> by the OBR, throwing markets into real-time uncertainty.</p>\n <a href=\"#fnref:1\" class=\"reversefootnote\">&#8617;</a></p></li></ol></div><h1>References</h1><ul><li>Madhavapeddy et al (2025). Steps towards an Ecology for the Internet. Association for Computing Machinery. <a href=\"https://doi.org/10.1145/3744169.3744180\" target=\"_blank\"><i>10.1145/3744169.3744180</i></a></li>\n<li>Madhavapeddy (2025). Royal Society's Future of Scientific Publishing meeting. <a href=\"https://doi.org/10.59350/nmcab-py710\" target=\"_blank\"><i>10.59350/nmcab-py710</i></a></li>\n<li>Feng et al (2025). TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2506.20380\" target=\"_blank\"><i>10.48550/arXiv.2506.20380</i></a></li>\n<li>Madhavapeddy (2025). A fully AI-generated paper just passed peer review; notes from our evidence synthesis workshop. <a href=\"https://doi.org/10.59350/k540h-6h993\" target=\"_blank\"><i>10.59350/k540h-6h993</i></a></li>\n<li>Madhavapeddy (2025). Oh my Claude, we need agentic copilot sandboxing right now. <a href=\"https://doi.org/10.59350/aecmt-k3h39\" target=\"_blank\"><i>10.59350/aecmt-k3h39</i></a></li>\n<li>Sutherland et al (2026). Nine changes needed to deliver a radical transformation in biodiversity measurement. <a href=\"https://doi.org/10.1073/pnas.2519345123\" target=\"_blank\"><i>10.1073/pnas.2519345123</i></a></li>\n<li>Dales et al (2025). Yirgacheffe: A Declarative Approach to Geospatial Data. Association for Computing Machinery. <a href=\"https://doi.org/10.1145/3759536.3763806\" target=\"_blank\"><i>10.1145/3759536.3763806</i></a></li>\n<li>Barham et al (2003). Xen 2002. <a href=\"https://doi.org/10.48456/tr-553\" target=\"_blank\"><i>10.48456/tr-553</i></a></li>\n<li>Madhavapeddy (2025). Thoughts on the National Data Library and private research data. <a href=\"https://doi.org/10.59350/fk6vy-5q841\" target=\"_blank\"><i>10.59350/fk6vy-5q841</i></a></li>\n<li>Madhavapeddy (2025). What I learnt at ICFP/SPLASH 2025 about OCaml, Hazel and FP. <a href=\"https://doi.org/10.59350/w1jvt-8qc58\" target=\"_blank\"><i>10.59350/w1jvt-8qc58</i></a></li>\n<li>Jaffer et al (2025). AI-assisted Living Evidence Databases for Conservation Science. Cambridge Open Engage. <a href=\"https://doi.org/10.33774/coe-2025-rmsqf\" target=\"_blank\"><i>10.33774/coe-2025-rmsqf</i></a></li>\n<li>Millar et al (2025). An Architecture for Spatial Networking. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2507.22687\" target=\"_blank\"><i>10.48550/arXiv.2507.22687</i></a></li>\n<li>Reynolds et al (2025). Will AI speed up literature reviews or derail them entirely?. Nature Publishing Group. <a href=\"https://doi.org/10.1038/d41586-025-02069-w\" target=\"_blank\"><i>10.1038/d41586-025-02069-w</i></a></li>\n<li>Madhavapeddy (2025). Socially self-hosting source code with Tangled on Bluesky. <a href=\"https://doi.org/10.59350/r80vb-7b441\" target=\"_blank\"><i>10.59350/r80vb-7b441</i></a></li>\n<li>Madhavapeddy (2025). Programming for the Planet at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/hasmq-vj807\" target=\"_blank\"><i>10.59350/hasmq-vj807</i></a></li>\n<li>Madhavapeddy (2025). The AIETF arrives, and not a moment too soon. <a href=\"https://doi.org/10.59350/agfta-8wk09\" target=\"_blank\"><i>10.59350/agfta-8wk09</i></a></li>\n<li>Madhavapeddy (2025). Arise Bushel, my sixth generation oxidised website. <a href=\"https://doi.org/10.59350/0r62w-c8g63\" target=\"_blank\"><i>10.59350/0r62w-c8g63</i></a></li>\n<li>Madhavapeddy (2025). Exploring the biodiversity impacts of what we choose to eat. <a href=\"https://doi.org/10.59350/xj427-y3q48\" target=\"_blank\"><i>10.59350/xj427-y3q48</i></a></li>\n<li>Madhavapeddy (2025). GeoTessera Python library released for geospatial embeddings. <a href=\"https://doi.org/10.59350/7hy6m-1rq76\" target=\"_blank\"><i>10.59350/7hy6m-1rq76</i></a></li>\n<li>Madhavapeddy (2025). mlgpx is the first Tangled-hosted package available on opam. <a href=\"https://doi.org/10.59350/7267y-nj702\" target=\"_blank\"><i>10.59350/7267y-nj702</i></a></li>\n<li>Madhavapeddy (2025). Using AT Proto for more than just Bluesky posts. <a href=\"https://doi.org/10.59350/32rdt-zny05\" target=\"_blank\"><i>10.59350/32rdt-zny05</i></a></li>\n<li>Fenner (2025). Rogue Scholar is becoming a German Non-Profit Organization. Front Matter. <a href=\"https://doi.org/10.53731/rftfk-qv692\" target=\"_blank\"><i>10.53731/rftfk-qv692</i></a></li>\n<li>Fenner (2025). Rogue Scholar starts supporting versioning. Front Matter. <a href=\"https://doi.org/10.53731/nxp08-a9947\" target=\"_blank\"><i>10.53731/nxp08-a9947</i></a></li>\n<li>Gibb (2025). Eilean. Front Matter. <a href=\"https://doi.org/10.59350/s621r-eg143\" target=\"_blank\"><i>10.59350/s621r-eg143</i></a></li>\n<li>Shumailov et al (2024). AI models collapse when trained on recursively generated data. Nature. <a href=\"https://doi.org/10.1038/s41586-024-07566-y\" target=\"_blank\"><i>10.1038/s41586-024-07566-y</i></a></li>\n<li>Laud et al (2025). STACD: STAC Extension with DAGs for Geospatial Data and Algorithm Management. <a href=\"https://doi.org/10.1145/3759536.3763803\" target=\"_blank\"><i>10.1145/3759536.3763803</i></a></li>\n<li>Kleppmann et al (2024). Bluesky and the AT Protocol: Usable Decentralized Social Media. Proceedings of the ACM Conext-2024 Workshop on the Decentralization of the Internet. <a href=\"https://doi.org/10.1145/3694809.3700740\" target=\"_blank\"><i>10.1145/3694809.3700740</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/principles-for-collective-knowledge",
      "title": "Four Ps for Building Massive Collective Knowledge Systems",
      "summary": "Design principles for collective knowledge systems—permanence, provenance, permission, and placement—that enable robust networks for evidence-based decision making.",
      "date_published": "2025-11-23T00:00:00.000000Z",
      "date_modified": "2025-11-23T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ai",
        "policy",
        "spatial",
        "networking",
        "biodiversity"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1145/3744169.3744180",
          "doi": "10.1145/3744169.3744180",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/nmcab-py710",
          "doi": "10.59350/nmcab-py710",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2506.20380",
          "doi": "10.48550/arXiv.2506.20380",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/k540h-6h993",
          "doi": "10.59350/k540h-6h993",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/aecmt-k3h39",
          "doi": "10.59350/aecmt-k3h39",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1073/pnas.2519345123",
          "doi": "10.1073/pnas.2519345123",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3759536.3763806",
          "doi": "10.1145/3759536.3763806",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.48456/tr-553",
          "doi": "10.48456/tr-553",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/fk6vy-5q841",
          "doi": "10.59350/fk6vy-5q841",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/w1jvt-8qc58",
          "doi": "10.59350/w1jvt-8qc58",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.33774/coe-2025-rmsqf",
          "doi": "10.33774/coe-2025-rmsqf",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2507.22687",
          "doi": "10.48550/arXiv.2507.22687",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1038/d41586-025-02069-w",
          "doi": "10.1038/d41586-025-02069-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/r80vb-7b441",
          "doi": "10.59350/r80vb-7b441",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/hasmq-vj807",
          "doi": "10.59350/hasmq-vj807",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/agfta-8wk09",
          "doi": "10.59350/agfta-8wk09",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/0r62w-c8g63",
          "doi": "10.59350/0r62w-c8g63",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/xj427-y3q48",
          "doi": "10.59350/xj427-y3q48",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/7hy6m-1rq76",
          "doi": "10.59350/7hy6m-1rq76",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/7267y-nj702",
          "doi": "10.59350/7267y-nj702",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/32rdt-zny05",
          "doi": "10.59350/32rdt-zny05",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.53731/rftfk-qv692",
          "doi": "10.53731/rftfk-qv692",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.53731/nxp08-a9947",
          "doi": "10.53731/nxp08-a9947",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.59350/s621r-eg143",
          "doi": "10.59350/s621r-eg143",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s41586-024-07566-y",
          "doi": "10.1038/s41586-024-07566-y",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3759536.3763803",
          "doi": "10.1145/3759536.3763803",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3694809.3700740",
          "doi": "10.1145/3694809.3700740",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/nagwp-tnw89",
      "content_html": "<p>I've just released <a href=\"https://github.com/ucam-eo/geotessera/releases\">geotessera 0.7</a> to <a href=\"https://pypi.org/project/geotessera\">pypi</a> for our <a href=\"/papers/2025-tessera\">TESSERA</a>\ngeospatial foundation model, following on from the <a href=\"/notes/geotessera-python\">first release</a> earlier this year. To recap:</p>\n<blockquote>\n<p>TESSERA is a foundation model for Earth observation that processes Sentinel-1\nand Sentinel-2 satellite data to generate representation (embedding) maps. It\ncompresses a full year of Sentinel-1 and Sentinel-2 data and learns useful\ntemporal-spectral features.\n<cite>-- <a href=\"https://github.com/ucam-eo/tessera\">Temporal Embeddings of Surface Spectra for Earth Representation and Analysis</a></cite></p>\n</blockquote>\n<p>With this new release, there's convenient <a href=\"http://geotessera.readthedocs.io\">documentation</a> to show how you can freely access 150TB+\nof CC-BY-licensed embeddings of the earth's surface. We've been getting a growing <a href=\"https://github.com/ucam-eo/geotessera/issues?q=is%3Aissue%20label%3Aembedding-request\">influx of requests</a> for diverse regions of the world, and so our focus for the next few\nmonths is attaining complete coverage of our v1 model on the whole planet.</p>\n<h2 id=\"generating-and-storing-the-worlds-embeddings\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#generating-and-storing-the-worlds-embeddings\"></a>Generating and storing the world's embeddings</h2>\n<p>A new <a href=\"https://github.com/ucam-eo/tessera-coverage-map/blob/main/tessera_coverage.png\">coverage map visualiser</a> tracks our progress towards complete global embeddings;\nto complete the set, we have a giant inferencing task ahead of us. AMD kindly provided us access to an <a href=\"https://www.amd.com/en/corporate/university-program/ai-hpc-cluster.html\">MI300X cluster</a> for us to train the TESSERA models on, and now we are using our in-house <a href=\"/videos/48a7ab10-3f49-4978-a00f-c26b64c2cae7\">Dawn cluster</a> along with machines donated by <a href=\"https://tarides.com\">Tarides</a> and <a href=\"https://www.tunbury.org/2025/03/27/dell-poweredge-r640/\">Jane Street</a> to run every tile on earth through our pipeline. The\noutput of this is a very large set of 128 dimensional vectors in <a href=\"https://www.nature.com/articles/s41586-020-2649-2\">numpy</a> format,\nwhich we then spatially sort and make available for download. You can <a href=\"https://dl2.geotessera.org\">browse the tiles</a> on our HTTP server, where you can see <a href=\"https://dl2.geotessera.org/v1/global_0.1_degree_representation/2024/grid_-0.05_12.75/\">one such tile</a> and associated<sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote\">[1]</a></sup> checksums.</p>\n<p><a href=\"https://ucam-eo.github.io/tessera-coverage-map/\"> <img src=\"/images/geotessera-coverage-ui.webp\" alt=\"%c\" title=\"The hourly refresh of the progress of our global embeddings for 2017-2024\" > </a></p>\n<p>All pretty simple stuff -- it's just some floating point numbers! -- except for the sheer number of them. Each tile of vectors is around 160MB in size, and so we need around 250TB of storage per year, and we want them from 2017 through to 2024. And by the time we're done inferring all these, 2026 will be upon us and we'll need to generate <em>another</em> 250TBs in short order for the 2025 seasons. Ultimately, to keep up with this we need to build a scalable, open and <a href=\"/papers/2025-fairground\">federated</a> pipeline to handle this, which the 0.7 geotessera release lays the groundwork for.</p>\n<p>Earlier releases of geotessera used the <a href=\"https://pypi.org/project/pooch/\">Pooch</a> library as a registry for all the embeddings. Pooch is a very convenient Python library that handles the details of fetching, checksumming and caching, but is really designed for grabbing a few large datasets, for example from Zenodo. The <a href=\"https://github.com/ucam-eo/tessera-manifests\">Tessera manifests repository for Pooch</a> now has over a million files and takes minutes to initialise the data structures, which in turn makes using <a href=\"https://github.com/ucam-eo/tessera-interactive-map\">GeoTessera interactively</a> slow.</p>\n<p>One lesson <a href=\"https://toao.com\">Sadiq Jaffer</a> and I learnt from building our <a href=\"/papers/2025-evidence-tap\">giant literature database</a> is how good the <a href=\"https://parquet.apache.org/\">Parquet</a> file format is. So in this 0.7 release, we switch from a text format to track all the tiles to a <a href=\"https://dl2.geotessera.org/v1/\">couple of GeoParquet databases</a>. To help visualize this, I've created an <a href=\"https://huggingface.co/datasets/avsm/geotessera/viewer/registry?views%5B%5D=registry\">avsm/geotessera HuggingFace</a> dataset where you can browse the format, as well as an <a href=\"https://ucam-eo.github.io/tessera-coverage-map/\">interactive coverage map</a>.</p>\n<p>Now when you initialise GeoTessera 0.7+, it will download the two parquet\nfiles instead of using the old git registry. The landmasks parquet rarely changes (it just provides the mapping\nbetween ocean/land tiles and the relevant UTM projection). The registry\ncontains the tiles for all years, and weighs in around 100MB so far compressed.\nIn future releases, I'll probably split out them out to be per-year, which will\nmake them smaller again, but they're all in one place for convenience right now\nas the size of the registry download is about the same as a single embedding\ntile, so it's not hugely significant to optimise at this stage.</p>\n<h3 id=\"towards-zarr-format-instead-of-numpy\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#towards-zarr-format-instead-of-numpy\"></a>Towards Zarr format instead of Numpy?</h3>\n<p>Hot on the heels of the 0.7.0 release, <a href=\"https://www.syke.fi/fi/asiantuntijat/janne-mayra\">Janne Mäyrä</a> submitted a PR to <a href=\"https://github.com/ucam-eo/geotessera/pull/94\">add support</a> for a\nfile format called <a href=\"https://zarr.dev/\">Zarr</a> which is a really optimised way to store tensors.</p>\n<blockquote>\n<p>Zarr is a community project to develop specifications and software for\nstorage of large N-dimensional typed arrays, also commonly known as tensors.\nA particular focus of Zarr is to provide support for storage using\ndistributed systems like cloud object stores, and to enable efficient I/O for\nparallel computing applications.</p>\n<p>Zarr is motivated by the need for a simple, transparent, open, and\ncommunity-driven format that supports high-throughput distributed I/O on\ndifferent storage systems. Zarr data can be stored in any storage system that\ncan be represented as a key-value store, including most commonly POSIX file\nsystems and cloud object storage but also zip files as well as relational and\ndocument databases.</p>\n<p><cite>-- <a href=\"https://zarr.dev/\">Zarr Homepage</a>, 2025</cite></p>\n</blockquote>\n<p>This is exactly what we need given the giant storage requirements above, so\nI've fixed up and merged this feature and included it into a <a href=\"https://github.com/ucam-eo/geotessera/releases/tag/v0.7.1\">0.7.1 point\nrelease</a>. I've not had much chance to actually play with the format yet, but\ngetting it into a release is the best place to start. I got one positive\nmessage from <a href=\"https://ancazugo.github.io/\">Andres Zuñiga-Gonzalez</a> that he's been using Zarr already in his <a href=\"https://ancazugo.github.io/posts/2025-09-14-weekly-notes.html\">Local Climate Zone</a>\nexperiments.</p>\n<h2 id=\"more-convenient-apis-for-sampling\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#more-convenient-apis-for-sampling\"></a>More convenient APIs for sampling</h2>\n<p>The other feature that's gone into the new library are higher level APIs to use\nthe embeddings. A very common usecase when building downstream tasks is to have\nto sample embeddings for a set of labels, which are then used to train\nclassifiers. It's quite cumbersome to select these out by hand, particularly with\nlarger regions of interest.</p>\n<p>There is now a new <code>sample_embeddings_at_points</code> library call\nthat extracts embedding values at arbitrary lon/lat coordinates and\ngroups points by tile for efficient batch processing. You can see this in action\nin the new <a href=\"https://github.com/ucam-eo/geotessera-examples\">geotessera-examples</a>\nrepository where we're starting to put sample code for downstream tasks that\nuse geotessera.</p>\n<p>The first one here is the code to detect solar panels worldwide, as demoed by\n<a href=\"https://toao.com\">Sadiq Jaffer</a> in his recent <a href=\"/notes/icfp25-propl\">PROPL 25 talk</a>.  Browse though the\n<a href=\"https://github.com/ucam-eo/geotessera-examples/tree/main/solarpanel\">source code</a> and\ngive it a spin. The results can visualised using QGIS, and I'm working on a notebook\ninterface for this later.</p>\n<p><a href=\"/images/tessera-f1.webp\"> <img src=\"/images/tessera-f1.webp\" alt=\"%c\" title=\"Parametric UMAP false colour visualisation of TESSERA embeddings for Cambridgeshire\" > </a></p>\n<p>The second one is the <a href=\"https://github.com/ucam-eo/geotessera-examples/tree/main/pumap-viz\">parametric UMAP script</a> to do false colour visualizations of\nany ROI. You can see a high-res &quot;arty version&quot; that has been contour traced using a pomap algorithm which <a href=\"http://www.cl.cam.ac.uk/~avsm2/cb2-pumap.svg\">renders as a high-res SVG</a> as well.</p>\n<h2 id=\"keep-the-embeddings-requests-coming\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#keep-the-embeddings-requests-coming\"></a>Keep the embeddings requests coming!</h2>\n<p>That's it for now with GeoTessera updates. Enjoy the new releases, and if you have any requests for embeddings please get them in <a href=\"https://github.com/ucam-eo/geotessera/issues\">our queue</a>.</p>\n<p><a href=\"https://www.tunbury.org/\">Mark Elvers</a> is currently syncing our embeddings to <a href=\"https://scaleway.com\">Scaleway</a> via a <a href=\"https://www.tunbury.org/2025/11/03/cepfs-partition-setup/\">Ceph cluster we setup</a> for this purpose, and I'll be announcing even more federation options for TESSERA over the coming weeks. We're <em>really</em> grateful to the outpouring of offers of help with our computational needs from our users and cloudy friends! And we also know that TESSERA doesn't have a proper homepage yet; we'll work on this right after the immediate embeddings bottleneck is handled for our current users (contributions from interested web designers are very welcome here).</p>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p>There are two numpy files in there because we store the quantized embeddings to save space. To dequantize them, the scales are multiplied to each of the bands in the main file. Geotessera takes <a href=\"https://geotessera.readthedocs.io/en/latest/architecture.html#quantization-system\">care of this</a> for you.</p>\n <a href=\"#fnref:1\" class=\"reversefootnote\">&#8617;</a></p></li></ol></div><h1>References</h1><ul><li>Feng et al (2025). TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2506.20380\" target=\"_blank\"><i>10.48550/arXiv.2506.20380</i></a></li>\n<li>Jaffer et al (2025). AI-assisted Living Evidence Databases for Conservation Science. Cambridge Open Engage. <a href=\"https://doi.org/10.33774/coe-2025-rmsqf\" target=\"_blank\"><i>10.33774/coe-2025-rmsqf</i></a></li>\n<li>Madhavapeddy (2025). Programming for the Planet at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/hasmq-vj807\" target=\"_blank\"><i>10.59350/hasmq-vj807</i></a></li>\n<li>Omar et al (2025). A FAIR Case for a Live Computational Commons. Association for Computing Machinery. <a href=\"https://doi.org/10.1145/3759536.3763802\" target=\"_blank\"><i>10.1145/3759536.3763802</i></a></li>\n<li>Madhavapeddy (2025). GeoTessera Python library released for geospatial embeddings. <a href=\"https://doi.org/10.59350/7hy6m-1rq76\" target=\"_blank\"><i>10.59350/7hy6m-1rq76</i></a></li>\n<li>Harris et al (2020). Array programming with NumPy. Nature. <a href=\"https://doi.org/10.1038/s41586-020-2649-2\" target=\"_blank\"><i>10.1038/s41586-020-2649-2</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/geotessera-python-0-7",
      "title": "GeoTessera 0.7 out with efficient sampling and Zarr support",
      "summary": "GeoTessera 0.7 switches to GeoParquet manifests for faster initialisation, adds Zarr tensor storage support, and provides new sampling APIs for building downstream tasks like solar panel detection.",
      "date_published": "2025-11-17T00:00:00.000000Z",
      "date_modified": "2025-11-17T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "tessera",
        "spatial",
        "ai",
        "satellite"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.48550/arXiv.2506.20380",
          "doi": "10.48550/arXiv.2506.20380",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.33774/coe-2025-rmsqf",
          "doi": "10.33774/coe-2025-rmsqf",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/hasmq-vj807",
          "doi": "10.59350/hasmq-vj807",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3759536.3763802",
          "doi": "10.1145/3759536.3763802",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/7hy6m-1rq76",
          "doi": "10.59350/7hy6m-1rq76",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s41586-020-2649-2",
          "doi": "10.1038/s41586-020-2649-2",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/x6rea-1g262",
      "content_html": "<p>There's a buzz forming around the upcoming <a href=\"https://impact.indiaai.gov.in/\">AI Impact\nSummit</a> next year in India, following up the\n<a href=\"/notes/uk-national-data-lib\">AI Safety Summit</a> here and the <a href=\"https://en.wikipedia.org/wiki/AI_Action_Summit\">France Action\nSummit</a> earlier this year. I\nheaded down to a couple of events in London this week to help set the agenda,\nparticularly around the importance of <a href=\"/papers/2025-fairground\">FAIR</a> and ethical <a href=\"/papers/2024-ai-conhorizon\">AI for sustainability</a> being on the political agenda.</p>\n<h2 id=\"meeting-the-alan-turing-institute-foreign-office-and-the-high-commission-of-india\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#meeting-the-alan-turing-institute-foreign-office-and-the-high-commission-of-india\"></a>Meeting the Alan Turing Institute, Foreign Office and the High Commission of India</h2>\n<p>The <a href=\"https://www.turing.ac.uk/\">ATI</a> <a href=\"https://cetas.turing.ac.uk/\">CETAS</a>, the <a href=\"https://www.gov.uk/government/organisations/foreign-commonwealth-development-office\">FCDO</a>, and the <a href=\"https://www.hcilondon.gov.in/\">High Commission of India</a> hosted a day long session for the\nIndian and UK contingents to get to know each other and to find opportunities for\ncollaboration. While the details of the day are under <a href=\"https://www.chathamhouse.org/about-us/chatham-house-rule\">Chatham House rules</a>, I learnt a\nlot from the discussions that is already public.\nThe opening came from <a href=\"https://indiaai.gov.in/people/abhishek-singh\">Abhishek Singh</a>, the CEO of the <a href=\"https://indiaai.gov.in/\">India AI Mission</a> that got over a <a href=\"https://www.pib.gov.in/PressReleasePage.aspx?PRID=2012375\">billion dollars</a> in funding last year to drive equitable AI deployment there:</p>\n<blockquote>\n<p>The IndiaAI Mission aims to build a comprehensive ecosystem that fosters AI innovation by democratizing computing access, enhancing data quality, developing indigenous AI capabilities, attracting top AI talent, enabling industry collaboration, providing startup risk capital, ensuring socially impactful AI projects, and promoting ethical AI. This mission drives responsible and inclusive growth of India's AI ecosystem [...]\n<cite>-- <a href=\"https://indiaai.gov.in/\">India AI Mission</a>, 2024</cite></p>\n</blockquote>\n<p>Abhishek made a compelling case about why India is so differentiated in this AI race.\nFirstly, it has a huge diversity of population, languages and climates. It's quite common to have voice-only interfaces, such as those by <a href=\"https://gramvaani.org/voice-technology-innovations-by-gram-vaani/\">Gram Vaani</a> that I learnt about at <a href=\"/notes/compass2024-ric-tripreport\">COMPASS 2024</a> in multiple Indian languages. India has been investing heavily in making sure that local languages are supported via the <a href=\"https://bhashini.gov.in\">Bhashini</a> initiative, which is an example of the central government funding efforts for the public good. <a href=\"https://www.cse.iitd.ac.in/~aseth/\">Aadi Seth</a> pointed me to the &quot;<a href=\"https://bhashinimigrationns.sosnm1.shakticloud.ai:9024/bhashinistaticassets/bhashini-assets/website/Field%20Guide%20-%201st%20Edition%20%283%29.pdf\">Field Guide for Inclusive Language</a>&quot; in India, which has a profile on how Gram Vaani and <a href=\"https://doi.org/10.1145/3700794.3700816\">Gram Sabhas</a> is being used in rural India to bridge the digital divides.</p>\n<p><img src=\"/images/ukindia-dialogue-2.webp\" alt=\"%c\" title=\"The high-stakes world of international diplomacy. I had a name card and everything!\" ></p>\n<p>Their recent <a href=\"https://static.pib.gov.in/WriteReadData/specificdocs/documents/2025/nov/doc2025115685601.pdf\">AI Governance Guidelines</a> also seemed reasonable to me. It identified risks in <a href=\"/notes/ai-poisoning\">malicious use</a>, bias and discrimination, transparency failures, systemic risks, loss of control, and national security challenges, but didn't overspecify solutions to these. The report was chaired by IIT-Madras, where I've been collaborating with <a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a> ever since he left Cambridge, so there's a nice connection there too!</p>\n<p>After the round table, we broke out into detailed planning sessions. I joined the climate one, where <a href=\"https://eviatarbach.com\">Eviatar Bach</a> gave me a splendid ad-hoc tutorial on the difficulty of subseasonal weather forecasting and pointed me to their notes on <a href=\"https://arxiv.org/abs/2410.10523\">inverse problems</a>. I delivered an explanation of <a href=\"/papers/2025-tessera\">TESSERA</a> and how terrestial forecasting is important towards areas such as farming, and we also learnt from <a href=\"https://sheffield.ac.uk/cs/people/academic/po-yang\">Po Yang</a> and others present about the latest in agricultural robotics.</p>\n<p>Of particular note is the large amount of work being done by <a href=\"https://www.wadhwaniai.org/publications/\">Wadhwani AI</a> towards &quot;real world&quot; AI deployment in India; for example, doing <a href=\"https://arxiv.org/abs/2402.00015\">uncertainty aware inferencing</a> to build trust with thousands of cotton farmers across India. I'm very much looking forward to building up more concrete collaborations here, in addition to the work we're starting with the <a href=\"https://core-stack.org/\">CoRE Stack</a> following <a href=\"/notes/icfp25-propl\">PROPL25</a>.</p>\n<h2 id=\"the-openuk-ai-impact-dinner\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-openuk-ai-impact-dinner\"></a>The OpenUK AI Impact Dinner</h2>\n<p>Without pause, the next day I went back to London to have a roundtable dinner\norganised by the wonderful <a href=\"https://amandabrock.com/\">Amanda Brock</a> from <a href=\"https://openuk.uk/\">OpenUK</a> for\nfurther discussions.  This time, the focus was on the critical role of open\nsource as a medium for exchange between the UK and India.\nThere were some wonderful discussions here again, ranging from open agentic infrastructure, to building open communities in the vibe coding world, to new communities around crowdsourcing nature data!</p>\n<p><img src=\"/images/ukindia-dialogue-8.webp\" alt=\"%c\" title=\"The unruly crowd of open sourcers, journalists and policymakers were FUN! (photo credit: OpenUK)\" ></p>\n<h3 id=\"going-the-full-zfs-circle-with-luke-marsden\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#going-the-full-zfs-circle-with-luke-marsden\"></a>Going the full ZFS circle with Luke Marsden</h3>\n<p>I saw Luke Marsden again after a decade <a href=\"/videos/725dda70-b12b-4b1a-a8ae-fa9c22683ff2\">following DockerCon 2016</a>,\nand learnt he is now working on a cool local agentic inference stack\ncalled <a href=\"https://helix.ml/\">helix.ml</a>, which sounded like exactly what we\nneed with all the complex agentic coding happening for building our <a href=\"/projects/ce\">collective knowledge database</a>.</p>\n<p><img src=\"/images/ukindia-dialogue-5.webp\" alt=\"%rc\" ></p>\n<p>Ironically, Luke used to work on a brilliant <a href=\"https://www.infoq.com/interviews/Luke-Marsden-ZFS-Docker/\">ZFS appliance for Docker</a> with his old\n<a href=\"https://github.com/clusterhq\">ClusterHQ</a> startup when I saw him last. Nowadays, I find my\nown group <a href=\"/notes/syncoid-sanoid-zfs\">hacking on</a> the same sort of thing. <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>\nhas been using ZFS in his <a href=\"https://patrick.sirref.org/shelter/index.xml\">Shelter</a> shell,\nand <a href=\"https://www.tunbury.org/\">Mark Elvers</a> has been building a giant petascale <a href=\"https://www.tunbury.org/2025/11/03/cepfs-partition-setup/\">storage cluster</a>\nfor our <a href=\"/notes/geotessera-python\">Tessera embeddings</a>. I love how open source goes full circle all the time!</p>\n<h3 id=\"building-open-community-across-countries-and-agents\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#building-open-community-across-countries-and-agents\"></a>Building open community across countries and agents</h3>\n<p><a href=\"https://en.wikipedia.org/wiki/Don_Syme\">Don Syme</a> not only leapt to the defence of higher order languages<sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote\">[1]</a></sup>, but more importantly\ngave a thoughtful note on the difficulty of building community when\neveryone is AI coding.\nCommunity in open-source isn't just code; it's about\n&quot;soaking in&quot; the whole lifecycle of crafting something together. We're in\nreal danger of losing the incentive to communicate and socialise if we're all vibe coding at\nhigh speed, and that in turn will signal a real problem in how we mentor and build distributed\ncommunities across geographies.</p>\n<p>Coincidentally, there's been a lot of online discussion about this same topic in the past few\ndays. <a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a> reflected on <a href=\"https://kcsrk.info/ocaml/2025/11/10/hacking/\">foundations for hacking on OCaml</a>, and\n<a href=\"https://patrick.sirref.org\">Patrick Ferris</a> on his <a href=\"https://patrick.sirref.org/thoughts-on-foundations-for-hacking-on-ocaml/index.xml\">experiences picking up systems research</a>.\nMeanwhile, a keen first-time contributor to OCaml decided that <a href=\"https://joel.id/artisanal-coding-is-dead-long-live-artisanal-coding/\">artisinal coding is the future</a> and caused a <a href=\"https://github.com/ocaml/ocaml/pull/14353#issuecomment-3523181904\">ruckus</a> among the maintainers. It's so easy to contribute PRs now -- which is great! -- but there's also no empathy for the amount of work it'll cause maintainers who have to deal with this <a href=\"https://discuss.ocaml.org/t/artisanal-coding-is-dead-long-live-artisanal-coding/17487\">flood</a> of long-term technical debt.\nMy gut feeling is that more in-person events (such as attending the AI summit and running hackathons) is the only way forward in the short-term, to recruit contributors who care about the long-term as well.</p>\n<h3 id=\"no-event-is-complete-without-the-life-of-hedgehogs\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#no-event-is-complete-without-the-life-of-hedgehogs\"></a>No event is complete without the LIFE of hedgehogs</h3>\n<p>I spoke to the assembled group about our work on <a href=\"/papers/2024-life\">LIFE</a> and <a href=\"/papers/2025-tessera\">TESSERA</a>, and\nwas bowled over the by the desire to engage on the topic from many of those\npresent. I particularly enjoyed sitting beside BBC tech journalist <a href=\"https://zoekleinman.com/\">Zoe Kleinman</a>\nwho I ended up explaining my <a href=\"/ideas/hedgehog-mapping\">obsession with tracking hedgehogs</a> to.</p>\n<p>It does feel to me like this hedgehog tracking project has really resonated with\npeople. <a href=\"/projects/rsn\">Satellite mapping</a> can be a little abstract as a concept, but <a href=\"https://toao.com/blog/can-we-really-see-brambles-from-space\">spotting hedgehogs from space</a> is an\nimmediate draw to a topic people are interested in immediate usecases where AI can help figure out the state of nature in their own backyards.</p>\n<p><img src=\"/images/ukindia-dialogue-9.webp\" alt=\"%c\" title=\"I think I was trying to demonstrate just how chonky hedgehogs are. (photo credit: OpenUK)\" ></p>\n<p>Aside from this, some random enjoyable tidbits of the evening:</p>\n<ul>\n<li><a href=\"https://en.wikipedia.org/wiki/Chi_Onwurah\">Dame Chi Onwurah</a> is an MP who went to Imperial a few years before me and is a proper hacker with an engineering degree! Just like our own <a href=\"https://www.jesus.cam.ac.uk/people/julian-huppert\">Julian Huppert</a>, it's encouraging to see senior politicians who are deeply technical engaging on policymaking at the top echelons of government. She's also written great pieces about how the <a href=\"https://ezp.lib.cam.ac.uk/login?url=https://www.proquest.com/scholarly-journals/taking-control-future-innovation-skills-green/docview/2562271309/se-2?\">industrial past and a green future are not incompatible</a> and supported <a href=\"https://doi.org/10.1093/astrogeo/aty144\">minorities in STEM</a> for years.</li>\n<li>I met <a href=\"https://www.linkedin.com/in/joe-fay/\">Joe Fay</a> for the first time, and discovered that he's the <a href=\"https://www.siliconvalleywatcher.com/joe-fay-takes-control-of-the-register-as-the-uk-it-news-site-seeks-to-increase-us-readership/\">former editor of the Register</a> who covered lots of our <a href=\"https://www.theregister.com/2016/01/13/docker_job_figures/\">early</a> and <a href=\"https://www.theregister.com/2025/10/30/docker_compose_desktop_flaws/\">current</a> days of <a href=\"/papers/2025-docker-icfp\">Docker</a>. A thoroughly lovely chap with lots of insights into the long arc of technology.</li>\n<li><a href=\"https://www.bristol.ac.uk/people/person/Dimitra-Simeonidou-07b8a105-17ca-485c-b146-66b681606a3d/\">Dimitra Simeonidou</a> from Bristol told me about their <a href=\"https://joiner.org.uk/\">JOINER</a> dark fiber network across the UK, which might actually help us shift some of these petabytes of planetary embeddings around more easily across universities. Very cool to see cutting edge telecoms research dropping around the UK; I don't know much about the Cambridge end of this as it's run from Engineering, but I did find the <a href=\"https://ieeexplore.ieee.org/abstract/document/11115941\">JOINER-NSF paper</a> to be helpful background.</li>\n<li><a href=\"https://openuk.uk/profiles/dr-jennifer-barth/\">Jennifer Barth</a>, the Chief Research Officer for OpenUK, turned out to have done her <a href=\"https://ora.ox.ac.uk/objects/uuid%3Aa6ab3dee-619b-450d-9942-f4aa39a988af\">DPhil on the ethnographics of coffee</a>. So I had an unexpected chance to discuss open-source approaches to <a href=\"/notes/exploring-food-impacts\">exploring food commodity impacts</a> on the planet with an expert on the topic!</li>\n</ul>\n<p><img src=\"/images/ukindia-dialogue-6.webp\" alt=\"%c\" title=\"The OpenUK events have decidedly better socials than the average event!\" ></p>\n<p><img src=\"/images/ukindia-dialogue-3.webp\" alt=\"%c\" title=\"There is a giant Rich Turner explaining weather forecasting at the front entrance to the ATI. Yay Rich!\" ></p>\n<p>Follow the discussion about this on <a href=\"https://www.linkedin.com/posts/openuktechnology_indiaaiimpactsummit2025-opensource-opensourceai-activity-7394365694076923905-2qDl\">LinkedIn</a> as well. I look forward to more followups with the ATI, OpenUK and our colleagues in India!</p>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p>A story to tell another day, since this meeting was also under Chatham House rules.</p>\n <a href=\"#fnref:1\" class=\"reversefootnote\">&#8617;</a></p></li></ol></div><h1>References</h1><ul><li>Eyres et al (2025). LIFE: A metric for mapping the impact of land-cover change on global extinctions. <a href=\"https://doi.org/10.1098/rstb.2023.0327\" target=\"_blank\"><i>10.1098/rstb.2023.0327</i></a></li>\n<li>Madhavapeddy et al (2025). Functional Networking for Millions of Docker Desktops. <a href=\"https://doi.org/10.1145/3747525\" target=\"_blank\"><i>10.1145/3747525</i></a></li>\n<li>Feng et al (2025). TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2506.20380\" target=\"_blank\"><i>10.48550/arXiv.2506.20380</i></a></li>\n<li>Madhavapeddy (2025). Thoughts on the National Data Library and private research data. <a href=\"https://doi.org/10.59350/fk6vy-5q841\" target=\"_blank\"><i>10.59350/fk6vy-5q841</i></a></li>\n<li>Madhavapeddy (2025). Semi distributed filesystems with ZFS and Sanoid. <a href=\"https://doi.org/10.59350/zy5bb-3ze20\" target=\"_blank\"><i>10.59350/zy5bb-3ze20</i></a></li>\n<li>Madhavapeddy (2025). Is AI poisoning the scientific literature? Our comment in Nature. <a href=\"https://doi.org/10.59350/pbxew-d2j78\" target=\"_blank\"><i>10.59350/pbxew-d2j78</i></a></li>\n<li>Madhavapeddy (2025). Programming for the Planet at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/hasmq-vj807\" target=\"_blank\"><i>10.59350/hasmq-vj807</i></a></li>\n<li>Omar et al (2025). A FAIR Case for a Live Computational Commons. Association for Computing Machinery. <a href=\"https://doi.org/10.1145/3759536.3763802\" target=\"_blank\"><i>10.1145/3759536.3763802</i></a></li>\n<li>Reynolds et al (2024). The potential for AI to revolutionize conservation: a horizon scan. <a href=\"https://doi.org/10.1016/j.tree.2024.11.013\" target=\"_blank\"><i>10.1016/j.tree.2024.11.013</i></a></li>\n<li>Madhavapeddy (2025). Exploring the biodiversity impacts of what we choose to eat. <a href=\"https://doi.org/10.59350/xj427-y3q48\" target=\"_blank\"><i>10.59350/xj427-y3q48</i></a></li>\n<li>Madhavapeddy (2025). GeoTessera Python library released for geospatial embeddings. <a href=\"https://doi.org/10.59350/7hy6m-1rq76\" target=\"_blank\"><i>10.59350/7hy6m-1rq76</i></a></li>\n<li>Madhavapeddy (2024). COMPASS 2024 report on the CoRE stack RIC meeting. <a href=\"https://doi.org/10.59350/p7kck-5bt81\" target=\"_blank\"><i>10.59350/p7kck-5bt81</i></a></li>\n<li>Mehta et al (2025). Initial Observations from Field Testing of a Digital Participatory Tool to Improve Water Security in Rural India. Proceedings of the 13th International Conference on Information \\& Communication Technologies and Development. <a href=\"https://doi.org/10.1145/3700794.3700816\" target=\"_blank\"><i>10.1145/3700794.3700816</i></a></li>\n<li>Bowler (2018). NAM 2018: a large and lively meeting. Astronomy & Geophysics. <a href=\"https://doi.org/10.1093/astrogeo/aty144\" target=\"_blank\"><i>10.1093/astrogeo/aty144</i></a></li>\n<li>Bach et al (2025). Machine Learning for Inverse Problems and Data Assimilation. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2410.10523\" target=\"_blank\"><i>10.48550/arXiv.2410.10523</i></a></li>\n<li>Agrawal et al (2024). Maintaining User Trust Through Multistage Uncertainty Aware Inference. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2402.00015\" target=\"_blank\"><i>10.48550/arXiv.2402.00015</i></a></li>\n<li>Saunders et al (2025). JOINER-NSF: UK’s national facility for spectrum access innovation. 2025 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN). <a href=\"https://doi.org/10.1109/DySPAN64764.2025.11115941\" target=\"_blank\"><i>10.1109/DySPAN64764.2025.11115941</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/path-to-uk-india-ai-summit",
      "title": "On the path to the UK/India AI Summit with OpenUK and the ATI",
      "summary": "Reflections on UK-India AI collaboration from meetings at the Alan Turing Institute and OpenUK, discussing ethical AI deployment, open source infrastructure, and the challenges of building community in the age of AI-assisted coding.",
      "image": "https://anil.recoil.org/images/ukindia-dialogue-8.1280.webp",
      "date_published": "2025-11-11T00:00:00.000000Z",
      "date_modified": "2025-11-11T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ai",
        "uk",
        "india",
        "opensource",
        "policy",
        "zfs"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1098/rstb.2023.0327",
          "doi": "10.1098/rstb.2023.0327",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3747525",
          "doi": "10.1145/3747525",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2506.20380",
          "doi": "10.48550/arXiv.2506.20380",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/fk6vy-5q841",
          "doi": "10.59350/fk6vy-5q841",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/zy5bb-3ze20",
          "doi": "10.59350/zy5bb-3ze20",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/pbxew-d2j78",
          "doi": "10.59350/pbxew-d2j78",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/hasmq-vj807",
          "doi": "10.59350/hasmq-vj807",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3759536.3763802",
          "doi": "10.1145/3759536.3763802",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1016/j.tree.2024.11.013",
          "doi": "10.1016/j.tree.2024.11.013",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/xj427-y3q48",
          "doi": "10.59350/xj427-y3q48",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/7hy6m-1rq76",
          "doi": "10.59350/7hy6m-1rq76",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/p7kck-5bt81",
          "doi": "10.59350/p7kck-5bt81",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3700794.3700816",
          "doi": "10.1145/3700794.3700816",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1093/astrogeo/aty144",
          "doi": "10.1093/astrogeo/aty144",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2410.10523",
          "doi": "10.48550/arXiv.2410.10523",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2402.00015",
          "doi": "10.48550/arXiv.2402.00015",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1109/DySPAN64764.2025.11115941",
          "doi": "10.1109/DySPAN64764.2025.11115941",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/food-and-risk-to-life",
      "content_html": "<p>Jacqueline Wilson wrote a <a href=\"https://www.cam.ac.uk/stories/food-LIFE-and-global-species-extinction-risk\">brilliant article</a> about our recent <a href=\"/papers/2024-food-life\">paper</a> on the <a href=\"/notes/exploring-food-impacts\">biodiversity impacts of food consumption</a> up on the main Cambridge website, complete with interviews with <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\">Thomas Ball</a> and <a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\">Alison Eyres</a>.</p>\n<blockquote>\n<p>How does your dinner affect the risk of 30,875 species of land-dwelling animal going extinct?</p>\n<p>Dr Thomas Ball can tell you. Depending on what you’re eating he can calculate\nthe likelihood of the global demise of every mammal, bird, amphibian and\nreptile over the next 100 years. He’ll tell you that not all dinners are\nequal.\n<cite>-- <a href=\"https://www.cam.ac.uk/stories/food-LIFE-and-global-species-extinction-risk\">cam.ac.uk</a>, 2025</p>\n</blockquote>\n<p>If you'd like to explore this dataset in more detail <a href=\"https://www.cam.ac.uk/stories/food-LIFE-and-global-species-extinction-risk\">after reading the article</a>, check out the interactive explorer below:</p>\n<p><a href=\"https://quantifyearth.github.io/food-globe/\"> <img src=\"/images/food-life-globe-1.webp\" alt=\"%c\" title=\"Explore food trade impacts on every country interactively\" > </a></p><h1>References</h1><ul><li>Ball et al (2025). Food impacts on species extinction risks can vary by three orders of magnitude. <a href=\"https://doi.org/10.1038/s43016-025-01224-w\" target=\"_blank\"><i>10.1038/s43016-025-01224-w</i></a></li>\n<li>Madhavapeddy (2025). Exploring the biodiversity impacts of what we choose to eat. <a href=\"https://doi.org/10.59350/xj427-y3q48\" target=\"_blank\"><i>10.59350/xj427-y3q48</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/food-and-risk-to-life",
      "external_url": "https://www.cam.ac.uk/stories/food-LIFE-and-global-species-extinction-risk",
      "title": "Food and the long term risk to life",
      "summary": "A Cambridge article explores our research on how food consumption affects the extinction risk of 30,875 land-dwelling animal species, with an interactive tool to examine biodiversity impacts across different countries and diets.",
      "image": "https://anil.recoil.org/images/food-life-globe-1.1280.webp",
      "date_published": "2025-11-06T00:00:00.000000Z",
      "date_modified": "2025-11-06T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "food",
        "biodiversity",
        "life",
        "spatial"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1038/s43016-025-01224-w",
          "doi": "10.1038/s43016-025-01224-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/xj427-y3q48",
          "doi": "10.59350/xj427-y3q48",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/c7zd2-6912",
      "content_html": "<p>I got to shake Jensen Huang's hand as he <a href=\"https://www.linkedin.com/posts/nvidia_nvidia-ceo-jensen-huang-has-been-awarded-activity-7391572997649838086-r5Dx\">received</a> the <a href=\"https://cus.org/hawkingfellowship\">2025 Hawking Fellowship</a> this evening at the Cambridge Union! He's a fitting winner for this award; he's not only tech's longest running CEO (33 years!), but also a <a href=\"https://www.cbsnews.com/news/meet-nvida-ceo-jensen-huang-company-powering-ai-today-60-minutes-transcript/\">founding engineer</a> who deeply understands the technology stack. He also bucks the trend among bigtech and famously doesn't believe in firing people, preferring to &quot;<a href=\"https://www.forbes.com/sites/julianhayesii/2024/09/06/nvidia-ceo-jensen-huang-gives-contrary-advice-to-ceos-on-firing-people/\">torture them into greatness</a>&quot;<sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote\">[1]</a></sup> instead.</p>\n<p>At the awards event this evening, Jensen made some remarks that stuck with me. He noted that he doesn't really like competing: <em>&quot;Nvidia is a company that succeeds when other companies succeed&quot;</em>.  By building platforms, Nvidia can focus on creating a better future that others aren't, and not on beating someone else. This mindset has had real <a href=\"https://siliconangle.com/2018/11/15/nvidia-shares-plunge-misses-earnings-forecasts/\">adverse costs</a>, but Nvidia has stoically taken hardware bets<sup id=\"fnref:2\"><a href=\"#fn:2\" class=\"footnote\">[2]</a></sup> that it believes in across <em>six</em> waves of computing generations.</p>\n<p>It was quite delightful to hear this attitude of focussing on the future, and\nretaining a basic optimism and curious mindset even as his company becomes <a href=\"https://www.forbes.com/sites/tylerroush/2025/10/29/nvidia-becomes-first-company-worth-5-trillion/\">larger than most economies</a>.</p>\n<h2 id=\"did-jensen-huang-just-advise-cambridge-to-abolish-exams\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#did-jensen-huang-just-advise-cambridge-to-abolish-exams\"></a>Did Jensen Huang just advise Cambridge to abolish exams?</h2>\n<p>My question to him was simple. My 816 year old institution has had exams as\nits staple assessment mechanism for centuries. Is AI finally the breaking point\nby which competitive stack ranking of brilliant students no longer makes sense? What would his advice to Cambridge University be now?</p>\n<p><img src=\"/images/jensen-hawking-4.webp\" alt=\"%c\" title=\"Jensen Huang at the Cambridge Union\" ></p>\n<p>His answer was comprehensive and nuanced.  At Nvidia, he's discounted <a href=\"https://fortune.com/2024/06/12/nvidia-ceo-jensen-huang-meeting-rule/\">360 peer reviews</a>\nor <a href=\"https://www.goodreads.com/book/show/3181380\">firing the bottom bracket of employees</a> as ineffective. The\nreason for this is that those employees\nmight be underperforming temporarily because they had the courage to take\nrisks, and in the next cycle are a source of creativity to come up with fresh\nideas and avoid ruts. If you weed out all the risktakers regularly, you've just\nweeded out a bunch of original thinkers!  So instead, they focus on a culture\nof helping peers to improve, and trust individuals to avoid <a href=\"https://www.businessinsider.com/nvidia-employees-rich-happy-problem-insiders-say-2023-12\">resting and vesting</a>.</p>\n<p>So how does this philosophy apply to Cambridge, a residential institution with a huge emphasis on <a href=\"https://www.undergraduate.study.cam.ac.uk/supervisions-and-assessment\">high student contact time</a> and full of mavericks? Jensen noted that the premise of testing students via exams is based around hiding some information (the exam) and\nmaking students find the answers (from their memory). When Google rose in popularity\nin the early 2000s, there was a lot of soul searching then about whether <a href=\"https://www.bbc.co.uk/news/magazine-12340505\">libraries were now obsolete</a>.\nToday, AI makes finding <a href=\"/papers/2025-evidence-tap\">expert information</a> about <a href=\"/papers/2024-ce-llm\">specialist topics</a> easier than ever, and makes it proportionately harder to use &quot;information hiding&quot; as the mechanism of assessment.<sup id=\"fnref:3\"><a href=\"#fn:3\" class=\"footnote\">[3]</a></sup></p>\n<p><img src=\"/images/jensen-hawking-2.webp\" alt=\"%c\" title=\"Every seat in the house was taken, with students hanging off the rafters!\" ></p>\n<p>So his view was that information hiding will become an ineffective form of\nscarcity, much like how digitising books made learning library skills far less\nof a useful discriminator. And instead, we could focus our pedagogical efforts on the\nqualities that <em>will</em> remain useful to our students throughout their careers and lives: the courage to be creative and take risks,\nthe resilince to fail in a public setting, and the intellectual honesty to\nrecognise when a direction needs to be altered.</p>\n<p>I've been having long conversations with <a href=\"http://carlhenrik.com/\">Carl Henrik Ek</a> and <a href=\"https://proroklab.org/\">Amanda Prorok</a> about the\nvalues we really believe in for Computer Science at Pembroke, which are aligned\nwith Jensen's remarks above. I don't think we'll be abolishing exams any time\nsoon, but I wouldn't be suprised if their format dramatically changed by the\nend of this decade.</p>\n<p>Nvidia is a remarkably agile company despite its huge size thanks to the culture\nthat Jensen Huang established. Other large organisations with similar\nagility that I've worked with include Jane Street<sup id=\"fnref:4\"><a href=\"#fn:4\" class=\"footnote\">[4]</a></sup> and their <a href=\"https://www.janestreet.com/who-we-are/\">famous lack of\nhierarchy</a>. Cambridge <em>also</em> has a unique\nstructure with the independent Colleges and University coalition that mean we\nresist centralisation, but exam culture is so deeply entrenched in our Tripos that\nseeing a quick path away from it might take some time -- but is possible!</p>\n<p><img src=\"/images/jensen-hawking-3.webp\" alt=\"%c\" ></p>\n<h2 id=\"he-deeply-believes-in-ai-for-scientific-discovery\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#he-deeply-believes-in-ai-for-scientific-discovery\"></a>He deeply believes in AI for scientific discovery</h2>\n<p>Another very encouraging aspect in Jensen's talk was his deep belief in the\nscientific method, and the important of intellectual honesty in our collective\nactions. He talked about his excitement about <a href=\"https://www.cnbc.com/2025/10/28/eli-lilly-nvidia-supercomputer-ai-factory-drug-discovery.html\">AI for drug discovery</a>,\nbut also how weird it is that we have to &quot;discover&quot; drugs in the first place.\nThere are clearly physical laws governing the interactions betweeen biological\nentities, so why isn't biology more of an &quot;engineering practise&quot;?</p>\n<p>His approach, of course, is the application of machine learning to build\nbiological models. And with the advent of multimodal models, he'd like to be able\nto <em>talk</em> to proteins and interrogate them Socratically <em>(&quot;how would you bind to this other molecule?&quot;)</em>.\nThis chimes very well with the research we've been doing in my group for the past few years...</p>\n<p>I've just been involved in two bold ARIA discovery grants that encourage us to ask exactly these sorts of big questions around understanding <a href=\"/notes/nas-rs-biodiversity\">complex ecological systems</a> using <a href=\"/papers/2024-ai-conhorizon\">machine learning</a>. The <a href=\"https://www.linkedin.com/posts/joe-millard_engineering-ecosystem-resilience-activity-7384950880065753088-6SCw\">first</a> is with <a href=\"https://joemillard.github.io/\">Joe Millard</a> on exploring the potential of combining distinct forms of foundational AI to simulate ecosystem resilience in digital space. The second is with <a href=\"http://oisin.info\">Oisin Mac Aodha</a> up in Edinburgh to think about <a href=\"/papers/2024-sdm-sa\">species distribution prediction</a> using a combination of machine learning, causal literature and remote observations.</p>\n<p>Key to both of these succeedings are the foundational machine learning models we've been developing. The <a href=\"/papers/2025-tessera\">TESSERA</a> geospatial model unlocks the modality of <a href=\"/notes/geotessera-python\">ground observation</a> alongside conventional LLMs. The <a href=\"/papers/2025-evidence-tap\">Cambridge TAP</a> literature pipeline gives us deep causal connections from millions of experiments.</p>\n<p>We haven't even begun to plumb the potential of combining some of these models in the future since we're deeply limited by our computational resources in the University at the moment. Luckily, both Nvidia and AMD<sup id=\"fnref:5\"><a href=\"#fn:5\" class=\"footnote\">[5]</a></sup> have been enthusiastic about supporting open research through GPU access, and we also have <a href=\"/videos/48a7ab10-3f49-4978-a00f-c26b64c2cae7\">Dawn</a> and soon Isembard. I'm <em>really</em> enjoying research life right now, and particularly so after seeing that someone as successful as Jensen Huang still retains a childlike curiosity about the world. Congratulations to him on a well-deserved Hawking Fellowship!</p>\n<p><img src=\"/images/jensen-hawking-1.webp\" alt=\"%c\" ></p>\n<p>Some entertaining misc factoids from his talk:</p>\n<ul>\n<li>Nvidia almost had a huge UK presence, except that their <a href=\"https://www.theguardian.com/business/2022/feb/08/nvidia-takeover-arm-collapses-softbank\">purchase of ARM was blocked</a> by regulators. Jensen made an offhand joke that &quot;perhaps it wasn't too late to buy them&quot;!</li>\n<li>He cited a great<sup id=\"fnref:6\"><a href=\"#fn:6\" class=\"footnote\">[6]</a></sup> marketing <a href=\"https://en.wikipedia.org/wiki/Crossing_the_Chasm\">book</a> I was given in 2001 on my first day at NetApp.</li>\n<li>He was the first generation of engineers that could use <a href=\"https://en.wikipedia.org/wiki/Computer-aided_design\">CAD</a> to design a computer inside another computer. Today, every chip lives as a digital twin for ages before the real hardware comes back from a fab. This is pleasingly recursive.</li>\n<li>I finally met Lucy Hawking in person, who I've been wanting to meet for ages since I now own her childhood house (her dad is <a href=\"https://en.wikipedia.org/wiki/Stephen_Hawking\">Stephen Hawking</a>).</li>\n</ul>\n<p>A big thank you to <a href=\"https://www.linkedin.com/in/andy-grant-7684431/\">Andy Grant</a> for the invitation to tonight's event! It was a lot fun.</p>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p>Remarkably similar to a Cambridge University degree...</p>\n <a href=\"#fnref:1\" class=\"reversefootnote\">&#8617;</a></p></li>\n<li id=\"fn:2\"><p><p>The Geforce I bought back in 2001 came with expensive <a href=\"https://www.pcgamer.com/from-voodoo-to-geforce-the-awesome-history-of-3d-graphics/\">shader hardware</a> that nothing supported. That feature lead to GPGPU programming that led to modern machine learning becoming computationally feasible.</p>\n <a href=\"#fnref:2\" class=\"reversefootnote\">&#8617;</a></p></li>\n<li id=\"fn:3\"><p><p>This is distinct from the problem of LLMs having a <a href=\"https://warwick.ac.uk/fac/cross_fac/eduport/edufund/projects/yang/projects/ai-meets-the-classroom-when-do-large-language-models-harm-learning/\">negative effect on learning</a> in schools and university. I'm focussing on <em>exams</em> as the primary means of assessment.</p>\n <a href=\"#fnref:3\" class=\"reversefootnote\">&#8617;</a></p></li>\n<li id=\"fn:4\"><p><p>The <a href=\"https://signalsandthreads.com/from-the-lab-to-the-trading-floor/\">Signals and Threads episode</a> describing how Jane Street regularly swaps employee desks is a wild ride.</p>\n <a href=\"#fnref:4\" class=\"reversefootnote\">&#8617;</a></p></li>\n<li id=\"fn:5\"><p><p>AMD donated us access to enough MI300x machines to make the training of the first version of TESSERA possible earlier this summer.</p>\n <a href=\"#fnref:5\" class=\"reversefootnote\">&#8617;</a></p></li>\n<li id=\"fn:6\"><p><p>Possibly the <em>only</em> good marketing book I've read.</p>\n <a href=\"#fnref:6\" class=\"reversefootnote\">&#8617;</a></p></li></ol></div><h1>References</h1><ul><li>Madhavapeddy (2025). What I learnt at the National Academy of Sciences US-UK Forum on Biodiversity. <a href=\"https://doi.org/10.59350/j6zkp-n7t82\" target=\"_blank\"><i>10.59350/j6zkp-n7t82</i></a></li>\n<li>Feng et al (2025). TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2506.20380\" target=\"_blank\"><i>10.48550/arXiv.2506.20380</i></a></li>\n<li>Jaffer et al (2025). AI-assisted Living Evidence Databases for Conservation Science. Cambridge Open Engage. <a href=\"https://doi.org/10.33774/coe-2025-rmsqf\" target=\"_blank\"><i>10.33774/coe-2025-rmsqf</i></a></li>\n<li>Reynolds et al (2024). The potential for AI to revolutionize conservation: a horizon scan. <a href=\"https://doi.org/10.1016/j.tree.2024.11.013\" target=\"_blank\"><i>10.1016/j.tree.2024.11.013</i></a></li>\n<li>Iyer et al (2025). Careful design of Large Language Model pipelines enables expert-level retrieval of evidence-based information from syntheses and databases. <a href=\"https://doi.org/10.1371/journal.pone.0323563\" target=\"_blank\"><i>10.1371/journal.pone.0323563</i></a></li>\n<li>Madhavapeddy (2025). GeoTessera Python library released for geospatial embeddings. <a href=\"https://doi.org/10.59350/7hy6m-1rq76\" target=\"_blank\"><i>10.59350/7hy6m-1rq76</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/jensen-huang-hawking",
      "title": "Jensen Huang receives the Hawking Fellowship at Cambridge",
      "summary": "Reflections on meeting Jensen Huang as he received the 2025 Hawking Fellowship, discussing his views on education, assessment, risk-taking culture at Nvidia, and the future of AI in scientific discovery and biological research.",
      "date_published": "2025-11-04T00:00:00.000000Z",
      "date_modified": "2025-11-04T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "nvidia",
        "cambridge",
        "fellowship",
        "ai"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/j6zkp-n7t82",
          "doi": "10.59350/j6zkp-n7t82",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2506.20380",
          "doi": "10.48550/arXiv.2506.20380",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.33774/coe-2025-rmsqf",
          "doi": "10.33774/coe-2025-rmsqf",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1016/j.tree.2024.11.013",
          "doi": "10.1016/j.tree.2024.11.013",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1371/journal.pone.0323563",
          "doi": "10.1371/journal.pone.0323563",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/7hy6m-1rq76",
          "doi": "10.59350/7hy6m-1rq76",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/4jf5k-01n91",
      "content_html": "<p>I had an amazingly fun week at\n<a href=\"https://conf.researchr.org/home/icfp-splash-2025\">ICFP/SPLASH</a> in Singapore;\nit was the first time that these two major programming languages conferences\nwere held simultaneously. My submissions turned into a bit of a <a href=\"https://conf.researchr.org/profile/icfp-splash-2025/anilmadhavapeddy\">success\ndisaster</a>;\nI ended up chairing a workshop, giving several talks and a keynote, and\norganising a tutorial, and helping out a bunch of colleague and students.  And\nif this wasn't enough to fill up the week, collaborators I hadn't seen in a few\nyears were also presenting a tonne of interesting work, so there was no time to\nbreathe!</p>\n<p>So much went on that I've split up the post into a five parter:</p>\n<ul>\n<li><em>Part 1:</em> <a href=\"/notes/icfp25-propl\">Chairing the 2nd Programming for the Planet Workshop</a></li>\n<li><em>Part 2:</em> <a href=\"/notes/icfp25-oxcaml\">Holding a tutorial on OxCaml</a></li>\n<li><em>Part 3:</em> <a href=\"/notes/icfp25-ocaml5-js-docker\">Migrations to OCaml 5 with Jane Street and Docker</a></li>\n<li><em>Part 4:</em> <a href=\"/notes/icfp25-post-posix\">My case for post-POSIX IO being important for runtime designers</a></li>\n<li><em>Part 5:</em> <a href=\"/notes/icfp25-what-i-learnt\">What I learnt from other people's talks and chats</a></li>\n</ul>\n<p><img src=\"/images/icfp-20.webp\" alt=\"%c\" title=\"The conference was actually larger than this very large dosa.\" ></p>\n<p>Other colleagues who wrote up their experiences at ICFP 2025 include:</p>\n<ul>\n<li><a href=\"https://maxcarroll0.github.io/blog/\">Max Carroll</a> wrote up his <a href=\"https://maxcarroll0.github.io/blog/I&apos;m%20Attending%20ICFP%20&amp;%20SPLASH%202025!/\">five days at ICFP</a>.</li>\n<li><a href=\"https://patrick.sirref.org\">Patrick Ferris</a> got lots of hacking done in <a href=\"https://patrick.sirref.org/icfp-2025/index.xml\">his recap of the week</a>!</li>\n<li><a href=\"https://www.dra27.uk\">David Allsopp</a> gives us his <a href=\"https://www.dra27.uk/blog/platform/2025/10/18/icfp-2025.html\">reflections on ICFP25</a> as well.</li>\n<li><a href=\"https://toao.com\">Sadiq Jaffer</a> recapped his <a href=\"https://toao.com/blog/ai-existential-ocaml\">ICFP 2025 talk on OCaml and AI</a>.</li>\n<li>Chris Armstrong described his <a href=\"https://www.chrisarmstrong.dev/posts/icfp-wrapup-2025-10-18\">first ICFP experience</a> and his talk.</li>\n</ul><h1>References</h1><ul><li>Madhavapeddy (2025). Holding an OxCaml tutorial at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/55bc5-x4p75\" target=\"_blank\"><i>10.59350/55bc5-x4p75</i></a></li>\n<li>Madhavapeddy (2025). What I learnt at ICFP/SPLASH 2025 about OCaml, Hazel and FP. <a href=\"https://doi.org/10.59350/w1jvt-8qc58\" target=\"_blank\"><i>10.59350/w1jvt-8qc58</i></a></li>\n<li>Madhavapeddy (2025). It's time to go post-POSIX at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/mch1m-8a030\" target=\"_blank\"><i>10.59350/mch1m-8a030</i></a></li>\n<li>Madhavapeddy (2025). Programming for the Planet at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/hasmq-vj807\" target=\"_blank\"><i>10.59350/hasmq-vj807</i></a></li>\n<li>Madhavapeddy (2025). Jane Street and Docker on moving to OCaml 5 at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/3jkaq-d3398\" target=\"_blank\"><i>10.59350/3jkaq-d3398</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/icfp25",
      "title": "A Roundup of ICFP/SPLASH 2025 happenings",
      "summary": "Five-part series overview covering workshops, tutorials, talks and keynotes from ICFP/SPLASH 2025 in Singapore.",
      "date_published": "2025-10-10T00:00:00.000000Z",
      "date_modified": "2025-10-10T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "conference",
        "icfp",
        "splash",
        "programming"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/55bc5-x4p75",
          "doi": "10.59350/55bc5-x4p75",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/w1jvt-8qc58",
          "doi": "10.59350/w1jvt-8qc58",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/mch1m-8a030",
          "doi": "10.59350/mch1m-8a030",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/hasmq-vj807",
          "doi": "10.59350/hasmq-vj807",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/3jkaq-d3398",
          "doi": "10.59350/3jkaq-d3398",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/w1jvt-8qc58",
      "content_html": "<p>This is part 5 of a <a href=\"/notes/icfp25\">series</a> of posts<sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote\">[1]</a></sup> about ICFP 2025.</p>\n<p>In addition to giving a bunch of talks about\n<a href=\"/notes/icfp25-ocaml5-js-docker\">Docker</a>, <a href=\"/notes/icfp25-post-posix\">post-POSIX</a> and\n<a href=\"/notes/icfp25-propl\">planetary computing</a>, the greatest fun at a huge conference\nlike ICFP and SPLASH is seeing talks given by my students (they grow up so\nfast!) and collaborators, and generally floating around random talks trying to\ndeceipher ancient Greek lambdas floating on a projector.</p>\n<h2 id=\"hazel-live-programming-and-type-level-debugging\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#hazel-live-programming-and-type-level-debugging\"></a>Hazel live programming and type level debugging</h2>\n<p>I've been wanting to try to do something with <a href=\"https://hazel.org\">Hazel</a> ever\nsince <a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a> showed it to me at <a href=\"https://watch.eeg.cl.cam.ac.uk/w/3nGExywoVm6XFRBA2zYxSL\">last year's PROPL</a>.\n<a href=\"https://maxcarroll0.github.io/blog/\">Max Carroll</a> picked up the idea of doing <a href=\"/ideas/gradual-type-error-debugging\">gradual type-level debugging</a> for his Part II undergraduate project\nat the Computer Lab. He not only aced his project, but wrote up a\n<a href=\"https://maxcarroll0.github.io/assets/papers/Carroll-Decomposable_Type_Highlighting.pdf\">paper</a>\nfor the <a href=\"https://conf.researchr.org/home/icfp-splash-2025/hatra-2025\">HATRA</a>\nworkshop:</p>\n<blockquote>\n<p>We explore how to provide programmers with an interactive interface for\nexplaining the process by which static types and dynamic casts are derived,\nwith the goal of improving the debugging of static and dynamic type errors.</p>\n<p>To this end, we define mathematical foundations for a decomposable\nhighlighting system within a bidirectional system, and show how these can be\npropagated through dynamic types in a cast system. Our prototype\nimplementation in the gradually typed Hazel language includes a web-based\nuser interface, through which we highlight the importance of type level\ndebugging.\n<cite>-- <a href=\"https://maxcarroll0.github.io/assets/papers/Carroll-Decomposable_Type_Highlighting.pdf\">Decomposable Type Highlighting for Bidirectional Type and Cast Systems</a>, Carroll 2025</cite></p>\n</blockquote>\n<p><a href=\"https://youtu.be/P-x1msRL7XU?t=3994\"> <img src=\"/images/icfp-18.webp\" alt=\"%c\" title=\"Max Carroll presenting his work on gradual type-level debugging in Hazel\" > </a></p>\n<p><a href=\"https://maxcarroll0.github.io/blog/\">Max Carroll</a> delivered a fantastic first conference talk, complete with a live\ndemo demonstrating type-level debugging in action; give it a watch if you're\ninterested in live programming!</p>\n<p>One issue we had during this project was finding a decent corpus of functional\ncode <em>with errors</em> to use to test out Max's debugger. Hazel's a pretty young\nlanguage, and finding a large codebase is difficult, let alone a bunch of code\nwith errors. <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> decided to accelerate this process by building a\n<a href=\"https://github.com/patricoferris/hazel_of_ocaml\">hazel_of_ocaml</a> and\npresenting this work at the <a href=\"https://conf.researchr.org/home/icfp-splash-2025/tyde-2025#program\">TyDE workshop</a>.</p>\n<p><a href=\"https://www.youtube.com/watch?v=VJM5-IVQ8lw&t=21045s\"> <img src=\"/images/icfp-pf341-tyde.webp\" alt=\"%c\" title=\"Patrick Ferris presents Hazel-of-OCaml at TyDE 2025\" > </a></p>\n<p>With Patrick's transpiler, we grabbed <a href=\"https://eric.seidel.io/\">Eric Seidel</a>'s\ncorpus of <a href=\"https://zenodo.org/records/806814\">ill-typed OCaml</a> that he built\nfor his research on <a href=\"https://dl.acm.org/doi/10.1145/2951913.2951915\">dynamic witnesses for static type\nerrors</a>. Max successfully used\nthis translated corpus to build his type-level debugger, and is planning to\ncontinue to work on this in his Part III project this year.</p>\n<h2 id=\"three-steps-for-ocaml-to-crest-the-ai-humps\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#three-steps-for-ocaml-to-crest-the-ai-humps\"></a>Three Steps for OCaml to Crest the AI Humps</h2>\n<p>I've been <a href=\"/notes/claude-copilot-sandbox\">spending a lot of time with my friend Claude</a> recently, and so have <a href=\"https://toao.com\">Sadiq Jaffer</a> and\n<a href=\"https://jon.recoil.org\">Jon Ludlam</a>. We <a href=\"/papers/2025-ocaml-ai\">wrote up</a> our experiences with interfacing\nOCaml with coding agents, and Sadiq <a href=\"https://youtu.be/Xh5PNe0SxDY?t=24042\">presented it</a> to an interactive crowd at the <a href=\"https://conf.researchr.org/home/icfp-splash-2025/ocaml-2025\">OCaml Workshop</a>.</p>\n<p><a href=\"https://youtu.be/Xh5PNe0SxDY?t=24042\"> <img src=\"/images/icfp-26.webp\" alt=\"%c\" title=\"Sadiq couldnt resist a good pun for his OCaml Workshop talk\" > </a></p>\n<p>Aside from the very sensible\n<a href=\"https://jon.recoil.org/blog/2025/08/ocaml-lsp-mcp.html\">guidance</a> on MCP and\ntools, I discovered a couple of things from this work:</p>\n<ul>\n<li><a href=\"https://toao.com\">Sadiq Jaffer</a> found <a href=\"https://www.tbench.ai/\">terminal-bench</a> and added an <a href=\"https://toao.com/blog/gc-debug-terminal-bench\">OCaml GC debugging task</a>. This has the effect of getting the frontier AI labs to point their mega training tasks at OCaml-related problems, thus making a rising tide for everyone! And looking at <a href=\"https://github.com/laude-institute/terminal-bench/commits/main/tasks/fix-ocaml-gc\">the history of the task</a>, other labs are raising the timeout on Sadiq's task, meaning that fixing bugs in the OCaml GC is right at the top end of difficulty. Let's get more problems into terminal-bench!</li>\n<li>It also surprised me just how good the <a href=\"https://qwen.ai/home\">Qwen coder</a> models are <a href=\"https://toao.com/blog/ocaml-local-code-models\">on simple OCaml tasks</a>. Local models are fairly far behind Claude's, but the gap is closing as the innovation moves to the agentic context management. I'm excited to see <a href=\"https://github.com/tmattio\">Thibaut Mattio</a>'s work on <a href=\"https://getspice.dev/\">Spice</a> (see his <a href=\"https://youtu.be/e8Dkj47nxbg?t=99\">FunOCaml talk</a>) as that combines these local models with OCaml-specific context management.</li>\n</ul>\n<h2 id=\"formally-verified-garbage-collector-for-ocaml\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#formally-verified-garbage-collector-for-ocaml\"></a>Formally verified garbage collector for OCaml</h2>\n<p>Sheera Shamsu gave a fantastic <a href=\"https://youtu.be/Xh5PNe0SxDY?t=4364\">talk</a> on building a formally\nspecified garbage collector for OCaml to a very crowded room! This was rather\ntopical given our <a href=\"/notes/icfp25-ocaml5-js-docker\">musings on multiple runtimes</a> in\nthe shift from OCaml 4 to 5.</p>\n<p><a href=\"https://youtu.be/Xh5PNe0SxDY?t=4364\"> <img src=\"/images/icfp-21.webp\" alt=\"%c\" title=\"Sheera Shamsu on a mechanically verified GC for OCaml\" > </a></p>\n<blockquote>\n<p>[...] we propose a strategy for crafting a correct, proof-oriented GC from\nscratch, designed to evolve over time with additional language features. Our\napproach neatly separates abstract GC correctness from OCaml-specific GC\ncorrectness, offering the ability to integrate further GC optimizations,\nwhile preserving core abstract GC correctness. As an initial step to\ndemonstrate the viability of our approach, we have developed a verified\nstop-the-world mark-and- sweep GC for OCaml. The approach is mechanized in Fstar\nand its low-level subset Lowstar.\n<cite>-- <a href=\"https://link.springer.com/article/10.1007/s10817-025-09721-0\">A Mechanically Verified Garbage Collector for OCaml</a>, Shamsu et al 2025</cite></p>\n</blockquote>\n<p>Chatting to <a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a> afterwards, it seems that there's interest in shifting to\nLean from Fstar to investigate if the ergonomics of the proofs are better. But\nas baselines go, the mechanically verified collector always beat the\nconservative Boehmm-GC, which means it's no worse than the current more\nconservative choice. That's good work!</p>\n<p><img src=\"/images/icfp-22.webp\" alt=\"%c\" title=\"A full room for verified GCs!\" ></p>\n<h2 id=\"haskell-and-ocaml-the-twain-shall-meet\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#haskell-and-ocaml-the-twain-shall-meet\"></a>Haskell and OCaml, the Twain Shall Meet?</h2>\n<p>Most &quot;wonderful ICFP experiences&quot; usually include crossing the lines\nto go hang out with <em>other</em> language communities.</p>\n<p>Back in 2014, I stayed up all night before my <a href=\"/videos/ed84b2eb-1b93-4dc3-b746-63a4af13d4ea\">keynote</a> to the Haskell Symposium\ntrying to encode OCaml functors as Haskell typeclasses and even got help on\nstage from friendly Haskellers.  This year, <a href=\"https://richarde.dev/\">Richard Eisenberg</a> was my absolute\nhighlight with a <a href=\"https://youtu.be/IlQQElKaFvM?t=13184\">superb session</a> on what\nhe's learnt from being the rare breed of someone steeped deeply <em>both</em> in\nHaskell and OCaml.  The room was so packed for this talk that they had to\ncreate an overflow room streaming it in the corridors!</p>\n<p><a href=\"https://youtu.be/IlQQElKaFvM?t=13184\"> <img src=\"/images/icfp-23.webp\" alt=\"%c\" title=\"Richard Eisenberg setting up in a crowded room for his keynote\" > </a></p>\n<p>Richard talked about his experiences with being <em>both</em> an OCaml and Haskeller,\nand went through a series of examples illustrating the differences between the\ntwo. He didn't get very far before the audience got involved, with both\nHaskellers and OCamlers putting their 2c in! For that reason, the stream\nrecording might not work so well.</p>\n<p><a href=\"https://youtu.be/IlQQElKaFvM?t=13184\"> <img src=\"/images/icfp-25.webp\" alt=\"%c\" > </a></p>\n<p>It's worth watching the talk rather than me going through each of his examples,\nbut I did have a long morning coffee with <a href=\"https://simon.peytonjones.org/\">Simon Peyton Jones</a> when I got back to Cambridge about what\nthe essential difference is between OCaml and Haskell. Laziness seems like a\ndetail, but purity is absolutely key; it percolates through every other design\ndecision (like ordering of variables, or module generativity, and so on) since\nside-effects lurk everywhere in OCaml.</p>\n<p><img src=\"/images/icfp-19.webp\" alt=\"%c\" title=\"Jane Street had a fun 'corridor track' where they contrasted Haskell and OCaml to passerbys as well, including an unfortunate wedding party that happened to be on the same floor as us.\" ></p>\n<p>I think it's really important to have these cross-community in-person moments. One call to action in Richard's talk was for us to consider having a unified &quot;Haskell/ML Symposium&quot; where long-form research papers could be shared, with shorter language-specific workshops. One audience member asked why this couldn't just be the <a href=\"https://conf.researchr.org/home/icfp-splash-2025/mlsymposium-2025#event-overview\">ML Workshop</a>, and Richard promptly pointed out that it has &quot;Higher-order, Typed, Inferred, <strong>Strict</strong>&quot; in the title! Just excising one word might unify two communities long split for decades...</p>\n<p>Aside from language matters, I think it would be a good idea to bring more of the functional programming community together more often outside of the &quot;main ICFP track&quot; (which is high pressure and quite squeezed for time with little discussion outside the corridor tracks). I really miss <a href=\"https://cufp.org\">CUFP</a>, since for a decade this was where the functional hackers would all meet up towards the tail end of the main conference. This year however, the workshops were run in parallel with the main ICFP and OOPSLA, which I think sadly diluted the community bonding a bit.</p>\n<p><img src=\"/images/icfp-24.webp\" alt=\"%c\" title=\"KC is the other person who's done both OCaml and Haskell hacking, so it was kind of adorable to see him sitting beside SPJ during the talk!\" ></p>\n<h2 id=\"i-got-shriramed-about-our-cambridge-teaching\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#i-got-shriramed-about-our-cambridge-teaching\"></a>I got Shriram'ed about our Cambridge teaching</h2>\n<p>Speaking of teaching, noone in the world can school me better than <a href=\"https://cs.brown.edu/~sk/\">Shriram Krishnamurthi</a> when\nit comes to matters of computer science pedagogy. I grabbed him at the lunch\nbreak and asked him for advice on our upcoming reform of the Cambridge Computer\nScience Tripos (I teach the <a href=\"/notes/focs\">first course</a>). His opinions were legion,\nand he kindly gave me a quick spin around what they are working on at Brown.</p>\n<p>The SMoL (Standard Model of Languages) has a nice <a href=\"https://blog.brownplt.org/2024/04/12/behavior-misconceptions.html\">web interface</a> and quiz, just like the one he helped on for our <a href=\"/notes/icfp25-oxcaml\">OxCaml tutorial</a>. SMoL is deliberately language agnostic:</p>\n<blockquote>\n<ul>\n<li>If students master SMoL, they have a good handle on the core of several of these languages.</li>\n<li>Students may find it easier to port their knowledge between languages: instead of being lost in a sea of different syntax, they can find familiar signposts in the common semantic features. This may also make it easier to learn new languages.</li>\n<li>The differences between the languages are thrown into sharper contrast.</li>\n<li>Students can see that, by going beyond syntax, there are several big semantic ideas that underlie all these languages, many of which we consider “best practices” in programming language design.\n<cite>-- <a href=\"https://blog.brownplt.org/2024/04/12/behavior-misconceptions.html\">Fixing Standard Misconceptions about Program Behaviour</a>, 2024</cite></li>\n</ul>\n</blockquote>\n<p>Much like <a href=\"https://richarde.dev/\">Richard Eisenberg</a>'s talk on Haskell/OCaml, the SMoL tutor shows multiple\nlanguages for the same problem, rotating across Python, Scala, JavaScript and\nso on. I like this idea <em>a lot</em> for our Foundations of CS course, as I've been\nconsidering rotating in <a href=\"https://hazel.org\">Hazel</a> into the mix to ease the\nsyntactic shock of using OCaml. SMoL takes this concept much further, and is\nbacked by serious <a href=\"https://cs.brown.edu/~sk/Publications/Papers/Published/\">user studies</a> on students.</p>\n<p>I also really liked the <a href=\"https://pyret.org/\">Pyret</a> approach of starting to\nteach using tables as a core datastructure, and not lists or arrays. However,\nI'll need to think hard about how this teaching model would work under Cambridge's\nquirky <a href=\"https://www.undergraduate.study.cam.ac.uk/supervisions-and-assessment\">supervision model</a>.</p>\n<p>This is on my queue to work on over the winter, while <a href=\"https://jon.recoil.org\">Jon Ludlam</a> kindly <a href=\"https://jon.recoil.org/blog/2025/09/giving-hub-cl-an-upgrade.html\">covers</a> my undergraduate <a href=\"https://www.cl.cam.ac.uk/teaching/2526/FoundsCS/\">lectures</a> for this year while I'm on sabbatical!\nOn my reading list from chatting to him:</p>\n<ul>\n<li><a href=\"https://cacm.acm.org/opinion/data-centricity/\">Data-Centricity: A Challenge and Opportunity for Computing Education</a>, CACM 2025.</li>\n<li><a href=\"https://iase-pub.org/ojs/SERJ/article/view/190/95\">Modeling as a core component of structuring data</a>, Konold 2017.</li>\n</ul>\n<p><img src=\"/images/icfp-27.webp\" alt=\"%c\" title=\"Not a bracket out of place when Shriram is demoing PyRet!\" ></p>\n<h2 id=\"deterministic-wasm\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#deterministic-wasm\"></a>Deterministic WASM</h2>\n<p>Webassembly has also gone a long way since I <a href=\"/notes/wasm-on-exotic-targets\">last looked</a> into it.  I had a long chat with <a href=\"https://www.doc.ic.ac.uk/~pg/\">Phillipa Gardner</a> on the\n<a href=\"https://bsky.app/profile/ningkeli.bsky.social/post/3m2y4ncoiae2n\">nature hike</a>\nto learn about her work on <a href=\"https://dl.acm.org/doi/10.1145/3656440\">SpecTec</a>,\nwhich is a single source-of-truth DSL that describes both the\n<a href=\"https://webassembly.org\">Wasm</a> specification <em>and</em> the artefacts like the\ninterpreter.</p>\n<p>After that, <a href=\"https://s3d.cmu.edu/people/core-faculty/titzer-ben.html\">Ben Titzer</a> told me about <a href=\"https://arxiv.org/abs/2312.03858\">WALI</a> which is an alternative approach to <a href=\"https://wasi.dev/\">WASI</a> that simply exposes Linux kernel interfaces straight to the wasm runtime.  I'm rather amenable to this given my <a href=\"/notes/icfp25-post-posix\">case for shared memory IO</a> earlier in the week at VMIL, so this is now on my list of things to investigate! <a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a>, <a href=\"https://tyconmismatch.com/code.html\">Chris Casinghino</a> and I discussed what an <a href=\"/notes/icfp25-oxcaml\">OxCaml</a> wasm unikernel might look like (a lot of buzzwords, I know), and we are pretty close to OxCaml making it possible to write runtimes using itself -- it just needs support for &quot;external memory&quot;, which is a topic the Jane Street <a href=\"https://blog.janestreet.com/wrought-2025/#ref-counted-objects-in-shared-memory\">interns worked on</a> over their summer projects.</p>\n<h2 id=\"wrapup-thoughts-on-singapore\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#wrapup-thoughts-on-singapore\"></a>Wrapup thoughts on Singapore</h2>\n<p>Overall, I had a brilliant -- if exhausting! -- week in ICFP in Singapore. I\nloved the city, I loved the vibes around the conference, and it was totally\nworth the trip. Huge thanks to Ilya Sergey and the organising team for making\nthis happen!</p>\n<p><img src=\"/images/icfp-20.webp\" alt=\"%c\" title=\"The vegetarian food was amazing and my diet is in tatters.\" ></p>\n<p><img src=\"/images/icfp-5.webp\" alt=\"%c\" title=\"The coffee was 'ok'; I wonder what Satnam Singh thought about it!\" ></p>\n<p><img src=\"/images/icfp-17.webp\" alt=\"%c\" title=\"The views were spectacular. Singaporean architecture is ridiculous.\" ></p>\n<p><small class=\"credits\"> <em>10th Oct 2025: Typo fixes spotted by Shriram.</em> </small></p>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p>See also in the <a href=\"/notes/icfp25\">ICFP25</a> series: <a href=\"/notes/icfp25-propl\">chairing PROPL25</a>, the <a href=\"/notes/icfp25-oxcaml\">OxCaml tutorial</a>, <a href=\"/notes/icfp25-ocaml5-js-docker\">multicore at Jane Street and Docker</a>, <a href=\"/notes/icfp25-post-posix\">post-POSIX IO</a> and <a href=\"/notes/icfp25-what-i-learnt\">what I learnt</a>.</p>\n <a href=\"#fnref:1\" class=\"reversefootnote\">&#8617;</a></p></li></ol></div><h1>References</h1><ul><li>Madhavapeddy (2025). Oh my Claude, we need agentic copilot sandboxing right now. <a href=\"https://doi.org/10.59350/aecmt-k3h39\" target=\"_blank\"><i>10.59350/aecmt-k3h39</i></a></li>\n<li>Madhavapeddy (2025). Foundations of Computer Science. <a href=\"https://doi.org/10.59350/qms3q-ymn65\" target=\"_blank\"><i>10.59350/qms3q-ymn65</i></a></li>\n<li>Madhavapeddy (2025). Holding an OxCaml tutorial at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/55bc5-x4p75\" target=\"_blank\"><i>10.59350/55bc5-x4p75</i></a></li>\n<li>Madhavapeddy (2025). It's time to go post-POSIX at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/mch1m-8a030\" target=\"_blank\"><i>10.59350/mch1m-8a030</i></a></li>\n<li>Madhavapeddy (2025). A Roundup of ICFP/SPLASH 2025 happenings. <a href=\"https://doi.org/10.59350/4jf5k-01n91\" target=\"_blank\"><i>10.59350/4jf5k-01n91</i></a></li>\n<li>Madhavapeddy (2025). Programming for the Planet at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/hasmq-vj807\" target=\"_blank\"><i>10.59350/hasmq-vj807</i></a></li>\n<li>Madhavapeddy (2025). Webassembly on exotic architectures (a 2025 roundup). <a href=\"https://doi.org/10.59350/ycqj1-b3996\" target=\"_blank\"><i>10.59350/ycqj1-b3996</i></a></li>\n<li>Madhavapeddy (2025). Jane Street and Docker on moving to OCaml 5 at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/3jkaq-d3398\" target=\"_blank\"><i>10.59350/3jkaq-d3398</i></a></li>\n<li>Seidel et al (2016). Dynamic witnesses for static type errors (or, ill-typed programs usually go wrong). <a href=\"https://doi.org/10.1145/2951913.2951915\" target=\"_blank\"><i>10.1145/2951913.2951915</i></a></li>\n<li>Shamsu et al (2025). A Mechanically Verified Garbage Collector for OCaml. Journal of Automated Reasoning. <a href=\"https://doi.org/10.1007/s10817-025-09721-0\" target=\"_blank\"><i>10.1007/s10817-025-09721-0</i></a></li>\n<li>Youn et al (2024). Bringing the WebAssembly Standard up to Speed with SpecTec. Artifact for \"Bringing the WebAssembly Standard up to Speed with SpecTec\". <a href=\"https://doi.org/10.1145/3656440\" target=\"_blank\"><i>10.1145/3656440</i></a></li>\n<li>Ramesh et al (2025). Empowering WebAssembly with Thin Kernel Interfaces. Proceedings of the Twentieth European Conference on Computer Systems. <a href=\"https://doi.org/10.1145/3689031.3717470\" target=\"_blank\"><i>10.1145/3689031.3717470</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/icfp25-what-i-learnt",
      "title": "What I learnt at ICFP/SPLASH 2025 about OCaml, Hazel and FP",
      "summary": "Highlights from ICFP/SPLASH 2025 including Hazel live programming, OCaml AI tooling, formally verified GC, and cross-community discussions between Haskell and OCaml.",
      "date_published": "2025-10-09T00:00:00.000000Z",
      "date_modified": "2025-10-09T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "oxcaml",
        "ocaml",
        "programming",
        "docker",
        "multicore",
        "functional",
        "icfp"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/aecmt-k3h39",
          "doi": "10.59350/aecmt-k3h39",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/qms3q-ymn65",
          "doi": "10.59350/qms3q-ymn65",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/55bc5-x4p75",
          "doi": "10.59350/55bc5-x4p75",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/mch1m-8a030",
          "doi": "10.59350/mch1m-8a030",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/4jf5k-01n91",
          "doi": "10.59350/4jf5k-01n91",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/hasmq-vj807",
          "doi": "10.59350/hasmq-vj807",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/ycqj1-b3996",
          "doi": "10.59350/ycqj1-b3996",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/3jkaq-d3398",
          "doi": "10.59350/3jkaq-d3398",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1145/2951913.2951915",
          "doi": "10.1145/2951913.2951915",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1007/s10817-025-09721-0",
          "doi": "10.1007/s10817-025-09721-0",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3656440",
          "doi": "10.1145/3656440",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3689031.3717470",
          "doi": "10.1145/3689031.3717470",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/mch1m-8a030",
      "content_html": "<p>This is part 4 of 5 of a <a href=\"/notes/icfp25\">series</a> of posts<sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote\">[1]</a></sup> about ICFP 2025.</p>\n<p>After the excitement of presenting my <a href=\"/notes/icfp25-ocaml5-js-docker\">Docker experience report</a>, I went straight into giving a <a href=\"https://youtu.be/tOMF69dP2-I?t=14187\">keynote talk</a> at <a href=\"https://conf.researchr.org/home/icfp-splash-2025/vmil-2025\">VMIL\n2025</a>.  This talk\nbubbled up intrusive thoughts I've had resulting in the past 25 years: every\nsystem I've worked on, ranging from <a href=\"/papers/2010-icfp-xen\">Xen</a> to\n<a href=\"/papers/2025-docker-icfp\">Docker</a> on all seem to boil down to &quot;make shared memory go\nfast&quot;.</p>\n<p>I'd started to believe it was time for change in the way we approach IO about 12 years ago when I talked about <a href=\"https://www.youtube.com/watch?v=Ss4pUbq09Lw\">wierd IO\nbehaviour</a> to a packed audience at\nFOSDEM, and now I believe it's even more true in 2025.</p>\n<p>So I made one key <a href=\"/slides/vmil25-keynote.pdf\">argument</a> to the audience: it's time to accept that standards\nsuch as POSIX are now holding back the development of good language runtimes,\nand we need to embrace the diversity of highly concurrent, shared-memory\ninterfaces. And unfortunately, there's no portable subset amongst these, and so\nthis may require a rethink of our frontend language interfaces as well.</p>\n<p><a href=\"https://youtu.be/tOMF69dP2-I?t=14187\"> <img src=\"/images/icfp-38.webp\" alt=\"%c\" title=\"The leaning tower of operating system layers\" > </a></p>\n<p>After explaining <a href=\"https://github.com/ocaml-multicore/ocaml-uring\">io_uring</a> on\nLinux, and then the Windows and macOS variants, I showed how we're trying to\nsupport these in our OCaml 5 <a href=\"https://github.com/ocaml-multicore/eio\">Eio library</a>. While Eio has plenty of\nreally <a href=\"https://tarides.com/blog/2024-03-20-eio-1-0-release-introducing-a-new-effects-based-i-o-library-for-ocaml/\">cool features</a>,\nthe defining one for me is that it abstracts away IO operations (like\n<a href=\"https://ocaml.org/p/eio/1.3/doc/eio/Eio/Flow/index.html#val-copy\">Flow.copy</a>)\nsufficiently that the backend can do highly parallel dispatch to the kernel\nover shared memory interfaces like uring.</p>\n<p>Crucially, we treat the shared memory interface in Eio as the first-class citizen,\nwith the POSIX(ish) backends relegated to a compatibility role.  This way, future\ninterfaces to the programmer can be &quot;parallel first&quot;.</p>\n<h2 id=\"posix-shouldnt-be-relegated-just-yet\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#posix-shouldnt-be-relegated-just-yet\"></a>POSIX shouldn't be relegated just yet</h2>\n<p>The discussion with the audience after the talk was just fantastic. <a href=\"https://www.humprog.org/~stephen/\">Stephen Kell</a>,\nwho has\n<a href=\"https://www.humprog.org/~stephen/blog/devel/native-debugging-part-2.html\">thought</a>\n<a href=\"https://www.humprog.org/~stephen/blog/research/seven-type-sins.html\">deeply</a>\nabout the interaction between the kernel and userspace asked whether I was\nbeing too harsh on POSIX, which has served us faithfully for many years. And\nindeed, I agree with Stephen! POSIX gives us a fine boot layer, and a fine\ninteraction layer (for terminals anyway), and a great single threaded\ninterface. Where it falls over is the highly concurrent and parallel world of\nhigh performance computing, where must align our data paths and not have any\ninterference from third party code.</p>\n<p>So perhaps we need to restructure runtimes to explicitly have a &quot;boot phase&quot;\n(POSIX) where they are establishing their resources, and then switch into a\n&quot;steady phase&quot; (uring and friends) where they are blasting data at high speeds.\nThese are really quite distinct modes of operation, and both are useful. Note\nthat &quot;high speed&quot; here applies to embedded systems as well; if I do less work\non those systems in the CPU, then I'll get better energy usage, so low-overhead\nmechanisms like uring are useful there too.</p>\n<p>One other thought I had from Stephen's excellent question was the important\nrole of POSIX in portability for many years. Moving forward into the next\ndecade, what standards body will help developers write portable software for\nall these different IO interfaces? It seems inevitable that this will fragment\ninto per-language specifications instead of the operating system (or C-level)\ninterfaces we've had for the past 50 years.</p>\n<h2 id=\"building-more-uring-based-low-level-examples\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#building-more-uring-based-low-level-examples\"></a>Building more uring-based low-level examples</h2>\n<p>I was also slightly surprised by how few people had used <code>io_uring</code>, in a room\nfull of language VM developers. One of the most useful things I did when\ndeveloping the OCaml <a href=\"https://github.com/ocaml-multicore/ocaml-uring\">uring\nbindings</a> was to build a\n<a href=\"https://github.com/ocaml-multicore/ocaml-uring/blob/main/tests/urcp_lib.ml\">parallel file\ncp</a>,\nbut I never got around to any real networking code.</p>\n<p>So I've started to build a &quot;raw&quot; OCaml 5 + uring HTTP server that is completely\nnon-portable, but serves to show how it works at the lowest level. This should\nalso give us a nice benchmark against which to test higher level interfaces.  My plan\nis to make this work with OCaml 5 first, and subsequently add <a href=\"/notes/icfp25-oxcaml\">OxCaml support</a>. <a href=\"https://toao.com\">Sadiq Jaffer</a> pointed\nme to the magic <a href=\"https://oxcaml.org/documentation/stack-allocation/intro/\">caml_alloc_local</a> FFI function\nadded in OxCaml that allows allocation directly into the OCaml stack from C, which should be all I need to\nmake the shared memory interface never allocate into the heap.</p>\n<p><a href=\"https://patrick.sirref.org\">Patrick Ferris</a> also spent <a href=\"https://patrick.sirref.org/icfp-2025/index.xml\">some ICFP time</a> hacking on integrating OxCaml into the OCaml uring bindings:</p>\n<blockquote>\n<p>The idea being that a completion queue entry (a notification that some operation has completed) could be fully represented using 64 bits (two 32-bit, unboxed values).\n<cite>-- <a href=\"https://patrick.sirref.org/icfp-2025/index.xml\">Patrick at ICFP 2025</a></cite></p>\n</blockquote>\n<p>Having had my regular morning coffee with <a href=\"https://simon.peytonjones.org/\">Simon Peyton Jones</a> back in Cambridge, he then pointed out\nthat Haskell <em>also</em> has a uring <a href=\"https://gitlab.haskell.org/ghc/ghc/-/issues/18390\">backend\nPR</a> lingering for years, and\nso perhaps we should do the same exercise there too, to understnad it all! I\ncan't say no to Simon, but if a Haskell expert is interested please do get in\ntouch so I don't have to inflict my OCaml-style Haskell on the world...</p>\n<p><small class=\"credits\"> <em>3rd Nov 2025: Add link to Patrick's uring bindings.</em> </small></p>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p>See also in the <a href=\"/notes/icfp25\">ICFP25</a> series: <a href=\"/notes/icfp25-propl\">chairing PROPL25</a>, the <a href=\"/notes/icfp25-oxcaml\">OxCaml tutorial</a>, <a href=\"/notes/icfp25-ocaml5-js-docker\">multicore at Jane Street and Docker</a>, <a href=\"/notes/icfp25-post-posix\">post-POSIX IO</a> and <a href=\"/notes/icfp25-what-i-learnt\">what I learnt</a>.</p>\n <a href=\"#fnref:1\" class=\"reversefootnote\">&#8617;</a></p></li></ol></div><h1>References</h1><ul><li>Madhavapeddy et al (2025). Functional Networking for Millions of Docker Desktops. <a href=\"https://doi.org/10.1145/3747525\" target=\"_blank\"><i>10.1145/3747525</i></a></li>\n<li>Madhavapeddy (2025). Holding an OxCaml tutorial at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/55bc5-x4p75\" target=\"_blank\"><i>10.59350/55bc5-x4p75</i></a></li>\n<li>Madhavapeddy (2025). What I learnt at ICFP/SPLASH 2025 about OCaml, Hazel and FP. <a href=\"https://doi.org/10.59350/w1jvt-8qc58\" target=\"_blank\"><i>10.59350/w1jvt-8qc58</i></a></li>\n<li>Scott et al (2010). Using functional programming within an industrial product group: perspectives and perceptions. ACM. <a href=\"https://doi.org/10.1145/1863543.1863557\" target=\"_blank\"><i>10.1145/1863543.1863557</i></a></li>\n<li>Madhavapeddy (2025). A Roundup of ICFP/SPLASH 2025 happenings. <a href=\"https://doi.org/10.59350/4jf5k-01n91\" target=\"_blank\"><i>10.59350/4jf5k-01n91</i></a></li>\n<li>Madhavapeddy (2025). Programming for the Planet at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/hasmq-vj807\" target=\"_blank\"><i>10.59350/hasmq-vj807</i></a></li>\n<li>Madhavapeddy (2025). Jane Street and Docker on moving to OCaml 5 at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/3jkaq-d3398\" target=\"_blank\"><i>10.59350/3jkaq-d3398</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/icfp25-post-posix",
      "title": "It's time to go post-POSIX at ICFP/SPLASH 2025",
      "summary": "VMIL keynote arguing for post-POSIX shared memory interfaces like io_uring in language runtimes for high-performance concurrent computing.",
      "date_published": "2025-10-08T00:00:00.000000Z",
      "date_modified": "2025-10-08T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "iouring",
        "ocaml",
        "tutorial",
        "programming",
        "functional",
        "icfp"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1145/3747525",
          "doi": "10.1145/3747525",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/55bc5-x4p75",
          "doi": "10.59350/55bc5-x4p75",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/w1jvt-8qc58",
          "doi": "10.59350/w1jvt-8qc58",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1145/1863543.1863557",
          "doi": "10.1145/1863543.1863557",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/4jf5k-01n91",
          "doi": "10.59350/4jf5k-01n91",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/hasmq-vj807",
          "doi": "10.59350/hasmq-vj807",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/3jkaq-d3398",
          "doi": "10.59350/3jkaq-d3398",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/3jkaq-d3398",
      "content_html": "<p>This is part 3 of 5 of a <a href=\"/notes/icfp25\">series</a> of posts<sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote\">[1]</a></sup> about ICFP 2025.</p>\n<p>It's been about six years since we wrote the papers on <a href=\"/papers/2020-icfp-retropar\">parallelism</a> and <a href=\"/papers/2021-pldi-retroeff\">effects</a>,\nand four years since we helped to <a href=\"/notes/recapping-ocaml-22\">release</a>\nupstream OCaml 5.0 with multicore support, a <a href=\"https://tarides.com/blog/2023-03-02-the-journey-to-ocaml-multicore-bringing-big-ideas-to-life/\">mammoth\neffort</a>\nthat took up years of work for my <a href=\"/projects/ocamllabs\">OCaml Labs</a> and\n<a href=\"https://tarides.com\">Tarides</a> crew. After the release came out, I focussed on\nbuilding applications using OCaml 5 for my own work on <a href=\"/projects/plancomp\">planetary computing</a>, for example on <em>using</em> the new features with the\nfledgling <a href=\"/papers/2023-ocaml-eio\">Eio library</a> to get some experience with\ndirect-style OCaml programming.</p>\n<p>Meanwhile, big OCaml users have also been adapting their codebases to shift\nfrom OCaml 4 to 5. Jane Street have expanded their tools and compiler team and\ndriven through their <a href=\"#the-path-to-ocaml-5-in-production-at-jane-street\">production switch</a> to the multicore runtime, and Docker for\nDesktop is progressing with <a href=\"#functional-networking-at-docker\">their switch</a> to direct-style code via Eio for\nhundreds of millions of users! Read on to learn more...</p>\n<h2 id=\"the-path-to-ocaml-5-in-production-at-jane-street\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-path-to-ocaml-5-in-production-at-jane-street\"></a>The Path to OCaml 5 in Production at Jane Street</h2>\n<p>Although it was the last talk of the entire conference week, I'm discussing\nthis first as it was the most exciting thing I learnt at ICFP!  At the\n<a href=\"https://conf.researchr.org/home/icfp-splash-2025/rebase-2025\">REBASE</a> <sup id=\"fnref:2\"><a href=\"#fn:2\" class=\"footnote\">[2]</a></sup> workshop,\n<a href=\"https://github.com/yminsky\">Yaron Minsky</a> <a href=\"https://www.youtube.com/live/UI1wApT2t1w?t=20700s\">announced</a> that Jane Street's production servers are now running\non the OCaml 5 runtime!</p>\n<p><a href=\"https://www.youtube.com/live/UI1wApT2t1w?t=20700s\"> <img src=\"/images/icfp-29.webp\" alt=\"%c\" title=\"Yaron Minsky showing the timeline of Jane Street's recent OCaml usage\" > </a></p>\n<p>That was the good news (our runtime is trading trillions of dollars, wow). The\nbad news is that there was a bumpy road internally for Jane Street to go from\nthe version that was first released (OCaml 5.0 on <a href=\"https://ocaml.org/releases\">Dec\n2022</a>) to their current tree.  Ron gave a really\ngood roundup of some of the effort that went into release engineering their\ninternal rollout. Since the entire OCaml runtime was\n<a href=\"https://github.com/ocaml/ocaml/pull/10831\">rewritten</a> as part of the multicore\nruntime upgrade, Jane Street encountered <a href=\"https://github.com/ocaml/ocaml/pulls?q=is%3Apr+pacing+label%3APerformance+is%3Aclosed\">GC pacing\nissues</a>\nand other unexpected changes in resource usage as a result of the new runtime\nbehaviours.  This took some significant design and engineering effort and to fix, which Ron\ncovers in detail in <a href=\"https://www.youtube.com/live/UI1wApT2t1w?t=20700s\">his talk</a>.</p>\n<h3 id=\"diagnosing-the-ocaml-5-performance-bumps\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#diagnosing-the-ocaml-5-performance-bumps\"></a>Diagnosing the OCaml 5 performance bumps</h3>\n<p>So what was the root cause of this bumpiness? I think a lot of it was just\nnormal release engineering; we clearly\n<a href=\"https://discuss.ocaml.org/t/ocaml-5-0-first-normal-alpha-release/10216\">signalled</a>\nwhen releasing OCaml 5.0 that it was not yet feature complete, and that the\n4.x runtime would continue to be supported for some years.</p>\n<blockquote>\n<p>The developer team released OCaml 5.0.0 in December 2022. OCaml 5.x features\na full rewrite of its runtime system for shared-memory parallel programming\nusing domains and native support for concurrent programming using effect\nhandlers.</p>\n<p>Owing to the large number of changes, especially to the garbage collector,\nOCaml 4.14 (the final release in the OCaml 4.x series, originally released in\nMarch 2022) remains supported for the time being. Maintainers of existing\ncodebases are strongly encouraged to evaluate OCaml 5.x and to report any\nperformance degradations on our issue tracker.\n<cite>-- <a href=\"https://github.com/ocaml/ocaml/blob/85cd5fd3dc0c1763926378a571ef215ce9512908/README.adoc\">ocaml/ocaml README</a></cite></p>\n</blockquote>\n<p>The latest OCaml 5.4.0 release that came out just a <a href=\"https://discuss.ocaml.org/t/ocaml-5-4-0-released/17365\">couple of\nweeks</a> ago is the first\nrelease with full feature parity with the OCaml 4.x LTS branch. Features such\nas <a href=\"https://tarides.com/blog/2025-03-06-feature-parity-series-statmemprof-returns/\">statmemprof</a>,\nthe <a href=\"https://tarides.com/blog/2025-01-15-using-clang-cl-with-ocaml-5/\">MSVC port</a>, <a href=\"https://tarides.com/blog/2024-09-11-feature-parity-series-compaction-is-back/\">GC compaction</a>, <a href=\"https://tarides.com/blog/2024-08-21-how-tsan-makes-ocaml-better-data-races-caught-and-fixed/\">thread sanitizer</a>, <a href=\"https://github.com/ocaml/ocaml/pull/11418\">RISC-V</a> and\n<a href=\"https://github.com/ocaml/ocaml/pull/11712\">S390X</a> architecture support all had to be engineered back in. OCaml development\nin recent years has been very developer intensive because of the need to not only\nreintroduce these features, and <em>also</em> keep up with the torrent of new features being introduced to OCaml 5.x.\nAll in all, I think it's been a very successful few years for the language to have kept steadily improving!</p>\n<h3 id=\"should-we-maintain-multiple-language-runtimes-in-ocaml\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#should-we-maintain-multiple-language-runtimes-in-ocaml\"></a>Should we maintain multiple language runtimes in OCaml?</h3>\n<p>However, Ron's talk was an excellent chance for us to reflect on what we might\nhave done differently with the benefit of hindsight. <a href=\"https://www.dra27.uk\">David Allsopp</a> posits that we should\nhave maintained the OCaml 4 and 5 runtimes simultaneously:</p>\n<blockquote>\n<p>One of the early ideas was to merge just the runtime changes as a separate\nruntime, leaving all the language changes to a subsequent update.  The main\nthing here would have been to upstream the immense changes to the allocator\nand garbage collector along with the domains and fibers machinery, while not\nyet exposing it.</p>\n<p>I remember the concern being that having essentially a runtime variant (not\nunlike the debug runtime) might lead to very slow uptake at actually testing\nit and possibly a maintenance burden.  i.e. we were concerned at maintaining\ntwo runtimes. This would probably have resulted in something like OCaml\n4.15.0, with an experimental official multicore-aware runtime.\n<cite>-- <a href=\"https://www.dra27.uk/blog/platform/2025/10/18/icfp-2025.html\">Reflections on ICFP 2025</a>, David Allsopp</cite></p>\n</blockquote>\n<p>I agree with this. Although we weren't sure in 2021 that it would be possible\nto have two simultaneous runtimes<sup id=\"fnref:3\"><a href=\"#fn:3\" class=\"footnote\">[3]</a></sup> it was clear by the time we were\nengineering the 5.0 PR that this would be possible. Still, a few years to stabilise\nmulticore performance <em>vs</em> the decade-old 4.x runtime isn't bad going at all, so I don't have\ndeep regrets about the approach we did take!</p>\n<h3 id=\"we-need-continuous-continuous-performance-engineering\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#we-need-continuous-continuous-performance-engineering\"></a>We need continuous continuous performance engineering</h3>\n<p>The other area where all of us agreed more effort is necessary is on continuous performance engineering.\nIn the runup to 2022, <a href=\"https://github.com/ctk21\">Tom Kelly</a>, <a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a> and I set up <a href=\"https://github.com/ocaml-bench/sandmark\">Sandmark</a>, which\nwas a body of microbenchmarks and some macrobenchmarks. Tarides setup a <a href=\"https://discuss.ocaml.org/t/ann-sandmark-nightly-benchmarking-as-a-service/10174\">continuous benchmarking service</a> around this, and I hosted a bunch of <a href=\"https://github.com/ocaml-bench/ocaml_bench_scripts#notes-on-hardware-and-os-settings-for-linux-benchmarking\">carefully tuned machines</a> with specific BIOS settings in the Cambrige Computer Lab.</p>\n<p>Roll on six years, and the maintainence cost of this service becomes clear.\nIt's quite a bit of effort to maintain the old machines (one is running Ubuntu\n16.04! I'm not telling you which one) to keep consistency in previous results.\nNew machines all come with a proportional configuration effort, and their results\nhave to be interpreted. Operating systems have to be upgraded, and tuned afresh.\nContinuous benchmarking is itself a continuous process of engineering, and should be treated as such!</p>\n<p>Our discussions at ICFP centred around the idea that we should not only maintain this\nbenchmarking infrastructure, but add an incentive to <em>macro</em> projects to submit\nrepresentative tests of their performance. Rocq, Why3,\n<a href=\"https://semgrep.dev/blog/2025/upgrading-semgrep-from-ocaml-4-to-ocaml-5/\">Semgrep</a>,\nor Frama-C, for example, should all have test cases run within this\ninfrastructure that move beyond microbenchmarks to realistic performance\npatterns that will show up issues with GC pacing in a way that microbenchmarks\ndo not.</p>\n<p>The challenge with doing this so far has been the difficulty of getting many of these\nbig projects to compile on a random OCaml trunk snapshot (essential to test them against\na pre-release OCaml compiler). Solving this will take some thought (particularly around\nppx usage), but the effort seems worthwhile as we move into a new phase of OCaml 5\nengineering now that feature parity has been achieved. Stay tuned for more on this from\nTarides!</p>\n<h2 id=\"functional-networking-at-docker\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#functional-networking-at-docker\"></a>Functional Networking at Docker</h2>\n<p>I also had the opportunity to share both a retrospective on the work <a href=\"https://dave.recoil.org\">Dave Scott</a>\nand I have been doing on <a href=\"https://www.docker.com/products/docker-desktop/\">Docker Desktop</a> for some years, and\n<em>also</em> the efforts from new OCamlers like <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> and <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> on helping us\nto port some aging OCaml code over to OCaml 5.</p>\n<p>We got a paper accepted to ICFP on the topic, and so I had a lot of fun\n<a href=\"https://www.youtube.com/watch?v=j84ocjlj1JA&amp;t=12880s\">presenting</a>\n&quot;<a href=\"/papers/2025-docker-icfp\">Functional Networking for Millions of Docker Desktops</a>&quot; to the mainline ICFP audience!  I first discussed the\npast; how we <a href=\"/notes/docker-buys-unikernel-systems\">joined Docker</a> and came up with <a href=\"https://www.docker.com/blog/docker-unikernels-open-source/\">HyperKit and\nVPNKit</a> to solve\nscaling problems that Docker was facing early in its growth.</p>\n<p><a href=\"https://www.youtube.com/watch?v=j84ocjlj1JA&t=12880s\"> <img src=\"/images/icfp-37.webp\" alt=\"%c\" title=\"Me on stage at ICFP in a verrrry cold venue\" > </a></p>\n<p>Our experience report makes the broad case for library operating sytems and\nfunctional programming being a good fit, especially with strict languages like\nOCaml which offer thin interfaces to the OS interfaces.</p>\n<blockquote>\n<p>Our use of library-oriented programming to deliver Docker for Desktop is [...] a very\nuseful way to build the &quot;invisible systems glue&quot; code that is pervasively needed in many systems\nprogramming tasks.</p>\n<p>There are an ever-growing number of hardware and software interfaces to\naccess the outside world, most obviously with GPUs for machine learning workloads but also\nFPGAs and new storage and persistent memory devices. These usually require significant\nretrofitting to work with existing codebases, and so building translation adapters like VPNkit and\nusing library VMMs like Hyperkit will become more common in the future.\n<cite>-- <a href=\"https://doi.org/10.1145/3747525\">Functional Networking for Millions of Docker Desktops</a>, 2025</cite></p>\n</blockquote>\n<p>And looking into recent specifics, <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>'s <a href=\"https://github.com/moby/vpnkit/pull/646\">contributions</a> to VPNKit also allow\nthis codebase to move to OCaml 5, and take advantage of direct-style IO! In a nutshell, it lets\nold code like this:</p>\n<pre><code class=\"language-ocaml\">module Make_packet_proxy\n(I: Mirage_flow.S) (O: Mirage_flow.S) = struct\n let run incoming outgoing =\n  let rec loop () =\n   I.read incoming &gt;&gt;= function\n   | Error err -&gt; fail &quot;%a&quot; I.pp_error err\n   | Ok `Eof -&gt; Lwt.return_unit\n   | Ok (`Data buf) -&gt; begin\n      O.write outgoing buf &gt;&gt;= function\n      | Ok () -&gt; loop ()\n      | Error err -&gt; fail &quot;%a&quot; O.pp_error err\n     end\n  in loop ()\n</code></pre>\n<p>...migrate to direct-style code Eio like this:</p>\n<pre><code>module Proxy = struct\n let run incoming outgoing =\n  try\n    while true do\n      Eio.Flow.copy incoming outgoing\n    done\n  with\n  | End_of_file -&gt; ()\n  | Write_error err -&gt;\n      fail &quot;%a&quot; pp_write_error err\n  | Read_error err -&gt;\n      fail &quot;%a&quot; pp_read_error err\nend\n</code></pre>\n<p>The old code<sup id=\"fnref:4\"><a href=\"#fn:4\" class=\"footnote\">[4]</a></sup> had monadic concurrency, functors for parameterising the IO\ndrivers, and error handling duplicated across OCaml exceptions and the\nconcurrency monad. The new code uses direct control flow constructs like\n<code>while</code>, and also one form of error handling.  This is all still a\nwork-in-progress, but looking a solid approach with no blockers except hacking\ntime to get it merged.</p>\n<p>Read the <a href=\"https://doi.org/10.1145/3747525\">ICFP paper</a> to learn more about\nthis, or <a href=\"/slides/icfp-docker-25.pdf\">browse my slides</a>!  It's exciting to see\nproduction code here get simpler <em>and</em> faster as we move to OCaml 5.  The reasons\nwhy this happens are explored further in my <a href=\"/notes/icfp25-post-posix\">VMIL keynote talk</a>\nthat I gave the next day, where I make a case for runtimes focussing on\npost-POSIX IO! And beyond that, we have <a href=\"/notes/icfp25-oxcaml\">OxCaml waiting in the wings</a>\nfor even more performance gains. OCaml is living in exciting times.</p>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p>See also in the <a href=\"/notes/icfp25\">ICFP25</a> series: <a href=\"/notes/icfp25-propl\">chairing PROPL25</a>, the <a href=\"/notes/icfp25-oxcaml\">OxCaml tutorial</a>, <a href=\"/notes/icfp25-ocaml5-js-docker\">multicore at Jane Street and Docker</a>, <a href=\"/notes/icfp25-post-posix\">post-POSIX IO</a> and <a href=\"/notes/icfp25-what-i-learnt\">what I learnt</a>.</p>\n <a href=\"#fnref:1\" class=\"reversefootnote\">&#8617;</a></p></li>\n<li id=\"fn:2\"><p><p>As far as I can tell, REBASE is SPLASH's equivalent of the\nvenerable <a href=\"http://cufp.org/\">CUFP</a> series that I helped run earlier in the\ncentury.</p>\n <a href=\"#fnref:2\" class=\"reversefootnote\">&#8617;</a></p></li>\n<li id=\"fn:3\"><p><p>Our 2020 <a href=\"/papers/2020-icfp-retropar\">parallelism paper</a> proposed <em>two</em> minor GC\nstrategies, one of which broke the C FFI and so didn't make the cut in the end\ndue to the amount of ecosystem churn it would cause.</p>\n <a href=\"#fnref:3\" class=\"reversefootnote\">&#8617;</a></p></li>\n<li id=\"fn:4\"><p><p>A fun historical note is that I gave one of the <a href=\"/videos/dbd7546a-95d8-40af-b286-3cf930767682\">first talks</a> about VPNKit in Jane Street London about a decade ago! Back then we had a deep discussion about whether to use Lwt or Async, and it looks like we'll now meet again via <a href=\"/notes/icfp25-oxcaml\">OxCaml</a>.</p>\n <a href=\"#fnref:4\" class=\"reversefootnote\">&#8617;</a></p></li></ol></div><h1>References</h1><ul><li>Madhavapeddy et al (2025). Functional Networking for Millions of Docker Desktops. <a href=\"https://doi.org/10.1145/3747525\" target=\"_blank\"><i>10.1145/3747525</i></a></li>\n<li>Madhavapeddy (2025). Holding an OxCaml tutorial at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/55bc5-x4p75\" target=\"_blank\"><i>10.59350/55bc5-x4p75</i></a></li>\n<li>Madhavapeddy (2025). What I learnt at ICFP/SPLASH 2025 about OCaml, Hazel and FP. <a href=\"https://doi.org/10.59350/w1jvt-8qc58\" target=\"_blank\"><i>10.59350/w1jvt-8qc58</i></a></li>\n<li>Sivaramakrishnan et al (2021). Retrofitting effect handlers onto OCaml. ACM. <a href=\"https://doi.org/10.1145/3453483.3454039\" target=\"_blank\"><i>10.1145/3453483.3454039</i></a></li>\n<li>Madhavapeddy (2025). It's time to go post-POSIX at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/mch1m-8a030\" target=\"_blank\"><i>10.59350/mch1m-8a030</i></a></li>\n<li>Madhavapeddy (2025). A Roundup of ICFP/SPLASH 2025 happenings. <a href=\"https://doi.org/10.59350/4jf5k-01n91\" target=\"_blank\"><i>10.59350/4jf5k-01n91</i></a></li>\n<li>Madhavapeddy (2025). Programming for the Planet at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/hasmq-vj807\" target=\"_blank\"><i>10.59350/hasmq-vj807</i></a></li>\n<li>Sivaramakrishnan et al (2020). Retrofitting parallelism onto OCaml. <a href=\"https://doi.org/10.1145/3408995\" target=\"_blank\"><i>10.1145/3408995</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/icfp25-ocaml5-js-docker",
      "title": "Jane Street and Docker on moving to OCaml 5 at ICFP/SPLASH 2025",
      "summary": "Jane Street's production deployment of OCaml 5 and Docker's migration to direct-style programming with Eio presented at ICFP.",
      "date_published": "2025-10-07T00:00:00.000000Z",
      "date_modified": "2025-10-07T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "oxcaml",
        "ocaml",
        "programming",
        "docker",
        "multicore",
        "icfp"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2025-docker-icfp.pdf",
          "mime_type": "application/pdf",
          "title": "Functional Networking for Millions of Docker Desktops"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1145/3747525",
          "doi": "10.1145/3747525",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/55bc5-x4p75",
          "doi": "10.59350/55bc5-x4p75",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/w1jvt-8qc58",
          "doi": "10.59350/w1jvt-8qc58",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3453483.3454039",
          "doi": "10.1145/3453483.3454039",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/mch1m-8a030",
          "doi": "10.59350/mch1m-8a030",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/4jf5k-01n91",
          "doi": "10.59350/4jf5k-01n91",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/hasmq-vj807",
          "doi": "10.59350/hasmq-vj807",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3408995",
          "doi": "10.1145/3408995",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/55bc5-x4p75",
      "content_html": "<p>This is part 2 of 5 of a <a href=\"/notes/icfp25\">series</a> of posts<sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote\">[1]</a></sup> about ICFP 2025.</p>\n<p>Several extensions to &quot;oxidize&quot; OCaml (Rust performancew with ML ergonomics!) have been developing rapidly\nin a <a href=\"https://github.com/oxcaml/oxcaml\">fork</a> called\n<a href=\"https://oxcaml.org\">OxCaml</a>. I helped an intrepid crew from Jane Street,\nIIT-M, Tarides, Brown and Cambridge pull together a really fun tutorial in ICFP\n2025 that you can try out too!  <strong>TL;DR:</strong> Work through the online\n<a href=\"https://gavinleroy.com/oxcaml-tutorial-icfp25/\">slides</a>, try the\n<a href=\"https://github.com/oxcaml/tutorial-icfp25\">activities</a>, and take the\n<a href=\"https://gavinleroy.com/oxcaml-icfp-activity/\">quiz</a> to give us feedback.</p>\n<p><a href=\"https://github.com/oxcaml/tutorial-icfp25\"> <img src=\"/images/oxcaml-codespace.webp\" alt=\"%c\" title=\"Just click on the tutorial repo to get an online environment\" > </a></p>\n<h2 id=\"where-oxcaml-came-from\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#where-oxcaml-came-from\"></a>Where OxCaml came from</h2>\n<p>If you watch <a href=\"https://github.com/yminsky\">Yaron Minsky</a>'s talk about <a href=\"/notes/icfp25-ocaml5-js-docker\">moving to OCaml 5</a>, towards the end he discusses the language\nextensions Jane Street needs to eke out every ounce of performance while\nwriting code in a safe, functional OCaml style. These extensions currently materialise in\na forked version of the mainline OCaml compiler.  Earlier in the year, a bunch\nof us from Cambridge and Tarides <a href=\"https://tarides.com/blog/2025-07-09-introducing-jane-street-s-oxcaml-branch/\">helped</a>\nJane Street get the first public release out the door, and to setup opam\nrepositories and tutorials for this fork to be developed <a href=\"https://github.com/oxcaml/oxcaml\">in the open</a>.</p>\n<p><a href=\"https://blog.janestreet.com/introducing-oxcaml/\"> <img src=\"/images/oxcaml-release-summer25.webp\" alt=\"%c\" title=\"An unruly crowd shove an oxidising compiler out the door on a hot summer day\" > </a></p>\n<p><a href=\"https://github.com/yminsky\">Yaron Minsky</a> <a href=\"https://bsky.app/profile/yminsky.bsky.social/post/3lrimpjimjs2w\">summarised</a> the OxCaml release thusly:</p>\n<blockquote>\n<p>OxCaml's extensions make OCaml a better language for performance engineering.\nIt also supports data race free parallel programming, and a bunch of other\ngoodies.</p>\n<p>OxCaml is in an interesting spot. It's an experimental language, whose\nextensions will change quickly and mercilessly.</p>\n<p>But it's also a production quality compiler. Indeed it's the compiler we use\nin production everyday.</p>\n<p>So why are we doing this? Our primary hope is to use this to build awareness\nabout our work, and to help pave the way for getting these extensions\nupstreamed to mainline OCaml.</p>\n<p>But we're also interested in building adoption among enthusiasts and\nresearchers who don't mind working with a language that is changing quickly\nunder their feet. We think there's a ton to learn from the collaboration.\n<cite>-- <a href=\"https://bsky.app/profile/yminsky.bsky.social/post/3lrimpjimjs2w\">Yaron Minsky on BlueSky</a>, June 2025</cite></p>\n</blockquote>\n<p>So the status in the summer of 2025 was that there existed an open source code\ndrop of OxCaml, but it had a lot of rough edges and it would take some effort\nto make it usable outside of the walls of Jane Street.</p>\n<h2 id=\"creating-an-oxcaml-tutorial-from-outside-jane-street\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#creating-an-oxcaml-tutorial-from-outside-jane-street\"></a>Creating an OxCaml tutorial from outside Jane Street</h2>\n<p>The language extensions in OxCaml fall into three broad categories of features that:</p>\n<ul>\n<li>are upstreamable to OCaml, like <a href=\"https://discuss.ocaml.org/t/ocaml-5-4-0-released/17365#p-73063-labelled-tuples-1\">labelled tuples</a> and <a href=\"https://tarides.com/blog/2025-10-10-ocaml-5-4-release-new-features-fixes-and-more/\">immutable arrays</a></li>\n<li>are still moving targets but candidates for upstreaming later (like <a href=\"https://www.dra27.uk/blog/platform/2025/10/18/icfp-2025.html\">local modes</a>)</li>\n<li>Jane Street specific extensions which are unlikely to ever make it upstream (like <a href=\"https://github.com/oxcaml/oxcaml/pull/4826\">block indices for direct record access</a>)</li>\n</ul>\n<p>Right now, all of these are all developed in the OxCaml <a href=\"https://github.com/oxcaml/oxcaml\">monorepo</a> by a\ngrowing number of developers within Jane Street.  Without access to the\ninternal Jane Street developer resources, <a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a> and I started to work with\n<a href=\"https://richarde.dev/\">Richard Eisenberg</a> and <a href=\"https://tyconmismatch.com/code.html\">Chris Casinghino</a> to submit a tutorial proposal to ICFP; there's no\nbetter way to learn something than a deadline looming over our heads!</p>\n<p>Writing the tutorial proved much more difficult than I'd expected, as OxCaml\nhas a very <a href=\"https://oxcaml.org/documentation/\">large number</a> of extensions\n(such as modes and kinds) that are not only evolving fast, but also sometimes\ninteract differently in combination. It not only requires learning the compiler extensions,\nbut also keeping up with the <a href=\"https://github.com/oxcaml/opam-repository\">OxBase libraries</a> that\ncompose them into usable interfaces. <a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a> and I struggled to pack in all the\ndifferent concepts; I spent a whole weekend just attempting to get a parallel quicksort\ncompiling!</p>\n<p>Luckily, <a href=\"https://richarde.dev/\">Richard Eisenberg</a> had the idea to bring in experts in programming language\npedagogy in the form of <a href=\"https://cs.brown.edu/~sk/\">Shriram Krishnamurthi</a>, <a href=\"https://willcrichton.net/\">Will Crichton</a> and <a href=\"https://gavinleroy.com/\">Gavin Gray</a> to save the day.\nGavin completely redesigned our slides to radically simplify the examples\n(beginning with a <code>gensym</code> rather than a sorting algortithm), and <em>also</em>\ndesigned a quiz to test user knowledge before and after. <a href=\"https://thenumb.at/\">Max Slater</a>, <a href=\"https://github.com/mgndv\">Megan Del Vecchio</a> and\n<a href=\"https://www.linkedin.com/in/nadia-razek\">Nadia Razek</a> from Jane Street also leapt in to give us the inside line on new\ndevelopments.</p>\n<p>This collaboration made for a nice split in efforts; <a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a>, <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> and I\nfocussed on the longer handson activity <a href=\"https://github.com/oxcaml/tutorial-icfp25/tree/main/handson_activity\">examples</a>, while Gavin finished the <a href=\"https://slipshow.readthedocs.io/en/stable/\">Slipshow</a>-based <a href=\"https://gavinleroy.com/oxcaml-tutorial-icfp25/\">slides</a> to deliver the presentation itself.  I got a <a href=\"https://github.com/oxcaml/tutorial-icfp25\">OxCaml GitHub DevContainer</a> working that allowed participants to spin up a full OxCaml environment in just a few minutes, to ensure that as many people could participate during the conference as possible.</p>\n<h2 id=\"of-course-we-had-to-release-a-new-compiler-the-night-before-right\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#of-course-we-had-to-release-a-new-compiler-the-night-before-right\"></a>Of course we had to release a new compiler the night before... right?</h2>\n<p>Some excitement ensued when we realised that there hadn't been a public release\nof the OxCaml packages since our first public release earlier in the year! Meanwhile,\nhundreds of improvements had accumulated upstream, including a number of significant\ninterface and type system changes. It seemed a little regressive\nto present an out-of-date version of the tutorial to a demanding ICFP audience.</p>\n<p>So, in a late night call after we arrived in Singapore, with <a href=\"https://icfp25.sigplan.org/profile/dianakalinichenko\">Diana Kalichenko</a> working\ntirelessly on compilation fixes from New York, we refreshed all the tutorial\nexamples and <a href=\"https://github.com/oxcaml/opam-repository/pull/18\">released the latest minus19\ncompiler</a> version of the\ncompiler with four months of developments! We fixed the compilation\nproblems that resulted in our slides, and the new Devcontainer images finally rebuilt\naround 10 seconds before the tutorial started. No sweat.</p>\n<p><img src=\"/images/icfp-4.webp\" alt=\"%c\" title=\"Gavin, KC and me look to the heavens for inspiration to get the tutorial working\" ></p>\n<h2 id=\"the-tutorial-day-arrives\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-tutorial-day-arrives\"></a>The tutorial day arrives</h2>\n<p>The tutorial itself held at NUS went fantastically! Both sessions were completely full,\nwith participants online as well (Edwin Torok in particular gave all of it a thorough\nspin on Discord, thanks!).</p>\n<p>The broad sweep of audience feedback centred around stuff like this:</p>\n<ul>\n<li>The interaction between portable and non-portable functions, which stemmed from a confusion around the fact that Base functions are annotated with modes, but the OCaml stdlib is not. The answer right now is to always use Base with OxCaml.</li>\n<li>Whether exclaves can be used to allocate in the caller caller's region or not. Exclaves are transitive, which lets you build this.</li>\n<li>What stops the Capsule API from leaking the access keys to outside the interface? The answer is that other modes (like local) work together to give safety to this aspect of the interface. It's difficult to only use one of the modes axes in isolation in a real interface.</li>\n<li>Are OxCaml annotations erasable so that the programs are runnable using OCaml? Answer is &quot;mostly erasable&quot;.</li>\n<li>Are <code>local</code> and <code>stack_</code> inferred? The compiler does this analysis by default and will locally allocate when possible, but it wasn't clear to the tutorial attendees that this is the case.</li>\n</ul>\n<p><img src=\"/images/icfp-1.webp\" alt=\"%c\" title=\"Full tutorial room at the NUS computing department with an engaged audience!\" ></p>\n<h2 id=\"should-you-use-oxcaml-right-now\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#should-you-use-oxcaml-right-now\"></a>Should you use OxCaml right now?</h2>\n<p>The answer depends on if you want to get into compiler and language design or\nnot!  The number of mode axes in the OxCaml compiler are growing rapidly as\nmore usecases are covered, so we'll clearly need to develop this tutorial\nfurther, and the &quot;final form&quot; of OxCaml is by no means stable yet. Jane Street\niterates quickly on language changes since they control all the code in their monorepo\nthat uses it, and have extensive production engineering infrastructure. From the outside,\nit'll be hard to keep a large codebase in sync... for now.</p>\n<p>My high-level takeaway from discussions with the developers is that there's\nanother 12-24 months of active language evolution left, so it'll continue to be\na moving target for a while. But some features like locals have been around for\nlonger than others features and are more stable. Knowing this\nreally helps to plan out how I'm going to use OxCaml &quot;from the outside&quot;.</p>\n<p>It's also reassuring to see that Jane Street is serious about investing in\neducation resources for the language as well. <a href=\"https://github.com/yminsky\">Yaron Minsky</a> <a href=\"https://bsky.app/profile/yminsky.bsky.social/post/3m3tmp56yzc2k\">posts</a>:</p>\n<blockquote>\n<p>We've had an exciting couple of weeks full of opportunities to teach people about the exciting (and mildly bewildering) features of OxCaml.\nAnd...we're looking to hire an experienced educator to help us in this work. Please share this with anyone you think might be a good fit!\n<cite>-- <a href=\"https://bsky.app/profile/yminsky.bsky.social/post/3m3tmp56yzc2k\">Yaron Minsky on Bluesky</a>, Oct 2025</cite></p>\n</blockquote>\n<p><img src=\"/images/icfp-3.webp\" alt=\"%c\" title=\"There were a healthy contingent of Jane Street OxCaml developers to answer questions as well.\" ></p>\n<p>If you want to have a go at the tutorial and quiz yourself, then it's all still\nopen for participation! Follow the\n<a href=\"https://gavinleroy.com/oxcaml-tutorial-icfp25/\">slides</a> and then take the\n<a href=\"https://gavinleroy.com/oxcaml-icfp-activity/\">quiz</a>. And most importantly,\nshare your improbable stunts online so we can see what's going on. I'm hacking on a\n<a href=\"/notes/icfp25-post-posix\">io_uring oxhttpserver</a> myself, and I heard rumours that <a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a>\nhas been peering into the eBPF sources...</p>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p>See also in the <a href=\"/notes/icfp25\">ICFP25</a> series: <a href=\"/notes/icfp25-propl\">chairing PROPL25</a>, the <a href=\"/notes/icfp25-oxcaml\">OxCaml tutorial</a>, <a href=\"/notes/icfp25-ocaml5-js-docker\">multicore at Jane Street and Docker</a>, <a href=\"/notes/icfp25-post-posix\">post-POSIX IO</a> and <a href=\"/notes/icfp25-what-i-learnt\">what I learnt</a>.</p>\n <a href=\"#fnref:1\" class=\"reversefootnote\">&#8617;</a></p></li></ol></div><h1>References</h1><ul><li>Madhavapeddy (2025). What I learnt at ICFP/SPLASH 2025 about OCaml, Hazel and FP. <a href=\"https://doi.org/10.59350/w1jvt-8qc58\" target=\"_blank\"><i>10.59350/w1jvt-8qc58</i></a></li>\n<li>Madhavapeddy (2025). It's time to go post-POSIX at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/mch1m-8a030\" target=\"_blank\"><i>10.59350/mch1m-8a030</i></a></li>\n<li>Madhavapeddy (2025). A Roundup of ICFP/SPLASH 2025 happenings. <a href=\"https://doi.org/10.59350/4jf5k-01n91\" target=\"_blank\"><i>10.59350/4jf5k-01n91</i></a></li>\n<li>Madhavapeddy (2025). Programming for the Planet at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/hasmq-vj807\" target=\"_blank\"><i>10.59350/hasmq-vj807</i></a></li>\n<li>Madhavapeddy (2025). Jane Street and Docker on moving to OCaml 5 at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/3jkaq-d3398\" target=\"_blank\"><i>10.59350/3jkaq-d3398</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/icfp25-oxcaml",
      "title": "Holding an OxCaml tutorial at ICFP/SPLASH 2025",
      "summary": "Tutorial at ICFP 2025 on OxCaml extensions for performance engineering with modes and locals.",
      "date_published": "2025-10-06T00:00:00.000000Z",
      "date_modified": "2025-10-06T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "oxcaml",
        "ocaml",
        "tutorial",
        "programming",
        "icfp"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/w1jvt-8qc58",
          "doi": "10.59350/w1jvt-8qc58",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/mch1m-8a030",
          "doi": "10.59350/mch1m-8a030",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/4jf5k-01n91",
          "doi": "10.59350/4jf5k-01n91",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/hasmq-vj807",
          "doi": "10.59350/hasmq-vj807",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/3jkaq-d3398",
          "doi": "10.59350/3jkaq-d3398",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/hasmq-vj807",
      "content_html": "<p>This is part 1 of 5 of a <a href=\"/notes/icfp25\">series</a> of posts<sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote\">[1]</a></sup> about ICFP 2025.</p>\n<p>The <a href=\"https://popl24.sigplan.org/home/propl-2024\">first outing</a> of PROPL was\nlast year in London, and this time around <a href=\"https://dorchard.github.io\">Dominic Orchard</a> and I invited <a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a> to be\nthe PC chair and <a href=\"/notes/propl-at-splash\">held it at ICFP/SPLASH</a>. The uptake was\nencouraging, and we got enough submissions to have a proper <a href=\"https://dl.acm.org/doi/proceedings/10.1145/3759536\">published\nproceedings</a> in the ACM\nDigital Library for the first time! Our <a href=\"/papers/2025-propl\">proceedings summary</a>\nis a quick read to give you an idea of the breadth of the papers and talks this year.</p>\n<p><a href=\"https://dl.acm.org/action/showFmPdf?doi=10.1145%2F3759536\"> <img src=\"/images/propl25-header.webp\" alt=\"%c\" title=\"A summary of the 6 papers, 9 talks and provocations that appeared.\" > </a></p>\n<p>The workshop itself had slightly less in-person attendance than last year, but\nthis was due to the heavily multi-track structure of ICFP. We had less <a href=\"https://conf.researchr.org/home/icfp-splash-2025/propl-2025#program\">total\ntime</a> than\nlast year as the morning slot was taken up by the ICFP keynote (the\nawesome <a href=\"https://raintown.org\">Satnam Singh</a>), and with six parallel sessions people were filtering in and\nout of all the events. The online attendance made up for it, with quite a few\npeople tracking the <a href=\"https://www.youtube.com/watch?v=IIRJeleXeuU\">SIGPLAN live\nstream</a>.</p>\n<p>The <a href=\"https://dl.acm.org/doi/proceedings/10.1145/3759536\">papers</a> were exactly\nwhat I'd dreamed would happen -- a variety of practitioners describing their\ncomputational challenges mixed together with solutions. I'll summarise the day's\nproceedings next!</p>\n<p><a href=\"/slides/propl25-intro.pdf\"> <img src=\"/images/icfp-6.webp\" alt=\"%c\" title=\"Dominic Orchard opening the PROPL25 workshop\" > </a></p>\n<h2 id=\"computational-challenges\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#computational-challenges\"></a>Computational challenges</h2>\n<p>The first batch of talks I'll cover were about how to specify some core\ncomputational models related to climate and biodiversity science.</p>\n<p>Firstly, climate modelling was up with Chinmayi Prabhu\n<a href=\"https://www.youtube.com/watch?v=IIRJeleXeuU&amp;t=18092s\">presenting</a> a paper on\n<a href=\"https://dl.acm.org/doi/10.1145/3759536.3763801\">climate model coupler verification</a> that discussed\nthe difficulty of folding multiple global climate models into combined ones,\nsomething that is normally done via (underspecified and somewhat black magic)\n<a href=\"https://www.metoffice.gov.uk/research/climate/understanding-climate/coupled-modelling\">coupler</a>\ncomponents. Chinmayi had found some bugs in production coupler code, and\ndescribed a hybrid verification strategy that used both static and runtime\ntechniques to improve the state of affairs.</p>\n<blockquote>\n<p>The continuous exchange of data through couplers creates the risk of subtle\nerrors propagating across components, potentially distorting scientific\nconclusions. In this paper, we argue for lightweight formal verification\ntechniques applied at the coupler interface to improve both coupler and model\ncorrectness.\n<cite>-- <a href=\"https://dl.acm.org/doi/10.1145/3759536.3763801\">Towards Modelling and Verification of Coupler Behaviour in Climate Models</a></cite></p>\n</blockquote>\n<p><a href=\"https://www.youtube.com/watch?v=IIRJeleXeuU&t=18092s\"> <img src=\"/images/icfp-11.webp\" alt=\"%c\" title=\"Chinmayi Prabhu Baramashetru presenting her work on climate coupler verification\" > </a></p>\n<p>Then we <a href=\"https://youtu.be/IIRJeleXeuU?t=12711\">heard</a> about <a href=\"https://dl.acm.org/doi/10.1145/3759536.3763805\">hydrology modeling using GPUs</a> over in India, where\nalgorithms to trace the path of surface water flows (e.g. flow accumulation,\nwatershed delineation or runoff simulation) are hard to execute for large areas\nat reasonably fine spatial and temporal resolutions.</p>\n<blockquote>\n<p>Libraries like GDAL that use multi-threaded CPU-based implementations running\non a single host may be slow, and distributed infrastructures like Google\nEarth Engine may not support the kind of computational primitives required by\nthese algorithms.</p>\n<p>We have developed a GPU-accelerated framework that\nre-engineers these four algorithms and is able to process areas as large as\nriver basins of 250,000 km2 on commodity GPU workstations.\n<cite>-- <a href=\"https://dl.acm.org/doi/10.1145/3759536.3763805\">GPU-Accelerated Hydrology Algorithms for On-Prem Computation</a></cite></p>\n</blockquote>\n<p><a href=\"https://www.cse.iitd.ac.in/~aseth/\">Aadi Seth</a> explained not only the basics of flow algorithms, but why it's\nessential to GPU accelerate them to get a reasonable spatial scale and\nresolution. They ended up with a set of small parallelizable primitives that are either\npixel independent, or a sequence of 'long pixel' operations that need more\ncoordination.</p>\n<p><a href=\"https://youtu.be/IIRJeleXeuU?t=12711\"> <img src=\"/images/icfp-13.webp\" alt=\"%c\" title=\"Aadi Seth presenting his work on hydrology modeling in India\" > </a></p>\n<p>Continuing on in the previous theme of novel computation models, <a href=\"https://mynameismwd.org\">Michael Dales</a> presented &quot;<a href=\"/papers/2025-yirgacheffe\">Yirgacheffe: A Declarative Approach to Geospatial Data</a>&quot;,\na new <a href=\"https://yirgacheffe.org\">Python library</a> that\nallows spatial algorithms to be implemented concisely and supports parallel\nexecution and resource management. <em>(<a href=\"https://digitalflapjack.com/weeknotes/2025-10-13/\">read more...</a>)</em></p>\n<p>Michael used our work on the <a href=\"/papers/2024-life\">LIFE biodiversity metric</a> as the\nmotivating usecase <a href=\"https://digitalflapjack.com/weeknotes/2025-10-13/\">for his demo</a>, as we have petabytes\nof rasters processed using Yirgacheffe.  He also set up a nice <a href=\"https://yirgacheffe.org\">new\nhomepage</a> for the project and used PROPL to launch it,\nincluding <a href=\"https://digitalflapjack.com/blog/marimo/\">adopting</a> a shiny new\nexecutable notebook called <a href=\"https://marimo.io\">Marimo</a> which I liked the look\nof!</p>\n<p><a href=\"https://www.youtube.com/live/IIRJeleXeuU?t=23927s\"> <img src=\"/images/icfp-36.webp\" alt=\"%c\" title=\"Michael Dales showing the LIFE biodiversity map calculated using Yirgacheffe\" > </a></p>\n<h2 id=\"geospatial-data-management\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#geospatial-data-management\"></a>Geospatial data management</h2>\n<p>Switching tack from computation to managing large-scale datasets, we had a\nnumber of papers discussing how to orchestrate these both for full execution\nbut also in a developer friendly way for local use.</p>\n<p>The first extremely ambitious <a href=\"https://youtu.be/IIRJeleXeuU?t=4536\">talk</a> was from <a href=\"https://orcid.org/0009-0007-3826-1125\">Jean-Michel Lord</a> who\npresented <a href=\"/papers/2025-programming-gbon\">&quot;Programming Opportunities for the Global Biodiversity Observation Network&quot;</a>. GEOBON is a <a href=\"https://earthobservations.org/groups/geo-biodiversity-observation-network\">global network</a> of researchers dedicated to improving the acquisition, coordination and delivery of biodiversity information at a global scale. This PROPL collaboration came up up after I met <a href=\"https://www.thegonzalezlab.org/\">Andrew Gonzalez</a> at the\n<a href=\"/notes/nas-rs-biodiversity\">NAS</a> earlier in the year and learnt about <a href=\"https://boninabox.geobon.org/\">BON-in-a-Box</a>, which uses <a href=\"https://www.tunbury.org/2025/07/02/bon-in-a-box/\">Docker under the hood</a>!</p>\n<p><a href=\"https://youtu.be/IIRJeleXeuU?t=4536\"> <img src=\"/images/icfp-8.webp\" alt=\"%c\" title=\"Jean-Michel Lord illustrates the technology stack behind BON-in-a-Box\" > </a></p>\n<p>Jean-Michel described how off-the-shelf software could (almost) be enough to integrate the world's biodiversity dataset pipelines, but needed some help from their maintainers. Most notably, BON-in-a-box facilitates peer review of <em>computation pipelines</em> (as opposed to the science underpinning them), which is the first time I've seen peer review applied to scientific code. This &quot;connecting the dots&quot; across diverse biodiversity datasets is vital towards building a comprehensive model of life on this planet, and computer science is a crucial piece of the puzzle to make sense of all the data. <a href=\"https://www.thegonzalezlab.org/\">Andrew Gonzalez</a> also just announced that all the major players are getting together later this month at the <a href=\"https://www.livingdata2025.com/\">Living Data 2025</a> conference in Colombia, further underlying how timely getting <a href=\"https://www.linkedin.com/posts/andrew-gonzalez-23589146_livingdata2025-boninabox-biodiversitymonitoring-activity-7386243770066817024-SV57\">involved</a> with BON-in-a-Box is.</p>\n<p><a href=\"https://www.cse.iitd.ac.in/~aseth/\">Aadi Seth</a> had a second paper on the challenge of building <a href=\"https://dl.acm.org/doi/10.1145/3759536.3763803\">spatio-temporal catalogue dataflow\ngraphs</a>. These\n<a href=\"https://stacspec.org/en/\">catalogues</a> are really important in environmental\nscience to <a href=\"/notes/nas-rs-biodiversity\">connect the dots</a> in the often &quot;gappy&quot; field\ndatasets. The STAC-D extension proposes dataflow pipelines to help compose\nthese together via YAML specifications that describe dependencies between the\n(large) datasets involved.</p>\n<blockquote>\n<p>We propose STACD (STAC extension with DAGs), an extension to STAC\nspecifications that incorporates Directed Acyclic Graph (DAG) representations\nalong with defining algorithms and version changes in the workflows. We also\nprovide a reference implementation on Apache Airflow to demonstrate STACD\ncapabilities such as selective recomputation when some datasets or algorithms\nin a DAG are updated, complete lineage construction for a dataset, and\nopportunities for improved collaboration and distributed processing that\narise with this standard.\n<cite>-- <a href=\"https://dl.acm.org/doi/10.1145/3759536.3763803\">STAC Extension with DAGs</a><cite></p>\n</blockquote>\n<p>This one really reminded me of the work I did ages ago with <a href=\"https://research.google/people/derekmurray/?&amp;type=google\">Derek G Murray</a> and <a href=\"https://cs.brown.edu/people/malte/\">Malte Schwarzkopf</a>\non the <a href=\"/papers/2011-nsdi-ciel\">CIEL execution engine</a>. Dynamic dataflow graphs seem\nto come up again and again as the beating heart of coordination languages, and\nwe've made surprisingly little progress in making them more ergonomic. Some\ncombination of Docker and <a href=\"https://simon.peytonjones.org/assets/pdfs/build-systems-jfp.pdf\">Build Systems à la Carte</a> would\ngo a long way here!</p>\n<p>The third <a href=\"https://www.youtube.com/watch?v=IIRJeleXeuU&amp;t=6053s\">talk</a> was from the the <a href=\"https://catalysts.org\">Catalysts Foundation</a>\nfrom India, and highlighted the importance of data science for promoting health\nand wellbeing in some of the most vulnerable rural communities, who are at\nserious risks of increasing intensity due to climate change.</p>\n<blockquote>\n<p>Climate change presents multifaceted public health challenges, from\nheat-related mortality and vector-borne disease expansion to water\ncontamination and respiratory ailments. The <a href=\"https://doi.org/10.1016/S0140-6736%2822%2901540-9\">2022 Lancet Countdown Report</a>\ndemonstrates a host of health effects of climate change ranging from\nheat-related illness and mortality to the spread of vector-borne and\nwater-borne pathogens, to rising food insecurity as cropping patterns change.\nCurrent public health systems lack integrated, real-time data capabilities to\nidentify vulnerable populations and coordinate timely responses to these\nclimate-induced health threats, particularly in resource-constrained\nsettings.\n<cite>-- <a href=\"https://conf.researchr.org/details/icfp-splash-2025/propl-2025-papers/9/Precision-Action-Towards-Climate-and-Health-PATCH-\">Precision Action Towards Climate and Health (PATCH)</a></cite></p>\n</blockquote>\n<p><a href=\"https://www.youtube.com/watch?v=IIRJeleXeuU&t=6053s\"> <img src=\"/images/icfp-9.webp\" alt=\"%c\" title=\"Prerak Shah shows the increasing risks faced by rural Indian communities\" > </a></p>\n<p>Prerak talked about the difficulty of combining geospatial data with machine\nlearning inference, and keeping track of the resulting outputs in a systematic\nway. What I found particularly interesting about their &quot;PATCH&quot; system is that it not\nonly has core computing facilities (a health reporting platform, a spatial counterfactual\nmap for interventions and a communications channel for different stakeholders), but they\nalso extensively partner with local state governments in India (like the India <a href=\"https://en.wikipedia.org/wiki/India_Meteorological_Department\">Meteorological Department</a> and the <a href=\"https://www.aiims.edu/index.php/en\">All India Institutes of Medical Sciences</a>.</p>\n<p><a href=\"https://patrick.sirref.org\">Patrick Ferris</a> formalised many of the above problems with his thoughtful <a href=\"https://youtu.be/IIRJeleXeuU?t=15374\">seminar</a> on &quot;what we talk about when we talk about scientific programming&quot;. Patrick came up with three axes along which the scientific method of hypothesis falsification needs to operate when dealing with computational data: dynamism, scale and access controls. If you'd like to read more about Patrick's thoughts on this topic, his paper last year on &quot;<a href=\"/papers/2024-uncertainty-cs\">Uncertainty at scale: how CS hinders climate research</a>&quot; is also well worth a read.</p>\n<p><a href=\"https://youtu.be/IIRJeleXeuU?t=15374\"> <img src=\"/images/icfp-31.webp\" alt=\"%c\" title=\"Patrick Ferris discusses difficult usecases for geospatial scientific computing\" > </a></p>\n<p>All these talks highlighted the difficulty of managing large and often very messy datasets in practise. So how do we move towards more principled platforms to fix the situation? That's what the next group of talks covered!</p>\n<h2 id=\"towards-a-giant-planetary-wiki-of-code\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#towards-a-giant-planetary-wiki-of-code\"></a>Towards a giant planetary wiki of code</h2>\n<p>By far my favourite aspect of PROPL was he sheer ambition on display when it\ncomes to leveraging the network effects around computer technology to\naccelerate the pace of environmental action. The next batch of papers is all\nabout evolving notebooks to be global scale!</p>\n<p><a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a> led the charge with his <a href=\"https://youtu.be/IIRJeleXeuU?t=16533\">call</a> to action for &quot;<a href=\"/papers/2025-fairground\">A FAIR Case for a Live Computational Commons</a>&quot;, which\nhe described as a live planetary wiki! What is needed to turn something like\nWikipedia into a live, updated computation graph with incremental, real-time\nvisualisation? This would effectively be one giant program running and being\nupdated by tens of thousands of contributors in real time.</p>\n<blockquote>\n<p>This paper proposes Fairground, a computational commons designed as a\ncollaborative notebook system where thousands of scientific artifacts are\nauthored, collected, and maintained together in executable form in a manner\nthat is <a href=\"https://en.wikipedia.org/wiki/FAIR_data\">FAIR</a>, reproducible, and\nlive by default. Unlike existing platforms, Fairground notebooks can\nreference each other as libraries, forming a single planetary-scale live\nprogram executed by a distributed scheduler.\n<cite>-- <a href=\"https://anil.recoil.org/papers/2025-fairground.pdf\">A FAIR Case for a Live Computational Commons</a>, Omar et al 2025</cite></p>\n</blockquote>\n<p>Many of the answers to how to do this lay in the talks at this week's\nICFP/SPLASH: programming languages with clean semantics for incremental\ncompilation, purely functional with effect tracking, and mergeable semantics\nfor managing large scale data structures. Cyrus' own <a href=\"https://hazel.org\">Hazel</a>\nlanguage is a perfect example of an ergonomic interactive language that has\nclean, functional semantics while retaining usability.</p>\n<p><a href=\"https://youtu.be/IIRJeleXeuU?t=16533\"> <img src=\"/images/icfp-32.webp\" alt=\"%c\" title=\"Cyrus Omar lays out his vision for a FAIR planetary wiki\" > </a></p>\n<p><a href=\"https://dynamicaspects.org/research/\">Roly Perera</a> then switched tack and\n<a href=\"https://youtu.be/IIRJeleXeuU?t=13998\">discussed</a> how we might achieve more\ntransparent climate reporting, via a new class of &quot;transparent programming\nlanguages&quot; like his own <a href=\"https://f.luid.org/\">Fluid</a> project.</p>\n<blockquote>\n<p>With traditional print media, the figures, text and other content are\ndisconnected from the underlying data, making them hard to understand,\nevaluate and trust. Digital media, such as online papers and articles,\npresent an opportunity to make visual artifacts which are connected to data\nand able to reveal those fine-grained relationships to an interested user.\nThis would enable research outputs, news articles and other data-driven\nartifacts to be more transparent, self-explanatory and explorable .\n<cite> -- <a href=\"https://f.luid.org/\">fluid, explorable, self-explanatory research outputs</a></p>\n</blockquote>\n<p>Roly showed in his talk how the latest advances in Fluid helped to automate\nproviding &quot;drill down&quot; explanations for topics in the energy transition and\ndecarbonisation, adaptation to climate change, or risk mitigation strategies\nthat required policy changes justified by data.</p>\n<p><a href=\"https://youtu.be/IIRJeleXeuU?t=13998\"> <img src=\"/images/icfp-30.webp\" alt=\"%c\" title=\"Roly Perera presents his Fluid language\" > </a></p>\n<p>The <a href=\"https://f.luid.org/\">Fluid website</a> has loads of interactive examples\nfor you to explore, so continue on there if interested in this topic.</p>\n<p><a href=\"https://www.gla.ac.uk/schools/computing/staff/cristianurlea/\">Cristian Urlea</a> then discussed the crucial problem of <a href=\"https://dl.acm.org/doi/10.1145/3759536.3763804\">Bridging Disciplinary Gaps in\nClimate Research</a>, since the\npoint of all these planetary computing systems is to make them accessible to\nnon-computer-science-expert users (the &quot;<a href=\"https://dl.acm.org/doi/pdf/10.1145/3480947\">vernacular programmers</a>&quot;).</p>\n<blockquote>\n<p>Current scientific computing practices pose major barriers to entry,\nparticularly for interdisciplinary researchers and those in low and\nmiddle-income countries (LMICs). Challenges include steep learning curves,\nlimited access to expert support, and difficulties with legacy or\nunder-documented software. Drawing on real-world experiences, we identify\nrecurring obstacles in the usability, accessibility, and sustainability of\nscientific software.\n<cite>-- <a href=\"https://dl.acm.org/doi/10.1145/3759536.3763804\">Bridging Disciplinary Gaps in Climate Research</a>, 2025</cite></p>\n</blockquote>\n<p>I greatly enjoyed the framing of this paper of &quot;reimagining scientific\nprogramming as a shared public good&quot;, a point also made by Roberto Di Cosmo\nrecently in his <a href=\"https://www.nature.com/articles/d41586-025-03196-0\">Nature comment</a> that we must stop\ntreating code like an afterthought, and instead record, share and value it.</p>\n<h2 id=\"onto-programming-the-planet\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#onto-programming-the-planet\"></a>Onto Programming the Planet!</h2>\n<p>Once we have all these planetary scale notebooks, what sorts of new programs\nmight we run on them? The last group of talks covered some radically different\nideas here, and I was involved with all of them!</p>\n<p>First, <a href=\"https://toao.com\">Sadiq Jaffer</a> and <a href=\"https://www.cst.cam.ac.uk/people/ray25\">Robin Young</a> <a href=\"https://youtu.be/IIRJeleXeuU?t=21133\">presented</a> our own <a href=\"https://github.com/ucam-eo/geotessera\">TESSERA</a> project, which is a new geospatial foundation model for programming with observations of the planet from space. This is pretty scifi stuff!</p>\n<blockquote>\n<p>Remote sensing observations from satellites are critical for scientists to\nunderstand how our world is changing in the face of climate change,\nbiodiversity loss, and desertification. However, working directly with this\ndata is difficult. For any given satellite constellation, there are a\nmultitude of processed products, data volume is considerable, and for optical\nimagery, users must contend with data sparsity due to cloud cover. This\ncomplexity creates a significant barrier for domain experts who are not\nspecialists.</p>\n<p>Pre-trained, self-supervised foundation models such as <a href=\"https://arxiv.org/abs/2506.20380\">TESSERA</a> aim to solve this by offering pre-computed\nglobal embeddings. These rich embeddings can be used in-place of raw remote\nsensing data in a powerful “embedding-as-data” approach. For example, a\nsingle 128-dimensional TESSERA embedding for a 10-meter point on Earth can\nsubstitute for an entire year of optical and radar imagery, representing its\ntemporal and spectral characteristics. While this could democratise access to\nadvanced remote sensing-derived analytics, it also creates a new programming\nchallenge: a lack of tools designed for this new approach.</p>\n<p><cite>-- <a href=\"https://conf.researchr.org/details/icfp-splash-2025/propl-2025-papers/14/Challenges-in-Practice-Building-a-Usable-Library-for-Planetary-Scale-Embeddings\">Building a Usable Library for Planetary-Scale Embeddings</a>, Jaffer 2025</cite></p>\n</blockquote>\n<p><a href=\"https://youtu.be/IIRJeleXeuU?t=21133\"> <img src=\"/images/icfp-12.webp\" alt=\"%c\" title=\"Sadiq Jaffer shows how to detect all the solar farms on the planet using TESSERA\" > </a></p>\n<p>Sadiq did a live demo by demonstrating the\n<a href=\"https://github.com/ucam-eo/geotessera\">Geotessera</a> Python library I've been\n<a href=\"/notes/geotessera-python\">hacking on</a>. He went one step further and coded up a\nsimple classifier that found most of the solar farms on the planet, with a demo\nthat ran in minutes on his laptop!</p>\n<p>After this, <a href=\"https://ancazugo.github.io/\">Andres Zuñiga-Gonzalez</a> <a href=\"https://youtu.be/IIRJeleXeuU?t=25009\">showed</a> how he's been using airborne data to uncover socioeconomic stratification of urban nature in England, all via a pipeline that combines all sorts of geospatial algorithms to distill metrics about modern urban life.</p>\n<blockquote>\n<p>Using high-resolution LiDAR (Vegetation Object Model), Sentinel 2 imagery,\nand open geospatial datasets for over 28 million buildings across England, we\nintegrate raster, vector, and socioeconomic data within a scalable\ncomputational framework. Tree segmentation was performed using adaptive\nlocal-maximum filtering, canopy cover estimated at 1 m resolution, and park\naccessibility derived from network-based walking distances.</p>\n<p>Inequality in access to nature was quantified via Gini coefficients and\nmodelled with spatial error regressions against socioeconomic deprivation.\nOur results reveal that while most urban areas meet the 3-tree proximity\nrule, fewer than 3% achieve 30% canopy cover, and only a minority satisfy all\nthree components simultaneously.\n<cite>-- <a href=\"/papers/2025-england-nature\">Airborne assessment uncovers socioeconomic stratification of urban nature in England</a></cite></p>\n</blockquote>\n<p><a href=\"https://youtu.be/IIRJeleXeuU?t=25009\"> <img src=\"/images/icfp-14.webp\" alt=\"%c\" title=\"Andres sketches out the 3-30-300 rule of urban nature access\" > </a></p>\n<p>You can read more about this in his <a href=\"/papers/2025-england-nature\">preprint</a> that\nexplains in detail the programming pipeline (including route finding across\ntree canopies across the whole of England), and the bigger picture behind\n<a href=\"/ideas/urban-vegetation\">3-30-300 mapping</a>.</p>\n<p>And last but definitely not least, <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> <a href=\"https://youtu.be/IIRJeleXeuU?t=22918\">presented</a> possibly the most topical\ntalk for a programming language venue: repurposing the classic Robin Milner\n<a href=\"https://en.wikipedia.org/wiki/Bigraph\">bigraphs</a> for environmental goodness!\nRyan (and <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a>) observe that existing distributed computing models poorly\ncapture spatial structure, hindering dynamic collaboration and access control.\nThey argue that that space must be treated as a first-class concept in\nprogramming models, and use this to better coordinate the complex sensor\nsystems we need for environmental science.</p>\n<p><a href=\"https://youtu.be/IIRJeleXeuU?t=22918\"> <img src=\"/images/icfp-35.webp\" alt=\"%c\" title=\"Ryan Gibb goes full bigraph using Roy Ang's OSM translation!\" > </a></p>\n<p>It was also wonderful to see <a href=\"mailto:ra652@cam.ac.uk\">Roy Ang</a> attend the workshop, as it was his <a href=\"/ideas/bigraphs-real-world\">Part II project</a> which imported\n<a href=\"https://openstreetmap.org\">OpenStreetMap</a> into <a href=\"https://uog-bigraph.bitbucket.io/\">BigraphER</a> that sparked off the idea in the first place.\nYou may also be interested in our preprint &quot;<a href=\"/papers/2025-bifrost\">An Architecture for Spatial Networking</a>&quot; that explores this concept of spatial networking in more detail.</p>\n<h1 id=\"reflections-on-the-2nd-propl\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reflections-on-the-2nd-propl\"></a>Reflections on the 2nd PROPL</h1>\n<p>As always, the corridor track of discussions after the conference was the most valuable part of attending PROPL. We had the opportunity to put up some posters during the main banquet session, and it was busy!</p>\n<p><div class=\"video-center\"><iframe title=\"PROPL poster session at ICFP/SPLASH banquet\" width=\"100%\" height=\"315px\" src=\"https://crank.recoil.org/videos/embed/83e457b1-76d0-4e7e-b65f-be35544750c7\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<p>One thing that leapt out at me from the discussions was the need for a <em>hosted</em> service with the ergonomics of Docker, the interactive flexibility of Jupyter, the peer community of Wikipedia, and the semantic cleanliness of Hazel. This is overwhelmingly difficult to do in a topdown manner, but we do have now a growing community of practitioners and computer scientists who share a vision of making this happen. I had long conversations with most of the attendees of PROPL about how we might make this a reality, and my plan is spend a good chunk of my sabbatical time this year hacking on this.</p>\n<p>The other really energizing thing was seeing all the side hacking going on. I\nspotted <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> porting all the Python Geotessera code over to OCaml in a\nskunkworks effect of which I heartily approve. I looked over the Hazel UI with\n<a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a> to figure out what a planetary view might look like. <a href=\"https://www.cse.iitd.ac.in/~aseth/\">Aadi Seth</a> figured\nout spatial datastructures for the CoRE stack with <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> and <a href=\"https://ancazugo.github.io/\">Andres Zuñiga-Gonzalez</a> while we\nstood around in the blazing Singapore heat. <a href=\"https://cseweb.ucsd.edu/~mcoblenz/\">Michael Coblenz</a> lead a <a href=\"https://conf.researchr.org/details/icfp-splash-2025/hatra-2025-papers/8/Discussion\">session at\nHATRA</a>\non user centric approaches to types and reasoning assistants.</p>\n<p>All of these strands could weave together powerfully into groundup systems that\nsolve the problems we've been defining at PROPL! I'm feeling\n<a href=\"https://youtu.be/IIRJeleXeuU?t=26691\">energized</a> and tired, and look forward\nto continuing the discussions started in London (2024) and Singapore (2025)\ninto real systems in 2026!</p>\n<p><img src=\"/images/icfp-33.webp\" alt=\"%c\" title=\"Ilya Sergey had the genius idea of having a fridge magnet booth\nat the reception, which the organising committee for PROPL took full advantage of. See my fridge for more.\" ></p>\n<p><img src=\"/images/icfp-34.webp\" alt=\"%c\" title=\"As a metaphor about the race we are running to mitigate the worst effects of climate change using computer science, I left Sadiq in the dust at the Skyline Luge. We were both outclassed soundly by a bunch of ten year olds who had just finished school, however.\" ></p>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p>See also in the <a href=\"/notes/icfp25\">ICFP25</a> series: <a href=\"/notes/icfp25-propl\">chairing PROPL25</a>, the <a href=\"/notes/icfp25-oxcaml\">OxCaml tutorial</a>, <a href=\"/notes/icfp25-ocaml5-js-docker\">multicore at Jane Street and Docker</a>, <a href=\"/notes/icfp25-post-posix\">post-POSIX IO</a> and <a href=\"/notes/icfp25-what-i-learnt\">what I learnt</a>.</p>\n <a href=\"#fnref:1\" class=\"reversefootnote\">&#8617;</a></p></li></ol></div><h1>References</h1><ul><li>Eyres et al (2025). LIFE: A metric for mapping the impact of land-cover change on global extinctions. <a href=\"https://doi.org/10.1098/rstb.2023.0327\" target=\"_blank\"><i>10.1098/rstb.2023.0327</i></a></li>\n<li>Madhavapeddy (2025). What I learnt at the National Academy of Sciences US-UK Forum on Biodiversity. <a href=\"https://doi.org/10.59350/j6zkp-n7t82\" target=\"_blank\"><i>10.59350/j6zkp-n7t82</i></a></li>\n<li>Dales et al (2025). Yirgacheffe: A Declarative Approach to Geospatial Data. Association for Computing Machinery. <a href=\"https://doi.org/10.1145/3759536.3763806\" target=\"_blank\"><i>10.1145/3759536.3763806</i></a></li>\n<li>Madhavapeddy (2025). Holding an OxCaml tutorial at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/55bc5-x4p75\" target=\"_blank\"><i>10.59350/55bc5-x4p75</i></a></li>\n<li>Madhavapeddy (2025). What I learnt at ICFP/SPLASH 2025 about OCaml, Hazel and FP. <a href=\"https://doi.org/10.59350/w1jvt-8qc58\" target=\"_blank\"><i>10.59350/w1jvt-8qc58</i></a></li>\n<li>Millar et al (2025). An Architecture for Spatial Networking. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2507.22687\" target=\"_blank\"><i>10.48550/arXiv.2507.22687</i></a></li>\n<li>Madhavapeddy et al (2025). Proceedings of the 2nd ACM SIGPLAN International Workshop on Programming for the Planet. <a href=\"https://doi.org/10.1145/3759536\" target=\"_blank\"><i>10.1145/3759536</i></a></li>\n<li>Madhavapeddy (2025). It's time to go post-POSIX at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/mch1m-8a030\" target=\"_blank\"><i>10.59350/mch1m-8a030</i></a></li>\n<li>Madhavapeddy (2025). A Roundup of ICFP/SPLASH 2025 happenings. <a href=\"https://doi.org/10.59350/4jf5k-01n91\" target=\"_blank\"><i>10.59350/4jf5k-01n91</i></a></li>\n<li>Omar et al (2025). A FAIR Case for a Live Computational Commons. Association for Computing Machinery. <a href=\"https://doi.org/10.1145/3759536.3763802\" target=\"_blank\"><i>10.1145/3759536.3763802</i></a></li>\n<li>Madhavapeddy (2025). 2nd Programming for the Planet workshop CFP out. <a href=\"https://doi.org/10.59350/728q9-5ct54\" target=\"_blank\"><i>10.59350/728q9-5ct54</i></a></li>\n<li>Madhavapeddy (2025). GeoTessera Python library released for geospatial embeddings. <a href=\"https://doi.org/10.59350/7hy6m-1rq76\" target=\"_blank\"><i>10.59350/7hy6m-1rq76</i></a></li>\n<li>Madhavapeddy (2025). Jane Street and Docker on moving to OCaml 5 at ICFP/SPLASH 2025. <a href=\"https://doi.org/10.59350/3jkaq-d3398\" target=\"_blank\"><i>10.59350/3jkaq-d3398</i></a></li>\n<li>Zuniga-Gonzalez et al (2025). Airborne assessment uncovers socioeconomic stratification of urban nature in England. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2510.13861\" target=\"_blank\"><i>10.48550/arXiv.2510.13861</i></a></li>\n<li>Romanello et al (2022). The 2022 report of the Lancet Countdown on health and climate change: health at the mercy of fossil fuels. The Lancet. <a href=\"https://doi.org/10.1016/S0140-6736(22)01540-9\" target=\"_blank\"><i>10.1016/S0140-6736(22)01540-9</i></a></li>\n<li>Baramashetru et al (2025). Towards Modelling and Verification of Coupler Behaviour in Climate Models. <a href=\"https://doi.org/10.1145/3759536.3763801\" target=\"_blank\"><i>10.1145/3759536.3763801</i></a></li>\n<li>Kumar et al (2025). GPU-Accelerated Hydrology Algorithms for On-Prem Computation: Flow Accumulation, Drainage Lines, Watershed Delineation, Runoff Simulation. <a href=\"https://doi.org/10.1145/3759536.3763805\" target=\"_blank\"><i>10.1145/3759536.3763805</i></a></li>\n<li>Laud et al (2025). STACD: STAC Extension with DAGs for Geospatial Data and Algorithm Management. <a href=\"https://doi.org/10.1145/3759536.3763803\" target=\"_blank\"><i>10.1145/3759536.3763803</i></a></li>\n<li>Urlea et al (2025). Bridging Disciplinary Gaps in Climate Research through Programming Accessibility and Interdisciplinary Collaboration. <a href=\"https://doi.org/10.1145/3759536.3763804\" target=\"_blank\"><i>10.1145/3759536.3763804</i></a></li>\n<li>Shaw (2020). Myths and mythconceptions: what does it mean to be a programming language, anyhow?. Proceedings of the ACM on Programming Languages. <a href=\"https://doi.org/10.1145/3480947\" target=\"_blank\"><i>10.1145/3480947</i></a></li>\n<li>Cosmo et al (2025). Stop treating code like an afterthought: record, share and value it. Nature. <a href=\"https://doi.org/10.1038/d41586-025-03196-0\" target=\"_blank\"><i>10.1038/d41586-025-03196-0</i></a></li>\n<li>Feng et al (2025). TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2506.20380\" target=\"_blank\"><i>10.48550/arXiv.2506.20380</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/icfp25-propl",
      "title": "Programming for the Planet at ICFP/SPLASH 2025",
      "summary": "Report on second Programming for the Planet workshop featuring papers on climate modeling, geospatial computation and planetary-scale collaborative systems.",
      "date_published": "2025-10-05T00:00:00.000000Z",
      "date_modified": "2025-10-05T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "spatial",
        "functional",
        "programming",
        "icfp"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1098/rstb.2023.0327",
          "doi": "10.1098/rstb.2023.0327",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/j6zkp-n7t82",
          "doi": "10.59350/j6zkp-n7t82",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3759536.3763806",
          "doi": "10.1145/3759536.3763806",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/55bc5-x4p75",
          "doi": "10.59350/55bc5-x4p75",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/w1jvt-8qc58",
          "doi": "10.59350/w1jvt-8qc58",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2507.22687",
          "doi": "10.48550/arXiv.2507.22687",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3759536",
          "doi": "10.1145/3759536",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/mch1m-8a030",
          "doi": "10.59350/mch1m-8a030",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/4jf5k-01n91",
          "doi": "10.59350/4jf5k-01n91",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3759536.3763802",
          "doi": "10.1145/3759536.3763802",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/728q9-5ct54",
          "doi": "10.59350/728q9-5ct54",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/7hy6m-1rq76",
          "doi": "10.59350/7hy6m-1rq76",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/3jkaq-d3398",
          "doi": "10.59350/3jkaq-d3398",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2510.13861",
          "doi": "10.48550/arXiv.2510.13861",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1016/S0140-6736(22)01540-9",
          "doi": "10.1016/S0140-6736(22)01540-9",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3759536.3763801",
          "doi": "10.1145/3759536.3763801",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3759536.3763805",
          "doi": "10.1145/3759536.3763805",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3759536.3763803",
          "doi": "10.1145/3759536.3763803",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3759536.3763804",
          "doi": "10.1145/3759536.3763804",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3480947",
          "doi": "10.1145/3480947",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1038/d41586-025-03196-0",
          "doi": "10.1038/d41586-025-03196-0",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2506.20380",
          "doi": "10.48550/arXiv.2506.20380",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/vka8v-s6g76",
      "content_html": "<p>A world without nature feels rather impermanent, doesn't it? It's difficult to\nimagine a healthy future without clean air, fresh water and diverse wildlife.\nYet important policy is <a href=\"https://sciencebasedtargets.org/about-us/advisory-groups\">being</a> decided at\nthe moment that will sideline &quot;nature-based solutions&quot; for net-zero carbon\ntargets.  While it is true that anything involving nature is fundamentally less\npredictable than human edifice, it is <em>not</em> true that it can't be\n<a href=\"/papers/2023-ncc-permanence\">quantified</a> through science-based methods! Advances in\nremote sensing mean we have <a href=\"/papers/2025-tessera\">better resolution</a> views into nature than ever\nbefore in human history, and we can leverage those towards protecting what's\nleft. The wrong economic incentives are pushing us into a <a href=\"/papers/2023-naturecredits\">dangerous crossroads</a> where several policy paths effectively abandon\nnature.</p>\n<p>Back in January 2024, I hosted a <a href=\"https://www.cambridge.org/engage/coe/article-details/66c38d1ff3f4b052905d4317\">workshop on permanence and\ndurability</a>\nat Pembroke College attended by sixty experts in this topic, with their\nrecommendation reflected in the <a href=\"https://icvcm.org/wp-content/uploads/2025/05/CIWP-Permanence-Report.pdf\">continuous improvement report on\npermanence</a>\nfrom the ICVCM in May this year.  I now join 40 other colleagues today in\nsigning an open letter to the UN Article 6.4 supervisory body strongly calling\nout the <strong><a href=\"https://docs.google.com/document/d/17Thti_gG-rqiuoybj1EUpzR59ZjRCsWDRuni3aHmN4o/edit?tab=t.0#heading=h.ytwmlzsgf59a\">scientific imperative to incentivise natural climate solutions on the path\nto net zero</a></strong>.</p>\n<p>A number of proponents of climate action policy are advocating for <em>only</em>\nconsidering 'entirely durable' carbon sequestration as the ones we should\naccount for. That usually means carbon sequestration technologies, and rules\nout nature based solutions due to arbitrary thresholds. However, temporary\nremovals can deliver <em>immediate</em> climate benefits which are urgently needed,\nand we've proposed <a href=\"/papers/2023-pact-tmf\">several</a> <a href=\"/papers/2024-nbs-risk\">schemes</a> of\n<a href=\"/papers/2023-ncc-permanence\">equivalent permanence accounting</a> along with\n<a href=\"https://carbonplan.org/research/ton-year-explainer\">others</a> that propose\ncredible paths forward.</p>\n<p><img src=\"/images/iiasa-2.webp\" alt=\"%c\" title=\"Sir Andy Haines shows how interconnected the earth system is to global health at the IIASA/Royal Society meeting on Sep 9th 2025\" ></p>\n<p>I argue instead for a principle of parallel action: we need to invest in <em>both</em>\nfully durable carbon sequestration technologies <em>as well as</em> nature based\nsolutions. Humanity can no long afford to take single bets; warming and\nbiodiversity loss has advanced to the point where concerted parallel action is required\nto <a href=\"https://overshootconference.org/\">overshoot</a> sufficiently to bring balance back. Setting aside nature at this\ncritical juncture is unbelievably short-sighted, and the incentives we\nestablish for nations this decade will echo through the remainder of the\ncentury.  The future of <a href=\"/notes/exploring-food-impacts\">food security</a>, arresting\n<a href=\"https://www.nature.com/articles/s41586-022-04788-w\">pandemic spread</a> and <a href=\"https://www.nature.com/articles/s41559-025-02730-7\">soil\nhealth</a> all depend on\nprotecting nature.</p>\n<p>If you agree, please read the <a href=\"https://docs.google.com/document/d/17Thti_gG-rqiuoybj1EUpzR59ZjRCsWDRuni3aHmN4o/edit?tab=t.0#heading=h.ytwmlzsgf59a\">open letter</a> and <a href=\"https://docs.google.com/forms/d/e/1FAIpQLSdh8M5IPSz9-53fpP34gJEYOWBexWIzJHtPYnyq_IOyJMjGmg/viewform\">sign up</a>. See also this great <a href=\"https://carbon-pulse.com/433281/\">explainer</a> about nature-based durability.</p>\n<p><img src=\"/images/iiasa-1.webp\" alt=\"%c\" title=\"John Schellnhuber makes the case for climate overshoot at the IIASA/Royal Society meeting on Sep 9th 2025\" ></p><h1>References</h1><ul><li>Balmford et al (2024). PACT Tropical Moist Forest Accreditation Methodology v2.1. Cambridge Open Engage. <a href=\"https://doi.org/10.33774/coe-2024-gvslq\" target=\"_blank\"><i>10.33774/coe-2024-gvslq</i></a></li>\n<li>Feng et al (2025). TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2506.20380\" target=\"_blank\"><i>10.48550/arXiv.2506.20380</i></a></li>\n<li>Balmford et al (2023). Realizing the social value of impermanent carbon credits. <a href=\"https://doi.org/10.1038/s41558-023-01815-0\" target=\"_blank\"><i>10.1038/s41558-023-01815-0</i></a></li>\n<li>Rau et al (2024). Mitigating risk of credit reversal in nature-based climate solutions by optimally anticipating carbon release. <a href=\"https://doi.org/10.1080/17583004.2024.2390854\" target=\"_blank\"><i>10.1080/17583004.2024.2390854</i></a></li>\n<li>Madhavapeddy (2025). Exploring the biodiversity impacts of what we choose to eat. <a href=\"https://doi.org/10.59350/xj427-y3q48\" target=\"_blank\"><i>10.59350/xj427-y3q48</i></a></li>\n<li>Swinfield et al (2024). Nature-based credit markets at a crossroads. Springer Science and Business Media LLC. <a href=\"https://doi.org/10.1038/s41893-024-01403-w\" target=\"_blank\"><i>10.1038/s41893-024-01403-w</i></a></li>\n<li>Carlson et al (2022). Climate change increases cross-species viral transmission risk. Nature. <a href=\"https://doi.org/10.1038/s41586-022-04788-w\" target=\"_blank\"><i>10.1038/s41586-022-04788-w</i></a></li>\n<li>Thakur (2025). Warming threatens soil health. Nature Ecology & Evolution. <a href=\"https://doi.org/10.1038/s41559-025-02730-7\" target=\"_blank\"><i>10.1038/s41559-025-02730-7</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/do-not-rule-out-nature",
      "title": "Do not rule out nature from climate action; an open letter",
      "summary": "Open letter to UN Article 6.4 supervisory body advocating for nature-based climate solutions alongside durable carbon sequestration technologies.",
      "date_published": "2025-09-10T00:00:00.000000Z",
      "date_modified": "2025-09-10T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "policy",
        "nature",
        "biodiversity",
        "carbon"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.33774/coe-2024-gvslq",
          "doi": "10.33774/coe-2024-gvslq",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2506.20380",
          "doi": "10.48550/arXiv.2506.20380",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s41558-023-01815-0",
          "doi": "10.1038/s41558-023-01815-0",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1080/17583004.2024.2390854",
          "doi": "10.1080/17583004.2024.2390854",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/xj427-y3q48",
          "doi": "10.59350/xj427-y3q48",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s41893-024-01403-w",
          "doi": "10.1038/s41893-024-01403-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s41586-022-04788-w",
          "doi": "10.1038/s41586-022-04788-w",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s41559-025-02730-7",
          "doi": "10.1038/s41559-025-02730-7",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/xj427-y3q48",
      "content_html": "<p>Choosing where we source the food that we eat makes a difference to the\nenvironment, but by how much? After churning through around 100 petabytes of\ndata, beginning with our <a href=\"/papers/2024-life\">LIFE</a> metric and moving onto food\nprovenance maps and import/export data for the world, we now know the answer\ncan vary by <em><a href=\"https://www.nature.com/articles/s43016-025-01224-w\">three orders of magnitude</a></em> for species\nextinction risks.</p>\n<p>Our <a href=\"https://www.nature.com/articles/s43016-025-01224-w\">paper in Nature Food</a> came out today with\nall the tasty details and implications for food policies worldwide. In order to make the data easier\nto explore, I knocked up an <a href=\"https://quantifyearth.github.io/food-globe/\">interactive global explorer</a>\nusing the data that the team (lead by <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\">Thomas Ball</a>) generated.</p>\n<p><a href=\"https://quantifyearth.github.io/food-globe/\"> <img src=\"/images/food-life-globe-1.webp\" alt=\"%c\" title=\"Explore food trade impacts on every country interactively\" > </a></p>\n<p>The methodology is described in the <a href=\"/papers/2024-food-life\">paper</a>:</p>\n<blockquote>\n<p>The marginal impact of current food consumption on biodiversity can be viewed\nas the forgone opportunity to restore biodiversity arising through ongoing\nagricultural land use. By linking LIFE with spatial crop and pasture\ndistributions, consumption, production and trade data from the <a href=\"https://www.fao.org/home/en/\">Food and\nAgriculture Organisation</a> of the United Nations, [...]\nwe quantify the &gt; opportunity cost to biodiversity of producing or consuming 1kg of each\nFAO-aligned food commodity in 174 countries, taking the feed and grazing\nrequirements for animal products into account.\n-- <cite><a href=\"https://www.nature.com/articles/s43016-025-01224-w\">Food impacts on species extinction risks can vary by three orders of magnitude</a></cite></p>\n</blockquote>\n<p>The results themselves are absolutely fascinating. Firstly, ruminant meat has\nmedian consequences 100 times higher than that of legumes and pulses, even when\naccounting for the high protein content. That burger, while delicious, has an\nenormous biodiversity cost somewhere in the world!  Bear in mind that more than\n3/4 of human-appropriated land surface on the planet is <a href=\"http://ourworldindata.org/land-use\">dedicated to the\nproduction of animal products</a> while\nproviding only 17% of global calories.</p>\n<p>But if you really <em>want</em> a burger, then choosing <em>where</em> you source the meat\nfrom makes a huge difference as well. There is a great deal of spatial variance\nabout the selection of each commodity.</p>\n<p><a href=\"https://www.nature.com/articles/s43016-025-01224-w/figures/1\"> <img src=\"/images/food-life-globe-2.webp\" alt=\"%c\" > </a></p>\n<p>The vertical length show the spatial variance. It gets more interesting when you break it down by country and examine the imports and exports of each one:</p>\n<p><a href=\"https://www.nature.com/articles/s43016-025-01224-w/figures/3\"> <img src=\"/images/food-life-globe-3.webp\" alt=\"%c\" > </a></p>\n<p>The <a href=\"https://www.science.org/doi/10.1126/science.adv8264\">biodiversity leak</a> becomes clear now! Some countries do consume enormous amounts of meat, but do so from domestic production (i.e. a local political decision). Others, like the UK or Japan, import vast amounts of damaging food from <em>other countries</em> which are highly biodiverse, resulting in the demand that is one of the primary drivers of deforestation.</p>\n<p>The data is out there now, under a <a href=\"https://creativecommons.org/licenses/by-sa/4.0/\">CC-BY-SA</a> license, and the <a href=\"https://quantifyearth.github.io/food-globe/\">interactive FOOD explorer</a> allows you to break down impacts per country and look through 150 different foodstuffs.</p>\n<p>I also mentioned the vast amount of data processing involved in coming up with these numbers; <a href=\"https://mynameismwd.org\">Michael Dales</a> has uploaded the calculators over to our <a href=\"https://github.com/quantifyearth/\">quantifyearth/</a> code organisation, and will be presenting the magic library <a href=\"https://github.com/quantifyearth/yirgacheffe\">Yirgacheffe</a> that drives much of this processing at <a href=\"https://conf.researchr.org/home/icfp-splash-2025/propl-2025#program\">PROPL25</a> in October. Tune in if you'd like to hear more about the computer science side of this sort of analysis!</p><h1>References</h1><ul><li>Ball et al (2025). Food impacts on species extinction risks can vary by three orders of magnitude. <a href=\"https://doi.org/10.1038/s43016-025-01224-w\" target=\"_blank\"><i>10.1038/s43016-025-01224-w</i></a></li>\n<li>Eyres et al (2025). LIFE: A metric for mapping the impact of land-cover change on global extinctions. <a href=\"https://doi.org/10.1098/rstb.2023.0327\" target=\"_blank\"><i>10.1098/rstb.2023.0327</i></a></li>\n<li>(0). Fig. 1: Global variation across and within commodities in the expected extinction impact of producing 1 kg of agricultural commodity or commodity group. | Nature Food. <a href=\"https://doi.org/https://www.nature.com/articles/s43016-025-01224-w/figures/1\" target=\"_blank\"><i>https://www.nature.com/articles/s43016-025-01224-w/figures/1</i></a></li>\n<li>(0). Fig. 3: The percentage of consumption-driven extinctions arising from imported and domestically produced food commodities, estimated for the United States, Japan, the United Kingdom, Brazil, Uganda and India. | Nature Food. <a href=\"https://doi.org/https://www.nature.com/articles/s43016-025-01224-w/figures/3\" target=\"_blank\"><i>https://www.nature.com/articles/s43016-025-01224-w/figures/3</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/exploring-food-impacts",
      "title": "Exploring the biodiversity impacts of what we choose to eat",
      "summary": "Nature Food paper revealing food choice biodiversity impacts vary by three orders of magnitude with interactive global explorer tool.",
      "date_published": "2025-09-09T00:00:00.000000Z",
      "date_modified": "2025-09-09T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "food",
        "biodiversity",
        "spatial"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2024-food-life.pdf",
          "mime_type": "application/pdf",
          "title": "Food impacts on species extinction risks can vary by three orders of magnitude"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1038/s43016-025-01224-w",
          "doi": "10.1038/s43016-025-01224-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1098/rstb.2023.0327",
          "doi": "10.1098/rstb.2023.0327",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/https://www.nature.com/articles/s43016-025-01224-w/figures/1",
          "doi": "https://www.nature.com/articles/s43016-025-01224-w/figures/1",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/https://www.nature.com/articles/s43016-025-01224-w/figures/3",
          "doi": "https://www.nature.com/articles/s43016-025-01224-w/figures/3",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/7hy6m-1rq76",
      "content_html": "<p>We've been having great fun at the <a href=\"https://www.cst.cam.ac.uk/research/eeg\">EEG</a>\nrecently releasing embeddings of\nour new <a href=\"https://github.com/ucam-eo/tessera\">TESSERA</a> geospatial foundation\nmodel.</p>\n<blockquote>\n<p>TESSERA is a foundation model for Earth observation that processes Sentinel-1\nand Sentinel-2 satellite data to generate representation (embedding) maps. It\ncompresses a full year of Sentinel-1 and Sentinel-2 data and learns useful\ntemporal-spectral features.\n<cite>-- <a href=\"https://github.com/ucam-eo/tessera\">Temporal Embeddings of Surface Spectra for Earth Representation and Analysis</a></cite></p>\n</blockquote>\n<p>A <a href=\"https://en.wikipedia.org/wiki/Foundation_model\">foundation model</a> is\ndesigned to be used for downstream tasks without having to retrain a full model\nfor every individual task. Our <a href=\"/papers/2025-tessera\">preprint paper</a>\ndescribes what sorts of geospatial tasks you can solve more quickly, ranging from crop\ntype classification, forest canopy height estimation, above-ground biomass\ncalculations, wildfire detection, forest stocks, and many more.</p>\n<p><a href=\"/images/tessera-f1.webp\"> <img src=\"/images/tessera-f1.webp\" alt=\"%c\" title=\"Parametric UMAP false colour visualisation of TESSERA embeddings for Cambridgeshire\" > </a></p>\n<p>TESSERA is an open model that is trained only on public satellite data (thanks\n<a href=\"https://www.esa.int/Applications/Observing_the_Earth/Copernicus/The_Sentinel_missions\">ESA</a>!),\nand we make tiled embeddings available for download that have 128-dimensional\nvectors precomputed for every 10m<sup>2</sup> surface of the planet. This makes using\nTESSERA really simple in existing GIS workflows.</p>\n<h2 id=\"taking-the-geotessera-cli-for-a-spin\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#taking-the-geotessera-cli-for-a-spin\"></a>Taking the GeoTessera CLI for a spin</h2>\n<p>To make it even simpler, I've just <a href=\"https://pypi.org/project/geotessera/0.5.1/\">published</a> a new Python library called\n<a href=\"https://github.com/ucam-eo/geotessera\">geotessera</a> which provides a\nprogrammatic and CLI interface to accessing these.</p>\n<figure class=\"image-center\"><img src=\"https://raw.githubusercontent.com/ucam-eo/tessera-coverage-map/refs/heads/main/geotessera.gif\"></figure>\n<p>The full set of TESSERA embeddings are petabytes when generated<sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote\">[1]</a></sup>, so it's important that you download just the ones\nyou need for a given region of interest. We chunked up the embeddings into image tiles where each pixel represents a 10m<sup>2</sup> are, and are hosting these at the Computer Lab on <a href=\"https://dl.geotessera.org\">dl.geotessera.org</a>.<sup id=\"fnref:2\"><a href=\"#fn:2\" class=\"footnote\">[2]</a></sup> Geotessera uses <a href=\"https://www.fatiando.org/pooch/latest/index.html\">Pooch</a> to build up a registry of <a href=\"https://github.com/ucam-eo/tessera-manifests\">manifests</a> in Git, and provides helper functions to calculate which tiles you need.\nautomate.</p>\n<p>You can take this for a spin very quickly if you have <a href=\"https://docs.astral.sh/uv/\">uv</a> installed. First, let's check global coverage of what's available:</p>\n<pre><code>uvx geotessera coverage\n</code></pre>\n<p>This will drop a figure like the below into <code>tessera_coverage.png</code>. You can also refine the map to an area of interest by passing in a GeoJSON, shapefile or manual bounding box to the command-line arguments.</p>\n<figure class=\"image-center\"><img src=\"https://raw.githubusercontent.com/ucam-eo/tessera-coverage-map/refs/heads/main/map.png\"></figure>\n<p>We're still churning through the inference (and prioritising areas of interest for our early adopters), so the green spots represent full coverage from 2017-2024, the blue represents just 2024, and orange for in-between.\nNow, we want to download the embeddings themselves. Let's do Cambridgeshire:</p>\n<pre><code>uvx geotessera download \\\n  --output cb \\\n  --region-file https://raw.githubusercontent.com/ucam-eo/geotessera/refs/heads/main/example/CB.geojson\n</code></pre>\n<p>This will drop a bunch of GeoTIFFs into the <code>cb/</code> directory which you can inspect using GDAL in the normal way. Note that the local UTM coordinates are preserved (this varies by latitude) and that there are 128 bands per TIFF.</p>\n<pre><code>% gdalinfo -stats cb/tessera_2024_lat51.95_lon0.05.tif\n&lt;...&gt;\nPixel Size = (10.000000000000000,-10.000000000000000)\nMetadata:\n  GEOTESSERA_VERSION=0.5.1\n  TESSERA_DATASET_VERSION=v1\n  TESSERA_DESCRIPTION=GeoTessera satellite embedding tile\n  TESSERA_TILE_LAT=51.95\n  TESSERA_TILE_LON=0.05\n  TESSERA_YEAR=2024\n  AREA_OR_POINT=Area\nImage Structure Metadata:\n  COMPRESSION=LZW\n  INTERLEAVE=PIXEL\nCorner Coordinates:\nUpper Left  (  293612.221, 5765288.255) (  0d 0'24.03&quot;W, 51d59'59.39&quot;N)\nLower Left  (  293612.221, 5753888.255) (  0d 0' 0.61&quot;E, 51d53'50.90&quot;N)\nUpper Right (  300932.221, 5765288.255) (  0d 5'59.33&quot;E, 52d 0' 9.01&quot;N)\nLower Right (  300932.221, 5753888.255) (  0d 6'23.09&quot;E, 51d54' 0.49&quot;N)\nCenter      (  297272.221, 5759588.255) (  0d 2'59.76&quot;E, 51d56'59.99&quot;N)\nBand 1 Block=256x256 Type=Float32, ColorInterp=Gray\n  Description = Tessera_Band_0\n  Minimum=-4.017, Maximum=11.446, Mean=3.649, StdDev=2.089\n  Metadata:\n    STATISTICS_MINIMUM=-4.0171508789062\n    STATISTICS_MAXIMUM=11.445509910583\n    STATISTICS_MEAN=3.6492421642927\n    STATISTICS_STDDEV=2.088705775134\n    STATISTICS_VALID_PERCENT=100\n</code></pre>\n<p>Once you have the GeoTIFFs locally, you can drop into your normal GIS workflows. But\nyou can also continue to use the CLI to do false colour visualisations, for\nexample using PCA, to help visualise what's going on.</p>\n<pre><code>uvx geotessera visualize cb cb.tiff\nuvx geotessera serve cb.tiff --open\n</code></pre>\n<p>These two commands will first output an RGB mosaic of the tiles (false colour, like the one at the start of this post), and then tile them using <a href=\"https://leafletjs.com/\">LeafletJS</a> so you explore them with OpenStreetMap in the background.</p>\n<h2 id=\"a-geospatial-classification-workflow\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#a-geospatial-classification-workflow\"></a>A geospatial classification workflow</h2>\n<p>At this point you are probably itching to do some actual machine learning. You can try out the Tessera interactive Jupyter notebook next!</p>\n<pre><code>git clone https://github.com/ucam-eo/tessera-interactive-map\ncd tessera-interactive-map\nuv venv\nsource .venv/bin/activate\nuv pip install -r requirements.txt\ncode app.ipynb\n</code></pre>\n<p>This will spin up the environment in VS Code as a notebook, where if you run the cells you get an interactive bounding box that you can use to do manual classification by simply marking labels. Here's a video that demonstrates this, courtesy of <a href=\"https://www.cst.cam.ac.uk/people/ray25\">Robin Young</a>:</p>\n<p><div class=\"video-center\"><iframe title=\"TESSERA interactive notebook demo\" width=\"100%\" height=\"315px\" src=\"https://crank.recoil.org/videos/embed/a736234b-98bd-4923-a01e-87cff597f8b2\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<p>Have fun with this! All of this is really bleeding edge stuff, so if you run into issues (likely) then please do <a href=\"https://github.com/ucam-eo/geotessera\">file an issue</a> and let us know. In fact, let us know if you build something where you didn't find any bugs, too!</p>\n<h2 id=\"what-next\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#what-next\"></a>What next?</h2>\n<p>Coding in Python again after a few years has been a fun experience for me, but\nI'm yearning to return to OCaml again. Accordingly, I've been building out an\nimplementation of GeoTessera in native OCaml, using\n<a href=\"https://github.com/ocaml-multicore/eio\">eio</a>. This is also a perfect usecase\nfor <a href=\"https://oxcaml.org\">oxcaml</a> extensions to speed up floating point\nprocessing, and <a href=\"https://github.com/tmattio\">Thibaut Mattio</a> has just published\n<a href=\"https://github.com/raven-ml/raven\">Raven</a> for handling numpy format arrays in\nOCaml.   Stay <a href=\"/notes/cresting-the-ocaml-ai-hump\">tuned</a> for more on that...</p>\n<p>There's also plenty to be done on improvements to GeoTessera; I'll be adding in\nmodules to help with machine learning workflows as soon as the external uses of them\nhave stabilised a bit more.  I'm also enjoying getting re-familiar with modern\nPython tooling. <code>uv</code> is a remarkable piece of work, but I'm still figuring out how to\n(e.g.) run notebooks directly using it without running into package issues.</p>\n<p>And finally, storage management remains a real headache as we are <a href=\"/notes/syncoid-sanoid-zfs\">striping and syncing</a> hundreds of terabytes of storage and keeping it\n<a href=\"https://www.tunbury.org/2025/08/27/fsperf/\">performant</a>.  As we go back to generate\nembeddings for earlier years, we'll be hitting petabytes easily. While the normal\nanswer is to store this on a cloud, the problem is the egress bandwidth is hugely\nexpensive, and it's imporant we have a local storage cluster for this. Any tips\non how to build out a cheap such cluster are welcome!</p>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p>We ran the inference on a combination of <a href=\"https://www.amd.com/en/products/accelerators/instinct/mi300/mi300x.html\">AMD MIX300</a> and the <a href=\"https://www.hpc.cam.ac.uk/d-w-n\">Dawn</a> cluster you may have seen me <a href=\"https://crank.recoil.org/w/9YmWNZrmGD2794Fk33djUp\">talking about on the BBC</a> a while back.</p>\n <a href=\"#fnref:1\" class=\"reversefootnote\">&#8617;</a></p></li>\n<li id=\"fn:2\"><p><p>The humungous number of SSDs are jury-rigged onto machines kindly donated to the Lab by <a href=\"https://janestreet.com\">Jane Street</a>.</p>\n <a href=\"#fnref:2\" class=\"reversefootnote\">&#8617;</a></p></li></ol></div><h1>References</h1><ul><li>Feng et al (2025). TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2506.20380\" target=\"_blank\"><i>10.48550/arXiv.2506.20380</i></a></li>\n<li>Madhavapeddy (2025). Cresting the OCaml AI humps. <a href=\"https://doi.org/10.59350/nn1d6-xgt62\" target=\"_blank\"><i>10.59350/nn1d6-xgt62</i></a></li>\n<li>Madhavapeddy (2025). Semi distributed filesystems with ZFS and Sanoid. <a href=\"https://doi.org/10.59350/zy5bb-3ze20\" target=\"_blank\"><i>10.59350/zy5bb-3ze20</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/geotessera-python",
      "title": "GeoTessera Python library released for geospatial embeddings",
      "summary": "Release of GeoTessera Python library and CLI for accessing TESSERA geospatial foundation model embeddings with interactive visualization tools.",
      "date_published": "2025-08-31T00:00:00.000000Z",
      "date_modified": "2025-08-31T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "tessera",
        "spatial",
        "ai",
        "satellite"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.48550/arXiv.2506.20380",
          "doi": "10.48550/arXiv.2506.20380",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/nn1d6-xgt62",
          "doi": "10.59350/nn1d6-xgt62",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/zy5bb-3ze20",
          "doi": "10.59350/zy5bb-3ze20",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/p45b8-kvt85",
      "content_html": "<p>That's a wrap for the next decade with <a href=\"https://aarhus2025.org/\">Aarhus 2025</a>,\nwhere I presented our paper on &quot;<a href=\"/papers/2025-internet-ecology\">Steps towards an Ecology for the Internet</a>&quot;.\nI was a little unsure about how to approach the presentation, largely because\nthe ideas seem a little crazy if they'd been proposed even a year ago! Luckily\nmy co-authors strengthened my spine with encouragement and gin,\nand the event was tremendous fun packed with useful insights.</p>\n<p>Our key observation is that the Internet is dangerously <a href=\"https://en.wikipedia.org/wiki/Protocol_ossification\">ossifying</a> into monocultures\nat multiple levels. Inspired by wild ecosystems, we're proposing mixing in more <a href=\"https://www.google.com/search?client=safari&amp;rls=en&amp;q=natural+selection&amp;ie=UTF-8&amp;oe=UTF-8\">natural\nselection</a>\ninto edge deployments by using AI code models to mutate end-hosts and tailor\nthem to their environment. Generative AI is notoriously\n<a href=\"https://www.taylorfrancis.com/books/mono/10.1201/9781003440260/ai-roman-yampolskiy\">unpredictable</a>,\nwhich turns out to be a useful property if you actually want more local\nsoftware diversity! For example, this lets us cook up &quot;antibotty&quot; networks that\nfight back against global viruses via locally adapted vigilantes (antibodies).</p>\n<p><a href=\"/slides/2025-internet-ecology.html\"> <img src=\"/images/aarhus-1.webp\" alt=\"%c\" > </a></p>\n<p>Beyond just making thing more resilient, injecting more software diversity gives\nus the hooks to make computers in our environment do <em>what we actually want\nthem to</em>.\nTo quote Mark Weiser, &quot;<a href=\"https://dl.acm.org/doi/pdf/10.1145/174800.174801\">the world is not a desktop</a>&quot;, and computers\nshould be so ubiquitously blended into our day-to-day lives that they are\n<a href=\"https://dl.acm.org/doi/10.1145/192426.192428\">invisible</a> to the user.  We're\ngetting further away from that dream every day with the monotony of the FAANG\nsoftware monocultures, and a <a href=\"https://en.wikipedia.org/wiki/Botnet\">wild west of botnets</a>\nsweeping through billions of devices.</p>\n<p>Since this conference only happens once a decade, I put myself in the right mindframe by reading through my old ideas and seeing how they had aged.\nI was struck by how much <a href=\"/papers/2015-aarhus-databox\">Databox</a> (2015), <a href=\"/papers/2011-icdcn-droplets\">Droplets</a> (2011) and\n<a href=\"/notes/yurts-for-digital-nomads\">Digital Yurts</a> (2009) all stood up surprisingly well<sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote\">[1]</a></sup> in 2025.\nThis time around, we have a fresh edge with the rise of coding models and a relative glut of edge computation. The question is how to harness these new technologies for the health of the Internet and not yet more central lockin.</p>\n<p><img src=\"/images/aarhus-4.webp\" alt=\"%c\" title=\"The crowd was centred around the SIGCHI community and very engaged\" ></p>\n<h2 id=\"discussions-with-the-audience\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#discussions-with-the-audience\"></a>Discussions with the audience</h2>\n<p>The talks were arranged in a panel with three other great speakers, who discussed crises through the lenses <a href=\"https://dl.acm.org/doi/10.1145/3744169.3744176\">urban gardening</a>, <a href=\"https://dl.acm.org/doi/10.1145/3744169.3744179\">computing supply chains</a> and <a href=\"https://dl.acm.org/doi/10.1145/3744169.3744178\">two-loops models of change</a>. There were some thought-provoking questions from the audience!</p>\n<p>Firstly, is framing these questions as a &quot;crisis&quot; just <a href=\"https://www.mdpi.com/2225-1154/13/4/69\">saturating</a> us with a constant bombardment of problems we need to react to? Should we be building more (emotionally and systemically) sustainable platforms for engendering change? I certainly agreed with this, but I don't yet have a clear sense of what this means, beyond finding <a href=\"https://ourworldindata.org/\">Our World In Data</a> an inspiration of how fun data exploration can be.</p>\n<p>A couple of things I need to follow up on reading:</p>\n<ul>\n<li><a href=\"https://www.diva-portal.org/smash/get/diva2:1773078/FULLTEXT01.pdf\">Computing as Ecocide</a> was a paper at the <a href=\"https://computingwithinlimits.org/2025/\">LIMITS</a> workshop, which I hadn't heard off before. This seems like a complement to <a href=\"https://conf.researchr.org/home/icfp-splash-2025/propl-2025\">PROPL</a> coming up in October.</li>\n<li><a href=\"https://en.wikipedia.org/wiki/Robin_Sloan\">Robin Sloan</a> is an author who combines old and new tech in fun ways.</li>\n<li>Approaches to directly <a href=\"https://ieeexplore.ieee.org/document/9830112\">deal with &quot;eco-anxiety&quot;</a> while teaching sustainability.</li>\n</ul>\n<p><img src=\"/images/aarhus-5.webp\" alt=\"%c\" title=\"Waving to the conference audience from stage!\" ></p>\n<h3 id=\"inversely-proportional-voting\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#inversely-proportional-voting\"></a>Inversely proportional voting</h3>\n<p>The stage discussion then veered into the role of lifecycles in these cultural\nsystems. An audience member asked about the role of biological mutualism and\ncooperation in any future digital framework, and I pointed out\n<a href=\"https://blogs.cornell.edu/info2040/2014/09/20/impalas-parasites-and-the-prisoners-dilemma/\">some</a>\n<a href=\"https://doi.org/10.1038/s41586-025-08614-x\">examples</a> from our paper\nabout how cooperative ensembles in nature can be very stable, but also do not\nhave to last forever. When applying this &quot;nihil aeternum est&quot; principle to\nhuman systems, how sacrosanct are concepts like &quot;democracy&quot; as we move forward?\nIf our views on these institutions remain unchanging, then they will also\nbecome brittle and collapse as the context in which they operate changes into a hot\ncrowded world mired in <a href=\"https://www.cambridge.org/core/journals/global-sustainability/announcements/call-for-papers/polycrisis-in-the-anthropocene\">polycrises</a>.</p>\n<p>This let me bring out an idea I've been <a href=\"/notes/cambridge-green-blue\">ruminating</a> on for a while. While the principles of <a href=\"https://en.wikipedia.org/wiki/Suffrage\">equal suffrage</a> are vital, one dimension where we could relax it is for <em>intergenerational</em> representation. One issue with taking political decisions for the long-term is that wealth pools with the old, who have far less to gain from long-term thinking than the young<sup id=\"fnref:2\"><a href=\"#fn:2\" class=\"footnote\">[2]</a></sup> voters. So why are the young so underrepresented? Imagine a completely made up voting system that looked like this:</p>\n<blockquote>\n<p>A voting system assigns each individual a number of votes inversely\nproportional to their age. An 18-year old will have the maximum number of\nvotes, and they gradually degrade until anyone age 70 or higher gets just one\nvote. The number of votes decay rapidly with age, so that every year the new\n18-year old cohort will control (say) 40% of the total vote. This sort of\nsystem ensures that the older generations (where wealth pools) must educate\nthe newly minted voters every year, or risk losing control of their agendas.\n<cite>-- Anil's entirely made up voting system</cite></p>\n</blockquote>\n<p>I've found some variations of the theme, like <a href=\"https://en.wikipedia.org/wiki/Demeny_voting\">Demeny voting</a> that gives parents a proxy\nvote for their children or even cases for <a href=\"https://profs-polisci.mcgill.ca/muniz/intergen/Van%20Parijs%20-%20Disenfranchisement%20of%20the%20elderly.pdf\">disenfranchisement of the elderly</a>.\nBut neither of these quite capture what I have in mind, which is to build\n<em>intergenerational education</em> firmly into how our society operates. If 40% of\nthe entire voting block appears newly every year, then education on civic\nmatters need to happen like clockwork and be incorporated into our curriculums.</p>\n<p>But of course, there are huge barriers to trying out these experiments in civic\nsociety. But there are no such barriers to running these experiments in\nmicrocosms of <a href=\"https://www.cambridge.org/core/books/governing-the-commons/7AB7AE11BADA84409C34815CC288CD79\">common pool\nresources</a>\nor even on digital systems. So after these fascinating conversations\nat the conference, I'm going to think about the <a href=\"/notes/cambridge-green-blue\">Cambridge Green Blue</a> and apply it there.</p>\n<p><img src=\"/images/aarhus-3.webp\" alt=\"%c\" title=\"The students at Aarhus are definitely a welcoming bunch!\" ></p>\n<h3 id=\"self-hosting-is-really-far-behind\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#self-hosting-is-really-far-behind\"></a>Self hosting is really far behind</h3>\n<p>I was also struck by how far behind self-hosting is, even among an audience\nthat should be heavily in favour of it. I think my talk was one of only a few\nthat mentioned BlueSky and the Fediverse, and alternative communication\nmechanisms. I also demoed Claude in a corner to show how it could help <a href=\"https://www.tunbury.org/2025/07/25/build-analysis/\">manage\ninfrastructure</a> that would\nordinarily take a sysadmin, but could now be reasonably handled by a non-expert\n(with care!).</p>\n<p>One of the attendees commented to me afterwards that they remembered the\n<a href=\"/papers/2015-aarhus-databox\">Databox</a> talk from a decade ago, and wondered why it\nhadn't taken off. Maybe now is the time for <a href=\"https://ryan.freumh.org\">Ryan Gibb</a>'s work on <a href=\"https://ryan.freumh.org/eilean.html\">digital\nislands</a> to hit the mainstream!\nThey've certainly never been needed more than now; I am deeply glad to\nsee my colleagues like <a href=\"https://jonmsterling.com\">Jon Sterling</a> also <a href=\"https://www.jonmsterling.com/2025-W35/index.xml\">working on solutions</a>\nin this space.</p>\n<h3 id=\"presenting-in-slipshow\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#presenting-in-slipshow\"></a>Presenting in Slipshow</h3>\n<p>In the spirit of self-hosting, I also used the great new\n<a href=\"https://github.com/panglesd/slipshow\">Slipshow</a> tool (that uses\n<a href=\"https://github.com/ocsigen/js_of_ocaml\"><code>js_of_ocaml</code></a>) to write the\npresentation. Slipshow lets the presentation be <a href=\"https://tangled.sh/@anil.recoil.org/aarhus25-ecology-talk/blob/main/aarhus.md?code=true\">written in\nMarkdown</a>,\nand I used Claude Code to handle all the styling for me. The whole presentation\ntook about an hour to put together, and can be <a href=\"/slides/2025-internet-ecology.html\">viewed\nstandalone</a> as a single web page as it\ninlines all the assets.</p>\n<p>Using HTML/JS/CSS for talks is really convenient, so I'm sold on using Slipshow\nfor my upcoming presentations this year! It's also excellent to be using\n<code>js_of_ocaml</code>. I think the only thing on my &quot;wishlist&quot; is to be able to run a\nheadless browser and output PDF snapshots of each of the slips. I'm also not\nyet sure how my 100MB videos will encode, but I'll figure that out ahead of my\nnext talk (in the Royal Society at the start of September to the Austrian\ngovernment). The author of the software, <a href=\"https://github.com/panglesd\">Paul-Elliot</a> also\nkindly reached out to get feedback, and I was really pleased to see his work\nwas supported by the <a href=\"https://nlnet.nl/commonsfund/\">NGI Commons Fund</a>!</p>\n<p><img src=\"/images/aarhus-8.webp\" alt=\"%c\" title=\"Slipshow looked great on the giant presentation screen\" ></p>\n<h2 id=\"the-city-of-aarhus\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-city-of-aarhus\"></a>The city of Aarhus</h2>\n<p>Aarhus is also a spiritual twin city to Cambridge. It was a gorgeously sunny\nweek, with bicycles available everywhere and a lovely Latin quarter to hang out\nin.</p>\n<p><img src=\"/images/aarhus-2.webp\" alt=\"%c\" title=\"Hanging at the Salling rooftop garden with Bas Spitters learning about infinite category theory\" ></p>\n<p>The venue itself at Aarhus University was really nice to explore and see what\nthe students are up to. Lots of music and creative arts in the same area. When\nI was here last year to present at\n<a href=\"https://direc.dk/cybersecurity-how-do-we-maintain-trust-in-our-digital-society/\">Matchpoints</a>\nwe were at the Modern Art museum, and the amount of new building work in the\ncity was remarkable to see. There was also a lovely forest park I went for an\nearly morning jog in to get some nature!</p>\n<p><img src=\"/images/aarhus-7.webp\" alt=\"%c\" title=\"Excellent coffee at Stillers in the Latin quarter\" ></p>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p>Not coincidentally, most of these ideas were cooked up with <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\">Jon Crowcroft</a> somewhere in the picture at <a href=\"/news/2025-internet-ecology-1\">a nearby pub</a>.</p>\n <a href=\"#fnref:1\" class=\"reversefootnote\">&#8617;</a></p></li>\n<li id=\"fn:2\"><p><p>We used the ideas of discount factors to adjust for impermanence in our <a href=\"/papers/2023-ncc-permanence\">forest carbon</a> work in Nature Climate Change a few years ago too.</p>\n <a href=\"#fnref:2\" class=\"reversefootnote\">&#8617;</a></p></li></ol></div><h1>References</h1><ul><li>Madhavapeddy et al (2025). Steps towards an Ecology for the Internet. Association for Computing Machinery. <a href=\"https://doi.org/10.1145/3744169.3744180\" target=\"_blank\"><i>10.1145/3744169.3744180</i></a></li>\n<li>Chaudhry et al (2015). Personal Data: Thinking Inside the Box. <a href=\"https://doi.org/10.7146/aahcc.v1i1.21312\" target=\"_blank\"><i>10.7146/aahcc.v1i1.21312</i></a></li>\n<li>Balmford et al (2023). Realizing the social value of impermanent carbon credits. <a href=\"https://doi.org/10.1038/s41558-023-01815-0\" target=\"_blank\"><i>10.1038/s41558-023-01815-0</i></a></li>\n<li>Madhavapeddy (2010). Yurts for Digital Nomads. <a href=\"https://doi.org/10.59350/4b3zw-zen8\" target=\"_blank\"><i>10.59350/4b3zw-zen8</i></a></li>\n<li>Madhavapeddy (2025). The Cambridge \"Green Blue\" competition to reduce emissions. <a href=\"https://doi.org/10.59350/y1g67-aq825\" target=\"_blank\"><i>10.59350/y1g67-aq825</i></a></li>\n<li>Galvez et al (2025). A travelling-wave strategy for plant–fungal trade. Nature. <a href=\"https://doi.org/10.1038/s41586-025-08614-x\" target=\"_blank\"><i>10.1038/s41586-025-08614-x</i></a></li>\n<li>Weiser (1994). The world is not a desktop. Interactions. <a href=\"https://doi.org/10.1145/174800.174801\" target=\"_blank\"><i>10.1145/174800.174801</i></a></li>\n<li>Weiser (1994). Creating the invisible interface: (invited talk). <a href=\"https://doi.org/10.1145/192426.192428\" target=\"_blank\"><i>10.1145/192426.192428</i></a></li>\n<li>Jack (2025). In the Dirt: Place-Based Environmental Action and Technology-Mediated Work in New York City. <a href=\"https://doi.org/10.1145/3744169.3744176\" target=\"_blank\"><i>10.1145/3744169.3744176</i></a></li>\n<li>Lin et al (2025). Whose, Which, and What Crisis? A Critical Analysis of Crisis in Computing Supply Chains. <a href=\"https://doi.org/10.1145/3744169.3744179\" target=\"_blank\"><i>10.1145/3744169.3744179</i></a></li>\n<li>Thorslund et al (2025). Meta-crisis computing and you: Finding agency through the Two Loops model of change. <a href=\"https://doi.org/10.1145/3744169.3744178\" target=\"_blank\"><i>10.1145/3744169.3744178</i></a></li>\n<li>Eriksson et al (2022). Addressing Students’ Eco-anxiety when Teaching Sustainability in Higher Education. 2022 International Conference on ICT for Sustainability (ICT4S). <a href=\"https://doi.org/10.1109/ICT4S55073.2022.00020\" target=\"_blank\"><i>10.1109/ICT4S55073.2022.00020</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/ecology-at-aarhus",
      "title": "Presenting our Ecology of the Internet ideas at Aarhus 2025",
      "summary": "Presentation at Aarhus 2025 on Internet ecology, proposing AI-driven software diversity to fight protocol ossification and create more resilient networks.",
      "date_published": "2025-08-22T00:00:00.000000Z",
      "date_modified": "2025-08-22T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "denmark",
        "ecology",
        "internet",
        "llms",
        "ai",
        "selfhosting"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2025-internet-ecology.pdf",
          "mime_type": "application/pdf",
          "title": "Steps towards an Ecology for the Internet"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1145/3744169.3744180",
          "doi": "10.1145/3744169.3744180",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.7146/aahcc.v1i1.21312",
          "doi": "10.7146/aahcc.v1i1.21312",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s41558-023-01815-0",
          "doi": "10.1038/s41558-023-01815-0",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/4b3zw-zen8",
          "doi": "10.59350/4b3zw-zen8",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/y1g67-aq825",
          "doi": "10.59350/y1g67-aq825",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s41586-025-08614-x",
          "doi": "10.1038/s41586-025-08614-x",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1145/174800.174801",
          "doi": "10.1145/174800.174801",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1145/192426.192428",
          "doi": "10.1145/192426.192428",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3744169.3744176",
          "doi": "10.1145/3744169.3744176",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3744169.3744179",
          "doi": "10.1145/3744169.3744179",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3744169.3744178",
          "doi": "10.1145/3744169.3744178",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1109/ICT4S55073.2022.00020",
          "doi": "10.1109/ICT4S55073.2022.00020",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/7267y-nj702",
      "content_html": "<p>Since I wrote about the new <a href=\"/notes/disentangling-git-with-bluesky\">ATProto-powered Tangled Git forge</a> a few months ago, it's come along by leaps and bounds!</p>\n<p>First, and most excitingly, they've added <a href=\"https://blog.tangled.sh/ci\">continuous integration via Spindles</a> which are built in a nice ATProto style:</p>\n<blockquote>\n<p>When you push code or open a pull request, the knot hosting your repository\nemits a pipeline event (sh.tangled.pipeline). Running as a dedicated service,\nspindle subscribes to these events via websocket connections to your knot.</p>\n</blockquote>\n<p>The pipelines are Nix-only right now, so I braved using it<sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote\">[1]</a></sup> for a new <a href=\"https://tangled.sh/@anil.recoil.org/ocaml-gpx\">GPS Exchange Format library in OCaml</a> that I wrote. The <a href=\"https://tangled.sh/@anil.recoil.org/ocaml-gpx/pipelines\">pipelines</a> should look familiar, and the <a href=\"https://tangled.sh/@anil.recoil.org/ocaml-gpx/blob/main/.tangled/workflows/build.yml\">description format</a> very straightforward.</p>\n<p>Secondly, the service has <a href=\"https://blog.tangled.sh/stacking\">added</a> support for <a href=\"https://github.com/jj-vcs/jj\">JJ</a> stacked pull requests, which are the closest I've seen to the <a href=\"https://blog.janestreet.com/ironing-out-your-release-process/\">Jane Street Iron diff workflow</a> which I've been wanting to try in open source for ages.  You can see the interdiff review process on a recent PR by <a href=\"https://tangled.sh/@winter.bsky.social\">Winter</a> who add support for <a href=\"https://tangled.sh/@tangled.sh/core/pulls/423\">engine-agnostic Spindle workflows</a>, which should pave the path for a Docker or BuildKit engine alongside the existing Nixery-based one.</p>\n<p>And thirdly, the general quality-of-life of the web frontend has improved dramatically, with a nice <a href=\"https://tangled.sh/\">timeline</a>, <a href=\"https://tangled.sh/@anil.recoil.org?tab=repos\">repo list</a>, and <a href=\"https://tangled.sh/@anil.recoil.org\">profile pages</a>. I'm running two knots right now (one on Recoil, and one in the Cambridge Computer Lab), and both have been very painfree. I wrote one of the earliest <a href=\"https://tangled.sh/@anil.recoil.org/knot-docker\">Dockerfiles</a> for it, but there's now a <a href=\"https://tangled.sh/@tangled.sh/knot-docker\">community-maintained Knot Docker</a> setup which I've switched to. Doesn't take very long at all; give it a try!</p>\n<p>Because I've been using Tangled so much, I <a href=\"https://github.com/ocaml/dune/pull/12197\">added support for Tangled metadata</a> to Dune to make OCaml package maintainence easier. This will appear in Dune 3.21 in a few months, but in the meanwhile enjoy the first <a href=\"https://ocaml.org/p/mlgpx/latest\">Tangled.sh package on opam</a>. It's a simple GPX library I used in my <a href=\"/notes/owntracks-and-lifecycle\">recent trip</a> to Botswana. All you need in your <code>dune-project</code> will be:</p>\n<pre><code>(lang dune 3.21)\n(name mlgpx)\n(generate_opam_files true)\n(source (tangled @anil.recoil.org/ocaml-gpx))\n</code></pre>\n<p>The only major thing I'm missing from Tangled is support for private repositories now, but I'm very content using it for public content today. Beware as usual that it's still in alpha, so don't trust super-ultra-mega-important stuff to it unless you've git mirrored elsewhere.</p>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p>...with the help of my trusty local Nixer <a href=\"https://ryan.freumh.org\">Ryan Gibb</a>. Noone should ever Nix by themselves.</p>\n <a href=\"#fnref:1\" class=\"reversefootnote\">&#8617;</a></p></li></ol></div><h1>References</h1><ul><li>Madhavapeddy (2025). Tracking locations with OwnTracks, Life Cycle and Home Assistant. <a href=\"https://doi.org/10.59350/13ras-yd957\" target=\"_blank\"><i>10.59350/13ras-yd957</i></a></li>\n<li>Madhavapeddy (2025). Socially self-hosting source code with Tangled on Bluesky. <a href=\"https://doi.org/10.59350/r80vb-7b441\" target=\"_blank\"><i>10.59350/r80vb-7b441</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/tangled-and-ci",
      "title": "mlgpx is the first Tangled-hosted package available on opam",
      "summary": "The Tangled git forge has recently gained support for CI, stacked pull requests and also the Dune build system can generate Tangled metadata easily now for OCaml packages hosted there.",
      "date_published": "2025-08-17T00:00:00.000000Z",
      "date_modified": "2025-08-17T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "bluesky",
        "tangled",
        "ocaml",
        "git"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/13ras-yd957",
          "doi": "10.59350/13ras-yd957",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/r80vb-7b441",
          "doi": "10.59350/r80vb-7b441",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/13ras-yd957",
      "content_html": "<p>I'm emerging reenergised from an <a href=\"https://www.flickr.com/photos/avsm/albums/72177720328187549\">epic trip</a> to the\nOkavango Delta in Botswana, where we <a href=\"https://totalkatastrophe.blogspot.com/2025/07/adventures-in-botswana-into-delta.html\">spent\nweeks</a>\nin the wilderness gathering ground truth for <a href=\"/papers/2025-tessera\">TESSERA</a>\n(and enjoying the wildlife!). Piecing together our locations was quite\nimportant, and so I took a cue from <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> and deployed\n<a href=\"https://owntracks.org/\">OwnTracks</a> and <a href=\"https://www.home-assistant.io/integrations/device_tracker/\">HomeAssistant Device\nTracker</a> before I\nheaded out there. There were four interesting tech pieces that resulted from\nthis: a local iPhone app to determine my GPS accuracy while entirely remote;\nthen merging my Home Assistant location database hile in the field,\nthen reverse engineering 8 years worth of location data out of an old iOS app to backfill data,\nand finally deploying my own self-hosted OwnTracks on Recoil for the longer term.</p>\n<p><img src=\"/images/okavango-1.webp\" alt=\"%c\" ></p>\n<h2 id=\"how-accurate-are-photo-locations\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#how-accurate-are-photo-locations\"></a>How accurate are photo locations?</h2>\n<p>Device power is at a premium while in the remote wilderness, and a lot of the\nusual methods that iOS uses to figure out precise locations (such as Wifi SSIDs)\nsimply don't exist there. So I need some way to figure out if our devices actually\nhad a location lock, and if so what the error was (so I could wiggle my phone in\nthe air until it got a lock).</p>\n<p>Since we had utterly minimal connectivity, I managed to text <a href=\"https://nick.recoil.org\">Nick Ludlam</a> back\nat London home base to seek his help. Nick\n<a href=\"https://bsky.app/profile/nick.recoil.org/post/3lvbcqb5f2c2k\">vibed</a> an iPhone\nPWA that uses <a href=\"https://github.com/exif-heic-js/exif-heic-js\">heic.js</a> to dump\nout all the Exif tags for a given photograph locally on my device. This solved\nthe immediate problem while working entirely offline! You can grab its\n<a href=\"https://github.com/nickludlam/svelte-exifdump\">source</a> for yourself or <a href=\"https://exif.recoil.org\">try it\ndirectly</a>.</p>\n<h2 id=\"figuring-out-my-home-assistant-location-while-remote\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#figuring-out-my-home-assistant-location-while-remote\"></a>Figuring out my Home Assistant location while remote</h2>\n<p>I've used <a href=\"https://home-assistant.io\">Home Assistant</a> for many years to manage\nmy household devices. They supply a handy <a href=\"https://www.home-assistant.io/integrations/ios/\">iOS\napp</a> which uses Location\nServices to keep track of where you are, in order to trigger events (&quot;Turn on\nthe heating when I arrive home&quot;). When I was out in Botswana, we had a bunch of\ncameras with us such as Canon R5 MKII and an Olympus E-M1 MKII. These cameras\ndon't track GPS reliably, which makes them a pain to realign with iPhone and\nother footage after the fact.</p>\n<p><img src=\"/images/okavango-2.webp\" alt=\"%c\" title=\"Idyllic and peaceful, and definitely no signal\" ></p>\n<p>Luckily, I realised that my Home Assistant iOS app is actually recording its\nlocation locally to my device, and very occasionally syncing it back home on\nthe rare occasions where we had signal (we discovered that by standing on top\nof a old termite mound and holding your phone high in the air you could\noccasionally get one bar of connectivity for important texts).</p>\n<p>I retrieved my Home Assistant database from Cambridge (via Tailscale), and\nused <a href=\"https://github.com/rubiojr/hass2geo\">hass2geo</a> to dump out <a href=\"https://en.wikipedia.org/wiki/GPS_Exchange_Format\">GPX format</a>\ntraces of our locations. Then, I vibe coded a Python script to convert the\nwaypoint format into &quot;trackfiles&quot;, so that they could be loaded into the\n<a href=\"https://helpx.adobe.com/lightroom-classic/help/maps-module.html\">Lightroom map view</a>.\nAfter this it was relatively easy going, as Lightroom matches up the GPS\ntraces to the camera picture times, and stamps them accordingly. It's not\nperfect as it doesn't interpolate between track traces, but good enough for\nthe sort of accuracy I wanted (&quot;in the NG32 reserve in Botswana&quot;).</p>\n<p>In the long term though, this isn't a great solution: Home Assistant only\nkeeps a few days of location history, and getting fine-grained GPS traces\nvia the API requires manually dumping out the internal database.</p>\n<p><img src=\"/images/okavango-3.webp\" alt=\"%c\" title=\"It got cold at night on the salt plains\" ></p>\n<h2 id=\"reverse-engineering-life-cycle-on-my-phone\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reverse-engineering-life-cycle-on-my-phone\"></a>Reverse engineering Life Cycle on my phone</h2>\n<p>I then looked more closely at which apps were using location on my phone, and\ndiscovered the <a href=\"https://northcube.com/lifecycle/\">Life Cycle</a> app that had been\nthere since 2018!  This was an app I installed way back to categorise where I\nwas spending my time, and then promptly forgot about it when the pandemic\nstarted.  The app developers are generally privacy friendly, and it also had an\noption to output a CSV file of all the location history.</p>\n<p>Unfortunately, this CSV file is incomplete, as it includes locations as\nnames, but not their exact GPS location (which the app has internally).\nI opened a support request with the app developer, but they responded\nthat they don't plan to add any new features to the app. That's fair\nenough, but I really wanted to get the data out!</p>\n<p><a href=\"https://www.flickr.com/photos/avsm/54709381223/in/dateposted-public/\"> <img src=\"/images/okavango-4.webp\" alt=\"%c\" title=\"This is how grumpy I was when I realised my GPS data was trapped in my phone\" > </a></p>\n<p>The obvious solution is to extract the app database from my iPhone and\nreverse engineer it. While this <a href=\"/notes/uiprototype\">used to be easy</a>, the\n<a href=\"https://developer.apple.com/documentation/security/app-sandbox\">app sandboxes</a> mean\nthat modern iPhones are very locked down. I had to manually initiate a tethered\nand unencrypted backup to my Mac (as opposed to an iCloud backup), and then\nextract the Life Cycle app from my phone using <a href=\"https://imazing.com/\">iMazing</a>.</p>\n<pre><code>$ ls \n   11440 Aug 11 15:43 .lock\n     192 Aug 11 15:43 Container/\n    1555 Aug 11 15:43 iTunesMetadata.plist\n 6661627 Aug 11 15:43 Life Cycle.imazingapp\n      96 Aug 11 15:43 Payload/\n$ cd Container/Library/ &amp;&amp; file life.db\nlife.db: SQLite 3.x database, user version 20, last written using SQLite\nversion 3043002, file counter 1, database pages 3443, cookie 0x65, schema 4,\nUTF-8, version-valid-for 1\n</code></pre>\n<p>We've found the <code>life.db</code>, and its now in an unencrypted sqlite3 format!\nBut what format is it actually in? I didn't have time to figure it out,\nbut I pointed Claude Code at the database in an attempt to reverse engineer\nthe format into something I could export. After consulting <a href=\"https://ryan.freumh.org\">Ryan Gibb</a>, he\nsuggested using <a href=\"https://owntracks.org/\">OwnTracks</a> as a self-hosted location tracker.</p>\n<p>Rather than get Claude to directly interpret the data, I directed it\nto examine the sqlite3 database and to emit a script that would convert\nit to a format that OwnTracks could use. It did this pretty accurately:</p>\n<pre><code># Discovered schema with 24 tables tracking location, motion, activities\nsqlite3 life.db &quot;.tables&quot;\nsqlite3 life.db &quot;.schema&quot;\n\n# Created JSON export script for primary observational data\n- LocationEvent: Raw GPS coordinates (31,743 records)\n- Motion: Movement classification (77,385 records)\n- Activity: Behavior categorization (20,273 records)\n- Visit: Stationary location periods (9,903 records)\n\n2. Initial Upload Implementation\n\n# Core transformation mapping\nLifeCycle → OwnTracks:\n- timestamp → tst (Unix epoch)\n- latitude/longitude → lat/lon\n- hAccuracy → acc\n- altitude → alt (with validation)\n- speed → vel (m/s to km/h conversion)\n- WiFi context → SSID/BSSID\n\nFeatures implemented:\n- Duplicate prevention via state tracking\n- Resume capability for interrupted uploads\n- Quality filtering (&gt;500m accuracy rejected)\n- Progress monitoring\n\n3. Enhanced Data Integration\n\n# Added motion activities from 77K motion records\n&quot;motionactivities&quot;: [\n  {&quot;type&quot;: &quot;automotive&quot;, &quot;confidence&quot;: 2},\n  {&quot;type&quot;: &quot;walking&quot;, &quot;confidence&quot;: 0}\n ]\n</code></pre>\n<p>This implementation went well beyond what I had in mind: in addition to just\ntranslating the GPS, it also looked at the contextual data (such as the iOS\nmotion tracking information), and also wrote a comprehensive script (available\n<a href=\"https://tangled.sh/strings/anil.recoil.org/3lwjtix2ban22\">here</a>) that collected it all together and uploaded it to my server.</p>\n<p>Overall, the fact that I could reverse engineer a decade old app, autodetect\nits data structures and convert with high fidelity into a self-hosted infrastructure\n<em>in about 15 minutes of agent-driven coding</em> is a capability I find absolutely incredible.\nThis sort of coding isn't suitable for everything, of course, but it's great for\nthe tedious glue code which makes self-hosting often so painful.</p>\n<p><a href=\"https://www.flickr.com/photos/avsm/54709383128/in/dateposted-public/\"> <img src=\"/images/okavango-5.webp\" alt=\"%c\" title=\"On some occasions, we definitely did not want to be found\" > </a></p>\n<h1 id=\"deploying-owntracks-on-recoil\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#deploying-owntracks-on-recoil\"></a>Deploying OwnTracks on Recoil</h1>\n<p>I now had my past eight years of locations in one place, so it seemed a\ngood time to switch away completely to a self-hosted location system\ndedicated to that purpose.</p>\n<p>Deploying the aforementioned OwnTracks requires two things: a secure MQTT\nendpoint to receive events from the iOS app, and then a <a href=\"https://github.com/owntracks/docker-recorder\">recorder</a> that\nsubscribes to it and compacts the results into location records. The recorder\nexposes an HTTP API that frontends can use to render map like interfaces,\nor it works headlessly as well for other API uses of your location.</p>\n<p>I used this docker-compose.yml, as I couldn't find an off-the-shelf\nsetup:</p>\n<pre><code>services:\n  frontend:\n    image: caddy:2-alpine\n    depends_on:\n      - mqtt\n    ports:\n      - &quot;443&quot;\n      - &quot;443/udp&quot;\n      - &quot;80&quot;\n    volumes:\n      - ./caddy/data:/data\n      - ./caddy/htdocs:/htdocs\n      - ./caddy/conf:/etc/caddy\n    restart: always\n  owntracks-frontend:\n    image: owntracks/frontend\n    volumes:\n      - ./frontend/config.js:/usr/share/nginx/html/config/config.js\n    environment:\n      - SERVER_HOST=otrecorder\n      - SERVER_PORT=8083\n    restart: unless-stopped\n  otrecorder:\n    image: owntracks/recorder\n    restart: unless-stopped\n    environment:\n      - VIRTUAL_HOST=&lt;host&gt;\n      - TZ=Europe/London\n      - OTR_USER=&lt;user&gt;\n      - OTR_PASS=&lt;pass&gt;\n      - OTR_HTTPHOST=0.0.0.0\n      - OTR_HTTPPORT=8083\n      - OTR_HOST=mqtt\n      - OTR_PORT=1883\n    volumes:\n      - ./owntracks/config:/config\n      - ./owntracks/store:/store\n  mqtt:\n    container_name: mqtt\n    build: .\n    environment:\n      - TZ=Europe/London\n    ports:\n      - &quot;185.33.27.72:8883:8883&quot;\n    volumes:\n      - ./mosquitto/data:/mosquitto/data\n      - ./mosquitto/logs:/mosquitto/logs\n      - ./mosquitto/conf:/mosquitto/config\n      - ./mosquitto/conf/passwd:/etc/mosquitto/passwd:ro\n      - ./caddy/data/caddy/certificates/acme-v02.api.letsencrypt.org-directory/track.recoil.org:/etc/mosquitto/crt:ro\n    restart: unless-stopped\n</code></pre>\n<p>You can replace the direct environment variables with an env_file\n(recommended!). This compose file uses Caddy to set up LetsEncrypt certificates\nfor the MQTT server automatically, and then you just need to drop in a\n<code>caddy/conf/Caddyfile</code> with the HTTP auth:</p>\n<pre><code>domainname.org {\n  encode gzip\n  reverse_proxy http://owntracks-frontend\n  basicauth * {\n   user hashed-password\n  }\n}\n</code></pre>\n<p>This uses normal browser auth to protect your location, and you can also run\nthe whole thing behind Tailscale or similar to avoid having an Internet-visible\nport exposed too.</p>\n<p><a href=\"https://www.flickr.com/photos/avsm/54709380903/in/dateposted-public/\"> <img src=\"/images/okavango-6.webp\" alt=\"%c\" title=\"This harrier's piercing gaze would find us anywhere\" > </a></p>\n<h2 id=\"what-next-for-locations\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#what-next-for-locations\"></a>What next for locations?</h2>\n<p>I've been interested in &quot;spatial programming&quot; for ages, and while I was away\n<a href=\"https://ryan.freumh.org\">Ryan Gibb</a> and <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> uploaded a preprint we've been working on a system for\n<a href=\"/papers/2025-bifrost\">programming with physical locations called &quot;Bifrost&quot;</a>. We're\nparticularly interested in the intersection between physical devices, locations\nand how to manage their interactions. This is a continuation of the work on\n<a href=\"/projects/osmose\">Osmose</a> and <a href=\"/papers/2023-hotnets-sns\">spatial naming</a>. Stay tuned!</p><h1>References</h1><ul><li>Feng et al (2025). TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2506.20380\" target=\"_blank\"><i>10.48550/arXiv.2506.20380</i></a></li>\n<li>Gibb et al (2023). Where on Earth is the Spatial Name System?. ACM. <a href=\"https://doi.org/10.1145/3626111.3628210\" target=\"_blank\"><i>10.1145/3626111.3628210</i></a></li>\n<li>Millar et al (2025). An Architecture for Spatial Networking. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2507.22687\" target=\"_blank\"><i>10.48550/arXiv.2507.22687</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/owntracks-and-lifecycle",
      "title": "Tracking locations with OwnTracks, Life Cycle and Home Assistant",
      "summary": "Setting up self-hosted location tracking using OwnTracks and reverse engineering Life Cycle app data with Claude Code for field work in Botswana.",
      "date_published": "2025-08-14T00:00:00.000000Z",
      "date_modified": "2025-08-14T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "gps",
        "spatial",
        "selfhosting",
        "claude",
        "llms"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.48550/arXiv.2506.20380",
          "doi": "10.48550/arXiv.2506.20380",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3626111.3628210",
          "doi": "10.1145/3626111.3628210",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2507.22687",
          "doi": "10.48550/arXiv.2507.22687",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/nn1d6-xgt62",
      "content_html": "<p>I've been hacking with <a href=\"https://toao.com\">Sadiq Jaffer</a> (<a href=\"https://toao.com/blog/ocaml-0725\">^</a>),\n<a href=\"https://jon.recoil.org\">Jon Ludlam</a> (<a href=\"https://jon.recoil.org/blog/2025/07/week28.html\">^</a>) and\n<a href=\"https://ryan.freumh.org\">Ryan Gibb</a> (<a href=\"https://ryan.freumh.org/enki.html\">^</a>) on various approaches to\nimproving the <a href=\"/notes/claude-copilot-sandbox\">agentic coding experience</a> for OCaml.</p>\n<p>We jotted down our notes in a <a href=\"https://www.cl.cam.ac.uk/~avsm2/2025-ocaml-ai-draft1.pdf\">draft paper</a> to keep track of everything going on, including <a href=\"https://toao.com/blog/ocaml-local-code-models\">summarising</a> previous experiments with Qwen3 for <a href=\"https://www.cl.cam.ac.uk/teaching/2425/FoundsCS/\">FoCS</a>. Since then, there's been a flurry of extra activity from others which we need to integrate!</p>\n<ul>\n<li><a href=\"https://academic.mseri.me/\">Marcello Seri</a> started pushing to my vibe coded <a href=\"https://tangled.sh/@anil.recoil.org/ocaml-mcp\">OCaml MCP library</a>, making him user number 2 of that!</li>\n<li>Then <a href=\"https://github.com/tmattio\">Thibaut Mattio</a> announced a bunch of software, starting with <a href=\"https://discuss.ocaml.org/t/announcing-raven-scientific-computing-for-ocaml-alpha-release/16913\">a collection of libraries and tools for numerical computing and machine learning</a> and also another <a href=\"https://discuss.ocaml.org/t/building-ocaml-mcp-what-features-would-you-want/16914\">MCP server</a>. I haven't had a chance to try the MCP server yet, but I hope I can retire mine...</li>\n<li><a href=\"https://github.com/samoht\">Thomas Gazagnaire</a> started hacking on an agent-friendly <a href=\"https://github.com/samoht/merlint\">merlint</a> tool that spots common problems in style and choices and gives CLI feedback in a style easily consumed by claude. I've <a href=\"https://github.com/samoht/merlint/issues\">started using it</a> despite its pre-alpha status.</li>\n<li><a href=\"https://jon.recoil.org\">Jon Ludlam</a>'s been <a href=\"https://jon.recoil.org/blog/2025/07/week28.html\">getting</a> the <a href=\"https://toao.com/blog/opam-archive-dataset\">opam embeddings</a> into shape to be suitable as an MCP server that can search the entire opam ecosystem. odoc v3 has also <a href=\"https://discuss.ocaml.org/t/new-odoc-3-generated-package-documentation-is-live-on-ocaml-org/16967\">gone live</a> after lots of work, and <a href=\"https://github.com/https://github.com/davesnx\">David Sancho</a>'s support for <a href=\"https://github.com/ocaml/odoc/pull/1341\">Markdown odoc output</a> on which makes this process easier.</li>\n</ul>\n<p>This is all fairly straightforward MCP work that improves the short-term experience. We'll get to the <a href=\"https://arxiv.org/abs/2505.24760\">RL-VR</a> ideas later...\nIf anyone else is hacking on something agent related do <a href=\"https://discuss.ocaml.org\">post on OCaml Discuss</a> and let us know! I'm hoping to update the paper later in August to roundup the efforts above.</p><h1>References</h1><ul><li>Madhavapeddy (2025). Oh my Claude, we need agentic copilot sandboxing right now. <a href=\"https://doi.org/10.59350/aecmt-k3h39\" target=\"_blank\"><i>10.59350/aecmt-k3h39</i></a></li>\n<li>Stojanovski et al (2025). REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2505.24760\" target=\"_blank\"><i>10.48550/arXiv.2505.24760</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/cresting-the-ocaml-ai-hump",
      "title": "Cresting the OCaml AI humps",
      "summary": "Community efforts to improve agentic coding experience for OCaml including MCP libraries, opam embeddings, and tooling improvements.",
      "date_published": "2025-07-18T00:00:00.000000Z",
      "date_modified": "2025-07-18T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "ai",
        "llms"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/aecmt-k3h39",
          "doi": "10.59350/aecmt-k3h39",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2505.24760",
          "doi": "10.48550/arXiv.2505.24760",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/25bk3-kr866",
      "content_html": "<p>I've just taken <a href=\"https://unmute.sh/\">Kyutai's speech-to-text model</a> for a spin on my Mac laptop, and it's stunningly good. As background, this is what the prolific <a href=\"https://github.com/laurentmazare\">Laurent Mazare</a> has been hacking on; he has made a ton of contributions to the OCaml community as well, such as <a href=\"https://github.com/LaurentMazare/ocaml-torch\">ocaml-torch</a> and starred in a very fun <a href=\"https://signalsandthreads.com/python-ocaml-and-machine-learning/\">Signals to Threads episode</a> on machine learning at Jane Street back in 2020.</p>\n<p>You can get the microphone-to-speech running on your Mac in a few commands, assuming you have <a href=\"https://github.com/astral-sh/uv\">uv</a> installed (which you should!).</p>\n<pre><code>git clone https://github.com/kyutai-labs/delayed-streams-modeling\ncd delayed-streams-modeling\nuvx --with moshi-mlx python scripts/stt_from_mic_mlx.py\n</code></pre>\n<p>It understands my accent near perfectly; if that isn't a machine learning miracle, I don't know what is! I'm looking forward to trying this out more with our <a href=\"/ideas/embedded-whisper\">Low power audio transcription with Whisper</a> project over the summer with <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> and <a href=\"mailto:dk729@cam.ac.uk\">Dan Kvit</a>.</p>",
      "url": "https://anil.recoil.org/notes/kyutai-streaming-voice-mlx",
      "title": "Using Kyutai's low latency audio models on macOS in one command",
      "summary": "Quick setup guide for running Kyutai's high-quality speech-to-text model locally on Mac using their MLX implementation.",
      "date_published": "2025-07-16T00:00:00.000000Z",
      "date_modified": "2025-07-16T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "llm",
        "ai",
        "audio"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://doi.org/10.59350/nmcab-py710",
      "content_html": "<p>I was a bit sleepy getting into the Royal Society <a href=\"https://royalsociety.org/science-events-and-lectures/2025/07/future-of-scientific-publishing/\">Future of Scientific\nPublishing</a>\nconference early this morning, but was quickly woken up by the dramatic passion\non show as publishers, librarians, academics and funders all got together for a\n&quot;frank exchange of views&quot; at a meeting that didn't pull any punches!</p>\n<p>These are my hot-off-the-press livenotes and only lightly edited; a more cleaned up version will be available\nfrom the RS in due course.</p>\n<p><img src=\"/images/rspub-1.webp\" alt=\"%c\" title=\"Sir Mark Walport FRS opens up the conference\" ></p>\n<h2 id=\"mark-walport-sets-the-scene\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#mark-walport-sets-the-scene\"></a>Mark Walport sets the scene</h2>\n<p>Sir Mark Walport was a delightful emcee for the proceedings of the day, and\nopened how important the moment is for the future of how we conduct science.\nAcademic publishing faces a perfect storm: peer review is buckling under\nenormous volume, funding models are broken and replete with perverse\nincentives, and the entire system groans with inefficiency.</p>\n<p>The Royal Society is the publisher of the world's oldest continuously published\nscientific journal <a href=\"https://royalsocietypublishing.org/journal/rstb\">Philosophical Transactions</a>\n(since 1665) and has convened this conference for academies worldwide. The\noverall question is: what <em>is</em> a scientific journal in 2025 and beyond?\nWalport traced the economic evolution of publishing: for centuries, readers\npaid through subscriptions (I hadn't realised that the <a href=\"https://royalsociety.org/blog/2015/03/philosophical-transactions-the-early-years/\">early editions of the RS</a>\nused to be sent for free to libraries worldwide until the current commercial\nmodel arrived about 80 years ago).. Now, the pendulum has swung to open access\nthat creates perverse incentives that prioritize volume over quality. He called\nit a &quot;smoke and mirrors&quot; era where diamond open access models obscure who\n<em>actually</em> pays for the infrastructure of knowledge dissemination: is it the\npublishers, the governments, the academics, the libraries, or some combination\nof the above?  The profit margins of the commercial publishers answers that\nquestion for me...</p>\n<p>He then identified the transformative forces that are a forcing function:</p>\n<ul>\n<li>LLMs have <a href=\"/papers/2025-ai-poison\">entered</a> the publishing ecosystem</li>\n<li>The proliferation of journals has created an attention economy rather than a knowledge economy</li>\n<li><a href=\"https://openreview.net/\">Preprint</a> archives are reshaping how research is shared quickly</li>\n</ul>\n<p>The challenges ahead while dealing with these are maintaining metadata\nintegrity, preserving the scholarly archive into the long term, and ensuring\nsystematic access for meta-analyses that advance human knowledge.</p>\n<h2 id=\"historical-perspectives-350-years-of-evolution\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#historical-perspectives-350-years-of-evolution\"></a>Historical Perspectives: 350 Years of Evolution</h2>\n<p>The opening pair of speakers were unexpected: they brought a historical and\nlinguistic perspective to the problem. I found both of these talks the\nhighlights of the day!  Firstly <a href=\"https://www.st-andrews.ac.uk/history/people/akf\">Professor Aileen\nFyfe</a> drew upon her research\nfrom 350 years of the Royal Society archives. Back in the day, there was no\nreal fixed entity called a &quot;scientific journal&quot;. Over the centuries, everything\nfrom editorial practices to publication methods over to dissemination means\nhave transformed repeatedly, so we shouldn't view the status quo as set in stone.</p>\n<p><img src=\"/images/rspub-2.webp\" alt=\"%c\" title=\"Professor Aileen Fyfe talks publishing history\" ></p>\n<p>While the early days of science were essentially people writing letters to each\nother, the post-WWII era of journals marked the shift to &quot;scale&quot;. The tools for\ndistance communication (i.e. publishing collected issues) and universities\nswitching from being teaching focused over to today's research-centric\npublishing ecosystem were both key factors. University scientists used to\nproduce 30% of published articles in 1900; by 2020, that figure exceeded 80%.\nThis parallels the globalization of science itself in the past century;\nresearch has expanded well beyond its European origins to encompass almost all\ninstitutions and countries worldwide.</p>\n<p>Amusingly, Prof Fyfe pointed out that a 1960 Nature editorial asked <em>&quot;<a href=\"https://www.nature.com/articles/186018a0\">How many more new\njournals?</a>&quot;</em> even back then! The 1950s\ndid bring some standardization efforts (nomenclature, units, symbols) also\nthough citation formats robustly seem to resist uniformity. English was also\nexplicitly selected as the &quot;<a href=\"https://en.wikipedia.org/wiki/Languages_of_science\">default language for\nscience</a>, and peer review\nwas also formalised via papers like <em>&quot;<a href=\"https://journals.sagepub.com/doi/10.1177/000456327901600179\">Uniform requirements for manuscripts submitted to biomedical journals</a>&quot;</em> (in 1979). <a href=\"https://nsf-gov-resources.nsf.gov/pubs/1977/nsb77468/nsb77468.pdf\">US Congressional hearings</a>\nwith the NSF began distinguishing peer review from other evaluation methods.</p>\n<p><img src=\"/images/rspub-3.webp\" alt=\"%c\" title=\"Professor Aileen Fyfe shows the globalisation of research over the years\" ></p>\n<p>All of this scale was then &quot;solved&quot; by financialisation after WWII. At the turn of the\n20th century, almost no journals generated any profit (the Royal Society\ndistributed its publications freely). By 1955, financial pressures and growing scale of submissions forced a\n<a href=\"https://journals.sagepub.com/doi/10.1177/0073275321999901\">reckoning</a>, leading\nto more self-supporting models by the 1960s. An era of mergers and acquisitions\namong journals followed, reshaping the <a href=\"https://serials.uksg.org/articles/259/files/submission/proof/259-1-259-1-10-20150210.pdf\">scientific information system</a>.</p>\n<p><a href=\"https://www.universiteitleiden.nl/en/staffmembers/vincent-lariviere#tab-1\">Professor Vincent Larivière</a> then took the stage to dispel some myths of English monolingualism in scientific publishing. While <a href=\"https://garfield.library.upenn.edu/essays/V1p019y1962-73.pdf\">English offers some practical benefits</a>, the reality at non-Anglophone institutions (like his own Université de Montréal) reveals that researchers spend significantly more time reading, writing, and processing papers as non-native language speakers, and often face higher rejection rates as a result of this.\nThis wasn't always the case though; Einstein published primarily in German, not English!</p>\n<p>He went on to note that today's landscape for paper language choices is more\ndiverse than is commonly assumed. English represents only 67% of publications,\na figure whic itself has been inflated by non-English papers that are commonly\npublished with English abstracts. Initiatives like the <a href=\"https://pkp.sfu.ca/2025/03/05/ojs-workshops-indonesia/\">Public Knowledge\nProject</a> has enabled\ngrowth in Indonesian and Latin America for example.  Chinese journals now\npublish twice the volume of English-language publishers, but are difficult to\nindex which makes Lariviere's numbers even more interesting: a growing majority\nof the world is no longer publishing in English! I also heard this in my trip\nin 2023 to China with the Royal Society; the scholars we met had a sequence of\nChinese language journals they submitted too, often before &quot;translating&quot; the\noutputs to English journals.</p>\n<p><img src=\"/images/rspub-4.webp\" alt=\"%c\" title=\"Professor Lariviere uses OpenAlex to show non-English linguistic breakdowns\" ></p>\n<p>All this leads us to believe that the major publisher's market share is smaller than commonly believed, which gives us reason for hope to change! Open access adoption worldwide currently varies fairly dramatically by per-capita <a href=\"https://ourworldindata.org/grapher/scientific-publications-per-million\">wealth and geography</a>, but reveals substantive greenspace for publishing beyond the major commercial publishers. Crucially, Larivière argued that research &quot;prestige&quot; is a socially constructed phenomenon, and not intrinsic to quality.</p>\n<p>In the Q&amp;A, Magdalena Skipper (Nature's Editor-in-Chief) noted that the private sector is reentering academic publishing (especially <a href=\"https://www.science.org/content/article/china-tops-world-artificial-intelligence-publications-database-analysis-reveals\">in AI topics</a>). Fyfe noted the challenge of tracking private sector activities; e.g. varying corporate policies on patenting and disclosure mean they are hard to infdex. A plug from <a href=\"https://coherentdigital.net/\">Coherent Digital</a> noted they have catalogued 20 million reports from non-academic research; this is an exciting direction (we've got <a href=\"/ideas/grey-lit-crawl\">30TB of grey literature</a> on our servers, still waiting to be categorisd).</p>\n<p><img src=\"/images/rspub-5.webp\" alt=\"%c\" title=\"Professor Lariviere shows how uneven citations are across languages and geographies\" ></p>\n<h2 id=\"what-researchers-actually-need-from-stem-publishing\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#what-researchers-actually-need-from-stem-publishing\"></a>What researchers actually need from STEM publishing</h2>\n<p>Our very own <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> opened with a sobering demonstration of &quot;AI\npoisoning&quot; in the literature, referencing <a href=\"/static/papers/2025-ai-poison.pdf\">our recent Nature\ncomment</a>. He did the risky-but-catchy\ngeneration of a plausible-sounding but entirely fabricated conservation study\nusing an LLM and noted how economically motivated rational actors might quite\nreasonably use these tools to advance their agendas via the scientific record.\nAnd recovering from this will be very difficult indeed once it mixes up with\nreal science.</p>\n<p><img src=\"/images/rspub-6.webp\" alt=\"%c\" title=\"Bill talks about our recent AI poisoning piece\" ></p>\n<p>Bill then outlined our <a href=\"/projects/ce\">emerging approach to subject-wide synthesis</a> via:</p>\n<ul>\n<li><strong>Systematic reviews</strong>: Slow, steady, comprehensive</li>\n<li><strong>Rapid reviews</strong>: Sprint-based approaches for urgent needs</li>\n<li><strong>Subject-wide evidence synthesis</strong>: Focused sectoral analyses</li>\n<li><strong>Ultrafast bespoke reviews</strong>: AI-accelerated with human-in-the-loop</li>\n</ul>\n<p>Going back to what ournals are <em>for</em> in 2025, Bill then discussed how they were\noriginally vehicles for exchanging information through letters, but now serve\nprimarily as stamps of authority and quality assurance. In an &quot;AI slop world,&quot;\nthis quality assurance function becomes existentially important, but shouldn't\nnecessarily be implemented in the current system of incentives. So then, how do\nwe maintain trust when the vast majority of submissions may soon be\nAI-generated? <em>(Bill and I scribbled down a plan on the back of a napkin for\nthis; more on that soon!)</em></p>\n<p><img src=\"/images/rspub-7.webp\" alt=\"%c\" title=\"Bill also does a cheeky advert for his Conservation Concepts channel!\" ></p>\n<h3 id=\"early-career-researcher-perspectives\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#early-career-researcher-perspectives\"></a>Early Career Researcher perspectives</h3>\n<p><a href=\"https://www.york.ac.uk/psychology/staff/postdocs/meekings,-sophie/\">Dr. Sophie Meekings</a> then took the stage to discuss the many barriers facing early career researchers (ECRs). They're on short-term contracts, are dependent on others people's grant funding, and yet are the ones conducting the frontline research that drives scientific progress. And this is <em>after</em> years spent on poorly paid PhD stipends!</p>\n<p>ECRs require:</p>\n<ul>\n<li>clear, accessible guidelines spelling out each publishing stage without requiring implicit knowledge of the &quot;system&quot;</li>\n<li>constructive, blinded peer review** that educates rather than gatekeeps</li>\n<li>consistent authorship conventions like <a href=\"https://www.elsevier.com/researcher/author/policies-and-guidelines/credit-author-statement\">CRediT</a> (Contributor Roles Taxonomy)</li>\n</ul>\n<p>Dr. Meekings then noted how the precarious nature of most ECR positions creates cascading complications for individuals. When job-hopping between short-term contracts, who funds the publication of work from previous positions? How do ECRs balance completing past research with new employers' priorities? <a href=\"https://www.cst.cam.ac.uk/people/eft20\">Eleanor Toye Scott</a> also had this issue when joining my group a few years ago, as it took a significant portion of her time in the first year to finish up her previous publication from her last research contract.</p>\n<p>If we're going to fix the system itself, then ECRs need better incentives for PIs to publish null results and exploratory work, the councils need to improve support for interdisciplinary research that doesn't fit traditional journal boundaries (as these as frontiers between &quot;conventional&quot; science where many ECRs will work), and recognition that ECRs often lack the networks for navigating journal politics where editors rule supreme.</p>\n<p>Dr. Meekings summarized ECR needs with an excellent new acronym (SCARF) that drew a round of applause!</p>\n<ul>\n<li><strong>S</strong>peed in publication processes</li>\n<li><strong>C</strong>larity in requirements and decisions</li>\n<li><strong>A</strong>ffordability of publication fees</li>\n<li><strong>R</strong>ecognition of contributions</li>\n<li><strong>F</strong>airness in review and credit</li>\n</ul>\n<p><img src=\"/images/rspub-8.webp\" alt=\"%c\" title=\"Dr Sophie Meekings' SCARF principles for ECRs\" ></p>\n<p>The audience Q&amp;A was quite robust at this point. The first question was about how might we extend the evidence synthesis approach widely?\n<a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> noted that we are currently extending this to education working with <a href=\"https://www.educ.cam.ac.uk/people/staff/gibson/\">Jenny Gibson</a>. Interconnected datasets <em>across</em> subjects are an obvious future path for evidence datasets, with common technology for handling (e.g.) retracted datasets that can be applied consistently. <a href=\"https://toao.com\">Sadiq Jaffer</a> and <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a> are supervising <a href=\"/notes/eeg-interns-2025\">projects on evidence synthesis</a> this summer on just this topic here in Cambridge.</p>\n<p>Another question was why ECRs feel that double blind review is important. Dr. Meekings noted that reviewers may not take ECR peer reviews as seriously, but this coul dbe fixed by opening up peer review and assigning credit <em>after</em> the process is completed and not during. Interestingly, the panel all like double-blind, which is the norm in computer science but not in other science journals. Some from the  BMJ noted there exists a lot of research into blinding; they summarised it that blinding doesn't work on the whole (people know who it is anyway) and open review doesn't cause any of the problems that people think it causes.</p>\n<p>A really interesting comment from Mark Walport was that a grand scale community project could work for the future of evidence collation, but this critically depends on breaking down the current silos since it doesn't work unless everyone makes their literature available. There was much nodding from the audience in support of this line of thinkin.g</p>\n<h2 id=\"charting-the-future-for-scientific-publishing\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#charting-the-future-for-scientific-publishing\"></a>Charting the future for scientific publishing</h2>\n<p>The next panel brought together folks from across the scientific\npublishing ecosystem, moderated by Clive Cookson of the Financial Times. This\nwas a particularly frank and pointed panel, with lots of quite direct messages\nbeing sent between the representatives of libraries, publishers and funders!</p>\n<p><img src=\"/images/rspub-9.webp\" alt=\"%c\" title=\"Amy Brand from MIT Press opens the panel\" ></p>\n<p>Amy Brand (MIT Press) started by delivered a warning about conflating &quot;open to\nread&quot; with &quot;open to train on&quot;. She pointed out that when MIT Press did a survey\nacross their authors, many of them raised concerns about the reinforcement of\nbias through AI training on scientific literature. While many of the authors\nacknowledged a moral imperative to make science available for LLM training,\nthey also wanted the <em>choice</em> of making their own work used for this. She urged\nthe community to pause and ask fundamental questions like &quot;AI training, at what\ncost?&quot; and &quot;to whose benefit?&quot;. I did think she made a good point by drawing\nparallels with the early internet, where Brand pointed out that lack of\nregulation accelerated the decline of non-advertising-driven models. Her\nclosing question asked if search engines merely lead to AI-generated summaries,\nwhy serve the original content at all? This is something we discuss in our\n<a href=\"/papers/2025-internet-ecology\">upcoming Aarhus paper on an Internet ecology</a>.</p>\n<p><a href=\"https://experts.deakin.edu.au/66981-danny-kingsley\">Danny Kingsley</a> from Deakin University Library then delivered a biting perspective as a representative of libraries. She said that libraries are &quot;the ones that sign the cheques that keeps the system running&quot;, which the rest of the panel all disagreed with in the subsequent discussion (they all claimed to be responsible, from the government to the foundations).  Her survey of librarians was interesting; they all asked for:</p>\n<ul>\n<li>Transparent peer review processes</li>\n<li>Unified expectations around AI declarations and disclosures</li>\n<li>Licensing as open as possible, resisting the &quot;salami slicing&quot; of specific use. We also ran across this problem of overly precise restrictions on use while <a href=\"/papers/2025-ai-poison\">building our paper corpus</a> for <a href=\"/projects/ce\">CE</a>.</li>\n</ul>\n<p>Kingsley had a great line that &quot;publishers re monetizing the funding mandate&quot;,\nwhich <a href=\"https://www.stats.ox.ac.uk/~deane/\">Charlotte Deane</a> later also said was the most succinct way she had heard\nto describe the annoyance we all have with the vast profit margins of\ncommercial publishers.  Kingsley highlighted this via the troubling practices\nin the IEEE and the American Chemical Society by charging to place repositories\nunder green open access. Her blunt assessment was that publishers are not\nnegotiating in good faith. Her talk drew the biggest applause of the day by\nfar.</p>\n<p>After this, <a href=\"https://wellcome.org/about-us/our-people/staff/john-arne-rottingen\">John-Arne\nRøttingen</a>\n(CEO of the Wellcome Trust) emphasised that funders depend on scientific\ndiscourse as a continuous process of refutations and discussions. He expressed\nconcern about overly depending on brand value as a proxy for quality, calling\nit eventually misleading even if it works sometimes in the short term. Key\npriorities the WT have is ensuring that reviewers have easy access to all\nliterature, to supporting evidence synthesis initiatives to translate research\ninto impact, and controlling the open body of research outputs through digital\ninfrastructure to manage the new scale.  However, his challenge lies in\nmaintaining sustainable financing models for all this research data; he noted\nexplicitly that the Wellcome would not cover open access costs for commercial\npublishers.</p>\n<p>Røttingen further highlighted the Global Biodata Coalition (which he was a\nmember of) concerns about US data resilience and framed research infrastructure\nas &quot;a global public good&quot; requiring collective investment and fair financing\nacross nations. Interestingly, he explicitly called out UNESCO as a weak force\nin global governance for this from the UN; I hadn't even realised that UNESCO\nwas responsible for this stuff!</p>\n<p>Finally, <a href=\"https://www.stats.ox.ac.uk/~deane/\">Prof Charlotte Deane</a> from the EPSRC also discussed what a scientific\njournal is for these days. It's not for proofreading or typesetting anymore and\n(as <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> also noted earlier), the stamp of quality is key. Deane\nargued that &quot;research completion&quot; doesn't happen until someone else can read it\nand reasonably verify the methods are sound; not something that can happen\nwithout more open access.  Deane also warned of the existential threat of <a href=\"/notes/ai-poisoning\">AI poisoning</a> since &quot;AI can make fake papers at a rate humans can't\nimagine. It won't be long before mose of the content on the Internet will be AI\ngenerated&quot;.</p>\n<p>The audience Q&amp;A was <em>very</em> blunt here.  <a href=\"https://uniweb.uottawa.ca/view/profile/members/2846\">Stefanie Haustein</a> pointed out that we\nare pumping of billions of dollars into the publishing industry, many of which\nare shareholder companies, and so we are losing a significant percentage of\neach dollar spent. There is enough money in the system, but it's very\ninefficiently deployed right now!</p>\n<p><a href=\"https://www.linkedin.com/in/richardsever\">Richard Sever</a> from openRxiv asked\nhow we pay for this when major funders like the NIH have issued a series of\n<em>unfunded</em> open data mandates over recent years. John-Arne Rottingen noted that\nUNESCO is a very weak global body and not influential here, but that we need\ncoalitions of the willing to build such open data approaches from the bottom\nup. Challenging the publisher hegemony can only be done as a pack, which lead\nnicely onto the next session after lunch where the founder of\n<a href=\"https://openalex.org/\">OpenAlex</a> would be present!</p>\n<h2 id=\"who-are-the-stewards-of-knowledge-\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#who-are-the-stewards-of-knowledge-\"></a>Who are the stewards of knowledge ?</h2>\n<p>After lunch (where sadly, the vegetarian options were terrible but\nluckily I had my trustly Huel bar!), we reconvened with a panel debating\nwho the stewards of the scientific record should be. This brought together\nperspectives from commercial publishers (Elsevier), open infrastructure advocates (OpenAlex),\nfunders (MRC), and university leadership (pro-VC of Birmingham).</p>\n<p><a href=\"https://www.elsevier.com/people/victoria-eva\">Victoria Eva</a> (<a href=\"https://researcheracademy.elsevier.com/publication-process/open-science/open-access-end-user-licenses\">SVP from\nElsevier</a>)\nopened by describing the &quot;perfect storm&quot; facing their academic publishing\nbusiness as they had 600k more submissions this year than the previous year.\nThere was a high level view on how their digital pipeline &quot;aims to insert\nsafeguards&quot; throughout the publication process to maintain integrity. She\nargued in general terms to view GenAI through separate lenses of trust and\ndiscoverability and argud that Elsevier's substantial technological investments\nposition them to manage both challenges well. I was\n<a href=\"https://www.theguardian.com/science/2017/jun/27/profitable-business-scientific-publishing-bad-for-science\">predisposed</a>\nto dislike excuses from staggeringly profitable commercial publishers, but I\ndid find her answers to providing bulk access to their corpus unsatisfying.\nWhile she highlighted their growing open access base of papers, she also noted\nthat the transitionon to open access cannot happen overnight (my personal\ntranslation is that this means slow-walking). She mentioned special cases in\nplace for\n<a href=\"https://www.elsevier.com/en-gb/about/open-science/research-data/text-and-data-mining\">TDM</a>\nin the Global South and healthcare access (presumably at the commercial\ndiscretion of Elsevier).</p>\n<p><a href=\"https://jasonpriem.org/\">Jason Priem</a> from <a href=\"https://openalex.org/\">OpenAlex</a>\n(part of <a href=\"https://ourresearch.org/\">OurResearch</a>) then offered a radically\ndifferent perspective. I'm a huge fan of OpenAlex, as we use it extensively in\nthe <a href=\"/projects/ce\">CE</a> infrastructure. He disagreed with the conference framing of\npublishers as &quot;custodians&quot; or &quot;stewards,&quot; noting that these evoke someone\nmaintaining a static, old lovely house. Science <em>isn't</em> a static edifice but a\ngrowing ecosystem, with more scientists alive today than at any point in\nhistory. He instead proposed a &quot;gardener&quot; as a better metaphor; the science\necosystem needs to nourish growth rather than merely preserving what exists.\nExtending the metaphor, Priem contrasted French and English garden styles:\nFrench gardens constrain nature into platonic geometric forms, while English\ngardens embrace a more rambling style that better represents nature's inherent\ndiversity. He argued that science needs to adopt the &quot;English garden&quot; approach\nand that we don't have an information overload problem but rather &quot;<a href=\"https://www.cnet.com/culture/shirky-problem-is-filter-failure-not-info-overload/\">bad\nfilters</a>&quot;\n(to quote Clay Shirky).</p>\n<p><img src=\"/images/rspub-11.webp\" alt=\"%c\" title=\"Jason Priem (OpenAlex), Victoria Eva (Elsevier) and Mark Walport in the panel\" ></p>\n<p>Priem advocated <em>strongly</em> for open infrastructures since communities don't just produce papers: also software, datasets, abstracts, and things we don't envision yet. If we provide them with the &quot;digital soil&quot; (open infrastructure) then they will prosper. OpenAlex and <a href=\"https://zenodo.org/\">Zenodo</a> are great examples of how such open infrastructure hold up here. I use both all the time; I'm a huge fan of Jason's work and talk.</p>\n<p><a href=\"https://www.ukri.org/people/patrick-chinnery/\">Patrick Chinnery</a> from the Medical Research Council brought the funder perspective with some numbers: publishing consumes 1 to 2% of total research turnover funds (roughly £24 million for UKRI) . He noted that during the pandemic, decision-makers were reviewing preprint data in real-time to determine which treatments should proceed to clinical trials and decisions had to be reversed after peer review revealed flaws. He emphasised the the need for more real time quality assurance in rapid decision-making contexts.</p>\n<p><a href=\"https://en.wikipedia.org/wiki/Adam_Tickell\">Adam Tickell</a> from the University of Birmingham declared the current model &quot;broken&quot;, and not that each attempt at reform fails to solve the <em>basic problem of literature access</em> (something I've faced myself). He noted that David Willetts (former UK Minister for Science) couldn't access paywalled material while minister of science in government (!) which significantly influenced <a href=\"https://www.gov.uk/government/news/government-to-open-up-publicly-funded-research\">subsequent government policy</a> towards open access.\nTickell was scathing about the oligopolies of Elsevier and Springer, arguing their <a href=\"https://www.researchprofessionalnews.com/rr-news-world-2025-2-elsevier-parent-company-reports-10-rise-in-profit-to-3-2bn/\">profit margins</a> are out of proportion with the public funding for science. He noted that early open access attempts from the <a href=\"https://ioppublishing.org/news/spotlight-on-the-finch-report/\">Finch Report</a> were well-intentioned but ultimately insufficient to break the hegemony. Perhaps an opportunity for a future UK <a href=\"/notes/uk-national-data-lib\">National Data Library</a>...\nTickell closed his talk with an observation about the current crisis of confidence in science. This did make me think of a <a href=\"https://bsky.app/profile/hetanshah.bsky.social/post/3lttyexntps2y\">recent report on British confidence in science</a>, which shows the British public still retains belief in scientific institutions. So at least we're doing better than the US in this regard for now!</p>\n<p><a href=\"https://www.linkedin.com/feed/update/urn:li:activity:7350547427319275520?commentUrn=urn%3Ali%3Acomment%3A%28activity%3A7350547427319275520%2C7350886618490130433%29&replyUrn=urn%3Ali%3Acomment%3A%28activity%3A7350547427319275520%2C7350908587134644225%29&dashCommentUrn=urn%3Ali%3Afsd_comment%3A%287350886618490130433%2Curn%3Ali%3Aactivity%3A7350547427319275520%29&dashReplyUrn=urn%3Ali%3Afsd_comment%3A%287350908587134644225%2Curn%3Ali%3Aactivity%3A7350547427319275520%29\"> <img src=\"/images/rspub-ss-1.webp\" alt=\"%c\" title=\"Stefanie Haustein points out ChatGPT-related content in response to Elsevier's comments on stage.\" > </a></p>\n<p>The Q&amp;A session opened with Mark Walport asked how Elsevier manages to publish so many articles. Victoria Eva from Elsevier responded that they receive 3.5m articles annually with ~750k published. Eva mentioned something about &quot;digital screening throughout the publication process&quot; but acknowledged that this was a challenge due to the surge from paper mills. A suggestion of paying peer reviewers was raised from the audience but not substantively addressed. <a href=\"https://www.scholcommlab.ca/stefanie-haustein/\">Stefanie Haustein</a> once again made a great point from the audience about how Elsevier could let through <a href=\"https://www.vice.com/en/article/scientific-journal-frontiers-publishes-ai-generated-rat-with-gigantic-penis-in-worrying-incident/\">AI generated rats with giant penises</a> with all this protection in place; clearly, some papers have been published by them with no humans ever reading it. This generated a laugh from the audience, and an acknowlegment from the Elsevier rep that they needed to invest more and improve.</p>\n<h2 id=\"how-to-make-open-infrastructure-sustainable\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#how-to-make-open-infrastructure-sustainable\"></a>How to make open infrastructure sustainable</h2>\n<p>My laptop power ran out at this point, but the next panel was an absolute treat as it had both <a href=\"https://kaythaney.com/\">Kaitlin Thaney</a> and <a href=\"https://en.wikipedia.org/wiki/Jimmy_Wales\">Jimmy Wales</a> of Wikipedia fame on it!</p>\n<p><img src=\"/images/rspub-12.webp\" alt=\"%c\" title=\"Hylke Koers, Kaitlin Thaney, Jummy Wales and Ian Mulvany\" ></p>\n<p>Jimmy Wales pointed out an interesting point from his &quot;seven rules of trust&quot; is that a key one is to be personal with human-to-human contact and not run too quickly to technological solutions. Rather than, for example, asking what percentage of academic papers showed evidence of language from ChatGPT, it's more fruitful to ask whether the science contained within the paper is good instead of how it's written. There are many reasons why someone might have used ChatGPT (non-native speakers etc) but also many reasons unrelated why the science might be bad.</p>\n<p>Kaitlin Thaney pointed out the importance of openness given <a href=\"https://www.motherjones.com/politics/2025/07/trump-war-assault-national-science-foundation-american-innovation-greatness-education/\">the US assault on\nscience</a>\nmeans that the open data repositories can be replicated reasonably as well.</p>\n<p>Ian Mulvaney pointed out that Nature claims to have invested $240m in research\ninfrastructure, and this is a struggle for a medium sized publisher (like his\nown <a href=\"https://www.bmj.com/\">BMJ</a>). Open infrastructure allows sharing and\ncreation of value to make it possible to let these smaller organisations\nsurvive.</p>\n<p>When it comes to policy recommendations, what did the panel have to say about a more trustworthy literature?</p>\n<ul>\n<li>The <a href=\"https://www.ccsd.cnrs.fr/en/posi-principles/\">POSI principles</a> came up as important levels.</li>\n<li>Kaitlin mentioned the <a href=\"https://www.nextgenlibpub.org/forest-framework\">FOREST framework</a> funded by Arcadia and how they need to manifest in concrete infrastructure. There's an implicit reliance on infrastructure that you only notice when it's taken away! Affordability of open is a key consideration as well.</li>\n<li>Jimmy talked about open source software, and what generally works is not one-size-fits-all. Some are run by companies (their main product and they sell services), and others by individuals.  If we bring this back to policy, we need to look at preserving whats already working sustainably but support it. Dont try to find a general solution but adopt targeted, well thought through interventions instead.</li>\n</ul>\n<p><em>I'm updating this as I go along but running out of laptop battery too!</em></p><h1>References</h1><ul><li>Madhavapeddy et al (2025). Steps towards an Ecology for the Internet. Association for Computing Machinery. <a href=\"https://doi.org/10.1145/3744169.3744180\" target=\"_blank\"><i>10.1145/3744169.3744180</i></a></li>\n<li>Madhavapeddy (2025). Thoughts on the National Data Library and private research data. <a href=\"https://doi.org/10.59350/fk6vy-5q841\" target=\"_blank\"><i>10.59350/fk6vy-5q841</i></a></li>\n<li>Reynolds et al (2025). Will AI speed up literature reviews or derail them entirely?. Nature Publishing Group. <a href=\"https://doi.org/10.1038/d41586-025-02069-w\" target=\"_blank\"><i>10.1038/d41586-025-02069-w</i></a></li>\n<li>Madhavapeddy (2025). Is AI poisoning the scientific literature? Our comment in Nature. <a href=\"https://doi.org/10.59350/pbxew-d2j78\" target=\"_blank\"><i>10.59350/pbxew-d2j78</i></a></li>\n<li>Madhavapeddy (2025). EEG internships for the summer of 2025. <a href=\"https://doi.org/10.59350/tf22g-p1822\" target=\"_blank\"><i>10.59350/tf22g-p1822</i></a></li>\n<li>Richter (1960). How Many More New Journals?. Nature. <a href=\"https://doi.org/10.1038/186018a0\" target=\"_blank\"><i>10.1038/186018a0</i></a></li>\n<li>Editors (1979). Uniform Requirements for Manuscripts Submitted to Biomedical Journals. Annals of Clinical Biochemistry. <a href=\"https://doi.org/10.1177/000456327901600179\" target=\"_blank\"><i>10.1177/000456327901600179</i></a></li>\n<li>Fyfe (2022). Self-help for learned journals: Scientific societies and the commerce of publishing in the 1950s. History of Science. <a href=\"https://doi.org/10.1177/0073275321999901\" target=\"_blank\"><i>10.1177/0073275321999901</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/rs-future-of-publishing",
      "title": "Royal Society's Future of Scientific Publishing meeting",
      "summary": "Live notes from Royal Society conference on scientific publishing challenges including peer review crisis, AI poisoning threats and open access economics.",
      "date_published": "2025-07-14T00:00:00.000000Z",
      "date_modified": "2025-07-14T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "royalsociety",
        "evidence",
        "publishing",
        "ai",
        "livenotes"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1145/3744169.3744180",
          "doi": "10.1145/3744169.3744180",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/fk6vy-5q841",
          "doi": "10.59350/fk6vy-5q841",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1038/d41586-025-02069-w",
          "doi": "10.1038/d41586-025-02069-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/pbxew-d2j78",
          "doi": "10.59350/pbxew-d2j78",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/tf22g-p1822",
          "doi": "10.59350/tf22g-p1822",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1038/186018a0",
          "doi": "10.1038/186018a0",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1177/000456327901600179",
          "doi": "10.1177/000456327901600179",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1177/0073275321999901",
          "doi": "10.1177/0073275321999901",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/pbxew-d2j78",
      "content_html": "<p>For the past few years, <a href=\"https://toao.com\">Sadiq Jaffer</a> and I been working with our colleagues in\n<a href=\"/projects/ce\">Conservation Evidence</a> to do <a href=\"/papers/2024-ce-llm\">analysis at scale</a> on the\nacademic literature. Getting local access to millions of fulltext papers has not\nbeen without drama, but made possible thanks to huge amounts of help from our\n<a href=\"https://www.lib.cam.ac.uk/\">University Library</a> who helped us navigate our\nrelationships with scientific publishers. We have just <strong><a href=\"https://rdcu.be/evkfj\">published a comment\nin Nature</a></strong> about the next phase\nof our research, where are looking into the impact of AI advances on evidence synthesis.</p>\n<p><a href=\"https://rdcu.be/evkfj\"> <img src=\"/images/davidparkins-ai-poison.webp\" alt=\"%c\" title=\"AI poisoning the literature in a legendary cartoon. Credit: David Parkins, Nature\" > </a></p>\n<p>Our work on literature reviews led us into assessing methods for <a href=\"https://royalsociety.org/news-resources/projects/evidence-synthesis/\">evidence\nsynthesis</a>\n(which is crucial to rational policymaking!) and specifically about how recent advances in AI may\nimpact it.  The current methods for <a href=\"https://en.wikipedia.org/wiki/Systematic_review\">rigorous systematic literature review</a> are expensive and slow, and authors are already struggling to keep up with the <a href=\"https://ourworldindata.org/grapher/scientific-and-technical-journal-articles?time=latest\">rapidly expanding</a>\nnumber of legitimate papers. Adding to this, <a href=\"https://retractionwatch.com/2025/\">paper retractions</a> are increasing near\n<a href=\"https://doi.org/10.1038/d41586-023-03974-8\">exponentially</a> and already\nsystematic reviews <a href=\"https://retractionwatch.com/the-retraction-watch-leaderboard/top-10-most-highly-cited-retracted-papers/\">unknowingly cite</a>\nretracted papers, with most remaining uncorrected even a year (after notification!)</p>\n<p>This is all made much more complex as LLMs are flooding the landscape with\nconvincing, fake manuscripts and doctored data, potentially overwhelming our\ncurrent ability to distinguish fact from fiction.  Just this March, the <a href=\"https://sakana.ai/ai-scientist/\">AI\nScientist</a> formulated hypotheses, designed and\nran experiments, analysed the results, generated the figures and produced a\nmanuscript that <a href=\"https://sakana.ai/ai-scientist-first-publication/\">passed human peer\nreview</a> for an ICLR\nworkshop! Distinguishing genuine papers from those produced by LLMs isn't just\na problem for review authors; it's a threat to the very foundation of\nscientific knowledge. And meanwhile, Google is taking a different tack with a\ncollaborative <a href=\"https://research.google/blog/accelerating-scientific-breakthroughs-with-an-ai-co-scientist/\">AI co-scientist</a> who acts as a multi-agent assistant.</p>\n<p>So the landscape is moving <em>really</em> quickly! Our proposal for the future of\nliterature reviews builds on our desire to move towards a more regional,\nfederated network approach. Instead of having giant repositories of knowledge\nthat <a href=\"https://en.wikipedia.org/wiki/2025_United_States_government_online_resource_removals\">may be erased unilaterally</a>,\nwe're aiming for a more bilateral network of &quot;living evidence databases&quot;.\nEvery government, especially those in the Global South, should have the ability to build their\nown &quot;<a href=\"/notes/uk-national-data-lib\">national data libraries</a>&quot; which represent the body\nof digital data that affects their own regional needs.</p>\n<p>This system of living evidence databases can be incremental and dynamically\nupdated, and AI assistance can be used as long as humans remain in-the-loop.\nSuch a system can continuously gather, screen, and index literature,\nautomatically remove compromised studies and recalculating results.  We're\nworking on this on multiple fronts this year; ranging from the computer science\nto figure out the distributed-nitty-gritty <sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote\">[1]</a></sup>, over to working with the\n<a href=\"/notes/nas-rs-biodiversity\">GEOBON folk</a> on global biodiversity <a href=\"https://www.tunbury.org/2025/07/02/bon-in-a-box/\">data\nmanagement</a>, and continuing\nto drive the core LED design at Conservation Evidence. It feels like a</p>\n<p>Read our <a href=\"https://www.nature.com/articles/d41586-025-02069-w\">Nature Comment piece</a> (<a href=\"https://www.linkedin.com/posts/anilmadhavapeddy_will-ai-speed-up-literature-reviews-or-derail-activity-7348317711002705920-Y5UT?rcm=ACoAAAB0Kb0BNo1v6ylsGU2NtPa95mj-w1VcaJA\">comment on LI</a>) to learn more about how we think we can safeguard evidence synthesis against the rising tide of &quot;AI-poisoned literature&quot; and ensure the continued integrity of scientific discovery. As a random bit of trivia, the incredibly cool artwork in the piece was drawn by the legendary <a href=\"https://www.davidparkins.com/\">David Parkins</a>, who also drew <a href=\"https://www.beano.com/\">Beano</a> and <a href=\"https://en.wikipedia.org/wiki/Dennis_the_Menace_and_Gnasher\">Dennis the Menace</a>!</p>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p>My instinct is that we'll end up with something <a href=\"https://arxiv.org/abs/2402.03239\">ATProto based</a> as it's so convenient for <a href=\"https://www.tunbury.org/2025/04/25/bluesky-ssh-authentication/\">distributed system authentication</a>.</p>\n <a href=\"#fnref:1\" class=\"reversefootnote\">&#8617;</a></p></li></ol></div><h1>References</h1><ul><li>Reynolds et al (2025). Will AI speed up literature reviews or derail them entirely?. Nature Publishing Group. <a href=\"https://doi.org/10.1038/d41586-025-02069-w\" target=\"_blank\"><i>10.1038/d41586-025-02069-w</i></a></li>\n<li>Madhavapeddy (2025). What I learnt at the National Academy of Sciences US-UK Forum on Biodiversity. <a href=\"https://doi.org/10.59350/j6zkp-n7t82\" target=\"_blank\"><i>10.59350/j6zkp-n7t82</i></a></li>\n<li>Madhavapeddy (2025). Thoughts on the National Data Library and private research data. <a href=\"https://doi.org/10.59350/fk6vy-5q841\" target=\"_blank\"><i>10.59350/fk6vy-5q841</i></a></li>\n<li>Iyer et al (2025). Careful design of Large Language Model pipelines enables expert-level retrieval of evidence-based information from syntheses and databases. <a href=\"https://doi.org/10.1371/journal.pone.0323563\" target=\"_blank\"><i>10.1371/journal.pone.0323563</i></a></li>\n<li>Noorden (2023). More than 10,000 research papers were retracted in 2023 — a new record. Nature. <a href=\"https://doi.org/10.1038/d41586-023-03974-8\" target=\"_blank\"><i>10.1038/d41586-023-03974-8</i></a></li>\n<li>Kleppmann et al (2024). Bluesky and the AT Protocol: Usable Decentralized Social Media. Proceedings of the ACM Conext-2024 Workshop on the Decentralization of the Internet. <a href=\"https://doi.org/10.1145/3694809.3700740\" target=\"_blank\"><i>10.1145/3694809.3700740</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/ai-poisoning",
      "title": "Is AI poisoning the scientific literature? Our comment in Nature",
      "summary": "Nature comment on AI-generated paper threats to evidence synthesis proposing federated living evidence databases with human-in-loop review.",
      "date_published": "2025-07-08T00:00:00.000000Z",
      "date_modified": "2025-07-08T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "evidence",
        "llms",
        "ai",
        "federation",
        "networks"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2025-ai-poison.pdf",
          "mime_type": "application/pdf",
          "title": "Will AI speed up literature reviews or derail them entirely?"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1038/d41586-025-02069-w",
          "doi": "10.1038/d41586-025-02069-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/j6zkp-n7t82",
          "doi": "10.59350/j6zkp-n7t82",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/fk6vy-5q841",
          "doi": "10.59350/fk6vy-5q841",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1371/journal.pone.0323563",
          "doi": "10.1371/journal.pone.0323563",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1038/d41586-023-03974-8",
          "doi": "10.1038/d41586-023-03974-8",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3694809.3700740",
          "doi": "10.1145/3694809.3700740",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/tf22g-p1822",
      "content_html": "<p>The exam marking is over, and a glorious Cambridge summer awaits! This year, we\nhave a sizeable cohort of undergraduate and graduate interns joining us from\nnext week.</p>\n<p>This note serves as a point of coordination to keep track of what's\ngoing on, and I'll update it as we get ourselves organised.\nIf you're an intern, then I highly recommend you take the time to carefully\nread through all of this, starting with <a href=\"#who-we-all-are-this-summer\">who we are</a>,\nsome <a href=\"#ground-rules\">ground rules</a>, <a href=\"#where-we-will-work\">where we will work</a>,\n<a href=\"#registering-on-chat-channels\">how we chat</a>, <a href=\"#how-you-will-get-paid\">how to get paid</a>, and of course <a href=\"#summer-social-activities\">social activities</a> to make sure we have some fun!</p>\n<h2 id=\"who-we-all-are-this-summer\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#who-we-all-are-this-summer\"></a>Who we all are this summer</h2>\n<p>We're working on quite the diversity of projects this summer, ranging from classic\ncomputer systems and programming problems all the way through to environmental\nscience. Here's a recap of what's going on.</p>\n<p>First we're working against the <a href=\"/projects/ce\">evidence database</a> we've been building for the past couple of years:</p>\n<ul>\n<li><em>&quot;<a href=\"/ideas/ai-assisted-inclusion-criteria\">Evaluating a human-in-the-loop AI framework to improve inclusion criteria for evidence synthesis</a>&quot;</em> with <a href=\"mailto:ra684@cam.ac.uk\">Radhika Agrawal</a>, supervised by <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a> and <a href=\"https://toao.com\">Sadiq Jaffer</a></li>\n<li><em>&quot;<a href=\"/ideas/accurate-summarisation-for-ce\">Accurate summarisation of threats for conservation evidence literature</a>&quot;</em> with <a href=\"mailto:kh807@cam.ac.uk\">Kittson Hamill</a>, supervised by <a href=\"https://toao.com\">Sadiq Jaffer</a> following up her successful MPhil submission.</li>\n</ul>\n<p>We're then heading into <a href=\"/projects/rsn\">remote sensing</a> and working on some mapping projects:</p>\n<ul>\n<li><em>&quot;<a href=\"/ideas/cairngorms-connect-habitats\">Habitat mapping of the Cairngormes Connect restoration area</a>&quot;</em> with <a href=\"https://github.com/Isabel-Mansley\">Isabel Mansley</a>, supervised by <a href=\"https://coomeslab.org\">David Coomes</a> and <a href=\"https://eo.conservation.cam.ac.uk/people/aland-chan/\">Aland Chan</a></li>\n<li><em>&quot;<a href=\"/ideas/hedgehog-mapping\">Mapping urban and rural British hedgehogs</a>&quot;</em> with <a href=\"https://www.theboatrace.org/athletes/gabriel-mahler\">Gabriel Mahler</a>, supervised by <a href=\"https://www.cambridgeconservation.org/about/people/dr-silviu-o-petrovan/\">Silviu Petrovan</a>, as well as writing up his MPhil dissertation on <em>&quot;<a href=\"/ideas/walkability-for-osm\">Enhancing Navigation Algorithms with Semantic Embeddings</a>&quot;</em></li>\n<li><em>&quot;<a href=\"/ideas/validating-anti-poaching-predictions\">Validating predictions with ranger insights to enhance anti-poaching patrol strategies in protected areas</a>&quot;</em> with <a href=\"mailto:hm708@cam.ac.uk\">Hannah McLoone</a>, supervised by <a href=\"https://charlesemogor.com\">Charles Emogor</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/professor-rob-fletcher\">Rob Fletcher</a></li>\n</ul>\n<p>Dropping down towards <a href=\"/projects/osmose\">embedded systems</a> and fun &quot;real-world&quot; projects, we have:</p>\n<ul>\n<li><em>&quot;<a href=\"/ideas/digitisation-of-insects\">Affordable digitisation of insect collections using photogrammetry</a>&quot;</em> with <a href=\"mailto:bsys2@cam.ac.uk\">Beatrice Spence</a>, <a href=\"mailto:ntay2@cam.ac.uk\">Anna Yiu</a> and <a href=\"mailto:aer82@cam.ac.uk\">Arissa-Elena Rotunjanu</a>, supervised by <a href=\"https://www.cambridgephilosophicalsociety.org/funding/henslow-fellows/dr-tiffany-ki%0A\">Tiffany Ki</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/dr-edgar-turner\">Edgar Turner</a></li>\n<li><em>&quot;<a href=\"/ideas/3d-print-world\">3D printing the planet (or bits of it)</a>&quot;</em> with <a href=\"mailto:fs618@cam.ac.uk\">Finley Stirk</a>, supervised by <a href=\"https://mynameismwd.org\">Michael Dales</a></li>\n<li><em>&quot;<a href=\"/ideas/embedded-whisper\">Low power audio transcription with Whisper</a>&quot;</em> with <a href=\"mailto:dk729@cam.ac.uk\">Dan Kvit</a> and <em>&quot;<a href=\"/ideas/battery-free-riotee\">Battery-free wildlife monitoring with Riotee</a>&quot;</em> with <a href=\"mailto:dp717@cam.ac.uk\">Dominico Parish</a>, both supervised by <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a></li>\n</ul>\n<p>Going back to classic computer science, we have a few programming language and systems projects:</p>\n<ul>\n<li><em>&quot;<a href=\"/ideas/hazel-to-ocaml-to-hazel\">Bidirectional Hazel to OCaml programming</a>&quot;</em> with <a href=\"https://maxcarroll0.github.io/blog/\">Max Carroll</a>, supervised by <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> and <a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a></li>\n<li><em>&quot;<a href=\"/ideas/effects-scheduling-ocaml-compiler\">Effects based scheduling for the OCaml compiler pipeline</a>&quot;</em> with <a href=\"mailto:khm39@cam.ac.uk\">Lucas Ma</a> and <em>&quot;<a href=\"/ideas/ocaml-bytecode-native-ffi\">Runtimes à la carte: crossloading native and bytecode OCaml</a>&quot;</em> with <a href=\"mailto:jc2483@cam.ac.uk\">Jeremy Chen</a>, both supervised by <a href=\"https://www.dra27.uk\">David Allsopp</a></li>\n<li><em>&quot;<a href=\"/ideas/zfs-filesystem-perf\">ZFS replication strategies with encryption</a>&quot;</em> with <a href=\"mailto:btt31@cam.ac.uk\">Becky Terefe-Zenebe</a>, supervised by <a href=\"https://www.tunbury.org/\">Mark Elvers</a></li>\n</ul>\n<h2 id=\"ground-rules\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#ground-rules\"></a>Ground rules</h2>\n<p>Since there are so many of us this summer, it's imperative that you're all\n<strong>proactive about communicating</strong> any problems or clarifications you need. If something\nhere doesn't make sense, or you have a better idea, then just reach out to any\nof the supervisors or me directly!</p>\n<p>Do also take time to <strong>learn from each other</strong>. Read up on not just your own project in the\nlist above, but take some to read the remainder so that you have a sense of what everyone\nis working on. When you see each other, it'll be much easier to chat about what's going\non and find opportunities for commonality.</p>\n<p>The projects above have been carefully selected to <strong>not be on the critical path</strong> for any\ndeadlines. If it's not going well from your perspective, then it's ok to take a step back\nand figure out why! We're hear to learn and discover things, so take the time to do so.</p>\n<h2 id=\"where-we-will-work\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#where-we-will-work\"></a>Where we will work</h2>\n<p>This will be different for everyone, since it depends on which home department will house the project.\nSome of us will be in the David Attenborough Building, in the third floor where the <a href=\"https://www.conservation.cam.ac.uk\">CRI</a> is:</p>\n<ul>\n<li><a href=\"mailto:ra684@cam.ac.uk\">Radhika Agrawal</a> and <a href=\"mailto:kh807@cam.ac.uk\">Kittson Hamill</a> will be with the <a href=\"/projects/ce\">CE</a> crew near <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a>'s office</li>\n<li><a href=\"https://github.com/Isabel-Mansley\">Isabel Mansley</a> and <a href=\"https://www.theboatrace.org/athletes/gabriel-mahler\">Gabriel Mahler</a> will hang out with <a href=\"https://coomeslab.org\">David Coomes</a>'s group</li>\n<li><a href=\"mailto:hm708@cam.ac.uk\">Hannah McLoone</a> can work near <a href=\"https://www.zoo.cam.ac.uk/directory/professor-rob-fletcher\">Rob Fletcher</a>'s office where <a href=\"https://charlesemogor.com\">Charles Emogor</a> works</li>\n</ul>\n<p>Those working on the Zoology Museum itself (<a href=\"mailto:aer82@cam.ac.uk\">Arissa-Elena Rotunjanu</a>, <a href=\"mailto:bsys2@cam.ac.uk\">Beatrice Spence</a> and <a href=\"mailto:ntay2@cam.ac.uk\">Anna Yiu</a>) will have an health and safety induction on Monday with <a href=\"https://www.cambridgephilosophicalsociety.org/funding/henslow-fellows/dr-tiffany-ki%0A\">Tiffany Ki</a> and find offices there.</p>\n<p>The rest of us will be in the Computer Lab over in West Cambridge:</p>\n<ul>\n<li><a href=\"mailto:khm39@cam.ac.uk\">Lucas Ma</a> and <a href=\"mailto:jc2483@cam.ac.uk\">Jeremy Chen</a> will work out of FW15 with <a href=\"https://www.dra27.uk\">David Allsopp</a> and <a href=\"https://jon.recoil.org\">Jon Ludlam</a></li>\n<li><a href=\"mailto:dk729@cam.ac.uk\">Dan Kvit</a>, <a href=\"mailto:fs618@cam.ac.uk\">Finley Stirk</a>, <a href=\"mailto:btt31@cam.ac.uk\">Becky Terefe-Zenebe</a> and <a href=\"mailto:dp717@cam.ac.uk\">Dominico Parish</a> will be in FW15/14.  We may need to clear out one desk in FW15 to make room here (just put the stuff in my office in FW16). <a href=\"https://mynameismwd.org\">Michael Dales</a> and <a href=\"https://toao.com\">Sadiq Jaffer</a> will work out of my office (FW16) for the summer, and <a href=\"https://www.cst.cam.ac.uk/people/og309\">Onkar Gulati</a> is away for an internship in the USA.</li>\n<li>We'll find somewhere for <a href=\"https://maxcarroll0.github.io/blog/\">Max Carroll</a> either in West Cambridge or in Pembroke soon, depending on preferences and heat!</li>\n</ul>\n<p>It'll probably take a week to let this all shake out, so please do shout if you find yourself stuck in your room and without an office! You should of course arrange to meet your immediate supervisors regularly according to whatever schedule and location works for you.</p>\n<h2 id=\"how-you-will-get-paid\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#how-you-will-get-paid\"></a>How you will get paid</h2>\n<p>The way you get paid weekly is via the <a href=\"https://www.hrsystems.admin.cam.ac.uk/systems/systems-overview/ccws\">Cambridge Casual Worker</a> system. This has a few important steps that you <strong>must</strong> pay attention to, or you will not get paid!</p>\n<ul>\n<li><strong>Before starting work</strong> you must go find <a href=\"https://www.cst.cam.ac.uk/people/ac733\">Alicja Zavros</a> in the Computer Lab with your passport or other proof of your right to work in the UK.  I've told Alicja that may of you will show up on Monday 30th June morning. It won't take more than a few minutes, as she'll take a photocopy of your id. You should also have registered on the <a href=\"https://www.hrsystems.admin.cam.ac.uk/systems/systems-overview/ccws\">CCWS</a> and gotten a login.</li>\n<li><strong>Every Friday</strong> that you do some work, fill in a timesheet on the CCWS. Round this off to a full day (8 hours) and don't do fine-grained timekeeping; just the number of days you've worked is fine. If you don't fill in a timesheet promptly, you won't get paid.</li>\n<li><strong>You must keep a research log with weeknotes</strong> that record what you've been up to. The exact style of weeknotes are entirely up to you, but it's vital that you get in the habit of keeping a log. If you have your own homepage, then send an <a href=\"https://en.wikipedia.org/wiki/Atom_(web_standard)\">Atom feed</a> to me. If you don't, then we have a <a href=\"https://github.com/ucam-eo/interns-2025\">github/ucam-eo/interns-2025</a> which I can give you write access to.  It's typical to store your weeknotes in Markdown format, and just a simple subdirectory with a date-based convention is fine. The primary use of weeknotes is to highlight things you've accomplished, areas where you are blocked, and interesting things you have run across. Try to make it a record to your future self, and also a way to let those around you know what's going on. While missing the occasional weeknote is just fine, missing them all will be a problem, so plan your time accordingly.  Weeknotes are also <em>not</em> a mechanism to assess anything to do with your progress, but a simple form of communication.</li>\n</ul>\n<h2 id=\"registering-on-chat-channels\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#registering-on-chat-channels\"></a>Registering on chat channels</h2>\n<p>Since we're all going to spread around Cambridge physically, it's important to have a chat channel. <a href=\"mailto:hm708@cam.ac.uk\">Hannah McLoone</a> is setting up a WhatsApp group for social things (see below), but we also use <a href=\"https://matrix.org\">Matrix</a> as our &quot;hackers choice&quot; for day-to-day messaging.</p>\n<p>We host a Computer Lab <a href=\"https://matrix.org\">Matrix</a> server on which anyone with a valid Raven account can create an account. Since Matrix is a decentralised chat system, it is also possible to use other accounts from third-party servers, and also to join channels elsewhere.</p>\n<p>To create an account:</p>\n<ul>\n<li>In your Matrix client (we most commonly use <a href=\"https://element.io\">Element</a>), select <code>eeg.cl.cam.ac.uk</code> as your homeserver.</li>\n<li>Login with SSO (Single Sign On)</li>\n<li>You should see a Cambridge authentication screen for your CRSID.</li>\n</ul>\n<p>Once you create your account, you will be in the &quot;EEG&quot; Matrix space.  A <a href=\"https://matrix.org/blog/2021/05/17/the-matrix-space-beta/\">Matrix space</a> is a collection of channels, and you should join &quot;EEGeneral&quot; as the overall channel for the group. We'll create a separate room just for intern chats. We also have a bot in the room that posts our blogs to the channel, so you can keep up with what the group members are all chattering about. <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> runs the CL matrix server, and there are occasional quirks, so just let us know if you run into any problems.  I am <code>@avsm:recoil.org</code> on there, not <code>avsm2</code> as I use my personal Matrix for a bunch of stuff.</p>\n<h2 id=\"summer-social-activities\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#summer-social-activities\"></a>Summer social activities</h2>\n<p>It's important to get some downtime this summer and recharge. <a href=\"mailto:hm708@cam.ac.uk\">Hannah McLoone</a> has been setting up a social group for the interns to hang out together, and we'll organise a punting excursion at some point to get us out to the river.  Of course, many of us will be travelling this summer (I'm heading off to Botswana in late July for instance), so please do also make suggestions.</p>",
      "url": "https://anil.recoil.org/notes/eeg-interns-2025",
      "title": "EEG internships for the summer of 2025",
      "summary": "Coordination note for summer 2025 undergraduate and graduate internships covering projects from evidence databases to remote sensing and embedded systems.",
      "date_published": "2025-06-28T00:00:00.000000Z",
      "date_modified": "2025-06-28T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "urop"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://doi.org/10.59350/53zjq-ft509",
      "content_html": "<p>The <a href=\"https://www.esa.int/Applications/Observing_the_Earth/FutureEO/Biomass\">BIOMASS</a> forest mission satellite was <a href=\"https://www.bbc.co.uk/newsround/articles/c0jzy3g0zx2o\">successfully</a> boosted into space a couple of days ago, after decades of development from just down the road in <a href=\"https://www.gov.uk/government/news/british-built-satellite-to-map-earths-forests-in-3d-for-the-first-time\">Stevenage</a>. I'm excited by this because it's the first global-scale <a href=\"https://www.esa.int/Applications/Observing_the_Earth/FutureEO/Biomass/The_instrument\">P-band SAR</a> instrument that can penetrate forest canopys to look underneath. This, when combined with <a href=\"/papers/2024-hyper-tropical-mapping\">hyperspectral mapping</a> will give us a lot more <a href=\"/projects/rsn\">insight</a> into global tree health.</p>\n<p>Weirdly, the whole thing almost never happened because permission to use the <a href=\"https://ieeexplore.ieee.org/document/9048581\">P-band</a> was blocked because it might <a href=\"https://spacenews.com/us-missile-warning-radars-could-squelch-esas-proposed-biomass-mission/\">interfere with US nuclear missile warning radars</a> back in 2013.</p>\n<blockquote>\n<p>Meeting in Graz, Austria, to select the the 7th Earth Explorer mission to be flown by the 20-nation European Space Agency (ESA), backers of the Biomass mission were pelted with questions about how badly the U.S. network of missile warning and space-tracking radars in North America, Greenland and Europe would undermine Biomass’ global carbon-monitoring objectives.</p>\n<p>Europe's Earth observation satellite system may be the world's most dynamic, but as it pushes its operating envelope into new areas, it is learning a lesson long ago taught to satellite telecommunications operators: Radio frequency is scarce, and once users have a piece of it they hold fast.\n<cite>-- <a href=\"https://spacenews.com/us-missile-warning-radars-could-squelch-esas-proposed-biomass-mission/\">Spacenews</a> (2013)</cite></p>\n</blockquote>\n<p>Luckily, all this got sorted by international frequency negotiators, and after\n<a href=\"https://www.thecomet.net/news/25125302.satellite-built-stevenage-airbus-launches-space/\">being built by Airbus in Stevenage</a>\n(and Germany and France, as it's a complex instrument!) it took off without a hitch. Looking forward to getting my hands on the first results later in the year over at the <a href=\"https://eo.conservation.cam.ac.uk\">Centre for Earth Observation</a>.</p>\n<p>Check out this cool <a href=\"https://www.esa.int/Applications/Observing_the_Earth/FutureEO/Biomass/The_instrument\">ESA video</a> about the instrument to learn more, and congratulations to the team at ESA. Looking forward to the next <a href=\"/notes/biospace-25\">BIOSPACE</a> where there will no doubt be initial buzz about this.</p>\n<p><div class=\"video-center\"><iframe title=\"BIOMASS p-band mirror\" width=\"100%\" height=\"315px\" src=\"https://crank.recoil.org/videos/embed/c3981e1f-3f2d-439a-924d-6d29de33cfe4\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<p><em>Update 28th June 2025:</em> See also this <a href=\"https://www.bbc.co.uk/news/resources/idt-d7353b50-0fea-46ba-8495-ae9e25192cfe\">beautiful BBC article</a> about the satellite, via <a href=\"https://coomeslab.org\">David Coomes</a>.</p><h1>References</h1><ul><li>Madhavapeddy (2025). ESA's first BioSpace conference seems a huge success. <a href=\"https://doi.org/10.59350/vd6af-4bc83\" target=\"_blank\"><i>10.59350/vd6af-4bc83</i></a></li>\n<li>Ball et al (2024). Harnessing temporal & spectral dimensionality to identify individual trees in tropical forests. bioRxiv. <a href=\"https://doi.org/10.1101/2024.06.24.600405\" target=\"_blank\"><i>10.1101/2024.06.24.600405</i></a></li>\n<li>Li et al (2019). The P-band SAR Satellite: Opportunities and Challenges. 2019 6th Asia-Pacific Conference on Synthetic Aperture Radar (APSAR). <a href=\"https://doi.org/10.1109/APSAR46974.2019.9048581\" target=\"_blank\"><i>10.1109/APSAR46974.2019.9048581</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/biomass-launches",
      "title": "BIOMASS launches to measure forest carbon flux from space",
      "summary": "ESA's BIOMASS satellite successfully launches, featuring first global P-band SAR instrument capable of penetrating forest canopy to measure tree health and carbon flux.",
      "date_published": "2025-05-01T00:00:00.000000Z",
      "date_modified": "2025-06-28T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "sensing",
        "space",
        "satellite",
        "forests",
        "biodiversity",
        "carbon"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/vd6af-4bc83",
          "doi": "10.59350/vd6af-4bc83",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1101/2024.06.24.600405",
          "doi": "10.1101/2024.06.24.600405",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1109/APSAR46974.2019.9048581",
          "doi": "10.1109/APSAR46974.2019.9048581",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2025-internet-ecology-1",
      "content_html": "<p>Every ten years, the city of <a href=\"https://www.visitdenmark.com/denmark/destinations/jutland/aarhus\">Aarhus</a> throws a giant conference to discuss new agendas for critical action and theory in computing. Back in 2016, <a href=\"https://haddadi.github.io/\">Hamed Haddadi</a>, <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\">Jon Crowcroft</a> and I posited the idea of <a href=\"/papers/2015-aarhus-databox\">personal data stores</a>, a topic that is just now becoming hot due to agentic AI. Well, time flies, and I'm pleased to report that our <em>second</em> dicennial thought experiment on <strong>&quot;<a href=\"/papers/2025-internet-ecology\">Steps towards an Ecology for the Internet</a>&quot;</strong> will appear at the 2025 edition of Aarhus this August!</p>\n<p>This time around, we projected our imaginations forward a decade to imagine an optimistic future for the Internet, when it has <a href=\"https://archive.org/details/trillionsthrivin0000luca\">exceeded a trillion nodes</a>.  After deciding in the <a href=\"https://www.themillpubcambridge.com/\">pub</a> that this many nodes was too many for us to handle, we turned to our newfound buddies in <a href=\"\">##conservation</a> to get inspiration from nature. We asked <a href=\"https://samreynolds.org\">Sam Reynolds</a>, <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a>, <a href=\"https://coomeslab.org\">David Coomes</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> first year undergraduate questions about how natural ecosystems operate across <em>all</em> levels of scale: from DNA through to cells through to whole populations.\nWe spent hours discussing the strange correspondences between the seeming chaos in the low-level interactions between cells through to the extraordinary emergent discipline through which biological development typically takes place.</p>\n<p>Then, going back to the computer scientists in our group and more widely (like <a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a> who I ran into at <a href=\"https://www.mcgill.ca/bellairs/\">Bellairs</a>), it turns out that this fosters some really wild ideas for how the Internet itself could evolve into the future. We could adopti biological process models within the heart of the <a href=\"https://en.wikipedia.org/wiki/End-to-end_principle\">end-to-end principle</a> that has driven the Internet architecture for decades!</p>\n<p><a href=\"https://anil.recoil.org/papers/2025-internet-ecology.pdf\"> <img src=\"/images/ecology-ss-1.webp\" alt=\"%c\" title=\"Correspondences between biological scales and Internet concepts\" > </a></p>\n<p>The ideas we sketch out in the paper were indeed quite crazy a year ago. We float the idea of &quot;antibotty&quot; networks that adopt the same command-and-control software as malicious botnets, except to patch our local community's networks and proactively protect them from roving viruses. We discuss how we might automatically modify the source code of our open source software stacks to introduce diversity into operating system kernels that have become homogenous due to the <a href=\"https://en.wikipedia.org/wiki/Usage_share_of_operating_systems\">runaway success of Linux</a>. We wonder if mutualism is the only eventual steady state of Internet communities, with the existing <a href=\"https://en.wikipedia.org/wiki/Surveillance_capitalism\">extractive surveillance economy</a> just a blip that is collapsing under the weight of its own unfitness.</p>\n<p>While these were fantastical ideas, we are already seeing the techniques to make them real (and more). <a href=\"/notes/claude-copilot-sandbox\">AI-driven Agentic programming</a> is now a regular part of my toolbox, and sophisticated code sandboxing like <a href=\"https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/\">CHERI</a>, <a href=\"https://kcsrk.info/papers/fides_jul25.pdf\">FIDES</a> and <a href=\"/papers/2023-raid-deluminator\">Deluminator</a> can enforce fine-grained containment throughout the software stack, with techniques like <a href=\"/projects/unikernels\">unikernels</a> driving whole-stack changes. And when it comes to imagining a future of Internet <em>community</em>, others like Maria Farrel and Robin Berjon have written wonderfully inspiring pieces on [the need to rewild the Internet] that are starting a chorus about the need for change and <a href=\"\">##selfhosting</a>.</p>\n<p>Nature is unstoppable in its ferocity and desire to eat the unfit, and the combination of artificial and natural life may be just what we need to bring the Internet back to mutualistic stability into the coming decade. Your own thoughts and reactions <a href=\"/papers/2025-internet-ecology\">on our paper</a> are extremely welcome ahead of Aarhus in August!  I am sweating at the thought of presenting this work in a coherent way as the paper reads as wildly as you might imagine. I am, however, confident in my excitement and gratitude for having the chance to think about the problem in a more interdisciplinary and longer-term way, and that this is just the start of a line of thinking that may mutate into something special.</p>\n<p><img src=\"/images/ecology-2016.webp\" alt=\"%c\" title=\"Two of the authors ago in 2016 at the Mirage retreat in Morrocco\" >\n<img src=\"/images/ecology-2026.webp\" alt=\"%c\" title=\"Two of the same authors in 2026. Jon Crowcroft is not included for comparison as he has annoyingly aged in reverse.\" ></p><h1>References</h1><ul><li>Madhavapeddy et al (2025). Steps towards an Ecology for the Internet. Association for Computing Machinery. <a href=\"https://doi.org/10.1145/3744169.3744180\" target=\"_blank\"><i>10.1145/3744169.3744180</i></a></li>\n<li>Madhavapeddy (2025). Oh my Claude, we need agentic copilot sandboxing right now. <a href=\"https://doi.org/10.59350/aecmt-k3h39\" target=\"_blank\"><i>10.59350/aecmt-k3h39</i></a></li>\n<li>Chaudhry et al (2015). Personal Data: Thinking Inside the Box. <a href=\"https://doi.org/10.7146/aahcc.v1i1.21312\" target=\"_blank\"><i>10.7146/aahcc.v1i1.21312</i></a></li>\n<li>Tarkhani et al (2023). Information Flow Tracking for Heterogeneous Compartmentalized Software. ACM. <a href=\"https://doi.org/10.1145/3607199.3607235\" target=\"_blank\"><i>10.1145/3607199.3607235</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2025-internet-ecology-1",
      "title": "Steps towards an ecology of the Internet",
      "summary": "Paper exploring biological ecosystem models as inspiration for Internet architecture evolution towards trillion-node scale at Aarhus 2025.",
      "date_published": "2025-06-25T00:00:00.000000Z",
      "date_modified": "2025-06-25T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ecology",
        "internet",
        "biodiversity",
        "opensource",
        "community"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2025-internet-ecology.pdf",
          "mime_type": "application/pdf",
          "title": "Steps towards an Ecology for the Internet"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1145/3744169.3744180",
          "doi": "10.1145/3744169.3744180",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/aecmt-k3h39",
          "doi": "10.59350/aecmt-k3h39",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.7146/aahcc.v1i1.21312",
          "doi": "10.7146/aahcc.v1i1.21312",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3607199.3607235",
          "doi": "10.1145/3607199.3607235",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/70ynk-ves20",
      "content_html": "<p>Apple made a notable <a href=\"https://developer.apple.com/videos/play/wwdc2025/346/\">announcement</a> in <a href=\"https://developer.apple.com/wwdc25/\">WWDC 2025</a> that they've got a new containerisation framework in the new Tahoe beta. This took me right back to the early <a href=\"https://docs.docker.com/desktop/setup/install/mac-install/\">Docker for Mac</a> days in 2016 when we <a href=\"https://www.docker.com/blog/docker-unikernels-open-source/\">announced</a> the first mainstream use of the <a href=\"https://developer.apple.com/documentation/hypervisor\">hypervisor framework</a>, so I couldn't resist taking a quick peek under the hood.</p>\n<p>There were two separate things announced: a <a href=\"https://github.com/apple/containerization\">Containerization framework</a> and also a <a href=\"https://github.com/apple/container\">container</a> CLI tool that aims to be an <a href=\"https://opencontainers.org/\">OCI</a> compliant tool to manipulate and execute container images. The former is a general-purpose framework that could be used by Docker, but it wasn't clear to me where the new CLI tool fits in among the existing layers of <a href=\"https://github.com/opencontainers/runc\">runc</a>, <a href=\"https://containerd.io/\">containerd</a> and of course Docker itself. The only way to find out is to take the new release for a spin, since Apple open-sourced everything (well done!).</p>\n<h2 id=\"getting-up-and-running\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#getting-up-and-running\"></a>Getting up and running</h2>\n<p>To get the full experience, I chose to install the <a href=\"https://www.apple.com/uk/newsroom/2025/06/macos-tahoe-26-makes-the-mac-more-capable-productive-and-intelligent-than-ever/\">macOS Tahoe beta</a>, as there have been improvements to the networking frameworks<sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote\">[1]</a></sup> that are only present in the new beta. It's essential you only use the <a href=\"https://developer.apple.com/news/releases/?id=06092025g\">Xcode 26 beta</a> as otherwise you'll get Swift link errors against vmnet. I had to force my installation to use the right toolchain via:</p>\n<pre><code>sudo xcode-select --switch /Applications/Xcode-beta.app/Contents/Developer\n</code></pre>\n<p>Once that was done, it was simple to clone and install the <a href=\"https://github.com/apple/container\">container\nrepo</a> with a <code>make install</code>. The first\nthing I noticed is that everything is written in Swift with no Go in sight.\nThey still use Protobuf for communication among the daemons, as most of the\nwider Docker ecosystem does.</p>\n<p><img src=\"/images/macos-ss-1.webp\" alt=\"%c\" title=\"I have mixed feelings about the new glass UI in macOS Tahoe. The tabs in the terminal are so low contrast they're impossible to distinguish!\" ></p>\n<h2 id=\"starting-our-first-apple-container\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#starting-our-first-apple-container\"></a>Starting our first Apple container</h2>\n<p>Let's start our daemon up and take the <code>container</code> CLI for a spin.</p>\n<pre><code class=\"language-sh\">$ container system start\nVerifying apiserver is running...\nInstalling base container filesystem...\nNo default kernel configured.\nInstall the recommended default kernel from [https://github.com/kata-containers/kata-containers/releases/download/3.17.0/kata-static-3.17.0-arm64.tar.xz]? [Y/n]: y\nInstalling kernel... \n⠙ [1/2] Downloading kernel 33% (93.4/277.1 MB, 14.2 MB/s) [5s]\n</code></pre>\n<p>The first thing we notice is it downloading a full Linux kernel from the <a href=\"https://github.com/kata-containers/kata-containers\">Kata Containers</a> project. This system spins up a VM per container in order to provide more isolation. Although I haven't tracked Kata closely since its <a href=\"https://techcrunch.com/2017/12/05/intel-and-hyper-partner-with-the-openstack-foundation-to-launch-the-kata-containers-project/\">launch</a> in 2017, I did notice it being used to containerise <a href=\"https://confidentialcomputing.io/\">confidential computing enclaves</a> while <a href=\"https://zatkh.github.io/\">Zahra Tarkhani</a> and I were working on <a href=\"/projects/difc-tee\">TEE programming models</a> a few years ago.</p>\n<p>The use of Kata tells us that <code>container</code> spins up a new kernel using the\nmacOS <a href=\"https://developer.apple.com/documentation/virtualization\">Virtualization framework</a> every time a new container is started. This\nis ok for production use (where extra isolation may be appropriate in a\nmultitenant cloud environment) but very memory inefficient for development\n(where it's usual to spin up 4-5 VMs for a development environment with a\ndatabase etc). In contrast, Docker for Mac <a href=\"https://speakerdeck.com/avsm/the-functional-innards-of-docker-for-mac-and-windows\">uses</a> a single Linux kernel and runs\nthe containers within that instead.</p>\n<p>It's not quite clear to me why Apple chose the extra overheads of a\nVM-per-container, but I suspect this might be something to do with running code securely\ninside the <a href=\"https://support.apple.com/en-gb/guide/security/sec59b0b31ff/web\">many hardware enclaves</a>\npresent in modern Apple hardware, a usecase that is on the rise with <a href=\"https://www.apple.com/uk/apple-intelligence/\">Apple\nIntelligence</a>.</p>\n<h2 id=\"peeking-under-the-hood-of-the-swift-code\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#peeking-under-the-hood-of-the-swift-code\"></a>Peeking under the hood of the Swift code</h2>\n<p>Once the container daemon is running, we can spin up our first container using Alpine, which uses the familiar Docker-style <code>run</code>:</p>\n<pre><code class=\"language-sh\">$ time container run alpine uname -a \nLinux 3c555c19-b235-4956-bed8-27bcede642a6 6.12.28 #1 SMP\nTue May 20 15:19:05 UTC 2025 aarch64 Linux\n0.04s user 0.01s system 6% cpu 0.733 total\n</code></pre>\n<p>The container spinup time is noticable, but still less than a second and pretty acceptable for day to day use. This is possible thanks to a custom userspace they implement via a Swift init process that's run by the Linux kernel as the <em>sole</em> binary in the filesystem, and that provides an RPC interface to manage other services. The <a href=\"https://github.com/apple/containerization/tree/main/vminitd/Sources/vminitd\">vminitd</a> is built using the Swift static Linux SDK, which links <a href=\"https://musl.libc.org/\">musl libc</a> under the hood (the same one used by <a href=\"https://www.alpinelinux.org/\">Alpine Linux</a>).</p>\n<p>We can see the processes running by using <a href=\"https://man7.org/linux/man-pages/man1/pstree.1.html\">pstree</a>:</p>\n<pre><code>|- 29203 avsm /System/Library/Frameworks/Virtualization.framework/\n   Versions/A/XPCServices/com.apple.Virtualization.VirtualMachine.xpc/\n   Contents/MacOS/com.apple.Virtualization.VirtualMachine\n|- 29202 avsm &lt;..&gt;/plugins/container-runtime-linux/\n   bin/container-runtime-linux\n   --root &lt;..&gt;/f82d3a52-c89b-4ff0-9e71-c7127cb5eee1\n   --uuid f82d3a52-c89b-4ff0-9e71-c7127cb5eee1 --debug\n|- 28896 avsm &lt;..&gt;/bin/container-network-vmnet\n   start --id default\n   --service-identifier &lt;..&gt;network.container-network-vmnet.default\n|- 28899 avsm &lt;..&gt;/bin/container-core-images start\n|- 29202 avsm &lt;..&gt;/bin/container-runtime-linux\n   --root &lt;..&gt;/f82d3a52-c89b-4ff0-9e71-c7127cb5eee1\n   --uuid f82d3a52-c89b-4ff0-9e71-c7127cb5eee1 --debug\n|- 28896 avsm &lt;..&gt;/container-network-vmnet start --id default\n   --service-identifier &lt;..&gt;network.container-network-vmnet.default\n</code></pre>\n<p>You can start to see the overheads of a VM-per-container now, as each container\nneeds the host process infrastructure to not only run the computation, but also to\nfeed it with networking and storage IO (which have to be translated from the\nhost).  Still, its a drop in the ocean for macOS these days, as I'm running 850\nprocesses in the background on my Macbook Air from an otherwise fresh\ninstallation! This isn't the lean, fast MacOS X Cheetah I used on my G4 Powerbook anymore,\nsadly.</p>\n<h3 id=\"finding-the-userspace-ext4-in-swift\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#finding-the-userspace-ext4-in-swift\"></a>Finding the userspace ext4 in Swift</h3>\n<p>I then tried to run a more interesting container for my local dev environment:\nthe <a href=\"https://hub.docker.com/r/ocaml/opam\">ocaml/opam</a> Docker images that we use\nin OCaml development.  This showed up an interesting new twist in the Apple\nrewrite: they have an entire <a href=\"https://en.wikipedia.org/wiki/Ext4\">ext4</a> filesystem <a href=\"https://github.com/apple/containerization/tree/main/Sources/ContainerizationEXT4\">implementation written in\nSwift</a>!\nThis is used to extract the OCI images from the Docker registry and then\nconstruct a new filesystem.</p>\n<pre><code class=\"language-sh\">$ container run ocaml/opam opam list\n⠦ [2/6] Unpacking image for platform linux/arm64 (112,924 entries, 415.9 MB, Zero KB/s) [9m 22s] \n⠹ [2/6] Unpacking image for platform linux/arm64 (112,972 entries, 415.9 MB, Zero KB/s) [9m 23s] \n⠇ [2/6] Unpacking image for platform linux/arm64 (113,012 entries, 415.9 MB, Zero KB/s) [9m 23s] \n⠼ [2/6] Unpacking image for platform linux/arm64 (113,059 entries, 415.9 MB, Zero KB/s) [9m 23s] \n⠋ [2/6] Unpacking image for platform linux/arm64 (113,104 entries, 415.9 MB, Zero KB/s) [9m 24s] \n# Packages matching: installed                                                                      \n# Name                # Installed # Synopsis\nbase-bigarray         base\nbase-domains          base\nbase-effects          base\nbase-threads          base\nbase-unix             base\nocaml                 5.3.0       The OCaml compiler (virtual package)\nocaml-base-compiler   5.3.0       pinned to version 5.3.0\nocaml-compiler        5.3.0       Official release of OCaml 5.3.0\nocaml-config          3           OCaml Switch Configuration\nopam-depext           1.2.3       Install OS distribution packages\n</code></pre>\n<p>The only hitch here is how slow this process is. The OCaml images do have a lot of individual\nfiles within the layers (not unusual for a package manager), but I was surprised that this took\n10 minutes on my modern M4 Macbook Air, versus a few seconds on Docker for Mac.  I <a href=\"https://github.com/apple/container/issues/136\">filed a bug</a> upstream to investigate further since (as with any new implementation) there are many <a href=\"/papers/2015-sosp-sibylfs\">edge cases</a> when handling filesystems in userspace, and the Apple code seems to have <a href=\"https://github.com/apple/container/issues/134\">other limitations</a> as well.  I'm sure this will all shake out as the framework gets more users, but it's worth bearing in mind if you're thinking of using it in the near term in a product.</p>\n<h2 id=\"whats-conspicuously-missing\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#whats-conspicuously-missing\"></a>What's conspicuously missing?</h2>\n<p>I was super excited when this announcement first happened, since I thought it might be the beginning of a few features I've needed for years and years. But they're missing...</p>\n<h3 id=\"running-macos-containers-nope\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#running-macos-containers-nope\"></a>Running macOS containers: nope</h3>\n<p>In OCaml-land, we have gone to ridiculous lengths to be able to run macOS CI on our own infrastructure. <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> first wrote a <a href=\"https://tarides.com/blog/2023-08-02-obuilder-on-macos/\">custom snapshotting builder</a> using undocumented interfaces like userlevel sandboxing, subsequently taken over and maintained by <a href=\"https://www.tunbury.org/\">Mark Elvers</a>. This is a tremendous amount of work to maintain, but the alternative is to depend on very expensive hosted services to spin up individual macOS VMs which are slow and energy hungry.</p>\n<p>What we <em>really</em> need are macOS containers! We have dozens of mechanisms to run Linux ones already, and only a few <a href=\"https://github.com/dockur/macos\">heavyweight alternatives</a> to run macOS itself within macOS. However, the VM-per-container mechanism chosen by Apple might be the gateway to supporting macOS itself in the future. I will be first in line to test this if it happens!</p>\n<h3 id=\"running-ios-containers-nope\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#running-ios-containers-nope\"></a>Running iOS containers: nope</h3>\n<p>Waaaay back when we were <a href=\"https://speakerdeck.com/avsm/the-functional-innards-of-docker-for-mac-and-windows\">first writing</a> Docker for Mac, there were no mainstream users of the Apple Hypervisor framework at all (that's why we built and released <a href=\"https://github.com/moby/hyperkit\">Hyperkit</a>. The main benefit we hoped to derive from using Apple-blessed frameworks is that they would make our app App-Store friendly for distribution via those channels.</p>\n<p>But while there do exist <a href=\"https://developer.apple.com/documentation/bundleresources/entitlements/com.apple.security.hypervisor\">entitlements</a> to support virtualisation on macOS, there is <em>no</em> support for iOS or iPadOS to this day! All of the trouble to sign binaries and deal with entitlements and opaque Apple tooling only gets it onto the Mac App store, which is a little bit of a graveyard compared to the iOS ecosystem.\nThis thus remains on my wishlist for Apple: the hardware on modern iPad adevices <em>easily</em> supports virtualisation, but Apple is choosing to cripple these devices from having a decent development experience by not unlocking the software capability by allowing the hypervisor, virtualisation and container frameworks to run on there.</p>\n<h3 id=\"running-linux-containers-yeah-but-no-gpu\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#running-linux-containers-yeah-but-no-gpu\"></a>Running Linux containers: yeah but no GPU</h3>\n<p>One reason to run Linux containers on macOS is to handle machine learning workloads. Actually getting this to be performant is tricky, since macOS has its own custom <a href=\"https://github.com/ml-explore/mlx\">MLX-based</a> approach to handling tensor computations. Meanwhile, the rest of the world mostly uses nVidia or AMD interfaces for those GPUs, which is reflected in container images that are distributed.</p>\n<p>There is some chatter on the <a href=\"https://github.com/apple/container/discussions/62#discussioncomment-13414483\">apple/container GitHub</a> about getting GPU passthrough working, but I'm still unclear on how to get a more portable GPU ABI. The reason Linux containers work so well is that the Linux kernel provides a very stable ABI, but this breaks down with GPUs badly.</p>\n<h1 id=\"does-this-threaten-dockers-dominance\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#does-this-threaten-dockers-dominance\"></a>Does this threaten Docker's dominance?</h1>\n<p>I have mixed feelings about the Containerization framework release. On one hand, it's always fun to see more systems code in a new language like Swift, and this is an elegant and clean reimplementation of classic containerisation techniques in macOS. But the release <strong>fails to unlock any real new end-user capabilities</strong>, such as running a decent development environment on my iPad without using cloud services. Come on Apple, you can make that happen; you're getting ever closer every release!</p>\n<p>I don't believe that Docker or Orbstack are too threatened by this release at this stage either, despite some reports that <a href=\"https://appleinsider.com/articles/25/06/09/sorry-docker-macos-26-adds-native-support-for-linux-containers\">they're being Sherlocked</a>. The Apple container CLI is quite low-level, and there's a ton of quality-of-life features in the full Docker for Mac app that'll keep me using it, and there seems to be no real blocker from Docker adopting the Containerization framework as one of its optional backends. I prefer having a single VM for my devcontainers to keep my laptop battery life going, so I think Docker's current approach is better for that usecase.</p>\n<p>Apple has been a very good egg here by open sourcing all their code, so I believe this will overall help the Linux container ecosystem by adding choice to how we deploy software containers. Well done <a href=\"https://github.com/crosbymichael\">Michael Crosby</a>, <a href=\"https://github.com/mavenugo\">Madhu Venugopal</a> and many of my other former colleagues who are all merrily hackily away on this for doing so!  As an aside, I'm also just revising a couple of papers about the history of using OCaml in several Docker components, and a retrospective look back at the hypervisor architecture backing Docker for Desktop, which will appear in print in the next couple of months (I'll update this post when they appear). But for now, back to my day job of marking undergraduate exam scripts...</p>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p>vmnet is a networking framework for VMs/containers that I had to <a href=\"https://github.com/mirage/ocaml-vmnet\">reverse engineer</a> back in 2014 to use with OCaml/MirageOS.</p>\n <a href=\"#fnref:1\" class=\"reversefootnote\">&#8617;</a></p></li></ol></div><h1>References</h1><ul><li>Ridge et al (2015). SibylFS: formal specification and oracle-based testing for POSIX and real-world file systems. ACM. <a href=\"https://doi.org/10.1145/2815400.2815411\" target=\"_blank\"><i>10.1145/2815400.2815411</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/apple-containerisation",
      "title": "Under the hood with Apple's new Containerization framework",
      "summary": "Technical deep dive into Apple's new macOS Tahoe containerization framework using Kata Containers and Swift-based implementation.",
      "date_published": "2025-06-11T00:00:00.000000Z",
      "date_modified": "2025-06-11T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "docker",
        "containers",
        "systems",
        "networking",
        "macos"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1145/2815400.2815411",
          "doi": "10.1145/2815400.2815411",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/7cpwj-d4161",
      "content_html": "<p>I stayed on for a few days extra in Washington DC after the <a href=\"/notes/nas-rs-biodiversity\">biodiversity extravaganza</a> to attend a workshop at legendary <a href=\"https://www.nationalgeographic.org/society/visit-base-camp/\">National Geographic Basecamp</a>. While I've been to several NatGeo <a href=\"https://www.nationalgeographic.org/society/national-geographic-explorers/\">Explorers</a> meetups in California, I've never had the chance to visit their HQ. The purpose of this was to attend a workshop organised by <a href=\"https://www.st-andrews.ac.uk/biology/people/cr68\">Christian Rutz</a> from St Andrews about the &quot;Urban Exploration Project&quot;:</p>\n<blockquote>\n<p>[The UEP is a...] global-scale, community-driven initiative will collaboratively track animals across gradients of urbanization worldwide, to produce a holistic understanding of animal behaviour in human-modified landscapes that can, in turn, be used to develop evidence-based approaches to achieving sustainable human-wildlife coexistence.\n<cite>-- <a href=\"https://www.st-andrews.ac.uk/biology/people/cr68\">Christian Rutz's homepage</a></cite></p>\n</blockquote>\n<p>This immediately grabbed my interest, since it's a very different angle of biodiversity measurements to my usual. I've so far been mainly involved in efforts that use <a href=\"/projects/rsn\">remote sensing</a> or expert <a href=\"/projects/life\">range maps</a>, but the UEP program is more concerned with the dynamic <em>movements</em> of species. Wildlife movements are extremely relevant to conservation efforts since there is a large tension between human/wildlife coexistence in areas where both communities are under spatial pressure. <a href=\"https://ratsakatika.com/\">Tom Ratsakatika</a> for example did his <a href=\"https://ai4er-cdt.esc.cam.ac.uk/\">AI4ER</a> <a href=\"https://github.com/ratsakatika/camera-traps\">project</a> on the tensions in the <a href=\"https://www.endangeredlandscapes.org/news/advancing-human-wildlife-coexistence-in-the-carpathian-mountains/\">Romanian Carpathian mountains</a>, and <a href=\"https://www.ifaw.org/journal/human-elephant-conflict-major-threat\">elephant/human conflicts</a> and <a href=\"https://www.bbc.co.uk/news/articles/cx2j43e2j5ro\">tiger/human conflicts</a> are also well known.</p>\n<p>The core challenge posed at the workshop was how to build momentum for the UEP's vision of fostering human–wildlife coexistence in the world's <em>unprotected</em> areas (often, this is areas near urban expansion zones like cities).  The UEP idea sprang from Christian's earlier efforts after the pandemic on the <a href=\"https://bio-logging.net/wg/covid19-biologging/\">COVID-19 Bio-Logging</a> that built up a database of over 1 billion satellite fixes for ~13,000 tagged animals across ~200 species. The lead student on that <a href=\"https://www.nature.com/articles/s41559-023-02125-6\">work</a>, <a href=\"https://diegoellissoto.org/\">Diego Ellis Soto</a> has since graduated and was also at the UEP workshop sitting beside me!</p>\n<p><img src=\"/images/ngs-2.webp\" alt=\"%c\" title=\"NatGeo Chief Scientist Ian Miller kicks off proceedings\" ></p>\n<p>The workshop itself wasn't fully public (not because it's secret, but just because the details are still being iterated on), so here are some high-level takeaways from my conversations there...</p>\n<h2 id=\"movebank-for-gps-tracking\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#movebank-for-gps-tracking\"></a>Movebank for GPS tracking</h2>\n<p>I've used <a href=\"https://inaturalist.org\">iNaturalist</a> and <a href=\"https://www.openstreetmap.org/\">OpenStreetMap</a> extensively for wildlife occurrence and urban data, but I'm less familiar with how animal movement data is recorded. <a href=\"https://www.ab.mpg.de/person/98226\">Martin Wikelski</a> was at the workshop and explained the <a href=\"https://www.humboldt-foundation.de/en/entdecken/magazin-humboldt-kosmos/humboldt-today-the-secret-of-an-eternal-idol/the-high-flyer\">ICARUS</a> project to me, which collected data fitted to animals via GPS transmitters. This is then fed into the <a href=\"https://www.movebank.org/cms/movebank-main\">MoveBank</a> service that is custom-designed for movement data.</p>\n<p>Unlike most other biodiversity data services though, MoveBank data is not immediately made public (due to the sensitivity of animal movements), but is licensed to the user that made it. For that reason, it's less of a &quot;social&quot; service than iNaturalist, but still has a staggering <a href=\"https://www.movebank.org/cms/movebank-content/february-2024-newsletter\">11 million records added every day</a>.  This data is then <a href=\"https://www.movebank.org/cms/movebank-content/archiving-animal-movements-as-biodiversity-2023-01-04\">fed into GBIF</a>, although it is downsampled to a single record per day. Martin also indicated to me that they're considering federating Movebank to other countries, which is important as <a href=\"https://www.youtube.com/watch?v=gDTQ1rIEaYo&amp;list=PLlKst-jESy-8t7lg429Movg6Fmsq2DU7y\">biodiversity data resilience</a> was a hot topic in our <a href=\"/notes/nas-rs-biodiversity\">meeting</a> a few days before.</p>\n<p><img src=\"/images/ngs-3.webp\" alt=\"%c\" title=\"The workshop was highly interactive through the 1.5 days. No laptops needed!\" ></p>\n<h2 id=\"storytelling-about-conservation-actions\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#storytelling-about-conservation-actions\"></a>Storytelling about conservation actions</h2>\n<p>I was really struck by how deeply the National Geographic staff were thinking about and co-designing solutions for along with the academics involved. I got chatting to <a href=\"https://www.nationalgeographic.org/society/our-leadership/\">Ian Miller</a>, the chief scientist at NatGeo about his scientific background (he's worked on all seven continents!) and how our <a href=\"/projects/ce\">conservation evidence database</a> might be of use to help the Society figure out the long-term impacts of their projects. I also met the person with the coolest job title there: <a href=\"https://www.linkedin.com/in/alextait/\">Alex Tait</a>, who is <a href=\"https://education.nationalgeographic.org/resource/mapping-change-roof-world/\">The Geographer</a> at the NGS. Alex, along with <a href=\"https://theorg.com/org/national-geographic-society/org-chart/lindsay-anderson\">Lindsay Anderson</a> and other NGS staff who participated, all had infectious enthusiasm about exploration combined with an encyclopedic knowledge of specific projects that they support involving explorers across the world.</p>\n<p>These projects ranged from the <a href=\"https://www.nationalgeographic.com/into-the-amazon/pink-dolphins-tricksters-and-thieves/\">Amazon River Dolphins</a> (to understand <a href=\"https://www.nationalgeographic.com/impact/article/fernando-trujillo-explorer-story\">aquatic health</a>) over to <a href=\"https://www.nationalgeographic.com/impact/article/alex-schnell-explorer-story\">cephalopod empathy</a>) and <a href=\"https://www.nationalgeographic.com/impact/article\">many more</a>. These gave me a new perspective on the importance of <em>storytelling</em> as a key mechanism to help connect the dots from conservation actions to people; something that I've been learning from <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a>'s <a href=\"/notes/junior-rangers\">video series</a> as well!</p>\n<p><a href=\"https://www.nationalgeographic.com/impact\"> <img src=\"/images/ngs-5.webp\" alt=\"%c\" title=\"I spent the whole return trip reading the impact stories. So very, very, very inspiring.\" > </a></p>\n<p>It's also worth noting that the NGS support goes beyond &quot;just&quot; filmmaking. Our own <a href=\"https://charlesemogor.com\">Charles Emogor</a> is also an <a href=\"https://explorers.nationalgeographic.org/directory/charles-agbor-emogor\">Explorer</a>, and recently received support from their <a href=\"https://www.nationalgeographic.org/society/our-programs/lab/\">Exploration Technology Lab</a> to get a bunch of <a href=\"https://www.wildlifeacoustics.com/products/song-meter-mini-2-aa\">biologgers</a> to support his research on <a href=\"/ideas/mapping-hunting-risks-for-wild-meat\">mapping hunting pressures</a>. Rather than placing a few big bets, the Society seems to focus on investing widely in a diverse range of people and geographies.</p>\n<h2 id=\"the-importance-of-hedgehogs\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-importance-of-hedgehogs\"></a>The importance of hedgehogs</h2>\n<p>A lot of the discussion at the workshop naturally focussed on charismatic mammals such as the amazing work done by the <a href=\"https://www.zambiacarnivores.org/\">Zambian Carnivore programme</a>. However, I also had in mind the importance of addressing issues closer to home in the UK as well so that we didn't ignore Europe.</p>\n<p>Luckily, before the workshop, I had grabbed a coffee with <a href=\"https://www.cambridgeconservation.org/about/people/dr-silviu-o-petrovan/\">Silviu Petrovan</a> from the CCI, who has been bringing me up to speed on the <a href=\"https://www.mammalweb.org/en/nhmp\">National Hedgehog Monitoring programme</a> (did you know that British hedgehogs are now <a href=\"https://www.britishhedgehogs.org.uk/british-hedgehog-now-officially-classified-as-vulnerable-to-extinction/\">vulnerable to extinction</a>?). This particular effort seems to tick a lot of boxes; it's a local and beloved species in the UK, it requires <a href=\"https://www.conservationevidence.com/individual-study/1018\">evidence-based interventions</a> to avoid making the problems worse, and also requires combining data sources (from camera traps to species distribution models to urban planning to the GPS Movebank data) to build up a really accurate high res picture of what's going on.</p>\n<p>I brought up UK hedgehog conservation at the NatGeo workshop, and then while down at <a href=\"https://earthfest.world/\">Earthfest</a> at Google a few days later I learnt from <a href=\"https://www.cfse.cam.ac.uk/directory/drew_purves\">Drew Purves</a> that they've developed an extremely high-res map of <a href=\"https://eoscience-external.projects.earthengine.app/view/farmscapes\">woodland and hedgerows</a> in the UK.  I've therefore created a new student project on <a href=\"/ideas/hedgehog-mapping\">hedgehog mapping</a> and hope to recruit a summer internship for this. It would be extremely cool to put the pieces together with a very concrete project such as this as a first small step for the UEP.</p>\n<p><img src=\"/images/ngs-1.webp\" alt=\"%c\" title=\"NatGeo Basecamp is under construction, but still epic\" ></p>\n<p>I found the whole experience of visiting National Geographic inspirational, and not just because of the projects discussed. The walls of their HQ are full of incredible photographs of explorers all over the world, and a seemingly unbounded enthusiasm for exploring the unknown. I kind of thought I'd aged out on applying to become an explorer, but <a href=\"https://totalkatastrophe.blogspot.com/\">Kathy Ho</a> has been encouraging me to apply, and the same was echoed by the lovely conversations with NatGeo staffers.</p>\n<p>I'm therefore putting on my thinking hat on for what my Explorers project proposal should be, as I am on academic sabbatical next year and have more freedom to travel; suggestions are welcome if you see me at the pub!</p>\n<p><img src=\"/images/ngs-4.webp\" alt=\"%c\" title=\"I might have deliberately gone the wrong way a few times while exploring the HQ\" ></p><h1>References</h1><ul><li>Madhavapeddy (2025). What I learnt at the National Academy of Sciences US-UK Forum on Biodiversity. <a href=\"https://doi.org/10.59350/j6zkp-n7t82\" target=\"_blank\"><i>10.59350/j6zkp-n7t82</i></a></li>\n<li>Madhavapeddy (2025). We become Junior Rangers at Shenandoah. <a href=\"https://doi.org/10.59350/d27v1-5tk68\" target=\"_blank\"><i>10.59350/d27v1-5tk68</i></a></li>\n<li>Ellis-Soto et al (2023). A vision for incorporating human mobility in the study of human–wildlife interactions. Nature Ecology & Evolution. <a href=\"https://doi.org/10.1038/s41559-023-02125-6\" target=\"_blank\"><i>10.1038/s41559-023-02125-6</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/natgeo-urban-wildlife",
      "title": "Visiting National Geographic HQ and the Urban Exploration Project",
      "summary": "Visit to National Geographic HQ for workshop on global urban wildlife tracking initiative and human-wildlife coexistence research.",
      "date_published": "2025-06-07T00:00:00.000000Z",
      "date_modified": "2025-06-07T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "natgeo",
        "usa",
        "biodiversity",
        "urban"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/j6zkp-n7t82",
          "doi": "10.59350/j6zkp-n7t82",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/d27v1-5tk68",
          "doi": "10.59350/d27v1-5tk68",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s41559-023-02125-6",
          "doi": "10.1038/s41559-023-02125-6",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/j6zkp-n7t82",
      "content_html": "<p>I spent a couple of days at the <a href=\"https://www.nationalacademies.org/home\">National Academy of Sciences</a> in the USA at the invitation of the <a href=\"https://royalsociety.org\">Royal Society</a>, who held a forum on &quot;<a href=\"https://www.nasonline.org/wp-content/uploads/2024/10/US-UK-Forum-2025-program-web.pdf\">Measuring Biodiversity for Addressing the Global Crisis</a>&quot;. It was a <a href=\"https://www.nasonline.org/wp-content/uploads/2024/10/US-UK-Forum-2025-program-web.pdf\">packed program</a> for those working in evidence-driven conservation:</p>\n<blockquote>\n<p>Assessing biodiversity is fundamental to understanding the distribution of biodiversity, the changes that are occurring and, crucially, the effectiveness of actions to address the ongoing biodiversity crisis. Such assessments face multiple challenges, not least the great complexity of natural systems, but also a lack of standardized approaches to measurement, a plethora of measurement technologies with their own strengths and weaknesses, and different data needs depending on the purpose\nfor which the information is being gathered.</p>\n<p>Other sectors have faced similar challenges, and the forum will look to learn from these precedents with a view to building momentum toward standardized methods for using environmental monitoring technologies, including new technologies, for particular purposes.\n<cite>-- NAS/Royal Society <a href=\"https://www.nasonline.org/wp-content/uploads/2024/10/US-UK-Forum-2025-program-web.pdf\">US-UK Scientific Forum on Measuring Biodiversity</a></cite></p>\n</blockquote>\n<p>I was honoured to talk about our work on using AI to &quot;connect the dots&quot; between disparate data like the academic literature and remote observations at scale. But before that, here's some of the bigger picture stuff I learnt...</p>\n<p><a href=\"https://www.nasonline.org/wp-content/uploads/2024/10/US-UK-Forum-2025-program-web.pdf\"> <img src=\"/images/nas-rs-cover.webp\" alt=\"%c\" title=\"Identifying the bird is an exercise for the reader!\" > </a></p>\n<h2 id=\"shifting-conservation-to-a-winning-stance\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#shifting-conservation-to-a-winning-stance\"></a>Shifting conservation to a winning stance</h2>\n<p>The need for urgent, additional action came across loud and clear from all the top actors in biodiversity. On the bright side, we have made stellar progress in measuring more dimensions of biodiversity accurately than ever before in human history. But, the field of biodiversity does not have a single &quot;simple question&quot; that needs answering, unlike many other science challenges in physics or chemistry. The ecosystem of nature measurements need to span scales ranging from the micro (from fungi and soil health) to the macro (species richness and diversity), with geographical coverage across the planet but also hyperlocal accuracy for ecosystem services.</p>\n<p>One key question asked at the forum was how we can get to interoperable, pragmatic tools that enable all the actors involved in conservation actions (from the governments that set policy, to the private sector that controls the supply chains, to the people who have to live in and depend on natural services) to work together more effectively on gathering all the data needed.</p>\n<p>This interoperability has to emerge during a rapid shift towards digital methods, which are vulnerable to being <a href=\"https://www.bbc.com/future/article/20250422-usa-scientists-race-to-save-climate-data-before-its-deleted-by-the-trump-administration\">deleted and edited at scale</a> with decades of painstaking observations at risk at the moment.  And in the middle of all this, machine learning is swooping in to perform data interpolation at scale, but also risks <a href=\"/notes/ai-should-unite-conservation\">dividing</a> and polluting observations with inaccurate projections.</p>\n<p><img src=\"/images/nas-rs-2.webp\" alt=\"%c\" ></p>\n<h2 id=\"what-is-an-optimistic-future-for-conservation\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#what-is-an-optimistic-future-for-conservation\"></a>What is an optimistic future for conservation?</h2>\n<p>This is all quite the challenge even for a gung-ho computer scientist like me, and I was struggling with the enormity of it all! But things really clicked into place after the inspirational <a href=\"https://www.bangor.ac.uk/staff/sens/julia-patricia-gordon-jones-010356/en\">Julia P.G. Jones</a> pointed me at a <a href=\"https://academic.oup.com/bioscience/article/68/6/412/4976422\">fantastic big-picture paper</a>:</p>\n<blockquote>\n<p>Drawing reasonable inferences from current patterns, we can predict that 100 years from now, the Earth could be inhabited by between 6-8 billion people, with very few remaining in extreme poverty, most living in towns and cities, and nearly all participating in a technologically driven, interconnected market economy.</p>\n<p>[...] we articulate a theory of social–environmental change that describes the simultaneous and interacting effects of urban lifestyles on fertility, poverty alleviation, and ideation.</p>\n<p><cite><a href=\"https://academic.oup.com/bioscience/article/68/6/412/4976422\">From Bottleneck to Breakthrough: Urbanization and the Future of Biodiversity Conservation</a></cite></p>\n</blockquote>\n<p>They observe that the field of conservation has often &quot;succumbed to jeremiad, bickering, and despair&quot;. Much of this angst springs from the (failed) bets made by <a href=\"https://en.wikipedia.org/wiki/Paul_R._Ehrlich\">Paul Ehlrich</a>, who thinks <a href=\"https://www.nature.com/articles/d41586-024-03592-y\">humans are going to be wiped out</a> because of unbounded expansion. In response, conservation has become &quot;the art of slowing declines&quot; rather than achieving long term wins. But instead of being moribund, the paper paints an optimistic, practical endgame for conservation:</p>\n<blockquote>\n<p>We suggest that lasting conservation success can best be realized when:</p>\n<ul>\n<li>the human population stabilizes and begins to decrease</li>\n<li>extreme poverty is alleviated</li>\n<li>the majority of the world's people and institutions act on a shared belief that it is in their best interest to care for rather than destroy the natural bases of life on Earth.</li>\n</ul>\n</blockquote>\n<p>It turns out that most of these conditions can be reasonably projected to happen in the next fifty years or so. Population is projected to <a href=\"https://en.wikipedia.org/wiki/Human_population_projections\">peak by the turn of the century</a>, <a href=\"https://openknowledge.worldbank.org/entities/publication/9d0fb27a-3afe-5999-8d8e-baf90b4331c0/full\">extreme poverty might reasonably be eradicated by 2050</a>, and <a href=\"https://iopscience.iop.org/article/10.1088/1748-9326/8/1/014025\">urban landuse will stabilise at 6% of terrestrial land</a> by 2030-ish.</p>\n<p><a href=\"https://academic.oup.com/view-large/figure/118140827/biy039fig4.jpeg\"> <img src=\"/images/nas-rs-6.webp\" alt=\"%c\" title=\"Connecting demographic and economic trends in the 21st century to the environment\" > </a></p>\n<p>Given this projection, the paper then points out that conservation doesn't need to save nature &quot;forever&quot;. Instead, we have to save enough nature now to &quot;breakthrough&quot; from the <a href=\"https://en.wikipedia.org/wiki/Great_Acceleration\">great acceleration</a> of WWII until we stabilise landuse.</p>\n<blockquote>\n<p>The profound danger is that by the time the foundations of recovery are in place, little of wildlife and wild places will be left. If society focuses only on economic development and technological innovation as a mechanism to pass through the bottleneck as fast as possible, then what remains of nature could well be sacrificed.\nIf society were to focus only on limiting economic growth to protect nature, then terrible poverty and population growth could overwhelm what remains.</p>\n<p>Either extreme risks narrowing the bottleneck to such an extent that our world passes through without its tigers, elephants, rainforests, coral reefs, or a life-sustaining climate. Therefore, the only sensible path for conservation is to continue its efforts to protect biodiversity while engaging in cities to build the foundations for a lasting recovery of nature.\n<cite>-- <a href=\"https://academic.oup.com/bioscience/article/68/6/412/4976422\">From Bottleneck to Breakthrough</a></cite></p>\n</blockquote>\n<p>This puts what we need to achieve today in a far, far more pragmatic light:</p>\n<blockquote>\n<p>[...] it means that conservation faces another 30–50 years of extreme difficulty, when more losses can be expected. However, if we can sustain enough nature through the bottleneck—despite climate change, growth in the population and economy, and urban expansion—then we can see the future of nature in a dramatically more positive light.</p>\n</blockquote>\n<p>Conservation is all about solving difficult opportunity-cost decisions in society.\nScience can help calculate <a href=\"/papers/2023-pact-tmf\">credible counterfactuals</a> that allow policymakers to balance\nlimited resources to minimise nature harm while maximising benefit to humans. We can also figure out new <a href=\"/papers/2023-ncc-permanence\">economic methods</a> to figure out the value of future actions. When combined, this can help conservation break through the bottleneck of the next fifty years of nature loss... and computer science can make a serious <a href=\"https://fivetimesfaster.org/\">accelerative</a> impact here (yay!).</p>\n<p><img src=\"/images/nas-rs-5.webp\" alt=\"%c\" title=\"What does one call a group of ecology legends? A committee!\" ></p>\n<h2 id=\"topics-relevant-to-our-planetary-computing-research\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#topics-relevant-to-our-planetary-computing-research\"></a>Topics relevant to our planetary computing research</h2>\n<p>Having got my existential big-picture crisis under control, here are some more concrete thoughts about some of the joint ideas that emerged from the NAS meeting.</p>\n<h3 id=\"resilience-in-biodiversity-data\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#resilience-in-biodiversity-data\"></a>Resilience in biodiversity data</h3>\n<p>We've been doing a <a href=\"https://digitalflapjack.com/blog/yirgacheffe/\">lot</a> of <a href=\"https://digitalflapjack.com/weeknotes/2025-04-22/\">work</a> on mechanisms to <a href=\"/papers/2024-planetary-computing\">process and ingest</a> remote sensing data. All of our techniques also apply to biodiversity, except that the pipelines are even more complex due to the multi-modal nature of the data being stored. This can be clearly seen in this <a href=\"https://www.science.org/doi/10.1126/science.adq2110\">review on the decline of insect biodiversity</a> that speaker Nick Isaac and my colleague <a href=\"https://www.zoo.cam.ac.uk/directory/prof-lynn-dicks\">Lynn Dicks</a> published last month.</p>\n<p><a href=\"https://www.science.org/doi/10.1126/science.adq2110\"> <img src=\"/images/nas-rs-1.webp\" alt=\"%c\" title=\"(source: Science, 10.1126/science.adq2110)\" > </a></p>\n<p>The data itself isn't just from one source; instead, we need a pipeline of spatial (at different resolution) measurements, of different types (visual, acoustic, occurrence), of different provenance (experts, crowdsourced, museum), and from different hypotheses tests (evidence bases).</p>\n<p>Once the ingestion pipeline is in place, there's a full range of validation and combination and extrapolation involved, often involving AI methods these days.  The output from all of this is then tested to determine which <a href=\"/projects/ce\">conservation actions</a> to take.</p>\n<p><img src=\"/images/nas-rs-3.webp\" alt=\"%c\" title=\"Nick Isaac explains how different lines of biodiversity evidence are necessary\" ></p>\n<p><a href=\"https://www.thegonzalezlab.org/\">Andrew Gonzalez</a> also talked about the ambitious <a href=\"https://www.nature.com/articles/s41559-023-02171-0\">global biodiversity observing system</a> that he's been assembling a coalition for in recent years.  They are using Docker as part of this via their <a href=\"https://boninabox.geobon.org/\">Bon in a Box</a> product but hitting scaling issues (a common problem due to the size of geospatial tiles).</p>\n<p><a href=\"https://www.nature.com/articles/s41559-023-02171-0\"> <img src=\"/images/nas-rs-7.webp\" alt=\"%c\" title=\"Andrew Gonzalez explains the GBioS concept\" > </a></p>\n<p>There's a good tie in for collaboration with us here via the next-generation <a href=\"https://patrick.sirref.org/weekly-2025-05-12/index.xml\">time-travelling shell</a> that <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> is developing that can handle this via <a href=\"https://www.tunbury.org/zfs-system-concept/\">ZFS snapshots</a>.  <a href=\"https://mynameismwd.org\">Michael Dales</a> has been applying this to scaling the <a href=\"/papers/2024-life\">LIFE</a> and <a href=\"/papers/2024-food-life\">FOOD</a> pipelines recently with <a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\">Alison Eyres</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\">Thomas Ball</a>. And meanwhile <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> and <a href=\"https://haddadi.github.io/\">Hamed Haddadi</a> have been researching <a href=\"/papers/2024-terracorder\">embedded biodiversity sensors</a>. The overall theme is that we need to make the hardware and software stack involved far easier to <a href=\"/papers/2024-planetary-computing\">use for non-expert programmers</a>.</p>\n<p><img src=\"/images/nas-rs-8.webp\" alt=\"%c\" title=\"A key part of the GBioS vision is to have a federated system\" ></p>\n<h3 id=\"observing-the-earth-through-geospatial-foundation-models\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#observing-the-earth-through-geospatial-foundation-models\"></a>Observing the earth through geospatial foundation models</h3>\n<p>Another problem that several speakers discussed was how complex biodiversity observations are to manage since they span multiple scales. In my talk, I described the new <a href=\"https://github.com/FrankFeng-23/btfm_project\">TESSERA</a> geospatial foundation model that <a href=\"https://www.cst.cam.ac.uk/people/zf281\">Frank Feng</a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and <a href=\"https://toao.com\">Sadiq Jaffer</a> have been leading in Cambridge. As this is a pre-trained foundation model, it needs to be finetuned to specific downstream tasks. A number of people came up after my talk with suggestions for collaborations here!</p>\n<p>Firstly, <a href=\"https://earthshotprize.org/winners-finalists/naturemetrics/\">Kat Bruce</a> (fresh from <a href=\"https://www.bbc.com/news/articles/cre8xxd7xl8o\">spraying pondwater</a> with Prince William) explained how <a href=\"https://www.naturemetrics.com/\">NatureMetrics</a> are gathering <a href=\"https://en.wikipedia.org/wiki/Environmental_DNA\">eDNA</a> from many diverse sources. The data is of varying licenses depending on which customer paid for the acquisition, but overall there is a lot of information about species presence that's very orthogonal to the kind of data gathered from satellite observations.</p>\n<p><img src=\"/images/nas-rs-4.webp\" alt=\"%c\" title=\"Kat Bruce showing how much information is packed into eDNA measurements\" ></p>\n<p>Secondly, <a href=\"https://darulab.org/\">Barnabas Daru</a> from Stanford described his efforts to map plant traits to species distribution models. This complements some work <a href=\"https://coomeslab.org\">David Coomes</a> has been leading recently in our group with <a href=\"https://www.kew.org/science/our-science/people/ian-ondo\">Ian Ondo</a> and <a href=\"https://www.cambridgeconservation.org/about/people/professor-neil-burgess/\">Neil Burgess</a> on mapping rare plants globally. The basic problem here is that plant occurrence data is <em>extremely</em> data deficient and spatially biased for 100k+ species, and so we'll need cunning interpolation techniques to fill in the data gaps.</p>\n<p><img src=\"/images/nas-rs-12.webp\" alt=\"%c\" title=\"Barnabas Daru shows his maps on gathering plant samples from all over the world\" ></p>\n<p>When back in Cambridge, I'm going to arrange for all of us to chat to see if we can somehow combine eDNA, fungal biodiversity, plant traits and satellite foundation models into a comprehensive global plant species map!</p>\n<h3 id=\"evidence-synthesis-from-the-literature\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#evidence-synthesis-from-the-literature\"></a>Evidence synthesis from the literature</h3>\n<p>There was also huge enthusiasm for another of our projects on <a href=\"/projects/ce\">analysing the academic literature</a> at scale. While we've been using it initially to accelerate the efficiacy and accuracy of <a href=\"https://en.wikipedia.org/wiki/Systematic_review\">systematic reviews</a> for <a href=\"https://conservationevidence.com\">Conservation Evidence</a>, there are a huge number of followup benefits for having a comprehensive data corpus.</p>\n<p>Firstly, <a href=\"http://elphick.lab.uconn.edu/\">Chris Elphick</a> pointed out a metasynthesis where they manually integrate recent <a href=\"https://academic.oup.com/bioscience/advance-article-abstract/doi/10.1093/biosci/biaf034/8115312\">hypotheses about insect stressors and responses</a> into a network (3385 edges / 108 nodes). It found that the network is highly interconnected, with agricultural intensification often identified as a root cause for insect decline. Much like the CE manually labeled dataset, it should be possible to do hypothesis searches in our LLM pipeline to expand this search and make it more dynamic.</p>\n<p>Secondly, <a href=\"http://oisin.info\">Oisin Mac Aodha</a>, fresh from a <a href=\"https://watch.eeg.cl.cam.ac.uk/w/7aqBd2Nn9E6QpMvnoBPxuQ\">recent talk</a> in Cambridge, discussed his <a href=\"https://arxiv.org/abs/2502.14977\">recent work</a> on few-shot species range estimation and also <a href=\"https://arxiv.org/abs/2412.14428\">WildSAT text/image encoding</a>. His example showed how you could not only spot a species from images, but also use text prompts to refine the search. An obvious extension for us to have a go at here is to combine our large corpus of academic papers with these models to see how good the search/range estimation could get with a much larger corpus of data.</p>\n<p><img src=\"/images/nas-rs-13.webp\" alt=\"%c\" title=\"I am proud to have pronounced Oisin's name correctly while introducing his recent CCI seminar\" ></p>\n<p>And thirdly, I finally met my coauthor <a href=\"https://environment.leeds.ac.uk/see/staff/2720/david-williams\">David Williams</a> in the flesh for the first time! We've worked together recently on the <a href=\"/papers/2024-food-life\">biodiversity impact of food</a>, and we had a long discussion over dinner about whether we could glean more behavioural data about how people react from the wider literature. This would require us expanding our literature corpus into <a href=\"/ideas/grey-lit-crawl\">grey literature</a> and policy documents, but this is something that <a href=\"https://toao.com\">Sadiq Jaffer</a> and I want to do soon anyway.</p>\n<p>The connective tissue across these seemingly disparate projects is that there is a strong connection between what you can observe from space (the canopies of trees) to the traits expressed via knowledge of plant physiology and their DNA. If we could figure out how to connect the dots between the observed species to the physiological traits to the bioclimatic range variables, we could figure out where the (many) data-deficient plant species in the world are! I'll be hosting a meeting in Cambridge soon on this since we're already <a href=\"/notes/ukri-grant-terra\">working on it</a>.</p>\n<h3 id=\"visualisations-in-biodiversity\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#visualisations-in-biodiversity\"></a>Visualisations in biodiversity</h3>\n<p>The most unexpectedly cool talk was <a href=\"https://www.weizmann.ac.il/plants/Milo/home\">Ron Milo</a> showing us visualisations of the <a href=\"https://www.pnas.org/doi/10.1073/pnas.1711842115\">mass distribution of all life on earth</a>. His work really puts our overall challenge into context, as it shows just how utterly dominated wildlife is by domesticated animals.</p>\n<p><img src=\"/images/nas-rs-11.webp\" alt=\"%c\" title=\"The dominant mammal biomass on the planet are domesticated animals\" ></p>\n<p>It struck me just how important these sort of high-level visualisations are in putting detailed numbers into context. For example, he also broke down global biomass that showed that plants are by far the &quot;heaviest&quot; living thing on earth, and that the ocean organisms do still dominate animal biomass.</p>\n<p><img src=\"/images/nas-rs-9.webp\" alt=\"%c\" ></p>\n<p><img src=\"/images/nas-rs-10.webp\" alt=\"%c\" ></p>\n<p>My favourite new animation library on the block is <a href=\"https://animejs.com/\">AnimeJS</a>, and so once I plan to try to do some nice animations for <a href=\"/papers/2024-life\">LIFE</a> and <a href=\"/papers/2024-food-life\">FOOD</a> along these lines after the academic term finishes.</p>\n<p>And that's a wrap on my notes for now! I'm still hanging out in the US for a bunch more meetings (including one at <a href=\"https://www.nationalgeographic.com/\">National Geographic HQ</a>), so I'll update this note when the official RS/NAS videos and writeup comes out.</p>\n<p><em>(Update 5th June: the <a href=\"https://www.youtube.com/watch?v=gDTQ1rIEaYo&amp;list=PLlKst-jESy-8t7lg429Movg6Fmsq2DU7y\">full talk videos series</a> is now online at the National Academy of Sciences channel. Enjoy!)</em></p><h1>References</h1><ul><li>Balmford et al (2024). PACT Tropical Moist Forest Accreditation Methodology v2.1. Cambridge Open Engage. <a href=\"https://doi.org/10.33774/coe-2024-gvslq\" target=\"_blank\"><i>10.33774/coe-2024-gvslq</i></a></li>\n<li>Eyres et al (2025). LIFE: A metric for mapping the impact of land-cover change on global extinctions. <a href=\"https://doi.org/10.1098/rstb.2023.0327\" target=\"_blank\"><i>10.1098/rstb.2023.0327</i></a></li>\n<li>Ball et al (2025). Food impacts on species extinction risks can vary by three orders of magnitude. <a href=\"https://doi.org/10.1038/s43016-025-01224-w\" target=\"_blank\"><i>10.1038/s43016-025-01224-w</i></a></li>\n<li>Balmford et al (2023). Realizing the social value of impermanent carbon credits. <a href=\"https://doi.org/10.1038/s41558-023-01815-0\" target=\"_blank\"><i>10.1038/s41558-023-01815-0</i></a></li>\n<li>Millar et al (2024). Terracorder: Sense Long and Prosper. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2408.02407\" target=\"_blank\"><i>10.48550/arXiv.2408.02407</i></a></li>\n<li>Ferris et al (2024). Planetary computing for data-driven environmental policy-making. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2303.04501\" target=\"_blank\"><i>10.48550/arXiv.2303.04501</i></a></li>\n<li>Madhavapeddy (2025). Technology needs to unite conservation, not divide it. <a href=\"https://doi.org/10.59350/vwrvd-3sg08\" target=\"_blank\"><i>10.59350/vwrvd-3sg08</i></a></li>\n<li>Sanderson et al (2018). From Bottleneck to Breakthrough: Urbanization and the Future of Biodiversity Conservation. BioScience. <a href=\"https://doi.org/10.1093/biosci/biy039\" target=\"_blank\"><i>10.1093/biosci/biy039</i></a></li>\n<li>Jones (2024). The scale of the biodiversity crisis laid bare. Nature. <a href=\"https://doi.org/10.1038/d41586-024-03592-y\" target=\"_blank\"><i>10.1038/d41586-024-03592-y</i></a></li>\n<li>Gonzalez et al (2023). A global biodiversity observing system to unite monitoring and guide action. Nature Ecology & Evolution. <a href=\"https://doi.org/10.1038/s41559-023-02171-0\" target=\"_blank\"><i>10.1038/s41559-023-02171-0</i></a></li>\n<li>Halsch et al (2025). Meta-synthesis reveals interconnections among apparent drivers of insect biodiversity loss. BioScience. <a href=\"https://doi.org/10.1093/biosci/biaf034\" target=\"_blank\"><i>10.1093/biosci/biaf034</i></a></li>\n<li>Lange et al (2025). Feedforward Few-shot Species Range Estimation. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2502.14977\" target=\"_blank\"><i>10.48550/arXiv.2502.14977</i></a></li>\n<li>Daroya et al (2025). WildSAT: Learning Satellite Image Representations from Wildlife Observations. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2412.14428\" target=\"_blank\"><i>10.48550/arXiv.2412.14428</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/nas-rs-biodiversity",
      "title": "What I learnt at the National Academy of Sciences US-UK Forum on Biodiversity",
      "summary": "Report from NAS/Royal Society forum on standardized biodiversity measurement technologies covering foundation models, eDNA and evidence synthesis.",
      "date_published": "2025-05-24T00:00:00.000000Z",
      "date_modified": "2025-06-06T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "biodiversity",
        "conservation",
        "policy",
        "royalsociety",
        "usa"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.33774/coe-2024-gvslq",
          "doi": "10.33774/coe-2024-gvslq",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1098/rstb.2023.0327",
          "doi": "10.1098/rstb.2023.0327",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s43016-025-01224-w",
          "doi": "10.1038/s43016-025-01224-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s41558-023-01815-0",
          "doi": "10.1038/s41558-023-01815-0",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2408.02407",
          "doi": "10.48550/arXiv.2408.02407",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2303.04501",
          "doi": "10.48550/arXiv.2303.04501",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/vwrvd-3sg08",
          "doi": "10.59350/vwrvd-3sg08",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1093/biosci/biy039",
          "doi": "10.1093/biosci/biy039",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1038/d41586-024-03592-y",
          "doi": "10.1038/d41586-024-03592-y",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s41559-023-02171-0",
          "doi": "10.1038/s41559-023-02171-0",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1093/biosci/biaf034",
          "doi": "10.1093/biosci/biaf034",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2502.14977",
          "doi": "10.48550/arXiv.2502.14977",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2412.14428",
          "doi": "10.48550/arXiv.2412.14428",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/d27v1-5tk68",
      "content_html": "<p>What might a Dame of the Realm, a Fellow of the Royal Society, the latest member of the UK Joint Nature Conservation Committee, and me all covet? That's right: a <a href=\"https://www.nps.gov/kids/become-a-junior-ranger.htm\">Junior Ranger</a> badge from <a href=\"https://www.nps.gov/shen/index.htm\">Shenandoah National Park</a>!  After an <a href=\"/notes/nas-rs-biodiversity\">intense</a> few days, <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a>, <a href=\"https://www.bangor.ac.uk/staff/sens/julia-patricia-gordon-jones-010356/en\">Julia P.G. Jones</a>, <a href=\"https://en.wikipedia.org/wiki/E._J._Milner-Gulland\">EJ Milner-Gulland</a> and I headed into nature to experience the spectacular landscapes of the Blue Ridge Mountains in Virginia and do some birding.</p>\n<p>The National Park Service in the US runs a wonderful program for anyone aged 8+ (which we just about qualified for) to introduce people to nature, and Shenandoah <a href=\"https://www.goshenandoah.com/activities-events/national-park-service-programs\">is no exception</a>. We visited the local ranger lodge in the park, and picked up a program booklet. They're full of activities for kids to do, but of course adults also pick up a lot of random knowledge (such as the <a href=\"https://en.wikipedia.org/wiki/Shenandoah_salamander\">endemic salamander species</a> in the region).</p>\n<p><img src=\"/images/shen-1.webp\" alt=\"%c\" title=\"EJ and Julia hard at work on their junior ranger books\" ></p>\n<p>The activities are rigorous: in addition to crosswords and drawings about the park, we had to compose poems and haiku, and also experience nature. First, we had to dance like butterflies... (I was a Monarch, so I occasionally had to respawn while migrating).</p>\n<p><div class=\"video-center\"><iframe title=\"Dancing like butterflies in Shenandoah\" width=\"100%\" height=\"315px\" src=\"https://crank.recoil.org/videos/embed/42b85cbd-a5e3-4211-9e3c-1e0970e08e1f\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<p>...and then hug a tree, of which there were many around, and they told us their stories...</p>\n<p><img src=\"/images/shen-2.webp\" alt=\"%c\" ></p>\n<p>...and try not to fall off sweeping overlooks as we clambered around...</p>\n<p><img src=\"/images/shen-3.webp\" alt=\"%c\" ></p>\n<p>And once that was done, we returned to the ranger lodge to swear our ranger oaths, with a delightful Ranger who knew lots about the local area, the forest/meadow management, and where we might spot more of the salamanders (which are apparently under some threat from an invasive look-a-like that is not endemic but outcompeting the local salamandar).</p>\n<p><div class=\"video-center\"><iframe title=\"Junior Ranger oath at Shenandoah National Park\" width=\"100%\" height=\"315px\" src=\"https://crank.recoil.org/videos/embed/cb5199b9-1a05-49e9-8bdd-9f2e7f96834b\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<h2 id=\"bills-conservation-concepts\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#bills-conservation-concepts\"></a>Bill's Conservation Concepts</h2>\n<p>During our hikes to the various lovely sites, we also took some time to record the 101st video for <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a>'s wonderful <a href=\"https://www.youtube.com/@Bill_Sutherland\">Conservation Concepts</a> video series, about the work <a href=\"https://toao.com\">Sadiq Jaffer</a> and I have been helping out the <a href=\"/projects/ce\">Conservation Evidence Copilots</a> team on.</p>\n<iframe width=\"560\" height=\"315\" src=\"https://www.youtube-nocookie.com/embed/anM8GVxa3Lo?si=B3h0bN6L_bfmlF46\" title=\"YouTube video player\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen></iframe>\n<p>I highly recommend watching <a href=\"https://www.youtube.com/@Bill_Sutherland\">Bill's whole channel series</a> as it's full of endlessly fascinating facts, and congratulations to Bill on reaching the century mark on the number he's done!</p>\n<p><img src=\"/images/shen-4.webp\" alt=\"%c\" title=\"I'm never taking my badge off, best day ever\" ></p><h1>References</h1><ul><li>Madhavapeddy (2025). What I learnt at the National Academy of Sciences US-UK Forum on Biodiversity. <a href=\"https://doi.org/10.59350/j6zkp-n7t82\" target=\"_blank\"><i>10.59350/j6zkp-n7t82</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/junior-rangers",
      "title": "We become Junior Rangers at Shenandoah",
      "summary": "Earning Junior Ranger badges at Shenandoah National Park and recording Conservation Concepts video series episode.",
      "date_published": "2025-05-27T00:00:00.000000Z",
      "date_modified": "2025-05-27T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "conservation",
        "usa"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/j6zkp-n7t82",
          "doi": "10.59350/j6zkp-n7t82",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2025-dl-rcn-1",
      "content_html": "<p><a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> has just released the latest survey paper he lead on energy-aware approaches to optimise deep-learning training and inference on embedded devices, such as those benchmarked in &quot;<a href=\"/papers/2025-npu-bench\">Benchmarking Ultra-Low-Power μNPUs</a>&quot; recently. This comprehensive survey is particularly relevant for IoT and mobile devices that face substantial energy constraints to prolong battery life or operate intermittently via energy harvesting. Josh synthesizes the evolving landscape of energy-aware deep learning approaches, examining their methodologies, implications for energy consumption and system-level efficiency, and their limitations across different network types, hardware platforms, and application scenarios. The work is especially timely given the push to run more AI workloads on edge devices for privacy, latency, and cost reasons.</p>\n<blockquote>\n<p>We present an overview of such approaches, outlining their methodologies, implications for energy consumption and system-level efficiency, and their limitations in terms of supported network types, hardware platforms, and application scenarios. We hope our review offers a clear synthesis of the evolving energy-aware DL landscape and serves as a foundation for future research in energy-constrained computing.</p>\n</blockquote>\n<p>Any comments, please do let any of us know!</p><h1>References</h1><ul><li>Millar et al (2025). Energy-Aware Deep Learning on Resource-Constrained Hardware. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2505.12523\" target=\"_blank\"><i>10.48550/arXiv.2505.12523</i></a></li>\n<li>Millar et al (2025). Benchmarking Ultra-Low-Power μNPUs. Association for Computing Machinery. <a href=\"https://doi.org/10.1145/3680207.3765264\" target=\"_blank\"><i>10.1145/3680207.3765264</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2025-dl-rcn-1",
      "title": "New preprint survey on energy-aware deep learning on embedded hardware",
      "summary": "Survey paper on energy-aware approaches for optimizing deep learning training and inference on embedded devices.",
      "date_published": "2025-05-20T00:00:00.000000Z",
      "date_modified": "2025-05-20T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "esp32",
        "embedded",
        "sensing",
        "ai",
        "llms"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2025-dl-rcn.pdf",
          "mime_type": "application/pdf",
          "title": "Energy-Aware Deep Learning on Resource-Constrained Hardware"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.48550/arXiv.2505.12523",
          "doi": "10.48550/arXiv.2505.12523",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3680207.3765264",
          "doi": "10.1145/3680207.3765264",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2024-ce-llm-3",
      "content_html": "<p>Our paper on <a href=\"/papers/2024-ce-llm\">how the careful design of LLMs is crucial for expert-level evidence retrieval</a> has been published today in PLOS One and is available fully open access!</p>\n<blockquote>\n<p>Our findings suggest that, with careful domain-specific design, LLMs could potentially be powerful tools for enabling expert-level use of evidence syntheses and databases. However, general LLMs used &quot;out-of-the-box&quot; are likely to perform poorly and misinform decision-makers. By establishing that LLMs exhibit comparable performance with human synthesis experts on providing restricted responses to queries of evidence syntheses and databases, future work can build on our approach to quantify LLM performance in providing open-ended responses.</p>\n</blockquote>\n<p>In a nutshell, we tested 10 LLMs with six different retrieval strategies on their ability to answer questions related to conservation, benchmarked against the <a href=\"/projects/ce\">Conservation Evidence</a> database that has been hand-assembled by experts over the last two decades. In some of the retrieval scenarios, models were <em>only</em> allowed to use their pretrained knowledge, whereas in others they had access to the relevant parts of the hand-curated database.</p>\n<p>We found that language models had very varying results when relying only on their pretrained data, and were particularly bad at answering questions about reptile conservation.\nHowever, given some extra training with the CE database, their performance improved dramatically.\nWhen we put these models head to head with human experts (from the conservation evidence team), with a set of questions and with RAG access to the database, we found that the models were just as good as our experts, but answered the questions much much much faster (near instant).</p>\n<p>Essentially, LLMs without extra training are likely to perform poorly and misinform decision-makers. This is crucial when considering how to build AI infrastructure for <a href=\"/notes/ai-should-unite-conservation\">public policymaking</a>.</p>\n<p>Particular props to <a href=\"mailto:ri301@cam.ac.uk\">Radhika Iyer</a> who did much this work in her summer break last year as part of the <a href=\"https://teaching.eng.cam.ac.uk/content/urop-available-projects\">UROP</a> program.\nSee also the fantastic <a href=\"https://watch.eeg.cl.cam.ac.uk/w/ijC1E36q7fn2qwxs7opSJq\">EEG seminar video</a> below of the talk that the student group who worked on this over the summer gave towards the end of 2024! And as an opportunistic advert, we are recruiting (with <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a>, <a href=\"https://samreynolds.org\">Sam Reynolds</a>, <a href=\"https://toao.com\">Sadiq Jaffer</a>, <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> and the rest of the CE team) for more undergrads for the coming summer.</p>\n<p><div class=\"video-center\"><iframe title=\"Conservation Evidence\" width=\"100%\" height=\"315px\" src=\"https://watch.eeg.cl.cam.ac.uk/videos/embed/8c44f016-07c0-4484-833f-b554679f175c\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<p><small class=\"credits\"><em>This summary is adapted from a social media <a href=\"https://www.linkedin.com/posts/samandrewreynolds_out-the-box-llms-are-not-ready-to-answer-activity-7329135222241849344-1AXx\">summary</a> by <a href=\"https://samreynolds.org\">Sam Reynolds</a>. Join the conversation there!</em></small></p><h1>References</h1><ul><li>Iyer et al (2025). Careful design of Large Language Model pipelines enables expert-level retrieval of evidence-based information from syntheses and databases. <a href=\"https://doi.org/10.1371/journal.pone.0323563\" target=\"_blank\"><i>10.1371/journal.pone.0323563</i></a></li>\n<li>Madhavapeddy (2025). Technology needs to unite conservation, not divide it. <a href=\"https://doi.org/10.59350/vwrvd-3sg08\" target=\"_blank\"><i>10.59350/vwrvd-3sg08</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2024-ce-llm-3",
      "title": "Out-of-the-box LLMs are not ready for conservation decision making",
      "summary": "PLOS One publication showing pretrained LLMs perform poorly on conservation questions but improve dramatically with Conservation Evidence database training.",
      "date_published": "2025-05-16T00:00:00.000000Z",
      "date_modified": "2025-05-16T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        ":ce",
        "evidence",
        "llms",
        "ai",
        "conservation"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2024-ce-llm.pdf",
          "mime_type": "application/pdf",
          "title": "Careful design of Large Language Model pipelines enables expert-level retrieval of evidence-based information from syntheses and databases"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1371/journal.pone.0323563",
          "doi": "10.1371/journal.pone.0323563",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/vwrvd-3sg08",
          "doi": "10.59350/vwrvd-3sg08",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/ymx6b-5th97",
      "content_html": "<p>I joined Cambridge's loftily named <a href=\"https://www.governance.cam.ac.uk/committees/essc/Pages/default.aspx\">Environment Sustainability Strategy Committee</a> this academic year, and have attended a couple of meetings with the latest one being held today. While a lot of what goes on is intricately tied into the University's rather <a href=\"https://www.governance.cam.ac.uk/Pages/default.aspx\">special</a> governance structure and the complexity of the College system, there has been significant progress on making all of this more visible more widely.</p>\n<p><a href=\"mailto:Sally.Pidgeon@admin.cam.ac.uk\">Sally Pidgeon</a>, our wonderful head of <a href=\"https://www.environment.admin.cam.ac.uk/\">Enviromental Sustainaibility</a>, has been redeveloping the public website and has put a lot of interesting data online.\nThere is now a new <a href=\"https://www.environment.admin.cam.ac.uk/\">Environmental Sustainability website</a> that tracks the University <a href=\"https://www.environment.admin.cam.ac.uk/our-commitments-and-approach\">committment</a> structure more closely, with the areas broken up into <a href=\"https://www.environment.admin.cam.ac.uk/our-commitments-and-approach/carbon-and-energy\">Carbon &amp; Energy</a>, <a href=\"https://www.environment.admin.cam.ac.uk/our-commitments-and-approach/travel-and-transport\">Travel &amp; Transport</a>, <a href=\"https://www.environment.admin.cam.ac.uk/our-commitments-and-approach/waste-and-circular-economy\">Waste &amp; Circular Economy</a>, <a href=\"https://www.environment.admin.cam.ac.uk/our-commitments-and-approach/biodiversity\">Biodiversity</a>, and <a href=\"https://www.environment.admin.cam.ac.uk/our-commitments-and-approach/water\">Water</a> usage.</p>\n<p>These pages makes it far clearer what our University aims are for operational environmental sustainability, and how we're getting there. There's also a dedicated area to <a href=\"https://www.environment.admin.cam.ac.uk/our-progress\">track our actual progress</a> along with a bunch of <a href=\"https://www.environment.admin.cam.ac.uk/news\">case studies</a> such as our own <a href=\"https://www.environment.admin.cam.ac.uk/news/david-attenborough-building-outstanding-environmental-management\">David Attenborough Building at the CCI</a>!</p>\n<p>Some highlights from the progress as I read through them:</p>\n<ul>\n<li><a href=\"https://www.environment.admin.cam.ac.uk/our-progress/carbon-and-energy-progress\">Carbon &amp; Energy progress</a> reports on two different ways of measuring our energy usage: market <em>or</em> location-based. The location-based emissions reporting is quite straightforward as it involves calculating the kWh of electricity used multiplied by the local grid emissions, therefore representing the mean emission resulting from energy generation within the local Cambridge area.<sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote\">[1]</a></sup> <br>The <a href=\"https://ghgprotocol.org/sites/default/files/2022-12/Scope2_ExecSum_Final.pdf\">market-based approach</a> calculates the emissions resulting from the energy supplier that we contract which spreads out the emissions calculation based on the contracts the energy supplier has. The market-based approach has many of the complexities that we've grappled with in <a href=\"https://4c.cst.cam.ac.uk\">4C</a> for avoided emissions, but is useful for <a href=\"https://ghgprotocol.org/sites/default/files/2022-12/Scope2_ExecSum_Final.pdf\">net-zero reporting</a> of GHG emissions.  While this is best summarised as being &quot;bloody complicated&quot;, it's good to see the University reporting <em>both</em> calculations and letting the readers decide which (or both) calculations to use.\nAnd finally, the use of the term &quot;natural gas&quot; turns out to be a surprisingly <a href=\"https://climatecommunication.yale.edu/publications/should-it-be-called-natural-gas-or-methane/\">bad idea</a>.  Names <a href=\"https://news.gallup.com/opinion/polling-matters/169541/name-affordable-care-act-obamacare.aspx\">do matter</a> when it comes to public communication.\n<a href=\"https://www.environment.admin.cam.ac.uk/our-progress/carbon-and-energy-progress\"> <img src=\"/images/essc-1.webp\" alt=\"%c\" > </a></li>\n<li><a href=\"https://www.environment.admin.cam.ac.uk/our-progress/travel-and-transport-progress\">Transport &amp; Travel progress</a> is fantastic to go through, as I worked on this with Ian Leslie absolutely ages ago with a <a href=\"/papers/2015-aarhus-databox\">Databox</a>-based <a href=\"/papers/2012-mpm-caware\">commuting calculator</a>!  However, it's a little disappointing to see that there hasn't been much of a systematic change in the modes of transport used, and also that &quot;work-from-home&quot; is excluded from the figures here as that's an obvious way to reduce the emissions associated with travelling. It's also interesting to see that business flying has bounced back hard since the pandemic despite strict <a href=\"https://www.environment.admin.cam.ac.uk/files/guidelines_for_sustainable_business_travel_approved.pdf\">business travel guidelines</a> that require us to use trains when possible.\n<a href=\"https://www.environment.admin.cam.ac.uk/our-progress/travel-and-transport-progress\"> <img src=\"/images/essc-2.webp\" alt=\"%c\" > </a></li>\n<li><a href=\"https://www.environment.admin.cam.ac.uk/our-progress/waste-and-circular-economy-progress\">Waste and Circular economy progress</a> appears to be largely flatlined in the last couple of years with not much substantive progress but this is also tied to the <a href=\"https://www.em.admin.cam.ac.uk/reshaping-our-estate\">amount of building work</a> going on in the University and isn't a relative metric (i.e. more building projects will result in more waste, but the University does need to do this building for its operations).</li>\n<li><a href=\"https://www.environment.admin.cam.ac.uk/files/uoc_bap.pdf\"> <img src=\"/images/bap-1.webp\" alt=\"%r\" > </a> <a href=\"https://www.environment.admin.cam.ac.uk/our-progress/biodiversity-progress\">Biodiversity progress</a> is closest to my heart, but also the hardest to assess despite the comprehensive <a href=\"https://www.environment.admin.cam.ac.uk/files/uoc_bap.pdf\">Biodiversity Action Plan</a> from last year (not because anyone's doing a bad job, but biodiversity is just a <em>really</em> complicated <a href=\"/papers/2024-life\">metric</a>!). There's a University-wide biodiversity manager now and a really well described set of action points here.\n<br> My suggestion during the meeting (and one I'll turn into a project idea soon) is that we should put spatial polygons of the progress described up as a layer over the <a href=\"https://map.cam.ac.uk\">University map</a> so people can overlay these data points and get a sense of what's going on (and where we don't have data). <a href=\"https://ancazugo.github.io/\">Andres Zuñiga-Gonzalez</a> has been <a href=\"https://ancazugo.github.io/research/outreach/2025/04/27/weekly-notes.html\">steadily working</a> with the Estates department on a side project regarding this as well!</li>\n<li><a href=\"https://www.environment.admin.cam.ac.uk/our-progress/water-progress\">Water progress</a> shows some of the difficulty of long-term reporting in this space, as a quick glance seems to reveal that we're getting worse in terms of water consumption. However, our monitoring mechanisms were improved in recent years with smart meters, and so we're just getting more accurate. However, the rise in <a href=\"/notes/ai-for-science-2024\">AI for research</a> has meant that the demand for GPUs is causing our cooling needs to spike as well, with a corresponding increase in water usage.</li>\n</ul>\n<p>So, lots to digest in here, and something I'm still piecing together in the context of the <a href=\"/notes/cambridge-green-blue\">Cambridge Green Blue</a> idea! The overall message seems clear that we need to continue to push harder for progress towards our net-zero goals to be far higher up the University's strategic plan than it currently is. That doesn't necessarily just involve spending more money, but bringing the juggernaut of <a href=\"https://www.cam.ac.uk/stories/ai-and-climate-and-nature\">research innovation</a> around here to bear, as well as shifting <a href=\"https://csaenvironmental.co.uk/projects/lord-bridges-solar-farm/\">landuse for renewable energy</a> while preserving biodiversity and water according to the biodiversity action plan.</p>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p><a href=\"https://patrick.sirref.org\">Patrick Ferris</a> has developed a <a href=\"https://github.com/geocaml/carbon-intensity\">carbon-intensity</a> based on this reporting style which <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> then used in a <a href=\"/papers/2024-loco-carbonres\">carbon-aware DNS server</a> recently. This is an example of location-based emissions data being used.</p>\n <a href=\"#fnref:1\" class=\"reversefootnote\">&#8617;</a></p></li></ol></div><h1>References</h1><ul><li>Eyres et al (2025). LIFE: A metric for mapping the impact of land-cover change on global extinctions. <a href=\"https://doi.org/10.1098/rstb.2023.0327\" target=\"_blank\"><i>10.1098/rstb.2023.0327</i></a></li>\n<li>Chaudhry et al (2015). Personal Data: Thinking Inside the Box. <a href=\"https://doi.org/10.7146/aahcc.v1i1.21312\" target=\"_blank\"><i>10.7146/aahcc.v1i1.21312</i></a></li>\n<li>Elsmore et al (2012). Confidential carbon commuting: exploring a privacy-sensitive architecture for incentivising 'greener' commuting. Association for Computing Machinery. <a href=\"https://doi.org/10.1145/2181196.2181201\" target=\"_blank\"><i>10.1145/2181196.2181201</i></a></li>\n<li>Madhavapeddy (2024). Royal Society and DeepMind host AI for Science Forum. <a href=\"https://doi.org/10.59350/0znpc-fw825\" target=\"_blank\"><i>10.59350/0znpc-fw825</i></a></li>\n<li>Madhavapeddy (2025). The Cambridge \"Green Blue\" competition to reduce emissions. <a href=\"https://doi.org/10.59350/y1g67-aq825\" target=\"_blank\"><i>10.59350/y1g67-aq825</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/cambridge-essc-progress",
      "title": "Learnings from the Cambridge Environmental Sustainability Committee",
      "summary": "Insights from Cambridge University's Environmental Sustainability Strategy Committee on carbon reduction, biodiversity, and operational sustainability progress.",
      "date_published": "2025-05-13T00:00:00.000000Z",
      "date_modified": "2025-05-13T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "conservation",
        "biodiversity",
        "policy",
        "cambridge",
        "urban"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1098/rstb.2023.0327",
          "doi": "10.1098/rstb.2023.0327",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.7146/aahcc.v1i1.21312",
          "doi": "10.7146/aahcc.v1i1.21312",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1145/2181196.2181201",
          "doi": "10.1145/2181196.2181201",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/0znpc-fw825",
          "doi": "10.59350/0znpc-fw825",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/y1g67-aq825",
          "doi": "10.59350/y1g67-aq825",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/32h4v-5kt36",
      "content_html": "<p>In my earlier note about how <a href=\"/notes/ai-should-unite-conservation\">AI should unite conservation</a>, I talked about the robust debate\nongoing within Cambridge about whether or not we're too &quot;AI obsessed&quot; and are losing track of our goals in the rush to adopt learning algorithms. <a href=\"https://www.communications.cam.ac.uk/our-team\">Jacqueline Garget</a> has written a <a href=\"https://www.cam.ac.uk/stories/ai-for-nature-embrace-with-caution\">brilliant roundup</a> about how colleages like <a href=\"https://samreynolds.org\">Sam Reynolds</a>, <a href=\"https://www.gci.cam.ac.uk/people/members/dr-chris-sandbrook\">Chris Sandbrook</a> and <a href=\"https://toao.com\">Sadiq Jaffer</a> in the\n<a href=\"https://www.conservation.cam.ac.uk\">CCI</a> are leading conversations to make sure we advance with eyes wide open.</p>\n<p><a href=\"https://www.cam.ac.uk/stories/ai-for-nature-embrace-with-caution\"> <img src=\"/images/camacuk-ainature.webp\" alt=\"%c\" > </a></p>\n<p>The <a href=\"https://www.cam.ac.uk/stories/ai-for-nature-embrace-with-caution\">article</a> covers many areas of concern to us right now: the takeover by big tech companies of data, our own <a href=\"/projects/ce\">conservation copilot</a> project, and ultimately how people and equity must remain at the centre of this process if we are to avoid causing harm to humans.</p>\n<blockquote>\n<p>Have you ever persisted in following your SatNav even when you knew you were\ngoing in the wrong direction?</p>\n<p>If so, you’ll know that placing all your trust in a machine powered by AI, without also engaging your own intelligence, does not always get you where you want to go.</p>\n<p>This is the message that a group of conservation scientists at Cambridge is pushing hard.\nEfforts to protect the natural world need all the help they can get - but before embracing AI as the solution, we need discussions about its risks and wider implications.\n<cite>-- <a href=\"https://www.cam.ac.uk/stories/ai-for-nature-embrace-with-caution\">To save nature, AI needs our help</a> - cam.ac.uk (2025)</cite></p>\n</blockquote>\n<p>Last week, we held a brilliant half-day <a href=\"https://jon.recoil.org/blog/2025/05/ai-for-climate-and-nature-day.html\">AI for Climate and Nature Day</a><sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote\">[1]</a></sup> with <a href=\"https://ai.cam.ac.uk\">AI@Cam</a> that had many of the CCI community present, and this topic was at the forefront of the group discussions at the end.</p>\n<p><a href=\"https://jon.recoil.org/blog/2025/05/ai-for-climate-and-nature-day.html\"> <img src=\"/images/aicamday-1.webp\" alt=\"%c\" title=\"An annotated guide to the AI@Cam day\" > </a></p>\n<p>I thought <a href=\"https://www.gci.cam.ac.uk/people/members/dr-chris-sandbrook\">Chris Sandbrook</a>'s point about societal change was key:</p>\n<blockquote>\n<p>If we give all our attention to inventing new AI tools to fix specific conservation problems - important as these are - we’re missing a trick.&quot;</p>\n<p>AI’s biggest impact on biodiversity is probably going to be through the ways it changes wider society.\n<cite>-- <a href=\"https://www.cam.ac.uk/stories/ai-for-nature-embrace-with-caution#section-FkJRUuRF4m\">Chris Sandbrook</a></cite></p>\n</blockquote>\n<p>I've been thinking recently that this principle applies at a <a href=\"/notes/cambridge-green-blue\">local level</a> as well, and not just with respect to AI. We generally to figure out how to change incentives towards more positive <a href=\"https://kogod.american.edu/news/how-good-is-the-paris-agreement\">collective action</a>, with <a href=\"/notes/carbon-credits-vs-offsets\">lightweight ways of keeping score</a> that do not give perverse incentives to cheat.</p>\n<p>One really interesting path (pun intended) in this direction is <a href=\"https://www.theboatrace.org/athletes/gabriel-mahler\">Gabriel Mahler</a>'s project on <a href=\"/ideas/walkability-for-osm\">generating urban walkability maps</a> that I've been supervising this year for the CompSci MPhil. Gabriel combines <a href=\"https://ancazugo.github.io/\">Andres Zuñiga-Gonzalez</a>'s <a href=\"https://ancazugo.github.io/research/outreach/2025/04/27/weekly-notes.html\">urban tree maps</a> with OSM labels in order to help people to really enjoy walking around cities. Imagine you want to bias your experience of walking to work along different dimensions such as the chance of seeing a particular bird you like, or need to go shopping at a local coop, or need to find a safe running route late at night. AI should be a tool that helps you to do all of this, and improve the general experience a human wanting to get the most out of nature, and generally help humans value their wild neighbours.</p>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p>I only had time to do a <a href=\"https://bsky.app/profile/anil.recoil.org/post/3lo43thrhvs2p\">Bluesky post storm</a> and <a href=\"https://jon.recoil.org\">Jon Ludlam</a> did a <a href=\"https://jon.recoil.org/blog/2025/05/ai-for-climate-and-nature-day.html\">roundup</a> as well.</p>\n <a href=\"#fnref:1\" class=\"reversefootnote\">&#8617;</a></p></li></ol></div><h1>References</h1><ul><li>Madhavapeddy (2025). Disentangling carbon credits and offsets with contributions. <a href=\"https://doi.org/10.59350/g4ch1-64343\" target=\"_blank\"><i>10.59350/g4ch1-64343</i></a></li>\n<li>Madhavapeddy (2025). Technology needs to unite conservation, not divide it. <a href=\"https://doi.org/10.59350/vwrvd-3sg08\" target=\"_blank\"><i>10.59350/vwrvd-3sg08</i></a></li>\n<li>Madhavapeddy (2025). The Cambridge \"Green Blue\" competition to reduce emissions. <a href=\"https://doi.org/10.59350/y1g67-aq825\" target=\"_blank\"><i>10.59350/y1g67-aq825</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/humans-save-nature-not-ai",
      "title": "Humans are the ones that will save nature, helped by AI",
      "summary": "Discussion of responsible AI adoption in conservation, emphasizing human agency and equity over technological solutions.",
      "date_published": "2025-05-07T00:00:00.000000Z",
      "date_modified": "2025-05-07T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "conservation",
        "biodiversity",
        "policy"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/g4ch1-64343",
          "doi": "10.59350/g4ch1-64343",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/vwrvd-3sg08",
          "doi": "10.59350/vwrvd-3sg08",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/y1g67-aq825",
          "doi": "10.59350/y1g67-aq825",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/y1g67-aq825",
      "content_html": "<p><a href=\"https://mlg.eng.cam.ac.uk/carl/\">Carl Edward Rasmussen</a> recently gave a great <a href=\"https://watch.eeg.cl.cam.ac.uk/w/qEsMt2Ayk37SaKgxrfwoBt\">talk</a> in our group about his thoughts on <a href=\"https://mlg.eng.cam.ac.uk/carl/words/mechanisms.pdf\">mechanisms against climate change</a>. He persuasively argued that the <a href=\"https://unfccc.int/process-and-meetings/the-paris-agreement\">Paris Agreement</a> was doing more harm than good by giving the <em>illusion</em> of being a concrete agreement, but is in reality a huge distraction. Our actual <a href=\"https://ourworldindata.org/co2-emissions\">emissions</a> have increased since the Paris agreement was signed!</p>\n<p>Carl <a href=\"https://www.youtube.com/watch?v=naFaQsFxs1g\">argues</a> that a climate system ultimately only responds to collective actions, and without a global cooperative incentive each nation will spring back to their own isolated short-term incentives that lead to an increase in fossil fuel burning.  He has just published the &quot;<a href=\"https://mlg.eng.cam.ac.uk/carl/climate/themis0.pdf\">Themis Mechanism</a>&quot; as a simple alternative for equitable global emission reduction (<a href=\"https://mlg.eng.cam.ac.uk/carl/climate/themis.pdf\">long form</a>). <em>(6th May 2025: See a new <a href=\"https://kogod.american.edu/news/how-good-is-the-paris-agreement\">article</a> on Themis as well)</em></p>\n<p>This got me brainstorming with Carl about how to test his theories out and we came up with an idea that is either terrible or awesome; please read on and judge appropriately. I think we should take advantage of Cambridge's unique structure to trial the Themis mechanism via a new <strong>competitive decarbonisation sporting league among Colleges that I dub the &quot;Cambridge Green Blue&quot;</strong>. Given the Chancellor's recent unveiling of an <a href=\"https://www.theguardian.com/business/2025/jan/28/reeves-plans-to-create-silicon-valley-between-oxford-and-cambridge\">innovation corridor</a> between Oxford and Cambridge, the timing could not be better for an initiative like this. <em>(TL;DR sign up at the bottom of this post if you'd like to participate)</em></p>\n<h2 id=\"the-basics-of-the-themis-mechanism\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-basics-of-the-themis-mechanism\"></a>The basics of the Themis mechanism</h2>\n<p>First, let's understand what Carl is <a href=\"https://mlg.eng.cam.ac.uk/carl/climate/themis.pdf\">proposing</a>, which is built on three foundations:</p>\n<blockquote>\n<ul>\n<li>Our atmosphere is a shared resource, a commons. Fossil fuel users benefit fully from fuel\nconsumption, while the CO2 cost is spread globally. This dilution effect makes continued\nuse rational for individuals but collectively disastrous. [...] To prevent this,\nwe must cooperate to guarantee positive climate results.</li>\n<li>The root cause of climate change is the failure to account for the true cost of emissions.\nBy treating the atmosphere as a free resource, we encourage overexploitation. Themis\ncorrects this unpriced externality by pricing greenhouse gas emissions.</li>\n<li>Effective cooperation requires a fair guiding principle. Themis upholds equity: that our\natmospheric resources should be shared equally between all humans.\n<cite> -- <a href=\"https://mlg.eng.cam.ac.uk/carl/\">Carl Edward Rasmussen</a>, <a href=\"https://mlg.eng.cam.ac.uk/carl/climate/themis0.pdf\">The Themis Mechanism</a> </cite></li>\n</ul>\n</blockquote>\n<p>As I <a href=\"/notes/carbon-credits-vs-offsets\">noted last week</a>, most tech companies regularly <a href=\"https://www.theverge.com/2022/8/1/23287351/amazon-climate-change-carbon-emissions-worse-2021\">break</a> future carbon pledges due to competitive pressure. So it's good to see that Themis requires only immediate commitments rather than <a href=\"https://climate.ec.europa.eu/eu-action/climate-strategies-targets/2050-long-term-strategy_en\">long-term pledges</a> which are impossible to police.  Instead of forcing <a href=\"https://climateactiontracker.org/publications/the-climate-crisis-worsens-the-warming-outlook-stagnates/\">unwilling</a> participants to join, Themis is a coalition in which partners check on each other, learn by doing, and build up mutual trust.</p>\n<p><a href=\"https://mlg.eng.cam.ac.uk/carl/climate/themis0.pdf\"> <img src=\"/images/themis-ss-1.webp\" alt=\"%c\" > </a></p>\n<p>The core scheme itself is based on a value <em>P<sub>y</sub></em> which is the price of emitting a single ton of CO2 into the atmosphere in year <em>y</em>. Here's how it works:</p>\n<ol>\n<li>Every year <em>y</em> there is a price <em>P<sub>y</sub></em> that all nations agree to.</li>\n<li>At year end, each members pays <em>P<sub>y</sub></em> times their emissions into a common pool.</li>\n<li>The pool is immediately redistributed to members in proportion to their population.</li>\n<li>Each member votes on <em>P<sub>y+1</sub></em> and the median result decides next year's price.</li>\n</ol>\n<p>This mechanism only depends on per-capita emissions for one year, and not on\nany <a href=\"https://www.carbonbrief.org/analysis-95-of-countries-miss-un-deadline-to-submit-2035-climate-pledges/\">future pledges</a> or <a href=\"http://pdf.wri.org/navigating_numbers_chapter6.pdf\">historic emissions</a>.  If a country has above average per capita emissions, then\nthey pay into the common pool. If they are below average per capita, then the country\nbenefits from payments from the pool.  The system permits co-existence with any other\ncarbon reduction efforts, and works with a non-exhaustive pool of nations participating.</p>\n<h2 id=\"will-themis-be-more-effective-than-paris\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#will-themis-be-more-effective-than-paris\"></a>Will Themis be more effective than Paris?</h2>\n<p>The main reason Themis might fail is that participating in the league <a href=\"https://www.ft.com/content/921381a8-48a4-4bb9-9196-b1d49f871bb7\">disadvantages</a> the participants vs those just continuing with business-as-usual. The economics theory behind Themis is similar to a <a href=\"https://en.wikipedia.org/wiki/Pigouvian_tax\">Pigouvian tax</a>\nwhich dates back to a century ago, when the Cambridge economist <a href=\"https://en.wikipedia.org/wiki/Arthur_Cecil_Pigou\">Arthur Pigou</a> suggested in 1920 that a tax equal to the external cost of pollution could align private costs with social costs. This idea also works for <a href=\"https://www.ecb.europa.eu/pub/pdf/scpwps/ecb.wp2812~81379c0224.en.pdf\">discounting</a> <a href=\"https://www.nature.com/articles/s41558-023-01680-x\">future</a> actions, and is the basis for some of our own work on <a href=\"/papers/2023-ncc-permanence\">pricing impermanent but delayed emissions</a>.</p>\n<p>From an economic theory perspective, Pigou and the other prominent Cambridge economist at the time <a href=\"https://en.wikipedia.org/wiki/John_Maynard_Keynes\">JM Keynes</a><sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote\">[1]</a></sup> had deep <a href=\"https://www.tandfonline.com/doi/pdf/10.1080/10370196.1994.11733148\">disagreements</a>. Keynes argued for higher interest rates to boost aggregate growth, while Pigou wanted to give people an increase in real wealth relative to prices. Both of their approaches ultimately <a href=\"https://en.wikipedia.org/wiki/Post-war_displacement_of_Keynesianism\">lost out</a> by the 1980s as free market economics ruled supreme instead, leading to the current <em>&quot;grow, emit and die&quot;</em> competitive spiral of doom we find ourselves in. However, Pigou's theories are clearly ones we should <a href=\"https://link.springer.com/article/10.1007/s10797-020-09653-y\">revisit today</a> in light of Themis: by raising the cost of emitting via taxes (or Themis contributions) we can incentivise countries to reduce pollution or decarbonise instead of treating the atmosphere as a free sink to dump into.</p>\n<p>A modern counterpoint to the &quot;lack of competitiveness&quot; argument from participating in a emissions reduction competition is the increasing evidence of <a href=\"https://www.nhm.ac.uk/discover/news/2025/january/ocean-temperature-rise-accelerating-greenhouse-gas-levels-rising.html\">runaway</a> <a href=\"/notes/rs-ecorisk-day1\">tipping points</a> that might suddenly need everyone to decarbonise really quickly. <a href=\"https://www.katharinehayhoe.com/\">Katherine Hayhoe</a>, the chief scientist at TNC observes that we <a href=\"https://www.motherjones.com/environment/2022/06/climate-scientist-katharine-hayhoe-crisis-adaptation-global-warming-impact/\">can't adapt our way out of this climate crisis</a> due to the sheer magnitude of change that will occur if we continue to emit.</p>\n<blockquote>\n<p>Our infrastructure, worth trillions of dollars, built over decades, was built for a planet that no longer exists [...]\n<cite> - Katherine Hayhoe, <a href=\"https://www.theguardian.com/environment/2022/jun/01/we-cannot-adapt-our-way-out-of-climate-crisis-warns-leading-scientist\">The Guardian</a> 2022</cite></p>\n</blockquote>\n<p>This is a pragmatic point in favour of countries joining Themis, since participation strengthens their economic infrastructure towards decarbonisation. By joining, countries can trade off some short term losses in their economy with being well hedged for either a &quot;sudden&quot; black swan <a href=\"https://en.wikipedia.org/wiki/Tipping_points_in_the_climate_system\">climate tipping point</a> that requires rapid change in their societal infrastructure, and it also gives them a long-term advantage heading into the inevitable <a href=\"https://cleantechnica.com/2024/09/12/virtual-power-plants-may-hold-the-key-to-an-all-electric-future/\">electric future</a>. So perhaps the fact that things are now much worse since Paris could force the emergence of cooperative groups who wish to <a href=\"https://www.e3g.org/wp-content/uploads/E3G-Report-Living-on-the-Edge-How-Climate-Tipping-Points-will-Reshape-Geopolitics.pdf\">prepare</a> for <a href=\"https://www.aria.org.uk/media/wxrnowvq/aria-forecasting-climate-tipping-points-programme-thesis.pdf\">sudden</a> change.</p>\n<p>As <a href=\"https://mlg.eng.cam.ac.uk/carl/\">Carl Edward Rasmussen</a> also notes in his Themis proposal, there is consensus among climate scientists that we must <a href=\"https://www.pnas.org/doi/10.1073/pnas.2301531121\">cooperate in the planetary commons</a> if we are to succeed.\nBut his proposal seems overwhelmingly difficult to evaluate in a <a href=\"https://www.theguardian.com/us-news/2024/oct/01/trump-visits-georgia-denies-climate-crisis-after-hurricane-helene\">political climate</a> that is moving <a href=\"https://www.bbc.co.uk/news/articles/cx253xjnxrmo\">away</a> from global cooperation. There must be a way to try some of these ideas out at a smaller scale, and especially locally in our sleepy University town!</p>\n<h2 id=\"cooperation-through-sport-and-games\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#cooperation-through-sport-and-games\"></a>Cooperation through sport and games</h2>\n<p>One area where nations have remained cooperative through no clear immediate financial gain is that of <a href=\"https://www.bloomsbury.com/uk/sport-in-ancient-times-9780275987398/\">competitive sport</a>. We just had the <a href=\"https://www.olympics.com/en/olympic-games/paris-2024\">Paris Olympics</a> with almost every nation in the world competing for no good reason other than a desire to win. And they're not seeking to win money as in most other areas of competition; instead it's just virtual credit in the form of <a href=\"https://www.eurosport.com/olympics/olympic-games-paris-2024/2024/gold-medal-table-per-capita-population_sto20028430/story.shtml\">medal tables</a> that are celebrated from the largest to the <a href=\"https://www.olympics.com/en/news/paris-2024-olympics-nations-won-first-ever-medal-at-the-games\">smallest</a> countries!</p>\n<p>Sporting events such as the Olympics are highly structured events with clear rules dictating almost every aspect. An interesting consequence of decoupling the rules of the games from direct financial incentives is that many sports are not <a href=\"https://en.wikipedia.org/wiki/Zero-sum_game\">zero-sum games</a>. In <a href=\"https://en.wikipedia.org/wiki/Laws_of_rugby_union\">rugby union</a> or <a href=\"https://www.thefa.com/football-rules-governance/lawsandrules\">football</a> for example, the <a href=\"https://pmc.ncbi.nlm.nih.gov/articles/PMC6315358\">winner gains more than the loser loses</a>. While this structure can encourage <a href=\"https://www.responsiblegambling.eu/wp-content/uploads/2016/06/Match-Fixing%E2%80%94The-Biggest-Threat-to-Sport-in-the-21st-Century.pdf\">match-fixing</a> due to the asymmetry, participants also build trust amongst themselves over the years, for example via <a href=\"https://link.springer.com/article/10.1007/s12197-009-9120-4\">promotion through divisions</a>.\n<a href=\"https://en.wikipedia.org/wiki/Game_theory\">Game theorists</a> often note how stable cooperation emerges in <a href=\"https://academics.hamilton.edu/economics/cgeorges/game-theory-files/repeated.pdf\">infinitely repeated</a> games. Sports seasons are simply repeated competitions; over time, codes of conduct evolve and become self-policing agreements for mutual benefit (avoiding injuries, preserving dignity in loss, etc).  There are clear lessons for the Themis mechanism here, as it also needs to establish long-term cooperation deep into the next century until <a href=\"https://www.nature.com/articles/s41558-018-0091-3\">total CO2 levels decline</a>.</p>\n<p><img src=\"/images/board-game-pd-1.webp\" alt=\"%c\" title=\"If the Olympics aren't for you, perhaps boardgames are\" ></p>\n<p>Away from physical sports, we also see similar scoring dynamics in <a href=\"https://boardgamegeek.com/\">boardgames</a>! There is a whole genre of semi-competitive boardgames such as <a href=\"https://drakesflames.blogspot.com/2012/11/board-game-review-archipelago.html\">Archipelago</a> which are <em>&quot;competitive games that everyone can lose&quot;</em>. This sounds a lot like Themis; we want to be able to stave off emissions disaster, but otherwise be the top dog in our league for every other aspect of our societies!  The game rules must be structured so that even selfish players find it in their interest to cooperate to <a href=\"https://boardgamegeek.com/geeklist/71983/competitive-games-where-everyone-can-lose\">avoid losing</a>. In Archipelago, the rule is simple: if instability within the game hits a certain point, all players lose, which forces even the leader to sometimes help the laggard to save themselves.<sup id=\"fnref:2\"><a href=\"#fn:2\" class=\"footnote\">[2]</a></sup></p>\n<h2 id=\"enter-the-cambridge-green-blue\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#enter-the-cambridge-green-blue\"></a>Enter the Cambridge Green Blue</h2>\n<p>So how is this relevant to evaluating the global Themis mechanism from earlier? Everything global must start locally, so I propose a new semi-competitive league here in Cambridge, with willing Colleges as participants, and with virtual points instead of using real currency.  And just like the <a href=\"https://en.wikipedia.org/wiki/Blue_(university_sport)\">two century old</a> tradition, we should make this sufficiently competitive to gain a coveted <a href=\"https://www.hawksclub.co.uk/about/history/the-cambridge-blue/\">sporting blue</a>! To give you some context, being really good at <a href=\"https://www.christs.cam.ac.uk/news/70-years-tiddlywinks\">tiddlywinks</a> can gain you a <a href=\"https://www.varsity.co.uk/sport/9629\">quarter blue</a>.</p>\n<p>In the rest of this post, I've written up the structure of this league with <a href=\"https://en.wikipedia.org/wiki/Elinor_Ostrom%23%2522Design_principles_illustrated_by_long-enduring_CPR_%28Common_Pool_Resource%29_institutions%2522\">Ostrom's principles</a> in mind, by treating the CO2 management problem as a <a href=\"https://en.wikipedia.org/wiki/Common-pool_resource\">common pool resource</a>.\nCambridge Colleges have been around for centuries and so naturally appreciate the long perspective required; Pembroke was <a href=\"https://www.pem.cam.ac.uk/college\">founded</a> in 1347. Our collective collegiate goal is to urgently reduce CO2e that accumulate in the atmosphere and contribute to climate change for hundreds of years. This requires cooperation and learning from each other, but also a certain drive to do better than each other to get to the goal as quickly as we can.</p>\n<h3 id=\"what-do-we-measure-in-this-league\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#what-do-we-measure-in-this-league\"></a>What do we measure in this league?</h3>\n<p>The three key sources of carbon emissions this league would track would initially come from food, heating and travel, noting again that we are only measuring <em>this year's</em> reductions and emissions, not historic or future pledges. We need to design specific mechanisms for each of these, but I'll just sketch out what makes measuring each of these &quot;interesting&quot;.</p>\n<h4 id=\"food-consumption-and-waste\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#food-consumption-and-waste\"></a>Food consumption and waste</h4>\n<p>Students, Fellows, visitors and staff all eat a <em>lot</em> of food in the Colleges from <a href=\"https://ourworldindata.org/food-choice-vs-eating-local\">all over</a> the world.  Communal dining is so central to the Cambridge College experience that it is mentioned in many College statutes as part of our charitable purpose.</p>\n<blockquote>\n<p>In furtherance of the College’s purposes, Fellows shall be entitled to dine daily free of charge at common table.\n<cite> -- <a href=\"https://www.pem.cam.ac.uk/sites/default/files/downloads/inlinearstatutesordsregs12july2022.pdf\">Pembroke College Statutes</a> presented to Her Majesty in 2009</cite></p>\n</blockquote>\n<p>Since thousands of meals go through a typical College every day, identifying pragmatic sources of emissions reductions is very important.  In a recent committee meeting at Pembroke College, I was incredibly pleased to hear that we've reduced <a href=\"https://lordslibrary.parliament.uk/food-waste-in-the-uk/\">food waste</a> from the kitchens down to just one or two meals a day (which, considering the vast number of meals served is hugely impressive).\nAnd similarly, Darwin College reported on the recent <a href=\"https://www.darwin.cam.ac.uk/wp-content/uploads/2024/02/Compressed-2023-Sustainability-Progress-Report-compressed-1mb.pdf\">plant based May Ball</a> which was a rather fine party, and the world did not end due to black tie attendees being unable to find a sausage roll.\nHow can we communicate the lessons learnt from the catering teams here to other Colleges? The CGB allows us to rank and categorise these initiatives!</p>\n<p>Research, with much of it conducted here in Cambridge, shows us that key gains in food impacts come from reducing <a href=\"https://www.britishecologicalsociety.org/wp-content/uploads/Ripple-et-al-2014-ruminants.pdf\">ruminant meat consumption</a> and the corresponding damage to <a href=\"https://www.worldwildlife.org/magazine/issues/summer-2018/articles/what-are-the-biggest-drivers-of-tropical-deforestation\">tropical forests</a> full of <a href=\"/papers/2024-life\">biodiversity</a>.\nImportantly, we're not trying to force every College member to suddenly become vegan, but instead provide sustainable and <a href=\"https://www.bbc.com/future/article/20241011-what-explains-increasing-anxiety-about-ultra-processed-plant-based-foods\">healthy</a> options.\n<a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\">Andrew Balmford</a> and <a href=\"https://en.wikipedia.org/wiki/Theresa_Marteau\">Theresa Marteau</a> have both shown that <a href=\"https://doi.org/10.1038/d41586-019-01662-0\">nudging consumers</a> such as Cambridge students and staff towards less damaging choices by default is entirely practical, without alienating those that insist on their meat'n'twoveg:</p>\n<blockquote>\n<p>A study of over 94000 cafeteria meal choices has found that doubling the vegetarian options from 1-in-4 to 2-in-4 increased the proportion of plant-based purchases by between 40-80% without affecting overall food sales.\n<cite>-- <a href=\"https://www.cam.ac.uk/stories/veg-nudge\">Veg nudge</a>. Impact of increasing vegetarian availability on meals (<a href=\"https://doi.org/10.1073/pnas.1907207116\">paper</a> / <a href=\"https://www.nature.com/articles/s43016-020-0132-8\">followup</a>)</cite></p>\n</blockquote>\n<p>The league does need some way to turn these initiatives into a points based system. This is where my colleague <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\">Thomas Ball</a>'s recent <a href=\"/papers/2024-food-life\">research</a> is instructive. He's been working on quantifying the <a href=\"/papers/2024-life\">biodiversity cost</a> of <a href=\"/papers/2024-food-life\">food imports</a>, broken up by the food type. The CGB food points game could correlate consumption choices with where the food comes from and how much it is wasted, and so we could steadily work across Colleges on reducing our impact year-on-year.</p>\n<p><a href=\"https://www.cambridge.org/engage/coe/article-details/67a21eac81d2151a0225692b\"> <img src=\"/images/tball-food-paper-ss-1.webp\" alt=\"%c\" title=\"An excerpt from the paper 'Quantifying the impact of the food we eat on species extinctions' (Tom Ball et al, under review)\" > </a></p>\n<h4 id=\"heating-without-fossil-fuels\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#heating-without-fossil-fuels\"></a>Heating without fossil fuels</h4>\n<p>Turning off the natural gas flows in Colleges is a major challenge. We have some of\nthe oldest buildings in the world around here, and much of the infrastructure is\ncorrespondingly aged.  Pembroke has just spent a ton of cash on a <a href=\"https://www.cibsejournal.com/uncategorized/fuel-for-thought-cambridge-college-plans-for-heat-pump-transition/\">communal heat pump</a> for our new development in Mill Lane, which got me thinking about how this aspect of the CGB league could be based around this. The rules and regulations for heat pump installation in the UK are incredibly baroque, as <a href=\"https://ramcq.net/\">Robert McQueen</a> pointed out recently:</p>\n<blockquote>\n<p>I have a neighbour who embarked on a planning application for a heat pump for his terraced house. There is a difference in ridiculous paperwork necessary simply to install &lt;1m from the boundary compared to the presumed consent in permitted development. Of course now they are waiving that requirement but he's stuck half way through the process. I can't even imagine adding listed requirements into that</p>\n<p>[...] <a href=\"https://mhclgmedia.blog.gov.uk/2024/11/21/warm-homes-plan-and-heat-pumps/\">due to be waived</a> for permitted development - whether that tracks through to the full regulations is anyone's guess. They are already bafflingly inconsistent.\n<cite>-- <a href=\"https://bsky.app/profile/ramcq.net/post/3lhcdlycth22n\">Robert McQueen, Bluesky</a>, Feb 2025</cite></p>\n</blockquote>\n<p>However, the Cambridge City Council isn't sitting still and has been working with the University on this. Ian Leslie pointed me to city-wide explorations into <a href=\"https://www.cambridge.gov.uk/city-centre-heat-network\">district heating</a> networks for Cambridge that includes a <a href=\"https://www.cambridge.gov.uk/media/pkjcwy1m/city-centre-heat-network-connection-guidance.pdf\">phase 1 report</a>\nthat plots out what it might look like by using different Colleges as sinks and sources!</p>\n<p><a href=\"https://www.cambridge.gov.uk/media/pkjcwy1m/city-centre-heat-network-connection-guidance.pdf\"> <img src=\"/images/cambridge-district-heat-ss-1.webp\" alt=\"%c\" > </a></p>\n<p>Darwin College also reports in their <a href=\"https://www.darwin.cam.ac.uk/wp-content/uploads/2024/02/Compressed-2023-Sustainability-Progress-Report-compressed-1mb.pdf\">2023 sustainability report</a> the progress they've made on establishing heat pumps in the River Cam.</p>\n<blockquote>\n<p>In 2022, in a collaboration with six other riverside Colleges, Mott MacDonald were commissioned to monitor\nwater flow, depth and temperature at four locations on the river and to produce a detailed hydrology study.\nThe report, delivered in 2023, confirms the considerable potential of the river to supply heat for space\nand hot water heating for the adjacent Colleges.\n<cite> -- <a href=\"https://www.darwin.cam.ac.uk/wp-content/uploads/2024/02/Compressed-2023-Sustainability-Progress-Report-compressed-1mb.pdf\">Darwin sustainability report</a>, 2023</cite></p>\n</blockquote>\n<p>And famously most recently, <a href=\"www.kings.cam.ac.uk\">Kings College</a> installed <a href=\"https://www.kings.cam.ac.uk/news/2023/kings-unveils-new-solar-panels-restored-chapel-roof\">400 solar panels</a> on their world-famous chapel, despite <a href=\"https://www.kings.cam.ac.uk/news/2023/kings-unveils-new-solar-panels-restored-chapel-roof\">opposition</a> from Historic England. This sets a huge precedent for the rest of Cambridge to take similar action, and they deserve recognition for this from the CGB!</p>\n<p><img src=\"/images/kings-solar-panels.webp\" alt=\"%c\" title=\"The roof of Kings College chapel. Source: BBC News.\" ></p>\n<p>So this aspect of the CGB league could focus on building spatial connections across Colleges. Perhaps the College that brings the most benefit to its neighbours by contributing the most towards a district heating mechanism could win this round.</p>\n<h4 id=\"reducing-impact-of-international-travel\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#reducing-impact-of-international-travel\"></a>Reducing impact of international travel</h4>\n<p>Finally, lots of the Colleges do facilitate international travel, for a variety of reasons ranging from <a href=\"https://www.pem.cam.ac.uk/international-programmes\">pedagogical</a> to <a href=\"https://www.pem.cam.ac.uk/alumni-development/connect-pembroke/alumni-events\">developmental</a>. The most obvious one is when conducting in-person interviews, when candidates fly in from all over the world. Since the pandemic, there has been <a href=\"https://oxbridgeapplications.com/blog/cambridge-interviews-online-or-in-person/\">split opinion</a> among Colleges about returning to in-person interviews or not, with Pembroke opting for in-person this year. While there are lots of good reasons to encourage in-person interactions, the carbon cost has been so low down in the discussion points in the meetings I've attended that it might as well not even be a factor. A CGB league might encourage us to tally up the scores across Colleges more systematically to factor in these costs into the overall decisionmaking.</p>\n<p>The other opposite end of the spectrum is international air travel for conferences, which are thankfully quite rare as most of our business is conducted locally. We do host events here such as the <a href=\"https://www.sccs-cam.org/\">SCCS</a> student conservation conference that flies in young scholars from all over the world, but this is quite rightly justified as being essential as it brings together underrepresented young students from all over the world who find tremendous value from meeting each other. I've made more extensive notes on the topic of travel mitigation elsewhere in my note on <a href=\"/notes/carbon-credits-vs-offsets\">carbon contributions</a>.</p>\n<h3 id=\"implementing-the-cambridge-green-blue\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#implementing-the-cambridge-green-blue\"></a>Implementing the Cambridge Green Blue</h3>\n<p>I've hopefully convinced you that there quite a few interesting dimensions around which we could design our semi-competitive Cambridge Green Blue (CGB) league. I've avoided over-specifying the rules at this early juncture, since I want to bring in more people's thoughts and ideas first.  However, here's a strawman attempt.</p>\n<blockquote>\n<p>We treat the emission of CO2 into the atmosphere as a shared common pool resource (CPR); i.e. we can collectively only emit a limited amount if we are to avoid the worst effects of climate change. Cooperation on a global CPR should ideally happen on a global basis, however that current approach is inadequate. Therefore, we must locally initiate mechanisms which will build up into a global framework from the ground up. Cambridge Colleges are institutions for young people who will be greatly affected by climate change, and Colleges make decisions with long time horizons, and a body of scholars should represent intellectual leadership in a time of crises. Therefore Cambridge Colleges should be an ideal proving ground for exploring cooperative frameworks in practise!</p>\n</blockquote>\n<p>The CGB would select its initial College membership and define baseline rules about how to measure emissions collectively, based around the first interest areas of travel, food and heating described above. Members will then write a rule book that follows the Themis mechanism to establish a virtual price for each tonne of emissions, and we will self-report progress monthly with points assigned to those who are beating their baselines of emissions reduction interventions. The league is used to collectively learn from those who are winning, and equalise the playing field in future seasons for the others to catch up.</p>\n<p>Following Ostrom's principles, the league looks like this:</p>\n<ol>\n<li><em>Define group boundaries and the contents of the CPR.</em> The common pool resource we measure are CO2 emissions from the Cambridge Colleges. The goal is to reduce emissions year-on-year, and so &quot;0&quot; is defined as the previous year’s emissions, with any additional emissions reductions resulting in points awarded. The league therefore measures the CPR as &quot;CO2e tonnes avoided&quot; without getting into any historic or future plans, only what is happening this year.</li>\n<li><em>Appropriation and provision of common resources.</em> The Colleges all have initiatives to reduce their CO2e, and have agreed to cooperate towards this common goal. Membership of the league is voluntary, and we make the membership public. We reserve the right to laugh derisively at those Colleges who elect not to participate.</li>\n<li><em>Collective-choice arrangements for resource appropriators to make decisions.</em> The league will maintain a points database tracking emissions across heating, travel and food-related emissions reduction activities. The league will not be directly involved in individual College decision making, but we hope to recruit persons from the Colleges who may be involved in those activities in addition to their participation in the league.</li>\n<li><em>Effective monitoring by monitors who are accountable to the appropriators.</em> The league will self-report their emissions reductions monthly, and there will be a collective consensus formed on the CO2e measurements across the emissions reductions. The reporters are all part of the Cambridge Colleges, and so have access to internal channels to verify their own claims.</li>\n<li><em>A scale of graduated sanctions for resource appropriators who violate community rules.</em> As a voluntary league, we do not anticipate any incentive to cheat. Sanctions will first be redaction of those points from the table, followed by ejection from the league.</li>\n<li><em>Mechanisms of conflict resolution that are cheap and of easy access.</em> The league has monthly checkpoints where participants collectively score their emissions reductions. Disagreements about methodologies will be resolved at these meetings, which also aim to collectively educate each other about the diverse emissions reduction methods available.</li>\n<li><em>Self-determination of the community recognised by higher-level authorities.</em> Cambridge Colleges have committed to various net-zero targets. Therefore, the emissions reductions tracked by this league will eventually be incorporated into some broader net-zero reporting that apply at a national and international level. But for now, we just want to reduce the real amount year-on-year.</li>\n<li><em>Organisation in the form of multiple layers of nested enterprises, with small local CPRs at the base level.</em> Our hope is that the Cambridge Green Blue is the first league of many, with other organisations also following our template. To that end, we will make our rules templates available freely as an open-source rulesheet after the first round concludes successfully. When there are multiple organisations running their own leagues (come on Oxford!), we will build up a bigger collective framework for Themis participants, akin to a sporting governing body.</li>\n</ol>\n<p>One very important aspect of this is to adopt a respectful &quot;<a href=\"https://en.wikipedia.org/wiki/Sportsmanship\">sportsmanship</a>&quot; rule to the relative ranking of Colleges, and not engage in <a href=\"https://www.varsity.co.uk/news/28426\">shaming</a> wars. There is a wide wealth <a href=\"https://www.varsity.co.uk/news/14626\">disparity</a> among the Cambridge Colleges, and we could adjust for this using the per-capita rules from the Themis mechanism. Ultimately, it's also about celebrating and learning from every participant and using the competition to spur us on, build each other up, and have fun doing so. We're all in this together.</p>\n<h2 id=\"err-are-you-serious-about-this-anil\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#err-are-you-serious-about-this-anil\"></a>Err, are you serious about this Anil?</h2>\n<p>Yeah, I think this is worth a try! I have recently joined the University's <a href=\"https://www.governance.cam.ac.uk/committees/essc/Pages/default.aspx\">Environmental Sustainability Strategy</a> committee, and I've found it extremely difficult to educate myself about the local initiatives going on (not because of any individual's fault, but because there are 31 separate constituent Colleges and University and townspeople sharing a fairly small area). If nothing else, this initiative will let us collectively bring together a wiki of all the positive actions happening across Cambridge. If it succeeds though, I'd like to spread the next iteration of the league to other Universities to run their own (I'm looking at you, Oxford), and see if we can turn this into a distributed game.</p>\n<p>I was reading <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\">Andrew Balmford</a>'s book <a href=\"https://press.uchicago.edu/ucp/books/book/chicago/W/bo13823467.html\">Wild Hope</a> over the weekend, and his conclusion at the end was that we must not lose hope in our quest for a biodiverse, equitable world.  And given the chaotic start to 2025, I can't think of a better place to start something new than within Cambridge, with our collegiate structure already providing a ready-made framework.</p>\n<p>So what next? If you're interested in helping <a href=\"https://mlg.eng.cam.ac.uk/carl/\">Carl Edward Rasmussen</a> and me organise this, get in touch with either of us! I'm on <a href=\"https://www.hr.admin.cam.ac.uk/policies-procedures/flexible-working-policy/supporting-guidance/sabbatical-leave\">academic sabbatical</a> for a year from the summer, so I'll have loads of time. I'll edit this post with a list of first Colleges that have been in touch. We'll likely organise a pub get-together in early March (exact date to follow) to brainstorm about this without anyone interested.</p>\n<p><small class=\"credits\"> <em>This post is the result of many conversations around Cambridgeshire over the past year, ranging from a balmy summer dinner in Ely with <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\">Andrew Balmford</a> and <a href=\"https://en.wikipedia.org/wiki/Theresa_Marteau\">Theresa Marteau</a>, chilly autumn cups of tea in my Pembroke office with <a href=\"https://mlg.eng.cam.ac.uk/carl/\">Carl Edward Rasmussen</a> and <a href=\"http://carlhenrik.com/\">Carl Henrik Ek</a>, to misty morning coffees at <a href=\"https://www.visitcambridge.org/place/pages-cambridge/\">Pages</a> with <a href=\"https://en.wikipedia.org/wiki/Melissa_Leach\">Melissa Leach</a> and <a href=\"https://mynameismwd.org\">Michael Dales</a> or at <a href=\"https://www.espressolane.co.uk/\">Espresso Lane</a> with <a href=\"https://www.cst.cam.ac.uk/people/eft20\">Eleanor Toye Scott</a>, to cosy pubs with Ian Leslie, <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\">Jon Crowcroft</a>, <a href=\"https://coomeslab.org\">David Coomes</a> and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>, to College dinners with <a href=\"https://toao.com\">Sadiq Jaffer</a> and <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>, and <a href=\"https://www.cst.cam.ac.uk/research/eeg\">EEG</a>/<a href=\"https://www.zoo.cam.ac.uk/research/groups/conservation-science\">CSG</a> discussions with <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\">Thomas Ball</a>, <a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\">Alison Eyres</a>, <a href=\"https://biomin.esc.cam.ac.uk/people/2023-Orlando-Timmerman/\">Orlando Timmerman</a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\">Thomas Swinfield</a>, <a href=\"https://ryan.freumh.org\">Ryan Gibb</a>, <a href=\"https://www.cl.cam.ac.uk/~arb33/\">Alastair Beresford</a>, <a href=\"https://inverseprobability.com/\">Neil Lawrence</a> and <a href=\"https://github.com/mor1\">Richard Mortier</a>. Many thanks to them for corrections and feedback, and any remaining errors are my own. Changelog: 12th Feb added note on sportsmanship and Carl's NeurIPS@Cam talk. 6th May 2025: added <a href=\"https://mlg.eng.cam.ac.uk/carl/\">Carl Edward Rasmussen</a>'s published <a href=\"https://kogod.american.edu/news/how-good-is-the-paris-agreement\">article</a> on Themis.</em> </small></p>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p>I promise I'm not a JMK shill, despite being a <a href=\"https://www.cshss.cam.ac.uk/research-info/j-m-keynes-fellowship-fund/j-m-keynes-fellows\">JMK Fellow</a>.</p>\n <a href=\"#fnref:1\" class=\"reversefootnote\">&#8617;</a></p></li>\n<li id=\"fn:2\"><p><p>The keen boardgame player will probably observe that there's always one player who decides to cause trouble just for fun, making everyone lose. This can be dealt with by social means.</p>\n <a href=\"#fnref:2\" class=\"reversefootnote\">&#8617;</a></p></li></ol></div><h1>References</h1><ul><li>Eyres et al (2025). LIFE: A metric for mapping the impact of land-cover change on global extinctions. <a href=\"https://doi.org/10.1098/rstb.2023.0327\" target=\"_blank\"><i>10.1098/rstb.2023.0327</i></a></li>\n<li>Madhavapeddy (2024). Royal Society meeting on ecological/commercial risks. <a href=\"https://doi.org/10.59350/0qmn2-rwh65\" target=\"_blank\"><i>10.59350/0qmn2-rwh65</i></a></li>\n<li>Ball et al (2025). Food impacts on species extinction risks can vary by three orders of magnitude. <a href=\"https://doi.org/10.1038/s43016-025-01224-w\" target=\"_blank\"><i>10.1038/s43016-025-01224-w</i></a></li>\n<li>Balmford et al (2023). Realizing the social value of impermanent carbon credits. <a href=\"https://doi.org/10.1038/s41558-023-01815-0\" target=\"_blank\"><i>10.1038/s41558-023-01815-0</i></a></li>\n<li>Madhavapeddy (2025). Disentangling carbon credits and offsets with contributions. <a href=\"https://doi.org/10.59350/g4ch1-64343\" target=\"_blank\"><i>10.59350/g4ch1-64343</i></a></li>\n<li>Fisher et al (2019). Use nudges to change behaviour towards conservation. Nature. <a href=\"https://doi.org/10.1038/d41586-019-01662-0\" target=\"_blank\"><i>10.1038/d41586-019-01662-0</i></a></li>\n<li>Garnett et al (2019). Impact of increasing vegetarian availability on meal selection and sales in cafeterias. Proceedings of the National Academy of Sciences. <a href=\"https://doi.org/10.1073/pnas.1907207116\" target=\"_blank\"><i>10.1073/pnas.1907207116</i></a></li>\n<li>Tol (2023). Social cost of carbon estimates have increased over time. Nature Climate Change. <a href=\"https://doi.org/10.1038/s41558-023-01680-x\" target=\"_blank\"><i>10.1038/s41558-023-01680-x</i></a></li>\n<li>Brady (1994). Keynes, Pigou and the Supply Side of the General Theory. History of Economics Review. <a href=\"https://doi.org/10.1080/10370196.1994.11733148\" target=\"_blank\"><i>10.1080/10370196.1994.11733148</i></a></li>\n<li>Edenhofer et al (2021). Pigou in the 21st Century: a tribute on the occasion of the 100th anniversary of the publication of The Economics of Welfare. International Tax and Public Finance. <a href=\"https://doi.org/10.1007/s10797-020-09653-y\" target=\"_blank\"><i>10.1007/s10797-020-09653-y</i></a></li>\n<li>Jasina et al (2012). A model of promotion and relegation in league sports. Journal of Economics and Finance. <a href=\"https://doi.org/10.1007/s12197-009-9120-4\" target=\"_blank\"><i>10.1007/s12197-009-9120-4</i></a></li>\n<li>Rogelj et al (2018). Scenarios towards limiting global mean temperature increase below 1.5 °C. Nature Climate Change. <a href=\"https://doi.org/10.1038/s41558-018-0091-3\" target=\"_blank\"><i>10.1038/s41558-018-0091-3</i></a></li>\n<li>Garnett et al (2020). Order of meals at the counter and distance between options affect student cafeteria vegetarian sales. Nature Food. <a href=\"https://doi.org/10.1038/s43016-020-0132-8\" target=\"_blank\"><i>10.1038/s43016-020-0132-8</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/cambridge-green-blue",
      "title": "The Cambridge \"Green Blue\" competition to reduce emissions",
      "summary": "Thinking about a Cambridge \"Green Blue\" competition to reduce emissions among Colleges, promoting cooperation through a semi-competitive league",
      "date_published": "2025-02-10T00:00:00.000000Z",
      "date_modified": "2025-05-06T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "carbon",
        "climate",
        "economics"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1098/rstb.2023.0327",
          "doi": "10.1098/rstb.2023.0327",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/0qmn2-rwh65",
          "doi": "10.59350/0qmn2-rwh65",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s43016-025-01224-w",
          "doi": "10.1038/s43016-025-01224-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s41558-023-01815-0",
          "doi": "10.1038/s41558-023-01815-0",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/g4ch1-64343",
          "doi": "10.59350/g4ch1-64343",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1038/d41586-019-01662-0",
          "doi": "10.1038/d41586-019-01662-0",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1073/pnas.1907207116",
          "doi": "10.1073/pnas.1907207116",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s41558-023-01680-x",
          "doi": "10.1038/s41558-023-01680-x",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1080/10370196.1994.11733148",
          "doi": "10.1080/10370196.1994.11733148",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1007/s10797-020-09653-y",
          "doi": "10.1007/s10797-020-09653-y",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1007/s12197-009-9120-4",
          "doi": "10.1007/s12197-009-9120-4",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s41558-018-0091-3",
          "doi": "10.1038/s41558-018-0091-3",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s43016-020-0132-8",
          "doi": "10.1038/s43016-020-0132-8",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/4hq78-m8c88",
      "content_html": "<p>With the <a href=\"https://www.tunbury.org/equinix-moves/\">sunsetting of Equinix Metal</a>\nI've also been migrating the Recoil machines over to new hosts in <a href=\"https://www.mythic-beasts.com/\">Mythic\nBeasts</a>. This time around, rather than manually\nsetting up services, I've turned to a nice new tool called\n<a href=\"https://github.com/moghtech/komodo\">Komodo</a> which helps with deploying Docker\ncontainers across multiple servers. Unlike many <a href=\"https://kubernetes.io/\">other</a>\ncontainer management solutions, Komodo is refreshingly simple. It has a mode\nwhere it can take <em>existing</em> <a href=\"https://docs.docker.com/compose/\">Docker compose</a> files on a\ngiven host, and run them, and provide a web-based monitor to keep an eye on a\nfew machines.</p>\n<h2 id=\"the-komodo-interface\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-komodo-interface\"></a>The Komodo interface</h2>\n<p>There's an online <a href=\"https://demo.komo.do/\">demo</a> of Komodo available (user/pass\nis demo/demo). The basic idea is that you first register servers (see below for\n&quot;Periphery&quot;), and then add in &quot;Stacks&quot; which represent a service each.</p>\n<p><img src=\"/images/komodo-1.webp\" alt=\"%c\" title=\"The list of Stacks running on Recoil\" ></p>\n<p>Every stack is configured to run a <code>docker-compose.yml</code> service that is already\npresent on the host, and the web UI has a convenient way of pulling, deploying\nand polling the Docker Hub to check for updates.</p>\n<p><img src=\"/images/komodo-2.webp\" alt=\"%c\" title=\"The stack view for a Tangled.sh knot running on Recoil\" ></p>\n<p>The autoupdate functionality is quite cool (if a touch risky), as it polls for the\nimages on the Docker Hub and updates to those automagically. While I've activated\nthis for services I'm happy autoupdating, it's also accompanied by a healthy\ndose of <a href=\"/notes/syncoid-sanoid-zfs\">ZFS snapshotting</a> so I can roll back if anything\nuntoward happens.</p>\n<p><img src=\"/images/komodo-3.webp\" alt=\"%c\" title=\"The alert view of autoupdates from polling the Hub\" ></p>\n<p>Most importantly to me is that I can always switch away from Komodo at any time\nand directly interact with the services on the host using the normal <code>docker</code> CLI.\nKomodo is just coordinating the compose invocations in the lightest way possible,\nand not wrapping them in such a way that I lose access.</p>\n<h2 id=\"setting-up-periphery-with-a-wireguard-mesh-and-dsnet\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#setting-up-periphery-with-a-wireguard-mesh-and-dsnet\"></a>Setting up Periphery with a Wireguard mesh and dsnet</h2>\n<p>Komodo operates across multiple hosts by using something called a <a href=\"https://komo.do/docs/connect-servers\">periphery agent</a>\nwhich the main host issues RPCs to in order to do something. This is obviously quite a privileged operation, and so rather than\nexpose it to the Internet I setup a Wireguard tunnel mesh across the Recoil hosts for these operations to go over.</p>\n<p>The easiest way to do this was via <a href=\"https://github.com/naggie/dsnet\">dsnet</a>, which generates the configurations and keys\nsuitable for a <a href=\"https://www.man7.org/linux/man-pages/man8/wg-quick.8.html\">wg-quick</a> service to run on each host and connect\nto their peers. Following the instructions let me setup this mesh in minutes; this is a much simpler solution than\n<a href=\"https://tailscale.com\">Tailscale</a> due to the lack of flexibility, but all I want here is few hosts connected by static interfaces\nand with no need for <a href=\"https://tailscale.com/blog/how-nat-traversal-works\">complex NAT punching</a>.  Once the dsnet configuration is\nsetup, all that's needed is to activate the <code>wg-quick</code> service on each of the hosts, and they spin up a virtual interface.</p>\n<p>After this, the Periphery setup was straightforward but with one twist.  I configured the agent to bind to the wireguard IP, e.g.:</p>\n<pre><code>/etc/komodo/periphery.config.toml\n################################\n# 🦎 KOMODO PERIPHERY CONFIG 🦎 #\n################################\n\nport = 8120\nbind_ip = &quot;10.100.0.2&quot;\n</code></pre>\n<p>But then on reboot the periphery agent would fail to startup due to the wireguard service being too low a priority in the boot order. This was fixed by a systemd tweak (which took me longer to figure out than the rest of the entire setup altogether, since I find systemd utterly inscrutable).</p>\n<pre><code>/etc/systemd/system/periphery.service\n[Unit]\nDescription=Agent to connect with Komodo Core\nAfter=wg-quick@wg0.service\n</code></pre>\n<p>This little tweak to the script, followed by umpteen <code>daemon-reload</code> prods and\nreboots to get systemd happy, did the trick.</p>\n<p>I'm pretty happy with Komodo, thank you to the devs! It's a system that's simple enough that I can try\nit out progressively, and can bypass easily if required, and provides a very\nuseful part of the <a href=\"\">##selfhosting</a> jigsaw puzzle.</p><h1>References</h1><ul><li>Madhavapeddy (2025). Semi distributed filesystems with ZFS and Sanoid. <a href=\"https://doi.org/10.59350/zy5bb-3ze20\" target=\"_blank\"><i>10.59350/zy5bb-3ze20</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/komodo-docker-compose",
      "title": "Using Komodo to manage Docker compose on a small cluster",
      "summary": "Guide to deploying Komodo container management tool with Wireguard mesh networking for coordinating Docker services across multiple hosts.",
      "date_published": "2025-05-05T00:00:00.000000Z",
      "date_modified": "2025-05-05T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "selfhosting",
        "docker"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/zy5bb-3ze20",
          "doi": "10.59350/zy5bb-3ze20",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/vwrvd-3sg08",
      "content_html": "<p>I had a tremendous time participating in last year's <a href=\"/papers/2024-ai-conhorizon\">horizon scan of AI and Conservation</a>, which laid out the opportunities that technological progress from AI (a catchall phrase here) could bring to hard-working conservation practitioners. Since then, there's been a lot of corridor conversations about future projects (and even <a href=\"/notes/uk-national-data-lib\">dinner with the Wildlife Trusts</a>). However, there has also been discussion about the potential <em>harms</em> of our work, most notably in a <a href=\"https://www.sciencedirect.com/science/article/pii/S0169534725000588\">response letter</a> to our paper written by <a href=\"https://experts.exeter.ac.uk/42389-katie-murray/about\">Katie Murray</a> and colleagues.</p>\n<p>Murray et al make two really important points:</p>\n<blockquote>\n<ul>\n<li>[...] importance of ecological expertise must be recognised as much more than just the expert annotation of training data</li>\n<li>[...] effort should be made to build capacity for AI development in the Global South, so that the rewards of successful research can be shared\n<cite>-- <a href=\"https://www.sciencedirect.com/science/article/pii/S0169534725000588\">The potential for AI to divide conservation</a></cite></li>\n</ul>\n</blockquote>\n<p>Myself and the co-authors of the original horizon scan could not agree more with this statement, and <a href=\"https://samreynolds.org\">Sam Reynolds</a> lead us to publish a response-to-the-response <sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote\">[1]</a></sup> dubbed &quot;<a href=\"/papers/2025-conservation-div\">Conservation changed but not divided</a>&quot;.</p>\n<p><a href=\"https://authors.elsevier.com/a/1k%7ESZcZ3X3uxK\"> <img src=\"/images/cam-nature-3.webp\" alt=\"%c\" > </a></p>\n<p>In our response, we note that:</p>\n<blockquote>\n<p>We agree wholeheartedly with these points and recognise that the task of equitable integration of AI into conservation is beyond the scope of any single group and requires collective action.</p>\n<p>[...] Developers of AI tools have a foundational role to play in delivering an equitable AI landscape. Technologies disconnected from pragmatic ecological, cultural, and socioeconomic factors are unlikely to advance the field [...]</p>\n<p>[...] Developers should adopt participatory design and development principles, identifying conservation actors to guide the process, designing [...] protocols that respect cultural sensitivities and Indigenous and local knowledge [...]</p>\n<p>[...] All tools should be open source and thoroughly documented, so that they can be easily adapted for local contexts.\n<cite>-- <a href=\"/papers/2025-conservation-div\">Conservation changed but not divided</a></cite></p>\n</blockquote>\n<p>Many thanks to Katie Murray and colleagues for taking the trouble to call out the issues in our original paper! Both of the letters will be side-by-side in the next issue of <a href=\"https://www.cell.com/trends/ecology-evolution/home\">Trends in Ecology and Evolution</a> and, of course, we welcome any more perspectives about either of these.</p>\n<h1 id=\"cambridge-goes-full-ai\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#cambridge-goes-full-ai\"></a>Cambridge goes full AI</h1>\n<p>This whole discourse also happened at the same time as Cambridge <a href=\"https://www.cam.ac.uk/topics/artificial-intelligence\">dove</a> in with a major piece about <a href=\"https://www.cam.ac.uk/stories/ai-and-climate-and-nature\">turbocharging the race to protect nature and climate with AI</a>. The piece itself (lead by the brilliant <a href=\"https://uk.linkedin.com/in/jacqueline-garget-b24804214\">Jacqueline Garget</a> and <a href=\"louise.walsh@admin.cam.ac.uk\">Louise Walsh</a>) covers a number of the projects in our <a href=\"https://ai.conservation.cam.ac.uk/\">AICN</a> project we started <a href=\"/notes/aicn-in-aicam\">last year</a>.</p>\n<p><a href=\"https://www.cam.ac.uk/stories/ai-and-climate-and-nature\"> <img src=\"/images/cam-nature-1.webp\" alt=\"%c\" > </a></p>\n<p>The online story itself is a rather gorgeous layout, with pieces on:</p>\n<ul>\n<li><a href=\"https://www.cam.ac.uk/stories/ai-and-climate-and-nature#section-Land-use-planning-T1WPpYngXA\">landuse planning</a> from me about our <a href=\"/notes/ukri-grant-terra\">UKRI-funded</a> &quot;Terra&quot; project to map global plants and the impact on <a href=\"/papers/2024-food-life\">food supply chains</a>.</li>\n<li><a href=\"https://www.cam.ac.uk/stories/ai-and-climate-and-nature#section-Biodiversity-conservation-50N1jQTVIa\">biodiversity conservation</a> with <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> who leads <a href=\"/projects/ce\">Conservation Evidence Copilots</a> and whose <a href=\"https://www.youtube.com/@Bill_Sutherland\">Conservation Concepts channel</a> is a must-watch, and <a href=\"https://toao.com\">Sadiq Jaffer</a> and <a href=\"https://mynameismwd.org\">Michael Dales</a> who have the <a href=\"https://ai.conservation.cam.ac.uk/2024/06/05/planetary-computing-fellows-michael-dales-and-sadiq-jaffer-putting-systems-to-work-to-accelerate-ecological-interventions/\">coolest job titles</a> in Cambridge.</li>\n<li><a href=\"https://www.cam.ac.uk/stories/ai-and-climate-and-nature#section-Climate-modelling-NdQYHh3cRP\">climate modelling</a> with Joe and Jack from the ICCS talking about <a href=\"https://github.com/Cambridge-ICCS/FTorch\">differentiable fortran</a> (I'm coorganising the <a href=\"/notes/propl-at-splash\">next PROPL</a> with ICCS lead <a href=\"https://dorchard.github.io\">Dominic Orchard</a> as well).</li>\n<li><a href=\"https://www.cam.ac.uk/stories/ai-and-climate-and-nature#section-Energy-efficient-homes-0AUJzMfjnS\">energy efficient homes</a> with <a href=\"https://www.arct.cam.ac.uk/people/dr-ronita-bardhan\">Ronita Bardhan</a> (who I'm having a blast working with alongside <a href=\"https://ancazugo.github.io/\">Andres Zuñiga-Gonzalez</a> on <a href=\"/ideas/urban-vegetation\">urban vegatation</a>).</li>\n<li><a href=\"https://www.cam.ac.uk/stories/ai-and-climate-and-nature#section-Forest-monitoring-VudaoOH7Rd\">forest monitoring</a> about Emily Lines and Harry Owens work on forest structure reconstruction and <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and Frank Feng's work on <a href=\"https://github.com/MingyueX/GreenLens\">GreenLens</a>.</li>\n</ul>\n<p>While each of these projects are fascinating research projects, the bit that made me stop and really think was the last <a href=\"https://www.cam.ac.uk/stories/Anil-Madhavapeddy-AI-climate-nature\">interview with me</a> about how AI could heal in the planet. In it, I talk about conservation from a technological lens:</p>\n<blockquote>\n<p>We need to act fast to mitigate the impacts of climate change, and to protect and restore biodiversity. There’s incredible potential in using AI to augment our work. It enables us to do things much more quickly – it’s like giving humans global knowledge superpowers!</p>\n</blockquote>\n<p>But, after more corridor conversations with colleagues in the <a href=\"https://www.conservation.cam.ac.uk\">CCI</a><sup id=\"fnref:2\"><a href=\"#fn:2\" class=\"footnote\">[2]</a></sup> more important angles to this story emerged. It's really easy for us to lose sight of the fact that AI is just a piece in the puzzle; a means to an end. We must keep the focus on the giant crisis in biodiversity unfolding in front of our eyes like a slow motion steamroller. Other pieces on the Cambridge website that cover this include <a href=\"https://www.cam.ac.uk/stories/pollinatorsriskindex\">the pollinator risk index</a> (with <a href=\"https://www.zoo.cam.ac.uk/directory/prof-lynn-dicks\">Lynn Dicks</a>), <a href=\"https://www.cam.ac.uk/research/news/pledge-to-phase-out-toxic-lead-ammunition-in-uk-hunting-by-2025-has-failed\">lead poisoning of grouse</a>, <a href=\"https://www.cam.ac.uk/research/news/uk-peatland-fires-are-supercharging-carbon-emissions-as-climate-change-causes-hotter-drier-summers\">carbon emissions from peatland fires</a>, or the risks of <a href=\"https://www.cam.ac.uk/research/news/restoring-wildlife-habitats-in-wealthy-nations-could-drive-extinctions-in-species-rich-regions\">biodiversity leakage</a> causing extinctions (by <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\">Andrew Balmford</a>).\nIt's not all bad news of course! Cambridge has also covered <a href=\"https://www.cam.ac.uk/research/news/thriving-antarctic-ecosystems-found-following-iceberg-calving\">thriving Antarctic ecosystems</a> and <a href=\"https://www.cam.ac.uk/stories/conservation-success-stories\">success stories in species restoration</a>.</p>\n<p>My overall concerns with current central University (and world's) focus on AI stem from:</p>\n<ul>\n<li>the &quot;distraction effect&quot; caused by AI. If every conversation begins with 'artificial intelligence', then we lose track of the goal, which is to protect what remains of the natural world while making access to it as equitable as possible to every human who lives on this planet. In the past few months, I've had four separate meetings with groups in the CCI where I've been explaining things like <a href=\"https://modelcontextprotocol.io/introduction\">MCP</a> to a bunch of conservation practitioners who should be, frankly, not be having to keep up with this incredibly fast moving field in order to submit funding proposals in their own areas of core expertise on the natural world.</li>\n<li>the &quot;leakage effect&quot; to funding caused by AI. While almost anything AI has a better chance of getting dosh right now, this means conventional conservation work is being undermined as a result. But this in turn chokes out the lifeblood of AI -- the data that trains the models we build!  I noticed also that <a href=\"https://rich-turner-group.github.io/\">Richard Turner</a> made the same point about his recent <a href=\"https://www.cam.ac.uk/research/news/fully-ai-driven-weather-prediction-system-could-start-revolution-in-forecasting\">revolutionary climate model</a>, where he observes that <em>&quot;Aardvark would not have been possible without decades of physical-model development by the community, and we are particularly indebted to ECMWF for their ERA5 dataset which is essential for training Aardvark&quot;</em>. The same is true for conservation.</li>\n<li>the &quot;credit effect&quot;, which ascribes all advances to AI rather than the hard work from a global community. I noticed this in Demis Hassabis' recent <a href=\"https://www.cam.ac.uk/stories/demis-hassabis-AI-Cambridge\">talk on his Nobel prize</a>, where Alphafold was mainly possible due to a <a href=\"https://www.statnews.com/2025/01/07/casp-protein-structure-prediction-competition-after-alphafold/\">decades-long competition</a> organised by computational and experimental chemists. Whole cohorts of scientists withheld their latest results back for a year in order to allow the models to have benchmarks.</li>\n<li>the &quot;fashion effect&quot;, whereby conservation interventions that might last decades (not <a href=\"https://golarainforest.org/grnp-history\">uncommon</a> in nature restoration projects) are forced to lurch between the latest topic of the week. <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a> noted that plastic pollution was another example of how precious political attention was diverted suddenly; there was a BBC documentary with heart-breaking images of <a href=\"https://www.youtube.com/watch?v=EjIUp6A7GRU\">plastic pollution killing dolphins</a> and suddenly all attention was on <a href=\"https://www.economist.com/international/2018/03/03/the-known-unknowns-of-plastic-pollution\">eliminating them</a> at <a href=\"https://www.gov.uk/government/news/gove-takes-action-to-ban-plastic-straws-stirrers-and-cotton-buds\">all cost</a>. This isn't to say that banning plastic straws was bad (quite the opposite!), but that we must also consider biodiversity impacts holistically and continue to fund <a href=\"https://pubmed.ncbi.nlm.nih.gov/33213887/\">broad picture work</a> as well as the 'charismatic topic of the week'. Alec, for example, held a fantastic workshop last year about the use of <a href=\"https://pubmed.ncbi.nlm.nih.gov/35979694/\">OSINT</a> for establishing the bigger picture in ecosystem management.</li>\n</ul>\n<p><a href=\"https://www.cam.ac.uk/stories/Anil-Madhavapeddy-AI-climate-nature\"> <img src=\"/images/cam-nature-2.webp\" alt=\"%c\" title=\"Would you trust this man with your garden? What's that? Yes? Yes you would?\" > </a></p>\n<h2 id=\"telling-the-story-from-the-conservation-perspective\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#telling-the-story-from-the-conservation-perspective\"></a>Telling the story from the conservation perspective</h2>\n<p>I only really started thinking about this properly <em>after</em> talking to Jacqueline and Sam, so I'm grateful to them for sparking the chain of thoughts. I've started reading how other organisations (such as MacArthur's <a href=\"https://www.macfound.org/programs/field-support/technology-public-interest/\">Technology for the Public Good</a>) discuss the role of technology in societal domains, and would be grateful for any pointers to similar initiatives in conservation.</p>\n<p>I would also dearly love to see a roundup of all the Cambridge <a href=\"https://www.cam.ac.uk/news/environment\">environmental coverage</a> in one place, perhaps on the <a href=\"https://www.conservation.cam.ac.uk/\">Conservation Research Institute</a> pages, told as a cohesive story from the perspective of the nature research and not the technology that enables just a part of it. If you're an undergraduate looking for something to do this summer, especially from the social sciences or journalism, do get in touch and I'd be delighted to work with you on this for an internship! Or maybe this is something for the first edition of the <a href=\"/notes/cambridge-green-blue\">Cambridge Green Blue</a> competition to assemble next year...</p>\n<p><small class=\"credits\">Thanks to <a href=\"https://samreynolds.org\">Sam Reynolds</a>, <a href=\"https://www.zoo.cam.ac.uk/directory/dr-william-morgan\">William Morgan</a>, <a href=\"https://toao.com\">Sadiq Jaffer</a>, <a href=\"https://www.zoo.cam.ac.uk/directory/ashley-simkins\">Ash Simkins</a> and <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a> for corrections and suggestions to this post! *<em>7th May 2025:</em> See also <a href=\"/notes/humans-save-nature-not-ai\">a followup article</a> on this by <a href=\"https://www.communications.cam.ac.uk/our-team\">Jacqueline Garget</a>.</small></p>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p>As an aside, I love this long-form, carefully considered mechanism for scholarly discussion, as espoused by the letter back-and-forth in a journal. I wish we had more of this in computer science rather than social media arguments that disappear like tears in the rain just a few scrolls later.</p>\n <a href=\"#fnref:1\" class=\"reversefootnote\">&#8617;</a></p></li>\n<li id=\"fn:2\"><p><p>Amusingly, they were triggered by an accidental reply-all from me to the whole building rather than a private reply. I hold that this is the best way to start a real conversation!</p>\n <a href=\"#fnref:2\" class=\"reversefootnote\">&#8617;</a></p></li></ol></div><h1>References</h1><ul><li>Reynolds et al (2024). The potential for AI to revolutionize conservation: a horizon scan. <a href=\"https://doi.org/10.1016/j.tree.2024.11.013\" target=\"_blank\"><i>10.1016/j.tree.2024.11.013</i></a></li>\n<li>Madhavapeddy (2025). Humans are the ones that will save nature, helped by AI. <a href=\"https://doi.org/10.59350/32h4v-5kt36\" target=\"_blank\"><i>10.59350/32h4v-5kt36</i></a></li>\n<li>Madhavapeddy (2025). Thoughts on the National Data Library and private research data. <a href=\"https://doi.org/10.59350/fk6vy-5q841\" target=\"_blank\"><i>10.59350/fk6vy-5q841</i></a></li>\n<li>Ball et al (2025). Food impacts on species extinction risks can vary by three orders of magnitude. <a href=\"https://doi.org/10.1038/s43016-025-01224-w\" target=\"_blank\"><i>10.1038/s43016-025-01224-w</i></a></li>\n<li>Reynolds et al (2025). Conservation changed but not divided. <a href=\"https://doi.org/10.1016/j.tree.2025.04.002\" target=\"_blank\"><i>10.1016/j.tree.2025.04.002</i></a></li>\n<li>Madhavapeddy (2025). The Cambridge \"Green Blue\" competition to reduce emissions. <a href=\"https://doi.org/10.59350/y1g67-aq825\" target=\"_blank\"><i>10.59350/y1g67-aq825</i></a></li>\n<li>Madhavapeddy (2025). 2nd Programming for the Planet workshop CFP out. <a href=\"https://doi.org/10.59350/728q9-5ct54\" target=\"_blank\"><i>10.59350/728q9-5ct54</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/ai-should-unite-conservation",
      "title": "Technology needs to unite conservation, not divide it",
      "summary": "Response to critique of AI in conservation emphasizing participatory design, open source tools and equitable capacity building in Global South.",
      "date_published": "2025-04-25T00:00:00.000000Z",
      "date_modified": "2025-04-25T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "conservation",
        "biodiversity",
        "policy",
        "ai"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2024-ai-conhorizon.pdf",
          "mime_type": "application/pdf",
          "title": "The potential for AI to revolutionize conservation: a horizon scan"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1016/j.tree.2024.11.013",
          "doi": "10.1016/j.tree.2024.11.013",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/32h4v-5kt36",
          "doi": "10.59350/32h4v-5kt36",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/fk6vy-5q841",
          "doi": "10.59350/fk6vy-5q841",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s43016-025-01224-w",
          "doi": "10.1038/s43016-025-01224-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1016/j.tree.2025.04.002",
          "doi": "10.1016/j.tree.2025.04.002",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/y1g67-aq825",
          "doi": "10.59350/y1g67-aq825",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/728q9-5ct54",
          "doi": "10.59350/728q9-5ct54",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/ckjd5-55d47",
      "content_html": "<p>Like many <a href=\"/notes/ai-ietf-aiprefs\">others</a>, my website is under a constant barrage of crawling from bots. I need to figure out which one is hosing me, but I am also resisting having third-party trackers of any form. I took a look at hosting a <a href=\"https://plausible.io/\">Plausible</a> instance as <a href=\"https://plausible.ci.dev/ocaml.org\">OCaml does</a>, but it's yet another service to run and maintain. Then <a href=\"https://nick.recoil.org\">Nick Ludlam</a> pointed me to an old-fashioned server-side log analyser with builtin privacy called <a href=\"https://goaccess.io\">Goaccess</a> he's using on his <a href=\"https://nick.recoil.org\">site</a>, which is also perfect for my needs!</p>\n<p>Setting up Goaccess is extremely simple. It's a single binary with no dependencies outside of ncurses, and just needs some server side logs.  I currently use <a href=\"https://caddyserver.com\">Caddy</a> to front the HTTP2/3 for my custom OCaml webserver, so I just had to configure it to output JSON-format logs.</p>\n<pre><code>anil.recoil.org {\n        reverse_proxy http://localhost:8080\n        encode zstd gzip\n        log {\n                format json\n                output file /var/log/caddy/anil.recoil.org.log {\n                        roll_size 1gb\n                        roll_keep 100\n                }\n        }\n}\n</code></pre>\n<p>The above causes Caddy to log lines in a JSON format like this:</p>\n<pre><code class=\"language-json\">{ &quot;level&quot;: &quot;info&quot;, &quot;ts&quot;: 1745414562.426229,\n  &quot;logger&quot;: &quot;http.log.access.log0&quot;,\n  &quot;msg&quot;: &quot;handled request&quot;,\n  &quot;request&quot;: {\n    &quot;remote_ip&quot;: &quot;&lt;snip&gt;&quot;, &quot;remote_port&quot;: &quot;56839&quot;,\n    &quot;client_ip&quot;: &quot;&lt;snip&gt;&quot;, &quot;proto&quot;: &quot;HTTP/3.0&quot;,\n    &quot;method&quot;: &quot;GET&quot;, &quot;host&quot;: &quot;anil.recoil.org&quot;,\n    &quot;uri&quot;: &quot;/assets/home.svg&quot;,\n    &quot;headers&quot;: {\n      &quot;Sec-Fetch-Dest&quot;: [ &quot;image&quot; ],\n      &quot;Sec-Fetch-Site&quot;: [ &quot;same-origin&quot; ],\n      &quot;Sec-Fetch-Mode&quot;: [ &quot;no-cors&quot; ],\n      &quot;Priority&quot;: [ &quot;u=5, i&quot; ],\n      &quot;Accept-Encoding&quot;: [ &quot;gzip, deflate, br&quot; ],\n      &quot;Accept&quot;: [\n        &quot;image/webp,image/avif,image/jxl,image/heic,image/heic-sequence,video/*;q=0.8,image/png,image/svg+xml,image/*;q=0.8,*/*;q=0.5&quot; ],\n      &quot;User-Agent&quot;: [\n        &quot;Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/18.3.1 Safari/605.1.15&quot;\n      ],\n      &quot;Referer&quot;: [ &quot;https://anil.recoil.org/&quot; ],\n      &quot;Accept-Language&quot;: [ &quot;en-GB,en;q=0.9&quot; ]\n    },\n    &quot;tls&quot;: {\n      &quot;resumed&quot;: false, &quot;version&quot;: 772, &quot;cipher_suite&quot;: 4865,\n      &quot;proto&quot;: &quot;h3&quot;, &quot;server_name&quot;: &quot;anil.recoil.org&quot;\n    }\n  }, &lt;...etc&gt;\n}\n</code></pre>\n<p>While this is a verbose logging format, it compresses very well and has lots of information that can be analysed without the need for any JavaScript. Once the logging is setup, just running <code>goaccess &lt;logfile&gt;</code> spins up a curses configuration from which I can select the Caddy log format.</p>\n<p><img src=\"/images/goaccess-ss-1.webp\" alt=\"%c\" ></p>\n<p>After that, there is a simple interactive terminal dashboard that not only shows the usual analytics, but also fun things like operating system and time-of-access frequency patterns.</p>\n<p><img src=\"/images/goaccess-ss-2.webp\" alt=\"%c\" ></p>\n<p>The tool can also blank out IP addresses in order to preserve privacy in the output analytics, and also spit out an <a href=\"https://theorangeone.net/posts/goaccess-analytics/\">HTML report</a>, although I'm not using this particular functionality.  While Plausible looks like the answer for bigger sites, this simple tool is good enough for me. The very first iteration of this site in 1998 used to use <a href=\"https://en.wikipedia.org/wiki/Analog_(program)\">Analog</a> (written by my former Xen/Docker colleague Stephen Turner), so it's nice to go back full circle to this sort of tool again!</p><h1>References</h1><ul><li>Madhavapeddy (2025). The AIETF arrives, and not a moment too soon. <a href=\"https://doi.org/10.59350/agfta-8wk09\" target=\"_blank\"><i>10.59350/agfta-8wk09</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/goaccess-for-logs",
      "title": "Viewing web logs the old fashioned way with Goaccess",
      "summary": "Guide to using Goaccess privacy-preserving server-side log analyzer with Caddy JSON logs for web analytics.",
      "date_published": "2025-04-23T00:00:00.000000Z",
      "date_modified": "2025-04-23T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "selfhosting"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/agfta-8wk09",
          "doi": "10.59350/agfta-8wk09",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/5xx2q-b5875",
      "content_html": "<p>The sister conference to <a href=\"/notes/propl-at-splash\">PROPL</a> was held late last year in Scotland with a bumper attendance from Cambridge. All of the talks from it are now available online at <a href=\"https://www.youtube.com/@loco-workshop\">YouTube</a>, or on our ad-free <a href=\"https://watch.eeg.cl.cam.ac.uk/c/loco/videos\">EEG video site</a>.\nThe keynote from <a href=\"https://www.annecurrie.com\">Anne Currie</a> was fantastic and wide-ranging (she is the author of the eerily predictive <a href=\"https://www.annecurrie.com/chapter-1-utopia-five\">Panopticon series</a>):</p>\n<p><div class=\"video-center\"><iframe title=\"LOCO 2024 Keynote discussion with Anne Currie, Strategically Green, UK\" width=\"100%\" height=\"315px\" src=\"https://watch.eeg.cl.cam.ac.uk/videos/embed/8d092fb4-e49a-4d6c-9d37-2169330b4480\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<p>I was also involved with four fun talks up there given by collaborators:</p>\n<p><div class=\"video-center\"><iframe title=\"Carbon-Aware Name Resolution\" width=\"100%\" height=\"315px\" src=\"https://watch.eeg.cl.cam.ac.uk/videos/embed/4cd6efdb-fd22-4a1c-a326-df49dfc1f398\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div>\n<div class=\"video-center\"><iframe title=\"Lineage first computing: towards a frugal userspace for Linux\" width=\"100%\" height=\"315px\" src=\"https://watch.eeg.cl.cam.ac.uk/videos/embed/cb2439c9-d160-4daa-8103-b952c5aa2c5f\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div>\n<div class=\"video-center\"><iframe title=\"Emission Impossible: privacy-preserving carbon emissions claims\" width=\"100%\" height=\"315px\" src=\"https://watch.eeg.cl.cam.ac.uk/videos/embed/4324ab18-f3b2-4fdd-883f-a4188dee5816\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div>\n<div class=\"video-center\"><iframe title=\"Cooperative Sensor Networks for Long-Term Biodiversity Monitoring\" width=\"100%\" height=\"315px\" src=\"https://watch.eeg.cl.cam.ac.uk/videos/embed/be89625e-c671-4e2c-8261-a98b1361a077\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<p>More broadly, I'm quite pleased with the <a href=\"\">##selfhosting</a> of videos. I'm now operating three sites (with <a href=\"https://www.tunbury.org/\">Mark Elvers</a> and <a href=\"https://ryan.freumh.org\">Ryan Gibb</a>):</p>\n<ul>\n<li>The <a href=\"https://watch.cl.cam.ac.uk\">EEG video site</a> with a <a href=\"https://watch.eeg.cl.cam.ac.uk/about/instance#statistics\">100+ videos</a> of EEG talks and workshops like <a href=\"https://watch.eeg.cl.cam.ac.uk/c/propl24/videos\">PROPL</a> and LOCO (above).</li>\n<li>The <a href=\"https://watch.ocaml.org\">Watch OCaml</a> site with <a href=\"https://watch.ocaml.org/about/instance#statistics\">almost 200 videos across 20 years</a> of talks related to the OCaml programming language.</li>\n<li>My personal <a href=\"https://crank.recoil.org\">Recoil video mirror</a> with <a href=\"https://crank.recoil.org/about/instance/home\">~65 videos</a> of my own stuff.</li>\n</ul>\n<p>Many of the talks in the instances above have been sunset from their respective video hosting sites, so there's a strong element of robustness over time to hosting things ourselves. Each of the instances above also follow each other to <a href=\"https://docs.joinpeertube.org/admin/following-instances\">provide p2p redundancy</a> if a video hits the socials and goes viral.</p><h1>References</h1><ul><li>Madhavapeddy (2025). 2nd Programming for the Planet workshop CFP out. <a href=\"https://doi.org/10.59350/728q9-5ct54\" target=\"_blank\"><i>10.59350/728q9-5ct54</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/loco24-talks-online",
      "title": "Talks from LOCO24 are now available online",
      "summary": "LOCO24 conference talks now available on self-hosted video platforms with peer-to-peer redundancy.",
      "date_published": "2025-04-17T00:00:00.000000Z",
      "date_modified": "2025-04-17T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "carbon",
        "policy",
        "systems",
        "selfhosting",
        "networks"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/728q9-5ct54",
          "doi": "10.59350/728q9-5ct54",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/ycqj1-b3996",
      "content_html": "<p>It's about the time of the academic year to come up with project <a href=\"/ideas\">ideas</a>! <a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a>, <a href=\"https://github.com/andrewray\">Andy Ray</a> and I have been looking into <a href=\"/notes/fpgas-hardcaml\">FPGA/OCaml matters</a> recently so I thought I'd review the latest in the land of <a href=\"https://webassembly.org\">Webassembly</a> for non-traditional hardware targets.  It turns out that there are very fun systems projects going on to turn wasm into a &quot;real&quot; target architecture on several fronts: a native port of Linux to run in wasm, a port of wasm to run in kernel space, a POSIX mapping of wasm, and fledgling wasm-CPUs-on-FPGAs.</p>\n<h2 id=\"native-port-of-linux-to-wasm\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#native-port-of-linux-to-wasm\"></a>Native port of Linux to wasm</h2>\n<p>The first one is a <a href=\"https://github.com/tombl/linux\"><em>native</em> port</a> of the Linux kernel to run in webassembly (<a href=\"https://linux.tombl.dev\">try it in your browser</a>). This isn't an emulation; instead, the various kernel subsystems have been ported to have wasm interfaces, so the C kernel code runs directly as webassembly, with virtual device layers.</p>\n<p>The inspiration for this seems to have come from a famous comment eight years ago on the LKML:</p>\n<blockquote>\n<p>One more general comment: I think this may well be the last new CPU architecture we ever add to the kernel. Both nds32 and c-sky are made by companies that also work on risc-v, and generally speaking risc-v seems to be killing off any of the minor licensable instruction set projects, just like ARM has mostly killed off the custom vendor-specific instruction sets already. If we add another architecture in the future, it may instead be something like the LLVM bitcode or WebAssembly, who knows?\n<cite>-- <a href=\"https://lore.kernel.org/all/CAK8P3a2-wyXxctVtJxniUoeShASMhF-6Z1vyvfBnr6wKJuioAQ@mail.gmail.com/\">Arnd Bergmann, LKML, 2018</a></cite></p>\n</blockquote>\n<p>And this port brings us much closer to that!  I need to spelunk more into the diffs to the mainline kernel to see how it all works, but some quick notes:</p>\n<ul>\n<li>the <a href=\"https://github.com/tombl/linux/blob/777d95246a8b1dc184e991a76946ccafef392206/tools/wasm/src/worker.ts\">tools/wasm</a> directory shows how some of the glue code works, such as the <a href=\"https://github.com/tombl/linux/blob/777d95246a8b1dc184e991a76946ccafef392206/tools/wasm/src/worker.ts\">worker.ts</a> which uses <a href=\"https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers\">WebWorkers</a> to implement multicore, and the venerable <a href=\"https://wiki.libvirt.org/Virtio.html\">virtio</a> to implement <a href=\"https://github.com/tombl/linux/blob/wasm/tools/wasm/src/virtio.ts#L204\">virtual block devices</a>.</li>\n<li>the <a href=\"https://github.com/tombl/linux/tree/777d95246a8b1dc184e991a76946ccafef392206/arch/wasm\">arch/wasm</a> contains the glue code, and <a href=\"https://github.com/tombl/linux/blob/777d95246a8b1dc184e991a76946ccafef392206/arch/wasm/kernel/irq.c#L17\">mm.c</a> shows how atomic builtins in wasm are sufficient to implement low-level memory management. The <a href=\"https://github.com/tombl/linux/blob/777d95246a8b1dc184e991a76946ccafef392206/arch/wasm/kernel/fork.c#L12C2-L12C24\">clone</a> implementation leads us to <a href=\"https://github.com/tombl/linux/blob/777d95246a8b1dc184e991a76946ccafef392206/arch/wasm/include/asm/wasm_imports.h\">wasm_imports.h</a> which shows all the FFI stubs needed from the runtime in <a href=\"https://github.com/tombl/linux/blob/777d95246a8b1dc184e991a76946ccafef392206/tools/wasm/src/wasm.ts#L21\">worker.ts</a>.  Notably, it looks like the <a href=\"https://github.com/tombl/linux/blob/777d95246a8b1dc184e991a76946ccafef392206/tools/wasm/src/worker.ts#L103\">process switcher</a> doesn't use the <a href=\"https://github.com/WebAssembly/stack-switching\">wasm stack switching</a> extension (possibly for compatibility?).</li>\n<li>the <a href=\"https://github.com/tombl/linux/blob/777d95246a8b1dc184e991a76946ccafef392206/arch/wasm/kernel/syscall.c#L19\">arch/wasm/kernel/syscall.c</a> (and that whole directory) could form the basis for a nice OS teaching course. Implementing the core of an OS on a virtual hypervisor is always <a href=\"/projects/unikernels\">an educational experience</a>, and this port is based on &quot;real&quot; Linux!</li>\n</ul>\n<h2 id=\"running-wasm-in-linux-kernel-mode\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#running-wasm-in-linux-kernel-mode\"></a>Running wasm in Linux kernel mode</h2>\n<p>On the opposite end of the architecture spectrum, we have a <a href=\"https://github.com/wasmerio/kernel-wasm\">Linux in-kernel WASM runtime</a>. This one allows running userspace code within the kernel space, as motivated by:</p>\n<blockquote>\n<p>Since WASM is a virtual ISA protected by a virtual machine, we don't need to rely on external hardware and software checks to ensure safety. Running WASM in the kernel avoids most of the overhead introduced by those checks, e.g. system call (context switching) and <code>copy_{from,to}_user</code>, therefore improving performance.\nAlso, having low-level control means that we can implement a lot of features that were heavy or impossible in userspace, like virtual memory tricks and handling of intensive kernel events (like network packet filtering).\n<cite>-- <a href=\"https://github.com/wasmerio/kernel-wasm?tab=readme-ov-file#why-run-webassembly-in-the-kernel\">Why run Wasm in the kernel</a></cite></p>\n</blockquote>\n<p>There are some interesting <a href=\"https://github.com/wasmerio/wasmer/tree/main/examples#examples\">example applications</a> available that they accelerate. They report on the speedup for an echo and http server that can run in kernel space:</p>\n<blockquote>\n<p>When compiled with the singlepass backend (unoptimized direct x86-64 code generation) and benchmarked using tcpkali/wrk, echo-server is ~10% faster (25210 Mbps / 22820 Mbps) than its native equivalent, and http-server is ~6% faster (53293 Rps / 50083 Rps). Even higher performance is expected when the other two Wasmer backends with optimizations (Cranelift/LLVM) are updated to support generating code for the kernel.\n<cite>-- <a href=\"https://github.com/wasmerio/kernel-wasm?tab=readme-ov-file#examples-and-benchmark\">kernel wasm benchmarks</a></cite></p>\n</blockquote>\n<h2 id=\"running-posix-applications-in-the-browser\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#running-posix-applications-in-the-browser\"></a>Running POSIX applications in the browser</h2>\n<p>The kernel-wasm port lead me to look more closely at the wasmer runtime, which in turn also extends the <a href=\"https://wasi.dev\">wasi</a> server-side interface of WASM to support full POSIX compatibility. You can also view this in the <a href=\"https://wasmer.sh\">browser as a shell</a>, where a variety of applications can be compiled to wasm and run as if you had a shell in the browser!</p>\n<p>There is impressive support for POSIX here, as well as an <a href=\"https://wasmer.io/posts/introducing-the-wasmer-js-sdk\">wasmer/wasix SDK</a> to port existing applications like ffmpeg to run in the browser or <a href=\"https://wasmer.io/posts/wasmer-js-sdk-now-supports-node-and-bun\">on in a server JS runtime</a>.</p>\n<p>So what's stopping OCaml --via the <a href=\"https://tarides.com/blog/2023-11-01-webassembly-support-for-ocaml-introducing-wasm-of-ocaml/\">new wasm-of-ocaml compiler</a> -- from running in the browser? Just the fact that our target runtime depends on the <a href=\"https://github.com/WebAssembly/stack-switching\">wasm stack switching</a> extension, and <a href=\"https://github.com/ocaml-wasm/wasm_of_ocaml/issues/101#issuecomment-2464706078\">wasmer doesnt yet support that extension</a>. Since there, wasmer 2.3 has <a href=\"https://wasmer.io/posts/wasmer-2_3\">improved stack switching</a> performance but the extension isn't quite there yet. So if anyone's looking for some experience with language runtime hacking, this might be a good project. I couldn't find any information on whether wasmer is planning on adding support for this extension yet though.</p>\n<h2 id=\"running-wasm-on-an-fpga\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#running-wasm-on-an-fpga\"></a>Running wasm on an FPGA</h2>\n<p>And last but not least, given all of the above, what would it take to run wasm on an FPGA directly? The existence of the Linux native wasm port is encouraging, since it implies that if you were to get wasm instructions to run directly on an FPGA (just like you might wiht a <a href=\"https://discuss.ocaml.org/t/hardcaml-mips-cpu-learning-project-and-blog/8088\">MIPS FPGA CPU</a> or a <a href=\"https://github.com/ujamjar/hardcaml-riscv\">RISC-V one</a>), then you could hook up the rest of the OS ecosystem to this as custom drivers.</p>\n<p>I found a few projects around this space that I need to look into more:</p>\n<ul>\n<li>wasmachine is an implementation of the WebAssembly specification in a FPGA. It follows a sequential 6-steps design. <a href=\"https://github.com/piranna/wasmachine\">https://github.com/piranna/wasmachine</a> (see <a href=\"https://github.com/WebAssembly/design/issues/1050\">wasm design discussion</a>)</li>\n<li>a <a href=\"https://github.com/denisvasilik/wasm-fpga-engine\">wasm-fpga-engine</a> that executes a subset of instructions</li>\n<li>an <a href=\"https://www.mdpi.com/2079-9292/13/20/3979\">FPGA accelerator for WASM instructions</a>. This one came before the stack switching extension though, which might make the implementation in hardware significantly easier.</li>\n</ul>\n<h2 id=\"and-more\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#and-more\"></a>And more...</h2>\n<p>After first posting this, here are incoming updates. <a href=\"https://bsky.app/profile/jonaskruckenberg.de/post/3lmygmvbidc2i\">Jonas Kruckenberg</a> tells me that he's got an experimental OS called <a href=\"https://github.com/JonasKruckenberg/k23\">k23</a>. This is a microkernel that is built around the idea of using Wasm as the primary execution environment:</p>\n<blockquote>\n<p>This allows for a number of benefits:</p>\n<ul>\n<li>Security: WebAssembly is designed to run in a sandboxed environment, making it much harder to exploit.</li>\n<li>Modularity: WebAssembly modules can depend on each other, importing and exporting functionality and data, forming a modular system where dependency management is a first class citizen.</li>\n<li>Portability: WebAssembly is designed to be very portable. Forget questions like &quot;is this binary compiled for amd64 or arm?&quot;. k23 programs just run wherever.</li>\n<li>Static Analysis: WebAssembly is famous for being very easy to analyze. This means we can check for bad programs without even running them.\n<cite>-- <a href=\"https://jonaskruckenberg.github.io/k23/\">The k23 manual</a></cite></li>\n</ul>\n</blockquote><h1>References</h1><ul><li>Madhavapeddy (2025). Programming FPGAs using OCaml. <a href=\"https://doi.org/10.59350/bxnyj-v6f40\" target=\"_blank\"><i>10.59350/bxnyj-v6f40</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/wasm-on-exotic-targets",
      "title": "Webassembly on exotic architectures (a 2025 roundup)",
      "summary": "Survey of WebAssembly implementations on non-traditional targets including native Linux port, kernel-mode runtime, POSIX browser support and FPGA ports.",
      "date_published": "2025-04-16T00:00:00.000000Z",
      "date_modified": "2025-04-16T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "wasm",
        "systems",
        "fpga",
        "ocaml"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/bxnyj-v6f40",
          "doi": "10.59350/bxnyj-v6f40",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/vd6af-4bc83",
      "content_html": "<p>The <a href=\"https://www.esa.int/\">European Space Agency</a> organised the first conference on <a href=\"https://biospace25.esa.int/\">Biodiversity Insights from Space</a> (BioSpace) in February this year, and it seems like it was a huge success. The conference itself sold out within days, and the program <a href=\"https://biospace25.esa.int/agenda/\">was so packed</a> that the organisers had to split it into multiple chunks during the week to cope with everyone.  I've only just gotten around to fully browsing the <a href=\"https://biospace25.esa.int/agenda/\">schedule</a>, and it's incredible to see so much variety of work happening in biodiversity and remote sensing. Here's hoping that <a href=\"https://www.esa.int/\">ESA</a> makes this an annual event in Italy!</p>\n<p><a href=\"https://coomeslab.org\">David Coomes</a>, who was on the scientific selection committee, told us about it so we hastily submitted a few abstracts which got selected for presentation! David himself <a href=\"https://biospace25.esa.int/iframe-agenda/files/ID498_Coomes.pdf\">talked about forest disturbance</a>.</p>\n<p><a href=\"https://www.youtube.com/live/e-eQ8XhRrsE?t=14326s\"> <img src=\"/images/biospace-ss-1.webp\" alt=\"%c\" > </a></p>\n<h2 id=\"from-ground-to-canopy-integrating-ground-based-sensors-with-remote-sensing-to-improve-urban-tree-management\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#from-ground-to-canopy-integrating-ground-based-sensors-with-remote-sensing-to-improve-urban-tree-management\"></a>From Ground to Canopy: Integrating Ground-based Sensors with Remote Sensing to Improve Urban Tree Management</h2>\n<p><a href=\"https://ancazugo.github.io/\">Andres Zuñiga-Gonzalez</a> presented the work we've been <a href=\"/papers/2024-terracorder\">exploring</a> at Cambridge and Imperial around using <a href=\"/papers/2025-npu-bench\">ultra low power sensors</a> for biodiversity monitoring and <a href=\"/ideas/urban-vegetation\">urban health</a>:</p>\n<blockquote>\n<p>Urban trees are essential for supporting biodiversity, as they provide\nhabitats for various species and help regulate water storage and temperature,\nand sequester CO₂ in urban ecosystems.Urban forests have been proposed as a\nnature-based solution to fight climate change and provide ecosystem services\nto citizens. Mapping and monitoring urban trees is vital as it facilitates\nconservation strategies for both flora and fauna, early diagnosis of plant\npathogens, and zoning and urban development.</p>\n<p>However, mapping trees has\nproved difficult for urban planners since they rely on in situ surveys or\ncommunity-led projects that may not cover all areas; one such case is London,\nwhere the official survey only accounts for ~10% of the estimated 8 million\ntrees in the city. Moreover, the geographic coordinates of trees are\nsurprisingly unreliable due to a lack of precision of measuring devices (e.g.\nphones or commercial GPS).</p>\n<p>We propose a method for calibrating urban tree\nlocations using physical ground sensors as &quot;anchors&quot;. These sensors help\nreconcile spatial mismatches across various spatial datasets, including\nhigh-resolution satellite and aerial imagery and tree surveys collected by\ncity councils or in open-data projects like OSM. These low-power sensors can\nalso collect microclimate and other biodiversity-related data, such as\npassive acoustic animal activity monitoring, providing a richer picture of\ntree and urban ecosystem health and enabling high resolution maps not\npreviously possible. Our ultimate goal is to combine remote sensing\ninformation with ground-based measurements to support reliable data that can\nbe used in geographic-based foundation models to help better urban planning\nstrategies around trees that maximise their benefit to humans and nature.</p>\n</blockquote>\n<p><img src=\"/images/biospace-ss-2.webp\" alt=\"%c\" title=\"The Biospace poster was so big it was half-way to space already\" ></p>\n<p>You can read <a href=\"https://ancazugo.github.io/\">Andres Zuñiga-Gonzalez</a>'s own <a href=\"https://ancazugo.github.io/research/outreach/2025/02/14/biospace25-blog.html\">writeup on his blog</a> and watch the <a href=\"https://www.youtube.com/live/e-eQ8XhRrsE?t=14326s\">recording</a>! <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> would have made it to a poster presentation, but forgot to register in time and missed out due to how packed the conference was!</p>\n<h2 id=\"establishing-causal-links-which-facilitate-remote-sensing-of-biodiversity-metric\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#establishing-causal-links-which-facilitate-remote-sensing-of-biodiversity-metric\"></a>Establishing causal links which facilitate remote sensing of biodiversity metric</h2>\n<p><a href=\"https://www.cst.cam.ac.uk/people/og309\">Onkar Gulati</a> also prepared a poster for his <a href=\"/ideas/ssl-for-geospatial-tasks\">PhD work</a> on the topic of causality measurement. His <a href=\"https://www.onkargulati.com/2025/02/28/biospace.html\">notes from the conference</a> about the use of SDGs are great:</p>\n<blockquote>\n<p>My big takeaway from the opening speeches was that this is the first year that the ESA is spending more on building out its data science capabilities than it is on putting satellites into space. To me, this is indicative of the fact that the marginal benefit from putting effort into effectively wrangling huge amounts of data is now greater than that from collecting huge amounts of data at a faster pace.</p>\n</blockquote>\n<p>Given the growing amount of <a href=\"https://www.sdo.esoc.esa.int/environment_report/Space_Environment_Report_latest.pdf\">space junk</a> out there, getting more leverage over already gathered data seems very sensible indeed.</p>\n<p>Another important point Onkar makes that I've been noticing in my own thoughts about <a href=\"/notes/uk-national-data-lib\">national data libraries</a> is:</p>\n<blockquote>\n<p>A key point multiple speakers made note of (there were a dozen or so speakers\ntalking for perhaps ~10 minutes each) was that introducing frameworks and\nmethodologies to give countries national ownership of their data and the\nability to independently generate compatible statistics was the priority, not\nintroducing new data products. If we can move towards all countries using the\nsame standards, we can enable the aggregation of statistics up in a reliable\nmanner.</p>\n</blockquote>\n<p>Since the February date of this BIOSPACE conference there has, of course, been a huge amount of\ngeopolitical flux in the world. Countries gaining national ownership of <em>their\nown</em> data seems more important than ever.\nOnkar's <a href=\"https://www.onkargulati.com/2025/02/28/biospace.html\">full writeup</a> is full of\ninsights derived from the conference, so I encourage you to have a direct read!</p><h1>References</h1><ul><li>Madhavapeddy (2025). Thoughts on the National Data Library and private research data. <a href=\"https://doi.org/10.59350/fk6vy-5q841\" target=\"_blank\"><i>10.59350/fk6vy-5q841</i></a></li>\n<li>Millar et al (2025). Benchmarking Ultra-Low-Power μNPUs. Association for Computing Machinery. <a href=\"https://doi.org/10.1145/3680207.3765264\" target=\"_blank\"><i>10.1145/3680207.3765264</i></a></li>\n<li>Millar et al (2024). Terracorder: Sense Long and Prosper. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2408.02407\" target=\"_blank\"><i>10.48550/arXiv.2408.02407</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/biospace-25",
      "title": "ESA's first BioSpace conference seems a huge success",
      "summary": "Report on ESA's first Biodiversity Insights from Space conference featuring presentations on urban tree management and remote sensing biodiversity metrics.",
      "date_published": "2025-04-16T00:00:00.000000Z",
      "date_modified": "2025-04-16T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "forests",
        "biodiversity"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/fk6vy-5q841",
          "doi": "10.59350/fk6vy-5q841",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3680207.3765264",
          "doi": "10.1145/3680207.3765264",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2408.02407",
          "doi": "10.48550/arXiv.2408.02407",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/hxjfa-bx293",
      "content_html": "<p>I was gobsmacked to get a note from the SIGARCH <a href=\"https://www.asplos-conference.org\">ASPLOS</a> steering committee that our 2013 paper &quot;<a href=\"/papers/2013-asplos-mirage\">Unikernels: library operating systems for the cloud</a>&quot; won the <a href=\"https://www.sigarch.org/benefit/awards/acm-sigarch-sigplan-sigops-asplos-influential-paper-award/\">most influential paper</a> award at the conference last week!  I couldn't make it to Rotterdam myself due to the <a href=\"https://www.businesstraveller.com/forums/topic/reminder-no-direct-eurostar-amsterdam-rotterdam-london-for-six-months/\">travel time</a>, but <a href=\"https://github.com/mor1\">Richard Mortier</a> was <a href=\"https://mort.io/blog/tdis-accepted/\">already there</a> and so accepted the award on the whole team's behalf!</p>\n<p>My officemate <a href=\"https://simon.peytonjones.org/\">Simon Peyton Jones</a> pointed out to me that these 'test of time' awards are his favourite, as they indicate that a piece of research was actually useful over a number of years to other people in the field:</p>\n<blockquote>\n<p>The ASPLOS Influential Paper Award recognizes historical ASPLOS papers that have had major influence on the field. The Program Committee nominates highly influential papers from any ASPLOS conference that occurred ten or more conferences ago, with the final selections being made by the ASPLOS Steering Committee.\n-- <a href=\"https://www.sigarch.org/benefit/awards/acm-sigarch-sigplan-sigops-asplos-influential-paper-award/\">SIGARCH awards</a></p>\n</blockquote>\n<p><img src=\"/images/asplos25-award-1.webp\" alt=\"%r\" title=\"Mort rocking the award with customary peak-geek EDSAC t-shirt\" ></p>\n<p>My long-time colleague <a href=\"https://dave.recoil.org\">Dave Scott</a> wrote up a great overview of why <a href=\"https://dave.recoil.org/unikernels/\">he likes unikernels</a>, especially in areas like <a href=\"\">Docker</a> where he is a senior maintainer these days. Dave uses a nice jigsaw puzzle analogy to show the value of a library operating system approach when building complex systems glue; they're good for high assurance applications, for rapid experimentation and iteration, and for deep systems customisation.</p>\n<h2 id=\"i-almost-rage-quit-academia\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#i-almost-rage-quit-academia\"></a>I almost rage-quit academia</h2>\n<p><a href=\"https://github.com/mor1\">Richard Mortier</a>'s <a href=\"https://mort.io/blog/happy-day/\">note</a> brought up some memories about how this particular work made me almost rage-quit academia entirely. Back in 2009 after I returned to academia with <a href=\"https://horizon.ac.uk\">Horizon</a> fellowship, I spent a year of my life working on lots of libraries for the very first iteration of MirageOS, and published a USENIX <a href=\"/papers/2010-hotcloud-lamp\">workshop paper</a> on it.</p>\n<p>After that, it was time to do the full conference paper, so I spent another year (joined by more colleagues like <a href=\"https://github.com/samoht\">Thomas Gazagnaire</a> and <a href=\"https://dave.recoil.org\">Dave Scott</a> in the <a href=\"/projects/unikernels\">early</a> days) bringing yet more OCaml libraries to life to make the thing actually useful. <a href=\"https://www.lancaster.ac.uk/scc/about-us/people/charalampos-rotsos\">Charalampos Rotsos</a> and <a href=\"https://github.com/mor1\">Richard Mortier</a> spent forever on an OpenFlow implementation in pure OCaml; <a href=\"https://github.com/balrajsingh\">Balraj Singh</a> sorted out the mess I made of the TCP stack congestion algorithms; <a href=\"https://github.com/sos22\">Steven Smith</a> hacked on Xen with me to add immutable pfn support. There was a lot of intense hacking going on.</p>\n<p>We then trimphantly wrote up our work as a submission to OSDI 2012 after staying up all night for several days in a row to get the evaluation done, and... it got rejected. But the paper didn't <em>just</em> get rejected, it got rejected so hard that I couldn't bear to look at another OSDI proceedings for years.  So hard that I can still the weight of the punch of that email as I opened it eagerly, a decade on. Some of our colleagues also had rejected OSDI papers that year; the <a href=\"https://www.microsoft.com/en-us/research/project/naiad/?from=https://research.microsoft.com/en-us/projects/naiad/&amp;type=exact\">Naiad</a> paper got six reviews but bounced (later to win best paper at <a href=\"https://sigops.org/s/conferences/sosp/2013/\">SOSP 2013</a>), and Andrew Warfield had another that got nine (!) reviews before being shown the door. But ours got...three reviews indicating it bounced in the very first round, and one review scored us at '1/5', the lowest possible value. It made all that intense work seem like a total waste of our lives.</p>\n<p>But then... one of the reviews shone out like a beacon. It was the longest review, and was <em>full</em> of directly actionable feedback. It began very constructively:</p>\n<blockquote>\n<p>The approach described in this paper is a very reasonable design point,\na natural intersection of the exokernel and libOS and the type-safe OS.\nIt would help to refocus the abstract and intro on the precise benefits that\nUnikernel can provide, and to attribute each benefit to its origin. Let me\ntake a stab at this, based on my read of the paper:</p>\n</blockquote>\n<p>The reviewer then reinterpreted our submission to add more clarity:</p>\n<blockquote>\n<ul>\n<li>&quot;eliminate several classes of security threats via type-safety&quot; -- clearly due to the top-to-bottom use of type-safe OCaml.</li>\n<li>&quot;eliminate several classes of security threats via ... an address-space which can be made immutable&quot; -- is there any reason an analogous technique could not apply to a libOS version of libc?</li>\n<li>&quot;progressive specialization&quot; -- I didn't find this contribution very exciting. It's nice that I can execute the same OCaml app in a Linux context to debug it; perhaps this is especially important since we don't yet have good symbolic debuggers for OCaml apps standing alone in a VM.  But it really doesn't seem like a central benefit.</li>\n<li>&quot;developers no longer need to become sysadmins&quot; -- This claim is specious.  If a third party packages an app together with a Linux guest-OS stack to become an appliance, there's no reason that appliance would require any sysadmin-ish behavior more than a Unikernel appliance.</li>\n<li>&quot;The resulting unikernels are also highly compact&quot; -- Could this property not also be readily achieved with a tuned libc-based (that is, not type-safe) libOS? How precious is this property? Is the <em>working set</em> actually much smaller? And finally, how much of the reduction is because the rewritten application is much less functional than the industry standard it replaces?\n[...the review continues on for several pages]</li>\n</ul>\n</blockquote>\n<p>Now, I didn't <em>agree</em> with all the points in the review, but they were restructured in such a way that made it clear that the reviewer had really thought about it, and had tried to pull out their own insights from the system construction. We ate up this feedback, and resubmitted it in a matter of weeks to ASPLOS adopting much of the structure suggested by this OSDI reviewer, and the paper got in with accepts across the board.</p>\n<p>The best bit of all this? The OSDI reviewer voluntarily <em>unblinded</em> their review:</p>\n<blockquote>\n<p>Review by Jon Howell <a href=\"mailto:howell@microsoft.com\">howell@microsoft.com</a>, intentionally unblinded.</p>\n</blockquote>\n<p>Jon's obviously an expert in the field (his own 2011 paper on <a href=\"https://dl.acm.org/doi/10.1145/1961296.1950399\">Drawbridge</a> won the ASPLOS influential paper award last year), but it's how much time he took in helping out a sibling system that stuck with me. His kind, constructive and direct review kept me in academia, and although I still haven't met him in person (life got really busy right afterwards with <a href=\"/notes/docker-buys-unikernel-systems\">Unikernel Systems</a>), I definitely still owe him a pint!</p>\n<h2 id=\"systems-research-a-decade-or-more-on\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#systems-research-a-decade-or-more-on\"></a>Systems research a decade or more on</h2>\n<p>Mort also <a href=\"https://mort.io/blog/happy-day/\">made me think</a> about what we learnt from all this work that current students might learn from.</p>\n<p>You never know which papers will sink or swim in the fullness of time and most never get that popular outside a small niche, so don't worry about that right now when doing the work!  Focus instead of honing your systems intuition for <em>why</em> you're building something, and bring out the <a href=\"https://blog.regehr.org/archives/6\">delta of your contributions</a> clearly in the paper. When you're building a complex system, there's a lot of boilerplate and scaffolding necessary, but the core of the &quot;thing that hasn't been done before&quot; is what you know best, and it can take some time and <a href=\"https://en.wikipedia.org/wiki/Rubber_duck_debugging\">conversations</a> to figure out what that is.</p>\n<p><img src=\"/images/jc-flyingpig-1.webp\" alt=\"%c\" title=\"In the much missed Flying Pig...with Jon\" ></p>\n<p>Back in the day, most of our discussions about systems research happened after a day of deep hacking down at the pub.\nSince the pandemic, we seem to have lost a big chunk of that social discussion around our work collectively.  While I still see people regularly for a swift half, it somehow seems more difficult to gather people in general. Part of that is that I don't go into the department much anymore due to the noise and cold (something <a href=\"https://jonmsterling.com\">Jon Sterling</a> also <a href=\"https://amok.recoil.org/@jonmsterling@mathstodon.xyz/114318437109811024\">observed recently</a>), <em>vs</em> my cosy Pembroke office.</p>\n<p>I'm not sure if it's just me or everyone else also feeling this, but I'm so zoned out after even a few Zoom calls that I'm just not very social afterwards. So one thing I'm aiming to do more consciously after Easter is to try to gather people down at <a href=\"https://www.themillpubcambridge.com/\">The Mill</a> more for a swift half (where they serve excellent <a href=\"https://www.guinness.com/en-gb/beers/guinness-zero\">Guinness Zero</a>), and really cut down on remote interactions that aren't necessary.</p>\n<p><img src=\"/images/jc-kingston-1.webp\" alt=\"%c\" title=\"In the Kingston Arms...with Jon\" ></p>\n<p>And the last tip is from Barry Schwartz, who noted that <a href=\"https://en.wikipedia.org/wiki/The_Paradox_of_Choice\">the secret to happiness is low expectations</a>. No matter how much works goes into a system, don't bank too much on the big papers making a splash. Instead, enjoy every step of the journey -- from building things, scrapping them, debugging odd failures, throwing ideas around, releasing the software, seeing adoption, scrapping it all and starting again, the whole journey! There will always be a <a href=\"https://link.springer.com/article/10.1007/s40037-021-00670-z\">reviewer 2</a> waiting to ruin your day if you let them, so don't let them in.</p>\n<p>I also wonder how long paper publishing in its current form will survive; with the sheer number of publications coming out these days and the amount of <a href=\"/notes/ai-contamination-of-papers\">AI generated output</a>, it's difficult to see something published today having the same ramp as the work we did in the past few decades. Instead, adoption and rapid iteration seem to be the way to go. Thankfully, our University intellectual property rights <a href=\"https://www.enterprise.cam.ac.uk/wp-content/uploads/2015/04/IP-Policy-in-Practice-Guidance-Note-25May10-FINAL-CLEAN-Updated-links-August-2015.pdf\">remain liberal</a> (patents aside, but who cares about those these days for software), so there's nothing stopping us!</p>\n<p><img src=\"/images/jc-mill-1.webp\" alt=\"%c\" title=\"In the Mill...with Jon. Remembering our friend Ross!\" ></p><h1>References</h1><ul><li>Madhavapeddy (2025). Fake papers abound in the literature. <a href=\"https://doi.org/10.59350/qmsqz-ark89\" target=\"_blank\"><i>10.59350/qmsqz-ark89</i></a></li>\n<li>Madhavapeddy et al (2013). Unikernels: library operating systems for the cloud. ACM. <a href=\"https://doi.org/10.1145/2451116.2451167\" target=\"_blank\"><i>10.1145/2451116.2451167</i></a></li>\n<li>Porter et al (2011). Rethinking the library OS from the top down. SIGPLAN Not.. <a href=\"https://doi.org/10.1145/1961296.1950399\" target=\"_blank\"><i>10.1145/1961296.1950399</i></a></li>\n<li>Watling et al (2021). Don’t be reviewer 2! Reflections on writing effective peer review comments. Perspectives on Medical Education. <a href=\"https://doi.org/10.1007/s40037-021-00670-z\" target=\"_blank\"><i>10.1007/s40037-021-00670-z</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/unikernels-test-of-time",
      "title": "Unikernels wins the ASPLOS most influential paper award",
      "summary": "2013 MirageOS unikernels paper wins ASPLOS influential paper award with reflections on the journey from rejection to recognition.",
      "date_published": "2025-04-12T00:00:00.000000Z",
      "date_modified": "2025-04-12T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "unikernels",
        "awards",
        "systems",
        "ocaml"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/qmsqz-ark89",
          "doi": "10.59350/qmsqz-ark89",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1145/2451116.2451167",
          "doi": "10.1145/2451116.2451167",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1145/1961296.1950399",
          "doi": "10.1145/1961296.1950399",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1007/s40037-021-00670-z",
          "doi": "10.1007/s40037-021-00670-z",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/zy5bb-3ze20",
      "content_html": "<p>Over in my <a href=\"https://www.cst.cam.ac.uk/research/eeg\">EEG</a> group, we have a <em>lot</em> of primary and secondary datasets lying around: 100s of terabytes of <a href=\"/projects/rsn\">satellite imagery</a>, <a href=\"/projects/life\">biodiversity data</a>, <a href=\"/projects/ce\">academic literature</a>, and the intermediate computations that go along with them. Our trusty central shared storage server running <a href=\"https://www.truenas.com\">TrueNAS</a> stores data in <a href=\"https://en.wikipedia.org/wiki/ZFS\">ZFS</a> and serves it over <a href=\"https://en.wikipedia.org/wiki/Network_File_System\">NFSv4</a> to a bunch of hosts. This is rapidly becoming a bottleneck as our group and datasets grow, and <a href=\"https://www.tunbury.org/\">Mark Elvers</a> has been steadily adding <a href=\"https://www.tunbury.org/kingston-drives/\">lots more raw capacity</a>.  The question now is how to configure this raw SSD capacity into a more nimble storage setup.  If anyone's seen any systems similar to the one sketched out below, I'd love to hear from you.</p>\n<h2 id=\"why-get-rid-of-nfs\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#why-get-rid-of-nfs\"></a>Why get rid of NFS?</h2>\n<p>The first design constraint is to get rid of centralised network storage. This is both slow when compared to a modern NVMEs, and also hard to extend beyond the <a href=\"/papers/2015-sosp-sibylfs\">POSIX-ish</a> API to take advantage of filesystem-specific features like snapshots or <a href=\"https://docs.rs/reflink/latest/reflink/\">reflink clones</a>. We also don't take much advantage of the simultaneous use of the network storage. Instead, we'd like to just make every host materialise the portion of storage it needs locally by cloning it from the remote server.</p>\n<p>The alternative I'm considering here is to use ZFS filesystems on the nodes themselves rather than NFS. This has the upside of having the cloned data be directly available on the local disk of the host that's using it, meaning that there's no performance impact as with networked storage.  ZFS also scales fairly enormous sizes, and so it seems likely that we won't run into an upper bound due to this choice of filesystem in the medium term.</p>\n<p>ZFS operates through the creation of a <a href=\"https://wiki.ubuntu.com/ZFS/ZPool\">zpool</a> across a block of disks, over which <a href=\"https://blog.victormendonca.com/2020/11/03/zfs-for-dummies/\">datasets</a> can be created in a tree. One of our typical research servers looks like this:</p>\n<pre><code>$ zfs list\nNAME               USED  AVAIL  REFER  MOUNTPOINT\neeg               20.4T  7.37T  7.84T  /eeg\neeg/gbif          8.55T  7.37T  8.55T  /eeg/gbif\neeg/logs/fetcher  11.3G  7.37T  8.41G  /eeg/logs/fetcher\neeg/logs/zotero   12.2G  7.37T  8.29G  /eeg/logs/zotero\neeg/papers/doi    3.11T  7.37T  3.11T  /eeg/papers/doi\neeg/papers/pmc     843G  7.37T   843G  /eeg/papers/pmc\neeg/papers/tei    87.4G  7.37T  85.8G  /eeg/papers/tei\neeg/repology      5.92G  7.37T  5.92G  /eeg/repolo\n</code></pre>\n<p>Inside the <code>eeg</code> zpool, each of the sub-datasets can themselves be arranged in a hierarchy. Each of them can also have key-value labels and separate properties attached to them, and inherit their parent datasets properties. There are a <a href=\"https://openzfs.github.io/openzfs-docs/man/master/7/zfsprops.7.html\">vast number of ZFS properties</a> that can be tuned.</p>\n<h2 id=\"snapshots-and-replication-with-sanoid\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#snapshots-and-replication-with-sanoid\"></a>Snapshots and replication with Sanoid</h2>\n<p>Once a single host has some important data in a local ZFS dataset, I've started using <a href=\"https://github.com/jimsalterjrs/sanoid\">Sanoid</a> for the snapshot management:</p>\n<blockquote>\n<p>Sanoid is a policy-driven snapshot management tool for ZFS filesystems [...] you can use it to make your systems functionally immortal via automated snapshot management and over-the-air replication.</p>\n</blockquote>\n<p>The first use of Sanoid is to regularly take ZFS snapshots of important filesystems. These snapshots will be rotated regularly at different intervals, and subsequently replicated off-host.  Here's an example <code>/etc/sanoid/sanoid.conf</code> from the machine above.</p>\n<pre><code>[eeg/papers/doi]\n        use_template = production\n[eeg/papers/pmc]\n        use_template = production\n[eeg/papers/tei]\n        use_template = production\n\n[template_production]\n        frequently = 0\n        hourly = 24\n        daily = 30\n        monthly = 3\n        yearly = 0\n        autosnap = yes\n        autoprune = yes\n</code></pre>\n<p>This <code>sanoid.conf</code> establishes a &quot;production&quot; template that keeps 24 hourly snapshots, 30 daily ones and 3 monthly ones. After some time passes, I can verify this by checking the local filesystem snapshots.</p>\n<pre><code>$ zfs list -t snapshot\nNAME                                            USED  AVAIL  REFER  MOUNTPOINT\neeg/p/doi@2024122101                            134M      -  1.19T  -\neeg/p/doi@2024122102                            45.3M     -  1.19T  -\neeg/p/doi@autosnap_2025-02-01_00:00:02_monthly  224M      -  2.84T  -\neeg/p/doi@autosnap_2025-03-01_00:00:02_monthly  173M      -  3.11T  -\neeg/p/doi@autosnap_2025-03-08_00:00:03_daily    173M      -  3.11T  -\neeg/p/doi@autosnap_2025-03-09_00:00:01_daily    0B        -  3.11T  -\neeg/p/doi@autosnap_2025-03-10_00:00:02_daily    0B        -  3.11T  -\neeg/p/doi@autosnap_2025-03-11_00:00:02_daily    0B        -  3.11T  -\n&lt;...etc&gt;\n</code></pre>\n<p>These snapshots are incremental, and each subsequent one uses only the differential space taken up by the earlier ones. See this <a href=\"https://zedfs.com/all-you-have-to-know-about-reading-zfs-disk-usage/\">handy guide</a> for the meaning of the different space accounting terms above.</p>\n<p>Once happy with the production template, I then automate it within cron:</p>\n<pre><code>* * * * * TZ=Europe/London /usr/bin/sanoid --cron\n</code></pre>\n<p>This currently runs every minute (a little wasteful), in order to quickly check if any snapshots are required. Once happy with the hourly cadence working as expected, I'll drop this back to an <code>@hourly</code> job.</p>\n<h2 id=\"replicating-the-zfs-snapshots\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#replicating-the-zfs-snapshots\"></a>Replicating the ZFS snapshots</h2>\n<p>Once Sanoid is merrily making snapshots on the active host, it's also necessary to replicate this off the host for robustness. Since we're not using a networked store, I'd like to replicate the snapshots onto two other hosts, one of which is offsite.\nCrucially, these backup hosts can have their <em>own</em> sanoid configuration with a longer-term horizon of backups (e.g. keeping some yearly snapshots). To make this work, we first use a sister tool <a href=\"https://github.com/jimsalterjrs/sanoid?tab=readme-ov-file#syncoid\"><code>syncoid</code></a> that is included in the Sanoid distribution.</p>\n<p>I run <code>syncoid</code> in 'pull mode', which means that the backup server is configured to be able to SSH into the production server(s) in order to fetch the datasets.  Under the hood, syncoid uses a combination of ZFS <a href=\"https://forums.truenas.com/t/zfs-bookmarks-and-why-you-dont-use-them-but-should/5578\">bookmarks</a> and <a href=\"https://xai.sh/2018/08/27/zfs-incremental-backups.html\">send/recv</a> to incrementally and efficiently transmit the snapshots over the network and reconstruct the filesystems locally.</p>\n<p>Once the SSH host keys are configured in the usual way, a series of crontab entries like this is sufficient to fetch all the remote snapshots to the local host. The backup host that's doing the pulling just needs to run <code>syncoid</code> regularly:</p>\n<pre><code>@daily /usr/sbin/syncoid backup@marpe:eeg/papers/doi eeg/papers/doi\n@daily /usr/sbin/syncoid backup@marpe:eeg/papers/tei eeg/papers/tei\n</code></pre>\n<p>At this point, the backup host now has all the snapshots from the live host (including hourly ones), and can then run Sanoid again in order to decide which ones it wants to keep locally. I haven't put too much effort into optimising these yet, but you can see they're different from the ones above.</p>\n<pre><code>[eeg/papers]\n        use_template = backup\n        recursive = yes\n\n[template_backup]\n        autoprune = yes\n        frequently = 0\n        hourly = 30\n        daily = 90\n        monthly = 12\n        yearly = 0\n        autosnap = no\n</code></pre>\n<p>These will keep a few more hourly snapshots, and three times the number of daily snapshots available on the backup servers, in case a rollback is needed. Since the backup server typically has a lot more raw capacity than the live server, it's practical to do this there rather than on the production hosts.</p>\n<p>Finally, we can also hook this up to our monitoring scripts with a handy <a href=\"https://www.nagios.org/\">Nagios</a>-compatible interface.</p>\n<pre><code># sanoid --monitor-health\nOK ZPOOL eeg : ONLINE {Size:30.6T Free:2.44T Cap:92%}\n</code></pre>\n<h2 id=\"should-we-use-zfs-root-volumes\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#should-we-use-zfs-root-volumes\"></a>Should we use ZFS root volumes?</h2>\n<p>It's a bit trickier to figure out if the root volume of the hosts should be ZFS. This requires that the boot initrd always has a working ZFS kernel module, which sometimes goes wrong on updates if the DKMS shim falters for some reason. In terms of specific distributions:</p>\n<ul>\n<li>Ubuntu in theory supports ZFS with all its kernels, but <a href=\"https://discourse.ubuntu.com/t/future-of-zfs-on-ubuntu-desktop/33001/19?u=d0od\">the long term future</a> of ZFS root on Ubuntu is in <a href=\"https://www.omgubuntu.co.uk/2023/01/ubuntu-zfs-support-status\">question</a>. <a href=\"https://www.tunbury.org/\">Mark Elvers</a> has got <a href=\"https://www.tunbury.org/ubuntu-with-zfs-root/\">detailed instructions</a> on how to automate this with an <a href=\"https://gist.github.com/mtelvers/2cbeb5e35f43f5e461aa0c14c4a0a6b8\">Ansible playbook</a>.</li>\n<li>Debian only packages <a href=\"https://wiki.debian.org/ZFS\">ZFS via DKMS</a> due to the CDDL <a href=\"https://sfconservancy.org/blog/2016/feb/25/zfs-and-linux/\">licensing concerns</a>.</li>\n<li>Alpine also has good <a href=\"https://wiki.alpinelinux.org/wiki/ZFS\">ZFS support</a>, including <a href=\"https://wiki.alpinelinux.org/wiki/Root_on_ZFS_with_native_encryption\">encrypted root</a>.</li>\n</ul>\n<p>For my own personal servers, I've been using a normal ext4 root volume, and creating a ZFS for the remainder of the disk, without using LVM underneath it. It's a bit less flexible, but strikes a balance between performance and flexibility.</p>\n<h2 id=\"next-steps\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#next-steps\"></a>Next steps</h2>\n<p>This basic setup is sufficient to have pull-based ZFS snapshot and replication across multiple hosts, and making it easy to quickly materialise a ZFS dataset onto a given host for use in data processing.  It still needs a bunch of development to turn this into a properly robust system though:</p>\n<ul>\n<li><strong>Dynamic dataset construction:</strong> One downside is that you must create the right ZFS dataset structure ahead of time, since you can't send/receive arbitary filesystem subtrees. I'm not sure if it's easy to <code>zfs create</code> into an existing subdirectory of a dataset and have it copy the files within there semi-automatically into the new sub-dataset.</li>\n<li><strong>Backing up into encrypted volumes:</strong> One of the coolest features of ZFS is that it can maintain <em>unmounted</em> and <em>encrypted at rest</em> datasets. It's therefore possible to have unencrypted data on the production servers (so no performance hit), with more secure-at-rest encryption on the backup servers. However, <a href=\"https://mtlynch.io/zfs-encrypted-backups/\">it requires some messing around</a> to figure out the right runes.</li>\n<li><strong>Discovery of datasets in a cluster:</strong> We also need a way of knowing which datasets have been backed up to which hosts. This way, if a host in a cluster needs a particular dataset, it can request it from the other host. Given we probably have 1000s of datasets (as opposed to potentially millions of snapshotS), this doesn't seem like too difficult a problem. We may even be able to use a <a href=\"https://irmin.org\">Irmin</a> database or a DNS-based broadcast mechanism to do this easily within a cluster.</li>\n<li><strong>Switching from ZFS to XFS locally:</strong> While ZFS seems like the ideal replication filesystem, it still lacks some of the cooler local features like <a href=\"https://github.com/openzfs/zfs/issues/405#issuecomment-1880208374\">XFS reflinks</a>. It would be nice to find an efficient way to materialise an XFS filesystem from a ZFS base, but without copying absolutely everything. This is either impossibly difficult or really easy via some cunning use of <a href=\"https://en.wikipedia.org/wiki/OverlayFS\">overlayfs</a>. Probably impossible though, given how much block-level information is needed to do deduplication.</li>\n<li><strong>ZFS labels for policy:</strong> Most ZFS tools use custom key/value labels on datasets to implement policies. For example, a <code>syncoid:sync</code> label can be used to tell syncoid to include a particular recursive dataset in its replication. There are some scalability limits in just how many labels you can add before slowing a machine down a crawl (though not as bad as how many live mounts). <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> started some WIP <a href=\"https://github.com/quantifyearth/ocaml-zfs\">ocaml-zfs</a> bindings to <a href=\"https://github.com/openzfs/zfs/blob/master/include/libzfs.h\">libzfs</a> to help explore this question.</li>\n</ul>\n<p>So lots of work left to do here, but quite good fun as systems hacking goes! When <a href=\"https://www.tunbury.org/\">Mark Elvers</a> is <a href=\"https://www.tunbury.org/kingston-drives/\">done installing</a> our new drives, we'll have a few petabytes of raw capacity to implement this system over...</p><h1>References</h1><ul><li>Ridge et al (2015). SibylFS: formal specification and oracle-based testing for POSIX and real-world file systems. ACM. <a href=\"https://doi.org/10.1145/2815400.2815411\" target=\"_blank\"><i>10.1145/2815400.2815411</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/syncoid-sanoid-zfs",
      "title": "Semi distributed filesystems with ZFS and Sanoid",
      "summary": "Exploring ZFS and Sanoid for distributed filesystem management with automated snapshots and replication to replace centralized NFS storage.",
      "date_published": "2025-04-05T00:00:00.000000Z",
      "date_modified": "2025-04-05T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "storage",
        "systems",
        "opensource",
        "enki"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1145/2815400.2815411",
          "doi": "10.1145/2815400.2815411",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/728q9-5ct54",
      "content_html": "<p><a href=\"https://dorchard.github.io\">Dominic Orchard</a> and I had a blast <a href=\"https://plas4sci.github.io/conference/2024/01/22/propl.html\">running</a> the first <a href=\"https://propl.dev\">PROPL</a> workshop a couple of years ago, with a full room and engaged audience in POPL in London. Last year, our sister conference <a href=\"https://sicsa.ac.uk/loco/loco2024/\">LOCO</a> took over, and it's our turn again this year!  PROPL will return for a <a href=\"https://conf.researchr.org/home/icfp-splash-2025/propl-2025\">second outing</a> in October, co-located with <a href=\"https://icfp25.sigplan.org/\">ICFP</a>/SPLASH in Singapore in October. Read the <a href=\"https://conf.researchr.org/home/icfp-splash-2025/propl-2025#Call-for-Papers\">call for papers</a> here (deadline 3rd July 2025).</p>\n<p><img src=\"/images/propl-1.webp\" alt=\"%r\" title=\"Dominic prepping for the first PROPL in the rather delightful venue\" ></p>\n<p>We'd love to get wider participation in computer science interacting with matters of climate and biodiversity:</p>\n<blockquote>\n<p>There are simultaneous interlinked crises across the planet due to human actions: climate change, biodiversity loss, and desertification. Addressing these challenges requires, amongst other things, a global understanding of the present state of affairs and the effectiveness of our adaptations and mitigations, leveraging both data and computation.</p>\n<p>However, programming the computer systems required to effectively ingest, clean, collate, process, explore, archive, and derive policy decisions from the planetary data we are collecting is difficult and leads to artefacts presently not usable by non-CS-experts, not reliable enough for scientific and political decision making, and not widely and openly available to all interested parties. Concurrently, domains where computational techniques are already central (e.g., climate modelling) are facing diminishing returns from current hardware trends and software techniques.</p>\n<p>PROPL explores how to close the gap between state-of-the-art programming methods being developed in academia and the use of programming in climate analysis, modelling, forecasting, policy, and diplomacy. The aim is to build bridges to the current practices used in the scientific community.\n<cite> -- <a href=\"https://conf.researchr.org/home/icfp-splash-2025/propl-2025\">About PROPL</a></cite></p>\n</blockquote>\n<h2 id=\"how-to-take-part\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#how-to-take-part\"></a>How to take part</h2>\n<p><img src=\"/images/propl-2.webp\" alt=\"%r\" title=\"The first PROPL had keynotes from Drew Purves and Lisa Rennels\" ></p>\n<p>In order to get a wide set of participants, we've got three different ways you can contribute to the program this year, all of which are listed in the <a href=\"https://conf.researchr.org/home/icfp-splash-2025/propl-2025#Call-for-Papers\">call for papers</a>:</p>\n<ul>\n<li>Firstly, we want to hear short &quot;provocations&quot; from practitioners in the field, which outline a problem, application area, challenge, or capacity gap, that might be addressable by computer scientists. Since said practitioners are busy people, we've put together a <a href=\"https://forms.gle/DV2rA1iUgNwxfjiW6\">simple online form</a> in which you can submit your thoughts, rants, and ideas.</li>\n<li>Secondly, we're now going to publish a post-proceedings in the ACM digital library. These can be short papers (up to 5 pages, excluding bibliography/appendices) addressing a topic within the scope of the workshop. <a href=\"https://sicsa.ac.uk/loco/loco2024/\">LOCOS</a> did a great job encouraging thoughtful submissions last year, and we'd love to see a similar enthusiasm this year too.</li>\n<li>Thirdly, consider submitting a talk or discussion idea aligned with the topics of the workshop. This could include reporting on existing work, a demo, open problems, work in progress, or new ideas and speculation. We may combine multiple talk proposals into panel discussions, depending on the submitted topics.</li>\n</ul>\n<p>The papers and talks can be submitted using the <a href=\"https://propl25.hotcrp.com\">PROPL HotCRP</a>, and the provocations via an <a href=\"https://forms.gle/DV2rA1iUgNwxfjiW6\">online form</a>. The deadline is the 3rd July 2025 (anywhere on earth), so we hope to see you take part!</p>\n<h2 id=\"see-last-years-talks\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#see-last-years-talks\"></a>See last year's talks</h2>\n<p>For those curious about the first PROPL outing, all of the talk videos are all online <a href=\"https://www.youtube.com/watch?v=yZeS4oN_XeI&amp;list=PLyrlk8Xaylp7j9K6CETKpQSpCIOcJ9iO9\">on YouTube</a> or our <a href=\"https://watch.eeg.cl.cam.ac.uk/c/propl24/videos\">EEG video mirror</a> (ad-free).</p>\n<p><img src=\"/images/propl-3.webp\" alt=\"%c\" title=\"We're looking forward to seeing you in Singapore for the second outing!\" ></p>\n<p><small class=\"credits\">(Thanks to Lena Yang for spotting typos.)</small></p>",
      "url": "https://anil.recoil.org/notes/propl-at-splash",
      "title": "2nd Programming for the Planet workshop CFP out",
      "summary": "Call for papers for PROPL 2025 workshop on programming methods for climate and biodiversity research at ICFP/SPLASH Singapore.",
      "date_published": "2025-04-03T00:00:00.000000Z",
      "date_modified": "2025-04-03T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "service",
        "functional",
        "conference",
        "climate",
        "biodiversity",
        "conservation"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2025-npu-bench-1",
      "content_html": "<p><a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> just released our latest preprint on how to make sense of the growing number of dedicated, ultra-low-power 'neural network accelerators' that are found in many modern embedded chipsets. My interest in this derives from wanting to decouple from the cloud when it comes to <a href=\"/projects/osmose\">low-latency local environments</a>, and this needs fast tensor operations in hardware. Josh found a huge number of interesting NPUs in modern low-cost chips, ranging from <a href=\"https://www.espressif.com/en/products/socs/esp32\">ESP32</a>-based boards over to <a href=\"https://arm.com\">ARM</a> ones. All of these have quite a variety of tradeoffs, from the operations supported (which affects which models can be run on them) to the amount of memory and CPU power. This is the first comparative evaluation and independent benchmarking of several commercially-available micro-NPUs. We developed an open-source model compilation framework to enable consistent benchmarking across diverse hardware, measuring end-to-end performance including latency, power consumption, and memory overhead. The analysis uncovered surprising disparities between hardware specifications and actual performance, including unexpected scaling behaviors with increasing model complexity.</p><h1>References</h1><ul><li>Millar et al (2025). Benchmarking Ultra-Low-Power μNPUs. Association for Computing Machinery. <a href=\"https://doi.org/10.1145/3680207.3765264\" target=\"_blank\"><i>10.1145/3680207.3765264</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2025-npu-bench-1",
      "title": "New preprint on benchmarking ultra-low power neural accelerators",
      "summary": "Preprint analyzing dedicated ultra-low-power neural network accelerators in modern embedded chipsets for edge computing applications.",
      "date_published": "2025-03-28T00:00:00.000000Z",
      "date_modified": "2025-03-28T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "biodiversity",
        "conservation",
        "esp32",
        "embedded",
        "sensing"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2025-npu-bench.pdf",
          "mime_type": "application/pdf",
          "title": "Benchmarking Ultra-Low-Power μNPUs"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1145/3680207.3765264",
          "doi": "10.1145/3680207.3765264",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/xb1fz-c5v35",
      "content_html": "<p>Our <a href=\"/papers/2024-life\">recently published</a> <a href=\"/projects/life\">LIFE</a> biodiversity metric has just been integrated into a newly recognised <a href=\"https://defraenvironment.blog.gov.uk/2025/01/20/newly-recognised-official-statistic-tracks-the-environmental-impact-of-our-consumption/\">Official Statistic from the UK government</a>! This integrates the core LIFE biodiversity metric with <a href=\"/papers/2024-food-life\">food provenance data</a> to track the environmental impacts of our consumption habits.</p>\n<p>I must admit that I'd not heard of &quot;Official Statistics&quot; before this, so I did a bit of research. The UK <a href=\"https://osr.statisticsauthority.gov.uk/\">Office for Statistics Regulation</a> says that:</p>\n<blockquote>\n<p>Official statistics are statistics produced by Crown bodies and other organisations listed within an Official Statistics Order, on behalf of the UK government or devolved administrations.\nThey provide a factual basis for assessment and decisions on economic, social and environmental issues at all levels of society.\n<cite>-- <a href=\"https://osr.statisticsauthority.gov.uk/policies/official-statistics-policies/\">OSR Policies</a> </cite></p>\n</blockquote>\n<p>The good folk at the <a href=\"https://jncc.gov.uk/\">Joint Nature Conservation Committee</a> are responsible for this particular statistic. The JNCC launched their <a href=\"https://data.jncc.gov.uk/data/ccb9f624-7121-4c32-aefa-e0579d7eaaa1/together-for-nature.pdf\">Together for Nature</a> strategy in 2023, and have the remit of turning scientific outcomes into robust evidence-based action for protecting nature worldwide. They've been developing a <a href=\"https://commodityfootprints.earth/\">Global Environmental Impacts of Consumption</a> indicator that provides information to policymakers about the tradeoffs of various consumption actions vs the corresponding global environmental impact.</p>\n<blockquote>\n<p>When products arrive in our shops and on our doorsteps, they can look very different to the raw ingredients that were used to make them. Most products are made up of many parts, and these can move thousands of miles through many countries before reaching their final destination.</p>\n<p>The average consumer is so far removed from the production process, both physically and conceptually, that it is hard to imagine where their products are from, let alone the environmental impacts resulting from their production. This is also true for the academics and governments working to monitor and reduce the environmental impact of our consumption.\n<cite>-- <a href=\"https://defraenvironment.blog.gov.uk/2025/01/20/newly-recognised-official-statistic-tracks-the-environmental-impact-of-our-consumption/\">DEFRA Blog</a></cite></p>\n</blockquote>\n<p>Their <a href=\"https://commodityfootprints.earth/\">GEIC tool</a>, developed jointly with the <a href=\"https://www.sei.org/\">Stockholm Environmental Institute</a> and our LIFE collaborator <a href=\"https://www.york.ac.uk/sei/staff/jonathan-green/\">Jonathan Green</a>, provides data on spatial biodiversity, water use, and forest cover changes associated with a country consumption.</p>\n<a href=\"https://commodityfootprints.earth/?footprint_type=consuming&footprint_opposite=producing&focal_country=United+Kingdom+of+Great+Britain+and+Northern+Ireland&measure=LIFE_score_embedded_in_consumption__change_in_prob_of_extinct_n&filter_year=2022&domestic_flows=true&lang=en#dashboard\">\n<p><img src=\"/images/life-statistic-1.webp\" alt=\"%c\" title=\"The consumption impacts of the UK on global species extinctions\" ></p>\n</a>\n<p>This metric is updated annually, and this year <a href=\"https://www.york.ac.uk/sei/staff/jonathan-green/\">Jonathan Green</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\">Thomas Ball</a> supplied LIFE+FOOD to additionally map where species are more likely to go extinct as a result of land-use change.</p>\n<p>This year's GEIC update is the first where it was recognised by the OSR as being of sufficient stability and quality to &quot;graduate&quot; into Offical Statistic status. A pretty cool feeling, and it's all openly downloadable of course; you can <a href=\"https://commodityfootprints.earth/?footprint_type=consuming&amp;footprint_opposite=producing&amp;focal_country=United+Kingdom+of+Great+Britain+and+Northern+Ireland&amp;measure=LIFE_score_embedded_in_consumption__change_in_prob_of_extinct_n&amp;filter_year=2022&amp;domestic_flows=true&amp;lang=en\">navigate</a> over to the Commodity Footprints LIFE section to explore the metrics for yourself.</p>\n<p>From a <a href=\"/projects/plancomp\">planetary computing</a> perspective, what I also found interesting is how the flow of observations and evidence works in practise. The computational processing for LIFE involves <a href=\"/ideas/effective-geospatial-code\">crunching</a> petabytes of raster maps from a <a href=\"https://github.com/quantifyearth/aoh-calculator\">species habitat pipeline</a> into a global map, which is then published as an aggregate map on <a href=\"https://zenodo.org/records/14945383\">Zenodo</a> by <a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\">Alison Eyres</a> and <a href=\"https://mynameismwd.org\">Michael Dales</a>.</p>\n<p><img src=\"/images/life-statistic-2.webp\" alt=\"%c\" title=\"The LIFE map in false colour around the equatorial region (credit: Tom Swinfield/Michael Dales)\" ></p>\n<p><a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\">Thomas Ball</a> and <a href=\"https://www.york.ac.uk/sei/staff/jonathan-green/\">Jonathan Green</a> then worked directly with the policy team at the JNCC to further customise the metric for GEIC needs, and the aggregate result of that is what's actually used in the dashboard.</p>\n<p>There's quite a long gap between the original observations and the resulting policy use, with many humans in the loop in between. Computational systems need to capture all this nuance rather than viewing these metrics as &quot;just&quot; dataflow pipelines. However, it's equally important to capture the policy customisations in some sort of code, so that we can reliably issue annual updates. Figuring this pipeline out is part of what we're working on in the <a href=\"/projects/ce\">Conservation Evidence Copilots</a> project at present. See below for a <a href=\"/videos/d592bf17-c835-435f-9469-f0f65e926975\">recent talk</a> I gave on the functional programming aspects of this problem at LambdaDays.</p>\n<p><div class=\"video-center\"><iframe title=\"Programming for the Planet\" width=\"100%\" height=\"315px\" src=\"https://crank.recoil.org/videos/embed/d592bf17-c835-435f-9469-f0f65e926975\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p><h1>References</h1><ul><li>Eyres et al (2025). LIFE: A metric for mapping the impact of land-cover change on global extinctions. <a href=\"https://doi.org/10.1098/rstb.2023.0327\" target=\"_blank\"><i>10.1098/rstb.2023.0327</i></a></li>\n<li>Ball et al (2025). Food impacts on species extinction risks can vary by three orders of magnitude. <a href=\"https://doi.org/10.1038/s43016-025-01224-w\" target=\"_blank\"><i>10.1038/s43016-025-01224-w</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/life-official-statistic",
      "title": "LIFE becomes an Official Statistic of the UK government",
      "summary": "LIFE biodiversity metric becomes UK government Official Statistic to track consumption's environmental impact.",
      "date_published": "2025-03-21T00:00:00.000000Z",
      "date_modified": "2025-03-21T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "life",
        "biodiversity",
        "conservation",
        "sensing",
        "policy",
        "evidence"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1098/rstb.2023.0327",
          "doi": "10.1098/rstb.2023.0327",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s43016-025-01224-w",
          "doi": "10.1038/s43016-025-01224-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/k540h-6h993",
      "content_html": "<p>Access to reliable and timely scientific evidence is utterly vital for the practise of responsible policymaking, especially with all the turmoil in the world these days. At the same time, the evidence base on which use to make these decisions is rapidly morphing under our feet; the <a href=\"https://sakana.ai/ai-scientist-first-publication/\">first entirely AI-generated paper passed peer review</a> at an ICLR workshop today.  We held a workshop on this topic of AI and evidence synthesis at <a href=\"https://pem.cam.ac.uk\">Pembroke College</a> last week, to understand both the opportunities for the use of AI here, the <a href=\"/papers/2024-ce-llm\">strengths and limitations</a> of current tools, areas of progress and also just to chat with policymakers from <a href=\"https://www.gov.uk/government/organisations/department-for-science-innovation-and-technology\">DSIT</a> and thinktanks about how to approach this rapidly moving area.</p>\n<p><em>(The following notes are adapted from jottings from <a href=\"https://www.cst.cam.ac.uk/people/jkm40\">Jessica Montgomery</a>,\n<a href=\"https://samreynolds.org\">Sam Reynolds</a>, <a href=\"https://ai.cam.ac.uk/people/annabelle-scott\">Annabelle Scott</a> and myself. They are not at all complete, but hopefully useful!)</em></p>\n<p>We invited a range of participants to the workshop and held it at Pembroke College (the choice of the centuries-old location felt appropriate).\n<a href=\"https://www.cst.cam.ac.uk/people/jkm40\">Jessica Montgomery</a> and <a href=\"https://inverseprobability.com/\">Neil Lawrence</a> expertly emceed the day, with <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a>, <a href=\"https://toao.com\">Sadiq Jaffer</a> and <a href=\"https://samreynolds.org\">Sam Reynolds</a> also presenting provocations to get the conversation going.</p>\n<p><img src=\"/images/evidence-synth-2.webp\" alt=\"%c\" title=\"Lots of excellent discussions over Pembroke sarnies!\" ></p>\n<h2 id=\"evidence-synthesis-at-scale\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#evidence-synthesis-at-scale\"></a>Evidence synthesis at scale</h2>\n<p><a href=\"https://www.cst.cam.ac.uk/people/jkm40\">Jessica Montgomery</a> described the purpose of the workshop as follows:</p>\n<blockquote>\n<p>Evidence synthesis is a vital tool to connect scientific knowledge to areas\nof demand for actionable insights. It helps build supply chains of ideas,\nthat connect research to practice in ways that can deliver meaningful\nimprovements in policy development and implementation.  Its value can be seen\nacross sectors: aviation safety benefitted from systematic incident analysis;\nmedical care has advanced through clinical trials and systematic reviews;\nengineering is enhanced through evidence-based design standards. When done\nwell, evidence synthesis can transform how fields operate. However, for every\nfield where evidence synthesis is embedded in standard operating practices,\nthere are others relying on untested assumptions or outdated guidance.\n<cite>-- <a href=\"https://www.cst.cam.ac.uk/people/jkm40\">Jessica Montgomery</a>, AI@Cam</cite></p>\n</blockquote>\n<p>One such field that benefits from evidence is <a href=\"/projects/ce\">conservation</a>, which is what <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> and his <a href=\"https://conservationevidence.com\">team</a> have been working away on for years.  Bill went on to discuss the fresh challenges that AI brings to this field, because it introduces a new element of scale which could augment relatively slow human efforts.</p>\n<blockquote>\n<p>Scale poses a fundamental challenge to traditional approaches to evidence\nsynthesis.  Comprehensive reviews take substantial resources and time. By the\ntime they are complete – or reach a policy audience – the window for action\nmay have closed.  The Conservation Evidence project at the University of\nCambridge offers an example of how researchers can tackle this challenge. The\nConservation Evidence team has analysed over 1.3M journals from 17 languages\nand built a website enabling access to this evidence base.  To support users\nto interrogate this evidence base, the team has compiled a metadataset that\nallows users to explore this literature based on a question of interest, for\nexample looking at what conservation actions have been effective in managing\na particular invasive species in a specified geographic area.\n<cite>-- <a href=\"https://www.cst.cam.ac.uk/people/jkm40\">Jessica Montgomery</a>, AI@Cam</cite></p>\n</blockquote>\n<p>The AI for evidence synthesis landscape is changing very rapidly, with a variety of specialised tools now\nbeing promoted in this space. This ranges from commercial tools such as <a href=\"https://gemini.google/overview/deep-research/?hl=en\">Gemini Deep Research</a> and <a href=\"https://openai.com/index/introducing-deep-research/\">OpenAI's deep searcher</a>, to\nresearch-focused systems such as <a href=\"https://elicit.com\">Elicit</a>, <a href=\"https://www.distillersr.com/products/distillersr-systematic-review-software\">DistillerSR</a>, and <a href=\"https://www.robotreviewer.net\">RobotReviewer</a>. These tools vary in their approach, capabilities, and target users, raising questions about which will best serve different user needs.  RobotReviewer, for example, notes that:</p>\n<blockquote>\n<p>[...] the machine learning works well, but is not a substitute for human systematic reviewers. We recommend the use of our demo as an assistant to human reviewers, who can validate the machine learning suggestions, and correct them as needed. Machine learning used this way is often described as semi-automation.\n<cite>-- <a href=\"https://www.robotreviewer.net/about\">About RobotReviewer</a></cite></p>\n</blockquote>\n<p>The problem, of course, is that these guidelines will often be ignored by\nreviewers who are under time pressure, and so the well established protocols\nfor systematic reviewers are under some threat.</p>\n<p><img src=\"/images/evidence-synth-4.webp\" alt=\"%c\" title=\"Sadiq Jaffer and Sam Reynolds discuss emerging AI systems\" ></p>\n<h2 id=\"how-do-we-get-more-systematic-ai-driven-systematic-reviews\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#how-do-we-get-more-systematic-ai-driven-systematic-reviews\"></a>How do we get more systematic AI-driven systematic reviews?</h2>\n<p><a href=\"https://toao.com\">Sadiq Jaffer</a> and <a href=\"https://samreynolds.org\">Sam Reynolds</a> then talked about some of the computing approaches required to achieve a more reliable evidence review base.\nThey identified three key principles for responsible AI integration into evidence synthesis:</p>\n<ul>\n<li>Traceability: Users should see which information sources informed the evidence review system and why any specific evidence was included or excluded.</li>\n<li>Transparency: Open-source computation code, the use of open-weights models, <a href=\"https://www.ibm.com/impact/ai-ethics\">ethically sourced</a> training data, and clear documentation of methods mean users can scrutinise how the system is working.</li>\n<li>Dynamism: The evidence outputs should be continuous updated to refines the evidence base, via adding new evidence and flagging <a href=\"/notes/ai-contamination-of-papers\">retracted papers</a>.</li>\n</ul>\n<p><a href=\"https://www.cser.ac.uk/team/alex-marcoci/\">Alex Marcoci</a> pointed out his recent work on <a href=\"https://osf.io/sz2g8/\">AI replication games</a> which I found fascinating. The idea here is that:</p>\n<blockquote>\n<p>Researchers will be randomly assigned to one of three teams: Machine, Cyborg\nor Human. Machine and Cyborg teams will have access to (commercially\navailable) LLM models to conduct their work; Human teams of course rely only\non unaugmented human skills. Each team consists of 3 members with similar\nresearch interests and varying skill levels. Teams will be asked to check for\ncoding errors and conduct a robustness reproduction, which is the ability to\nduplicate the results of a prior study using the same data but different\nprocedures as were used by the original investigator.\n<cite>-- <a href=\"https://www.sheffield.ac.uk/machine-intelligence/events/i4rs-ai-replication-games\">Institute for Replication</a></cite></p>\n</blockquote>\n<p>These replication games are happening on the outputs of evidence, but the\n<em>inputs</em> are also rapidly changing with today's announcement of a <a href=\"https://sakana.ai/ai-scientist-first-publication/\">fully generated AI papers passing peer\nreview</a>. It's hopefully now clear\nthat AI is a huge disruptive factor in evidence synthesis.</p>\n<p><img src=\"/images/evidence-synth-3.webp\" alt=\"%c\" ></p>\n<h2 id=\"the-opportunity-ahead-of-us-for-public-policy\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-opportunity-ahead-of-us-for-public-policy\"></a>The opportunity ahead of us for public policy</h2>\n<p>We first discussed how AI could help in enhancing systematic reviews.\nAI-enabled analysis can accelerate literature screening and data extraction,\ntherefore helping make the reviews more timely and comprehensive.  The\nopportunity ahead of us is to democratise access to knowledge synthesis by\nmaking it available to those without specialised training or institutional\nresources, and therefore getting wider deployment in countries and\norganisations without the resources to commission traditional reviews.</p>\n<p>However, there are big challenges remaining in <a href=\"/notes/uk-national-data-lib\">gaining access</a> to published research papers and datasets.\nThe publishers have deep concerns over AI-generated evidence synthesis, and more generally about the use of generative AI involving their source material. But individual publishers are <a href=\"https://theconversation.com/an-academic-publisher-has-struck-an-ai-data-deal-with-microsoft-without-their-authors-knowledge-235203\">already selling</a> their content to the highest bidder as part of the <a href=\"/notes/ai-ietf-aiprefs\">data hoarding wars</a> and so the spread of the work into pretrained models is not currently happening equitably or predictably.\n<a href=\"https://inverseprobability.com/\">Neil Lawrence</a> called this &quot;competitive exclusion&quot;, and it is limiting communication and knowledge diversity.</p>\n<p>The brilliant <a href=\"https://www.aru.ac.uk/people/jennifer-schooling\">Jennifer Schooling</a> then led a panel discussion about the responsible\nuse of AI in the public sector.  The panel observed that different countries\nare taking different approaches to the applications of AI in policy research.\nHowever, every country has deep regional variances in the <em>application</em> of\npolicy and priorities, which means that global pretrained AI models always need\nsome localized retuning. The &quot;one-size-fits-all&quot; approach works particularly\nbadly for policy, where local context is crucial to a good community outcome\nthat minimises harm.</p>\n<p>Policymakers therefore need realistic expectations about what AI can and cannot do in evidence synthesis.\n<a href=\"https://inverseprobability.com/\">Neil Lawrence</a> and <a href=\"https://www.aru.ac.uk/people/jennifer-schooling\">Jennifer Schooling</a> came up with the notion that &quot;anticipate, test, and learn&quot; methods must guide AI deployment in policy research; this is an extension of the &quot;<a href=\"https://public.digital/pd-insights/blog/2024/12/just-what-is-test-and-learn\">test and learn</a>&quot; culture being pushed by Pat McFadden as part of the Labour plan to <a href=\"https://www.gov.uk/government/speeches/reform-of-the-state-has-to-deliver-for-the-people\">reform the public sector</a> this year.  With AI systems, <a href=\"https://www.cser.ac.uk/team/alex-marcoci/\">Alex Marcoci</a> noted that we need to be working with the end users of the tools to scope what government departments need and want. These conversations needs to happen <em>before</em> we build the tools, letting us anticipate problems before we deploy and test them in a real policy environment. <a href=\"https://inverseprobability.com/\">Neil Lawrence</a> noted that policy doesn't have a simple &quot;sandbox&quot; environment to test AI outcomes in, unlike many other fields where simulation is practical ahead of deployment.</p>\n<p><a href=\"https://www.jbs.cam.ac.uk/people/lucia-reisch/\">Lucia Reisch</a> noted that users must maintain critical judgement when using these\nnew AI tools; the machine interfaces must empower users towrads enhancing their\ncritical thinking and encouraging reflection on what outputs are being created\n(and what is being left out!).  Lucia also mentioned that her group helps run\nthe &quot;<a href=\"https://whatworksclimate.solutions/about/\">What Works</a>&quot; summit, which\nI've never been to but plan on attending next it rolls around.</p>\n<p>The energy requirements for training and running these large scale AI models\nare significant as well, of course, raising questions about the long-term\nmaintenance costs of these tools and their environmental footprint.  There was\nwide consensus that the UK should develop its own AI models to ensure\nresilience and sovereignty, but also to make sure that the regional finetuning\nto maximise positive outcomes is under clear local control and not outsourced\ngeopolitically. By providing a single model that combines <a href=\"/notes/uk-national-data-lib\">UK national data</a>, we would also not waste energy with lots of\nsmaller training efforts around the four nations.</p>\n<p><img src=\"/images/evidence-synth-1.webp\" alt=\"%c\" title=\"Sadiq Jaffer in front of a very old, very fancy and not AI-designed door\" ></p>\n<p>Thanks <a href=\"https://ai.cam.ac.uk/people/annabelle-scott\">Annabelle Scott</a> for such a stellar organisation job and to Pembroke for hosting and all for\nattending, and please do continue the discussion about this <a href=\"https://www.linkedin.com/feed/update/urn:li:activity:7303431795587309569/\">on LinkedIn</a>\nif you are so inclined.</p><h1>References</h1><ul><li>Madhavapeddy (2025). Fake papers abound in the literature. <a href=\"https://doi.org/10.59350/qmsqz-ark89\" target=\"_blank\"><i>10.59350/qmsqz-ark89</i></a></li>\n<li>Madhavapeddy (2025). Thoughts on the National Data Library and private research data. <a href=\"https://doi.org/10.59350/fk6vy-5q841\" target=\"_blank\"><i>10.59350/fk6vy-5q841</i></a></li>\n<li>Madhavapeddy (2025). The AIETF arrives, and not a moment too soon. <a href=\"https://doi.org/10.59350/agfta-8wk09\" target=\"_blank\"><i>10.59350/agfta-8wk09</i></a></li>\n<li>Iyer et al (2025). Careful design of Large Language Model pipelines enables expert-level retrieval of evidence-based information from syntheses and databases. <a href=\"https://doi.org/10.1371/journal.pone.0323563\" target=\"_blank\"><i>10.1371/journal.pone.0323563</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/ai-for-evidence-synthesis-workshop",
      "title": "A fully AI-generated paper just passed peer review; notes from our evidence synthesis workshop",
      "summary": "\"AI-generated paper passes peer review, sparking discussion on evidence synthesis and AI's role in policymaking.\"",
      "date_published": "2025-03-12T00:00:00.000000Z",
      "date_modified": "2025-03-12T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ce",
        "conservation",
        "ai",
        "llms",
        "evidence"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/qmsqz-ark89",
          "doi": "10.59350/qmsqz-ark89",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/fk6vy-5q841",
          "doi": "10.59350/fk6vy-5q841",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/agfta-8wk09",
          "doi": "10.59350/agfta-8wk09",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1371/journal.pone.0323563",
          "doi": "10.1371/journal.pone.0323563",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/r80vb-7b441",
      "content_html": "<p>I've been an avid user of <a href=\"https://github.com\">GitHub</a> since its launch, and it really has revolutionised how communities come together to work on open source. In recent years though, I find myself utterly overwhelmed by its notifications and want to experiment with <a href=\"https://www.offlineimap.org/github/2016/03/08/github-pr-suck.html\">alternative workflows</a>. This experimentation also has a more serious undertone due to the increasing need for <a href=\"https://www.boell.de/en/2025/01/24/trump-and-big-tech-europes-sovereignty-stake\">data sovereignty</a> and so I'm starting to move my source code to self-hosted solutions that are less reliant on centralised services.</p>\n<p>This has also come up persistently over the years in the <a href=\"https://ocaml.org\">OCaml</a> community, with questions over why participation in packaging <a href=\"https://discuss.ocaml.org/t/publishing-without-github/3232\">requires a GitHub account</a> ever since the <a href=\"/notes/opam-1-1-beta\">early days</a> of opam. I've never found a good answer... until now, with the launch of an exciting <a href=\"https://tangled.sh\">new service</a> that's built over the same protocol that <a href=\"https://bsky.app\">Bluesky</a> uses.\nAs I <a href=\"/notes/atproto-for-fun-and-blogging\">noted</a> a few weeks ago, the <a href=\"https://atproto.com/\">ATProto</a> can be used for more than just microblogging. It can also be an <em>identity</em> layer, across which other applications can be built which reuse the social fabric from Bluesky accounts.</p>\n<p>&quot;<a href=\"https://tangled.sh\">Tangled</a>&quot; is a new service launched (just yesterday!) by <a href=\"https://tangled.sh/@oppili.bsky.social\">opilli</a> and <a href=\"https://tangled.sh/@icyphox.sh\">icyphox</a> to manage Git repositories. I'm having a lot of fun trying it out, even in its early alpha stages!  The coolest thing about Tangled is that you can self-host your own <a href=\"https://blog.tangled.sh/intro\">knots</a>, which control where the source code repositories are actually stored.</p>\n<h2 id=\"self-hosting-my-own-tangled-knot\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#self-hosting-my-own-tangled-knot\"></a>Self hosting my own Tangled knot</h2>\n<p>I set up one of the first knots on the network on <code>git.recoil.org</code>, and can now directly share my source code online without depending on GitHub!  For example, this is the <a href=\"https://tangled.sh/@anil.recoil.org/knot-docker\">knot-docker</a> container config which you can use to deploy your own version of this.</p>\n<p><a href=\"https://tangled.sh/@anil.recoil.org/knot-docker\"> <img src=\"/images/tangled-ss-1.webp\" alt=\"%c\" > </a></p>\n<p>It looks pretty similar to GitHub doesn't it? The first key difference is the login on the top-right, which is the same as my <a href=\"https://bsky.app/profile/anil.recoil.org\">anil.recoil.org</a> account.  Once you're logged in, the other difference shows up when creating a new Git repository.</p>\n<p><img src=\"/images/tangled-ss-2.webp\" alt=\"%c\" ></p>\n<p>As you can see, you can not only select the name of the repository, but also <em>where</em> it's going to be stored. I can either put it on the central Tangled knot, or stick it on my own Recoil one.  After this, the user experience of cloning is as simple as:</p>\n<pre><code>git clone https://tangled.sh/@anil.recoil.org/knot-docker\ngit clone git@git.recoil.org:anil.recoil.org/knot-docker\n</code></pre>\n<p>In the first case, the central tangled web server proxies the Git contents over HTTP, and for SSH I can just connect directly to my own server.  Inside my Knot container, we can see where the Git repositories are stored:</p>\n<pre><code>/home/git # ls -1\ndid:plc:nhyitepp3u4u6fcfboegzcjw\nknotserver.db\nknotserver.db-shm\nknotserver.db-wal\nlog\n</code></pre>\n<p>The <code>did:</code> directory is actually my 'decentralised identifier' from the ATProto, which we can verify by looking up the <a href=\"https://bsky.social/about/blog/4-28-2023-domain-handle-tutorial\">DNS atproto TXT</a> record for my domain:</p>\n<pre><code>$ dig txt _atproto.anil.recoil.org\n;; ANSWER SECTION:\n_atproto.anil.recoil.org. 10799 IN      TXT     &quot;did=did:plc:nhyitepp3u4u6fcfboegzcjw&quot;\n</code></pre>\n<p>And then if we navigate into that directory, we can see there are just normal bare git repositories stored on my server.</p>\n<pre><code>/home/git/did:plc:nhyitepp3u4u6fcfboegzcjw/knot-docker # ls -la\ntotal 24\ndrwxr-sr-x    4 git      git           4096 Mar  8 19:02 .\ndrwxr-sr-x    4 git      git           4096 Mar  8 18:23 ..\n-rw-r--r--    1 git      git             21 Mar  8 18:01 HEAD\n-rw-r--r--    1 git      git             36 Mar  8 18:01 config\ndrwxr-sr-x   17 git      git           4096 Mar  8 19:02 objects\ndrwxr-sr-x    4 git      git           4096 Mar  8 18:01 refs\n</code></pre>\n<p>This makes the core of Tangled very safe to use, even if the service disappears: I maintain the actual git repositories myself, so I can (e.g.) mirror them to GitHub via a simple cron script.</p>\n<h2 id=\"collaboration-is-as-simple-as-bluesky\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#collaboration-is-as-simple-as-bluesky\"></a>Collaboration is as simple as Bluesky</h2>\n<p>Tangled has only been out for about a day, so I coopted fellow Recoiler <a href=\"https://nick.recoil.org\">Nick Ludlam</a> to create an account. I added his <a href=\"https://bsky.app/profile/nick.recoil.org\">handle</a> over to the Recoil knot, and that's all it took for him to be able to create repositories on our server.</p>\n<p><img src=\"/images/tangled-ss-5.webp\" alt=\"%c\" ></p>\n<p>I can also just add people directly to a particular repository, as you can see from the one below on his profile.</p>\n<p><a href=\"https://tangled.sh/@nick.recoil.org\"> <img src=\"/images/tangled-ss-3.webp\" alt=\"%c\" > </a></p>\n<h2 id=\"the-issue-metadata-is-also-distributed\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-issue-metadata-is-also-distributed\"></a>The issue metadata is also distributed</h2>\n<p>The real lockin to code repository management though, is the metadata around the repository; things like issues, comments and so on. Tangled makes it possible to decentralise where is this stored <a href=\"https://www.chiark.greenend.org.uk/~sgtatham/quasiblog/git-no-forge/\">without needing a central Forge</a>, by relaying it all via the ATProto.\nLet's take a look at how this works.</p>\n<p>I <a href=\"https://tangled.sh/@anil.recoil.org/knot-docker/issues/1\">created an issue</a> on knot-docker, and it looks very similar to a GitHub issue. Zicklag on <code>#tangled</code> pointed me to the <a href=\"https://pdsls.dev/\">PDSLS</a> public ATProto browser with which you can browse the actual ATProto records. I can start from my <a href=\"https://pdsls.dev/at://did:plc:nhyitepp3u4u6fcfboegzcjw\">DID record</a> and look for the <a href=\"https://pdsls.dev/at://did:plc:nhyitepp3u4u6fcfboegzcjw/sh.tangled.repo.issue\">sh.tangled.repo.issue</a> collection, and find the <a href=\"https://pdsls.dev/at://did:plc:nhyitepp3u4u6fcfboegzcjw/sh.tangled.repo.issue/3ljvbt4zni322\">issue URL from earlier</a>.  I then prodded <a href=\"https://nick.recoil.org\">Nick Ludlam</a> to leave a comment on the issue, and you can see his <a href=\"https://pdsls.dev/at://did:plc:dr3wsy7hlzgyanewhbw7fj5g/sh.tangled.repo.issue.comment/3ljvdsrlckj22\">sh.tangled.repo.issue.comment</a> in the relay as well.</p>\n<p><a href=\"https://pdsls.dev/at://did:plc:nhyitepp3u4u6fcfboegzcjw/sh.tangled.repo.issue/3ljvbt4zni322\"> <img src=\"/images/tangled-ss-4.webp\" alt=\"%c\" > </a></p>\n<p>Even the <a href=\"https://bsky.app/profile/tangled.sh/post/3ljv6wpioxc2q\">repository stars</a> are on the relay; see for example <a href=\"https://pdsls.dev/at://did:plc:nhyitepp3u4u6fcfboegzcjw/sh.tangled.feed.star/3ljvbtbrhew22\">this</a> entry for <a href=\"https://pdsls.dev/at://did:plc:nhyitepp3u4u6fcfboegzcjw/sh.tangled.repo/3ljv45bhfql22\">knot-docker</a> that I did. The Tangled developers just added support for stars <a href=\"https://tangled.sh/@tangled.sh/core/commit/662bd012caec9c2bd2a15e1dcfe184d5b2c49ff9#file-lexicons%2fstar.json\">a few hours ago</a>, and that changeset is a nice way to see how to add a new lexicon entry.</p>\n<p><a href=\"https://bsky.app/profile/wedg.dev\">Samuel Wedgwood</a> then reminded me of his project a few years ago to run <a href=\"/ideas/version-control-matrix\">git pull requests over Matrix chat</a>. It would indeed be very cool if the pull request model on Tangled evolved into something more message-oriented like <a href=\"https://git-scm.com/docs/git-send-email\">git-send-email</a>, in order to let us try out more personalised workflows than GitHub PRs.</p>\n<h2 id=\"why-this-fits-in-so-well-with-the-rest-of-bluesky\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#why-this-fits-in-so-well-with-the-rest-of-bluesky\"></a>Why this fits in so well with the rest of Bluesky</h2>\n<p>The ATProto developers also released their <a href=\"https://docs.bsky.app/blog/2025-protocol-roadmap-spring\">roadmap for early 2025</a> today, and it aligns really well with some of the productions features I would need to completely shift over to a service like Tangled.</p>\n<p>The first, and most vital one, is <a href=\"https://docs.bsky.app/blog/2025-protocol-roadmap-spring#auth-scopes\">auth scopes</a> to control the permissions of an app password to only certain operations. Once this is in the protocol, then a client to manage Tangled repositories could use a differently privileged password from the main social client.</p>\n<p>Secondly, <a href=\"https://docs.bsky.app/blog/2025-protocol-roadmap-spring#privately-shared-data-and-e2ee-dms\">privately shared data</a> and <a href=\"https://www.ietf.org/blog/mls-secure-and-usable-end-to-end-encryption/\">encrypted DMs using MLS</a> point to how private code repositories could work. <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and I were discussing the difficulty of access-controlled replication over the Internet just yesterday, and I'm starting to believe that ATProto has the right balance of ergonomics and good design to make solving this problem much, much easier.</p>\n<p>If you'd like to try this out, then the <a href=\"https://tangled.sh/@anil.recoil.org/knot-docker/\">Knot Docker</a> repository welcomes your issues!</p>\n<small class=\"credits\">\n<p>Many thanks to Zicklag and icyphox on <a href=\"https://web.libera.chat/#tangled\">tangled IRC</a> for helping me out with debugging the Knot setup and <a href=\"https://tangled.sh/@tangled.sh/core/commit/477da124ad0bdeeab5b621b81999683256ab7a4b\">fixing bugs in real-time</a>. 12th Mar 2025: updated with <a href=\"https://bsky.app/profile/wedg.dev\">Samuel Wedgwood</a> comments.</p>\n</small><h1>References</h1><ul><li>Madhavapeddy (2025). Using AT Proto for more than just Bluesky posts. <a href=\"https://doi.org/10.59350/32rdt-zny05\" target=\"_blank\"><i>10.59350/32rdt-zny05</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/disentangling-git-with-bluesky",
      "title": "Socially self-hosting source code with Tangled on Bluesky",
      "summary": "Self-host source code with Tangled on Bluesky for decentralized Git repositories.",
      "date_published": "2025-03-08T00:00:00.000000Z",
      "date_modified": "2025-03-08T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "selfhosting",
        "identity",
        "distributed",
        "security",
        "docker",
        "bluesky",
        "ocaml"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/32rdt-zny05",
          "doi": "10.59350/32rdt-zny05",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/qxz6t-xre87",
      "content_html": "<p><a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> organised this week's <a href=\"https://www.cst.cam.ac.uk/research/eeg\">EEG</a> group <a href=\"https://www.cst.cam.ac.uk/seminars/list/229027\">discussion</a> on what AI tools we use for our daily work.  I was immediately struck by how <em>few</em> tools there are that are actually making us more productive, so I jotted down notes as the discussion was going on.</p>\n<ul>\n<li>Personally, the only tool I've found that's (only just recently) making me more productive is agentic coding, which I <a href=\"/notes/claude-copilot-sandbox\">wrote about a few days ago</a>.  Since then, I've been mildly obsessively forking off ideas I've wanted to try for years (like converting RFCs to OCaml code) and greatly enjoying myself. <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> and I have been looking for how to do this more ethically, and the best I ran across was the <a href=\"https://www.ibm.com/impact/ai-ethics\">IBM AI ethics</a> guidance and their <a href=\"https://github.com/ibm-granite/granite-code-models\">granite models</a>, but not much else. Any pointers to other models that don't violate open source licensing norms would be gratefully accepted; I'm using Claude 3.7 here, but don't feel great doing so!</li>\n<li><a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> described his use of <a href=\"https://fathom.video/\">Fathom</a> for note-taking, and (having been on the receiving end) can confirm it does a very good transcription job.</li>\n<li><a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\">Jon Crowcroft</a> has a local <a href=\"https://stabledifffusion.com/\">Stable Diffusion</a> image generator to help create local content for presentations/etc, but the setup broke when going from macOS 13 to 15 (Sequoia). Apple seem to have changed something in <a href=\"https://developer.apple.com/metal/\">Metal</a> so the existing HuggingFace installation (mostly <a href=\"https://developer.apple.com/metal/pytorch/\">pyTorch-metal</a> and the <a href=\"https://developer.apple.com/metal/tensorflow-plugin/\">Tensorflow MPS</a> backend) were out of date with the system Metal libraries. Package management for these tightly integrated hardware/software inference systems is pretty bad right now (<a href=\"https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html\">nvidia-container-toolkit</a> is another bag of hacks for containerised applications).</li>\n</ul>\n<p>Then there's a long list of things that people <em>aren't</em> using because they suck. LLM-driven searches are pretty inaccurate, as many people noted; I use <a href=\"https://kagi.com\">Kagi</a> but only because I love their AI-filtered search results, not because of their assistant!. I've turned off Apple Intelligence on all my devices, not because of privacy concerns, but because it's just utter crap -- the summaries are actually <a href=\"https://www.bbc.co.uk/news/articles/cq5ggew08eyo\">incorrect</a> half the time. I find the autocorrect features similarly distracting and wrong most of the time, and normal spellcheckers do a better job in practise.</p>\n<h2 id=\"wheres-this-going\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#wheres-this-going\"></a>Where's this going?</h2>\n<p>Our discussion then into developing news of emerging tools and techniques, since the field overall is just moving incredibly fast. Two things I've been reading this week are:</p>\n<ul>\n<li>With <a href=\"https://awards.acm.org/about/2024-turing\">RL winning the Turing award</a> this week, some folks investigated whether lightweight open-weight models could reach the performance of advanced heavy frontier models in terms of deductive reasoning. They applied RL to train an LLM for the <a href=\"https://openpipe.ai/blog/using-grpo-to-beat-o1-o3-mini-and-r1-on-temporal-clue\">game of temporal clue</a>, and their post describes many neat tricks (including the use of <a href=\"https://developers.google.com/optimization/cp/cp_solver\">CP-SAT</a> to generate difficult-but-solvable game scenarios). They applied <a href=\"https://arxiv.org/abs/2402.03300\">GRPO</a> (as made famous by <a href=\"/notes/deepseek-r1-advances\">DeepSeek</a>) to do the RL loop of solving puzzles via model responses, grading groups of responses, and fine tuning the model using clipped policy gradients derived from these group estimates. Their results <a href=\"https://openpipe.ai/blog/using-grpo-to-beat-o1-o3-mini-and-r1-on-temporal-clue\">were impressive</a> and reached frontier-model performance using Qwen 14B!</li>\n<li>And for something completely different, another team released their <a href=\"https://google-research.github.io/self-organising-systems/difflogic-ca/\">Differentiable Logic Cellular Automata</a> paper which describes how to go from the <a href=\"https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life\">Game of Life</a> to full pattern generation using learned recurrent circuits. This one should really be read in its entirity to appreciate how incredible it might become in the future, as it would allow us to generate distributed systems that can build a very complex end-goal pattern by following a set of simple rules. <a href=\"https://coomeslab.org\">David Coomes</a> pointed out to me recently that the question of <a href=\"https://www.wired.com/story/mystery-solved-how-plant-cells-know-when-to-stop-growing/\">why cells stop growing</a> has only very recently been understood in traditional biology, and yet here we are applying ML to the case.</li>\n<li><a href=\"https://mistral.ai/fr/news/mistral-ocr\">Mistral OCR</a> came out today and seems to be the state of the art in multi-modally breaking down documents into a consistent linear structure. Their results show that they can break down complex PDFs in multiple languages into seemingly clean HTML with semantic structure (such as tables, equations, figures and so on). I've only just finished running <a href=\"/projects/ce\">millions of papers</a> through <a href=\"https://grobid.readthedocs.io/en/latest/\">Grobid</a>, so this is next on the queue to try out...</li>\n</ul>\n<p>So, I guess the TL;DR of our discussion was that current AI tools are the first generation, but we're heading rather rapidly into new frontiers of discovery, so there's only going to be more of them coming up...</p><h1>References</h1><ul><li>Madhavapeddy (2025). Oh my Claude, we need agentic copilot sandboxing right now. <a href=\"https://doi.org/10.59350/aecmt-k3h39\" target=\"_blank\"><i>10.59350/aecmt-k3h39</i></a></li>\n<li>Madhavapeddy (2025). Deepdive into Deepseek advances. <a href=\"https://doi.org/10.59350/r06z7-0ht06\" target=\"_blank\"><i>10.59350/r06z7-0ht06</i></a></li>\n<li>Shao et al (2024). DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2402.03300\" target=\"_blank\"><i>10.48550/arXiv.2402.03300</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/the-state-of-ai-tools",
      "title": "Our EEG group discussion on 'useful' AI tools",
      "summary": "EEG group discusses useful AI tools and emerging techniques in productivity and research.",
      "date_published": "2025-03-07T00:00:00.000000Z",
      "date_modified": "2025-03-07T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "llms",
        "ai",
        "eeg",
        "computerlab"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/aecmt-k3h39",
          "doi": "10.59350/aecmt-k3h39",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/r06z7-0ht06",
          "doi": "10.59350/r06z7-0ht06",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2402.03300",
          "doi": "10.48550/arXiv.2402.03300",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/1jxq1-7e147",
      "content_html": "<p>I started pushing OCaml Docker images over to the <a href=\"https://hub.docker.com/r/ocaml/opam\">Docker Hub</a> in around 2017, to support the burgeoning automated build infrastructure around the use of the language. Back then, OCaml 4.06 was the latest release, and so I wrote an <a href=\"https://github.com/ocurrent/ocaml-version/blob/master/CHANGES.md\">ocaml-version</a> library to track the release metadata. It has been a bit of a success disaster, as that library now <a href=\"https://github.com/ocurrent/ocaml-version/blob/master/CHANGES.md\">tracks</a> every release of OCaml in the modern era, and also backs the <a href=\"https://github.com/ocurrent/docker-base-images\">automatic building</a> of a huge array of compiler versions and variants across <a href=\"https://images.ci.ocaml.org/?distro=debian-12&amp;\">Linux</a> and <a href=\"https://images.ci.ocaml.org/?distro=windows-msvc&amp;\">Windows</a>.</p>\n<p>The problem is...we're now building the full set of images from OCaml 4.02 onwards through to the latest OCaml 5.3.0 release, which is unsustainable for obvious reasons; despite the hosting being kindly <a href=\"https://www.docker.com/community/open-source/application/\">sponsored by Docker</a>, we must also consider the <a href=\"https://ocaml.org/policies/carbon-footprint\">carbon footprint</a> of our infrastructure.\nSo the question for the OCaml community: <strong>are there are any remaining users who still need images earlier than OCaml 4.08 or can we can stop pushing those now?</strong></p>\n<p><a href=\"https://github.com/hannesm\">Hannes Mehnert</a> lead an effort to deprecate compilers earlier than 4.08 <a href=\"https://discuss.ocaml.org/t/opam-repository-archival-phase-2-ocaml-4-08-is-the-lower-bound/15965\">in the opam-repo</a>, and now <a href=\"https://www.tunbury.org/\">Mark Elvers</a> is asking the same question <a href=\"https://discuss.ocaml.org/t/docker-base-images-and-ocaml-ci-support-for-ocaml-4-08/16229\">on the OCaml discussion forum</a> about the Docker image infrastructure. The latter lags the opam repository since there still may be operational usecases of industrial users who depend on older compilers, even if they don't use the latest package repository.  So if you <em>are</em> using a really old OCaml and depend on our infrastructure, we'd appreciate you chiming in on the <a href=\"https://discuss.ocaml.org/t/docker-base-images-and-ocaml-ci-support-for-ocaml-4-08/16229\">forum thread</a> or just contact <a href=\"https://www.tunbury.org/\">Mark Elvers</a> or myself directly to let us know.</p>\n<p>On another note, it's also quite difficult on the central <a href=\"https://hub.docker.com/\">Docker Hub</a> to get statistics per-tag as to how many people are using each image. Does anyone have any recommendations on whether we should deploy our own &quot;proxy registry&quot; before pushing through to the central Docker Hub, or alternative open source registries to run our own?</p>",
      "url": "https://anil.recoil.org/notes/deprecating-ocaml-408",
      "external_url": "https://discuss.ocaml.org/t/docker-base-images-and-ocaml-ci-support-for-ocaml-4-08/16229",
      "title": "Are you still using OCaml 4.08 or earlier? If so, we need to know",
      "summary": "OCaml users: share your needs regarding older versions to help determine support for OCaml 4.08 and earlier.",
      "date_published": "2025-03-05T00:00:00.000000Z",
      "date_modified": "2025-03-05T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "opensource"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://doi.org/10.59350/aecmt-k3h39",
      "content_html": "<p><a href=\"https://github.com/yminsky\">Yaron Minsky</a> nerdsniped me last week into getting OCaml to drive the 80s-retro <a href=\"https://www.adafruit.com/product/2345\">RGB Matrix</a> displays. I grabbed one from the local Pi Store and soldered it together with help from <a href=\"https://mynameismwd.org\">Michael Dales</a>. But instead of writing OCaml bindings by hand, we thought we'd try out the latest agentic CLI called <a href=\"https://github.com/kodu-ai/claude-code\">Claude Code</a> released <a href=\"https://ai-claude.net/\">last week</a> to see if we could entirely autogenerate the bindings.</p>\n<p><div class=\"video-center\"><iframe title=\"Using Claude Code to get OCaml to drive a Matrix HAT\" width=\"100%\" height=\"315px\" src=\"https://crank.recoil.org/videos/embed/59cb8699-9cb0-46d0-a9d9-a82338fd7452\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<p><em>TL;DR:</em> Claude Coder generated working OCaml code almost from scratch, ranging from C bindings to high-level OCaml interface files and even Cmdliner terms, but needs a more sophisticated sandboxing model before something goes horribly wrong. So much potential and so much danger awaits us. Coincidentally <a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a> and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> and I <a href=\"/papers/2024-hope-bastion\">wrote</a> about this a few months ago. Read on...</p>\n<h2 id=\"wiring-up-the-display-to-my-raspberry-pi\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#wiring-up-the-display-to-my-raspberry-pi\"></a>Wiring up the display to my Raspberry Pi</h2>\n<p>The RGB Matrix display has a very nice C++ <a href=\"https://github.com/hzeller/rpi-rgb-led-matrix\">rpi-rgb-led-matrix</a> library, so I fired up my Raspberry Pi 4 to get an OCaml development environment going by using that. The included <a href=\"https://github.com/hzeller/rpi-rgb-led-matrix/tree/master/examples-api-use\">demo</a> immediately gave me a disappointingly noisy display, but my larger-than-usual 64x64 display turned out to just need a jumper soldered.</p>\n<p><img src=\"/images/rgb-matrix-hat-ocaml-2.webp\" alt=\"%c\" title=\"Deploying my local friendly agentic soldering machine otherwise known as Michael Dales\" ></p>\n<p>As soon that was soldered, the examples worked great out of the box, so I could get on with some agentic OCaml coding. Thanks <a href=\"https://mynameismwd.org\">Michael Dales</a> and <a href=\"https://web.makespace.org/\">CamMakespace</a>!</p>\n<h2 id=\"building-ocaml-bindings-using-claude-coder\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#building-ocaml-bindings-using-claude-coder\"></a>Building OCaml bindings using Claude Coder</h2>\n<p><a href=\"https://github.com/yminsky\">Yaron Minsky</a> and I first played around with using <a href=\"https://dev.realworldocaml.org/foreign-function-interface.html\">ocaml-ctypes</a> to build the bindings by hand, but quickly switched over to trying out Claude Sonnet 3.7, first in VSCode and then directly on the Pi CLI via <a href=\"https://github.com/anthropics/claude-code\">Claude Code</a>. The latter fires up an interactive session where you not only input prompts, but it can also <em>run shell commands</em> including builds.</p>\n<p>The very first hurdle was sorting out the build rules. This is the one place where Claude failed badly; it couldn't figure out <a href=\"https://dune.readthedocs.io/en/latest/quick-start.html\">dune files</a> at all, nor the intricate linking flags required to find and link to the C++ library. I made those changes quickly by hand, leaving just a stub <code>librgbmatrix_stubs.c</code> that linked successfully with the main C++ library, but didn't do much beyond that.  I also added a near-empty <code>rgb_matrix.ml</code> and <code>rgb_matrix.mli</code> interface files to have a place for the OCaml side of the interface.</p>\n<p><img src=\"/images/claude-coder-ss-1.webp\" alt=\"%c\" title=\"The Claude Code CLI runs fine on the Raspberry Pi 4, since most of the heavy computation is done on their end.\" ></p>\n<p>After that, it was just a matter of &quot;asking the Claude Code CLI&quot; via a series of prompts to get it to fill in the code blanks I'd left. The VSCode Copilot editing mode has to be told which files to look at within the project for its context, but I didn't have to do that with the Claude Code CLI.</p>\n<p>Instead, I just prompted it to generate C stubs from the <a href=\"https://github.com/hzeller/rpi-rgb-led-matrix/blob/master/include/led-matrix-c.h\">led-matrix-c.h</a> C interface (so it didn't get distracted attempting to bind C++ to OCaml, which isn't a winning proposition). It duly generated reasonable low-level bindings, along with the right OCaml interface files by suggesting edits to the files I'd created earlier.  At this point, I got a very basic &quot;hello world&quot; circle going (with the test binary also built by Claude).</p>\n<p><img src=\"/images/rgb-matrix-hat-ocaml-3.webp\" alt=\"%c\" title=\"The OCaml bindings and concentric circles were all auto-generated by Claude Sonnet 3.7\" ></p>\n<p>Although the binding generation built fine, they did segfault when I first ran the test binary!  Claude 3.7 bound some C/OCaml functions with more than 5 arguments, which are a special case in OCaml due to <a href=\"https://ocaml.org/manual/5.3/intfc.html#ss:c-prim-impl\">differing bytecode and native code ABIs</a>.  Although Claude <em>almost</em> got it right, it subtly mixed up the order of the <code>external</code> binding on the OCaml side. The correct version is:</p>\n<pre><code>external set_pixels_native :\n  t -&gt; int -&gt; int -&gt; int -&gt; int -&gt; Color.t array -&gt; unit =\n  &quot;caml_led_canvas_set_pixels_bytecode&quot; &quot;caml_led_canvas_set_pixels&quot;\n</code></pre>\n<p>The bytecode C stub comes first, and the native code second, but Claude swapped them which lead to memory corruption. This mixup would ordinarily be rather hard to spot, but the <a href=\"https://valgrind.org/\">valgrind</a> backtrace lead me to the problem very quickly (but only because I'm very familiar with the OCaml FFI!).  I couldn't convince Claude to fix this with prompting as it kept making the same mistake, so I swapped the arguments manually and committed the results by hand.</p>\n<h2 id=\"generating-higher-level-ocaml-interfaces-and-docstrings\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#generating-higher-level-ocaml-interfaces-and-docstrings\"></a>Generating higher level OCaml interfaces and docstrings</h2>\n<p>Once the basics were in place, I then asked it to then refine the OCaml interface to be higher-level; for example instead of a <code>string</code> for the hardware mode, could it scan the C header file, find the appropriate <code>#defines</code>, and generate corresponding OCaml <a href=\"https://dev.realworldocaml.org/variants.html\">variant types</a>? Incredibly, it not only did this, but <em>also</em> generated appropriate OCamldoc annotations for those types from the C header files.</p>\n<p><img src=\"/images/claude-coder-ss-2.webp\" alt=\"%c\" title=\"These OCamldoc entries are generated automatically from the C header files\" ></p>\n<p>The Claude Code CLI then helpfully summarises all the changes, and also offers execute dune to check the result works! This is starting to get a bit mad...</p>\n<p><img src=\"/images/claude-coder-ss-3.webp\" alt=\"%c\" title=\"Claude offers to do the dune build after making code changes\" ></p>\n<p><img src=\"/images/claude-coder-ss-4.webp\" alt=\"%c\" title=\"It can also navigate the output of commands to see if the desired outcome is successful\" ></p>\n<p><img src=\"/images/claude-coder-ss-5.webp\" alt=\"%c\" title=\"The patches to the interface and implementation added in more abstract types as requested\" ></p>\n<p>The OCaml interfaces generated here required a little iteration to get right, with some manual tweaks. Claude, for some reason, generated duplicate entries for some type definitions, which OCaml doesn't permit. I fixed those manually very quickly, and then asked Claude Code to commit the changes to git for me. It generated a <a href=\"https://github.com/yminsky/rpi-rgb-led-matrix/pull/3/commits/70c7739696ca207245dfdbc80c5d6d08fe2fce79\">good summary commit message</a>. The interfaces were all documented with docs from the C header file, such as:</p>\n<pre><code>type multiplexing =\n  | DirectMultiplexing (* 0: Direct multiplexing *)\n  | Stripe             (* 1: Stripe multiplexing *)\n  | Checker            (* 2: Checker multiplexing (typical for 1:8) *)\n  | Spiral             (* 3: Spiral multiplexing *)\n  | ZStripe            (* 4: Z-Stripe multiplexing *)\n  | ZnMirrorZStripe    (* 5: ZnMirrorZStripe multiplexing *)\n  | Coreman            (* 6: Coreman multiplexing *)\n  | Kaler2Scan         (* 7: Kaler2Scan multiplexing *)\n  | ZStripeUneven      (* 8: ZStripeUneven multiplexing *)\n  | P10MapperZ         (* 9: P10MapperZ multiplexing *)\n  | QiangLiQ8          (* 10: QiangLiQ8 multiplexing *)\n  | InversedZStripe    (* 11: InversedZStripe multiplexing *)\n  | P10Outdoor1R1G1_1  (* 12: P10Outdoor1R1G1_1 multiplexing *)\n  | P10Outdoor1R1G1_2  (* 13: P10Outdoor1R1G1_2 multiplexing *)\n                       (* ...etc &lt;snipped&gt; *)\n  | Custom of int      (* Custom multiplexing as an integer *)\n</code></pre>\n<p>Pretty good! After that, I couldn't resist pushing it a bit further. I asked the CLI to generate me a good command-line interface using <a href=\"https://github.com/dbuenzli/cmdliner\">Cmdliner</a>, which is normally a fairly intricate process that involves remembering the <a href=\"https://erratique.ch/software/cmdliner/doc/Cmdliner/Term/index.html\">Term/Arg DSL</a>. But Claude aced this; it generated a huge series of CLI converter functions like this:</p>\n<pre><code>(* scan_mode conversion *)\n  let scan_mode_conv =\n    let parse s =\n      match String.lowercase_ascii s with\n      | &quot;progressive&quot; -&gt; Ok Progressive\n      | &quot;interlaced&quot; -&gt; Ok Interlaced\n      | _ -&gt; Error (`Msg &quot;scan_mode must be 'progressive' or 'interlaced'&quot;)\n    in\n    let print fmt m =\n      Format.fprintf fmt &quot;%s&quot;\n        (match m with\n         | Progressive -&gt; &quot;progressive&quot;\n         | Interlaced -&gt; &quot;interlaced&quot;)\n    in\n    Arg.conv (parse, print)\n</code></pre>\n<p>These are not entirely what I'd write, as <a href=\"https://erratique.ch/software/cmdliner/doc/Cmdliner/Arg/index.html#val-enum\">Cmdliner.Arg.enum</a> would suffice, but they're fine as-is and could be refactored later. I even got it to complete the job and generate a combined options parsing function for the (dozens) of command-line arguments, which would have been <em>very</em> tedious to do by hand:</p>\n<pre><code>(* Apply options from command line to Options.t *)\nlet apply_options options\n    ~rows ~cols ~chain_length ~parallel ~hardware_mapping ~brightness \n    ~pwm_bits ~pwm_lsb_nanoseconds ~pwm_dither_bits ~scan_mode ~row_address_type \n    ~multiplexing ~disable_hardware_pulsing ~show_refresh_rate ~inverse_colors\n    ~led_rgb_sequence ~pixel_mapper_config ~panel_type ~limit_refresh_rate_hz \n    ~disable_busy_waiting =\n  Options.set_rows options rows;\n  Options.set_cols options cols;\n  Options.set_chain_length options chain_length;\n  Options.set_parallel options parallel;\n  Options.set_hardware_mapping options hardware_mapping;\n  Options.set_brightness options brightness;\n  Options.set_pwm_bits options pwm_bits;\n  Options.set_pwm_dither_bits options pwm_dither_bits;\n  Options.set_scan_mode options scan_mode;\n  Options.set_pixel_mapper_config options pixel_mapper_config;\n  Options.set_panel_type options panel_type;\n  Options.set_limit_refresh_rate_hz options limit_refresh_rate_hz;\n  Options.set_disable_busy_waiting options disable_busy_waiting;\n  (* ...etc &lt;snipped&gt; *)\n  options\n</code></pre>\n<p>Once this compiled, I asked for a rotating 3D cube demo, and it duly used the bindings to give me a full command-line enabled generator which you can see below. I just ran:</p>\n<pre><code>rotating_block_generator.exe --disable-hardware-pulsing -c 64 -r 64 --hardware-mapping=adafruit-hat  --gpio-slowdown=2\n</code></pre>\n<p>and I had a spinning cube on my display! The code model had no problem with the matrix transformations required to render the cool spinning effect.</p>\n<p><div class=\"video-center\"><iframe title=\"Using Claude Code to get OCaml to drive a Matrix HAT\" width=\"100%\" height=\"315px\" src=\"https://crank.recoil.org/videos/embed/59cb8699-9cb0-46d0-a9d9-a82338fd7452\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<p>Of course, I had to pay the piper for the truckload of GPUs that drove this code model. At one point, the Claude Code agent got into a loop that I had to manually interrupt as it kept oscillating on a code fix without ever finding the right solution. This turned out to have sucked up quite a lot of money from my Claude API account!</p>\n<p><img src=\"/images/claude-coder-ss-6.webp\" alt=\"%c\" title=\"This post cost me a cup of coffee and a boatload of energy\" ></p>\n<p>Overall, I'm impressed. There's clearly some <a href=\"https://arxiv.org/abs/2502.18449\">RL or SFT</a> required to teach the code model the specifics of OCaml and its tooling, but the basics are already incredible. <a href=\"https://toao.com\">Sadiq Jaffer</a>, <a href=\"https://jon.recoil.org\">Jon Ludlam</a> and I are having a go at this in the coming months.</p>\n<h2 id=\"claude-code-is-powerful-but-it-can-doanythingto-your-machine\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#claude-code-is-powerful-but-it-can-doanythingto-your-machine\"></a>Claude Code is powerful, but it can do...anything...to your machine</h2>\n<p>The obvious downside of this whirlwind binding exercise is that while the NPM-based Claude Code asks nicely before it runs shell commands, <em>it doesn't have to ask</em>. I happened to run it inside a well-sandboxed <a href=\"https://docker.com\">Docker</a> container on my rPi, but most people probably won't. And in general, we need a more sophisticated security model; running the agent within a coarse sandbox that limits access to the file system, the network, and other sensitive resources is too restrictive, as we want to provide access to these resources for certain agentic tasks!</p>\n<p>So in a happy coincidence, this leads to a line of research that <a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a> and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> started last year with something we <a href=\"/news/2024-hope-bastion-1\">presented at HOPE 2024</a>. We explored how to express more precise constraints on what an AI can do by the use of the scary-sounding <a href=\"https://anil.recoil.org/papers/2024-hope-bastion.pdf\">Dijkstra monad</a>.  It's far easier to understand by perusing the <a href=\"https://anil.recoil.org/slides/2024-hope-bastion-slides.pdf\">slides</a> of the talk, or watch <a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a>'s great <a href=\"https://www.youtube.com/watch?v=U9H9xU-8-qc&amp;list=PLyrlk8Xaylp7OQNLeCGS0j2fjEnvIWL9u\">video presentation</a>.</p>\n<p>We're mainly concerned with situations where the AI models are running over sensitive codebases or datasets. Consider three scenarios we want to handle, which are very logical extensions from the above agentic coding one:</p>\n<ol>\n<li>Modify or ignore sensor data to minimize the extent of habitat loss in a <a href=\"/papers/2024-terracorder\">biodiversity monitoring</a> setup. <em>But we may want to be able to delete duplicate sensor data in some phases of the analysis.</em></li>\n<li>Leak location sightings of vulnerable species to poachers. <em>But we still want to be able to work with this data to design effective interventions — we want a sandbox that limits information flows, in a statistical sense (differential privacy).</em></li>\n<li>Enact an intervention that may not satisfy legal constraints. <em>We want a sandbox that requires that a sound causal argument has been formulated</em></li>\n</ol>\n<p>For each of these, we could use a <a href=\"https://en.wikipedia.org/wiki/Capability-based_security\">capability security</a> model where access to sensitive data and effects can occur only via unforgeable capabilities granted explicitly. And the generation of that specification could also be done via code LLMs, but needs to target a verification friendly language like <a href=\"https://fstar-lang.com\">Fstar</a>. The prototype <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> built looks like this:</p>\n<pre><code>module type CapDataAccess (readonly : list dir, writable : list dir)\n  (* abstract monad *)\n  type Cmd a\n  val return : a -&gt; Cmd a\n  val bind : Cmd a -&gt; ( a -&gt; Cmd b ) -&gt; Cmd b\n  (* only allows access to given directories *)\n  val readfile : path -&gt; Cmd string\n  (* only allows writes to writable dirs *)\n  val writefile : path -&gt; string -&gt; Cmd ()\n</code></pre>\n<p>And then you can use this rich specification to add constraints, for example see this <a href=\"https://github.com/patricoferris/hope-2024/tree/main/simple-json\">JSON parsing example</a> from the Fstar prototype:</p>\n<pre><code>(* Following IUCN's Globally Endangered (GE) scoring *)\nlet datamap = [\n&quot;iberian-lynx.geojson&quot;, O [ &quot;rarity&quot;, Int 2 ];\n&quot;bornean-elephant.geojson&quot;, O [ &quot;rarity&quot;, Int 3 ]\n]\n\n(* We add some additional predicates on the files allowed to be used *)\n@|-1,9 +1,10 ==========================================\n| (ensures (fun _ -&gt; True))\n| (requires (fun _ _ local_trace -&gt;\n| dont_delete_any_file local_trace /\\\n+| all_paths_are_not_endangered readonly /\\\n| only_open_some_files local_trace readonly))\n|}\n</code></pre>\n<p>Once you have this specification, then it's a matter of implementing fine-grained OS-level sandboxing policies to interpret and enforce them. Spoiler: we're working on such a system, so I'll write about that just as soon as it's more self-hosting; this area is moving incredibly fast.</p>\n<small class=\"credits\">\n<p>Thanks to <a href=\"https://mynameismwd.org\">Michael Dales</a> for help soldering. For the curious, here's the <a href=\"https://github.com/yminsky/rpi-rgb-led-matrix/pull/3\">PR with the code</a>, but it shouldn't go anywhere near any real use until we've had a chance to review the bindings carefully. There needs to be a new, even more buyer-beware no-warranty license for AI generated code!</p>\n</small><h1>References</h1><ul><li>Millar et al (2024). Terracorder: Sense Long and Prosper. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2408.02407\" target=\"_blank\"><i>10.48550/arXiv.2408.02407</i></a></li>\n<li>Wei et al (2025). SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2502.18449\" target=\"_blank\"><i>10.48550/arXiv.2502.18449</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/claude-copilot-sandbox",
      "title": "Oh my Claude, we need agentic copilot sandboxing right now",
      "summary": "Claude Code auto-generates OCaml bindings, but lacks robust sandboxing.",
      "date_published": "2025-03-02T00:00:00.000000Z",
      "date_modified": "2025-03-02T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "hardware",
        "ocaml",
        "llm",
        "ai"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.48550/arXiv.2408.02407",
          "doi": "10.48550/arXiv.2408.02407",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2502.18449",
          "doi": "10.48550/arXiv.2502.18449",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/agfta-8wk09",
      "content_html": "<p>The <a href=\"https://ietf.org\">IETF</a> <a href=\"https://bsky.app/profile/ietf.org/post/3lj6w5fpjx22u\">announced</a> their new <a href=\"https://www.ietf.org/blog/aipref-wg/\">AI Preferences Working Group</a> (AIPREF), which will <em>&quot;work on standardizing building blocks that allow for the expression of preferences about how content is collected and processed for Artificial Intelligence models&quot;</em>. This is quite well timed; the IETF tries not to standardise too early before there is <a href=\"https://www.ietf.org/runningcode/\">running code</a> but also needs to move before it's too late and a bad defacto standard is <a href=\"https://datatracker.ietf.org/doc/html/rfc7282\">chosen</a>.  The AI world seems to be at that nexus point right about now, with <a href=\"https://openai.com/index/introducing-gpt-4-5/\">GPT 4.5</a> seemingly hitting a <a href=\"https://www.newscientist.com/article/2470327-is-openai-hitting-a-wall-with-huge-and-expensive-gpt-4-5-model/\">scaling wall</a> and possibly triggering the start of a renewed data scraping frenzy.</p>\n<h2 id=\"how-do-websites-interact-with-ai-crawlers-right-now\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#how-do-websites-interact-with-ai-crawlers-right-now\"></a>How do websites interact with AI crawlers right now?</h2>\n<p>I've found when developing my own website there are a number of approaches to interacting with automated data crawlers. For the record, over 90% of the traffic to this site is from automated sources, so it's a material concern for <a href=\"\">##selfhosting</a> infrastructure.</p>\n<ol>\n<li><strong>Ban all bots; humans only plz:</strong> I don't want to do this, as I'd like to opt into my writing training next generation foundation models, but would like some agency over how much I need to pay for them to get their data (I am covering the bandwidth costs here, after all), so I just need them to cooperate more to avoid flooding my site. If I do want to ban them, the excellent <a href=\"https://github.com/ai-robots-txt/ai.robots.txt/blob/main/table-of-bot-metrics.md\">ai-robots</a> crew maintain a useful list of bad bots.</li>\n<li><strong>Ban some bots with a robots.txt:</strong> <a href=\"https://www.rfc-editor.org/rfc/rfc9309.html\">RFC9309</a> allows for the discrimination of web-crawlers via a <a href=\"/robots.txt\">robots.txt</a>. We nowadays have not just a few big crawlers mirroring the Internet (like <a href=\"https://developers.google.com/search/docs/crawling-indexing/googlebot\">Googlebot</a> and <a href=\"https://en.wikipedia.org/wiki/Bingbot\">Bingbot</a>), but seemingly thousands of variants competing for the data gold rush (or in my case, for <a href=\"/projects/ce\">conservation research</a>!)  The <code>robots.txt</code> doesn't give us enough control to usefully rate-limit across all of these, unfortunately. You need to regenerate the file every time there are new URLs on the site that don't fit a longest-prefix match. This, combined with having a mega <a href=\"https://sitemaps.org\">sitemaps</a> file, is a lot of non-cacheable metadata that's just adding to my serving load.</li>\n<li><strong>Add server-side throttling for specific bots:</strong> On the assumption that there are a bunch of bad bots that mimic good bots, what I really need is to start rate-throttling them all! This is where I am today, and ended up hacking together a bunch of OCaml code for <a href=\"/notes/bushel-lives\">this</a> website to track all the robots request rates and slow down over-eager ones. The rest of the Internet are mostly just asking Cloudflare to take care of this for them, which results in a <a href=\"/notes/uk-national-data-lib\">world of pain</a> for anyone outside of their world view.</li>\n<li><strong>Just give the bots what they want, which is Markdown:</strong> Since I can't really win the throttling wars in the long term, can I just give the bots what they want, which is the core text without all the HTML around it? The first thing these crawlers do is to tokenize the HTML anyway! There is <a href=\"https://llmstxt.org/\">llms.txt</a> emerging for this. I author my website in Markdown in the first place, and then transform it into the HTML you see here. But it looks like the <a href=\"https://llmstxt.org/domains.html\">llms.txt guidelines</a> insist on just one page at the root of the site, and not one Markdown per page. This is probably better for reducing crawling traffic, but it would be a large page even for my humble homepage.</li>\n<li><strong>Can I just give you a tarball with my stuff so you leave me alone?:</strong> I rebuild my site regularly, so I could just provide the AI bots with a convenient tar/zip of my entire website content, but put it in a common place so I don't have to pay for the download bandwidth. This could include my images, videos, and source markdown which could be used not only for training, but for <a href=\"https://archive.org/\">archival</a> as well. We don't seem to have a common protocol to map URLs to static archives right now, although there are a <a href=\"https://en.wikipedia.org/wiki/Web_archive_file\">few web archive</a> formats flying around.</li>\n</ol>\n<h2 id=\"the-role-of-the-ietf-is-to-create-protocols-not-mandate-implementations\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-role-of-the-ietf-is-to-create-protocols-not-mandate-implementations\"></a>The role of the IETF is to create protocols, not mandate implementations</h2>\n<p>The IETF has a valuable role to play here to establish a consensus around what a sensible, usable <em>protocol</em> for exchanging data on our websites might look like, rather than mandating any specific backend technology or storage format.\nThere is a lot of nuance around sharing content over HTTP: it supports <a href=\"https://http.dev/authentication\">authentication</a>, <a href=\"https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control\">caching</a>, <a href=\"http://web.archive.org/web/20190904190534/https://www.dirv.me/blog/2011/07/18/understanding-403-forbidden/index.html\">access control</a>, <a href=\"https://matt-jackson.com/seo-glossary/http-429/\">rate limiting</a>, and many other features hidden behind a seemingly simple <a href=\"https://datatracker.ietf.org/doc/html/rfc2616\">request-response specification</a>.</p>\n<p>I'm hoping that the AIPREF process will end up with something that gives me something closer to 5) above than 1). I need an HTTP-based mechanism by which I can express my preferences for AI crawling, and cooperate with the crawlers so that I can ensure maximum collective benefit to both people and bots visiting my site, rather than withdrawing behind a gated community of humans only. However, I think that this requires the establishment of a protocol to help sequence the HTTP requests together and not just a single static file like <code>llms.txt</code> or <code>sitemap.xml</code>.</p>\n<p>Back in the 90s, I <a href=\"/papers/netapp-tr-3071\">worked</a> <a href=\"/papers/netapp-tr-3152\">on</a> NetApp/<a href=\"https://en.wikipedia.org/wiki/NetCache\">NetCache</a> with <a href=\"https://www.netskope.com/press-releases/netskope-john-martin-chief-product-officer\">John Martin</a>. Bandwidth used to be expensive and so we deployed edge caches that could <em>modify</em> website content with local modifications to common global content. Consider, for example, a local news website that might want to show mostly cached global news, but also modify the HTML to include local news content. You can do that today via JavaScript, but back then the only way to have a protocol to modify the static HTML. The <a href=\"https://datatracker.ietf.org/doc/rfc3507/\">Internet Content Adaptation Protocol</a> was the IETF's answer to creating a structured HTTP-like protocol to allow edge modifications from proxy servers:</p>\n<blockquote>\n<p>ICAP is, in essence, a lightweight protocol for executing a &quot;remote procedure call&quot; on HTTP messages.  It allows ICAP clients to pass HTTP messages to ICAP servers for some sort of transformation or other processing (&quot;adaptation&quot;).  The server executes its transformation service on messages and sends back responses to the client, usually with modified messages.  Typically, the adapted messages are either HTTP requests or HTTP responses.\n<cite>-- <a href=\"https://datatracker.ietf.org/doc/rfc3507/\">RFC3507</a>, IETF</cite></p>\n</blockquote>\n<p>One of the coolest features of ICAP is that is didn't mandate the transformation mechanism, just the protocol. The proxies deployed at the edge networks would get a vector into transforming the data stream. NetCache implemented an implementation of ICAP, and Squid <a href=\"https://www.egirna.com/blog/news-2/configure-squid-v6-2-on-ubuntu-server-22-and-use-it-with-icap-18\">still supports</a> it. What would a similar approach look like for allowing crawlers into your site's content, but leaving lots of freedom for the details of this to be delegated to the crawlers and servers?</p>\n<h2 id=\"challenges-in-an-open-data-hoovering-protocol\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#challenges-in-an-open-data-hoovering-protocol\"></a>Challenges in an open data hoovering protocol</h2>\n<p><a href=\"https://bsky.app/profile/aftnet.bsky.social\">Antoine Fressancourt</a> <a href=\"https://bsky.app/profile/aftnet.bsky.social/post/3ljcw2uawe22c\">identifies</a> the main problem facing AIPref:</p>\n<blockquote>\n<p>Given the reports that some current LLM models have been trained on data corpus obtained illegally, I have some doubts that AIPref will be respected.</p>\n</blockquote>\n<p>This is <a href=\"https://www.tomshardware.com/tech-industry/artificial-intelligence/meta-staff-torrented-nearly-82tb-of-pirated-books-for-ai-training-court-records-reveal-copyright-violations\">true</a> for the current generation of data crawlers, but also where the <em>opportunity</em> lies for AIPref. Without a systematic way to support <a href=\"/notes/uk-national-data-lib\">replication of non-public data</a>, the situation will get even worse as custom apertures are created into data silos without any integrity underlying them.</p>\n<p>The main reason for having a protocol-based solution is that we could support the strong authentication and identification of bots. If (for example) the GoogleBot supplied a token with every HTTP request to fetch my content, I could track its use and perhaps even get compensation for the bandwidth costs. The current methods of <a href=\"https://developers.google.com/search/docs/crawling-indexing/verifying-googlebot\">bot verification</a> all seem quite weak; they are just IP based checks for example.</p>\n<p>This would in turn open a path to the disciplined negotiation for access controlled data bilaterally between crawlers and hosters. More and more content publishers are signing various <a href=\"https://www.monda.ai/blog/ultimate-list-of-data-licensing-deals-for-ai\">exclusive deals</a> with AI training companies. Irrespective of your opinion on such deals, a protocol to make it easier to authenticate bots strongly would make the establishment (and ongoing negotiation) of those mechanisms far easier to handle.</p>\n<p>We are also seeing rapid adoption of the the <a href=\"https://github.com/modelcontextprotocol\">Model Context Protocol</a> released a few months ago. This establishes a <a href=\"https://github.com/modelcontextprotocol/specification\">JSON-RPC specification</a> for LLM clients and data providers to talk to each other locally. It seems odd to me that we'd have a rich &quot;local&quot; specification for data exchange like this for RAG-like systems, but not have one in the wide area across the Internet.  As the chair of the AIPREFS group <a href=\"https://mnot.net\">Mark Nottingham</a> notes, <a href=\"https://www.mnot.net/blog/2024/11/29/platforms\">platform advantages</a> are not just network effects, so there may be deep repurcussions into the economics of AI here:</p>\n<blockquote>\n<p>In short: there are less-recognised structural forces that push key Internet services into centralized, real-time advertising-supported platforms. Along with factors like network effects and access to data, they explain some of why the Internet landscape looks like it does.\n<cite>-- <a href=\"https://www.mnot.net/blog/2024/11/29/platforms\">Mark Nottingham</a></cite></p>\n</blockquote>\n<p>Just substitute &quot;advertising-supported&quot; with &quot;AI&quot; above and the trend becomes clear. The protocol designs we chose today will form structural forces that decide the future of what the post-advertising driven Internet culture and content architecture looks like. It would be a nice outcome to establish open protocols that are somewhere in between the <a href=\"https://github.com/punkpeye/awesome-mcp-clients\">MCP clients</a> and <a href=\"https://en.wikipedia.org/wiki/Web_server\">HTTP servers</a> to facilitate a more equitable outcome rather than pooling all the data to a few big players.</p>\n<p>The other consideration is here is that such an open protocol could have utility far beyond &quot;just&quot; managing AI training bots and address the general problem we have that <a href=\"/notes/uk-national-data-lib\">replicating datasets with access control is difficult</a>. This would help the good folk at <a href=\"https://archive.org/\">Archive.org</a> to manage <a href=\"https://help.archive.org/help/how-to-download-files/\">restricted access</a> data sets that might want to become eventually open. There are also geospatial datasets such as <a href=\"https://www.gbif.org/\">biodiversity data</a> that need help managing how they are mirrored, but with access restrictions for <a href=\"https://india.mongabay.com/2025/02/commentary-how-data-deficiency-is-hindering-hydro-diplomacy-between-china-and-india/\">geopolitical reasons</a>.</p>\n<p>Luckily, the IETF do a lot of things over email, so I've signed up to the <a href=\"https://mailman3.ietf.org/mailman3/lists/ai-control.ietf.org/\">AIPREF mailing list</a> to learn more as it develops and hopefully participate!</p>\n<small class=\"credits\">\n<p>Changelog. Mar 1st 2024: Thanks to <a href=\"https://mynameismwd.org\">Michael Dales</a> for spotting typos, and <a href=\"https://bsky.app/profile/aftnet.bsky.social\">Antoine Fressancourt</a> for helpful clarifying questions on Bluesky.</p>\n</small><h1>References</h1><ul><li>Madhavapeddy (2025). Thoughts on the National Data Library and private research data. <a href=\"https://doi.org/10.59350/fk6vy-5q841\" target=\"_blank\"><i>10.59350/fk6vy-5q841</i></a></li>\n<li>Madhavapeddy (2025). Arise Bushel, my sixth generation oxidised website. <a href=\"https://doi.org/10.59350/0r62w-c8g63\" target=\"_blank\"><i>10.59350/0r62w-c8g63</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/ai-ietf-aiprefs",
      "title": "The AIETF arrives, and not a moment too soon",
      "summary": "IETF's new AI Preferences Working Group aims to standardize protocols for expressing preferences on AI content collection and processing.",
      "date_published": "2025-02-28T00:00:00.000000Z",
      "date_modified": "2025-02-28T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ietf",
        "protocols",
        "ai",
        "llms"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/fk6vy-5q841",
          "doi": "10.59350/fk6vy-5q841",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/0r62w-c8g63",
          "doi": "10.59350/0r62w-c8g63",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/t7ekw-e7y39",
      "content_html": "<p>This week I've been reading three really nice pieces of work by my\ncolleagues, in the form of a <a href=\"https://www.nature.com/articles/s44358-025-00022-3\">review paper</a> on biodiversity and AI,\na <a href=\"https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14503\">benchmark</a> for 3D forest reconstruction using laser scanners and a <a href=\"https://github.com/MingyueX/GreenLens\">mobile app</a> for measuring the width of tree trunks. A real bonanza for forest lovers!</p>\n<h2 id=\"review-paper-on-mapping-opportunities-for-ai-in-biodiversity\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#review-paper-on-mapping-opportunities-for-ai-in-biodiversity\"></a>Review paper on mapping opportunities for AI in biodiversity</h2>\n<p>A paper on '<a href=\"https://www.nature.com/articles/s44358-025-00022-3\">Harnessing AI to fill global shortfalls in biodiversity knowledge</a>' just came out in Nature Biodiversity today (via <a href=\"http://oisin.info\">Oisin Mac Aodha</a>).  They start with the baseline present uses of AI (camera traps, acoustic monitoring and improved data analysis) which are pretty well known to anyone in the field, but then introduce <a href=\"https://www.nature.com/articles/s44358-025-00022-3/figures/1\">a lovely diagram of future uses</a> of AI for biodiversity which includes:</p>\n<ul>\n<li>Rapid retrieval of existing information means both looking into existing literature, but also the digitisation of existing museum specimens. Coincidentally, I have just posted a new student project on the area of <a href=\"/ideas/digitisation-of-insects\">insect digitisation at the Zoology museum</a> from <a href=\"https://www.cambridgephilosophicalsociety.org/funding/henslow-fellows/dr-tiffany-ki%0A\">Tiffany Ki</a> on the latter topic, which I'd be very happy to hear from interested students about. I have also have been working on <a href=\"/papers/2024-ce-llm\">LLM driven evidence retrieval</a> recently, so I'm all in favour of lots more projects in this space.</li>\n<li>Once the data is retrieved, they discuss how this could be used for richer hypothesis generation via detection of new patterns for humans to review, ranking high-value areas that need more observations, and generally doing more unsupervised learning over the vast space. This is a good zooming in from many of the general areas covered in the <a href=\"https://royalsociety.org/news-resources/projects/science-in-the-age-of-ai/\">Royal Society Science in the Age of AI</a> report as well, and very good to see given the sheer urgency of more action in the field of biodiversity conservation.</li>\n<li>Finally, there's also the fascinating topic of ecological modelling where we move from individual species to whole communities, as well as knowledge-guided machine learning towards this. I'm planning on experimenting more with differentiable models (beyond ABMs, where both <a href=\"/ideas/differentiable-abm\">differentiable</a> and <a href=\"/ideas/rev-abm\">reversible</a> have worked very well). The recent paper on <a href=\"https://www.nature.com/articles/s41586-024-07744-y\">NeuralGCM</a> from the Google team underlined the huge potential of combining purely data-driven and purely-computational models into a combined system with much better predictive power than either by itself.</li>\n</ul>\n<p>Those interested in this may also want to look at our recent <a href=\"/papers/2024-ai-conhorizon\">horizon scan on AI and conservation</a> from a few months ago. The field is moving so quickly that I wouldn't be surprised if both of these were obsolete a year from now!</p>\n<p><a href=\"/ideas/digitisation-of-insects\"> <img src=\"/images/umzc-4.webp\" alt=\"%c\" title=\"If you like biodiversity, consider working with me on this project!\" > </a></p>\n<h2 id=\"benchmark-dataset-for-tree-species-identifications\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#benchmark-dataset-for-tree-species-identifications\"></a>Benchmark dataset for tree species identifications</h2>\n<p>And then out in MEE is a comprehensive benchmark from a collection of forestry researchers on a <a href=\"https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14503\">benchmark for tree species classifiction from proximal laser scanners</a>. Their <a href=\"https://zenodo.org/records/13255198\">FOR-species20k</a> dataset is on Zenodo, and has tons of tree point clouds taken using a variety of laser scanning techniques (<a href=\"https://www.earthscope.org/what-is/tls/\">TLS</a>, <a href=\"https://www.sciencedirect.com/science/article/pii/S1618866723003710\">MLS</a> and <a href=\"https://www.gispro.pl/en/products/unmanned-laser-scanning-uls/\">ULS</a>).</p>\n<p>As <a href=\"https://www.geog.cam.ac.uk/people/lines/\">Emily Lines</a> notes:</p>\n<blockquote>\n<p>Most importantly, we demonstrate that community efforts and open science are the only way to make significant progress in this important task. With more researchers publishing and sharing high quality 3D forest datasets, I hope we see an end of single-site studies and that proper and broad benchmarking of all new 3D forest deep learning methods becomes the standard.\n<cite>-- <a href=\"https://www.linkedin.com/posts/emily-lines-2b271a80_openscience-ai-deeplearning-activity-7292116486519676928-XfwF\">Emily Lines on LinkedIn</a></cite></p>\n</blockquote>\n<p>I've been learning more about <a href=\"/papers/2024-hyper-tropical-mapping\">tree species identification for tropical species</a> last year, so I'm looking forward to delving more into laser scanning techniques soon from this work.</p>\n<h2 id=\"a-mobile-app-for-measuring-tree-trunks\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#a-mobile-app-for-measuring-tree-trunks\"></a>A mobile app for measuring tree trunks</h2>\n<p>And last but not least, I was delighted to see that my colleagues <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and <a href=\"https://www.cst.cam.ac.uk/people/zf281\">Frank Feng</a> released the extremely cool mobile phone app they've been working on for some time along with a <a href=\"https://www.sciencedirect.com/science/article/pii/S1574954124003169?via%3Dihub#s0125\">paper in Ecological Informatics</a>. Their app is a simple and elegant mobile phone app that can measure the diameter of a tree trunk (more specifically, the <a href=\"https://en.wikipedia.org/wiki/Diameter_at_breast_height\">DBH</a>) just using standard cameraphone hardware on most modern-ish Android phones.</p>\n<p>I was lucky enough to beta test this and try it out on my <a href=\"/notes/compass2024-ric-tripreport\">recent trip to India</a>, and the <a href=\"https://github.com/MingyueX/GreenLens\">GreenLens</a> is also now open source as well.</p>\n<blockquote>\n<p>Other apps for measuring forest plots are available [...] but those for Android phones tend not to perform as well as ours, while those designed for the iPhone require the purchase of a high-end phone that is not affordable for researchers in the Global South.</p>\n<p>We believe ours is the only app to sit in the 'sweet spot' of offering high quality for low cost.\n-- <cite><a href=\"https://www.cst.cam.ac.uk/using-ai-see-wood-trees\">Frank and Keshav on cam.ac.uk</a></cite></p>\n</blockquote>\n<p><img src=\"/images/pups-india-1.webp\" alt=\"%c\" title=\"I actually got quite distracted while trying to beta test GreenLens in India as I ran across these adorable stray street puppies, which seems important to post\" ></p><h1>References</h1><ul><li>Ball et al (2024). Harnessing temporal & spectral dimensionality to identify individual trees in tropical forests. bioRxiv. <a href=\"https://doi.org/10.1101/2024.06.24.600405\" target=\"_blank\"><i>10.1101/2024.06.24.600405</i></a></li>\n<li>Reynolds et al (2024). The potential for AI to revolutionize conservation: a horizon scan. <a href=\"https://doi.org/10.1016/j.tree.2024.11.013\" target=\"_blank\"><i>10.1016/j.tree.2024.11.013</i></a></li>\n<li>Iyer et al (2025). Careful design of Large Language Model pipelines enables expert-level retrieval of evidence-based information from syntheses and databases. <a href=\"https://doi.org/10.1371/journal.pone.0323563\" target=\"_blank\"><i>10.1371/journal.pone.0323563</i></a></li>\n<li>Madhavapeddy (2024). COMPASS 2024 report on the CoRE stack RIC meeting. <a href=\"https://doi.org/10.59350/p7kck-5bt81\" target=\"_blank\"><i>10.59350/p7kck-5bt81</i></a></li>\n<li>Pollock et al (2025). Harnessing artificial intelligence to fill global shortfalls in biodiversity knowledge. Nature Reviews Biodiversity. <a href=\"https://doi.org/10.1038/s44358-025-00022-3\" target=\"_blank\"><i>10.1038/s44358-025-00022-3</i></a></li>\n<li>(0). Fig. 1: Potential roles of artificial intelligence in filling biodiversity knowledge gaps and downstream applications. | Nature Reviews Biodiversity. <a href=\"https://doi.org/https://www.nature.com/articles/s44358-025-00022-3/figures/1\" target=\"_blank\"><i>https://www.nature.com/articles/s44358-025-00022-3/figures/1</i></a></li>\n<li>Kochkov et al (2024). Neural general circulation models for weather and climate. Nature. <a href=\"https://doi.org/10.1038/s41586-024-07744-y\" target=\"_blank\"><i>10.1038/s41586-024-07744-y</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/forest-apps-and-benchmarks",
      "title": "A trio of papers I read on biodiversity and forests this week",
      "summary": "Exploring biodiversity and forests through 3 papers on AI, 3D reconstruction, and a tree-measuring mobile app.",
      "date_published": "2025-02-20T00:00:00.000000Z",
      "date_modified": "2025-02-20T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "forests",
        "biodiversity",
        "conservation",
        "sensing",
        "ai",
        "llms"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1101/2024.06.24.600405",
          "doi": "10.1101/2024.06.24.600405",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1016/j.tree.2024.11.013",
          "doi": "10.1016/j.tree.2024.11.013",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1371/journal.pone.0323563",
          "doi": "10.1371/journal.pone.0323563",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/p7kck-5bt81",
          "doi": "10.59350/p7kck-5bt81",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s44358-025-00022-3",
          "doi": "10.1038/s44358-025-00022-3",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/https://www.nature.com/articles/s44358-025-00022-3/figures/1",
          "doi": "https://www.nature.com/articles/s44358-025-00022-3/figures/1",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s41586-024-07744-y",
          "doi": "10.1038/s41586-024-07744-y",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/fk6vy-5q841",
      "content_html": "<p>Over the past year, <a href=\"https://toao.com\">Sadiq Jaffer</a> and I have been getting an object lesson in how the modern Internet handles researcher access to data, as we've been downloading tens of millions of research papers towards our <a href=\"/projects/ce\">Conservation Evidence</a> project. This is legally possible via our <a href=\"https://www.lib.cam.ac.uk/stories/student-guide-libraries\">institutional subscriptions</a> that give us license to fulltexts, and the incredibly helpful <a href=\"https://uk.linkedin.com/in/james-caudwell-60681766\">head of electronic services</a> at the University Library who wields encyclopedic knowledge of each of our agreements with the hundreds of publishers out there. My thoughts on this then segwayed into recent conversations I've been having about the emerging <a href=\"https://takes.jamesomalley.co.uk/p/wtf-is-the-national-data-library\">National Data Library</a> and also with the UK <a href=\"https://www.wildlifetrusts.org/\">Wildlife Trusts</a>...</p>\n<h2 id=\"the-difficulty-of-access-controlled-bulk-data-downloads\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-difficulty-of-access-controlled-bulk-data-downloads\"></a>The difficulty of access controlled bulk data downloads</h2>\n<p>In late 2023, once we got past the legal aspects of downloading closed access papers<sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote\">[1]</a></sup> it was still remarkably difficult to <em>actually</em> gain access to the actual paper datasets themselves.  For instance, a select few hurdles include:</p>\n<ul>\n<li><a href=\"https://www.cloudflare.com/\">Cloudflare</a> got in the way <em>all</em> the time, preventing batch downloading by throwing <a href=\"https://en.wikipedia.org/wiki/CAPTCHA\">CAPTCHAs</a> down the wire. Each publisher has to individually allowlist our one hardworking IP, and it can take months for them to do this and it's never quite clear when we have been allowed. So I hacked up <a href=\"https://www.zenrows.com/blog/undetected-chromedriver-vs-selenium-stealth\">dodgy stealth downloaders</a> even though we're meant to have access via the publisher.</li>\n<li>Many official <a href=\"https://www.springernature.com/gp/researchers/text-and-data-mining\">text mining</a> APIs for publishers such as Elsevier and Springer do not provide PDF access, and only give an <a href=\"https://www.elsevier.com/en-gb/researcher/author/policies-and-guidelines/elsevier-xml-dtds-and-transport-schemas\">XML equivalent</a> which is both inconsistent in its schemas and misses diagrams. Luckily there are great projects like <a href=\"https://grobid.readthedocs.io/en/latest/\">Grobid</a> to normalise some of these with very <a href=\"https://github.com/kermitt2/Pub2TEI/pull/18\">responsive</a> maintainers.</li>\n<li>There existing archival indices for the PDFs that <a href=\"https://docs.openalex.org/api-entities/works/work-object/location-object\">point to preprints</a> around the web, but <a href=\"https://commoncrawl.org/blog/january-2025-crawl-archive-now-available\">CommonCrawl</a> truncates <a href=\"/ideas/grey-lit-crawl\">downloads</a> to their first megabyte, and the <a href=\"https://archive.org/details/UNPAYWALL-PDF-CRAWL-2019-04\">archive.org unpaywall</a> crawls are restricted access for licensing reasons. So I built a crawler to get these ourselves (I'm glad I wrote the first <a href=\"https://github.com/mirage/ocaml-cohttp\">cohttp</a> now!)</li>\n<li>Bulk download still involves individual HTTP queries with various rate throttling mechanisms that all vary slightly, making me an expert in different <a href=\"https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/429\">HTTP 429</a> response headers. There's not much sign of <a href=\"https://graphql.org/\">batch query</a> interfaces anywhere, probably because of the difficulty of access checking for each individual result.</li>\n<li>The <a href=\"https://pmc.ncbi.nlm.nih.gov/tools/ftp/#pdf\">NIH PMC</a> only have one hard-working rate-throttled FTP server for PDFs, which I've been slowly mirroring using a hand-crafted OCaml FTP client since Nov 2024 (almost done!)</li>\n<li>Meanwhile, because this is happening through allowlisting of specific IPs, I then got my Pembroke office kicked off the Internet due to automated abuse notifications going to the <a href=\"https://www.uis.cam.ac.uk/\">UIS</a> who turn netblocks off before checking (fair enough, it could be malware). But it would have been easier to run these downloads through <a href=\"/papers/2010-iswp-dustclouds\">dust clouds</a> than try to do it properly by registering the addresses involved, eh?</li>\n</ul>\n<p>The situation is better for open access downloads, where projects such as <a href=\"https://core.ac.uk/\">Core</a> offer easier bulk access and large metadata databases like <a href=\"https://openalex.org\">OpenAlex</a> use '<a href=\"https://docs.aws.amazon.com/AmazonS3/latest/userguide/RequesterPaysBuckets.html\">downloader pays</a>' S3 buckets. And in other domains like satellite data, there is still a lot of complexity in obtaining the data, but <a href=\"https://github.com/sentinel-hub/sentinelhub-py\">programming wrappers</a> make implementing the (often terabyte-level) downloads much more palatable. For our recent <a href=\"/papers/2024-life\">LIFE</a> biodiversity maps, we also make them available on services like <a href=\"https://zenodo.org/records/14188450\">Zenodo</a> as they are open.</p>\n<p>The lesson I took away from this is that it's really difficult to deal with large sensitive datasets where selective <em>access control</em> is required, and also that sort of data is rarely mirrored on the open web for obvious reasons. But in the <a href=\"https://www.theatlantic.com/health/archive/2025/02/trump-science-data-gender-dei/681698/\">current climate</a>, it's utterly vital that we move to protect human health or <a href=\"https://www.nature.com/articles/s41559-023-02226-2\">biodiversity data</a> gathered over decades that is irreplaceable once lost. And beyond data loss, if the data is present but not accessible, then what's the point in gathering it in the first place? It's also really important not to blame the existing publishers of these datasets, who are getting overwhelmed by <a href=\"https://perishablepress.com/ultimate-ai-block-list/\">AI bots</a> making huge numbers of requests to their infrastructure. So I'm getting energised by the idea of a cooperative solution among all the stakeholders involved.</p>\n<h2 id=\"enter-the-national-data-library\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#enter-the-national-data-library\"></a>Enter the National Data Library</h2>\n<p>You can imagine my excitement late last year when I got a call from the Royal Society to show up bright and early for a mysterious speech by Rishi Sunak. He duly <a href=\"https://www.gov.uk/government/speeches/prime-ministers-speech-on-ai-26-october-2023\">announced</a> the government's AI summit that mostly focussed on <a href=\"https://www.gov.uk/government/topical-events/ai-safety-summit-2023\">safety</a>, but a report by <a href=\"https://sciencesuperpower.substack.com/i/144202375/investing-in-public-goods\">Onward</a> caught my eye by recommending that <em>&quot;the Government should establish a British Library for Data – a centralised, secure platform to collate high-quality data for scientists and start-ups&quot;</em>. I wasn't down for the &quot;centralised&quot; part of this, but I generally liked the library analogy and the curation it implied.</p>\n<p><img src=\"/images/rishi-sunak-rs-ai-1.webp\" alt=\"%c\" title=\"Seeing Rishi Sunak and, more importantly, the back of my PhD supervisor Andy Hopper's head.\" ></p>\n<p>Then in 2025, with Sunak dispatched back to <a href=\"https://en.wikipedia.org/wiki/Richmond_and_Northallerton_(UK_Parliament_constituency)\">Richmond</a>, Labour took up the reigns with their <a href=\"https://www.gov.uk/government/publications/ai-opportunities-action-plan/ai-opportunities-action-plan\">AI Action Plan</a>. While this report started predictably with the usual need for acres of GPU-filled datacenters, it continued onto something much more intriguing via the creation of a &quot;National Data Library&quot;:</p>\n<blockquote>\n<ul>\n<li>Rapidly identify at least 5 high-impact public datasets it will seek to make available [...] Prioritisation should consider the potential economic and social value of the data, as well as public trust, national security, privacy, ethics, and data protection considerations.</li>\n<li>Build public sector data collection infrastructure and finance the creation of new high-value datasets that meet public sector, academia and startup needs.</li>\n<li>Actively incentivise and reward researchers and industry to curate and unlock private datasets.\n<cite>-- <a href=\"https://www.gov.uk/government/publications/ai-opportunities-action-plan/ai-opportunities-action-plan\">AI Opportunities Action Plan</a>, Jan 2025</cite></li>\n</ul>\n</blockquote>\n<p>This takes into account much more of the nuances of getting access to public data. It identifies the need for data curation, and also the costs of curating such private datasets and ensuring correct use.  The announcement spurred on a number of excellent thoughts from around the UK web about the implications, particularly from <a href=\"https://gavinfreeguard.com/\">Gavin Freeguard</a> who wrote about <a href=\"https://gavin-freeguard.medium.com/how-should-we-think-about-a-national-data-library-dd2d47edee8b\">how we should think about an NDL</a>. Gavin identified one particularly difficult element of exposing private data:</p>\n<blockquote>\n<p>[...] analogy with the National Data Library suggests that there might be some materials available to everyone, and some restricted to specialist researchers. There may be different access models for more sensitive material. There may be better and worse options — bringing together all the data in one place for accredited researchers to access [...] would be a logistical and security nightmare [...] may be possible to keep the data where it already is, but provide researchers with the ability to access different systems.\n<cite>-- <a href=\"https://gavin-freeguard.medium.com/how-should-we-think-about-a-national-data-library-dd2d47edee8b\">Gavin Freeguard</a></cite></p>\n</blockquote>\n<p>Others also <a href=\"https://theodi.org/news-and-events/blog/how-to-build-a-national-data-library/\">identified</a> that the centralised library analogy only goes so far, and that we should focus on <a href=\"https://peterkwells.com/2024/12/18/the-national-data-library-should-help-people-deliver-trustworthy-data-services/\">building trustworthy data services instead</a> and on <a href=\"https://www.adruk.org/news-publications/news-blogs/the-new-uk-government-wants-a-national-data-library-a-brilliant-aspiration-if-built-on-solid-foundations/\">the real lifecycle of the data</a> usage:</p>\n<blockquote>\n<p>[...] this means that the latest data is already there in the &quot;library&quot; [...] researchers don't first need to work with the data owners to create it [...] bodies of knowledge around using these complex datasets can be built up over time.</p>\n<p>Researchers can share code and derived data concepts, so the researchers that come after can iterate, refine, and build on what has gone before. None of this was possible with the previous &quot;create and destroy&quot; model of accessing these types of datasets, which was hugely inefficient\n<cite>-- <a href=\"https://www.adruk.org/news-publications/news-blogs/the-new-uk-government-wants-a-national-data-library-a-brilliant-aspiration-if-built-on-solid-foundations/\">Administrative Data Research</a> UK</cite></p>\n</blockquote>\n<p>Gosh, this network effect sounds an awful lot like what I experienced as a <a href=\"\">Docker</a> maintainer, which had its incredible <a href=\"https://www.docker.com/blog/docker-index-dramatic-growth-in-docker-usage-affirms-the-continued-rising-power-of-developers/\">popularity</a> fuelled by tapping into users to building <em>and sharing</em> their own software packaging rather than depending on third parties to do it for them.  If we could unlock the power of crowds here but go one step further and enforce privacy constraints on the underlying data and code, then the technical solution could be both usable and secure. I'm still not quite sure what that balance of UI would look like, but we're <a href=\"/projects/plancomp\">working on it</a> spearheaded by <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>, <a href=\"https://mynameismwd.org\">Michael Dales</a> and <a href=\"https://ryan.freumh.org\">Ryan Gibb</a>'s research areas.</p>\n<p>The Wellcome and ESRC have also put together a <a href=\"https://zenodo.org/communities/wellcome/records?q=&amp;f=subject%3AData%20Library&amp;l=list&amp;p=1&amp;s=10&amp;sort=newest\">series of whitepapers</a> about the challenges and potential approaches behind the NDL (via <a href=\"https://en.wikipedia.org/wiki/Nick_McKeown\">Nick McKeown</a>). I'm still going through them in detail, but the <a href=\"https://zenodo.org/records/14671714\">modular approach</a> paper makes sensible observations about not trying to build one enormous national database and to not outsource it all to one organisation to build. Instead, they espouse a <a href=\"https://zenodo.org/records/14672004\">federated architectural</a> approach.</p>\n<p><a href=\"https://zenodo.org/records/14672004\"> <img src=\"/images/federated-ndl-ss-1.webp\" alt=\"%c\" title=\"Sourced from https://zenodo.org/records/14672004\" > </a></p>\n<p>Since their primary (but not only) usecase focuses on <a href=\"https://ukhealthdata.org/\">health data</a>, there is an emphasis on moving the computation and data around rather than pooling it:</p>\n<blockquote>\n<p>The project's overlay mesh network dynamically and securely connects all the required resources. The\nmesh network creates a transient, project-specific, secure network boundary such that all the project’s\ncomponents are within one overarching safe setting\n<cite>-- <a href=\"https://zenodo.org/records/14672004\">A federated architecture for a National Data Library</a></cite></p>\n</blockquote>\n<p>This isn't a million miles away from how we set up <a href=\"https://docs.docker.com/engine/network/tutorials/overlay/\">overlay networks</a> on cloud infrastructure, but with the added twist of putting in more policy enforcement upfront.</p>\n<ul>\n<li>On the programming languages side, we're seeing exciting progress on <a href=\"https://github.com/MLanguage/mlang\">formalising legal systems</a> which encourages <a href=\"https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4291177\">pair programming with lawyers</a> to capture the nuances of policy accurately (and <a href=\"https://news.law.northwestern.edu/sarah-lawsky-worked-on-a-tax-law-code-that-the-french-government-deemed-officially-awesome/\">pronounced 'awesome'</a> by the French government).</li>\n<li>At a systems level, <a href=\"https://cs.brown.edu/people/malte/\">Malte Schwarzkopf</a> recently published <a href=\"https://cs.brown.edu/people/malte/pub/papers/2024-sosp-sesame.pdf\">Sesame</a> which provides end-to-end privacy sandboxing guarantees, and there is classic work on <a href=\"https://www.usenix.org/conference/nsdi-08/securing-distributed-systems-information-flow-control\">DIFC</a> that we've been using <a href=\"/papers/2023-raid-deluminator\">more recently</a> in secure enclave programming.</li>\n<li>From a machine learning perspective, my colleague <a href=\"https://mlsys.cst.cam.ac.uk/\">Nic Lane</a>'s work on <a href=\"https://www.cam.ac.uk/research/news/can-federated-learning-save-the-world\">federated learning</a> via <a href=\"https://flower.ai/\">Flower</a> seems to be everywhere right now with its own <a href=\"https://flower.ai/events/flower-ai-summit-2025/\">summit</a> coming up.</li>\n</ul>\n<p>However, it's not all plain sailing, as there is also mega-controversy ongoing with the UK government's <a href=\"https://takes.jamesomalley.co.uk/p/ask-the-computer-people-first#footnote-anchor-3-156712689\">surprising</a> demands for an <a href=\"https://www.bbc.co.uk/news/articles/c20g288yldko\">encryption backdoor</a> into iCloud, leading to even more of a <a href=\"https://www.theregister.com/2025/02/13/us_demand_uk_apple_backdoor_close/\">geopolitical tangle</a> with the US. Irrespective of what happens with this particular case, it's clear that any end-to-end encryption in these federated systems will need to deal with the reality that jurisdictions will have different lawful decryption needs, so <a href=\"https://statusq.org/archives/2025/02/16/13063/\">end-to-end encryption may be at an end</a> for initiatives like the NDL. Add onto this the flagrant <a href=\"https://shujisado.org/2025/01/27/significant-risks-in-using-ai-models-governed-by-the-llama-license/\">disregard for licensing</a> in current pretrained language models but also the movement <a href=\"https://www.gov.uk/government/consultations/copyright-and-artificial-intelligence/copyright-and-artificial-intelligence\">to revise copyright laws</a> to legislate around this, and it's clear that technology will need to be fluid in adapting to matters of provenance tracking as well.</p>\n<p>There's definitely a rich set of academic literature in this space, combined with interesting constraints, and so I'll pull this together into an annotated bibtex soon!</p>\n<h2 id=\"who-are-some-users-of-such-a-service\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#who-are-some-users-of-such-a-service\"></a>Who are some users of such a service?</h2>\n<p>To get some more inspiration on a technical solution, I've been looking to users of such an infrastructure to understand what easy-to-use interfaces might look like.</p>\n<p>My colleague <a href=\"https://inverseprobability.com/\">Neil Lawrence</a> over at <a href=\"https://ai.cam.ac.uk\">AI@Cam</a> co-lead a recent report into <a href=\"https://ai.cam.ac.uk/reports/access-to-data-case-studies\">case studies for the NDL</a> which is very much worth a read. From a conservation perspective, <a href=\"https://toao.com\">Sadiq Jaffer</a> and <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a> both <a href=\"https://ai.cam.ac.uk/blog/conserving-with-code-how-data-is-helping-to-save-our-planet\">gave input</a> about the importance of having such infrastructure for <a href=\"/projects/ce\">evidence-driven landuse</a>.</p>\n<blockquote>\n<p>What would be helpful, according to Dr Jaffer, is more\nstandardisation between publishers for Open Access material\nunder permissive licences.\n[...] having a coherent archive for OA materials that are licensed\nin such a way that they can be used for data mining without\nany technical hurdles would be the ideal scenario for this kind\nof research, as well as for a National Data Library,\n<cite>-- <a href=\"https://ai.cam.ac.uk/projects/access-to-data-case-studies\">Access to Data for Research</a>, AI@CAM</cite></p>\n</blockquote>\n<p><a href=\"https://ai.cam.ac.uk/projects/access-to-data-case-studies\"> <img src=\"/images/ai-cam-data-library.webp\" alt=\"%c\" title=\"The extremely cool doodle on the workshop from AI@Cam\" > </a></p>\n<p>Another very different group I talked to back in 2023 via Rosalind Goodfellow as part of her <a href=\"https://www.csap.cam.ac.uk/network/rosalind-goodfellow/\">CSaP</a> fellowship was the <a href=\"https://www.gov.uk/government/organisations/geospatial-commission\">Geospatial Commission</a> who began work on a <a href=\"https://www.gov.uk/guidance/national-underground-asset-register-nuar\">National Underground Asset Register</a>. The NAUR was initially restricted to &quot;<a href=\"https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1148100/NUAR_FAQs__.pdf\">safe dig</a>&quot; usecases and not exposed more widely for security and other concerns. In 2024, they subsequently <a href=\"https://gdsgeospatial.blog.gov.uk/2024/01/11/discovering-potential-opportunities-for-the-national-underground-asset-register/\">reported</a> great interest in expanded usecases and are doing a discovery project on how to expose this information via APIs. This seems like an ideal usecase for some of the access control needs discussed above, as it's not only a lot of data (being geospatial) but also updated quite frequently and not necessarily something to make entirely public (although <a href=\"https://x2n.com/blog/how-utility-companies-are-using-satellite-technology/\">satellite pipeline monitoring</a> is perhaps obsoleting this need).</p>\n<p>And a month ago after reading our <a href=\"/papers/2024-ai-conhorizon\">horizon scan for AI and conservation</a> paper, <a href=\"https://samreynolds.org\">Sam Reynolds</a> <a href=\"https://coomeslab.org\">David Coomes</a> <a href=\"https://www.cisl.cam.ac.uk/directory/emily-shuckburgh\">Emily Shuckburgh</a> and I got invited by <a href=\"https://uk.linkedin.com/in/craig-bennett3\">Craig Bennett</a> to a remarkable dinner with the assembled leaders of all 46 of the UK's <a href=\"https://www.wildlifetrusts.org/\">wildlife trusts</a>. They are a collective of independent charities who together maintain wildlife areas across the UK, with most people living near one of their 2300+ parks (more than there are UK McDonald's branches!). Over the course of dinner, we heard from every single one of them, with the following gist:</p>\n<ul>\n<li>The 46 nature charities work by consensus but independently, but recently are building more central coordination around their use of systematic biodiversity data gathering across the nation. They are building a data pool across all of them, which is important as the sensing they do is very biased both spatially and across species (we know lots about <a href=\"https://www.rspb.org.uk/whats-happening/big-garden-birdwatch\">birds</a>, less about <a href=\"https://www.britishhedgehogs.org.uk/british-hedgehog-now-officially-classified-as-vulnerable-to-extinction/\">hedgehogs</a>).</li>\n<li>The charities recognise that need to take more risks as the pressures on UK nature are currently <a href=\"https://www.wildlifetrusts.org/news/new-report-reveals-drought-now-considered-biggest-risk-uk-nature-reserves\">immense</a>, which means harnessing their data and AI responsibly to both accelerate action and also to recruit more participation from a broader cross-section of the UK population for citizen science input but also just to experience it.</li>\n<li><a href=\"https://www.conservationevidence.com\">Conservation evidence</a> is important to them, and sharing data from one area to replicate that action elsewhere in the UK is essential but difficult to engineer from scratch. There's a real cost to generating this data, and some confusion about appropriate licensing strategies. I gave a somewhat mixed message here reflecting my own uncertainly about the right way forward: one on hand, restricted licensing might prevent their data being hoovered up by the big tech companies who give peanuts back in return, but then again the bad actors in this space would simply <a href=\"https://www.vox.com/technology/2023/7/27/23808499/ai-openai-google-meta-data-privacy-nope\">ignore</a> the licensing and the good actors probably <a href=\"https://www.weforum.org/stories/2023/01/davos23-ai-divide-global-north-global-south/\">can't afford</a> it.</li>\n</ul>\n<p>The trusts are operating on a fairly shoestring budget already, so they're a great candidate to benefit from a collective, federated National Data Library. In particular, if the NDL can nail down a <a href=\"https://www.gov.uk/working-with-trade-unions/collective-bargaining\">collective bargaining</a> model for data access to big tech companies, this could finance the collection costs among smaller organisations throughout the four nations. The same holds true for thousands of small organisations around the UK that could benefit from this infrastructure and kickstart more <a href=\"https://lookingforgrowth.uk/\">sustainable growth</a>.</p>\n<p><img src=\"/images/wildlife-trusts-homerton.webp\" alt=\"%c\" title=\"The assembled CEOs of the Wildlife Trusts taught me awful lot about hedgehogs that evening\" ></p>\n<p>I'm organising a get-together on the topic of <a href=\"/projects/plancomp\">planetary computing</a> next month with <a href=\"https://www.cs.cornell.edu/~jnfoster/\">Nate Foster</a> and a number of colleagues from around the world, so stay tuned for more updates in this space in the coming months! Your thoughts, as always, are most welcome.</p>\n<small class=\"credits\">\n<p><em>(Thanks <a href=\"https://samreynolds.org\">Sam Reynolds</a> for the notes on what we discussed with the Wildlife Trusts)</em></p>\n</small>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p>This largely involved talking to individual publishers and agreeing not to directly train generative AI models and to keep them private to our own research use. Fairly reasonable stuff.</p>\n <a href=\"#fnref:1\" class=\"reversefootnote\">&#8617;</a></p></li></ol></div><h1>References</h1><ul><li>Eyres et al (2025). LIFE: A metric for mapping the impact of land-cover change on global extinctions. <a href=\"https://doi.org/10.1098/rstb.2023.0327\" target=\"_blank\"><i>10.1098/rstb.2023.0327</i></a></li>\n<li>Tarkhani et al (2023). Information Flow Tracking for Heterogeneous Compartmentalized Software. ACM. <a href=\"https://doi.org/10.1145/3607199.3607235\" target=\"_blank\"><i>10.1145/3607199.3607235</i></a></li>\n<li>Mortier et al (2010). Using Dust Clouds to Enhance Anonymous Communication. Springer. <a href=\"https://doi.org/10.1007/978-3-662-45921-8_10\" target=\"_blank\"><i>10.1007/978-3-662-45921-8_10</i></a></li>\n<li>Reynolds et al (2024). The potential for AI to revolutionize conservation: a horizon scan. <a href=\"https://doi.org/10.1016/j.tree.2024.11.013\" target=\"_blank\"><i>10.1016/j.tree.2024.11.013</i></a></li>\n<li>Buschke et al (2023). Make global biodiversity information useful to national decision-makers. Nature Ecology & Evolution. <a href=\"https://doi.org/10.1038/s41559-023-02226-2\" target=\"_blank\"><i>10.1038/s41559-023-02226-2</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/uk-national-data-lib",
      "title": "Thoughts on the National Data Library and private research data",
      "summary": "Exploring the National Data Library and its potential to improve access to private research data while balancing security and privacy concerns.",
      "date_published": "2025-02-17T00:00:00.000000Z",
      "date_modified": "2025-02-17T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "network",
        "storage",
        "distributed",
        "opensource"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1098/rstb.2023.0327",
          "doi": "10.1098/rstb.2023.0327",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3607199.3607235",
          "doi": "10.1145/3607199.3607235",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1007/978-3-662-45921-8_10",
          "doi": "10.1007/978-3-662-45921-8_10",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1016/j.tree.2024.11.013",
          "doi": "10.1016/j.tree.2024.11.013",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s41559-023-02226-2",
          "doi": "10.1038/s41559-023-02226-2",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2024-food-life-2",
      "content_html": "<p>We've uploaded a revised preprint on our ongoing work on quantifying the <a href=\"/papers/2024-food-life\">biodiversity cost of global food consumption</a>, lead by <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\">Thomas Ball</a>.  This is based on the <a href=\"/notes/2024-life-3\">recently published</a> <a href=\"/projects/life\">LIFE</a> metric, combined with supply chain data and provenance modeling.</p>\n<p>Of particular interest may be the new analysis in the <a href=\"https://www.cambridge.org/engage/coe/article-details/67a21eac81d2151a0225692b\">supplementary material</a> which analyses the biodiversity cost of common food types against their size as a &quot;functional unit&quot; (that is, how much is typically consumed). The results don't come too differently at a macro scale, but the breakdown is interesting to see.</p>\n<p><a href=\"https://www.cambridge.org/engage/coe/article-details/67a21eac81d2151a0225692b\"> <img src=\"/images/food-functional-service-units-preprint-v2.webp\" alt=\"%c\" > </a></p>\n<p><a href=\"https://github.com/mor1\">Richard Mortier</a> also pointed out some related work that he did on this topic back in 2015 at <a href=\"https://horizon.ac.uk\">Horizon</a>, where they did an ethanographic study on &quot;<a href=\"https://link.springer.com/content/pdf/10.1007/s00779-015-0871-y.pdf\">Understanding food consumption lifecycles using wearable camera</a>&quot;. It would be an extremely cool followup project to:</p>\n<blockquote>\n<p>build a tool (app, whatever) that incentivised collection and enabled [privacy-preserving] sharing of such data. &quot;A Social Food Network&quot;.\n<cite>-- <a href=\"https://github.com/mor1\">Richard Mortier</a>, personal communication, 2025</cite></p>\n</blockquote>\n<p>I'll write that up in my <a href=\"/ideas\">ideas</a> page when I get a chance. Any other comments or questions on the preprint are, as always, very welcome! In the meanwhile, please find the preprint details below.</p><h1>References</h1><ul><li>Ball et al (2025). Food impacts on species extinction risks can vary by three orders of magnitude. <a href=\"https://doi.org/10.1038/s43016-025-01224-w\" target=\"_blank\"><i>10.1038/s43016-025-01224-w</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2024-food-life-2",
      "title": "Updated preprint on quantifying biodiversity cost of food consumption",
      "summary": "Revised preprint on quantifying biodiversity cost of global food consumption using LIFE metric combined with supply chain data and provenance modeling.",
      "date_published": "2025-02-12T00:00:00.000000Z",
      "date_modified": "2025-02-12T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "food",
        ":life",
        "biodiversity",
        "supplychains",
        "conservation",
        "climate"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2024-food-life.pdf",
          "mime_type": "application/pdf",
          "title": "Food impacts on species extinction risks can vary by three orders of magnitude"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1038/s43016-025-01224-w",
          "doi": "10.1038/s43016-025-01224-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/g4ch1-64343",
      "content_html": "<p>The terms <a href=\"https://en.wikipedia.org/wiki/Carbon_offsets_and_credits\">carbon credits and carbon offsets</a> are often used interchangeably,\nbut are in fact two distinct concepts.  I've spent a nice Sunday morning\nreading up on some <a href=\"https://ssir.org/articles/entry/forest-contributions-carbon-offsets\">recent articles</a> that <a href=\"https://en.wikipedia.org/wiki/Bhaskar_Vira\">Bhaskar Vira</a> sent me which introduce a\n<em>third</em> term, known as <em>&quot;carbon contributions&quot;</em>. Rather than this adding confusion, I\nfound it helped me clarify my own thoughts on the matter, which I\nnote down here in draft form. <em>(Update 7th Feb: I've revised this several times after many discussions this week, especially with <a href=\"https://coomeslab.org\">David Coomes</a> and <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>, with full list of credits in the end)</em></p>\n<h2 id=\"what-are-carbon-credits-and-offsets\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#what-are-carbon-credits-and-offsets\"></a>What are carbon credits and offsets?</h2>\n<p>A <em>carbon credit</em> aims to quantify the net climate benefit resulting an\nintervention that alters some CO2 emissions that would otherwise have gone into\nthe atmosphere in a business-as-usual counterfactual scenario. While there are many\ndifferent categories of carbon credits, I'll focus on <a href=\"https://iucn.org/our-work/nature-based-solutions\">nature-based solutions</a>. For example,\nwe could fund an intervention which provides an <a href=\"https://www.rspb.org.uk/whats-happening/news/the-power-of-forest-friendly-chocolate\">alternative livelihood</a> to cutting down tropical rainforests,\nand then calculate the area of rainforest saved (and therefore, the amount of avoided carbon emitted into the atmosphere) as a result\nof this action.</p>\n<p>The carbon credit therefore measures the <em>additional</em> amount of CO2 avoided as a result of the specific intervention,\nadjusted for <a href=\"https://www.lse.ac.uk/granthaminstitute/publication/avoiding-leakage-from-nature-based-offsets-by-design/\">negative externalities</a> and the <a href=\"/papers/2023-ncc-permanence\">impermanence</a> of\nthe action into the future if it's at risk of being reversed. We can monitor the measurements using spaceborn sensing to\nestablish <a href=\"/notes/credible-credit-principles\">global baselines</a> against which to calculate the counterfactual impacts of positive actions.<sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote\">[1]</a></sup> Carbon credits are nowadays their own asset class, both <a href=\"/papers/2024-cclr-carbon\">legally</a> and <a href=\"https://www.gov.uk/government/publications/revenue-and-customs-brief-7-2024-vat-treatment-of-voluntary-carbon-credits/revenue-and-customs-brief-vat-treatment-of-voluntary-carbon-credits\">fiscally</a>.</p>\n<p>A <em>carbon offset</em> <sup id=\"fnref:2\"><a href=\"#fn:2\" class=\"footnote\">[2]</a></sup> is then a way to account for the net climate benefits that one entity brings to another. The &quot;benefits&quot; are the amount of CO2e avoided or removed via the carbon credit, and the &quot;costs&quot; are the amounts of CO2e being emitted by the other party. The origin of this accounting can be traced back to the UN's <a href=\"https://en.wikipedia.org/wiki/Net-zero_emissions\">net-zero</a> goals:</p>\n<blockquote>\n<p>Net-zero means cutting carbon emissions to a small amount of residual emissions that can be absorbed and durably stored by nature and other carbon dioxide removal measures, leaving zero in the atmosphere.\n<cite>-- UN <a href=\"https://www.un.org/en/climatechange/net-zero-coalition\">Net Zero coalition</a></cite></p>\n</blockquote>\n<p>The theory behind offsetting is that we can never get to a complete net zero state due to the <a href=\"https://www.nature.com/articles/s41558-022-01592-2\">residual CO2 emissions</a> that will remain in even the most optimal decarbonised societies. For these residual emissions, we need to offset them with corresponding climate benefits in order to balance the books on how much carbon is in the atmosphere and how much is being <a href=\"https://www.nature.com/articles/s41586-024-07602-x\">absorbed</a> by the planet's biosphere.  And one of the main sources of CO2 absorption that we must protect in the biosphere are rainforests:</p>\n<blockquote>\n<p>Carbon sinks have increased in temperate and tropical regrowth forests owing to increases in forest area, but they decreased in boreal and tropical intact forests, as a result of intensified disturbances and losses in intact forest area, respectively. The global forest sink is equivalent to almost half of fossil-fuel emissions. However, two-thirds of the benefit from the sink has been negated by tropical deforestation.\n<cite>-- <a href=\"https://www.nature.com/articles/s41586-024-07602-x\">The enduring world forest carbon sink</a>, Nature 2024</cite></p>\n</blockquote>\n<p>Since tropical rainforests are so crucial for both <a href=\"https://www.unesco.org/en/articles/yangambi-biosphere-reserve-congo-basin-become-knowledge-hub-climate-and-biodiversity\">CO2 absorption</a> and biodiversity, my own recent <a href=\"https://4c.cst.cam.ac.uk/publications\">research</a> has largely focussed on reliable <a href=\"/papers/2023-pact-tmf\">accounting</a> for quantifying carbon credits accurately for <a href=\"https://unfccc.int/topics/land-use/workstreams/redd/what-is-redd\">avoided deforestation</a> projects in these regions. This work been <a href=\"https://www.cambridge.org/engage/coe/article-details/6409c345cc600523a3e778ae\">progressing</a> steadily, and we're increasingly confident in the quantification methods used behind measuring the carbon sequestration impact of nature-based credits.</p>\n<p>However, what has been dragging down carbon credits is how they are used <em>after</em> they are verified and purchased, which is predominately via carbon offsetting. Let's first examine the problems with carbon <a href=\"https://en.wikipedia.org/wiki/Carbon_offsets_and_credits\">offsetting</a>, and then examine an emerging concept of &quot;carbon <a href=\"https://ssir.org/articles/entry/forest-contributions-carbon-offsets\">contributions</a>&quot; might provide a better way forward for carbon credits.</p>\n<h2 id=\"is-carbon-offsetting-a-license-to-pollute\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#is-carbon-offsetting-a-license-to-pollute\"></a>Is carbon offsetting a license to pollute?</h2>\n<p>Carbon offsets are currently mostly <a href=\"https://icvcm.org/voluntary-carbon-market-explained/\">voluntary</a>, where private actors can purchase carbon credits towards reducing their emissions targets. The obvious problem with offsetting is that it can give <a href=\"https://www.ft.com/content/93938a1b-dc36-4ea6-9308-170189be0cb0\">bad actors</a> a license to spend money to <a href=\"https://www.theguardian.com/environment/2023/jan/19/shell-to-spend-450m-on-carbon-offsetting-fears-grow-credits-worthless-aoe\">continue to pollute</a>, while <a href=\"https://www.npr.org/2024/07/12/g-s1-9545/ai-brings-soaring-emissions-for-google-and-microsoft-a-major-contributor-to-climate-change\">breaking their emissions pledges</a>. And the harsh reality is that if we don't engage in immediate and real emissions reductions, we're <a href=\"https://www.newscientist.com/article/2344159-world-is-on-track-for-2-5c-of-global-warming-by-end-of-the-century/\">screwed</a> in the coming decades.</p>\n<p>Unfortunately, we need to balance this with the short-term reality that many of these businesses have to emit to <a href=\"https://www.npr.org/2024/07/12/g-s1-9545/ai-brings-soaring-emissions-for-google-and-microsoft-a-major-contributor-to-climate-change\">remain competitive</a>, for example in the AI sector (<a href=\"/notes/deepseek-r1-advances\">Deepseek</a> notwithstanding!).\nAmazon highlighted the difficulty of forecasting their emissions in their annual sustainability report in 2023:</p>\n<blockquote>\n<p>[...] our progress toward a net-zero carbon business will not be linear, and each year as our various businesses grow and evolve, we will produce different results [...] These results will be influenced by significant changes to our business, investments in growth, and meeting the needs of our customers.\n<cite>-- <a href=\"https://sustainability.aboutamazon.com/2023-amazon-sustainability-report.pdf\">Amazon Sustainability Report 2023</a></p>\n</blockquote>\n<p>As did Google, who gave up on 'real time net zero' last year, preferring instead to aim for the comfortably distant 2030:</p>\n<blockquote>\n<p>[...] starting in 2023, we're no longer maintaining operational carbon neutrality. We're instead focusing on accelerating an array of carbon solutions and partnerships that will help us work toward our net-zero goal [...]\n<cite>-- <a href=\"https://www.gstatic.com/gumdrop/sustainability/google-2024-environmental-report.pdf\">Google Environment Report 2024</a></cite></p>\n</blockquote>\n<p>Your heart may not be bleeding for these tech companies finding it difficult to forecast how they'll make their next <a href=\"https://en.wikipedia.org/wiki/List_of_public_corporations_by_market_capitalization#Trillion-dollar_companies\">trillion dollars</a>, but there is the undeniable reality that they need to break emissions pledges in response to global competitive pressure on their core businesses. But given this, is there still any point in all the precise accounting frameworks for net-zero carbon <em>offsetting</em>?</p>\n<p>A December <a href=\"https://www.ft.com/content/969b487f-9534-44b6-a47d-ce7519667884\">article</a> in the FT argues that there needs to be a fundamental shift in our approach to carbon credits for this reason. They observed that the use of carbon offsets for emissions trading in the EU will probably only apply to removal projects that <a href=\"https://en.wikipedia.org/wiki/Direct_air_capture\">suck carbon from the air</a> and not to the nature-based deforestation avoidance schemes I described above.</p>\n<blockquote>\n<p>Corporate funding for nature conservation has a useful role to play — but as a contribution to the public good, not for use in tonne-for-tonne emissions offsetting calculations.\n<cite>-- <a href=\"https://www.ft.com/content/969b487f-9534-44b6-a47d-ce7519667884\">Simon Mundy</a>, &quot;It's time for a shift in approach to carbon credits&quot;, FT</cite></p>\n</blockquote>\n<p>And <em>there</em> is the critical distinction between carbon &quot;credits&quot; and &quot;offsets&quot; I was looking for! Simon acknowledges the crucial importance of generating forest carbon credits to advance the extremely urgent problem of tackling tropical deforestation, but notes that corporations should not be giving to this pot as part of a complex accounting scheme tied to the vagaries of their ever-shifting business strategies. Forests are too important to our continued existence to be left to the mercies of a <a href=\"https://www.theguardian.com/environment/article/2024/may/31/market-value-of-carbon-offsets-drops-61-aoe\">volatile stock market</a>.</p>\n<p>Instead, we need to come up with a scheme for spending carbon credits whose incentives are aligned towards keeping the focus on emissions reductions and behavioural change. So, let's next firmly decouple carbon credits from carbon offsets, and examine how organisations that wish to <em>do</em> the right thing can...contribute...instead.</p>\n<h2 id=\"carbon-contributions-as-an-alternative-to-offsetting\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#carbon-contributions-as-an-alternative-to-offsetting\"></a>Carbon contributions as an alternative to offsetting</h2>\n<p>An <a href=\"https://ssir.org/articles/entry/forest-contributions-carbon-offsets#\">article last year</a> by a former Cambridge Gates Scholar <a href=\"https://www.libbyblanchard.com/\">Libby Blanchard</a> and colleagues made a very clear case how and why we might replace carbon offsetting with &quot;carbon contributions&quot;, and especially so for forest protection. She observed that the <a href=\"https://www.ft.com/content/6eb8981e-4117-4aeb-a1b3-40f08ae85f53\">integrity crisis</a> in the offsets market has quite rightly lead to the exposure of many poor quality schemes, but is also drying up crucial funding for the <a href=\"https://www.fscindigenousfoundation.org/global-south-voices-in-support-of-redd/\">good actors</a> who are working hard under very adverse conditions to launch forest protection schemes in the global <a href=\"https://www.wildlifeworks.com/post/listen-to-global-south-voices-the-carbon-market-s-key-role-in-financing-sustainable-development-and\">south</a> and <a href=\"https://www.reuters.com/sustainability/land-use-biodiversity/how-carbon-finance-is-seeding-new-hope-northern-forests-2024-12-20/\">north</a>.</p>\n<blockquote>\n<p>One way to channel forest finance away from bad offsets toward more productive outcomes is, simply, to stop claiming that forests offset fossil fuel emissions. Companies could, instead, make &quot;contributions&quot; to global climate mitigation through investments in forests.</p>\n<p>This change in terminology may seem small, but it represents a fundamentally different approach. For one thing, not allowing companies to subtract carbon credits from their direct emissions into a single net number, as offsetting does, refocuses priorities on direct emissions reductions. Companies would no longer be able to hide inaction behind offset purchases.\n<cite>-- <a href=\"https://ssir.org/articles/entry/forest-contributions-carbon-offsets#\">Libby Blanchard, Bill Anderegg and Barbara Haya</a>, Instead of Carbon Offsets, We Need 'Contributions' to Forests, Jan 2024</cite></p>\n</blockquote>\n<p>This approach is radically more accessible for a good actor who has been scared away from offsets and is entangled in complex <a href=\"https://sciencebasedtargets.org\">SBTI</a>-style accounting frameworks!</p>\n<p>Firstly and most importantly, it removes the incentive to purchase the cheapest credits on the market at the lowest price possible. Since the organisations are no longer racing to hit a net-zero target, they can afford to find the highest quality and highest impact carbon projects available, and put their money towards those instead.</p>\n<p>Secondly, a contributions model focussed on quality means that more organisations can safely participate. In the current voluntary market, there is a <a href=\"https://en.wikipedia.org/wiki/The_Market_for_Lemons\">market for lemons</a> situation where it is very difficult to distinguish <a href=\"https://www.theguardian.com/environment/article/2024/may/30/corporate-carbon-offsets-credits\">junk credits</a> from <a href=\"https://community.rspb.org.uk/ourwork/b/actionfornature/posts/protecting-gola-10-years-of-the-redd-conservation-project-in-sierra-leone-s-gola-rainforest\">worthwhile credits</a>, since the market price is not a reliable indicator of quality. This means that the vast majority of organisations <a href=\"https://www.statista.com/statistics/501730/voluntary-carbon-offset-market-transaction-volume-worldwide/\">withdraw</a> from participating in the (voluntary) market due to the <a href=\"https://infiniteglobal.com/insights/a-net-zero-fairytale-the-reputational-risks-of-carbon-offsetting/\">reputational risks</a>, leaving only two sorts of participants: very good actors who <em>really</em> want to do the right thing, and very bad actors who are blatantly <a href=\"https://en.wikipedia.org/wiki/Greenwashing\">greenwashing</a>. It's a very odd party if the only two sorts of people left are the sinners and the saints!</p>\n<p>Let's look more closely at each of these points, as I think it fundamentally changes the dynamics of the use of carbon credits.</p>\n<h2 id=\"selecting-the-highest-quality-carbon-credits-instead-of-the-cheapest\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#selecting-the-highest-quality-carbon-credits-instead-of-the-cheapest\"></a>Selecting the highest quality carbon credits instead of the cheapest</h2>\n<p>There are a <a href=\"https://www.carbon-direct.com/insights/how-do-carbon-credits-actually-work-removal-reduction-and-avoidance-credits-explained\">vast array</a> of carbon avoidance, reduction and removal schemes; how do we chose between them? The current carbon markets focus on <a href=\"https://carbonmarketwatch.org/2024/08/14/faq-understanding-the-financial-workings-of-the-voluntary-carbon-market/\">secondary trading</a> as a price proxy, but this is a poor indicator of the underlying reliability and human and biodiversity cobenefits of any given intervention. In 2021, the University of <a href=\"https://www.environment.admin.cam.ac.uk/ESSC/carbon-offsetting-working-group-terms-reference\">Cambridge Offset Working Group</a> commissioned a <a href=\"https://www.cambridge.org/engage/coe/article-details/6409c345cc600523a3e778ae\">comprehensive report</a> on how we might compare project quality and cobenefits first, and then figure out a suitable price for each. This methodology (dubbed &quot;<a href=\"/papers/2023-ncc-permanence\">PACT</a>&quot;) allows us to compare diverse credit types such as direct-air-capture and nature-based solution projects as apples-to-apples.  Here's an excerpt from that <a href=\"https://www.cambridge.org/engage/coe/article-details/6409c345cc600523a3e778ae\">report</a>:</p>\n<p><img src=\"/images/pact-table.webp\" alt=\"%c\" title=\"Table of relative costs of carbon credits across project types from the COWG report\" ></p>\n<p>The important column is the £<sub>PACT</sub> one, which shows the adjusted costs per ton of carbon of purchasing those credits. The <a href=\"https://climeworks.com/subscriptions-co2-removal\">Climeworks</a> direct-air-capture comes in at £900/tonne <sup id=\"fnref:3\"><a href=\"#fn:3\" class=\"footnote\">[3]</a></sup> whereas a tropical rainforest project in Sierra Leone ranks in at £73/tonne, <em>even after impermanance is adjusted for</em>! That's an absolutely mind-blowing price difference for a market that's allegedly more <a href=\"https://en.wikipedia.org/wiki/Efficient-market_hypothesis\">efficient</a> due to the existence of secondary trading. Yet there is an order-of-magnitude price difference between tropical forest protection and direct air capture, and that's <em>before</em> taking into account the obvious co-benefits of forest protection such as <a href=\"/projects/life\">biodiversity</a> and livelihood improvements.</p>\n<p>Blanchard's earlier article identifies the key benefits of a contributions model here:</p>\n<blockquote>\n<p>Freeing companies from the pressure of &quot;offsetting&quot; by switching to a &quot;contributions&quot; frame lessens the incentive to minimize costs at the expense of quality, allowing them to focus on contributing to higher-quality projects.\n<cite>-- <a href=\"https://ssir.org/articles/entry/forest-contributions-carbon-offsets#\">Libby Blanchard, Bill Anderegg and Barbara Haya</a></cite></p>\n</blockquote>\n<p>Since the University is <em>not</em> planning on spending these carbon credits on accounting towards a net-zero goal, it is free to search the market for the highest quality impact -- in this case, tropical rainforest avoidance credits that are hugely undervalued -- and also filtering based on important co-benefits such as biodiversity and livelihood impacts.  And by sharing our knowledge about high quality carbon credit projects, we could hopefully find many other organisations that want to similarly contribute, and drive up the price of rainforest credits to their <a href=\"https://www.nature.com/articles/s41893-018-0175-0\">true value</a>.<sup id=\"fnref:4\"><a href=\"#fn:4\" class=\"footnote\">[4]</a></sup></p>\n<p>With a contributions model, we no longer care what the absolute price we're paying for the credits are: our contributions only reflect a fraction of our total climate damage anyway, and we want the carbon credits that we do purchase to reflect the highest available impact out of the spectrum of compensation efforts that we could engage in.  There's still one important consideration we'll talk about next though: how should an organisation account for these contributions, if not as part of a net-zero mechanism?</p>\n<h2 id=\"applying-carbon-contributions-to-sustainability-policies\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#applying-carbon-contributions-to-sustainability-policies\"></a>Applying carbon contributions to sustainability policies</h2>\n<p>The primary sustainability focus of any organisation must be on <a href=\"https://en.wikipedia.org/wiki/Climate_change_mitigation\">decarbonisation</a> via direct emissions reduction. With carbon contributions, we can focus on this without the distractions of race-to-the-bottom carbon offset accounting.</p>\n<p>For example, consider the area of <a href=\"https://www.bbc.co.uk/news/articles/cz7wp777780o\">international air travel</a>. There are <em>plenty</em> of things to do to reduce emissions here as a matter of urgent policy change. My University's <a href=\"https://www.environment.admin.cam.ac.uk/travel/sustainable-business-travel\">sustainable travel policy</a> is sensible and dictates that it must be a trip of last resort to fly; we must use trains or other land travel where available, such as for European trips. There is also plenty of science to invest in to reduce the impact of aviation; ranging from <a href=\"https://www.bbc.co.uk/news/av/technology-60985913\">electrified planes</a> and <a href=\"https://www.bbc.co.uk/news/articles/cz7wp777780o\">contrails</a> and <a href=\"https://www.sciencedirect.com/science/article/pii/S0191261524001899\">optimised routing</a>. But, while all this is going on, sometimes there is only one practical way to get somewhere internationally, such as for an annual conference. We need all the emissions reductions strategies to be deployed first, and while these are taking effect we <em>also</em> need to also augment them with voluntary contribution towards the last-resort travel that's happening while they are being rolled out or researched. Or indeed, also compensate for past travel emissions, as CO2e affects the climate for <a href=\"https://www.nature.com/articles/climate.2008.122\">longer than Stonehenge</a> has existed!</p>\n<p>Another similarly <a href=\"https://ourworldindata.org/food-choice-vs-eating-local\">topical</a> emissions reduction area is on how to reduce our <a href=\"https://www.britishecologicalsociety.org/wp-content/uploads/Ripple-et-al-2014-ruminants.pdf\">ruminant meat consumption</a>. More and more research is showing how damaging this is for <a href=\"https://www.worldwildlife.org/magazine/issues/summer-2018/articles/what-are-the-biggest-drivers-of-tropical-deforestation\">tropical forest destruction</a> but also from a <a href=\"/papers/2024-food-life\">biodiversity angle</a>. But it turns out that <a href=\"https://doi.org/10.1038/d41586-019-01662-0\">nudging consumers</a> such as Cambridge students and staff towards less damaging choices by default is entirely practical:<sup id=\"fnref:5\"><a href=\"#fn:5\" class=\"footnote\">[5]</a></sup></p>\n<blockquote>\n<p>A study of over 94000 cafeteria meal choices has found that doubling the vegetarian options – from 1-in-4 to 2-in-4 – increased the proportion of plant-based purchases by between 40-80% without affecting overall food sales.\n<cite>-- <a href=\"https://www.cam.ac.uk/stories/veg-nudge\">Veg nudge</a>. Impact of increasing vegetarian availability on meals (<a href=\"https://doi.org/10.1073/pnas.1907207116\">paper</a> / <a href=\"https://www.nature.com/articles/s43016-020-0132-8\">followup</a>)</cite></p>\n</blockquote>\n<p>For both of these emissions reductions initiatives, we could tag on a voluntary contribution whenever some damaging action (long-haul flying, importing ruminant meat, etc) is taken.  This is an contribution of <em>last resort</em> (&quot;I am a grad student presenting a paper and have to go to abroad for this conference&quot;). In <a href=\"https://www.environment.admin.cam.ac.uk/Annual-Report\">annual sustainability reports</a>, the primary focus of reporting would remain firmly on the emissions reductions initiatives themselves. But the contributions gathered from these schemes could be pooled, and treated as a collective (but voluntary) <a href=\"https://en.wikipedia.org/wiki/Carbon_tax\">carbon tax</a> on the damages to nature and the atmosphere.</p>\n<p>And how do we spend this carbon tax? On the highest quality carbon projects we can find in the big wide world, as I described earlier! Each individual reductions scheme doesn't worry about what the compensation mechanisms are; groups similar to the <a href=\"https://www.environment.admin.cam.ac.uk/ESSC/carbon-offsetting-working-group-terms-reference\">COWG</a> could regularly assess projects worldwide. By publically sharing their results to allow other organisations to participate in supporting them, they would also help reinforce the emerging <a href=\"https://icvcm.org/core-carbon-principles/\">core carbon principles</a> championed by the <a href=\"https://icvcm.org/\">IC-VCM</a>.</p>\n<h2 id=\"im-pretty-sold-on-carbon-contributions-vs-offsets\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#im-pretty-sold-on-carbon-contributions-vs-offsets\"></a>I'm pretty sold on carbon contributions vs offsets</h2>\n<p>This contributions model places the emphasis back where it should be -- on behavioural and systemic reductions of our environment impacts -- rather than on being a &quot;license to pollute&quot;, as carbon offsets have often been used as.  It allows us to pragmatically identify high-impact areas where we have policies in place to reduce emissions, purchase carbon credits from those projects, and then account for their expenditure via our emissions reductions activities.</p>\n<p>An explicit non-goal is to use credits towards a big net-zero target of claiming carbon neutrality; they just reflect our collective contribution towards mitigating environmental damage that we've judged that we had to do.\n<a href=\"https://www.landecon.cam.ac.uk/person/dr-ellen-quigley\">Ellen Quigley</a> succinctly summarises this: <em>&quot;a contribution is an acknowledgement of harm rather than its <a href=\"https://dictionary.cambridge.org/dictionary/english/expiation\">expiation</a>&quot;</em>.</p>\n<p><a href=\"https://www.bangor.ac.uk/staff/sens/julia-patricia-gordon-jones-010356/en\">Julia P.G. Jones</a> also applies this approach to <a href=\"/papers/2023-naturecredits\">biodiversity credits</a> in a recent <a href=\"https://royalsocietypublishing.org/doi/10.1098/rspb.2024.2353\">piece</a>:</p>\n<blockquote>\n<p>Using biodiversity credits to quantify contributions toward nature recovery, rather than to directly offset specific negative impacts, is a key way to reduce some of the risks we highlight. This is referred to in the forest carbon world as a &quot;contribution&quot; model. Instead of buyers of forest carbon credits claiming that the credits can offset emissions to achieve &quot;net zero&quot;, they instead make a &quot;contribution&quot; to global climate mitigation through investments in forests.</p>\n<p>While this may seem like a small change in terminology, it represents an important difference. If carbon credits cannot be subtracted from a company's emissions to produce a single net number, they cannot be used as a license to continue emitting. This also lessens the incentive for buyers to focus on quantity rather than quality in purchased credits. Some biodiversity credit operators are already promoting this approach [...]\n<cite>-- <a href=\"https://royalsocietypublishing.org/doi/10.1098/rspb.2024.2353\">Hannah Wauchope et al</a>, What is a unit of nature? Measurement challenges in the emerging biodiversity credit market, Royal Society 2024</cite></p>\n</blockquote>\n<p>I couldn't agree more! Julia also highlights eloquently the urgency of the situation in her <a href=\"https://www.nature.com/articles/s41559-024-02442-4\">commentary</a> in Nature in response to a recent <a href=\"https://www.bbc.co.uk/programmes/m001zd68\">Panorama</a> program on the BBC:</p>\n<blockquote>\n<p>However, dramatically more finance is urgently needed to stop the ongoing loss of forests and the vital services that they provide. REDD+ credits that cover the true cost of reducing deforestation in an effective and equitable way can help to provide that finance. If they are only used to offset residual emissions after substantial reductions, they could also contribute to the transition to net zero. The bottom line is that failure to conserve our carbon-rich forests and the life they support would be a dramatic and catastrophic failure for humanity.\n<cite>- <a href=\"https://www.nature.com/articles/s41559-024-02442-4\">Julia P.G. Jones</a>, Scandal in the voluntary carbon market must not impede tropical forest conservation, Nature</cite></p>\n</blockquote>\n<h2 id=\"draft-principles-to-operationalise-carbon-contributions\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#draft-principles-to-operationalise-carbon-contributions\"></a>Draft principles to operationalise carbon contributions</h2>\n<p>While we're still in early days of working through the details, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\">Thomas Swinfield</a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>, <a href=\"https://coomeslab.org\">David Coomes</a> and I have been framing a three-step checklist that organisations could apply towards the implementation of a carbon contributions model:</p>\n<ol>\n<li>The organisation acknowledges harm from recent and historic emissions. Decarbonisation remains the first priority, whilst minimising residual emissions.</li>\n<li>Contributions are intended to mitigate harm from residual emissions and not to claim carbon neutrality</li>\n<li>The organisation is transparent about decreases or increases in emissions and beneficiaries of its contributions</li>\n</ol>\n<p>With these principles, it should be possible for an organisation to contribute to carbon credit financing without adverse incentives. While there is some concern that this contributions mechanism has no built-in incentive to force organisations to contribute, I believe that it could bring a lot more people into the fold than voluntary offsetting has (which, as I noted earlier, has only mainly the best and the worst participants remaining now with the majority of people stepping back from it due to all the controversies). However, we still need to see if this is a strong enough incentive to get more organisations to participate voluntarily; this concern has been raised by several colleagues in response to this article and I will think on it further.</p>\n<p>The stakes <a href=\"https://news.mongabay.com/2024/12/the-year-in-tropical-rainforests-2024/\">cannot be higher</a> right now for tropical rainforests, and we do not have the collective luxury of time to remain locked in the <a href=\"https://www.ecosystemmarketplace.com/articles/commentaryhow-i-learned-to-stop-worrying-and-love-or-tolerate-carbon-offsets/\">offset-or-not</a> debate without an immediate alternative. The carbon contributions model could be just what we need to push forward! My hope is that this model makes it easier and safer for many organisations that have decided against offsetting to still contribute towards nature protection and restoration.</p>\n<p>Other universities also grappling with this topic include <a href=\"https://www.ecosystemmarketplace.com/articles/commentaryhow-i-learned-to-stop-worrying-and-love-or-tolerate-carbon-offsets/\">Brown</a> and <a href=\"https://www.cis.upenn.edu/~bcpierce/papers/carbon-offsets.pdf\">UPenn</a>, so I plan to circulate this article to them to gather wider opinions.  The good folks at <a href=\"https://native.eco\">Native</a> also published a <a href=\"https://www.linkedin.com/pulse/why-businesses-must-shift-from-compensation-contribution-gkwee/?trackingId=ebXd8K96TidbACLeGURK%2Fw%3D%3D\">piece</a> about this shift from a compensation mindset to a contributions one.</p>\n<p>As noted at the beginning, I am updating this article regularly and would greatly welcome any other thoughts from you, the reader! I am grateful to <a href=\"https://coomeslab.org\">David Coomes</a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\">Thomas Swinfield</a>, <a href=\"https://www.cst.cam.ac.uk/people/eft20\">Eleanor Toye Scott</a>, <a href=\"https://www.wolfson.cam.ac.uk/people/dr-robin-daniels\">Robin Daniels</a>, <a href=\"https://www.cisl.cam.ac.uk/directory/emily-shuckburgh\">Emily Shuckburgh</a>, <a href=\"https://www.geog.cam.ac.uk/people/garrett/\">Rachael Garrett</a>, <a href=\"https://www.linkedin.com/in/isobelcohen/\">Isobel Cohen</a>, <a href=\"https://en.wikipedia.org/wiki/Simon_Zadek\">Simon Zadek</a>, <a href=\"https://en.wikipedia.org/wiki/Bhaskar_Vira\">Bhaskar Vira</a>, <a href=\"https://www.cam.ac.uk/stories/changemakers-melissa-leach\">Melissa Leach</a>, <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\">Andrew Balmford</a>, <a href=\"https://mynameismwd.org\">Michael Dales</a>, <a href=\"https://www.linkedin.com/in/harriet-hunnable-uk/\">Harriet Hunnable</a>, <a href=\"https://www.eden-plus.org/team-members/elliot-kinsey\">Elliot Kinsey</a>, <a href=\"https://www.landecon.cam.ac.uk/person/dr-ellen-quigley\">Ellen Quigley</a>, <a href=\"https://www.linkedin.com/in/jonpierre1/\">Jon Pierre</a>, <a href=\"https://www.bangor.ac.uk/staff/sens/julia-patricia-gordon-jones-010356/en\">Julia P.G. Jones</a> and many others for their thoughts. This article includes their input but is not endorsed by them and any mistakes are mine alone.</p>\n<p><a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\">Thomas Swinfield</a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and I decided it might be instructive to run a <a href=\"https://notebooklm.google\">NotebookLM</a> summary of some of our discussions, which you can find as in (AI-generated) podcast format below.</p>\n<p><div class=\"video-center\"><iframe title=\"NotebookLM summary of Carbon Credits, Offsets, and Contributions\" width=\"100%\" height=\"315px\" src=\"https://crank.recoil.org/videos/embed/8ba0f444-5846-4fb3-a54d-4dcdb3bb73be\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<p><small class=\"credits\"> Changelog: 2nd Feb 2025 was original article.  5th Feb 2025 refined draft principles. 12th Feb 2025 added note about Native.eco article via <a href=\"https://www.wolfson.cam.ac.uk/people/dr-robin-daniels\">Robin Daniels</a>, note on incentives via <a href=\"https://www.cisl.cam.ac.uk/directory/emily-shuckburgh\">Emily Shuckburgh</a>. 20th Feb 2025 fixed typo in Ellen Quigley quote, via <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>.</small></p>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p><a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> has an excellent <a href=\"https://4c.cst.cam.ac.uk/about/additionality-leakage-and-permanence\">video explainer</a> series of the work <a href=\"/projects/4c\">4C</a> has been doing towards this.</p>\n <a href=\"#fnref:1\" class=\"reversefootnote\">&#8617;</a></p></li>\n<li id=\"fn:2\"><p><p>From the <a href=\"https://en.wikipedia.org/wiki/Carbon_offsets_and_credits\">Wikipedia article</a> to carbon credits and offsets.</p>\n <a href=\"#fnref:2\" class=\"reversefootnote\">&#8617;</a></p></li>\n<li id=\"fn:3\"><p><p>The Climeworks price seems to have gone up since 2022, and the <a href=\"https://climeworks.com/subscriptions-co2-removal\">subscription</a> site now shows £1100/tonne.</p>\n <a href=\"#fnref:3\" class=\"reversefootnote\">&#8617;</a></p></li>\n<li id=\"fn:4\"><p><p>There's a nice <a href=\"https://www.vice.com/en/article/the-amazon-is-worth-more-money-left-standing-study-shows/\">article from Vice</a> that explains the <a href=\"https://www.nature.com/articles/s41893-018-0175-0\">paper</a> more accessibly.</p>\n <a href=\"#fnref:4\" class=\"reversefootnote\">&#8617;</a></p></li>\n<li id=\"fn:5\"><p><p>As an aside, I've been purchasing <a href=\"https://shopping.rspb.org.uk/gifts-home/home-and-kitchen/food-drink/food/gola-chocolate.html\">sustainable Gola rainforest chocolate</a> from the RSPB. <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> gave me some of their truffles for Christmas and they were consumed rapidly by my family.</p>\n <a href=\"#fnref:5\" class=\"reversefootnote\">&#8617;</a></p></li></ol></div><h1>References</h1><ul><li>Balmford et al (2024). PACT Tropical Moist Forest Accreditation Methodology v2.1. Cambridge Open Engage. <a href=\"https://doi.org/10.33774/coe-2024-gvslq\" target=\"_blank\"><i>10.33774/coe-2024-gvslq</i></a></li>\n<li>Madhavapeddy (2025). Position paper on scientifically credible carbon credits. <a href=\"https://doi.org/10.59350/69k1e-cts10\" target=\"_blank\"><i>10.59350/69k1e-cts10</i></a></li>\n<li>Chapman et al (2024). A Legal Perspective on Supply-side Integrity Issues in the Forest Carbon Market. <a href=\"https://doi.org/10.21552/cclr/2024/3/5\" target=\"_blank\"><i>10.21552/cclr/2024/3/5</i></a></li>\n<li>Madhavapeddy (2025). Deepdive into Deepseek advances. <a href=\"https://doi.org/10.59350/r06z7-0ht06\" target=\"_blank\"><i>10.59350/r06z7-0ht06</i></a></li>\n<li>Ball et al (2025). Food impacts on species extinction risks can vary by three orders of magnitude. <a href=\"https://doi.org/10.1038/s43016-025-01224-w\" target=\"_blank\"><i>10.1038/s43016-025-01224-w</i></a></li>\n<li>Balmford et al (2023). Realizing the social value of impermanent carbon credits. <a href=\"https://doi.org/10.1038/s41558-023-01815-0\" target=\"_blank\"><i>10.1038/s41558-023-01815-0</i></a></li>\n<li>Swinfield et al (2024). Nature-based credit markets at a crossroads. Springer Science and Business Media LLC. <a href=\"https://doi.org/10.1038/s41893-024-01403-w\" target=\"_blank\"><i>10.1038/s41893-024-01403-w</i></a></li>\n<li>Fisher et al (2019). Use nudges to change behaviour towards conservation. Nature. <a href=\"https://doi.org/10.1038/d41586-019-01662-0\" target=\"_blank\"><i>10.1038/d41586-019-01662-0</i></a></li>\n<li>Garnett et al (2019). Impact of increasing vegetarian availability on meal selection and sales in cafeterias. Proceedings of the National Academy of Sciences. <a href=\"https://doi.org/10.1073/pnas.1907207116\" target=\"_blank\"><i>10.1073/pnas.1907207116</i></a></li>\n<li>Buck et al (2023). Why residual emissions matter right now. Nature Climate Change. <a href=\"https://doi.org/10.1038/s41558-022-01592-2\" target=\"_blank\"><i>10.1038/s41558-022-01592-2</i></a></li>\n<li>Pan et al (2024). The enduring world forest carbon sink. Nature. <a href=\"https://doi.org/10.1038/s41586-024-07602-x\" target=\"_blank\"><i>10.1038/s41586-024-07602-x</i></a></li>\n<li>Strand et al (2018). Spatially explicit valuation of the Brazilian Amazon Forest’s Ecosystem Services. Nature Sustainability. <a href=\"https://doi.org/10.1038/s41893-018-0175-0\" target=\"_blank\"><i>10.1038/s41893-018-0175-0</i></a></li>\n<li>Inman (2008). Carbon is forever. Nature Climate Change. <a href=\"https://doi.org/10.1038/climate.2008.122\" target=\"_blank\"><i>10.1038/climate.2008.122</i></a></li>\n<li>Garnett et al (2020). Order of meals at the counter and distance between options affect student cafeteria vegetarian sales. Nature Food. <a href=\"https://doi.org/10.1038/s43016-020-0132-8\" target=\"_blank\"><i>10.1038/s43016-020-0132-8</i></a></li>\n<li>Jones (2024). Scandal in the voluntary carbon market must not impede tropical forest conservation. Nature Ecology & Evolution. <a href=\"https://doi.org/10.1038/s41559-024-02442-4\" target=\"_blank\"><i>10.1038/s41559-024-02442-4</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/carbon-credits-vs-offsets",
      "title": "Disentangling carbon credits and offsets with contributions",
      "summary": "Clarifying carbon credits, offsets, and contributions to understand effective climate action strategies",
      "date_published": "2025-02-02T00:00:00.000000Z",
      "date_modified": "2025-02-12T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "carboncredits",
        "nbs",
        "forests",
        "conservation",
        "economics",
        "policy",
        ":4c"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.33774/coe-2024-gvslq",
          "doi": "10.33774/coe-2024-gvslq",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/69k1e-cts10",
          "doi": "10.59350/69k1e-cts10",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.21552/cclr/2024/3/5",
          "doi": "10.21552/cclr/2024/3/5",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.59350/r06z7-0ht06",
          "doi": "10.59350/r06z7-0ht06",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s43016-025-01224-w",
          "doi": "10.1038/s43016-025-01224-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s41558-023-01815-0",
          "doi": "10.1038/s41558-023-01815-0",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s41893-024-01403-w",
          "doi": "10.1038/s41893-024-01403-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1038/d41586-019-01662-0",
          "doi": "10.1038/d41586-019-01662-0",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1073/pnas.1907207116",
          "doi": "10.1073/pnas.1907207116",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s41558-022-01592-2",
          "doi": "10.1038/s41558-022-01592-2",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s41586-024-07602-x",
          "doi": "10.1038/s41586-024-07602-x",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s41893-018-0175-0",
          "doi": "10.1038/s41893-018-0175-0",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1038/climate.2008.122",
          "doi": "10.1038/climate.2008.122",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s43016-020-0132-8",
          "doi": "10.1038/s43016-020-0132-8",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s41559-024-02442-4",
          "doi": "10.1038/s41559-024-02442-4",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/32rdt-zny05",
      "content_html": "<p>While <a href=\"https://bsky.app\">Bluesky</a> is taking off like a rocket, a number of us <a href=\"/notes/enter-the-matrix-hookshot\">moving</a> towards <a href=\"\">self sovereign</a> digital infrastructure have been looking at how to use the Bluesky network for other uses than just short-form notes. This is possible because of my colleague <a href=\"https://martin.kleppmann.com\">Martin Kleppmann</a>'s hard work on the &quot;<a href=\"https://atproto.com/\">AT Protocol</a>&quot; that underpins the Bluesky network. Martin recently gave us a <a href=\"https://talks.cam.ac.uk/talk/index/224767\">deep-dive into the AT proto</a> in the Cambridge <a href=\"https://www.cl.cam.ac.uk/research/security/\">security group</a>, which made me look into other uses of it more closely. As background, you may wish to read <a href=\"https://arxiv.org/abs/2402.03239\">his paper</a> on the subject which explains the technical architecture extremely clearly.</p>\n<p><a href=\"https://arxiv.org/pdf/2402.03239\"> <img src=\"/images/atproto-paper-ss-1.webp\" alt=\"%c\" > </a></p>\n<p>One of the key problems this solves is one I'm having with using my <a href=\"https://en.wikipedia.org/wiki/ActivityPub\">ActivityPub</a>-based services at the moment. Each of these services (like my <a href=\"https://crank.recoil.org\">video</a> or <a href=\"https://amok.recoil.org\">microblog</a> sites) do not share a common authentication system, and so each account is different. <a href=\"https://nick.recoil.org\">Nick Ludlam</a> and I are also thinking of renaming all of our services to go under the cleaner <code>recoil.org</code> domain rather than a subdomain, but this involves a fairly error-prone <a href=\"https://digitalflapjack.com/blog/hosting24/\">migration</a> that lacks <a href=\"/ideas/activitypub-resilience\">resilience</a> to domain change changes since they are baked into the ActivityPub protocol messages.  The AT Protocol underpinning Bluesky deals with all these by decoupling the underlying authentication and identity system, and the content that's flowing over the network.</p>\n<p>This strikes a nice balance between pure self-hosting and longevity while bootstrapping the network; I chuckled, for instance, reading <a href=\"https://statusq.org/archives/2012/09/29/4524/\">Q's post from 2012</a> about how a new social network called &quot;<a href=\"https://web.archive.org/web/20121011065707/https://join.app.net/\">app.net</a>&quot;, which described itself as &quot;your real-time feed, a home for meaningful conversation, where you control your data&quot;, but is now a long expired domain-squatted adfest.  The AT Proto model seems more pragmatic in that it builds up a centralised bootstrap, but the underlying protocol itself admits innovation for other apps, and so permits evolution.</p>\n<p>So let's look at some of these alternative apps are already cropping up:</p>\n<ul>\n<li><a href=\"https://whtwnd.com/about\">Whitewind</a> is a blogging platform (<a href=\"https://github.com/whtwnd/whitewind-blog\">source code</a>) that lets you write longform posts in Markdown format and post them to the Internet. The data itself is stored on a local <a href=\"https://github.com/bluesky-social/pds\">PDS</a> and you can republish the blog posts using a <a href=\"https://github.com/hugeblank/whitebreeze\">simple site generator</a>.</li>\n<li><a href=\"https://github.com/muni-town/roomy\">Roomy</a> (via <a href=\"https://github.com/samoht\">Thomas Gazagnaire</a>) is a peer-to-peer messaging app built over AT Proto. <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> points out the nice idea of &quot;digital gardening&quot; in their <a href=\"https://github.com/commune-sh/commune-server/discussions/28\">discussions</a> which I absolutely <em>love</em> and have been building into my own <a href=\"/notes/bushel-lives\">Bushel</a> notes platform which powers this site.  I've wanted to the ability to go from short-form thoughts to long-form consolidation for years and years now.</li>\n<li><a href=\"https://bsky.app/profile/tom.frontpage.team\">Tom Sherman</a> maintains a Bluesky list of <a href=\"https://bsky.app/profile/tom.frontpage.team/lists/3l3qcs6lizq2o\">alternative ATProto apps</a>, from which I discovered <a href=\"https://bsky.app/profile/stream.place\">Streamplace</a>, a mechanism to share live video on AT Proto.</li>\n</ul>\n<p>Then there are a bunch of &quot;alternative clients&quot; that do specific forms of media, such as photos or videos. This is less about using the underlying protocol than about building a new client, but it's still pretty neat that it's so accessible:</p>\n<ul>\n<li><a href=\"https://bsky.app/profile/did:plc:24kqkpfy6z7avtgu3qg57vvl\">Flashes</a> is a photo sharing app that recently launched in <a href=\"https://techcrunch.com/2025/02/06/flashes-a-photo-sharing-app-for-bluesky-opens-beta/\">beta</a> and is currently only available via TestFlight. Like Insta, it allows multiple photos per post, and you can then share comments with the mainline Bluesky.</li>\n<li><a href=\"https://www.bluecast.app/\">Bluecast</a> is a real-time audio streaming service for any Bluesky users (anyone remember the fun we had for about two weeks in the pandemic lockdown with <a href=\"https://www.clubhouse.com/\">Clubhouse</a>?)</li>\n<li><a href=\"https://bsky.app/profile/did:plc:kx626d5pdvqbn3kmoxtjjcbd\">Bluemotion</a> has been built by the <a href=\"https://fediversereport.com/video-audio-and-blogging-japanese-bluesky-is-building-in-the-atmosphere/\">Japanese Bluesky community</a> for quick and easy video sharing. <a href=\"https://liquidx.net\">Alastair Tse</a> says that development on this has <a href=\"https://bsky.app/profile/liquidx.net/post/3lhsoperh2s2f\">slowed down</a>, possibly as Bluesky supports videos natively now.</li>\n<li><a href=\"https://apps.apple.com/us/app/bluescreen-for-bluesky/id6741334901\">Bluescreen</a> is a <a href=\"https://lifehacker.com/tech/bluesky-now-has-its-own-tiktok\">Tiktok alternative</a> for video posts, as is <a href=\"https://bsky.app/profile/mmccue.bsky.social/post/3lg6ezjpawc2c\">SkyTok</a>. Skytok seems to built on something called <a href=\"https://surf.social/\">surf.social</a> that gives more control over a Bluesky/Mastodon/RSS feedset as well, but it's still in closed beta.</li>\n</ul>\n<p>This is just the tip of the iceberg for the open web, of course. Excitingly, there are experiments ongoing to <a href=\"https://berjon.com/ap-at/?ref=cosmico.org\">run ActivityPub over AT Proto</a> which describes how complementary these ecosystems are.\nAnd most excitingly from my personal perspective, is <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> successfully <a href=\"https://bsky.app/profile/patrick.sirref.org/post/3lh24rrjngw24\">posting</a> from an up-and-coming <a href=\"https://github.com/patricoferris/ocaml-atproto-lexicon\">OCaml ATProto</a> implementation. I'm looking forward to hacking in this ecosystem in 2025!</p>\n<p><em>(Thanks David Gageot for spotting typos!)</em></p><h1>References</h1><ul><li>Madhavapeddy (2025). Entering the Matrix with Hookshot. <a href=\"https://doi.org/10.59350/f2mg2-v2134\" target=\"_blank\"><i>10.59350/f2mg2-v2134</i></a></li>\n<li>Madhavapeddy (2025). Arise Bushel, my sixth generation oxidised website. <a href=\"https://doi.org/10.59350/0r62w-c8g63\" target=\"_blank\"><i>10.59350/0r62w-c8g63</i></a></li>\n<li>Kleppmann et al (2024). Bluesky and the AT Protocol: Usable Decentralized Social Media. Proceedings of the ACM Conext-2024 Workshop on the Decentralization of the Internet. <a href=\"https://doi.org/10.1145/3694809.3700740\" target=\"_blank\"><i>10.1145/3694809.3700740</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/atproto-for-fun-and-blogging",
      "title": "Using AT Proto for more than just Bluesky posts",
      "summary": "Explore alternative uses for AT Proto beyond Bluesky posts, enabling self-sovereign digital infrastructure and innovative apps.",
      "date_published": "2025-02-09T00:00:00.000000Z",
      "date_modified": "2025-02-11T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "bluesky",
        "distributed",
        "selfhosting",
        "perscon",
        "ocaml",
        "fediverse"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/f2mg2-v2134",
          "doi": "10.59350/f2mg2-v2134",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.59350/0r62w-c8g63",
          "doi": "10.59350/0r62w-c8g63",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3694809.3700740",
          "doi": "10.1145/3694809.3700740",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/f2mg2-v2134",
      "content_html": "<p>We've been happy users of <a href=\"https://matrix.org\">Matrix</a> for our group communications in the <a href=\"https://www.cst.cam.ac.uk/research/eeg\">EEG</a>. Today we've been bringing in more members of the wider group to using it instead of Slack. As part of that, I've set up a cool bot called <a href=\"https://github.com/matrix-org/matrix-hookshot\">Hookshot</a> which allows Matrix to be connected to external services such as GitHub and Atom/RSS feeds. This is a test post to demonstrate to the members of the EEG how Matrix and Atom work!</p>\n<p>The basic idea behind Hookshot is to provide a bridging service to communications rooms hosted on Matrix, in such a way that it can exert administrative control over a room to intercept requests for services (such as adding an Atom feed).</p>\n<p>The setup for Hookshot can be a little involved as there are lots of encryption keys flying around. In a nutshell, I have a Docker container running it with a Yaml config of this nature:</p>\n<pre><code>bridge:\n  domain: recoil.org\n  url: http://synapse:8008\n  port: 9993\n  bindAddress: 0.0.0.0\npassFile:\n  /data/hookshot.pem\nexperimentalEncryption:\n  storagePath: /state\nexperimental_features:\n  msc3202_device_masquerading: true\n  msc3202_transaction_extensions: true\n  msc2409_to_device_messages_enabled: true\nlogging:\n  level: debug\n  colorize: true\n  json: false\n  timestampFormat: HH:mm:ss:SSS\n</code></pre>\n<p>This gives me a healthy amount of debug logging, and uses my <a href=\"/notes/decentralised-stack\">personal</a> Matrix server at <a href=\"https://recoil.org\">recoil.org</a> as the &quot;home&quot; for the bot. <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> set up our EEG Matrix server completely separately over in a VM in the Computer Lab, at <code>eeg.cl.cam.ac.uk</code>. It's pretty cool that Matrix allows for this sort of decentralised communucation pretty seamlessly!</p>\n<p>After this worked and was tested, I now have an active bot user on the Matrix (in this case, it's <code>llama:recoil.org</code>. I then configured GitHub on Hookshot so that the bot can monitor GitHub via its API.</p>\n<pre><code>github:\n  auth:\n    id: 861482\n    privateKeyFile: /data/github-key.pem\n  webhook:\n    secret: &lt;secret&gt;\n  oauth:\n    client_id: &lt;client-id&gt;\n    client_secret: &lt;client-secret&gt;\n    redirect_uri: https://recoil.org/hookshot/oauth/\n  defaultOptions:\n    showIssueRoomLink: false\n    hotlinkIssues:\n      prefix: &quot;#&quot;\n</code></pre>\n<p>This bit of Yaml takes some configuring via the GitHub OAuth API to get the client-id and secrets. Once that's done, the bot can then be instructed to monitor certain repositories just by issing some commands from within Matrix!</p>\n<p><img src=\"/images/hookshot-ss-1.webp\" alt=\"%c\" title=\"The Hookshot bot monitoring Quantify Earth\" ></p>\n<p>After this, the bot can be configured for a variety of other service. For instance it can monitor Atom feeds to keep track of what the whole group is writing. For this, it's as simple as:</p>\n<ul>\n<li>Invite the bot to the room and give it admin privileges</li>\n<li>Write a message <code>!hookshot feed https://anil.recoil.org/news.xml</code> (as an example for my feed)</li>\n<li>The bot will start monitoring it and post every five minutes by default</li>\n</ul>\n<p><img src=\"/images/hookshot-ss-2.webp\" alt=\"%c\" title=\"It did admittedly take some messing around to get it to work\" ></p>\n<p><img src=\"/images/hookshot-ss-3.webp\" alt=\"%c\" title=\"But it picked up my post! First!\" ></p>\n<p>Hookshot supports a <a href=\"https://matrix-org.github.io/matrix-hookshot/latest/index.html\">variety of other</a> services to bridge to as well, including <a href=\"https://matrix-org.github.io/matrix-hookshot/latest/setup/webhooks.html\">webhooks</a> for arbitrary services.  One of the most fun student projects I've supervised recently is &quot;<a href=\"/ideas/version-control-matrix\">Decentralised Capability-based Code Collaboration using Matrix</a>&quot; in which <a href=\"https://bsky.app/profile/wedg.dev\">Samuel Wedgwood</a> built Git-patches-over-Matrix. If anyone wants to pick up on that and build a &quot;real&quot; version, perhaps we could use this for peer-to-peer coding! It might work really well with coding copilots, as they have a chat based interface anyway...</p>",
      "url": "https://anil.recoil.org/notes/enter-the-matrix-hookshot",
      "title": "Entering the Matrix with Hookshot",
      "summary": "Hookshot integrates Matrix with external services like GitHub and Atom feeds for enhanced group communications.",
      "date_published": "2025-02-07T00:00:00.000000Z",
      "date_modified": "2025-02-08T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "selfhosting",
        "matrix",
        "networking",
        "computerlab"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://doi.org/10.59350/bxnyj-v6f40",
      "content_html": "<p>With the vast amount of data we have these days for our <a href=\"/projects/plancomp\">planetary computing</a> processing, it's naturally tempting to use more hardware offload. The obvious choice, GPGPUs, are not a great fit for the problem due to the difficulty of unlocking high data parallelism for geospatial data. So it's back to an old technology I worked on <a href=\"/papers/2011-fccm-cloudfpga\">twelve years ago</a> in the form of <a href=\"https://en.wikipedia.org/wiki/Field-programmable_gate_array\">FPGAs</a>!</p>\n<p>FPGAs are a very flexible way to execute boolean combinatorial logic, but are notoriously difficult to program. We have two possible angles to explore to address this. One is to design more declarative DSLs for data processing that compile to the FPGAs, such as <a href=\"https://mynameismwd.org\">Michael Dales</a> work on <a href=\"https://github.com/quantifyearth/yirgacheffe\">Yirgacheffe</a> or <a href=\"https://github.com/omarathon\">Omar Tanner</a>'s work on in-memory <a href=\"/ideas/compressive-geospatial\">compressive computation</a>.  The other angle is to work on the low-level API to programming the FPGAs, to get away from <a href=\"https://danluu.com/why-hardware-development-is-hard/\">Verilog</a> and program in our favourite high-level language...OCaml!  <a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a> and I have started making a list of resources for programming FPGAs in OCaml for our own education.</p>\n<p>HardCaml was originally a side project by <a href=\"https://www.ujamjar.com\">Andy Ray</a>. He gave a great presentation about it at <a href=\"https://www.ujamjar.com/presentations/orconf2015.html\">ORConf 2015</a>. Later on in the project's lifecycle, he moved it to being maintained by <a href=\"https://janestreet.com\">Jane Street</a>, where is used in production and is <a href=\"https://github.com/janestreet/hardcaml\">open source</a>.  The first two resources to learn about HardCaml are to listen to the <a href=\"https://www.youtube.com/watch?v=GJX5VbKvh90\">Signals and Threads episode with Andy</a>, and then to <a href=\"https://arxiv.org/pdf/2312.15035\">read the 2023 paper</a>:</p>\n<blockquote>\n<p>Unlike high level synthesis (HLS), Hardcaml allows for low level control of the underlying hardware for maximum productivity, while abstracting away many of the tedious aspects of traditional hardware definition languages (HDLs) such as Verilog or VHDL. The richness of OCaml’s type system combined with Hardcaml’s fast circuit elaboration checks reduces the chance of user-introduced bugs and erroneous connections with features like custom type defining, type-safe parameterized modules and elaboration-time bit-width inference and validation.</p>\n<p>Hardcaml tooling emphasizes fast feedback through simulation, testing, and verification. It includes both a native OCaml cycle-accurate and an event-driven simulator. Unit tests can live in the source code and include digital ASCII waveforms representing the simulator’s output. Hardcaml also provides tools for SAT proving and formal verification. Hardcaml is industrially proven, and has been used at Jane Street internally for many large FPGA designs.</p>\n</blockquote>\n<p>Let's look at the <a href=\"https://github.com/janestreet/hardcaml\">source code repository</a> next to see some more code.\nHardCaml is easily installable via <a href=\"https://opam.ocaml.org\">opam</a>, so there appears to be few barriers to getting the software up and running. For the development lifecycle, there are a few other packages to ease the interfacing with the FPGA hardware:</p>\n<ul>\n<li><a href=\"https://github.com/janestreet/hardcaml_waveterm\">Hardcaml_waveterm</a> is a terminal-based digital waveform viewer. These are usable in <a href=\"https://dev.realworldocaml.org/testing.html\">expect tests</a> or from an interactive terminal application. I love a good terminal user interface, particularly now that I've shifted to <a href=\"https://ghostty.org/\">Ghostty</a> with extremely good UTF-8 and colour support, so this is a very good sign.</li>\n<li><a href=\"https://github.com/janestreet/hardcaml_c\">Hardcaml_c</a> then converts a Hardcaml design over to C, where it can be compiled into a cycle-accurate simulation model and <a href=\"https://github.com/janestreet/hardcaml_verilator\">Hardcaml_verilator</a> does the same except for the open-source <a href=\"https://www.veripool.org/verilator/\">verilator</a> Verilog emulator.</li>\n</ul>\n<p>Let's look at some examples. There is a <a href=\"https://github.com/janestreet/hardcaml_circuits\">hardcaml_circuits</a> repository with some interesting designs in HardCaml. Picking some at random:</p>\n<ul>\n<li>There's a <a href=\"https://github.com/janestreet/hardcaml_circuits/blob/master/src/sorting_network.mli\">sorting network</a> that arranges a fixed configuration of compare-and-swaps to sort data. The network's structure is static (so it can be implemented easily in hardware), but the library abstracts its implementation to allow plugging in different compare-and-swap and data structures.  Looking at the OCaml interface, it's an <a href=\"https://dev.realworldocaml.org/functors.html\">OCaml functor</a> over the compare-and-swap function, and has implementations in the module for a <a href=\"https://github.com/janestreet/hardcaml_circuits/blob/master/src/sorting_network.ml#L140\">merge sort</a> and a <a href=\"https://github.com/janestreet/hardcaml_circuits/blob/master/src/sorting_network.ml#L65\">bitonic merge</a>. This is already quite instructive to compare vs a software implementation, as for my <a href=\"/notes/focs\">Foundations of CS</a> course where I teach <a href=\"https://www.cl.cam.ac.uk/teaching/2324/FoundsCS/slides/FoCS-202324-5.pdf\">merge strategies</a> quite early on.</li>\n<li>For floating point calculations, we generally do <a href=\"https://www.allaboutcircuits.com/technical-articles/an-introduction-to-the-cordic-algorithm/\">CORDIC</a> algorithms which perform vector rotations iteratively to solve trig functions.  The <a href=\"https://github.com/janestreet/hardcaml_circuits/blob/master/src/cordic_reference.mli\">cordic.mli</a> interface here is very readable, with nice use of OCaml features such as <a href=\"https://dev.realworldocaml.org/variants.html#variants\">algebraic data types</a> to express the equations themselves. The implementation of <a href=\"https://github.com/janestreet/hardcaml_circuits/blob/master/src/cordic_reference.ml#L97-L101\">arctan</a> shows how elegantly the OCaml implementation expresses the CORDIC equation as a higher level function.</li>\n</ul>\n<h2 id=\"is-hardcaml-worth-learning\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#is-hardcaml-worth-learning\"></a>Is HardCaml worth learning?</h2>\n<p>I was curious to see what HardCaml's been used for recently. Most notably, it took home awards at the <a href=\"https://zprize.hardcaml.com/\">ZPrize</a> in 2022, winning the multi-scalar multiplication track. So this thing is right up there with other HDLs in terms of producing high performing circuits!</p>\n<p>There are two good blog posts about each of the implementations:</p>\n<ul>\n<li>The <a href=\"https://zprize.hardcaml.com/msm-overview.html\">multi-scalar multiplication post</a> looks to multiply 2<sup>26</sup> points on the <a href=\"https://neuromancer.sk/std/bls/BLS12-377\">BLS12-377</a> <a href=\"https://en.wikipedia.org/wiki/Elliptic_curve\">elliptic curve</a> by scalars from the associated 253-bit scalar field and add them all as fast as possible.  This is difficult as the full set of transforms can't fit within a single FPGA's RAM, and so needs to call out to the host DRAM.  There's an <a href=\"https://dl.acm.org/doi/10.1145/3626202.3637577\">paper</a> with all the details on the evaluation, which was done on an <a href=\"https://fpga-development-on-ec2.workshop.aws/en/4-f1-application-development-flow/introduction-to-f1-development-environment.html\">Amazon F1</a> FPGA instance.</li>\n<li>The <a href=\"https://zprize.hardcaml.com/ntt-overview.html\">number-theoretic transform post</a> describes what's going on there as something similar to fourier transforms but working over a <a href=\"https://en.wikipedia.org/wiki/Finite_field\">Galois field</a>. An extremely cool <a href=\"https://zprize.hardcaml.com/apps/ntt/ntt-core-with-rams-app\">web based interaction visualisation</a> allows you to step through the NTT implementation.\nThey used an <a href=\"https://www.amd.com/en/products/accelerators/alveo.html\">AMD Alveo</a> for this; I think that team are formerly Xilinx and based locally here in Cambridge!</li>\n</ul>\n<p><img src=\"/images/hardcaml-webterm-1.webp\" alt=\"%c\" title=\"The web-based waveform view for the NTT transformer\" ></p>\n<p>More relevantly to my interested in geospatial processing, there is a <a href=\"https://github.com/hardcamls/video-coding/tree/main/jpeg\">JPEG decoder in HardCaml</a> which looks rather exciting. It implements the <a href=\"https://stackoverflow.com/questions/26523504/what-is-the-baseline-architecture-of-jpeg\">JPEG baseline profile</a> with arbitrary huffman tables for encoding, along with a more work-in-progress decoder. A <a href=\"https://github.com/geocaml/ocaml-tiff\">GeoTIFF</a> implementation would be a fun starter project to port to HardCaml!</p>\n<h2 id=\"some-ideas-for-student-projects\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#some-ideas-for-student-projects\"></a>Some ideas for student projects</h2>\n<p>Moving on from prizes, there is also a <a href=\"https://github.com/askvortsov1/hardcaml-mips\">MIPS processor in HardCaml</a> designed by a couple of students at <a href=\"https://www.psu.edu/\">Penn State</a>. They've also written a series of great <a href=\"https://ceramichacker.com/blog/34-1412-hardcaml-mips-and-io\">blog posts</a> about their adventures in learning HardCaml as students.</p>\n<p><a href=\"https://toao.com\">Sadiq Jaffer</a> and I have also been discussing the possibility of using <a href=\"/ideas/computational-storage-for-vector-dbs\">computational SSDs to accelerate vector databases</a>, which would be a game-changer for the <a href=\"/projects/rsn\">huge datasets</a> we're throwing around at the moment.</p>\n<p>I'm going to continue to explore this further, and will update this note with any more resources I found. Please do send me any ideas you might have! <em>(Update 2025/02/07):</em> Immediately after <a href=\"https://amok.recoil.org/@avsm/113962272067656593\">posting</a> this, two interesting responses came up:</p>\n<ul>\n<li><a href=\"https://github.com/edwintorok\">Török Edwin</a> from the <a href=\"/projects/xen\">Xen</a> team <a href=\"https://amok.recoil.org/@edwintorok@discuss.systems/113962395735439060\">reports</a> that he experimented with <a href=\"https://tinytapeout.com/runs/ttihp0p2/tt_um_edwintorok\">TinyTapeout</a> in HardCaml to implement a raytracer:</li>\n</ul>\n<blockquote>\n<p>The VGA controller is <a href=\"https://github.com/edwintorok/roundingerror-ihp/blob/main/src/generator/vga.ml\">here</a> and the hardcaml output works nicely with yosys and open lane tooling and verilator. So far it seems to work in simulation and on an FPGA (output <a href=\"https://www.youtube.com/watch?v=K9mu3getxhU&amp;t=42s\">recording video</a>, see bottom of <a href=\"https://tinytapeout.com/competitions/demoscene-tt08-entries/\">this</a> on how it got recorded).</p>\n<p>Yet to find out whether it'll work in a physical chip (they say the tape out will be done in April). I particularly like the waveforms in source code for unit test (see the above VGA example).</p>\n</blockquote>\n<ul>\n<li>My colleague <a href=\"https://albert.rierol.net/\">Albert Cordona</a> works on analysing the <a href=\"https://www.science.org/doi/full/10.1126/science.add9330\">connectomes of insect brains</a> (among other brains), which involves a lot of image processing over vast datasets as well.  I <a href=\"https://amok.recoil.org/@avsm/113962390567495016\">pointed</a> him at an <a href=\"https://hackaday.io/project/27550-the-hobbyists-guide-to-fpgas\">FPGA overview</a>; any other good beginner &quot;FPGA for programmers&quot; ones I could also use?</li>\n</ul>\n<p>Thanks also to <a href=\"https://ujamjar.com\">Andy Ray</a> and Andrew W. Moore for feedback and corrections to this post.</p><h1>References</h1><ul><li>Madhavapeddy (2025). Foundations of Computer Science. <a href=\"https://doi.org/10.59350/qms3q-ymn65\" target=\"_blank\"><i>10.59350/qms3q-ymn65</i></a></li>\n<li>Madhavapeddy et al (2011). Reconfigurable Data Processing for Clouds. IEEE. <a href=\"https://doi.org/10.1109/FCCM.2011.35\" target=\"_blank\"><i>10.1109/FCCM.2011.35</i></a></li>\n<li>Ray et al (2024). Hardcaml MSM: A High-Performance Split CPU-FPGA Multi-Scalar Multiplication Engine. <a href=\"https://doi.org/10.1145/3626202.3637577\" target=\"_blank\"><i>10.1145/3626202.3637577</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/fpgas-hardcaml",
      "title": "Programming FPGAs using OCaml",
      "summary": "Learn FPGA programming with OCaml using HardCaml.",
      "date_published": "2025-02-07T00:00:00.000000Z",
      "date_modified": "2025-02-07T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "fpga",
        "hardware",
        "embedded",
        "networking"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/qms3q-ymn65",
          "doi": "10.59350/qms3q-ymn65",
          "cito": [
            "citesAsRelated"
          ]
        },
        {
          "url": "https://doi.org/10.1109/FCCM.2011.35",
          "doi": "10.1109/FCCM.2011.35",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3626202.3637577",
          "doi": "10.1145/3626202.3637577",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/qmsqz-ark89",
      "content_html": "<p><a href=\"https://toao.com\">Sadiq Jaffer</a> sent along this <a href=\"https://theconversation.com/fake-papers-are-contaminating-the-worlds-scientific-literature-fueling-a-corrupt-industry-and-slowing-legitimate-lifesaving-medical-research-246224\">piece in The Conversation</a> last week about the remarkable number of academic papers that are now AI generated. The numbers of these papers are probably underestimated:</p>\n<blockquote>\n<p>These papers are absorbed into the worldwide library of research faster than they can be weeded out. About 119,000 scholarly journal articles and conference papers are published globally every week, or more than 6 million a year. Publishers estimate that, at most journals, about 2% of the papers submitted – but not necessarily published – are likely fake, although this number can be much higher at some publications.\n<cite>-- Frederik Joelving et al, <a href=\"https://theconversation.com/fake-papers-are-contaminating-the-worlds-scientific-literature-fueling-a-corrupt-industry-and-slowing-legitimate-lifesaving-medical-research-246224\">The Conversation</a></cite></p>\n</blockquote>\n<p>What caught my eye in this article is their development of the <a href=\"https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.24495\">Problematic Paper Screener</a>, which the good folks at <a href=\"https://en.wikipedia.org/wiki/Retraction_Watch\">Retraction Watch</a> developed. It works with high precision to detect papers issued by grammar-based generators. They noted in <a href=\"https://theconversation.com/problematic-paper-screener-trawling-for-fraud-in-the-scientific-literature-246317\">another article</a> that over 764,000 articles cited papers that could be unreliable, further illustrating the creeping unreliability. <a href=\"https://toao.com\">Sadiq Jaffer</a> and I are planning to run this over our <a href=\"/projects/ce\">growing paper corpus</a>, but I can't find the source code to their system, just <a href=\"https://dbrech.irit.fr/pls/apex/f?p=9999:1::::::\">the hosted version</a>.</p>\n<p>Meanwhile, datasets are also under similar threat of causing <a href=\"https://www.nature.com/articles/s41586-024-07566-y\">recursive model collapse</a>. The <a href=\"https://github.com/rspeer/wordfreq\">Wordfreq</a> team announced in September 2024 that they would <a href=\"https://github.com/rspeer/wordfreq/blob/master/SUNSET.md\">discontinue</a> updating their corpus because generative AI has polluted the data and information that used to be free has became expensive.  <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> also noted the related problem of dataset versioning becoming unreliable across science in &quot;<a href=\"/papers/2024-uncertainty-cs\">Uncertainty at scale: how CS hinders climate research</a>&quot;, but for different reasons -- large datasets are inherently difficult to version and reproduce (it's quite hard to share a terabyte of data over the Internet easily, even in this day and age).</p>\n<p><img src=\"/images/conversation-fakeai-1.webp\" alt=\"%c\" title=\"Wayne State scientists Frank Cackowski and Steven Zielske carried out experiments based on a paper they later found to contain false data. Credit: Amy Sacka\" ></p>\n<p>Another big development this week was the release of <a href=\"https://openai.com/index/introducing-deep-research/\">OpenAI's Deep Research</a> feature, which goes off and really mines a literature corpus for information. I've grudgingly updated to their expensive <a href=\"https://openai.com/index/introducing-chatgpt-pro/\">Pro</a> to try this out and will report my findings in a future post. The ability to generate papers has moved well beyond just the grammar generators that the Problem Paper Screener can filter out, so this arms race is unlikely to end well if we're pinning our hopes on detecting AI-generated papers. The current publish-or-perish model has already died; at least our Cambridge <a href=\"https://www.acp.hr.admin.cam.ac.uk/acp-overview/acp-key-principles\">promotion process</a> is more enlightened than &quot;just&quot; looking at paper counts!</p><h1>References</h1><ul><li>Shumailov et al (2024). AI models collapse when trained on recursively generated data. Nature. <a href=\"https://doi.org/10.1038/s41586-024-07566-y\" target=\"_blank\"><i>10.1038/s41586-024-07566-y</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/ai-contamination-of-papers",
      "external_url": "https://theconversation.com/fake-papers-are-contaminating-the-worlds-scientific-literature-fueling-a-corrupt-industry-and-slowing-legitimate-lifesaving-medical-research-246224",
      "title": "Fake papers abound in the literature",
      "summary": "AI-generated papers are contaminating scientific literature at a rapid pace",
      "date_published": "2025-02-04T00:00:00.000000Z",
      "date_modified": "2025-02-04T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "evidence",
        "llms",
        "science"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1038/s41586-024-07566-y",
          "doi": "10.1038/s41586-024-07566-y",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/ncdnp-ka236",
      "content_html": "<p>There's a <a href=\"https://www.science.org/doi/10.1126/science.adt6811\">letter in Science</a> today from a bunch of well known remote sensing researchers that make the unusual point that modern satellite resolution is getting <em>too</em> good to be accurate for forest carbon estimation.</p>\n<blockquote>\n<p>Many new satellites can resolve fine features on the landscape, and even some individual trees outside forests, but this resolution (0.3-5m) is too high for mapping forest carbon. Forest carbon has a natural resolution constraint: the size of an individual tree. To create these maps, tree data from the ground are required because there is no direct measure of tree carbon nor any way to accurately divide trees into smaller components from space.\n[...]\nBecause most carbon in a forest is stored in large trees, map resolutions should at minimum exceed the crown diameter of a typical large tree, which ranges from about 10m for temperate forests to about 20m for tropical forests\n<cite>--- <a href=\"https://www.science.org/doi/10.1126/science.adt6811\">Laura Duncanson et al</a>, Spatial resolution for forest carbon maps, Science</cite></p>\n</blockquote>\n<p>The lead author <a href=\"https://geog.umd.edu/facultyprofile/duncanson/laura\">Laura Duncanson</a> is a remote sensing scientist at Maryland who works on the incredible <a href=\"https://en.wikipedia.org/wiki/Global_Ecosystem_Dynamics_Investigation\">GEDI</a> instrument on the International Space Station.  In her recent <a href=\"https://watch.eeg.cl.cam.ac.uk/w/uoH2Gie4WiiAocQJYLi9im\">EEG seminar talk</a>, she noted that their instrument is so sensitively calibrated that they can detect when astronauts on the space station are flushing the loo!</p>\n<div class=\"video-center\"><iframe title=\"Forest biomass mapping and monitoring with NASA Lidars\" width=\"100%\" height=\"315\" src=\"https://watch.eeg.cl.cam.ac.uk/videos/embed/e5eb87d3-c4c8-49d5-b074-94c6a38ba8f6\" frameborder=\"0\" allowfullscreen=\"\" sandbox=\"allow-same-origin allow-scripts allow-popups\"></iframe></div>\n<p><a href=\"https://coomeslab.org\">David Coomes</a> further notes that we shouldn't think of either field data or GEDI footprints as sole ground truths, but rather factor in the combined uncertainties in both ground and remote sensing data.  This <a href=\"https://tforces.net/upload/publication-store/2018/Jucker_et_al_2018_Borneo_carbon_Biogeosciences-15-3811-2018.pdf\">2018 Geosciences paper</a> goes through the details of how this error propagation works in Borneo rainforests:</p>\n<blockquote>\n<p>By combining ALS imagery with data from 173 permanent forest plots spanning the lowland rainforests of Sabah on the island of Borneo, we develop a simple yet general model for estimating forest carbon stocks using ALS-derived canopy height and canopy cover as input metrics. An advanced feature of this new model is the propagation of uncertainty in both ALS- and ground-based data, allowing uncertainty in hectare-scale estimates of carbon stocks to be quantified robustly.</p>\n<p>[...] Since the 1970s Borneo has lost more than 60% of its old-growth forests, the majority of which have been replaced by large-scale industrial palm oil plantations.</p>\n<p>With the view of halting the further deforestation of carbon-dense old-growth forests and generating the necessary knowledge to better manage its forests into the future, in 2016 the Sabah state government commissioned CAO to deliver a high-resolution ALS-based carbon map of the entire state. The regional carbon model we develop here underpins this initiative [...]\n<cite>-- <a href=\"https://tforces.net/upload/publication-store/2018/Jucker_et_al_2018_Borneo_carbon_Biogeosciences-15-3811-2018.pdf\">Tommaso Jucker, David Coomes et al</a>, Estimating aboveground carbon density and its uncertainty in Borneo’s structurally complex tropical forests using airborne laser scanning</cite></p>\n</blockquote>\n<p><a href=\"https://mynameismwd.org\">Michael Dales</a> and <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\">Thomas Swinfield</a> are just starting to refresh our <a href=\"/papers/2023-pact-tmf\">PACT methodology spec</a>, so this yet another timely warning to not race ahead with the <a href=\"/projects/rsn\">latest satellite data</a> without careful consideration of what it is we are actually measuring (in our case, forest carbon for <a href=\"\">##carboncredits</a>).</p><h1>References</h1><ul><li>Balmford et al (2024). PACT Tropical Moist Forest Accreditation Methodology v2.1. Cambridge Open Engage. <a href=\"https://doi.org/10.33774/coe-2024-gvslq\" target=\"_blank\"><i>10.33774/coe-2024-gvslq</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/forests-spatial-resolution",
      "title": "Satellites are getting too good for forest carbon?",
      "summary": "Satellites are getting too accurate for forest carbon estimation due to high resolution.",
      "date_published": "2025-02-03T00:00:00.000000Z",
      "date_modified": "2025-02-03T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "forests",
        "sensing",
        "carbon",
        "science"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.33774/coe-2024-gvslq",
          "doi": "10.33774/coe-2024-gvslq",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/r06z7-0ht06",
      "content_html": "<p><a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> posted a link to this excellent deepdive by <a href=\"https://www.linkedin.com/in/prasadraje/\">Prasad Raje</a> of Udemy into the advances that\n<a href=\"https://deepseek.com\">DeepSeek</a> R1 has made from a perspective of the core\ntechnology.</p>\n<blockquote>\n<ul>\n<li>Multi-headed Latent Attention (MLA). In the famous Google &quot;<a href=\"https://arxiv.org/abs/1706.03762\">Attention is all you need</a>&quot; paper, the attention block is responsible for a lot of the magic of LLMs but is also compute heavy [...] Deepseek has innovated here with Multi-headed latent attention - which essentially reduces the size of matrix multiplication applied to generate the K,V vectors that are inputs into the attention block. Combined with KV Caching, this reduces the memory needs [...]</li>\n<li>Mixture of Experts (MoE). The key idea here is that instead of feeding each token through one massive <a href=\"https://en.wikipedia.org/wiki/Feedforward_neural_network\">FFN</a>, break down the single FFN into a number of smaller FFNs and route each token through a subset of these FFNs. [...] each of these smaller FFNs will learn during training something specific about how to transform each token, hence becoming an &quot;expert&quot;.  Deepseek took MoE to this 670B parameter scale that no one had done before [...] and created 256 FFNs and routes each token through only 8 of these.</li>\n<li>Multi-token prediction (MTP): [...] you compute more than 1 token and send the aggregate error to back propagate. The intuition is that you get more changes made to the model weights in each training step, thus reducing the total training steps needed [...] Deepseek took this idea further, added innovations of their own (Sequential vs parallel MTP) and used this to reduce training time.\n<cite> -- <a href=\"https://www.linkedin.com/pulse/deepdive-deepseek-prasad-raje-jakqc\">Prasad Raje</a></li>\n</ul>\n</blockquote><h1>References</h1><ul><li>Vaswani et al (2023). Attention Is All You Need. arXiv. <a href=\"https://doi.org/10.48550/arXiv.1706.03762\" target=\"_blank\"><i>10.48550/arXiv.1706.03762</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/deepseek-r1-advances",
      "external_url": "https://www.linkedin.com/pulse/deepdive-deepseek-prasad-raje-jakqc/",
      "title": "Deepdive into Deepseek advances",
      "summary": "Explore DeepSeek's advances in AI technology, including Multi-headed Latent Attention and Mixture of Experts.",
      "date_published": "2025-02-01T00:00:00.000000Z",
      "date_modified": "2025-02-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "llms",
        "ai",
        "deepseek",
        "til"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.48550/arXiv.1706.03762",
          "doi": "10.48550/arXiv.1706.03762",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/sf0ze-pbf15",
      "content_html": "<p>Now that I've <a href=\"/notes/bushel-lives\">switched</a> to a new website, I'm working on open-sourcing its components. I've got a lot of small OCaml scripts that are all work-in-progress, and so not quite suitable to be published to the <a href=\"https://github.com/ocaml/opam-repository\">central opam-repository</a> but I still need be able to run them conveniently on my own <a href=\"\">self-hosted</a> infrastructure.</p>\n<p>I mainly use a variety of macOS and Linux hosts<sup id=\"fnref:1\"><a href=\"#fn:1\" class=\"footnote\">[1]</a></sup> and I want a workflow as simple as &quot;<code>brew install avsm/ocaml/srcsetter</code>&quot; and have it install a working binary version of my CLI utility. In this case, it's <a href=\"https://github.com/avsm/srcsetter\">srcsetter</a>, a simple tool I knocked up to generate the <a href=\"https://developer.mozilla.org/en-US/docs/Web/HTML/Responsive_images\">responsive images</a> on this website. Luckily, Homebrew has made this <em>really</em> easy for us! They have a <a href=\"https://docs.brew.sh/BrewTestBot\">BrewTestBot</a> that integrates with GitHub Actions to automate the compilation of binary packages for us, all from a convenient PR-like workflow.</p>\n<p>First, we need to set up a GitHub Homebrew &quot;tap&quot; repository. Mine is <a href=\"https://github.com/avsm/homebrew-ocaml\">avsm/homebrew-ocaml</a> which allows for the tap to be referred to as <code>avsm/ocaml</code> (Homebrew special-cases these to expand to the full GitHub repository). We then add in a couple of GitHub Actions to activate the testbot:</p>\n<ul>\n<li><a href=\"https://github.com/avsm/homebrew-ocaml/blob/main/.github/workflows/tests.yml\">.github/workflows/tests.yml</a> runs in response to pull requests to that repository and does a full Brew build of the package.</li>\n<li><a href=\"https://github.com/avsm/homebrew-ocaml/blob/main/.github/workflows/publish.yml\">.github/workflows/publish.yml</a> allows us to simply add a <code>pr-pull</code> label to a successful PR and have it be merged automatically by the bot.</li>\n</ul>\n<p>Secondly, we need to create a Homebrew package for the opam package. For this, I just added a very simple script to the srcsetter repository called <a href=\"https://github.com/avsm/srcsetter/blob/main/.opambuild.sh\">.opambuild.sh</a> which builds a local binary using a temporary opam installation. In the future, we should be able to use <a href=\"https://preview.dune.build\">dune package management</a> to remove the need for this script, but I'm blocked on some <a href=\"https://github.com/ocaml/dune/issues/11405\">teething issues</a> there in the short-term.</p>\n<pre><code>export OPAMROOT=`pwd`/_opamroot\nexport OPAMYES=1\nexport OPAMCONFIRMLEVEL=unsafe-yes\nopam init -ny --disable-sandboxing\nopam switch create . \nopam exec -- dune build --profile=release\n</code></pre>\n<p>Once this is present in the repository we're building, I just need to <a href=\"https://github.com/avsm/homebrew-ocaml/pull/2\">open a pull request</a> with the Homebrew <a href=\"https://docs.brew.sh/Formula-Cookbook\">formula</a> for my CLI tool.</p>\n<pre><code>class Srcsetter &lt; Formula\n  desc &quot;Webp image generator for responsive HTML sites&quot;\n  homepage &quot;https://github.com/avsm/srcsetter/&quot;\n  url &quot;https://github.com/avsm/srcsetter.git&quot;, branch: &quot;main&quot;\n  version &quot;0.0.1&quot;\n  license &quot;ISC&quot;\n\n  depends_on &quot;gpatch&quot;\n  depends_on &quot;opam&quot;\n\n  def install\n    system &quot;bash&quot;, &quot;./.opambuild.sh&quot;\n    bin.install &quot;_opam/bin/srcsetter&quot;\n  end\nend\n</code></pre>\n<p>The formula is fairly self-explanatory: I just point Homebrew at the source repository, give it some descriptive metadata, and tell it to invoke the binary build script and make the sole resulting binary available as the contents of the package.  At this point, the BrewBot will run against the PR and report any build failures on both macOS and Ubuntu. Most of these were swiftly fixed by running <code>brew style</code> (as instructed in the build failures) to take of fairly minor issues.</p>\n<p><img src=\"/images/gh-brewbot-screen.webp\" alt=\"%c\" ></p>\n<p>When the PR went green, all I then had to do was to add the <code>pr-pull</code> label, and the bot takes care of uploading the binary artefacts to my <a href=\"https://github.com/avsm/homebrew-ocaml/releases/tag/srcsetter-0.0.1\">homebrew tap repo</a> and merging the PR. It also takes care of adding checksums to the merged Formula, so what actually got merged is:</p>\n<pre><code>class Srcsetter &lt; Formula\n  desc &quot;Webp image generator for responsive HTML sites&quot;\n  homepage &quot;https://github.com/avsm/srcsetter/&quot;\n  url &quot;https://github.com/avsm/srcsetter.git&quot;, branch: &quot;main&quot;\n  version &quot;0.0.1&quot;\n  license &quot;ISC&quot;\n\n  bottle do\n    root_url &quot;https://github.com/avsm/homebrew-ocaml/releases/download/srcsetter-0.0.1&quot;\n    sha256 cellar: :any_skip_relocation, arm64_sequoia: &quot;b3e1289965d8bcf086db06b18e6c2865f9949a9e1202b8fafa640f3e363b6bd4&quot;\n    sha256 cellar: :any_skip_relocation, ventura:       &quot;9b61e8e4be5f777e3ef98672f275909a80c3cc3f82d6886ca1a90b66ea7bb9f8&quot;\n    sha256 cellar: :any_skip_relocation, x86_64_linux:  &quot;d8279f11f30edf865368a3c6f63d811d31c1a9ca019ef86e93afeb6624232850&quot;\n  end\n\n  depends_on &quot;gpatch&quot;\n  depends_on &quot;opam&quot;\n\n  def install\n    system &quot;bash&quot;, &quot;./.opambuild.sh&quot;\n    bin.install &quot;_opam/bin/srcsetter&quot;\n  end\nend\n</code></pre>\n<p>The end result is that <code>brew install avsm/ocaml/srcsetter</code> now works, without me having to cut a release of the tool more centrally. I'd love to incorporate some aspects of this workflow into the OCaml opam-repository, as users are currently responsible for the checksumming generation themselves via <a href=\"https://discuss.ocaml.org/t/dune-release-version-1-4-0-released/6103\">dune-release</a> or <a href=\"https://opam.ocaml.org/doc/Packaging.html\">opam-publish</a>. It's an interesting twist to automate this part of the process and let the humans focus on the core package metadata instead. Thanks for all the help, Brewbot!</p>\n<div class=\"footnotes\"><ol><li id=\"fn:1\"><p><p>Let's leave <a href=\"\">OpenBSD</a> support to another day!</p>\n <a href=\"#fnref:1\" class=\"reversefootnote\">&#8617;</a></p></li></ol></div><h1>References</h1><ul><li>Madhavapeddy (2025). Arise Bushel, my sixth generation oxidised website. <a href=\"https://doi.org/10.59350/0r62w-c8g63\" target=\"_blank\"><i>10.59350/0r62w-c8g63</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/custom-homebrew-taps",
      "title": "How to publish custom Homebrew taps for OCaml",
      "summary": "Publish custom OCaml Homebrew taps with a simple GitHub workflow.",
      "date_published": "2025-01-31T00:00:00.000000Z",
      "date_modified": "2025-01-31T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "homebrew",
        "packaging",
        "testing",
        "bushel",
        "til"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/0r62w-c8g63",
          "doi": "10.59350/0r62w-c8g63",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/69k1e-cts10",
      "content_html": "<p>My colleagues <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\">Thomas Swinfield</a> and <a href=\"https://www.cst.cam.ac.uk/people/eft20\">Eleanor Toye Scott</a> lead the publication of a comprehensive\n<a href=\"https://www.cambridge.org/engage/coe/article-details/679385946dde43c9082f7009\">report</a> of the steps the voluntary carbon market needs to take\nto restore its scientific credibility, with input from many of us in <a href=\"/projects/4c\">4C</a> and beyond.</p>\n<blockquote>\n<ul>\n<li>establishing common standards for carbon quantification and accounting, to cover additionality, leakage and permanence.</li>\n<li>avoiding perverse incentives and align the motivations of all stakeholders with high-integrity outcomes. [...]</li>\n<li>issuing all carbon credits based on trusted primary observations.</li>\n<li>making all the data needed to reproduce carbon calculations available in standard file formats.</li>\n<li>[...] reporting social and biodiversity dimensions of projects separately from carbon calculations.</li>\n<li>integrating DMRV methods into carbon and biodiversity accounting standards to reduce the financial and administrative burdens on nature-based projects and the local communities participating in or affected by them.</li>\n</ul>\n</blockquote>\n<p>This paper represents three years of hard work from the team on trying to blend remote sensing with carbon quantification. For more reading on the topic, you may also wish to browse the full <a href=\"https://4c.cst.cam.ac.uk/publications\">4C publication list</a> for the firehose of activity from the centre.</p>",
      "url": "https://anil.recoil.org/notes/credible-credit-principles",
      "external_url": "https://www.cambridge.org/engage/coe/article-details/679385946dde43c9082f7009",
      "title": "Position paper on scientifically credible carbon credits",
      "summary": "Experts outline steps to restore scientific credibility in the voluntary carbon market.",
      "date_published": "2025-01-30T00:00:00.000000Z",
      "date_modified": "2025-01-30T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "carboncredits",
        "sensing",
        "economics",
        "conservation",
        "nbs"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://doi.org/10.59350/0r62w-c8g63",
      "content_html": "<p>This website has been through quite a few iterations over the years. The first version in 1998 was written in Perl and hosted on <a href=\"\">OpenBSD</a>; the second was rewritten in 2000 when I <a href=\"/notes/commit-access-to-php\">got commit access to PHP</a>; the third rewrite became a hybrid OCaml/PHP/Perl special in 2004 in <a href=\"https://en.wikipedia.org/wiki/Blosxom\">Blosxom</a>; then the forth rewrite around 2013 got turned into a <a href=\"/projects/unikernels\">unikernel</a> in MirageOS; then the <a href=\"https://web.archive.org/web/20220118200046/https://anil.recoil.org/\">fifth</a> in 2019 then transitioned to an OCaml static site generator hosted on a prerelease <a href=\"https://github.com/avsm/eeww\">multicore OCaml webserver</a>. So the sixth generation now needs something to continue the grand <a href=\"https://en.wikipedia.org/wiki/Rube_Goldberg_machine\">Rube Goldberg</a> tradition of helping me learn the latest and greatest in systems technology.</p>\n<p>And so here it is! The site is now written in a bleeding-edge unreleased variant of OCaml with extensions based around <a href=\"https://blog.janestreet.com/icfp-2024-index/\">Rust-like type system features</a> activated, including rather exciting <a href=\"https://popl25.sigplan.org/details/POPL-2025-popl-research-papers/23/Data-Race-Freedom-la-Mode\">data-race freedom</a> work that just won a best paper award at POPL 2025.  It's normally difficult to work on continuously moving compilers, but Diana Kalinichenko did a tremendous amount of work into making it usable with opam out of the box, and this post documents the journey to getting this website live.</p>\n<h2 id=\"getting-the-oxidised-compiler\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#getting-the-oxidised-compiler\"></a>Getting the oxidised compiler</h2>\n<p>Firstly, we did some groundwork a few months ago by adding support into the opam-repository for <a href=\"https://github.com/ocaml/opam-repository/pull/26471\">bootstrap versions</a> of dune, menhir and ocamlfind. These are used to build the Jane Street version of the OCaml compiler, which is published as an <a href=\"https://github.com/janestreet/opam-repository/tree/with-extensions\">opam-repository#with-extensions</a>.</p>\n<p>The extensions there are straightforward for those familiar with opam. On a clean system you can run:</p>\n<pre><code class=\"language-bash\">$ opam init\n$ opam switch create 5.2.0+flambda2 \\\n --repos with-extensions=git+https://github.com/janestreet/opam-repository.git#with-extensions,default\n$ eval $(opam env)\n</code></pre>\n<p>This creates a new opam switch known as <code>5.2.0+flambda2</code>, and we can then verify it's running the variant compiler.</p>\n<pre><code>$ ocaml\nOCaml version 5.2.0+jst\nEnter #help;; for help.\n# let () =\n    let local_message : string @@ local = &quot;Hello, World&quot; in\n    print_endline local_message\n  ;;\nError: This value escapes its region.\n</code></pre>\n<p>That last bit is the new region magic which I'm keen to start experimenting with for this website! But before that, we need to get the rest of the ecosystem packages needed for the website working under this compiler.</p>\n<h2 id=\"installing-ecosystem-packages\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#installing-ecosystem-packages\"></a>Installing ecosystem packages</h2>\n<p>I decided to build the new site based on a content manager I've been designing\n(and scrapping) for a few years, codenamed Bushel.  The basic idea behind\nBushel is to extend Markdown sufficiently with rich contextual data (such as\ncontacts, papers, projects, ideas and so on), and allow for cross-referencing\nto <em>other</em> sites that also follow the Bushel protocol. I'll talk about that in\nmore detail in future posts, but for now that means that I need a more dynamic\nwebsite than the static one I used for the past few years.</p>\n<p>Since the Jane Street compiler doesn't yet support the effect system from OCaml 5, I couldn't use my own Eio-based webserver. So after some discussion with <a href=\"https://mynameismwd.org\">Michael Dales</a> who is <a href=\"https://digitalflapjack.com/blog/the-partially-dynamic-web/\">also porting his site to OCaml</a>, I took the opportunity to learn the the excellent <a href=\"https://aantron.github.io/dream/\">Dream</a> server, which is based on Lwt.  I also used Daniel Bunzli's <a href=\"https://discuss.ocaml.org/t/ann-cmarkit-0-3-0-commonmark-parser-and-renderer-for-ocaml/13622\">cmarkit</a> library for Markdown parsing, and my own <a href=\"https://github.com/avsm/jekyll_format\">Jekyll_format</a> and <a href=\"https://github.com/avsm/ocaml-yaml\">yaml</a> libraries.</p>\n<p>Amazingly, all of these libraries worked out of the box on the Jane Street\ncompiler, except for one snag: the parsetree internals have changed in their\nbranch. This means that <a href=\"https://ocaml.org/docs/metaprogramming\">PPX</a>\nextensions will not work out-of-the-box. Thankfully, there is an abstraction\nlibrary called <a href=\"https://discuss.ocaml.org/t/ann-ppxlib-034-0/15952\">ppxlib</a> which\nhas been ported to the variant compiler, and the differences in the parse tree\nwere easy to fix up (thanks Nathan Reb and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> for your recent ppxlib work!)</p>\n<p>After forking and fixing just two libraries that were using ppx (and not part of the\nJane Street core libraries that were already ported), all I had to do was to pin them\nand add them to my development environment.</p>\n<pre><code>opam pin add ppxlib 0.33.0+jst\nopam pin add dream-httpaf 1.0.0~alpha4\nopam pin add hpack https://github.com/avsm/ocaml-h2.git#js-extensions-fixes\nopam pin add lwt_ppx https://github.com/avsm/lwt.git#js-extensions-fixes\n</code></pre>\n<p>And this then installs the overridden version of packages that I needed,\nwith the pins making sure that the right dependencies were also present.\nAfter that, it was plain sailing! I've now compiled up a native code version\nof my webserver code, deployed it into a <a href=\"\">Docker</a> container, and\ndeployed it on Linux.</p>\n<p>In the future, I hope to use <a href=\"https://preview.dune.build\">dune package management</a> to ease the deployment\nof the site, but it didn't work in its current preview form due to a <a href=\"https://github.com/ocaml/dune/issues/11405\">problem\nwith depopts</a>. Just teething\nproblems with a preview, so I'll post more about that when I get it working!\nI also have a half-finished port of the variant compiler to OpenBSD, so that\nI can shift my website back to its familiar home rather than running on Linux.</p>\n<p>I haven't yet actually taken advantage of any of the new extensions in the\nJane Street variant, since I wantd to get this site up and running first.\nI'll tidy up the code, open source it in the coming weeks, and then we can\ndive into some region extensions and see how far I get!</p>",
      "url": "https://anil.recoil.org/notes/bushel-lives",
      "title": "Arise Bushel, my sixth generation oxidised website",
      "summary": "Learn about my sixth generation oxidised website built with a bleeding-edge OCaml variant.",
      "date_published": "2025-01-29T00:00:00.000000Z",
      "date_modified": "2025-01-29T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "recoil",
        "selfhosting",
        "bushel"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2024-ce-llm-2",
      "content_html": "<p>We have just updated our <a href=\"/papers/2024-ce-llm\">preprint</a> on using LLMs for evidence decision support with more evaluation results and corrections from peer review.</p>\n<blockquote>\n<p>Our findings suggest that, with careful domain-specific design, LLMs could potentially be powerful tools for enabling expert-level use of evidence syntheses and databases. However, general LLMs used &quot;out-of-the-box&quot; are likely to perform poorly and misinform decision-makers. By establishing that LLMs exhibit comparable performance with human synthesis experts on providing restricted responses to queries of evidence syntheses and databases, future work can build on our approach to quantify LLM performance in providing open-ended responses.</p>\n</blockquote>\n<p>See also the fantastic <a href=\"https://watch.eeg.cl.cam.ac.uk/w/ijC1E36q7fn2qwxs7opSJq\">EEG seminar talk</a> that the student group who worked on this over the summer gave towards the end of last year.</p><h1>References</h1><ul><li>Iyer et al (2025). Careful design of Large Language Model pipelines enables expert-level retrieval of evidence-based information from syntheses and databases. <a href=\"https://doi.org/10.1371/journal.pone.0323563\" target=\"_blank\"><i>10.1371/journal.pone.0323563</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2024-ce-llm-2",
      "title": "Updated preprint on LLMs for evidence-based decision support",
      "summary": "Updated preprint with additional evaluation results on using LLMs for expert-level evidence synthesis queries in conservation.",
      "date_published": "2025-01-23T00:00:00.000000Z",
      "date_modified": "2025-01-23T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        ":ce",
        "evidence",
        "llms",
        "ai",
        "conservation"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2024-ce-llm.pdf",
          "mime_type": "application/pdf",
          "title": "Careful design of Large Language Model pipelines enables expert-level retrieval of evidence-based information from syntheses and databases"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1371/journal.pone.0323563",
          "doi": "10.1371/journal.pone.0323563",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2024-life-3",
      "content_html": "<p>After some years of hard work, our <a href=\"/projects/life\">Mapping LIFE on Earth</a> biodiversity metric was published today in a <a href=\"https://royalsocietypublishing.org/doi/10.1098/rstb.2023.0327\">special issue</a> of the Royal Society Philosophical Transactions B!  The idea behind LIFE is that although human-driven habitat loss is known to be  the greatest cause of the <a href=\"https://www.unep.org/facts-about-nature-crisis\">biodiversity crisis</a>, we do not yet have robust spatially explicit metrics that <em>quantify</em> the relative impacts of human actions on species extinctions.  And that's what LIFE provides: a way to compare the relative impacts of some landuse anywhere in the world, in a manner that is globally applicable.</p>\n<p>There are lots of limitations in this first version: it does not yet cover plants (but we're <a href=\"/notes/ukri-grant-terra\">working on it</a>), and some other taxa such as freshwater (working on that too!), but it's a very solid start.  We've uploaded the <a href=\"https://zenodo.org/records/14188450\">LIFE datasets</a> to Zenodo, and the full pipeline <a href=\"https://github.com/quantifyearth/LIFE\">source code</a> is available too.</p>\n<p>The release of LIFE is also discussed further in an article on <a href=\"https://news.mongabay.com/2025/01/life-scores-map-out-where-habitat-loss-for-crops-drives-extinction/\">Mongabay</a>.</p>\n<blockquote>\n<p>Noodling around with the team’s maps can reveal noticeable variations in the\nrisk of extinction to various species in different regions, with implications\nfor sourcing the goods we use every day. For example, clearing a hectare of\nforest in the Congo Basin will nudge far more species toward extinction than\ndoing the same in northern Europe — the former is much more biodiverse than\nthe latter. Companies could also use these maps to boost their sustainability\nby sourcing goods from places where extinctions are less likely. Or,\nindividual consumers might use them to understand how their consumption\nchoices affect species’ habitats.\n<cite> -- <a href=\"https://news.mongabay.com/2025/01/life-scores-map-out-where-habitat-loss-for-crops-drives-extinction/\">John Cannon</a></cite></p>\n</blockquote>\n<p>The <a href=\"https://www.sei.org/publications/life-metric-mapping-global-extinctions/\">Stockholm Environment Institute</a> also covered it:</p>\n<blockquote>\n<p>In a major step forward, the authors of this paper have expanded the potential applications of the &quot;persistence score&quot; approach to create LIFE: they have introduced high-performance computing and brought in data for over 30 000 vertebrate species. The tool can now create global maps of extinction risk probability for 30 875 species of terrestrial vertebrates at 1 arc-minute resolution (3.4 km2 at the equator).</p>\n<p>The technical leap means that, for the first time, map users can access powerful quantitative data regarding the expected number of extinctions (whether that is an increase or decrease) caused by the conversion of natural vegetation to agriculture, or restoring farmland to natural habitat.</p>\n</blockquote>\n<p>If you'd like to see a talk about LIFE, I discussed it in my recent LambdaDays keynote:\n<div class=\"video-center\"><iframe title=\"Programming for the Planet\" width=\"100%\" height=\"315px\" src=\"https://crank.recoil.org/videos/embed/d592bf17-c835-435f-9469-f0f65e926975\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p><h1>References</h1><ul><li>Eyres et al (2025). LIFE: A metric for mapping the impact of land-cover change on global extinctions. <a href=\"https://doi.org/10.1098/rstb.2023.0327\" target=\"_blank\"><i>10.1098/rstb.2023.0327</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2024-life-3",
      "title": "LIFE metric published in Royal Society Phil Trans B",
      "summary": "Publication of LIFE biodiversity metric quantifying relative impacts of human actions on species extinctions through spatially explicit habitat loss analysis.",
      "date_published": "2025-01-09T00:00:00.000000Z",
      "date_modified": "2025-01-09T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "biodiversity",
        "spatial",
        "economics",
        "conservation",
        "sdms",
        "aoh"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2024-life.pdf",
          "mime_type": "application/pdf",
          "title": "LIFE: A metric for mapping the impact of land-cover change on global extinctions"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1098/rstb.2023.0327",
          "doi": "10.1098/rstb.2023.0327",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/qms3q-ymn65",
      "content_html": "<p>Here are the various repos used to create the interactive <a href=\"/notes/teaching\">teaching</a> environment\nwe use for 1A Foundations of Computer Science in Cambridge. It may be useful to\nother professors who are using OCaml in their courses.</p>\n<ul>\n<li><a href=\"https://github.com/avsm/teaching-fcs\">https://github.com/avsm/teaching-fcs</a> is a private repo, but ping me if\nare teaching and I'll give you access (it has coursework answers in it).\nWe use a Jupyter notebook, with the course written in Markdown using the\n<a href=\"https://github.com/realworldocaml/mdx\">mdx</a> OCaml parser which evaluates\ntoplevel phrases through the compiler and promotes the output directly\ninto the markdown.</li>\n<li>We then convert the Markdown into Jupyter format using a\n<a href=\"https://github.com/realworldocaml/mdx/pull/124\">fork of mdx</a>, and then\nnbconvert it into LaTeX for the printed notes.</li>\n<li>A <a href=\"https://jupyter.org/install.html\">JupyterLab</a> installation with a\n<a href=\"https://github.com/akabe/ocaml-jupyter\">custom OCaml kernel</a> suffices\nfor the live setup. Every student gets their own container on the server\nand one server is sufficient for a full class of ~125 students.</li>\n</ul>\n<p>Ping me if you want to know more, and other people who have worked\non this with me are <a href=\"https://www.cst.cam.ac.uk/people/jdy22\">Jeremy Yallop</a>, <a href=\"https://www.dra27.uk\">David Allsopp</a> and <a href=\"https://jon.recoil.org\">Jon Ludlam</a>, with Jon\nbeing the currently active additional lecturer on the course as of 2024/2025.</p>",
      "url": "https://anil.recoil.org/notes/focs",
      "title": "Foundations of Computer Science",
      "summary": "Resources for teaching Foundations of Computer Science with OCaml and Jupyter notebooks.",
      "date_published": "2018-09-02T00:00:00.000000Z",
      "date_modified": "2025-01-03T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "teaching",
        "cambridge",
        "computerlab",
        "pembroke",
        "compsci",
        "ocaml"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2024-ai-conhorizon-1",
      "content_html": "<p>Back in July 2024, a large group of conservation and computer scientists got together in the <a href=\"https://conservation.cam.ac.uk\">CCI</a> to prioritise the storm of AI-related projects that have been kicking off around the world. Our key goal was to harness AI to accelerate the positive impact of conservation efforts, while minimising harm caused through either the direct or indirect use of AI technologies.</p>\n<p>The first horizon scan resulting from this has just been published in Trends in Ecology and Evolution. If you're looking for a gentle introduction to some of the terms in AI from a non-experts perspective, the first section does a good job of defining a glossary as well. The panel identified 21 key ideas ranging from species recognition for uncovering 'dark diversity' to multimodal models for improving biodiversity loss predictions, monitoring wildlife trade, and addressing human-wildlife conflict. Importantly, we also considered the potential negative impacts of AI adoption, including AI colonialism and the loss of essential conservation skills. The scan provides guidance on how the conservation field can adapt to harness AI's benefits while mitigating its risks.</p>\n<p><img src=\"/images/2024-ai-horizon-scan-group.webp\" alt=\"%c\" title=\"The horizon scanners assemble after a successful workshop!\" ></p><h1>References</h1><ul><li>Reynolds et al (2024). The potential for AI to revolutionize conservation: a horizon scan. <a href=\"https://doi.org/10.1016/j.tree.2024.11.013\" target=\"_blank\"><i>10.1016/j.tree.2024.11.013</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2024-ai-conhorizon-1",
      "title": "Horizon scan on AI and conservation published",
      "summary": "Horizon scan published in Trends in Ecology and Evolution on prioritizing AI projects to accelerate conservation efforts while minimizing harm.",
      "date_published": "2024-12-05T00:00:00.000000Z",
      "date_modified": "2024-12-05T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "cci"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2024-ai-conhorizon.pdf",
          "mime_type": "application/pdf",
          "title": "The potential for AI to revolutionize conservation: a horizon scan"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1016/j.tree.2024.11.013",
          "doi": "10.1016/j.tree.2024.11.013",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2024-loco-emissions-1",
      "content_html": "<p>Customers of online services may want to take carbon emissions into account\nwhen deciding which service to use, but it's currently difficult to do so due\nto the lack of reliable emissions data that is comparable across online\nservices. There's a lot of muddled data out there, and calculating accurate\ncarbon emissions across a computing pipeline involves a number of stakeholders,\nnone of whom are incentivised to accurately report their emissions for\ncompetitive reasons!</p>\n<p>In this <a href=\"https://locos.codeberg.page/loco2024/\">LOCO</a> paper, <a href=\"https://www.cst.cam.ac.uk/people/psjm3\">Jessica Man</a> lead our\nexploration of mechanisms to support verifiable <em>and</em> privacy-preserving\nemissions reporting across a chain of energy suppliers, cloud data centres,\nvirtual machine hosting services providers and cloud services providers. The\nidea is that all of this can ultimately be exposed to APIs that can be consumed\nby client devices in order to let consumers make direct choices about their\ndecisions based on relative environmental impacts.</p><h1>References</h1><ul><li>Man et al (2025). Emission Impossible: privacy-preserving carbon emissions claims. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2506.16347\" target=\"_blank\"><i>10.48550/arXiv.2506.16347</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2024-loco-emissions-1",
      "title": "Towards verifiable, privacy-preserving carbon emissions claims",
      "summary": "Paper on mechanisms for verifiable and privacy-preserving carbon emissions reporting across cloud service supply chains at LOCO workshop.",
      "date_published": "2024-12-01T00:00:00.000000Z",
      "date_modified": "2024-12-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "carbon",
        "emissions",
        "crypto",
        "zkp",
        "loco"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2024-loco-emissions.pdf",
          "mime_type": "application/pdf",
          "title": "Emission Impossible: privacy-preserving carbon emissions claims"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.48550/arXiv.2506.16347",
          "doi": "10.48550/arXiv.2506.16347",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2024-loco-shark-1",
      "content_html": "<p>All the work we've been doing on biodiversity (such as <a href=\"/projects/life\">LIFE</a>) comes at\na fairly large computation and storage cost due to the amount of data that we\nchurn through. This gets worse when you consider the exploratory nature of\nscience -- we sometimes just need to mess around with the large dataset to test\nhypotheses which are often shown to be wrong.  So then, when the\n<a href=\"https://www.sicsa.ac.uk/loco/loco2024/\">LOCO</a> conference came around, we wrote\nup our thoughts on what a <em>frugal</em> Linux userspace might look like.</p>\n<p>The key insight is that the Linux kernel already exposes a number of namespace\nmechanisms (that we use in Docker, for example), and so we explore a new OS\narchitecture which defaults to deterministic, reusable computation with the\ncareful recording of side-effects. This in turn allows Linux to guide complex\ncomputations towards previously acquired intermediate results, but still\nallowing for recomputation when required by the user. We're putting this\ntogether into a new shell known as &quot;Shark&quot;, and this first abstract describes\nour early results.</p>",
      "url": "https://anil.recoil.org/notes/2024-loco-shark-1",
      "title": "Towards a frugal userspace for Linux",
      "summary": "LOCO paper presenting Shark shell for deterministic, reusable Linux computation using kernel namespaces to reduce computational costs.",
      "date_published": "2024-12-01T00:00:00.000000Z",
      "date_modified": "2024-12-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "shark",
        "life",
        "carbon",
        "linux",
        "docker",
        "loco"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2024-loco-shark.pdf",
          "mime_type": "application/pdf",
          "title": "Lineage first computing: towards a frugal userspace for Linux"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2024-loco-carbonres-1",
      "content_html": "<p><a href=\"https://ryan.freumh.org\">Ryan Gibb</a> and I have been thinking about how the current Internet architecture fails to treat the carbon emissions\nassociated with networked services as a first-class metric. So when the <a href=\"https://locos.codeberg.page/loco2024/\">LOCO</a> conference came up, we tried extending the DNS with load balancing techniques to consider the carbon cost of scheduling decisions. A next step was then to build a custom <a href=\"https://github.com/RyanGibb/eon\">DNS server written in OCaml</a> to actively wake machines running networked services as a side effect of the name\nresolution.</p>\n<p>Extending DNS means that we maintain compatibility with existing Internet\ninfrastructure, unlocking the ability for existing applications to be\ncarbon-aware. This is very much a spiritual follow on to the\n<a href=\"/papers/2013-foci-signposts\">Signposts</a> project that I worked on back in 2013, and\nhave always wanted to return to!</p>",
      "url": "https://anil.recoil.org/notes/2024-loco-carbonres-1",
      "title": "Prototyping carbon-aware domain name resolution",
      "summary": "Extending DNS with carbon-aware load balancing and OCaml-based DNS server for sustainable Internet infrastructure presented at LOCO.",
      "date_published": "2024-12-01T00:00:00.000000Z",
      "date_modified": "2024-12-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "selfhosting",
        "dns",
        "distributed",
        "carbon",
        "signpost",
        "loco"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2024-loco-carbonres.pdf",
          "mime_type": "application/pdf",
          "title": "Carbon-aware Name Resolution"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2024-loco-terracorder-1",
      "content_html": "<p><a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> and I have been having great fun designing embedded systems for\ncooperative biodiversity monitoring. Josh presented our work over at <a href=\"https://www.sicsa.ac.uk/loco/loco2024/\">LOCO\n2024</a> with an abstract on the\nTerracorder project. Read more if you enjoy a combination of machine learning\nand ESP32 hacking.</p>",
      "url": "https://anil.recoil.org/notes/2024-loco-terracorder-1",
      "title": "Cooperative Sensor Networks for Long-Term Biodiversity Monitoring",
      "summary": "LOCO 2024 presentation on Terracorder project combining machine learning and ESP32 for cooperative biodiversity monitoring networks.",
      "date_published": "2024-12-01T00:00:00.000000Z",
      "date_modified": "2024-12-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "terracorder",
        "biodiversity",
        "sensing",
        "embedded",
        "ai",
        "qlearning",
        "loco"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2024-loco-terracorder.pdf",
          "mime_type": "application/pdf",
          "title": "Cooperative Sensor Networks for Long-Term Biodiversity Monitoring"
        }
      ],
      "_references": []
    },
    {
      "id": "https://doi.org/10.59350/0znpc-fw825",
      "content_html": "<p>I got invited to join the Royal Society and DeepMind to a summit on\nhow AI is revolutionising scientific discovery and trotted along with\n<a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\">Jon Crowcroft</a>. This event is hot on the heels of the\nexcellent RS report on <a href=\"https://royalsociety.org/news-resources/projects/science-in-the-age-of-ai/\">Science in the Age of AI</a>\nand, of course, the Nobel prize for Demis Hassabis which was the <a href=\"https://www.cst.cam.ac.uk/news/nobel-prize-our-alumnus-sir-demis-hassabis\">first ever\nfor my department</a>!\nThe event was held at the BAFTA today, and what follows are my quick livenotes\nas there was just so much to absorb. The RS and Deepmind will have the\nfull sessions online sometime soon, so I'll update this with those more polished\noutputs when they're out! <em>Update: Proper notes now available from <a href=\"https://blog.google/technology/ai/ai-science-forum-2024/\">Google</a> and the <a href=\"https://royalsociety.org/news-resources/projects/science-in-the-age-of-ai/\">Royal Society</a>.</em></p>\n<p><img src=\"/images/ai-for-science/ai-for-science-2024-4.webp\" alt=\"%c\" title=\"Hannah Fry doing a great job emceeing\" ></p>\n<p>The summit was a day-long exploration of how artificial intelligence is\ntransforming science and society, and the overall theme (including four Nobel\nlaureates!) was on how we are in a golden age of scientific discovery,\nespecially in the biological sciences. The emcee for the event was Hannah Fry,\nwho simply dazzled with her rapid summarisation of complex discussions\ninterspersed with very dry humour about the proceedings!</p>\n<p>The consistent message was how ongoing synergy between science, technology, and\nsociety is essential to setting the stage for an exploration of transformative\ndiscoveries powered by AI that <em>would benefit everyone in society</em>. Missing that\nsynergy would lead to unequal benefit or dangerous crossings of boundaries.</p>\n<p><img src=\"/images/ai-for-science/ai-for-science-2024-8.webp\" alt=\"%c\" title=\"Busy schedule for the day at BAFTA HQ!\" ></p>\n<h2 id=\"lessons-from-crispr\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#lessons-from-crispr\"></a>Lessons from CRISPR</h2>\n<p>The first session had James Manyika interviewing <a href=\"https://en.wikipedia.org/wiki/Jennifer_Doudna\">Jennifer Doudna</a>, Nobel Laureate and co-inventor of CRISPR, shared how gene editing has moved from science fiction to an essential tool for societal improvement. Some key points:</p>\n<ul>\n<li>AI's integration with CRISPR allows scientists to better predict and control\ngenome editing outcomes, advancing efficiency and reducing hypothesis-testing\ntime. Many experiments could potentially be skipped thanks to simulations\npredicting outcomes without the need for wetlab work.</li>\n<li>Jennifer discussed projects like <a href=\"https://www.ucdavis.edu/food/news/making-cattle-more-sustainable\">methane-free cows</a>,\nwhere altering cattle genomes could eliminate methane production entirely.\nThese efforts require multidisciplinary collaboration between computer\nscientists, agricultural experts, and biologists.</li>\n<li>The success of CRISPR emphasises the need for public engagement, policy\nframeworks, and open databases for international collaboration. Doudna called\nfor knowledge accessibility, including teaching high school educators about\ngenome editing's impact, as a key part of public outreach about how this\ntechnology might affect society in the future.</li>\n<li>CRISPR moved really fast: it was published in 2012, and by 2014 scientists\nwere already editing monkey embroyes. This lead to a realisation that it\nwasnt enought to be head down in the Lab, but needed a whole team that works on\npublic impact and policy (including RS and national academies) to bootstrap\ninternational meetings on human genome editing. The most recent was held in\nLondon in March of last year which lead to open, transparent conversations and\nbuilding a worldwide database of work involving genome editing, especially that\nwhich impacts human genome or environmental editing which could escape.</li>\n</ul>\n<p><img src=\"/images/ai-for-science/ai-for-science-2024-1.webp\" alt=\"%r\" title=\"James and Jennifer in discussion.\" ></p>\n<h2 id=\"science-in-the-age-of-ai\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#science-in-the-age-of-ai\"></a>Science in the Age of AI</h2>\n<p>The panel next was <a href=\"https://en.wikipedia.org/wiki/Eric_Topol\">Eric Topol</a> chairing a discussion with <a href=\"https://en.wikipedia.org/wiki/Pushmeet_Kohli\">Pushmeet Kohli</a>, <a href=\"https://en.wikipedia.org/wiki/Fiona_Marshall_(pharmacologist)\">Fiona H. Marshall</a> and <a href=\"https://en.wikipedia.org/wiki/Alison_Noble\">Alison Noble</a>. The theme was on how huge number of foundation\nmodels are coming out for LLLMs (large language life models) at a dizzying\npace.</p>\n<p><img src=\"/images/ai-for-science/ai-for-science-2024-2.webp\" alt=\"%r\" title=\"Eric, Pushmeet, Fiona and Alison on stage.\" ></p>\n<p>Pushmeet Kohli first explained how deciphering the genome accelerates\nbiological discoveries by orders of magnitude. AI tools like AlphaFold (which\njust got the Nobel prize) exemplify breakthroughs that transform biology into a\npredictive science from a wetlab driven discipline.  On other fronts,\nDeepMind's GraphCast model enables near-term weather forecasting in minutes,\nwhich previously required days of supercomputer time to do an equivalent\nforecast (and now\n<a href=\"https://www.nature.com/articles/s41586-024-07744-y\">NeuralGCM</a> is doing even\nbetter with mechanistic models combined with data).  Pushmeet then noted how\nGNNs for materials science identified over 400k (or 200k? couldnt catch it) new\nstable inorganic crystals, with potential applications like high-temperature\nsuperconductors which were just scifi before.</p>\n<p>Then Fiona H. Marshall from Novartis emphasized how AI identifies new drug\ntargets using genomic and population-level data.  In drug development,\npredictive safety is absolutely crucial. Pharmaceutical companies have decades’\nworth of histological data, such as rodent testing, stored on hundreds of\nthousands of glass slides that are now being digitized. Once this data is\nprocessed, we can use it to make various predictions. Sharing this data across\nthe pharmaceutical industry would benefit everyone. One of their partner\ncompanies is developing scanning algorithms, and once these are operational\nthey will be made open, the entire industry will gain from the resulting\ndataset.  Generative chemistry (like AlphaFold) now designs drugs faster while\npredictive toxicology ensures safer medicines.\nInterestingly, the data scientists are in the prime author\nposition as they are dictating the experimental procedures vs the wetlab\nscientists a few decades ago. This change in incentives drives change within a\nlarge org towards more data driven methods.</p>\n<p>Another valuable source of data is population-level information, such as <a href=\"https://ourfuturehealth.org.uk\">Our\nFuture Health</a> (a UK-based NHS initiative).\nProper management of nomenclature will ensure that this project generates a\nmassive, usable dataset vs what we have anywhere else.  Eric noted that they\nrely heavily on the UK Biobank, which, with its 15,000 participants, is one of\nthe largest in the world and the Our Future Health program is a huge leap\nforward with 5m participants. The NIH in the United States is hesitant to\nengage in public-private partnerships, and so the UK is leading the way in this\ndomain (<em>Anil notes: with privacy concerns about the data sharing</em>).</p>\n<p>Fiona also noted that drug discovery is also transforming clinical trials, not\njust the discovery process itself. Typically, it takes around 10 years for a\ncandidate molecule to move to the clinical trial phase. One major bottleneck is\npatient enrollment. By leveraging vast demographic databases, which include\ninformation on global populations, their diseases, medications, and hospital\naffiliations, we can drastically improve recruitment efficiency.  For example,\na clinical trial protocol targeting &quot;women aged 30-50 who are not taking drug X\nor estrogen modifiers&quot; can utilize these databases to identify and enroll\npatients quickly. This approach can reduce recruitment time from three years to\njust six months, significantly accelerating the process of getting drugs to\nmarket.</p>\n<p><img src=\"/images/ai-for-science/ai-for-science-2024-12.webp\" alt=\"%r\" title=\"Sneaking in some sightseeing at BAFTA\" ></p>\n<p>Alison Noble discussed how deep learning has revolutionized ultrasound imaging.\nAI now guides users on probe placement, reducing training requirements for\nmedical professionals. However, we're now moving so fast that we need to be\ncareful; even the notion of what a scientist is is changing. In the RS report\non <a href=\"https://royalsociety.org/news-resources/projects/science-in-the-age-of-ai/\">Science in the Age of AI</a> a number of scientists around the UK were\nconsulted and this concern of reproducibility and data access came up loud and\nclear. When we publish results, we need to make sure theyh are sound and that\npeer review is possible. Openness is a deliberate choice however and not always\nappropriate when sensitive data is involved (e.g. healthcare) but requiring a\nrigor in evaluation is essential for good science. Scientists need to rethink\nin the age of AI how we present our work, and how we train new scientists in\nthis environment. So we have some wonderful early examples like AlphaFold, but\nwe need to understand the societal/incentive impacts on our new generation of\nscientists.</p>\n<p>Eric also noted that one of the greatest challenges in genomics is\nunderstanding variance, and\n<a href=\"https://www.science.org/doi/10.1126/science.adg7492\">AlphaMissense</a> has\ntackled this head-on. However, there is a significant data shortage. Without\nHelen Birman and the creation of the <a href=\"https://en.wikipedia.org/wiki/Worldwide_Protein_Data_Bank\">protein data\nbank</a>, AlphaFold\nwouldn’t have been possible. This raises the critical question: where do we\nsource the necessary inputs?  Pushmeet responded that intelligence doesn’t\nemerge in isolation; it relies on experiential datasets. Models can derive this\nexperience from real-world input or interactions within simulations.\nHigh-fidelity simulations, in particular, provide models with valuable\nintuition about future outcomes. Experimental data is also critical, as it\nintegrates with simulations to complete the picture.  When dealing with\nunlabeled data, prediction labels generated by the model itself can be used for\nfurther training. However, it's essential to discard incorrect labels to ensure\naccuracy, which makes this technique effective for bridging data gaps.\nCrucially, a pure and uncontaminated test set is vital to ensure the\nreliability of the system. For example, AlphaMissense was trained in an\nunsupervised manner and successfully identified cancer mutations.</p>\n<p>The discussion was quite wide ranging, but overall the two key challenges were:</p>\n<ul>\n<li>Reproducibility in science is a growing concern as AI accelerates discovery.\nAlison Noble emphasized the need for rigorous validation of results.</li>\n<li>Pushmeet noted the importance of testing the &quot;prediction hull&quot; of AI systems\nto understand their uncertainty and limitations, which was how AlphaFold\nbuilt up user confidence (by not having false positives).</li>\n</ul>\n<p>As AI transforms science, public-private partnerships and interdisciplinary\ncollaboration are essential (like the Our Future Health) program.  Transparency\nand openness are deliberate choices in science, but regulation must keep up\nwith the pace of innovation. Alison Noble also noted there is a culture change\ngoing on for these public/private partnerships within academia. While\ncompetition drives innovcation, we need to make sure that the academic reward\nsystem keeps up; if there are 50 people on a paper then how is this attractive\nfor young scientists to enter a field and make their own name?</p>\n<h2 id=\"a-view-from-the-frontier-of-cell-biology\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#a-view-from-the-frontier-of-cell-biology\"></a>A view from the frontier (of cell biology)</h2>\n<p><a href=\"https://en.wikipedia.org/wiki/Siddhartha_Mukherjee\">Siddhartha Mukherjee</a>, a cancer physician and Pulitzer Prize winner (and\npersonally speaking, author of one of my <a href=\"https://en.wikipedia.org/wiki/The_Emperor_of_All_Maladies\">favourite medical\nbooks</a> ever), began\nthe discussion with a warning about potential breaches of data privacy and the\ndangers they pose. He emphasized the risk of AI being weaponized for\nmisinformation, calling it a frontier challenge. These issues highlight the\nurgent need to develop mitigation strategies while continuing to advance the\ncapabilities of AI in their respective fields.\nSiddhartha emphasized that data is the critical missing link in advancing AI.\nIssues of access, quality, integration, equity, and privacy must be addressed.\nThe complexity of data ownership in AI raises ethical and commercial concerns,\nas data aggregators often benefit disproportionately.</p>\n<p><img src=\"/images/ai-for-science/ai-for-science-2024-9.webp\" alt=\"%r\" title=\"Siddhartha on stage with Anne, Janet and Anna.\" ></p>\n<p><a href=\"https://www.ebi.ac.uk/people/person/janet-thornton/\">Janet Thornton</a>, from the European Molecular\nBiology Lab, shared her perspective on protein structures. She highlighted how\nAI has transformed the field—from modeling just 20 protein structures in the\nearly days to now over 214 million. Structural biologists worldwide are using\nthis data to validate their findings, explore ligand binding, and venture into\nuncharted territories of protein functionality.  Anna delved into her work as a\ncell biologist studying membrane proteins and their organization within the\nbody. She shared a case from Cyprus, where mysterious kidney disease affected\ncertain families for decades. AI-driven image recognition was used to identify\na genetic mutation, leading to a therapeutic solution. The issue stemmed from a\nmisshapen protein caused by a nodal molecule that traps faulty proteins,\nultimately causing cell death. This discovery is now being applied to other\nconditions, such as Alzheimer’s and blindness, offering hope for broader\ntherapeutic applications.</p>\n<p>Janet also spoke about her time as the director of the European\nBioinformatics Institute, which manages data repositories like the <a href=\"https://www.wwpdb.org\">Worldwide\nProtein Data Bank</a>. She described the cultural shift required to encourage data\nsharing, noting it took 20 years for crystallographers to agree to mandatory\ndata deposition before publication. She stressed that medical data,\nparticularly in clinical contexts, must undergo a similar transformation.\nPublicly funded data must eventually reach the commons, especially when such\nsharing has the potential to save lives. The UK’s NHS, with its secure data\nenvironments, provides a strong model for this approach. However, the health\nsector needs to move faster than the crystallography community did, requiring\nbuy-in from both patients and medical professionals, as emphasized in Kathy\nSudlow’s recent report on the UK’s health data landscape.</p>\n<p><a href=\"https://www.broadinstitute.org/bios/anna-greka\">Anna Greka</a>, a pathologist and head of a new institute focusing on women’s\ncancer at the Broad Institute, discussed her work on building AI tools to identify and facilitate the\ndevelopment of disease mechanisms.  Anna Greka added that millions of human\ngenomes have been sequenced and aggregated into databases usable by scientists\nworldwide. If pathology labs globally pooled their data, AI training models\nwould benefit immensely. She suggested anonymizing the data while preserving\nmetadata and federating results across countries to protect privacy and enhance\nglobal collaboration.</p>\n<p>Anne Vincent-Salomon, head of diagnostic and theranostic medicine at the\nInstitute Curie, stressed the importance of multidisciplinary science and\neffective communication. She emphasized the need to educate the public,\nreducing fear and fostering understanding of scientific advancements.</p>\n<p>Anna concluded by underscoring the importance of understanding the &quot;unit of\nlife&quot; to progress in biology. She argued for the creation of a high-quality\nperturbation dataset for cells, akin to the Protein Data Bank’s role in\nAlphaFold. Skeptical of synthetic data, she emphasized the need for\nexperimentally derived data as a foundation for future breakthroughs. She\ncalled this effort a moonshot for the next five years -— a grand challenge to\ndeepen our understanding of life itself! (<em>Anil: see also this great <a href=\"https://www.ted.com/talks/anna_greka_the_world_s_rarest_diseases_and_how_they_impact_everyone?subtitle=en\">TED talk</a> from Anna last year</em>)</p>\n<h2 id=\"the-polycene-exploring-the-opportunity-of-the-moment\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-polycene-exploring-the-opportunity-of-the-moment\"></a>The Polycene: Exploring the Opportunity of the Moment</h2>\n<p><img src=\"/images/ai-for-science/ai-for-science-2024-5.webp\" alt=\"%r\" title=\"Thomas Friedman talking about the polycene.\" ></p>\n<p>The (epic) next speaker was Thomas L. Friedman, who explored the the interplay of three critical &quot;scaling laws&quot; in the modern era:</p>\n<ol>\n<li><strong>Software</strong>: The rapid improvement of AI capabilities post-2017 (with transformers and GPUs).</li>\n<li><strong>Carbon</strong>: Rising CO2e and methane emissions driving &quot;<a href=\"https://www.newstatesman.com/science-tech/2021/04/why-we-need-talk-about-global-weirding\">global weirding</a>&quot; (extreme and destructive climate changes).</li>\n<li><strong>Disorder</strong>: Societal and institutional fragmentation in addressing these crises.</li>\n</ol>\n<p>Between 2017, with the introduction of transformer algorithms, and 2020, when\nadvancements in microchips and GPUs took off, artificial intelligence has\nexpanded dramatically. This period reflects a &quot;scaling law&quot; in action.\nPolymathic AI—AI capable of addressing a broad range of problems—now seems\nwithin reach. In just three years, AI-driven science has evolved from science\nfiction to reality and become accessible to many (albeit often with some\nlimitations on free access). To address the challenges AI presents, we need\nhigher-dimensional thinking for higher-dimensional problems.</p>\n<p>At the same time, we're seeing a scaling law in carbon emissions. Levels of CO₂\nand methane in the atmosphere, including methane from livestock, are causing\ndestructive climate change. The seven warmest years on record occurred between\n2015 and 2021, resulting in what’s called &quot;global weirding&quot;—where hot regions\ngrow hotter, wet regions grow wetter, and the effects become increasingly\ndestructive.</p>\n<p>In parallel with these scaling points in carbon and silicon (AI hardware),\nwe’re facing a scaling point in disorder—the erosion of civic structures.\nGovernments worldwide have over-promised on the benefits of industrial\ndemocracies, such as healthcare, retirement plans, and infrastructure, yet are\nstruggling to deliver. Societies are aging, educational attainment has\nstagnated, and productivity growth has slowed.</p>\n<p>We're also witnessing growing national security challenges and the dissolution\nof the great power balance that defined the post-Cold War era. Electric\ntransportation, healthcare, and employment systems are under strain, leading to\nincreased migration. Today, there are 56 active conflicts globally—the highest\nnumber since World War II—and more displaced people than at any point in\nhistory.</p>\n<p>We need a game-changer.</p>\n<p>To solve these interconnected crises, we must adopt higher-dimensional\napproaches that blend solutions across disciplines and scales. The future\nstability of our planet—and the well-being of the next generation—depends on\npresenting holistic, interconnected solutions. Friedman calls this the &quot;Polycene&quot; era.</p>\n<p>Never before has politics needed science more. Science must enable leaps\nforward in education and provide buffers against disruptive climate change.\nSimilarly, politics must create the frameworks and systems to allow science to\nthrive and deliver meaningful solutions.</p>\n<p>In <a href=\"https://en.wikipedia.org/wiki/That_Used_to_Be_Us\">That Used to Be Us</a>,\nFriedman argued that &quot;average is over&quot;; the rapid acceleration of technology\nmeans the American Dream -- once achievable for many -- is no longer guaranteed.</p>\n<p>However, technology can flatten access to resources. For instance, an Indian\nfarmer can now access advanced insights about crop planting, watering\nschedules, and fertilizers directly on a smartphone. For the first time, those\nwithout access to &quot;average&quot; resources now have access to them through AI.\nThanks to AI, &quot;average&quot; as a benchmark is over—and that gives Friedman optimism.</p>\n<p>However, there’s a caveat: science and politics must work together. The\nalignment problem between these fields is real and will become more pressing as\nwe approach AGI. As a species, we’ve become more godlike than ever before,\ncreating systems and intelligence that resemble a larger, more powerful brain.\nHow we use this power will determine our future.</p>\n<h2 id=\"building-the-infrastructure-for-success\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#building-the-infrastructure-for-success\"></a>Building the Infrastructure for Success</h2>\n<p>The speakers here were Paul Hofheinz, <a href=\"https://en.wikipedia.org/wiki/Asmeret_Asefaw_Berhe\">Asmeret Asefaw Berhe</a>, Fabian J. Theis and Bosun Tijani.</p>\n<p><img src=\"/images/ai-for-science/ai-for-science-2024-6.webp\" alt=\"%r\" title=\"Paul Hofheinz, Asmeret, Fabian and Bosun.\" ></p>\n<p>Berhe began by noting that we are at an inflection point -- how do we avoid\nrepeating mistakes from the past, ensuring we don’t leave segments of human\nsociety behind or widen the digital divide further? In previous programs such\nas exascale computing, they insisted as part of the program that they  must\nhave explicit sustainability goals. While these goals may have seemed\nunrealistic initially and were criticised, in retrospect they have shown they\ncan be achieved. An example of the next thing theyre working on is the\nHigh-Performance Data Facility, recognizing that the Office of Science produces\nmore scientific data than any other entity in the world (e.g particle physics,\ngenomic labs). We need to rethink how we handle such huge amounts of data,\naddressing concerns around privacy and integrity.</p>\n<p>Minister Tijani then talked about how in Nigeria, there is a direct correlation\nbetween science and economic growth, yet in the Global South, we often fail to\napply science effectively to societal issues. The answers we think we have\noften got to shift solutions with context; for instance, policies from the UK don’t transplant\ncleanly to Nigeria where the poplulation is growing hugely faster.</p>\n<p>Key points included:</p>\n<ul>\n<li><strong>Dataset Inclusion</strong>: Like Indian farmers accessing AI-driven agricultural\ninsights, we need datasets that represent Nigeria, Rwanda, and other African\ncountries. Nigeria’s average age is 16.9, with 5 million new people entering\nthe population each year. The workforce of the future will come from Africa.</li>\n<li><strong>Infrastructure</strong>: Establishing certainty in data infrastructure is\ncritical. Countries will need to ensure their data infrastructures allow for\nglobal participation rather than stagnation.</li>\n<li><strong>Digitization</strong>: Much of Nigeria’s existing knowledge isn't encoded in a\nform digestible by AI. Smart digitization efforts are necessary to create\ninputs for widely used models.</li>\n</ul>\n<p>To address these challenges, the Nigerian government did a few things.  Over\nthe past 30 years, publications and a name library were correlated to identify\nAI-focused Nigerian scientists. This effort brought 100 of them together to\ndevelop a national AI strategy for Nigeria.  An AI Trust was created with a\ncommunity of trustees to support this strategy. And then a Talent Attraction\nProgram was launched, supported by Google and the government, alongside local\nprivate investment. This is one of the largest talent accelerators globally.\nNigeria aims to become a net exporter of talent, much like India’s success in\nthis area.</p>\n<p>Fabien then talked about how many scientists are driven by natural curiosity,\nand it's vital to nurture that environment. The &quot;holy trinity&quot; of AI consists\nof data compute and algorithms need to be completed together.  Ten years ago,\ncomputer vision flourished due to ImageNet’s availability and now  we’re\nentering a new era with foundation models for cell biology. These models\nrequire algorithmic innovations to merge datasets and techniques like\nmultiscale modeling mixed with self-supervised learning to succeed.</p>\n<p>We're at a tipping point where we can build generalizable models that can be\nadapted for specific tasks around the world (a reference to the Nigerian\nusecases earlier)</p>\n<p>Some key points discussed were:</p>\n<ul>\n<li><em>Equitable compute access</em>: Academic/industrial partnerships are essential to make GPUs accessible for foundational research.</li>\n<li><em>Cell Atlas</em>: Foundation models help academics plan experiments (&quot;lab in the loop&quot;) and overlay disease data for deeper insights. The Deep Cell Project for example aims to not only create steady-state models but also add perturbation behaviors, enabling predictions about how cells respond to interventions. Unlike physics, where laws of motion guide us, cell biology lacks such universal equations. Deep Cell integrates image-based observations, tissue contact data, computational morphology, and clinical data to create a comprehensive model that maps healthy states and potential interventions.</li>\n<li><em>Benchmarks</em>: Gene data is consistent internationally, but we need\nstandardized benchmarks to equalize global talent and foster competition.\nBenchmarks are powerful tools for collaboration and innovation as they are accessible for anyone (see Kaggle for example).</li>\n<li><em>Bias</em>: While we have powerful computational systems, the data they rely on is highly biased and incomplete. These flaws risk being perpetuated in future frontier models. To address this we need to invest in rebalancing datasets to prevent historical biases from carrying over. Cooperative investments are essential to develop homegrown talent in the global south.</li>\n</ul>\n<p>Bosun Tijani also noted that the Global South isnt a charity case when it comes\nto AI. By the end of this century, Nigeria’s population will be half a billion,\nmaking it crucial in a highly connected world. There's a strong business case\nfor not missing out. Nigeria is investing $2 billion to deploy dark fiber\ninfrastructure nationwide. This connectivity will empower people to contribute\nmeaningfully to the global digital economy.  Governments in the Global South\nmust expand their capacity to think strategically about AI and its potential.\nUnlike academic institutions, which often drive innovation, governments in\nthese regions need to strengthen their ability to lead and cant rely on a large\nuniversity pool like Europe does.</p>\n<h2 id=\"collaborating-for-impact\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#collaborating-for-impact\"></a>Collaborating for Impact</h2>\n<p>The speakers here were Lila Ibrahim, Ilan Gur, Dame Angela McLean and Sir Paul Nurse.</p>\n<p>The first question about around how the speakers have experienced changes in science and how it have evolved?</p>\n<p>Paul Nurse noted that we live in an increasingly complex scientific world. The\nexpansion of science has led to greater complexity, which, in turn, has created\nmore silos across disciplines. To address this, we need more interaction —- not\nnecessarily immediate collaboration -— but increasing permeability between\nfields. There are also important social science aspects to consider, such as\nhow to structure training and foster interdisciplinary to work effectively.</p>\n<p><img src=\"/images/ai-for-science/ai-for-science-2024-7.webp\" alt=\"%c\" title=\"Lila, Ilan, Angela and Paul on stage.\" ></p>\n<p>Angela McClean: we’ve transitioned from an era of &quot;big science&quot; projects -—\nlarge, centrally organized efforts with clear, command-and-control structures\n-— towards distributed collectives. These collectives pool together the\nnecessary data to tackle significant questions, such as those addressed by\nAlphaFold. Unlike a single centralized project, AlphaFold’s success came from a\nclear, well-defined question that encouraged competition and fostered winning\nstrategies.</p>\n<p>Today, we need the big research projects to define what success looks like and\nthen innovate towards new ways for people to contribute collectively without a\nbig central structure.</p>\n<p>Disciplines can generally be divided into two categories. Those with &quot;countable\nquestions&quot;; for example, solving the Riemann hypothesis might win you a Nobel\nPrize!  Then we have unstructured disciplines: fields like biology, where there\nisn’t a single list of questions to solve. As Paul Nurse put it, &quot;biology is a\nbunch of anarchist scientists!&quot;. He contined that we need more unfocused\nresearch organizations that keep track of unstructured problems and help refine\nthem into focused scientific questions. This kind of work is essential for\nachieving progress in disciplines that don't have clear or countable goals.</p>\n<p>Ilan Gur then introduced ARIA, the Advanced Research Intelligence Agency, that\nhas a mission to build on the UK’s rich scientific ecosystem by pursuing\nbreakthroughs that may not yet have immediate or obvious consequences. ARIA\nfocuses on identifying the right people, their environments, their incentives, and\nhow their work can ultimately benefit society.\nARIA’s method begins by gathering program manager scientists to develop a thesis about\nwhere to focus efforts. This doesn’t involve just one research project but\nrather a constellation of efforts that can cross technology readiness\nlevels and organizational boundaries to achieve a focused target.\nExamples of ARIA initiatives include scaling AI via compute inspired by nature, and\nanother project observing that formal mathematics should be better integrated\ninto AI research to create more robust models. By guaranteeing bounds on\ninputs, we could use AI in critical applications with confidence in its\noutcomes.</p>\n<p>Angela McClean then talked about how the UK government (under Labour) has outlined missions for addressing key societal challenges, such as\ngrowth, safer streets, opportunities for all, clean energy, and better health.\nThese missions are a great starting point for research and problem-solving.\nHowever, the structure of Whitehall (government departments) tends to remain\nsiloed. To succeed, we need to draw expertise from across departments to\naddress these challenges.</p>\n<p>Paul Nurse noted that science operates on a spectrum between discovery and\napplication.  Discovery is anarchic and unpredictable but applications are more\ndirected and goal-oriented.  We need end-to-end funding that supports the\nentire spectrum, from discovery to application, while embracing diversity in\napproaches. A one-size-fits-all method won’t work. At the Francis Crick\nInstitute, departments were abolished, allowing disciplines to mix; turnover\nwas encouraged after a limit of tenure to keep staffing dynamic (including at\nsenior levels) and a high level of technical expertise was made available to\neveryone.  Mixing people from different disciplines and using social scientists\nto understand the cultural structures within organizations is key to fostering\ninnovation.</p>\n<p>At the Crick Institute, the space encourages serendipitous conversations* This\nincluded inviting external guests and creating opportunities for unexpected\ninteractions. Tools like Slack’s &quot;lunch roulette&quot; feature could similarly\nencourage serendipity and collaboration.\n(<em>Anil note</em>: Cambridge Colleges do a great job here, but our departments are\nmore siloed but events like <a href=\"https://www.christs.cam.ac.uk/news/celebrating-50-years-rubiks-cube\">Rubik's 50</a> are a great example of how different disciplines come together)</p>\n<p>Angela McClean also noted that we need to find ways to communicate the\nimportance of breakthroughs like AlphaFold outside of the scientific community.\nFor example, when AlphaFold was introduced, the Ministry of Defence (MoD)\ndidn’t grasp why the science mattered -— they lacked the necessary context. Even\namong highly educated people in the UK, there's a gap in understanding just how\nmuch AI is transforming society. By better connecting experts and amplifying\ntheir insights, we can and must help bridge this gap.</p>\n<p>Paul Nurse also noted that the public must be informed about science advances;\ne.g. the fiasco around GM crops happened because noone trying to introduce GM\ncrops had bothered to infrm the public explainign what the issues are and\ngetting feedback. The main answer from the public smaple about &quot;why not GM crops&quot; is because they\ndidnt want to eat food with genes in it, and thats what bothered them. So when\nwe're thinking about AI and its implications, lets go out and talk to the\npublic and find out what worries them and then think about how to communicate.</p>\n<h3 id=\"accelerating-scientific-discovery-using-ai\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#accelerating-scientific-discovery-using-ai\"></a>Accelerating Scientific Discovery Using AI</h3>\n<p>Demis Hassabis then reflected on AlphaFold and the future of AI-Driven science.\nAlphaFold, which has been cited over 28,000 times already and by open-sourcing it (including AlphaFold 3 with open weights for non-commercial use), its impact has been profound. Some standout applications include:</p>\n<ul>\n<li>Determining the structure of the nuclear pore complex, which regulates nutrients entering and exiting the cell nucleus.</li>\n<li>Developing a molecular syringe for delivering drugs to hard-to-reach parts of the body.</li>\n<li>Discovering plastic-eating enzymes to address environmental challenges.</li>\n</ul>\n<p>AlphaFold is described as a &quot;root node&quot; problem within Deepmind -— once solved, it unlocks entirely new branches of scientific discovery. Its success in determining protein structures has validated this potential.  What's next?</p>\n<p><img src=\"/images/ai-for-science/ai-for-science-2024-10.webp\" alt=\"%c\" title=\"Hannah Fry and Demis Hassabis on stage\" ></p>\n<p>Material Design (<a href=\"https://deepmind.google/discover/blog/millions-of-new-materials-discovered-with-deep-learning/\">Gnome Project</a>) is the next frontier of material design, which shares characteristics with protein folding:</p>\n<ul>\n<li>A massive combinatorial search space.</li>\n<li>The need for models that integrate physics and chemistry to optimize solutions.</li>\n<li>Potential breakthroughs include developing new batteries or room-temperature superconductors.\nAlready, 200,000 new crystals have been published -— previously unknown to science -— marking significant progress toward a usable Gnome system.</li>\n</ul>\n<p>Applying AI to mathematics also offers exciting possibilities, including solving major conjectures that have eluded mathematicians.</p>\n<p>Inspired by mentorship from Paul Nurse, the aim is to simulate an entire\nvirtual cell -— a &quot;Mount Everest&quot; of biology. AlphaFold 2 solves static protein\nstructures, while AlphaFold 3 models interactions, taking us closer to\nsimulating an entire cell (e.g., a yeast cell as the model organism).  Within\n5–10 years, this ambitious goal may well be achievable.</p>\n<p>Quantum Computing is accelerating and offers exciting intersections with AI; simulating quantum systems to generate synthetic data  or addressing challenges like error-correcting codes.  However, classical Turing machines have proven more capable than initially thought:</p>\n<ul>\n<li>Projects like AlphaGo and AlphaFold show how new algorithms outperform brute force by precomputing models before tasks like making a move in Go or folding a protein.</li>\n<li>Classical systems, when used effectively, can model even quantum systems.</li>\n</ul>\n<p>David Deutsch called this approach &quot;crazy, but the right kind of crazy&quot; when Demis talked to him about it. Demis thinks that every natural phenomenon has inherent structure, which machine learning can model to efficiently search for optimal solutions. So quantum may not be necessary for this, and classical computing used with machine learning sufficient to solve the hugely complex underlying problem.</p>\n<p>Meanwhile they also launched Isomorphic Labs to rethinking the drug discovery\nprocess from the ground up, leveraging AI for one of the most impactful use\ncases: curing diseases. AlphaFold is a powerful tool for fundamental research,\nand Isomorphic works on adjacent usecases need for practical drug discovery\n(helping design chemical compounds, test for toxicity, and minimize side\neffects, etc).  Isomorphic aims to cure diseases with AI, and generate revenue\nto reinvest in fundamental research, so striking a balance between societal\nimpact and profitability.</p>\n<p>Demis then commented that we stand on the brink of a golden era of scientific\ndiscovery, driven by interdisciplinary collaboration with domain experts and\nlimitless possibilities for applying AI to new fields and improving AI itself\n(approaching exponential improvement).  The scientific method is humanity's\ngreatest invention and remains the foundation of modern civilization. In an era\nof transformative AI, its useful to go beyond simple A/B testing and treat AI\ndevelopment as a scientific method test.  We need to understand the emergent\nproperties of AI systems and improve interpretability. Techniques from\nneuroscience (e.g fMRI for studying brains) could inspire ways to study neural\nnetworks and make them explainable rather than just being black boxes.  The\napproach is to build the artifact of interest first, then decompose it through\ntargeted experiments to understand it once it has proven useful. Artificial\nsystems like neural networks can be as complex as natural systems, requiring\nsimilar rigor to understand.</p>\n<p>Science is increasingly expensive and complex, leading to slower progress in\ncertain areas. However, interdisciplinary work will drive significant advances\nin the next decade. DeepMind, originally founded at the intersection of\nneuroscience and computer science, exemplifies how crossing disciplines\naccelerates innovation.</p>\n<p>To support these efforts, Google.org just announced a <a href=\"https://blog.google/outreach-initiatives/google-org/google-org-science-ai-funding/\">$20 million fund for\ninterdisciplinary\nresearch</a>,\nfurther enabling breakthroughs at the intersection of fields. (<em>Anil's note: let's hope that sustainability is on the list here!</em>)</p>\n<h3 id=\"ask-the-nobel-laureates\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#ask-the-nobel-laureates\"></a>Ask the Nobel Laureates</h3>\n<p>The last panel had all four Laureates on stage to answer questions, moderated\nby Hannah Fry: Jennifer Doudna, Sir Demis Hassabis, John Jumper and Sir Paul\nNurse.</p>\n<p><img src=\"/images/ai-for-science/ai-for-science-2024-11.webp\" alt=\"%c\" title=\"What's a group of Nobel laureates called?\" ></p>\n<p>The discussion opened by asking the panelists how they first felt when they\nmade their prize winning discoveries.</p>\n<p>John Jumper: when you release groundbreaking work, it’s fascinating to see the\nimmediate responses. I remember refreshing Twitter and seeing graduate students\nexclaiming, “How did they get my structure? It hasn’t even been published!”\nThere was a special issue of Science related to the nuclear pore complex, and\nthree out of four studies had heavily used AlphaFold without me even knowing\nit. It was amazing to see how our tools are empowering researchers.</p>\n<p>Jennifer Doudna: In the fall of 2011, while working on CRISPR (a bacterial\nimmune system), we realized it was an RNA-guided system that targets DNA for\ncleaving. It was one of those &quot;aha&quot; moments—bacteria had figured out how to do\nthis, and now we could understand and manipulate DNA using the same principle.\nA year later, when we published our findings, we could feel the momentum\nbuilding in the scientific community.</p>\n<p>Paul Nurse: In 1985 (much older than the others!), I was working on yeast and\nmy lab had identified the genes responsible for the cell cycle—how one cell\nreproduces into two. We wondered whether these findings could apply to humans,\neven though this was well before human genome mapping. Using the first human\ncDNA library ever made, we introduced human genes into defective yeast cells.\nIf a human gene could replace the defective yeast gene and restore function, it\nmeant the discovery was transferable. Remarkably, 1.5 billion years of\nevolutionary divergence didn’t stop this experiment from working.</p>\n<h2 id=\"qa\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#qa\"></a>Q&amp;A</h2>\n<p>Q: What would you say to your 18-year-old self?</p>\n<p>Demis Hassabis: I actually had this plan when I was 18! The amazing thing is that it worked out, but I’d tell myself to enjoy the journey a bit more.</p>\n<p>John Jumper: My career has been more of a random walk, driven by doing good science in the moment and being open to new opportunities. My advice is to focus on doing good science now and let the interesting paths unfold naturally. It’s almost the opposite of Demis’s advice.</p>\n<p>Jennifer Doudna: Follow your passion, never give up, and don’t listen to naysayers.</p>\n<p>Paul Nurse: Coming from a non-academic background, I couldn’t believe that I could be paid to follow my curiosity. Even now, it still feels like a privilege.</p>\n<p><img src=\"/images/ai-for-science/ai-for-science-2024-3.webp\" alt=\"%r\" title=\"Hideo Kojima has the coolest portraits at the BAFTA\" ></p>\n<p>Q: AI gives answers but struggles with mechanistic insights. How big a barrier is this to public trust, and when can we expect true mechanistic insights?</p>\n<p>Demis Hassabis: AI is an engineering science. First, we need to build systems that are worthy of study. Once built, we can break them down and understand them mechanistically over time. Early systems weren’t worth this effort, but now we’re developing tools that are, and they’re improving themselves. Unlike physics, biology can’t always be explained by universal laws, but simulations that can be tested and probed are better suited. Neuroscience techniques, like those used to study real brains, can also help us understand artificial neural networks.</p>\n<p>Q: Is attention still all we need?</p>\n<p>John Jumper: AlphaFold isn’t just an off-the-shelf transformer. While attention is an important component, many other innovations were added to change the structure of the network significantly. Fundamental research continues to unlock insights into both new data and previously unexamined data. AlphaFold has revealed new knowledge about data that had been available for years.</p>\n<p>Demis Hassabis: The transformer architecture has been incredible but isn’t sufficient on its own. We’ll need several more breakthroughs of that scale to reach full AGI.</p>\n<p>Q: What are the current challenges in biology data?</p>\n<p>Jennifer Doudna: Biology faces issues with both the quality and quantity of data for training AI models. We need to educate researchers on how to collect data both sparsely and smartly. Sparse but broad data is critical to creating robust platforms for training. This ultimately comes down to asking the right questions.</p>\n<p>Q: What about people who are skeptical of these breakthroughs? Could society reject them?</p>\n<p>Paul Nurse: Keeping the public on board is critical. This isn’t the first time new technology has faced resistance, and every time it happens, there’s concern. Special interest groups often hijack these conversations, so we need to find better ways to engage with the public and explain the science behind the breakthroughs.</p>\n<p>Q: Africa will have the largest population of young adults by 2050. How can Africans be included in this global scientific revolution?</p>\n<p>Jennifer Doudna: The Innovative Genomics Institute has an ongoing effort in Kenya to work with scientists and help them understand CRISPR. This initiative has fostered a network of researchers, and I’d like to see more of that happening.</p>\n<p>Demis Hassabis: DeepMind has been actively working in Africa, with events like the Deep Indaba conference serving as key convening points for African talent. There’s still a lot more to be done, but it’s a hugely important area of focus.</p>\n<p>Q: How do we encourage the next generation of scientists?</p>\n<p>Paul Nurse: In today’s world, journals are dominated by big data studies. While there’s value in this, we must ensure that creativity doesn’t get lost. There’s enormous potential in big data if approached with creativity, and we need to foster this mindset in our colleagues and students.</p>\n<p>Demis Hassabis: Encouraging the next generation is crucial. One of my heroes is Richard Feynman. Every schoolchild should read <em>Surely You’re Joking, Mr. Feynman!</em> It shows how exhilarating it is to work at the frontier of knowledge. Science is incredible and fun, and we need to expose people to that joy.</p>\n<p><img src=\"/images/ai-for-science/ai-for-science-2024-13.webp\" alt=\"%c\" title=\"Ray Dolby is a Pembroke alumni too\" >\n<img src=\"/images/ai-for-science/ai-for-science-2024-15.webp\" alt=\"%c\" title=\"Interactive exhibits inside the room\" >\n<img src=\"/images/ai-for-science/ai-for-science-2024-14.webp\" alt=\"%c\" title=\"Glitzy entrance to the BAFTA\" ></p>\n<p>These conclude my live notes! Beyond the notes here, the corridor conversations were incredibly\nuseful for me: I have lots of connections to make next.  Any errors in these\nnotes are all mine, of course; I mainly took them for myself, but I hope it's\nuseful for you to have put them online as well.</p><h1>References</h1><ul><li>Kochkov et al (2024). Neural general circulation models for weather and climate. Nature. <a href=\"https://doi.org/10.1038/s41586-024-07744-y\" target=\"_blank\"><i>10.1038/s41586-024-07744-y</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/ai-for-science-2024",
      "external_url": "https://royalsociety.org/news-resources/projects/science-in-the-age-of-ai/",
      "title": "Royal Society and DeepMind host AI for Science Forum",
      "summary": "The Royal Society and DeepMind hosted an AI for Science Forum, exploring AI's role in revolutionizing scientific discovery and its potential to benefit society.",
      "date_published": "2024-11-18T00:00:00.000000Z",
      "date_modified": "2024-11-18T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        ":rsn",
        ":life",
        "livenotes",
        "royalsociety",
        "science"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1038/s41586-024-07744-y",
          "doi": "10.1038/s41586-024-07744-y",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2024-cclr-carbon-1",
      "content_html": "<p><a href=\"https://www.cst.cam.ac.uk/people/smc70\">Sophie Chapman</a> lead an <a href=\"/ideas/legal-aspects-of-credits\">effort</a> to explore a novel legal framework for forest carbon credits\nthat separates carbon tenure (i.e. title and associated property rights to the\nland and trees which store the carbon) from the carbon rights (i.e. title and\nassociated rights to monetise and manage the credits which\nsymbolically represent the carbon stored in the trees), while also specifying\nthe relationship between the carbon tenure and the carbon rights.</p>\n<p>The resulting <a href=\"/papers/2024-cclr-carbon\">paper</a> has just been published in the Climate\nand Carbon Law Review journal, and is available as open access for your perusal.</p><h1>References</h1><ul><li>Chapman et al (2024). A Legal Perspective on Supply-side Integrity Issues in the Forest Carbon Market. <a href=\"https://doi.org/10.21552/cclr/2024/3/5\" target=\"_blank\"><i>10.21552/cclr/2024/3/5</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2024-cclr-carbon-1",
      "title": "Published a legal perspective on high integrity forest carbon credits",
      "summary": "Paper exploring legal framework for forest carbon credits that separates carbon tenure from carbon rights, published in Climate and Carbon Law Review.",
      "date_published": "2024-11-01T00:00:00.000000Z",
      "date_modified": "2024-11-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "forest",
        "carboncredits",
        "legal",
        "conservation"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2024-cclr-carbon.pdf",
          "mime_type": "application/pdf",
          "title": "A Legal Perspective on Supply-side Integrity Issues in the Forest Carbon Market"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.21552/cclr/2024/3/5",
          "doi": "10.21552/cclr/2024/3/5",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2024-sensys-terracorder-1",
      "content_html": "<p><a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> presented our work on biodiversity sensing over at <a href=\"http://sensys.acm.org/2024/\">ACM Sensys 2024</a> in China. The <a href=\"http://sensys.acm.org/2024/demos/\">full set</a> of papers and demos has a range of impressive work on sensor networks, and some that stood out to me follow.</p>\n<ul>\n<li>&quot;<a href=\"https://dl.acm.org/doi/pdf/10.1145/3666025.3699332\">An Open-source Hardware and Software Platform for the Battery-free Internet of Things</a>&quot;, which is &quot;an open-source and commercially available battery-free platform that includes multiple boards, extensive software, and comprehensive documentation&quot;.</li>\n<li>&quot;<a href=\"https://dl.acm.org/doi/pdf/10.1145/3666025.3699323\">Towards Sustainable Live Sonar Analytics in Wild Ecosystems</a>&quot;, which &quot;enables real-time processing of acoustic sonar data with spatial and temporal adaptations, and features energy-efficient operation through a robust energy management module&quot;.</li>\n<li>&quot;<a href=\"https://dl.acm.org/doi/10.1145/1644038.1644049\">Canopy closure estimates with GreenOrbs</a>&quot;, whose software and hardware is &quot;tailored for sensing in wild environments without human supervision, including a firm weatherproof enclosure of sensor motes and a light-weight mechanism for node state monitoring and data collection&quot;.</li>\n</ul><h1>References</h1><ul><li>Millar et al (2024). Poster: Towards Low-Power Comprehensive Biodiversity Monitoring. Association for Computing Machinery. <a href=\"https://doi.org/10.1145/3666025.3699400\" target=\"_blank\"><i>10.1145/3666025.3699400</i></a></li>\n<li>Geissdoerfer et al (2024). Riotee: An Open-source Hardware and Software Platform for the Battery-free Internet of Things. <a href=\"https://doi.org/10.1145/3666025.3699332\" target=\"_blank\"><i>10.1145/3666025.3699332</i></a></li>\n<li>Xu et al (2024). SALINA: Towards Sustainable Live Sonar Analytics in Wild Ecosystems. <a href=\"https://doi.org/10.1145/3666025.3699323\" target=\"_blank\"><i>10.1145/3666025.3699323</i></a></li>\n<li>Mo et al (2009). Canopy closure estimates with GreenOrbs: sustainable sensing in the forest. <a href=\"https://doi.org/10.1145/1644038.1644049\" target=\"_blank\"><i>10.1145/1644038.1644049</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2024-sensys-terracorder-1",
      "title": "Presented poster at Sensys on low-power biodiversity monitoring",
      "summary": "Poster presentation at ACM Sensys 2024 on biodiversity sensing work with Terracorder platform.",
      "date_published": "2024-11-01T00:00:00.000000Z",
      "date_modified": "2024-11-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "terracorder",
        "biodiversity",
        "sensys"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2024-sensys-terracorder.pdf",
          "mime_type": "application/pdf",
          "title": "Poster: Towards Low-Power Comprehensive Biodiversity Monitoring"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1145/3666025.3699400",
          "doi": "10.1145/3666025.3699400",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3666025.3699332",
          "doi": "10.1145/3666025.3699332",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1145/3666025.3699323",
          "doi": "10.1145/3666025.3699323",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1145/1644038.1644049",
          "doi": "10.1145/1644038.1644049",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2024-ce-llm-1",
      "content_html": "<p>We have just uploaded a preprint on using LLMs for conservation evidence, based on our <a href=\"/projects/ce\">work</a> on large-scale crawling of the academic literature. Well done in particular to <a href=\"mailto:ri301@cam.ac.uk\">Radhika Iyer</a> for having done the bulk of the evaluation on this as part of a very productive summer internship with us! This work evaluates whether LLMs can facilitate evidence-based decision support for conservation by testing ten different LLMs against human experts on multiple choice questions about conservation interventions. We found that with careful design, open-book LLM performance was competitive with human experts on filtered questions, both in correctly answering them and retrieving the source documents. However, general LLMs used &quot;out-of-the-box&quot; performed poorly, highlighting the importance of domain-specific design to avoid misinforming decision-makers.</p><h1>References</h1><ul><li>Iyer et al (2025). Careful design of Large Language Model pipelines enables expert-level retrieval of evidence-based information from syntheses and databases. <a href=\"https://doi.org/10.1371/journal.pone.0323563\" target=\"_blank\"><i>10.1371/journal.pone.0323563</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2024-ce-llm-1",
      "title": "Preprint on using LLMs to for evidence-based decision support",
      "summary": "Preprint on using LLMs for conservation evidence based on large-scale academic literature crawling.",
      "date_published": "2024-11-01T00:00:00.000000Z",
      "date_modified": "2024-11-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        ":ce",
        "evidence",
        "llms",
        "ai",
        "conservation"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2024-ce-llm.pdf",
          "mime_type": "application/pdf",
          "title": "Careful design of Large Language Model pipelines enables expert-level retrieval of evidence-based information from syntheses and databases"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1371/journal.pone.0323563",
          "doi": "10.1371/journal.pone.0323563",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2024-socc-murmuration-1",
      "content_html": "<p><a href=\"https://www.cl.cam.ac.uk/~sv440/\">Smita Vijayakumar</a> went along to Seattle to <a href=\"https://acmsocc.org/2024/\">SOCC 2024</a> to present her PhD research on Murmuration. This is a new scheduler for Kubernetes that allows for 15%--25% faster job completion times than the default scheduler for different job arrival characteristics in datacenters that are very busy. The key insight is that existing schedulers impose large wait times on tail tasks in highly utilized clusters, leading to long job completion times. Murmuration employs multiple communicating schedulers to schedule tasks such that their start times are as close together as possible, ensuring small tail task completion time. Our evaluation shows it scales to workloads with millions of tasks, and with queue re-ordering enhancements, achieves up to 100x better median job completion time than current schedulers on industry workloads.</p>\n<p>Unfortunately, the videos from SOCC don't seem to be online yet (I could only find <a href=\"https://www.youtube.com/@officialacmsocc2204\">SOCC 2020</a>), but I'll update this if they do show up.</p><h1>References</h1><ul><li>Vijayakumar et al (2024). Scheduling for Reduced Tail Task Latencies in Highly Utilized Datacenters. Association for Computing Machinery. <a href=\"https://doi.org/10.1145/3698038.3698522\" target=\"_blank\"><i>10.1145/3698038.3698522</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2024-socc-murmuration-1",
      "title": "Paper on scheduling for reduced tail task latencies",
      "summary": "SOCC 2024 presentation on Murmuration scheduler achieving 15-25% faster Kubernetes job completion times in busy datacenters.",
      "date_published": "2024-11-01T00:00:00.000000Z",
      "date_modified": "2024-11-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "distributed",
        "scheduling",
        "cloud",
        "systems"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2024-socc-murmuration.pdf",
          "mime_type": "application/pdf",
          "title": "Scheduling for Reduced Tail Task Latencies in Highly Utilized Datacenters"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1145/3698038.3698522",
          "doi": "10.1145/3698038.3698522",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/a0280750-2ef0-4f5c-b138-68f7b11b4c29-1",
      "content_html": "<p>I got invited by <a href=\"https://profiles.ucl.ac.uk/78591-serta%C3%A7-sehlikoglu\">Sertaç Sehlikoglu</a> to deliver a lecture to the Masters students down at the <a href=\"https://www.ucl.ac.uk/bartlett/igp/\">UCL Institute for Global Prosperity</a>. I talked about the recent work on <a href=\"/projects/plancomp\">planetary computing</a>, with an overview of the <a href=\"/projects/life\">LIFE</a> and <a href=\"/papers/2024-food-life\">FOOD</a> papers.</p><h1>References</h1><ul><li>Ball et al (2025). Food impacts on species extinction risks can vary by three orders of magnitude. <a href=\"https://doi.org/10.1038/s43016-025-01224-w\" target=\"_blank\"><i>10.1038/s43016-025-01224-w</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/a0280750-2ef0-4f5c-b138-68f7b11b4c29-1",
      "title": "Mapping greener futures with planetary computing",
      "summary": "Lecture to UCL Institute for Global Prosperity Masters students on planetary computing, covering LIFE and FOOD papers.",
      "date_published": "2024-10-24T00:00:00.000000Z",
      "date_modified": "2024-10-24T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "london",
        "biodiversity",
        "sensing",
        "conservation"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1038/s43016-025-01224-w",
          "doi": "10.1038/s43016-025-01224-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/2j4qx-r0f87",
      "content_html": "<p>After some time away from cloud computing (due to my new focus on <a href=\"/projects/life\">conservation research</a>), I served on the <a href=\"https://acmsocc.org/2024/\">ACM SOCC 2024</a> program committee. It was quite interesting seeing the massive shift away from &quot;traditional&quot; cloud research (such as consensus protocols) towards many submissions aimed at accelerating machine learning workloads.</p>\n<p>I also had a paper accepted there on <a href=\"/papers/2024-socc-murmuration\">decentralised scheduling</a>, thanks to my former PhD student <a href=\"https://www.cl.cam.ac.uk/~sv440/\">Smita Vijayakumar</a> and her hard work on Murmuration!</p><h1>References</h1><ul><li>Vijayakumar et al (2024). Scheduling for Reduced Tail Task Latencies in Highly Utilized Datacenters. Association for Computing Machinery. <a href=\"https://doi.org/10.1145/3698038.3698522\" target=\"_blank\"><i>10.1145/3698038.3698522</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/socc-pc",
      "external_url": "https://acmsocc.org/2024/",
      "title": "On the SOCC 2024 PC",
      "summary": "Serving on the ACM SOCC 2024 program committee revealed a shift towards machine learning research.",
      "date_published": "2024-10-08T00:00:00.000000Z",
      "date_modified": "2024-10-08T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "cloud",
        "service",
        "scheduling",
        "distributed"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1145/3698038.3698522",
          "doi": "10.1145/3698038.3698522",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/0qmn2-rwh65",
      "content_html": "<p>I'm at the Royal Society this morning for the 2 day programme on <a href=\"https://royalsociety.org/science-events-and-lectures/2024/10/ecological-and-commercial-risk/\">&quot;How does ecological risk related to commercial risk?&quot;</a>, and am reporting on the <a href=\"https://royalsociety.org/-/media/events/2024/10/ecological-risk/programme-booklet.pdf\">morning session</a>.  The full program is being <a href=\"https://www.youtube.com/watch?v=gVuxzand8RE\">livestreamed</a> so please do dial in if the below notes seem interesting to you. I put this note up almost live, so any errors below are my own.\n<em>(Update: partial <a href=\"#daytwo\">day 2 notes</a> now available below)</em></p>\n<h2 id=\"opening-keynote-by-sir-partha-dasgupta\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#opening-keynote-by-sir-partha-dasgupta\"></a>Opening Keynote by Sir Partha Dasgupta</h2>\n<p>The summit kicked off with a keynote by economist <a href=\"https://en.wikipedia.org/wiki/Partha_Dasgupta\">Sir Partha Dasgupta</a>. The focus was on the intersection of nature and economics, covering how markets fail to account for the ecosystems that sustain them. His <a href=\"https://www.gov.uk/government/publications/final-report-the-economics-of-biodiversity-the-dasgupta-review\">landmark report</a> covered ecosystem services, freshwater, tipping points, and physical risk, bringing to light the urgent need to reframe economic activities around the services provided by nature.</p>\n<p><img src=\"/images/rs-ecorisk24/rs-ecorisk-1.webp\" alt=\"%c\" title=\"Sir Partha Dasgupta opening the morning session.\" ></p>\n<p>He began by distinguishing between two types of market activities:</p>\n<ul>\n<li><strong>Provisioning goods</strong>. Commodities like food, water, timber, fibres and so forth are the primary products which, when aggregated and valued at market prices, form the GDP (the visible outputs) of human endeavour.</li>\n<li><strong>Processes</strong> are the background work of maintaining and regulating services which produce these goods are problematic from a commercial perspective. The services aren't extractive in the same way as provisioning goods, but they express themselves via the goods.</li>\n</ul>\n<p>In the language of economics, there is a missing link here for the markets of\nprocesses, which makes for inefficiency, but more alarmingly a crisis in the\nmanagement of natural resources.  The action is on the activities we undertake\nwhich affect our landscape -- and the huge number of species affected therein.\nOur knowledge is extremely limited, but keeping them intact is in our interest.</p>\n<p>There is a good deal of work on option values which economists have tried to\nuncover, but research has somewhat stalled in recent years. The risks that\ncompanies face as a consequence of this inefficiency are correlated, which is\nextremely dangerous for market stability.</p>\n<p>He tried to address this in the review of the <a href=\"https://www.gov.uk/government/publications/final-report-the-economics-of-biodiversity-the-dasgupta-review\">Economics of Biodiversity\nreport</a>,\nbut it needs a lot more thinking.  A feature of the human condition is that\nfragmented ecosystems lose their productivity. The sum of the productivity of\nan ecosystem vs the sum of the individual parts has a big difference. The\n<a href=\"https://www.theguardian.com/environment/2021/dec/27/thomas-lovejoy-conservation-biologist-dies-80\">late</a>\nTom Lovejoy did a lot of work on this in the context of the Amazon rainforest\necosystem.  Similarly, we are looking at a fragmented nature ecosystem today in\na global context as more and more habitats get truncated.</p>\n<p>The background extinction rates of organisms are hugely growing - the option\nvalue of organisms suggests we are losing enormous amounts of value in the form\nof unknown lifeforms. The correlation risk here is something that whole\ngovernments need to take on board -- it's too vast for any single organisation\nto take on board!</p>\n<h2 id=\"ecosystem-services-and-physical-risk\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#ecosystem-services-and-physical-risk\"></a>Ecosystem Services and Physical Risk</h2>\n<p><a href=\"https://en.wikipedia.org/wiki/Jane_Lubchenco\">Jane Lutchenko</a> chaired the panel session. She is the administrator of the NOAA, &quot;on loan&quot; to the White House to work on nature and technology policy.</p>\n<p><img src=\"/images/rs-ecorisk24/rs-ecorisk-2.webp\" alt=\"%c\" title=\"Jane Lutchenko chairing the morning session.\" ></p>\n<h3 id=\"dr-tony-juniper-cbe\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#dr-tony-juniper-cbe\"></a>Dr Tony Juniper CBE</h3>\n<p>We have not always been looking at nature through a financial perspective. The\njourney of Natural England in the 1940 began more an ethical and moral\nperspective, and the scientific and beauty value of landscapes. It's only\nrecently been that practical impacts of nature and its benefit to humanity is\nbeing considered.</p>\n<p>The <a href=\"https://en.wikipedia.org/wiki/Millennium_Ecosystem_Assessment\">Millenium Ecosystem\nAssessment</a>\ncommissioned in 2001 by Kofi Annan was a stocktake of the earth's natural\ncapital assets, and remarkable for how it made it clear how if we didn't\nreverse the decline of nature then we would be unable to meet humanity's needs\nsuch as ending poverty. A few years later in 2007 the G8 commissioned\n<a href=\"https://en.wikipedia.org/wiki/The_Economics_of_Ecosystems_and_Biodiversity\">&quot;Economics of Ecosystems and\nBiodiversity&quot;</a>\nthat worked until 2011 and helped reset the understanding of the human world\nand is fundamental to how we conduct our ecosystem. Later on the\n&quot;Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem\nServices&quot; (<a href=\"https://www.ipbes.net\">IPBES</a>) commissioned by the UN general\nassembly to consider the contribution of nature to people.</p>\n<p><img src=\"/images/rs-ecorisk24/rs-ecorisk-3.webp\" alt=\"%c\" title=\"Tony Juniper speaking.\" ></p>\n<p>So as nature rises in the political eye, there are multiple national programs\nall trying to assess the state of natural assets.  In 2020 there was a Treasury\norganisation to bring natural capital into national accounting in the UK.\nDasgputa's landmark report established that nature and the economy are\nintertwined, which was a huge change in thinking. We no longer need to assume\nthat degradation of nature is the price of progress, and the new reality are\nthat the range of ecosystem services are critical to how we need to conduct our\necosystem services into the future. Pollination, carbon capture, biomimicry,\nnutrient capture, when all added up are worth more than the UK's GDP, but we\nstill struggle to how to measure this as part of our conventional economies!</p>\n<p>After 20 years of expert studies and carefully constructed datasets, we\nstill struggle to do the right thing, but why? No one company is quite able to\nfully embrace the scope of the problem. There are 1000s of medium to large\ncompanies that depend on agriculture, but they are all dependent on 25 billion\ntons of water moving around intercontinental distances for the global water\ncycle. Any particular one company might make a small difference to reduce their\ndeforestation footprint but without collective action they will be unable to\nmake a global change. There must be a collective force to bring things\ntogether, and also better understanding of the connectivity across these\nfactors.</p>\n<p>However, there still isn't a prominent intellectual place for the connections\nbetween ecology and economy and still isn't being taught in undergraduate\ncourses and so not making it through to high political decision making. There\nis still a lot of emphasis on financial growth, but not much discussion of\nnature: we should be factoring in that nature is the mechanism by which we can\nachieve financial growth. It's quite hard to meaure biodiversity vs &quot;tons of\ncarbon&quot;. Environment regulation must move from stopping harm, but also plot\npathways to nature recovery.  So the link between ecological and commercial\nrisk is very real and must be measured and actioned.</p>\n<h3 id=\"freshwater-professor-louise-heathwaite-cbe-frs\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#freshwater-professor-louise-heathwaite-cbe-frs\"></a>Freshwater (Professor Louise Heathwaite CBE FRS)</h3>\n<p>Her first ever &quot;proper job&quot; was working at the Nature Conservancy as their first hydrologist. Freshwater is at the centre of the triple planetary crisis today. As a result of these challenges, what we see is an intensification of the water cycle:</p>\n<ul>\n<li>increasing temperatures increase atmospheric water holding capacity</li>\n<li>increases preciptation, increases evaporation, so the cycle intensifies</li>\n<li>evaporative demand increases and so extreme drough events increase</li>\n</ul>\n<p>Our lack of investment in proper waste recycling infrastructures is now coming\nback to haunt us. They impact hugely on freshwater availability and quality. It\ndisrupts commercial supply chains, and the quality of the supply for both water\nand ecosystem services for which we heavily depend.</p>\n<p><img src=\"/images/rs-ecorisk24/rs-ecorisk-4.webp\" alt=\"%c\" title=\"Louise Heathwaite shows the GRACE water map.\" ></p>\n<p>Two satellites (GRACE) spinning around earth from 2002-2017 that can detect\nchanges at a centimetre scale of changes in water mass. The image shows how\nconnected the world is, and the scale of the changes at the poles and greenland\nis incredible. Some research just finished up in Greenland, and the rate of\nwarming in the arctic is roughly 2x the global warming rate. Greenland is\nlosing almost 260 gigatons of ice per year, which is a million Olympic-sized\nswimming pools a year -- an awful lot of water to move around into the ocean\nsystems. The ocean cycles aren't on the slide, but they will impact.</p>\n<p>Only 1% of the total water supply is freshwater, and its appropriation among human\nactivities is quite unalanced. We have green water to blue water to white water\nto grey water to black water. We have messed up the blue and black water\nmanagement. We are producing 360 billions of cubic metres of waste water a year\nand only 3% is recycled. A significant percentage of the world population (2\nbillion people) are dependent on black water, and a lot only have access to\ngrey or black water.</p>\n<p><img src=\"/images/rs-ecorisk24/rs-ecorisk-5.webp\" alt=\"%c\" title=\"Water flows in the ecosystem.\" ></p>\n<p>We need to treat management of surface and groundwater systems holistically.\nThe freshwater living planet index is dropping precipitously: some causes are flow\nreduction, introduction of invasive species and nitrogen/acidification peaks in the\n1970s/80s/90s which all causes major changes in freshwater biodiversity. We\nalso introduced laws on long range transport regulation about acidification. In\n1991 we had the EU urban wastewater directive and in 2000 the EU water\nframework directive, but none of this seems to have made much of a difference (see graph above).</p>\n<p><img src=\"/images/rs-ecorisk24/rs-ecorisk-6.webp\" alt=\"%c\" title=\"Freshwater confidence index not measurably changed by legislation.\" ></p>\n<p>Are we looking a tipping point in terms of waste water systems? We dont have\nclear metrics about freshwater biodiversity, and without the metrics we dont\nhave a way of valuing them as part of commercial systems. We have new emerging\npollutants (microplastics etc) which we have to deal with multi-sectoral\nchanges to make a difference, which is lacking despite the legislation in\nplace.\nHowever, the growth in urbanisation (56% of UK population lives in cities, increasing to\n70% by 2050) which makes a huge difference to where waste comes from.  When we\nmove food around the word, there is a water cost which is passed onto other\ncountries. <em>(Note: see also our preprint on <a href=\"/papers/2024-food-life\">Food impacts on species extinction risks can vary by three orders of magnitude</a>)</em></p>\n<p>When we move to decarbonisation, the last 20% of action (the really difficult bit) is\nrelated to land use in particular. But if we join up the practises around good\nwater use and decarbonisation then we have a bunch of innovative interventions,\nfor example natural flood management.  There are also cobenefits around\ninfrastructure and there is a real challenge with our UK underinvestment. The\nlast report in 2022 there are some excellent propositions for how to deal with\nwater and atmospheric pollution and biodiversity. We need to act on this\nreport, but progress is slow: e.g. Thames Water is a long way away from\nactioning this.</p>\n<p><img src=\"/images/rs-ecorisk24/rs-ecorisk-7.webp\" alt=\"%c\" title=\"Water flows between countries worldwide.\" ></p>\n<p>In the 1980s, <a href=\"https://adas.co.uk\">ADAS</a> was the wing of farming and food ministry worked on how to\nrecycle waste better. Metals were the big problem back then, but now there are\nloads of extra contaminants such as plastics and forever chemicals, We must\nmove towards reusing waste we produce towards reusable end products for farming\nand dramatically drop our water use.</p>\n<h3 id=\"tipping-points-and-biosphere-stewardship-professor-carl-folke\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#tipping-points-and-biosphere-stewardship-professor-carl-folke\"></a>Tipping points and biosphere stewardship (Professor Carl Folke)</h3>\n<p>We have polycrises with climate change, biosphere pressure, intertwined\nsystems, interacting shocks, and many tipping points.  Tipping points are a\nshock that can move a system from a local maxima into a new stable state, but also\nthe gradual loss resilience can cause a system to shift suddenly. There are a\nnumber of drives that cause these tipping points, and they are not just a\nrandom aside: they should be a key part of any investment strategy.</p>\n<p>A lot of people live in areas that are at risk of ecological tipping points,\nand the map seems very correlated to the earlier water change map shown by\nLouise! 56% of human CO2e have been taken up by the biosphere so far (1430\nGtCO2 from 1850-2019) which has been happening for &quot;free&quot; thanks to nature, and\nthis might come to an abrupt end soon which is a big source of worry in the\nscientific community.</p>\n<p><img src=\"/images/rs-ecorisk24/rs-ecorisk-8.webp\" alt=\"%c\" title=\"Prof Folke discussing tipping points.\" ></p>\n<p>The connectivity of various tipping points is becoming clearer. There are just\na few major actors that shape some of these (such as the Amazon rainforest) and\nthey are often quite far away geographically by virtual of their influence over\nmarkets.</p>\n<p>The current idea is that social tipping (&quot;social norms as solutions&quot;) is the\nway to alter our value systems. See &quot;<a href=\"https://www.cambridge.org/core/journals/global-sustainability/article/operationalising-positive-tipping-points-towards-global-sustainability/8E318C85A8E462AEC26913EC43FE60B1\">Operationalising positive tipping points\ntowards global\nsustainability</a>&quot;\nby Tim Lenton et al. <em>(Note: Simon Sharpe also authored this, see our CSaP\n<a href=\"https://www.csap.cam.ac.uk/news/article-reading-group-five-times-faster-4-rethinking-unive/?preview=1\">reading\ngroup</a>\non his excellent <a href=\"https://fivetimesfaster.org\">&quot;Five Times Faster&quot;</a> book.)</em></p>\n<p><img src=\"/images/rs-ecorisk24/rs-ecorisk-9.webp\" alt=\"%c\" title=\"Exposure of people to tipping points.\" >\n<img src=\"/images/rs-ecorisk24/rs-ecorisk-10.webp\" alt=\"%c\" title=\"Interconnected climate tipping points worldwide.\" ></p>\n<p>So we need to prepare for tranformation, navigate the sudden transition and\nthen build resilience in the new norm after the tipping point. But our window\nof opportinty is happening right now so action must happen urgently or we miss\nthe window and enter the tipping point unprepared.  So &quot;corporate biosphere\nstewardship&quot; is a new business logic with the purpose of shepherding and\nsafeguarding the resilience of the biosphere for human well-being, and\nfostering the sustainability of a rapidly changing planet.  Rather than viewing\nnature as a compliance question, we should view it as humanity's greatest\nbusiness opportunnity! For example, Seafood Business for Ocean Stewardship\n([seabos.org](Seafood Business for Ocean Stweardship)) is codifying this\napproach for marine foodstocks. There is an increasing focus on how to report\nthis stuff. There are three good books to read more on this:</p>\n<ul>\n<li><a href=\"https://openknowledge.worldbank.org/entities/publication/855c2e15-c88b-4c04-a2e5-2d98c25b8eca\">&quot;Nature's Frontiers&quot;</a>, 2023 from the World Bank group.</li>\n<li><a href=\"https://www.stockholmresilience.org/publications/publications/2022-09-29-economy-and-finance-for-a-just-future-on-a-thriving-planet.html\">&quot;Economy and Finance for a Just Future on a Thriving Planet&quot;</a>, 2022 from the SRC.</li>\n<li><a href=\"https://www.ngfs.net/en/the-green-scorpion-macro-criticality-nature-for-finance\">&quot;The Green Scorpion&quot;</a>, the macro-criticality of nature for finance.</li>\n</ul>\n<p><img src=\"/images/rs-ecorisk24/rs-ecorisk-11.webp\" alt=\"%c\" title=\"The window to tackle tipping points.\" ></p>\n<h3 id=\"ecosystem-services-and-physical-risk-paul-polman-kbe\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#ecosystem-services-and-physical-risk-paul-polman-kbe\"></a>Ecosystem services and physical risk (Paul Polman KBE)</h3>\n<p><a href=\"https://en.wikipedia.org/wiki/Paul_Polman\">Paul Polman</a> is the former CEO of\nUnilever. He started by noting that we have to make sure that we dont just\nspend our time talking to each other, but we have to get the message out in the\nform of action to the wider world!  When they developed the SDGs, Ban Ki Moon\nrequested Paul to represent the private sector. He was described as <em>&quot;the\nproblem walking into the room&quot;</em> when the politicans first met him, but luckily\nthey ended up in a happier place! We must find a balance into a sustainable\neconomy into the future.  Most businesses, although they dont behave entirely\nsustainabiltly now, they do understand the need for a planet. When he ran\nUnilever for a decade, they encoutered hundreds of millions of dollars worth of\nnature-related interruptions tot heir business.</p>\n<p>4 billion people in the world depend on natural medicine, and nature governs\nhuge amounts of the human ecosystem. Nature supports peace and national\nsecurity - many of the world geopolitical events train back to inequality in\naccess to natural resources and this is the foundation for many economics. The\nWEF report recently calculated thaat $44tn of the world economy depends on\nnature -- but this is a huge understatement given that our entire life depends\non it!  Changes in one ecosystem affects others, and so our conversation must\naddress the interrelationship of all this that makes it so complex.</p>\n<p><img src=\"/images/rs-ecorisk24/rs-ecorisk-12.webp\" alt=\"%c\" title=\"Paul Polman discussing global business and nature.\" ></p>\n<p>We must not see ourselves at the top of the pyramid -- biomimicry and\ngeoengineering is an arrogant approach as our survival relies on cooperation\nwith the biosphere -- when we destroy nature we destroy ourselves. (paraphrased quote) <em>&quot;Man is the\nmost insane species - he worships an invisible god and destroys the visible\nnature, not realising that the invisible god he worships is the visible nature\nhe destroys&quot;</em> (original: <a href=\"https://www.goodreads.com/quotes/1171374-man-is-the-most-insane-species-he-worships-an-invisible\">Hubert Reeves</a>). Extinctions are 10-100x the average of the previous\ncenturies.</p>\n<p>Food and land use is about 30% of our global emissions, and yet we have the\naudacity to keep our famers in poverty. Every $1 invested in changing nature\nand regenerative farming approaches give us $16 return. The food companies are\nreally exposed if they don't act, much as we criticiise fossil fuel companies\nright now.</p>\n<p>We are making withdrawals much faster than we are depositing in the bank of\nplanetary boundaries, and there are millions of people losing their lives and\nbillions being displaced because of these choices. The Carribean has lowered\nits tourism by a significant percentage due to beach erosion. The Amazon this\nyear has had unprecedented wildfires (around the size of Italy) as an example.\nWe name our tropical storms (we name them as if they are our friends); would it\nbe different if we named them Exxon Chevron etc? We must absolutely link the\ncommercial sector to nature as companies like Unilever must become\nborder-positive and nature positive as they are hugely global.</p>\n<p>Leadership is increasingly centred in europe with many business regulations\nthat are nature positive. There is a bunch of business interaction happening,\nbut the estimation is that loss of biodiversity has cost us over $10tn. In\nagriculture we are destroying $12tn of value, but if we turn that around its a\n$4tn opportunity. This just doesn't make business sense to not act. Covid\nshowed us that infinite growth on a finite planet is unsustainable.</p>\n<p>What can business do next then?</p>\n<ul>\n<li><em>Invest in nature.</em> It drives innovation and nature has probably been the best R&amp;D lab (1/3rd of medicines come directly from nature)</li>\n<li><em>Water utility improvement.</em> If the 500 largest cities looked into restoring local forests and water tables, it would make a huge difference in quality of life.</li>\n<li><em>A mindset change for business.</em> See his <a href=\"https://netpositive.world/book/\">book on the topic</a>. The only way to think in business to be successful now is to think regeneratively and restoratively -- every action needs to contribute to restoration and not cause to us to fail slightly more slowly.</li>\n</ul>\n<p>Priorities for business action:</p>\n<ul>\n<li>Repair and restore nature. e.g. 30x30 the global biodiversity framework</li>\n<li>Account for the value of nature, the point of this event. less than 5% of companies today account for nature, and this needs to vastly increase.</li>\n<li>Form partnerships for advocacy and change. most of the big issues cannot be tackled by one company; even at Unilever Paul could only solve about 20% of the problem with one of the biggest companies in the world. So for global change (like &quot;planetary guardians&quot;  launched at the UN last week) is another example of partnership.</li>\n<li>Align financial flows using the political moves around GBF and the Paris agreement. We need to put money behind this.</li>\n</ul>\n<p>Nature needs business, and nature needs business, and the time to act is now!</p>\n<p><img src=\"/images/rs-ecorisk24/rs-ecorisk-13.webp\" alt=\"%c\" title=\"Breaching planetary boundaries.\" >\n<img src=\"/images/rs-ecorisk24/rs-ecorisk-14.webp\" alt=\"%c\" title=\"Key priorities next for nature and business.\" >\n<img src=\"/images/rs-ecorisk24/rs-ecorisk-15.webp\" alt=\"%c\" title=\"The full panel discussing these topics.\" ></p>\n<h2 id=\"discussion\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#discussion\"></a>Discussion</h2>\n<p>Partial notes from the audience Q&amp;A follow.</p>\n<p>Q: To what extent have we made progress in the transition towards a nature positive future -- is the awakening fast enough?</p>\n<p>Tony: what we have discussed here is not very well known in the majority of the\nworld's population. Most people get in their car and go to work, and their food\nis processed and disconnected from nature. So the insight that most people have\ninto their dependencies on nature isnt very well known. David Attenborough has\nmade it clear that if we take steps towards repairing the natural world then\npeople will need to be exposed to nature so that they understand what they're\ntrying to save. This was disconnected in the beginning of the industrial\nrevolution, and we need to urgently reconnect it, or people will simply not\nappreciate what we're about to lose.</p>\n<p>Louise: is quite optimistic because the younger generation is very engaged in\nthis, and we need to ensure they have the levers for action. It's the older\ngenerations challenge to bring the right levers to them.</p>\n<p>Paul: the science is still evolving at a fast pace, and will continue to, and\nif we dont convert it to fast practical steps we'll be trouble. Everything\nstarts and ends with educaiton, but we are also out of time. The urgency isnt\nwell understood -- a survey of 4000 people about this skewed to about 30% of\npeople where they left a company because it wasnt aligned with their values.\n60% were considering leaving but didnt. We must get businesses to move out of\nthe current gridlock where multilateral institutions cannot change easily, and\nwe must cooperate across boundaries. The EU has a nature restoration law which\nwouldnt have happened without businesses getting involved to push it over the\nline for example -- without that it wouldnt have moved. And tipping points mean\nthat only 4-5% change is required to get us to a new stable state, so this is\nboth an opportunity and a risk.  The biggest risk right now is the US election,\nas its a global vote but decided by a tiny minority.</p>\n<p>Q: What examples have you seen about nature flows in the real world?</p>\n<p>Paul: the economic forces must be made to work; people arent skeptical but the\neconomic realities must work. But there are pressures; the average tenure of a\nCEO has dropped to 4-5 years and so their actions are aligned with reelection\ncycles just like poliitics! So we have to make capital and money flow work much\nmore transparently.  With a relatively small investment, we can get to 50%\nregenerative agriculture; this is happening in the USA (thanks to IRA).  Many\nof the major food companies (from Pepsi to Unilever) are committed to $15bn\ntowards soil health and regenerative agriculture. And there is nothing better\nthan healthy soil towards preserving yields in the face of climate change. The\naverage age of farmers is ~50 now and so they only have 10 seasons left before\nthey retire, so they need to be paid to shift proactively for ecosystem\nservices. It needs a proper farmer rewards to do the shift quickly and hedge\nthe risk. Luckily, the speed of nature restoration is faster than we predicted.</p>\n<p>Carl: it has become a strategic issues in companies, as opposed to a\ngreenwashing protection. They run an executive course for companies and they\nare factoring in risks proactively now in a way they didnt before. They are\ndemanding satellite data and other risk management data products to quantify\nthe uncertainties in nature. Earth SYstem Indicators is their system to combine\nmultoiple planetary boundaries into one actionable indicator. It is a bit like\nan avalanche now where the reaction is awakening, but we are probably in a\ntipping point right now so we must act in the right direction.</p>\n<p>Jane: social systems are highly non linear and also characterised by social\nsystem tipping points. Water seems to be an obvious one that underpins the\noperation of many companies.</p>\n<p>Louise: Universities and research must turn into businesses (of innovation).\nOne thing we failed to do is to find the problems but dont find the solutions,\nand research must come up with solutions. &quot;Ecologists turn over stones looking\nfor problems&quot;. We must take a position on coming up with solutions and\ncommunicating them clearly, and the only way to do this is to work together\nacross disciplines. We are looking to social scientists to combine civic change\nwith environmental scientists to propose interventions.</p>\n<p>Jane: there has been a huge transformation in the US towards finding solutions,\nand that means partnering with government agencies and NGOs in a way we haven't\nseen before. <em>(Note: this describes the <a href=\"https://www.cambridgeconservation.org\">Cambridge Conservation Initiative</a> model perfectly!).</em></p>\n<p>Tony: even when the solutions exist, a number of entities want to keep the\nstatus quo: e.g. Exxon and Shell spent more than two decades pretending it\nwasnt a problem to protect their interests, and they're not alone. And so we\nhave to establish a social tipping point (just 3-4%) was just the idea of\nExctinction Rebellion to overload the court system and get to the point where\nwe could shift the discussion, But it didnt work, so what will work? We need to\ntry, try and try again on this. Otherwise we're rather just talking to\nourselves.</p>\n<p>Q: Water related financial risks: TCFD galvanised focus on climate,\nspeciifcally carbon, and one of the barriers is this focus on regulations that\nfocus on the cause (carbon) but not the impacts. We mainly feel these impacts\nthrough the water cycle, so how can we make the focus on the phsyical risks of\nclimate change being on the impacts and not the cause?</p>\n<p>Louise: we need a systems approach to this; tipping points sort of get there,\nbut how do we get the evidence to get over the top?</p>\n<p>Q: Why is it that the business and finance community isn't moving faster? The\nsuspicion is that not all businesses are part of the solution, and do we need\nto be clearer on that? If we combine some of the social aspects and linking of\npeople to nature, does that supply chain need to be shorter?</p>\n<p>Paul: we are slightly bending the curve linearly, but the gap is getting bigger\nexponentially. If we dont solve the problems for the world, we can be great as\na company but it doesnt save the planet for the next generation. Business alone\ncannot do it, but what we can do is to make it more transparent. The more\ntransparent we are the more we change behaviours and social norms. Companies\nusing AI across supply chains are doing surprisingly well at bringing this\ntransparency around. In every transition in the history of mankind, there will\nbe people resisting change, but now the noise globally is becoming louder is a\nbig deal and is a sign that the process is unstoppable and has begun. The noise\nwill only increase in the next five years. In the food chain there are\nherbicide, pesticide producers that are doing really well on food speculation,\nbut at the expense of many lives. We need a leader to change them, and if they\ndont change they will obsolete their business models.</p>\n<p>Q: What is the single most important action businesses can take today to address the crisis?\nQ: Why not spin this around to look at this at a very local level (e.g. flooding for business crisis)?\nQ: Unless we can localise analysis with improved precision, can we improve decision making?</p>\n<p>Tony: Localisation of issues is critical. The founding slogan of Friends of the Earth was to &quot;think globally act lcoally&quot; which was a very good slogan. Natural England is working with local authorities on spatial planning exercises on where the good remaining nature exists and where it might usefully be repaired (cleaning up peatlands etc). The targeting of the resources like this might help us bend the curve. This must be linked with our need to build 100k new houses and balance natural and business.</p>\n<p>Paul: COP30 is likely to involve businesses heavily (COP29 is a writeoff due to location). But businesses are getting behind it all as countries cant implement their policies without business help. A large percentage of indigenous populations are planetary guardians and so we must support their efforts to protect existing natural capital against tipping points. Science without impact is useless when we have a crisis at this level.</p>\n<h2 id=\"day-1-reflections-by-sir-partha-dasgupta\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#day-1-reflections-by-sir-partha-dasgupta\"></a>Day 1 Reflections by Sir Partha Dasgupta</h2>\n<p>Ecosystems are capital assets; we now use the term natural capital to include\nthe biosphere in the full stock. This raises the question of asset management,\nand the questions discussed today represent shortcomings in asset management.</p>\n<p>With natural capital, property rights are hard to enforce -- nature is always\non the move and &quot;mobile&quot; and the commodity changes when it moves. This leads to\nan underpricing of many forms of natural capital (not all, but many) which\nleads to the overuse of it, which in turns leads to deterioration of the asset,\nwhich implies a runaway tipping point of decline. This all leads to a circular\ndecline in the value of natural capital; if there is a heightened risk of an\nasset collapsing, the accounting value is naturally reduced. So this is a\nvicious cycle at work that leads to the inevitable decline.</p>\n<p>Global GDP has increased hugely as opposed to natural capital, which increases\nthe pressure to overspend on natural capital. What are the arbitrage conditions\n(e.g. risk adjusted rate of return) on the portfolio assets that comprise\nnatural assets? Because of imperfect pricing, there is a big misalignment of\nany portfolios. For example, the Ganges is the most polluted river in the world\nand under the Ganges action plan to remedy this there was a <a href=\"https://archive.org/details/cleaningupganges0000mark\">social cost benefit\nanalysis</a> -- there was a 15% rate of return calculated. The rate of return on\ngovernment bonds was roughly 5%. There was a gap of 10% between these two assets,\nand so if the structure was efficiently organised then the Ganges value as a stock\nshould be decreasing by 10%. But this is perverse since the reverse should hold\nsince the river quality has been decreasing not increasing!</p>\n<p>So from the firm's point of view we are looking at their balance sheets (from\nthe natural capital POV). But GDP was not constructed for this purpose -- instead\nit was constructed in the post war period to calculate progress towards getting\nout of economic depression resulting from lack of economic activity. But somehow\nafter WWII it became a long-term goal but it has no social benefit justification.\nGDP is a flow and so not a future predictor like stocks. You can have GDP growth\nand the national package of accounts, but it just won't take natural capital\ninto account. There is a gap between portfolio management and GDP therefore.</p>\n<p>But countries are moving towards wealth accounting bit by bit, including for natural\ncapital aspects of wealth. Most of the attention has been towards human capital,\nincluding the attention of investors. This now needs to shift quickly towards\nnatural capital. There have been studies attempting to assess the market demand\nfor natural capital. Think of the biosphere as a massive fishery -- the natural\nanalogy is how much we take out of it, and what the regeneration rate is. The question\nis what the overreach is and there are <a href=\"https://naturalcapitalproject.stanford.edu/about\">projects working on this</a>.\nWe need to take the outputs of these projects and treat them as underestimates due\nto the huge extinction pressure on organisms.</p>\n<p>Even if firms are competing in the market, they still need to cooperate on the\nunderlying natural capital accounting to avoid a &quot;market crash&quot;. In academia,\nthere is both cooperation and competition. There is a race for paper publication,\nbut also a huge amount of sharing towards global scientific co-creation.  There is a\ngood deal that companies could learn from this to cooperate and communicate towards\nnatural capital accounting.  More on day 2 tomorrow about how we can get there!</p>\n<p><span id=\"daytwo\"></span></p>\n<h2 id=\"day-2-metrics-and-actions\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#day-2-metrics-and-actions\"></a>Day 2: Metrics and Actions</h2>\n<p>Kat Bruce (founder of <a href=\"https://www.naturemetrics.com/\">NatureMetrics</a>) opened the session by noting just how quickly\nthe biodiversity space is moving, and how encouraging it is to see so many businesses\nengaging with environmental scientists.</p>\n<h3 id=\"metrics-for-business-use-prof-neil-burgess\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#metrics-for-business-use-prof-neil-burgess\"></a>Metrics for business use (Prof Neil Burgess)</h3>\n<p>Is going to cover terrestrial metrics only. There are several sorts of metrics we could measure:</p>\n<ul>\n<li>pressures on biodiversity</li>\n<li>steady state metrics for biodiversity</li>\n<li>measuring benefits from biodiversity</li>\n<li>response metrics on the effectiveness of biodiversity interventions</li>\n</ul>\n<p>We do need to understand biodiversity risk in regional detail so that\nbusinesses can use this to influence their actions. There is also the need for\nclear target setting to know how much opportunity cost to &quot;spend&quot; on\nbiodiversity vs other actions.</p>\n<p><img src=\"/images/rs-ecorisk24/rs-ecorisk-16.webp\" alt=\"%r\" title=\"Neil describes how to classify biodiversity metrics\" >\n<img src=\"/images/rs-ecorisk24/rs-ecorisk-17.webp\" alt=\"%r\" title=\"Categorising 573 (!) biodiversity metrics\" >\n<img src=\"/images/rs-ecorisk24/rs-ecorisk-18.webp\" alt=\"%r\" title=\"Metrics usable for businesses currently\" ></p>\n<p>UNEP-WCMC have a huge database about all the various metrics (~573!) that\ncover biodiversity. If you even add other things like marine, the number\nbreaks 600 (see picture). There are around 23 useful ones for business use;\nthere are few in this list that check benefits to people or discuss genetic\nchanges, but there are lots of on diversity, habitat area and so on.\n(Neil noted that our <a href=\"/papers/2024-life\">LIFE: A metric for mapping the impact of land-cover change on global extinctions</a> paper isnt listed as its not published yet\ntill the end of the year, but it will be!)</p>\n<p><img src=\"/images/rs-ecorisk24/rs-ecorisk-19.webp\" alt=\"%r\" title=\"What next for biodiversity metrics?\" ></p>\n<p>There is quite a lot of work needed to make the metrics usable, and a pipeline.\nHe also noted the importance of incremental pipelines and mentioned the\nincremental pipelines that <a href=\"https://mynameismwd.org\">Michael Dales</a> and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> are working on for these.\nAnd this is something I'd like to advance with <a href=\"/projects/plancomp\">Planetary Computing</a>...</p>\n<p>The main tool that pulls this together is <a href=\"https://www.ibat-alliance.org\">IBAT</a>\nwhich is a paid-for model to support its continued developed. There is also\nthe free <a href=\"https://encorenature.org/en\">ENCORE nature</a> platform. For supply chains,\nthere is also tools to help check the country to country. There is a big emphasis\non the need for open sharing between platforms as well due to the sheer\ncomplexity of biodiversity worldwide, which is a key thing to keep the platforms\nsustainable both financially and equitably.</p>\n<h3 id=\"data-availability-and-use-prof-andy-purvis\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#data-availability-and-use-prof-andy-purvis\"></a>Data availability and use (Prof Andy Purvis)</h3>\n<p>Andy's career as a scientist started 35 years, and he is now at the Natural\nHistory Museum as a senior researcher. The mission there is not just to track\nthe decline of the planet, but also to advocate towards a healthier nature\npositive society as well.</p>\n<p><img src=\"/images/rs-ecorisk24/rs-ecorisk-20.webp\" alt=\"%r\" title=\"Why care about biodiversity?\" ></p>\n<p>We have a million or so species of animal and plants threatened by species\nextinction. Our job as scientists is to come up with defensible statistics,\nbut they also don't convince anyone. So stories about individual species\n(like the white rhino) are vital to build public awareness about the real\nimpact.</p>\n<p>Is there one index to rule them all? Andy is against that idea of a single\nmetric to describe the world's biodiversity.  Any indicator combining some of\nthe metrics has an &quot;exchange rate&quot; between extinctions and human wellbeing. In\nother words, it creates an &quot;extinction market&quot;.</p>\n<p>Decision grade data are derived very carefully from very biased raw data. And\nthe data collection requires a huge amount of expertise, and is painstaking\nto conduct. This has to be funded, and not even the collection of the raw data\nis very well funded right now. To illustrate the data bias, there are a huge\namount of birds...and mallards...in there, which is clearly not representative\nof global species spread.</p>\n<p>We can plot ecosystem function (resource capture, biomass production, decomposition\nand nutrient recycling) on the axis of biological diversity (variation in genes\nand species and functional traits). Across this, there is a huge spectrum\nof ecosystem services that this supports, of which a few are really useful to humans.\nAnd the <a href=\"https://www.nhm.ac.uk/our-science/services/data/biodiversity-intactness-index.html\">BII index</a>\nis a statistical model relating nature to human pressures that produces both\nhigh resolution temporal and spatial models.</p>\n<p>What should we do in terms of ecosystem health?</p>\n<ul>\n<li>Deintensify activities in unhealthy systems where people depend on local ecosystem services</li>\n<li>Divest from businesses that are poor stewards of ecosystem health</li>\n<li>Invest in actions that are &quot;nature positive&quot;, which is an action that improves the expected global status of biodiversity relative to counterfactuals.</li>\n</ul>\n<p>It requires a model, has to be global, and measure both species persistence and\necosystem health. It has to be vs counterfactuals otherwise the cost is too\nhigh for any individual organisation <em>vs</em> society taking action collectively.</p>\n<p><img src=\"/images/rs-ecorisk24/rs-ecorisk-21.webp\" alt=\"%r\" title=\"Why not have one biodiversity index?\" >\n<img src=\"/images/rs-ecorisk24/rs-ecorisk-22.webp\" alt=\"%r\" title=\"Ecosystem function vs health\" >\n<img src=\"/images/rs-ecorisk24/rs-ecorisk-23.webp\" alt=\"%r\" title=\"Defining nature positivity as a counterfactual\" ></p>\n<p>We need to combine models with monitoring to give us a &quot;sat nav&quot; for nature.\nThere is a need to monitor drivers as well biodiversity, which\nrelies on data being available to improve the models. And if the platforms are\nopen then this is possible. There are platforms in the form of Geo BON, IBAT, BII ha a data license with\nBloomberg, and some TNFD tools.</p>\n<p>Take home messages:</p>\n<ul>\n<li>Use data whose methodologies are transparent</li>\n<li>Remember the pitfall of hybrid indicators or indices</li>\n<li>Reduce extinctions and mitigate existing activities in important areas and dont do new human activities there</li>\n<li>Divest from poor stewards and invest in nature positive actions</li>\n<li>Monitor closely to verify gains and contribute to data repositories</li>\n<li>Accept that decision grade data does cost money and needs funding!</li>\n</ul>\n<h3 id=\"qa\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#qa\"></a>Q&amp;A</h3>\n<ul>\n<li>Is the restoration of an ecosystem the reverse of destruction curve?</li>\n<li>Is there consensus about BII outside of the NHM team?</li>\n<li>Are we moving towards biodiversity standardisation?</li>\n<li>How can we make these metrics to be more understandable to businesses?</li>\n</ul>\n<p>Andy: if we go back to a system with the original biodiversity, but the actions to get it there don't have an immediate effect (there is a timelag).  There's no standardisation effort yet across many metrics, but we are moving towards ensembles of models with alternative sets of inputs for the environmental rasters in order to normalise the uncertainty in both the environmental and geographic space.</p>\n<p>Neil: UNEP-WCMC uses BII a lot! In terms of metrics, a species and ecosystem metric and something on genes would cover the three dimensions of biodiversity, but others also need to be factored in. There is an effort to use AI/geospatial to get higher resolution landuse data (is this a forest?) and also probabilistic SDMs, but no standardisation.</p>\n<p>Kat: being able to reverse degradation is a secret weapon for nature as it has an incredible ability to bounce back if pressures are reduced due to its innate resilience. The aspect of the data we often forget about is its ability to tell these stories and drive uptake, and this is powerful.</p>\n<ul>\n<li>What are the greenwashing risks of nature-positive counterfactuals?</li>\n<li>When we boil down the metrics that are ready for both country and business use, what are the qualities that make a metric useful? There were only 16 or so outside of ~600!</li>\n</ul>\n<p>Neil: There are criteria (peer reviewed, published, etc) with lots of feedback from the co-authors of the paper (to appear later this year). There are strongly held opinions in the biodiversity metric space. But it's actually not a huge and manageable list of metrics.</p>\n<p>Andy: trying to avoid false claims of nature positive requires verification. We're going to get better at this by estimating the net gain and how certain its positive. The models will improve when there is a connecting pipeline to monitoring data and needs verification. So any payment for nature-positive claims needs to be staged and ex-ante.</p>\n<h2 id=\"other-talks\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#other-talks\"></a>Other talks</h2>\n<p>I failed to capture notes for the middle sessions as I was engrossed in conversations, but here's a gallery of some of the fantastic speakers! They're on the livestream video if you want to catch the details.</p>\n<p><img src=\"/images/rs-ecorisk24/rs-ecorisk-24.webp\" alt=\"%c\" >\n<img src=\"/images/rs-ecorisk24/rs-ecorisk-25.webp\" alt=\"%c\" >\n<img src=\"/images/rs-ecorisk24/rs-ecorisk-26.webp\" alt=\"%c\" >\n<img src=\"/images/rs-ecorisk24/rs-ecorisk-28.webp\" alt=\"%c\" >\n<img src=\"/images/rs-ecorisk24/rs-ecorisk-29.webp\" alt=\"%c\" >\n<img src=\"/images/rs-ecorisk24/rs-ecorisk-30.webp\" alt=\"%c\" ></p>\n<h1 id=\"parthas-day-2-roundup\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#parthas-day-2-roundup\"></a>Partha's Day 2 roundup</h1>\n<p>See the equation below. The right hand side is <code>G</code> (nature's regeneration rates) and is a function of the state of the biosphere (the <code>S</code>).\nThe left hand side is the human demand (human activity) which is a per-capita adjusted population and the efficiency of provisioning goods (recall what those are from the start of day 1 notes).\nMost of the discussion in these two days have been on the right hand side. A huge amount of the literature is on the alpha, and how technological progress can raise alpha (e.g. via cleaner energy) and help rebalance the inequality. How fast can alpha change, as the faster it grows, the faster GDP and natural capital grows.</p>\n<p><img src=\"/images/rs-ecorisk24/rs-ecorisk-31.webp\" alt=\"%c\" title=\"Sir Partha's equation to sum up the two days!\" ></p>\n<p>How might we advocate for change? The total we pay ourselves is 2-3% of global GDP for subsidising our assault on nature.  Removing those subsidies is the equivalent of raising alpha. Adam Smith's classic book in the 18th century was the &quot;Wealth of Nations&quot; and not the &quot;GDP of Nations&quot;. For us, wealth includes natural capital. Even as early as 60 years ago, human capital didn't appear in the economic literature. Most national accountants include statements about the increase in human capital, and for our purposes we must include natural capital in the notion of wealth. We must shift accounting away from GDP into calculations of stock and inequalities, as shown in the equation.</p>\n<p>What the discussions missed didn't discuss invasive species and the &quot;transfer of natural capital&quot; through the fact that goods and services are traded. This is represented in the capital <code>N</code> in the equation. So that's future work!</p>\n<h2 id=\"follow-more\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#follow-more\"></a>Follow more</h2>\n<p>This concludes my rapid note taking. Do join the <a href=\"https://royalsociety.org/science-events-and-lectures/2024/10/ecological-and-commercial-risk/\">livestream</a> and follow for the remaining day and a half if you have a spare moment!</p>\n<p><em>Edit: upon being prodded by <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a>, I've uploaded a <a href=\"https://notebooklm.google\">NotebookLM</a> generated podcast summary of the morning. It's surprisingly entertaining, but I'm sure I'm going to regret this for some reason...</em></p>\n<p><div class=\"video-center\"><iframe title=\"How does ecological risk related to commercial risk (NotebookLM AI summary)\" width=\"100%\" height=\"315px\" src=\"https://crank.recoil.org/videos/embed/09d626ed-d090-4016-a812-ca90e382b441\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p><h1>References</h1><ul><li>Eyres et al (2025). LIFE: A metric for mapping the impact of land-cover change on global extinctions. <a href=\"https://doi.org/10.1098/rstb.2023.0327\" target=\"_blank\"><i>10.1098/rstb.2023.0327</i></a></li>\n<li>Ball et al (2025). Food impacts on species extinction risks can vary by three orders of magnitude. <a href=\"https://doi.org/10.1038/s43016-025-01224-w\" target=\"_blank\"><i>10.1038/s43016-025-01224-w</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/rs-ecorisk-day1",
      "title": "Royal Society meeting on ecological/commercial risks",
      "summary": "The discussion centered around the relationship between ecological and commercial risk, highlighting the need for a more comprehensive understanding of natural capital and its role in the economy. The equation presented by Sir Partha Dasgupta summarizes the two days of discussion, emphasizing the importance of balancing human demand with nature's regeneration rates. The talks touched on various topics, including greenwashing risks, standardization of biodiversity metrics, and the need for verification of nature-positive claims. The conversation also stressed the importance of shifting accounting away from GDP and towards calculations of stock and inequalities, incorporating natural capital into the notion of wealth. Overall, the event aimed to raise awareness about the interconnectedness of ecological and commercial risk and the need for a more sustainable approach to economic development.",
      "date_published": "2024-10-03T00:00:00.000000Z",
      "date_modified": "2024-10-04T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        ":rsn",
        ":life",
        "royalsociety",
        "livenotes",
        "conservation",
        "ecology",
        "economics"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1098/rstb.2023.0327",
          "doi": "10.1098/rstb.2023.0327",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s43016-025-01224-w",
          "doi": "10.1038/s43016-025-01224-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/hotnets-pc-2024",
      "content_html": "<p>I was on the program committee for <a href=\"https://conferences.sigcomm.org/hotnets/2024/\">HotNets\n2024</a> this year, which was a\nthoroughly enjoyable experience. The <a href=\"https://conferences.sigcomm.org/hotnets/2024/accepted.html\">list of accepted\npapers</a> is now out,\nand it's a diverse program -- with my personal favourites being the ones on\nspace communications networks using low earth orbit satellites.</p>\n<p>Well done to <a href=\"https://www.microsoft.com/en-us/research/people/bearzani/\">Behnaz\nArzani</a> and <a href=\"https://www.cs.cornell.edu/~jnfoster/\">Nate\nFoster</a> for really excellent general\nchairing and ensuring the PC maintained a constructive, positive tone while\ndoing the difficult job of selecting papers from a crowded set of submissions.\nThe structure of of the program committee was also somewhat novel, and one\nI'd like to replicate in other conferences I organise in the future.</p>\n<p><img src=\"/images/hotnetspc-view-2024.webp\" alt=\"%c\" title=\"The spectacular view from Jane Street's 18th floor!\" ></p>\n<ul>\n<li><strong>Two Review Rounds.</strong> There were two rounds of reviewing, with any clear decisions from the first\nset of reviewers resulting in an early rejection decision. Remaining papers\nwent through to round 2, where they got a further set of reviews.</li>\n<li><strong>HotCRP Discussions.</strong> The PC strove to discuss the papers on <a href=\"https://hotcrp.com\">HotCRP</a> before\nthe in-person PC meeting, coming to consensus on a number of them. Only a\nsmall subset of the full papers had to be discussed in the live meeting.  HotCRP has\nsuperb support to facilitate this sort of interaction, in a way that alternatives\nlike EasyChair simply don't. I'm <em>much</em> more likely to agree to future program\ncommittees if they use HotCRP.</li>\n<li><strong>Hybrid Meeting with Pods.</strong> For the live meeting, the chairs organised &quot;pods&quot; at Microsoft in Seattle (with Behnaz)\nand at Jane Street in New York (with Nate). I was hoping to host a pod in\nCambridge as well, but I ended up having to travel to New York for some\nmeetings on biodiversity and so went along to the Jane Street pod.\nThis was wonderful -- we got to minimise travel, and yet have good synchronous\ndiscussions, with excellent A/V links between the pods.  Other PC members\ngot to Zoom in as usual if they couldn't make it to a pod, but there was\nenough critical mass to make it a more social occasion for those who did attend\none.</li>\n<li><strong>Post PC Workshop.</strong> There was an excellent workshop of talks held afterwards, where I spoke on\nplanetary computing, and I got to hear the legendary <a href=\"https://www.linkedin.com/in/brian-nigito-a366052/\">Brian Nigito</a>\ntalk about their low latency <a href=\"https://x.com/yminsky/status/1837650874409136339\">TCP/IP stack called NetKit</a>\nthat's written in OCaml.  Now, I've <a href=\"https://github.com/mirage/mirage-tcpip\">written an OCaml TCP/IP stack</a>\nor two in my time, but what makes theirs really exciting is that it takes advantage\nof the experimental modal types in their <a href=\"https://blog.janestreet.com/author/mslater/\">&quot;oxidised&quot; OCaml</a>\nbranch be as performance as a non-garbage-collected stack. I sadly had to run\nfor my flight back home half-way through the workshop, but it was lovely to\nreconnect with the networking community again after being deep into environmental\nscience for the past few years.</li>\n</ul>\n<p><img src=\"/images/hotnetspc-anil-2024.webp\" alt=\"%c\" title=\"Me giving a talk! (photo courtesy Nate Foster)\" ></p>\n<p>I'm noting down the HotNets as a potentially really good way to run the next\n<a href=\"https://propl.dev\">Programming for the Planet</a>, which is due in 2025. More\nnews on that soon!  In the meanwhile, get your papers into <a href=\"https://www.sicsa.ac.uk/loco/loco2024/\">LOCO\n2024</a> which is due in a couple of\ndays...</p>",
      "url": "https://anil.recoil.org/notes/hotnets-pc-2024",
      "title": "Being on the HotNets 2024 program committee",
      "summary": "Serving on HotNets 2024 program committee was a great experience with a novel structure.",
      "date_published": "2024-09-22T00:00:00.000000Z",
      "date_modified": "2024-09-22T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "networks",
        "functional",
        "hotnets",
        "janestreet",
        "service"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/mitigating-nbs-risk-paper",
      "content_html": "<p>Many of the questions around our recent <a href=\"/papers/2023-naturecredits\">Nature Sustainability commentary on NbS credits</a> revolve around\n<em>how</em> to finance new projects if credible credits need to be ex-post. Our\nlatest paper published in Carbon Management on <em>&quot;<a href=\"/papers/2024-nbs-risk\">Mitigating risk of credit reversal in nature-based climate solutions by optimally anticipating carbon release</a>&quot;</em> tries to address this.</p>\n<p>The problem with selling ex-ante (future) carbon credits for (e.g.) a\ndeforestation avoidance scheme is that project reversals can happen in the\nfuture (&quot;deforestation has increased&quot;) thus rendering any credits issued\npreviously useless. On the flip side though, an overly conservative view of the\nfuture (&quot;the entire forest will disappear overnight!&quot;) is clearly so\nconservative that it doesn't serve the best interests of the project developer.\nSo ideally, a project would make realistic but conservative ex-ante predictions\nthat is safe for both project developer (who gets more funds upfront) and a\ncarbon credit purchasers (who needs to account for impermanence of nature\ncredits).</p>\n<p>Our paper shows how to do this by calculating a &quot;release schedule&quot; to predict\nfuture drawdowns, and then issuing extra credits when the release at some\nfuture date is less than predicted by the release schedule. We use verified\nex-post observations to construct these release schedules, and design them to\nbound the risk of the project becoming negative overall (that is, net drawdown\nis negative) and thus failing.</p>\n<p>The paper evaluates this process with both theoretical and real projects to\nassess how well it balances the tradeoff between generating permanent nature\ncredits and bounding the risk of project failure in the future. As a nice side\neffect, our method removes the need for buffer pools entirely, which do not\ncurrently base the sizing on an empirical assessment of reversal risks, and are\nusually cancelled at project end (wasting potential credits). Read the full\nopen access paper, lead expertly by <a href=\"https://www.plantsci.cam.ac.uk/staff/dr-e-ping-rau\">E.-Ping Rau</a> <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and <a href=\"https://coomeslab.org\">David Coomes</a>,\nthat just came out in Carbon Management for details: <a href=\"/papers/2024-nbs-risk\">Mitigating risk of credit reversal in nature-based climate solutions by optimally anticipating carbon release</a></p>\n<p>There's still plenty of future work to be done -- we focus on avoided\ndeforestation projects in this paper, but afforestation projects could also be\nmodelled on similar principles. Do get in touch if you'd like to help assess\nour methods!</p>\n<iframe src=\"https://www.linkedin.com/embed/feed/update/urn:li:share:7238538742104281091\" height=\"1321\" width=\"504\" frameborder=\"0\" allowfullscreen=\"\" title=\"Embedded post\"></iframe><h1>References</h1><ul><li>Rau et al (2024). Mitigating risk of credit reversal in nature-based climate solutions by optimally anticipating carbon release. <a href=\"https://doi.org/10.1080/17583004.2024.2390854\" target=\"_blank\"><i>10.1080/17583004.2024.2390854</i></a></li>\n<li>Swinfield et al (2024). Nature-based credit markets at a crossroads. Springer Science and Business Media LLC. <a href=\"https://doi.org/10.1038/s41893-024-01403-w\" target=\"_blank\"><i>10.1038/s41893-024-01403-w</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/mitigating-nbs-risk-paper",
      "title": "Mitigating credit reversal risks in nature-based solutions",
      "summary": "Mitigating credit reversal risks in nature-based solutions with predictive release schedules.",
      "date_published": "2024-09-08T00:00:00.000000Z",
      "date_modified": "2024-09-08T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "4c",
        ":2024-nbs-risk",
        "conservation",
        "economics",
        "forests",
        "nbs",
        "carboncredits"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1080/17583004.2024.2390854",
          "doi": "10.1080/17583004.2024.2390854",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s41893-024-01403-w",
          "doi": "10.1038/s41893-024-01403-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2024-hope-bastion-1",
      "content_html": "<p>A very fun talk at <a href=\"https://icfp24.sigplan.org/home/hope-2024\">ACM HOPE 2024</a>\non some new work with <a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a> and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> on how we can formally specify\nsystems to be robust to code generation by AI agents. For instance, if you were\nto ask GitHub Copilot to generate you code to filter endangered animals out of\na folder of images, it might interpret that as to delete the image, or to move\nit to another folder (which might be public), or just remove it from the index.\nAny of those options are potentially valid, so what do we do? Our idea is to\nuse F* to specify a rich set of allowable behaviours which can then be\ndynamically enforced in less expressive languages, and thus offer layers of\nprotection against over-eager (or rogue) AI agents.</p>\n<p>We'll increasingly need these sort of protections in our systems as natural\nlanguage interfaces get adopted for programming, and the traditional 'yes or\nnot' access control policies we use right now will be insufficient.</p>\n<p>Read more in:</p>\n<ul>\n<li>The <a href=\"https://anil.recoil.org/slides/2024-hope-bastion-slides.pdf\">talk slides</a> given by <a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a> and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a></li>\n<li>The <a href=\"https://www.youtube.com/watch?v=U9H9xU-8-qc&amp;list=PLyrlk8Xaylp7OQNLeCGS0j2fjEnvIWL9u\">talk video</a> from the conference.</li>\n</ul>",
      "url": "https://anil.recoil.org/notes/2024-hope-bastion-1",
      "title": "Towards security specifications for agentic AIs",
      "summary": "ACM HOPE talk on using F* specifications to protect systems against over-eager or rogue AI agents.",
      "date_published": "2024-09-04T00:00:00.000000Z",
      "date_modified": "2024-09-04T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        ":plancomp",
        "systems",
        "specification",
        "ai",
        "icfp",
        "security"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2024-hope-bastion.pdf",
          "mime_type": "application/pdf",
          "title": "Modularizing Reasoning about AI Capabilities via Abstract Dijkstra Monads"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/new-teaching-page",
      "content_html": "<p>There's a new <a href=\"/notes/teaching\">teaching</a> page with my past and present courses, and links\nto the associated teaching materials. One of the nice things about most Cambridge\ncourses is that all the teaching materials are public, except for video recordings of\nthe lectures themselves.</p>",
      "url": "https://anil.recoil.org/notes/new-teaching-page",
      "external_url": "https://anil.recoil.org/teaching",
      "title": "New teaching page with my Computer Science courses",
      "summary": "Explore my new teaching page featuring Computer Science courses and materials.",
      "date_published": "2024-09-03T00:00:00.000000Z",
      "date_modified": "2024-09-03T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "teaching",
        "cambridge",
        "computerlab"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/ukri-grant-terra",
      "content_html": "<p>I don't normally announce funded grants (preferring to focus on outcomes), but I'm really excited by this one and couldn't resist!  Myself and my colleagues <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> (from computer science), <a href=\"https://coomeslab.org\">David Coomes</a> (from Plant Sciences), <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\">Andrew Balmford</a> (from Zoology) and <a href=\"https://www.cambridgeconservation.org/about/people/professor-neil-burgess/\">Neil Burgess</a> (the Head of Science at <a href=\"https://www.unep-wcmc.org/en/the-team\">UNEP-WCMC</a>) have just received a £1.2m grant from the UKRI to work on <a href=\"https://www.cst.cam.ac.uk/news/meet-terra-ai-aiming-map-terrestrial-life-planet\">building foundation models for planetary intelligence</a>.</p>\n<p>Now, normally a grant isn't news, but I wanted to highlight the scheme that it came under. UKRI announced an <a href=\"https://www.ukri.org/news/first-projects-from-ukris-new-interdisciplinary-scheme-announced/\">interdisciplinary program</a> specifically for projects that don't normally get funded by just one research council. In our case, this work usually falls between the cracks of EPSRC <em>(&quot;too much nature&quot;)</em> or NERC <em>(&quot;too much engineering&quot;)</em> or STFC <em>(&quot;not enough satellites&quot;)</em>. But this interdisciplinary program expressly assembled a panel across all these areas, and collectively gave us a shot. I really hope this scheme continues to gather steam within the UKRI.</p>\n<p>As to what we're doing? There'll be the evolution of the work described in <a href=\"/projects/rsn\">Remote Sensing of Nature</a> and <a href=\"/projects/life\">Mapping LIFE on Earth</a>, with lots of domain knowledge that we're pulling together with our partners at UNEP-WCMC (especially <a href=\"https://www.cambridgeconservation.org/about/people/professor-neil-burgess/\">Neil Burgess</a> and <a href=\"https://www.kew.org/science/our-science/people/ian-ondo\">Ian Ondo</a>) on plant and animal species distributions across the globe.</p>\n<p><img src=\"/images/2024-clr-scotland.webp\" alt=\"%c\" title=\"Us freezing in a Scottish August counting heather growth. There's got to be a more scalable way of doing this, right?\" ></p>\n<h2 id=\"learn-more\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#learn-more\"></a>Learn more</h2>\n<p>You can read more both in the <a href=\"https://www.ukri.org/news/first-projects-from-ukris-new-interdisciplinary-scheme-announced/\">UKRI announcement today</a> and in the <a href=\"https://www.cst.cam.ac.uk/news/meet-terra-ai-aiming-map-terrestrial-life-planet\">Cambridge Computer Science coverage</a> about what we're up to. Some exciting preprints about our work in this space so far:</p>\n<ul>\n<li><a href=\"/papers/2024-life\">LIFE: A metric for mapping the impact of land-cover change on global extinctions</a> is our new metric for calculating biodiversity impacts worldwide in a comparable way. We intend to extend it to cover plant species.</li>\n<li><a href=\"/papers/2024-food-life\">Food impacts on species extinction risks can vary by three orders of magnitude</a> connects up the biodiversity metric to supply chains to figure out the environmental impact of human food consumption on the planet. We intend to increase its resolution significantly with the new foundation models derived from remote sensing data.</li>\n<li><a href=\"/papers/2024-terracorder\">Terracorder: Sense Long and Prosper</a> is a battery-efficient sensing platform I'm working on with our Imperial buddies. We need more data about our planet!</li>\n</ul><h1>References</h1><ul><li>Eyres et al (2025). LIFE: A metric for mapping the impact of land-cover change on global extinctions. <a href=\"https://doi.org/10.1098/rstb.2023.0327\" target=\"_blank\"><i>10.1098/rstb.2023.0327</i></a></li>\n<li>Ball et al (2025). Food impacts on species extinction risks can vary by three orders of magnitude. <a href=\"https://doi.org/10.1038/s43016-025-01224-w\" target=\"_blank\"><i>10.1038/s43016-025-01224-w</i></a></li>\n<li>Millar et al (2024). Terracorder: Sense Long and Prosper. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2408.02407\" target=\"_blank\"><i>10.48550/arXiv.2408.02407</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/ukri-grant-terra",
      "title": "Building species models of the planet",
      "summary": "Researchers receive £1.2m grant to build species models for planetary intelligence.",
      "date_published": "2024-09-02T00:00:00.000000Z",
      "date_modified": "2024-09-02T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        ":rsn",
        ":life",
        "conservation",
        "biodiversity",
        "funding",
        "sensing",
        "sdms"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1098/rstb.2023.0327",
          "doi": "10.1098/rstb.2023.0327",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s43016-025-01224-w",
          "doi": "10.1038/s43016-025-01224-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.2408.02407",
          "doi": "10.48550/arXiv.2408.02407",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2024-nbs-risk-2",
      "content_html": "<p>Our paper on ex-ante projection for nature-based solutions has been published in the <a href=\"https://www.tandfonline.com/journals/tcmt20\">Journal of Carbon Management</a>. I also wrote up some <a href=\"/notes/mitigating-nbs-risk-paper\">long-form thoughts</a> on it here. The publication represents the culmination of work on how to properly account for credit reversal risk in nature-based climate solutions through optimal carbon release anticipation. E-Ping did excellent work leading the methodology development and analysis. The approach we developed incentivizes project performance while resolving the fundamental trade-off between a credit's permanence rating and risk reduction, providing a pragmatic solution to one of the key challenges facing the effectiveness of nature-based climate solutions.</p><h1>References</h1><ul><li>Rau et al (2024). Mitigating risk of credit reversal in nature-based climate solutions by optimally anticipating carbon release. <a href=\"https://doi.org/10.1080/17583004.2024.2390854\" target=\"_blank\"><i>10.1080/17583004.2024.2390854</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2024-nbs-risk-2",
      "title": "Paper published on ex-ante forecasts of nature-based solutions",
      "summary": "Journal of Carbon Management publication on ex-ante projection methodologies for evaluating nature-based carbon solutions.",
      "date_published": "2024-08-31T00:00:00.000000Z",
      "date_modified": "2024-08-31T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "economics",
        "nbs",
        "forests",
        "carboncredits"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2024-nbs-risk.pdf",
          "mime_type": "application/pdf",
          "title": "Mitigating risk of credit reversal in nature-based climate solutions by optimally anticipating carbon release"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1080/17583004.2024.2390854",
          "doi": "10.1080/17583004.2024.2390854",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2023-pact-tmf-3",
      "content_html": "<p>We have just released the Tropical Moist Forest v2.1 specification, to follow up the now-expired <a href=\"/notes/2023-pact-tmf-2\">v2.0</a> from six months ago. The key updates are a new <a href=\"https://tinyurl.com/PACTTMFexplainer\">high-level explainer</a>, as well as clarifications for buffer zones and base tiles. The high-level explainer is particularly useful as it makes the methodology more accessible to a broader audience beyond just technical specialists. We've also refined the technical details around how to handle edge cases with buffer zones and base tiles in the remote sensing data, which came from feedback from teams trying to implement the methodology. These incremental improvements are important for ensuring the specification can be reliably applied in practice.</p><h1>References</h1><ul><li>Balmford et al (2024). PACT Tropical Moist Forest Accreditation Methodology v2.1. Cambridge Open Engage. <a href=\"https://doi.org/10.33774/coe-2024-gvslq\" target=\"_blank\"><i>10.33774/coe-2024-gvslq</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2023-pact-tmf-3",
      "title": "PACT Tropical Moist Forest Accreditation Methodology",
      "summary": "Release of Tropical Moist Forest v2.1 specification with new explainer and clarifications for buffer zones and base tiles.",
      "date_published": "2024-08-29T00:00:00.000000Z",
      "date_modified": "2024-08-29T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "forests",
        "carboncredits",
        "pact",
        "satellite",
        ":rsn",
        ":4c"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2023-pact-tmf.pdf",
          "mime_type": "application/pdf",
          "title": "PACT Tropical Moist Forest Accreditation Methodology v2.1"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.33774/coe-2024-gvslq",
          "doi": "10.33774/coe-2024-gvslq",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/nature-crossroads",
      "content_html": "<p>Our <a href=\"/papers/2023-naturecredits\">commentary on nature-based credits</a> has been published in <a href=\"https://www.nature.com/articles/s41893-024-01403-w\">Nature\nSustainability</a>,\nlead expertly by my colleagues <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\">Thomas Swinfield</a> and <a href=\"https://www.naturerecovery.ox.ac.uk/people/sophus-zu-ermgassen/\">Sophus zu Ermgassen</a>.</p>\n<p>In our view the carbon credits markets are vitally important for forest\nconservation, but the key is to only transact these credits <em>after they have\nbeen proven to be demonstrably additional using robust statistical techniques</em>,\nso that we know before a sale that each credit represents real gains that would\nnot otherwise have occurred without the carbon finance.</p>\n<p>A more scientific approach that supports transparent, third-party validation\ncould absolutely transform these markets. And given the rapid rate of tropical\nforest loss, such upscaling of credibility is vitally necessary to raise\ninvestor confidence in protecting nature, since we can now be confident that\nevery &quot;credit&quot; sold is resulting in real climate benefit.  There are real\nquestions remaining about this reform, of course.</p>\n<p><img src=\"/images/naturecrossroads-method.webp\" alt=\"%c\" ></p>\n<h3 id=\"where-does-early-project-finance-come-from\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#where-does-early-project-finance-come-from\"></a>Where does early project finance come from?</h3>\n<p>Since projects can no longer\nsell ex-ante credits (i.e. future credits which may not be real), then we\nneed to come up with financing models that embrace the upfront risk. This\nalready happens in other areas such as oil and gas; as <a href=\"https://uk.linkedin.com/in/siddarthshrikanth\">Siddarth Shrikanth</a> notes:</p>\n<blockquote>\n<p>&lt;..&gt;speculative efforts like mining or oil exploration, we’ve still managed to build large industries out of uncertain (but potentially very valuable) payoffs. The challenge here will be to figure out which archetype different projects fall into, and create enough trust that the output will be real and valuable enough to someone to justify the up front investments&lt;..&gt;\n<cite>-- <a href=\"https://uk.linkedin.com/in/siddarthshrikanth\">Siddarth Shrikanth</a> via <a href=\"https://www.linkedin.com/feed/update/urn:li:activity:7226538933961007104?commentUrn=urn%3Ali%3Acomment%3A%28activity%3A7226538933961007104%2C7226597328550273025%29&amp;replyUrn=urn%3Ali%3Acomment%3A%28activity%3A7226538933961007104%2C7226840222288789504%29&amp;dashCommentUrn=urn%3Ali%3Afsd_comment%3A%287226597328550273025%2Curn%3Ali%3Aactivity%3A7226538933961007104%29&amp;dashReplyUrn=urn%3Ali%3Afsd_comment%3A%287226840222288789504%2Curn%3Ali%3Aactivity%3A7226538933961007104%29\">LinkedIn</a></cite></p>\n</blockquote>\n<p>Lead author <a href=\"https://www.naturerecovery.ox.ac.uk/people/sophus-zu-ermgassen/\">Sophus zu Ermgassen</a> comments as well that:</p>\n<blockquote>\n<p>Society has made huge policy commitments to upscale carbon &amp; biodiversity offsetting.\nBut, carbon credit markets have suffered serious hits to their credibility &amp; nascent biodiversity markets risk inheriting shortcomings. Impact evaluations have shown that these markets have systematically underdelivered additionality.\n<cite>-- <a href=\"https://www.naturerecovery.ox.ac.uk/people/sophus-zu-ermgassen/\">Sophus zu Ermgassen</a> via <a href=\"https://www.linkedin.com/posts/sophus-zu-ermgassen-12915ba6_nature-based-carbon-markets-have-experienced-activity-7226538933961007104-mM-u?utm_source=share&amp;utm_medium=member_desktop\">LinkedIn</a></cite></p>\n</blockquote>\n<p>We've been working on this aspect in <a href=\"/projects/4c\">4C</a>, since ex-ante predictions of outcomes are necessary for project developers to be able to forecast financing. See the paper &quot;<a href=\"/papers/2024-nbs-risk\">Mitigating risk of credit reversal in nature-based climate solutions by optimally anticipating carbon release</a>&quot; for our latest work on that, lead by <a href=\"https://www.plantsci.cam.ac.uk/staff/dr-e-ping-rau\">E.-Ping Rau</a> and <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>.</p>\n<div class=\"video-center\">\n<iframe width=\"560\" height=\"315\" src=\"https://www.youtube-nocookie.com/embed/69bKFhuvmeM?si=5VRnoSaX3mDIyj78\" title=\"YouTube video player\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen></iframe>\n</div>\n<p><a href=\"https://www.naturerecovery.ox.ac.uk/people/sophus-zu-ermgassen/\">Sophus zu Ermgassen</a> also gave a fantastic talk at the CCI ont his topic a few months ago that is a must watch for anyone working on carbon or biodiversity markets.</p>\n<h3 id=\"questions-of-equity-and-justice\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#questions-of-equity-and-justice\"></a>Questions of equity and justice</h3>\n<p>It's also not enough to &quot;just&quot; show that a given project is additional from a satellite perspective, but also that they do not result in justice and equity concerns for the local populations. Current reporting practices often require only superficial descriptions of how projects approach justice and equity issues, which are challenging to verify and lack consistency and transparency. So our group has also been working on <a href=\"https://4c.cst.cam.ac.uk/news/introducing-new-framework-assessing-justice-and-equity-impacts-nature-based-solutions-projects\">a framework for assessing justice and equity impacts</a>, started by <a href=\"https://uk.linkedin.com/in/miranda-lam-a088561b4\">Miranda Lam</a>. I've also been working with <a href=\"https://www.cst.cam.ac.uk/people/smc70\">Sophie Chapman</a> and <a href=\"https://www.cst.cam.ac.uk/people/eft20\">Eleanor Toye Scott</a> on the <a href=\"/ideas/legal-aspects-of-credits\">Legal perspectives on integrity issues in forest carbon</a>. Please do get in touch if you have thoughts on this aspect of project development.</p><h1>References</h1><ul><li>Rau et al (2024). Mitigating risk of credit reversal in nature-based climate solutions by optimally anticipating carbon release. <a href=\"https://doi.org/10.1080/17583004.2024.2390854\" target=\"_blank\"><i>10.1080/17583004.2024.2390854</i></a></li>\n<li>Swinfield et al (2024). Nature-based credit markets at a crossroads. Springer Science and Business Media LLC. <a href=\"https://doi.org/10.1038/s41893-024-01403-w\" target=\"_blank\"><i>10.1038/s41893-024-01403-w</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/nature-crossroads",
      "title": "Nature Sustainability commentary on carbon and biodiversity credits",
      "summary": "Experts discuss reforming carbon and biodiversity credits markets for effective forest conservation and climate benefit.",
      "date_published": "2024-08-15T00:00:00.000000Z",
      "date_modified": "2024-08-15T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        ":4c",
        ":rsn",
        ":2023-naturecredits",
        "conservation",
        "economics",
        "carboncredits",
        "nbs",
        "biodiversity"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1080/17583004.2024.2390854",
          "doi": "10.1080/17583004.2024.2390854",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s41893-024-01403-w",
          "doi": "10.1038/s41893-024-01403-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2024-terracorder-1",
      "content_html": "<p>Our preprint on the Terracorder ground sensing platform I've been working with <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> at Imperial on is now available on arXiv. It's a heady combination of ESP32 very low power hardware, combined with Q-learning to build cooperative networks of them that can run for long periods of time without wasting energy on redundant operations. The Terracorder is a versatile multi-sensor device designed for biodiversity monitoring in remote environments. Josh's clever on-device reinforcement learning scheduler captures more than 80% of events at less than 50% of the number of activations of the best-performing fixed schedule. We also explore how a collaborative scheduler can maximize useful operation across a network of devices, improving overall power consumption and robustness - crucial for long-term deployment in places like tropical forests.</p><h1>References</h1><ul><li>Millar et al (2024). Terracorder: Sense Long and Prosper. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2408.02407\" target=\"_blank\"><i>10.48550/arXiv.2408.02407</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2024-terracorder-1",
      "title": "Preprint on Terracorder sensing now available",
      "summary": "Preprint on ultra-low-power ESP32-based biodiversity sensor using Q-learning to optimize cooperative network operations.",
      "date_published": "2024-08-01T00:00:00.000000Z",
      "date_modified": "2024-08-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "terracorder",
        "sensing",
        "biodiversity",
        "esp32"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2024-terracorder.pdf",
          "mime_type": "application/pdf",
          "title": "Terracorder: Sense Long and Prosper"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.48550/arXiv.2408.02407",
          "doi": "10.48550/arXiv.2408.02407",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2023-naturecredits-1",
      "content_html": "<p>Our commentary on nature-based credits has been published in <a href=\"https://www.nature.com/natsustain/\">Nature Sustainability</a>. I wrote some <a href=\"/notes/nature-crossroads\">thoughts</a> about it here as well. This piece argues that nature-based credit markets are at a critical crossroads - recent impact evaluations showing disappointing results have threatened investor confidence. We make the case that these markets need fundamental reform to adopt the latest scientific understanding on additionality, leakage, and permanence. The key proposal is releasing credits ex-post only after proven demonstrable additionality relative to statistically-derived counterfactuals, and making credit estimation methods robust to rather than resistant to scientific improvements. Without these reforms, we risk losing one of our most promising tools for drawing private investment into conservation.</p><h1>References</h1><ul><li>Swinfield et al (2024). Nature-based credit markets at a crossroads. Springer Science and Business Media LLC. <a href=\"https://doi.org/10.1038/s41893-024-01403-w\" target=\"_blank\"><i>10.1038/s41893-024-01403-w</i></a></li>\n<li>(2025). Nature Sustainability. <a href=\"https://doi.org/https://www.nature.com/natsustain/\" target=\"_blank\"><i>https://www.nature.com/natsustain/</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2023-naturecredits-1",
      "title": "Nature Sustainability article on carbon/biodiversity credits",
      "summary": "Nature Sustainability commentary on nature-based carbon and biodiversity credit markets.",
      "date_published": "2024-08-01T00:00:00.000000Z",
      "date_modified": "2024-08-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "biodiversity",
        "economics",
        "nature",
        "carboncredits",
        ":4c"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2023-naturecredits.pdf",
          "mime_type": "application/pdf",
          "title": "Nature-based credit markets at a crossroads"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1038/s41893-024-01403-w",
          "doi": "10.1038/s41893-024-01403-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/https://www.nature.com/natsustain/",
          "doi": "https://www.nature.com/natsustain/",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/student-ideas",
      "content_html": "<p>I've refreshed the set of project <a href=\"/ideas\">ideas</a> for incoming <a href=\"https://www.cst.cam.ac.uk/teaching/part-ii\">CST Part II</a> and <a href=\"https://www.cst.cam.ac.uk/teaching/masters\">MPhil</a> and PhD student projects for 2024-2025.</p>\n<p>These are not an exhaustive list, but intended to kickstart conversations for things we could work on together. Do get in touch if you're an incoming student and see something that grabs your interest.</p>",
      "url": "https://anil.recoil.org/notes/student-ideas",
      "external_url": "/ideas",
      "title": "New set of ideas for incoming students",
      "summary": "Explore new project ideas for CST Part II, MPhil, and PhD students starting in 2024-2025.",
      "date_published": "2024-07-15T00:00:00.000000Z",
      "date_modified": "2024-07-15T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "cambridge",
        "research"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://doi.org/10.59350/p7kck-5bt81",
      "content_html": "<p>This is a trip report of <a href=\"https://compass.acm.org\">ACM COMPASS 2024</a> held in New Delhi, which had a novel track of <a href=\"https://compass.acm.org/research-impact-collaboratives/\">&quot;Research to Impact Collaboratives&quot;</a> that drew me in. The general chair, <a href=\"https://www.cse.iitd.ac.in/~aseth/\">Aadi Seth</a> wrote a fantastic book on &quot;<a href=\"https://www.cse.iitd.ac.in/~aseth/act.html\">Technology and Disempowerment</a>&quot; a few years ago, and he organised one RIC session on the CoRE Stack -- a climate adaptation stack for rural communities. This was a must-visit for me as it is closely related to the work we've been doing on <a href=\"/projects/rsn\">Remote Sensing of Nature</a> and <a href=\"/projects/plancomp\">Planetary Computing</a>. The following notes are somewhat raw as they have only been lightly edited, but please refer to the more polished documents on the <a href=\"https://docs.google.com/document/d/1MJ-Nw_P3z6gI9rvh4OcjJmdZRE83D_OXedgEeDZDnm8/edit\">agenda for ACM COMPASS RIC</a> and the overall <a href=\"https://core-stack.org\">CoRE Stack</a> initiative on commoning technologies for resilience and equality</p>\n<p>The conference itself was held at <a href=\"http://iiitd.ac.in/\">IIIT-D</a> in New Delhi, right at the cusp of the monsoon season and after record-breaking temperatures. Luckily, as always, the hospitality and welcoming nature of New Delhi overrode all the climate discomfort!</p>\n<p><img src=\"/images/compass24/compass24-17.webp\" alt=\"%c\" title=\"Arriving at the IIIT-D campus\" ></p>\n<p>The main focus of this report is the one-day RIC held on the 8th July 2024. The RIC had around <a href=\"https://docs.google.com/spreadsheets/d/1IF7bOT-868ky138ysKXZE-BBN0z6KjI7D7ZjfKufFQQ/edit?gid=0#gid=0\">60 attendees</a> in person and 40 online, and was a mix of presentations and discussions on the CoRE stack and how it could be used to address climate adaptation in rural communities. The day was divided into two sessions, with the first being a series of scene setting presentations by practitioners and researchers, and the second being a series of breakout discussions on how the CoRE stack could be used in different contexts.</p>\n<h2 id=\"intro-the-ric-core-stack-aadi-seth\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#intro-the-ric-core-stack-aadi-seth\"></a>Intro: The RIC Core stack (Aadi Seth)</h2>\n<p>Data driven approaches enable new approaches to social ecological system health, but need to be grounded in community based approaches, and the scope is too vast for any one group to handle.  The CoRE stack (Commoning for Resilience and Equality) is being architected as a digital public infrastructure consisting of datasets, pre-computed analytics, and tools that can be used by rural communities and other stakeholders to improve the sustainability and resilience of their local landscapes. It will enable innovators to build upon and contribute their own datasets, use APIs for third-party apps, and track and monitor socio-ecological sustainability through a systems approach. The CoRE stack broadly consists of four layers.</p>\n<p><img src=\"/images/compass24/compass24-19.webp\" alt=\"%c\" title=\"Getting a signed copy of Aadi's book!\" ></p>\n<p>The broad approach is bottom-up usecase discovery, and picking a digital public infrastructure approach to work with civic services with, and to do distributed problem solving across stakeholders in academia, government and business.\nAadi noted the need to balance between standards and design and end-to-end servicing, and the overheads of collaboration across so many people; see the notes on <a href=\"https://docs.google.com/document/d/1akzDkbCxbXQe49uaArNLw-2z_AYtF5jjZxR2UGJ66o0/edit\">RIC collaboration across people</a>.</p>\n<p>Aadi then described the CoRE stack is a logical layered architecture:</p>\n<ul>\n<li>Layer 1 is the inclusion of new datasets: what is the standards and processes\nbehind this? There are a lot of geospatial data products around, including\ncommunity data that has been gathered in an ad-hoc way.</li>\n<li>Layer 2 is the generation of indicators, APIs and reports which give us\nlandscape level socio-ecological indicators. Includes alert services,\ncomputation infrastructure and suport.</li>\n<li>Layer 3 are the tools and platforms for implementation partners and\ncommunities. There are planning tools that are community based and\nparticipatory processes. Once we &quot;know our landscape&quot; we can perform fund\nallocation guidelines. Example of such as tool is Jaltol, for landscape and\nsite-level analysis. And ultimately we want to support new innovations such as\ndMRV for credits or socioecological indices.</li>\n<li>Layer 4 is about integrating into government and mark programmed, such as\nwater security, foresty and biodiversity credits, natural farming, flood\nhazard adaption and so on.</li>\n</ul>\n<p>To enable this, Aadi motivated the need to work together with networked co-creation and a\ndigital commons and build on top of it with open licenses. We need to overcome\nfault lines not only in terms of new climate change problems but also\nsocio-ecological barriers. And ultimately we need to go to scale and work with\ngovernment departments to make urgent improvements.</p>\n<p>An example of this is water security, via <a href=\"https://welllabs.org/jaltol/\">WellLabs Jaltol</a> which allows for\nlandscape characterisation for action pathways and side validation decision\nsupport tools, but also builds community based capacity for social accountability.\nE.g. given a drainage network, if you were to construct a new water body at this\npoint, what would the effect be on downstream water bodies and the communities that depend on it?</p>\n<p><img src=\"/images/compass24/compass24-2.webp\" alt=\"%c\" title=\"The general chair, Aadi Seth, opening the conference\" ></p>\n<p>Aadi states the goals for this RIC:</p>\n<ul>\n<li>Find new usecases, what challenges exist, and what principles we adopt for collaboration.</li>\n<li>Look at problems through different lenses: issues of equity, data validity, unnecessary digital labour, aligned with knowledge commons, scaling challenges, productisation challenges.</li>\n<li>Consider the data and algorithm standards necessary to enable networked co-creation but not hinder progress</li>\n<li>Think in productised terms for end-to-end usecases to solve real problems in rural communities.</li>\n</ul>\n<h2 id=\"discussion-session-1\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#discussion-session-1\"></a>Discussion Session 1</h2>\n<h3 id=\"sustainability-action-at-scale-abhijeet-parmar-isb\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#sustainability-action-at-scale-abhijeet-parmar-isb\"></a>Sustainability Action at Scale. Abhijeet Parmar (ISB)</h3>\n<p><a href=\"https://docs.google.com/presentation/d/1wZhXjRCStvkFIHh9Lo4UwIGFSezRdUKX/edit#slide=id.p1\">Slides</a>\n<img src=\"/images/compass24/compass24-3.webp\" alt=\"%r\" title=\"Abhijeet Parmar presenting\" ></p>\n<p>The speaker highlighted the importance of scalability in approaches, particularly in the context of technological applications. Applications must remain simple, grounded in community needs, and usable by the general public. A key problem discussed was the extraction of Above-Ground Biomass (AGB) using smartphone cameras while traversing forested areas. Traditional Lidar-based systems, though effective in providing detailed depth information, are deemed impractical due to the specialised equipment required.</p>\n<p>The proposed solution involves creating a Self-Supervised Learning (SSL) model that utilises mobile phones to conduct real-time segmentation of individual trees as one walks through a forest. This approach leverages a pre-trained segmentation model alongside advanced modelling and tagging processes.</p>\n<p>The development involves three distinct pipelines, which could be integrated into a single application in the future. Consideration must be given to the UI design to ensure accessibility and effectiveness by rural populations. Advancements in data collection, benchmarking, and pipeline development suggest that such technology could support large-scale forest management initiatives, particularly in public policy contexts. The initial testing phase of this model is being conducted under controlled conditions, including specific lighting and seasonal factors, with plans to extend its applicability.</p>\n<p>During the discussion, a question was raised regarding the allocation of funds for tree planting initiatives and identifying a starting point. Answer: it was suggested that bamboo, a valuable resource for biofuel production, could be a focal point. The Indian landscape has sufficient bamboo to meet current biofuel demand, and directing Corporate Social Responsibility (CSR) funds towards this effort could significantly expedite progress.</p>\n<p><em>During a break later I showed <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>'s GreenLens mobile app for estimating DBH from a mobile phone image (see <a href=\"https://drive.google.com/drive/folders/17-Yu3KXcgJiFapGc2AjJ2dHNC30YUbup?usp=sharing\">app download</a>).</em></p>\n<h3 id=\"plantation-monitoring-for-drone-images-snehasis-mukherjee-snu\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#plantation-monitoring-for-drone-images-snehasis-mukherjee-snu\"></a>Plantation monitoring for drone images, Snehasis Mukherjee (SNU)</h3>\n<p><a href=\"https://docs.google.com/presentation/d/1yyqx1Z8aVwtHnbkycGiSV_L7WllaI8JI/edit#slide=id.p3\">Slides</a></p>\n<p><img src=\"/images/compass24/compass24-4.webp\" alt=\"%r\" title=\"Snehasis Mukherjee presenting\" >\nThe presentation by Snehasis Mukherjee focused on plantation monitoring using drone imagery, addressing the limitations of satellite images, esp. their inaccessibility to farmers. The workflow involves using drones at lower altitudes to capture detailed field imagery. The process begins with printing satellite images of a village onto paper, collaboratively marking land use with the locals, and proposing interventions. These are then imported into QGIS by a technical team, followed by field trips to gather further data using GeoODK, which is also integrated into QGIS. This iterative process is intended to inform local policy decisions at the Gram Sabha level.</p>\n<p>For drone imagery, the low-cost DJI Mini 2 with an RGB camera was chosen. Heights of 50-100m proved effective for capturing plantation images with sufficient resolution. The use cases include crop area estimation, classification, and monitoring plantation health. The first field trip occurred in Aug 2023 in Vittalpur village near Hyderabad, resulting in 253 usable images at ~50m (mainly of plantations).</p>\n<p>Image annotation was labor-intensive, with 100 images annotated by the team and 150 outsourced for ₹1000, resulting in approximately 9000 annotations. The Xception and ResNet50 models showed promising results with reduced overfitting, and 2000 acres have been mapped now with multiple tree varieties. The challenge remains on how to supplement limited drone imagery with lower-resolution satellite images, since flying drones is expensive.</p>\n<h3 id=\"forestry-agroforestry-and-restoration-toolkit-using-technology-and-community-participation---ashish-kumar\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#forestry-agroforestry-and-restoration-toolkit-using-technology-and-community-participation---ashish-kumar\"></a>Forestry Agroforestry and Restoration Toolkit using Technology and Community Participation - Ashish Kumar</h3>\n<p><a href=\"https://docs.google.com/presentation/d/1hJ0NwdiRq5hAvxSDsopznuZD-B8Ik-OX/edit#slide=id.p1\">Slides</a></p>\n<p><img src=\"/images/compass24/compass24-5.webp\" alt=\"%r\" title=\"Ashish Kumar presenting\" >\nAshish is building a community participation model to scale agroforestry, aiming to create a feedback/knowledge loop with locals. Goal is to promote tree planting outside traditional forestry areas and restore degraded common lands. The approach involves identifying degraded areas and building a toolkit to recommend suitable tree species.</p>\n<p>The project includes several modules: Species Distribution Modelling (SDM), water management, carbon sequestration, and economic analysis. Water management is particularly critical and is informed by <a href=\"https://www.sciencedirect.com/science/article/pii/S2214581820302068\">research from the Kolar district</a>, which has experienced declining groundwater levels since the 1990s and exacerbated by increasing demand. Remote sensing data shows significant variation in water usage depending on plant type and location (e.g., mango vs eucalyptus).</p>\n<p>Their work utilised the <a href=\"https://earlywarning.usgs.gov/docs/SSEBopETreadme.pdf\">SSEBOP evapotranspiration</a> product, accessed via Google Earth Engine (GEE), to analyse water use and its implications for agroforestry efforts.</p>\n<h3 id=\"riverbed-sand-mining-activity-detection-based-on-satellite-imagery---siddharth-agarwal\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#riverbed-sand-mining-activity-detection-based-on-satellite-imagery---siddharth-agarwal\"></a>Riverbed sand mining activity detection based on satellite imagery - Siddharth Agarwal</h3>\n<p><a href=\"https://drive.google.com/file/d/1iXaGuY0Ihb1luCn3aifkYvIhX3aI4pzT/view\">Slides</a></p>\n<p><img src=\"/images/compass24/compass24-6.webp\" alt=\"%r\" title=\"Siddharth Agarwal presenting\" ></p>\n<p>Focussing on detecting riverbed sand mining activities using satellite imagery, particularly in areas where on-site visits are impractical. It turns out that sand is the second most extracted material globally after water, and its mining is a significant environmental concern especially for river communities. The project aims to develop a machine learning model to detect such mining activities using S1/S2 (didn't catch which, or both) satellite data.</p>\n<p>India Sand Watch, an open data platform developed with <a href=\"https://www.ooloilabs.in\">Ooloi Labs</a>, aims to collect annotate and archive data related to sand mining in India. This emerged due to the high costs associated with using detailed satellite imagery and processing and the need to understand sand mining comprehensively. The project covers the entire sand mining process, from discovery and land auctions to clearances and mining, and includes a 'sites of violence' framework that identifies intervention points.</p>\n<p>Significant challenge identified was the readability of documents associated with planning, which can be difficult even for humans let alone LLMs, making digitisation and structuring of data crucial. The transition from discovery to the actual mining site often involves navigating poorly accessible documents, highlighting the need for better evidence pipelines. <em>Note to self: just like our <a href=\"/projects/ce\">Conservation Evidence Copilots</a> project!</em></p>\n<p>They are in collaboration with Berkeley with the aim to develop a machine learning model that predicts mining activity using low-resolution imagery (thus saving costs), covering vast areas (up to 10000 km2+) with Sentinel-1/2 as base maps. Their goal is to combine this data to create large-scale evidence that can then be used to drive large-scale action. This approach has been validated in court, where the data was accepted as evidence by the <a href=\"https://greentribunal.gov.in\">National Green Tribunal</a> (NGT).</p>\n<p>Q: is the community getting involved? A: The initiative began with community action, reflecting concerns over sand mining's impact on ecosystems, as sand is the second most extracted material globally after water.</p>\n<h2 id=\"session-2\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#session-2\"></a>Session 2</h2>\n<h3 id=\"proposal-for-a-new-interactive-electronic-publication-standard---r-nandakumar\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#proposal-for-a-new-interactive-electronic-publication-standard---r-nandakumar\"></a>Proposal for a new interactive electronic publication standard - R. Nandakumar</h3>\n<p><a href=\"https://docs.google.com/presentation/d/142YSXa8IUUmKSKUhH1TvIN-PaD1Cuirv/edit#slide=id.p1\">Slides</a></p>\n<p><img src=\"/images/compass24/compass24-7.webp\" alt=\"%r\" title=\"R. Nandakumar presenting\" >\nR. Nandakumar (recently retired from ISRO but still working on this space) proposed a new interactive electronic publication standard aimed at improving the quality of information products in communicating research results more interactively. He seeks to integrate code with data, ensuring version control while addressing security and privacy concerns. The current business model, which relies on distracting advertisements, exacerbates the digital divide especially with rural communities and hampers effective communication.</p>\n<p>He highlighted several issues with existing formats; inadequate representation of images, maps, infographics, and spreadsheets, and the absence of interactive features like running commentaries during visualisation animations. Also, there is a lack of fine printing and zoom capabilities and flexible authorisation mechanisms.</p>\n<p>His proposal suggests evolving existing standards (like PDFs) into more interactive and self-contained formats that include code. First phase would extend 2D image maps to support animations and metadata while embedding free and open-source software within the PDF. The second phase could expand this to include 3D models.</p>\n<p>The end goal is to standardise interactions across various formats—image maps, spreadsheets, infographics, animations, and audiovisual content—using the ISO/IEC 25010 square standard, which provides a comprehensive framework for functionality, performance, compatibility, usability, reliability, security, maintainability, and portability. (see slides for more details on each of these)</p>\n<p><em>My mad idea:</em> might we build a WASM interpreter in JavaScript so that it can run inside the existing PDF JS interpreter and work with existing docs? WASI for PDF! I've got a project idea relevant to this that can perhaps be extended or forked; see <a href=\"/ideas/life-explorer-wasm\">Using wasm to locally explore geospatial layers</a>.</p>\n<h3 id=\"geospatial-data-standards-to-enable-co-creation-of-data-products-craig-dsouza\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#geospatial-data-standards-to-enable-co-creation-of-data-products-craig-dsouza\"></a>Geospatial data standards to enable co-creation of data products (Craig Dsouza)</h3>\n<p><a href=\"https://docs.google.com/presentation/d/1n1CN66Yh9wKKcquMHInbQPSRCkPY9vmhae-_ogJmIcg/edit#slide=id.g2eaa42613c0_0_73\">Slides</a></p>\n<p><img src=\"/images/compass24/compass24-8.webp\" alt=\"%r\" title=\"Craig Dsouza presenting\" ></p>\n<p>There is an overload of data and algorithms in all directions, so we want to accelerate development of <em>better</em> data and algorithms rather than quantity. How do we increase trust and reduce friction in the source data and eventual results with rural communities?\nExisting domain specific standards do exist, but they either dont exist or aren't widely adopted (see previous talk), especially for natural resource managemen where it can be of different modalities/resolution and some commonality exists but also sector specific extensions are required from current standards to deal with local variability.</p>\n<p>So they are surveying data standards and algorithm standards. To consider data standards first, the most successful is Open Street Map. For algorithm standards, there is rapidly adopted services like HuggingFace. But what is the <em>combination</em> of both so that they can be coupled to real outcomes?</p>\n<p>How do we compare the performance of data standards and build guiding principles of which ones to pick?</p>\n<ul>\n<li><em>to reduce friction:</em>\n<ul>\n<li>consider the time taken for dataset and model integration with existing open source tools</li>\n<li>or the time taken for the end user to create a new dummy datapoint.</li>\n<li>time taken for end user to run the model and make the first minor fix.</li>\n</ul>\n</li>\n<li><em>to accelerate development:</em>\n<ul>\n<li>number of collaborators over time</li>\n<li>number of additions by 3rd parties over time</li>\n<li>increase in model performance over time</li>\n</ul>\n</li>\n</ul>\n<p>An existing example is how to share a LULC dataset using existing open geospatial standards (<a href=\"https://stacspec.org/en\">STAC</a>). The data standard creates a simple JSON file which has metadata for that module.  The data user can then access to eh latest version of the data via either an API or the STAC browser.</p>\n<p><em>TODO for myself:</em> Look at mapping these metrics onto our TMF pipeline (in <a href=\"/projects/4c\">Trusted Carbon Credits</a>) and investigate a possible user study with some CCI data. Also is STAC relevant to TMF/LIFE/FOOD publishing pipeline in <a href=\"/projects/life\">Mapping LIFE on Earth</a> as we need to publish the various layers there soon.</p>\n<h3 id=\"geospatial-data-flow-management---anil-madhavapeddy\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#geospatial-data-flow-management---anil-madhavapeddy\"></a>Geospatial data flow management - Anil Madhavapeddy</h3>\n<p>My talk, I was speaking, so no notes! I'll upload the slides later and edit this section.</p>\n<p>Good question from the audience about healthcare management and its relevance to planetary computing -- it seems to share a lot of the problems involving data sensitivity and the need for spatially explicit data sharing.</p>\n<h3 id=\"opportunities-in-agricultural-sensing---anupam-sobti\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#opportunities-in-agricultural-sensing---anupam-sobti\"></a>Opportunities in agricultural sensing - Anupam Sobti</h3>\n<p><a href=\"https://docs.google.com/presentation/d/11XAuKb78TpIpMkZGYWn58I3iQnlBvRmQ/edit#slide=id.p1\">Slides</a></p>\n<p>Anupan introduced the main questions across the rural farming cycle including:</p>\n<ul>\n<li><em>Sowing:</em> &quot;Is this the right crop?&quot; &quot;Will I have enough resources (water, heat, seeds)?&quot; &quot;Are these the right seeds?&quot;</li>\n<li><em>Harvesting:</em> &quot;Is this the right time to harvest?&quot; &quot;How do I plan post-harvest logistics?&quot; &quot;How do I manage residue?&quot;</li>\n<li><em>Selling:</em> &quot;Is this the right time to sell?&quot; &quot;Who do I trust to sell to?&quot; &quot;Do I sell now or wait?&quot;</li>\n</ul>\n<p>So onto the notion of &quot;Agricultural Computing&quot;, which:</p>\n<ul>\n<li>involves multiple decision layers: farmer-centric, government-centric, and finance-centric.</li>\n<li>features recent innovations such as advancements in remote sensing and game theory applications to navigate complex agricultural decisions.</li>\n</ul>\n<p>Urban heat islands are a significant problem detectable with geospatial data. He noted the reference of paper by\nMohajerani, Abbas, Jason Bakaric, and Tristan Jeffrey-Bailey. &quot;The urban heat island effect, its causes, and mitigation, with reference to the thermal properties of asphalt concrete.&quot; <em>Journal of Environmental Management</em> 197 (2017): 522-538.</p>\n<p><em>Note to self: Send to <a href=\"https://ancazugo.github.io/\">Andres Zuñiga-Gonzalez</a> re <a href=\"/papers/2024-green-urban-eq\">Green Urban Equity: Analyzing the 3-30-300 Rule in UK Cities and Its Socioeconomic Implications</a>.</em></p>\n<p><em>Q:</em> For marginalised communities, should there be standards for interactions to obtain feedback iteratively, reducing the shock of policy changes? <strong>A:</strong> There is a need for significant groundwork engineering right now to provide immediate feedback, helping communities adapt more smoothly to changes.</p>\n<h3 id=\"understanding-soil-moisture-regime-for-crop-diversification---prachi-d-patil\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#understanding-soil-moisture-regime-for-crop-diversification---prachi-d-patil\"></a>Understanding Soil Moisture Regime for Crop Diversification - Prachi D. Patil</h3>\n<p><a href=\"https://docs.google.com/presentation/d/1ZZMqF-8hCIupNm5VUH8wu61v9eTuI1e-/edit#slide=id.p1\">Slides</a>\n<img src=\"/images/compass24/compass24-9.webp\" alt=\"%r\" title=\"Prachi D. Patil presenting\" ></p>\n<p>Prachi gave a perspective from the farmer's fields, with a study aiming to group relatively homogenous regions based on soil, climate, and physiography, focusing on moisture availability periods for soil and the length of the growing season. Their approach uses simple moisture sensors at various depths to measure soil resistivity, providing farmers with real-time information on whether to irrigate. This system can map dry spells and their duration, offering actionable insights for crop management.</p>\n<p>The <a href=\"https://www.wassan.org/wp-content/uploads/2022/03/WASSANPublication_BhagyalakshmiUthappaSudhakarUday_03032022.pdf\">Navadhanya system</a> is a traditional cropping method with specific design and crop geometry, which can be analysed for soil moisture as a multidimensional system—both spatially and temporally. Different crops have varying maturity and root depth cycles, making soil moisture critical for establishing and protecting these crops. A fallow period during a critical stage can lead to crop loss and so highlights the importance of consistent moisture.</p>\n<p>Navadhanya bridges traditional crop mixing knowledge with modern scientific sensor methods as described in the talk. Navadhanya offers nutritional security through crop variety though farmers typically sell a reliable monocrop in the market. Their analysis suggests a need to consider soil use regimes both in the short and long term, challenging the practice of forcing farmers to switch crops (e.g., from rice to bajra) based on short-term  profitability.</p>\n<p><strong>Q:</strong> How can this tool assist with monsoon management? <strong>A:</strong> The tool can map soil moisture and integrate it with traditional knowledge, enabling the development of combined solutions for managing monsoon impacts.</p>\n<h3 id=\"ranking-and-financing-based-on-climate-smart-agriculture---atanu-garai-socialwell\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#ranking-and-financing-based-on-climate-smart-agriculture---atanu-garai-socialwell\"></a>Ranking and financing based on climate smart agriculture - Atanu Garai (SocialWell)</h3>\n<p><a href=\"https://docs.google.com/document/d/1MJ-Nw_P3z6gI9rvh4OcjJmdZRE83D_OXedgEeDZDnm8/edit\">Slides</a>\n<img src=\"/images/compass24/compass24-10.webp\" alt=\"%r\" title=\"Atanu Garai presenting\" >\n<img src=\"/images/compass24/compass24-11.webp\" alt=\"%r\" title=\"The machine learning approaches to climate models\" ></p>\n<p>Atanu switched tack to the business side of things, focused on switching Farmer Producer Organisations (FPOs), of which there are 10000+ in India, to adopt climate-smart practices. The incentive based approach includes:</p>\n<ol>\n<li><strong>Business Plan:</strong> Farmers, FPOs, and market data collaboratively generate a business plan, which is then used by FPOs to secure loans.</li>\n<li><strong>Land Parcels and FPO Rating:</strong> Farm inputs, soil, and weather data are tracked to classify and rate each land parcel.</li>\n<li><strong>Climate Smart Financing:</strong> Execute the plan based on the gathered data.</li>\n</ol>\n<p>The key requirements for obtaining an FPO Land Parcel Rating with their method are:</p>\n<ol>\n<li><strong>Farm Inputs:</strong> Data on seeds, fertilizers, and pesticides provided by the FPO and sourced by the farmer, recorded by the FPO.</li>\n<li><strong>Soil Data:</strong> Rating of soil using a combination of mobile and sensor technologies.</li>\n<li><strong>Climate Data:</strong> Sourced from public datasets, focusing on classifying rainfall and extreme weather events.</li>\n<li><strong>Farm Practices:</strong> Documentation through photos of sowing, irrigation, and data on the methods used.</li>\n</ol>\n<p>For climate data, their approach involves using neural network-based chaos forecasting to provide weather predictions in a format useful to farmers. <em>The second half of the presentation went into great detail into their ensemble methods to predict weather patterns, which I didn't note in detail, but see <a href=\"/ideas/diffusion-model-satellites\">Diffusion models for terrestrial predictions about land use change</a>.</em></p>\n<h2 id=\"session-3\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#session-3\"></a>Session 3</h2>\n<h3 id=\"groundwater-monitoring-tool-challenges-to-apply-ecological-health-monitoring-at-scale---himani-sharmachiranjit-guha\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#groundwater-monitoring-tool-challenges-to-apply-ecological-health-monitoring-at-scale---himani-sharmachiranjit-guha\"></a>Groundwater monitoring tool, challenges to apply ecological health monitoring at scale - Himani Sharma/Chiranjit Guha</h3>\n<p><a href=\"https://docs.google.com/presentation/d/14zesuTt8R9UGOvaSXsvOPwARO-c4xyg6/edit?usp=sharing&amp;ouid=116413035808485050246&amp;rtpof=true&amp;sd=true\">Slides</a></p>\n<p><img src=\"/images/compass24/compass24-12.webp\" alt=\"%r\" title=\"Himani Sharma presenting\" >\nGroundwater monitoring in India faces significant data scarcity, with only 4886 wells having long-term data in the whole country, averaging just 7 wells per district. To address this 150+ organisations collaborated a few years ago to create an Android app for crowdsourcing groundwater data. Starting with 5000 villages, the project has now expanded to 11000+ villages and is used both pre- and post-monsoon and is revealing substantial fluctuations in water levels.</p>\n<p>The app enables users to generate village-level groundwater maps, correlating water level data with geological information to create comprehensive groundwater flow maps, even within individual villages. The process involves measuring water depth from three wells per village, using GPS and mobile devices, and rendering the data on an online platform.</p>\n<p><img src=\"/images/compass24/compass24-sm-ss.webp\" alt=\"%c\" title=\"Soil moisture measurements\" >\nThe crowdsourcing presents challenges in data quality, requiring post-processing and filtering. Despite this, the analysis has been highly effective, and the Jaldoot scheme now covers 450000+ villages as of 2023, following extensive lobbying with the Indian government who are now supporting it directly.</p>\n<p>In addition to groundwater monitoring, efforts are also focused on community-based ecological health monitoring, including biodiversity, biomass assessment, and pollinator/insect tracking. Four sample watersheds with detailed socio-ecological-economic indicators and over 150 annual monitoring sites are used to track changes in vegetation and species over time. These assessments both reveal valuable insights (e.g., the increased presence of a rare frog in specific watersheds) and are resource-intensive and challenging to scale. Potential solutions include GIS-based platforms, remote sensing, and tools for tracking changes in standing biomass, carbon stock, and biodiversity.</p>\n<p><em>Note to self:</em> Possible connection with the iRecord team in the UK to explore applicability of biodiversity data collected?</p>\n<p>The project also maps highly infested areas by invasive species, such as the <a href=\"https://india.mongabay.com/2020/08/lantana-invasion-threatens-40-percent-of-indias-tiger-habitat-reports-study/\">Lantana camara</a>, to focus restoration efforts abd is drawing on data from 150+ sites.</p>\n<p>Q: what are the next steps? A: going to withdraw the Android app in the next few years, so the government is taking over next after creating a similar app. Declaring the project a success! Q: But will the data remain open for the communities once the government takes over? A: There is a growing widening of the dataset collection (e.g. biodiversity) to refine the datasets for things not yet considered such as ecosystem services. Not clear on the future of the government-run data provenance.</p>\n<h3 id=\"land-suitability-assessment----athithiyan-mr\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#land-suitability-assessment----athithiyan-mr\"></a>Land Suitability Assessment -- Athithiyan MR</h3>\n<p><a href=\"https://docs.google.com/presentation/d/19rXpXNoizFA-Pc8UKXC0G1qbfzSm3iZ-/edit#slide=id.p1\">Slides</a></p>\n<p><img src=\"/images/compass24/compass24-13.webp\" alt=\"%r\" title=\"Athithiyan presenting\" >\nTheir &quot;LifeLands&quot; system is designed to unlock the productive potential of degraded lands, aiming to mitigate climate impacts through better land use. The digital planning tool they built utilises satellite imagery, public databases, and AI modelling to assess land suitability for regenerative purposes such as solar energy, sustainable water management, or ecological restoration.</p>\n<p>The system integrates geospatial and socioeconomic data layers, along with public datasets, to produce an interactive map and report, determining whether land is unused and suitable for intervention. Data collection is facilitated through a mobile app that traces land boundaries using GPS, captures four site photos and a video, and gathers information on land ownership and existing vegetation (shrubs and trees).</p>\n<h3 id=\"designing-for-context---aila-dutt\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#designing-for-context---aila-dutt\"></a>Designing for Context - Aila Dutt</h3>\n<p><a href=\"https://docs.google.com/presentation/d/19lThkR3LfHhQvDibQiHs_vtNeCr4XOFj/edit#slide=id.p1\">Slides</a></p>\n<p><img src=\"/images/compass24/compass24-14.webp\" alt=\"%r\" title=\"Aila Dutt presenting\" >\nCitizens and community stewards need to be able to understand, analyse and apply various concepts and data around climate change to understand intricacies of socio-economic changes. So how might be simplify complex systems and data to encourage data driven decision making through these interventions? To be successful this needs to be participatory decision making and a reclamation of agency of each of the stakeholders within the system.\nIt is essential for citizens and community stewards to comprehend, analyse, and apply complex concepts and data. The goal is to simplify these systems and data, fostering participatory decision-making and empowering stakeholders to reclaim their agency within the system.</p>\n<p>Broad research approach:</p>\n<ol>\n<li><strong>Discover:</strong> Conduct field research, interviews, observations, secondary research, and expert consultations.</li>\n<li><strong>Define:</strong> Engage in systems mapping, curriculum design, and persona mapping using analogous examples.</li>\n<li><strong>Ideate:</strong> Perform field testing, map problems to solutions, and explore sacrificial concepts.</li>\n<li><strong>Prototype:</strong> Conduct usability testing, create sketches and wireframes, and integrate data analytics.</li>\n</ol>\n<p>To enhance understanding, environmental education and curriculum design can incorporate semi-fictional &quot;case studies&quot; that place users in relatable contexts. This approach increases adoption by breaking the system into modules and using gamification to test concepts. For example, users can explore the concept of 'climate change' as it pertains to their own land and prosperity.</p>\n<p>In the analysis phase, it’s crucial to not only graph data but also describe it in ways that participants can relate to their own landscapes. The decision-making process must integrate data-driven insights with existing frameworks. Generative images and brainstorming sessions are used to develop innovative ways to visualise complex data, such as precipitation and climatic variables, in a simple and understandable form.</p>\n<p><strong>Example Activity:</strong> &quot;Set a 15-minute timer and brainstorm all possible ways to present data simply.&quot; Consider descriptors like terrain, slopes, plains, rainfall, surface water, MNREGA projects, and agriculture to see how users can better utilise this information.</p>\n<p><strong>Q:</strong> Is 'making data actionable' a priority, and how do we address the tragedy of the commons? <strong>A:</strong> Yes, systems thinking and collaboration are essential to prevent resource depletion and ensure shared benefits.\n<strong>Q:</strong> Can this approach scale from smaller to larger communities? <strong>A:</strong> Yes, by developing microwatershed data and village-level datasets, even large communities can work at much smaller, more precise resolutions.</p>\n<p><img src=\"/images/compass24/compass24-group1.webp\" alt=\"%c\" title=\"The attendees of the RIC\" ></p>\n<h2 id=\"group-sessions\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#group-sessions\"></a>Group Sessions</h2>\n<p>After this, we split into groups to discuss the following topics roughly as follows:</p>\n<ul>\n<li>What do we need to do to take this into scale? e.g. remote sensing: works at some scale, but validation also needs to scale.</li>\n<li>Then we saw new usecases. E.g. soil moisture. Now we need to think this through and come up with succinct problem statement to.</li>\n<li>Start taking through some datasets and algorithms as examples and turn them in to a spec. What is the specification process and ultimate metadata standards?</li>\n<li>One group then will work on methods to facilitate community engagement with data</li>\n<li>And then what are principles and processes for effective collaboration and co-creation. What are barriers?</li>\n</ul>\n<p>I'll follow up with more analysis about the outcomes soon, as I'm in touch with Aadi and hopefully we will be working on a project together in the future. But for now, I'll conclude this trip report with great appreciation for Aadi and the hard working volunteers at COMPASS 2024 that made attendance such a pleasure!</p>\n<p><img src=\"/images/compass24/compass24-18.webp\" alt=\"\" title=\"Glorious Delhi sunset to finish the conference\" >\n<img src=\"/images/compass24/compass24-21.webp\" alt=\"\" title=\"Spotted some electric charging stations!\" >\n<img src=\"/images/compass24/compass24-22.webp\" alt=\"\" title=\"Made it back to London in time to catch some tennis\" ></p><h1>References</h1><ul><li>Zuñiga-Gonzalez et al (2024). Green Urban Equity: Analyzing the 3-30-300 Rule in UK Cities and Its Socioeconomic Implications. <a href=\"https://doi.org/10.5194/egusphere-egu24-20833\" target=\"_blank\"><i>10.5194/egusphere-egu24-20833</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/compass2024-ric-tripreport",
      "title": "COMPASS 2024 report on the CoRE stack RIC meeting",
      "summary": "Report from COMPASS 2024 on the CoRE stack RIC meeting on climate adaptation for rural communities using digital public infrastructure and commoning technologies",
      "date_published": "2024-07-08T00:00:00.000000Z",
      "date_modified": "2024-07-08T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        ":rsn",
        ":life",
        "conservation",
        "livenotes",
        "india",
        "biodiversity",
        "sensing",
        ":plancomp"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.5194/egusphere-egu24-20833",
          "doi": "10.5194/egusphere-egu24-20833",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2024-life-2",
      "content_html": "<p>We have made an update to the <a href=\"/projects/life\">LIFE</a> biodiversity metric based on reviewer feedback, and are very pleased that it has been accepted for publication early next year as part of a special issue from the Royal Society. Any comments would be most welcome before we submit the final proofs in a few months. The revisions incorporated feedback from peer reviewers and represent an improved version of the methodology. Getting accepted for the Royal Society special issue is particularly exciting as it means the LIFE metric will be published alongside other important contributions to biodiversity science. This metric has already proven useful in our work on food systems and will hopefully become a widely-used tool for quantifying biodiversity impacts across diverse applications.</p><h1>References</h1><ul><li>Eyres et al (2025). LIFE: A metric for mapping the impact of land-cover change on global extinctions. <a href=\"https://doi.org/10.1098/rstb.2023.0327\" target=\"_blank\"><i>10.1098/rstb.2023.0327</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2024-life-2",
      "title": "Second preprint of the LIFE biodiversity metric available",
      "summary": "Updated LIFE biodiversity metric preprint based on reviewer feedback, accepted for publication in Royal Society special issue.",
      "date_published": "2024-07-01T00:00:00.000000Z",
      "date_modified": "2024-07-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "biodiversity",
        "spatial",
        "economics",
        "conservation",
        "sdms",
        "aoh"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2024-life.pdf",
          "mime_type": "application/pdf",
          "title": "LIFE: A metric for mapping the impact of land-cover change on global extinctions"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1098/rstb.2023.0327",
          "doi": "10.1098/rstb.2023.0327",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/80795e06-ac75-4015-b178-3cfcbb233685-1",
      "content_html": "<p><a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> organised a workshop at the CCI on how to bring about an <a href=\"https://about.conservationevidence.com/2024/07/12/the-next-steps-for-transforming-conservation-ideas-from-the-effectiveness-revolution-workshop/\">Effectiveness Revolution</a> for transforming conservation into an evidence-driven discipline.</p>\n<blockquote>\n<p>The aim was to discuss the &quot;Evidence Emergency&quot; (The Wildlife Trusts' term), the urgent need to embed evidence into decision-making and to create additional evidence to fill the considerable gaps in the evidence base, to improve conservation practice.</p>\n</blockquote>\n<p>I gave a talk about our early results with the <a href=\"/papers/2024-ce-llm\">conservation copilots</a> work.</p><h1>References</h1><ul><li>Iyer et al (2025). Careful design of Large Language Model pipelines enables expert-level retrieval of evidence-based information from syntheses and databases. <a href=\"https://doi.org/10.1371/journal.pone.0323563\" target=\"_blank\"><i>10.1371/journal.pone.0323563</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/80795e06-ac75-4015-b178-3cfcbb233685-1",
      "title": "Speaking at CCI workshop on conservation evidence",
      "summary": "Talk at CCI Effectiveness Revolution workshop on conservation copilots and embedding evidence into conservation decision-making.",
      "date_published": "2024-06-25T00:00:00.000000Z",
      "date_modified": "2024-06-25T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "conservation",
        "evidence",
        "llms",
        "ai"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1371/journal.pone.0323563",
          "doi": "10.1371/journal.pone.0323563",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/aicam-interview-ce",
      "content_html": "<p>I talked to the <a href=\"https://ai.cam.ac.uk\">AI@Cam</a> team to discuss our <a href=\"/notes/aicn-in-aicam\">AICN</a>\nproject and what we're planning to do in the <a href=\"/projects/ce\">Conservation Evidence Copilots</a> team.</p>\n<blockquote>\n<p>Over the last two decades, the University of Cambridge-based project Conservation Evidence has screened more than 1.6 million scientific papers on conservation, as well as manually summarising 8,600+ studies relating to conservation actions. However, the current project’s work is limited by the specialised skills needed to screen and summarise relevant studies. It took more than 75 person years to manually curate the current database and only a few 100 papers can be added each year. By accelerating these efforts, AI has the potential to transform the impact this database has on biodiversity conservation.</p>\n<p>What we’re aiming to do through the ai@cam project – bringing together an interdisciplinary team from across the fields of computer science, ecology, climate and conservation – is to build up models of the world that are really detailed and that can be queried by policy makers to help make informed decisions.</p>\n<p><cite>-- <a href=\"https://ai.cam.ac.uk/blog/harnessing-the-power-of-ai-to-help-save-our-planet\">AI@Cam</a></cite></p>\n</blockquote>",
      "url": "https://anil.recoil.org/notes/aicam-interview-ce",
      "external_url": "https://ai.cam.ac.uk/blog/harnessing-the-power-of-ai-to-help-save-our-planet",
      "title": "Interview with AI@CAM about conservation",
      "summary": "Discussing conservation with AI@CAM and using AI to transform biodiversity conservation efforts.",
      "date_published": "2024-06-09T00:00:00.000000Z",
      "date_modified": "2024-06-09T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ai",
        "cambridge",
        "computerlab",
        "biodiversity",
        "evidence",
        "interview"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2024-hyper-tropical-mapping-1",
      "content_html": "<p>A preprint on using <a href=\"https://en.wikipedia.org/wiki/Hyperspectral_imaging\">hyperspectral sensors</a> to perform tree species identification across the tropics is now available on bioarxiv.</p>\n<blockquote>\n<p>This study introduces a new approach for mapping tree species linking a multi-temporal implementation of the CNN method detectree2 to segment tree-crowns from aerial photographs to machine learning classifiers to identify species from hyperspectral data.</p>\n</blockquote><h1>References</h1><ul><li>Ball et al (2024). Harnessing temporal & spectral dimensionality to identify individual trees in tropical forests. bioRxiv. <a href=\"https://doi.org/10.1101/2024.06.24.600405\" target=\"_blank\"><i>10.1101/2024.06.24.600405</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2024-hyper-tropical-mapping-1",
      "title": "Hyperspectrally identifying trees in tropical forests",
      "summary": "Preprint on using hyperspectral sensors and CNN tree-crown segmentation for tropical tree species identification.",
      "date_published": "2024-06-01T00:00:00.000000Z",
      "date_modified": "2024-06-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "sensing",
        "hyperspectral",
        "ai",
        "satellite",
        "forests"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2024-hyper-tropical-mapping.pdf",
          "mime_type": "application/pdf",
          "title": "Harnessing temporal & spectral dimensionality to identify individual trees in tropical forests"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1101/2024.06.24.600405",
          "doi": "10.1101/2024.06.24.600405",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/d592bf17-c835-435f-9469-f0f65e926975-1",
      "content_html": "<p>I was invited by Mary Sheeran to deliver a keynoted at <a href=\"https://www.lambdadays.org/\">Lambda Days</a>, and I decided to go along to talk about my work on  <a href=\"/videos/981c00b5-32c0-4cac-a387-6c945dfa9934\">Programming for the Planet</a>. The conference was a really vibrant crowd and I would definitely go along in future years. It's best summarised via an <a href=\"https://www.youtube.com/watch?v=Kao-LguvYDU&amp;list=PLvL2NEhYV4ZtX2TurK0BIlKD_cHct0rSs\">interview video</a> they took of all the speakers.</p>",
      "url": "https://anil.recoil.org/notes/d592bf17-c835-435f-9469-f0f65e926975-1",
      "title": "Programming for the Planet",
      "summary": "Planetary computing keynote at LambdaDays conference featuring interview video on biodiversity and satellite sensing work.",
      "date_published": "2024-05-27T00:00:00.000000Z",
      "date_modified": "2024-05-27T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        ":plancomp",
        "interview",
        "biodiversity",
        "sensing",
        "satellites",
        "fp",
        "sweden"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2024-food-life-1",
      "content_html": "<p>Submitted preprint on quantifying the biodiversity cost of global food consumption for peer review. This work links the LIFE biodiversity metric with food consumption and production data to quantify how different types of food and their production locations impact species extinctions. We discovered that the impact varies widely both across and within foods - in many cases by more than an order of magnitude. Using an opportunity-cost framing, we can estimate the marginal changes in expected extinctions from converting natural vegetation to agriculture or restoring farmland to natural habitat. Despite marked differences in per-capita impacts across countries, there are consistent patterns that could inform everything from national policies to individual dietary choices.</p><h1>References</h1><ul><li>Ball et al (2025). Food impacts on species extinction risks can vary by three orders of magnitude. <a href=\"https://doi.org/10.1038/s43016-025-01224-w\" target=\"_blank\"><i>10.1038/s43016-025-01224-w</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2024-food-life-1",
      "title": "Quantifying the impact of the food we eat on species extinctions",
      "summary": "Preprint on quantifying biodiversity cost of global food consumption submitted for peer review.",
      "date_published": "2024-05-01T00:00:00.000000Z",
      "date_modified": "2024-05-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "conservation",
        "biodiversity",
        "food",
        "climate"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2024-food-life.pdf",
          "mime_type": "application/pdf",
          "title": "Food impacts on species extinction risks can vary by three orders of magnitude"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1038/s43016-025-01224-w",
          "doi": "10.1038/s43016-025-01224-w",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2024-sdm-sa-1",
      "content_html": "<p><a href=\"https://github.com/emorris7\">Emily Morris</a> did some great MPhil work here in her Masters on using <a href=\"/ideas/sdms-with-cnns\">CNNs with satellite data</a> to do species predictions in South Africa better. She presented it at the <a href=\"https://www.climatechange.ai/events/iclr2024\">ICLR CCAI</a> workshop in Vienna, and is now off to do a PhD at Oxford!</p>\n<blockquote>\n<p>Species distribution models are crucial tools that predict species locations by interpolating observed field data with environmental information. We develop an improved, scalable method for species distribution modelling by proposing a dataset pipeline that incorporates global remote sensing imagery, land use classification data, environmental variables, and observation data, and utilising this with CNN models to predict species presence at higher spatial and temporal resolutions than well-established species distribution modelling methods.</p>\n</blockquote>",
      "url": "https://anil.recoil.org/notes/2024-sdm-sa-1",
      "title": "Predicting species using machine learning at CCAI",
      "summary": "MPhil research on using CNNs with satellite data for species distribution modeling presented at ICLR CCAI workshop in Vienna.",
      "date_published": "2024-05-01T00:00:00.000000Z",
      "date_modified": "2024-05-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "machine-learning",
        "conservation",
        "climate",
        "cnn",
        "remote-sensing"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2024-sdm-sa.pdf",
          "mime_type": "application/pdf",
          "title": "Towards Scalable Deep Species Distribution Modelling using Global Remote Sensing"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/swseng",
      "content_html": "<p>We're still reeling from the shocking and unexpected <a href=\"https://www.cst.cam.ac.uk/news/ross-anderson\">passing of Ross Anderson</a>\nlast week. You can read a lovely <a href=\"https://raintown.org/ross_anderson/\">tribute</a> to him by <a href=\"https://raintown.org\">Satnam Singh</a>, and I still getting my thoughts together on all the guidance, advice and prods in the right direction that Ross has given me over the years.</p>\n<p>In pragmatic news, I'll be emergency lecturing part of Ross' 1A <a href=\"https://www.cl.cam.ac.uk/teaching/2324/SWSecEng/\">Software and Security Engineering</a> course here at Cambridge, along with my colleagues Martin, Alastair, Mort and Rob who have all stepped up at short notice. I have absolutely no idea how we'll live up to Ross' standard, but we'll do our very best in his memory!</p>",
      "url": "https://anil.recoil.org/notes/swseng",
      "title": "Teaching 1A Security and Software Engineering",
      "summary": "Lecturers step in to teach Software and Security Engineering course after Ross Anderson's passing.",
      "date_published": "2024-04-04T00:00:00.000000Z",
      "date_modified": "2024-04-04T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "teaching",
        "cambridge",
        "computerlab",
        "security"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2024-cc-blockchain-1",
      "content_html": "<p>Paper on smart contracts for carbon credits at <a href=\"http://icbc2024.ieee-icbc.org\">ICBC 2024</a> in Dublin. This work proposes the PACT stablecoin, which addresses concerns about credibility, scalability, and liquidity in voluntary carbon markets. We combine remote sensing data, modern econometric techniques, and blockchain-based certification and trading to create digital carbon assets against which offsetting claims can be transparently verified. The key innovation is creating a reproducible computational pipeline that not only quantifies CO2 emissions but also allows credits to be pooled based on co-benefits like biodiversity and jurisdictional attributes, increasing liquidity through fungibility. We implemented it on the Tezos blockchain, which is designed for low-cost transactions with minimal environmental impact.</p><h1>References</h1><ul><li>Jaffer et al (2024). Global, robust and comparable digital carbon assets. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2403.14581\" target=\"_blank\"><i>10.48550/arXiv.2403.14581</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2024-cc-blockchain-1",
      "title": "Global, robust and comparable digital carbon assets",
      "summary": "Paper on PACT stablecoin implementation using blockchain smart contracts for transparent and scalable carbon credit transactions at ICBC 2024.",
      "date_published": "2024-04-04T00:00:00.000000Z",
      "date_modified": "2024-04-04T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "blockchain",
        "carbon-credits",
        "economics",
        "climate",
        "distributed"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2024-cc-blockchain.pdf",
          "mime_type": "application/pdf",
          "title": "Global, robust and comparable digital carbon assets"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.48550/arXiv.2403.14581",
          "doi": "10.48550/arXiv.2403.14581",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/bushel-step1",
      "content_html": "<p>I've done a redesign of my site after about 20 years since the last one <a href=\"/notes/opening-anil-recoil-org\">back in 2003</a>.\nThe site design is based on my upcoming Bushel content manager, which I'll post about more once I get the data model in place and try it out properly using this site as a guinea pig.</p>\n<p><a href=\"https://nick.recoil.org\">Nick Ludlam</a> also refreshed <a href=\"https://nick.recoil.org\">his website</a> since we were chatting about how outdated our web presences were, and he also put up a main <a href=\"https://recoil.org\">recoil.org</a> page for the main server.</p>",
      "url": "https://anil.recoil.org/notes/bushel-step1",
      "external_url": "/",
      "title": "Rolling out a new site design",
      "summary": "New site design launched after 20 year redesign featuring Bushel content manager preview.",
      "date_published": "2024-04-01T00:00:00.000000Z",
      "date_modified": "2024-04-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "selfhosting",
        "website",
        "ui",
        "recoil"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2024-nbs-risk-1",
      "content_html": "<p>A new preprint is available on our work on ex-ante pricing models for nature-based solutions. It is currently under review, so any feedback is most welcome! This paper tackles the challenge of variability in the performance of nature-based climate solutions by developing optimal release schedules that balance generating credits with higher permanence ratings against limiting the risk of negative additionality. We use Monte Carlo simulations on both theoretical and real-life projects to show how conservatively anticipating carbon release and issuing additional credits when reality is less pessimistic than projections can provide pragmatic insurance against the inherent uncertainty in these systems.</p><h1>References</h1><ul><li>Rau et al (2024). Mitigating risk of credit reversal in nature-based climate solutions by optimally anticipating carbon release. <a href=\"https://doi.org/10.1080/17583004.2024.2390854\" target=\"_blank\"><i>10.1080/17583004.2024.2390854</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2024-nbs-risk-1",
      "title": "Preprint available on insuring against variability of NbS",
      "summary": "Preprint on ex-ante pricing models for mitigating credit reversal risk in nature-based climate solutions through optimal carbon release anticipation.",
      "date_published": "2024-03-01T00:00:00.000000Z",
      "date_modified": "2024-03-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "economics",
        "nbs",
        "forests",
        "carboncredits"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2024-nbs-risk.pdf",
          "mime_type": "application/pdf",
          "title": "Mitigating risk of credit reversal in nature-based climate solutions by optimally anticipating carbon release"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1080/17583004.2024.2390854",
          "doi": "10.1080/17583004.2024.2390854",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2024-green-urban-eq-1",
      "content_html": "<p>Abstract on urban biodiversity and human health at <a href=\"https://meetingorganizer.copernicus.org/EGU24/EGU24-20833.html\">EGU 24</a>. This work analyzes the 3-30-300 rule for urban green spaces across major UK cities - the rule states that every resident should be close to at least three trees, every neighborhood should have 30% canopy cover, and every citizen should have access to a public green space within 300m. We employed remote sensing imagery, census data, and machine learning to assess implementation of this rule and found significant disparities, particularly in impoverished areas. The findings emphasize the need for strategic urban planning that fosters both social equity and environmental sustainability in our cities.</p><h1>References</h1><ul><li>Zuñiga-Gonzalez et al (2024). Green Urban Equity: Analyzing the 3-30-300 Rule in UK Cities and Its Socioeconomic Implications. <a href=\"https://doi.org/10.5194/egusphere-egu24-20833\" target=\"_blank\"><i>10.5194/egusphere-egu24-20833</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2024-green-urban-eq-1",
      "title": "Green Urban Equity: Analyzing the 3-30-300 Rule in UK Cities and Its Socioeconomic Implications",
      "summary": "Abstract on analyzing urban green space access and equity across UK cities using remote sensing and machine learning at EGU 2024.",
      "date_published": "2024-03-01T00:00:00.000000Z",
      "date_modified": "2024-03-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "urban",
        "biodiversity",
        "spatial",
        "conservation",
        "health"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.5194/egusphere-egu24-20833",
          "doi": "10.5194/egusphere-egu24-20833",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2024-planetary-computing-2",
      "content_html": "<p>Revision of planetary computing preprint. We updated the paper with additional insights and refinements based on feedback from the research community. This revision strengthens the case for planetary-scale infrastructure that can handle global environmental data with proper traceability and reproducibility guarantees. The computing challenges we face in environmental science are fundamentally different from traditional big data problems - we need to handle continuously evolving datasets with complex provenance requirements and access controls. This revision better articulates these unique requirements and the opportunities for the systems research community to contribute to addressing the climate and biodiversity crises.</p><h1>References</h1><ul><li>Ferris et al (2024). Planetary computing for data-driven environmental policy-making. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2303.04501\" target=\"_blank\"><i>10.48550/arXiv.2303.04501</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2024-planetary-computing-2",
      "title": "A Case for Planetary Computing",
      "summary": "Revised preprint on planetary-scale infrastructure for ingesting and analyzing global environmental data with traceability and reproducibility.",
      "date_published": "2024-03-01T00:00:00.000000Z",
      "date_modified": "2024-03-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "conservation",
        "systems",
        "distributed-systems",
        "sustainability"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.48550/arXiv.2303.04501",
          "doi": "10.48550/arXiv.2303.04501",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/aicn-in-aicam",
      "content_html": "<p>We won the <a href=\"https://www.cam.ac.uk/stories/AI-deas-launch\">AI@CAM challenge</a> that was sent out\nUniversity wide to find research projects that use AI to tackle society's biggest challenges.\nOur project on using <a href=\"https://www.cam.ac.uk/stories/AI-deas-launch#section-9RKgEyI2LZ\">AI for climate and nature</a>\nis one of the five selected.</p>\n<blockquote>\n<p>The twin climate and biodiversity crises are two of the world’s most complex challenges to tackle. This project aims to develop AI approaches for bringing together a wide range of datasets and accelerating the collation of information.</p>\n<p>This work will provide up to date, relevant and robust information for researchers and decision-makers working on climate and biodiversity conservation – opening up the possibility for more targeted and effective solutions to some of our world’s most pressing climate and biodiversity challenges.</p>\n<p><a href=\"https://anil.recoil.org\">Anil Madhavapeddy</a>, AI-deas challenge co-lead, said: 'Mitigating the impacts of climate change while maintaining and restoring biodiversity demands urgent, evidence-based action. We're excited to bring together an interdisciplinary team across computer science, ecology, climate and conservation to use AI to empower decision-makers to equitably tackle the biggest challenge of our generation.'</p>\n<p><cite>-- <a href=\"https://www.cam.ac.uk/stories/AI-deas-launch#section-9RKgEyI2LZ\">AI@CAM</a></cite></p>\n</blockquote>\n<p>This project is a collaboration between lots of friendly people at Cambridge Zero, the Cambridge Conservation Initiative, Conservation Evidence, the Institute for Computing for Climate Science, Conservation Research Institute, Centre for Landscape Regeneration, <a href=\"/projects/4c\">Cambridge Centre for Carbon Credits</a> and Cambridge Centre for Earth Observation.</p>\n<p><img src=\"/images/aicn-team-feb24.webp\" alt=\"%c\" title=\"Team AICN in the CCI building, Feb 2024\" ></p>",
      "url": "https://anil.recoil.org/notes/aicn-in-aicam",
      "external_url": "https://www.cam.ac.uk/stories/AI-deas-launch",
      "title": "Selected in the AI@CAM challenge for conservation research",
      "summary": "Won the AI@CAM challenge towards climate and nature conservation research using remote sensing and LLMs",
      "date_published": "2024-02-05T00:00:00.000000Z",
      "date_modified": "2024-02-05T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ai",
        "conservation",
        "funding",
        "cambridge",
        "computerlab"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2024-uncertainty-cs-1",
      "content_html": "<p>Paper on uncertainty in climate science in <a href=\"https://undonecs.sciencesconf.org\">Undone CS</a>. This workshop paper examines how computer science approaches to uncertainty can both help and hinder climate research practices. Patrick led this critical reflection on our own work, exploring the tensions between the way computer scientists typically handle uncertainty - often through abstraction and simplification - and the complex, multifaceted nature of uncertainty in climate and environmental science. It's part of the &quot;Undone Computer Science&quot; workshop series that encourages reflection on the limitations and impacts of computing approaches, which is particularly important when working at the intersection of CS and climate science.</p>",
      "url": "https://anil.recoil.org/notes/2024-uncertainty-cs-1",
      "title": "Uncertainty at scale: how CS hinders climate research",
      "summary": "Paper examining how computer science approaches to uncertainty impact climate research practices at Undone CS workshop.",
      "date_published": "2024-02-01T00:00:00.000000Z",
      "date_modified": "2024-02-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "climate",
        "uncertainty",
        "methodology",
        "systems"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2024-uncertainty-cs.pdf",
          "mime_type": "application/pdf",
          "title": "Uncertainty at scale: how CS hinders climate research"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/royal-society-newton",
      "content_html": "<p>I joined the <a href=\"https://royalsociety.org\">Royal Society</a> <a href=\"https://royalsociety.org/grants/newton-international/\">Newton International Fellowships</a>\n<a href=\"https://royalsociety.org/people/anil-madhavapeddy-36582/\">committee</a> to help with selecting bright new scientists from abroad who wish to conduct research in the UK.</p>\n<blockquote>\n<p>The Newton International Fellowship (NIF) programme provides support for outstanding early career researchers to make a first step towards developing an independent research career through gaining experience across international borders. The fellowships enable researchers to access expertise, gain new perspectives and build long-lasting collaborative relationships.\n<cite> -- <a href=\"https://royalsociety.org/grants/newton-international/\">The Royal Society</a></p>\n</blockquote>",
      "url": "https://anil.recoil.org/notes/royal-society-newton",
      "external_url": "https://royalsociety.org/people/anil-madhavapeddy-36582/",
      "title": "Joined the Royal Society fellowships committee",
      "summary": "I joined the Royal Society's Newton International Fellowships committee to help select international researchers for UK-based projects.",
      "image": "https://anil.recoil.org/images/rs-car-1.768.webp",
      "date_published": "2023-12-02T00:00:00.000000Z",
      "date_modified": "2023-12-02T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "service",
        "royalsociety",
        "funding"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2023-pact-tmf-2",
      "content_html": "<p>We have just released the Tropical Moist Forest v2.0 specification, to update the <a href=\"/notes/2023-pact-tmf-1\">v1.1</a> released earlier in the year. There are significant updates to the methodology to better match the scheme described in <a href=\"/papers/2023-ncc-permanence\">Realizing the social value of impermanent carbon credits</a>. This revision integrates the permanence valuation framework from our Nature Climate Change paper into the operational methodology. It represents an important evolution as we refined our approach based on both peer review feedback and early implementation experiences. The updates ensure that the practical specification aligns with the theoretical foundation we published, creating a more coherent and scientifically robust system for evaluating forest carbon projects.</p><h1>References</h1><ul><li>Balmford et al (2024). PACT Tropical Moist Forest Accreditation Methodology v2.1. Cambridge Open Engage. <a href=\"https://doi.org/10.33774/coe-2024-gvslq\" target=\"_blank\"><i>10.33774/coe-2024-gvslq</i></a></li>\n<li>Balmford et al (2023). Realizing the social value of impermanent carbon credits. <a href=\"https://doi.org/10.1038/s41558-023-01815-0\" target=\"_blank\"><i>10.1038/s41558-023-01815-0</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2023-pact-tmf-2",
      "title": "PACT Tropical Moist Forest Accreditation Methodology",
      "summary": "Release of Tropical Moist Forest v2.0 specification with significant methodology updates.",
      "date_published": "2023-12-01T00:00:00.000000Z",
      "date_modified": "2023-12-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "forests",
        "carboncredits",
        "pact",
        "satellite",
        ":4c"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2023-pact-tmf.pdf",
          "mime_type": "application/pdf",
          "title": "PACT Tropical Moist Forest Accreditation Methodology v2.1"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.33774/coe-2024-gvslq",
          "doi": "10.33774/coe-2024-gvslq",
          "cito": [
            "citesAsSourceDocument"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s41558-023-01815-0",
          "doi": "10.1038/s41558-023-01815-0",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/48a7ab10-3f49-4978-a00f-c26b64c2cae7-1",
      "content_html": "<p>On the BBC briefly about the Dawn supercomputer. This was an unexpected media appearance when the new Cambridge supercomputer was announced at the 2023 AI Summit. Dawn represents a significant investment in computational infrastructure for AI research in the UK. While I didn't expect to end up on the BBC discussing supercomputing, it was a good opportunity to highlight the importance of computational resources for advancing both AI research and applications like the environmental and conservation work we're doing. These kinds of facilities are crucial for training large models and processing the massive datasets we work with in planetary computing.</p>",
      "url": "https://anil.recoil.org/notes/48a7ab10-3f49-4978-a00f-c26b64c2cae7-1",
      "title": "BBC report on the new Cambridge supercomputer (\"Dawn\") announced at the 2023 AI Summit",
      "summary": "BBC interview about the Dawn supercomputer announced at 2023 AI Summit.",
      "date_published": "2023-11-02T00:00:00.000000Z",
      "date_modified": "2023-11-02T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ai",
        "supercomputing",
        "interview",
        "hpc"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2023-hotnets-sns-1",
      "content_html": "<p>Paper on spatial networks on DNS at <a href=\"https://dl.acm.org/doi/10.1145/3626111.3628210\">HotNets 2023</a>. Ryan Gibb led this work proposing the Spatial Name System (SNS) that extends DNS to support location-based names and resolution mechanisms. The key insight is that the existing Internet architecture lacks proper support for naming physical locations and resolving them to the diverse addressing mechanisms we use beyond IP addresses. By building on DNS infrastructure, SNS enables integration of spatial names into existing applications and opens up exciting possibilities for sensor networks and augmented reality applications. It's a neat piece of work bridging the gap between cyberspace and physical space.</p><h1>References</h1><ul><li>Gibb et al (2023). Where on Earth is the Spatial Name System?. ACM. <a href=\"https://doi.org/10.1145/3626111.3628210\" target=\"_blank\"><i>10.1145/3626111.3628210</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2023-hotnets-sns-1",
      "title": "Where on Earth is the Spatial Name System?",
      "summary": "Paper on spatial name system architecture extending DNS with geographic routing capabilities at HotNets 2023.",
      "date_published": "2023-11-01T00:00:00.000000Z",
      "date_modified": "2023-11-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "dns",
        "networking",
        "spatial",
        "routing"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2023-hotnets-sns.pdf",
          "mime_type": "application/pdf",
          "title": "Where on Earth is the Spatial Name System?"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1145/3626111.3628210",
          "doi": "10.1145/3626111.3628210",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2023-ncc-permanence-2",
      "content_html": "<p>Our paper on valuing impermanent carbon credits has been published at <a href=\"https://www.nature.com/articles/s41558-023-01815-0\">Nature Climate Change</a>. It has received a bunch of press coverage, including <a href=\"https://phys.org/news/2023-10-offset-approach-tropical-forests-faith.html\">phys.org</a>, <a href=\"https://www.cam.ac.uk/research/news/offset-markets-new-approach-could-help-save-tropical-forests-by-restoring-faith-in-carbon-credits\">cam.ac.uk</a>, and <a href=\"https://www.miragenews.com/new-method-may-boost-trust-in-carbon-credits-1113599/\">Mirage</a>. The publication follows our July preprint and represents a significant milestone in bringing scientific rigor to carbon credit markets. The press coverage has been really encouraging, with science journalists picking up on the key insight that our framework allows for like-for-like comparisons of diverse carbon projects while generating incentives for safeguarding already-credited carbon. The methodology we developed integrates three substantial advances that together provide a path forward for credible nature-based climate solutions.</p><h1>References</h1><ul><li>Balmford et al (2023). Realizing the social value of impermanent carbon credits. <a href=\"https://doi.org/10.1038/s41558-023-01815-0\" target=\"_blank\"><i>10.1038/s41558-023-01815-0</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2023-ncc-permanence-2",
      "title": "Nature Climate Change paper on impermanent carbon credits",
      "summary": "Publication of research on valuing impermanent carbon credits in Nature Climate Change with extensive press coverage.",
      "date_published": "2023-11-01T00:00:00.000000Z",
      "date_modified": "2023-11-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "4c",
        "economics",
        "forests",
        "carbon",
        "sensing"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2023-ncc-permanence.pdf",
          "mime_type": "application/pdf",
          "title": "Realizing the social value of impermanent carbon credits"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1038/s41558-023-01815-0",
          "doi": "10.1038/s41558-023-01815-0",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2024-life-1",
      "content_html": "<p>The first preprint on our new <a href=\"/projects/life\">LIFE</a> metric for global biodiversity is now available. It is under review, so feedback would be very welcome. LIFE stands for Land-cover change Impacts on Future Extinctions and represents a significant computational achievement - we generated global maps at 1 arc-minute resolution covering 29,153 terrestrial vertebrate species. The metric quantifies the marginal changes in expected extinctions from either converting natural vegetation to agriculture or restoring farmland to natural habitat. By coupling the persistence score approach with high-performance computing, we've created a tool that integrates information on species richness, endemism, and past habitat loss, offering unprecedented opportunities to estimate extinction impacts from scales ranging from individual dietary choices to global protected area development.</p><h1>References</h1><ul><li>Eyres et al (2025). LIFE: A metric for mapping the impact of land-cover change on global extinctions. <a href=\"https://doi.org/10.1098/rstb.2023.0327\" target=\"_blank\"><i>10.1098/rstb.2023.0327</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2024-life-1",
      "title": "First preprint of LIFE biodiversity metric available",
      "summary": "First preprint on new LIFE metric for global biodiversity now available for review.",
      "date_published": "2023-11-01T00:00:00.000000Z",
      "date_modified": "2023-11-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "biodiversity",
        "spatial",
        "economics",
        "conservation",
        "sdms",
        "aoh"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2024-life.pdf",
          "mime_type": "application/pdf",
          "title": "LIFE: A metric for mapping the impact of land-cover change on global extinctions"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1098/rstb.2023.0327",
          "doi": "10.1098/rstb.2023.0327",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/mission-possible",
      "content_html": "<p>I was on stage in New York for <a href=\"https://www.cam.ac.uk/news/cambridge-zero-highlights-university-efforts-at-climate-week-nyc\">Mission Possible</a>\nduring <a href=\"https://www.climateweeknyc.org\">NYC Climate Week</a>.  I was there with <a href=\"https://www.cisl.cam.ac.uk/directory/emily-shuckburgh\">Emily Shuckburgh</a> and we met with a lot of Cambridge alumni who\nare all engaged with climate change related activities -- either directly in their careers, or through a side interest.</p>\n<p>The major highlights on the discussions with alumni centred around agency: a lot of them were wondering how to combine the evidence coming\nout Cambridge research and combine it with real policy action. A number of the alumni are obviously highly successful in their individual\ncareers, and so the University helping to glue this together would potentially result in valuable actions that might not otherwise come together.</p>\n<p>This reminded me strongly of the discussions we had in Pembroke a little while back when <a href=\"https://www.cisl.cam.ac.uk/directory/emily-shuckburgh\">Emily Shuckburgh</a> chaired my talk about &quot;Who's in Charge?&quot; for the <a href=\"https://www.pem.cam.ac.uk/college/corporate-partnership/corporate-partnership-events/william-pitt-seminars/17th-william-pitt\">William Pitt Seminar</a> where we had very similar discussions at dinner afterwards.</p>\n<p><div class=\"video-center\"><iframe title=\"17th William Pitt Seminar - Who's in Charge?\" width=\"100%\" height=\"315px\" src=\"https://crank.recoil.org/videos/embed/a26475b5-c169-478e-b88e-be5cd1f2aff8\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<p><em>(See also <a href=\"https://www.zero.cam.ac.uk/who-we-are/blog/news/cambridge-zero-takes-centre-stage-climate-week-nyc\">Cambridge Zero</a> notes on the event, and thanks to <a href=\"https://www.cisl.cam.ac.uk/\">CISL</a>.)</em></p>",
      "url": "https://anil.recoil.org/notes/mission-possible",
      "external_url": "https://www.cam.ac.uk/news/cambridge-zero-highlights-university-efforts-at-climate-week-nyc",
      "title": "Cambridge Zero highlights University efforts at Climate Week NYC",
      "summary": "Cambridge University highlights climate efforts at NYC Climate Week.",
      "date_published": "2023-10-18T00:00:00.000000Z",
      "date_modified": "2023-10-18T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "computerlab",
        "cambridge",
        "climate",
        "outreach"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2023-raid-deluminator-1",
      "content_html": "<p>Paper on DIFC Deluminator interface at <a href=\"https://dl.acm.org/doi/10.1145/3607199.3607235\">RAID 2023</a>. Zahra led this work on information flow tracking for heterogeneous compartmentalized software environments. The key contribution is recognizing that modern systems increasingly use diverse compartmentalization mechanisms - processes, SGX enclaves, TrustZone Trusted Apps, and intra-address space compartments - but existing abstractions assume single-compartment models. Deluminator provides OS abstractions and a userspace framework to enable extensible, fine-grained information flow tracking across these heterogeneous compartments. We implemented it on Linux for both ARM and x86-64, with evaluation showing reasonable overhead (7-29% on average) that makes it practical for real-world use.</p><h1>References</h1><ul><li>Tarkhani et al (2023). Information Flow Tracking for Heterogeneous Compartmentalized Software. ACM. <a href=\"https://doi.org/10.1145/3607199.3607235\" target=\"_blank\"><i>10.1145/3607199.3607235</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2023-raid-deluminator-1",
      "title": "Information Flow Tracking for Heterogeneous Compartmentalized Software",
      "summary": "RAID 2023 paper on Deluminator interface for decentralized information flow control in compartmentalized software.",
      "date_published": "2023-10-01T00:00:00.000000Z",
      "date_modified": "2023-10-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "security",
        "information-flow",
        "compartmentalization",
        "systems"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2023-raid-deluminator.pdf",
          "mime_type": "application/pdf",
          "title": "Information Flow Tracking for Heterogeneous Compartmentalized Software"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1145/3607199.3607235",
          "doi": "10.1145/3607199.3607235",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2023-acns-microguards-1",
      "content_html": "<p>Paper on MicroGuards memory API at ACNSW with Zahra Tarkhani. MicroGuards provides lightweight kernel modifications and APIs for fine-grained in-process memory protection and privilege separation in multithreaded applications. Taking advantage of tagged memory support in modern CPUs, MicroGuards enables compartmentalization even on resource-constrained mobile devices with minimal overhead (less than 3.5%) - addressing the challenge of securing applications without requiring heavyweight isolation mechanisms.</p><h1>References</h1><ul><li>Tarkhani et al (2023). Enabling Lightweight Privilege Separation in Applications with MicroGuards. Springer Nature Switzerland. <a href=\"https://doi.org/10.1007/978-3-031-41181-6_31\" target=\"_blank\"><i>10.1007/978-3-031-41181-6_31</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2023-acns-microguards-1",
      "title": "Enabling Lightweight Privilege Separation in Applications with MicroGuards",
      "summary": "Paper on MicroGuards memory API for lightweight privilege separation presented at ACNSW.",
      "date_published": "2023-10-01T00:00:00.000000Z",
      "date_modified": "2023-10-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "security",
        "privilege-separation",
        "memory-safety",
        "systems"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2023-acns-microguards.pdf",
          "mime_type": "application/pdf",
          "title": "Enabling Lightweight Privilege Separation in Applications with MicroGuards"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1007/978-3-031-41181-6_31",
          "doi": "10.1007/978-3-031-41181-6_31",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/981c00b5-32c0-4cac-a387-6c945dfa9934-1",
      "content_html": "<p>Keynoted at ICFP 2023 on Functional Programming for the Planet, giving the opening keynote in Seattle. I discussed how functional programmers could contribute to addressing the climate and biodiversity crises through planetary computing. The talk covered our work on satellite sensing, forest monitoring, and how the principles of functional programming - immutability, composability, type safety - could help build reliable systems for environmental monitoring and conservation at scale.</p>",
      "url": "https://anil.recoil.org/notes/981c00b5-32c0-4cac-a387-6c945dfa9934-1",
      "title": "Functional Programming for the Planet",
      "summary": "Keynote presentation at ICFP 2023 on functional programming for the planet.",
      "date_published": "2023-09-05T00:00:00.000000Z",
      "date_modified": "2023-09-05T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "fp",
        "conservation",
        "biodiversity",
        "icfp",
        "climate"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2023-ocaml-platform-1",
      "content_html": "<p>We deliver the annual presentation about the OCaml Platform in the OCaml Workshop at ICFP 2023.</p>\n<blockquote>\n<p>This paper reflects on a decade of progress and developments within the OCaml Platform, from its inception in 2013 with the release of opam 1.0, to today where it stands as a robust toolchain for OCaml developers. We review the last three years in detail, emphasizing the advancements and innovations that have shaped the OCaml development landscape and highlighting key milestones such as the migration to Dune as the primary build system, and the development of a Language Server Protocol (LSP) server for OCaml.</p>\n</blockquote>",
      "url": "https://anil.recoil.org/notes/2023-ocaml-platform-1",
      "title": "State of the OCaml Platform 2023",
      "summary": "Annual OCaml Workshop presentation reviewing a decade of progress from opam 1.0 to modern toolchain including Dune and LSP server.",
      "date_published": "2023-09-01T00:00:00.000000Z",
      "date_modified": "2023-09-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "devtools",
        "testing"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2023-ocaml-platform.pdf",
          "mime_type": "application/pdf",
          "title": "State of the OCaml Platform 2023"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2023-ocaml-eio-1",
      "content_html": "<p>An update on the OCaml EIO library at the OCaml Workshop 2023. Tom Leonard led this presentation on the release of Eio 1.0, which brings effects-based IO to OCaml 5. This is a big deal for the OCaml ecosystem as it provides a modern approach to concurrent programming using algebraic effects rather than monads or callbacks. The library has been under development for a while and reaching 1.0 is a significant milestone. It's particularly exciting because it takes advantage of OCaml 5's new effect handlers to provide a clean, composable interface for IO operations that feels natural in the language.</p>",
      "url": "https://anil.recoil.org/notes/2023-ocaml-eio-1",
      "title": "Eio 1.0 – Effects-based IO for OCaml 5",
      "summary": "OCaml Workshop 2023 update on the Eio effects-based IO library for OCaml 5.",
      "date_published": "2023-09-01T00:00:00.000000Z",
      "date_modified": "2023-09-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "eio",
        "effects",
        "fp",
        "concurrency"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2023-ocaml-eio.pdf",
          "mime_type": "application/pdf",
          "title": "Eio 1.0 – Effects-based IO for OCaml 5"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2023-ncc-permanence-1",
      "content_html": "<p>We have uploaded a preprint of our <a href=\"/projects/4c\">4C</a> paper on valuing impermanent carbon credits, by using the <a href=\"https://en.wikipedia.org/wiki/Social_cost_of_carbon\">Social Cost of Carbon</a> as a basis for a discount function into the future. Comments and feedback are most welcome. This work tackles one of the biggest challenges in nature-based climate solutions: how to value carbon sequestration that might not be permanent. We developed a novel framework that conceptualizes permanence as additionality over time relative to a counterfactual baseline, uses risk-averse estimation of future carbon release, and deploys post-credit monitoring to correct for overly pessimistic forecasts. Our preliminary comparisons suggest that even after fully adjusting for impermanence, nature-based interventions may offer less costly ways of reducing climate damages than more technological solutions.</p><h1>References</h1><ul><li>Balmford et al (2023). Realizing the social value of impermanent carbon credits. <a href=\"https://doi.org/10.1038/s41558-023-01815-0\" target=\"_blank\"><i>10.1038/s41558-023-01815-0</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2023-ncc-permanence-1",
      "title": "Preprint on the social value of impermanent carbon credits",
      "summary": "Preprint using Social Cost of Carbon to create a discount function for valuing impermanent carbon credits.",
      "date_published": "2023-07-01T00:00:00.000000Z",
      "date_modified": "2023-07-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "carboncredits",
        "economics",
        "forests",
        "carbon",
        "sensing"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2023-ncc-permanence.pdf",
          "mime_type": "application/pdf",
          "title": "Realizing the social value of impermanent carbon credits"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1038/s41558-023-01815-0",
          "doi": "10.1038/s41558-023-01815-0",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/acm-sigplan-award",
      "content_html": "<p>I was honoured to be included in the OCaml team that won the <a href=\"https://www.cst.cam.ac.uk/news/acm-programming-languages-software-award-goes-ocaml-researchers\">ACM Programming Languages Software Award for 2023</a>.</p>\n<blockquote>\n<p>The Association for Computing Machinery (ACM), the world's largest association of computing professionals, today gave the 2023 SIGPLAN Award to a group of developers for their work on the functional programming language OCaml.</p>\n<p>The award was presented at the annual SIGPLAN Programming Language Design and Implementation Conference to a group of researchers and developers including our colleague Anil Madhavapeddy, Professor of Planetary Computing here.</p>\n<p>The prestigious Programming Languages Software Award is given annually &quot;to an institution or individual(s) to recognise the development of a software system that has had a significant impact on programming language research, implementations, and tools,&quot; ACM says.</p>\n<p><cite>-- <a href=\"https://www.cst.cam.ac.uk/news/acm-programming-languages-software-award-goes-ocaml-researchers\">Computer Laboratory</a></cite></p>\n</blockquote>\n<p>See also the main <a href=\"https://www.sigplan.org/Awards/Software/\">ACM Award Page</a> citation:</p>\n<blockquote>\n<p>The OCaml Compiler Distribution is the reference implementation of the OCaml language, a dialect of ML that aims to be pragmatic, both in language features and implementation, encouraging a simple programming style that yields good performance and usability. It has a large user base in industry, research, and education throughout the world, and was used to implement a number of other impactful systems, notably in verification: Coq proof assistant, CompCert verified compiler, Why3 verified programming environment, Frama-C, Astrée and Gillian static analyzers, Infer, Hack and Flow projects at Meta, SLAM/SDV and F* at Microsoft, etc.\n<cite>-- <a href=\"https://www.sigplan.org/Awards/Software/\">ACM SIGPLAN</a></cite></p>\n</blockquote>",
      "url": "https://anil.recoil.org/notes/acm-sigplan-award",
      "external_url": "https://www.cst.cam.ac.uk/news/acm-programming-languages-software-award-goes-ocaml-researchers",
      "title": "OCaml wins the ACM Programming Language Software award",
      "summary": "OCaml wins 2023 ACM Programming Languages Software Award for its impact on research and tools.",
      "date_published": "2023-06-19T00:00:00.000000Z",
      "date_modified": "2023-06-19T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "award",
        "acm",
        "icfp"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2023-pact-tmf-1",
      "content_html": "<p>We have just published the Tropical Moist Forest v1.0 specification, which is a detailed description of the methodology we are using for counterfactual dynamic baselines to calculate the additionality, leakage and permanence behind REDD+ projects. I explained some of the background behind this in a seminar last year. This specification operationalizes the theoretical framework we developed for the Cambridge Center for Carbon Credits (4C), translating research into a practical methodology that can be applied to real-world forest conservation projects. It's a comprehensive document that lays out how to use satellite data and econometric techniques to establish what would have happened to a forest in the absence of a conservation project, which is crucial for determining whether carbon credits genuinely represent additional climate benefits.</p>\n<p><div class=\"video-center\"><iframe title=\"A Credible Approach towards Halting Tropical Deforestation\" width=\"100%\" height=\"315px\" src=\"https://crank.recoil.org/videos/embed/bc9da6fc-9419-4f18-9db9-c13b1a4a859f\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p><h1>References</h1><ul><li>Balmford et al (2024). PACT Tropical Moist Forest Accreditation Methodology v2.1. Cambridge Open Engage. <a href=\"https://doi.org/10.33774/coe-2024-gvslq\" target=\"_blank\"><i>10.33774/coe-2024-gvslq</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2023-pact-tmf-1",
      "title": "PACT Tropical Moist Forest Accreditation Methodology",
      "summary": "Publication of Tropical Moist Forest v1.0 specification using counterfactual dynamic baselines for calculating REDD+ project additionality, leakage and permanence.",
      "date_published": "2023-06-01T00:00:00.000000Z",
      "date_modified": "2023-06-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "forests",
        "carboncredits",
        "pact",
        "satellite",
        ":rsn"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2023-pact-tmf.pdf",
          "mime_type": "application/pdf",
          "title": "PACT Tropical Moist Forest Accreditation Methodology v2.1"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.33774/coe-2024-gvslq",
          "doi": "10.33774/coe-2024-gvslq",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2023-carbon-credibility-1",
      "content_html": "<p>Our perspective in <a href=\"https://science.org\">Science</a> magazine appeared this week on the credibility of carbon credits and its importance for tropical forest protection. This was a collaborative effort with colleagues from across conservation science, economics, and computer science to address one of the most pressing issues in climate finance. We argue that improving the quantification methods behind carbon credits is not just a technical issue but an existential one for forest protection. If credits don't accurately represent their environmental benefits, the entire voluntary carbon market risks collapse, which would be catastrophic for tropical forests that depend on this financing. The piece calls for robust, scientifically-grounded methodologies to ensure that carbon credits can genuinely contribute to climate mitigation rather than providing false assurance.</p>\n<blockquote>\n<p>Addressing global warming requires increased investment in conserving and restoring carbon-dense natural habitats.  Some companies that emit carbon have turned to certified carbon credits to offset their environmental impact. However, the effectiveness of carbon credits depends on the methods used to quantify them. If carbon credits do not accurately represent their environmental benefits, relying on them could exacerbate climate change.  To ensure that carbon credits are robust, the methods used to calculate them must be improved.</p>\n</blockquote><h1>References</h1><ul><li>Balmford et al (2023). Credit credibility threatens forests. <a href=\"https://doi.org/10.1126/science.adh3426\" target=\"_blank\"><i>10.1126/science.adh3426</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2023-carbon-credibility-1",
      "title": "Credit credibility threatens forests",
      "summary": "Perspective in Science magazine on the credibility of carbon credits and their importance for tropical forest protection.",
      "date_published": "2023-05-01T00:00:00.000000Z",
      "date_modified": "2023-05-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "carboncredits",
        "economics",
        "forests",
        "carbon",
        "sensing"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2023-carbon-credibility.pdf",
          "mime_type": "application/pdf",
          "title": "Credit credibility threatens forests"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1126/science.adh3426",
          "doi": "10.1126/science.adh3426",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/ce64a918-ff52-4116-b1ee-256f08e6e7f1-1",
      "content_html": "<p>Discussion with Mantle Labs about carbon credits at the Forestry and Agriculture Summit in London. On stage with Jon Pierre, we discussed how scientific innovation and AI could help scale carbon markets. The conversation explored satellite-driven approaches to calculating additionality and permanence in forest carbon projects, addressing the credibility challenges facing the voluntary carbon market. This work aims to use technology to create more transparent and verifiable carbon credits.</p>",
      "url": "https://anil.recoil.org/notes/ce64a918-ff52-4116-b1ee-256f08e6e7f1-1",
      "title": "Leveraging Scientific Innovation and AI to Scale Carbon Markets",
      "summary": "Discussion with Mantle Labs exploring applications of AI to carbon credits markets.",
      "date_published": "2023-03-07T00:00:00.000000Z",
      "date_modified": "2023-03-07T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "carbon-credits",
        "ai",
        "conservation",
        "climate"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2024-planetary-computing-1",
      "content_html": "<p>Preprint of planetary computing paper. Patrick led this work making the case for infrastructure to handle the ingestion, transformation, analysis, and publication of global data products for environmental science and policy-making. Drawing on our experiences working with environmental scientists on forest carbon and biodiversity preservation, we classify existing solutions by their flexibility in processing geospatial data and their support for building trust through traceability and reproducibility. The paper identifies research gaps around handling continuously changing datasets collected across decades that require careful access control. It's a call to action for the computing community to build better infrastructure for planetary-scale environmental data.</p><h1>References</h1><ul><li>Ferris et al (2024). Planetary computing for data-driven environmental policy-making. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2303.04501\" target=\"_blank\"><i>10.48550/arXiv.2303.04501</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2024-planetary-computing-1",
      "title": "A Case for Planetary Computing",
      "summary": "Preprint presenting the case for planetary-scale computing infrastructure and applications.",
      "date_published": "2023-03-01T00:00:00.000000Z",
      "date_modified": "2023-03-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "planetary",
        "conservation",
        "biodiversity",
        "systems",
        "climate"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.48550/arXiv.2303.04501",
          "doi": "10.48550/arXiv.2303.04501",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/recapping-ocaml-22",
      "content_html": "<p>I recap the OCaml community progress in 2022, which covers a number of bases ranging from\nthe release of OCaml 5.0, the launch of a new website with integrated documentation for 20000+ packages, prototyping new developer workflows that are better integrated into editors, and the launch of ActivityPub based services such as <a href=\"https://watch.ocaml.org\">https://watch.ocaml.org</a>.</p>",
      "url": "https://anil.recoil.org/notes/recapping-ocaml-22",
      "external_url": "https://discuss.ocaml.org/t/ocaml-org-recapping-2022-and-queries-on-the-fediverse/11099/1",
      "title": "OCaml.org: recapping 2022 and queries on the Fediverse",
      "summary": "Recapping OCaml's 2022 progress and exploring Fediverse integration.",
      "date_published": "2023-01-02T00:00:00.000000Z",
      "date_modified": "2023-01-02T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocamllabs",
        "opensource",
        "selfhosting",
        "fediverse"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/bc9da6fc-9419-4f18-9db9-c13b1a4a859f-1",
      "content_html": "<p>Wednesday seminar on financing forests using carbon credits at the Cambridge Computer Lab. I presented our work at 4C on applying computer science and satellite sensing to tropical deforestation. The talk explored how technology could help create credible carbon credit markets by using satellite data to measure additionality and permanence, potentially providing financial mechanisms to help avert both the biodiversity and carbon crises.</p>",
      "url": "https://anil.recoil.org/notes/bc9da6fc-9419-4f18-9db9-c13b1a4a859f-1",
      "title": "Financing Forests: A Credible Approach towards Halting Tropical Deforestation",
      "summary": "Wednesday seminar presentation on using carbon credits to finance tropical forest conservation.",
      "date_published": "2022-11-16T00:00:00.000000Z",
      "date_modified": "2022-11-16T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "carbon-credits",
        "conservation",
        "forests",
        "climate"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/a26475b5-c169-478e-b88e-be5cd1f2aff8-1",
      "content_html": "<p>I opened the 17th William Pitt Seminar at Pembroke College on climate change with a brief talk about the status of the world's biodiversity, and how we have more agency than ever before to take matters into our own hands.</p>",
      "url": "https://anil.recoil.org/notes/a26475b5-c169-478e-b88e-be5cd1f2aff8-1",
      "title": "17th William Pitt Seminar - Who's in Charge?",
      "summary": "Opening talk at Pembroke College's climate change seminar on global biodiversity status and individual agency for conservation action.",
      "date_published": "2022-11-01T00:00:00.000000Z",
      "date_modified": "2022-11-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "conservation",
        "biodiversity",
        "climate",
        "seminar",
        "agency"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/rwo-2",
      "content_html": "<p>I'm delighted to report that the second edition of <a href=\"https://realworldocaml.org\">Real World OCaml</a> is now available from Cambridge University Press! It's also freely available <a href=\"https://realworldocaml.org\">online</a>, and CUP also kindly agreed that the PDF version could be freely available online thanks to sponsorship from <a href=\"https://tarides.com\">Tarides</a>.</p>\n<p><img src=\"/images/rwo-window-1.webp\" alt=\"%c\" title=\"The book is on display in the main CUP shop in the middle of Cambridge!\" ></p>\n<p><img src=\"/images/rwo-fans-1.webp\" alt=\"%c\" title=\"I spot some random fans named Dave and Eleanor who happened to be buying the book\" ></p>\n<p><img src=\"/images/rwo-fans-2.webp\" alt=\"%c\" title=\"I follow them just to make sure they do actually buy the book\" ></p>\n<p>As always, if you have any feedback about the book, please post it on <a href=\"https://github.com/realworldocaml/book\">https://github.com/realworldocaml/book</a>.</p><h1>References</h1><ul><li>Madhavapeddy et al (2022). Real World OCaml: Functional Programming for the Masses. Cambridge University Press. <a href=\"https://doi.org/10.1017/9781009129220\" target=\"_blank\"><i>10.1017/9781009129220</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/rwo-2",
      "title": "The 2nd ed of Real World OCaml is available in shops",
      "summary": "Second edition of Real World OCaml now available from Cambridge University Press, with free online and PDF versions.",
      "date_published": "2022-10-01T00:00:00.000000Z",
      "date_modified": "2022-10-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "book",
        "cambridge"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1017/9781009129220",
          "doi": "10.1017/9781009129220",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2022-oud-ocurrent-1",
      "content_html": "<p>Paper on our incremental computation DSL OCurrent presented at OCaml Workshop 2022 with Tim McGilchrist, David Allsopp, Patrick Ferris, and others. OCurrent provides a declarative way to express CI/CD pipelines as incremental computations that automatically track dependencies and cache results. Combined with OBuilder for reproducible builds, it enables homogeneous build infrastructure across different platforms - the foundation powering OCaml-CI and other OCaml ecosystem infrastructure.</p>",
      "url": "https://anil.recoil.org/notes/2022-oud-ocurrent-1",
      "title": "Homogeneous Builds with OBuilder and OCaml",
      "summary": "Paper on OCurrent incremental computation DSL presented at OCaml Workshop 2022.",
      "date_published": "2022-09-01T00:00:00.000000Z",
      "date_modified": "2022-09-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "build-systems",
        "ci-cd",
        "tooling"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/4c-1",
      "content_html": "<p>With the recent controversies over low-integrity carbon credits, I spoke to Vox magazine\nabout my skepticism about Adam Neumann's new startup.</p>\n<blockquote>\n<p>&quot;The problem with the current markets is nothing to do with how we can trade these more effectively,&quot; said Anil Madhavapeddy, who is an associate professor of computer science and technology at Cambridge University and the director of the Cambridge Center for Carbon Credits. &quot;We just do not have enough supply.&quot;\n<cite>-- <a href=\"https://www.vox.com/recode/23142106/adam-neumann-crypto-carbon-credit-offset-flowcarbon\">Vox</a></cite></p>\n</blockquote>",
      "url": "https://anil.recoil.org/notes/4c-1",
      "title": "Trusted Carbon Credits",
      "summary": "Interview in Vox magazine expressing skepticism about Adam Neumann's carbon credit startup, emphasizing supply shortage over trading efficiency.",
      "date_published": "2022-05-06T00:00:00.000000Z",
      "date_modified": "2022-05-06T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "carbon-credits",
        "conservation",
        "interview",
        "climate"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/multicore-monthly-mar22",
      "content_html": "<p>We're getting closer to a stable release of OCaml 5.0, including reenabling support for the BSDs and introducing ARM64 multicore support.</p>",
      "url": "https://anil.recoil.org/notes/multicore-monthly-mar22",
      "external_url": "https://discuss.ocaml.org/t/multicore-ocaml-march-2022/9692",
      "title": "OCaml Multicore Monthly: heading towards OCaml 5.0",
      "summary": "OCaml 5.0 approaches with BSD support and ARM64 multicore.",
      "date_published": "2022-04-19T00:00:00.000000Z",
      "date_modified": "2022-04-19T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocamllabs",
        "ocaml",
        "multicore"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/multicore-monthly-jan22",
      "content_html": "<p>After we got the massive OCaml 5.0 pull request merged, we've taken some time to consolidate the trunk branch of OCaml and start down the release path towards getting OCaml 5.0 out of the door.</p>",
      "url": "https://anil.recoil.org/notes/multicore-monthly-jan22",
      "external_url": "https://discuss.ocaml.org/t/multicore-ocaml-january-2022-and-post-merge-activity/9294",
      "title": "OCaml Multicore Monthly: post merge activites",
      "summary": "OCaml 5.0 merge complete, focusing on release preparation.",
      "date_published": "2022-02-09T00:00:00.000000Z",
      "date_modified": "2022-02-09T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocamllabs",
        "ocaml",
        "multicore",
        "opensource"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2022-enhancing-brain-security-1",
      "content_html": "<p>Preprint on security vulnerabilities in brain-computer interfaces with Zahra Tarkhani, Lorena Qendro, and colleagues from Cambridge. This fascinating work analyzed security threats to wearable BCI devices from both an OS and adversarial ML perspective, discovering over 300 vulnerabilities across six attack vectors in real devices like Muse, NeuroSky, and OpenBCI. We introduced Argus, an information flow control system that mitigates these attacks with acceptable overhead - critical for protecting users' brainwave data and preventing remote attackers from compromising BCI-assisted devices.</p><h1>References</h1><ul><li>Tarkhani et al (2022). Enhancing the Security & Privacy of Wearable Brain-Computer Interfaces. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2201.07711\" target=\"_blank\"><i>10.48550/arXiv.2201.07711</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2022-enhancing-brain-security-1",
      "title": "Enhancing the Security & Privacy of Wearable Brain-Computer Interfaces",
      "summary": "Preprint introducing Argus information flow control system addressing security vulnerabilities in wearable BCI devices.",
      "date_published": "2022-01-01T00:00:00.000000Z",
      "date_modified": "2022-01-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "security",
        "privacy",
        "bci",
        "wearables",
        "ifc"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2022-enhancing-brain-security.pdf",
          "mime_type": "application/pdf",
          "title": "Enhancing the Security & Privacy of Wearable Brain-Computer Interfaces"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.48550/arXiv.2201.07711",
          "doi": "10.48550/arXiv.2201.07711",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/multicore-monthly-dec21",
      "content_html": "<p>We've been working hard on OCaml multicore support, and went over to Paris to sit down with some core developers from Inria and work through code review of our proposed patches.</p>",
      "url": "https://anil.recoil.org/notes/multicore-monthly-dec21",
      "external_url": "https://discuss.ocaml.org/t/multicore-ocaml-november-2021-with-results-of-code-review/8934",
      "title": "OCaml Multicore Monthly: code review complete with Inria",
      "summary": "OCaml multicore support code review completed with Inria developers.",
      "date_published": "2021-12-21T00:00:00.000000Z",
      "date_modified": "2021-12-21T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocamllabs",
        "ocaml",
        "multicore",
        "opensource"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/4c-launch",
      "content_html": "<p>I launched <a href=\"/projects/4c\">4C</a> recently, and Pembroke College covers the launch with an interview with me.</p>\n<blockquote>\n<p>The world is facing a large-scale environmental crisis. Two parallel and related strands of this are, first, the crisis in biodiversity and the rapid extinction of many species, recently addressed at the COP15 UN Biodiversity Conference in October, and second, the threat of climate change, the topic of last month’s COP26 summit in Glasgow. Pressure is growing on governments to execute nature-based solutions which will offset some of the most damaging impacts of these crises. While COP26 built some momentum, there is still a long way to go to turn promises into lasting change. More engagement with the private sector is urgently needed.</p>\n<p>The solution to the crisis is two-pronged: we must engage in behaviour change to reduce unnecessary harmful emissions, and also invest in nature-based solutions at global scales to not only reduce, but ultimately reverse the effects of climate change and biodiversity loss.\n<cite>-- <a href=\"https://www.pem.cam.ac.uk/college/corporate-partnership/25th-anniversary-corporate-partnership-programme/25th-anniversary-11\">Pembroke College</a></cite></p>\n</blockquote>",
      "url": "https://anil.recoil.org/notes/4c-launch",
      "external_url": "https://www.pem.cam.ac.uk/college/corporate-partnership/25th-anniversary-corporate-partnership-programme/25th-anniversary-11",
      "title": "Launching the Cambridge Centre for Carbon Credits",
      "summary": "Introducing the Cambridge Centre for Carbon Credits to combat climate change and biodiversity loss.",
      "date_published": "2021-11-04T00:00:00.000000Z",
      "date_modified": "2021-11-04T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "pembroke",
        ":4c",
        "carboncredits",
        "interview"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/signals-and-threads",
      "content_html": "<p>I am the latest person to feature on the first season of the <a href=\"https://signalsandthreads.com/what-is-an-operating-system/\">Signals and\nThreads</a> podcast\nhosted by <a href=\"https://github.com/yminsky\">Yaron Minsky</a> (you may recognise him as my co-author on <a href=\"/papers/rwo\">Real World OCaml</a>).</p>\n<blockquote>\n<p>Anil Madhavapeddy is an academic, author, engineer, entrepreneur, and OCaml aficionado. In this episode, Anil and Ron consider the evolving role of operating systems, security on the internet, and the pending arrival (at last!) of OCaml 5.0. They also discuss using Raspberry Pis to fight climate change; the programming inspiration found in British pubs and on Moroccan beaches; and the time Anil went to a party, got drunk, and woke up with a job working on the Mars Polar Lander.\n<cite>-- <a href=\"https://signalsandthreads.com/what-is-an-operating-system/\">Signals and Threads</a></cite></p>\n</blockquote>\n<p>I think I might be the first non- Jane Street person to be on their podcast! Quite the honour.</p><h1>References</h1><ul><li>Madhavapeddy et al (2022). Real World OCaml: Functional Programming for the Masses. Cambridge University Press. <a href=\"https://doi.org/10.1017/9781009129220\" target=\"_blank\"><i>10.1017/9781009129220</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/signals-and-threads",
      "external_url": "https://signalsandthreads.com/what-is-an-operating-system/",
      "title": "What is an Operating System?",
      "summary": "Learn what an operating system is and its evolving role.",
      "date_published": "2021-11-03T00:00:00.000000Z",
      "date_modified": "2021-11-03T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "nasa",
        "space",
        "podcast",
        "multicore",
        "opensource",
        "janestreet"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1017/9781009129220",
          "doi": "10.1017/9781009129220",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/multicore-monthly-sep21",
      "content_html": "<p>We're making steady progress on getting multicore support merged into OCaml, including some great developer meetings where we achieved consensus with the core team to include support for effect handlers in the 5.0 release.</p>",
      "url": "https://anil.recoil.org/notes/multicore-monthly-sep21",
      "external_url": "https://discuss.ocaml.org/t/multicore-ocaml-september-2021-effect-handlers-will-be-in-ocaml-5-0/8554",
      "title": "OCaml Multicore Monthly: effect handling confirmed for 5.0",
      "summary": "OCaml 5.0 to include effect handling and multicore support.",
      "date_published": "2021-10-01T00:00:00.000000Z",
      "date_modified": "2021-10-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocamllabs",
        "opensource",
        "ocaml"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/forests",
      "content_html": "<p>I track external notes and media articles here on forest preservation and\nrestoration as part of my work on <a href=\"/projects/4c\">Trusted Carbon Credits</a>. Not complete, just a reading list.</p>\n<ul>\n<li><a href=\"https://www.youtube.com/watch?v=yiw6_JakZFc\">Can YOU Fix Climate Change?</a> (great short summary of the overall issues)</li>\n</ul>\n<h2 id=\"rewilding\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#rewilding\"></a>Rewilding</h2>\n<ul>\n<li><a href=\"https://www.theguardian.com/environment/2021/sep/24/vast-area-of-scottish-highlands-to-be-rewilded-in-ambitious-30-year-project-aoe\">Affric Highlands initiative</a> to rewild Scotland over 30 years</li>\n<li><a href=\"https://www.bloomberg.com/news/articles/2021-09-14/gabon-s-climate-law-brings-it-closer-to-carbon-trade-ambition\">Gabon's Climate Law</a></li>\n<li><a href=\"https://www.soilassociation.org/blogs/2021/august/3/pairing-agroforestry-with-livestock-the-major-benefits/\">Pairing agroforestry with livestock: the major benefits</a></li>\n<li><a href=\"https://www.nationalparks.uk/2021/10/06/press-release-major-global-companies-to-fund-vital-nature-restoration-projects-in-the-uks-national-parks-through-innovative-new-financing-facility/\">Major global companies to fund nature restoration projects in UK's national parks</a> (via <a href=\"https://www.thepalladiumgroup.com\">Palladium group</a>)</li>\n</ul>\n<h2 id=\"remote-sensing\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#remote-sensing\"></a>Remote sensing</h2>\n<ul>\n<li><a href=\"https://www.kiss.caltech.edu/papers/biodiversity/papers/2020_Book_RemoteSensingOfPlantBiodiversi.pdf\">Remote sensing of plant biodiversity</a>\n<ul>\n<li><a href=\"https://geobon.org\">Geobon</a> - global researcher network working on above.</li>\n</ul>\n</li>\n<li><a href=\"https://earthi.space/\">Earth-i</a> - sub-1m sensing satellite constellation</li>\n<li><a href=\"https://www.mantle-labs.com\">Mantle Labs</a> - earth observation + machine learning for farmers</li>\n<li><a href=\"https://www.cgi.com/uk/en-gb/news/climate/cgi-announces-strategic-partnership-project-seagrass-reduce-co2\">Seagrass from space</a></li>\n<li>Keshav's <a href=\"http://blizzard.cs.uwaterloo.ca/iss4e/wp-content/uploads/2017/10/Communication-technologies-for-energy-informatics.pdf\">comms technologies for energy informatics</a></li>\n</ul>\n<h2 id=\"carbon-credits\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#carbon-credits\"></a>Carbon Credits</h2>\n<ul>\n<li><a href=\"https://www.cis.upenn.edu/~bcpierce/papers/carbon-offsets.pdf\">Notes on Carbon offsets for scientific societies</a></li>\n<li><a href=\"https://vcmintegrity.org/\">Voluntary Carbon Markets integrity initiative</a></li>\n<li><a href=\"https://www.ecosystemmarketplace.com/articles/press-release-voluntary-carbon-markets-rocket-in-2021-on-track-to-break-1b-for-first-time/\">Voluntary Carbon Markets Rocket in 2021, On Track to Break $1B for First Time</a></li>\n</ul>\n<h2 id=\"biodiversity\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#biodiversity\"></a>Biodiversity</h2>\n<ul>\n<li><a href=\"https://kiss.caltech.edu/lectures/2019_biodiversity.html\">Biodiversity: Perspectives of a Techie</a> - Dave Thau - Data and Technology Global Lead Scientist, WWF</li>\n</ul>\n<h2 id=\"valuing-climate-change\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#valuing-climate-change\"></a>Valuing climate change</h2>\n<ul>\n<li><a href=\"https://www.sciencedirect.com/science/article/pii/S001671851930051X\">Cryptocarbon: The promises and pitfalls of forest protection on a blockchain</a></li>\n<li><a href=\"https://www.nature.com/articles/s41558-018-0285-8\">Valuing climate damages at the country level</a> - nature climate change, 2018</li>\n<li><a href=\"https://www.nature.com/articles/s41558-018-0282-y\">Country-level social cost of carbon</a>, nature climate change 2018</li>\n</ul><h1>References</h1><ul><li>Moore (2018). Valuing climate damages at the country level. Nature Climate Change. <a href=\"https://doi.org/10.1038/s41558-018-0285-8\" target=\"_blank\"><i>10.1038/s41558-018-0285-8</i></a></li>\n<li>Ricke et al (2018). Country-level social cost of carbon. Nature Climate Change. <a href=\"https://doi.org/10.1038/s41558-018-0282-y\" target=\"_blank\"><i>10.1038/s41558-018-0282-y</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/forests",
      "title": "Forest preservation and restoration",
      "summary": "Forest preservation and restoration efforts and resources compiled for reference.",
      "date_published": "2021-09-25T00:00:00.000000Z",
      "date_modified": "2021-09-25T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "4c",
        "conservation",
        "forests"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1038/s41558-018-0285-8",
          "doi": "10.1038/s41558-018-0285-8",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.1038/s41558-018-0282-y",
          "doi": "10.1038/s41558-018-0282-y",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/decentralised-stack",
      "content_html": "<p><a href=\"https://nick.recoil.org\">Nick Ludlam</a> and I have self-hosted recoil.org since around 1996, typically for\nemail and web.  These days, there are a number of interesting software stacks\naround decentralised communication that we deploy. This note keeps track of\nthem.</p>\n<ul>\n<li><strong>Email</strong>  (active)\n<ul>\n<li>Currently Postfix and DKIM/SPIF relays</li>\n<li>Till 2019, was OpenSMTPD and would like to return to it but waiting on\nfilter support.</li>\n<li>Till around 2016, was qmail but finally gave up due to difficulty of\nspam filtering.</li>\n<li>Next step will be to try out the MirageOS email stack that dinosaure\nhas been leading the development of.</li>\n</ul>\n</li>\n<li><strong>Web</strong> (active)\n<ul>\n<li>This website is an OCaml webserver running a custom multicore OCaml <a href=\"https://github.com/avsm/eeww\">webserver</a></li>\n<li>Next step will be to go solar powered with a custom DNS server.</li>\n</ul>\n</li>\n<li><strong>DNS</strong> (inactive)\n<ul>\n<li>MirageOS DNS server.</li>\n<li>Currently offline due to a hosting issue so fell back to Gandi.</li>\n<li>Hopefully can secondary with @hannesm and his MirageOS infrastructure.</li>\n</ul>\n</li>\n<li><strong>Videos</strong> (active)\n<ul>\n<li>Running a PeerTube instance on <a href=\"https://crank.recoil.org\">https://crank.recoil.org</a></li>\n<li>Also deployed this for the OCaml community as &lt;watch.ocaml.org&gt;, so my\npersonal recoil instance is &quot;following&quot; the OCaml one as well as having\nmy own videos.</li>\n</ul>\n</li>\n<li><strong>Chat</strong> (active)\n<ul>\n<li>Running a Matrix Element. server with a HTTP srv for recoil.org</li>\n<li>Using element.io clients to connect to it.</li>\n<li>Lots of federation to other services happening from this via\nrepublished rooms, so its a fairly busy server.</li>\n<li>Next step is to deploy some of the OCaml Matrix clients to control\nthe notifications. Element doesnt have very good push support.</li>\n<li>Decided not to bridge this to WhatsApp/Signal/etc as the maintenance\ncost is quite high and it requires unencrypted passwords.</li>\n<li>Need to regularly sweep the Element database to keep the size down, as detailed in this <a href=\"https://levans.fr/shrink-synapse-database.html\">handy blog post</a>.</li>\n</ul>\n</li>\n<li><strong>Activity</strong> (active)\n<ul>\n<li>Deplyed a Mastadon instance for distributed tweeting via\nActivityPub, on https://amok.recoil.org/</li>\n</ul>\n</li>\n<li><strong>Images</strong> (inactive)\n<ul>\n<li>Tristan Henderson pointed me to pixelfed which seems worth a try for\nimage sharing over ActivityPub. Not had a chance to use it yet.</li>\n</ul>\n</li>\n<li><strong>Spam</strong> (inactive)\n<ul>\n<li>Problem with the chat service is that I'm getting quite a lot of spam\nrequests on Matrix. Am experimenting with a Tezos node to act as a\nDID introduction proxy with gas costs. Hopefully there's a way to\nbe introduced due to some common service (or some evidence of PoW for the\ncommunication such as having read and quoted one of my papers or something)\nand have micropayment as a last-resort.</li>\n<li>Also deployed SpamAssassin recoil-wide and custom bayes filters.</li>\n</ul>\n</li>\n</ul>\n<p>In general, our operating system of choice is OpenBSD (since 1998 or so) with\nAlpine Linux for the more recent things that run on a cloud or haven't been\nported yet.</p>",
      "url": "https://anil.recoil.org/notes/decentralised-stack",
      "title": "Decentralised tech on Recoil",
      "summary": "Recoil's decentralized tech stack includes email, web, and chat services.",
      "date_published": "2021-09-19T00:00:00.000000Z",
      "date_modified": "2021-09-19T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "recoil",
        "selfhosting",
        "openbsd",
        "opensource",
        "security"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/roadmap-ocamlorg-v3",
      "content_html": "<p>After a decade of good service, it's time to overhaul OCaml's online presence\nto more modern technologies. This post lays out the roadmap for the third\nedition of the OCaml.org website.</p>",
      "url": "https://anil.recoil.org/notes/roadmap-ocamlorg-v3",
      "external_url": "https://discuss.ocaml.org/t/v3-ocaml-org-a-roadmap-for-ocamls-online-presence/8368",
      "title": "Roadmap for OCaml's online presence",
      "summary": "Overhauling OCaml's online presence with a modern new website.",
      "date_published": "2021-08-27T00:00:00.000000Z",
      "date_modified": "2021-08-27T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocamllabs",
        "ocaml",
        "selfhosting"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2021-arxiv-forestrycs-1",
      "content_html": "<p>Preprint about our working notes on how CS might contribute to forest preservation with Gemma Gordon, Amelia Holcomb, Tom Kelly, Srinivasan Keshav and Jon Ludlam. This was a departure from our usual systems work, exploring how computational techniques could aid forest restoration efforts - tackling the interlinked crises of climate change and biodiversity loss. The paper outlined opportunities for computer science to contribute to reforestation, from drone-based sensing to data management for restoration projects.</p><h1>References</h1><ul><li>Gordon et al (2021). How Computer Science Can Aid Forest Restoration. arXiv. <a href=\"https://doi.org/10.48550/arXiv.2109.07898\" target=\"_blank\"><i>10.48550/arXiv.2109.07898</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2021-arxiv-forestrycs-1",
      "title": "How Computer Science Can Aid Forest Restoration",
      "summary": "Preprint exploring how computer science techniques could contribute to forest preservation and restoration efforts.",
      "date_published": "2021-08-01T00:00:00.000000Z",
      "date_modified": "2021-08-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "forests",
        "conservation",
        "restoration",
        "systems"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2021-arxiv-forestrycs.pdf",
          "mime_type": "application/pdf",
          "title": "How Computer Science Can Aid Forest Restoration"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.48550/arXiv.2109.07898",
          "doi": "10.48550/arXiv.2109.07898",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2021-oud-effects-1",
      "content_html": "<p>Paper on programming with effects in OCaml at the OCaml Workshop with Thomas Leonard, Craig Ferguson, Patrick Ferris, Sadiq Jaffer, Tom Kelly and KC Sivaramakrishnan. This shared practical experiences from actually using effect handlers in real systems, now that multicore OCaml with effects was becoming available. The lessons learned helped refine the design and informed best practices for effect-based programming in production code.</p>",
      "url": "https://anil.recoil.org/notes/2021-oud-effects-1",
      "title": "Experiences with Effects",
      "summary": "Paper on programming with effects in OCaml presented at OCaml Workshop.",
      "date_published": "2021-08-01T00:00:00.000000Z",
      "date_modified": "2021-08-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "effect-handlers",
        "fp",
        "programming-languages"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2021-oud-effects.pdf",
          "mime_type": "application/pdf",
          "title": "Experiences with Effects"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2021-pldi-retroeff-1",
      "content_html": "<p>Paper on retrofitting effects in OCaml presented at PLDI 2021 with KC Sivaramakrishnan, Stephen Dolan, Leo White, Tom Kelly and Sadiq Jaffer. This work complemented our retrofitting parallelism paper, showing how algebraic effect handlers could be added to OCaml's runtime while maintaining backwards compatibility. Effect handlers are the key mechanism enabling lightweight concurrency in multicore OCaml, and getting the design and implementation right was critical for the language's evolution.</p><h1>References</h1><ul><li>Sivaramakrishnan et al (2021). Retrofitting effect handlers onto OCaml. ACM. <a href=\"https://doi.org/10.1145/3453483.3454039\" target=\"_blank\"><i>10.1145/3453483.3454039</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2021-pldi-retroeff-1",
      "title": "Retrofitting effect handlers onto OCaml",
      "summary": "Paper on retrofitting effect handlers into OCaml runtime presented at PLDI 2021.",
      "date_published": "2021-06-01T00:00:00.000000Z",
      "date_modified": "2021-06-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "effect-handlers",
        "fp",
        "programming-languages"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2021-pldi-retroeff.pdf",
          "mime_type": "application/pdf",
          "title": "Retrofitting effect handlers onto OCaml"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1145/3453483.3454039",
          "doi": "10.1145/3453483.3454039",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2020-asplas-banyan-1",
      "content_html": "<p>Paper on Banyan for coordination-free distributed transactions at ASPLAS 2020 with Shashank Shekhar Dubey, KC Sivaramakrishnan and Thomas Gazagnaire. Banyan addressed a key limitation of existing geo-distributed and IoT systems: they couldn't afford coordination latency but still needed transactions. The paper showed how mergeable replicated data types built on Irmin could support distributed transactions without coordination, while remaining composable - something previous CRDT approaches couldn't achieve.</p><h1>References</h1><ul><li>Dubey et al (2020). Banyan: Coordination-Free Distributed Transactions over Mergeable Types. Springer International Publishing. <a href=\"https://doi.org/10.1007/978-3-030-64437-6_12\" target=\"_blank\"><i>10.1007/978-3-030-64437-6_12</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2020-asplas-banyan-1",
      "title": "Banyan: Coordination-Free Distributed Transactions over Mergeable Types",
      "summary": "Paper on coordination-free distributed transactions using mergeable replicated data types at ASPLAS 2020.",
      "date_published": "2020-11-01T00:00:00.000000Z",
      "date_modified": "2020-11-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "distributed",
        "databases",
        "crdts",
        "ocaml",
        "transactions"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2020-asplas-banyan.pdf",
          "mime_type": "application/pdf",
          "title": "Banyan: Coordination-Free Distributed Transactions over Mergeable Types"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1007/978-3-030-64437-6_12",
          "doi": "10.1007/978-3-030-64437-6_12",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2020-oud-platform-1",
      "content_html": "<p>Annual update on the OCaml Platform at the OCaml Workshop, the 2020 edition of our ongoing series. This year's update covered the continued evolution of the toolchain, including improved testing infrastructure, better cross-platform support, and preparations for the coming multicore OCaml changes. These yearly checkpoints help ensure the platform evolves in sync with the language itself.</p>",
      "url": "https://anil.recoil.org/notes/2020-oud-platform-1",
      "title": "The OCaml Platform: 2020",
      "summary": "Annual OCaml Workshop update on the OCaml Platform development toolchain.",
      "date_published": "2020-09-01T00:00:00.000000Z",
      "date_modified": "2020-09-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "devtools",
        "platform"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/b11188ba-0f97-4ec4-b372-fa3cea0821ab-1",
      "content_html": "<p>Talk on the state of the OCaml Platform in 2020, outlining the next steps for OCaml development tools and infrastructure. Delivered online during the pandemic, the talk covered improvements to the OCaml toolchain, IDE support, documentation, and package management. I discussed our ongoing work at OCaml Labs to make OCaml more accessible to developers and strengthen the ecosystem with better developer experience.</p>",
      "url": "https://anil.recoil.org/notes/b11188ba-0f97-4ec4-b372-fa3cea0821ab-1",
      "title": "State of the OCaml Platform 2020",
      "summary": "Talk on next steps for the OCaml Platform development tools, delivered online during the pandemic.",
      "date_published": "2020-08-28T00:00:00.000000Z",
      "date_modified": "2020-08-28T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "platform",
        "tooling",
        "devtools",
        "community"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/multicore-monthly-sep20",
      "content_html": "<p>The big advance in the multicore OCaml branch is that we restored compatibility\nwith the traditional OCaml systhreads. This in turn means that many existing\nsoftware packages just work out of the box on the new runtime.</p>\n<blockquote>\n<p>Big news this month is that the systhreads compatibility support PR has been\nmerged, which means that Dune (and other users of the Thread module) can\ncompile out of the box. You can now compile the multicore OCaml fork\nconveniently using the new opam compiler plugin (see announcement).\n<cite>-- <a href=\"https://discuss.ocaml.org/t/multicore-ocaml-september-2020/6565\">me, on the discussion forum</a></cite></p>\n</blockquote>",
      "url": "https://anil.recoil.org/notes/multicore-monthly-sep20",
      "external_url": "https://discuss.ocaml.org/t/multicore-ocaml-september-2020/6565",
      "title": "OCaml Multicore Monthly: systhreads compatibility merged",
      "summary": "OCaml Multicore now supports systhreads compatibility for seamless integration.",
      "date_published": "2020-08-20T00:00:00.000000Z",
      "date_modified": "2020-08-20T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocamllabs",
        "ocaml",
        "opensource",
        "multicore"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2020-icfp-retropar-1",
      "content_html": "<p>Won best paper award at ICFP 2020 for our paper on retrofitting parallelism onto OCaml! This was the culmination of years of work with KC Sivaramakrishnan, Stephen Dolan, Leo White and the multicore team. The paper presented a mostly-concurrent garbage collector that maintained backwards compatibility for sequential code while enabling true parallelism. The achievement was maintaining both feature compatibility and performance for existing single-threaded programs while scaling admirably on multicore processors - a balancing act that required novel GC techniques and careful engineering.</p><h1>References</h1><ul><li>Sivaramakrishnan et al (2020). Retrofitting parallelism onto OCaml. <a href=\"https://doi.org/10.1145/3408995\" target=\"_blank\"><i>10.1145/3408995</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2020-icfp-retropar-1",
      "title": "Retrofitting parallelism onto OCaml",
      "summary": "Best paper award at ICFP 2020 for multicore garbage collector design maintaining backwards compatibility for sequential OCaml code.",
      "date_published": "2020-08-01T00:00:00.000000Z",
      "date_modified": "2020-08-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "multicore",
        "gc",
        "fp",
        "systems"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2020-icfp-retropar.pdf",
          "mime_type": "application/pdf",
          "title": "Retrofitting parallelism onto OCaml"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1145/3408995",
          "doi": "10.1145/3408995",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2020-oud-parallelising-1",
      "content_html": "<p>Paper on how to parallelise OCaml code at the OCaml Workshop with Sadiq Jaffer, Sudha Parimala, KC Sivaramakrishnan and Tom Kelly. As multicore OCaml became available, we needed practical guidance for developers on how to actually use the new parallel features. This paper provided concrete techniques and patterns for parallelizing existing OCaml codebases, helping the community prepare for the multicore transition.</p>",
      "url": "https://anil.recoil.org/notes/2020-oud-parallelising-1",
      "title": "Parallelising your OCaml Code with Multicore OCaml",
      "summary": "OCaml Workshop paper on techniques for parallelizing OCaml code using multicore features.",
      "date_published": "2020-08-01T00:00:00.000000Z",
      "date_modified": "2020-08-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "multicore",
        "parallel",
        "fp",
        "systems"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2020-oud-parallelising.pdf",
          "mime_type": "application/pdf",
          "title": "Parallelising your OCaml Code with Multicore OCaml"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2020-oud-ci-1",
      "content_html": "<p>Presented the new OCaml DSL for continuous integration at the OCaml Workshop with Thomas Leonard, Craig Ferguson, Kate Deplaix and Magnus Skjegstad. OCaml-CI automatically extracts metadata from opam and dune files to determine what to build, uses aggressive caching for speed, and tests across multiple OCaml versions and platforms. Deployed on around 50 GitHub projects, it delivered response times an order of magnitude faster than less integrated CI solutions - demonstrating the benefits of deep integration with the OCaml ecosystem.</p>",
      "url": "https://anil.recoil.org/notes/2020-oud-ci-1",
      "title": "OCaml-CI: A Zero-Configuration CI",
      "summary": "Presentation of new OCaml DSL for continuous integration at the OCaml Workshop.",
      "date_published": "2020-08-01T00:00:00.000000Z",
      "date_modified": "2020-08-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "ci",
        "devops",
        "dsl",
        "testing"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/multicore-monthly-apr20",
      "content_html": "<p>In the April OCaml multicore monthly, we have a preprint available of our ICFP submission about the OCaml 5 multicore runtime.\n<em>(Update: This paper actually won the ICFP best paper award later in the year! Read it at &quot;<a href=\"/papers/2020-icfp-retropar\">Retrofitting parallelism onto OCaml</a>&quot;).</em></p><h1>References</h1><ul><li>Sivaramakrishnan et al (2020). Retrofitting parallelism onto OCaml. <a href=\"https://doi.org/10.1145/3408995\" target=\"_blank\"><i>10.1145/3408995</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/multicore-monthly-apr20",
      "external_url": "https://discuss.ocaml.org/t/multicore-update-april-2020-with-a-preprint-paper/5630/1",
      "title": "OCaml Multicore Monthly: preprint paper available",
      "summary": "Preprint of OCaml 5 multicore runtime paper available now",
      "date_published": "2020-04-27T00:00:00.000000Z",
      "date_modified": "2020-04-27T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocamllabs",
        "ocaml",
        "papers",
        "multicore"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1145/3408995",
          "doi": "10.1145/3408995",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/c09ed36f-6ad5-4254-a0ce-3ca3398f38a3-1",
      "content_html": "<p>Part 2 of my distinguished lecture series at St Andrews, discussing how unikernels had reached real-world deployment at scale. This talk explored actual deployments reaching billions of users, demonstrating that unikernel technology had moved beyond academic research into production systems. I covered lessons learned from deploying MirageOS in embedded systems and cloud infrastructure, and the future potential for unikernels in mainstream computing.</p>",
      "url": "https://anil.recoil.org/notes/c09ed36f-6ad5-4254-a0ce-3ca3398f38a3-1",
      "title": "The First Billion Real Deployments of Unikernels",
      "summary": "Discussion of unikernel deployments at scale reaching billions of users.",
      "date_published": "2020-02-26T00:00:00.000000Z",
      "date_modified": "2020-02-26T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "unikernels",
        "mirageos",
        "deployment",
        "systems"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/d456e4bc-bce6-45ad-9d2e-102f834ec400-1",
      "content_html": "<p>Delivered the distinguished seminar series at St Andrews on rebuilding Operating Systems with functional principles. Part 1 of the lecture series explored how functional programming concepts could be applied to systems design, using MirageOS as a case study. I discussed how immutability, type safety, and composability from functional languages could help build more secure and reliable operating systems, moving away from the complexity and vulnerabilities of traditional OS architectures.</p>",
      "url": "https://anil.recoil.org/notes/d456e4bc-bce6-45ad-9d2e-102f834ec400-1",
      "title": "Rebuilding Operating Systems with Functional Principles",
      "summary": "Distinguished lecture series at University of St Andrews on functional programming principles applied to operating systems design.",
      "date_published": "2020-02-26T00:00:00.000000Z",
      "date_modified": "2020-02-26T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "fp",
        "systems",
        "unikernels",
        "mirageos",
        "ocaml"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/287364fa-b59c-4b9f-812d-d81cc0c992a5-1",
      "content_html": "<p>Part 3 of my distinguished lecture series at St Andrews on functional foundations for operating systems. This talk explores the programming challenges we'll face with the coming wave of embedded devices - likely trillions of them over the next decade. The series at St Andrews was a great opportunity to present a comprehensive vision for how functional programming approaches, particularly OCaml and unikernels, can address the unique constraints of embedded systems. These devices need to be secure, resource-efficient, and easy to deploy at scale, which makes traditional OS approaches challenging.</p>",
      "url": "https://anil.recoil.org/notes/287364fa-b59c-4b9f-812d-d81cc0c992a5-1",
      "title": "Programming the Next Trillion Embedded Devices",
      "summary": "Talk on programming approaches for the coming wave of embedded devices.",
      "date_published": "2020-02-26T00:00:00.000000Z",
      "date_modified": "2020-02-26T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "embedded",
        "systems",
        "iot",
        "programming",
        "fp"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/multicore-monthly-jan20",
      "content_html": "<p>We started the process of upstreaming our <a href=\"/papers/2014-oud-multicore\">multicore OCaml</a> branch to mainline OCaml, and so I started posting regular updates to the community forum.</p>\n<blockquote>\n<p>The most common question we get is how to contribute to the overall multicore effort. As I noted last year, we are now in the process of steadily upstreaming our efforts to mainline OCaml. Therefore, the best way by far to contribute is to test for regressions or opportunities for improvements in the patches that are outstanding in the main OCaml repository.\n<cite>-- <a href=\"https://discuss.ocaml.org/t/multicore-ocaml-january-2020-update/5090\">me, on the discussion forum</a></cite></p>\n</blockquote>",
      "url": "https://anil.recoil.org/notes/multicore-monthly-jan20",
      "external_url": "https://discuss.ocaml.org/t/multicore-ocaml-january-2020-update/5090",
      "title": "OCaml Multicore Monthly: starting upstream to OCaml",
      "summary": "Upstreaming multicore OCaml to mainline OCaml has begun, with opportunities for community contribution through testing and feedback.",
      "date_published": "2020-01-29T00:00:00.000000Z",
      "date_modified": "2020-01-29T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocamllabs",
        "multicore",
        "ocaml",
        "opensource"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/openbsd-hosting",
      "content_html": "<p>I <a href=\"https://twitter.com/avsm/status/1167012354556669952\">asked on Twitter</a> about hosting options for OpenBSD on cloud providers, so that we could have some alternative options for Recoil.  We have a strong preference for bare-metal and not VMs when it comes to OpenBSD.  Options that came back were:</p>\n<ul>\n<li>OpenBSDAMS\n<ul>\n<li>Dedicated/VMs for openbsd hosting (see <a href=\"https://twitter.com/OpenBSDAms\">here</a>)</li>\n</ul>\n</li>\n<li>Mythic Beasts\n<ul>\n<li>I have provisioned a bare metal server there and they kindly stuck a USB stick in with an OpenBSD installer.</li>\n</ul>\n</li>\n<li>DataCentreLite\n<ul>\n<li>Not tried this yet but <a href=\"https://twitter.com/NicoSchottelius/status/1167163133024264192\">possible followup</a>.</li>\n</ul>\n</li>\n<li>LiquidWeb\n<ul>\n<li>Good <a href=\"https://twitter.com/vphantom/status/1167020959771049984\">recommendation</a> from Stephane</li>\n</ul>\n</li>\n</ul>",
      "url": "https://anil.recoil.org/notes/openbsd-hosting",
      "title": "OpenBSD cloud hosting options",
      "summary": "Explore OpenBSD cloud hosting options for bare-metal and dedicated servers.",
      "date_published": "2019-08-29T00:00:00.000000Z",
      "date_modified": "2019-08-29T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "perscon",
        "openbsd",
        "cloud",
        "selfhosting"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2019-ocaml-platform-1",
      "content_html": "<p>Annual update on the OCaml Platform in 2019 with Gemma Gordon at the OCaml Workshop. By this point, the platform had evolved considerably with major improvements to tooling, better IDE support through LSP integration, and growing adoption of modern best practices. These regular updates help track progress and set priorities for the community-driven tooling effort.</p>",
      "url": "https://anil.recoil.org/notes/2019-ocaml-platform-1",
      "title": "The OCaml Platform in 2019",
      "summary": "Annual update on OCaml Platform development and ecosystem progress.",
      "date_published": "2019-08-01T00:00:00.000000Z",
      "date_modified": "2019-08-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "platform",
        "tooling"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2019-mirage-build-1",
      "content_html": "<p>Paper on the MirageOS 4 build system at OCaml Workshop with Lucas Pluvinage, Romain Calascibetta and Rudi Grinberg. MirageOS 4 introduced practical build systems for exotic targets like embedded devices and specialized hypervisor backends. The work integrated with dune, OCaml's modern build system, making it much easier to configure and build unikernels for different deployment targets - a major step toward making unikernel development accessible to more developers.</p>",
      "url": "https://anil.recoil.org/notes/2019-mirage-build-1",
      "title": "MirageOS 4: the dawn of practical build systems for exotic targets",
      "summary": "Paper on the MirageOS 4 build system presented at OCaml Workshop.",
      "date_published": "2019-08-01T00:00:00.000000Z",
      "date_modified": "2019-08-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "mirageos",
        "ocaml",
        "build-systems",
        "unikernels"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2019-mirage-build.pdf",
          "mime_type": "application/pdf",
          "title": "MirageOS 4: the dawn of practical build systems for exotic targets"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2019-mirage-functors-1",
      "content_html": "<p>Preprint on programming unikernels with ML modules, exploring functor-driven development with Gabriel Radanne, Thomas Gazagnaire, Jeremy Yallop, Richard Mortier and others. The paper tackled the configuration matrix problem - realistic unikernel applications depend on hundreds of libraries, each with different requirements across heterogeneous platforms. We showed how OCaml's module system, particularly functors, could cleanly separate configuration from application logic, enabling modular composition and leveraging link-time optimization for efficiency.</p><h1>References</h1><ul><li>Radanne et al (2019). Programming Unikernels in the Large via Functor Driven Development. arXiv. <a href=\"https://doi.org/10.48550/arXiv.1905.02529\" target=\"_blank\"><i>10.48550/arXiv.1905.02529</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2019-mirage-functors-1",
      "title": "Programming Unikernels in the Large via Functor Driven Development",
      "summary": "Preprint on programming unikernels using ML modules and functors.",
      "date_published": "2019-05-01T00:00:00.000000Z",
      "date_modified": "2019-05-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "unikernels",
        "mirageos",
        "ocaml",
        "functors",
        "fp"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2019-mirage-functors.pdf",
          "mime_type": "application/pdf",
          "title": "Programming Unikernels in the Large via Functor Driven Development"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.48550/arXiv.1905.02529",
          "doi": "10.48550/arXiv.1905.02529",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2019-edgesys-snape-1",
      "content_html": "<p>Paper on a framework to rearchitect applications for better TEE support at EdgeSys 2019 with Zahra Tarkhani and Richard Mortier. Snape (named after the Harry Potter character who mastered the &quot;dark arts&quot;) provided tools for handling heterogeneous trusted execution environments. The framework helped developers partition applications across different TEE technologies like Intel SGX, ARM TrustZone, and others, addressing the practical challenge that edge devices often have diverse security capabilities.</p><h1>References</h1><ul><li>Tarkhani et al (2019). Snape: The Dark Art of Handling Heterogeneous Enclaves. ACM. <a href=\"https://doi.org/10.1145/3301418.3313945\" target=\"_blank\"><i>10.1145/3301418.3313945</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2019-edgesys-snape-1",
      "title": "Snape: The Dark Art of Handling Heterogeneous Enclaves",
      "summary": "EdgeSys 2019 paper on framework for rearchitecting applications to better support trusted execution environments.",
      "date_published": "2019-03-01T00:00:00.000000Z",
      "date_modified": "2019-03-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "security",
        "tee",
        "systems",
        "enclaves"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2019-edgesys-snape.pdf",
          "mime_type": "application/pdf",
          "title": "Snape: The Dark Art of Handling Heterogeneous Enclaves"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1145/3301418.3313945",
          "doi": "10.1145/3301418.3313945",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/ocaml-opam-new-layout",
      "content_html": "<p>Managing package manager constraints is getting difficult, particularly given the growth of the number of packages in the <a href=\"https://github.com/ocaml/opam-repository\">opam repository</a>. I'm therefore laying out a new mechanism for the OCaml contributors to submit large package sets, such as those from <a href=\"https://janestreet.com\">Jane Street</a>.</p>",
      "url": "https://anil.recoil.org/notes/ocaml-opam-new-layout",
      "external_url": "https://discuss.ocaml.org/t/experimenting-with-a-new-opam-repository-release-strategy-for-large-libraries/2918",
      "title": "New opam repository layout for large libraries",
      "summary": "Introducing a new opam repository layout for managing large libraries and packages.",
      "date_published": "2018-11-19T00:00:00.000000Z",
      "date_modified": "2018-11-19T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocamllabs",
        "ocaml",
        "opensource"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2018-socp-modular-ffi-1",
      "content_html": "<p>Journal paper on building modular foreign function interfaces in Science of Computer Programming with Jeremy Yallop and David Sheets. This expanded our FLOPS work into a comprehensive treatment of how to structure FFI systems modularly. The approach separated the what (function specifications) from the how (binding mechanisms), enabling the same specification to be used with different backends - a pattern that has since influenced FFI design in other languages.</p><h1>References</h1><ul><li>Yallop et al (2018). A modular foreign function interface. <a href=\"https://doi.org/10.1016/j.scico.2017.04.002\" target=\"_blank\"><i>10.1016/j.scico.2017.04.002</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2018-socp-modular-ffi-1",
      "title": "A modular foreign function interface",
      "summary": "Journal paper on building modular foreign function interfaces.",
      "date_published": "2018-10-01T00:00:00.000000Z",
      "date_modified": "2018-10-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "ffi",
        "fp",
        "systems",
        "programming"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2018-socp-modular-ffi.pdf",
          "mime_type": "application/pdf",
          "title": "A modular foreign function interface"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1016/j.scico.2017.04.002",
          "doi": "10.1016/j.scico.2017.04.002",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2018-pldi-memorymodel-1",
      "content_html": "<p>Paper on the OCaml memory model and underlying theory at PLDI 2018 with Stephen Dolan and KC Sivaramakrishnan. This work was crucial for multicore OCaml, defining the formal memory model that specifies what behaviors are legal for concurrent programs. The paper presented novel techniques for bounding data races in both space and time, providing the theoretical foundation needed to ensure multicore OCaml programs behave predictably while still achieving good performance on modern hardware.</p><h1>References</h1><ul><li>Dolan et al (2018). Bounding data races in space and time. ACM. <a href=\"https://doi.org/10.1145/3192366.3192421\" target=\"_blank\"><i>10.1145/3192366.3192421</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2018-pldi-memorymodel-1",
      "title": "Bounding data races in space and time",
      "summary": "Paper on the OCaml memory model and underlying theory presented at PLDI 2018.",
      "date_published": "2018-06-01T00:00:00.000000Z",
      "date_modified": "2018-06-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "memory-model",
        "concurrency",
        "fp",
        "systems"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2018-pldi-memorymodel.pdf",
          "mime_type": "application/pdf",
          "title": "Bounding data races in space and time"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1145/3192366.3192421",
          "doi": "10.1145/3192366.3192421",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2017-tfp-effecthandlers-1",
      "content_html": "<p>Paper on concurrent systems programming with effect handlers at TFP 2017, expanding on our ML Workshop work. This paper with Stephen Dolan, Spiros Eliopoulos, Daniel Hillerstrom, KC Sivaramakrishnan and Leo White demonstrated that effect handlers in Multicore OCaml could elegantly express difficult concurrent systems programs without performance compromises. We showed that highly concurrent web servers built with effect handlers performed on par with heavily optimized monadic concurrency libraries, while maintaining the simplicity of direct-style code - a significant practical achievement.</p><h1>References</h1><ul><li>Dolan et al (2018). Concurrent System Programming with Effect Handlers. Springer International Publishing. <a href=\"https://doi.org/10.1007/978-3-319-89719-6_6\" target=\"_blank\"><i>10.1007/978-3-319-89719-6_6</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2017-tfp-effecthandlers-1",
      "title": "Concurrent System Programming with Effect Handlers",
      "summary": "Paper on concurrent systems programming using effect handlers presented at TFP 2017.",
      "date_published": "2018-04-01T00:00:00.000000Z",
      "date_modified": "2018-04-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "effect-handlers",
        "concurrency",
        "systems",
        "fp"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2017-tfp-effecthandlers.pdf",
          "mime_type": "application/pdf",
          "title": "Concurrent System Programming with Effect Handlers"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1007/978-3-319-89719-6_6",
          "doi": "10.1007/978-3-319-89719-6_6",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2018-hotpost-osmose-1",
      "content_html": "<p>Paper on the interspatial networking architecture at HotPOST 2018 with KC Sivaramakrishnan, Gemma Gordon and Thomas Gazagnaire. OSMOSE inverted the typical cloud-centric model by designing an OS for extremely low-latency local computation in physical spaces. The architecture used unikernels and Irmin to provide secure, high-bandwidth connectivity between physical spaces rather than relying on remote datacenters. This addressed the data security, latency and reliability issues inherent in shipping everything to the cloud.</p><h1>References</h1><ul><li>Madhavapeddy et al (2018). An architecture for interspatial communication. IEEE. <a href=\"https://doi.org/10.1109/INFCOMW.2018.8406931\" target=\"_blank\"><i>10.1109/INFCOMW.2018.8406931</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2018-hotpost-osmose-1",
      "title": "An architecture for interspatial communication",
      "summary": "Paper on interspatial networking architecture presented at HotPOST 2018.",
      "date_published": "2018-04-01T00:00:00.000000Z",
      "date_modified": "2018-04-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "networking",
        "systems",
        "spatial",
        "architecture"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2018-hotpost-osmose.pdf",
          "mime_type": "application/pdf",
          "title": "An architecture for interspatial communication"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1109/INFCOMW.2018.8406931",
          "doi": "10.1109/INFCOMW.2018.8406931",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/founded-tarides",
      "content_html": "<p>I'm delighted to report that I'm helping my long-time collaborator <a href=\"https://github.com/samoht\">Thomas Gazagnaire</a> to return to his OCaml roots from Docker. He has just founded Tarides, a startup in Paris with the goal of advancing the open source OCaml ecosystem.</p>\n<blockquote>\n<p>Founded in Paris in early 2018, Tarides helps developers and companies build secure, performant and resource-efficient network and storage services. We are using MirageOS to run applications without the overhead of a traditional operating system and Irmin to create scalable distributed applications. Tarides offers commercial support and commercial development services for companies interested to run MirageOS or Irmin as part of their technology stack.\n<cite> -- <a href=\"https://discuss.ocaml.org/t/tarides-is-looking-for-software-engineers-to-work-on-mirageos-and-irmin/1690\">Thomas Gazagnaire</a></cite></p>\n</blockquote>",
      "url": "https://anil.recoil.org/notes/founded-tarides",
      "external_url": "https://discuss.ocaml.org/t/tarides-is-looking-for-software-engineers-to-work-on-mirageos-and-irmin/1690",
      "title": "Founded Tarides and looking to hire OCaml hackers",
      "summary": "Tarides, a Paris-based startup, seeks OCaml hackers to build secure network services with MirageOS and Irmin.",
      "date_published": "2018-03-02T00:00:00.000000Z",
      "date_modified": "2018-03-02T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "startups",
        "france",
        "opensource"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2f824dde-e112-4f4f-890d-1825572ea1c4-1",
      "content_html": "<p>Talk on the state of the OCaml Platform at the 2017 OCaml Workshop. This was my annual update to the community on the progress of the OCaml Platform - the collection of tools and libraries that make up the OCaml development experience. By 2017, we were seeing real momentum with opam, the package manager, becoming widely adopted, and early work on better editor integration and build tools. These annual talks at the OCaml Workshop serve as both progress reports and opportunities to gather feedback from the community on what tooling improvements matter most to practitioners.</p>",
      "url": "https://anil.recoil.org/notes/2f824dde-e112-4f4f-890d-1825572ea1c4-1",
      "title": "State of the OCaml Platform",
      "summary": "Talk presenting the current state of the OCaml Platform toolchain and development ecosystem.",
      "date_published": "2017-09-08T00:00:00.000000Z",
      "date_modified": "2017-09-08T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "platform",
        "tooling",
        "devtools",
        "community"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2017-oud-platform-1",
      "content_html": "<p>Annual update on the OCaml Platform at ICFP, continuing the tradition of yearly status reports on the OCaml development toolchain. By 2017, the platform had matured significantly with improved editor integration, better documentation tools, and more robust CI infrastructure. These yearly talks help coordinate the community's efforts and ensure the tools evolve together coherently.</p>",
      "url": "https://anil.recoil.org/notes/2017-oud-platform-1",
      "title": "The State of the OCaml Platform: Sep 2017",
      "summary": "Annual update on OCaml Platform development presented at ICFP.",
      "date_published": "2017-09-01T00:00:00.000000Z",
      "date_modified": "2017-09-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "platform",
        "tooling"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2017-oud-platform.pdf",
          "mime_type": "application/pdf",
          "title": "The State of the OCaml Platform: Sep 2017"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2017-ml-effects-1",
      "content_html": "<p>Paper on how to tackle awkward IO patterns with effect handlers at the ML Workshop with Stephen Dolan, Spiros Eliopoulos, Daniel Hillerstrom, KC Sivaramakrishnan and Leo White. The work addressed what Simon Peyton Jones famously called the &quot;Awkward Squad&quot; - the messy reality of I/O, concurrency, and exceptions in functional programming. We showed how algebraic effects and handlers could elegantly express these programs, introducing the concept of asynchronous effects to solve the interaction between user-level threads and operating system services without compromising performance.</p>",
      "url": "https://anil.recoil.org/notes/2017-ml-effects-1",
      "title": "Effectively tackling the awkward squad",
      "summary": "Paper on tackling awkward IO patterns using effect handlers.",
      "date_published": "2017-09-01T00:00:00.000000Z",
      "date_modified": "2017-09-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "effects",
        "fp",
        "io",
        "systems"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2017-ml-effects.pdf",
          "mime_type": "application/pdf",
          "title": "Effectively tackling the awkward squad"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/opening-discuss-ocaml",
      "content_html": "<p>I opened up a <a href=\"https://discourse.org\">Discourse</a> forum for the OCaml community to use, which is running successfully on https://discuss.ocaml.org. This forum thread collates the feedback and discussions about it.</p>",
      "url": "https://anil.recoil.org/notes/opening-discuss-ocaml",
      "external_url": "https://discuss.ocaml.org/t/discussion-site-status-and-timeline/23",
      "title": "Opening discuss.ocaml.org for the community",
      "summary": "Introducing discuss.ocaml.org, a new community forum for OCaml discussion and feedback.",
      "date_published": "2017-05-13T00:00:00.000000Z",
      "date_modified": "2017-05-13T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "selfhosting",
        "opensource"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2017-snapl-dali-1",
      "content_html": "<p>Position paper on building databases-as-a-library at SNAPL 2017, the Summit on Advances in Programming Languages. Working with Gowtham Kaki, KC Sivaramakrishnan, Thomas Gazagnaire and Suresh Jagannathan, we argued for a radical rethinking of database architecture. Instead of monolithic database servers, DaLi proposed encapsulating data management as transparent libraries in the same language as the application. This enables the type system and verification tools to enforce application-level invariants across the data layer - essentially extending the unikernel philosophy from computation to data.</p>",
      "url": "https://anil.recoil.org/notes/2017-snapl-dali-1",
      "title": "DaLi: Database as a Library",
      "summary": "Position paper on building databases-as-a-library presented at SNAPL 2017.",
      "date_published": "2017-05-01T00:00:00.000000Z",
      "date_modified": "2017-05-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "databases",
        "fp",
        "ocaml",
        "systems",
        "library"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2017-snapl-dali.pdf",
          "mime_type": "application/pdf",
          "title": "DaLi: Database as a Library"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/725dda70-b12b-4b1a-a8ae-fa9c22683ff2-1",
      "content_html": "<p>DockerCon talk on unikernels and MirageOS, explaining the integration work we had done to deliver Docker for Desktop using library hypervisor technology. I spoke about how unikernels could work alongside containers, with MirageOS providing the underlying virtualization infrastructure. This talk came after our acquisition by Docker and showed how functional programming and systems research could have real-world impact on tools used by millions of developers.</p>",
      "url": "https://anil.recoil.org/notes/725dda70-b12b-4b1a-a8ae-fa9c22683ff2-1",
      "title": "Unikernels: the rise of the library hypervisor in MirageOS",
      "summary": "DockerCon talk on integrating MirageOS unikernels with Docker using library hypervisor approach for Docker Desktop.",
      "date_published": "2016-10-14T00:00:00.000000Z",
      "date_modified": "2016-10-14T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "mirageos",
        "unikernels",
        "docker",
        "hypervisor",
        "ocaml"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/mirageos-hack-retreat-2016",
      "content_html": "",
      "url": "https://anil.recoil.org/notes/mirageos-hack-retreat-2016",
      "external_url": "https://mirageos.org/blog/2016-summer-hackathon",
      "title": "MirageOS Summer 2016 hack retreat",
      "summary": "MirageOS holds summer 2016 hack retreat to advance its unikernel technology.",
      "date_published": "2016-06-29T00:00:00.000000Z",
      "date_modified": "2016-06-29T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "unikernels"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/dbd7546a-95d8-40af-b286-3cf930767682-1",
      "content_html": "<p>I gave a talk at the <a href=\"https://functional.works-hub.com\">Functional Works</a> meetup, held in <a href=\"https://janestreet.com\">Jane Street London</a> about how Docker for Mac and Windows use OCaml and unikernels <a href=\"https://www.docker.com/blog/docker-unikernels-open-source/\">under the hood</a>.</p>",
      "url": "https://anil.recoil.org/notes/dbd7546a-95d8-40af-b286-3cf930767682-1",
      "title": "The functional innards of Docker for Mac and Windows",
      "summary": "Talk at Functional Works meetup discussing how Docker for Mac and Windows use OCaml and unikernels internally.",
      "date_published": "2016-06-24T00:00:00.000000Z",
      "date_modified": "2016-06-24T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "docker",
        "janestreet",
        "ocaml",
        "unikernels"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/8c92d6cf-3e05-429f-8c8e-094f77be61c6-1",
      "content_html": "<p>Interviewed by The New Stack at OSCON in Austin, Texas alongside Ian Eyberg and Joshua Bernstein. We discussed unikernels and Docker with Alex Williams, exploring how unikernel technology was maturing and the different approaches being taken by the community. The interview covered the acquisition of Unikernel Systems by Docker and what it meant for bringing unikernels to mainstream development workflows.</p>",
      "url": "https://anil.recoil.org/notes/8c92d6cf-3e05-429f-8c8e-094f77be61c6-1",
      "title": "Ian Eyberg, Joshua Bernstein, Anil Madhavapeddy at OSCON in Austin",
      "summary": "Interview with The New Stack about unikernels and Docker at OSCON in Austin, Texas.",
      "date_published": "2016-06-06T00:00:00.000000Z",
      "date_modified": "2016-06-06T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "unikernels",
        "docker",
        "interview",
        "mirageos",
        "containers"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2016-usenix-flick-1",
      "content_html": "<p>Paper on application-specific network services at USENIX ATC 2016, a collaboration across multiple universities including Cambridge, Imperial, UCL and Nottingham. FLICK provided a framework for developing network services like custom load balancers and middleboxes using high-level abstractions while achieving good performance. The system automatically translated FLICK programs to efficient parallel task graphs with bounded resource usage, enabling safe concurrent execution of multiple services. We demonstrated it with practical applications including an HTTP load balancer, Memcached router, and Hadoop data aggregator.</p>",
      "url": "https://anil.recoil.org/notes/2016-usenix-flick-1",
      "title": "FLICK: Developing and Running Application-Specific Network Services",
      "summary": "Paper on framework for developing and deploying application-specific network services at USENIX ATC 2016.",
      "date_published": "2016-06-01T00:00:00.000000Z",
      "date_modified": "2016-06-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "networking",
        "nfv",
        "middleboxes",
        "systems"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2016-usenix-flick.pdf",
          "mime_type": "application/pdf",
          "title": "FLICK: Developing and Running Application-Specific Network Services"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/vpnkit-hyperkit",
      "content_html": "<p>I announce the release of three big components that form the basis for <a href=\"https://docker.com\">Docker for Desktop</a>: a hypervisor framework called HyperKit, a networking framework for host translation called VPNKit, and a versioned data management store called DataKit.</p>",
      "url": "https://anil.recoil.org/notes/vpnkit-hyperkit",
      "external_url": "https://www.docker.com/blog/docker-unikernels-open-source/",
      "title": "Improving Docker with Unikernels",
      "summary": "Introducing HyperKit, VPNKit, and DataKit to enhance Docker performance.",
      "date_published": "2016-05-18T00:00:00.000000Z",
      "date_modified": "2016-05-18T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "unikernels",
        "ocamllabs",
        "ocaml",
        "docker",
        "opensource"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2016-flops-cmeleon-1",
      "content_html": "<p>Paper on declarative approaches to foreign function bindings at FLOPS 2016 with Jeremy Yallop and David Sheets. The work addressed a fundamental problem with FFI systems: they typically tie the specification of foreign functions to a specific binding mechanism, making it hard to switch between dynamic binding and static code generation. Our approach using generic programming allowed developers to write FFI specifications once and then flexibly choose the binding strategy, demonstrating the power of abstraction in systems programming.</p><h1>References</h1><ul><li>Yallop et al (2016). Declarative Foreign Function Binding Through Generic Programming. Springer International Publishing. <a href=\"https://doi.org/10.1007/978-3-319-29604-3_13\" target=\"_blank\"><i>10.1007/978-3-319-29604-3_13</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2016-flops-cmeleon-1",
      "title": "Declarative Foreign Function Binding Through Generic Programming",
      "summary": "Paper on flexible declarative approaches to foreign function interfaces through generic programming at FLOPS 2016.",
      "date_published": "2016-02-01T00:00:00.000000Z",
      "date_modified": "2016-02-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ffi",
        "fp",
        "ocaml",
        "generic-programming",
        "systems"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2016-flops-cmeleon.pdf",
          "mime_type": "application/pdf",
          "title": "Declarative Foreign Function Binding Through Generic Programming"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1007/978-3-319-29604-3_13",
          "doi": "10.1007/978-3-319-29604-3_13",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/a612e810-d56c-48af-b43e-2893a96b9120-1",
      "content_html": "<p>Announced that Unikernel Systems is now part of Docker, marking a significant milestone for bringing unikernel technology to mainstream developers. This acquisition meant that our research on MirageOS and library operating systems would directly influence tools used by millions of developers worldwide. The team would work on integrating unikernels into Docker for Mac and Windows, demonstrating how academic systems research could have real commercial impact.</p>",
      "url": "https://anil.recoil.org/notes/a612e810-d56c-48af-b43e-2893a96b9120-1",
      "title": "Unikernel Systems is now part of Docker",
      "summary": "Announcement that Unikernel Systems has been acquired by Docker.",
      "date_published": "2016-01-21T00:00:00.000000Z",
      "date_modified": "2016-01-21T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "unikernels",
        "docker",
        "mirageos",
        "acquisition",
        "startup"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/docker-buys-unikernel-systems",
      "content_html": "<p>My startup <a href=\"/projects/unikernels\">Unikernel Systems</a> was acquired by <a href=\"https:/docker.com\">Docker</a>, and I'll\nbe joining and setting up a UK branch of Docker along with the rest of my team.</p>\n<blockquote>\n<p>'Just like we did with containers, we are interested is democratizing that technology, making it available and useful to the millions of developers and IT pros out there, said <a href=\"https://www.linkedin.com/in/solomonhykes\">Solomon Hykes</a>, founder and chief technology officer for Docker. 'Unikernels allow you to basically get rid of the operating system, and instead compile into the application the small bits of the operating system it really needs.'\n<cite>-- <a href=\"https://thenewstack.io/docker-buys-unikernel-systems-plans-bring-unikernels-data-center/\">The New Stack</a></cite></p>\n</blockquote>\n<p>You can also see an announcement from me explaining the background story:</p>\n<p><div class=\"video-center\"><iframe title=\"Unikernel Systems is now part of Docker\" width=\"100%\" height=\"315px\" src=\"https://crank.recoil.org/videos/embed/a612e810-d56c-48af-b43e-2893a96b9120\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>",
      "url": "https://anil.recoil.org/notes/docker-buys-unikernel-systems",
      "external_url": "https://thenewstack.io/docker-buys-unikernel-systems-plans-bring-unikernels-data-center/",
      "title": "Unikernel Systems acquired by Docker",
      "summary": "Docker acquires Unikernel Systems to bring unikernel tech to developers and IT pros.",
      "date_published": "2016-01-21T00:00:00.000000Z",
      "date_modified": "2016-01-21T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "docker",
        "startups",
        "unikernels",
        "opensource"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2015-sosp-sibylfs-1",
      "content_html": "<p>Paper on formal specification and testing of filesystems at SOSP 2015, the premier operating systems conference. SibylFS with Tom Ridge, David Sheets, Peter Sewell and colleagues created the first comprehensive formal specification of POSIX filesystem behavior, then used it as a test oracle to discover numerous inconsistencies and bugs in real-world filesystems. The work demonstrated how executable specifications could bridge theory and practice, finding genuine bugs in production Linux and BSD filesystem implementations that had existed for years.</p><h1>References</h1><ul><li>Ridge et al (2015). SibylFS: formal specification and oracle-based testing for POSIX and real-world file systems. ACM. <a href=\"https://doi.org/10.1145/2815400.2815411\" target=\"_blank\"><i>10.1145/2815400.2815411</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2015-sosp-sibylfs-1",
      "title": "SibylFS: formal specification and oracle-based testing for POSIX and real-world file systems",
      "summary": "Paper on formal specification and testing of filesystems presented at SOSP 2015.",
      "date_published": "2015-10-01T00:00:00.000000Z",
      "date_modified": "2015-10-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "formal-methods",
        "filesystems",
        "testing",
        "verification"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2015-sosp-sibylfs.pdf",
          "mime_type": "application/pdf",
          "title": "SibylFS: formal specification and oracle-based testing for POSIX and real-world file systems"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1145/2815400.2815411",
          "doi": "10.1145/2815400.2815411",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2015-aarhus-databox-1",
      "content_html": "<p>Paper on personal databoxes at the one-in-a-decade Aarhus conference on participatory design. This work with Hamed Haddadi, Heidi Howard, Richard Mortier and colleagues proposed that individuals should have direct control over their personal data through a personal networked service - the Databox. The paper argued for a fundamental shift away from centralized data silos toward personal data management infrastructure, presaging many of the concerns that would drive GDPR and the broader privacy movement.</p><h1>References</h1><ul><li>Chaudhry et al (2015). Personal Data: Thinking Inside the Box. <a href=\"https://doi.org/10.7146/aahcc.v1i1.21312\" target=\"_blank\"><i>10.7146/aahcc.v1i1.21312</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2015-aarhus-databox-1",
      "title": "Personal Data: Thinking Inside the Box",
      "summary": "Paper on personal databoxes presented at the decennial Aarhus conference.",
      "date_published": "2015-10-01T00:00:00.000000Z",
      "date_modified": "2015-10-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "privacy",
        "databox",
        "personal-data",
        "hci"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2015-aarhus-databox.pdf",
          "mime_type": "application/pdf",
          "title": "Personal Data: Thinking Inside the Box"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.7146/aahcc.v1i1.21312",
          "doi": "10.7146/aahcc.v1i1.21312",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/d5fbd6a4-bef2-4fbc-9d02-cb9935e50d8e-1",
      "content_html": "<p>Invited talk at NetPL on Immutable Distributed Infrastructure with Unikernels at NetPL 2015. The talk explored how combining unikernels with immutable infrastructure concepts through Irmin's distributed storage could revolutionize how we build cloud systems. I discussed how MirageOS and Irmin together provided version control and branch consistency for entire distributed systems, enabling new approaches to deployment, rollback, and system management.</p>",
      "url": "https://anil.recoil.org/notes/d5fbd6a4-bef2-4fbc-9d02-cb9935e50d8e-1",
      "title": "Immutable Distributed Infrastructure with Unikernels",
      "summary": "Invited talk at NetPL on immutable distributed infrastructure using unikernels.",
      "date_published": "2015-09-29T00:00:00.000000Z",
      "date_modified": "2015-09-29T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "unikernels",
        "mirageos",
        "distributed-systems",
        "infrastructure"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/35e1a70d-0fb4-49b1-86ce-dd6266b812de-1",
      "content_html": "<p>Update on the state of the OCaml Platform at the 2015 OCaml Workshop. This talk covered the evolution of the OCaml development toolchain as we continued building out the platform vision. By 2015, opam had become the standard package manager and we were working on improving the overall developer experience with better documentation tools, testing frameworks, and build systems. It's interesting to look back at these talks and see the progression - many of the tools we were prototyping in 2015 became production-ready in subsequent years and are now fundamental parts of how people write OCaml.</p>",
      "url": "https://anil.recoil.org/notes/35e1a70d-0fb4-49b1-86ce-dd6266b812de-1",
      "title": "The State of the OCaml Platform",
      "summary": "Update on the state of the OCaml Platform development environment and toolchain.",
      "date_published": "2015-09-04T00:00:00.000000Z",
      "date_modified": "2015-09-04T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "platform",
        "tooling",
        "devtools",
        "community"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2015-usenixsec-nqsb-1",
      "content_html": "<p>Paper on rebuilding TLS securely but practically at USENIX Security 2015, presented with David Kaloper-Mersinjak, Hannes Mehnert and Peter Sewell. Not-Quite-So-Broken TLS was a from-scratch implementation in OCaml that served dual roles: both an executable specification for testing other TLS implementations and a production-ready library. By using a memory-safe language and modular design, we excluded entire classes of security vulnerabilities by construction. The implementation achieved reasonable performance (73-84% of OpenSSL throughput) despite the safety guarantees, and could be compiled into tiny Xen unikernels with a 4% TCB compared to Linux/OpenSSL stacks.</p>",
      "url": "https://anil.recoil.org/notes/2015-usenixsec-nqsb-1",
      "title": "Not-Quite-So-Broken TLS",
      "summary": "Paper on rebuilding TLS securely but practically presented at USENIX Security 2015.",
      "date_published": "2015-08-01T00:00:00.000000Z",
      "date_modified": "2015-08-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "security",
        "tls",
        "cryptography",
        "ocaml"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2015-usenixsec-nqsb.pdf",
          "mime_type": "application/pdf",
          "title": "Not-Quite-So-Broken TLS"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/55852136-843d-4043-98e7-6b46c6d39b01-1",
      "content_html": "<p>Talk at Esper on functional programming with unikernels to a packed room at their OCaml meetup in California. I covered how MirageOS enables building functional infrastructure by compiling OCaml applications directly into specialized unikernels. The talk demonstrated the integration possibilities with Docker and explored how Irmin's distributed storage capabilities could revolutionize how we build cloud services.</p>",
      "url": "https://anil.recoil.org/notes/55852136-843d-4043-98e7-6b46c6d39b01-1",
      "title": "Unikernels: Functional Infrastructure with Mirage OS",
      "summary": "Talk at Esper on functional programming with unikernels using MirageOS.",
      "date_published": "2015-05-12T00:00:00.000000Z",
      "date_modified": "2015-05-12T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "mirageos",
        "unikernels",
        "fp",
        "ocaml"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/ad4658f5-ca4f-42f3-b61a-58f13dcdeb1a-1",
      "content_html": "<p>NSDI 2015 talk on Jitsu at USENIX NSDI in Oakland, presenting our system for just-in-time summoning of unikernels. Jitsu addressed network latency by rapidly instantiating local services near users on resource-constrained ARM devices. The system used DNS-triggered launches to boot unikernels in milliseconds, masking boot latency through clever shared memory channels. We demonstrated that lightweight, memory-safe unikernels could provide secure multi-tenant isolation on embedded devices while being highly responsive and power-efficient.</p>",
      "url": "https://anil.recoil.org/notes/ad4658f5-ca4f-42f3-b61a-58f13dcdeb1a-1",
      "title": "Jitsu: Just-In-Time Summoning of Unikernels",
      "summary": "NSDI 2015 talk on Jitsu system for on-demand unikernel instantiation.",
      "date_published": "2015-05-04T00:00:00.000000Z",
      "date_modified": "2015-05-04T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "unikernels",
        "mirageos",
        "systems",
        "xen"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2015-diynet-kadupul-1",
      "content_html": "<p>Workshop paper on DIY networking using timelock puzzles, presented at the DIY Networking workshop with Magnus Skjegstad and Jon Crowcroft. Kadupul explored using virtual currencies like Bitcoin to incentivize heterogeneous multihop mesh networking. The key insight was using time-locked puzzles where forwarding nodes compete to solve cryptographic challenges, with the fastest delivery claiming a reward. This created a natural surge pricing model during congestion and could enable low-latency services like video streaming and AR on mesh networks.</p><h1>References</h1><ul><li>Skjegstad et al (2015). Kadupul: Livin' on the Edge with Virtual Currencies and Time-Locked Puzzles. ACM. <a href=\"https://doi.org/10.1145/2753488.2753492\" target=\"_blank\"><i>10.1145/2753488.2753492</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2015-diynet-kadupul-1",
      "title": "Kadupul: Livin' on the Edge with Virtual Currencies and Time-Locked Puzzles",
      "summary": "Workshop paper on using virtual currencies and time-locked puzzles to incentivize heterogeneous multihop mesh networking.",
      "date_published": "2015-05-01T00:00:00.000000Z",
      "date_modified": "2015-05-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "networking",
        "mesh",
        "crypto",
        "puzzles",
        "systems"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2015-diynet-kadupul.pdf",
          "mime_type": "application/pdf",
          "title": "Kadupul: Livin' on the Edge with Virtual Currencies and Time-Locked Puzzles"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1145/2753488.2753492",
          "doi": "10.1145/2753488.2753492",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2015-nsdi-jitsu-1",
      "content_html": "<p>Paper on spinning up low-latency unikernels per-connection at NSDI 2015, one of the top systems conferences. Jitsu demonstrated that MirageOS unikernels could boot fast enough (milliseconds) to be instantiated on-demand in response to network traffic, masking boot latency through DNS tricks. Working with Thomas Leonard, Magnus Skjegstad and a large team, we showed this was practical even on resource-constrained ARM devices, enabling secure multi-tenant edge computing years before it became mainstream. The paper won acclaim for demonstrating compelling performance while maintaining type-1 hypervisor isolation guarantees.</p>",
      "url": "https://anil.recoil.org/notes/2015-nsdi-jitsu-1",
      "title": "Jitsu: Just-In-Time Summoning of Unikernels",
      "summary": "Paper on low-latency unikernel spawning per-connection presented at NSDI 2015.",
      "date_published": "2015-05-01T00:00:00.000000Z",
      "date_modified": "2015-05-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "unikernels",
        "mirageos",
        "networking",
        "systems"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2015-nsdi-jitsu.pdf",
          "mime_type": "application/pdf",
          "title": "Jitsu: Just-In-Time Summoning of Unikernels"
        }
      ],
      "_references": []
    },
    {
      "id": "https://doi.org/10.59350/hg4a4-m0a36",
      "content_html": "<p>The <a href=\"/projects/ocamllabs\">OCaml Labs</a> initiative within the <a href=\"http://www.cl.cam.ac.uk\">Cambridge\nComputer Laboratory</a> is now just over two years\nold, and it is time for an update about our activities since the last\nupdate at the <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/news/index.html#Dec%202013\">end of\n2013</a>\nand\n<a href=\"/2012/10/19/announcing-ocaml-labs.html\">2012</a>.</p>\n<p>The theme of our group was not to be pure research, but rather a hybrid\ngroup that takes on some of the load of day-to-day OCaml maintenance\nfrom <a href=\"http://caml.inria.fr/\">INRIA</a>, as well as help grow the wider\ncommunity and meet our own research agendas around topics such as\n<a href=\"https://queue.acm.org/detail.cfm?id=2566628\">unikernels</a>. To this end,\nall of our projects have been highly collaborative, often involving\ncolleagues from <a href=\"http://ocamlpro.com\">OCamlPro</a>,\n<a href=\"http://caml.inria.fr/\">INRIA</a>, <a href=\"http://janestreet.com\">Jane Street</a>,\n<a href=\"http://lexifi.com\">Lexifi</a> and <a href=\"http://citrix.com\">Citrix</a>.</p>\n<p>This post covers our progress in tooling, the compiler and language,\ncommunity efforts, research projects and concludes with our priorities\nfor 2015.</p>\n<h2 id=\"r-tooling\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#r-tooling\"></a><img src=\"/images/toru-cucl-window.webp\" alt=\"%r\" title=\"OCaml: it's a dog's life. In this case, Toru the dog.\" >\nTooling</h2>\n<p>At the start of 2014, we had just helped to release <a href=\"http://opam.ocaml.org/blog/opam-1-1-1-released/\">OPAM\n1.1.1</a> with our\ncolleagues at <a href=\"http://ocamlpro.com\">OCamlPro</a>, and serious OCaml users\nhad just started moving over to using it.</p>\n<p>Our overall goal at OCaml Labs is to deliver a modular set of of\ndevelopment tools around OCaml that we dub the <em>OCaml Platform</em>. The\nremainder of 2014 was thus spent polishing this nascent OPAM release\ninto a solid base (both as a command-line tool and as a library) that we\ncould use as the basis for documentation, testing and build\ninfrastructure, all the while making sure that bigger OCaml projects\ncontinued to migrate over to it. Things have been busy; here are the\nhighlights of this effort.</p>\n<h3 id=\"opam\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#opam\"></a>OPAM</h3>\n<p>The central <a href=\"https://github.com/ocaml/opam-repository\">OPAM repository</a>\nthat contains the package descriptions has grown tremendously in 2014,\nwith over 280 contributors committing almost 10000 changesets across\n3800 <a href=\"https://github.com/ocaml/opam-repository/pulls\">pull requests</a> on\nGitHub. The front line of incoming testing has been continuous\nintegration by the wonderful <a href=\"http://travis-ci.org/ocaml/opam-repository\">Travis\nCI</a>, who also granted us\naccess to their experimental <a href=\"http://docs.travis-ci.com/user/osx-ci-environment/\">MacOS\nX</a> build pool. The\nOPAM package team also to expanded to give David Sheets, Jeremy Yallop,\nPeter Zotov and Damien Doligez commit rights, and they have all been\nbusily triaging new packages as they come in.</p>\n<p>Several large projects such as <a href=\"http://xapi-project.github.io/\">Xapi</a>,\n<a href=\"http://ocsigen.org\">Ocsigen</a> and our own\n<a href=\"http://openmirage.org\">MirageOS</a> switched over to using OPAM for\nday-to-day development, as well as prolific individual developers such\nas <a href=\"http://erratique.ch\">Daniel Buenzli</a> and <a href=\"http://ocaml.info/\">Markus\nMottl</a>. <a href=\"https://blogs.janestreet.com/category/ocaml/\">Jane\nStreet</a> continued to send\nregular <a href=\"https://github.com/ocaml/opam-repository/pulls?utf8=%E2%9C%93&amp;q=is%3Apr+author%3Adiml+\">monthly\nupdates</a>\nof their Core/Async suite, and releases appeared from the\n<a href=\"https://github.com/ocaml/opam-repository/pull/3570\">Facebook</a>\nopen-source team as well (who develop\n<a href=\"https://code.facebook.com/posts/264544830379293/hack-a-new-programming-language-for-hhvm/\">Hack</a>,\n<a href=\"https://github.com/facebook/flow\">Flow</a> and\n<a href=\"https://github.com/facebook/pfff\">Pfff</a> in OCaml).</p>\n<ul>\n<li>Gallery\n<img src=\"/images/opam12-contributors-mar14.webp\" alt=\"%r\" title=\"Number of unique contributors to the central OPAM package repository\" >\n<img src=\"/images/opam12-packages-mar14.webp\" alt=\"%r\" title=\"Total number of unique packages (including multiple versions of the same package)\" >\n<img src=\"/images/opam12-unique-packages-mar14.webp\" alt=\"%r\" title=\"Total packages with multiple versions coalesced so you can see new package growth\" ></li>\n</ul>\n<p>We used feedback from the users to smooth away many of the rough edges,\nwith:</p>\n<ul>\n<li>a redesigned <a href=\"http://opam.ocaml.org/blog/opam-1-2-pin/\">development workflow</a> that lets developers quickly grab a development version of a library recompile all dependent packages automatically, and quickly publish results to GitHub.</li>\n<li>binary distributions for common OS distributions via their <a href=\"https://github.com/ocaml/opam/wiki/Distributions\">native packaging</a>, as well as <a href=\"http://opam.ocaml.org/blog/0install-intro/\">0install</a> and <a href=\"https://github.com/mirage/mirage-vagrant-vms\">Vagrant boxes</a>.</li>\n<li>a unified way of cloning the source of any package via <code>opam source</code>. This handles any supported OPAM archive, including Git, Mercurial or Darcs remotes.</li>\n<li>a richer package metadata, including source code, development archives and bug report URLs.</li>\n</ul>\n<p>These changes were all incorporated into the <a href=\"http://opam.ocaml.org/blog/opam-1-2-0-release/\">OPAM 1.2</a>, along with backwards compatibility shims to keep the old 1.1 metadata format working until the migration is complete. The 1.2.x series has been a solid and usable development manager, and last week’s release of <a href=\"http://opam.ocaml.org/blog/opam-1-2-1-release/\">OPAM 1.2.1</a> has further polished the core scripting engine.</p>\n<h4 id=\"platform-blog\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#platform-blog\"></a>Platform Blog</h4>\n<p>One of the more notable developments during 2014 was the <a href=\"http://coq-blog.clarus.me/use-opam-for-coq.html\">adoption of\nOPAM</a> further up the\necosystem by the <a href=\"https://coq.inria.fr/\">Coq</a> theorem prover. This\nbroadening of the community prompted us to create an <a href=\"http://opam.ocaml.org\">official OPAM\nblog</a> to give us a central place for new and\ntips, and we’ve had posts about\n<a href=\"http://opam.ocaml.org/blog/opam-in-xenserver/\">XenServer</a> developments,\nthe <a href=\"http://opam.ocaml.org/blog/turn-your-editor-into-an-ocaml-ide/\">Merlin IDE\ntool</a>\nand the modern <a href=\"http://opam.ocaml.org/blog/about-utop/\">UTop</a>\ninteractive REPL. If you are using OPAM in an interesting or production\ncapacity, please do <a href=\"https://github.com/ocaml/platform-blog/issues\">get in\ntouch</a> so that we can\nwork with you to write about it for the wider community.</p>\n<p>The goal of the blog is also to start bringing together the various\ncomponents that form the OCaml Platform. These are designed to be\nmodular tools (so that you can pick and choose which ones are necessary\nfor your particular use of OCaml). There are more details available from\nthe OCaml Workshop presentation at ICFP 2014\n(<a href=\"https://ocaml.org/meetings/ocaml/2014/ocaml2014_7.pdf\">abstract</a>,\n<a href=\"https://ocaml.org/meetings/ocaml/2014/ocl-platform-2014-slides.pdf\">slides</a>,\n<a href=\"https://www.youtube.com/watch?v=jxhtpQ5nJHg&amp;list=UUP9g4dLR7xt6KzCYntNqYcw\">video</a>).</p>\n<h4 id=\"onboarding-new-users\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#onboarding-new-users\"></a>Onboarding New Users</h4>\n<p>OPAM has also been adopted now by <a href=\"http://harvard.edu\">several</a>\n<a href=\"http://cornell.edu\">big</a> <a href=\"http://princeton.edu\">universities</a>\n(including <a href=\"http://www.cl.cam.ac.uk/teaching/1415/L28/\">us at\nCambridge</a>!) for\nundergraduate and graduate Computer Science courses. The demands\nincreased for an out-of-the-box solution that makes it as easy possible\nfor new users to get started with minimum hassle. We created a\n<a href=\"http://lists.ocaml.org/listinfo/teaching\">dedicated teaching list</a> to\naid collaboration, and a list of <a href=\"http://ocaml.org/learn/teaching-ocaml.html\">teaching resources on\nocaml.org</a> and supported\nseveral initiatives in collaboration with <a href=\"https://github.com/AltGr\">Louis\nGesbert</a> at OCamlPro, as usual with OPAM\ndevelopment).</p>\n<p>The easiest way to make things &quot;just work&quot; are via regular binary builds\nof the latest releases of OCaml and OPAM on Debian, Ubuntu, CentOS and\nFedora, via <a href=\"http://launchpad.net/~avsm\">Ubuntu PPAs</a> and the <a href=\"https://build.opensuse.org/package/show/home:ocaml/opam\">OpenSUSE\nBuild Service</a>\nrepositories. Our industrial collaborators from Citrix, <a href=\"http://jon.recoil.org\">Jon\nLudlam</a> and <a href=\"http://dave.recoil.org\">Dave Scott</a>\nbegan an <a href=\"http://lists.ocaml.org/pipermail/opam-devel/2015-January/000910.html\">upstreaming\ninitiative</a>\nto Fedora and sponsored the creation of a <a href=\"http://lists.centos.org/pipermail/centos-devel/2014-November/012375.html\">CentOS\nSIG</a>\nto ensure that binary packages remain up-to-date. We also contribute to\nthe hardworking packagers on MacOS X, Debian, FreeBSD, NetBSD and\nOpenBSD where possible as well to ensure that binary builds are well\nrounded out. Richard Mortier also assembled <a href=\"https://github.com/mirage/mirage-vagrant-vms\">Vagrant\nboxes</a> that contain OCaml,\nfor use with VirtualBox.</p>\n<ul>\n<li>Gallery il\n<img src=\"/images/opam-in-nice.webp\" alt=\"%r\" title=\"Louis cooks us dinner in Nice at our OPAM developer summit\" ></li>\n</ul>\n<p>Within OPAM itself, we applied polish to the handling of <a href=\"https://github.com/ocaml/opam-depext\">external\ndependencies</a> to automate checking\nthat the system libraries required by OPAM are present. Two emerging\ntools that should help further in 2015 are the\n<a href=\"https://github.com/OCamlPro/opam-user-setup\">opam-user-setup</a> and\n<a href=\"https://github.com/ocaml/opam/issues/1035\">OPAM-in-a-box</a> plugins that\nautomate first-time configuration. These last two are primarily\ndeveloped at OCamlPro, with design input and support from OCaml Labs.</p>\n<p>We do have a lot of work left to do with making the new user experience\nreally seamless, and help is <em>very</em> welcome from anyone who is\ninterested. It often helps to get the perspective of a newcomer to find\nout where the stumbling blocks are, and we value any such advice. Just\nmail <a href=\"mailto:opam-devel@lists.ocaml.org\">opam-devel@lists.ocaml.org</a>\nwith your thoughts, or <a href=\"https://github.com/ocaml/opam/issues\">create an\nissue</a> on how we can improve. A\nparticularly good example of such an initiative was started by Jordan\nWalke, who prototyped <a href=\"https://github.com/jordwalke/CommonML\">CommonML</a>\nwith a NodeJS-style development workflow, and <a href=\"http://lists.ocaml.org/pipermail/opam-devel/2015-February/000975.html\">wrote\nup</a>\nhis design document for the mailing list. (Your questions or ideas do\nnot need to be as well developed as Jordan’s prototype!)</p>\n<h3 id=\"testing-packages\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#testing-packages\"></a>Testing Packages</h3>\n<p>The public Travis CI testing does come with some limitations, since it\nonly checks that the latest package sets install, but not if any\ntransitive dependencies fail due to interface changes. It also doesn’t\ntest all the optional dependency combinations due to the 50 minute time\nlimit.</p>\n<p><img src=\"/images/travis-mascot-200px.webp\" alt=\"%r\" ></p>\n<p>We expanded the OPAM repository testing in several ways to get around\nthis:</p>\n<ul>\n<li>\n<p><strong>Individual Repositories:</strong> Thomas Gazagnaire built <a href=\"http://opam.ocaml.org/blog/opam-1-2-travisci/\">centralised\nTravis scripts</a> that\ncan be used on any OCaml GitHub repository to easily test code\nbefore it is released into OPAM. These scripts are sourced from a\ncentral\n<a href=\"https://github.com/ocaml/ocaml-travisci-skeleton\">repository</a> and\nsupport external, optional and reverse dependency checking across\nmultiple revisions of the compiler. For instance, it just needs <a href=\"https://github.com/mirage/ocaml-cohttp/blob/master/.travis.yml\">one\nfile</a>\nto test all the supported permutations of the\n<a href=\"https://github.com/mirage/ocaml-cohttp\">CoHTTP</a> library.</p>\n</li>\n<li>\n<p><strong>Bulk Builds</strong>: Damien Doligez and I independently started doing\nlarge-scale bulk builds of the repository to ensure that a single\nsnapshot of the package repository can automatically build as many\npackages as possible. My implementation used the\n<a href=\"http://docker.com\">Docker</a> container manager to spawn off 1000s of\npackage builds in parallel and commit the results into a filesystem\nThis required building a <a href=\"http://avsm.github.io/ocaml-dockerfile\">Dockerfile\neDSL</a>, and the results are\nnow online at\n<a href=\"https://opam.ocaml.org/builds\">https://opam.ocaml.org/builds</a>.</p>\n</li>\n<li>\n<p><strong>OCamlot</strong>: An ongoing piece of infrastructure work is to take the\nbulk build logs (which are around 7GB per daily run), and to store\nand render them using our <a href=\"http://irmin.io\">Irmin</a> Git store. Expect\nto see more around this soon; it has the awesome feature of letting\nany developer clone the build logs for their project locally, to\nmake triage of foreign operating systems as simple as possible.</p>\n</li>\n</ul>\n<h4 id=\"language-evolution\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#language-evolution\"></a>Language Evolution</h4>\n<p>This ability to do unattended builds of the package repository has also\nimproved the decision making process within the core compiler team.\nSince we now have a large (3000+ package) corpus of OCaml code, it\nbecame a regular occurrence in the 4.02 development cycle to “<a href=\"/2014/04/08/grepping-every-known-ocaml-package-source.html\">ask\nOPAM</a>”\nwhether a particular feature or new syntax would break any existing\ncode. This in turn provides an incentive for commercial users to provide\nrepresentative samples of their code; for instance, the Jane Street Core\nreleases in OPAM (with their very modular style) act as an open-source\ncanary without needing access to any closed source code.</p>\n<p>One good example in 2014 was the decoupling of the\n<a href=\"http://en.wikipedia.org/wiki/Camlp4\">Camlp4</a> macro preprocessor from\nthe main OCaml distribution. Since Camlp4 has been used for over a\ndecade and there are some very commonly used syntax extensions such as\n<a href=\"https://github.com/janestreet/type_conv\">type_conv</a>, a simple removal\nwould break a lot of packages. We used OPAM to perform a gradual\nmovement that most users hopefully never noticed by the time OCaml 4.02\nwas released. First, we added a <a href=\"https://github.com/ocaml/opam-repository/pull/2558\">dummy\npackage</a> in OPAM for\nearlier versions of the compiler that had Camlp4 built-in, and then used\nthe OPAM constraint engine to compile it as an external tool for the\nnewer compiler revisions. Then we just had to triage the bulk build logs\nto find build failures from packages that were missing a Camlp4\ndependency, and <a href=\"https://github.com/ocaml/opam-repository/pulls?utf8=%E2%9C%93&amp;q=camlp4+requires+is%3Apr+\">add\nthem</a>\nto the package metadata.</p>\n<h4 id=\"github-integration\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#github-integration\"></a>GitHub Integration</h4>\n<p>An interesting\n<a href=\"https://twitter.com/vincenthz/status/563108158907097089\">comment</a> from\nVincent Hanquez about OPAM is that &quot;OCaml's OPAM is a post-GitHub\ndesign&quot;. This is very true, as much of the workflow for pinning <code>git://</code>\nURLs emerged out of being early adopters of GitHub for hosting the\nMirageOS. OCaml Labs supported two pieces of infrastructure integration\naround GitHub in 2014:</p>\n<ul>\n<li>\n<p>OPAM has a compiler switch feature that lets you run simultaneous\nOCaml installations and swap between them easily. I used my <a href=\"https://github.com/avsm/ocaml-github\">GitHub\nAPI bindings</a> to regularly\nconvert every GitHub pull request into a custom compiler\nswitch (see <a href=\"/notes/ocaml-github-and-opam\">Easily OPAM switching to any OCaml feature request</a>).\nThis lets users reporting bugs try out a patched compiler almost\nimmediately upon a fix becoming available.</p>\n</li>\n<li>\n<p>The motivation behind this feature was our collaborator Gabriel\nScherer’s\n<a href=\"http://gallium.inria.fr/blog/patch-review-on-github/\">experiment</a>\nto enable patch review of OCaml on GitHub, alongside the venerable\n<a href=\"http://caml.inria.fr/mantis/view_all_bug_page.php\">Mantis bug\ntracker</a>. We\nsupported this via adding Travis CI support to the main compiler,\nand also helped to migrate a number of support libraries to GitHub,\nsuch as <a href=\"https://github.com/ocaml/camlp4\">camlp4</a>. These can all be\nfound on the <a href=\"https://github.com/ocaml\">ocaml</a> organisation on\nGitHub.</p>\n</li>\n</ul>\n<h3 id=\"codoc-documentation\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#codoc-documentation\"></a>Codoc Documentation</h3>\n<p>Leo White, David Sheets, Amir Chaudhry and Thomas Gazagnaire led the\ncharge to build a modern documentation generator for OCaml, and\n<a href=\"http://lists.ocaml.org/pipermail/platform/2015-February/000539.html\">published</a>\nan <em>alpha</em> version of <a href=\"https://github.com/dsheets/codoc\">codoc 0.2.0</a>\nafter a lot of work throughout 2014. In the 2014 OCaml workshop\npresentation\n(<a href=\"http://ocaml.org/meetings/ocaml/2014/ocaml2014_7.pdf\">abstract</a>,\n<a href=\"http://ocaml.org/meetings/ocaml/2014/ocl-platform-2014-slides.pdf\">slides</a>,\n<a href=\"https://www.youtube.com/watch?v=jxhtpQ5nJHg&amp;list=UUP9g4dLR7xt6KzCYntNqYcw\">video</a>),\nwe mentioned the “module wall” for documentation and this attempts to\nfix it. To try it out, simply follow the directions in the README on\nthat repository, or <a href=\"http://dsheets.github.io/codoc\">browse some\nsamples</a> of the current, default output\nof the tool. Please do bear in mind codoc and its constituent libraries\nare still under heavy development and are <em>not</em> feature complete, but\nwe’re gathering <a href=\"https://github.com/dsheets/codoc/issues\">feedback</a> from\nearly adopters.</p>\n<p>codoc's aim is to provide a widely useful set of tools for generating\nOCaml documentation. In particular, we are striving to:</p>\n<ol>\n<li>Cover all of OCaml’s language features</li>\n<li>Provide accurate name resolution and linking</li>\n<li>Support cross-linking between different packages</li>\n<li>Expose interfaces to the components we’ve used to build <code>codoc</code></li>\n<li>Provide a magic-free command-line interface to the tool itself</li>\n<li>Reduce external dependencies and default integration with other\ntools</li>\n</ol>\n<p>We haven’t yet achieved all of these at all levels of our tool stack but\nare getting close, and the patches are all under discussion for\nintegration into the mainstream OCaml compiler. <code>codoc</code> 0.2.0 is usable\ntoday (if a little rough in some areas like default CSS), and there is a\n<a href=\"http://opam.ocaml.org/blog/codoc-0-2-0-released/\">blog post</a> that\noutlines the architecture of the new system to make it easier to\nunderstand the design decisions that went into it.</p>\n<h3 id=\"community-governance\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#community-governance\"></a>Community Governance</h3>\n<p>As the amount of infrastructure built around the\n<a href=\"http://ocaml.org\">ocaml.org</a> domain grows (e.g. mailing lists, file\nhosting, bulk building), it is important to establish a governance\nframework to ensure that it is being used as best needed by the wider\nOCaml community.</p>\n<p>Amir Chaudhry took a good look at how other language communities\norganise themself, and began putting together a succinct <a href=\"http://amirchaudhry.com/towards-governance-framework-for-ocamlorg/\">governance\nframework</a>\nto capture how the community around <code>ocaml.org</code> operates, and how to\nquickly resolve any conflicts that may arise in the future. He took care\nto ensure it had a well-defined scope, is simple and self-contained, and\n(crucially) documents the current reality. The result of this work is\ncirculating privately through all the existing volunteers for a first\nround of feedback, and will go live in the next few months as a living\ndocument that explains how our community operates.</p>\n<h3 id=\"assemblage\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#assemblage\"></a>Assemblage</h3>\n<p>One consequence of OCaml’s age (close to twenty years old now) is that\nthe tools built around the compiler have evolved fairly independently.\nWhile OPAM now handles the high-level package management, there is quite\na complex ecosystem of other components that are complex for new users\nto get to grips with: <a href=\"http://github.com/ocaml/oasis\">OASIS</a>,\n<a href=\"http://projects.camlcity.org/projects/findlib.html\">ocamlfind</a>,\n<a href=\"https://ocaml.org/learn/tutorials/ocamlbuild/\">ocamlbuild</a>, and\n<a href=\"https://github.com/the-lambda-church/merlin\">Merlin</a> to name a few.\nEach of these components (while individually stable) have their own\nmetadata and namespace formats, further compounding the lack of cohesion\nof the tools.</p>\n<p>Thomas Gazagnaire and Daniel Buenzli embarked on an effort to build an\neDSL that unifies OCaml package descriptions, with the short-term aim of\ngenerating the support files required by the various support tools, and\nthe long-term goal of being the integration point for the build, test\nand documentation generation lifecycle of an OCaml/OPAM package. This\nprototype, dubbed <a href=\"https://github.com/samoht/assemblage\">Assemblage</a> has\ngone through several iterations and <a href=\"https://github.com/samoht/assemblage/labels/design\">design\ndiscussions</a> over\nthe summer of 2014. Daniel has since been splitting out portions of it\ninto the <a href=\"http://erratique.ch/software/bos\">Bos</a> OS interaction library.</p>\n<p>Assemblage is not released officially yet, but we are committed to\nresuming work on it this summer when Daniel visits again, with the\nintention of unifying much of our workflow through this tool. If you are\ninterested in build and packaging systems, now is the time to <a href=\"https://github.com/samoht/assemblage\">make your\nopinion known</a>!</p>\n<h2 id=\"core-compiler\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#core-compiler\"></a>Core Compiler</h2>\n<p>We also spent time in 2014 working on the core OCaml language and\ncompiler, with our work primarily led by Jeremy Yallop and Leo White.\nThese efforts were not looking to make any radical changes in the core\nlanguage; instead, we generally opted for evolutionary changes that\neither polish rough edges in the language (such as open type and handler\ncases), or new features that fit into the ML style of building programs.</p>\n<h3 id=\"new-features-in-4020\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#new-features-in-4020\"></a>New Features in 4.02.0</h3>\n<p>The OCaml 4.02 series was primarily developed and\n<a href=\"https://ocaml.org/releases/4.02.html\">released</a> in 2014. The\n<a href=\"http://caml.inria.fr/pub/distrib/ocaml-4.02/notes/Changes\">ChangeLog</a>\ngenerated much <a href=\"https://blogs.janestreet.com/ocaml-4-02-everything-else/\">user\nexcitement</a>,\nand we were also pleased to have contributed several language\nimprovements.</p>\n<h4 id=\"handler-cases-and-exceptional-syntax\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#handler-cases-and-exceptional-syntax\"></a>Handler Cases and exceptional syntax</h4>\n<p>OCaml’s <code>try</code> and <code>match</code> constructs are good at dealing with exceptions\nand values respectively, but neither constructs can handle both values\nand exceptions. Jeremy Yallop investigated <a href=\"http://ocamllabs.github.io/compiler-hacking/2014/02/04/handler-case.html#match-exception\">how to handle\nsuccess</a>\nmore elegantly, and an elegant unified syntax emerged. A simple example\nis that of a stream iterator that uses exceptions for control flow:</p>\n<pre><code>let rec iter_stream f s =\n  match (try Some (MyStream.get s) with End_of_stream -&gt; None) with\n  | None -&gt; ()\n  | Some (x, s') -&gt; f x; iter_stream f s'\n</code></pre>\n<p>This code is not only verbose, but it also has to allocate an <code>option</code>\nvalue to ensure that the <code>iter_stream</code> calls remains tail recursive. The\nnew syntax in OCaml 4.02 allows the above to be rewritten succinctly:</p>\n<pre><code>let rec iter_stream f s =\n  match MyStream.get s with\n  | (x, s') -&gt; f x; iter_stream f s'\n  | exception End_of_stream -&gt; ()\n</code></pre>\n<p>Read more about the background of this feature in Jeremy’s <a href=\"http://ocamllabs.github.io/compiler-hacking/2014/02/04/handler-case.html#match-exception\">blog\npost</a>,\nthe associated discussion in the <a href=\"http://caml.inria.fr/mantis/view.php?id=6318\">upstream Mantis\nbug</a>, and the final\n<a href=\"http://caml.inria.fr/pub/docs/manual-ocaml/extn.html#sec245\">manual\npage</a> in\nthe OCaml 4.02 release. For an example of its use in a real library, see\nthe Jane Street\n<a href=\"https://github.com/janestreet/sexplib/blob/1bd69553/lib/conv.ml#L213-L215\">usage</a>\nin the <a href=\"https://github.com/janestreet/sexplib\">s-expression</a> handling\nlibrary (which they use widely to reify arbitrary OCaml values and\nexceptions).</p>\n<h4 id=\"open-extensible-types\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#open-extensible-types\"></a>Open Extensible Types</h4>\n<p>A long-standing trick to build <a href=\"https://blogs.janestreet.com/rethinking-univ/\">universal\ncontainers</a> in OCaml has\nbeen to encode them using the exception <code>exn</code> type. There is a similar\nconcept of a <a href=\"http://mlton.org/UniversalType\">universal type</a> in\nStandard ML, and they were described in the “<a href=\"http://www.andres-loeh.de/OpenDatatypes.pdf\">Open Data Types and Open\nFunctions</a>” paper by Andres\nLöh and Ralf Hinze in 2006.</p>\n<p>Leo White designed, implemented and upstreamed support for <a href=\"http://caml.inria.fr/pub/docs/manual-ocaml/extn.html#sec246\">extensible\nvariant\ntypes</a> in\nOCaml 4.02. Extensible variant types are variant types that can be\nextended with new variant constructors. They can be defined as follows:</p>\n<pre><code>type attr = ..\n\ntype attr += Str of string\n\ntype attr +=\n  | Int of int\n  | Float of float\n</code></pre>\n<p>Pattern matching on an extensible variant type requires a default case\nto handle unknown variant constructors, just as is required for pattern\nmatching on exceptions (extensible types use the exception memory\nrepresentation at runtime).</p>\n<p>With this feature added, the OCaml <code>exn</code> type simply becomes a special\ncase of open extensible types. Exception constructors can be declared\nusing the type extension syntax:</p>\n<pre><code>    type exn += Exc of int\n</code></pre>\n<p>You can read more about the discussion behind open extensible types in\nthe upstream <a href=\"http://caml.inria.fr/mantis/view.php?id=5584\">Mantis bug</a>.\nIf you’d like to see another example of their use, they have been\nadopted by the latest releases of the Jane Street Core libraries in the\n<a href=\"https://github.com/janestreet/core_kernel/blob/43ee3eef/lib/type_equal.ml#L64\">Type_equal</a>\nmodule.</p>\n<h3 id=\"modular-implicits\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#modular-implicits\"></a>Modular Implicits</h3>\n<p>A common criticism of OCaml is its lack of support for ad-hoc\npolymorphism. The classic example of this is OCaml’s separate addition\noperators for integers (<code>+</code>) and floating-point numbers (<code>+.</code>). Another\nexample is the need for type-specific printing functions (<code>print_int</code>,\n<code>print_string</code>, etc.) rather than a single <code>print</code> function which works\nacross multiple types.</p>\n<p>Taking inspiration from Scala’s\n<a href=\"http://docs.scala-lang.org/tutorials/tour/implicit-parameters.html\">implicits</a>\nand <a href=\"http://www.mpi-sws.org/~dreyer/papers/mtc/main-long.pdf\">Modular Type\nClasses</a> by\nDreyer <em>et al.</em>, Leo White designed a system for ad-hoc polymorphism in\nOCaml based on using modules as type-directed implicit parameters. The\ndesign not only supports implicit modules, but also implicit functors\n(that is, modules parameterised by other module types) to permit the\nexpression of generic modular implicits in exactly the same way that\nfunctors are used to build abstract data structures.</p>\n<p>Frederic Bour joined us as a summer intern and dove straight into the\nimplementation, resulting in an <a href=\"http://andrewray.github.io/iocamljs/modimp_show.html\">online\ndemo</a> and ML\nWorkshop presentation\n(<a href=\"https://sites.google.com/site/mlworkshoppe/modular-implicits.pdf?attredirects=0\">abstract</a>,\n<a href=\"https://www.youtube.com/watch?v=3wVUXTd4WNc\">video</a> and\n<a href=\"http://www.lpw25.net/ml2014.pdf\">paper</a>). Another innovation in how\nwe’ve been trialling this feature is the use of Andy Ray’s\n<a href=\"https://andrewray.github.io/iocamljs/\">IOCamlJS</a> to publish an\ninteractive, online notebook that is fully hosted in the browser. You\ncan follow the examples of modular implicits\n<a href=\"https://andrewray.github.io/iocamljs/modimp_show.html\">online</a>, or try\nthem out on your own computer via an OPAM switch:</p>\n<pre><code>opam switch 4.02.0+modular-implicits\neval `opam config env`\nopam install utop \nutop\n</code></pre>\n<p>Some of the early feedback on modular implicits from industrial users\nwas interesting. Jane Street commented that although this would be a big\nusability leap, it would be dangerous to lose control over exactly what\ngoes into the implicit environment (i.e. the programmer should always\nknow what <code>(a + b)</code> represents by locally reasoning about the code). The\ncurrent design thus follows the ML discipline of maintaining explicit\ncontrol over the namespace, with any ambiguities in resolving an\nimplicit module type resulting in a type error.</p>\n<h3 id=\"multicore\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#multicore\"></a>Multicore</h3>\n<p>In addition to ad-hoc polymorphism, support for parallel execution on\nmulticore CPUs is undoubtedly the most common feature request for OCaml.\nThis has been high on our list after improving tooling support, and\nStephen Dolan and Leo White made solid progress in 2014 on the core\nruntime plumbing required.</p>\n<p>Stephen initially added <a href=\"https://github.com/stedolan/ocaml\">thread-local\nsupport</a> to the OCaml compiler. This\ndesign avoided the need to make the entire OCaml runtime preemptive (and\nthus a huge patch) by allocating thread-local state per core.</p>\n<p>We are now deep into the design and implementation of the programming\nabstractions built over these low-level primitives. One exciting aspect\nof our implementation is much of the scheduling logic for multicore\nOCaml can be written in (single-threaded) OCaml, making the design very\nflexible with respect to <a href=\"http://kcsrk.info/papers/mmscc_marc12.pdf\">heterogenous\nhardware</a> and <a href=\"http://fable.io\">variable IPC\nperformance</a>.</p>\n<p>To get feedback on the overall design of multicore OCaml, we presented\nat OCaml 2014\n(<a href=\"http://www.cl.cam.ac.uk/~sd601/papers/multicore_slides.pdf\">slides</a>,\n<a href=\"https://www.youtube.com/watch?v=FzmQTC_X5R4\">video</a> and\n<a href=\"https://ocaml.org/meetings/ocaml/2014/ocaml2014_1.pdf\">abstract</a>), and\nStephen visited INRIA to consult with the development team and Arthur\nChargueraud (the author of\n<a href=\"http://www.chargueraud.org/softs/pasl/\">PASL</a>). Towards the end of the\nyear, <a href=\"http://kcsrk.info/\">KC Sivaramakrishnan</a> finished his PhD studies\nat Purdue and joined our OCaml Labs group. He is the author of\n<a href=\"http://multimlton.cs.purdue.edu/mML/Welcome.html\">MultiMlton</a>, and is\nnow driving the completion of the OCaml multicore work along with\nStephen Dolan, Leo White and Mark Shinwell. Stay tuned for updates from\nus when there is more to show later this year!</p>\n<h3 id=\"ctypes-a-modular-foreign-function-interface\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#ctypes-a-modular-foreign-function-interface\"></a>Ctypes: a Modular Foreign Function Interface</h3>\n<p>The <a href=\"https://github.com/ocamllabs/ocaml-ctypes\">Ctypes</a> library started\nas an experiment with GADTs by Jeremy Yallop, and has since ballooned in\na robust, comprehensive library for safely interacting with the OCaml\nforeign function interface. The first release came out in time to be\nincluded in <a href=\"https://realworldocaml.org/v1/en/html/foreign-function-interface.html\">Real World\nOCaml</a>\nin lieu of the low-level FFI (which I was not particularly enamoured\nwith having to explain in a tight page limit).</p>\n<p>Throughout 2014, Jeremy expanded support for a number of features\nrequested by users (both industrial and academic) who adopted the\nlibrary in preference to manually writing C code to interface with the\nruntime, and issued several updated\n<a href=\"https://github.com/ocamllabs/ocaml-ctypes/releases\">releases</a>.</p>\n<h4 id=\"c-stub-generation\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#c-stub-generation\"></a>C Stub Generation</h4>\n<p>The first release of Ctypes required the use of\n<a href=\"https://sourceware.org/libffi/\">libffi</a> to dynamically load shared\nlibraries and dynamically construct function call stack frames whenever\na foreign function is called. While this works for simple libraries, it\ncannot cover <em>all</em> usecases, since interfacing with C demands an\nunderstanding of <code>struct</code> memory layout, C preprocessor macros, and\nother platform-dependent quirks which are more easily dealt with by\ninvoking a C compiler. Finally, the performance of a <code>libffi</code>-based API\nwill necessarily be slower than writing direct C stub code.</p>\n<p>While many other language FFIs provide separate libraries for dynamic\nand static FFI libraries, we decided to have a go at building a\n<em>modular</em> version of Ctypes that could handle both cases from a single\ndescription of the foreign function interface. The result (dubbed\n“Cmeleon”) remained surprisingly succinct and usable, and now covers\nalmost every use of the OCaml foreign function interface. We submitted a\npaper to <a href=\"http://icfpconference.org/2015\">ICFP 2015</a> dubbed “<a href=\"/papers/drafts/2015-cmeleon-icfp-draft1.pdf\">A modular\nforeign function\ninterface</a>”\nthat describes it in detail. Here is a highlight of how simple a generic\nbinding looks:</p>\n<pre><code>module Bindings(F : FOREIGN) = struct\n  open F\n  let gettimeofday = foreign &quot;gettimeofday&quot;\n     (ptr timeval @-&gt; ptr timezone @-&gt; returning int)\nend\n</code></pre>\n<p>The <code>FOREIGN</code> module type completely abstracts the details of whether or\nnot dynamic or static binding is used, and handles C complexities such\nas computing the struct layout on the local machine architecture.</p>\n<h4 id=\"inverse-stubs\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#inverse-stubs\"></a>Inverse Stubs</h4>\n<p>The other nice result from functorising the foreign function interface\nemerged when we tried to <em>invert</em> the FFI and serve a C interface from\nOCaml code (for example, by compiling the OCaml code as a <a href=\"http://caml.inria.fr/pub/docs/manual-ocaml/intfc.html\">shared\nlibrary</a>). This\nwould let us begin swapping out C libraries that we <a href=\"http://openssl.org\">don’t\ntrust</a> with <a href=\"https://github.com/mirage/ocaml-tls\">safer\nequivalents</a> written in OCaml.</p>\n<p>You can see an\n<a href=\"https://github.com/yallop/ocaml-ctypes-inverted-stubs-example\">example</a>\nof how inverted stubs work via a simple C XML parsing exposed from the\n<a href=\"http://erratique.ch/software/xmlm\">Xmlm</a> library. We can define a C\n<code>struct</code> by:</p>\n<pre><code>(* Define a struct of callbacks (C function pointers) *)\nlet handlers : [`handlers] structure typ = structure &quot;handlers&quot;\nlet (--) s f = field handlers s (funptr f)\n let on_data      = &quot;on_data&quot;      -- (string @-&gt; returning void)\n let on_start_tag = &quot;on_start_tag&quot; -- (string @-&gt; string @-&gt; returning void)\n let on_end_tag   = &quot;on_end_tag&quot;   -- (void @-&gt; returning void)\n let on_dtd       = &quot;on_dtd&quot;       -- (string @-&gt; returning void) \n let on_error     = &quot;on_error&quot;     -- (int @-&gt; int @-&gt; string @-&gt; returning void)\nlet () = seal handlers\n</code></pre>\n<p>and then expose this via C functions:</p>\n<pre><code>module Stubs(I : Cstubs_inverted.INTERNAL) = struct\n  (* Expose the type 'struct handlers' to C. *)\n  let () = I.structure handlers\n\n  (* We expose just a single function to C.  The first argument is a (pointer\n     to a) struct of callbacks, and the second argument is a string\n     representing a filename to parse. *)\n  let () = I.internal &quot;parse_xml&quot; \n     (ptr handlers @-&gt; string @-&gt; returning void) parse\nend\n</code></pre>\n<p>You can find the full source code to these snippets on the\n<a href=\"https://github.com/yallop/ocaml-ctypes-inverted-stubs-example\">ocaml-ctypes-inverted-stubs-example</a>\nrepository on GitHub.</p>\n<p>We’ll be exploring this aspect of Ctypes further in 2015 for SSL/TLS\nwith David Kaloper and Hannes Mehnert, and Microsoft Research has\ngenerously funded a <a href=\"http://research.microsoft.com/en-us/collaboration/global/phd_projects2015.aspx\">PhD\nstudentship</a>\nto facilitate the work.</p>\n<h4 id=\"community-contributions\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#community-contributions\"></a>Community Contributions</h4>\n<p>Ctypes benefited enormously from several external contributions from the\nOCaml community. From a portability perspective, A. Hauptmann\ncontributed <a href=\"https://github.com/ocamllabs/ocaml-ctypes/pull/190\">Windows\nsupport</a>, and Thomas\nLeonard added <a href=\"https://github.com/ocamllabs/ocaml-ctypes/pull/231\">Xen\nsupport</a> to allow\nCtypes bindings to work with <a href=\"http://openmirage.org\">MirageOS\nunikernels</a> (which opens up the intriguing\npossibility of accessing shared libraries across virtual machine\nboundaries in the future). C language support was fleshed out by Edwin\nTorok contributing <a href=\"https://github.com/ocamllabs/ocaml-ctypes/pull/238\">typedef\nsupport</a>, Ramkumar\nRamachandra adding <a href=\"https://github.com/ocamllabs/ocaml-ctypes/pull/220\">C99\nbools</a> and Peter\nZotov integrating <a href=\"https://github.com/ocamllabs/ocaml-ctypes/pull/143\">native\nstrings</a>.</p>\n<p>The winner of “most enthusiastic use of OCaml Labs code” goes to <a href=\"https://github.com/braibant\">Thomas\nBraibant</a> of\n<a href=\"http://cryptosense.com/the-team/\">Cryptosense</a>, who used <em>every</em>\nfeature of the Ctypes library (consider multi-threaded, inverted, staged\nand marshalled bindings) in their effort to <a href=\"http://www.economist.com/news/science-and-technology/21647269-automating-search-loopholes-software-hacking-hackers\">hack the\nhackers</a>.\nDavid Sheets comes a close second with his implementation of the <a href=\"https://github.com/dsheets/profuse\">FUSE\nbinary protocol</a>, parameterised by\nversion quirks.</p>\n<p>If you’re using Ctypes, we would love to hear about your particular use.\nA search on GitHub and OPAM reveals over 20 projects using it already,\nincluding industrial use at <a href=\"http://cryptosense.com\">Cryptosense</a> and\n<a href=\"http://ocaml.janestreet.com\">Jane Street</a>, and ports to Windows, *BSD,\nMacOS X and even iPhone and Android. There’s a <a href=\"https://github.com/ocamllabs/ocaml-ctypes/wiki\">getting\nstarted</a> guide, and a\n<a href=\"http://lists.ocaml.org/listinfo/ctypes\">mailing list</a> available.</p>\n<h2 id=\"community-and-teaching-efforts\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#community-and-teaching-efforts\"></a>Community and Teaching Efforts</h2>\n<p>In addition to the online community building, we also participated in a\nnumber of conferences and face-to-face events to promote education about\nfunctional programming.</p>\n<h3 id=\"conferences-and-talks\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#conferences-and-talks\"></a>Conferences and Talks</h3>\n<ul>\n<li>Gallery ir\n<img src=\"/images/qcon-unikernel-talk.webp\" alt=\"%r\" title=\"Anil speaking at QCon on unikernels\" ></li>\n</ul>\n<p>There has been a huge growth in the number of quality conferences in\nrecent years, making it tough to choose which ones to attend.\n<a href=\"http://icfpconference.org\">ICFP</a> is the academic meeting point that\npredates most of them, and we <a href=\"/2014/08/31/ocaml-labs-at-icfp-2014.html\">participated\nextensively</a>\nin 2014 via talks, tutorials and a\n<a href=\"https://www.youtube.com/watch?v=UEIHfXLMtwA\">keynote</a> at the Haskell\nSymposium.<br>\nI also served on the <a href=\"http://icfpconference.org/icfp2014/\">program\ncommittee</a> and <a href=\"/2015/02/18/icfp15-call-for-sponsorships.html\">industrial\nrelations\nchair</a>\nand took over as the steering committee chair of\n<a href=\"http://cufp.org\">CUFP</a>. Jeremy Yallop, Thomas Gazagnaire and Leo White\nall served program committees on workshops, with Jeremy also chairing\nthis year’s ML Workshop.</p>\n<p>Outside of academic conferences, we participated in a number of\nnon-academic conferences such as <a href=\"https://qconsf.com/\">QCon</a>,\n<a href=\"http://oscon.com\">OSCON</a>, <a href=\"http://ccc.de\">CCC</a>, <a href=\"https://operatingsystems.io/\">New Directions in\nOS</a>,\n<a href=\"http://functionalconf.com\">FunctionalConf</a>,\n<a href=\"https://skillsmatter.com/conferences/1819-functional-programming-exchange\">FPX</a>\nand <a href=\"https://fosdem.org/2014/\">FOSDEM</a>. The vast majority of these talks\nwere about the MirageOS, and slides can be found at\n<a href=\"http://decks.openmirage.org\">decks.openmirage.org</a>.</p>\n<h4 id=\"the-2048-browser-game\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-2048-browser-game\"></a>The 2048 Browser Game</h4>\n<p>Yaron Minsky and I have run OCaml tutorials for ICFP for\n<a href=\"http://cufp.org/2011/t3-building-functional-os.html\">a</a>\n<a href=\"http://cufp.org/2013/t2-yaron-minsky-anil-madhavapeddy-ocaml-tutorial.html\">few</a>\n<a href=\"http://cufp.org/2012/t1-real-world-ocaml-anil-madhavapeddy-university-c.html\">years</a>,\nand we finally hung up our boots in favour of a new crowd.</p>\n<p>Jeremy Yallop and Leo White stepped up to the mark with their ICFP/CUFP\n2014 <a href=\"http://cufp.org/2014/t7-leo-white-introduction-to-ocaml.html\">Introduction to\nOCaml</a>\ntutorial, which had the additional twist of being taught entirely in a\nweb browser by virtue of using the\n<a href=\"http://ocsigen.org/js_of_ocaml\">js_of_ocaml</a> and\n<a href=\"http://andrewray.github.io/iocamljs/\">IOCamlJS</a>. They decided that a\ngood practical target was the popular\n<a href=\"http://gabrielecirulli.github.io/2048/\">2048</a> game that has wasted many\nprogrammer hours here at OCaml Labs. They <a href=\"https://github.com/ocamllabs/2048-tutorial\">hacked on\nit</a> over the summertime,\nassisted by our visitor Daniel Buenzli who also released useful\nlibraries such as <a href=\"http://erratique.ch/software/vg\">Vg</a>,\n<a href=\"http://erratique.ch/software/react\">React</a>,\n<a href=\"http://erratique.ch/software/useri\">Useri</a>, and\n<a href=\"http://erratique.ch/software/gg\">Gg</a>.</p>\n<p>The end result is satisfyingly <a href=\"http://ocamllabs.github.io/2048-tutorial/\">playable\nonline</a>, with the source code\navailable at\n<a href=\"https://github.com/ocamllabs/2048-tutorial\">ocamllabs/2048-tutorial</a>.</p>\n<p>Thomas Gazagnaire got invited to Bangalore for <a href=\"http://functionalconf.com/\">Functional\nConf</a> later in the year, and he extended the\n<a href=\"http://gazagnaire.org/fuconf14/\">interactive tutorial notebook</a> and\nalso ran an OCaml tutorial to a packed room. We were very happy to\nsupport the first functional programming conference in India, and hope\nto see many more such events spring up! Amir Chaudhry then went to\nBelgium to <a href=\"https://fosdem.org/2015/\">FOSDEM 2015</a> where he showed off\n<a href=\"http://amirchaudhry.com/unikernel-arm-demo-fosdem/\">the 2048 game running as an ARM\nunikernel</a> to a\ncrowd of attendees at the Xen booth.</p>\n<ul>\n<li>Gallery\n<img src=\"/images/l23.webp\" alt=\"%r\" title=\"Jeremy Yallop giving the L23 course at Cambridge\" >\n<img src=\"/images/compiler-hacking-dsyme.webp\" alt=\"%r\" title=\"Compiling hacking with Don Syme\" >\n<img src=\"/images/jeremy-rwo.webp\" alt=\"%r\" title=\"Finding a copy of Real World OCaml in Foyles!\" ></li>\n</ul>\n<h3 id=\"graduate-teaching\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#graduate-teaching\"></a>Graduate Teaching</h3>\n<p><a href=\"https://www.cst.cam.ac.uk/people/jdy22\">Jeremy Yallop</a> and <a href=\"https://github.com/lpw25\">Leo White</a> (with assistance from <a href=\"https://www.cl.cam.ac.uk/~am21/\">Alan Mycroft</a> and\nmyself) also led the design of a new graduate course on <a href=\"http://www.cl.cam.ac.uk/teaching/1415/L28/\">Advanced\nFunctional Programming</a> at\nthe Computer Laboratory. This ran in the <a href=\"http://en.wikipedia.org/wiki/Lent_term\">Lent\nTerm</a> and was over-subscribed by\nthree times the number who pre-registered (due to a number of PhD\nstudents and our collaborators from <a href=\"http://citrix.com\">Citrix</a> also\nattending).</p>\n<p>The course materials are <a href=\"http://www.cl.cam.ac.uk/teaching/1415/L28/materials.html\">freely available\nonline</a> and\ncover the theory behind functional programming, and then move onto type\ninference, abstraction and parametricity, GADTs, rows, monads, and\nstaging. We will be running this again in future years, and the lecture\nmaterials are already proving useful to <a href=\"https://sympa.inria.fr/sympa/arc/caml-list/2015-04/msg00001.html\">answer mailing list\nquestions</a>.</p>\n<h3 id=\"mentoring-beginners\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#mentoring-beginners\"></a>Mentoring Beginners</h3>\n<p>We also had the pleasure of mentoring up-and-coming functional\nprogrammers via several outreach programs, both face-to-face and remote.</p>\n<h4 id=\"cambridge-compiler-hacking\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#cambridge-compiler-hacking\"></a>Cambridge Compiler Hacking</h4>\n<p>We started the <a href=\"http://ocamllabs.github.io/compiler-hacking/\">Cambridge Compiler\nHacking</a> sessions in a\nsmall way towards the end of 2013 in order to provide a local, friendly\nplace to assist people who wanted to dip their toes into the\nunnecessarily mysterious world of programming language hacking. The plan\nwas simple: provide drinks, pizza, network and a <a href=\"https://github.com/ocamllabs/compiler-hacking/wiki\">bug list of varying\ndifficulty</a> for\nattendees to choose from and work on for the evening, with mentoring\nfrom the experienced OCaml contributors.</p>\n<p>We continued this bi-monthly tradition in 2014, with a regular\nattendance of 15-30 people, and even cross-pollinated communities with\nour local F# and Haskell colleagues. We rotated locations from the\nCambridge Computer Laboratory to Citrix, Makespace, and the new\nCambridge Postdoc Centre. We posted some\n<a href=\"http://ocamllabs.github.io/compiler-hacking/2014/06/24/highlights-from-recent-sessions.html\">highlights</a>\nfrom sessions towards the start of the year, and are very happy with how\nit’s going. There has even been uptake of the bug list across the water\nin France, thanks to Gabriel Scherer.</p>\n<p>In 2015, we’d like to branch out further and host some sessions in\nLondon. If you have a suggestion for a venue or theme, please <a href=\"http://lists.ocaml.org/listinfo/cam-compiler-hacking\">get in\ntouch</a>!</p>\n<h4 id=\"summer-programs\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#summer-programs\"></a>Summer Programs</h4>\n<p>There has been a laudable rise in summer programs designed to encourage\ndiversity in our community, and we of course leap at the opportunity to\nparticipate in these when we find them.</p>\n<ul>\n<li>The <a href=\"https://gnome.org/opw/\">GNOME Outreach Program</a> (now also known\nas <a href=\"https://www.gnome.org/outreachy/\">Outreachy</a>) had one funded\nplace for Xen and MirageOS. <a href=\"http://www.somerandomidiot.com/\">Mindy\nPreston</a> did a spectacular <a href=\"http://www.somerandomidiot.com/blog/categories/ocaml/\">blog\nseries</a> about\nher experiences and motivations behind learning OCaml.</li>\n<li>The <a href=\"https://www.google-melange.com/\">Google Summer of Code 2014</a>\nalso had us\n<a href=\"http://openmirage.org/blog/applying-for-gsoc2014\">participating</a>\nvia MirageOS, and <a href=\"https://github.com/moonlightdrive\">Jyotsna\nPrakash</a> took on the challenging\njob of building OCaml bindings for Amazon EC2, also detailed on <a href=\"https://1000hippos.wordpress.com/\">her\nblog</a>.</li>\n<li>Amir Chaudhry began the <a href=\"https://github.com/mirage/mirage-www/wiki/Pioneer-Projects\">Mirage Pioneer\nProjects</a>\ninitiative to give beginners an easier onramp, and this has taken\noff very effectively as a way to advertise interesting projects for\nbeginners at varying levels of difficulties.</li>\n</ul>\n<p>Our own students also had the chance to participate in such workshops to\nget out of Cambridge in the summer! <a href=\"http://hh360.user.srcf.net/blog/\">Heidi\nHoward</a> liveblogged her experiences at\nthe\n<a href=\"http://www.syslog.cl.cam.ac.uk/2015/01/14/programming-languages-mentoring-workshop-plmw/\">PLMW</a>\nworkshop in Mumbai. Meanwhile, <a href=\"https://github.com/dsheets\">David\nSheets</a> got to travel to the slightly less\nexotic London to <a href=\"http://www.syslog.cl.cam.ac.uk/2014/11/25/new-directions-in-operating-systems/\">liveblog\nOSIO</a>,\nand Leonhard Markert covered <a href=\"http://www.syslog.cl.cam.ac.uk/2014/09/05/ocaml-2014/\">ICFP\n2014</a> as a\nstudent volunteer.</p>\n<h3 id=\"blogging-and-online-activities\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#blogging-and-online-activities\"></a>Blogging and Online Activities</h3>\n<p>Our <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/blogs/\">blog roll</a>\nmaintains the ongoing stream of activity from the OCaml Labs crew, but\nthere were some particular highlights throughout 2014.</p>\n<ul>\n<li><a href=\"http://roscidus.com/blog/\">Thomas Leonard</a> began writing about his\nexperiences with switching his <a href=\"http://0install.net\">0install</a>\ninstallation system from <a href=\"http://roscidus.com/blog/blog/2014/06/06/python-to-ocaml-retrospective/\">Python to\nOCaml</a>\nand <a href=\"http://roscidus.com/blog/blog/2014/02/13/ocaml-what-you-gain/\">what you gain with\nOCaml</a>.\nThis series led to a bunch of interesting feedback on social\nnetworking sites, and Thomas joined the group full-time to work on\nour research into\n<a href=\"http://roscidus.com/blog/blog/2015/01/21/securing-the-unikernel/\">unikernels</a>.</li>\n<li><a href=\"http://www.skjegstad.com/\">Magnus Skjegstad</a> returned from Norway\nto Cambridge to work on MirageOS, and came up with some <a href=\"http://www.skjegstad.com/blog/2015/03/25/mirageos-vm-per-url-experiment/\">crazy\nexperiements</a>,\nas well as helping to build <a href=\"http://www.skjegstad.com/blog/2015/01/19/mirageos-xen-virtualbox/\">Vagrant\nimages</a>\nof the OCaml development environment.</li>\n<li><a href=\"http://amirchaudhry.com\">Amir Chaudhry</a> began his quest to <a href=\"http://amirchaudhry.com/writing-planet-in-pure-ocaml/\">port\nhis website</a>\nwebsite to a <a href=\"http://amirchaudhry.com/from-jekyll-to-unikernel-in-fifty-lines/\">Jekyll\nunikernel</a>.</li>\n<li>The <a href=\"http://openmirage.org/blog/announcing-mirage-20-release\">Mirage 2.0\nrelease</a> in\nthe summer of 2014 saw a slew of blogs posts about the\n<a href=\"http://openmirage.org/blog/2014-in-review\">surge</a> in MirageOS\nactivity.</li>\n</ul>\n<p>It wasn’t all just blogging though, and Jeremy Yallop and Leo White in\nparticular participated in some epic OCaml <a href=\"http://caml.inria.fr/mantis/view.php?id=5528\">bug\nthreads</a> about new\nfeatures, and\n<a href=\"https://sympa.inria.fr/sympa/arc/caml-list/2015-02/msg00150.html\">explanations</a>\nabout OCaml semantics on the mailing list.</p>\n<p>Amir Chaudhry also continued to curate and develop the content on the\n<a href=\"http://ocaml.org\">ocaml.org</a> website with our external collaborators\nAshish Agarwal, Christophe Troestler and Phillippe Wang.\nNotably, it is now the recommended site for OCaml (with the <a href=\"http://caml.inria.fr\">INRIA\nsite</a> being infrequently updated), and also hosts\nthe <a href=\"https://ocaml.org/meetings/\">ACM OCaml Workshop</a> pages. One\naddition that highlighted the userbase of OCaml in the teaching\ncommunity came from building a <a href=\"https://ocaml.org/learn/teaching-ocaml.html\">map of all of the\nuniversities</a> where the\nlanguage is taught, and this was Yan Shvartzshnaider’s <a href=\"http://yansnotes.blogspot.co.uk/2014/11/good-news-everyone-ocamlorg-teaching.html\">first\ncontribution</a>\nto the site.</p>\n<h3 id=\"visitors-and-interns\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#visitors-and-interns\"></a>Visitors and Interns</h3>\n<ul>\n<li>Gallery ir\n<img src=\"/images/ocl-pub.webp\" alt=\"%r\" title=\"Down at the pub with the gang!\" ></li>\n</ul>\n<p>Finally, a really important part of any community is hanging out with\neach other to chat over ideas in a friendly environment. As usual, we\nhad a very steady stream of visitors and interns throughout 2014 to\nfacilitate this.</p>\n<p>Frederic Bour, Benjamin Farinier and Matthieu Journault joined us as\nsummer interns from their respective universities in France as part of\ntheir Masters programs. Frederic worked on modular implicits and <a href=\"https://www.irill.org/videos/oups-december-2014/Modular_implicits\">gave a\ngreat\ntalk</a>\nat the OCaml Users group. Benjamin and Matthieu worked on Irmin data\nstructures and complexity (and\n<a href=\"https://github.com/mirage/merge-queues\">merge-queues</a> and\n<a href=\"https://github.com/mirage/merge-ropes\">merge-ropes</a>), and Benjamin had\nhis paper on “<a href=\"/papers/2015-jfla-irmin.pdf\">Mergeable Persistent Data\nStructures</a>” accepted\nto <a href=\"http://jfla.inria.fr/2015/\">JFLA 2015</a>, while Matthieu’s work on\nefficient algorithms for synchronising Irmin DAGs is being integrated\ninto the upstream source code.</p>\n<p>Daniel Buenzli repeated his visit from 2013 and spent a productive\nsummer with us, commenting on almost every project we’re working on. In\nhis own words (edited for brevity):</p>\n<blockquote>\n<p>I started by implementing and releasing\n<a href=\"http://erratique.ch/software/uucp\">Uucp</a>, a library to provide\nefficient access to a selection of the properties of the latest\nUnicode Character database (UCD). […] As a side effect of the previous\npoint I took time to write an absolute <a href=\"http://erratique.ch/software/uucp/doc/Uucp.html#uminimal\">minimal introduction to\nUnicode</a>.\n[…] Since I was in this Unicode business I took the opportunity to\npropose a <a href=\"https://github.com/ocaml/ocaml/pull/80\">31 loc patch to the standard\nlibrary</a> for a type to\nrepresent Unicode scalar values (an Unicode character to be imprecise)\nto improve interoperability.</p>\n<p>The usual yearly update to OpenGL was announced at the Siggraph\nconference. This prompted me to update the ctypes-based <a href=\"http://erratique.ch/software/tgls\">tgls\nlibrary</a> for supporting the latest\nentry point of OpenGL 4.5 and OpenGL ES 3.1. Since the bindings are\nautomatically generated from the OpenGL XML registry the work is not\ntoo involved but there’s always the odd function signature you\ndon’t/can’t handle automatically yet.</p>\n<p>Spend quite a bit (too much) time on\n<a href=\"http://erratique.ch/software/useri\">useri</a>, a small multi-platform\nabstraction for setting up a drawing surface and gather user input\n(<em>not</em> usury) as <a href=\"http://erratique.ch/software/react\">React</a> events.\nUseri started this winter as a layer on top of SDL to implement a <a href=\"http://erratique.ch/log/2014-05-18\">CT\nscan app</a> and it felt like this\ncould be the basis for adding interactivity and animation to Vg/Vz\nvisualizations – js viz libraries simply rely on the support provided\nby the browser or SVG support but Vg/Vz strives for backend\nindependence and clear separations of concern (up to which limit\nremains an open question). Unfortunately I couldn’t bring it to a\nrelease and got a little bit lost in browser compatibility issues and\ntrying to reconcile what browser and SDL give us in terms of\nfunctionality and way of operating, so that a maximum of client code\ncan be shared among the supported platforms. But despite this\nnon-release it still managed to be useful in some way, see the next\npoint.</p>\n<p>Helped Jeremy and Leo to implement the rendering and interaction for\ntheir ICFP tutorial <a href=\"https://github.com/ocamllabs/2048-tutorial\">2048 js_of_ocaml\nimplementation</a>. This\nfeatured the use of Gg, Vg, Useri and React and I was quite pleased\nwith the result (despite some performance problems in certain\nbrowsers, but hey composable rendering and animation without a single\nassignement in client code). It’s nice to see that all these pains at\ntrying to design good APIs eventually fit together […]</p>\n</blockquote>\n<p>A couple of visitors joined us from sunny\n<a href=\"http://github.com/mirleft\">Morocco</a>, where Hannes Mehnert and David\nKaloper had gone to work on a clean-slate TLS stack. They found the\n<a href=\"http://openmirage.org\">MirageOS</a> effort online, and got in touch about\nvisiting. After a very fun summer of hacking, their stack is now the\nstandard TLS option in MirageOS and resulted in the <a href=\"http://amirchaudhry.com/bitcoin-pinata/\">Bitcoin Pinata\nchallenge</a> being issued! Hannes\nand David have since moved to Cambridge to work on this stack full-time\nin 2015, but the internships served as a great way for everyone to get\nto know each other.</p>\n<p>We also had the pleasure of visits from several of our usually remote\ncollaborators. <a href=\"https://github.com/Chris00\">Christophe Troestler</a>,\n<a href=\"http://ocaml.janestreet.com\">Yaron Minsky</a>, <a href=\"http://github.com/diml\">Jeremie\nDiminio</a> and <a href=\"https://github.com/andrewray\">Andy\nRay</a> all visited for the annual OCaml Labs\n<a href=\"https://gist.github.com/avsm/18450004ae19c2facf7a\">review meeting</a> in\nChrist’s College. There were also many academic talks from foreign\nvisitors in our <a href=\"http://talks.cam.ac.uk/show/archive/8316\">SRG seminar\nseries</a>, ranging from <a href=\"http://www.cse.iitb.ac.in/~uday/\">Uday\nKhedkar</a> from IIT to <a href=\"http://okmij.org/ftp/\">Oleg\nKiselyov</a> deliver multiple talks on staging and\noptimisation (as well as making a celebrity appearance at the compiler\nhacking session, and <a href=\"http://ocaml.janestreet.com\">Yaron Minsky</a>\ndelivering an Emacs-driven departmental seminar on his experiences with\n<a href=\"http://talks.cam.ac.uk/talk/index/51144\">Incremental</a> computation.</p>\n<h2 id=\"research-efforts\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#research-efforts\"></a>Research Efforts</h2>\n<p>The OCaml Labs are of course based in the Cambridge Computer Laboratory,\nwhere our day job is to do academic research. Balancing the demands of\nopen source coding, community efforts and top-tier research has be a\ntricky one, but an effort that has been worthwhile.</p>\n<ul>\n<li>Gallery\n<img src=\"/images/christs-dinner.webp\" alt=\"%r\" title=\"Dinner at Christ's College\" >\n<img src=\"/images/nsdi-deadline.webp\" alt=\"%r\" title=\"Hacking to the clock for the NSDI deadline\" >\n<img src=\"/images/scotty.webp\" alt=\"%r\" title=\"Dave enters the glass filled future\" ></li>\n</ul>\n<p>Our research efforts are broadly unchanged <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/news/index.html#Dec%202013\">from\n2013</a>\n(it takes time to craft good ideas!), and this will not be an exhaustive\nrecap. Instead, we’ll summarise them here and point to our\n<a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/papers/index.html\">papers</a>\nthat describe the work in detail.</p>\n<ul>\n<li>\n<p>The <a href=\"http://openmirage.org\">MirageOS</a> really found its own feet in\n2014, with a <a href=\"http://openmirage.org/blog/announcing-mirage-20-release\">summer 2.0\nrelease</a>\nand an extensive <a href=\"http://openmirage.org/blog/2014-in-review\">end-of-year\nrecap</a>. The most notable\nthing has been how well the MirageOS research work has melded with\nthe core OCaml Labs efforts, since much of it has been constructing\ngood quality OCaml libraries to plug holes in the ecosystem. It also\nserved to make us use OPAM on a day-to-day basis for our own work,\nthus creating an effective feedback loop between open-source and\nresearch.</p>\n</li>\n<li>\n<p>In the <a href=\"http://trilogy2.it.uc3m.es/\">Trilogy2</a> and\n<a href=\"http://usercentricnetworking.eu/\">UCN</a> EU projects, we built out\nMirageOS features such as the\n<a href=\"/papers/2015-nsdi-jitsu.pdf\">Jitsu</a> toolstack\nfor the “just-in-time” summoning of unikernels in response to DNS\nrequests. This paper will be presented next month at UlSENIX\n<a href=\"https://www.usenix.org/conference/nsdi15/\">NSDI</a>. It also drove the\ndevelopment of the <a href=\"http://openmirage.org/blog/introducing-xen-minios-arm\">ARMv7\nport</a>, an\narchitecture for which OCaml has an excellent native code generator,\nas well as more experimental forays into <a href=\"http://arxiv.org/abs/1412.4638\">BitCoin incentive\nschemes</a> for distributed systems.</p>\n</li>\n<li>\n<p>The <a href=\"http://irmin.io\">Irmin</a> Git-like branchable store created by\nThomas Gazagnaire matured, with Dave Scott\n<a href=\"https://www.youtube.com/watch?v=DSzvFwIVm5s\">prototyping</a> a complex\nport of the <a href=\"http://wiki.xen.org/wiki/XenStore\">XenStore</a> database\nto Irmin, thus letting us show off <a href=\"http://decks.openmirage.org/xendevsummit14#/\">debugging systems with\nGit</a>. We had a paper\naccepted on some early datastructures accepted at\n<a href=\"/papers/2015-jfla-irmin.pdf\">JFLA</a>, and\nThomas Leonard is building the JavaScript backend for running\nin-browser, while Yan Schvartzshnaider is experimenting with <a href=\"http://yansnotes.blogspot.co.uk/2015/01/work-summary-ocaml-labs.html\">graph\nprocessing</a>\nover the DAG representation for privacy-friendly queries. KC is\ninvestigating how to adapt his PLDI 2015 paper on\n<a href=\"http://kcsrk.info/papers/quelea_pldi15.pdf\">Quelea</a> into using\nIrmin as a backend as well.</p>\n</li>\n<li>\n<p>The <a href=\"https://github.com/ocamllabs/higher\">Higher</a> kinded\npolymorphism library written by Jeremy Yallop and Leo White was\npublished in <a href=\"http://www.lpw25.net/flops2014.pdf\">FLOPS 2014</a>,\nforming a basis for building more complex use-cases that need the\nflexibility of higher kinded types without requiring functorising\ncode.</p>\n</li>\n</ul>\n<p>Our long standing research into <a href=\"http://nymote.org\">personal online\nprivacy</a> led to our next system target that uses\nunikernels: the <a href=\"http://arxiv.org/abs/1501.04737\">Databox</a> paper\noutlines the architecture, and was covered in the\n<a href=\"http://www.theguardian.com/technology/2015/feb/01/control-personal-data-databox-end-user-agreement\">Guardian</a>\nnewspaper. Jon Crowcroft led the establishment of the Cambridge wing of\nthe <a href=\"http://www.mccrc.eu/about-us\">Microsoft Cloud Computing Research\nCenter</a> to consider the legal aspect of\nthings, and so we have made forays outside of technology into\nconsidering the implications of <a href=\"http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-863.pdf\">region-specific\nclouds</a> as well.</p>\n<p>Some of the most exciting work done in the group as part of the\n<a href=\"http://rems.io\">REMS</a> and <a href=\"http://www.naas-project.org/\">NaaS</a> projects\ncame towards the end of 2014 and start of 2015, with multiple\nsubmissions going into top conferences. Unfortunately, due to most of\nthem being double blind reviewed, we cannot link to the papers yet. Keep\nan eye on the blog and <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/papers/index.html\">published paper\nset</a>, or\nask us directly about what’s been going on!</p>\n<h2 id=\"priorities-for-2015\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#priorities-for-2015\"></a>Priorities for 2015</h2>\n<p>As spring breaks and the weather (almost) becomes bearable again, we’re\nsetting our work priorities for the remainder of the year.</p>\n<ul>\n<li>\n<p><strong>Tooling Cohesion</strong>: The entire core team is focussed on fusing\ntogether the individual tools that have been created last year into\na cohesive OCaml Platform release that covers the lifecycle of\ndocumentation, testing and build. This is being managed by Amir\nChaudhry. OPAM remains at the heart of this strategy, and Louis\nGesbert and Thomas Gazagnaire have settled on the <a href=\"https://github.com/ocaml/opam/wiki/1.3-Roadmap\">OPAM 1.3\nroadmap</a>\n(<a href=\"http://lists.ocaml.org/pipermail/opam-devel/2015-February/000940.html\">summary</a>).</p>\n</li>\n<li>\n<p><strong>Multicore</strong>: <a href=\"kcsrk.info\">KC Sivaramakrishnan</a> has joined the core\nOCaml Labs fulltime to drive the multicore work into a publically\ntestable form. Leo White recently departed after many productive\nyears in Cambridge to head into a career in industry (but still\nremains very much involved with OCaml development!).</p>\n</li>\n<li>\n<p><strong>Language Evolution</strong>: Jeremy Yallop continues to drive our efforts\non staged programming, modular implicits, and a macro system for\nOCaml, all of which are key features that make building complex,\nreliable systems more tractable than ever.</p>\n</li>\n</ul>\n<p>I’d like to thank the <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/people/index.html\">entire\nteam</a> and\nwider community for a wonderfully enjoyable 2014 and start of 2015, and\nam very thankful to the funding and support from Jane Street, Citrix,\nBritish Telecom, RCUK, EPSRC, DARPA and the EU FP7 that made it all\npossible. As always, please feel free to contact any of us directly with\nquestions, or reach out to me <a href=\"mailto:avsm2@cl.cam.ac.uk\">personally</a>\nwith any queries, concerns or bars of chocolate as encouragement.</p><h1>References</h1><ul><li>Skjegstad et al (2014). Kadupul: Livin' on the Edge with Virtual Currencies and Time-Locked Puzzles. arXiv. <a href=\"https://doi.org/10.48550/arXiv.1412.4638\" target=\"_blank\"><i>10.48550/arXiv.1412.4638</i></a></li>\n<li>Haddadi et al (2015). Personal Data: Thinking Inside the Box. arXiv. <a href=\"https://doi.org/10.48550/arXiv.1501.04737\" target=\"_blank\"><i>10.48550/arXiv.1501.04737</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/ocamllabs-2014-review",
      "external_url": "https://web.archive.org/web/20160310100554/http://www.cl.cam.ac.uk/projects/ocamllabs/news/index.html",
      "title": "Reviewing the second year of OCaml Labs in 2014",
      "summary": "Reviewing OCaml Labs' progress in 2014, covering tooling, compiler, community efforts, and research projects.",
      "date_published": "2015-04-02T00:00:00.000000Z",
      "date_modified": "2015-04-02T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocamllabs",
        "unikernels",
        "ocaml",
        "computerlab",
        "opensource"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.48550/arXiv.1412.4638",
          "doi": "10.48550/arXiv.1412.4638",
          "cito": [
            "cites"
          ]
        },
        {
          "url": "https://doi.org/10.48550/arXiv.1501.04737",
          "doi": "10.48550/arXiv.1501.04737",
          "cito": [
            "cites"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/icfp15-call-for-sponsorships",
      "content_html": "<p>The call for papers for this year’s <a href=\"http://icfpconference.org/icfp2015/\">International Conference on Functional\nProgramming</a> is about to close in two\nweeks, and over a hundred cutting-edge research papers will be submitted on the\ntheory, application, and experiences behind functional programming and type\ntheory. In addition to the main conference, there are also over 10 big\n<a href=\"http://icfpconference.org/icfp2015/affiliated.html\">affiliated workshops</a> that\nrun throughout the week on topics ranging from specific languages\n(<a href=\"http://www.erlang.org/workshop/2014/\">Erlang</a>,\n<a href=\"http://www.haskell.org/haskellwiki/HaskellImplementorsWorkshop\">Haskell</a>,\n<a href=\"http://ocaml.org/meetings/ocaml/2014/\">OCaml</a>), the broader <a href=\"http://cufp.org/\">commercial\ncommunity</a>, and even <a href=\"http://functional-art.org/\">art and\nmusic</a>.</p>\n<p>The ICFP conference experience can be a remarkable one for students. Some great\nideas have emerged from random corridor conversations between talks with the\nlikes of <a href=\"http://homepages.inf.ed.ac.uk/wadler/\">Phil Wadler</a>, or from\nrain-soaked discussions with <a href=\"http://research.microsoft.com/en-us/people/simonpj/\">Simon PJ</a> at\n<a href=\"http://mikkeller.dk/\">Mikeller</a>, or in my case, from being convinced to <a href=\"https://blogs.janestreet.com/the-making-of-real-world-ocaml/\">write a book</a> while in\na smoky Tokyo bar. This year, it will be held in the beautiful city of\nVancouver in the fall.</p>\n<p>We’re committed to growing the ICFP community, not just in numbers but also in\ndiversity. The <a href=\"http://plmw15.iisc-seal.net/\">Programming Language Mentoring\nWorkshop</a> has been at capacity since it started\nand will run again. For the first time ever, I am really excited to announce\nthat the <a href=\"https://adainitiative.org/\">Ada Initiative</a> will also be running an\n<a href=\"https://adainitiative.org/what-we-do/workshops-and-training/\">Ally Skills</a>\nworkshop during the conference.</p>\n<p>Sustaining these activities and responsible growth means that we need to reach\never wider to support the activities of the (not-for-profit) ICFP conference.\nSo as this year’s industrial relations chair, I wish to <strong>invite any\norganization that wishes to support ICFP to get in touch with us</strong> (e-mail at\n<code>avsm2@cl.cam.ac.uk</code>) and sponsor us. I’ve put an abridged version of the\ne-mail solicitation below that describes the benefits. Sponsorship can start as\nlow as $500 and is often tax-deductible in many countries.</p>\n<blockquote>\n<p>I’m writing to ask if you would be willing to provide corporate financial support for the 20th ACM SIGPLAN International Conference on Functional Programming (ICFP), which takes place in Vancouver, Canada, from August 30th through September 5th, 2015:</p>\n<pre><code>http://icfpconference.org/icfp2015/\n</code></pre>\n<p>Corporate support funds are primarily used to subsidize students – the lifeblood of our community – and in turn serve to raise the community profile of the supporting companies through a high-profile industrial recruitment event.</p>\n<p>Last year, unprecedented levels of support from you and folks like you at over 25 companies and institutions made it possible for students from all over the world to attend ICFP 2014 in Sweden. The Industrial Reception, open to all attendees, was by all accounts a roaring success. All 2014 sponsoring companies had the opportunity to interact with the gathered students, academics, and software professionals.</p>\n<p>This year, let’s build on that success and continue to grow our community, and bring even more students to ICFP 2015 in Vancouver!</p>\n<p>Your generosity will make it possible for students from all over the world to attend ICFP, the premier conference in functional programming. There, they will meet luminaries in the field, as well as people who’ve built a successful career and/or business on functional programming. They will return home inspired to continue pursuing functional programming in the confidence that exciting future careers await them. For the first time, we will also host an Ally Skills workshop by the Ada Foundation, as well as continue the successful student mentoring workshop from previous years.</p>\n<p>This year, we’re continuing a similar system of levels of financial support as last year. Our goal is to enable smaller companies to contribute while allowing larger companies to be as generous as they wish (with additional benefits, in recognition of that generosity).</p>\n<p>The support levels, and their associated benefits and pledge amounts and benefits are as follows (costs in US dollars).</p>\n<p>Bronze: $500: Logo on website, poster at industrial reception, listed in proceedings.</p>\n<p>Silver: $2500: As above plus: logo in proceedings, logo on publicity materials (e.g., posters, etc.)</p>\n<p>Gold: $5000: As above plus: named supporter of industrial reception with opportunity to speak to the audience, and opportunity to include branded merchandise in participants’ swag bag.</p>\n<p>Platinum: $10000: As above plus: named supporter of whole event, logo on lanyards, badge ribbon, table/booth-like space available (in coffee break areas), other negotiated benefits (subject to ACM restrictions on commercial involvement).</p>\n<p>Thank you for your time and especially for your generosity! I look forward to seeing you in Vancouver. If you are willing to be a sponsor, it would be helpful to hear back by March 9th to help us plan and budget.</p>\n</blockquote>\n<p>If you are interested, please get in touch with <a href=\"mailto:anil@recoil.org\">me</a> or any of the <a href=\"http://icfpconference.org/icfp2015/index.html\">organizing committee</a>. If you’re interested in helping out ICFP in a non-financial capacity (for example, as a student volunteer), then there will also be plenty of opportunities to sign up later in the year.</p>",
      "url": "https://anil.recoil.org/notes/icfp15-call-for-sponsorships",
      "title": "ICFP 2015 - a call for sponsorship and how you can help",
      "summary": "ICFP 2015 seeks sponsors to support students and growth of the functional programming community.",
      "date_published": "2015-02-18T00:00:00.000000Z",
      "date_modified": "2015-02-18T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocamllabs",
        "icfp",
        "service"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/0bc235e0-b154-4cbf-a84a-61240f16d60a-1",
      "content_html": "<p>I hopped over to Berlin to give the keynote at <a href=\"https://bobkonf.de/2015/en/\">BOB 2015</a> on functional operating systems. The talk focused on how we can apply functional programming principles to build unikernels with MirageOS, compiling OCaml applications directly into specialized virtual machine images that run on the Xen hypervisor. This represents a radical rethinking of the traditional OS stack, eliminating much of the bloat and attack surface of conventional operating systems. If you're in the region, I <em>highly</em> recommend attending BOB as a superbly organised conference with a diverse and interesting crowd of functional programmers.</p>",
      "url": "https://anil.recoil.org/notes/0bc235e0-b154-4cbf-a84a-61240f16d60a-1",
      "title": "Delivered keynote at BOB 2015 on MirageOS",
      "summary": "Opening keynote at BOB 2015 conference in Berlin on functional operating systems and MirageOS unikernels.",
      "date_published": "2015-01-23T00:00:00.000000Z",
      "date_modified": "2015-01-23T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "mirageos",
        "fp",
        "unikernels",
        "ocaml",
        "systems"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2014-sigops-raft-1",
      "content_html": "<p>Paper on reproducing the Raft consensus protocol, published in ACM SIGOPS Operating Systems Review with Heidi Howard, Malte Schwarzkopf and Jon Crowcroft. We developed a clean-slate OCaml implementation of Raft and built an event-driven simulation framework to test it. The paper proposed several optimizations to the protocol and empirically validated Raft's correctness invariants, while also evaluating whether it lived up to its claims of being more understandable than Paxos.</p><h1>References</h1><ul><li>Howard et al (2015). Raft Refloated: Do We Have Consensus?. <a href=\"https://doi.org/10.1145/2723872.2723876\" target=\"_blank\"><i>10.1145/2723872.2723876</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2014-sigops-raft-1",
      "title": "Raft Refloated: Do We Have Consensus?",
      "summary": "Paper reproducing Raft consensus protocol with optimizations and empirical validation using clean-slate OCaml implementation.",
      "date_published": "2015-01-01T00:00:00.000000Z",
      "date_modified": "2015-01-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "distributed-systems",
        "consensus",
        "systems"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2014-sigops-raft.pdf",
          "mime_type": "application/pdf",
          "title": "Raft Refloated: Do We Have Consensus?"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1145/2723872.2723876",
          "doi": "10.1145/2723872.2723876",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2015-jfla-irmin-1",
      "content_html": "<p>Paper on mergeable data structures using Irmin (nee Irminsule) at JFLA 2015, the French-language functional programming conference. Working with Benjamin Farinier and Thomas Gazagnaire, we explored the theoretical foundations of mergeable persistent data structures. This formalized the ideas from Irminsule into a more general framework for building eventually-consistent distributed applications with type-safe merge operations.</p>",
      "url": "https://anil.recoil.org/notes/2015-jfla-irmin-1",
      "title": "Mergeable persistent data structures",
      "summary": "Paper on mergeable data structures using Irmin presented at JFLA 2015.",
      "date_published": "2015-01-01T00:00:00.000000Z",
      "date_modified": "2015-01-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "irmin",
        "ocaml",
        "database",
        "data-structures"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2015-jfla-irmin.pdf",
          "mime_type": "application/pdf",
          "title": "Mergeable persistent data structures"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2013-cufp-scribe-1",
      "content_html": "<p>Published the scribe's report for CUFP 2013 in JFP. Continuing our role as scribes for the Commercial Users of Functional Programming workshop series, we documented the 2013 edition held in conjunction with ICFP. This report in the Journal of Functional Programming captures another year of industrial FP adoption, showing how the community continues to grow and functional techniques become more mainstream in production software development. The cumulative record of these workshop reports provides valuable insight into the evolution of practical FP.</p><h1>References</h1><ul><li>Eriksen et al (2015). CUFP'13 scribe's report. <a href=\"https://doi.org/10.1017/S0956796815000052\" target=\"_blank\"><i>10.1017/S0956796815000052</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2013-cufp-scribe-1",
      "title": "CUFP'13 scribe's report",
      "summary": "Scribe's report from CUFP 2013 workshop on commercial uses of functional programming published in JFP.",
      "date_published": "2015-01-01T00:00:00.000000Z",
      "date_modified": "2015-01-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "fp",
        "ocaml",
        "cufp"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1017/S0956796815000052",
          "doi": "10.1017/S0956796815000052",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/644914a5-a40b-4ef7-bb17-cea43c95dd09-1",
      "content_html": "<p>Gave Codemesh 2014 talk on Nymote, exploring personal data management with unikernels and self-hosting. The talk focused on how individuals could &quot;git their own cloud&quot; by running secure, lightweight services for managing personal data using MirageOS and Irmin. This was part of our broader vision for giving people control over their own data through easy-to-deploy unikernel applications, moving away from dependence on centralized cloud providers.</p>",
      "url": "https://anil.recoil.org/notes/644914a5-a40b-4ef7-bb17-cea43c95dd09-1",
      "title": "Codemesh 2014: Nymote: Git Your Own Cloud Here",
      "summary": "Talk on Nymote presented at Codemesh 2014.",
      "date_published": "2014-12-17T00:00:00.000000Z",
      "date_modified": "2014-12-17T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "nymote",
        "cloud",
        "mirageos",
        "personal-data",
        "privacy"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/46968fa0-e5bd-4df8-98e1-3cf88d9b31e5-1",
      "content_html": "<p>New Directions in Operating Systems talk on Jitsu at a workshop in London. Jitsu demonstrated just-in-time summoning of unikernels - the ability to boot MirageOS unikernels in milliseconds in response to network requests. This was a breakthrough for unikernel practicality, showing that you could have the security and efficiency benefits of unikernels without sacrificing responsiveness. The system worked by keeping unikernels in a suspended state and waking them up on demand, dramatically reducing the overhead compared to traditional VMs. It was a fun piece of work that combined DNS tricks with hypervisor optimizations to create something that felt almost magical in its speed.</p>",
      "url": "https://anil.recoil.org/notes/46968fa0-e5bd-4df8-98e1-3cf88d9b31e5-1",
      "title": "Jitsu: Just-in-Time Summoning of Unikernels (new directions in operating systems)",
      "summary": "Talk on Jitsu at New Directions in Operating Systems conference.",
      "date_published": "2014-11-25T00:00:00.000000Z",
      "date_modified": "2014-11-25T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "unikernels",
        "mirageos",
        "systems",
        "cloud",
        "jitsu"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2014-regional-clouds-1",
      "content_html": "<p>Report on regional cloud computing law available as Cambridge tech report TR-863. This interdisciplinary work with Jatinder Singh, Jean Bacon, Jon Crowcroft and legal scholars examined the technical considerations needed to comply with regional data protection laws. The report explored how cloud infrastructure could be designed to respect jurisdictional boundaries - prescient work given the subsequent explosion of GDPR and data sovereignty concerns.</p><h1>References</h1><ul><li>Singh et al (2014). Regional clouds: technical considerations. <a href=\"https://doi.org/10.48456/tr-863\" target=\"_blank\"><i>10.48456/tr-863</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2014-regional-clouds-1",
      "title": "Regional clouds: technical considerations",
      "summary": "Report examining technical considerations for regional cloud computing and legal compliance.",
      "date_published": "2014-11-01T00:00:00.000000Z",
      "date_modified": "2014-11-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "cloud",
        "law",
        "privacy",
        "distributed"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2014-regional-clouds.pdf",
          "mime_type": "application/pdf",
          "title": "Regional clouds: technical considerations"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.48456/tr-863",
          "doi": "10.48456/tr-863",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/c9273fa0-802f-4d2b-8f0d-db383943564e-1",
      "content_html": "<p>At the Xen Summit speaking about branch consistency for Xen Stub Domains, explaining how MirageOS 2.0 could leverage Irmin for distributed systems. The talk demonstrated how combining Irmin's branch-consistent storage with MirageOS could build reliable stub domains for Xen hosts. This work showed how functional data structures and version control concepts could be applied to building more reliable virtualization infrastructure.</p>",
      "url": "https://anil.recoil.org/notes/c9273fa0-802f-4d2b-8f0d-db383943564e-1",
      "title": "MirageOS 2.0: branch consistency for Xen Stub Domains",
      "summary": "Xen Summit talk on implementing branch consistency for MirageOS 2.0 Xen Stub Domains.",
      "date_published": "2014-10-17T00:00:00.000000Z",
      "date_modified": "2014-10-17T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "mirageos",
        "xen",
        "unikernels",
        "distributed",
        "ocaml"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/4390c1d0-ed4f-4c01-9e10-dab2a3faed7a-1",
      "content_html": "<p>Talk on the OCaml Platform reaching v1.0 at the 2014 OCaml Workshop. This was a significant milestone - declaring version 1.0 of the OCaml Platform meant we had a stable, coherent set of tools that developers could rely on. This included opam for package management, core libraries, and documentation infrastructure. Reaching v1.0 was important for signaling to the community that the platform was production-ready and we were committed to maintaining backwards compatibility. It marked the transition from experimental tooling to something that could support serious industrial and academic use of OCaml.</p>",
      "url": "https://anil.recoil.org/notes/4390c1d0-ed4f-4c01-9e10-dab2a3faed7a-1",
      "title": "OCaml 2014: The OCaml Platform v1.0",
      "summary": "Talk on the OCaml Platform reaching v1.0.",
      "date_published": "2014-09-05T00:00:00.000000Z",
      "date_modified": "2014-09-05T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "platform",
        "tooling"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/ed84b2eb-1b93-4dc3-b746-63a4af13d4ea-1",
      "content_html": "<p>Gave Haskell Symposium 2014 Keynote on functional OS design, somewhat nervously talking to a room full of Haskellers about OCaml modules. The keynote explored how functional programming principles could be applied to operating systems design, using MirageOS as a concrete example. I discussed the benefits of OCaml's module system for building composable OS components and how type-driven development could improve systems security and reliability.</p>",
      "url": "https://anil.recoil.org/notes/ed84b2eb-1b93-4dc3-b746-63a4af13d4ea-1",
      "title": "Haskell Symposium 2014 Keynote on functional OS design",
      "summary": "Keynote presentation at Haskell Symposium 2014 on functional operating system design.",
      "date_published": "2014-09-05T00:00:00.000000Z",
      "date_modified": "2014-09-05T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "haskell",
        "fp",
        "systems",
        "unikernels",
        "mirageos"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2014-oud-platform-1",
      "content_html": "<p>Paper on the OCaml Platform status, marking the release of version 1.0. This was a major milestone showing how the community-driven toolchain had matured over the year since v0.1. Working with Jeremie Dimino, Louis Gesbert, Mark Shinwell and others, we consolidated the core development tools that OCaml programmers rely on daily, including improvements to opam, the package manager.</p>",
      "url": "https://anil.recoil.org/notes/2014-oud-platform-1",
      "title": "The OCaml Platform v1.0",
      "summary": "Paper on OCaml Platform version 1.0 status and features.",
      "date_published": "2014-09-01T00:00:00.000000Z",
      "date_modified": "2014-09-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "platform",
        "tooling",
        "devtools",
        "community"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2014-oud-platform.pdf",
          "mime_type": "application/pdf",
          "title": "The OCaml Platform v1.0"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2014-oud-multicore-1",
      "content_html": "<p>First paper on multicore OCaml's design at the OCaml Workshop, presented with Stephen Dolan and Leo White. This initial work laid out the vision for adding parallelism to OCaml while maintaining backwards compatibility - a challenging goal that would take several more years of research and development to achieve. The paper introduced the core concepts that would eventually lead to OCaml 5.0's multicore support.</p>",
      "url": "https://anil.recoil.org/notes/2014-oud-multicore-1",
      "title": "Multicore OCaml",
      "summary": "First paper on multicore OCaml's design presented at the OCaml Workshop.",
      "date_published": "2014-09-01T00:00:00.000000Z",
      "date_modified": "2014-09-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "multicore",
        "concurrency",
        "fp",
        "systems"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2014-oud-multicore.pdf",
          "mime_type": "application/pdf",
          "title": "Multicore OCaml"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2014-oud-irminsule-1",
      "content_html": "<p>Paper at the OCaml Workshop introducing Irmin (then called Irminsule), a branch-consistent distributed library database that brings Git-like operations to application data. This work with Thomas Gazagnaire and collaborators introduced mergeable persistent data structures that could be used to build eventually-consistent distributed systems. Irmin has since become a foundational component of MirageOS and other distributed OCaml applications.</p>",
      "url": "https://anil.recoil.org/notes/2014-oud-irminsule-1",
      "title": "Irminsule: a branch-consistent distributed library database",
      "summary": "Paper on Irmin branch-consistent distributed database library supporting Git-like operations at OCaml Workshop 2014.",
      "date_published": "2014-09-01T00:00:00.000000Z",
      "date_modified": "2014-09-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "irmin",
        "ocaml",
        "distributed",
        "databases",
        "mirageos"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2014-oud-irminsule.pdf",
          "mime_type": "application/pdf",
          "title": "Irminsule: a branch-consistent distributed library database"
        }
      ],
      "_references": []
    },
    {
      "id": "https://doi.org/10.59350/p1v4c-7h249",
      "content_html": "<p>It's the ever-exciting week of the <a href=\"https://icfpconference.org/\">International Conference on\nFunctional Programming</a> again in Sweden,\nand this time <a href=\"http://ocaml.io\">OCaml Labs</a> has a variety of talks,\ntutorials and keynotes to deliver throughout the week. This post\nsummarises all them so you can navigate your way to the right session.\nRemember that once you register for a particular day at ICFP, you can\nmove between workshops and tutorials as you please.</p>\n<p><img src=\"/images/gothenburg.webp\" alt=\"%r\" title=\"Gothenburg, the location of this year's ICFP conference.\" >\nQuick links to the below in date order:</p>\n<ul>\n<li>Talk on <a href=\"#coeffects\">Coeffects, a Calculus of Context-dependent\nComputation</a>, Monday 1st September, 16:30-17:20, ICFP\nDay 1.</li>\n<li>Talk on <a href=\"#implicits\">Modular Implicits</a>, Thu 4th September,\n14:25-14:50, ML Workshop.</li>\n<li>Talk on <a href=\"#modulealiases\">Module Aliases</a>, Thu 4th September,\n09:35-10:00, ML Workshop.</li>\n<li>Talk on <a href=\"#metamirage\">Metaprogramming in the Mirage OS</a>, Thu 4th\nSeptember, 14:50-15:10, ML Workshop.</li>\n<li>Keynote talk on <a href=\"#unikernels\">Unikernels</a>, Fri 5th September,\n09:00-10:00, Haskell Symposium.</li>\n<li>Talk on <a href=\"#multicore\">Multicore OCaml</a>, Fri 5th September,\n09:10-10:00, OCaml Workshop.</li>\n<li>Tutorial on <a href=\"#cufptutorial\">OCaml and JavaScript Programming</a>, Fri\n5th September, 09:00-12:00, CUFP Tutorial Day 2.</li>\n<li>Talk on <a href=\"#zeroinstall\">0install binary distribution</a>, Fri 5th\nSeptember, 10:25-10:50, OCaml Workshop.</li>\n<li>Talk on <a href=\"#tls\">Transport Layer Security in OCaml</a>, Fri 5th\nSeptember, 10:50-11:20, OCaml Workshop.</li>\n<li>Talk/Demo on the <a href=\"#platform\">OCaml Platform</a>, Fri 5th September,\n12:00-12:30, OCaml Workshop.</li>\n<li>Poster and Demo of the <a href=\"#irmin\">Irmin branch-consistent store</a>, Fri\n5th September, 15:10-16:30, OCaml/ML Workshop.</li>\n<li><a href=\"#social\">Social Events</a></li>\n</ul>\n<h2 id=\"language-and-compiler-improvements\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#language-and-compiler-improvements\"></a>Language and Compiler Improvements</h2>\n<p>The first round of talks are about improvements to the core OCaml\nlanguage and runtime.</p>\n<h3 id=\"modular-implicits\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#modular-implicits\"></a>» Modular implicits</h3>\n<p>Leo White and Frederic Bour have been taking inspiration from Scala\nimplicits and <a href=\"https://www.mpi-sws.org/~dreyer/papers/mtc/main-short.pdf\">Modular Type\nClasses</a> by\nDreyer <em>et al</em>, and will describe the design and implementation of a\nsystem for ad-hoc polymorphism in OCaml based on passing implicit module\nparameters to functions based on their module type.</p>\n<p>This provides a concise way to write functions to print or manipulate\nvalues generically, while maintaining the ML spirit of explicit\nmodularity. You can actually get get a taste of this new feature ahead\nof the talk, thanks to a new facility in OCaml: we can compile any OPAM\nswitch directly into an interactive JavaScript notebook thanks to\n<a href=\"https://github.com/andrewray/iocamljs\">iocamljs</a> by <a href=\"http://ujamjar.github.io/\">Andy\nRay</a>.</p>\n<ul>\n<li><a href=\"http://www.lpw25.net/ml2014.pdf\">Abstract</a></li>\n<li><a href=\"http://andrewray.github.io/iocamljs/modimp_show.html\">Interactive\nCompiler</a></li>\n</ul>\n<h3 id=\"multicore-ocaml\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#multicore-ocaml\"></a>Multicore OCaml</h3>\n<p>Currently, threading in OCaml is only supported by means of a global\nlock, allowing at most one thread to run OCaml code at any time. Stephen\nDolan, Leo White and Anil Madhavapeddy have been building on the <a href=\"http://www.cl.cam.ac.uk/~sd601/multicore.md\">early\ndesign</a> of a multicore\nOCaml runtime that they started in January, and now have a (early)\nprototype of a runtime design that is capable of shared memory\nparallelism.</p>\n<ul>\n<li><a href=\"http://ocaml.org/meetings/ocaml/2014/ocaml2014_1.pdf\">Abstract</a></li>\n<li>Date: 09:10-10:00, OCaml Workshop, Fri Sept 5th</li>\n</ul>\n<h3 id=\"type-level-module-aliases\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#type-level-module-aliases\"></a>Type-level Module Aliases</h3>\n<p>Leo White has been working with <a href=\"http://www.math.nagoya-u.ac.jp/~garrigue/\">Jacques\nGarrigue</a> on adding support\nfor module aliases into OCaml. This significantly improves the\ncompilation speed and executable binary sizes when using large libraries\nsuch as\n<a href=\"https://realworldocaml.org/v1/en/html/concurrent-programming-with-async.html\">Core/Async</a>.</p>\n<ul>\n<li><a href=\"https://sites.google.com/site/mlworkshoppe/modalias.pdf?attredirects=0\">Abstract</a></li>\n<li><a href=\"https://blogs.janestreet.com/better-namespaces-through-module-aliases\">Better Namespaces through Module\nAliases</a></li>\n<li>Date: 0935-1000, ML Workshop, Thu Sep 4th.</li>\n</ul>\n<h3 id=\"coeffects-a-calculus-of-context-dependent-computation\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#coeffects-a-calculus-of-context-dependent-computation\"></a>Coeffects: A Calculus of Context-dependent Computation</h3>\n<p>Alan Mycroft has been working with Tomas Petricek and Dominic Orchard on\ndefining a broader notion of context than just variables in scope. Tomas\nwill be presenting a research paper on developing a generalized coeffect\nsystem with annotations indexed by a correct shape.</p>\n<ul>\n<li><a href=\"http://www.cl.cam.ac.uk/~dao29/publ/coeffects-icfp14.pdf\">Paper</a></li>\n<li>Date: 16:30-17:20, ICFP Day 1, Mon Sep 1st.</li>\n</ul>\n<h2 id=\"mirage-os-20\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#mirage-os-20\"></a>Mirage OS 2.0</h2>\n<p>We <a href=\"http://openmirage.org/blog/announcing-mirage-20-release\">released Mirage OS\n2.0</a> in July,\nand there will be several talks diving into some of the new features you\nmay have read on the blog.</p>\n<h3 id=\"unikernels-keynote-at-haskell-symposium\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#unikernels-keynote-at-haskell-symposium\"></a>Unikernels Keynote at Haskell Symposium</h3>\n<p>Since MirageOS is a\n<a href=\"/papers/2013-asplos-mirage.pdf\">unikernel</a>\nwritten entirely in OCaml, it makes perfect sense to describe it in\ndetail to our friends over at the <a href=\"http://www.haskell.org/haskell-symposium/\">Haskell\nSymposium</a> and reflect on\nsome of the design implications between Haskell type-classes and OCaml\nfunctors and metaprogramming. Anil Madhavapeddy will be doing just that\nin a Friday morning keynote at the Haskell Symposium.</p>\n<ul>\n<li>Haskell Symposium\n<a href=\"http://www.haskell.org/haskell-symposium/2014/index.html\">Program</a></li>\n<li>Date: 0900-1000, Haskell Symposium, Fri Sep 5th.</li>\n</ul>\n<h3 id=\"transport-layer-security-in-ocaml\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#transport-layer-security-in-ocaml\"></a>Transport Layer Security in OCaml</h3>\n<p>Hannes Menhert and David Kaloper have been <a href=\"http://openmirage.org/blog/introducing-ocaml-tls\">working\nhard</a> on integrating a\npure OCaml Transport Layer Security stack into Mirage OS. They’ll talk\nabout the design principles underlying the library, and reflect on the\nnext steps to build a TLS stack that we can rely on not to been more\ninsecure than telnet.</p>\n<ul>\n<li><a href=\"http://ocaml.org/meetings/ocaml/2014/ocaml2014_4.pdf\">Abstract</a></li>\n<li>Date: 10:25-11:20, OCaml Workshop, Fri Sep 5th.</li>\n</ul>\n<p>Hannes will also continue his travels and deliver a couple of talks the\nweek after ICFP on the same topic in Denmark, so you can still see it if\nyou happen to miss this week’s presentation:</p>\n<ul>\n<li>9th Sep at 15:00, IT University of Copenhagen (2A08),\n<a href=\"http://list.ku.dk/pipermail/sci-diku-prog-lang/2014-August/000244.html\">details</a></li>\n<li>11th Sep Aarhus University, same talk (time and room TBA)</li>\n</ul>\n<h3 id=\"irmin-a-branch-consistent-distributed-library-database\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#irmin-a-branch-consistent-distributed-library-database\"></a>Irmin: a Branch-consistent Distributed Library Database</h3>\n<p>Irmin is an <a href=\"https://github.com/mirage/irmin\">OCaml library</a> to persist\nand synchronize distributed data structures both on-disk and in-memory.\nIt enables a style of programming very similar to the Git workflow,\nwhere distributed nodes fork, fetch, merge and push data between each\nother. The general idea is that you want every active node to get a\nlocal (partial) copy of a global database and always be very explicit\nabout how and when data is shared and migrated.</p>\n<p>This has been a big collaborative effort lead by Thomas Gazagnaire, and\nincludes contributions from Amir Chaudhry, Anil Madhavapeddy, Richard\nMortier, David Scott, David Sheets, Gregory Tsipenyuk, Jon Crowcroft.\nWe’ll be demonstrating Irmin <a href=\"https://www.youtube.com/watch?v=DSzvFwIVm5s\">in\naction</a>, so please come\nalong if you’ve got any interesting applications you would like to talk\nto us about.</p>\n<ul>\n<li><a href=\"http://ocaml.org/meetings/ocaml/2014/ocaml2014_11.pdf\">Abstract</a></li>\n<li><a href=\"http://openmirage.org/blog/introducing-irmin\">Blog Post</a></li>\n<li>Date: 15:10-16:30, Joint Poster Session for OCaml/ML Workshop, Fri\nSep 5th 2014.</li>\n</ul>\n<h3 id=\"metaprogramming-with-ml-modules-in-the-mirageos\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#metaprogramming-with-ml-modules-in-the-mirageos\"></a>Metaprogramming with ML modules in the MirageOS</h3>\n<p>Mirage OS lets the programmer build modular operating system components\nusing a combination of OCaml functors and generative metaprogramming.\nThis ensures portability across both Unix binaries and Xen unikernels,\nwhile preserving a usable developer workflow.</p>\n<p>The core Mirage OS team of Anil Madhavapeddy, Thomas Gazagnaire, David\nScott and Richard Mortier will be talking about the details of the\nfunctor combinators that make all this possible, and doing a live\ndemonstration of it running on a tiny <a href=\"http://openmirage.org/blog/introducing-xen-minios-arm\">ARM\nboard</a>!</p>\n<ul>\n<li><a href=\"https://sites.google.com/site/mlworkshoppe/Gazagnaire-abstract.pdf?attredirects=0\">Abstract</a></li>\n<li>Date: 14:50-15:10, ML Workshop, Thu Sep 4th 2014.</li>\n</ul>\n<h3 id=\"cufp-ocaml-language-tutorial\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#cufp-ocaml-language-tutorial\"></a>CUFP OCaml Language Tutorial</h3>\n<p>Leo White and Jeremy Yallop (with much helpful assistance from Daniel\nBuenzli) will be giving a rather different OCaml tutorial from the usual\nfare: they are taking you on a journey of building a variant of the\npopular <a href=\"http://gabrielecirulli.github.io/2048/\">2048</a> game in pure\nOCaml, and compiling it to JavaScript using the\n<a href=\"http://ocsigen.org/js_of_ocaml/\">js_of_ocaml</a> compiler. This is a\nvery pragmatic introduction to using statically typed functional\nprogramming combined with efficient compilation to JavaScript.</p>\n<blockquote>\n<p>In this tutorial, we will first introduce the basics of OCaml using an\ninteractive environment running in a web browser, as well as a local\ninstall of OCaml using the OPAM package manager. We will also explore\nhow to compile OCaml to JavaScript using the js_of_ocaml tool.</p>\n</blockquote>\n<p>The tutorial is focused around writing the 2048 logic, which will then\nbe compiled with js_of_ocaml and linked together with a frontend based\non (a pre-release version of) Useri, React, Gg and Vg, thanks to Daniel\nBuenzli. There’ll also be appearances from OPAM, IOCaml, Qcheck and\nOUnit.</p>\n<ul>\n<li><a href=\"https://github.com/ocamllabs/cufp-tutorial/\">Tutorial Code</a></li>\n<li><a href=\"https://github.com/ocamllabs/cufp-tutorial/blob/master/task.md\">Task\nSheet</a></li>\n<li>Date: 09:00-12:00, CUFP Tutorial Day 2, Fri Sep 5th 2014.</li>\n</ul>\n<p>There will also be a limited supply of special edition OCaml-branded USB\nsticks for the first tutorial attendees, so get here early for your\nexclusive swag!</p>\n<h2 id=\"the-ocaml-platform\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-ocaml-platform\"></a>The OCaml Platform</h2>\n<p>The group here has been working hard all summer to pull together an\nintegrated demonstration of the new generation of OCaml tools being\nbuilt around the increasingly popular <a href=\"https://opam.ocaml.org\">OPAM</a>\npackage manager. Anil Madhavapeddy will demonstrate all of these pieces\nin the OCaml Workshop, with guest appearances of work from Amir\nChaudhry, Daniel Buenzli, Jeremie Diminio, Thomas Gazagnaire, Louis\nGesbert, Thomas Leonard, David Sheets, Mark Shinwell, Christophe\nTroestler, Leo White and Jeremy Yallop.</p>\n<blockquote>\n<p>The OCaml Platform combines the OCaml compiler toolchain with a\ncoherent set of tools for build, documentation, testing and IDE\nintegration. The project is a collaborative effort across the OCaml\ncommunity, tied together by the OCaml Labs group in Cambridge and with\nother major contributors.</p>\n</blockquote>\n<ul>\n<li><a href=\"http://ocaml.org/meetings/ocaml/2014/ocaml2014_7.pdf\">Abstract</a></li>\n<li><a href=\"https://opam.ocaml.org/blog\">Platform Blog</a></li>\n<li>Date: 12:00-12:30, OCaml Workshop, Fri Sep 5th 2014.</li>\n</ul>\n<h3 id=\"the-0install-binary-installation-system\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-0install-binary-installation-system\"></a>The 0install Binary Installation System</h3>\n<p>Thomas Leonard will also be delivering a separate talk about\ncross-platform binary installation via his\n<a href=\"http://zero-install.sourceforge.net/\">0install</a> library, which works on\na variety of platforms ranging from Windows, Linux and MacOS X. He\nrecently rewrote it in <a href=\"http://roscidus.com/blog/blog/2014/06/06/python-to-ocaml-retrospective/\">OCaml from\nPython</a>,\nand will be sharing his experiences on how this went as a new OCaml\nuser, as well as deliver an introduction to 0install.</p>\n<ul>\n<li><a href=\"http://ocaml.org/meetings/ocaml/2014/ocaml2014_3.pdf\">Abstract</a></li>\n<li>Date: 10:25-10:50, OCaml Workshop, Fri Sep 5th 2014.</li>\n</ul>\n<h2 id=\"service-and-socialising\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#service-and-socialising\"></a>Service and Socialising</h2>\n<p>Heidi Howard and Leonhard Markert are acting as student volunteers at\nthis years ICFP, and assisting with videoing various workshops such as\nCUFP Tutorials, Haskell Symposium, the Workshop on Functional\nHigh-Performance Computing and the ML Family Workshop. Follow their live\nblogging on the <a href=\"http://www.syslog.cl.cam.ac.uk/\">Systems Research Group\nSysBlog</a> and leave comments about any\nsessions you’d like to know more about!</p>\n<p>Anil Madhavapeddy is the ICFP industrial relations chair and will be\nhosting an Industrial Reception on Thursday 4th September in the <a href=\"http://www.varldskulturmuseerna.se/varldskulturmuseet/\">Museum\nof World\nCulture</a>\nstarting from 7pm. There will be wine, food and some inspirational talks from the ICFP\nsponsors that not only make the conference possible, but provide an\navenue for the academic work to make its way out into industry (grad\nstudents that are job hunting: this is where you get to chat to folk\nhiring FP talent).</p>\n<p>This list hasn’t been exhaustive, and only covers the activities of my\ngroup in <a href=\"http://ocaml.io\">OCaml Labs</a> and the <a href=\"http://www.cl.cam.ac.uk/research/srg/\">Systems Research\nGroup</a> at Cambridge. There are\nnumerous other talks from the Cambridge Computer Lab during the week,\nbut the artistic highlight will be on Saturday evening following the\n<a href=\"http://cufp.org/2014/\">CUFP talks</a>: <a href=\"http://sam.aaron.name/\">Sam Aaron</a>\nwill be doing a <a href=\"https://twitter.com/samaaron/status/505081137660981248\">live musical\nperformance</a>\nsometime after 8pm at <a href=\"http://www.3vaningen.se/\">3vaningen</a>. Sounds like\na perfect way to wind down after what’s gearing to up to be an intense\nICFP 2014. I look forward to seeing old friends and making new ones in\nGothenburg soon!</p>",
      "url": "https://anil.recoil.org/notes/ocaml-labs-at-icfp-2014",
      "title": "Talks from OCaml Labs during ICFP 2014",
      "summary": "OCaml Labs talks at ICFP 2014, covering language improvements & MirageOS",
      "date_published": "2014-08-31T00:00:00.000000Z",
      "date_modified": "2014-08-31T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocamllabs",
        "icfp",
        "livenotes"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/5cdf2eef-9053-428e-b8b3-ab5ae274c129-1",
      "content_html": "<p>Appeared on FLOSS Weekly 302 about Open Mirage, where Randal Schwartz and Simon Phipps interviewed me about the MirageOS project. We discussed how unikernels bring functional programming principles to systems programming, the benefits of using OCaml for building secure network services, and the open source community growing around the project. It was a great opportunity to introduce MirageOS to a broader open source audience.</p>",
      "url": "https://anil.recoil.org/notes/5cdf2eef-9053-428e-b8b3-ab5ae274c129-1",
      "title": "FLOSS Weekly 302: Open Mirage",
      "summary": "Interview with Randal Schwartz and Simon Phipps about MirageOS on FLOSS Weekly podcast episode 302.",
      "date_published": "2014-07-23T00:00:00.000000Z",
      "date_modified": "2014-07-23T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "mirageos",
        "unikernels",
        "ocaml",
        "podcast"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/announcing-mirageos-2",
      "content_html": "<p>This is a big release for us; after the first version came out earlier in the year, we added in support for ARM devices, a new storage subsystem called <a href=\"https://irmin.org\">Irmin</a> and even a pure OCaml TLS stack.</p>",
      "url": "https://anil.recoil.org/notes/announcing-mirageos-2",
      "external_url": "https://mirageos.org/blog/announcing-mirage-20-release",
      "title": "MirageOS v2.0: a recap of new features",
      "summary": "MirageOS v2.0 adds ARM support, Irmin storage and OCaml TLS stack.",
      "date_published": "2014-07-22T00:00:00.000000Z",
      "date_modified": "2014-07-22T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "unikernels",
        "mirageos",
        "ocaml"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/announcing-mirageos-1-2",
      "content_html": "<p>I announce a point release of MirageOS 1.x, and the exciting run up to the major MirageOS 2.0 release which has lots of new features. The number of Mirage users is growing steadily!</p>",
      "url": "https://anil.recoil.org/notes/announcing-mirageos-1-2",
      "external_url": "https://mirageos.org/blog/mirage-1.2-released",
      "title": "MirageOS v1.2 released and the runup to 2.0",
      "summary": "MirageOS v1.2 released, paving way for MirageOS 2.0 with new features.",
      "date_published": "2014-07-08T00:00:00.000000Z",
      "date_modified": "2014-07-08T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "unikernels",
        "mirageos",
        "opensource"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/d5411e25-7845-41e8-b3ec-ab3c33ce13c8-1",
      "content_html": "<p>Appeared on SE Radio Episode 204 about Mirage and OCaml, talking with Robert Blumen on the Software Engineering Radio podcast. We discussed the Mirage cloud operating system architecture, how OCaml's type system and functional features made it ideal for systems programming, and the practical benefits of unikernels for cloud deployment. The conversation covered both the technical foundations and real-world applications of our work.</p>",
      "url": "https://anil.recoil.org/notes/d5411e25-7845-41e8-b3ec-ab3c33ce13c8-1",
      "title": "SE Radio Episode 204: Anil Madhavapeddy on the Mirage Cloud Operating System and the OCaml Language",
      "summary": "Software Engineering Radio podcast episode discussing Mirage cloud operating system and OCaml language.",
      "date_published": "2014-05-01T00:00:00.000000Z",
      "date_modified": "2014-05-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "mirageos",
        "ocaml",
        "unikernels",
        "podcast",
        "interview"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/codio-now-has-opam-support",
      "content_html": "<p>I noticed an offhand tweet from Phil Tomson about <a href=\"http://codio.com/\">Codio</a> adding OPAM support, and naturally had to take a quick look. I was <em>really</em> impressed by the whole process, and ended up building the <a href=\"https://web.archive.org/web/20170914182531/http://www.openmirage.org/wiki/mirage-www\">Mirage Xen website</a> unikernel directly from my web browser in less than a minute, including registration!</p>\n<ul>\n<li>I signed up to Codio for free (since it’s <a href=\"https://web.archive.org/web/20170914182531/https://codio.com/avsm/Mirage-WWW/\">a public project</a>) using GitHub oAuth (only public identity access required at first, no repository access).</li>\n<li>Selected a <code>git</code> project and pointed it at the <a href=\"https://web.archive.org/web/20170914182531/https://github.com/mirage/mirage-www\">mirage-www</a> repository.</li>\n<li>At this point, you get the usual file explorer and code editor view in your browser. The magic begins when you go to “Tools/Terminal”, and an interactive Ubuntu shell pops up. Since Codio added <a href=\"https://web.archive.org/web/20170914182531/https://codio.com/s/blog/2014/03/new-parts/\">opam support</a>, setting up the Mirage environment is a breeze:</li>\n</ul>\n<blockquote>\n<p>I notice Codio supports OCaml and opam on the server side now.\n— phil tomson (@philtor)\n<a href=\"https://web.archive.org/web/20170914182531/https://twitter.com/philtor/statuses/448884571950444545\">March 26, 2014</a></p>\n</blockquote>\n<pre><code class=\"language-bash\">$ parts install opam\n$ opam init -a\n$ eval `opam config env`\n$ opam install mirage-www -y\n$ make MODE=xen\n</code></pre>\n<p>Then have a cup of coffee while the box builds, and you have a <code>mir-www.xen</code>, all from your web browser! Codio has a number of deployment options available too, so you should be able to hook up a <a href=\"https://web.archive.org/web/20170914182531/http://amirchaudhry.com/from-jekyll-to-unikernel-in-fifty-lines/\">Git-based workflow</a> using some combination of Travis or other CI service.</p>\n<p>This is the first time I’ve ever been impressed by an online editor, and might consider moving away from my beloved vi...</p>",
      "url": "https://anil.recoil.org/notes/codio-now-has-opam-support",
      "title": "Codio: build Mirage unikernels from a browser",
      "summary": "Build Mirage unikernels from a browser with Codio's OPAM support and interactive Ubuntu shell.",
      "date_published": "2014-03-26T00:00:00.000000Z",
      "date_modified": "2014-03-26T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocamllabs",
        "ocaml",
        "web"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/ocaml-github-and-opam",
      "content_html": "<p>Gabriel Scherer <a href=\"http://gallium.inria.fr/blog/patch-review-on-github/\">announced an\nexperiment</a> to\nhost OCaml compiler pull requests on\n<a href=\"https://github.com/ocaml/ocaml/pulls\">GitHub</a> for six months. There is\na general feeling that GitHub would be a more modern hosting platform\nthan the venerable but reliable\n<a href=\"http://caml.inria.fr/mantis/changelog_page.php\">Mantis</a> setup that has\nin place for over a decade, but the only way to find out for sure is by\ntrying it out for a while.</p>\n<p>One of the great benefits of using GitHub is their excellent\n<a href=\"http://developer.github.com/v3/\">API</a> to easily automate workflows\naround issues and pull requests. After a suggestion from Jeremy Yallop\nand David Sheets over lunch, I decided to use this to make it easier to\nlocally apply compiler patches. OPAM has a great <a href=\"https://opam.ocaml.org/doc/Advanced_Usage.html#h2-Usingadifferentcompiler\">compiler\nswitch</a>\nfeature that lets you run simultaneous OCaml installations and swap\nbetween them easily.</p>\n<p>For instance, the default setting gives you access\nto:</p>\n<pre><code>$ opam switch\nsystem  C system       System compiler (4.01.0)\n--     -- 3.11.2       Official 3.11.2 release\n--     -- 3.12.1       Official 3.12.1 release\n--     -- 4.00.0       Official 4.00.0 release\n--     -- 4.00.1       Official 4.00.1 release\n--     -- 4.01.0       Official 4.01.0 release\n--     -- 4.01.0beta1  Beta1 release of 4.01.0\n</code></pre>\n<p>I used my <a href=\"https://github.com/avsm/ocaml-github\">GitHub API bindings</a> to\nknock up a script that converts every GitHub pull request into a custom\ncompiler switch. You can see these by passing the <code>--all</code> option to\n<code>opam switch</code>, as follows:</p>\n<pre><code>$ opam switch --all\n--     -- 4.02.0dev+pr10              Add String.{split,rsplit}\n--     -- 4.02.0dev+pr13              Add String.{cut,rcut}.\n--     -- 4.02.0dev+pr14              Add absolute directory names to bytecode format for ocamldebug to use\n--     -- 4.02.0dev+pr15              replace String.blit by String.unsafe_blit\n--     -- 4.02.0dev+pr17              Cmm arithmetic optimisations\n--     -- 4.02.0dev+pr18              Patch for issue 5584\n--     -- 4.02.0dev+pr2               Parse -.x**2. (unary -.) as -.(x**2.).  Fix PR#3414\n--     -- 4.02.0dev+pr20              OCamlbuild: Fix the check of ocamlfind\n--     -- 4.02.0dev+pr3               Extend record punning to allow destructuring.\n--     -- 4.02.0dev+pr4               Fix for PR#4832 (Filling bigarrays may block out runtime)\n--     -- 4.02.0dev+pr6               Warn user when a type variable in a type constraint has been instantiated.\n--     -- 4.02.0dev+pr7               Extend ocamllex with actions before refilling\n--     -- 4.02.0dev+pr8               Adds a .gitignore to ignore all generated files during `make world.opt'\n--     -- 4.02.0dev+pr9               FreeBSD 10 uses clang by default, with gcc not available by default\n--     -- 4.02.0dev+trunk             latest trunk snapshot\n</code></pre>\n<p>Testing the impact of a particular compiler switch is now pretty\nstraightforward. If you want to play with Stephen Dolan’s <a href=\"https://github.com/ocaml/ocaml/pull/17\">optimized\narithmetic operations</a>, for\ninstance, you just need to do:</p>\n<pre><code>$ opam switch 4.02.0dev+pr17\n$ eval `opam config env`\n</code></pre>\n<p>And your local environment now points to the patched OCaml compiler. For\nthe curious, the scripts to generate the OPAM pull requests are in my\n<a href=\"https://github.com/avsm/opam-sync-github-prs\">avsm/opam-sync-github-prs</a>\nrepository. It contains an example of how to query active pull requests,\nand also to create a new cross-repository pull request (using the <a href=\"https://github.com/avsm/ocaml-github\">git\njar</a> binary from my GitHub\nbindings). The scripts run daily for now, and delete switches once the\ncorresponding pull request is closed. Just run <code>opam update</code> to retrieve\nthe latest switch set from the upstream <a href=\"https://github.com/ocaml/opam-repository\">OPAM package\nrepository</a>.</p>",
      "url": "https://anil.recoil.org/notes/ocaml-github-and-opam",
      "title": "Easily OPAM switching to any OCaml feature request",
      "summary": "Switch between OCaml compilers easily with OPAM and GitHub pull requests.",
      "date_published": "2014-03-25T00:00:00.000000Z",
      "date_modified": "2014-03-25T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocamllabs"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/icfp-call-for-sponsorships",
      "content_html": "<p>The call for papers for this year’s <a href=\"http://icfpconference.org/icfp2014/\">International Conference on Functional Programming</a> has just closed, with around a hundred cutting-edge research papers submitted on the theory, application, and experiences behind functional programming. This marks just the beginning of sorting out the program, as there are also over 10 big <a href=\"http://icfpconference.org/icfp2014/affiliated.html\">affiliated workshops</a> that run throughout the week on topics ranging from specific languages (<a href=\"http://www.erlang.org/workshop/2014/\">Erlang</a>, <a href=\"http://www.haskell.org/haskellwiki/HaskellImplementorsWorkshop\">Haskell</a>, <a href=\"http://ocaml.org/meetings/ocaml/2014/\">OCaml</a>), the broader <a href=\"http://cufp.org/\">commercial community</a>, and even <a href=\"http://functional-art.org/\">art and music</a>.</p>\n<p>The ICFP conference experience can be a remarkable one for students. Some great ideas have emerged from random corridor conversations between talks with the likes of <a href=\"http://homepages.inf.ed.ac.uk/wadler/\">Phil Wadler</a>, or from rain-soaked discussions with <a href=\"http://research.microsoft.com/en-us/people/simonpj/\">Simon PJ</a> at <a href=\"http://mikkeller.dk/\">Mikeller</a>, or in my case, from being convinced to <a href=\"https://blogs.janestreet.com/the-making-of-real-world-ocaml/\">write a book</a> while in a smoky Tokyo bar.</p>\n<p>Functional programming worldwide has been growing ever more popular in 2014 (and <a href=\"http://whatsapp.com/\">lucrative</a>). We’re committed to growing the ICFP community, not just in numbers but also in diversity. We had a record number of sponsors in 2013, and sustaining the growth means that we need to reach ever wider to support the activities of the (not-for-profit) conference.</p>\n<p>So as this year’s industrial relations chair, I thought I’d throw the gates open and <strong>invite any organization that wishes to support FP to get in touch with us</strong> (e-mail at <code>avsm2@cl.cam.ac.uk</code>) and sponsor us. I’ve put an abridged version of the e-mail solicitation below that describes the benefits. Sponsorship can start as low as $500 and is often tax deductible in many countries.</p>\n<blockquote>\n<p>I’m writing to ask if you would be willing to provide corporate financial support for the 19th ACM SIGPLAN International Conference on Functional Programming (ICFP), which takes place in Gothenburg, Sweden, from September 1st through 3rd, 2014:</p>\n<p><a href=\"http://icfpconference.org/icfp2014/\">http://icfpconference.org/icfp2014/</a></p>\n<p>Corporate support funds are primarily used to subsidize students – the lifeblood of our community – and in turn serve to raise the community profile of the supporting companies through a high-profile industrial recruitment event.</p>\n<p>Last year, unprecedented levels of support from you and folks like you at over 25 companies and institutions made it possible for students from all over the world to attend ICFP 2013 in Boston. The Industrial Reception, open to all attendees, was by all accounts a roaring success. All 2013 sponsoring companies had the opportunity to speak to the gathered students, academics, and software professionals.</p>\n<p>This year, let’s build on that success and continue to grow our community, and bring even more students to ICFP 2014 in Sweden!</p>\n<p>Your generosity will make it possible for students from all over the world to attend ICFP, the premier conference in functional programming. There, they will meet luminaries in the field, as well as people who’ve built a successful career and/or business on functional programming. They will return home inspired to continue pursuing functional programming in the confidence that exciting future careers await them.</p>\n<p>This year, we’re continuing a similar system of levels of financial support as last year. Our goal is to enable smaller companies to contribute while allowing larger companies to be as generous as they wish (with additional benefits, in recognition of that generosity).</p>\n<p>The support levels, and their associated benefits and pledge amounts and benefits are as follows (costs in US dollars).</p>\n<p><strong>Bronze:</strong> $500: Logo on website, poster at industrial reception, listed in proceedings.</p>\n<p><strong>Silver:</strong> $2500: As above plus: logo in proceedings, logo on publicity materials (e.g., posters, etc.)</p>\n<p><strong>Gold:</strong> $5000: As above plus: named supporter of industrial reception, opportunity to include branded merchandise in participants’ swag bag.</p>\n<p><strong>Platinum:</strong> $10000: As above plus: named supporter of whole event, logo on lanyards, badge ribbon, table/booth-like space available (in coffee break areas), other negotiated benefits (subject to ACM restrictions on commercial involvement).</p>\n</blockquote>\n<p>If you are interested, please get in touch with <a href=\"mailto:anil@recoil.org\">me</a> or any of the <a href=\"http://icfpconference.org/icfp2014/index.html\">organizing committee</a>.\nIf you’re interested in helping out ICFP in a non-financial capacity (for example as a student volunteer), then there will also be plenty of opportunity to sign up later in the year.</p>",
      "url": "https://anil.recoil.org/notes/icfp-call-for-sponsorships",
      "title": "ICFP 2014 - a call for sponsorship and how you can help",
      "summary": "Support functional programming community growth by sponsoring ICFP 2014, starting at $500.",
      "date_published": "2014-03-03T00:00:00.000000Z",
      "date_modified": "2014-03-03T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocamllabs"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2010-iswp-dustclouds-1",
      "content_html": "<p>Paper on building dust clouds for anonymous communication. This work with <a href=\"https://github.com/mor1\">Richard Mortier</a>, Theodore Hong, and others explored how cloud computing platforms like Amazon EC2 could facilitate anonymous communications. By dynamically leasing diverse collections of virtual machines spread across the world, we could enhance systems like Tor with the flexibility and geographic diversity of cloud infrastructure. The paper discusses both the opportunities this creates for privacy-preserving communications and the unique challenges that arise from using commercial cloud platforms for anonymity.</p><h1>References</h1><ul><li>Mortier et al (2010). Using Dust Clouds to Enhance Anonymous Communication. Springer. <a href=\"https://doi.org/10.1007/978-3-662-45921-8_10\" target=\"_blank\"><i>10.1007/978-3-662-45921-8_10</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2010-iswp-dustclouds-1",
      "title": "Using Dust Clouds to Enhance Anonymous Communication",
      "summary": "Paper on building dust clouds for anonymous communication systems.",
      "date_published": "2014-03-01T00:00:00.000000Z",
      "date_modified": "2014-03-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "security",
        "privacy",
        "networking",
        "anonymity"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2010-iswp-dustclouds.pdf",
          "mime_type": "application/pdf",
          "title": "Using Dust Clouds to Enhance Anonymous Communication"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1007/978-3-662-45921-8_10",
          "doi": "10.1007/978-3-662-45921-8_10",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/unikernels-in-cacm",
      "content_html": "<p>The Communications of the ACM have just published an article that <a href=\"https://dave.recoil.org\">Dave Scott</a> and I wrote providing a broader background on the concept of <a href=\"http://anil.recoil.org/papers/2013-asplos-mirage.pdf\">Unikernels</a> that we’ve been working on since about 2003, when we started building <a href=\"http://anil.recoil.org/papers/2007-eurosys-melange.pdf\">Melange</a> and the <a href=\"http://anil.recoil.org/papers/2010-icfp-xen.pdf\">Xen toolstack</a>. You can read either the <a href=\"http://cacm.acm.org/magazines/2014/1/170866-unikernels\">print article</a> (requires an ACM subscription) or the <a href=\"http://queue.acm.org/detail.cfm?id=2566628\">open access version</a> on the ACM Queue.</p>\n<p><img src=\"/images/acm-queue-unikernels-ss.webp\" alt=\"%r\" >\nThere's been some interesting discussion about it already online:</p>\n<ul>\n<li>On <a href=\"http://www.reddit.com/r/programming/comments/1upy41/mirage_os_10_released_last_december/\">Reddit</a>, a number of queries about how it fits into the space of containers, microkernels, and other experimental operating systems.</li>\n<li>Coverage from <a href=\"http://www.eweek.com/cloud/xen-project-builds-its-own-cloud-os-mirage.html\">eWeek</a>, <a href=\"http://www.infoworld.com/t/operating-systems/xen-mirage-the-less-more-cloud-os-233823\">InfoWorld</a>, and <a href=\"http://www.linux.com/news/enterprise/cloud-computing/751156-are-cloud-operating-systems-the-next-big-thing\">Linux.com</a>, and a couple of interviews on InfoQ covering <a href=\"http://www.infoq.com/news/2013/12/mirageos\">Mirage</a> and my <a href=\"http://www.infoq.com/articles/real-world-ocaml-interview\">book on OCaml</a> that give more background on the project.</li>\n</ul>\n<p>Two of the most interesting bits of feedback for me personally came from <a href=\"http://en.wikipedia.org/wiki/Butler_Lampson\">Butler Lampson</a> (via Jon Crowcroft) and <a href=\"http://www.cs.cmu.edu/~rwh/\">Robert Harper</a>, two computer scientists who have made key contributions to operating systems and programming languages and provided some broader perspective.</p>\n<p>Butler Lampson points out (edited for the web):</p>\n<blockquote>\n<p>I found the Mirage work quite interesting: a 21st-century version of things that we did at Xerox in the 1970s. Of course, the application domain is quite different, and so is the whole-program optimization. And we couldn’t afford garbage collection, so freeing storage was not type-safe. But there are lots of interesting parallels.</p>\n<p>The “OS as libraries” idea was what made it possible to fit big applications into the Alto’s 128k bytes of memory:</p>\n<p><em>Lampson and Sproull</em>, <a href=\"http://research.microsoft.com/pubs/68223/acrobat.pdf\">An open operating system for a single-user machine</a>, ACM Operating Systems Rev. 11, 5 (Dec. 1979), pp 98-105. <a href=\"http://dl.acm.org/citation.cfm?id=800215.806575\">ACM</a>.</p>\n<p>The use of strong type-checking and interfaces for an OS was pioneered in [Mesa](http://en.wikipedia.org/wiki/Mesa_(programming_language%29) and [Pilot](http://en.wikipedia.org/wiki/Pilot_(operating_system%29):</p>\n<p><em>Lauer and Satterthwaite</em>, <a href=\"http://dl.acm.org/citation.cfm?id=802937\">The impact of Mesa on system design</a>, Proc. 4th ICSE, Munich, Sep. 1979, pp 174-182.</p>\n<p><em>Redell et al</em>, <a href=\"http://web.cs.wpi.edu/~cs502/s06/Papers/Redell,%20Pilot%20Operating%20System.pdf\">Pilot: An Operating System for a Personal Computer</a>, Comm. ACM 23, 2 (Feb 1980), pp 81-92 (from 7th SOSP, 1979). <a href=\"http://dl.acm.org/citation.cfm?id=358818.358822&amp;coll=DL&amp;dl=ACM&amp;CFID=396678249&amp;CFTOKEN=51329799\">ACM</a>.</p>\n</blockquote>\n<p>Robert Harper correctly points out some related work that was missing from our CACM article:</p>\n<ul>\n<li><a href=\"http://www.cs.cmu.edu/~fox/foxnet.html\">FoxNet</a> is an implementation of the standard TCP/IP networking protocol stack using the <a href=\"http://en.wikipedia.org/wiki/Standard_ML\">Standard ML</a> (SML) language. It was part of a wide-reaching project at CMU in the 1990s that made seminal contributions in <a href=\"http://www.cs.cmu.edu/~fox/pcc.html\">proof-carrying code</a> and <a href=\"http://www.cs.cmu.edu/~fox/til.html\">typed intermediate languages</a>, among <a href=\"http://www.cs.cmu.edu/~fox/publications.html\">many other things</a>. The FoxNet stack was actually one of my big inspirations for wanting to build Mirage since the elegance of using functors as a form of dependency injection into a system as complex as an OS and application stack is very desirable and the reason we chose to build Mirage in ML instead of another, less modular, language.</li>\n<li>Ensemble (website now offline but here’s a <a href=\"http://www.cs.uni-potsdam.de/ti/kreitz/PDF/99sosp-fastpath.pdf\">SOSP 1999 paper</a>) is a group communication system written in OCaml, developed at Cornell and the Hebrew University. For an application builder, Ensemble provides a library of protocols that can be used for quickly building complex distributed applications. For a distributed systems researcher, Ensemble is a highly modular and reconfigurable toolkit: the high-level protocols provided to applications are really stacks of tiny protocol “layers,” each of whose can be modified or rebuilt to experiment.</li>\n</ul>\n<p>Both Ensemble and FoxNet made strong echoes throughout the design of Mirage (and its precursor software such as <a href=\"http://anil.recoil.org/papers/2007-eurosys-melange.pdf\">Melange</a> in 2007). The <a href=\"http://openmirage.org/wiki/hello-world\">Mirage command-line tool</a> uses staged computation to build a concrete application out of functors, and we are making this even more programmable via a new <a href=\"https://github.com/mirage/mirage/pull/178\">combinator-based functor types</a> library that <a href=\"http://gazagnaire.org/\">Thomas Gazagnaire</a> built, and also experimenting with <a href=\"https://github.com/ocamllabs/higher\">higher kinded polymorphic</a> abstractions.</p>\n<p>My thanks to Butler Lampson and Robert Harper for making me go re-read their papers again, and I’d like to leave you with Malte Schwarzkopf’s <a href=\"http://www.cl.cam.ac.uk/~ms705/netos/os-reading-group.html\">OS Reading Group</a> papers for other essential reading in this space. Many more citations immediately relevant to Mirage can also be found in our <a href=\"http://anil.recoil.org/papers/2013-asplos-mirage.pdf\">ASPLOS 2013</a> paper.</p>",
      "url": "https://anil.recoil.org/notes/unikernels-in-cacm",
      "title": "Unikernels, and the Rise of the Virtual Library Operating System",
      "summary": "Unikernels and virtual library operating systems are on the rise, changing the face of cloud computing.",
      "date_published": "2014-01-13T00:00:00.000000Z",
      "date_modified": "2014-01-13T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "unikernels",
        "cacm",
        "interview"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://doi.org/10.59350/dxt23-hxf18",
      "content_html": "<p>This time last year in 2012, I had just\n<a href=\"/2012/10/19/announcing-ocaml-labs.html\">announced</a>\nthe formation of a new group called <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/\">OCaml\nLabs</a> in the <a href=\"http://www.cl.cam.ac.uk\">Cambridge\nComputer Lab</a> that would combine research and\ncommunity work towards the practical application of functional\nprogramming. An incredible year has absolutely flown by, and I’ve put\ntogether this post to summarise what’s gone on, and point to our future\ndirections for 2014.</p>\n<p>The theme of our group was not to be pure research, but rather a hybrid\ngroup that would take on some of the load of day-to-day OCaml\nmaintenance from <a href=\"http://caml.inria.fr\">INRIA</a>, as well as help grow the\nwider OCaml community. To this end, all of our projects have been highly\ncollaborative, often involving colleagues from\n<a href=\"http://ocamlpro.com\">OCamlPro</a>, <a href=\"http://gallium.inria.fr/\">INRIA</a>,\n<a href=\"http://janestreet.com\">Jane Street</a>, <a href=\"http://www.lexifi.com/\">Lexifi</a>\nand <a href=\"http://citrix.com\">Citrix</a>.</p>\n<p>This post covers progress in <a href=\"#tooling\">tooling</a>, the <a href=\"#core_compiler\">compiler and\nlanguage</a>, <a href=\"#community_efforts\">community efforts</a>,\n<a href=\"#research_projects\">research projects</a> and concludes with our\n<a href=\"#priorities_for_2014\">priorities for 2014</a>.</p>\n<h2 id=\"tooling\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#tooling\"></a>Tooling</h2>\n<p>At the start of 2013, OCaml was in the interesting position of being a\nmature decades-old language with a small, loyal community of industrial\nusers who built mission critical applications using it. We had the\nopportunity to sit down with many of them at the <a href=\"http://caml.inria.fr/consortium/\">OCaml\nConsortium</a> meeting and prioritise\nwhere we started work. The answer came back clearly: while the compiler\nitself is legendary for its stability, the tooling around it (such as\npackage management) was a pressing problem.</p>\n<h3 id=\"opam\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#opam\"></a>OPAM</h3>\n<p>Our solution to this tooling was centered around the\n<a href=\"http://opam.ocaml.org\">OPAM</a> package manager that\n<a href=\"http://ocamlpro.com\">OCamlPro</a> released into beta just at the end of\n2012, and had its first stable release in March 2013. OPAM differs from\nmost system package managers by emphasising a flexible distributed\nworkflow that uses version constraints to ensure incompatible libraries\naren’t mixed up (important for the statically-typed OCaml that is very\ncareful about dependencies). Working closely with\n<a href=\"http://ocamlpro.com\">OCamlPro</a> we developed a git-based workflow to\nmake it possible for users (both individual or industrial) to easily\nbuild up their own package repositories and redistribute OCaml code, and\nstarted curating the <a href=\"https://github.com/ocaml/opam-repository\">package\nrepository</a>.</p>\n<p>The results have been satisfying: we started with an initial set of\naround 100 packages in OPAM (mostly imported by the 4 developers), and\nended 2013 with 587 unique packages and 2000 individual versions, with\ncontributions from 160 individuals. We now have a curated <a href=\"https://github.com/ocaml/opam-repository\">central\npackage repository</a> for anyone\nto submit their OCaml code, several third-party remotes are maintained\n(e.g. the <a href=\"https://github.com/xapi-project/opam-repo-dev\">Xen Project</a>\nand <a href=\"https://github.com/ocsigen/opam-ocsigen\">Ocsigen</a>). We also\nregularly receive releases of the <a href=\"http://ocaml.janestreet.com\">Core</a>\nlibraries from Jane Street, and updates from sources as varied as\n<a href=\"https://github.com/ocaml/opam-repository/pull/1300\">Facebook</a>,\n<a href=\"/2013/09/16/camlpdf-the-end-of-sucky-pdf-tools.html\">Coherent\nPDF</a>,\nto the <a href=\"http://ocaml.org/meetings/ocaml/2013/slides/guha.pdf\">Frenetic\nSDN</a> research.</p>\n<p><img src=\"/images/opam11-contributors-dec13.webp\" alt=\"\" title=\"Number of unique contributors to the central OPAM package repository\" >\n<img src=\"/images/opam11-packages-dec13.webp\" alt=\"\" title=\"Total number of unique packages (including multiple versions of the same package)\" >\n<img src=\"/images/opam11-unique-packages-dec13.webp\" alt=\"\" title=\"Total packages with multiple versions coalesced so you can see new package growth\" ></p>\n<p>A notable contribution from OCamlPro during this time was to\n<a href=\"https://github.com/ocaml/opam-repository/issues/955\">clarify</a> the\nlicensing on the package repository to be the liberal\n<a href=\"http://creativecommons.org/choose/zero/\">CC0</a>, and also to pass\nownership to the <a href=\"http://github.com/ocaml\">OCaml</a> organization on\nGitHub, where it’s now jointly maintained by OCaml Labs, OCamlPro and\nanyone else that wishes to contribute.</p>\n<h3 id=\"a-lens-into-global-ocaml-code\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#a-lens-into-global-ocaml-code\"></a>A lens into global OCaml code</h3>\n<p>It’s been quite interesting just watching all the varied code fly into\nthe repository, but stability quickly became a concern as the new\npackages piled up. OCaml compiles to native code on not just x86, but\nalso PowerPC, Sparc and\n<a href=\"/2012/02/25/dreamplug-debian-and-ocaml.html\">ARM</a>\nCPUs. We kicked off various efforts into automated testing: firstly\n<a href=\"https://github.com/dsheets\">David Sheets</a> built the\n<a href=\"https://github.com/ocaml/v2.ocaml.org/blob/master/site/meetings/ocaml/2013/proposals/ocamlot.pdf\">OCamlot</a>\ndaemon that would schedule builds across all the exotic hardware. Later\nin the year, the <a href=\"http://travis-ci.org\">Travis</a> service launched support\nfor testing from GitHub pull requests, and this became the front line of\n<a href=\"https://web.archive.org/web/20181114154831/https://anil.recoil.org/2013/09/30/travis-and-ocaml.html\">automated\nchecking</a> for\nall incoming new packages to OPAM.</p>\n<p>A major headache with automated testing is usually setting up the right\nbuild environment with external library dependencies, and so we <a href=\"/2013/11/15/docker-and-opam.html\">added\nDocker support</a>\nto make it easier to bulk-build packages for local developer use, with\nthe results of builds available\n<a href=\"https://github.com/avsm/opam-bulk-logs\">publically</a> for anyone to help\ntriage. Unfortunately fixing the bugs themselves is still a <a href=\"https://github.com/ocaml/opam-repository/issues/1304\">very manual\nprocess</a>, so more\nvolunteers are always welcome to help out!</p>\n<p><img src=\"/images/travis-mascot-200px.webp\" alt=\"%r\" >\nWe’re going to be really seeing the rewards from all this effort as\nOCaml 4.02 development proceeds, since we can now adopt a data-driven\napproach to changing language features instead of guessing how much\nthird-party code will break. If your code is in OPAM, then it’ll be\ntested as new features such as <a href=\"http://caml.inria.fr/mantis/view.php?id=6063\">module\naliases</a>,\n<a href=\"http://ocaml.org/meetings/ocaml/2013/slides/garrigue.pdf\">injectivity</a>\nand <a href=\"http://ocaml.org/meetings/ocaml/2013/slides/white.pdf\">extension\npoints</a> show up.</p>\n<h3 id=\"better-documentation\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#better-documentation\"></a>Better documentation</h3>\n<p>The venerable\n<a href=\"http://caml.inria.fr/pub/docs/manual-ocaml-4.00/manual029.html\">OCamlDoc</a>\ntool has done an admirable job for the last decade, but is increasingly\nshowing its age due to a lack of support for cross-referencing across\npackages. We started working on this problem in the summer when <a href=\"https://github.com/vincent-botbol\">Vincent\nBotbol</a> visited us on an internship,\nexpecting it to be a quick job to come up with something as good as\nHaskell’s excellent <a href=\"http://www.haskell.org/haddock/\">Haddock</a> online\ndocumentation.</p>\n<p>Instead, we ran into the &quot;module wall&quot;: since OCaml makes it so easy to\nparameterise code over other modules, it makes it hard to generate\nstatic documentation without outputting hundreds of megabytes of HTML\nevery time. After some hard work from Vincent and Leo, we’ve got a\nworking prototype that lets you simply run\n<code>opam install opam-doc &amp;&amp; opam doc core async</code> to generate package\ndocumentation. You can see the results for\n<a href=\"http://mirage.github.io/\">Mirage</a> online, but expect to see this\nintegrated into the main OCaml site for all OPAM packages as we work\nthrough polishing up the user interface.</p>\n<h3 id=\"turning-opam-into-libraries\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#turning-opam-into-libraries\"></a>Turning OPAM into libraries</h3>\n<p>The other behind-the-scenes effort for OPAM has been to keep the core\ncommand-line tool simple and stable, and to have it install OCaml\nlibraries that can be interfaced with by other tools to do\ndomain-specific tasks. <a href=\"http://gazagnaire.org\">Thomas Gazagnaire</a>,\n<a href=\"http://louis.gesbert.fr/cv.en.html\">Louis Gesbert</a> and <a href=\"https://github.com/dsheets\">David\nSheets</a> have been steadily hacking away at\nthis and we now have <a href=\"https://github.com/ocamllabs/opamfu\">opamfu</a> to\nrun operations over all packages, and an easy-to-template\n<a href=\"https://github.com/ocaml/opam2web\">opam2web</a> that generates the live\n<a href=\"http://opam.ocaml.org\">opam.ocaml.org</a> website.</p>\n<p>This makes OPAM easier to deploy within other organizations that want to\nintegrate it into their workflow. For example, the <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/pkg/\">software\nsection</a> of the OCaml\nLabs website is regularly generated from a search of all OPAM packages\ntagged <code>ocamllabs</code>. We also used it to rewrite the entire OPAM\nrepository <a href=\"https://github.com/ocaml/opam-repository/pull/1240\">in one epic\ndiff</a> to add\nexternal library dependencies via a <a href=\"https://github.com/ocaml/opam/pull/886/files\">command-line\nshim</a>.</p>\n<h3 id=\"opam-in-a-box\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#opam-in-a-box\"></a>OPAM-in-a-Box</h3>\n<p>All of this effort is geared towards making it easier to maintain\nreusable local OPAM installations. After several requests from big\nuniversities to help out their teaching needs, we’re putting together\nall the support needed to easily redistribute OPAM packages via an\n“<a href=\"https://github.com/ocaml/opam/issues/1035\">OPAM-in-a-Box</a>” command\nthat uses <a href=\"http://docker.io\">Docker</a> containers to let you clone and do\nlightweight modifications of OCaml installations.</p>\n<p>This will also be useful for anyone who’d like to run tutorials or teach\nOCaml, without having to rely on flaky network connectivity at\nconference venues: a problem we’ve <a href=\"http://amirchaudhry.com/fpdays-review\">suffered\nfrom</a> too!</p>\n<h2 id=\"core-compiler\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#core-compiler\"></a>Core Compiler</h2>\n<p><img src=\"/images/compiler-hacking.webp\" alt=\"%r\" title=\"Compiling hacking at the Cambridge Makespace\" >\nStarting to work on a real compiler can often be a daunting prospect,\nand so one initiative we started this year is to host regular <a href=\"http://ocamllabs.github.io/compiler-hacking/2013/10/30/third-compiler-hacking-session.html\">compiler\nhacking\nsessions</a>\nwhere people could find a <a href=\"https://github.com/ocamllabs/compiler-hacking/wiki\">curated list of\nfeatures</a> to work\non, with the regular developers at hand to help out when people get\nstuck, and free beer and pizza to oil the coding wheels. This has worked\nout well, with around 20 people showing up on average for the three we\nheld, and <a href=\"https://github.com/ocamllabs/compiler-hacking/wiki/Things-previously-worked-on\">several\npatches</a>\nsubmitted upstream to OCaml. <a href=\"http://gallium.inria.fr/~scherer/\">Gabriel\nScherer</a> and <a href=\"http://cristal.inria.fr/~doligez/\">Damien\nDoligez</a> have been helping this\neffort by tagging <a href=\"http://caml.inria.fr/mantis/search.php?project_id=1&amp;sticky_issues=1&amp;sortby=last_updated&amp;dir=DESC&amp;highlight_changed=24&amp;hide_status_id=90&amp;tag_string=junior_job\">junior\njobs</a>\nin the OCaml Mantis bug tracker as they are filed.</p>\n<h3 id=\"syntax-transformations-and-extension-points\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#syntax-transformations-and-extension-points\"></a>Syntax transformations and extension points</h3>\n<p><a href=\"http://www.lpw25.net\">Leo White</a> started the year fresh out of\ncompleting his PhD with <a href=\"https://www.cl.cam.ac.uk/~am21/\">Alan Mycroft</a>,\nand before he realized what he’d gotten himself into was working with\n<a href=\"http://alain.frisch.fr/\">Alain Frisch</a> on the future of syntax\ntransformations in OCaml. We started off our first\n<a href=\"http://lists.ocaml.org/listinfo/wg-camlp4\">wg-camlp4</a> working group on\nthe new <a href=\"http://lists.ocaml.org\">lists.ocaml.org</a> host, and a spirited\ndiscussion\n<a href=\"http://lists.ocaml.org/pipermail/wg-camlp4/2013-January/thread.html\">started</a>\nthat went\n<a href=\"http://lists.ocaml.org/pipermail/wg-camlp4/2013-February/thread.html\">on</a>\nand\n<a href=\"http://lists.ocaml.org/pipermail/wg-camlp4/2013-March/thread.html\">on</a>\nfor several months. It ended with a very satisfying design for a simpler\n<em>extension points</em> mechanism which Leo\n<a href=\"http://ocaml.org/meetings/ocaml/2013/slides/white.pdf\">presented</a> at\nthe OCaml 2013 workshop at ICFP, and is now merged into OCaml\n4.02-trunk.</p>\n<h3 id=\"namespaces\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#namespaces\"></a>Namespaces</h3>\n<p>Not all of the working groups were quite as successful in coming to a\nconclusion as the Camlp4 one. On the Platform mailing list, Gabriel\nScherer started a discussion on the design for\n<a href=\"http://lists.ocaml.org/pipermail/platform/2013-February/000050.html\">namespaces</a>\nin OCaml. The resulting discussion was useful in separating multiple\nconcerns that were intermingled in the initial proposal, and Leo wrote a\n<a href=\"http://www.lpw25.net/2013/03/10/ocaml-namespaces.html\">comprehensive blog\npost</a> on a\nproposed namespace design.</p>\n<p>After further discussion at <a href=\"http://icfpconference.org/icfp2013/\">ICFP\n2013</a> with Jacques Garrigue later\nin the year, it turns out adding support for <a href=\"http://caml.inria.fr/mantis/view.php?id=6063\">module\naliases</a> would solve much\nof the cost associated with compiling large libraries such as\n<a href=\"http://ocaml.janestreet.com\">Core</a>, with no backwards compatibility\nissues. This solution has now been integrated into OCaml 4.02.0dev and\nis being tested with Core.</p>\n<h3 id=\"delving-into-the-bug-tracker\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#delving-into-the-bug-tracker\"></a>Delving into the bug tracker</h3>\n<p>Jeremy Yallop joined us in April, and he and Leo also leapt into the\ncore compiler and started triaging issues on the OCaml <a href=\"http://caml.inria.fr/mantis\">bug\ntracker</a>. This seems unglamorous in the\nbeginning, but there rapidly turned out to be many fascinating threads\nthat shed light on OCaml’s design and implementation through seemingly\nharmless bugs. Here is a pick of some interesting threads through the\nyear that we’ve been involved with:</p>\n<ul>\n<li>An <a href=\"http://caml.inria.fr/mantis/view.php?id=5985&amp;nbn=49#bugnotes\">unexpected interaction between variance and GADTs</a>\nthat led to Jacques Garrigue’s\n<a href=\"http://ocaml.org/meetings/ocaml/2013/slides/garrigue.pdf\">talk</a> at\nOCaml 2013.</li>\n<li>Type unsoundness by <a href=\"http://caml.inria.fr/mantis/view.php?id=5992\">pattern matching lazy mutable\nvalues</a>, thus shedding\nlight on the precise semantics of the order of pattern matching.</li>\n<li>Leo proposed an <a href=\"http://caml.inria.fr/mantis/view.php?id=5584\">open types</a> extension to\nallow abstract types to be declared open. You can try it via\n<code>opam switch 4.00.1+open-types</code>.</li>\n<li>Designing the popular, but controversial <a href=\"http://caml.inria.fr/mantis/view.php?id=5759\">record disambiguation feature</a> in OCaml\n4.01.0, and debating <a href=\"http://caml.inria.fr/mantis/view.php?id=6000\">the right warnings</a> needed to\nprevent programmer surprise.</li>\n<li>Exposing a <a href=\"http://caml.inria.fr/mantis/view.php?id=6064\">GADT representation for Bigarray</a>.</li>\n</ul>\n<p>This is just a sample of some of the issues solved in Mantis; if you\nwant to learn more about OCaml, it’s well worth browsing through it to\nlearn from over a decade of interesting discussions from all the\ndevelopers.</p>\n<h3 id=\"thread-local-storage-runtime\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#thread-local-storage-runtime\"></a>Thread-local storage runtime</h3>\n<p>While OCamlPro was working on their <a href=\"https://github.com/lucasaiu/ocaml\">reentrant OCaml\nruntime</a>, we took a different tack by\nadding <a href=\"https://github.com/ocamllabs/ocaml/tree/multicore\">thread-local\nstorage</a> to the\nruntime instead, courtesy of <a href=\"http://mu.netsoc.ie/\">Stephen Dolan</a>. This\nis an important choice to make at the outset of adding multicore, so\nboth approaches are warranted. The preemptive runtime adds a lot of code\nchurn (due to adding a context parameter to most function calls) and\ntakes up a register, whereas the thread-local storage approach we tried\ndoesn’t permit callbacks to different threads.</p>\n<p>Much of this work isn’t interesting on its own, but forms the basis for\na fully multicore runtime (with associated programming model) in 2014.\nStay tuned!</p>\n<h3 id=\"ctypes\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#ctypes\"></a>Ctypes</h3>\n<p><img src=\"/images/c.webp\" alt=\"%r\" >\nOne other complaint from the Consortium members was quite surprising:\nthe difficulty of using the OCaml foreign function interface safely to\ninterface with C code. Jeremy Yallop began working on the\n<a href=\"https://github.com/ocamllabs/ocaml-ctypes\">ctypes</a> library that had the\ngoal of eliminating the need to write any C code at all for the vast\nmajority of foreign bindings.</p>\n<p>Instead, Ctypes lets you describe any C function call as an OCaml value,\nand provides various linkage options to invoke that function into C. The\nfirst option he implemented was a <code>dlopen</code> interface, which immediately\nbrought us the same level of functionality as the\n<a href=\"http://docs.python.org/2/library/ctypes.html\">Python</a> or\n<a href=\"http://www.haskell.org/haskellwiki/Library/libffi\">Haskell</a> Ctypes\nequivalents. This early code was in itself startlingly useful and more\npleasant to use than the raw FFI, and various folk (such as David\nSheets’ <a href=\"https://github.com/dsheets/ocaml-sodium\">libsodium</a>\ncryptography bindings) started adopting it.</p>\n<p>At this point, I happened to be struggling to write the Foreign Function\nInterface chapter of <a href=\"https://realworldocaml.org\">Real World OCaml</a>\nwithout blowing through our page budget with a comprehensive explanation\nof the existing system. I decided to take a risk and write about Ctypes\ninstead, since it let new users to the language have a <em>far</em> more\nproductive experience to get started. Xavier Leroy pointed out <a href=\"https://github.com/realworldocaml/book/issues/1701\">some\nshortcomings</a> of the\nlibrary in his technical book review, most notably with the lack of an\ninterface with C macros. The design of Ctypes fully supports alternate\nlinking mechanisms than just <code>dlopen</code> though, and Jeremy has added\nautomatic C stub generation support as well. This means that if you use\nCtypes to build an OCaml binding in 2014, you can choose several\nmechanisms for the same source code to link to the external system.\nJeremy even demonstrated a forking model at OCaml 2013 that protects the\nOCaml runtime from the C binding via process separation.</p>\n<p>The effort is paying off: Daniel Bünzli <a href=\"http://alan.petitepomme.net/cwn/2013.12.17.html#9\">ported\nSDL2</a> using ctypes,\nand gave us extensive\n<a href=\"https://github.com/ocamllabs/ocaml-ctypes/issues\">feedback</a> about any\nmissing corner cases, and the resulting bindings don’t require any C\ncode to be written. <a href=\"http://xulforum.org\">Jonathan Protzenko</a> even used\nit to implement an OCaml controller for the <a href=\"http://gallium.inria.fr/blog/raspi-lcd/\">Adafruit Raspberry Pi RGB\nLCD</a>!</p>\n<h2 id=\"community-efforts\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#community-efforts\"></a>Community Efforts</h2>\n<p>Our community efforts were largely online, but we also hosted visitors\nover the year and regular face-to-face tutorials.</p>\n<h3 id=\"online-at-ocamlorg\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#online-at-ocamlorg\"></a>Online at OCaml.org</h3>\n<p>While the rest of the crew were hacking on OPAM and OCaml, <a href=\"http://amirchaudhry.com/\">Amir\nChaudhry</a> and <a href=\"http://philippewang.info/CL/\">Philippe\nWang</a> teamed up with Ashish Agarwal and\nChristophe Troestler to redesign and relaunch the <a href=\"http://ocaml.org\">OCaml\nwebsite</a>. Historically, OCaml’s homepage has been the\n<a href=\"http://caml.inria.fr\">caml.inria.fr</a> domain, and the\n<a href=\"http://ocaml.org\">ocaml.org</a> effort was begun by Christophe and Ashish\n<a href=\"https://www.mail-archive.com/caml-list@inria.fr/msg00169.html\">some years\nago</a> to\nmodernize the web presence.</p>\n<p>The webpages were already rather large with complex scripting (for\nexample, the <a href=\"http://ocaml.org/learn/tutorials/99problems.html\">99\nProblems</a> page runs\nthe OCaml code to autogenerate the output). Philippe developed a\n<a href=\"https://github.com/pw374/MPP-language-blender\">template DSL</a> that made\nit easier to unify a lot of the templates around the website, and also a\n<a href=\"https://github.com/pw374/omd\">Markdown parser</a> that we could link to as\na library from the rest of the infrastructure without shelling out to\nPandoc.</p>\n<p>Meanwhile, Amir designed a series of <a href=\"http://amirchaudhry.com/wireframe-demos-for-ocamlorg/\">interactive wireframe\nsketches</a> and\n<a href=\"http://amirchaudhry.com/ocamlorg-request-for-feedback/\">gathered feedback</a> on it\nfrom the community. A local <a href=\"http://onespacemedia.com\">design agency</a> in\nCambridge helped with visual look and feel, and finally at the end of\nthe summer we began the\n<a href=\"http://amirchaudhry.com/migration-plan-ocaml-org/\">migration</a> to the\nnew website, followed by a triumphant\n<a href=\"http://amirchaudhry.com/announcing-new-ocamlorg/\">switchover</a> in\nNovember to the design you see today.</p>\n<p>The domain isn’t just limited to the website itself. Leo and I set up a\n<a href=\"https://github.com/ocaml/ocaml.org-scripts\">SVN-to-Git mirror</a> of the\nOCaml compiler <a href=\"http://caml.inria.fr/ocaml/anonsvn.en.html\">Subversion\nrepository</a> on the GitHub\n<a href=\"https://github.com/ocaml/ocaml\">OCaml organization</a>, which is proving\npopular with developers. There is an ongoing effort to simplify the core\ncompiler tree by splitting out some of the larger components, and so\n<a href=\"http://github.com/ocaml/camlp4\">camlp4</a> is also now hosted on that\norganization, along with <a href=\"https://github.com/ocaml/oasis\">OASIS</a>. We\nalso administer several subdomains of <a href=\"http://ocaml.org\">ocaml.org</a>,\nsuch as the <a href=\"http://lists.ocaml.org\">mailing lists</a> and the <a href=\"http://opam.ocaml.org\">OPAM\nrepository</a>, and other services such as the\n<a href=\"http://forge.ocamlcore.org\">OCaml Forge</a> are currently migrating over.\nThis was made significantly easier thanks to sponsorship from <a href=\"http://rackspace.com\">Rackspace\nCloud</a> (users of <a href=\"http://xenserver.org\">XenServer</a>\nwhich is written in OCaml). They saw our struggles with managing\nphysical machines and gave us developer accounts, and all of the\nocaml.org infrastructure is now hosted on Rackspace. We’re very grateful\nto their ongoing help!</p>\n<p><img src=\"/images/rackspace.webp\" alt=\"%r\" >\nIf you’d like to contribute to infrastructure help (for example, I’m\nexperimenting with a <a href=\"http://git.ocaml.org/public/\">GitLab</a> mirror),\nthen please join the\n<a href=\"http://lists.ocaml.org/listinfo/infrastructure\">infrastructure@lists.ocaml.org</a>\nmailing list and share your thoughts. The website team also need help\nwith adding content and <a href=\"https://github.com/ocaml/ocaml.org/issues/376\">international\ntranslations</a>, so head\nover to the <a href=\"http://github.com/ocaml/ocaml.org/issues\">website issue\ntracker</a> and start proposing\nimprovements you’d like to see.</p>\n<h3 id=\"next-steps-for-ocamlorg\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#next-steps-for-ocamlorg\"></a>Next steps for ocaml.org</h3>\n<p>The floodgates requesting features opened up after the launch of the new\nlook and feel. Pretty much everyone wanted deeper OPAM integration into\nthe main website, for features such as:</p>\n<ul>\n<li>Starring and reviewing packages</li>\n<li>Integrating the <a href=\"https://github.com/ocamllabs/opam-doc\">opam-doc</a>\ndocumentation with the metadata</li>\n<li>Display test results and a compatibility matrix for non-x86 and\nnon-Linux architectures.</li>\n<li>Link to blog posts and tutorials about the package.</li>\n</ul>\n<p>Many of these features were part of the <a href=\"http://amirchaudhry.com/wireframe-demos-for-ocamlorg/\">original\nwireframes</a> but\nwe’re being careful to take a long-term view of how they should be\ncreated and maintained. Rather than building all of this as a huge\nbloated <a href=\"https://github.com/ocaml/opam2web\">opam2web</a> extension, David\nSheets (our resident relucant-to-admit-it web expert) has designed an\noverlay directory scheme that permits the overlaying of different\nmetadata onto the website. This lets one particular feature (such as\nblog post aggregation) be handled separately from the others via Atom\naggregators.</p>\n<h3 id=\"real-world-ocaml\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#real-world-ocaml\"></a>Real World OCaml</h3>\n<p><img src=\"/papers/rwo\" alt=\"%r\" >\nA big effort that took up most of the year for me was finishing and\npublishing an O’Reilly book called <a href=\"https://realworldocaml.org\">Real World\nOCaml</a> with <a href=\"https://ocaml.janestreet.com/?q=blog/5\">Yaron\nMinsky</a> and Jason Hickey. Yaron\ndescribes how it all started in <a href=\"https://ocaml.janestreet.com/?q=node/117\">his blog\npost</a>, but I learnt a lot from\ndeveloping a book using the <a href=\"https://web.archive.org/web/20160324164610/https://anil.recoil.org/2013/08/06/real-world-ocaml-beta2.html\">open commenting\nscheme</a>\nthat we developed just for this.</p>\n<p>In particular, the book ended up shining a bright light into dark\nlanguage corners that we might otherwise not have explored in OCaml\nLabs. Two chapters of the book that I wasn’t satisfied with were the\n<a href=\"https://realworldocaml.org/v1/en/html/objects.html\">objects</a> and\n<a href=\"https://realworldocaml.org/v1/en/html/classes.html\">classes</a> chapters,\nlargely since neither Yaron nor Jason nor I had ever really used their\nfull power in our own code. Luckily, Leo White decided to pick up the\nbaton and champion these oft-maligned (but very powerful) features of\nOCaml, and the result is the clearest explanation of them that I’ve read\nyet. Meanwhile, Jeremy Yallop helped out with extensive review of the\n<a href=\"https://realworldocaml.org/v1/en/html/foreign-function-interface.html\">Foreign Function\nInterface</a>\nchapter that used his\n<a href=\"https://github.com/ocamllabs/ocaml-ctypes\">ctypes</a> library. Finally,\n<a href=\"https://plus.google.com/100586365409172579442/posts\">Jeremie Diminio</a>\nat Jane Street worked hard on adding several features to his\n<a href=\"https://github.com/diml/utop\">utop</a> toplevel that made it compelling\nenough to become our default recommendation for newcomers.</p>\n<p>All in all, we ended up closing over <a href=\"https://web.archive.org/web/20160101000000*/https://anil.recoil.org/2013/08/06/real-world-ocaml-beta2.html\">2000\ncomments</a>\nin the process of writing the book, and I’m very proud of the result\n(freely available <a href=\"https://realworldocaml.org\">online</a>, but do <a href=\"http://www.amazon.com/Real-World-OCaml-Functional-programming/dp/144932391X/\">buy a\ncopy</a>\nif you can to support it). Still, there’s more I’d like to do in 2014 to\nimprove the ease of using OCaml further. In particular, I removed a\nchapter on packaging and build systems since I wasn’t happy with its\nquality, and both <a href=\"http://gazagnaire.org\">Thomas Gazagnaire</a> and I\nintend to spend time in 2014 on improving this part of the ecosystem.</p>\n<h3 id=\"tutorials-and-talks\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#tutorials-and-talks\"></a>Tutorials and Talks</h3>\n<p><img src=\"/images/pfff.webp\" alt=\"%r\" title=\"Julien Verlaguet and Yoann Padioleau show off Pfff code visualisation at Facebook.\" >\nWe had a lively presence at <a href=\"http://icfpconference.org\">ICFP 2013</a> this\nyear, with the third iteration of the <a href=\"http://ocaml.org/meetings/ocaml/2013/program.html\">OCaml\n2013</a> held there, and\nStephen Dolan presenting a paper in the main conference. I <a href=\"http://www.syslog.cl.cam.ac.uk/2013/09/24/liveblogging-ocaml-workshop-2013/\">liveblogged\nOCaml\n2013</a>\nand <a href=\"http://www.syslog.cl.cam.ac.uk/2013/09/22/liveblogging-cufp-2013/\">CUFP\n2013</a>\nas they happened, and all the\n<a href=\"http://ocaml.org/meetings/ocaml/2013/program.html\">talks</a> we gave are\nlinked from the program. The most exciting part of the conference for a\nlot of us were the two talks by Facebook on their use of OCaml: first\nfor <a href=\"http://ocaml.org/meetings/ocaml/2013/slides/padioleau.pdf\">program analysis using\nPfff</a> and\nthen to migrate their massive PHP codebase <a href=\"http://www.youtube.com/watch?feature=player_detailpage&amp;v=gKWNjFagR9k#t=1150\">using an OCaml\ncompiler</a>.\nI also had the opportunity to participate in a panel at the Haskell\nWorkshop on whether <a href=\"http://ezyang.tumblr.com/post/62157468762/haskell-haskell-and-ghc-too-big-to-fail-panel\">Haskell is too big to fail\nyet</a>;\nlots of interesting perspectives on scaling another formerly academic\nlanguage into the real world.</p>\n<p><a href=\"https://github.com/yminsky\">Yaron Minsky</a> and I have been\ngiving tutorials on OCaml at ICFP for several years, but the release of\nReal World OCaml has made it significantly easier to give tutorials\nwithout the sort of labor intensity that it took in previous years (one\nmemorable ICFP 2011 tutorial that we did took almost 2 hours to get\neveryone installed with OCaml. In ICFP 2013, it took us 15 minutes or so\nto get everyone started). Still, giving tutorials at ICFP is very much\npreaching to the choir, and so we’ve started speaking at more\ngeneral-purpose events.</p>\n<p><img src=\"/images/marius-yaron-icfp.webp\" alt=\"%r\" title=\"Marius Eriksen and Yaron Minsky start a Scala vs OCaml rap battle at the ICFP industrial fair. Maybe.\" >\nOur first local effort was <a href=\"http://fpdays.net/2013/\">FPDays</a> in\nCambridge, where Jeremy Yallop and Amir Chaudhry ran the tutorial with\nhelp from Phillipe Wang, Leo White and David Sheets. The OCaml session\nthere ended up being the biggest one in the entire two days, and Amir\n<a href=\"http://amirchaudhry.com/fpdays-review/\">wrote up</a> their experiences.\nOne interesting change from our ICFP tutorial is that Jeremy used\n<a href=\"https://github.com/ocsigen/js_of_ocaml\">js_of_ocaml</a> to teach OCaml\nvia JavaScript by building a fun <a href=\"https://github.com/ocamllabs/fpdays-skeleton\">Monty\nHall</a> game.</p>\n<h3 id=\"visitors-and-interns\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#visitors-and-interns\"></a>Visitors and Interns</h3>\n<p><img src=\"/images/thomas-nycoug-2013.webp\" alt=\"%r\" title=\"Thomas Gazagnaire presents at Jane Street\" >\nSince OCaml Labs is a normal group within the <a href=\"http://www.cl.cam.ac.uk\">Cambridge Computer\nLab</a>, we often host academic visitors and\ninterns who pass through. This year was certainly diverse, and we\nwelcomed a range of colleagues:</p>\n<ul>\n<li><a href=\"http://www.lip6.fr/actualite/personnes-fiche.php?ident=D1161&amp;LANG=en\">Mathias\nBourgoin</a>\nhas just finished his work on interfacing OCaml with GPUs, and gave\nus a seminar on how his\n<a href=\"http://www.algo-prog.info/spoc/web/index.php?id=spoc\">SPOC</a> tool\nworks (also available in OPAM via a <a href=\"http://www.algo-prog.info/spoc/distribution/opam/\">custom\nremote</a>).</li>\n<li><a href=\"http://www.benjamin.canou.fr/\">Benjamin Canou</a> (now at OCamlPro)\npractised his <a href=\"http://ocaml.org/meetings/ocaml/2013/slides/canou.pdf\">OCaml 2013\ntalk</a> on\nbuilding high-level interfaces to JavaScript with OCaml by giving a\ndepartmental seminar.</li>\n<li><a href=\"http://www.dicosmo.org/\">Roberto Di Cosmo</a>, who directs the\n<a href=\"http://www.irill.org/\">IRILL</a> organization on Free Software in\nParis delivered a seminar on constraint solving for <a href=\"http://mancoosi.org\">package\nsystems</a> that are as large-scale as Debian’s.</li>\n<li><a href=\"http://gazagnaire.org\">Thomas Gazagnaire</a> visited during the summer\nto help plot the <a href=\"http://openmirage.org/blog/mirage-1.0.3-released\">Mirage\n1.0</a> and <a href=\"/2013/09/20/opam-1-1-beta.html\">OPAM\n1.1</a> releases.\nHe has also since joined OCaml Labs fulltime to work on\n<a href=\"http://nymote.org\">Nymote</a>.</li>\n<li><a href=\"http://louis.gesbert.fr/cv.en.html\">Louis Gesbert</a> from OCamlPro\nvisited for 2 weeks in December and kicked off the inaugral OPAM\ndevelopers summit (which was, admittedly, just 5 developers in the\n<a href=\"http://www.kingston-arms.co.uk/\">Kingston Arms</a>, but all good\nthings start in a pub, right?)</li>\n<li><a href=\"http://www.xulforum.org/\">Jonathan Protzenko</a> presented his PhD\nwork on <a href=\"http://protz.github.io/mezzo/\">Mezzo</a> (which is now <a href=\"http://gallium.inria.fr/blog/mezzo-on-opam/\">merged\ninto OPAM</a>), and\neducated us on the vagaries of <a href=\"http://protz.github.io/ocaml-installer/\">Windows\nsupport</a>.</li>\n<li><a href=\"http://gallium.inria.fr/~scherer/\">Gabriel Scherer</a> from the\nGallium INRIA group visited to discuss the direction of OPAM and\nvarious language feature discussions (such as namespaces). He didn’t\ngive a talk, but promises to do so next time!</li>\n<li><a href=\"https://github.com/bvaugon\">Benoît Vaugon</a> gave a seminar on his\n<a href=\"http://oud.ocaml.org/2012/slides/oud2012-paper10-slides.pdf\">OCamlCC</a>\nOCaml-to-C compiler, talked about porting OCaml to <a href=\"http://www.algo-prog.info/ocaml_for_pic/web/index.php?id=ocapic\">8-bit\nPICs</a>,\nand using GADTs to <a href=\"http://caml.inria.fr/mantis/view.php?id=6017\">implement\nPrintf</a> properly.</li>\n</ul>\n<p>We were also visited several times by <a href=\"http://danmey.org/\">Wojciech\nMeyer</a> from ARM, who was an OCaml developer who\nmaintained (among other things) the\n<a href=\"http://brion.inria.fr/gallium/index.php/Ocamlbuild\">ocamlbuild</a> system\nand worked on <a href=\"http://www.youtube.com/watch?v=d9Hg5L76FG8\">DragonKit</a>\n(an extensible LLVM-like compiler written in OCaml). Wojciech very sadly\npassed away on November 18th, and we all fondly remember his\nenthusiastic and intelligent contributions to our small Cambridge\ncommunity.</p>\n<p>We also hosted visitors to live in Cambridge and work with us over the\nsummer. In addition to Vincent Botbol (who worked on OPAM-doc as\ndescribed earlier) we had the pleasure of having <a href=\"http://erratique.ch/\">Daniel\nBünzli</a> and <a href=\"http://www.x9c.fr/\">Xavier Clerc</a>\nwork here. Here’s what they did in their own words.</p>\n<h4 id=\"xavier-clerc-ocamljava\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#xavier-clerc-ocamljava\"></a>Xavier Clerc: OCamlJava</h4>\n<p>Xavier Clerc took a break from his regular duties at INRIA to join us\nover the summer to work on\n<a href=\"http://ocamljava.x9c.fr/preview/\">OCaml-Java</a> and adapt it to the\nlatest JVM features. This is an incredibly important project to bridge\nOCaml with the huge Java community, and here’s his report:</p>\n<blockquote>\n<p>After a four-month visit to the OCaml Labs dedicated to the\n<a href=\"http://ocamljava.x9c.fr/preview/\">OCaml-Java</a> project, the time has\ncome for an appraisal! The undertaken work can be split into two\nareas: improvements to code generation, and interaction between the\nOCaml &amp; Java languages. Regarding code generation, several classical\noptimizations have been added to the compiler, for example loop\nunrolling, more aggressive unboxing, better handling of globals, or\npartial evaluation (at the bytecode level). A new tool, namely\nocamljar, has been introduced allowing post-compilation optimizations.\nThe underlying idea is that some optimizations cannot always be\napplied (e.g. depending whether multiple threads/programs will\ncoexist), but enabling them through command-line flags would lead to\nrecompilation and/or multiple installations of each library according\nto the set of chosen optimizations. It is thus far more easier to\nfirst build an executable jar file, and then modify it according to\nthese optimizations. Furthermore, this workflow allows the ocamljar\ntool to take advantage of whole-program information for some\noptimizations. All these improvements, combined, often lead to a gain\nof roughly 1/3 in terms of execution time.</p>\n<p>Regarding language interoperability, there are actually two directions\ndepending on whether you want to call OCaml code from Java, or want to\ncall Java code from OCaml. For the first direction, a tool allows to\ngenerate Java source files from OCaml compiled interfaces, mapping the\nvarious constructs of the OCaml language to Java classes. It is then\npossible to call functions, and to manipulate instances of OCaml types\nin pure Java, still benefiting from the type safety provided by the\nOCaml language. In the other direction, an extension of the OCaml\ntyper is provided allowing to create and manipulate Java instances\ndirectly from OCaml sources. This typer extension is indeed a thin\nlayer upon the original OCaml typer, that is mainly responsible for\nencoding Java types into OCaml types. This encoding uses a number of\nadvanced elements such as polymorphic variants, subtyping, variance\nannotations, phantom typing, and printf-hack, but the end-user does\nnot have to be aware of this encoding. On the surface, the type of\ninstances of the Java Object classes is\n<code>java'lang'Object java_instance</code>, and instances can be created by\ncalling Java.make <code>Object()</code>.</p>\n<p>While still under heavy development, a working prototype <a href=\"http://ocamljava.x9c.fr/preview/\">is\navailable</a>, and bugs <a href=\"http://bugs.x9c.fr/\">can be\nreported</a>. Finally, I would like to thank the\nOCaml Labs for providing a great working environment.</p>\n</blockquote>\n<h4 id=\"daniel-bünzli-typography-and-visualisation\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#daniel-bünzli-typography-and-visualisation\"></a>Daniel Bünzli: Typography and Visualisation</h4>\n<p>Daniel joined us from Switzerland, and spent some time at Citrix before\njoining us in OCaml Labs. All of his\n<a href=\"http://erratique.ch/software\">software</a> is now on OPAM, and is seeing\never-increasing adoption from the community.</p>\n<blockquote>\n<p>Released a first version of <a href=\"http://erratique.ch/software/vg\">Vg</a> […]\nI’m especially happy about that as I wanted to use and work on these\nideas since at least 2008. The project is a long term project and is\ncertainly not finished yet but this is already a huge step.</p>\n<p>Adjusted and released a first version of\n<a href=\"http://erratique.ch/software/gg\">Gg</a>. While the module was already\nmostly written before my arrival to Cambridge, the development of Vg\nand Vz prompted me to make some changes to the module.</p>\n<p>[…] released <a href=\"http://erratique.ch/software/otfm\">Otfm</a>, a module to\ndecode OpenType fonts. This is a work in progress as not every\nOpenType table has built-in support for decoding yet. But since it is\nneeded by Vg’s PDF renderer I had to cut a release. It can however\nalready be used to implement certain simple things like font kerning\nwith Vg, this can be seen in action in the <code>vecho</code> binary installed by\nVg.</p>\n<p>Started to work on <a href=\"http://erratique.ch/software/vz/doc/Vz.html\">Vz</a>,\na module for helping to map data to Vg images. This is really\nunfinished and is still considered to be at a design stage. There are\na few things that are however well implemented like (human)\nperceptually meaningful <a href=\"http://erratique.ch/software/vz/demos/color_schemes.html\">color\npalettes</a>\nand the small folding stat module (<code>Vz.Stat</code>). However it quickly\nbecame evident that I needed to have more in the box w.r.t. text\nrendering in Vg/Otfm. Things like d3js entirely rely on the SVG/CSS\nsupport for text which makes it easy to e.g. align things (like tick\nlabels on <a href=\"http://erratique.ch/software/vz/demos/iris.html\">such\ndrawings</a>). If you\ncan’t rely on that you need ways of measuring rendered text. So I\ndecided to suspend the work on Vz and put more energy in making a\nfirst good release of Vg. Vz still needs quite some design work,\nespecially since it tries to be independent of Vg’s backend and from\nthe mechanism for user input.</p>\n<p>Spent some time figuring out a new “opam-friendly” release workflow in\npkgopkg. One of my problem is that by designing in the small for\nprogramming in the large — what a slogan — the number of packages I’m\npublishing is growing (12 and still counting). This means that I need\nto scale horizontally maintenance-wise unhelped by the sad state of\nbuild systems for OCaml. I need tools that make the release process\nflawless, painless and up to my quality standards. This lead me to\nenhance and consolidate my old scattered distribution scripts in that\nrepo, killing my dependencies on Oasis and ocamlfind along the way.\n<em>(edited for brevity, see\n<a href=\"https://github.com/dbuenzli/pkgopkg\">here</a>)</em></p>\n</blockquote>\n<p><img src=\"/images/daniel-presentation-vg.webp\" alt=\"%r\" >\nDaniel also left his bicycle here for future visitors to use, and the\n“Bünzli-bike” is available for our next visitor! (Louis Gesbert even\ndonated lights, giving it a semblance of safety).</p>\n<h3 id=\"industrial-fellows\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#industrial-fellows\"></a>Industrial Fellows</h3>\n<p><img src=\"/images/xenserver.webp\" alt=\"%r\" >\nMost of our regular funding bodies such as <a href=\"http://epsrc.ac.uk\">EPSRC</a>\nor <a href=\"http://cordis.europa.eu/fp7/home_en.html\">EU FP7</a> provide funding,\nbut leave all the intellectual input to the academics. A compelling\naspect of OCaml Labs has been how involved our industrial colleagues\nhave been with the day-to-day problems that we solve. Both Jane Street\nand Citrix have senior staff regularly visiting our group and working\nalongside us as industrial fellows in the Computer Lab.</p>\n<p><img src=\"/images/js.webp\" alt=\"%r\" >\n<a href=\"http://www.three-tuns.net/mark/\">Mark Shinwell</a> from Jane Street\nEurope has been working on improving the <a href=\"http://www.youtube.com/watch?v=NF2WpWnB-nk\">state of native\ndebugging</a> in OCaml, by\nadding extended DWARF debugging information to the compiler output.\nMark is also a useful source of feedback about the forthcoming\ndesign of multicore, since he has daily insight into a huge\nproduction codebase at Jane Street (and can tell us about it without\nus requiring access!).</p>\n<p><a href=\"http://dave.recoil.org\">Dave Scott</a> is the principal architect of\n<a href=\"http://xenserver.org\">XenServer</a> at Citrix in Cambridge. This year\nhas been transformative for that project, since Citrix <a href=\"http://blogs.citrix.com/2013/06/26/open-source-what-does-it-mean-for-xenserver/\">open-sourced\nXenServer</a>\nto GitHub and fully adopted OPAM into their workflow. Dave is the\nauthor of numerous libraries that have all been released to OPAM,\nand his colleagues <a href=\"http://jon.recoil.org\">Jon Ludlam</a> and <a href=\"http://www.xenserver.org/blog/blogger/listings/euanh.html\">Euan\nHarris</a>\nare also regular visitors who have also been contributors to the\nOPAM and Mirage ecosystems.</p>\n<h2 id=\"research-projects\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#research-projects\"></a>Research Projects</h2>\n<p>The other 100% of our time at the Labs is spent on research projects.\nWhen we started the group, I wanted to set up a feedback loop between\nlocal people <em>using</em> OCaml to build systems, with the folk <em>developing</em>\nOCaml itself. This has worked out particularly well with a couple of big\nresearch projects in the Lab.</p>\n<h3 id=\"mirage\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#mirage\"></a>Mirage</h3>\n<p>Mirage is a <a href=\"/papers/2013-asplos-mirage.pdf\">library operating\nsystem</a> written in\nOCaml that compiles source code into specialised Xen microkernels,\ndeveloped at the Cambridge Computer Lab, Citrix and the <a href=\"http://horizon.ac.uk\">Horizon Digital\nEconomy</a> institute at Nottingham. This year saw\nseveral years of effort culminate in the first release of <a href=\"http://openmirage.org\">Mirage\n1.0</a> as a self-hosting entity. While Mirage\nstarted off as a <a href=\"/papers/2010-hotcloud-lamp.pdf\">quick\nexperiment</a> into\nbuilding specialised virtual appliances, it rapidly became useful to\nmake into a real system for use in bigger research projects. You can\nlearn more about Mirage <a href=\"http://openmirage.org/docs\">here</a>, or read the\n<a href=\"http://cacm.acm.org/magazines/2014/1/170866-unikernels/abstract\">Communications of the\nACM</a>\narticle that <a href=\"http://dave.recoil.org\">Dave Scott</a> and I wrote to close\nout the year.</p>\n<p>This project is where the OCaml Labs “feedback loop” has been strongest.\nA typical <a href=\"http://www.openmirage.org/wiki/hello-world\">Mirage\napplication</a> consists of\naround 50 libraries that are all installed via OPAM. These range from\n<a href=\"https://github.com/mirage/mirage-block-xen\">device drivers</a> to protocol\nlibraries for <a href=\"https://github.com/avsm/ocaml-cohttp\">HTTP</a> or\n<a href=\"https://github.com/mirage/ocaml-dns\">DNS</a>, to filesystems such as\n<a href=\"https://github.com/mirage/ocaml-fat\">FAT32</a>. Coordinating <a href=\"http://openmirage.org/blog/mirage-1.0.3-released\">regular\nreleases</a> of all of\nthese would be near impossible without using OPAM, and has also forced\nus to use our own tools daily, helping to sort out bugs more quickly.\nYou can see the full list of libraries on the <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/pkg/\">OCaml Labs software\npage</a>.</p>\n<p>Mirage is also starting to share code with big projects such as\n<a href=\"http://xenserver.org\">XenServer</a> now, and we have been working with\nCitrix engineers to help them to move to the\n<a href=\"http://ocaml.janestreet.com\">Core</a> library that Jane Street has\nreleased (and that is covered in <a href=\"https://realworldocaml.org\">Real World\nOCaml</a>). Moving production codebases this\nlarge can take years, but OCaml Labs is turning out to be a good place\nto start unifying some of the bigger users of OCaml into one place.\nWe’re also now an official <a href=\"http://www.xenproject.org/developers/teams/mirage-os.html\">Xen Project incubator\nproject</a>,\nwhich helps us to validate functional programming to other Linux\nFoundation efforts.</p>\n<h3 id=\"nymote-and-user-centric-networking\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#nymote-and-user-centric-networking\"></a>Nymote and User Centric Networking</h3>\n<p><img src=\"/images/nymote.webp\" alt=\"%r\" >\nThe release of Mirage 1.0 has put us on the road to simplifying embedded\nsystems programming. The move to the centralized cloud has led to\nregular well-publicised privacy and security threats to the way <a href=\"http://de2013.org/wp-content/uploads/2013/09/de2013_submission_25-1.pdf\">we\nhandle</a>\nour digital infrastructure, and so <a href=\"http://www.cl.cam.ac.uk/~jac22/\">Jon\nCrowcroft</a>, <a href=\"http://www.cs.nott.ac.uk/~rmm/\">Richard\nMortier</a> and I are leading an effort to\nbuild an alternative privacy-preserving infrastructure using embedded\ndevices as part of the <a href=\"http://usercentricnetworking.eu/\">User Centric\nNetworking</a> project, in collaboration\nwith a host of companies led by <a href=\"http://www.thlab.net/\">Technicolor</a>\nParis. This work also plays on the strong points of OCaml: it already\nhas a <a href=\"/2012/02/25/dreamplug-debian-and-ocaml.html\">fast ARM\nbackend</a>,\nand Mirage can easily be ported to the new Xen/ARM target as hardware\nbecomes available.</p>\n<p>One of the most difficult aspects of programming on the “wide area”\nInternet are dealing with the lack of a distributed identity service\nthat’s fully secure. We published <a href=\"/papers/2013-foci-signposts.pdf\">our\nthoughts</a> on this\nat the USENIX Free and Open Communications on the Internet workhsop, and\nDavid Sheets is working towards a full implementation using Mirage. If\nyou’re interested in following this effort, Amir Chaudhry is blogging at\nthe <a href=\"http://nymote.org/\">Nymote</a> project website, where we’ll talk about\nthe components as they are released.</p>\n<h3 id=\"data-center-networking\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#data-center-networking\"></a>Data Center Networking</h3>\n<p>At the other extreme from embedded programming is datacenter networking,\nand we started the\n<a href=\"http://gow.epsrc.ac.uk/NGBOViewGrant.aspx?GrantRef=EP/K034723/1\">Network-as-a-Service</a>\nresearch project with <a href=\"http://gow.epsrc.ac.uk/NGBOViewGrant.aspx?GrantRef=EP/K032968/1\">Imperial\nCollege</a>\nand\n<a href=\"http://gow.epsrc.ac.uk/NGBOViewGrant.aspx?GrantRef=EP/K031724/1\">Nottingham</a>.\nWith the rapid rise of <a href=\"http://en.wikipedia.org/wiki/Software-defined_networking\">Software Defined\nNetworking</a>\nthis year, we are investigating how application-specific customisation\nof network resources can build fast, better, cheaper infrasructure.\nOCaml is in a good position here: several other groups have built\nOpenFlow controllers in OCaml (most notably, the <a href=\"https://github.com/frenetic-lang\">Frenetic\nProject</a>), and Mirage is specifically\ndesigned to assemble such bespoke infrastructure.</p>\n<p>Another aspect we’ve been considering is how to solve the problem of\noptimal connectivity across nodes. TCP is increasingly considered\nharmful in high-through, high-density clusters, and <a href=\"http://www.sussex.ac.uk/informatics/people/peoplelists/person/334868\">George\nParisis</a>\nled the design of\n<a href=\"/papers/2013-hotnets-trevi.pdf\">Trevi</a>, which is\na fountain-coding based alternative for storage networking. Meanwhile,\n<a href=\"http://gazagnaire.org\">Thomas Gazagnaire</a> (who joined OCaml Labs in\nNovember), has been working on a branch-consistent data store called\n<a href=\"https://github.com/samoht/irminsule\">Irminsule</a> which supports scalable\ndata sharing and reconciliation using Mirage. Both of these systems will\nsee implementations based on the research done this year.</p>\n<h3 id=\"higher-kinded-programming\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#higher-kinded-programming\"></a>Higher Kinded Programming</h3>\n<p>Jeremy Yallop and Leo White have been developing an approach that makes\nit possible to write programs with higher-kinded polymorphism (such as\nmonadic functions that are polymorphic in the monad they use) without\nusing functors. It’s early days yet, but there’s a\n<a href=\"https://github.com/ocamllabs/higher\">library</a> available on\n<a href=\"http://opam.ocaml.org/pkg/higher/higher.0.1\">OPAM</a> that implements the\napproach, and a <a href=\"https://github.com/ocamllabs/higher/raw/paper/higher.pdf\">draft\npaper</a> that\noutlines the design.</p>\n<h2 id=\"priorities-for-2014\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#priorities-for-2014\"></a>Priorities for 2014</h2>\n<p><img src=\"/images/camel.webp\" alt=\"%r\" >\nThis year has been a wild ride to get us up to speed, but we now have a\nsolid sense of what to work on for 2014. We’ve decided on a high-level\nset of priorities led by the senior members of the group:</p>\n<ul>\n<li><strong>Multicore</strong>: Leo White will be leading efforts in putting an\nend-to-end multicore capable OCaml together.</li>\n<li><strong>Metaprogramming</strong>: Jeremy Yallop will direct the metaprogramming\nefforts, continuing with Ctypes and into macros and extension\npoints.</li>\n<li><strong>Platform</strong>: Thomas Gazagnaire will continue to drive OPAM\ndevelopment towards becoming the first <a href=\"http://ocaml.org/meetings/ocaml/2013/slides/madhavapeddy.pdf\">OCaml\nPlatform</a>.</li>\n<li><strong>Online</strong>: Amir Chaudhry will develop the online and community\nefforts that started in 2013.</li>\n</ul>\n<p>These are guidelines to choosing where to spend our time, but not\nexcluding other work or day-to-day bugfixing. Our focus on collaboration\nwith Jane Street, Citrix, Lexifi, OCamlPro and our existing colleagues\nwill continue, as well as warmly welcoming new community members that\nwish to work with us on any of the projects, either via internships,\nstudentships or good old-fashioned open source hacking.</p>\n<p>I appreciate the <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/people/\">whole\nteam's</a> feedback in\nediting this long post into shape, the amazing professorial support from\n<a href=\"http://www.cl.cam.ac.uk/~jac22/\">Jon Crowcroft</a>, <a href=\"https://www.cl.cam.ac.uk/~iml1/\">Ian\nLeslie</a> and <a href=\"https://www.cl.cam.ac.uk/~am21/\">Alan\nMycroft</a> throughout the year, and of\ncourse the funding and support from Jane Street, Citrix, RCUK, EPSRC,\nDARPA and the EU FP7 that made all this possible. Roll on 2014, and\nplease do <a href=\"mailto:avsm2@cl.cam.ac.uk\">get in touch</a> with me with any\nqueries!</p>\n<p><img src=\"/images/fpdays2013-04.webp\" alt=\"%c\" title=\"A successful FPDays tutorial in Cambridge, with all attendees getting a free copy of RWO!\" ></p>",
      "url": "https://anil.recoil.org/notes/the-year-in-ocamllabs",
      "external_url": "https://web.archive.org/web/20160310100554/http://www.cl.cam.ac.uk/projects/ocamllabs/news/index.html#Dec%202013",
      "title": "Reviewing the first year of OCaml Labs in 2013",
      "summary": "Reviewing OCaml Labs' first year, including progress on OPAM, the compiler, and community efforts.",
      "date_published": "2013-12-29T00:00:00.000000Z",
      "date_modified": "2013-12-29T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocamllabs",
        "opensource",
        "ocaml",
        "cambridge",
        "computerlab"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/docker-and-opam",
      "content_html": "<p>Now that OCaml 4.01 has been released, there is a frenzy of commit\nactivity in the <a href=\"https://github.com/ocaml/ocaml\">development trunk</a> of\nOCaml as the new features for 4.02 are all integrated. These include\nsome enhancements to the type system such as\n<a href=\"http://ocaml.org/meetings/ocaml/2013/slides/garrigue.pdf\">injectivity</a>,\n<a href=\"http://caml.inria.fr/mantis/view.php?id=6063\">module aliases</a> and\n<a href=\"http://ocaml.org/meetings/ocaml/2013/slides/white.pdf\">extension\npoints</a> as a\nsimpler alternative to syntax extensions.</p>\n<p>The best way to ensure that these all play well together is to test\nagainst the ever-growing OPAM package database as early as possible.\nWhile we’re working on more elaborate <a href=\"https://web.archive.org/web/20181114154831/https://anil.recoil.org/2013/09/30/travis-and-ocaml.html\">continuous\nbuilding</a>\nsolutions, it’s far easier if a developer can quickly run a bulk build\non their own system. The difficulty with doing this is that you also\nneed to install all the external dependencies (e.g. libraries and header\nfiles for bindings) needed by the thousands of packages in OPAM.</p>\n<p>Enter a hip new lightweight container system called\n<a href=\"http://docker.io\">Docker</a>. While containers aren’t quite as secure as\n<a href=\"http://en.wikipedia.org/wiki/Hypervisor\">type-1 hypervisors</a> such as\n<a href=\"http://xenproject.org\">Xen</a>, they are brilliant for spawning lots of\nlightweight tasks such as installing (and reverting) package\ninstallations. Docker is still under heavy development, but it didn’t\ntake me long to follow the documentation and put together a\nconfiguration file for creating an OCaml+OPAM image to let OCaml\ndevelopers do these bulk builds.</p>\n<h2 id=\"a-basic-docker-and-opam-setup\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#a-basic-docker-and-opam-setup\"></a>A basic Docker and OPAM setup</h2>\n<p>I started by spinning up a fresh Ubuntu Saucy VM on the <a href=\"https://rackspace.com\">Rackspace\nCloud</a>, which has a recent enough kernel version\nto work out-of-the-box with Docker. The <a href=\"http://docs.docker.io/en/latest/installation/ubuntulinux/#ubuntu-raring\">installation\ninstructions</a>\nworked without any problems.</p>\n<p>Next, I created a\n<a href=\"http://docs.docker.io/en/latest/use/builder/#dockerfiles-for-images\">Dockerfile</a>\nto represent the set of commands needed to prepare the base Ubuntu image\nwith an OPAM and OCaml environment. You can find the complete repository\nonline at\n<strong><a href=\"https://github.com/avsm/docker-opam\">https://github.com/avsm/docker-opam</a></strong>.\nLet’s walk through the <code>Dockerfile</code> in chunks.</p>\n<pre><code>FROM ubuntu:latest\nMAINTAINER Anil Madhavapeddy &lt;anil@recoil.org&gt;\nRUN apt-get -y install sudo pkg-config git build-essential m4 software-properties-common\nRUN git config --global user.email &quot;docker@example.com&quot;\nRUN git config --global user.name &quot;Docker CI&quot;\nRUN apt-get -y install python-software-properties\nRUN echo &quot;yes&quot; | add-apt-repository ppa:avsm/ocaml41+opam11\nRUN apt-get -y update -qq\nRUN apt-get -y install -qq ocaml ocaml-native-compilers camlp4-extra opam\nADD opam-installext /usr/bin/opam-installext\n</code></pre>\n<p>This sets up a basic OCaml and OPAM environment using the same Ubuntu\nPPAs as the <a href=\"https://web.archive.org/web/20181114154831/https://anil.recoil.org/2013/09/30/travis-and-ocaml.html\">Travis\ninstructions</a> I\nposted a few months ago. The final command adds a helper script which\nuses the new <code>depexts</code> feature in OPAM 1.1 to also install operating\nsystem packages that are required by some libraries. I’ll explain in\nmore detail in a later post, but for now all you need to know is that\n<code>opam installext ctypes</code> will not only install the <code>ctypes</code> OCaml\nlibrary, but also invoke <code>apt-get install libffi-dev</code> to install the\nrelevant development library first.</p>\n<pre><code>RUN adduser --disabled-password --gecos &quot;&quot; opam\nRUN passwd -l opam\nADD opamsudo /etc/sudoers.d/opam\nUSER opam\nENV HOME /home/opam\nENV OPAMVERBOSE 1\nENV OPAMYES 1\n</code></pre>\n<p>The next chunk of the Dockerfile configures the OPAM environment by\ninstalling a non-root user (several OPAM packages fail with an error if\nconfigured as root). We also set the <code>OPAMVERBOSE</code> and <code>OPAMYES</code>\nvariables to ensure we get the full build logs and non-interactive use,\nrespectively.</p>\n<h2 id=\"running-the-bulk-tests\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#running-the-bulk-tests\"></a>Running the bulk tests</h2>\n<p>We’re now set to build a Docker environment for the exact test that we\nwant to run.</p>\n<pre><code>RUN opam init git://github.com/mirage/opam-repository#add-depexts-11\nRUN opam install ocamlfind\nENTRYPOINT [&quot;usr/bin/opam-installext&quot;]\n</code></pre>\n<p>This last addition to the <code>Dockerfile</code> initializes our OPAM package set.\nThis is using my development branch which adds a <a href=\"https://github.com/ocaml/opam-repository/pull/1240\">massive\ndiff</a> to populate\nthe OPAM metadata with external dependency information for Ubuntu and\nDebian.</p>\n<p>Building an image from this is a single command:</p>\n<pre><code class=\"language-bash\">$ docker build -t avsm/opam github.com/avsm/docker-opam\n</code></pre>\n<p>The <code>ENTRYPOINT</code> tells Docker that our wrapper script is the “root\ncommand” to run for this container, so we can install a package in a\ncontainer by doing this:</p>\n<pre><code class=\"language-bash\">$ docker run avsm/opam ctypes\n</code></pre>\n<p>The complete output is logged to stdout and stderr, so we can capture\nthat as easily as a normal shell command. With all these pieces in\nplace, my local bulk build shell script is trivial:</p>\n<pre><code class=\"language-bash\">pkg=`opam list -s -a`\nRUN=5\nmkdir -p /log/$RUN/raw /log/$RUN/err /log/$RUN/ok\nfor p in $pkg; do\n  docker run avsm/opam $p &gt; /log/$RUN/raw/$p 2&gt;&amp;1\n  if [ $? != 0 ]; then\n    ln -s /log/$RUN/raw/$p /log/$RUN/err/$p\n  else\n    ln -s /log/$RUN/raw/$p /log/$RUN/ok/$p\n  fi\ndone  \n</code></pre>\n<p>This iterates through a local package set and serially builds\neverything. Future enhancements I’m working on: parallelising these on a\nmulticore box, and having a <a href=\"http://blog.docker.io/2013/10/docker-0-6-5-links-container-naming-advanced-port-redirects-host-integration/\">linked\ncontainer</a>\nthat hosts a local package repository so that we don’t require a lot of\nexternal bandwidth. Stay tuned!</p>",
      "url": "https://anil.recoil.org/notes/docker-and-opam",
      "title": "Using Docker to bulk-build OPAM packages on Linux",
      "summary": "Build OPAM packages in bulk on Linux using Docker containers.",
      "date_published": "2013-11-15T00:00:00.000000Z",
      "date_modified": "2013-11-15T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocamllabs",
        "unikernels",
        "docker",
        "opensource"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/cf9fcf6b-de5d-4a23-a00d-cceadea5b668-1",
      "content_html": "<p>MirageOS and XAPI project update at XenSummit, announcing the MirageOS 1.0 release. This was a major milestone for the project, showing how MirageOS had matured into a usable system for building unikernels. I discussed the integration work with Xen's XAPI toolstack and how MirageOS could be used to build secure, specialized virtual machines for cloud infrastructure.</p>",
      "url": "https://anil.recoil.org/notes/cf9fcf6b-de5d-4a23-a00d-cceadea5b668-1",
      "title": "MirageOS and XAPI project update at XenSummit",
      "summary": "Update presentation on MirageOS and XAPI project development at XenSummit.",
      "date_published": "2013-11-13T00:00:00.000000Z",
      "date_modified": "2013-11-13T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "mirageos",
        "xen",
        "virtualization",
        "xapi"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/liveblog-plos-2013",
      "content_html": "<p>I co-chaired the Programming Languages and Operating Systems workshop at SOSP 2013, and made livenotes about the (many) papers presented there.</p>",
      "url": "https://anil.recoil.org/notes/liveblog-plos-2013",
      "external_url": "https://web.archive.org/web/20140813215919/http://www.syslog.cl.cam.ac.uk/2013/11/03/liveblog-from-programming-languages-and-operating-systems-2013/",
      "title": "Notes from PL and OS 2013 workshop",
      "summary": "Notes from the 2013 Programming Languages and Operating Systems workshop at SOSP.",
      "date_published": "2013-11-03T00:00:00.000000Z",
      "date_modified": "2013-11-03T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "systems",
        "fp",
        "livenotes"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/rise-of-libos-1",
      "content_html": "<p>Article on the Communications of the ACM on unikernels is published, co-authored with Dave Scott. This was a significant milestone in bringing unikernels to a broader computer science audience through ACM Queue. The article explained how library operating systems could revolutionize cloud computing by replacing traditional OS virtualization with lightweight, specialized virtual machines. We discussed how unikernels could address security, performance, and resource efficiency challenges in multi-tenant cloud environments.</p><h1>References</h1><ul><li>Madhavapeddy et al (2013). Unikernels: Rise of the Virtual Library Operating System. <a href=\"https://doi.org/10.1145/2557963.2566628\" target=\"_blank\"><i>10.1145/2557963.2566628</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/rise-of-libos-1",
      "title": "Unikernels: Rise of the Virtual Library Operating System",
      "summary": "Article on unikernels published in Communications of the ACM.",
      "date_published": "2013-11-01T00:00:00.000000Z",
      "date_modified": "2013-11-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "unikernels",
        "mirageos",
        "systems",
        "virtualization"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1145/2557963.2566628",
          "doi": "10.1145/2557963.2566628",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2013-hotnets-trevi-1",
      "content_html": "<p>Paper on fountain coding for datacenter networking at HotNets 2013. The Trevi work explored using fountain codes to address storage hotspots in datacenter networks. By applying erasure coding techniques from reliable multicast to the datacenter context, we investigated how to &quot;water down&quot; hotspots and improve load distribution across storage systems. The paper demonstrates how techniques from one domain (networking) can be creatively applied to solve problems in another (distributed storage), providing new approaches to datacenter resource management.</p><h1>References</h1><ul><li>Parisis et al (2013). Trevi: watering down storage hotspots with cool fountain codes. ACM. <a href=\"https://doi.org/10.1145/2535771.2535781\" target=\"_blank\"><i>10.1145/2535771.2535781</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2013-hotnets-trevi-1",
      "title": "Trevi: watering down storage hotspots with cool fountain codes",
      "summary": "Paper on fountain coding for datacenter networking presented at HotNets 2013.",
      "date_published": "2013-11-01T00:00:00.000000Z",
      "date_modified": "2013-11-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "networking",
        "datacenter",
        "storage",
        "coding",
        "systems"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2013-hotnets-trevi.pdf",
          "mime_type": "application/pdf",
          "title": "Trevi: watering down storage hotspots with cool fountain codes"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1145/2535771.2535781",
          "doi": "10.1145/2535771.2535781",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/rwo-1",
      "content_html": "<p>The 1st Edition of Real World OCaml by O'Reilly associates has been released! There have been flurry of signing events, including an upcoming one at OSCON in Austin. This 513-page book, co-authored with Yaron Minsky and Jason Hickey, aimed to make OCaml accessible to working programmers by focusing on practical applications and real-world systems. The book covered everything from functional programming fundamentals to building concurrent systems, with examples drawn from industrial use at Jane Street and academic research in systems programming.</p><h1>References</h1><ul><li>Madhavapeddy et al (2022). Real World OCaml: Functional Programming for the Masses. Cambridge University Press. <a href=\"https://doi.org/10.1017/9781009129220\" target=\"_blank\"><i>10.1017/9781009129220</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/rwo-1",
      "title": "First edition of Real World OCaml published",
      "summary": "Release announcement for first edition of Real World OCaml book published by O'Reilly with upcoming signing events.",
      "date_published": "2013-11-01T00:00:00.000000Z",
      "date_modified": "2013-11-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "writing"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1017/9781009129220",
          "doi": "10.1017/9781009129220",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2012-cufp-scribe-1",
      "content_html": "<p>Published the scribe's report for CUFP 2012. Following our work documenting CUFP 2011, we continued serving as scribes for the Commercial Users of Functional Programming workshop. This report, published in the Journal of Functional Programming, captures the talks and experiences shared by industrial practitioners using functional programming. The workshop series provides an invaluable record of how FP is being applied in production systems across different industries and problem domains.</p><h1>References</h1><ul><li>Sperber et al (2013). Commercial users of functional programming workshop report. <a href=\"https://doi.org/10.1017/S0956796813000257\" target=\"_blank\"><i>10.1017/S0956796813000257</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2012-cufp-scribe-1",
      "title": "Commercial users of functional programming workshop report",
      "summary": "Published the scribe's report for CUFP 2012.",
      "date_published": "2013-11-01T00:00:00.000000Z",
      "date_modified": "2013-11-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "fp",
        "cufp",
        "workshop",
        "community"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1017/S0956796813000257",
          "doi": "10.1017/S0956796813000257",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/hdi-workshop-2013-liveblog",
      "content_html": "<p>We held the first <a href=\"https://hdi-network.org\">Human Data Interaction</a> workshop over in Cambridge, with lots of discussion about social networks and the state of play with decentralising them.</p>",
      "url": "https://anil.recoil.org/notes/hdi-workshop-2013-liveblog",
      "external_url": "https://web.archive.org/web/20140624105911/http://www.syslog.cl.cam.ac.uk/2013/10/02/liveblogging-the-first-human-data-interaction-workshop/",
      "title": "Notes on the first Human Data Interaction workshop",
      "summary": "Notes on the first Human Data Interaction workshop discussing social networks and decentralization.",
      "date_published": "2013-10-02T00:00:00.000000Z",
      "date_modified": "2013-10-02T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "livenotes",
        "computerlab",
        "ubicomp"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/ocaml-2013-liveblog",
      "content_html": "<p>I attended the OCaml 2013 workshop and took live notes of the event. There was a lot going on here, which you can learn more about in the &quot;<a href=\"/notes/the-year-in-ocamllabs\">Reviewing the first year of OCaml Labs in 2013</a>&quot; roundup as well that I published later in the year.</p><h1>References</h1><ul><li>Madhavapeddy (2013). Reviewing the first year of OCaml Labs in 2013. <a href=\"https://doi.org/10.59350/dxt23-hxf18\" target=\"_blank\"><i>10.59350/dxt23-hxf18</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/ocaml-2013-liveblog",
      "external_url": "https://web.archive.org/web/20140726215424/http://www.syslog.cl.cam.ac.uk/2013/09/24/liveblogging-ocaml-workshop-2013/",
      "title": "OCaml 2013 workshop liveblog",
      "summary": "Live notes from the OCaml 2013 workshop covering key events and discussions.",
      "date_published": "2013-09-24T00:00:00.000000Z",
      "date_modified": "2013-09-24T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocamllabs",
        "livenotes",
        "icfp",
        "ocaml"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.59350/dxt23-hxf18",
          "doi": "10.59350/dxt23-hxf18",
          "cito": [
            "citesAsRelated"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/cufp-2013-liveblog",
      "content_html": "<p>The <a href=\"https://cufp.org\">Commercial Uses of Functional Programming</a> workshop is one of the best industry/academia crossover workshops to attend, and these are my livenotes from the 2013 edition.</p>",
      "url": "https://anil.recoil.org/notes/cufp-2013-liveblog",
      "external_url": "https://web.archive.org/web/20140328072614/https://www.syslog.cl.cam.ac.uk/2013/09/22/liveblogging-cufp-2013/",
      "title": "Liveblogging CUFP 2013",
      "summary": "Live notes from Commercial Uses of Functional Programming 2013 workshop",
      "date_published": "2013-09-22T00:00:00.000000Z",
      "date_modified": "2013-09-22T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocamllabs",
        "livenotes",
        "cufp",
        "icfp"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/opam-1-1-beta",
      "content_html": "<p><a href=\"https://github.com/samoht\">Thomas Gazagnaire</a> just announced the availability of the\n<a href=\"http://opam.ocamlpro.com\">OPAM</a> beta release. This has been a huge\namount of work for him and <a href=\"http://louis.gesbert.fr/\">Louis</a>, so I’m\nexcited to see this release!</p>\n<p>Aside from general stability, the main\nhighlights for me are:</p>\n<ul>\n<li>\n<p>A switch to the\n<a href=\"http://creativecommons.org/publicdomain/zero/1.0/\">CC0</a>\npublic-domain-like license for the repository, and LGPL2+linking\nexception for OPAM itself. The <a href=\"https://github.com/OCamlPro/opam-repository/issues/955\">cutover to the new\nlicense</a> was\nthe first non-gratuitous use of GitHub’s fancy issue lists I’ve\nseen, too! As part of this, we’re also beginning a transition over\nto hosting it at <code>opam.ocaml.org</code>, to underline our committment to\nmaintaining it as an OCaml community resource.</p>\n</li>\n<li>\n<p>Much-improved support for package pinning and updates. This is the\nfeature that makes OPAM work well with\n<a href=\"http://openmirage.org\">MirageOS</a>, since we often need to do\ndevelopment work on a low-level library (such as a <a href=\"https://github.com/mirage/ocaml-xen-block-driver\">device\ndriver</a> and\nrecompile all the reverse dependencies.</p>\n</li>\n<li>\n<p>Support for post-installation messages (e.g. to display <a href=\"https://github.com/OCamlPro/opam-repository/pull/1100\">licensing\ninformation</a>\nor configuration hints) and better support for the external library\nmanagement issues I explained in an earlier post about <a href=\"/2013/09/09/ocamlot-autotriaging.html\">OCamlot\ntesting</a>.</p>\n</li>\n<li>\n<p>Better library structuring to let tools like\n<a href=\"http://github.com/OCamlPro/opam2web\">Opam2web</a> work with the\npackage metadata. For instance, my group’s <a href=\"http://ocaml.io\">OCaml\nLabs</a> has a comprehensive list of <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/pkg/index.html\">the software\npackages that we work\non</a>\ngenerated directly from an OPAM remote.</p>\n</li>\n<li>\n<p>A growing set of administration tools (via the <code>opam-admin</code> binary)\nthat run health checks and compute statistics over package\nrepositories. For example, here’s the result of running\n<code>opam-admin stats</code> over the latest package repository to show\nvarious growth curves.</p>\n</li>\n</ul>",
      "url": "https://anil.recoil.org/notes/opam-1-1-beta",
      "title": "OPAM 1.1 beta available, with pretty colours",
      "summary": "OPAM 1.1 beta is available with improved stability and new features.",
      "date_published": "2013-09-20T00:00:00.000000Z",
      "date_modified": "2013-09-20T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocamllabs",
        "ocaml",
        "opensource",
        "packaging"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2013-oud-platform-1",
      "content_html": "<p>Paper on the OCaml Platform at the OCaml Workshop 2013, presenting the first version of integrated development tools for the OCaml ecosystem. This was a collaborative effort with Amir Chaudhry, Thomas Gazagnaire, David Sheets, Philippe Wang, Leo White, and Jeremy Yallop to define what would become the standard toolchain for OCaml development. The platform concept has evolved significantly since then, but this v0.1 established the foundational principles.</p>",
      "url": "https://anil.recoil.org/notes/2013-oud-platform-1",
      "title": "The OCaml Platform v0.1",
      "summary": "Paper presenting first version of OCaml Platform development tools and infrastructure at OCaml Workshop 2013.",
      "date_published": "2013-09-01T00:00:00.000000Z",
      "date_modified": "2013-09-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "platform",
        "tooling"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2013-oud-platform.pdf",
          "mime_type": "application/pdf",
          "title": "The OCaml Platform v0.1"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2013-ocamlot-1",
      "content_html": "<p>Presented an OCaml ecosystem testing system at the OCaml Users and Developers Workshop. This was an early CI/CD system designed specifically for the OCaml package ecosystem, long before GitHub Actions and similar services became commonplace. The work with David Sheets, Amir Chaudhry and Thomas Gazagnaire laid important groundwork for what would eventually become the modern OCaml-CI infrastructure we use today across hundreds of projects.</p>",
      "url": "https://anil.recoil.org/notes/2013-ocamlot-1",
      "title": "Ocamlot: Online OCaml Testing",
      "summary": "Presentation of online testing system for the OCaml ecosystem at OCaml Users and Developers Workshop.",
      "date_published": "2013-09-01T00:00:00.000000Z",
      "date_modified": "2013-09-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "testing",
        "ci",
        "devops",
        "tooling"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2013-ocamlot.pdf",
          "mime_type": "application/pdf",
          "title": "Ocamlot: Online OCaml Testing"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2013-foci-signposts-1",
      "content_html": "<p>Paper on DNSSEC-based Signpost servers for better p2p communications at USENIX FOCI. This paper expands on our SIGCOMM demo, presenting the full technical details of how Signposts uses DNSSEC to provide secure peer-to-peer communications and NAT traversal. The system enables users to establish direct connections between their personal devices even when separated by complex network middleboxes. By building on DNS infrastructure and adding cryptographic authentication, Signposts provides a practical solution to the end-to-end connectivity problems plaguing modern networks.</p>",
      "url": "https://anil.recoil.org/notes/2013-foci-signposts-1",
      "title": "Lost in the Edge: Finding Your Way with DNSSEC Signposts",
      "summary": "Paper on DNSSEC-based Signposts system for improved peer-to-peer communications and NAT traversal at USENIX FOCI.",
      "date_published": "2013-08-01T00:00:00.000000Z",
      "date_modified": "2013-08-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "dns",
        "signpost",
        "networking",
        "p2p",
        "nat"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2013-foci-signposts.pdf",
          "mime_type": "application/pdf",
          "title": "Lost in the Edge: Finding Your Way with DNSSEC Signposts"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/762795c5-9f3b-499b-a054-b2af37d1ddd2-1",
      "content_html": "<p>Mirage Developer Preview 1 screencast, showing developers how to get started with MirageOS. The screencast walked through initializing and building both Unix and Xen kernels from the same OCaml codebase. This was an important milestone in making MirageOS accessible to developers, demonstrating the workflow for creating unikernels and the power of portable functional code that could target multiple backends.</p>",
      "url": "https://anil.recoil.org/notes/762795c5-9f3b-499b-a054-b2af37d1ddd2-1",
      "title": "Mirage Developer Preview 1 screencast",
      "summary": "Screencast demonstration of Mirage Developer Preview 1 release.",
      "date_published": "2013-07-26T00:00:00.000000Z",
      "date_modified": "2013-07-26T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "mirageos",
        "ocaml",
        "unikernels",
        "demo"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/grepping-every-known-ocaml-package-source",
      "content_html": "<p>A regular question that comes up from OCaml developers is how to use\n<a href=\"http://opam.ocaml.org\">OPAM</a> as a hypothesis testing tool against the\nknown corpus of OCaml source code. In other words: can we quickly and\nsimply run <code>grep</code> over every source archive in OPAM? So that’s the topic\nof today’s 5 minute blog post:</p>\n<pre><code class=\"language-bash\">git clone git://github.com/ocaml/opam-repository\ncd opam-repository\nopam-admin make\ncd archives\nfor i in *.tar.gz; \\\n  do tar -zxOf $i | grep caml_stat_alloc_string; \\\ndone\n</code></pre>\n<p>In this particular example we’re looking for instances of\n<code>caml_stat_alloc_string</code>, so just replace that with the regular\nexpression of your choice. The <code>opam-admin</code> tool repacks upstream\narchives into a straightforward tarball, so you don’t need to worry\nabout all the different <a href=\"http://opam.ocaml.org/doc/Packaging.html#h1-CreatingOPAMpackages#Notes\">archival\nformats</a>\nthat OPAM supports (such as git or Darcs). It just adds an <code>archive</code>\ndirectory to a normal\n<a href=\"https://github.com/ocaml/opam-repository\">opam-repository</a> checkout, so\nyou can reuse an existing checkout if you have one already.</p>\n<pre><code class=\"language-bash\">$ cd opam-repository/archives\n$ du -h\n669M    .\n$ ls | wc -l\n2092\n</code></pre>",
      "url": "https://anil.recoil.org/notes/grepping-every-known-ocaml-package-source",
      "title": "Grepping the source of every OCaml package in OPAM",
      "summary": "Run grep on every OCaml package in OPAM using a simple script.",
      "date_published": "2013-04-08T00:00:00.000000Z",
      "date_modified": "2013-04-08T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocamllabs",
        "ocaml",
        "opam",
        "opensource",
        "packaging"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2013-asplos-mirage-1",
      "content_html": "<p>The first paper on unikernels is published at ASPLOS 2013. This landmark paper introduced unikernels as a new approach to cloud computing, where applications are compile-time specialized into standalone kernels sealed against modification. Working with <a href=\"https://github.com/mor1\">Richard Mortier</a>, <a href=\"https://www.lancaster.ac.uk/scc/about-us/people/charalampos-rotsos\">Charalampos Rotsos</a>, <a href=\"https://dave.recoil.org\">Dave Scott</a>, <a href=\"https://github.com/samoht\">Thomas Gazagnaire</a>, and others, we showed how Mirage achieves an order of magnitude reduction in code size while maintaining performance. The paper demonstrated that combining static type-safety with single address-space layouts creates secure, efficient cloud services, establishing unikernels as a viable alternative to traditional OS stacks.</p><h1>References</h1><ul><li>Madhavapeddy et al (2013). Unikernels: library operating systems for the cloud. ACM. <a href=\"https://doi.org/10.1145/2451116.2451167\" target=\"_blank\"><i>10.1145/2451116.2451167</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2013-asplos-mirage-1",
      "title": "Unikernels: library operating systems for the cloud",
      "summary": "First paper on unikernels published at ASPLOS 2013.",
      "date_published": "2013-03-01T00:00:00.000000Z",
      "date_modified": "2013-03-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "unikernels",
        "mirageos",
        "cloud",
        "systems",
        "ocaml"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2013-asplos-mirage.pdf",
          "mime_type": "application/pdf",
          "title": "Unikernels: library operating systems for the cloud"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1145/2451116.2451167",
          "doi": "10.1145/2451116.2451167",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/de13-dataware-1",
      "content_html": "<p>Paper on dataware computing in the digital economy, examining perceived risks of personal data sharing. This interdisciplinary work with researchers from psychology, HCI, and computer science explored how people perceive and manage risks when sharing personal data. The paper contributed to understanding the human factors in personal data management, informing the design of systems like Databox that aimed to give individuals more control over their information.</p>",
      "url": "https://anil.recoil.org/notes/de13-dataware-1",
      "title": "Perceived risks of personal data sharing",
      "summary": "Paper on dataware computing in the digital economy.",
      "date_published": "2013-02-01T00:00:00.000000Z",
      "date_modified": "2013-02-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "privacy",
        "databox",
        "personal-data",
        "security"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/de13-dataware.pdf",
          "mime_type": "application/pdf",
          "title": "Perceived risks of personal data sharing"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2012-conext-pvtcp-1",
      "content_html": "<p>Paper on extending TCP in a backwards compatible way at CoNeXT 2012. This work explored the practical challenges of evolving TCP to add new features while maintaining compatibility with the existing internet infrastructure. We investigated how to introduce protocol extensions in a way that degrades gracefully when communicating with legacy systems, addressing the fundamental tension between innovation and backwards compatibility in critical internet protocols. The paper provides insights into the engineering challenges of updating foundational network protocols.</p><h1>References</h1><ul><li>Nabi et al (2012). Evolving TCP: how hard can it be?. ACM. <a href=\"https://doi.org/10.1145/2413247.2413270\" target=\"_blank\"><i>10.1145/2413247.2413270</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2012-conext-pvtcp-1",
      "title": "Evolving TCP: how hard can it be?",
      "summary": "Paper on extending TCP in a backwards compatible way presented at CoNeXT 2012.",
      "date_published": "2012-12-01T00:00:00.000000Z",
      "date_modified": "2012-12-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "networking",
        "tcp",
        "protocols",
        "systems"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2012-conext-pvtcp.pdf",
          "mime_type": "application/pdf",
          "title": "Evolving TCP: how hard can it be?"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1145/2413247.2413270",
          "doi": "10.1145/2413247.2413270",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/announcing-ocaml-labs",
      "content_html": "<p>I’m very excited to announce <a href=\"/projects/ocamllabs\">OCaml Labs</a>, the latest project\nto hit the Cambridge Computer Lab. As anyone that hangs out near me\nprobably realises, I very much enjoy functional programming. My weapon\nof choice tends to be <a href=\"http://www.ocaml-lang.org\">OCaml</a>, as it\ncondenses <a href=\"http://events.inf.ed.ac.uk/Milner2012/X_Leroy-html5-mp4.html\">decades of\nresearch</a>\ninto a pragmatic blend of functional, imperative and object-oriented\nprogramming styles. What’s perhaps less well known are the steady\n<a href=\"http://www.ocaml-lang.org/companies.html\">inroads</a> that OCaml has been\nmaking into mission-critical areas of industry. At <a href=\"http://ocaml.janestreet.com\">Jane\nStreet</a>, billions of dollars of\ntransactions are routed through a huge ML code-base that is designed to\ncatch bugs <a href=\"http://vimeo.com/14313378\">at compile-time</a>. At\n<a href=\"http://github.com/xen-org/xen-api\">Citrix</a>, the Xen management\ntoolstack that powers\n<a href=\"http://blogs.citrix.com/2012/10/09/one-in-a-million/\">millions</a> of\nhosts in the cloud is <a href=\"/papers/2010-icfp-xen.pdf\">largely written in\nOCaml</a>. Facebook does\nsophisticated <a href=\"https://github.com/facebook/pfff/wiki/Main\">static\nanalysis</a> using OCaml over\ntheir vast PHP codebase to close security holes.</p>\n<p>The OCaml community is small but dedicated, but there is always more to\ndo to improve the language and ecosystem. So, thanks to a generous\nplatform grant from <a href=\"http://ocaml.janestreet.com\">Jane Street</a>, we are\nlaunching a program to help with the open-source development of OCaml\nfrom Cambridge.</p>\n<p>The <em><a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/\">OCaml Labs</a></em> are\nbased in the <a href=\"http://www.cl.cam.ac.uk\">Cambridge Computer Lab</a> and led\nmy myself, <a href=\"http://www.cl.cam.ac.uk/~am21/\">Alan Mycroft</a> and <a href=\"http://www.cl.cam.ac.uk/~iml1/\">Ian\nLeslie</a>. We’re closely affiliated with\nother\n<a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/collaboration.html\">groups</a>,\nand will be:</p>\n<ul>\n<li>\n<p>developing the OCaml Platform, which will bundle the official OCaml\ncompiler from INRIA with a tested set of community libraries that\nrefreshed every six months.</p>\n</li>\n<li>\n<p>working with the core OCaml team at INRIA’s\n<a href=\"http://gallium.inria.fr/\">Gallium</a> group on the compiler, and with\ncommercial partners like <a href=\"http://ocamlpro.com\">OCamlPro</a> on tool\ndevelopment. OCamlPro are making some very impressive progress\nalready with the <a href=\"http://opam.ocamlpro.com\">OPAM</a> packge manager and\n<a href=\"http://www.typerex.org\">TypeRex</a> IDE helper.</p>\n</li>\n<li>\n<p>supporting the online presence with more teaching material and\ncontent. Yaron, Jason and I are working hard on a <a href=\"http://realworldocaml.org\">new\nbook</a> that will be published next year,\nand the OCaml Web team (led by <a href=\"http://ashishagarwal.org\">Ashish</a>\nand\n<a href=\"https://plus.google.com/109604597514379193052/posts\">Christophe</a>)\nhave made great progress on a <a href=\"http://www.ocaml-lang.org\">brand new\nwebsite</a> that we will move to the\n<code>ocaml.org</code> domain soon.</p>\n</li>\n</ul>\n<h3 id=\"research-efforts\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#research-efforts\"></a>Research efforts</h3>\n<p>Of course, it is difficult to hack on a language in a void, and we also\n<em>use</em> OCaml heavily in our own research. The other half of OCaml Lab’s\ngoals are more disruptive (and riskier!):</p>\n<ul>\n<li>The upcoming first beta release of <a href=\"http://openmirage.org\">Mirage</a>,\nwhich is an operating system designed for cloud and embedded\nenvironments, and is written almost entirely from the ground up in\nOCaml. The outputs of Mirage include a <a href=\"http://www.openmirage.org/blog/breaking-up-is-easy-with-opam\">large number of\nlibraries</a>\nwhich are usable separately, such as pure implementations of TCP/IP,\nDNS, SSH, DHCP and HTTP. The Xen hackers, led by <a href=\"http://dave.recoil.org\">David Scott</a>, are out in force to integrate Mirage\ninto their <a href=\"http://www.xen.org/xensummit/xs12na_talks/T2.html\">next-generation</a>\nplatform. Meanwhile, Raphael Proust is busy eliminating the <a href=\"/papers/drafts/2012-places-limel-draft1.pdf\">garbage\ncollector</a>\nwith his cut-down “LinearML” variant.</li>\n<li>Working with our collaborators at the <a href=\"http://horizon.ac.uk\">Horizon\nInstitute</a> on privacy-preserving technologies\nsuch as\n<a href=\"/papers/2012-sigcomm-signposts-demo.pdf\">Signposts</a>\nwhich let you build and maintain your own personal clouds that\noperate <a href=\"/papers/2011-icdcn-droplets.pdf\">autonomously</a>\nfrom the central cloud. You can read more about our <a href=\"http://www.cam.ac.uk/research/features/privacy-by-design/\">privacy-by-design</a> philosophy too.</li>\n<li>Extending OCaml to run on secure hardware platforms that doesn’t\ncompromise on performance, using the MIPS64-based <a href=\"http://www.cl.cam.ac.uk/research/security/ctsrd/cheri.html\">capability\nprocessor</a>\nthat is being developed at at the Lab.</li>\n<li>The <a href=\"http://www.trilogy-project.org\">Trilogy</a> was a hugely\nsuccessful EU-funded effort on future evolution of the Internet, and\nresulted in <a href=\"http://trilogy-project.org/publications/standards-contributions.html\">numerous\nRFCs</a>\non subjects such as multipath-TCP. We’re partipating in the\nfollow-up (imaginatively dubbed “Trilogy2”), and look forward to\nworking on more structured abstractions for programming large-scale\nnetworks.</li>\n</ul>\n<h3 id=\"getting-involved\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#getting-involved\"></a>Getting involved</h3>\n<p>So, how can you get involved? We are initially advertising three\npositions for full-time developers and researchers\n(<a href=\"http://www.jobs.cam.ac.uk/job/-21662/\">junior</a> and\n<a href=\"http://www.jobs.cam.ac.uk/job/-21942/\">senior</a>) to help us get started\nwith the OCaml Platform and compiler development. These aren’t\nconventional pure research jobs, and a successful candidate should enjoy\nthe open-source development cycle (you retain your own copyright for\nyour own projects). The Computer Lab offers a pretty unique environment:\na friendly, non-hierarchical group in a beautiful city, and some of the\nbest faculty and students you could hope to hang out with.</p>\n<p>And finally, there is a longer lead time on <a href=\"http://www.cl.cam.ac.uk/admissions/phd/\">applying for\nPhDs</a>, but this is a great time\nto get involved. When I started at the Lab in 2002, a little project\ncalled <a href=\"http://xen.org\">Xen</a> was just kicking off, and many of us had a\nwild (and oft great) time riding that wave. Get in touch with myself,\n<a href=\"http://www.cl.cam.ac.uk/~am21/\">Alan</a>,\n<a href=\"http://www.cl.cam.ac.uk/~iml1/\">Ian</a> or\n<a href=\"http://www.cl.cam.ac.uk/~jac22/\">Jon</a> soon if you are interested in\napplying! There’s some more information available on the <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/collaboration.html\">OCaml Labs\npages</a>\nabout options.</p>",
      "url": "https://anil.recoil.org/notes/announcing-ocaml-labs",
      "title": "Announcing OCaml Labs",
      "summary": "Introducing OCaml Labs, a new project at Cambridge Computer Lab to develop and improve the OCaml programming language.",
      "date_published": "2012-10-19T00:00:00.000000Z",
      "date_modified": "2012-10-19T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocamllabs",
        "janestreet",
        "cambridge",
        "funding"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/breaking-up-mirageos",
      "content_html": "<p>Once the main advantages of having hypervisors is that you can have strongly isolated services within a single machine. But it's really hard to actually build these specialised services; that is, until MirageOS came along.  This post discusses how to build so-called &quot;stub domains&quot; for Xen using MirageOS.</p>",
      "url": "https://anil.recoil.org/notes/breaking-up-mirageos",
      "external_url": "https://mirageos.org/blog/xenstore-stub-domain",
      "title": "Breaking up is easy (with OPAM)",
      "summary": "Build isolated services with ease using MirageOS and OPAM on Xen hypervisors.",
      "date_published": "2012-10-17T00:00:00.000000Z",
      "date_modified": "2012-10-17T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "unikernels",
        "xen",
        "mirageos",
        "ocaml",
        "packaging"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/dd8b1f58-c43c-4422-9963-d3a980529e57-1",
      "content_html": "<p>Recording of the OCaml Labs announcement at ICFP 2012, where I announced the formation of OCaml Labs together with Yaron Minsky from Jane Street. We outlined our vision for the OCaml Platform - a comprehensive set of development tools, libraries, and infrastructure to make OCaml more accessible and productive. This marked the beginning of a major effort to strengthen the OCaml ecosystem with funding and dedicated engineering resources.</p>",
      "url": "https://anil.recoil.org/notes/dd8b1f58-c43c-4422-9963-d3a980529e57-1",
      "title": "OUD 2012: Towards an OCaml Platform and Introducing OCaml Labs",
      "summary": "Recording of OCaml Labs announcement and OCaml Platform vision at OCaml Users and Developers workshop.",
      "date_published": "2012-09-17T00:00:00.000000Z",
      "date_modified": "2012-09-17T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "platform",
        "ocaml-labs",
        "community",
        "tooling"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/xenstore-stub-domain",
      "content_html": "",
      "url": "https://anil.recoil.org/notes/xenstore-stub-domain",
      "external_url": "https://mirageos.org/blog/xenstore-stub-domain",
      "title": "Building a Xenstore stub domain with MirageOS",
      "summary": "Learn how to build a Xenstore stub domain using MirageOS for improved security and performance.",
      "date_published": "2012-09-12T00:00:00.000000Z",
      "date_modified": "2012-09-12T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "unikernels"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2012-sigcomm-signposts-1",
      "content_html": "<p>Demoed the Signposts DNSSEC system at SIGCOMM. Working with Amir Chaudhry, <a href=\"https://www.lancaster.ac.uk/scc/about-us/people/charalampos-rotsos\">Charalampos Rotsos</a>, <a href=\"https://github.com/mor1\">Richard Mortier</a>, and others, we presented a system providing secure, simple communication channels between personal devices in a world dominated by NATs and middleboxes. Signposts uses DNSSEC-based naming to establish secure end-points while clients dynamically discover routes to overcome network obstacles. The demo showed devices on different networks behind various middleboxes seamlessly interconnecting to share data, with the system automatically adapting as network configurations changed.</p><h1>References</h1><ul><li>Chaudhry et al (2012). Signposts: end-to-end networking in a world of middleboxes. <a href=\"https://doi.org/10.1145/2377677.2377692\" target=\"_blank\"><i>10.1145/2377677.2377692</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2012-sigcomm-signposts-1",
      "title": "Signposts: end-to-end networking in a world of middleboxes",
      "summary": "Demo of Signposts DNSSEC system for end-to-end networking presented at SIGCOMM.",
      "date_published": "2012-09-01T00:00:00.000000Z",
      "date_modified": "2012-09-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "networking",
        "dnssec",
        "middleboxes",
        "security"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1145/2377677.2377692",
          "doi": "10.1145/2377677.2377692",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2012-oud-xen-1",
      "content_html": "<p>Paper on programming the Xen cloud using OCaml at the OCaml Workshop. This work presented how we use OCaml throughout the Xen cloud stack, from the control plane tooling to the unikernel applications running as guests. We demonstrated how OCaml's strong type system and functional programming features enable building reliable cloud infrastructure, and discussed the practical experiences of deploying OCaml in production cloud environments. The paper helped establish OCaml as a viable language for systems programming in the cloud era.</p>",
      "url": "https://anil.recoil.org/notes/2012-oud-xen-1",
      "title": "Programming the Xen cloud using OCaml",
      "summary": "Paper on programming Xen cloud infrastructure using OCaml at the OCaml Workshop.",
      "date_published": "2012-09-01T00:00:00.000000Z",
      "date_modified": "2012-09-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "xen",
        "ocaml",
        "cloud",
        "virtualization"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2012-oud-xen.pdf",
          "mime_type": "application/pdf",
          "title": "Programming the Xen cloud using OCaml"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2012-ahans-soapp-1",
      "content_html": "<p>Paper on control flow analysis to break up applications into compartments. This work led by <a href=\"https://www.khilan.com/\">Khilan Gudka</a> introduced SOAAP (Security-Oriented Analysis of Application Programs), a tool for exploring compartmentalization hypotheses in existing C software. The system uses static and dynamic analysis driven by source code annotations to help programmers evaluate different strategies for decomposing applications into sandboxed components. This semi-automated approach addresses the difficulty of adapting legacy software for security compartmentalization while maintaining correctness and performance.</p><h1>References</h1><ul><li>Gudka et al (2012). Exploring Compartmentalisation Hypotheses with SOAAP. IEEE. <a href=\"https://doi.org/10.1109/SASOW.2012.14\" target=\"_blank\"><i>10.1109/SASOW.2012.14</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2012-ahans-soapp-1",
      "title": "Exploring Compartmentalisation Hypotheses with SOAAP",
      "summary": "Paper on control flow analysis techniques for compartmentalizing applications into isolated components.",
      "date_published": "2012-09-01T00:00:00.000000Z",
      "date_modified": "2012-09-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "security",
        "compartmentalization",
        "analysis",
        "systems",
        "isolation"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2012-ahans-soapp.pdf",
          "mime_type": "application/pdf",
          "title": "Exploring Compartmentalisation Hypotheses with SOAAP"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1109/SASOW.2012.14",
          "doi": "10.1109/SASOW.2012.14",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2012-iccsdn-mirageflow-1",
      "content_html": "<p>Paper on using MirageOS for better SDN infrastructure with OpenFlow. Working with <a href=\"https://www.lancaster.ac.uk/scc/about-us/people/charalampos-rotsos\">Charalampos Rotsos</a>, <a href=\"https://github.com/mor1\">Richard Mortier</a>, and others, we demonstrated how applications could express their own forwarding logic using OpenFlow to achieve optimal performance in cloud environments. We built modular OpenFlow controllers and switches as Mirage libraries that link directly into applications, providing a safe yet extensible framework for programming network control. The work showed how the unikernel approach could address the lack of traffic isolation in virtualized datacenters while giving applications direct control over their networking.</p><h1>References</h1><ul><li>Rotsos et al (2012). Cost, Performance & Flexibility in OpenFlow: Pick three. IEEE. <a href=\"https://doi.org/10.1109/ICC.2012.6364690\" target=\"_blank\"><i>10.1109/ICC.2012.6364690</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2012-iccsdn-mirageflow-1",
      "title": "Cost, Performance & Flexibility in OpenFlow: Pick three",
      "summary": "Paper on using MirageOS to improve software-defined networking infrastructure with OpenFlow.",
      "date_published": "2012-06-01T00:00:00.000000Z",
      "date_modified": "2012-06-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "mirageos",
        "networking",
        "sdn",
        "unikernels"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2012-iccsdn-mirageflow.pdf",
          "mime_type": "application/pdf",
          "title": "Cost, Performance & Flexibility in OpenFlow: Pick three"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1109/ICC.2012.6364690",
          "doi": "10.1109/ICC.2012.6364690",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2012-mpm-caware-1",
      "content_html": "<p>Paper on our use of data lockers within Cambridge to incentivise more green commuting patterns. This work with Chris Elsmore, Ian Leslie, and Amir Chaudhry explored building a privacy-sensitive architecture for measuring employee travel-to-work carbon footprints. Rather than centralizing location data, we built a distributed system where individuals record fine-grained location information in personal data containers they control, trading portions of data to the organization in exchange for benefits. We piloted this on Cambridge's cloud service, demonstrating how to transform private information into public good with minimal privacy loss.</p><h1>References</h1><ul><li>Elsmore et al (2012). Confidential carbon commuting: exploring a privacy-sensitive architecture for incentivising 'greener' commuting. Association for Computing Machinery. <a href=\"https://doi.org/10.1145/2181196.2181201\" target=\"_blank\"><i>10.1145/2181196.2181201</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2012-mpm-caware-1",
      "title": "Confidential carbon commuting: exploring a privacy-sensitive architecture for incentivising 'greener' commuting",
      "summary": "Paper on privacy-sensitive data locker architecture to incentivize greener commuting patterns in Cambridge.",
      "date_published": "2012-04-01T00:00:00.000000Z",
      "date_modified": "2012-04-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "privacy",
        "carbon",
        "personal-data",
        "ubicomp",
        "climate"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2012-mpm-caware.pdf",
          "mime_type": "application/pdf",
          "title": "Confidential carbon commuting: exploring a privacy-sensitive architecture for incentivising 'greener' commuting"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1145/2181196.2181201",
          "doi": "10.1145/2181196.2181201",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2012-resolve-fable-1",
      "content_html": "<p>Paper on a new design for reconfigurable IO that copes with heterogeneous software/hardware. Presented at the RESoLVE workshop at ASPLOS, this work proposed a reconfigurable I/O channel architecture to address the challenges of diverse computing environments combining different software stacks and hardware accelerators. We explored how to build flexible I/O abstractions that can adapt to heterogeneous systems, from CPUs to FPGAs to specialized accelerators, without sacrificing performance or composability.</p>",
      "url": "https://anil.recoil.org/notes/2012-resolve-fable-1",
      "title": "The case for reconfigurable I/O channels",
      "summary": "Workshop paper proposing reconfigurable I/O channel architecture for heterogeneous software and hardware environments at RESoLVE ASPLOS.",
      "date_published": "2012-03-01T00:00:00.000000Z",
      "date_modified": "2012-03-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "systems",
        "io",
        "architecture"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2012-resolve-fable.pdf",
          "mime_type": "application/pdf",
          "title": "The case for reconfigurable I/O channels"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/dreamplug-debian-and-ocaml",
      "content_html": "<p>I’ve been meaning to play with <a href=\"http://www.plugcomputer.org/\">Plug\nComputers</a> for some time now, as I need a\nlow-power embedded system around the house. I recently bought a <a href=\"http://soekris.com/products/net6501.html\">Soekris\nNet6501</a> (a pretty powerful\nIntel CPU, that even has VT support), but had annoying\n<a href=\"http://marc.info/?l=soekris-tech&amp;m=132915532912206&amp;w=2\">issues</a> getting\nit working reliably. I ordered an ARM-based\n<a href=\"http://www.newit.co.uk/shop/products.php?cat=21\">Dreamplug</a> as an\nalternative (and as a bonus, the Dreamplug is 6x cheaper than the\nSoekris!). Here are my notes on getting it to work.</p>\n<p><a href=\"http://www.flickr.com/photos/tlamer/5693063642/\" title=\"dreamplug by tlamer, on Flickr\"><img src=\"http://farm6.staticflickr.com/5230/5693063642_47aa7c4c99.jpg\" alt=\"dreamplug\" ></a></p>\n<p>Requirements:</p>\n<ul>\n<li>Aside from the Dreamplug itself, make sure you order the optional\nJTAG module. This provides a serial console that is essential to\ngetting any development done with it.</li>\n<li>I also grabbed the extra 16GB Class 10 SLC SD Card, to act as my\nhome directory.</li>\n<li>You will also need another functional system running Debian (or a VM\non your Mac; whatever is easiest). The JTAG drivers for the USB\nserial are easiest to get running on Linux.</li>\n</ul>\n<p>The Dreamplug arrived with a working installation, but running the\nabsolutely ancient Debian Lenny. A dist-upgrade through to Wheezy led to\nbricking it almost immediately, and so I did a fresh installation from\nscratch.</p>\n<p>For a fresh installation, place a USB stick of suitable size (greater\nthan 2GB is best) into your functional Debian installation. Then:</p>\n<ul>\n<li>\n<p>The Marvell bootloader boots from a VFAT partition, so you will need\ntwo partitions. The first should be a small <code>fat16</code> (I picked 150MB)\nand the remainder an <code>ext3</code> partition for Linux itself. There are\ngood instructions available on the\n<a href=\"https://trac.torproject.org/projects/tor/wiki/doc/DebianDreamPlug\">Tor/Dreamplug</a>\nwiki which show you how to do this.</p>\n</li>\n<li>\n<p>I grabbed the latest kernel (at this time, 3.2.7) from\n<a href=\"http://sheeva.with-linux.com/sheeva/3/3.2/3.2.7/\">with-linux</a>, and\ninstalled it with the following commands (assuming your USB stick is\n<code>/dev/sdb</code>).</p>\n<pre><code>$ sudo mount /dev/sdb1 /mnt\n$ sudo cp uImage /mnt\n$ sudo umount /mnt\n</code></pre>\n</li>\n<li>\n<p>You now need to use <code>debootstrap</code> to install a fresh root image.\nBecause it is ARM and your main PC is probably an x86, you will need\nto setup the QEMU CPU emulator. An extremely cool feature of QEMU is\nthat it can do <a href=\"http://wiki.debian.org/QemuUserEmulation\">transparent\nemulation</a> of foreign\nbinaries, so you can chroot directly into the ARM filesystem and run\ncommands as if they were x86. The <code>qemu-deboostrap</code> command will\ntake care of this for you, if you perform the steps below (again,\nassuming your USB stick is <code>/dev/sdb</code>).</p>\n<pre><code>$ sudo apt-get install qemu-user-static debootstrap\n$ sudo mount /dev/sdb2 /mnt\n$ sudo mkdir -p /mnt/usr/bin\n$ sudo cp /usr/bin/qemu-arm-static /mnt/usr/bin/\n$ sudo qemu-debootstrap --arch=armel wheezy http://ftp.uk.debian.org/debian/\n</code></pre>\n</li>\n<li>\n<p>Now grab the kernel modules from the same place as your uImage (for\n3.2.7, from\n<a href=\"http://sheeva.with-linux.com/sheeva/3/3.2/3.2.7/sheeva-3.2.7-Modules.tar.gz\">here</a>).\nThen, chroot into your fresh installation and untar them.</p>\n<pre><code>$ cd /mnt\n$ sudo tar -zxvf ~/sheeva-3.2.7-Modules.tar.gz\n$ sudo chroot /mnt\n$ depmod -a\n# edit /etc/network/interfaces\n# edit /etc/resolv.conf\n</code></pre>\n</li>\n<li>\n<p>The wireless setup involves some extremely crap firmware which\nrelentlessly kernel panicked for me, so I just disabled it by adding\nthe following to <code>/etc/modprobe.d/dpwifiap.conf</code>, as I only want\nwired access:</p>\n<pre><code>blacklist libertas\nblacklist libertas_sdio\n</code></pre>\n</li>\n<li>\n<p>From there on, put the USB stick into the Dreamplug, and follow the\nrest of the boot instructions from the <a href=\"https://trac.torproject.org/projects/tor/wiki/doc/DebianDreamPlug\">Tor\nwiki</a>\nto interact with the Marvell BIOS and boot from the USB stick. I\ncopied the contents of the USB stick onto the internal MicroSD, and\nit all boots standalone now.</p>\n</li>\n</ul>\n<h2 id=\"ocaml-on-arm\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#ocaml-on-arm\"></a>OCaml on ARM</h2>\n<p>One of the reasons I wanted an ARM-based setup is to experiment with the\nOCaml native code generation. <a href=\"http://www.home.unix-ag.org/bmeurer/index.html\">Benedikt\nMeurer</a> has been doing\nsome excellent work on <a href=\"http://old.nabble.com/New-ARM-backend-merged-into-trunk-td33262083.html\">improving code\ngeneration</a>\nfor embedded systems, including support for 16-bit Thumb code, exception\nbacktraces, and dynamic linking and profiling.</p>\n<p>Once Linux was up and running, compiling up the latest ocaml-trunk was\nstraightforward.</p>\n<pre><code>    $ sudo apt-get install build-essential git\n    $ git clone http://github.com/OCamlPro/ocp-ocaml svn-trunk\n    $ cd ocp-ocaml\n    $ ./configure &amp;&amp; make world opt opt.opt install\n</code></pre>\n<p>This compiles the bytecode and native code compilers, and then compiles\nthem again using the native code generator. This takes a while to do on\nthe poor little ARM CPU. Once that finished, I compiled up a few simple\nmodules and they worked great! Since the trunk of OCaml is a development\nbranch, you may run into a few packaging issues (use the very latest\nOASIS to regenerate any <code>setup.ml</code>, and you will need a small patch\nuntil <a href=\"http://caml.inria.fr/mantis/view.php?id=5503\">PR 5503</a> is\napplied).</p>\n<p>Incidentally, if anyone is interested in working on a\n<a href=\"http://openmirage.org\">Mirage</a> port to ARM as an internship in the\n<a href=\"http://www.cl.cam.ac.uk/research/srg/netos/\">Cambridge Computer Lab</a>,\ndo get in touch with me...</p>",
      "url": "https://anil.recoil.org/notes/dreamplug-debian-and-ocaml",
      "title": "Dreaming of an ARM OCaml",
      "summary": "Getting OCaml running on an ARM-based Dreamplug device with Debian and native code generation.",
      "date_published": "2012-02-25T00:00:00.000000Z",
      "date_modified": "2012-02-25T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocamllabs",
        "unikernels",
        "mirageos",
        "ocaml"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2011-cufp-scribe-1",
      "content_html": "<p>Published the scribe's report for CUFP 2011. This workshop report, co-authored with <a href=\"https://github.com/yminsky\">Yaron Minsky</a> and <a href=\"https://monkey.org/~marius/\">Marius Eriksen</a> and published in the Journal of Functional Programming, summarizes the talks delivered at CUFP 2011 in Tokyo. The Commercial Users of Functional Programming workshop brings together software developers who use FP in real-world settings, and our scribe report captures the diverse applications and experiences shared by practitioners. Videos and slides from all the talks are available online for the community.</p><h1>References</h1><ul><li>Madhavapeddy et al (2012). CUFP 2011 Workshop Report. <a href=\"https://doi.org/10.1017/S0956796812000020\" target=\"_blank\"><i>10.1017/S0956796812000020</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2011-cufp-scribe-1",
      "title": "CUFP 2011 Workshop Report",
      "summary": "Published the scribe's report for CUFP 2011.",
      "date_published": "2012-01-01T00:00:00.000000Z",
      "date_modified": "2012-01-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "fp",
        "cufp",
        "workshop",
        "community"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1017/S0956796812000020",
          "doi": "10.1017/S0956796812000020",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/7d949597-b864-4ada-ab1a-81ff8c0463e2-1",
      "content_html": "<p>At the OCaml Meeting 2011 speaking about MirageOS in France, presenting the project to the OCaml community. This was an early opportunity to share our vision for using OCaml to build entire operating systems and get feedback from the language developers and enthusiasts. The talk covered how we were extending OCaml with new runtime capabilities to generate Xen kernels, and the potential for functional programming in systems development.</p>",
      "url": "https://anil.recoil.org/notes/7d949597-b864-4ada-ab1a-81ff8c0463e2-1",
      "title": "OCaml Meeting 2011 - MirageOS",
      "summary": "Presentation on MirageOS at the OCaml Meeting 2011 conference.",
      "date_published": "2011-10-19T00:00:00.000000Z",
      "date_modified": "2011-10-19T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "mirageos",
        "ocaml",
        "unikernels",
        "systems"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/cufp-2011-mirage",
      "content_html": "<p>We signed up to do a MirageOS tutorial at ICFP, which is a bit daunting: we had to get all the embedded ARM hardware and laptop support in shape, as well as make it work for a bunch of discerning hackers.</p>\n<p><img src=\"/images/cufp11-1.webp\" alt=\"%c\" ></p>\n<p><img src=\"/images/cufp11-2.webp\" alt=\"%c\" ></p>",
      "url": "https://anil.recoil.org/notes/cufp-2011-mirage",
      "external_url": "https://mirageos.org/blog/an-outing-to-cufp",
      "title": "An outing to CUFP 2011 for Mirage",
      "summary": "Attending CUFP 2011 with MirageOS tutorial at ICFP.",
      "date_published": "2011-09-29T00:00:00.000000Z",
      "date_modified": "2011-09-29T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "unikernels",
        "cufp",
        "icfp"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2011-dynamics-ml-1",
      "content_html": "<p>Published dyntype at the Workshop on Generative Technologies. This paper presents the dyntype library for ML meta-programming, which provides statically typed value persistence through code generation. The work explores how to use OCaml's type system and meta-programming capabilities to automatically derive serialization and deserialization code, enabling type-safe storage and retrieval of complex data structures. This became a foundational library for many OCaml systems that needed reliable data persistence.</p><h1>References</h1><ul><li>Gazagnaire et al (2011). Dynamics for ML using Meta-Programming. <a href=\"https://doi.org/10.1016/j.entcs.2011.06.002\" target=\"_blank\"><i>10.1016/j.entcs.2011.06.002</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2011-dynamics-ml-1",
      "title": "Dynamics for ML using Meta-Programming",
      "summary": "Workshop on Generative Technologies paper on dyntype library for ML meta-programming.",
      "date_published": "2011-07-01T00:00:00.000000Z",
      "date_modified": "2011-07-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "fp",
        "meta-programming"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2011-dynamics-ml.pdf",
          "mime_type": "application/pdf",
          "title": "Dynamics for ML using Meta-Programming"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1016/j.entcs.2011.06.002",
          "doi": "10.1016/j.entcs.2011.06.002",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2010-dyntype-wgt-1",
      "content_html": "<p>Paper on statically typed value persistence for OCaml in ENTCS 2011. This work on the dyntype library explored using meta-programming to provide type-safe serialization and persistence in OCaml. The approach uses OCaml's type system to automatically generate code for converting values to and from various representations, enabling developers to persist complex data structures without sacrificing type safety. The library became an important building block for systems that needed to store and retrieve OCaml values reliably.</p>",
      "url": "https://anil.recoil.org/notes/2010-dyntype-wgt-1",
      "title": "Dynamics for ML using Meta-Programming",
      "summary": "Paper on statically typed value persistence for OCaml published in ENTCS 2011.",
      "date_published": "2011-07-01T00:00:00.000000Z",
      "date_modified": "2011-07-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "metaprogramming",
        "fp",
        "types",
        "persistence"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2010-dyntype-wgt.pdf",
          "mime_type": "application/pdf",
          "title": "Statically-typed value persistence for ML"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/delimited-cont-vs-lwt",
      "content_html": "",
      "url": "https://anil.recoil.org/notes/delimited-cont-vs-lwt",
      "external_url": "https://mirageos.org/blog/delimcc-vs-lwt",
      "title": "Delimited continuations vs Lwt for threads",
      "summary": "Delimited continuations and Lwt compared for threading purposes.",
      "date_published": "2011-06-18T00:00:00.000000Z",
      "date_modified": "2011-06-18T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "unikernels",
        "ocaml",
        "multicore",
        "mirageos"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://doi.org/10.59350/k1mwb-s6k33",
      "content_html": "<p>Distributed programming frameworks like\n<a href=\"http://wiki.apache.org/hadoop\">Hadoop</a> and\n<a href=\"http://research.microsoft.com/en-us/projects/dryad/\">Dryad</a> are popular\nfor performing computation over large amounts of data. The reason is\nprogrammer convenience: they accept a query expressed in a simple form\nsuch as <a href=\"http://wiki.apache.org/hadoop/HadoopMapReduce\">MapReduce</a>, and\nautomatically take care of distributing computation to multiple hosts,\nensuring the data is available at all nodes that need it, and dealing\nwith host failures and stragglers.</p>\n<p>A major limitation of Hadoop and Dryad is that they are not well-suited\nto expressing <a href=\"http://en.wikipedia.org/wiki/Iterative_method\">iterative\nalgorithms</a> or <a href=\"http://en.wikipedia.org/wiki/Dynamic_programming\">dynamic\nprogramming</a> problems.\nThese are very commonly found patterns in many algorithms, such as\n<a href=\"http://en.wikipedia.org/wiki/K-means_clustering\">k-means clustering</a>,\n<a href=\"http://en.wikipedia.org/wiki/Binomial_options_pricing_model\">binomial options\npricing</a> or\n<a href=\"http://en.wikipedia.org/wiki/Smith%E2%80%93Waterman_algorithm\">Smith Waterman</a>\nfor sequence alignment.</p>\n<p>Over in the SRG in Cambridge,\n<a href=\"http://www.cl.cam.ac.uk/research/srg/netos/ciel/who-we-are/\">we</a>\ndeveloped a Turing-powerful distributed execution engine called\n<a href=\"http://www.cl.cam.ac.uk/research/srg/netos/ciel/\">CIEL</a> that addresses\nthis. The <a href=\"/papers/2011-nsdi-ciel\">CIEL: A universal execution engine for distributed data-flow computing</a>\npaper describes the system in detail, but here’s a shorter introduction.</p>\n<h2 id=\"the-ciel-execution-engine\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-ciel-execution-engine\"></a>The CIEL Execution Engine</h2>\n<p>CIEL consists of a master coordination server and workers installed on\nevery host. The engine is job-oriented: a job consists of a graph of\ntasks which results in a deterministic output. CIEL tasks can run in any\nlanguage and are started by the worker processes as needed. Data flows\naround the cluster in the form of <em>references</em> that are fed to tasks as\ndependencies. Tasks can publish their outputs either as <em>concrete</em>\nreferences if they can finish the work immediately or as a <em>future</em>\nreference. Additionally, tasks can dynamically spawn more tasks and\ndelegate references to them, which makes the system Turing-powerful and\nsuitable for iterative and dynamic programming problems where the task\ngraph cannot be computed statically.</p>\n<p>The first iteration of CIEL used a domain-specific language called\n<a href=\"/papers/2011-nsdi-ciel.pdf\">Skywriting</a> to\ncoordinate how tasks should run across a cluster. Skywriting is an\ninterpreted language that is “native” to CIEL, and when it needs to\nblock it stores its entire execution state inside CIEL as a\ncontinuation. <a href=\"http://www.cl.cam.ac.uk/~dgm36/\">Derek Murray</a> has\nwritten a blog post <a href=\"http://www.syslog.cl.cam.ac.uk/2011/04/06/ciel/\">explaining this in more\ndetail</a>.</p>\n<p>More recently, we have been working on eliminating the need for\nSkywriting entirely, by adding direct support for CIEL into languages\nsuch as <a href=\"http://www.stackless.com/\">Python</a>, Java,\n<a href=\"http://www.scala-lang.org/\">Scala</a>, and the main subject of this post –\n<a href=\"http://caml.inria.fr\">OCaml</a>. It works via libraries that communicate\nwith CIEL to spawn tasks, publish references, or suspend itself into the\ncluster to be woken up when a future reference is completed.</p>\n<h2 id=\"datacaml-api\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#datacaml-api\"></a>DataCaml API</h2>\n<p>Rather than go into too much detail about the innards of CIEL, this post\ndescribes the OCaml API and gives some examples of how to use it. The\nsimplest interface to start with is:</p>\n<pre><code class=\"language-ocaml\">type 'a ref\nval deref : 'a ref -&gt; 'a\n</code></pre>\n<p>The type <code>'a ref</code> represents a CIEL reference. This data might not be\nimmediately present on the current node, and so must be dereferenced\nusing the <code>deref</code> function.</p>\n<p>If the reference has been completed, then the OCaml value is\nunmarshalled and returned. If it is not present, then the program needs\nto wait until the computation involving the reference has completed\nelsewhere. The future reference might contain a large data structure and\nbe on another host entirely, and so we should serialise the program\nstate and spawn a task that is dependent on the future’s completion.\nThis way, CIEL can resume execution on whatever node finished that\ncomputation, avoiding the need to move data across the network.</p>\n<p>Luckily, we do not need to serialise the entire heap to suspend the\nprogram. DataCaml uses the\n<a href=\"http://okmij.org/ftp/continuations/implementations.html\">delimcc</a>\ndelimited continuations library to walk the stack and save only the\nsubset required to restart this particular task. Delimcc abstracts this\nin the form a “restartable exception” that supplies a closure which can\nbe called later to resume the execution, as if the exception had never\nhappened. Delimcc supports serialising this closure to an output\nchannel, which you can read about in Oleg’s\n<a href=\"http://okmij.org/ftp/continuations/caml-shift.pdf\">paper</a>.</p>\n<p>So how do we construct references? Lets fill in more of the interface:</p>\n<pre><code class=\"language-ocaml\">module Ciel = struct\n  type 'a ref\n  val deref : 'a ref -&gt; 'a\n  val spawn : ('a -&gt; 'b) -&gt; 'a -&gt; 'b ref\n  val run : (string list -&gt; 'a) -&gt; ('a -&gt; string) -&gt; unit\nend\n</code></pre>\n<p>The <code>spawn</code> function accepts a closure and an argument, and returns a\nfuture of the result as a reference. The <code>run</code> function begins the\nexecution of a job, with the first parameter taking some\n<code>string arguments</code> and returning an <code>'a</code> value. We also supply a\npretty-printer second argument to convert the <code>'a</code> into a string for\nreturning as the result of the job (this can actually be any JSON value\nin CIEL, and just simplified here).</p>\n<pre><code class=\"language-ocaml\">let r1 = spawn (fun x -&gt; x + 5) arg1 in\nlet r2 = spawn (fun x -&gt; deref r1 + 5) arg1 in\nderef r2\n</code></pre>\n<p>We first spawn a function <code>r1</code> which simply adds 5 to the job argument.\nA job in CIEL is <em>lazily scheduled</em>, so this marshals the function to\nCIEL, creates a future, and returns immediately. Next, the <code>r2</code> function\nspawns a task which also adds 5, but to the dereferenced value of <code>r1</code>.\nAgain, it is not scheduled yet as the return reference has not been\ndereferenced.</p>\n<p>Finally, we attempt to dereference <code>r2</code>, which causes it be scheduled on\na worker. While executing, it will try to dereference <code>r1</code> that will\nschedule it, and all the tasks will run to completion.</p>\n<p>Programming language boffins will recognise that this interface is very\nsimilar to <a href=\"http://www.ps.uni-saarland.de/alice/\">AliceML</a>’s concept of\n<a href=\"http://www.ps.uni-saarland.de/alice/manual/futures.html\">lazy futures</a>.\nThe main difference is that it is implemented as a pure OCaml library,\nand uses a general-purpose distributed engine that can also work with\nother languages.</p>\n<h2 id=\"streaming-references\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#streaming-references\"></a>Streaming References</h2>\n<p>The references described so far only have two states: they are either\nconcrete or futures. However, there are times when a task can\nprogressively accept input and make forward progress. For these\nsituations, references can also be typed as <em>opaque</em> references that are\naccessed via <code>in_channel</code> and <code>out_channel</code>, as networks are:</p>\n<pre><code class=\"language-ocaml\">type opaque_ref\n\nval spawn_ref : (unit -&gt; opaque_ref) -&gt; opaque_ref\nval output : ?stream:bool -&gt; ?pipe:bool -&gt; (out_channel -&gt; unit) -&gt; opaque_ref\nval input : (in_channel -&gt; 'a) -&gt; opaque_ref -&gt; 'a\n</code></pre>\n<p>This interface is a lower-level version of the previous one:</p>\n<ul>\n<li><code>spawn_ref</code> creates a lazy future as before, but the type of\nreferences here is completely opaque to the program.</li>\n<li>Inside a spawned function, <code>output</code> is called with a closure that\naccepts an <code>out_channel</code>. The <code>stream</code> argument informs CIEL that a\ndependent task can consume the output before it is completed, and\n<code>pipe</code> forms an even more closely coupled shared-memory connection\n(requiring the tasks to be scheduled on the same host). Piping is\nmore efficient, but will require more work to recover from a fault,\nand so using it is left to the programmer to decide.</li>\n<li>The <code>input</code> function is used by the receiving task to parse the\ninput as a standard <code>in_channel</code>.</li>\n</ul>\n<p>The CIEL engine actually supports multiple concurrent input and output\nstreams to a task, but I’ve just bound it as a single version for now\nwhile the bindings find their feet. Here’s an example of how streaming\nreferences can be used:</p>\n<pre><code class=\"language-ocaml\">let x_ref = spawn_ref (fun () -&gt;\n    output ~stream:true (fun oc -&gt;\n      for i = 0 to 5 do\n        Unix.sleep 1;\n        fprintf oc &quot;%d\\n%!&quot; i;\n      done\n    )\n  ) in\n  let y_ref = spawn_ref (fun () -&gt;\n    input (fun ic -&gt;\n      output ~stream:true (fun oc -&gt;\n        for i = 0 to 5 do\n          let line = input_line ic in\n          fprintf oc &quot;LINE=%s\\n%!&quot; line\n        done\n      )\n    ) x_ref\n  ) in\n</code></pre>\n<p>We first spawn an <code>x_ref</code> which pretends to do 5 seconds of work by\nsleeping and outputing a number. This would of course be heavy number\ncrunching in a real program. The <code>y_ref</code> then inputs this stream, and\noutputs its own result by prepending a string to each line.</p>\n<h2 id=\"try-it-out\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#try-it-out\"></a>Try it out</h2>\n<p>If you are interested in a more real example, then read through the\n<a href=\"https://github.com/avsm/ciel/blob/master/src/ocaml/binomial.ml\">binomial\noptions</a>\ncalculator that uses streaming references to parallelise a dynamic\nprogramming problem (this would be difficult to express in MapReduce).\nOn my Mac, I can run this by:</p>\n<ul>\n<li>check out CIEL from from Derek’s <a href=\"http://github.com/mrry/ciel\">Git\nrepository</a>.</li>\n<li>install all the Python libraries required (see the <code>INSTALL</code> file)\nand OCaml libraries\n(<a href=\"http://okmij.org/ftp/continuations/implementations.html\">delimcc</a>\nand <a href=\"http://martin.jambon.free.fr/yojson.html\">Yojson</a>).</li>\n<li>add <code>&lt;repo&gt;/src/python</code> to your <code>PYTHONPATH</code></li>\n<li>in one terminal: <code>./scripts/run_master.sh</code></li>\n<li>in another terminal: <code>./scripts/run_worker.sh -n 5</code> (this allocates\n5 execution slots)</li>\n<li>build the OCaml libraries: <code>cd src/ocaml &amp;&amp; make</code></li>\n<li>start the binomial options job:\n<code>./scripts/sw-start-job -m http://localhost:8000 ./src/package/ocaml_binopt.pack</code></li>\n<li>there will be a URL printed which shows the execution progress in\nreal-time</li>\n<li>you should see log activity on the worker(s), and a result reference\nwith the answer (<code>10.x</code>)</li>\n<li>let us know the happy news if it worked or sad news if something\nbroke</li>\n</ul>\n<h2 id=\"discussion\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#discussion\"></a>Discussion</h2>\n<p>The DataCaml bindings outlined here provide an easy way to write\ndistributed, fault-tolerant and cluster-scheduled jobs in OCaml. The\ncurrent implementation of the engine is aimed at cluster computation,\nbut <a href=\"http://www.cl.cam.ac.uk/~ms705\">Malte</a> has been working on\n<a href=\"http://www.cl.cam.ac.uk/~ms705/pub/papers/2011-ciel-sfma.pdf\">condensing CIEL onto multicore\nhardware</a>.\nThus, this could be one approach to ‘solving the OCaml multicore\nproblem’ for problems that fit nicely into the dataflow paradigm.</p>\n<p>The biggest limitation for using these bindings is that delimited\ncontinuation serialisation only works in bytecode. Native code delimcc\nsupports <code>shift/reduce</code> in the same program, but serialising is\nproblematic since native code continuations contain a C stack, which may\nhave unwrapped integers. One way to work around this is by switching to\na monadic approach to dereferencing, but I find delimcc programming more\nnatural (also see <a href=\"http://www.openmirage.org/wiki/delimcc-vs-lwt\">this\ndiscussion</a>).</p>\n<p>Another important point is that tasks are lazy and purely functional\n(remind you of Haskell?). This is essential for reliable fault-tolerance\nand reproducibility, while allowing individual tasks to run fast, strict\nand mutable OCaml code. The tasks must remain referentially transparent\nand idempotent, as CIEL may choose to schedule them multiple times (in\nthe case of faults or straggler correction). Derek has been working on\n<a href=\"http://www.cl.cam.ac.uk/~dgm36/publications/2011-murray2011nondet.pdf\">integrating non-determinism into\nCIEL</a>,\nso this restriction may be relaxed soon.</p>\n<p>Finally, these ideas are not limited to OCaml at all, but also apply to\nScala, Java, and Python. We have submitted a draft paper dubbed <em>‘<a href=\"http://www.cl.cam.ac.uk/~ms705/pub/papers/2011-ciel-socc-draft.pdf\">A\nPolyglot Approach to Cloud\nProgramming</a>’</em>\nwith more details and the ubiquitous evaluation versus Hadoop. There is\na really interesting line to explore between low-level\n<a href=\"http://en.wikipedia.org/wiki/Message_Passing_Interface\">MPI</a> coding and\nhigh-level MapReduce, and we think CIEL is a useful spot in that design\nspace.</p>\n<p>Incidentally, I was recently hosted by <a href=\"http://research.nokia.com/\">Nokia\nResearch</a> in Palo Alto by my friend\n<a href=\"http://www.linkedin.com/pub/prashanth-mundkur/6/b44/27\">Prashanth\nMundkur</a>, where\nthey work on the Python/Erlang/OCaml <a href=\"http://discoproject.org/\">Disco</a>\nMapReduce engine. I’m looking forward to seeing more critical\ncomparisons and discussions of alternatives to Hadoop, from them and\nothers.</p>\n<p><em>Thanks are due to <a href=\"http://www.cl.cam.ac.uk/~dgm36/\">Derek</a>,\n<a href=\"https://twitter.com/#!/chrissmowton\">Chris</a> and\n<a href=\"http://www.cl.cam.ac.uk/~ms705\">Malte</a> for answering my incessant CIEL\nquestions while writing this post! Remember that DataCaml is a work in\nprogress and a research prototype, and feedback is most welcome.</em></p>",
      "url": "https://anil.recoil.org/notes/datacaml-with-ciel",
      "title": "DataCaml: distributed dataflow programming in OCaml",
      "summary": "DataCaml brings distributed dataflow programming to OCaml using the CIEL engine.",
      "date_published": "2011-06-11T00:00:00.000000Z",
      "date_modified": "2011-06-11T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "unikernels",
        "distributed",
        "ocaml",
        "cloud",
        "fp"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2011-fccm-cloudfpga-1",
      "content_html": "<p>Paper on what a Xen+FPGA cloud would look like at FCCM. This work with <a href=\"https://raintown.org\">Satnam Singh</a> explored reconfigurable data processing for clouds, addressing how FPGAs could help solve practical problems in scaling out data-centers where computation is limited by energy consumption or latency. We identified the prerequisites for making reconfigurable computing practical in cloud environments and described scenarios enabled by combining virtualization with hardware acceleration. The paper laid groundwork for thinking about heterogeneous computing in the cloud era.</p><h1>References</h1><ul><li>Madhavapeddy et al (2011). Reconfigurable Data Processing for Clouds. IEEE. <a href=\"https://doi.org/10.1109/FCCM.2011.35\" target=\"_blank\"><i>10.1109/FCCM.2011.35</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2011-fccm-cloudfpga-1",
      "title": "Reconfigurable Data Processing for Clouds",
      "summary": "Paper exploring Xen and FPGA integration for cloud computing presented at FCCM.",
      "date_published": "2011-05-01T00:00:00.000000Z",
      "date_modified": "2011-05-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "fpga",
        "cloud",
        "xen",
        "virtualization"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2011-fccm-cloudfpga.pdf",
          "mime_type": "application/pdf",
          "title": "Reconfigurable Data Processing for Clouds"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1109/FCCM.2011.35",
          "doi": "10.1109/FCCM.2011.35",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/srg-fp",
      "content_html": "<p>We've been doing loads of OCaml programming in the Systems Research Group, and this blog post lays out some of the things going on. It ranges from OCaml hacking, over to the CIEL distributed execution engine, and even some Haskell hacking ongoing for distributed execution.</p>",
      "url": "https://anil.recoil.org/notes/srg-fp",
      "external_url": "https://web.archive.org/web/20151028011702/http://www.syslog.cl.cam.ac.uk/2011/04/18/functional-programming-gone-wild-in-the-srg/",
      "title": "Functional programming gone wild in the SRG",
      "summary": "Exploring functional programming projects in the SRG, including OCaml and Haskell.",
      "date_published": "2011-04-18T00:00:00.000000Z",
      "date_modified": "2011-04-18T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocamllabs",
        "computerlab",
        "ocaml",
        "fp"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://doi.org/10.59350/zb6ye-yfk38",
      "content_html": "<p>I'm at the <a href=\"https://forge.ocamlcore.org/plugins/mediawiki/wiki/ocaml-meeting/index.php/OCamlMeeting2011\">2011 OCaml Users Group</a> in Paris, reporting on some splendid talks this year. It looked like around 60-70 people in the room, and I had the pleasure of meeting users all the way from <a href=\"http://ru.linkedin.com/pub/dmitry-bely/4/955/717\">Russia</a> to <a href=\"http://ashishagarwal.org/about/\">New York</a> as well as all the Europeans!</p>\n<h3 id=\"js_of_ocaml\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#js_of_ocaml\"></a>Js_of_ocaml</h3>\n<p>First up was <a href=\"http://www.lsv.ens-cachan.fr/~chambart/\">Pierre Chambart</a> talking about the <a href=\"http://ocsigen.org/js_of_ocaml/\">js_of_ocaml</a> compiler. It compiles OCaml bytecode directly to Javascript, with few external dependencies. Since the bytecode format changes very rarely, it is simpler to maintain than alternatives (such as Jake Donham’s <a href=\"https://github.com/jaked/ocamljs\">ocamljs</a>) that require patching the compiler tool-chain. Javascript objects are mapped to dynamic OCaml objects via a light-weight <code>##</code> operator, so you can simply write code like:</p>\n<pre><code>  class type window = object\n      method alert : js_string t -&gt; unit meth\n      method name : js_string t prop\n    end\n    let window : window t =\n      JS.Unsafe.variable &quot;window&quot;\n    \n    let () = \n      window##alert ( window##name)\n      name &lt;- Js.string &quot;name&quot;\n</code></pre>\n<p>Overloading is handled similarly to <a href=\"http://pyobjc.sourceforge.net/\">PyObjC</a>, with each parameter combination being mapped into a uniquely named function. <a href=\"https://github.com/raphael-proust\">Raphael Proust</a> then demonstrated a cool game he wrote using via <a href=\"https://github.com/raphael-proust/raphael\">bindings</a> to the <a href=\"http://raphaeljs.com/\">Raphael</a> Javascript vector graphics library. Performance of <code>js_of_ocaml</code> is good compared to writing it by hand, and they have have quite a few <a href=\"http://ocsigen.org/js_of_ocaml/doc/1.0.2/manual/performances\">benchmarks</a> on their website.</p>\n<p>Overall the project looks very usable: the main omissions are Bigarray, no dynlink, no Str (replaced by native regexps), no recursive modules or weak references. None of these missing features seem very critical for the sorts of applications that <code>js_of_ocaml</code> is intended for.</p>\n<h3 id=\"ocaml-on-a-pic-ocapic\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#ocaml-on-a-pic-ocapic\"></a>OCaml on a PIC (OCAPIC)</h3>\n<p>Next up Phillipe Wang presented something completely different: <a href=\"http://www.algo-prog.info/ocaml_for_pic/web/index.php\">running OCaml on tiny 8-bit PIC microcontrollers</a>!  These PICs have 4-128Kb of flash (to store the code), and from 256 <em>bytes</em> to 4 kilobytes. Not a lot of room to waste there. He demonstrated an example with a game with 24 physical push buttons that beat humans at a conference (JFLA).</p>\n<p>It works by translating OCaml bytecode through several stages: <code>ocamlclean</code> to eliminate dead code in the bytecode (which would be very useful for native code too!), a compression step that does run-length encoding, and then translation to PIC assembly. They have a replacement stop-and-copy GC (150 lines of assembly) and a full collection cycle runs in less than 1.5ms. Integers are 15-bits (with 1 bit reserved) and the block representation is the same as native OCaml. Very cool project!</p>\n<h3 id=\"frama-c\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#frama-c\"></a>Frama-C</h3>\n<p>We went onto static analysis and <a href=\"http://www.linkedin.com/pub/julien-signoles/24/5a9/4b4\">Julien Signoles</a> presented <a href=\"http://frama-c.com/\">Frama-C</a>, a powerful static analysis tool for real-world C. It forks the <a href=\"http://www.eecs.berkeley.edu/~necula/cil/\">CIL</a> project from Berkeley and adds <a href=\"http://ocamlgraph.lri.fr/\">ocamlgraph</a> and GUI support. He demonstrated a simple loop counter plugin to count them in C code, and the homepage has many interesting <a href=\"http://frama-c.com/plugins.html\">plugins</a> maintained by the community.</p>\n<p>I hadn’t realised that CIL was still maintained in the face of <a href=\"http://clang.llvm.org/\">clang</a>, so it’s nice to see it live on as part of Frama-C.</p>\n<h3 id=\"ocsigen\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#ocsigen\"></a>Ocsigen</h3>\n<p>The ever-cheerful <a href=\"http://www.pps.jussieu.fr/~balat/\">Vincent Balat</a> updated us about the <a href=\"http://ocsigen.org\">Ocsigen</a> web framework, including unveiling their exciting new logo! This was written using an amazing <a href=\"http://ocsigen.org/tutorial/tutorial1\">collaborative editor</a> that lets users edit in real time.</p>\n<p>Ocsigen is based around <em>services</em> of type <code>service: parameters -&gt; page</code>. Services are first-class values, and can be registered dynamically and associated with sessions. The code for the collaborative editor was about 100 lines of code.</p>\n<p>There is a syntax extension to distinguish between client and server side code, and both can be written in the same service (invoking <code>js_of_ocaml</code> to compile the client code to Javascript). They have bindings to <a href=\"http://code.google.com/closure/\">Google Closure</a> in order to provide UI support. There is a really nice “bus” service to pass messages between the server and the client, with seamless integration of <a href=\"http://ocsigen.org/lwt\">Lwt</a> to hide the details of communication to the browser.</p>\n<p>Ocsigen is looking like a very mature project at this point, and I’m very keen to integrate it with <a href=\"http://www.openmirage.org\">Mirage</a> to specialise the into micro-kernels. A task for the hacking day tomorrow morning I think!</p>\n<h3 id=\"mirage\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#mirage\"></a>Mirage</h3>\n<p>I talked about <a href=\"http://www.openmirage.org\">Mirage</a>, hurrah! Good questions about why we need a block device (and not just use NFS), and I replied that everything is available as the library and the programmer can choose depending on their needs (the core goal of <a href=\"http://en.wikipedia.org/wiki/Exokernel\">exokernels</a>).</p>\n<p>A highlight for me was lunch where I finally met <a href=\"http://people.redhat.com/~rjones/\">Richard Jones</a>, who is one of the other OCaml and cloud hackers out there. Wide ranging conversation about what the cool stuff going in <a href=\"http://www.linux-kvm.org/page/Main_Page\">KVM</a> and Red Hat in general. Richard also gave a short talk about how they use OCaml to generate hundreds of thousands of lines of code in <a href=\"http://libguestfs.org/\">libguestfs</a>. There are bindings for pretty much every major language, and it is all generated from an executable specification. He notes that “normal” programmers love the OCaml type safety without explicit annotations, and that it is a really practical language for the working programmer. The <a href=\"http://xen.org\">Xen Cloud Platform</a> also has a similar <a href=\"https://github.com/xen-org/xen-api/blob/master/ocaml/idl/datamodel.ml\">generator</a> for XenAPI bindings, so I definitely agree with him about this!</p>\n<h3 id=\"ocaml-future\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#ocaml-future\"></a>OCaml Future</h3>\n<p><a href=\"http://pauillac.inria.fr/~xleroy/\">Xavier “superstar” Leroy</a> then gave an update of OCaml development. Major new features in 3.12.0 are first-class modules, polymorphic recursion, local module opens, and richer operations over module signatures. Version 3.12.1 is coming out soon, with bug fixes (in camlp4 and ocamlbuild mainly), and better performance on x86_64: turns out a new <code>mov</code> instruction change improves floating point performance on <code>x86_64</code>.</p>\n<p>OCaml 3.13 has no release date, but several exciting features are in the pipeline. Firstly, more lightweight first-class modules by permitting some annotations to be inferred by the context, and it introduces patterns to match and bind first-class module values. Much more exciting is support for GADTs (Generalised Algebraic Data Types). This permits more type constraints to be enforced at compile time:</p>\n<pre><code>  type _ t =\n      | IntLit : int -&gt; int t\n      | Pair : 'a t * 'b t -&gt; ('a * 'b) t\n      | App : ('a -&gt; 'b) t * 'a t -&gt; 'b t\n      | Abs : ('a -&gt; 'b) -&gt; ('a -&gt; 'b) t\n     \n    let rec eval : type s . s t -&gt; s = function\n      | IntLit x -&gt; x (* s = int here *)\n      | Pair (x,y) -&gt; (eval x, eval y) (* s = 'a * 'b here *)\n      | App (f,a) -&gt; (eval f) (eval a)\n      | Abs f -&gt; f\n</code></pre>\n<p>In this example of a typed interpreter, the <code>eval</code> function is annotated with a <code>type s . s t -&gt; s</code> type that lets each branch of the pattern match have a constrained type for <code>s</code> depending on the use. This reminded me of Edwin Brady’s <a href=\"http://www.cs.st-andrews.ac.uk/~eb/writings/icfp10.pdf\">partial evaluation</a> work using dependent types, but a much more restricted version suitable for OCaml.</p>\n<p>There are some really interesting uses for GADTs:</p>\n<ul>\n<li>Enforcing invariants in data structures, as with the typed interpreter example above.</li>\n<li>Reflecting types into values means that libraries such as our own <a href=\"http://github.com/mirage/dyntype\">dyntype</a> can be expressed in the core language without lots of camlp4 hacks. Finally, this should make typed I/O generators for XML, JSON and other network formats much simpler.</li>\n</ul>\n<p>The challenges in the implementation are that principle type inference is now impossible (so some annotation is required), and pattern matching warnings are also trickier.</p>\n<p>From the IDE perspective, the third bit of work is to have the OCaml compiler save the full abstract syntax tree annotation with source locations, scoping information, types (declared and inferred) and addition user-defined annotations. This generalises the <code>-annot</code> flag and can help projects like <a href=\"http://jun.furuse.info/hacks/ocamlspotter\">OCamlSpotter</a>, <a href=\"http://ocamlwizard.lri.fr/\">OCamlWizard</a>, <a href=\"http://www.algo-prog.info/ocaide/\">OcaIDE</a>, etc. It also helps code-generators driven by type-generators (such as our <a href=\"http://github.com/mirage/orm\">SQL ORM</a> or <a href=\"http://oss.wink.com/atdgen/\">ATDgen</a>).</p>\n<p>The OCaml consortium has new members; <a href=\"http://mlstate.com\">MLState</a> and <a href=\"http://mylife.com\">MyLife</a>, and <a href=\"http://www.esterel-technologies.com/\">Esterel</a>, <a href=\"http://www.ocamlpro.com\">OCamlPro</a> and one unnamed new member are joining. The consortium goals are to sell permissive licensing (BSD) to members, and sound off new features with the serious users. Three companies are now doing commercial development (Gerd, OCamlCore, OCamlPro) which is growing the community nicely.</p>\n<h3 id=\"jocaml\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#jocaml\"></a>JoCaml</h3>\n<p><a href=\"http://pauillac.inria.fr/~maranget/\">Luc Maranget</a> (who looks like an archetypal mad professor!) gave a great rundown on <a href=\"http://jocaml.inria.fr/\">JoCaml</a>, a distributed programming extension to OCaml. This extends the compiler with join-definitions (a compiler patch), and a small bit of runtime support (using Thread), and significant extensions for concurrent and distributed programming in a type-safe way.</p>\n<p>It extends the syntax with three new keywords: <code>def</code>, <code>spawn</code> and <code>reply</code>, and new usage for <code>or</code> and <code>&amp;</code> (you should be using <code>||</code> and <code>&amp;&amp;</code> anyway). Binary libraries remain compatible between matching versions of JoCaml and OCaml. An example of JoCaml code is:</p>\n<pre><code>  let create n =\n      def st(rem) &amp; tick() = st(rem-1)\n      or st(0) &amp; wait() = reply to wait in\n      spawn st(n) ; { tick=tick; wait=wait; }\n    \n    type t = {\n      tick: unit Join.chan;\n      wait: unit -&gt; unit;\n    }\n</code></pre>\n<p>After <code>n</code> messages to <code>tick</code>, the <code>wait</code> barrier function will be called.</p>\n<pre><code>  let c = create n\n    let () =\n      for k = 0 to 9 do\n       spawn begin printf &quot;%i&quot; k; c.tick ()\n      done;\n      c.wait ()\n</code></pre>\n<p>Here we asynchronously print the numbers of <code>0</code> to <code>9</code>, and then the <code>wait</code> call acts as a barrier until it finishes. JoCaml is useful for distributed fork-join parallelism tasks such as raytracing, but with the type system support of OCaml. It is a bit like MapReduce, but without the data partitioning support of Hadoop (and is more light-weight). It would be quite interesting to combine some of the JoCaml extensions with the dynamic dataflow graphs in our own <a href=\"http://www.cl.cam.ac.uk/research/srg/netos/ciel/\">CIEL</a> distributed execution engine.</p>\n<h3 id=\"forgetful-memoisation-in-ocaml\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#forgetful-memoisation-in-ocaml\"></a>Forgetful Memoisation in OCaml</h3>\n<p><a href=\"http://www.lri.fr/~bobot/\">Francois Bobot</a> talks about the problem of memoizing values so that they can be re-used (e.g. in a cache). Consider a standard memoiser:</p>\n<pre><code>  let memo_f =\n      let cache = H.create () in\n      fun k -&gt;\n        try H.find cache k\n        with Not_found -&gt;\n          let v = f k in\n          H.add cache k v;\n          v\n    \n    let v1 = memo_f k1\n    let v2 = memo_f k2 in (* k2 = k1 in O(1) *)\n</code></pre>\n<p>If a key is not reachable from anywhere other than the heap, we want to eliminate it from the cache also. The first solution is a normal hashtable, but this results in an obvious memory leak since a key held in the cache marks it as reachable. A better solution is using OCaml <a href=\"http://caml.inria.fr/pub/docs/manual-ocaml/libref/Weak.html\">weak pointers</a> that permit references to values without holding on to them (see <a href=\"http://www.pps.jussieu.fr/~li/software/weaktbl/doc/html/Weaktbl.html\">Weaktbl</a> by <a href=\"http://www.pps.jussieu.fr/~li/\">Zheng Li</a> who is now an OCaml hacker at Citrix). The problem with Weaktbl is that if the value points to the key, forming a cycle which will never be reclaimed.</p>\n<p>Francois solves this by using <a href=\"http://en.wikipedia.org/wiki/Ephemeron\">Ephemerons</a> from Smalltalk.  They use the rule that the value can be reclaimed if the key or the ephemeron itself can be reclaimed by the GC, and have a signature like:</p>\n<pre><code>  module Ephemeron : sig type ('a,'b) t\n      val create : 'a -&gt; 'b -&gt; ('a,'b) t\n      val check : ('a,'b) t -&gt; bool\n      val get : ('a,'b) t -&gt; 'b option\n      val get_key : ('a,'b) t -&gt; 'a option\n    end\n</code></pre>\n<p>The implementation in OCaml patches the runtime to use a new tag for ephemerons, and the performance graphs in his <a href=\"https://forge.ocamlcore.org/docman/view.php/77/134/memoization2011.pdf\">slides</a> look good. This is an interesting topic for me since we need efficient memoisation in Mirage I/O (see the effects on DNS performance in the <a href=\"/papers/2007-eurosys-melange.pdf\">Eurosys paper</a> which used Weaktbl). When asked if the OCaml patch will be upstreamed, <a href=\"http://gallium.inria.fr/~doligez/\">Damien Doligez</a> did not like the worst-case complexity of long chains of ephemerons in the GC, and there are several approaches under consideration to alleviate this without too many changes to the runtime, but Francois believes the current complexity is not too bad in practise.</p>\n<h3 id=\"oasis-and-website\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#oasis-and-website\"></a>Oasis and website</h3>\n<p><a href=\"http://sylvain.le-gall.net/\">Sylvain</a> came on stage later to give a demonstration of <a href=\"http://oasis.forge.ocamlcore.org/oasis-db.html\">OASIS</a>, an equivalent of <a href=\"http://www.haskell.org/cabal/\">Cabal</a> for Haskell or <a href=\"http://www.cpan.org/\">CPAN</a> for Perl. It works with a small <code>_oasis</code> file that describes the project, and then the OASIS tool auto-generates <code>ocamlbuild</code> files from it (this reminds me of Perl’s <a href=\"http://perldoc.perl.org/ExtUtils/MakeMaker.html\">MakeMaker</a>). Once the files are auto-generated, it is self-contained and there is no further dependency on OASIS itself.</p>\n<ul>\n<li>Gallery\n<img src=\"/images/ocaml-users-1.webp\" alt=\"%r\" title=\"How many OCaml hackers does it take to change a lightbulb?\" >\n<img src=\"/images/ocaml-users-3.webp\" alt=\"%r\" title=\"Wearing bibs at French Teppinyaki\" >\n<img src=\"/images/ocaml-users-2.webp\" alt=\"%r\" title=\"Team Mirage cheeses it up\" ></li>\n</ul>\n<p>OASIS works with either an existing build system in a project, or can be integrated more closely with <code>ocamlbuild</code> by advanced users. Lots of projects are already using OASIS (from Cryptokit to Lwt to the huge <a href=\"http://caml.inria.fr/cgi-bin/hump.en.cgi?contrib=641\">Jane Street Core</a>). He is also working on a distribution mechanism on a central website, which should make for convenient OCaml packaging when it is finished and gets more adoption from the community.</p>\n<p>Finally, <a href=\"http://ashishagarwal.org/\">Ashish Agarwal</a> led a discussion on how OCaml can improve its web presence for beginners. Lots of good ideas here (some of which we implemented when reworking the <a href=\"http://cufp.org\">CUFP</a> website last year). Looking forward to seeing what happens next year in this space! I really enjoyed the day; the quality of talks was very high, and many engaging discussions from all involved!</p>\n<p><img src=\"/images/sf-ocaml.webp\" alt=\"%c\" ></p>\n<p>Of course, not all of the OCaml community action is in France. The ever-social <a href=\"http://www.twitter.com/jakedonham\">Jake Donham</a> organised the First Ever San Francisco User Group that I attended when I was over there a few weeks ago. Ok, admittedly it was mainly French people there too, but it was excellent to meet up with <a href=\"http://www.linkedin.com/pub/mika-illouz/0/a02/7b4\">Mika</a>, <a href=\"http://martin.jambon.free.fr/\">Martin</a>, <a href=\"http://www.linkedin.com/pub/julien-verlaguet/20/10a/b57\">Julien</a>, <a href=\"http://fr.linkedin.com/in/henribinsztok\">Henri</a> and of course Jake when over there.</p>\n<p>We should definitely have more of these fun local meetups, and a number of other OCaml hackers I mentioned it to want to attend next time in the Bay Area, if only to cry into their drinks about the state of multi-core... <em>just kidding</em>, <a href=\"http://www.ocamlpro.com\">OCamlPro</a> is hard at work fixing that after all :-)</p>",
      "url": "https://anil.recoil.org/notes/ocaml-users-group",
      "title": "Camel Spotting in Paris",
      "summary": "Report from the 2011 OCaml Users Meeting in Paris, covering talks on js_of_ocaml, OCaml on PIC, and more.",
      "date_published": "2011-04-15T00:00:00.000000Z",
      "date_modified": "2011-04-15T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocamllabs"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2011-nsdi-ciel-1",
      "content_html": "<p>Paper on CIEL, a distributed dataflow engine, at USENIX NSDI 2011. This work led by <a href=\"https://research.google/people/derekmurray/?&amp;type=google\">Derek G Murray</a> introduced a universal execution engine that, unlike previous systems, enables data-dependent control-flow decisions for computing iterative and recursive algorithms. We also developed Skywriting, a Turing-complete scripting language that runs directly on CIEL with transparent fault tolerance and distribution. The system was deployed on cloud platforms and demonstrated scalable performance for both iterative and non-iterative algorithms, addressing a key limitation of earlier MapReduce-style systems.</p>",
      "url": "https://anil.recoil.org/notes/2011-nsdi-ciel-1",
      "title": "CIEL: A universal execution engine for distributed data-flow computing",
      "summary": "Paper on CIEL universal execution engine for distributed data-flow computing using dynamic task graphs at USENIX NSDI 2011.",
      "date_published": "2011-03-01T00:00:00.000000Z",
      "date_modified": "2011-03-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "distributed",
        "dataflow",
        "cloud",
        "systems"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2011-nsdi-ciel.pdf",
          "mime_type": "application/pdf",
          "title": "CIEL: A universal execution engine for distributed data-flow computing"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2011-icdcn-droplets-1",
      "content_html": "<p>Paper on a vision for a semi-federated cloud for personal data at ICDCN. Working with <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\">Jon Crowcroft</a>, <a href=\"https://github.com/mor1\">Richard Mortier</a>, <a href=\"https://cs.brown.edu/people/malte/\">Malte Schwarzkopf</a>, and Theodore Hong, we proposed the Droplets architecture as a middle ground between fully centralized cloud services and completely decentralized systems. The work contrasts these two extremes and shows how Droplets enables controlled trade-offs between the costs and benefits of each approach. We demonstrated three sample applications that substantially benefit from this flexibility, providing a more nuanced vision of cloud computing than the prevailing centralized model.</p>",
      "url": "https://anil.recoil.org/notes/2011-icdcn-droplets-1",
      "title": "Unclouded vision",
      "summary": "Paper on Droplets architecture enabling controlled trade-offs between centralized cloud and fully decentralized personal data approaches at ICDCN.",
      "date_published": "2011-01-01T00:00:00.000000Z",
      "date_modified": "2011-01-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "cloud",
        "personal-data",
        "privacy",
        "distributed",
        "systems"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2011-icdcn-droplets.pdf",
          "mime_type": "application/pdf",
          "title": "Unclouded vision"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/43ab3ae0-9ffc-474f-aa02-3cc1139f54d1-1",
      "content_html": "<p>Talk on building the Xen toolstack using OCaml at ICFP 2010. This was one of the early talks that helped establish OCaml as a viable language for systems programming. We shared our experiences rewriting parts of the Xen hypervisor toolstack in OCaml, demonstrating that functional programming could deliver the performance, reliability, and safety needed for critical infrastructure software. The work laid the groundwork for MirageOS and showed that you could build high-performance systems software with strong type safety. It was exciting to show the functional programming community that OCaml wasn't just for compilers and theorem provers - it could handle real systems work.</p>",
      "url": "https://anil.recoil.org/notes/43ab3ae0-9ffc-474f-aa02-3cc1139f54d1-1",
      "title": "Building the Xen toolstack using OCaml",
      "summary": "Talk on building the Xen toolstack using OCaml.",
      "date_published": "2010-11-05T00:00:00.000000Z",
      "date_modified": "2010-11-05T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "xen",
        "virtualization",
        "systems"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/be2f049b-174a-4e5b-b30e-0319793487c7-1",
      "content_html": "<p>At LinkedIn giving tech talk about Mirage, presenting our new multi-scale operating system approach for clouds. This early talk in California introduced the concept of compiling functional code into specialized kernels for cloud deployment. I discussed how MirageOS could address the bloat and security issues of traditional OS stacks by building minimal, purpose-built systems from libraries written in a high-level language.</p>",
      "url": "https://anil.recoil.org/notes/be2f049b-174a-4e5b-b30e-0319793487c7-1",
      "title": "Mirage: A New Multi-Scale Operating System for Clouds and Crowds (2014)",
      "summary": "Tech talk at LinkedIn presenting Mirage operating system for cloud computing.",
      "date_published": "2010-10-25T00:00:00.000000Z",
      "date_modified": "2010-10-25T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "mirageos",
        "ocaml",
        "cloud",
        "unikernels",
        "systems"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/mirage-self-hosting",
      "content_html": "<p>I managed to get early <a href=\"https://mirageos.org\">MirageOS</a> suitably feature-complete enough that we could run the Mirage website using Mirage. This was all very satisfying after hacking on the <a href=\"https://github.com/mirage/mirage-tcpip\">TCP/IP</a> stack for ages.</p>",
      "url": "https://anil.recoil.org/notes/mirage-self-hosting",
      "external_url": "https://mirageos.org/blog/self-hosting-mirage-website",
      "title": "Self-hosting MirageOS website",
      "summary": "Running the MirageOS website on its own self-hosted infrastructure using the TCP/IP stack.",
      "date_published": "2010-10-11T00:00:00.000000Z",
      "date_modified": "2010-10-11T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "unikernels",
        "ocaml",
        "mirageos",
        "selfhosting"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/de10-perscon-1",
      "content_html": "<p>Paper on personal containers for data management at the UK Digital Economy meeting, exploring how individuals could better manage their personal data. Co-authored with Richard Mortier, Jon Crowcroft, and others, this work introduced the concept of personal containers - secure environments for storing and processing personal information. This research laid the groundwork for later projects like Dataware and Databox, addressing growing concerns about privacy and data ownership in ubiquitous computing.</p>",
      "url": "https://anil.recoil.org/notes/de10-perscon-1",
      "title": "The personal container, or your life in bits",
      "summary": "Paper on personal containers for managing personal data presented at UK Digital Economy meeting.",
      "date_published": "2010-10-01T00:00:00.000000Z",
      "date_modified": "2010-10-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "privacy",
        "personal-data",
        "data-management",
        "ubicomp"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/de10-perscon.pdf",
          "mime_type": "application/pdf",
          "title": "The personal container, or your life in bits"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2010-icfp-xen-1",
      "content_html": "<p>Paper on our experiences with writing the Xen control stack in OCaml at ICFP 2010. Working with <a href=\"https://dave.recoil.org\">Dave Scott</a>, <a href=\"mailto:richard.sharp@gmail.com\">Richard Sharp</a>, and <a href=\"https://github.com/samoht\">Thomas Gazagnaire</a>, we presented the perspectives and perceptions from using functional programming within an industrial product group at Citrix. The paper discusses the practical realities of deploying OCaml in a commercial virtualization product, covering both the technical benefits of the functional approach and the organizational challenges of introducing FP into an established engineering team. This became an influential case study for FP adoption in industry.</p><h1>References</h1><ul><li>Scott et al (2010). Using functional programming within an industrial product group: perspectives and perceptions. ACM. <a href=\"https://doi.org/10.1145/1863543.1863557\" target=\"_blank\"><i>10.1145/1863543.1863557</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2010-icfp-xen-1",
      "title": "Using functional programming within an industrial product group: perspectives and perceptions",
      "summary": "Paper on experiences writing the Xen control stack in OCaml presented at ICFP 2010.",
      "date_published": "2010-09-01T00:00:00.000000Z",
      "date_modified": "2010-09-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "xen",
        "fp",
        "systems",
        "icfp"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2010-icfp-xen.pdf",
          "mime_type": "application/pdf",
          "title": "Using functional programming within an industrial product group: perspectives and perceptions"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1145/1863543.1863557",
          "doi": "10.1145/1863543.1863557",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/4957325f-d7f5-4a29-95b6-a1e1f61ea5cf-1",
      "content_html": "<p>At HotCloud for the first talk about MirageOS, presenting our vision for software specialization in the cloud. This was the first public outing of the MirageOS concept at USENIX HotCloud 2010 in Boston. The paper proposed running applications directly on cloud hardware without traditional OS layers, using OCaml to generate specialized binaries that execute as Xen guest operating systems. Our early prototype showed significant performance improvements for I/O and memory handling compared to the same code running under Linux/Xen.</p>",
      "url": "https://anil.recoil.org/notes/4957325f-d7f5-4a29-95b6-a1e1f61ea5cf-1",
      "title": "Turning Down the LAMP: Software Specialisation for the Cloud",
      "summary": "First HotCloud talk presenting the initial MirageOS concept for cloud software specialization.",
      "date_published": "2010-06-22T00:00:00.000000Z",
      "date_modified": "2010-06-22T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "mirageos",
        "cloud",
        "unikernels",
        "systems",
        "specialization"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2010-hotcloud-lamp-1",
      "content_html": "<p>Workshop paper on the early MirageOS architecture and evaluation at HotCloud 2010. This paper presented our vision of &quot;turning down the LAMP&quot; stack by running applications directly on cloud infrastructure as specialized unikernels. Working with <a href=\"https://github.com/mor1\">Richard Mortier</a>, <a href=\"mailto:ripduman.sohan@gmail.com\">Ripduman Sohan</a>, <a href=\"https://github.com/samoht\">Thomas Gazagnaire</a>, and others, we showed how Mirage could compile OCaml applications into Xen guest operating systems, achieving significant performance improvements for I/O and memory handling compared to traditional LAMP stacks. The work demonstrated that cloud computing offered an unprecedented opportunity to rethink application architecture from the ground up.</p>",
      "url": "https://anil.recoil.org/notes/2010-hotcloud-lamp-1",
      "title": "Turning Down the LAMP: Software Specialisation for the Cloud",
      "summary": "Workshop paper presenting early MirageOS architecture for running OCaml applications directly on cloud platforms as unikernels.",
      "date_published": "2010-06-01T00:00:00.000000Z",
      "date_modified": "2010-06-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "mirageos",
        "unikernels",
        "cloud",
        "ocaml",
        "systems"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2010-hotcloud-lamp.pdf",
          "mime_type": "application/pdf",
          "title": "Turning Down the LAMP: Software Specialisation for the Cloud"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/anil-phd-thesis-2",
      "content_html": "<p>My PhD thesis is now also published as a print book, making the work on type-safe network application development more accessible. The thesis demonstrated how functional programming languages with strong type systems could be used to build network protocols and applications with correctness guarantees. These ideas about using types for security and verification continue to influence my work on systems programming with OCaml.</p>",
      "url": "https://anil.recoil.org/notes/anil-phd-thesis-2",
      "title": "Creating high-performance, statically type-safe network applications",
      "summary": "PhD thesis on type-safe network application development now published as print book.",
      "date_published": "2010-05-01T00:00:00.000000Z",
      "date_modified": "2010-05-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "fp",
        "networking",
        "security",
        "systems"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/anil-phd-thesis.pdf",
          "mime_type": "application/pdf",
          "title": "Creating High-Performance, Statically Type-Safe Network Applications"
        }
      ],
      "_references": []
    },
    {
      "id": "https://doi.org/10.59350/4b3zw-zen8",
      "content_html": "<p>The App Engine data collector for Personal Containers is coming on nicely, and is on track for an alpha preview release <a href=\"http://github.com/avsm/perscon/blob/master/README.md\">fairly soon</a>. Working with AppEngine has been interesting; it’s got excellent availability and you can’t beat the price (free), but coding robust Python that doesn’t trip over the tight resource limits for individual requests, asynchronous tasks and queries is tricky. While it is good for small records such as my <a href=\"http://github.com/avsm/perscon/tree/master/plugins/iPhone/\">iPhone</a> or Find My iPhone <a href=\"http://github.com/avsm/perscon/blob/master/appengine/perscon/drivers/fmi.py\">GPS traces</a> traces, it doesn’t work so well with my gigabytes of photographs or decades of e-mail.</p>\n<p>This confirmed our earlier intuition that there is no one perfect solution for personal data handling; instead, we need to <em>embrace diversity</em> and construct an infrastructure that can cope with change over the coming decades. Mobile programming has changed beyond recognition in just a few years, and cloud providers are specialising in different ways (e.g. <a href=\"http://www.picloud.com/\">PiCloud</a> for simple compute, or <a href=\"http://aws.amazon.com\">EC2</a> for fancy services like elastic <a href=\"http://aws.amazon.com/elasticloadbalancing/\">load balancing</a>).</p>\n<p>So to recognise this, we are building components that all interoperate with your personal data, keep it secure, and ensure it persists for more than a few years. <a href=\"https://cs.brown.edu/people/malte/\">Malte Schwarzkopf</a> came up with the term &quot;digital <a href=\"http://en.wikipedia.org/wiki/Yurt\">yurts</a>&quot;, and it's stuck. We’ve written a <a href=\"http://perscon.net/papers/digital-yurts-draft1.pdf\">draft paper</a> about it, and would love to hear your comments and feedback on the approach.</p>\n<p><img src=\"/images/nomads-diagram.webp\" alt=\"\" ></p>\n<p>There are some interesting recent trends that make doing this\nparticularly important:</p>\n<ul>\n<li>The New York Times wrote about the <a href=\"http://www.nytimes.com/2010/05/02/magazine/02self-measurement-t.html\">data-driven\nlife</a>\nincreasingly influencing our decision making. Current sensor data\nsuch as GPS traces are just harbringers for the privacy disaster\nthat would be information such as heart rates or your consumption\nhabits getting into the public domain. <em>(link via <a href=\"http://www.cl.cam.ac.uk/~dgm36/\">Derek\nMurray</a>)</em>.</li>\n<li>Facebook has announced a brand new API platform to get access to\nyour information. The <a href=\"http://eff.org\">EFF</a> has a fantastic timeline\nof <a href=\"http://www.eff.org/deeplinks/2010/04/facebook-timeline\">Facebook’s Eroding\nPrivacy</a>\nover the last five years, to demonstrate how unsafe it is to trust\nyour data to any third-party. We’ve started developing an\ninformation dump plugin for Facebook, but the API just changed\nmid-way and so it has to be started again (volunteers welcome!).</li>\n<li>In the UK, the <a href=\"http://en.wikipedia.org/wiki/Digital_Economy_Act_2010\">Digital Economy\nAct</a> is an\nextremely controversial act that makes anonymity and privacy all the\nmore important. We’re assembling an open-source <a href=\"http://www.scribd.com/doc/28393106/Using-Dust-Clouds-to-Enhance-Anonymous-Communication\">dust\ncloud</a>\nthat integrates Tor into personal containers to automatically grant\nyou anonymity as you communicate with your friends.</li>\n</ul>\n<p>If you’re interested, join our <a href=\"http://perscon.net/contact.html\">group</a>\nor contact <a href=\"https://anil.recoil.org\">Anil Madhavapeddy</a> directly. At this stage, you\nneed desire and the ability to hack code, but things are settling down\nover the next few months...</p>",
      "url": "https://anil.recoil.org/notes/yurts-for-digital-nomads",
      "external_url": "https://web.archive.org/web/20110315011341/http://perscon.net/2010/04/29/yurts-for-digital-nomads.html",
      "title": "Yurts for Digital Nomads",
      "summary": "Digital nomads can secure their personal data with digital yurts, a system for diverse data handling and storage.",
      "date_published": "2010-04-29T00:00:00.000000Z",
      "date_modified": "2010-04-29T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "perscon",
        "ocaml",
        "cloud",
        "selfhosting"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/uiprototype",
      "content_html": "<p>We’ve been <a href=\"http://github.com/avsm/perscon\">hacking</a> away on fleshing out the <a href=\"http://code.google.com/appengine\">App Engine</a> node for personal containers. We’re building this node first because, crucially, deploying an App Engine VM is free to anyone with a Google account.  The service itself is limited since you can only respond to HTTP or XMPP requests and do HTTP fetches, and so its primary use is as an always-on data collection service with a webmail-style UI written using <a href=\"http://www.extjs.com/\">extjs</a>.</p>\n<p>Personal containers gather data from a wide variety of sources, and normalise them into a format which understands people (address book entries, with a set of services such as e-mail, phone, IM and online IDs), places (GPS, WOEID), media (photos, movies) and messages (Tweets, emails, Facebook messages). I’ll post more about the data model behind personal containers in a follow-up as the format settles.</p>\n<p><img src=\"/images/perscon-extjs.webp\" alt=\"%c\" ></p>\n<p>The App Engine node has a number of plugins to gather data and aggregate them into a single view (see screenshot). Plugins include:</p>\n<ul>\n<li><a href=\"http://github.com/avsm/perscon/tree/master/plugins/iPhoto/\">iPhoto</a> extracts location (via EXIF), people present (associated via <a href=\"http://gizmodo.com/5141741/what-to-know-about-iphoto-09-face-detection-and-recognition\">faces</a>), and of course, the actual photograph.</li>\n<li><a href=\"http://github.com/avsm/perscon/tree/master/plugins/Adium/\">Adium</a> logs all IMs into a threaded chat view.  -   <a href=\"http://github.com/avsm/perscon/tree/master/plugins/iPhone/\">iPhone</a> uses the backup files on a Mac to extract SMS messages, phone call records (and it could also get photographs and browsing history, although it currently doesn’t). An AppEngine tracker can also use <a href=\"http://www.apple.com/mobileme/features/find-my-iphone.html\">FindMyIPhone</a> to poll your iPhone regularly to keep track of your location without publishing it to Google or Yahoo (and hopefully in iPhone 4.0, we can operate as a background service at last!).</li>\n<li><a href=\"http://github.com/avsm/perscon/tree/master/appengine/twitter.py\">Twitter</a> runs directly on AppEngine (authenticated via OAuth) and synchronizes with a Twitter feed.</li>\n<li><a href=\"http://github.com/avsm/perscon/tree/master/plugins/MacOS-SyncServices/\">SyncServices</a> hooks into the MacOS X <a href=\"http://developer.apple.com/macosx/syncservices.html\">sync framework</a> and initially subscribes to Address Book updates. This seems to be the first open-source sync alternative to the expensive Mobile Me, as far as I can tell. I’m planning to expand this to also subscribe to the full set of sync information (e.g. calendars).</li>\n</ul>\n<p>I'm switching tacks briefly; we received an <a href=\"http://aws.amazon.com/education/aws-in-education-research-grants/\">Amazon Research Grant</a> recently and I’m building a node that runs as a Linux server to act as a longer-term archival and search server. This is being written in OCaml and uses <a href=\"http://1978th.net/tokyocabinet/\">Tokyo Cabinet</a> (with Jake Donham’s excellent <a href=\"http://github.com/jaked/otoky\">bindings</a>) and so should be speedy and a useful alternative implementation of the HTTP REST interface. The plan is to automatically synchronize meta-data across all the nodes of a personal container, but store large and historical data away from expensive cloud storage such as App Engine.</p>\n<p>There are lots more plugins in development, such as <a href=\"http://foursquare.com\">Foursquare</a> and <a href=\"http://gowalla.com\">Gowalla</a> OAuth collectors, an <a href=\"http://github.com/avsm/perscon/tree/master/android\">Android</a> mobile application to upload location and contacts information, and Google GData synchronization. If you’re interested in one of these or something else, please do <a href=\"http://perscon.net/contact.html\">get in touch</a> or just fork the <a href=\"http://github.com/avsm/perscon\">project</a> and start hacking!</p>",
      "url": "https://anil.recoil.org/notes/uiprototype",
      "external_url": "https://web.archive.org/web/20110313101153/http://perscon.net/2010/04/15/uiprototype.html",
      "title": "Pulling together a user interface",
      "summary": "Building a user interface for personal containers on App Engine with extjs.",
      "date_published": "2010-04-15T00:00:00.000000Z",
      "date_modified": "2010-04-15T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "selfhosting",
        "cloud",
        "ui",
        "perscon"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2010-bcs-visions-1",
      "content_html": "<p>Paper on our vision for multiscale programming at the BCS Visions 2010 conference. Working with <a href=\"https://github.com/mor1\">Richard Mortier</a>, <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\">Jon Crowcroft</a>, and <a href=\"https://research.google/people/steven-hand/\">Steven Hand</a>, we presented a clean-slate approach to heterogeneous cloud computing that sweeps away decades of system software cruft. The paper argues that the move to virtual clouds creates an opportunity to build a unified platform spanning from cloud servers through desktops to mobile smartphones - this became the foundation for the Mirage framework. We demonstrated how this multiscale approach could deliver significant benefits in security, reliability and efficiency across the entire computing spectrum.</p>",
      "url": "https://anil.recoil.org/notes/2010-bcs-visions-1",
      "title": "Multiscale not multicore: efficient heterogeneous cloud computing",
      "summary": "Paper on vision for multiscale programming presented at BCS Visions 2010 conference.",
      "date_published": "2010-04-01T00:00:00.000000Z",
      "date_modified": "2010-04-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "cloud",
        "heterogeneous",
        "systems",
        "programming"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2010-bcs-visions.pdf",
          "mime_type": "application/pdf",
          "title": "Multiscale not multicore: efficient heterogeneous cloud computing"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/opening-a-website",
      "content_html": "<p>We've been working away at building a new type of database to help individuals\nkeep reigns on their ever-increasing personal digital information. The first\nprototypes run freely on <a href=\"https://web.archive.org/web/20110509135538/http://code.google.com/appengine\">Google App Engine</a> to gather your data\nbehind-the-scenes, and we are working on more advanced versions that run on\nembedded devices and the cloud.</p>\n<p>If you’re interested in keeping track of your personal data, you can start off\nwith the <a href=\"https://web.archive.org/web/20110509135538/http://perscon.net/install.html\">installation</a> instructions to clone your own version. After that, read\nup on the <a href=\"https://web.archive.org/web/20110509135538/http://perscon.net/design.html\">design</a> of the system (which is still changing as we research new\nideas around it). When you find something you want to fix, or add a new plugin\ndata source, just clone the <a href=\"https://github.com/avsm/perscon\">code</a> and send us back fixes!</p>",
      "url": "https://anil.recoil.org/notes/opening-a-website",
      "external_url": "https://web.archive.org/web/20110509135538/http://perscon.net/2010/03/29/intro.html",
      "title": "Opening a website",
      "summary": "Learn about opening a personal data tracking website with a customizable database system.",
      "date_published": "2010-03-29T00:00:00.000000Z",
      "date_modified": "2010-03-29T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "perscon",
        "ui",
        "selfhosting"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2010-smarte-privacybutler-1",
      "content_html": "<p>Paper on privacy butler services for more private data management. This work with Ryan Wishart, Domenico Corapi, and Morris Sloman proposed an automated Privacy Butler service that monitors a person's online presence and makes corrections based on their privacy policies. The system addresses the challenge of third-parties modifying online content - such as tagging photos or posting comments - without the consent of the person being referenced. We demonstrated how policy-driven automation could help people maintain control over their digital footprint across social networks and online communities.</p><h1>References</h1><ul><li>Wishart et al (2010). Privacy Butler: A Personal Privacy Rights Manager for Online Presence. IEEE. <a href=\"https://doi.org/10.1109/PERCOMW.2010.5470519\" target=\"_blank\"><i>10.1109/PERCOMW.2010.5470519</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2010-smarte-privacybutler-1",
      "title": "Privacy Butler: A Personal Privacy Rights Manager for Online Presence",
      "summary": "Paper on privacy butler services for more private data management.",
      "date_published": "2010-03-01T00:00:00.000000Z",
      "date_modified": "2010-03-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "privacy",
        "security",
        "ubicomp",
        "data-management"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2010-smarte-privacybutler.pdf",
          "mime_type": "application/pdf",
          "title": "Privacy Butler: A Personal Privacy Rights Manager for Online Presence"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1109/PERCOMW.2010.5470519",
          "doi": "10.1109/PERCOMW.2010.5470519",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/anil-phd-thesis-1",
      "content_html": "<p>PhD thesis now available as a technical report, documenting my work on creating high-performance, statically type-safe network applications. The thesis explored how ML-style type systems and model checking could be combined to build secure network services. This foundational work on using functional programming for systems development would later influence the design of MirageOS and our approach to building secure, verified network stacks.</p>",
      "url": "https://anil.recoil.org/notes/anil-phd-thesis-1",
      "title": "Creating high-performance, statically type-safe network applications",
      "summary": "PhD thesis published as technical report on building secure network applications using ML type systems and model checking.",
      "date_published": "2010-03-01T00:00:00.000000Z",
      "date_modified": "2010-03-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ocaml",
        "fp",
        "networking",
        "security",
        "systems"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/anil-phd-thesis.pdf",
          "mime_type": "application/pdf",
          "title": "Creating High-Performance, Statically Type-Safe Network Applications"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2009-icfem-spl-1",
      "content_html": "<p>Paper on a DSL for specifying temporal protocol automata at ICFEM 2009. The Statecall Policy Language (SPL) provides a practical middle ground between ad-hoc coding and full formal verification for complex protocol implementations. SPL lets programmers embed automata directly in their code which can be both statically model-checked using SPIN and dynamically enforced at runtime with minimal performance overhead. I demonstrated the approach with an SSH server written entirely in OCaml/SPL, showing how the automata provide higher-level debugging capabilities while maintaining the benefits of formal verification.</p><h1>References</h1><ul><li>Madhavapeddy (2009). Combining Static Model Checking with Dynamic Enforcement Using the Statecall Policy Language. Springer. <a href=\"https://doi.org/10.1007/978-3-642-10373-5_23\" target=\"_blank\"><i>10.1007/978-3-642-10373-5_23</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2009-icfem-spl-1",
      "title": "Combining Static Model Checking with Dynamic Enforcement Using the Statecall Policy Language",
      "summary": "Paper on DSL for specifying temporal protocol automata presented at ICFEM 2009.",
      "date_published": "2009-11-01T00:00:00.000000Z",
      "date_modified": "2009-11-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "formal-methods",
        "model-checking",
        "dsl",
        "protocols"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2009-icfem-spl.pdf",
          "mime_type": "application/pdf",
          "title": "Combining Static Model Checking with Dynamic Enforcement Using the Statecall Policy Language"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1007/978-3-642-10373-5_23",
          "doi": "10.1007/978-3-642-10373-5_23",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://doi.org/10.59350/r9b4r-1fs97",
      "content_html": "<p>Well, the big launch of <a href=\"http://www.xenserver5.com/\">XenServer 5</a> has gone smoothly, and with it have arrived a flood of questions about how exactly the new <a href=\"https://web.archive.org/web/20081121042533/https://xenserver5.com/ha.php\">High Availability</a> functionality works.  I’ll use this post to explain the overall architecture of HA in XenServer 5, and also how some of the fault detection and failure planning works.</p>\n<p>Fundamentally, HA is about making sure important VMs are always running on a resource pool. There are two aspects to this: reliably <strong>detecting host failure</strong>, and computing a <strong>failure plan</strong> to deal with swift recovery.</p>\n<p>Detecting host failure reliably is difficult since you need to remotely distinguish between a host disappearing for a while versus exploding in a ball of flames.  If we mistakenly decide that a master host has broken down and elect a new master in its place, there may be unpredictable results if the original host were to make a comeback!   Similarly, if there is a network issue and a resource pool splits into two equal halves, we need to ensure that only one half accesses the shared storage and not both simultaneously.</p>\n<h2 id=\"heartbeating-for-availability\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#heartbeating-for-availability\"></a>Heartbeating for availability</h2>\n<p><img src=\"/images/ha-wizard-3b.webp\" alt=\"%c\" ></p>\n<p>We solve all these problems in XenServer by having two mechanisms: a <strong>storage heartbeat</strong> and a <strong>network heartbeat</strong>. When you enable HA in a pool, you must nominate an iSCSI or FC storage repository to be the heartbeat SR. XenServer automatically creates a couple of small virtual disks in this SR. The first disk is used by every physical host in the resource pool as a <strong>shared quorum disk</strong>. Each host allocates itself a unique block in the shared disk and regularly writes to the block to indicate that it is alive.</p>\n<p>I asked <a href=\"https://dave.recoil.org\">Dave Scott</a>, the principal engineer behind HA about the startup process:</p>\n<blockquote>\n<p>When HA starts up, all hosts exchange data over both network and\nstorage channels, indicating which hosts <em>they</em> can see over both\nchannels; i.e. which I/O paths are working and which are not.  This\nliveness information is exchanged until a fixed point is reached and\nall of the hosts are satisfied that they are in agreement about what\nthey can see.  When this happens, the HA functionality is ‘armed’ and\nthe pool is protected.</p>\n</blockquote>\n<blockquote>\n<p>This HA arming process can take a few minutes to settle for larger\npools, but is only required when HA is first enabled.</p>\n</blockquote>\n<blockquote>\n<p>Once HA is active, each host regularly writes storage updates to the\nheartbeat virtual disk, and network packets over the management\ninterface.  It is vital to ensure that network adapters are\n<a href=\"http://docs.xensource.com/XenServer/5.0.0/1.0/en_gb/reference.html#networking-standalone_host_config-bonds\">bonded</a>\nfor resilience, and that storage interfaces are using <a href=\"http://docs.xensource.com/XenServer/5.0.0/1.0/en_gb/reference.html#id2557754\">dynamic\nmultipathing</a>\nwhere supported.  This will ensure that any single adapter or wiring\nfailures do not result in any availability issues.</p>\n</blockquote>\n<p><img src=\"/images/ha-wizard-5.webp\" alt=\"%c\" >\n<img src=\"/images/ha-wizard-5-1.webp\" alt=\"%c\" ></p>\n<p>The worst-case scenario for HA is the situation where a host is thought to be off-line but is actually still writing to the shared storage, since this can result in corruption of persistent data.  In order to prevent this situation without requiring active power strip control, we implemented <strong>hypervisor-level fencing</strong>.  This is a Xen modification which will hard-power the host off at a very low-level if it doesn’t hear regularly from a watchdog process running in the control domain.  Since it is implemented at a very low-level, this also covers the case where the control domain becomes unresponsive for any reason.</p>\n<p>Hosts will self-fence (i.e. power off and restart) in the event of any heartbeat failure unless any of the following hold true:</p>\n<ul>\n<li>The storage heartbeat is present for all hosts but the network has\npartitioned (so that there are now two groups of hosts).  In this\ncase, all of the hosts which are members of the largest network\npartition stay running, and the hosts in the smaller network\npartition self-fence.  The assumption here is that the network\noutage has isolated the VMs, and they ought to be restarted on a\nhost with working networking.  If the network partitions are exactly\nthe same size, then only one of them will self-fence according to a\nstable selection function.</li>\n<li>If the storage heartbeat goes away but the network heartbeat\nremains, then the hosts check to see if they can see all other hosts\nover the network.  If this condition holds true, then the hosts\nremain running on the assumption that the storage heartbeat server\nhas gone away.  This doesn’t compromise VM safety, but any network\nglitches will result in fencing since that would mean both\nheartbeats have disappeared.</li>\n</ul>\n<h2 id=\"planning-for-failure\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#planning-for-failure\"></a>Planning for failure</h2>\n<p>The heartbeat system gives us reliable notification of host failure, and so we move onto the second step of HA: capacity planning for failure.</p>\n<p>A resource pool consists of several physical hosts (say, 16), each with potentially different amounts of host memory and a different number of running VMs.  In order to ensure that no single host failure will result in the VMs on that host being unrestartable (e.g. due to insufficient memory on any other host), the XenServer pool dynamically computes a <strong>failure plan</strong> which calculates the actions that would be taken on any host failure.</p>\n<p>But there’s one more complexity... a single host failure plan does not cover more advanced cases such as network partitions which take out entire groups of hosts.  It would be very useful to be able to create a plan that could tolerate more than a single host failure, so that administrators could ignore the first host failure and be safe in the knowledge that (for example) three more hosts could fail before the pool runs out of spare capacity.</p>\n<p>That’s exactly what we do in XenServer... the resource pool <em>dynamically</em> computes a failure plan which considers the “number of host failures to tolerate” (or <em>nhtol</em>).  This represents the number of disposable servers in a pool for a given set of protected VMs.</p>\n<p>The planning algorithms are pretty complex, since doing a brute force search of all possible failures across all hosts across all VMs is an exponential problem.  We apply heuristics to ensure we can compute a plan in a reasonably small time:</p>\n<ul>\n<li>for up to 3 host failures, we do a comprehensive search which tries\nalmost all permutations.  This covers corner cases such as having\nhosts or VMs with very different amounts of memory (e.g. 4GB vs\n128GB).  Rather than calculate memory slots or otherwise approximate\nresults, we just deal with them individually and give very accurate\nplans.</li>\n<li>for greater than 3 host failures, we make conservative decisions by\napproximating every VM to be as large as the largest, and\nconsidering each host to be the same as the most densely packed\nhost.  We do not approximate the host memory, and so having pools\nwith uneven amounts of host memory will be fine.  However, in\napproximate planning mode having a single very large VM will result\nin a low <em>nhtol</em> value.  If this is a problem, then try to reduce\nthe <em>nhtol</em> or try to have a more even spread of VM memory sizes.</li>\n</ul>\n<p>Since planning algorithms are designed for unexpected host failures, we only consider absolutely essential resource reservations which would prevent the VM from starting on the alternative host (e.g. storage is visible, and enough memory is present).  We do not perform CPU reservation on the basis that it can be optimised at a later stage via live relocation once the VM is back up and running.</p>\n<h3 id=\"overcommit-protection\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#overcommit-protection\"></a>Overcommit protection</h3>\n<p>We now have HA armed and a failover plan for our VMs.  But what if you want to make changes to your configuration after HA is enabled?  This is dealt with via <strong>overcommit protection</strong>.</p>\n<p>The XenServer pool dynamically calculates a new failover plan in response to every XenAPI call which would affect it (e.g. starting a new VM).  If a new plan cannot be calculated due to insufficient resources across the pool, the XenServer will return an <strong>overcommitment</strong> error message to the client which blocks the operation.</p>\n<h4 id=\"the-what-if-machine\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#the-what-if-machine\"></a>The “What if?” Machine</h4>\n<p><img src=\"/images/ha-wizard-4b.webp\" alt=\"%c\" ></p>\n<p>This overcommit protection would be quite irritating if you have to keep trying things and seeing if a plan exists or not, and so we built in a &quot;<a href=\"http://www.gotfuturama.com/Information/Encyc-55-What_If_Machine/\">What If?</a>&quot; machine into XenServer to facilitate counter-factual reasoning.</p>\n<p>When reconfiguring HA via XenCenter, you can supply a hypothetical series of VM priorities, and XenServer will return a number of host failures which would be tolerated under this scheme.  This lets you try various combinations of VM protections depending on your business needs, and see if the number of host failures is appropriate to the level of paranoia you desire.</p>\n<p>This can even be done via the CLI, using the snappily named &quot;<strong>xe pool-ha-compute-max-host-failures-to-tolerate</strong>&quot; when HA is enabled.</p>\n<p>The nice thing about XenServer HA is that it is done at the XenAPI level, and so  any of the standard clients (such as the xe CLI or XenCenter) or any third-party clients which use the XenAPI will all interoperate just fine.  The XenServer pool dynamically recalculates plans in response to the client requests, and so no special “oracle” is required outside of the pool to figure out HA plans.</p>\n<p>Finally, HA makes master election completely invisible.  Any host in a pool can be a master host, and the pool database is constantly replicated across all nodes and also backed up to shared storage on the heartbeat SR for additional safety.  Any XenAPI client can connect to any host, and a redirect is issued to the current master host.</p>\n<h2 id=\"protection-levels\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#protection-levels\"></a>Protection Levels</h2>\n<p>Each VM in an HA pool can be either <strong>fully protected</strong>, <strong>best-effort</strong> or <strong>unprotected</strong>. VMs which are protected are all included in the failover planning, and if no plan exists for which they can all be reliably restarted then the pool is considered to be overcommitted. Hugh Warrington (who implemented the XenCenter HA support) explained what use protection levels are:</p>\n<blockquote>\n<p>Best-effort VMs are not considered when calculating a failover plan,\nbut the pool will still try to start them as a one-off if a host that\nis running them fails.  This restart is attempted after all protected\nVMs are restarted, and if the attempt to start them fails then it will\nnot be retried.  This is a useful setting for test/dev VMs which\naren’t critical to keep running, but would be nice to do so in a pool\nwhich also has some important VMs which absolutely must run.</p>\n</blockquote>\n<p><img src=\"/images/ha-wizard-5.webp\" alt=\"%c\" ></p>\n<p>There are some advanced features which are only available via the CLI.   Each protected VM in an HA pool can be assigned a numeric <code>ha-restart-priority</code>.  If a pool is well-resourced with a high <em>nhtol</em>, then these restart priorities are not relevant: the VMs are all guaranteed to be started.</p>\n<p>If more hosts fail than have been planned for, then the priorities are used to determine the order in which VMs are restarted.  This ensures that in over-committed pools, the most important VMs are restarted first.  Although the pool will start priority 1 VMs first, they might not finish booting before the priority 2 VMs, and so this should not be used as the basis for service ordering.</p>\n<p>Note that it's very important to <strong>ensure that a VM is agile</strong> when protecting it by HA.  If the VM is not agile (e.g has a physical CD drive mapped in from a host), then it can only be assigned Best Effort restart since it is tied to one host.</p>\n<h2 id=\"xencenter-support-for-ha\"><a class=\"anchor\" aria-hidden=\"true\" href=\"#xencenter-support-for-ha\"></a>XenCenter support for HA</h2>\n<p>The best practice for HA is not to make configuration changes while it is enabled.  Instead, it is intended to be the &quot;2am safeguard&quot; which will restart hosts in the event of a problem when there isn't a human administrator nearby.  If you are actively making configuration changes such as applying patches, then HA should be disabled for the duration of these changes.</p>\n<p>XenCenter makes some common changes under HA much more user-friendly, which I asked <a href=\"http://community.citrix.com/blogs/citrite/ewanm/\">Ewan Mellor</a> (the principal GUI engineer) about:</p>\n<ul>\n<li>Normally a protected VM cannot be shut down via the CLI or from\nwithin the guest (a shutdown from within the guest will\nautomatically restart it).  If you try to shutdown from XenCenter,\nit will give you the option of unprotecting the VM and then shutting\nit down first.  Thus, accidental in-guest shutdowns wont result in\ndowntime, but administrators can still stop a protected guest if\nthey really want to.</li>\n<li>If you want to reboot a host when HA is enabled, XenCenter\nautomatically uses the hypothetical planning calculation to\ndetermine if this would invalidate the failover plan.  If it doesn’t\naffect it, then the host is shut down normally.  If the plan would\nbe violated, but the <em>nhtol</em> is greater than 1, XenCenter will give\nthe administrator the option of lowering the <em>nhtol</em> value by 1. \nThis reduces the overall resilience of the pool, but always ensures\nthat at least one host failure will be tolerated.  When the host\ncomes back up, the plan is automatically recalculated and the\noriginal <em>nhtol</em> value restored if appropriate.</li>\n<li>If you try to apply a hotfix, then XenCenter will disable HA for the\nduration of the pool patching wizard.  It is important to manually\nkeep an eye on hotfix application to ensure that host failures do\nnot disrupt the operation of the pool.</li>\n</ul>\n<p>So, I hope this short article has given you a taster... just kidding! This post is almost as long as my PhD thesis, but then, HA is a complex topic. Please do feel free to get back to me with comments and feedback about how we can improve it in the future releases, or if you just love it the way it is.  Many thanks to <a href=\"https://dave.recoil.org\">Dave Scott</a>, <a href=\"mailto:richard.sharp@gmail.com\">Richard Sharp</a>, Ewan Mellor and Hugh Warrington for their input to this article.</p>",
      "url": "https://anil.recoil.org/notes/peeking-under-the-hood-of-high-availability",
      "external_url": "https://web.archive.org/web/20090126095912/http://community.citrix.com/blogs/citrite/anilma/2008/09/17/Peeking+under+the+hood+of+High+Availability",
      "title": "Peeking under the hood of High Availability",
      "summary": "Learn how XenServer's High Availability feature works, including host failure detection and automatic VM restarts.",
      "date_published": "2008-09-17T00:00:00.000000Z",
      "date_modified": "2008-09-17T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "xen",
        "citrix",
        "consensus",
        "distributed"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://doi.org/10.59350/48sxx-hjt49",
      "content_html": "<p>You won’t be surprised to hear that we spend a lot of time improving\n<a href=\"http://www.citrix.com/XenApp\">XenApp</a> performance when running on\n<a href=\"http://www.citrix.com/XenServer\">XenServer</a>. Although there are some\ngood benchmark comparisons available (such as the <a href=\"http://community.citrix.com/x/_4ENAg\">Tolly\nGroup</a> report), I still get a lot\nof customers asking about what the “secret sauce” is. I sat down with\nGeorge Dunlap, the lead XenServer performance engineer to chat about the\nvery first optimisation we did back in XenServer 4.0 last year.</p>\n<p>Before we dive in, we first need to explain how a normal operating\nsystem handles memory. George explains:</p>\n<blockquote>\n<p>Modern desktop and server processors don’t access memory directly\nusing its physical address. They use ‘<a href=\"http://en.wikipedia.org/Virtual_Memory\">virtual\nmemory</a>’ to separate the\naddresses that processes use to read and write memory from the actual\nmemory itself. This allows operating systems to hide from processes\nall the dirty details of how much memory there is, where in physical\nmemory the process needs to write to, and so on.</p>\n<p>However, the actual processor still needs to translate from a\n<a href=\"http://en.wikipedia.org/wiki/Virtual_address\">virtual address</a> to the\nphysical memory address in order to actually read and write any\nmemory. This translation is done with something called <a href=\"http://en.wikipedia.org/wiki/Page_tables\">page\ntables</a>.</p>\n<p>Page tables are used to implement virtual memory by mapping virtual\naddresses to physical addresses. The operating system constructs page\ntables using physical memory addresses, and then puts the physical\naddress of the “top-level” page table into a hardware register called\nthe ‘base pointer’. Then the processor will read these page tables to\ntranslate virtual addresses to physical addresses as needed, before\nreading and writing to physical memory.</p>\n</blockquote>\n<p>Most modern processor types have some sort of paging mechanism, although\nXenServer is specifically tuned for\n<a href=\"http://en.wikipedia.org/wiki/X86-64\">x86-64</a> CPUs. An excellent book on\nthe general topic is <a href=\"http://en.wikipedia.org/wiki/Special:BookSources/0130313580\">Modern Operating\nSystems</a> by\n<a href=\"http://www.cs.vu.nl/~ast/\">Andrew Tanenbaum</a>. When XenServer creates\nWindows VMs, it takes advantage of the <a href=\"http://en.wikipedia.org/wiki/X86_virtualization\">virtualization\nextensions</a> in modern\nCPUs, which requires special memory handling in Xen. George explains\nthis further:</p>\n<blockquote>\n<p>When we create a virtual machine, we virtualize the memory as well;\nthat means that the guest operating system’s idea of physical memory\ndoes not match up to real physical memory on the host. Traditionally,\nwhat the guest thinks of as physical memory is called “physical\nmemory”, and what the hypervisor thinks of as physical memory is\ncalled “machine memory”. Since this terminology is a bit confusing,\nXen tends to call what the guest thinks of as physical memory as\n“guest physical” memory, just to help make things more clear.</p>\n</blockquote>\n<blockquote>\n<p>This means that any fully-virtualized operating system, like Windows,\nwill create page tables using guest physical memory, and will point\nthe base pointer at the guest physical address of the top-level page\ntable. Unfortunately, the hardware still needs to map from virtual\nmemory address to machine addresses, not guest physical addresses.</p>\n</blockquote>\n<blockquote>\n<p>In order to allow this to happen, the hypervisor sets up <strong>shadow\npage tables</strong>. These page tables are generated by the hypervisor are\ncopies of the guest page tables, but with the guest physical addresses\nconverted into machine physical addresses. The guest cannot access\nthem directly, and they don’t reside in the guest’s physical memory;\nthey’re generated out of a pool of memory that the hypervisor\nallocates when a VM is created, called shadow page table memory.</p>\n</blockquote>\n<blockquote>\n<p>What this means is that whenever the guest operating system wants to\nmap some new memory, after it writes the data into the page table but\nbefore it can actually use it, the hypervisor needs to translate the\nchange to the guest page table into changes to the shadow page table.\nSo any workload that involves a lot of this will necessarily involve\nthe hypervisor a lot, which causes overhead.</p>\n</blockquote>\n<p>So shadow page tables are our mechanism of giving a guest an interface\nwhich is identical to real hardware (so it doesn’t need to be modified),\nbut still intercepting changes before they reach the real hardware. You\ncan find more details from the <a href=\"http://www.xensource.com/files/summit_3/XenSummit_Shadow2.pdf\">XenSummit 2006\ntalk</a> or\nfrom the 2005 <a href=\"http://www.cl.cam.ac.uk/research/srg/netos/papers/2005-migration-nsdi-pre.pdf\">NSDI\npaper</a>.\nSo how is this all relevant to XenApp performance? Back to George…</p>\n<blockquote>\n<p>The hypervisor allocates a certain amount of memory for each VM to\nuse for shadow page tables; this is called <strong>shadow page table\nmemory</strong>. As new page tables are created and old ones aren’t used\nanymore, the hypervisor cycles through this shadow page table memory.\nWhen it needs a new page and there isn’t enough, it will ‘unshadow’\nthe guest page tables that haven’t been used for the longest time to\nreclaim shadow memory, so that it can use more.</p>\n</blockquote>\n<blockquote>\n<p>We don’t know ahead of time how much shadow memory a given workload\nwill use, but we can estimate based on the amount of memory that the\nVM has. We allocate enough shadow memory for each page to be mapped\nonce, more or less, then add an extra 50% to have some slack. For all\nthe workloads we’ve tested, that’s been enough – except XenApp.</p>\n</blockquote>\n<blockquote>\n<p>XenApp is the one workload we’ve found that requires more shadow page\ntable memory than our standard default. Because XenApp generally\nstarts hundreds of copies of the same process, the same memory ends up\nmapped in hundreds of different processes. What happens when all of\nthose processes are active is that XenServer is continually\nunshadowing one process’ page tables in order to shadow another\nprocess’ pagetables; only to have to re-shadow the original ones a\nsecond or two later when it runs again! This is called\n<a href=\"http://en.wikipedia.org/wiki/Thrash_(computer_science)\">thrashing</a>,\nwhen there’s not enough of a limited resource.</p>\n</blockquote>\n<p>Once the bottleneck was discovered, the solution was simple. In\nXenServer 4.1, we created a special XenServer application template\ncalled <em>“Citrix XenApp”</em>, which has an increased shadow multiplier that\nreserves more shadow memory for the guest when it starts. This is also a\ngood example of how templates hide the complexities of performance\ntuning from the user, but still permitting custom modifications if they\nare required. For example, on your XenServer host with a VM called\n“XenApp”, you could view the shadow multiplier by using the CLI:</p>\n<pre><code class=\"language-bash\"># xe vm-list name-label=XenApp params=HVM-shadow-multiplier\n  HVM-shadow-multiplier ( RW)    : 4.000\n</code></pre>\n<p>The same value is also available from XenCenter in the Optimization\npane, but of course do remember that the default value was chosen\nthrough extensive testing and doesn’t need to be changed. Most of the\nother templates in XenServer also have carefully tuned settings (e.g.\nthe hardware platform flags) to ensure smooth running, or in the case of\nLinux templates, to support <a href=\"http://docs.xensource.com/XenServer/4.1.0/1.0/en_gb/sdk.html#id2553443\">para-virtual\ninstallation</a>.\nThis is why it’s so important that you not use the <em>“Other Install\nMedia”</em> template in preference of a more specialised one!</p>\n<p>I mentioned at the beginning of this post that this was the first of\nmany XenApp optimisations. We’ve just released the <a href=\"https://www.citrix.com/English/ss/downloads/details.asp?downloadId=1679827&amp;productId=683148\">public\nbeta</a>\nof the latest XenServer (“Orlando”) which is even faster. The story of\nwhat those improvements are, and the tools which George and his team\nuses to analyze the inner workings of Xen, are a topic for a future\npost. For now, get downloading XenServer and start virtualizing your\nXenApp installations! Or if you’re feeling inspired, go over to\n<a href=\"http://xen.org/\">xen.org</a>, check out the source, and get coding…</p>",
      "url": "https://anil.recoil.org/notes/shedding-some-light-on-xenapp-on-xenserver-performance-tuning",
      "external_url": "https://web.archive.org/web/20090131071235/http://community.citrix.com/blogs/citrite/anilma/2008/08/04/Shedding+some+light+on+XenApp+on+XenServer+performance+tuning",
      "title": "Shedding light on XenApp on XenServer performance tuning",
      "summary": "Optimize XenApp performance on XenServer with expert tuning tips.",
      "date_published": "2008-08-04T00:00:00.000000Z",
      "date_modified": "2008-08-04T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "xen",
        "systems",
        "opensource"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://doi.org/10.59350/fxmfq-3b085",
      "content_html": "<p>I thought I’d kick off my Citrix blog with a question I get pretty often\nfrom Linux enthusiasts: how to install unsupported Linux distributions\non <a href=\"https://xenserver.com\">XenServer</a> 4.1.</p>\n<p>The most common solution people find is to use the &quot;Other Install Media&quot;\ntemplate, insert the distribution installation CD, and find that the\nmouse cursor doesn’t work when they boot into X11. The reason for this\nis that they are using the hardware-assisted emulation mode of\ninstalling Linux. In this mode (dubbed “HVM”), all input and output is\nemulated, and in particular the mouse interface uses the USB tablet\ninterface. If the distribution doesn’t include a driver for USB tablets,\nthen no mouse will appear.</p>\n<p>Windows guests run at high-speed in HVM mode due to the installation of\nthe XenServer tools which install <a href=\"http://xen.org/files/summit_3/xen-pv-drivers.pdf\">high-speed\ndrivers</a>, but these\nare not necessary for Linux distributions since they can be run in\n<a href=\"http://en.wikipedia.org/wiki/Paravirtualization\">para-virtualized</a> mode\n(dubbed “PV”). This involves obtaining a Xen-enabled PV kernel from the\ndistribution, and modifying the VM record in XenServer to boot into this\nkernel instead of HVM mode. The XenServer built-in templates for popular\ndistributions such as RHEL, CentOS or SUSE Linux already automate all\nthis and are in PV mode from the installer onwards.</p>\n<p>In the remainder of this post, I’ll explain how to take a distribution\nwithout direct support (<a href=\"http://www.ubuntu.com/\">Ubuntu</a>\n<a href=\"https://wiki.ubuntu.com/HardyHeron\">8.04</a>), get it installed in HVM\nmode on XenServer 4.1, and convert it to PV mode with a XenCenter\ngraphical console.</p>\n<ul>\n<li>\n<p>Download the &quot;<a href=\"http://www.ubuntu.com/GetUbuntu/download\">Alternative Installation\nCD</a>&quot;. The main\ninstallation CD uses graphical mode, which won't install as well in\nHVM mode due to the use of esoteric 16-bit mode instructions for the\ngraphics operations. The 16-bit emulation mechanisms vary between\nprocessors (with better support on AMD chips, and a software\ninstruction emulator required on Intel VT chips). However, the\nUbuntu alternate CD uses a text-based installer which works fine.</p>\n</li>\n<li>\n<p>Create a new VM on the XenServer 4.1 host using the &quot;Windows Server\n2003&quot; template. This template is set up with a sensible set of\nhardware emulation flags and default disks, and so is a good base\nfor the HVM installation of Ubuntu as well. Attach the Ubuntu ISO\nyou just downloaded to the VM, and proceed to install Ubuntu as\nnormal. You should install it onto the first disk, to make the\nsubsequent steps in this guide easier.</p>\n</li>\n<li>\n<p>When the installation is finished, reboot the VM (don't forget to\ndetach the installation ISO first). It should boot up in HVM mode\ninto the graphical login screen. The XenCenter display will show it\nas not being optimized, which is fine. At this stage, I prefer to\nwork via a remote command-line using SSH. Open up a Terminal from\nUbuntu, and run &quot;<code>sudo apt-get install openssh-server</code>&quot;. Then find\nout the VM's IP address with &quot;<code>ifconfig eth0</code>&quot;, and then connect to\nit remotely. Alternatively, you can continue to type in the commands\ndirectly into the terminal as well.</p>\n</li>\n<li>\n<p>On the Ubuntu guest, you now need to install the latest Xen version\nof the Ubuntu kernel:</p>\n<ul>\n<li>Install the Linux kernel virtual package with\n&quot;<code>sudo apt-get install linux-image-xen</code>&quot;. This is a virtual\npackage which pulls in the latest Xen kernel and modules, in my\ncase <code>2.6.24.19.21</code>.</li>\n<li>You now need to workaround a\n<a href=\"http://www.mail-archive.com/grub-devel@gnu.org/msg06024.html\">bug</a>\nin grub. Due to the switch in recent versions of Linux to work\nwith the hypervisor-independent\n<a href=\"http://xen.xensource.com/files/xensummit_4/xen-paravirt_ops_Fitzhardinge.pdf\">paravirt_ops</a>\ninterface, <code>update-grub</code> doesn't update the grub configuration\nwith your newly installed Xen kernel. To fix this:\n<ul>\n<li>\n<p>Open <code>/boot/grub/menu.lst</code> in your favourite editor.</p>\n</li>\n<li>\n<p>Scroll to the bottom to the kernel list, and find the entry\nwhich looks like:</p>\n<pre><code>title           Ubuntu 8.04, kernel 2.6.24-16-generic\nroot            (hd0,0)\nkernel          /boot/vmlinuz-2.6.24-16-generic root=UUID=&lt;uuid&gt; ro quiet splash\ninitrd          /boot/initrd.img-2.6.24-16-generic\nquiet\n</code></pre>\n</li>\n<li>\n<p>Add a new entry which is similar to this, but change all\nreferences to the <code>2.6.24-16-generic</code> to the Xen kernel. In\n<code>/boot</code> I have <code>vmlinuz-2.6.24-19-xen</code>, so my new entry\nlooks like:</p>\n<pre><code>title           Ubuntu 8.04, kernel 2.6.24-19-xen\nroot            (hd0,0)\nkernel          /boot/vmlinuz-2.6.24-19-xen root=UUID=&lt;uuid&gt; ro quiet splash\ninitrd          /boot/initrd.img-2.6.24-19-xen\nquiet\n</code></pre>\n</li>\n<li>\n<p>Also edit the <code>default</code> entry in the <code>menu.lst</code> to match the\nnumber of the kernel you just added. I set mine to 3, since\nit is the fourth entry in the list and the indexing starts\nfrom 0.</p>\n</li>\n</ul>\n</li>\n</ul>\n</li>\n<li>\n<p>When this is done, shut down the guest but do not reboot it just\nyet. You first need to edit the VM record for your Ubuntu VM to\nconvert it to PV boot mode. From the control domain console of your\nXenServer:</p>\n<ul>\n<li>Determine the UUID of the Ubuntu VM by using the <code>xe</code> CLI:\n<ul>\n<li><code>xe vm-list name-label=Ubuntu params=uuid --minimal</code> : this\nwill print out the UUID of the VM named &quot;Ubuntu&quot;. If you are\nlogged into the control domain, pressing the <code>&lt;tab&gt;</code> key\nwill perform auto-completion of UUIDs in subsequent XE\ncommands, so you don't need to keep typing it in every time!</li>\n<li><code>xe vm-param-set uuid=&lt;uuid&gt; HVM-boot-policy=</code> : this will\nclear the HVM boot mode from the VM.</li>\n<li><code>xe vm-param-set uuid=&lt;uuid&gt; PV-bootloader=pygrub</code> : this\nwill switch the VM to using to the pygrub bootloader which\nstarts the guest in PV mode by examining its filesystem for\nkernel.</li>\n<li><code>vm vm-param-set uuid=&lt;uuid&gt; PV-args=&quot;console=tty0 xencons=tty&quot;</code>\n: this configures the kernel boot arguments to display the\nlogin console on the correct TTY, so that it shows up in the\nXenCenter console.</li>\n</ul>\n</li>\n<li>Next, you need to flag the root disk of the VM as bootable so\nthat pygrub knows where to look for the PV kernel:\n<ul>\n<li><code>xe vm-disk-list uuid=&lt;uuid&gt;</code> and look for the UUID of the\nVBD for the disk. VBD stands for &quot;Virtual Block Device&quot; and\nrepresents how to map the virtual disk into the virtual\nmachine.</li>\n<li><code>xe vbd-param-set uuid=&lt;vbd uuid&gt; bootable=true</code> will set\nthe root disk VBD to be bootable.</li>\n</ul>\n</li>\n</ul>\n</li>\n<li>\n<p>You should be all set now! If you boot up the Ubuntu VM, it should\nstart up in text-mode with the high-speed PV kernel. If it doesn't\nwork due to an incorrect grub configuration, you can use the\n<code>xe-edit-bootloader</code> script in the XenServer control domain to edit\nthe <code>grub.conf</code> until it works.</p>\n</li>\n<li>\n<p>The next step is to install the XenServer tools within the guest, so\nthat metrics such as the network interface IP addresses are recorded\nand reported from XenCenter. To do this:</p>\n<ul>\n<li>Due to a portability issues with the default shell in Ubuntu\n(<a href=\"http://en.wikipedia.org/wiki/Debian_Almquist_shell\">dash</a>),\nyou will need to replace it by:\n<code>sudo apt-get -y install bash &amp;&amp; sudo dpkg-reconfigure dash</code>.\nWe've actually fixed this issue in future releases of XenServer,\nbut for XenServer 4.1 you will need to use <code>bash</code>.</li>\n<li>Attach the XenServer Tools ISO into the VM, and mount it into\nthe guest with <code>sudo mount /dev/xvdd /mnt</code></li>\n<li>Install the tools with\n<code>sudo dpkg -i /mnt/Linux/xe-guest-utilities_4.1.0-257_i386.deb</code>.</li>\n<li>The warnings about the VM being unoptimized should disappear,\nand additional information such as the IP address of the guest\nshould appear in XenCenter.</li>\n</ul>\n</li>\n<li>\n<p>In order to access the Ubuntu installation via the graphical\nconsole, you need to configure it to run\n<a href=\"http://www.realvnc.com/\">VNC</a> on the external network interface.\nXenCenter polls the guest to see if it is listening on the VNC port\n5900, and offers the option to switch to the graphical console if it\nfinds it. I followed the excellent instructions on this <a href=\"http://ubuntuforums.org/showpost.php?p=4963842&amp;postcount=1\">forum\npost</a>.\nTo summarise them:</p>\n<ul>\n<li>\n<p><code>sudo apt-get install vnc4server xinetd</code> : to install the\nrequired packages</p>\n</li>\n<li>\n<p>Edit <code>/etc/gdm/gdm.conf</code> and uncomment the\n<code>RemoteGreeter=/usr/lib/gdm/gdmlogin</code> line, set the key\n<code>Enable=true</code> in the <code>[xdcmp]</code> section.</p>\n</li>\n<li>\n<p>Install a new service file for <code>xinetd</code> into\n<code>/etc/xinetd.d/Xvnc</code> with the following contents:</p>\n<pre><code>service Xvnc\n{\n  type = UNLISTED\n  disable = no\n  socket_type = stream\n  protocol = tcp\n  wait = no\n  user = nobody\n  server = /usr/bin/Xvnc\n  server_args = -inetd -query localhost -geometry 1024x768  -depth 16 -cc 3 -once -SecurityTypes=none -extension XFIXES\n  port = 5900\n}\n</code></pre>\n</li>\n<li>\n<p>The major difference from the forum poster is to run it on port\n5900, and not to restrict it to just localhost (since XenCenter\nalso needs to connect to it).</p>\n</li>\n<li>\n<p>Finally, restart the <code>xinetd</code> service by running\n<code>sudo /etc/init.d/xinetd restart</code>.</p>\n</li>\n</ul>\n</li>\n</ul>\n<p>Once you're done with this installation, you can shut down the VM and\nconvert it to a template. Any exports or clones will continue to run in\nPV mode, since the XenServer XVA export format records all of the\nmetadata required to re-create the VM records.</p>\n<p>Enjoy the Ubuntu on XenServer experience! Remember to report any issues\nyou have with the in-guest packages on the Ubuntu support forums, or\njust give them positive feedback.</p>\n<p>PS: many thanks to Andrew Peace and Ian Campbell for assistance. May\ntheir Linux beards remain long and uncut.</p>",
      "url": "https://anil.recoil.org/notes/installing-ubuntu-on-xenserver",
      "external_url": "https://web.archive.org/web/20090123155914/http://community.citrix.com/blogs/citrite/anilma/2008/07/02/Installing+Ubuntu+on+XenServer",
      "title": "Installing Ubuntu on XenServer",
      "summary": "This guide provides step-by-step instructions for installing Ubuntu as a paravirtualized (PV) guest on Citrix XenServer 4.1, including configuring the grub boot loader, installing XenServer tools, and setting up VNC for graphical console access.",
      "date_published": "2008-07-02T00:00:00.000000Z",
      "date_modified": "2008-07-02T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "xen",
        "linux",
        "ubuntu"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2008-mobisys-splittrust-1",
      "content_html": "<p>Paper on splitting trust between smartphones and web browsers at MobiSys 2008. This work with <a href=\"mailto:richard.sharp@gmail.com\">Richard Sharp</a>, Roy Want, and Trevor Pering extended our earlier crimeware research to enhance security on public terminals using mobile composition. The system splits web browsing functionality between an untrusted public terminal and a trusted personal mobile device, protecting users from keyloggers and screengrabbing attacks when using shared computers. We demonstrated how combining the convenience of public terminals with the security of personal devices could enable safer web browsing in public spaces.</p><h1>References</h1><ul><li>Sharp et al (2008). Enhancing web browsing security on public terminals using mobile composition. ACM. <a href=\"https://doi.org/10.1145/1378600.1378612\" target=\"_blank\"><i>10.1145/1378600.1378612</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2008-mobisys-splittrust-1",
      "title": "Enhancing web browsing security on public terminals using mobile composition",
      "summary": "Paper on splitting trust between smartphones and web browsers presented at MobiSys 2008.",
      "date_published": "2008-06-01T00:00:00.000000Z",
      "date_modified": "2008-06-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "security",
        "mobile",
        "ubicomp",
        "web",
        "trust"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2008-mobisys-splittrust.pdf",
          "mime_type": "application/pdf",
          "title": "Enhancing web browsing security on public terminals using mobile composition"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1145/1378600.1378612",
          "doi": "10.1145/1378600.1378612",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2007-eurosys-melange-1",
      "content_html": "<p>Won best student paper for my PhD work on a high-performance functional packet parsing DSL at Eurosys 2007! The Melange work combined strong static typing with generative meta-programming to create MPL, a domain-specific language for describing Internet packet protocols. Our approach generated fast, zero-copy code that outperformed C implementations - we built fully-featured SSH and DNS servers that showed greater throughput and lower latency than OpenSSH and BIND. The work demonstrated that type-safe languages could eliminate the performance penalty typically associated with memory safety, opening the door to more secure protocol implementations.</p><h1>References</h1><ul><li>Madhavapeddy et al (2007). Melange: creating a \"functional\" internet. <a href=\"https://doi.org/10.1145/1272998.1273009\" target=\"_blank\"><i>10.1145/1272998.1273009</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2007-eurosys-melange-1",
      "title": "Melange: creating a \"functional\" internet",
      "summary": "Won best student paper award at Eurosys 2007 for PhD work on high-performance functional packet parsing DSL.",
      "date_published": "2007-06-01T00:00:00.000000Z",
      "date_modified": "2007-06-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "networking",
        "fp",
        "dsl",
        "systems",
        "parsing"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2007-eurosys-melange.pdf",
          "mime_type": "application/pdf",
          "title": "Melange: creating a \"functional\" internet"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1145/1272998.1273009",
          "doi": "10.1145/1272998.1273009",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2006-puc-tags-1",
      "content_html": "<p>Journal paper on interacting with mobile services using camera-phones. This Personal and Ubiquitous Computing article presents both qualitative user-experience studies and quantitative pointing-device experiments with visual tags. We found that novice users could aim and click on visual tags remarkably quickly (under 3 seconds on average) and accurately (meeting our 6% error-rate threshold). The work with <a href=\"https://www.cst.cam.ac.uk/people/eft20\">Eleanor Toye Scott</a>, <a href=\"mailto:richard.sharp@gmail.com\">Richard Sharp</a>, and others revealed positive attitudes toward visual-tag applications while identifying important reservations about camera-phone technology more generally, providing concrete design lessons for future mobile interaction systems.</p><h1>References</h1><ul><li>Scott et al (2007). Interacting with mobile services: an evaluation of camera-phones and visual tags. <a href=\"https://doi.org/10.1007/s00779-006-0064-9\" target=\"_blank\"><i>10.1007/s00779-006-0064-9</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2006-puc-tags-1",
      "title": "Interacting with mobile services: an evaluation of camera-phones and visual tags",
      "summary": "Journal paper evaluating camera phone interaction with mobile services using visual tags.",
      "date_published": "2007-02-01T00:00:00.000000Z",
      "date_modified": "2007-02-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "mobile",
        "pervasive-computing",
        "visual-tags"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2006-puc-tags.pdf",
          "mime_type": "application/pdf",
          "title": "Interacting with mobile services: an evaluation of camera-phones and visual tags"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1007/s00779-006-0064-9",
          "doi": "10.1007/s00779-006-0064-9",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2006-fighting-crimeware-1",
      "content_html": "<p>New paper <a href=\"/papers/2006-fighting-crimeware\">Fighting Crimeware: An Architecture for Split-Trust Web Applications</a> available. This Intel Research technical report presents a split-trust browsing architecture that splits web applications across a trusted personal device and an untrusted PC. Working with <a href=\"mailto:richard.sharp@gmail.com\">Richard Sharp</a>, Roy Want, and others at Intel, we developed a system where sensitive information entered on a phone's keypad cannot be read by PC-based keyloggers, and data displayed on the phone's screen is hidden from screengrabbing malware. We implemented a prototype using a commercial cell phone and showed it could defend against a range of crimeware attacks including active injection and browser compromises.</p>",
      "url": "https://anil.recoil.org/notes/2006-fighting-crimeware-1",
      "title": "Fighting Crimeware: An Architecture for Split-Trust Web Applications",
      "summary": "Paper presenting architecture for split-trust web applications to combat crimeware attacks.",
      "date_published": "2006-04-01T00:00:00.000000Z",
      "date_modified": "2006-04-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "security",
        "web",
        "privacy",
        "malware"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2006-fighting-crimeware.pdf",
          "mime_type": "application/pdf",
          "title": "Fighting Crimeware: An Architecture for Split-Trust Web Applications"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2005-spin-splat-1",
      "content_html": "<p>Workshop paper on temporal automata for protocol specifications at SPIN 2005. We presented SPLAT, a tool that makes model-checking more accessible by helping developers bridge the gap between complex applications and their abstract models. The key challenge we addressed was how non-experts could create accurate abstractions without inadvertently hiding critical bugs, and how to determine whether counter-examples represent genuine bugs or just modeling artifacts. SPLAT combines static model-checking with dynamic enforcement of abstractions to validate the correspondence between models and implementations.</p><h1>References</h1><ul><li>Madhavapeddy et al (2005). SPLAT: A Tool for Model-Checking and Dynamically-Enforcing Abstractions. Springer. <a href=\"https://doi.org/10.1007/11537328_23\" target=\"_blank\"><i>10.1007/11537328_23</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2005-spin-splat-1",
      "title": "SPLAT: A Tool for Model-Checking and Dynamically-Enforcing Abstractions",
      "summary": "Workshop paper on temporal automata for protocol specifications presented at SPIN 2005.",
      "date_published": "2005-08-01T00:00:00.000000Z",
      "date_modified": "2005-08-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "formal-methods",
        "model-checking",
        "protocols",
        "verification"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2005-spin-splat.pdf",
          "mime_type": "application/pdf",
          "title": "SPLAT: A Tool for Model-Checking and Dynamically-Enforcing Abstractions"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1007/11537328_23",
          "doi": "10.1007/11537328_23",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2005-ubiapp-ubimedia-1",
      "content_html": "<p>Position paper on ubiquitous computing approaches to emerging stream media appliances. This work argued that the ubiquitous computing community needed to catch up with the rapid evolution of streaming media devices and services. We explored how ubicomp principles could be applied to the emerging landscape of media appliances, considering the challenges of integrating location-aware and context-sensitive features into streaming media systems that were becoming increasingly prevalent in everyday environments.</p><h1>References</h1><ul><li>Sharp et al (2005). Ubiquitious Computing needs to catch up with Ubiquitous Media. <a href=\"https://doi.org/10.1109/MPRV.2005.69\" target=\"_blank\"><i>10.1109/MPRV.2005.69</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2005-ubiapp-ubimedia-1",
      "title": "Ubiquitious Computing needs to catch up with Ubiquitous Media",
      "summary": "Position paper on ubiquitous computing approaches for emerging stream media appliances.",
      "date_published": "2005-07-01T00:00:00.000000Z",
      "date_modified": "2005-07-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ubicomp",
        "streaming",
        "media",
        "hci"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2005-ubiapp-ubimedia.pdf",
          "mime_type": "application/pdf",
          "title": "Ubiquitious Computing needs to catch up with Ubiquitous Media"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1109/MPRV.2005.69",
          "doi": "10.1109/MPRV.2005.69",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2005-ieee-audio-1",
      "content_html": "<p>New paper <a href=\"/papers/2005-ieee-audio\">Audio networking: the forgotten wireless technology</a> available. This IEEE Pervasive Computing article explores how audio can be used as a wireless networking technology for mobile devices. We investigated various modulation schemes for transferring data to nearby smartphones through audio signals, covering both the technical aspects and usability considerations. The work included a case study applying audio networking techniques to solve real-world problems in telephone conferencing, demonstrating how this &quot;forgotten&quot; wireless technology could enable new types of proximity-based interactions.</p><h1>References</h1><ul><li>Madhavapeddy et al (2005). Audio networking: the forgotten wireless technology. <a href=\"https://doi.org/10.1109/MPRV.2005.50\" target=\"_blank\"><i>10.1109/MPRV.2005.50</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2005-ieee-audio-1",
      "title": "Audio networking: the forgotten wireless technology",
      "summary": "Paper exploring audio-based networking as overlooked wireless communication technology.",
      "date_published": "2005-07-01T00:00:00.000000Z",
      "date_modified": "2005-07-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "networking",
        "audio",
        "wireless"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2005-ieee-audio.pdf",
          "mime_type": "application/pdf",
          "title": "Audio networking: the forgotten wireless technology"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1109/MPRV.2005.50",
          "doi": "10.1109/MPRV.2005.50",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2005-ubicomp-bluetooth-1",
      "content_html": "<p>Ubicomp paper on a study of indoor bluetooth propagation using the Active Bat system. This work with <a href=\"https://liquidx.net\">Alastair Tse</a> used ultrasonic location tracking to conduct fine-grained Bluetooth signal strength surveys, giving us unprecedented accuracy in mapping radio propagation indoors. We discovered that Bluetooth is actually poorly suited for fine-grained location inference due to hardware and specification limitations, and that device movement speed significantly impacts available bandwidth. The study provided valuable data sets for the research community and practical insights about the limitations of using Bluetooth for ubiquitous computing applications.</p><h1>References</h1><ul><li>Madhavapeddy et al (2005). A Study of Bluetooth Propagation Using Accurate Indoor Location Mapping. Springer. <a href=\"https://doi.org/10.1007/11551201_7\" target=\"_blank\"><i>10.1007/11551201_7</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2005-ubicomp-bluetooth-1",
      "title": "A Study of Bluetooth Propagation Using Accurate Indoor Location Mapping",
      "summary": "Ubicomp paper studying indoor Bluetooth signal propagation characteristics using Active Bat ultrasonic location system.",
      "date_published": "2005-07-01T00:00:00.000000Z",
      "date_modified": "2005-07-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "bluetooth",
        "ubicomp",
        "networking",
        "wireless",
        "mobile"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2005-ubicomp-bluetooth.pdf",
          "mime_type": "application/pdf",
          "title": "A Study of Bluetooth Propagation Using Accurate Indoor Location Mapping"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1007/11551201_7",
          "doi": "10.1007/11551201_7",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/c2k5-thoughts",
      "content_html": "<p>Finally had some time to get back from the OpenBSD hackathon and take\nstock of what I worked on. It was pretty interesting one this year, as I\nwent without having much idea of what to work on (unlike last year, when\nI had a mad backlog to catch up on).</p>\n<p>Some stuff I did during the week included:</p>\n<ul>\n<li>Clean up the <a href=\"http://www.openbsd.org/cgi-bin/cvsweb.cgi/src/usr.bin/ssh/atomicio.c\">atomicio</a>\ninterface used in <a href=\"http://www.openssh.com\">OpenSSH</a> and\n<em><a href=\"http://www.openbsd.org/cgi-bin/man.cgi?query=nc\">nc(1)</a></em> to\nprovide simpler semantics. Error checking from read/write functions\nare a real headache in C, as the functions return <code>-1</code> on error,\nwhich means a signed <code>ssize_t</code> is returned. However, they accept an\nunsigned value as the size of the buffer to process, which means\nthey could potentially return a value outside the range of the\nreturn value. This means you have to check if the return is <code>-1</code>,\nwhich indicates an error, and otherwise cast to a <code>size_t</code> to\ncorrectly get the buffer size back. With the new atomicio, it always\nreturns a <code>size_t</code>, and returns <code>0</code> to signal an error (with <code>errno</code>\ncontaining the error, and <code>EPIPE</code> being set for an <code>EOF</code> condition).</li>\n<li>Start looking at the Bluetooth stack to get L2CAP and RFCOMM\nsupport. We are half-way through un-netgraphing the FreeBSD stack\nand having a more traditional <code>netbt</code> socket interface (much like\n<code>netinet</code> or <code>netinet6</code>) to Bluetooth.</li>\n<li>Use <a href=\"http://cil.sf.net/\">CIL</a> to implement a few fun kernel\nsource-&gt;source transforms. <code>kerneltrace</code> just accepts a regular\nexpression and inserts a <code>printf</code> in the function prologue which\noutputs the function name and any arguments passed into it. Had this\nidea when chatting with <a href=\"http://www.monkey.org/~marius/\">Marius</a>,\nand it turned out to be very useful when trying to figure out\ndataflow in the Bluetooth stack (just compile with\n<code>make CC=&quot;/usr/local/bin/cilly --dokerneltrace --trace-regexp='ubt|ng_blue'&quot;</code>).\nThe second one was even simpler; <code>randomvars</code> assigns a non-zero\nvalue to every local variable in a function call to help track down\nuninitialized-local-variable bugs. Heres\n<a href=\"http://www.openbsd.org/cgi-bin/cvsweb.cgi/src/usr.bin/mg/search.c.diff?r1=1.15&amp;r2=1.16\">one</a>\nChad Loder found in\n<em><a href=\"http://www.openbsd.org/cgi-bin/man.cgi?query=mg\">mg(1)</a></em>.</li>\n<li>Other random <a href=\"http://marc.theaimsgroup.com/?l=openbsd-cvs&amp;m=111689009724884&amp;w=2\">signed/unsigned cleanups</a>\nin OpenSSH. Boring but important I guess...</li>\n</ul>\n<p>All in all, the hackathon re-motivated me to continue work on the\nOCaml-based daemons that <a href=\"https://dave.recoil.org\">Dave Scott</a> and I have been\nhacking on. I don't want to be fixing random buffer or integer overflows\nin an OpenBSD hackathon 5 years from now; we need to move on to more\nhigh-level issues.</p>",
      "url": "https://anil.recoil.org/notes/c2k5-thoughts",
      "title": "OpenBSD C2K5 thoughts",
      "summary": "Reflections on OpenBSD C2K5 hackathon projects and progress.",
      "date_published": "2005-06-04T00:00:00.000000Z",
      "date_modified": "2005-06-04T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "melange",
        "openbsd",
        "ocaml",
        "security",
        "livenotes"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2005-hotdep-spl-1",
      "content_html": "<p>Paper on temporal automata for protocol implementations at HotDep 2005. This work, done with <a href=\"https://dave.recoil.org\">Dave Scott</a>, explored the challenge of building internet servers that are simultaneously high-performance, dependable, and formally verified through model checking. We investigated how to use temporal automata to specify and verify protocol implementations without sacrificing the performance needed for production systems. The key insight was finding ways to make formal verification practical for real-world network services.</p>",
      "url": "https://anil.recoil.org/notes/2005-hotdep-spl-1",
      "title": "On the challenge of delivering high-performance, dependable, model-checked internet servers",
      "summary": "Paper on using temporal automata for protocol implementations presented at HotDep 2005.",
      "date_published": "2005-06-01T00:00:00.000000Z",
      "date_modified": "2005-06-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "formal-methods",
        "model-checking",
        "protocols",
        "systems"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2005-hotdep-spl.pdf",
          "mime_type": "application/pdf",
          "title": "On the challenge of delivering high-performance, dependable, model-checked internet servers"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2005-mc2r-visualtags-1",
      "content_html": "<p>While designing <a href=\"/projects/ubiqinteraction\">Spotcodes</a>, we realised that visual\ntags are a much better mechanism to advertise security keys to users instead\nof the error prone and much more difficult to use Bluetooth device discovery\nprotocol.  We duly implemented the direct system, and conducted a user study\nwith <a href=\"https://www.cst.cam.ac.uk/people/eft20\">Eleanor Toye Scott</a> who was down the corridor working with the HCI group in the\nComputer Lab.  The resulting journal paper on our SpotCode visual tag login\nsystem is a fun blend of systems and human factors.</p>\n<blockquote>\n<p>We do not believe that printed tags are competing with RFID and NFC for the prize of &quot;universally accepted connection establishment technology&quot;. Instead we observe that each offers complementary tradeoffs in terms of cost, data capacity, interaction distance, client-device compatibility and visibility. We predict that all three of these technologies will ultimately be integrated into mobile applications to provide consumers with the flexibility and functionality they require.</p>\n</blockquote>\n<p><em>(Update: and indeed, two decades on in 2025, this has played out pretty accurately. It is common now to use QRCodes to access services, and wireless scanning is a relatively rare thing to do but still available.)</em></p><h1>References</h1><ul><li>Scott et al (2005). Using visual tags to bypass Bluetooth device discovery. <a href=\"https://doi.org/10.1145/1055959.1055965\" target=\"_blank\"><i>10.1145/1055959.1055965</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2005-mc2r-visualtags-1",
      "title": "Using visual tags to bypass Bluetooth device discovery",
      "summary": "Journal paper on SpotCode visual tag login system, demonstrating visual tags as superior alternative to Bluetooth device discovery for security key advertisement.",
      "date_published": "2005-01-01T00:00:00.000000Z",
      "date_modified": "2005-01-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "spotcodes",
        "hci",
        "mobile",
        "bluetooth",
        "ubicomp"
      ],
      "attachments": [],
      "_references": [
        {
          "url": "https://doi.org/10.1145/1055959.1055965",
          "doi": "10.1145/1055959.1055965",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2005-ieee-smartphones-1",
      "content_html": "<p>New article out on using cameraphones to access site-specific services in IEEE\nPervasive Computing. This work with <a href=\"https://www.cst.cam.ac.uk/people/eft20\">Eleanor Toye Scott</a>, <a href=\"mailto:richard.sharp@gmail.com\">Richard Sharp</a>, and <a href=\"https://dave.recoil.org\">Dave Scott</a> explored\nhow smartphones could enhance location-based services like ticket machines and\ninformation kiosks while reducing deployment costs. We developed the Mobile\nService Toolkit, a complete client-server framework that lets site-specific\nservices automatically tailor their behavior using personal information stored\non users' phones. The paper presents case studies demonstrating how this\napproach could transform the way people interact with services in physical\nspaces.</p><h1>References</h1><ul><li>Scott et al (2005). Using smart phones to access site-specific services. <a href=\"https://doi.org/10.1109/MPRV.2005.44\" target=\"_blank\"><i>10.1109/MPRV.2005.44</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2005-ieee-smartphones-1",
      "title": "Using smart phones to access site-specific services",
      "summary": "IEEE Pervasive Computing article on using camera phones to access location-based services.",
      "date_published": "2005-01-01T00:00:00.000000Z",
      "date_modified": "2005-01-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "mobile",
        "ubicomp",
        "location",
        "spatial"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2005-ieee-smartphones.pdf",
          "mime_type": "application/pdf",
          "title": "Using smart phones to access site-specific services"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1109/MPRV.2005.44",
          "doi": "10.1109/MPRV.2005.44",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/2005-bbphone-1",
      "content_html": "<p>Report on our hacking on the AT&amp;T Broadband Phone. This technical report describes our experiments with context-aware telephony, exploring how broadband phone networks could be enhanced with location and presence information to provide smarter calling features. We worked with <a href=\"mailto:ripduman.sohan@gmail.com\">Ripduman Sohan</a> and <a href=\"https://liquidx.net\">Alastair Tse</a> to prototype various context-aware services that could improve the user experience of IP telephony systems.</p>",
      "url": "https://anil.recoil.org/notes/2005-bbphone-1",
      "title": "The Broadband Phone Network: Experiences with Context-Aware Telephony",
      "summary": "Report on context-aware telephony experiments with AT&T Broadband Phone network.",
      "date_published": "2005-01-01T00:00:00.000000Z",
      "date_modified": "2005-01-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "telephony",
        "context-aware",
        "pervasive-computing"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2005-bbphone.pdf",
          "mime_type": "application/pdf",
          "title": "The Broadband Phone Network: Experiences with Context-Aware Telephony"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2004-spotcodes-1",
      "content_html": "<p>A technical report is now available on our SpotCode visual tag system, and includes a user study lead by <a href=\"https://www.cst.cam.ac.uk/people/eft20\">Eleanor Toye Scott</a> which tested its benefits against conventional mobile interfaces. The work demonstrates how camera-phones can be used for &quot;point-and-click&quot; interactions with visual tags in the real world, measuring both pointing speed and accuracy. We built a complete client/server framework to make it easy to develop applications using this technique, and validated it with prototype applications. The results showed that even novice users could quickly and accurately interact with site-specific mobile services using just their camera phones and public displays.</p><h1>References</h1><ul><li>Scott et al (2004). Using camera-phones to interact with context-aware mobile services. <a href=\"https://doi.org/10.48456/tr-609\" target=\"_blank\"><i>10.48456/tr-609</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/2004-spotcodes-1",
      "title": "Using camera-phones to interact with context-aware mobile services",
      "summary": "Technical report on SpotCode visual tag system with user study comparing benefits against conventional mobile interfaces.",
      "date_published": "2004-12-01T00:00:00.000000Z",
      "date_modified": "2004-12-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "spotcodes",
        "hci",
        "visual",
        "ubicomp"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2004-spotcodes.pdf",
          "mime_type": "application/pdf",
          "title": "Using camera-phones to interact with context-aware mobile services"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.48456/tr-609",
          "doi": "10.48456/tr-609",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/spotcodes-nytimes",
      "content_html": "<p>In what is definitely our most exciting media coverage yet, <a href=\"/projects/ubiqinteraction\">Spotcodes</a> are featured in the New York Times!</p>\n<blockquote>\n<p>When you think of a public information kiosk, your mental picture might include greasy touch screens, broken trackballs and frozen monitors.\nBut researchers at an Intel-financed lab at Cambridge University have developed a way to replace displays like those with something portable, not to mention personal: a cellphone's built-in camera and screen. They and others plan to use commercially available hardware to turn the camera-equipped cellphone into a mouse, remote control, keyboard and more.\n<cite>-- <a href=\"https://www.nytimes.com/2004/10/07/technology/circuits/connecting-paper-and-online-worlds-by-cellphone-camera.html\">New York Times</a></cite></p>\n</blockquote>\n<p><a href=\"mailto:richard.sharp@gmail.com\">Richard Sharp</a> got cited as I wasn't in the department that day when the journalist showed up at Intel Research!</p>\n<blockquote>\n<p>&quot;Instead of having all the hassle of putting things out in the environment that you have to maintain and that people can vandalize, you get a cheap PC, shove it in the back room of your shop and just put posters out front,&quot; said Richard Sharp, an Intel researcher here.</p>\n</blockquote>",
      "url": "https://anil.recoil.org/notes/spotcodes-nytimes",
      "external_url": "https://www.nytimes.com/2004/10/07/technology/circuits/connecting-paper-and-online-worlds-by-cellphone-camera.html",
      "title": "Connecting Paper and Online Worlds by Cellphone",
      "summary": "Researchers use cellphones to connect physical and online worlds, replacing public displays with portable technology.",
      "date_published": "2004-10-07T00:00:00.000000Z",
      "date_modified": "2004-10-07T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "spotcodes",
        "ubicomp",
        "mobile"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/2004-ubicomp-camera-1",
      "content_html": "<p>We gave a demo at <a href=\"https://www.ubicomp.org/ubicomp2004/\">UbiComp 2004</a> all the way in Tokyo\non our SpotCode visual tag system. It went very well, including some time to do some\nsightseeing in Japan and visit the sumo wrestling championships!</p>\n<p><div class=\"video-center\"><iframe title=\"passive-spiderman\" width=\"100%\" height=\"315px\" src=\"https://crank.recoil.org/videos/embed/9ef069e9-0aee-4c51-87c3-e09625711f99\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>",
      "url": "https://anil.recoil.org/notes/2004-ubicomp-camera-1",
      "title": "Using Camera-Phones to Enhance Human-Computer Interaction",
      "summary": "Demo of SpotCode visual tag system at UbiComp 2004 conference in Tokyo.",
      "date_published": "2004-09-01T00:00:00.000000Z",
      "date_modified": "2004-09-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "spotcodes",
        "ubicomp",
        "japan"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/2004-ubicomp-camera.pdf",
          "mime_type": "application/pdf",
          "title": "Using Camera-Phones to Enhance Human-Computer Interaction"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/mit-spotcodes",
      "content_html": "<p>We got more coverage of <a href=\"https://en.wikipedia.org/wiki/ShotCode\">SpotCodes</a> and our startup <a href=\"/projects/ubiqinteraction\">High Energy Magic</a>, leading to lots of interest in the technology.</p>\n<blockquote>\n<p>Public touch-screen displays such as airport check-in kiosks aren’t known for having versatile interfaces; they usually lack keyboards or pointing devices, limiting users to a few navigational buttons. But new software from High Energy Magic of Cambridge, England, turns a camera phone with a Bluetooth wireless connection into a portable mouse and keyboard that can take full command of public displays, doing away with the old touch screen. Working with Intel’s Cambridge research lab, High Energy Magic has developed a set of circular symbols, similar in concept to bar codes, that can be displayed by public terminals. Camera phones loaded with the company’s software can translate the symbols into data. Once a phone locks onto one of the symbols, it uses the Bluetooth short-range wireless protocol to send information about its size, position, and orientation to the computer running the display. The phone can then act as a mouse, manipulating on-screen controls such as scroll bars. The company plans to license the technology to businesses, such as travel agencies, that operate public kiosks.\n<cite>-- <a href=\"https://web.archive.org/web/20241202023917/https://cdn.technologyreview.com/s/403022/phone-it-in/\">MIT Technology Review</a></cite></p>\n</blockquote>",
      "url": "https://anil.recoil.org/notes/mit-spotcodes",
      "external_url": "https://cdn.technologyreview.com/s/403022/phone-it-in/",
      "title": "MIT Technology review covers SpotCodes",
      "summary": "MIT Technology Review covers SpotCodes technology by High Energy Magic.",
      "date_published": "2004-09-01T00:00:00.000000Z",
      "date_modified": "2004-09-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "spotcodes",
        "ubicomp",
        "mobile"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/netgames04-ctf-1",
      "content_html": "<p>The summer of 2004 was sufficient full of procrastination that the members\nof the <a href=\"https://web.archive.org/web/20041212123550/http://sn17.org/\">SN17 collective</a> in the\nComputer Lab decided to build a computer game. But it wasn't enough to just play\nthe game on our phones -- instead, we combined all the public displays in the corridors,\nand then added in a cutting-edge 5cm-accurate <a href=\"https://en.wikipedia.org/wiki/Active_Bat\">ActiveBAT</a>,\nand built a Symbian-based Capture The Flag game where we all had to run around and\ntag each other physically while tracking the flag virtually.</p>\n<p>Was it mad? Yes. Was it fun? Yes. Did it get us a paper into the SIGCOMM NetGames\nworkshop? Yes!</p>\n<blockquote>\n<p>Our novel contributions include: (i) creating a fast-paced, close quarters, location-aware game, (ii) exploring the tradeoffs between the accuracy of a location system, the I/O capabilities of current mobile hardware, and the latency of user feedback, and (iii) investigating the viability of Bluetooth as a component in a low-latency location-aware gaming infrastructure.</p>\n</blockquote><h1>References</h1><ul><li>Mansley et al (2004). Feedback, latency, accuracy: exploring tradeoffs in location-aware gaming. ACM Press. <a href=\"https://doi.org/10.1145/1016540.1016544\" target=\"_blank\"><i>10.1145/1016540.1016544</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/netgames04-ctf-1",
      "title": "Exploring tradeoffs in location-aware gaming using smartphones",
      "summary": "Paper on location-aware Capture The Flag game using Symbian phones, public displays, and ActiveBAT tracking, presented at SIGCOMM NetGames workshop.",
      "date_published": "2004-08-01T00:00:00.000000Z",
      "date_modified": "2004-08-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "games",
        "mobile",
        "bluetooth",
        "ubicomp",
        "networking",
        "vr"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/netgames04-ctf.pdf",
          "mime_type": "application/pdf",
          "title": "Feedback, latency, accuracy: exploring tradeoffs in location-aware gaming"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1145/1016540.1016544",
          "doi": "10.1145/1016540.1016544",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/wired-spotcode",
      "content_html": "<p>I gave a talk at <a href=\"https://web.archive.org/web/20050204012820/http://www.quernstone.com/notcon04/\">NotCon 2004</a> on SpotCodes, and it got covered in Wired Magazine!\nOf course, I shared the stage with a man telling the time <a href=\"https://www.wired.com/2004/06/from-the-prawn-of-time/\">via a prawn sandwich</a>, so the limelight wasn't all just mine...</p>\n<blockquote>\n<p>Anil Madhavapeddy and his colleagues at High Energy Magic think camera phones should be used for more than taking bad pictures. The company's SpotCode reader software lets camera phones recognize a circular tag and then communicate via Bluetooth with a local server.\n<cite>-- <a href=\"https://roxannekhamsi.com\">Roxanne Khamsi</a> for <a href=\"https://www.wired.com/2004/06/from-the-prawn-of-time/\">Wired</a></p>\n</blockquote>",
      "url": "https://anil.recoil.org/notes/wired-spotcode",
      "external_url": "https://www.wired.com/2004/06/from-the-prawn-of-time/",
      "title": "From the prawn of time",
      "summary": "Talk at NotCon 2004 on SpotCodes and camera phone technology covered in Wired Magazine.",
      "date_published": "2004-06-07T00:00:00.000000Z",
      "date_modified": "2004-06-07T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "spotcodes",
        "ubicomp",
        "mobile"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/hem",
      "content_html": "<p>The <a href=\"/projects/ubiqinteraction\">SpotCode</a> cellphone software was spun out into a startup called High Energy Magic Ltd, and was covered on Slashdot.</p>\n<blockquote>\n<p>Check this out! High Energy Magic have announced a public beta of software to let you use your camera-phone as a physical mouse by just pointing and clicking and rotating it in the air.</p>\n</blockquote>\n<p>There were also articles on <a href=\"https://web.archive.org/web/20060505171702/http://www.linuxdevices.com/news/NS3157166681.html\">DeviceForge</a> that were picked up by quite a few outlets.</p>\n<p><em>Update: You can see some of the videos under <a href=\"/projects/ubiqinteraction\">Ubiquitous Interaction Devices</a> as well.</em></p>",
      "url": "https://anil.recoil.org/notes/hem",
      "external_url": "https://slashdot.org/story/04/05/27/1849209/cellphone-as-virtual-mouse-keyboard",
      "title": "Cellphone as a virtual mouse/keyboard",
      "summary": "Use your cellphone as a virtual mouse and keyboard with innovative software.",
      "date_published": "2004-05-27T00:00:00.000000Z",
      "date_modified": "2004-05-27T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ubicomp",
        "mobile",
        "spotcodes",
        "startups"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/audio-networking-1",
      "content_html": "<p>While working as an intern at Intel Research Cambridge, <a href=\"https://dave.recoil.org\">Dave Scott</a> and <a href=\"mailto:richard.sharp@gmail.com\">Richard Sharp</a> and I put together a fun system based on the emerging new class of smartphones. The project kicked off when we randomly experimented with our fancy Nokia smartphones and discovered that they didn't have anti-aliasing filters on the microphones!  We argued that</p>\n<blockquote>\n<p>[...] audio networking can be used as the basis for developing context-aware applications. Audio networking allows standard devices fitted with speakers and microphones (e.g. PDAs, laptops, desktop PCs and mobile phones) to exchange data and infer information about their environment.  One of the key advantages of audio networking is that it enables context-aware applications to be immediately deployed on a large scale without requiring users to purchase and install additional hardware.</p>\n</blockquote>\n<p>We used the lack of antialiasing filters to create a set of inaudible location beacons that would allow laptop computers to simply listen using their microphones and discover their current location without any advanced equipment being required!</p>\n<p><div class=\"video-center\"><iframe title=\"location\" width=\"100%\" height=\"315px\" src=\"https://crank.recoil.org/videos/embed/cda6e8e1-6f64-4f6f-aab0-8dc615836a51\" frameborder=\"0\" allowfullscreen sandbox=\"allow-same-origin allow-scripts allow-popups allow-forms\"></iframe></div></p>\n<p>You can read more details in the paper or in the <a href=\"/projects/ubiqinteraction\">Ubiquitous Interaction Devices</a> project page.</p><h1>References</h1><ul><li>Madhavapeddy et al (2003). Context-Aware Computing with Sound. Springer Berlin Heidelberg. <a href=\"https://doi.org/10.1007/978-3-540-39653-6_25\" target=\"_blank\"><i>10.1007/978-3-540-39653-6_25</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/audio-networking-1",
      "title": "Context-Aware Computing with Sound",
      "summary": "Research project using inaudible audio beacons for location discovery on smartphones without additional hardware, conducted as intern at Intel Research Cambridge.",
      "date_published": "2003-10-01T00:00:00.000000Z",
      "date_modified": "2003-10-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "ubicomp",
        "audio",
        "networking",
        "mobile"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/audio-networking.pdf",
          "mime_type": "application/pdf",
          "title": "Context-Aware Computing with Sound"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.1007/978-3-540-39653-6_25",
          "doi": "10.1007/978-3-540-39653-6_25",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/gcc-bounds",
      "content_html": "<p>After many rounds of review and helpful feedback from fellow developers,\nI merged my <a href=\"https://man.openbsd.org/gcc-local.1\">GCC static bounds checking extension</a> into OpenBSD today!</p>\n<blockquote>\n<p>Introduce a simple static checker for making sure that the bounds\nlength passed to common functions such as strlcpy/strlcat match the\nreal length of the buffer.  It also checks to make sure that the bound\nlength was not incorrectly derived from a sizeof(pointer) operation.</p>\n<p>Functions must be marked with the new attribute <strong>bounded</strong>, and warnings\nare turned on by -Wbounded.  Specifying -Wformat also enables bounds\nchecking for scanf(3) bounds to '%s' format variables. -Wall now turns\non -Wbounded also.</p>\n<p>The checking is pretty limited right now to constant parameters, and the\nbuffers must be statically declared, and not inside a record type.  This\nsimple checking still found hundreds of bugs around the ports tree though,\nand there have been no false positive warnings.</p>\n</blockquote>\n<p>You can read more details in the <a href=\"https://man.openbsd.org/gcc-local.1\"><em>gcc-local(1)</em></a> manual page as well.</p>",
      "url": "https://anil.recoil.org/notes/gcc-bounds",
      "external_url": "https://undeadly.org/cgi?action=article&sid=20030627104847",
      "title": "My static C bounds checker extension merged into OpenBSD",
      "summary": "OpenBSD merges static C bounds checker extension into its codebase.",
      "date_published": "2003-06-27T00:00:00.000000Z",
      "date_modified": "2003-06-27T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "openbsd",
        "compiler",
        "security",
        "opensource"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/sam03-secpol-1",
      "content_html": "<p>My first ever academic paper, written with the expert guidance of <a href=\"https://www.cl.cam.ac.uk/~am21/\">Alan Mycroft</a> and my PhD colleagues <a href=\"https://dave.recoil.org\">Dave Scott</a> and <a href=\"mailto:richard.sharp@gmail.com\">Richard Sharp</a>!  We worked on a system call policy language to help constrain application access to privileged resources, and implemented this on OpenBSD using <a href=\"https://man.openbsd.org/OpenBSD-5.1/systrace.1\">systrace</a>. The paper describing the declarative language was presented at SAM 2003 in Las Vegas.</p>\n<blockquote>\n<p>&quot;Untrusted code&quot; is just as much a social problem as it\nis a technical problem. Looking for a complete solution\nis unrealistic: it is analogous to looking for a solution to\ncrime in general. With this in mind, we do not claim that\nour proposed framework is a panacea. However, although\na number of security problems remain (e.g. covert channel\nleakage), we claim that our system offers the potential to\nraise the security level of existing general purpose operating systems significantly.</p>\n</blockquote>",
      "url": "https://anil.recoil.org/notes/sam03-secpol-1",
      "title": "The Case for Abstracting Security Policies",
      "summary": "First academic paper on system call policy language to constrain application access to privileged resources, implemented on OpenBSD using systrace and presented at SAM 2003.",
      "date_published": "2003-06-01T00:00:00.000000Z",
      "date_modified": "2003-06-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "security",
        "openbsd",
        "kernel"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/sam03-secpol.pdf",
          "mime_type": "application/pdf",
          "title": "The Case for Abstracting Security Policies"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/opening-anil-recoil-org",
      "content_html": "<p>I've taken the opportunity to redesign my homepage and switch to its hopefully-permanent\nURL on <code>anil.recoil.org</code>. Many thanks to Jon Parise for giving me permission to base my\nHTML upon his homepage's, saving me lots of design trouble!</p>",
      "url": "https://anil.recoil.org/notes/opening-anil-recoil-org",
      "external_url": "https://web.archive.org/web/20031009224818/https://anil.recoil.org/",
      "title": "Moving to anil.recoil.org",
      "summary": "Redesigned homepage now at anil.recoil.org",
      "date_published": "2003-05-16T00:00:00.000000Z",
      "date_modified": "2003-05-16T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "selfhosting",
        "recoil"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/xen02-1",
      "content_html": "<p>The first technical report on the <a href=\"/projects/xen\">Xen Hypervisor</a> hypervisor is now available. I mainly\ncontributed to the early NetBSD port (but have run into a snag with the lack of\nlinear page tables in our paravirtual page implementation). This was the\ninitial documentation of <a href=\"/projects/xen\">Xen</a> at the Cambridge Computer Lab, describing the\nparavirtualization approach that would later become foundational to cloud\ncomputing.</p><h1>References</h1><ul><li>Barham et al (2003). Xen 2002. <a href=\"https://doi.org/10.48456/tr-553\" target=\"_blank\"><i>10.48456/tr-553</i></a></li></ul>",
      "url": "https://anil.recoil.org/notes/xen02-1",
      "title": "Xen 2002",
      "summary": "First technical report on Xen hypervisor with contributions to early NetBSD port.",
      "date_published": "2003-01-04T00:00:00.000000Z",
      "date_modified": "2003-01-04T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "xen",
        "cambridge",
        "computerlab",
        "systems"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.orghttps://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-553.pdf",
          "mime_type": "application/pdf",
          "title": "Xen 2002"
        }
      ],
      "_references": [
        {
          "url": "https://doi.org/10.48456/tr-553",
          "doi": "10.48456/tr-553",
          "cito": [
            "citesAsSourceDocument"
          ]
        }
      ]
    },
    {
      "id": "https://anil.recoil.org/notes/starting-phd",
      "content_html": "<p>I started my PhD at the Systems Research Group in Cambridge this week, based in\nthe <a href=\"https://www.cl.cam.ac.uk\">Computer Laboratory</a> and <a href=\"https://robinson.cam.ac.uk\">Robinson\nCollege</a>.  I'll still be working part-time at\n<a href=\"https://netapp.com\">NetApp</a>, but my primary focus will be on the <a href=\"/projects/xen\">Xen</a>\nhypervisor and other systems research topics.</p>",
      "url": "https://anil.recoil.org/notes/starting-phd",
      "external_url": "https://web.archive.org/web/20030218053705/http://www.cl.cam.ac.uk/Research/SRG/netos/people.html",
      "title": "Started PhD at Cambridge",
      "summary": "Started PhD in Systems Research at Cambridge University, focusing on Xen hypervisor.",
      "date_published": "2002-09-01T00:00:00.000000Z",
      "date_modified": "2002-09-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "xen",
        "phd",
        "cambridge",
        "computerlab",
        "systems"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/php-port-layout-openbsd",
      "content_html": "<p>I've committed a big improvement to the PHP port on OpenBSD, by switching from a complex set of FLAVOR tags over to a set of independently installing &quot;multi packages&quot;.</p>\n<p>The first thing I did was to import the core PHP without any extensions.</p>\n<pre><code>commit 15dc0f67ef5fd0cae9fb841e608b90b9f51c71ca\nAuthor: avsm &lt;avsm@openbsd.org&gt;\nDate:   Mon Jun 24 19:23:41 2002 +0000\n\n    Import php4-core-4.2.1\n    \n    Installs the barebones php4 with only the gettext, iconv and recode\n    modules compiled in.\n    \n    All of the other modules have to be installed as shared modules on\n    top of this.\n    \n    In addition to the Apache module, this package also includes a php\n    command-line binary which can be used in shell scripts.  The binary\n    uses the same /var/www/conf/php.ini file as the Apache module.\n    \n    There is some non-i386 breakage at the moment (notably macppc).\n</code></pre>\n<p>After that, I imported in the extensions system which has many more dependencies, and that generates the multi packages.</p>\n<pre><code>commit a5c226010f93bd3ce70667b801d6518354f44914\nAuthor: avsm &lt;avsm@openbsd.org&gt;\nDate:   Mon Jun 24 19:27:46 2002 +0000\n\n    Import php4-4.2.1 extensions\n    \n    This module generates a bunch of php4 extensions as shared modules,\n    and seperates them out into multiple packages.\n    \n    End result is that you can pkg_add individual modules now without\n    getting into the mess of flavors that we've had in the past.\n</code></pre>\n<p>This should make the use of <code>pkg_add</code> for PHP much simpler for new users. Any problems, please file a bug report or let me know.</p>",
      "url": "https://anil.recoil.org/notes/php-port-layout-openbsd",
      "title": "Streamlining PHP on OpenBSD",
      "summary": "Improved PHP on OpenBSD with simplified pkg_add usage and modular extensions.",
      "date_published": "2002-06-24T00:00:00.000000Z",
      "date_modified": "2002-06-24T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "openbsd",
        "php",
        "packaging"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/netapp-tr-3152-1",
      "content_html": "<p>After the <a href=\"/notes/mars-polar-lander\">Mars Polar Lander crashed</a>, I took a job at NetApp working as the\nproduct architect for <a href=\"https://en.wikipedia.org/wiki/NetCache\">NetCache</a>.  Among the hundreds\nof deployments that I help setup across the world, the most fun was figuring out how to scale\none of the biggest bands in the world at the time wanting to stream their concert live to a\nglobal audience.</p>\n<p>To pull this off, I moved to Sardinia where Tiscali was based, and worked with their team\nto integrate new support we added to NetCache for the RealPlayer RTSP protocol and extensions.\nThe concert ended up reaching a global audience of <a href=\"https://www.u2songs.com/news/the_history_mix_live_streams_of_u2_concerts\">5 million viewers</a> (huge at the time for streaming video across the Internet!) and\neventually well over 30 million watched the archived version.</p>\n<p>I wrote up a technical report at NetApp on setting up the CDN for the live streaming,\nwhich was picked up by a number of other streaming companies in the next few years.</p>",
      "url": "https://anil.recoil.org/notes/netapp-tr-3152-1",
      "title": "Streaming U2 live across the Internet",
      "summary": "Technical report on setting up CDN for U2's live concert stream from Sardinia, reaching 5 million viewers globally while working as NetApp product architect.",
      "date_published": "2002-04-01T00:00:00.000000Z",
      "date_modified": "2002-04-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "distributed",
        "internet",
        "streaming",
        "netapp",
        "caching",
        "italy"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/netapp-tr-3152.pdf",
          "mime_type": "application/pdf",
          "title": "Tiscali: How to build a Content Delivery Network"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/horde-cache",
      "content_html": "<p>While hacking on making Chora performant enough to work as the official PHP CVS web-viewer, I added in a general caching subsystem into the Horde PHP framework. Do let me know if you end up finding a use for it in your own applications.</p>\n<blockquote>\n<p>Add in a Cache framework for persistent storage and retrieval of cached\nobjects.  Consider it experimental for now.</p>\n<p>Basically works for Chora's needs ... implements a filesystem driver\nwhich tries to act sensibly (writes to a tmp file, then does an atomic\nrename to the cache object), to avoid synchronization issues.</p>\n<p>It does not cleanup the cached repository at the moment - needs to have\na garbage collection function done at some point.</p>\n<p><cite> -- <a href=\"https://lists.horde.org/archives/cvs/Week-of-Mon-20010820/003116.html\">Anil Madhavapeddy</a></cite></p>\n</blockquote>",
      "url": "https://anil.recoil.org/notes/horde-cache",
      "title": "Added a caching subsystem to Horde",
      "summary": "Horde PHP framework now includes a general caching subsystem for improved performance.",
      "date_published": "2001-08-20T00:00:00.000000Z",
      "date_modified": "2001-08-20T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "horde",
        "storage",
        "email",
        "php"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/chora-live-on-php",
      "content_html": "<p>I spent a chunk of time through the year working on the <a href=\"https://horde.org\">Horde</a> project. I began when I got commit to <a href=\"https://www.horde.org/apps/imp/\">IMP webmail</a> to fix some bugs in the MIME rendering for our Recoil deployment. You can see my code commits on the <a href=\"https://marc.info/?a=97359997900001&amp;r=6\">horde-cvs</a> mailing list archive.</p>\n<p>After getting to grips with the PHP code, I then went on to totally rewrite the <a href=\"https://www.horde.org/apps/chora/\">Chora</a> version control viewer so that the CVS repositories for Horde could be browsed online instead of only via the command line.</p>\n<p>I'm extremely proud to report that the <a href=\"http://php.net\">PHP project</a> has <a href=\"https://lists.horde.org/archives/dev/Week-of-Mon-20010806/002886.html\">now deployed Chora</a> for production use to serve up <code>cvs.php.net</code>, making it our biggest user by far. Thanks for making my day, Rasmus!</p>\n<blockquote>\n<p>I switched Chora over to be the default web cvs system behind cvs.php.net\nnow.  The old viewcvs site is still available at viewcvs.php.net (dns may\nnot have updated yet)\n<cite> -- <a href=\"https://lists.horde.org/archives/dev/Week-of-Mon-20010806/002886.html\">Rasmus Lerdorf</a>, php.net</cite></p>\n</blockquote>",
      "url": "https://anil.recoil.org/notes/chora-live-on-php",
      "external_url": "https://lists.horde.org/archives/dev/Week-of-Mon-20010806/002886.html",
      "title": "Chora now the production CVS viewer for PHP",
      "summary": "Chora is now the production CVS viewer for PHP, replacing viewcvs on cvs.php.net.",
      "date_published": "2001-08-05T00:00:00.000000Z",
      "date_modified": "2001-08-05T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "email",
        "web",
        "horde",
        "selfhosting",
        "recoil",
        "php"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/chora-internationalised",
      "content_html": "<p>One of the coolest things about hacking on the <a href=\"https://horde.org\">Horde</a> framework is that it gives me lots of features for free that I can use in my web applications. The latest thing I added to the Chora CVS viewer today is the internationalisation framework, so that the frontend can be translated to multiple languages.</p>\n<p>I've added in a simple <a href=\"https://lists.horde.org/archives/cvs/Week-of-Mon-20010730/002975.html\">German translation</a> to start with, but please contribute your own strings if you get the opportunity.</p>",
      "url": "https://anil.recoil.org/notes/chora-internationalised",
      "title": "Added internationalisation to the Chora viewer",
      "summary": "Chora viewer now supports internationalisation with multiple language translations available.",
      "date_published": "2001-08-03T00:00:00.000000Z",
      "date_modified": "2001-08-03T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "horde",
        "php",
        "cvs"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/openfx",
      "content_html": "<p>Slashdot covers the GPL release of <a href=\"http://openfx.org\">OpenFX</a>, which I worked on with Stuart Ferguson (my brother's PhD supervisor in Queen's University Belfast).</p>\n<blockquote>\n<p>It has a renderer and raytrace engine, NURBS support, kinematics-based animation, morphing, a plugin API - and it's under the GPL. Currently only for Windows, but they're working on a Linux and FreeBSD port.</p>\n</blockquote>",
      "url": "https://anil.recoil.org/notes/openfx",
      "external_url": "https://tech.slashdot.org/story/01/02/10/0340210/gpled-3d-modeler-and-renderer",
      "title": "GPL release of OpenFX",
      "summary": "OpenFX is now available under the GPL with advanced features like raytracing and NURBS support.",
      "date_published": "2001-02-10T00:00:00.000000Z",
      "date_modified": "2001-02-10T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "3d",
        "openfx",
        "opensource",
        "graphics"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/commit-access-to-php",
      "content_html": "<p>I've been maintaining <a href=\"http://php.net\">PHP</a> on OpenBSD for a while now, including the core package distributed as binary packages.</p>\n<p>So as of today, the core team has decided I'm trustworthy enough to have my own commit bit to the central PHP repository, where I can commit code fixes and also maintain the <a href=\"https://www.php.net/manual/en/install.unix.openbsd.php\">OpenBSD on PHP</a> official instructions. You can contact me on <code>avsm@php.net</code> if you need any help!</p>",
      "url": "https://anil.recoil.org/notes/commit-access-to-php",
      "title": "I am now a core PHP developer",
      "summary": "I'm now a core PHP developer with commit access to the central repository.",
      "date_published": "2001-01-09T00:00:00.000000Z",
      "date_modified": "2001-01-09T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "php",
        "opensource",
        "horde",
        "openbsd"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/openbsd-developer",
      "content_html": "<p>I've been using OpenBSD for a few years now as the primary OS for <a href=\"\">##recoil</a>\nand have been contributing fixes and ports when I get a chance. So I'm\nincredibly excited to report that the project leader, Theo de Raadt, has\ninvited me to become an OpenBSD developer. I've registered my keys now, and\nwill be known as <code>avsm@openbsd.org</code>!</p>\n<p>My first commit is to start fixing up the PHP port, which I have been working\non in <a href=\"https://news-web.php.net/php.qa/652\">PHP-land</a> for a while now.</p>\n<pre><code>commit 93d5cc5ae56b22b19aa3bce34d38fa260b882d16\nAuthor: avsm &lt;avsm@openbsd.org&gt;\nDate:   Tue Dec 26 23:35:43 2000 +0000\n\n    - update to php-4.0.4\n    - bump NEED_VERSION\n    - no longer need extra distfile number4.tar.gz since it has\n      been integrated into the main distribution\n    - ltconfig, mysql socket patches are in main distribution now,\n      so they are removed.  Note that the ltconfig patch was only\n      applied to the 4_0_4 branch by the PHP team, so we will have\n      to resubmit it for the next version, unless libtool-cvs has\n      been updated with our information.\n    - Since php3/4 conflict with each other anyway, versioning is\n      not needed.\n</code></pre>\n<p>Many thanks to Jakob for the help with getting started, and for ok'ing my first commit.</p>",
      "url": "https://anil.recoil.org/notes/openbsd-developer",
      "title": "I'm now an OpenBSD developer",
      "summary": "I'm now an official OpenBSD developer, contributing fixes and ports.",
      "date_published": "2000-12-26T00:00:00.000000Z",
      "date_modified": "2000-12-26T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "openbsd",
        "opensource",
        "packaging",
        "recoil",
        "php"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/horde-developer",
      "content_html": "<p>After contributing some patches, I've now got the honour of becoming a <a href=\"https://www.horde.org/community/team\">core team</a> member of the <a href=\"https://horde.org\">Horde</a> project. Many thanks to Chuck Hagenbuch and Jon Parise for their trust in me!</p>\n<p>I'm planning on fixing bugs in IMP and the webmail subsystem, and am getting interested in version control and CVS as well, so I'm going to look at Whups and Chora ore. You can follow my commits on the <a href=\"https://lists.horde.org/archives/cvs/Week-of-Mon-20001016/author.html\">Horde-CVS</a> commit archives, where I am <code>avsm@horde.org</code>!</p>",
      "url": "https://anil.recoil.org/notes/horde-developer",
      "title": "I'm now a Horde core team member",
      "summary": "Author joins Horde core team, contributing to IMP and webmail development.",
      "date_published": "2000-10-16T00:00:00.000000Z",
      "date_modified": "2000-10-16T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "horde",
        "php",
        "email",
        "opensource"
      ],
      "attachments": [],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/netapp-tr-3071-1",
      "content_html": "<p>Although the Mars Polar Lander ended up <a href=\"https://en.wikipedia.org/wiki/Mars_Polar_Lander#See_also\">crashing</a>, the website itself was one of the busiest websites in the world at the time during the approach to the landing. I was the person handling the website architecture and the amazing <code>webmaster@mars.nasa.gov</code> account at the time. I worked closely <a href=\"/notes/mars-polar-lander\">with Sun</a> and NetApp and wrote up a technical report on how the Mars Polar Lander website acceleration architecture worked. The report detailed our distributed web site architecture using caching and load balancing to handle massive traffic spikes, with lessons applicable to designing scalable internet services. It was an early real-world example of content distribution networks and cacheability design principles.</p>",
      "url": "https://anil.recoil.org/notes/netapp-tr-3071-1",
      "title": "Paper on the NASA Mars Polar Lander website architecture",
      "summary": "Technical report on Mars Polar Lander website acceleration architecture, one of the busiest websites during approach to landing.",
      "date_published": "2000-07-01T00:00:00.000000Z",
      "date_modified": "2000-07-01T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "nasa",
        "mars",
        "space",
        "distributed",
        "web",
        "netapp"
      ],
      "attachments": [
        {
          "url": "https://anil.recoil.org/papers/netapp-tr-3071.pdf",
          "mime_type": "application/pdf",
          "title": "Application of a Distributed Web Site Acceleration: Mars Polar Lander"
        }
      ],
      "_references": []
    },
    {
      "id": "https://anil.recoil.org/notes/mars-polar-lander",
      "content_html": "<p>In my capacity as <a href=\"/papers/netapp-tr-3071\">webmaster</a> of the Mars Polar Lander, I submitted a note to Slashdot. Although our amazing distributed website took quite a beating (with some estimated 1 in 4 Internet users trying to access it simultaneously), the Lander itself <a href=\"https://www.wired.com/1999/12/mars-lander-wont-phone-home/\">sadly crashed</a>.  On the bright side, I got mentioned in a <a href=\"https://web.archive.org/web/20020106163651/http://www.sun.com/smi/Press/sunflash/1999-12/sunflash.991202.1.html\">Sun press release</a> because of the Sun Netra T1 servers they gave us to host the website!</p>\n<p>You can read more about the architecture behind the site in &quot;<a href=\"/papers/netapp-tr-3071\">Application of a Distributed Web Site Acceleration: Mars Polar Lander</a>&quot;.</p>",
      "url": "https://anil.recoil.org/notes/mars-polar-lander",
      "external_url": "https://science.slashdot.org/story/99/12/03/0755200/mars-polar-lander-lands-today",
      "title": "Slashdot covers the Mars Polar Lander",
      "summary": "Mars Polar Lander's webmaster shares story of website survival despite spacecraft crash.",
      "date_published": "1999-12-03T00:00:00.000000Z",
      "date_modified": "1999-12-03T00:00:00.000000Z",
      "authors": [
        {
          "name": "Anil Madhavapeddy",
          "url": "https://orcid.org/0000-0001-8954-2428",
          "avatar": "https://anil.recoil.org/images/anil-headshot.webp"
        }
      ],
      "tags": [
        "mars",
        "space",
        "distributed",
        "nasa",
        "netapp"
      ],
      "attachments": [],
      "_references": []
    }
  ]
}