Available · MPhil · 2026 · KC Sivaramakrishnan

An io_uring IO implementation for Lean

Modern operating systems are increasingly moving away from the syscall-at-a-time POSIX model towards shared-memory submission/completion queues like Linux's io_uring, Windows IoRing and macOS' various async interfaces. In my VMIL 2025 keynote at ICFP/SPLASH, I argued that it's time for language runtimes to treat these shared-memory rings as first-class citizens, with POSIX relegated to a compatibility backend.

Lean 4 is an interesting target for this kind of work because it is simultaneously a proof assistant and a practical programming language with its own runtime. Its IO story today is comparatively thin though as it leans (pun intended) on its C runtime for blocking file and socket operations. There is room to design a principled, verified async IO layer from scratch rather than retrofitting one.

This project would design and implement an io_uring-backed IO subsystem for Lean. Most of the work will be specifying the submission/completion queue pair: a bounded, fixed-capacity, lock-free ring shared between a single producer (the Lean program submitting work) and a single consumer (the kernel completing it), with memory-ordering semantics that must be respected exactly for correctness. The student would:

  1. Model the ring structure. Define the submission queue (SQ) and completion queue (CQ) as Lean data structures, with explicit head/tail indices, entry layouts, and the subset of io_uring opcodes needed for a useful demo (read, write, accept, recv, send, close).
  2. Specify concurrent semantics. Lean's existing concurrency story is mostly sequential IO with a few task primitives. The project will almost certainly require defining novel abstractions for shared-memory concurrency like monotonic indices, release/acquire fences, and a story for reasoning about interleavings between kernel and user space.
  3. Wire it into the Lean runtime. Use Lean's FFI to mmap the ring buffers, register file descriptors and drive io_uring_enter, exposing a clean high-level API (ideally something close to Eio's Flow.copy) that hides the ring machinery behind typed resource handles.
  4. Build a demo. A small HTTP echo server or parallel file copy that demonstrates measurable throughput wins over a naive blocking-IO Lean baseline, like the ocaml-uring parallel cp example.

The ambitious stretch goal is to prove the ring's key invariants inside Lean itself. There is growing expertise in Ilya Sergey's group at NUS (more here) on mechanised metatheory for concurrent data structures to ask for help!

A successful project would produce a working io_uring backend usable from real Lean programs, a clean specification of the ring's concurrent semantics, benchmarks against the existing Lean IO path, and as a stretch a machine-checked proof that the ring never loses or duplicates submissions under the single-producer/single-consumer discipline.