/ Ideas / Functional ABNF parser generators

This is an idea proposed in 2011 as a Cambridge Computer Science Part II project, and has been completed by Nicholas Skehin. It was supervised by Anil Madhavapeddy as part of my Functional Internet Services project.

Summary

Writing internet servers is a difficult proposition. On some levels it seems as though we haven’t made much progress since the 1970s, as popular servers such as Apache and nginx for HTTP, BIND for DNS and qmail for IMAP for many Internet protocols still tend to be written in C. While it is not impossible to write robust software in C, it does tend to be extremely difficult and almost all of the above have suffered from their fair share of security vulnerabilities. With the advent of higher level programming languages, this does not need to be the case any longer. Modern functional languages such as OCaml and Haskell can be competitive performance-wise with C on many workloads. In many cases their emphasis on purity where possible comes with significant benefits when moving towards an environment where concurrent execution is the norm rather than the exception.

This project aimed to build a functional parser for the IMAP email protocol in OCaml, and to compare its performance and flexibility against a C-based parser. IMAP is a very complex protocol with many quirks and has endured several buggy implementations through the years on both the server and the client side. Since writing a parser for IMAP by hand was going to be tedious and error prone, this project focusses on how better tooling to make writing parsers for internet servers a more manageable and pain-free experience. Specifically, it investigated writing ABNF generators for OCaml, since IMAP was already specified using that.

Related Reading

Links

The dissertation PDF isn't available publically but should be in the Cambridge Computer Lab archives somewhere. The ABNFComp tool that was built is also available on request from the author, but not published.

Related Ideas