home Anil Madhavapeddy, Professor of Planetary Computing  

Programming FPGAs using OCaml / Feb 2025

With the vast amount of data we have these days for our planetary computing processing, it's naturally tempting to use more hardware offload. The obvious choice, GPGPUs, are not a great fit for the problem due to the difficulty of unlocking high data parallelism for geospatial data. So it's back to an old technology I worked on twelve years ago in the form of FPGAs!

FPGAs are a very flexible way to execute boolean combinatorial logic, but are notoriously difficult to program. We have two possible angles to explore to address this. One is to design more declarative DSLs for data processing that compile to the FPGAs, such as Michael Dales work on Yirgacheffe or Omar Tanner's work on in-memory compressive computation. The other angle is to work on the low-level API to programming the FPGAs, to get away from Verilog and program in our favourite high-level language...OCaml! KC Sivaramakrishnan and I have started making a list of resources for programming FPGAs in OCaml for our own education.

HardCaml was originally a side project by Andy Ray. He gave a great presentation about it at ORConf 2015. Later on in the project's lifecycle, he moved it to being maintained by Jane Street, where is used in production and is open source. The first two resources to learn about HardCaml are to listen to the Signals and Threads episode with Andy, and then to read the 2023 paper:

Unlike high level synthesis (HLS), Hardcaml allows for low level control of the underlying hardware for maximum productivity, while abstracting away many of the tedious aspects of traditional hardware definition languages (HDLs) such as Verilog or VHDL. The richness of OCaml’s type system combined with Hardcaml’s fast circuit elaboration checks reduces the chance of user-introduced bugs and erroneous connections with features like custom type defining, type-safe parameterized modules and elaboration-time bit-width inference and validation.

Hardcaml tooling emphasizes fast feedback through simulation, testing, and verification. It includes both a native OCaml cycle-accurate and an event-driven simulator. Unit tests can live in the source code and include digital ASCII waveforms representing the simulator’s output. Hardcaml also provides tools for SAT proving and formal verification. Hardcaml is industrially proven, and has been used at Jane Street internally for many large FPGA designs.

Let's look at the source code repository next to see some more code. HardCaml is easily installable via opam, so there appears to be few barriers to getting the software up and running. For the development lifecycle, there are a few other packages to ease the interfacing with the FPGA hardware:

  • Hardcaml_waveterm is a terminal-based digital waveform viewer. These are usable in expect tests or from an interactive terminal application. I love a good terminal user interface, particularly now that I've shifted to Ghostty with extremely good UTF-8 and colour support, so this is a very good sign.
  • Hardcaml_c then converts a Hardcaml design over to C, where it can be compiled into a cycle-accurate simulation model and Hardcaml_verilator does the same except for the open-source verilator Verilog emulator.

Let's look at some examples. There is a hardcaml_circuits repository with some interesting designs in HardCaml. Picking some at random:

  • There's a sorting network that arranges a fixed configuration of compare-and-swaps to sort data. The network's structure is static (so it can be implemented easily in hardware), but the library abstracts its implementation to allow plugging in different compare-and-swap and data structures. Looking at the OCaml interface, it's an OCaml functor over the compare-and-swap function, and has implementations in the module for a merge sort and a bitonic merge. This is already quite instructive to compare vs a software implementation, as for my Foundations of CS course where I teach merge strategies quite early on.
  • For floating point calculations, we generally do CORDIC algorithms which perform vector rotations iteratively to solve trig functions. The cordic.mli interface here is very readable, with nice use of OCaml features such as algebraic data types to express the equations themselves. The implementation of arctan shows how elegantly the OCaml implementation expresses the CORDIC equation as a higher level function.

Is HardCaml worth learning?

I was curious to see what HardCaml's been used for recently. Most notably, it took home awards at the ZPrize in 2022, winning the multi-scalar multiplication track. So this thing is right up there with other HDLs in terms of producing high performing circuits!

There are two good blog posts about each of the implementations:

The web-based waveform view for the NTT transformer
The web-based waveform view for the NTT transformer

More relevantly to my interested in geospatial processing, there is a JPEG decoder in HardCaml which looks rather exciting. It implements the JPEG baseline profile with arbitrary huffman tables for encoding, along with a more work-in-progress decoder. A GeoTIFF implementation would be a fun starter project to port to HardCaml!

Some ideas for student projects

Moving on from prizes, there is also a MIPS processor in HardCaml designed by a couple of students at Penn State. They've also written a series of great blog posts about their adventures in learning HardCaml as students.

Sadiq Jaffer and I have also been discussing the possibility of using computational SSDs to accelerate vector databases, which would be a game-changer for the huge datasets we're throwing around at the moment.

I'm going to continue to explore this further, and will update this note with any more resources I found. Please do send me any ideas you might have! (Update 2025/02/07): Immediately after posting this, two interesting responses came up:

The VGA controller is here and the hardcaml output works nicely with yosys and open lane tooling and verilator. So far it seems to work in simulation and on an FPGA (output recording video, see bottom of this on how it got recorded).

Yet to find out whether it'll work in a physical chip (they say the tape out will be done in April). I particularly like the waveforms in source code for unit test (see the above VGA example).

Thanks also to Andy Ray for feedback and corrections to this post.

# 7th Feb 2025   iconnotes embedded fpga hardware networking ocaml

Related News