/ Ideas / Macro- and Micro-benchmarking in OCaml

This is an idea proposed in 2012 as a Cambridge Computer Science Part II project, and has been completed by Sebastian Funk. It was supervised by Anil Madhavapeddy as part of my OCaml Labs project.

Summary

Benchmarking involves the measurement of statistics such as run-time, memory allocations, garbage collections in a running program in order to analyze its performance and behaviour. To scientifically evaluate and understand the performance of a program, there is often a cycle of:

  1. making performance observations about the program
  2. finding a potential hypothesis, i.e. a cause for this performance behaviour
  3. making predictions on experiments based on this hypothesis
  4. comparing the predictions against the actual benchmark results to evaluate the hypothesis.

To be able to do all this, there is a need for an effective and robust framework to continuously make these observations that is not biased by the choice of hypothesis or the observation made. In general, any sort of improvement relies on robust and precise measurements.

Benchmarking can be split into two perspectives: micro-benchmarking, measuring a single (small) function repeatedly to collect statistics for a regression, and macro-benchmarking, measuring the performance of a complete program or library, often in a single-run. This project aims to improve the benchmarking infrastructure in OCaml, both at micro- and macro-benchmarking.

The project aims to add event tracing into OCaml, via instrumentation to the Core Bench library using Camlp4. The event-tracing tool is then a way for macro-benchmarking together with the multivariate regression for micro-benchmarking to analyze the performance of commonly used libraries to exhibit and explain abnormalities and performance differences in implementations. On a meta-level this study will give an insight into which predictors are useful for a multivariate regression in which circumstances to provide interesting results and how event-tracing can be used efficiently and compactly in large libraries.

Related Reading

Links

The dissertation is available on request to students from Anil Madhavapeddy but isn't online anywhere. The source code (a CamlP4 event tracer) has been superceded by modern event tracing.

Sebastian Funk went on to work at Jane Street on OCaml after his project, and one 2019 talk on his subsequent work can be seen below.

Related Ideas