The source of the instability was really hard to track down, but it seems to be the automatic spawning of FastCGI processes by the web-server, and lighttpd failing to handle a SIGCHLD somewhere when a child process crashes. To sort this out, I just converted all the Ruby on Rails setups (this blog and Nick’s) to use an external spawn.
This only leaves our Mercurial vhost hg.recoil.org to switch to using FastCGI, and I couldn’t find a module for this anywhere and so lashed up some Python glue to do the job.
You can download the small distribution for Mercurial 0.9 (hg-fcgi-0.9.tar.gz). It has a FastCGI library written by someone else, the Python files to glue the Mercurial and FastCGI libraries together, and a simple rc script to launch the external web process. Instructions are for lighttpd… install the Python files somewhere, modify them to point to the Mercurial directory, run the rc script to start the daemon, and then add something similar to the following to your lighttpd config file:
fastcgi.server = ( ".fcgi" => ( "localhost" => ( "socket" => "/var/cache/fcgi/sites/hg.recoil.org/dirsock" )), ".hg" => ( "localhost" => ( "socket" => "/var/cache/fcgi/sites/hg.recoil.org/sock" )), )
Also add “index.fcgi” to index-file.names in the config file, and touch it in the vhost directory to create an empty file (this is to avoid getting a 404 error and instead pass it through to the FastCGI process). Similarly, touch a .hg file for every repository you want to serve. You could do this differently by passing through a URL prefix and modifying the Python appropriately, but I prefer finer control over what we’re serving.
Hope this is useful; I won’t bother submitting it back to the Mercurial list as it looks like the official hg repo has a different code layout; I’ll check it out later on when I have a bit more time and integrate properly.
I have no idea whether or not this will actually improve our stability, but it’s at least easier to move onto a different web-server now that everything is FastCGI. All I need now is an OpenBSD/php5-fastcgi port, which doesn’t seem to exist (yet).