<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="/stylesheets/rss.css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>No Content, No Fuss: Category recoil</title>
    <link>http://anil.recoil.org/blog/articles/category/recoil</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description>Anil Madhavapeddy</description>
    <item>
      <title>Trac spamming is taking down Melange</title>
      <description>&lt;p&gt;After loads of reports that the &lt;a href="http://melange.recoil.org/"&gt;Melange&lt;/a&gt; site keeps going 505 and crashing, I took at look at why.  Turns out several spam crawlers were going mental and repeatedly adding tickets with spam links.  Around 600 tickets and thousands of comments later, the process decided it had enough and terminated.&lt;/p&gt;

&lt;p&gt;I've deleted all tickets (even the valid ones were spammed into oblivion) and turned off comment creation and modification for anonymous users.  A look on the &lt;a href="http://trac.edgewall.org/wiki"&gt;Trac Wiki&lt;/a&gt; shows that there are some &lt;a href="http://trac.edgewall.org/wiki/SpamFilter"&gt;SpamFilter&lt;/a&gt; extensions being developed which I'll investigate at some point.&lt;/p&gt;</description>
      <pubDate>Mon, 02 Apr 2007 11:15:00 +0100</pubDate>
      <guid isPermaLink="false">urn:uuid:db28ddc2-d81c-45ff-95a3-5a280fc78ba5</guid>
      <author>anil@recoil.org (Anil Madhavapeddy)</author>
      <link>http://anil.recoil.org/blog/articles/2007/04/02/trac-spamming-is-taking-down-melange</link>
      <category>recoil</category>
    </item>
    <item>
      <title>Google Webmaster tools</title>
      <description>&lt;p&gt;The conversion of the Recoil web services to external FastCGI pinned our &lt;a href="http://trac.edgewall.org/"&gt;Trac&lt;/a&gt; installation at &lt;a href="http://melange.recoil.org/"&gt;Melange&lt;/a&gt; as the source of the CPU hogging.  It turned out the Google crawler was indexing the entire source tree via Trac, causing it to go ballistic.&lt;/p&gt;

&lt;p&gt;I then stumbled on the latest cool Googlism: the &lt;a href="https://www.google.com/webmasters/tools/"&gt;Google Webmaster Tool&lt;/a&gt;, which lets you register your sites and displays options, diagnostics and statistics about how the Google crawler views your website.
I turned down the frequency at which Google hits the Trac installation (as well as installing a suitable &lt;a href="http://melange.recoil.org/robots.txt"&gt;robots.txt&lt;/a&gt; file).  This solved the immediate problem, but some of the search statistics were fun to check out as well.&lt;/p&gt;

&lt;p&gt;It turns out the &lt;a href="http://anil.recoil.org/gallery/"&gt;gallery&lt;/a&gt; is pretty highly ranked for &lt;a href="http://images.google.com/"&gt;image searches&lt;/a&gt;.  My trips to Japan seems to have made it big, with popular searches including "&lt;a href="http://images.google.com/images?q=shibuya&amp;amp;hl=en&amp;amp;ie=UTF-8&amp;amp;oe=UTF-8&amp;amp;sa=N&amp;amp;tab=wi"&gt;Shibuya&lt;/a&gt;", "&lt;a href="http://images.google.com/images?q=tokyo%20at%20night&amp;amp;hl=en&amp;amp;ie=UTF-8&amp;amp;oe=UTF-8&amp;amp;sa=N&amp;amp;tab=wi"&gt;tokyo at night&lt;/a&gt;", and "&lt;a href="http://images.google.com/images?q=japanese%20roof&amp;amp;hl=en&amp;amp;ie=UTF-8&amp;amp;oe=UTF-8&amp;amp;sa=N&amp;amp;tab=wi"&gt;japanese roof&lt;/a&gt;".  My random pictures of &lt;a href="http://images.google.com/images?q=buffalo%20india&amp;amp;hl=en&amp;amp;ie=UTF-8&amp;amp;oe=UTF-8&amp;amp;sa=N&amp;amp;tab=wi"&gt;indian buffaloes&lt;/a&gt;, &lt;a href="http://images.google.com/images?svnum=10&amp;amp;hl=en&amp;amp;lr=&amp;amp;q=smoggy+skyline&amp;amp;btnG=Search"&gt;smoggy skylines&lt;/a&gt; and &lt;a href="http://images.google.com/images?q=fried%20ice%20cream&amp;amp;hl=en&amp;amp;ie=UTF-8&amp;amp;oe=UTF-8&amp;amp;sa=N&amp;amp;tab=wi"&gt;fried ice-cream&lt;/a&gt; seem especially popular as well.  It's a wierd old Internet eh?&lt;/p&gt;

&lt;p&gt;The gallery has fallen a bit by the wayside in recent months.  I'll update it when I get back to Cambridge!&lt;/p&gt;</description>
      <pubDate>Thu, 28 Dec 2006 15:04:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:ae998dee-78ec-4f05-b751-1ff800d3f880</guid>
      <author>anil@recoil.org (Anil Madhavapeddy)</author>
      <link>http://anil.recoil.org/blog/articles/2006/12/28/google-webmaster-tools</link>
      <category>travel</category>
      <category>recoil</category>
      <category>net</category>
    </item>
    <item>
      <title>Looking my Spam statistics</title>
      <description>&lt;p&gt;The switch to &lt;a href="http://smtpd.develooper.com/"&gt;qpsmtpd&lt;/a&gt; does seem to have reduced my spam intake somewhat, so out of curiousity I looked at the statistics from 2 years of &lt;a href="http://www.procmail.org/"&gt;procmail&lt;/a&gt; logs to see what's been happening in terms of filtering effectiveness.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://anil.recoil.org/blog/files/mailstats-dec2006.png" rel="lightbox" title="Ham/Spam stats for 2004-2006"&gt;&lt;img style="float:right" src="/blog/files/mailstats-dec2006-thumb.png" alt="mlgalleryedit" /&gt; &lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A quick &lt;a href="http://www.openbsd.org/cgi-bin/cvsweb.cgi/ports/mail/p5-Log-Procmail"&gt;import and bug-fix&lt;/a&gt; of &lt;a href="http://search.cpan.org/dist/Log-Procmail/"&gt;Log::Procmail&lt;/a&gt; into OpenBSD, and some lashed up Perl and gnuplot later, the graph on the right showed up.  The red and green are ham and spam respectively, as classified by &lt;a href="http://www.spamassassin.org/"&gt;SpamAssassin&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The large amount of ham in 2004 was not actually real mail, but mostly postmaster bounces from forged spam; I am currently forced to destroy all domain bounces without even reading them due to the sheer volume.  This is something that &lt;a href="http://www.openspf.org/"&gt;Sender Permitted From&lt;/a&gt; promises to help solve once we determine if any our users send &lt;code&gt;@recoil.org&lt;/code&gt; mail from sources other than our mail server.&lt;/p&gt;

&lt;p&gt;Since the turn of this year the amount of spam has jumped, but more concerningly, SpamAssassin has been missing increasing amounts, and it's been flowing through straight to my Inbox (despite &lt;a href="http://wiki.apache.org/spamassassin/RuleUpdates"&gt;sa-update&lt;/a&gt; running daily).  I'm going to do these graphs again in a few months and see just how much the switch to the new paranoid SMTP has helped.&lt;/p&gt;</description>
      <pubDate>Wed, 27 Dec 2006 00:01:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:e79fc438-9464-4f07-9476-5b96fad27392</guid>
      <author>anil@recoil.org (Anil Madhavapeddy)</author>
      <link>http://anil.recoil.org/blog/articles/2006/12/27/looking-my-spam-statistics</link>
      <category>recoil</category>
      <category>net</category>
    </item>
    <item>
      <title>Christmas Spam Cleanup</title>
      <description>&lt;p&gt;It's Christmas Day, I've eaten far too much, and am lounging around doing the now-traditional Annual Recoil Cleanup as the year's todo list has grown ever larger.  I've been meaning to switch from our venerable &lt;a href="http://cr.yp.to/qmail.html"&gt;qmail-smtpd&lt;/a&gt; for some years now, and finally made the move over to &lt;a href="http://smtpd.develooper.com/"&gt;qpsmtpd&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;qpsmtpd is a drop-in replacement for the SMTP portion of qmail, and is written in Perl with a number of plug-ins which lets us increase our paranoia levels considerably.  It's a pity we have to do this, but the policy of 'accept anything' has been under increasing stress for the last few years, and when I looked at my e-mail stats last night, I realised over 99.99% of my incoming e-mail was some kind of virus or spam.  Even a 1% miss rate on SpamAssassin is enough to chuck 100s of mails into my inbox!&lt;/p&gt;

&lt;p&gt;So now the new e-mail setup at Recoil includes virus scanning via the wonderful &lt;a href="http://www.clamav.net/"&gt;clamav&lt;/a&gt;, reverse DNS RBL looksup via &lt;a href="http://www.rfc-ignorant.org"&gt;rfc-ignorant.org&lt;/a&gt;, and even early-chatter detection of viruses which blindly blast messages before the initial SMTP greeting has completed.  I'm hoping to enable global &lt;a href="http://www.spamassassin.org/"&gt;SpamAssassin&lt;/a&gt; checking soon if all else is stable and I don't get bleating about missing mail from our users.&lt;/p&gt;

&lt;p&gt;I played with &lt;a href="http://greylisting.org/"&gt;Greylisting&lt;/a&gt; as well to see if it had improved from my &lt;a href="http://anil.recoil.org/blog/articles/2004/07/29/playing-with-spammers"&gt;earlier experiments&lt;/a&gt; a couple of years ago.  Unfortunately, it still looks as if there are many broken MTAs out there which don't cope well with rejection, and manual whitelists are required, which sounds a bit unreliable for setups like ours which sometimes don't get looked at for years on end (ahem).&lt;/p&gt;

&lt;p&gt;So it's with a tear in my eye that I wave goodbye to &lt;em&gt;qmail-smtpd&lt;/em&gt;, the first ever network-facing service deployed on Recoil back in 1998, and incredibly, the only one I've never had to upgrade in the 8 years since.&lt;/p&gt;</description>
      <pubDate>Mon, 25 Dec 2006 23:35:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:9f3dd154-e452-4519-a627-989a1d2d282b</guid>
      <author>anil@recoil.org (Anil Madhavapeddy)</author>
      <link>http://anil.recoil.org/blog/articles/2006/12/25/christmas-spam-cleanup</link>
      <category>recoil</category>
    </item>
    <item>
      <title>Migrating blog to Typo</title>
      <description>&lt;p&gt;I've migrated away from my custom-written blog (farewell &lt;a href="http://anil.recoil.org/blog/articles/2005/05/23/arise-mlblog"&gt;mlblog&lt;/a&gt;!) to the very nice &lt;a href="http://typosphere.org/"&gt;Typo&lt;/a&gt; based on Rails.  This is mainly because I don't have the time to hack together all of the XML-RPC functions required to support the &lt;a href="http://www.sixapart.com/movabletype/"&gt;MovableType API&lt;/a&gt; in OCaml, and the live-search feature now present in the sidebar is very cool!&lt;/p&gt;

&lt;p&gt;To summarise the last six months, I submitted my PhD, broke my knee, went to the OpenBSD hackathon, fixed my knee, released &lt;a href="http://melange.recoil.org/"&gt;some OCaml code&lt;/a&gt; (still very low-key as it needs a lot of polish) and am happily working at &lt;a href="http://xensource.com/"&gt;XenSource&lt;/a&gt; in Cambridge now!  PhD viva is next Friday...!&lt;/p&gt;</description>
      <pubDate>Fri, 30 Jun 2006 19:25:24 +0100</pubDate>
      <guid isPermaLink="false">urn:uuid:352454a6-e51a-467e-bf8d-865d7967cc0b</guid>
      <author>anil@recoil.org (Anil Madhavapeddy)</author>
      <link>http://anil.recoil.org/blog/articles/2006/06/30/migrating-blog-to-typo</link>
      <category>cambridge</category>
      <category>recoil</category>
      <trackback:ping>http://anil.recoil.org/blog/articles/trackback/73</trackback:ping>
    </item>
    <item>
      <title>Arise, MLBlog</title>
      <description>&lt;p&gt;
I got tired of trying to combine the various blogging and gallery tools
into something that did what I wanted: take a simple directory of images,
blog entries, links, papers and output a nice HTML/RSS version of the
directory.  So I hacked up a quick blog tool in OCaml that does the trick,
and put it live.
&lt;/p&gt;

&lt;p&gt;
Its got quite a few rough edges at the moment, especially to do with
the lack of date archives and the large number of images on the front page
of the &lt;a href="http://anil.recoil.org/gallery/"&gt;gallery&lt;/a&gt;.
Note that the location of the RSS links have changed as well, as I've
switched to using &lt;a href="http://www.nongnu.org/ocamlrss"&gt;ocaml-rss&lt;/a&gt; for outputting
it instead of the homebrew format used before in blosxom.
One thing that has come out really well is the use of a flat tag namespace instead of the previous directory structure; it allows me to share stories and images among multiple categories without needing symlink hacks.
&lt;/p&gt;</description>
      <pubDate>Mon, 23 May 2005 21:00:04 +0100</pubDate>
      <guid isPermaLink="false">urn:uuid:3e604f8d-7eba-4a06-a997-3185d8790bb5</guid>
      <author>avsm</author>
      <link>http://anil.recoil.org/blog/articles/2005/05/23/arise-mlblog</link>
      <category>hacking</category>
      <category>recoil</category>
    </item>
    <item>
      <title>Grand Recoil update done</title>
      <description>&lt;p&gt;Phew!  The big re-arrangement to decommission the venerable F630 &lt;a href="http://www.netapp.com/"&gt;Netapp&lt;/a&gt; is finally pretty much done.  Now we have a new beefy &lt;a href="http://fork.recoil.org"&gt;fork&lt;/a&gt;, a dual-Opteron screaming along as the main mail and CVS server (running the &lt;a href="http://www.openbsd.org/amd64.html"&gt;OpenBSD/amd64&lt;/a&gt; port in SMP mode).  &lt;a href="http://quick.recoil.org/"&gt;Quick&lt;/a&gt; takes up the web serving duties, running &lt;a href="http://www.acme.com/software/thttpd/"&gt;thttpd&lt;/a&gt; and &lt;a href="http://www.apache.org/"&gt;Apache&lt;/a&gt;.  Hidden away from public glare is "chunk", serving up 400Gb of storage for backups and "multimedia content" to both machines.&lt;/p&gt;

&lt;p&gt;The speed increase from removing NFS from the equation has been pretty incredible.  &lt;a href="http://cr.yp.to/proto/maildir.html"&gt;Maildir&lt;/a&gt; is a nice format, but it really needs a top-notch NFS client to avoid dying from &lt;a href="openbsd2"&gt;opendir&lt;/a&gt; overload when users have tens of thousands of mails in a single folder.  Not to mention the locking headaches that NFS introduces as well (on local disks, &lt;a href="http://dovecot.org/"&gt;Dovecot&lt;/a&gt; can now use its index files much more effectively).  Overall, I'm pretty happy with a local disk / rsync combination until some maniac steps up to improve OpenBSD's NFS client to Solaris or FreeBSD levels.&lt;/p&gt;

&lt;p&gt;On the software side, things have really improved regarding 64-bit cleanliness, and only one package, &lt;a href="http://courier.sf.net/"&gt;maildrop&lt;/a&gt; had a problem - it uses C++ exceptions which aren't supported by the amd64 toolchain on OpenBSD yet.  A swift switch to &lt;a href="http://www.procmail.org/"&gt;procmail&lt;/a&gt; later, all worked peachily.  &lt;a href="http://cr.yp.to/qmail.html"&gt;qmail&lt;/a&gt; doesn't compile out of the box, but the excellent &lt;a href="http://www.qmail.org/netqmail/"&gt;netqmail&lt;/a&gt; patchset integrates fixes for this.  Only outstanding task is to bootstrap &lt;a href="http://www.polstra.com/projects/freeware/ezm3/"&gt;ezm3&lt;/a&gt; on amd64 in order to run &lt;a href="http://www.cvsup.org/"&gt;CVSup&lt;/a&gt; (or perhaps &lt;a href="http://www.cvsync.org/"&gt;cvsync&lt;/a&gt; would be simpler).&lt;/p&gt;</description>
      <pubDate>Mon, 30 Aug 2004 12:59:32 +0100</pubDate>
      <guid isPermaLink="false">urn:uuid:c1cf8c6b-cd13-4282-9889-2dadc1beb7bc</guid>
      <author>avsm</author>
      <link>http://anil.recoil.org/blog/articles/2004/08/30/grand-recoil-update-done</link>
      <category>recoil</category>
    </item>
    <item>
      <title>Enough travelling already</title>
      <description>&lt;p&gt;Sitting in the &lt;a href="http://www.panynj.gov/aviation/jfkframe.HTM"&gt;New York JFK&lt;/a&gt; &lt;a href="http://www.ual.com"&gt;United&lt;/a&gt; lounge (which incidentally, has free wifi).  Spent a great week in Princeton with Sandy Fraser at &lt;a href="http://www.fraserresearch.org"&gt;Fraser Research&lt;/a&gt; - full coverage available in Dave "blogthief" Scott's &lt;a href="http://recoil.org/~djs/blog/princeton-2004/"&gt;blog&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Ordered a &lt;a href="http://www.cyclades.com/products/2/ts_series"&gt;Cyclades TS-800&lt;/a&gt; to act as serial console server for the &lt;a href="http://flirble.org"&gt;Flirble&lt;/a&gt; and &lt;a href="http://recoil.org"&gt;Recoil&lt;/a&gt; machine clusters.  Hopefully this should cure the woes with remote management of x86 servers (why, oh why, can't they have something as cool as &lt;a href="http://www.sun.com/servers/alom.html"&gt;Sun LOM&lt;/a&gt;?) &lt;/p&gt;</description>
      <pubDate>Fri, 27 Aug 2004 00:20:36 +0100</pubDate>
      <guid isPermaLink="false">urn:uuid:674da2b8-e5dd-40f3-a759-18f727638cc2</guid>
      <author>avsm</author>
      <link>http://anil.recoil.org/blog/articles/2004/08/27/enough-travelling-already</link>
      <category>travel</category>
      <category>recoil</category>
      <category>usa</category>
      <category>princeton</category>
    </item>
    <item>
      <title>Eradicating PHP from the face of Recoil</title>
      <description>&lt;p&gt;
I used to do an awful lot of &lt;a href="http://www.php.net/"&gt;PHP&lt;/a&gt; hacking, but over the last few years, the boring gods of &lt;a href="http://www.openbsd.org/"&gt;security&lt;/a&gt; and &lt;a href="http://caml.inria.fr/"&gt;correctness&lt;/a&gt; have snared me, leaving me frustrated with the effort and resources required to create and maintain dynamic web content.
&lt;/p&gt;

&lt;p&gt;
So I've converted my site over to static HTML, and started switching our main web-servers to use &lt;a href="http://www.acme.com/software/thttpd/"&gt;thttpd&lt;/a&gt; instead of &lt;a href="http://www.apache.org/"&gt;Apache&lt;/a&gt;.  The performance difference has been absolutely staggering, as the humble hardware behind &lt;a href="http://fork.recoil.org/"&gt;fork.recoil.org&lt;/a&gt; managed to survive a &lt;a href="http://slashdot.org/article.pl?sid=04/05/27/1849209"&gt;slashdotting&lt;/a&gt; and mentions on &lt;a href="http://www.opengl.org/"&gt;opengl.org&lt;/a&gt; (for the latest release of OpenFX) without breaking a sweat.
&lt;/p&gt;

&lt;p&gt;
thttpd is great; it uses the BSD &lt;a href="http://www.openbsd.org/cgi-bin/man.cgi?query=kqueue"&gt;kqueue(3)&lt;/a&gt; kernel event mechanism, and is single-threaded (removing the endearing fork-bomb effect Apache has when hit by a burst of traffic).  There are quite a few good programs to help generate static content as well; my new blogging tool of choice is &lt;a href="http://www.blosxom.com/"&gt;blosxom&lt;/a&gt;, which fits into the UNIX way of doing things absolutely perfectly.
&lt;/p&gt;</description>
      <pubDate>Sun, 25 Jul 2004 23:46:49 +0100</pubDate>
      <guid isPermaLink="false">urn:uuid:baf997e7-311f-45c0-a36d-d4308d0719be</guid>
      <author>avsm</author>
      <link>http://anil.recoil.org/blog/articles/2004/07/25/eradicating-php-from-the-face-of-recoil</link>
      <category>hacking</category>
      <category>recoil</category>
    </item>
    <item>
      <title>Style tweaking</title>
      <description>&lt;p&gt;Did a bit of style tweaking on my journal, so now multiple entries in one day are collected under one group.  Works well on my Powerbook so far...  Thanks to &lt;a href="http://www.livejournal.com/users/jon/"&gt;Jon Parise's LiveJournal&lt;/a&gt; for the inspiration.  I'm off to the Robinson Fifth Week Blue's Jazz party now!&lt;/p&gt;</description>
      <pubDate>Wed, 13 Nov 2002 22:03:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:2f14b2fb-8112-44de-85d7-6f5d5dbc63f9</guid>
      <author>avsm</author>
      <link>http://anil.recoil.org/blog/articles/2002/11/13/style-tweaking</link>
      <category>recoil</category>
    </item>
  </channel>
</rss>
