home · projects · papers · blog · gallery · contact ·
anil madhavapeddy // anil.recoil.org


Trac spamming is taking down Melange

Posted by Anil Madhavapeddy Mon, 02 Apr 2007 10:15:00 GMT

After loads of reports that the Melange site keeps going 505 and crashing, I took at look at why. Turns out several spam crawlers were going mental and repeatedly adding tickets with spam links. Around 600 tickets and thousands of comments later, the process decided it had enough and terminated.

I've deleted all tickets (even the valid ones were spammed into oblivion) and turned off comment creation and modification for anonymous users. A look on the Trac Wiki shows that there are some SpamFilter extensions being developed which I'll investigate at some point.

Posted in  | no comments

Google Webmaster tools

Posted by Anil Madhavapeddy Thu, 28 Dec 2006 15:04:00 GMT

The conversion of the Recoil web services to external FastCGI pinned our Trac installation at Melange as the source of the CPU hogging. It turned out the Google crawler was indexing the entire source tree via Trac, causing it to go ballistic.

I then stumbled on the latest cool Googlism: the Google Webmaster Tool, which lets you register your sites and displays options, diagnostics and statistics about how the Google crawler views your website. I turned down the frequency at which Google hits the Trac installation (as well as installing a suitable robots.txt file). This solved the immediate problem, but some of the search statistics were fun to check out as well.

It turns out the gallery is pretty highly ranked for image searches. My trips to Japan seems to have made it big, with popular searches including "Shibuya", "tokyo at night", and "japanese roof". My random pictures of indian buffaloes, smoggy skylines and fried ice-cream seem especially popular as well. It's a wierd old Internet eh?

The gallery has fallen a bit by the wayside in recent months. I'll update it when I get back to Cambridge!

Posted in , ,  | 2 comments

Looking my Spam statistics

Posted by Anil Madhavapeddy Wed, 27 Dec 2006 00:01:00 GMT

The switch to qpsmtpd does seem to have reduced my spam intake somewhat, so out of curiousity I looked at the statistics from 2 years of procmail logs to see what's been happening in terms of filtering effectiveness.

mlgalleryedit

A quick import and bug-fix of Log::Procmail into OpenBSD, and some lashed up Perl and gnuplot later, the graph on the right showed up. The red and green are ham and spam respectively, as classified by SpamAssassin.

The large amount of ham in 2004 was not actually real mail, but mostly postmaster bounces from forged spam; I am currently forced to destroy all domain bounces without even reading them due to the sheer volume. This is something that Sender Permitted From promises to help solve once we determine if any our users send @recoil.org mail from sources other than our mail server.

Since the turn of this year the amount of spam has jumped, but more concerningly, SpamAssassin has been missing increasing amounts, and it's been flowing through straight to my Inbox (despite sa-update running daily). I'm going to do these graphs again in a few months and see just how much the switch to the new paranoid SMTP has helped.

Posted in ,  | no comments

Christmas Spam Cleanup

Posted by Anil Madhavapeddy Mon, 25 Dec 2006 23:35:00 GMT

It's Christmas Day, I've eaten far too much, and am lounging around doing the now-traditional Annual Recoil Cleanup as the year's todo list has grown ever larger. I've been meaning to switch from our venerable qmail-smtpd for some years now, and finally made the move over to qpsmtpd.

qpsmtpd is a drop-in replacement for the SMTP portion of qmail, and is written in Perl with a number of plug-ins which lets us increase our paranoia levels considerably. It's a pity we have to do this, but the policy of 'accept anything' has been under increasing stress for the last few years, and when I looked at my e-mail stats last night, I realised over 99.99% of my incoming e-mail was some kind of virus or spam. Even a 1% miss rate on SpamAssassin is enough to chuck 100s of mails into my inbox!

So now the new e-mail setup at Recoil includes virus scanning via the wonderful clamav, reverse DNS RBL looksup via rfc-ignorant.org, and even early-chatter detection of viruses which blindly blast messages before the initial SMTP greeting has completed. I'm hoping to enable global SpamAssassin checking soon if all else is stable and I don't get bleating about missing mail from our users.

I played with Greylisting as well to see if it had improved from my earlier experiments a couple of years ago. Unfortunately, it still looks as if there are many broken MTAs out there which don't cope well with rejection, and manual whitelists are required, which sounds a bit unreliable for setups like ours which sometimes don't get looked at for years on end (ahem).

So it's with a tear in my eye that I wave goodbye to qmail-smtpd, the first ever network-facing service deployed on Recoil back in 1998, and incredibly, the only one I've never had to upgrade in the 8 years since.

Posted in  | no comments

Migrating blog to Typo

Posted by Anil Madhavapeddy Fri, 30 Jun 2006 18:25:24 GMT

I've migrated away from my custom-written blog (farewell mlblog!) to the very nice Typo based on Rails. This is mainly because I don't have the time to hack together all of the XML-RPC functions required to support the MovableType API in OCaml, and the live-search feature now present in the sidebar is very cool!

To summarise the last six months, I submitted my PhD, broke my knee, went to the OpenBSD hackathon, fixed my knee, released some OCaml code (still very low-key as it needs a lot of polish) and am happily working at XenSource in Cambridge now! PhD viva is next Friday...!

Posted in ,  | 3 comments | no trackbacks

Arise, MLBlog

Posted by avsm Mon, 23 May 2005 20:00:04 GMT

I got tired of trying to combine the various blogging and gallery tools into something that did what I wanted: take a simple directory of images, blog entries, links, papers and output a nice HTML/RSS version of the directory. So I hacked up a quick blog tool in OCaml that does the trick, and put it live.

Its got quite a few rough edges at the moment, especially to do with the lack of date archives and the large number of images on the front page of the gallery. Note that the location of the RSS links have changed as well, as I've switched to using ocaml-rss for outputting it instead of the homebrew format used before in blosxom. One thing that has come out really well is the use of a flat tag namespace instead of the previous directory structure; it allows me to share stories and images among multiple categories without needing symlink hacks.

Posted in ,  | no comments

Grand Recoil update done

Posted by avsm Mon, 30 Aug 2004 11:59:32 GMT

Phew! The big re-arrangement to decommission the venerable F630 Netapp is finally pretty much done. Now we have a new beefy fork, a dual-Opteron screaming along as the main mail and CVS server (running the OpenBSD/amd64 port in SMP mode). Quick takes up the web serving duties, running thttpd and Apache. Hidden away from public glare is "chunk", serving up 400Gb of storage for backups and "multimedia content" to both machines.

The speed increase from removing NFS from the equation has been pretty incredible. Maildir is a nice format, but it really needs a top-notch NFS client to avoid dying from opendir overload when users have tens of thousands of mails in a single folder. Not to mention the locking headaches that NFS introduces as well (on local disks, Dovecot can now use its index files much more effectively). Overall, I'm pretty happy with a local disk / rsync combination until some maniac steps up to improve OpenBSD's NFS client to Solaris or FreeBSD levels.

On the software side, things have really improved regarding 64-bit cleanliness, and only one package, maildrop had a problem - it uses C++ exceptions which aren't supported by the amd64 toolchain on OpenBSD yet. A swift switch to procmail later, all worked peachily. qmail doesn't compile out of the box, but the excellent netqmail patchset integrates fixes for this. Only outstanding task is to bootstrap ezm3 on amd64 in order to run CVSup (or perhaps cvsync would be simpler).

Posted in  | no comments

Enough travelling already

Posted by avsm Thu, 26 Aug 2004 23:20:36 GMT

Sitting in the New York JFK United lounge (which incidentally, has free wifi). Spent a great week in Princeton with Sandy Fraser at Fraser Research - full coverage available in Dave "blogthief" Scott's blog.

Ordered a Cyclades TS-800 to act as serial console server for the Flirble and Recoil machine clusters. Hopefully this should cure the woes with remote management of x86 servers (why, oh why, can't they have something as cool as Sun LOM?)

Posted in , , ,  | no comments

Older posts: 1 2




Copyright © 2003-2006 by Anil Madhavapeddy. All rights reserved.
Original design used with kind permission from Jon Parise.
Valid CSS
Valid XHTML 1.0