Thursday, December 9, 2004

Fighting SPAM with DNSBL

I've been getting an average of 20,000 spam emails a day on one of my servers. Apparently some nice spammer included a domain I own as a target for his zombies. That means I kind of get DDOS'ed with spam :P

Most approaches to filtering spam don't work well when you're only getting a spam or two from each IP that connects to your server. For instance, one very nice way of catching spammers is by placing a few honeypots around and then blocking whatever IP sends mail to them. Unfortunately the kind of spam I'm getting is really dumb, in the form of messages to addresses that _don't_ exist. This causes the message to bounce back to the faked originating address. I say it is dumb because the person who actually receives the bounce gets it is "error" form, not as the clean original message, and thus will more than likely not read it. Even if they do, they'll be pretty sure that they didn't send the message, and will not click on the spam link that they supposedly sent someone else. SIGH!

Anyway, some pathetic spammer with a fairly big botnet thinks it's a great idea and decided to bounce some of his trash off my server. I'd really like to block that spam _before_ it gets delivered to my SMTP server (Exim in my case--yes it's very l33t). That being the case I created a tiny Perl script to tail the Exim log files and block access to port 25 from any IP that sent me spam. The idea was to prevent any further spam from that IP from even connecting to my box.

That worked fine and dandy, with only a small problem (or two). Very few IPs returned to spam me again. As I said, this guy's botnet is quite large, and many of his zombies have dialup or dynamic IP DSL/cable. The other problem is that there are just _a lot_ of them. A single day of logging resulted in over 15,000 IPs added to my firewall.

OK, let's go to plan B. Lots of other people are getting this spam, right? Let's see what they're doing about it! Turn out that a very efficient way of dealing with this type of bot is by allowing a pool of servers to rat on the IPs that are delivering spam. That way other servers can block their spam _before_ it's delivered. I guess this is how Vipul's Razor works, but I've never gotten to install it. I just used the lazy approach: filter whatever everyone else is filtering.

Most people don't worry that much about spam because they get only a few messages a day. ISPs and large companies, however, _do_ mind. And so a few "central" facilities for consolidating these spam sources were born. To distribute the data, a very clever approach is used: DNS. The fact that everyone that uses the Internet already uses DNS, and that it is distributed and has built-in caching, and deals with IPs, make it the prime candidate for the job. All that has to be done is to create a dummy (non-authoritative) reverse zone, and then clients can query the database using W.Z.Y.X.dnsbl.domain.tld to check if IP X.Y.Z.W is blacklisted. BTW "DNSBL" simply means DNS Block (or Black) List.

This all sounds quite complicated, but to implement it with Exim takes only a few lines. Exim4 supports ACLs (Access Control Lists), so all you have to do is add an ACL entry:

  deny   hosts = !+relay_from_hosts
message = $sender_host_address is listed \
at $dnslist_domain
dnslists = dnsbl.njabl.org : \
bl.spamcop.net : \
dnsbl.sorbs.net : \
blackholes.five-ten-sg.com : \
cbl.abuseat.org : \
psbl.surriel.com : \
list.dsbl.org


I chose to not check for spam from anything in my relay_from_hosts variable (for obvious reasons). You basically choose a message to use when rejecting (and logging) an attempt of spam delivery, and specify a list of domains to be used for the reverse mapping checks. Normally these DNS servers will return NXDOMAIN for regular IPs, or 127.0.0.2 for known spam sources.

So there you have it. I came up with my list of DNSBL sources by searching the excelent OpenRBL (a kind of DNSBL meta-search) for the spam sources that reached my box.

Also note that on of my DNSBL sources is psbl.surriel.com. This is Rik van Riel's (of Linux Kernel hacking fame) site, and is powered by Spamikaze, a tool that I plan to run on one of my boxes soon. The plan is to have my own DNSBL based on the spam that still gets through to my box.

I'll end this entry with a big "THANKS!" to all the projects mentioned (this is all free, folks) and look forward to paying them back with some pizza and beer in the future.

Friday, November 19, 2004

If you're a Dilbert fan...

...then surely you must be a geek, just like me. I can't say that I'm insanely into comic strips or anything like that, but a visit to comics.com made me want to have all Dilbert strips really bad.

While I'm not prepared to pony up for their paid service, I did subscribe to the free "Basic" service to see what it looks like. Not being able to wait to start my collection, I decided to fetch the 30 or so strips that are available in the Archive for free.

Obviously I'm not the first person to have such an urge. The guys at comics.com don't use a very common filename scheme for their content, presumably for the exact reason of making mass-fetching harder. A quick search on Google showed that every geek and his grandma has already written a script to fetch these strips. Instead of making my life easier, this simply proved that every minimally proud Dilbert fan must make his/her own script.

And so I proceeded to code my hack, jdilbert. Don't expect much--it's just a hack--but it does work. So much so that my collection now contains exactly 31 strips :)

Update: I rewrote this in ruby and it now handles more comics. Fetch jdstrips now :) I put it in a weekly cron job so that I never miss any.

Sunday, November 7, 2004

Multi-head screenshot

I was talking to my good friend gzp on AIM, bragging about my new (but made from old parts) box, and the fact that it had RAID1, RAID5, three heads, etc. So he asked me for a screenshot. I tried to use xwd to make it--to no avail--so he told me about scrot.

And what can I say, it's just sweet! Here's the result.