Sunday, March 27, 2005

Using the Compaq PA-1 with Linux

I purchased one of these little guys in 2001 and used it very little at the time. It took forever to transfer songs, only worked under Winblows, and the capacity was very limited (they come with two 32MB MMC cards).

Fast-forward 4 years and podcasting is born. It's really nice to always have new interesting content to listen to while you're comuting or working outside. So I downloaded a bunch of podcasts (in MP3 format for now) and proceeded to install the Compaq-bundled RioPort software inside a Winblows session in VMWare. File transfers worked one third of the time, and drained the batteries pretty badly. The real problem, however, was the time wasted in other parts of the process. Waiting one full minute for RioPort to read the list of files transfered was too much. As with anything else in Winblows, it's not the apps that suck that much, but the OS just makes the user experience a real nightmare.

Being a happy resident of a non-DMCA encumbered country, I decided to reverse engineer the filesystem used to store the files to the flash cards. This way I can transfer files without using the PA-1 itself, which saves on the USB hassles and uses zero battery power. The first step was to dump a working flash image and examine it using a binary editor (bvi in this case).

It turns out that the filesystem was created by a company called Eiger M&C, which doesn't seem to be doing business anymore. I even tried emailing their contact listed on their website (last updated in 2002), but of course got no reply. To make a long story short, I ended up successfully reverse engineering most of the filesystem format, and used a bare bones version of it as the basis for a small Python script.

And so was born jdeigerfs v0.1 (3.4KB) :) It allows you to generate a mm.img file that contains a filesystem image that you can copy to any flash card. I use 32MB and 64MB MMC cards on my device, but your device may use other cards/sizes. All should work, up to 128MB per card. From what I can tell the format used for the FAT reserves 128KB for a 1 to 1024 mapping of the flash card, so anything over 128MB would actually cause the FAT to overwrite the first file in the card.

The script is barely useable. Actually it's a bit better than that, and in a works-for-me state. I decided to release it early so that if anyone else has any use for it I can get feedback at an early stage, although I don't plan on making any major improvements to it. Now I can finally test if the claim to support AAC is true (hard to believe for 2000 hardware). Later.

Thursday, March 3, 2005

Multi-DVD backups using zero disk space

As a few million other people, I have started doing my backups on DVD+RW. With a capacity of 4.7GB (that's 4.7 billion bytes, not 4.7 * 2^20 bytes), fast write speeds (compared to CD-RW) and the ability to reuse the media thousands of times, it's hard to ask for more.

Unfortunately the problem when backing up to DVD starts when you have to choose a format. You could theoretically use any default filesystem that your OS likes and burn that directly to the media, but it would be highly incompatible with any other OS. One of the desired characteristics of a backup is the ability to restore easily under any circumstance (or any OS).

That basically leaves us with ISO-9660 as a format. Virtually every OS supports that. Of course there's your Rock Ridge extensions for Unix, and your Joliet for Windows, but that's easy to implement (most software supports both). The problem is, even with these extension, the ISO-9660 format is pretty limited. It needs a lot of hand-holding in order to solve duplicate filenames (inside different directories, which is quite common in any filesystem), and the most common utility to generate such a filesystem (mkisofs) tends to require _a lot_ of switches to do what you want.

OK, so all we have to do is come up with a script to feed mkisofs with the proper switches, resolve the duplicate filenames, and we're set, right? Not quite.

Making an ISO-9660 image of your data and then burning it would require lots of temporary storage. At least the 4.7GB to be exact. And in lots of situations, that temporary space just won't be available, or your /tmp or /home partitions may be too full to fit that image in. That's why we need to backup in lots of situations--to free up some space. How can you free space up when you need _more_ space to do it? Sounds like asking a bank manager for a loan--he'll want you to prove that you already have the money in order to lend it to you!

Back to software land, we'll need a neat utility called growisofs. It is named like that for historical reasons, but can actually burn the DVD for you, as well as making the ISO-9660 filesystem. The strategy here will be to identify the files that we're backing up, and group them until we reach the media size, then provide that list of files to growisofs so that it can make the filesystem and burn it on the fly, without using temporary storage :)

Another alternative would be to use the mkisofs -stream-media-size switch, but that way we could end up splitting up a file (I think--I didn't actually test this), which is not at least what I personally want with my backups. Notice that my technique here can waste a lot of space if you have lots of huge files, and won't even work at all if you have files larger than 4.7GB. I use this script to backup my pictures, music, and data. For movies and other large files I create a directory, move files to fit nicely inside the 4.7GB, and then backup "." (the current directory) using the same script. Works quite nicely.

Please also note that this script is not for production purposes. It's a hack that I came up with to do simple yet effective backups to DVD. Again, works fine for me. YMMV.

And finally for the script itself. You can find it here. It takes only 2 command line parameters: the volume label prefix, and the directory to backup. The volume label of your burned DVDs will be the prefix appended with "_01", "_02" and so on. There's a bug where when the backup is finished it'll still ask you for one more DVD. Just press Enter and it will quit harmlessly (without turning your DVD into a coaster ;)).

Thursday, January 6, 2005

My own DNSBL

My trash folder used to hold about 2,000 spam (and non-spam) messages. Any mail older than 7 days is automatically deleted. Most of what was there never got to my email client, because I use bogofilter to do bayesian spam filtering.

That worked well on its own until I started getting _tons_ of spam. I wrote a bunch of scripts to identify the offending IPs and compile them into my own DNSBL (DNS Block List). It is publicly available at That's not a homepage, but a domain for the reverse IP lookups.

Since I started using this DNSBL, my trash folder trimmed down to about 200 messages (for the week). That includes my legitimate email (which I read and then delete). Not bad :) It also unloads my mail server, and, most importantly, makes spammers really angry. And poor. And suicidal (I wish).

The current count for is about 70,000 IPs. I don't add blocks, only single IPs. I don't remove IPs unless I feel like it. I don't recommend that anyone use this DNSBL to actually block messages, but instead to flag spam as part of some greater process, such as using SpamAssassin or another similar tool.

That's about it. At a rate of about 2,000 new IPs every day (boy do I get spammed!), I'll probably have over 100,000 spam sources identified by the time you read this! Bring on the zombie botnets!

Update: The database has over 1 million IPs now. Scary.