Skip navigation

Monthly Archives: May 2007

Since Kayhan asked, I finally got around to finishing up my script to backup the mail in a gmail account to a local files.

gmail-backup.py

The only requirement this has (besides python and a gmail account of course) is libgmail.

This is in keeping with the theme from my blogger backup that no matter how much I like or trust a company, the only backups I really trust are my own.

First a disclaimer, then a few notes about the script. I cannot stress enough that I take no responsibility for how Google will take you using this to access your account. It waits 10 seconds between each folder for safety, although that can be changes. That being said, I used this a ton of times on my email account while testing today and haven’t noticed any problems. And I know other people do much worse with their accounts. But still, use some common sense and be careful (although I’m not sure what that means in this case).

Now, on to more interesting stuff. This script downloads each label (including inbox, spam, starred, and all) and saves each one as a separate mbox file. Although the ‘all’ label is redundant for the most part, it is necessary to catch any unlabelled mail. The mbox files should be readable by most mail programs, but the have only been tested with pine, mutt, and BSD mail (the standard mail command). These actually seem a bit pickier than the documentation on mbox seemed to imply. Currently, all mail is listed as new, but I hope to fix that in the next release. Incremental backups are not possible at the moment, but I think if I play around with the message ids that may be possible.

Unlike some other things I’ve done, I do actually plan on updating this in the near future. libgmail can access gmail contacts too, so I may even decide to grab that info as well.

I trust Google. I really do. At least more than I trust any other company that I deal with regularly.

At the same time, I’ve been using computers long enough that there is only one kind of backup I trust completely: a backup that I control on local medium.

So, in light of that, I have a python script to help create backups of blogs on blogger.

The script, blogger_backup.py is available on my webpage under the GPL v2.0.

This is a fairly primitive script, but it does have some nice features (especially the fact that it works well unsupervised as a cronjob). There are two main requirements. First, you must have python installed. Any halfway reasonable UNIX-like system will have it, and it exists for Windows as well. Second, you must set the feeds in the blogger dashboard to ‘full’.

Once those two things are taken care of, just run the script followed by the name of the blog. In my case, it would be:

./blogger-backup.py netpurgatory

This gets an xml file with the 100 most recent posts and the 100 most recent comments. (Kayhan pointed out that contrary to what I thought, you can only grab 100 of each thing, not 1000 with this script. Hopefully I can find some way around it before moving to using the Google API.) It cannot get more than that or get photos that are up (probably in a picasa album). I hope to fix those by moving from a simple python script to the Google API for blogger and picasa. Those provide much more powerful features (but they require installed libraries) and should allow for a more complete backup. For the moment however, my script will do.

I’ve got something in the works to backup gmail accounts as well (using the libgmail libraries), but that will have to wait a bit.