Personal Backups: A Lesson in Computing Safety

Over at the Daring Fireball blog, John Gruber has a nice article about how his extensive backup saved him recently from losing data on a hard drive that died. I know the utilities he speaks of (SuperDuper and DiskWarrior) and can vouch for their usefulness on Macintoshes (although I prefer to use psync instead of SuperDuper).

It was Merlin Mann over at 43Folders who noticed the article and then wrote his own take on John’s article.

For Linux and Unix, backups are much more varied. Two of the most widely known programs are Bacula and Amanda, both enterprise-level backup tools. For personal use, I prefer to use rsync to make copies of my home directories. There are a large number of tools that use rsync to make backups; one tool is from Mike Rubel from way back in 2004, with a comprehensive article titled Easy Automated Snapshot-Style Backups with Linux and Rsync. Another good article is from Joe Brockmeier on Linux.com titled Back up like an expert with rsync.

One popular (and simple) backup program is Dirvish, which I believe uses rsync behind the scenes. I’ve used the KDE app keep before, which was an easy and pleasant experience. The program rdiff-backup is also commonly recommended.

Whatever you use, the most important thing is: do it! The easier and more automated the better: if you dread making backups, you won’t do it.

Also, don’t settle for just one backup: what happens if you need to retrieve a backup from a while back – or your primary backup system fails? Best is multiple backups with multiple methods: backup to another disk and to the Internet (using sources like SpiderOak or Box.net or even Ubuntu One).

Lastly, there are a couple of Java-based backup programs – specifically, Areca and plan/b. Cory Buford wrote about these programs for Linux.com in 2008; it is hard to see how Java-based programs can reliably read all filesystem attributes and restore them without problem. Java, after all, is available from HP for the OpenVMS platform; do these programs restore all ODS-5 attributes? Do these programs work with other Linux filesystems like XFS and JFS and ext4? Perhaps someone can fill us in…

Whatever the case is – whatever the tool – go backup your personal systems now! I know I will be backing up. Don’t wait until your data is lost.

Webware 100

CNet has released their 2009 list of the 100 Best Web applications in 10 categories, plus the editor’s choice for the best Web applications that weren’t otherwise included.

There are quite a few, including just about every major browser on the planet. There are a few that are not in the lists, but should be. Here are some of my favorites that are and aren’t included:

Zoho

Zoho (a winner in the Productivity section) is unlike any other documentation suite online: they have everything – and the most interesting stuff is free. I keep wanting to use them, and would if my work was web-only. One of the most important reasons I like Thinkfree Office is the seamless integration between the desktop and the web; Evernote (another entry) does this too.

Evernote

Evernote was one of the Editor’s Picks. Evernote is essentially an electronic collection of notes that gets synchronized with their servers and made available to you online. Thus, you can work at your desk with desktop speeds, and let it update to the web so you can look at your notes on the go.

Pidgin

Pidgin (a winner in the Communications category) is the former GAIM instant messaging client, and supports a variety of services, as well as plug-ins. What makes Pidgin so nice is that it runs on everything – it really does. There’s versions for Windows and Linux, a version called Adium for Macintosh, and a text console version called Finch. What’s not to like?

Wikipedia

Wikipedia (one of the winners in Search and Reference) is an online encyclopedia that you can edit. If you find a mistake, don’t just complain: fix it! I edit regularly – any time I find bad English, I correct it – doing my part to make Wikipedia an excellent resource.

Not only that, but there is also the French Wikipedia or the Russian Wikipedia – or numerous others that could also use your help – even an Esperanto Wikipedia!

Thinkfree Office

How did they miss Thinkfree Office? This is one of my favorite applications, and I use it daily. I bought the Macintosh version ages ago (before web synchronization was as nice as it is now).

Not giving Thinkfree Office a place in the awards is a real mistake.

Data.gov

This is brand new – perhaps just too new for the awards – but the United States government put all the public data they had available onto Data.gov and made it easily available to all. Certainly, it is of most interest to United States citizens – but a lot of the data should be interesting to others as well.

LinkedIn

LinkedIn, to me, is a social networking web site for adults. Professionalism is paramount, and connections can truly be useful and helpful. You can get back in touch with old colleagues and catch up on what they are doing, and more. Not including LinkedIn was a real surprise for me also.

SpiderOak

SpiderOak provides excellent backup service with multi-platform support: Windows, Linux – its supported. Old versions of files – and deleted files – can be retrieved from the user interface on whatever platform you are using. Very simple, and very easy.

Toodledo

Toodledo is a To Do List manager: simple, clean, and easy to use. It integrates with iGoogle, with Firefox, and others, along with numerous export and import capabilities. If you are willing to keep your To Do list online (sadly, I wasn’t), this is a must – especially for GTD adherents.

ReadItLater

The Read It Later application is no less than brilliant – every time you see a web site you want to read – don’t read it (wasting otherwise productive time): save it and read it later. This is a wonderful idea, and I use it all the time. Now if only I could remember to actually read them….

Wolfram Alpha

WolframAlpha, the new offering from Wolfram is absolute genius. It is like a fact-based search engine – like a cross between Wikipedia, Google, and the CIA World Factbook – but even that doesn’t cover it all. If it has to do with facts or computation, WolframAlpha can handle it.

And that doesn’t even cover Wolfram’s other offerings, like: WolframTones, free computer-generated tones for your mobile phone; Wolfram Demonstrations, explaining and demonstrating mathematical concepts at all skill levels; Wolfram Mathworld, a one-stop resource on mathematics; and even more!

At one time I seriously considered a carreer in mathematics; this site is a mathemetician’s dream come true…

Powered by ScribeFire.

Backup in depth

In security circles, there is often talk about “defense in depth.” This refers to the fact that a security system is not relying on a single element to accomplish its goal; the “defense in depth” strategy is a form of remove a single point of failure from security mechanisms. That is, if one element in the security infrastructure goes down (such as firewall collapse) other elements will be waiting to prevent an attacker from entering further.

Backup in depth (my term) is similar. In one environment I was priviledged to be in, the database administrator and I worked out a backup plan like this: each database would be backed up on the machine itself (backup #1); this backup would be saved to a location on a central server for up to 30 days (backup #2); and both the database servers and the central repository would be backed up to tape daily (backups #3 and #4). In at least one case, having the database backup on the local disk saved the database administrator from a long drawn out restore from tape.

When you are backing up your own personal data, this is also a good procedure to follow. Don’t rely just on tape or a remote site. Backup your data in several ways and in several locations (varying by ease of access and completeness of backup).

One could, for example, save your home directory to SpiderOak (the remote backup facility I mentioned earlier) and a copy to an external USB drive. SpiderOak thus provides the space and deep history, and the external drive provides immediate and fast restores that are not dependent on the Internet.

Virtual environments provide an inherint ability to create a “backup in depth” – the host can be backed up (including the virtual environments) and the virtual environments can do a standard backup.

With multiple backups in place, restoring a file should not be a problem in most cases – or restoring entire directories or systems. Isn’t that worth taking some time to accomplish on your personal machines?

RAID is not a backup!

This post describes the authors experience, almost losing his data on a RAID disk set. He also gives good details on why RAID is not a backup and how he rectified the situation. Remember: RAID is not a backup!

When working with corporate systems, a complete, reliable, and tested backup system is important. RAID does not protect you against many (or even most) disasters that could happen.

RAID is designed to protect against one thing: disk failure. It does not protect against user error, operator error, site destruction, and many more possibilities.

So how do I back things up? I must admit, I’ve improved my backup strategies of late. I currently have several tools that I use and would recommend to you:

  • SpiderOak. This is an online backup service which offers the first 2Gb backup free. They also maintain multiple version backup, so if you want a file from two versions back, it’ll still be there. This service is worth paying for, I’d say.
  • For my Mac, I’ve used PsyncX periodically (albeit not automated). It has come in handy more than once as my laptop died several times – I’ve one of those iBooks that was notorious for video hardware that failed annually (and Apple would fix for free, but never admitted fault). If you’ve a Mac, get an external drive and use PsyncX to save your home directory off. Also recommended: put your applications in your home directory, not the system directory: restoring your home directory will then be enough to get your applications back.
  • For UNIX, the similar alternative to PsyncX is rsync: again, get an external drive and save your home directory off to it regularly.
  • Also, come at it from the other direction: save your configuration by putting it into a cfengine or puppet setup and saving that as well. If the machine fails, running cfengine or puppet on startup will restore the system to its original state.
  • One other item – that may seem a bit unusual – is using Thinkfree Office. Thinkfree Office gives you a way to save documents locally and have them mirrored in the Internet cloud – and you can also manipulate your documents on the web as well. Of course, this is only entirely true for documents that Thinkfree Office can edit.

It would seem that cfengine v3 is now available for download – that will have to be a subject for a new article.