RAID is not a backup!

This post describes the authors experience, almost losing his data on a RAID disk set. He also gives good details on why RAID is not a backup and how he rectified the situation. Remember: RAID is not a backup!

When working with corporate systems, a complete, reliable, and tested backup system is important. RAID does not protect you against many (or even most) disasters that could happen.

RAID is designed to protect against one thing: disk failure. It does not protect against user error, operator error, site destruction, and many more possibilities.

So how do I back things up? I must admit, I’ve improved my backup strategies of late. I currently have several tools that I use and would recommend to you:

  • SpiderOak. This is an online backup service which offers the first 2Gb backup free. They also maintain multiple version backup, so if you want a file from two versions back, it’ll still be there. This service is worth paying for, I’d say.
  • For my Mac, I’ve used PsyncX periodically (albeit not automated). It has come in handy more than once as my laptop died several times – I’ve one of those iBooks that was notorious for video hardware that failed annually (and Apple would fix for free, but never admitted fault). If you’ve a Mac, get an external drive and use PsyncX to save your home directory off. Also recommended: put your applications in your home directory, not the system directory: restoring your home directory will then be enough to get your applications back.
  • For UNIX, the similar alternative to PsyncX is rsync: again, get an external drive and save your home directory off to it regularly.
  • Also, come at it from the other direction: save your configuration by putting it into a cfengine or puppet setup and saving that as well. If the machine fails, running cfengine or puppet on startup will restore the system to its original state.
  • One other item – that may seem a bit unusual – is using Thinkfree Office. Thinkfree Office gives you a way to save documents locally and have them mirrored in the Internet cloud – and you can also manipulate your documents on the web as well. Of course, this is only entirely true for documents that Thinkfree Office can edit.

It would seem that cfengine v3 is now available for download – that will have to be a subject for a new article.

JournalSpace Dies by Data Loss

The blogging site JournalSpace has been shut down after there was significant data loss without backups. The entrepeneur’s blog has more information – apparently, the most likely cause seems to be sabotage by a former IT staff person, combined with the lack of working backups.

What can we learn from this unfortunate incident? There are a number of things to note here:

  • Remove all access for former staff in its entirety – don’t skimp! All access, passwords, server access, everything. Lock it down. If you have only one IT staffer, you are also at risk: you need to be able to call on someone who can lock out your fired (or laid off) employee completely.
  • Disk RAID is not a backup solution. RAID protects you from disk failure, not from “data failure” or operator mistakes. Do not forget to have a complete backup solution in place. It also pays to enable a “hot spare” so that if one of the disks fail, that there is still protection from disk loss.
  • Have a backup solution. You must have a comprehensive backup plan working, tested, and implemented.
  • Have a working backup solution. This point cannot be stressed enough: Test your backup solution before you need to use it! When the data is gone is no time to realize that the backups are useless. Test your backups in real-world scenarios as well: one story described a backup solution that was well-tested, then the tapes went off-site in the operator’s car. Unfortunately, sitting in the car caused the tapes to be demagnetized and this was realized only after the data was gone. Test those backups!

The dreadful story of JournalSpace might have had a different ending if they had only tested their backups: that alone would have saved them. However, solutions (like security) should be in depth: working backups might not be enough next time.