This post is about backups, specifically my backup strategy and it’s latest evolution. Yes, that thing I suspect most of us working in technology think about, do something half-hearted about and then forget about.
We recently had a few power outages in relatively quick succession (3 in the space of a few hours) which have managed to kill, in “interesting” (read: not immediately apparent) ways, a number of items of hardware in our house. This included our router (which eventually turned out to be the SSD with the OS on had failed) and the USB hard disk that my backups resided on. The upshot of this was no copy of the router’s disk or the backups that should have protected from that.
Now, I have not had off-site backups for a very long time, not since my backup was small enough to store on a couple of re-writeable CD. This was a risk that, until this happened, I accepted. My thoughts had been a dual hardware failure (taking out both the original and backup) seemed very unlikely and the most plausible situation I thought might result in a total loss (burglary where the original machine and backup device was taken or a fire in our “office” room) was somewhat mitigated by my documents being stored in Dropbox (“other cloud sync-and-share providers are available”). But total loss of my key machines disk and failure of the drive with the backup has rattled me, so my opinion has swayed to needed a solution which incorporates both off-site and on-site backups.
I had been toying with creating my own backup solution (ugh!), as I didn’t think there was a solution that would fit what I wanted – however I think I’ve come up with something workable using my existing software that meets all my requirements:
- On-site and off-site copies of the backups (for instant access for restoration)
- No remote client required – I don’t have root on all the systems I backup
- At least 1 full copy off-site at all times (i.e. no risk of total loss if a disastrous event occurred while the off-site copy was being updated)
- Secure (I don’t trust Dropbox sufficiently to put anything I wouldn’t be okay with becoming public in it – my accounts, for example, are held purely within our home)
- Cost-effective (both in terms of outlay, which largely boils down to space-efficient with the backup data, but also my time to setup and manage it has a value)
I’ve long been a user of BackupPC and I really, really like the product. Due to it’s file-level de-duplication, it is extremely space-efficient if one is backing up more than a single machine with the same operating system and a lot of thought has gone into making it an efficient and streamlined system (see how it handles compression with a custom ZLib-based format, to minimise memory overhead on inflation, for example) with no remote client to install. The main reason I had for considering writing my own was struggling to find a solution to the off-site backup solution with BackupPC’s hard-link based de-duplication within the pool (making file-level copying of the pool, with ‘cp’ or ‘rsync’ very expensive processes.
While replacing the failed hardware, I decided to take the plunge and replace my NAS device too, which is a little over 8 years old, that reached the limit of it’s capacity some time ago (at 2TB) and cannot be expanded further. It’s a little NetGear ReadyNas that has proven itself to be very reliable, so I replaced it with a newer generation with some bigger disks. The new one supports creating iSCSI targets, as well as the usual myriad of file-based access methods, so I have created a new BackupPC iSCSI LUN which I’ve partitioned and mounted an ext4 filesystem on my Debian box for BackupPC to use. While musing about this, and the single point of failure I introduce by using the NAS shares for file-storage and the iSCSI Lun for backups on the same device, this solution popped into my head over lunch today:
What I plan to do is convert the plain partition on the iSCSI target to a LVM-logical volume. I can then snapshot this (the ReadyNas supports manual snapshots of the iSCSI Lun but this appears to have to be initiated by hand through the web-interface, making it hard to script), use ‘dd’ to duplicate the point-in-time snapshot to a USB disk of the same capacity as the Lun for off-site backup without the overhead of cp/rsync having to track all the hard-links in memory. By rotating 2 disks for the off-site backups I meet my “at least one full copy off-site at any one time” requirement. Thanks to reducing disk costs and increased density, having 2 USB disks for off-site backup is now doable for under £100 and in a 2.5″ form factor, which wasn’t possible when I setup BackupPC last.