I finally got to spend some time on how to backup my machines, my home machines and laptop. This was triggered from when I got back from a week in Egypt and my LaCie NAS wouldn't start. I finally did manage to get it started by connecting through USB and whacking it with a hammer. However this triggered something I've been thinking about for a while - to buy a backup disk (NAS) with RAID 1 and implement a "proper" backup stragegy.
I decided for QNAP TS-209 Pro II which is Linux based. This device supports loads of stuff (DLNA, FTP, web server, DDNS, printers, mySQL etc), but I will at least initially only use the file server bit (Samba). I do not want a fullblown PC for this, since I want something simple, which is running all the time with low energy usage, doesn't produce too much heat, doesn't expect a monitor etc. So, a NAS seems just about the right thing for me. This unit was also simple to install and setup. Here's a simplification of my environment:
- My "main" machine is my desktop machine.
- I also have a laptop, which is what I'm using when I'm at customer sites and when I do training.
- I have an USB drive mainly used on above laptop.
- And of course the NAS.
- There are other machines as well (including USB disks), but I won't discuss them here since they don't change the basic principles.
I've already a long time ago decided on the concept of ownership. I.e., some folder is owned by some particular machine.
Some 3-4 years ago, I had a disk crash which made me lose 2 weeks worth of email. I decided that I cannot rely on manually copying folders around. Perhaps my most important folder is a "document" folder (where I among other things have the outlook PST file) on my desktop machine. I wrote a .NET console program which reads some config info from a local SQL Server and then creates a folder with current date as name and copies the content of my "document" folder to this new folder. After this is done, it removes every folder older than 1 week - except for folders created on day 1 in the month. I put this in the startup group. This has served me well for this "document" folder - mainly because it is small in size (some 200 MB). But over time, things has become a bit more complicated. A few examples:
- I have a "courseFiles" folder on my laptop and this should be owned by the laptop. I might for instance do some modification for a demo-file when I'm doing training.
- I have over time realized that some stuff are too large to have in the "documents" folder, like SQL Server videos I've produced.
- I have virtual machines. I can't have them copied every time I start my machine including many many generations of them.
- I have ghost images which includes virtual machines. Same applies as above.
So it was time to expand on my simple "documents" backup solution. I have some very important aspects:
- I can reinstall OS and applications. I already have a small script which produces a file with what apps are installed (autostarted) so I know what to reinstall if I have to whack a machine. So, no backup of binary files.
- I don't want to virtualize everyting. I don't feel like paying the penalty for it. I'm too pedantic, so I know I can spend days just to get this little thing working in virtual environment - and I don't have that time. Also, I don't see how virtualization will change anything. I will still have "productivity OS's" where I have important files which I need to backup.
- A backup need work with pure files, same folder structure as source. I.e., I don't want to rely on some backup app whenever I need to restore. Nor do I want some "diff" strategy only to realize that my base is corrupted.
- A folder has an ownership. This is the machine which owns the folder. For instance, the "documents" folder is owned by my desktop machine - any changes done in that folder on my laptop should be discarded and dissapear.
- The NAS is the backup station. This is where all backups go. Some folders I also want on some other machine ("documents" on laptop, "courseFiles" on desktop) but such s folder should be treated as read-only.
- I want my backup files distributed and independent of each other. I.e., I do NOT want a distributed system which fails if one machine is lost (think RAID0). What I DO want is a distributed system which can surive several failures - like house burning down and several machines lost (think distributed RAID 1 with several mirrors). Now, don't mistake my RAID analogy for some real-time replication solution - since I want to be able to find an older versone of a file on some backup machine if I happen to destroy the owning file.
- Having only one machine (laptop) is not an option for me.
I found a backup program which suit my needs: SyncBack from http://www.2brightsparks.com/. This has a lot of features and functionality, and it allow the level of customization that I need. Here's how I have created my backup definitions (one per folder):
- I have three root folders on my NAS "backup" share. One per source (desktop, laptop and laptop USB disk).
- For each folder, I create a backup job on each owning machine, per folder that the machine owns.
- For some folders, I also create a "downstream" backup. Say for instance the "courseFiles" folder. This is owned by the laptop, but I want to have a copy available on my desktop machine as well. Of course I have an "upstream" backup definition from my laptop to the NAS. But I also have a "downstream" job from the NAS to the desktop machine.
- All backup definitions are defined as source always win (not newest file), and delete file if exist on target but not source. This is important since it implements pure one-way "replication".
It took me a couple of hours to learn the backup app and setup the jobs. (It took even longer time to do the initial copying of folders to my NAS (something I did before setting up the backup jobs) - but this was only because sheer volume.) But thaks to SyncBack (complemented with my own small applet for generation type backups) I now have a backup solution which is easy to understand and hopefully can survive multiple failures.