Wednesday, 25 February 2015

Occasional Rsnapshot v1.3.0

It is almost exactly 1 year and a half since I came up with the idea of having a way of making backups using Rsnapshot automatically triggered by my laptop when I have the backup media connected to my laptop. This could mean connecting a USB drive directly to the laptop or mounting a NFS/sshfs share in my home network. Today I tagged Occasional Rsnapshot the v1.3.0 version, the first released version that makes sure even when you connect your backup media occasionally, your Rsnapshot backups are done if and when it makes sense to do it, according to the rsnapshot.conf file and the status of the existing backups on the backup media.

Quoting from the README, here is what Occasional Rsnapshot does:

This is a tool that allows automatic backups using rsnapshot when the external backup drive or remote backup media is connected.

Although the ideal setup would be to have periodic backups on a system that is always online, this is not always possible. But when the connection is done, the backup should start fairly quickly and should respect the daily/weekly/... schedules of rsnapshot so that it accurately represents history.

In other words, if you backup to an external drive or to some network/internet connected storage that you don't expect to have always connected (which is is case with laptops) you can use occasional_rsnapshot to make sure your data is backed up when the backup storage is connected.

occasional_rsnapshot is appropriate for:
  • laptops backing up on:
    • a NAS on the home LAN or
    • a remote or an internet hosted storage location
  • systems making backups online (storage mounted locally somehow)
  • systems doing backups on an external drive that is not always connected to the system
The only caveat is that all of these must be mounted in the local file system tree somehow by any arbitrary tool, occasional_rsnapshot or rsnapshot do not care, as long as the files are mounted.

So if you find yourself in a simillar situation, this script might help you to easily do backups in spite of the occasional availability of the backup media, instead of no backups at all. You can even trigger backups semi-automatically when you remember to or decide is time to backup, by simply pulging in your USB backup HDD.

But how did I end up here, you might ask?

In December 2012 I was asking about suggestions for backup solutions that would work for my very modest setup with Linux and Windows so I can backup my and my wife's system without worrying about loss of data.

One month later I was explaining my concept of a backup solution that would not trust the backup server, and leave to the clients as much as possible the decision to start the backup at their desired time. I was also pondering on the problems I might encounter.

From a security PoV, what I wanted was that:
  1. clients would be isolated from each other
  2. even in the case of a server compromise:
    • the data would not be accessible since they would be already encrypted before leaving the client
    • the clients could not be compromised


The general concept was sane and supplemental security measures such as port knocking and initiation of backups only during specific time frames could be added.

The problem I ran to was that when I set up this in my home network a sigle backup cycle would take more than a day, due to the fact that I wanted to do backup of all of my data and my server was a humble Linksys NSLU2 with a 3TB storage attached on USB.

Even when the initial copy was done by attaching the USB media directly to the laptop, so the backup would only copy changed data, the backup with the HDD attached to the NSLU2 was not finished even after more than 6 hours.

The bottleneck was the CPU speed and the USB speed. I tried even mounting the storage media over sshfs so the tiny xscale processor in the NSLU2 would not be bothered by any of the rsync computation. This proved to an exercise in futility, any attempt to put the NSLU2 anywhere in the loop resulted in an unacceptable and impractically long backup time.

All these attempts, of course, took time, but that meant that I was aware I still didn't have appropriate backups and I wasn't getting closer to the desired result.

So this brings us August 2013, when I realized I was trying to manually trigger Rsnapshot backups from time to time, but having to do all sorts of mental gymnastics and manual listing to evaluate if I needed to do monthly, weekly and daily backups or if weekly and daily was due.

This had to stop.
Triggering a backup should happen automatically as soon as the backup media is available, without any intervention from the user.
I said.

Then I came up with the basic concept for Occasional Rsnapshot: a very silent script that would be called from  cron every 5 minutes, would check if the backup media is mounted, if is not, exit silently to not generate all sorts of noise in cron emails, but if mounted, compute which backup intervals should be triggered, and trigger them, if the appropriate amount of time passed since the most recent backup in that backup interval.

Occasional Rsnapshot version v1.3.0 is the 8th and most recent release of the script. Even if I used Occasional Rsnapshot since the day 1, v1.3.0 is the first one I can recommend to others, without fearing they might lose data due to it.

The backup media can be anything, starting from your regular USB mounted HDD, your sshfs mounted backup partition on the home NAS server to even a remote storage such as Amazon S3 online storage, and there are even brief instructions on how to do encrypted backups for the cases where you don't trust the remote storage.

So if you think you might find anything that I described remotely interesting, I recommend downloading the latest release of Occasional Rsnapshot, go through the README and trying it out.

Feedback and bug reports are welcome.
Patches are welcomed with a 'thank you'.
Pull requests are eagerly waited for :) .

No comments: