In addition to my computer backups I have a large cache of static files on my NAS. Some of these files are very large, the files never change, are relatively rarely added to and are all retrievable from elsewhere (either by re-downloading from the internet or re-copying from a physical disk). Backing them up is more a convenience to avoid recreating the cache from scratch, rather than it being a catastrophe if they were lost, so I chose to create a single off-site copy on some external disks (3 of them, to accommodate all of the files at a sensible size/price point for the external drives) with rsync. This is rather than backing them up by adding more storage to the NAS to increase the size of the local backup volume to accommodate them, which would in turn necessitate buying larger off-site disks.

Create/update backup

These commands mount the external disk (encrypted as a matter of course, in case it is stolen from the off-site location, although nothing on it should be sensitive), run the copy script (which will pick the relevant filter based on its name), umount and used fsck to check the filesystem’s consistency before turning off the USB disk:

DISK_NO=1
SCRIPT_DIR_PATH=/path/to/backup-scripts
sudo cryptsetup luksOpen /dev/disk/by-partlabel/VT-BACKUP-DISK-${DISK_NO} VT-BACKUP
sudo mount /dev/mapper/VT-BACKUP /mnt
date && time bash ${SCRIPT_DIR_PATH}/copy-disk-${DISK_NO}.bash ; date
sudo umount /mnt
sudo fsck -f /dev/mapper/VT-BACKUP
sudo cryptsetup luksClose /dev/mapper/VT-BACKUP
sudo udisksctl power-off -b /dev/sdb

Copy disk script

Script is named copy-disk-x.bash where ‘x’ is the number of the disk filter to use.

TARGET="/mnt/"

disknum=$( basename $0 .bash | sed 's/^.*\([0-9]\+\)$/\1/' )
scriptdir="$( dirname "$( realpath "$0" )" )"

SOURCE="$( realpath "$scriptdir/.." )/"

if [[ -z $disknum ]]
then
        echo "Unable to determine disk number from script name." >&2
        exit 1
fi

rsync -avP --delete -f "merge $scriptdir/disk$disknum.filter" $SOURCE $TARGET

Rsync filters

The exact filters, for which files go on which disk, were done by manual inspection and a bit of du -s to determine have to split the files sensibly to fit on the disks. A copy of all of the scripts and filters were added to each backup (so any of them can be inspected to see which filters were used/which files should be on which disk).

disk1.filter:

# Prevent rsync trying to delete 'lost+found' folder on backup disk
- /lost+found/
# Include the backup scripts and filters
+ /backup-scripts/
+ /backup-scripts/**
# Exclude files in 'T' directory named n-z (in either case)
- /T/[n-zN-Z]*/
# Include all the other files in T
+ /T/
+ /T/**
# Exclude everything else
- *

disk2.filter:

# Prevent rsync trying to delete 'lost+found' folder on backup disk
- /lost+found/
# Include the backup scripts and filters
+ /backup-scripts/
+ /backup-scripts/**
# Include 'T' files named n-z (in either case) only
# (i.e. all files in T excluded in disk1.filter)
+ /T/
+ /T/[n-zN-Z]**
# Exclude everything else
- *

disk3.filter:

# Prevent rsync trying to delete 'lost+found' folder on backup disk
- /lost+found/
# Include the backup scripts and filters
+ /backup-scripts/
+ /backup-scripts/**
# Exclude T (backed up onto disks 1 and 2)
- /T/
# Include everything else
+ **

Prepare a new drive

The drives arrive formatted with NTFS and the newest one also had a EFI boot partition. I want a Linux filesystem and encryption so they need to be repartitioned, encryption setup then the encrypted volume formatted as the root user:

DISK_NO=1
DISK_DEVICE=/dev/sdb
parted ${DISK_DEVICE} mklabel gpt
parted ${DISK_DEVICE} mkpart VT-BACKUP-DISK-${DISK_NO} ext4 0% 100%
cryptsetup luksFormat /dev/disk/by-partlabel/VT-BACKUP-DISK-${DISK_NO}
cryptsetup luksOpen /dev/disk/by-partlabel/VT-BACKUP-DISK-${DISK_NO} VT-BACKUP
mkfs.ext4 /dev/mapper/VT-BACKUP
mount /dev/mapper/VT-BACKUP /mnt
chown laurence:store /mnt
umount /mnt
cryptsetup luksClose /dev/mapper/VT-BACKUP