Until now I have been using SaltStack to apply configuration, although in some cases that means removing default settings. In my new home lab I have deployed systems by doing bare-metal restores from live-system backups. Predominantly due to hardware differences, there are some difficulties that require undoing configurations SaltStack applies to the live systems to correct. I think of this as “anti-configuration-management”.

UPS/NUT

The live network is protected by a UPS, which NUT is complaining it cannot talk to, loudly.

To fix this, I simply configured SaltStack to remove the packages (leaving the configuration behind, whilst possibly untidy, I do not see as a problem as it keeps the lab clone a closer copy of the live system):

remove-nut-packages:
  pkg.removed:
    - pkgs:
      - nut-server
      - nut-client

I applied this by creating a roles.remove-nut that I used in place of the role that adds and configures nut within the lab (c.f. the live network).

BackupPC

In the live network, since 2018, BackupPC uses an iSCSI target on my NAS for storage. The lab network uses one of the USB “off-site” backups as a source, which is closer to how the old live system worked. As the old NAS in my lab doesn’t support iSCSI, in order to replace my temporary backup restore system with a clone of the live system I need to reconfigure the cloned system to use the USB drive instead of the iSCSI target.

SaltStack configuration management

Because this only affects how the BackupPC store gets mounted to /var/lib/backupppc, no changes were needed to any of the states managing backuppc itself. I added a new state to create the extra directories needed to work with OverlayFS:

{% for (dir, data) in salt['pillar.get']('overlay-fs-dirs', {}).items() %}
overlayfs-directory-{{ dir }}:
  file.directory:
    - name: {{ dir }}
    - user: {{ data.get('user', 'root') }}
    - group: {{ data.get('group', 'root') }}
    - dir_mode: {{ data.get('mode', '0o755') }}
{% endfor %}

My existing ‘extra-mounts’ state remained unchanged:

{% set mount_info = salt['pillar.get']('extra-mounts', {}) %}
 {% for (mount_point, data) in mount_info.items() %}
 mount-{{ mount_point }}:
   mount.mounted:
     - name: {{ mount_point }}
     - device: {{ data.device }}
     - fstype: {{ data.fstype }}
     - dump: {{ data.get('dump', 0) }}
     - pass_num: {{ data.get('pass', 2) }}
   {% if 'options' in data %}
     - opts:
     {%- for option in data.options %}
       - {{ option }}
     {%- endfor %}
   {% endif %}
     - persist: {{ data.get('persist', True) }}
     - mount: {{ data.get('mount', True) }}
   {% if 'extra-requires' in data %}
     - require:
     {%- for req in data['extra-requires'] %}
       - {{ req }}
     {%- endfor %}
   {% endif %}
   {% if 'extra-require-in' in data %}
     - require_in:
     {%- for req in data['extra-require-in'] %}
       - {{ req }}
     {%- endfor %}
   {% endif %}
     - extra_mount_invisible_keys:
 # Workaround for SaltStack issues #45136 and #46178 - fixed 27 Feb 2018 (MR #46175), fixed in versions >2017.7.5 and >2018.3.1 (Debian stretch currently at 2016.11.2, 2018-11-25)
       - credentials
       - retry
       - secretfile
 # End Workaround
       - noauto
 
   {% if 'make-mountpoint' in data %}
 mountpoint-{{ mount_point }}:
   file.directory:
     - name: {{ mount_point }}
     - user: {{ data['make-mountpoint'].user }}
     - group: {{ data['make-mountpoint'].group }}
     - mode: {{ data['make-mountpoint'].mode }}
     - makedirs: True
     {% if 'require' in data['make-mountpoint'] %}
     - require:
       {%- for req in data['make-mountpoint'].require %}
       - {{ req }}
       {%- endfor %}
     {% endif %}
     - require_in:
       - mount: mount-{{ mount_point }}
   {% endif %}
 {% endfor %}

A new pillar file, for the read-only mount, completes the change.:


roles:
  overlayfs: True

overlay-fs-dirs:
  /tmp/backuppc-rw:
    user: backuppc
    group: backuppc
  /tmp/backuppc-overlay-work:
    user: root
    group: root

extra-mounts:
  /media/backuppc-ro:
    device: /dev/mapper/backuppc-store
    fstype: ext4
    persist: false
    make-mountpoint:
      user: backuppc
      group: backuppc
      mode: 0o750
    options:
      - noauto
      - ro
  /var/lib/backuppc:
    device: overlay
    fstype: overlay
    extra-require-in:
      - service: backuppc-server
    extra-requires:
      - mount: /media/backuppc-ro
      - file: /tmp/backuppc-rw
      - file: /tmp/backuppc-overlay-work
      - pkg: backuppc-server
    options:
      - lowerdir=/media/backuppc-ro
      - upperdir=/tmp/backuppc-rw
      - workdir=/tmp/backuppc-overlay-work
      - index=on
      - metacopy=on
      - noauto

noauto is unnecessary with persist: no (since the option only has meaning in fstab) but its included for completeness and to protect the system (won’t try and mount on boot) if the persist option were not respected or (inadvertently) removed and the mount ended up in /etc/fstab.

Unlocking the USB drive is still a manual affair, using the device name backuppc-store for the unlocked volume to match the configuration of the iSCSI device from the live network. Despite this, I have deliberately put mounting it into the configuration management as it causes a failure if the device is unavailable that I can detect when running SaltStack against my systems.

To unlock, manually (in read-only mode):

cryptsetup luksOpen --readonly /dev/disk/by-part-label/backuppc-offsite[0-9] backuppc-store

I am thinking about deploying something like tang (Red Hat documentation) and/or Hasicorp Vault to provide these credentials in a way that still protects my data - the purpose of encryption being to not have to worry (too much) about disposal of failed disks, rather than to prevent unauthorised access when attached to the local network.

Mount/umount scripts

In the meantime, I added four convenience scripts to be pushed out via SaltStack to mount/unmount either iSCSI or local (off-site/DR) devices including the commands to unlock the cryptography. The script will still prompt for the password, so interactive use is still required, but it means I no longer have to keep copying-and-pasting the iSCSI commands from my previous blog post (or, now, this one for the USB disk in the lab) when the system is restarted.

/usr/local/sbin/backuppc-mount-iscsi

#!/bin/bash

set -eufo pipefail

if [[ ! $UID -eq 0 ]]
then
    echo "This command must be run as root." >&2
    exit 1
fi

MOUNT_POINT=/var/lib/backuppc

if grep -q "$MOUNT_POINT" /proc/mounts
then
    echo "$MOUNT_POINT already mounted."
    exit
else
    if iscsiadm --mode session | grep -q backuppc
    then
        echo "iscsi already logged in"
    else
        # Login to iSCSI
        iscsiadm --mode node --targetname "iqn.1994-11.com.netgear:isolinear:6349f3fd:backuppc" --login
        # Wait to give device chance to appear
        sleep 1
    fi

    if [[ -f /dev/mapper/backuppc-pv ]]
    then
        echo "/dev/mapper/backuppc-pv already exists (encrypted volume already unlocked)"
    else
        # Open encrypted filesystem (see lsblk to locate the filesystem)
        cryptsetup luksOpen /dev/disk/by-partlabel/BackupPC backuppc-pv

        # Give device chance to appear
        sleep 1
    fi

    # LVM will automagically have found the volume group and logical volume,
    # so it can just be mounted (assuming /etc/fstab is correct)
    mount "$MOUNT_POINT"
fi

# Try and start the service, if not running
if systemctl is-active backuppc &>/dev/null
then
    echo "BackupPC already running"
else
    systemctl start backuppc
fi

echo "$MOUNT_POINT successfully mounted and BackupPC service started."

/usr/local/sbin/backuppc-umount-iscsi

#!/bin/bash

set -eufo pipefail

if [[ ! $UID -eq 0 ]]
then
    echo "This command must be run as root." >&2
    exit 1
fi

MOUNT_POINT=/var/lib/backuppc

if systemctl is-active backuppc &>/dev/null
then
    systemctl stop backuppc
else
    echo "BackupPC not running"
fi

if grep -q "$MOUNT_POINT" /proc/mounts
then
    umount "$MOUNT_POINT"
else
    echo "$MOUNT_POINT not mounted."
fi

if vgdisplay | grep -q backuppc
then
    # Deactive the LVM volume group
    vgchange -a n backuppc
fi

if [ -e /dev/mapper/backuppc-pv ]
then
    # Close the encrypted volume
    cryptsetup luksClose backuppc-pv
fi

if iscsiadm --mode session | grep -q backuppc
then
    # Logout with the iSCSI server (target)
    iscsiadm --mode node --targetname "iqn.1994-11.com.netgear:isolinear:6349f3fd:backuppc" --logout
fi

echo "$MOUNT_POINT fully unmounted and all connections/encrypted volumes closed."

/usr/local/sbin/backuppc-mount-usb-dr

#!/bin/bash

# Globbing (-f) required to match /dev/disk/by-partlabel/backuppc-offsite[0-9]
set -euo pipefail

if [[ ! $UID -eq 0 ]]
then
    echo "This command must be run as root." >&2
    exit 1
fi

MOUNT_POINT=/var/lib/backuppc
OVERLAY_RW=/tmp/backuppc-rw
OVERLAY_WORK=/tmp/backuppc-overlay-work

if grep -q "$MOUNT_POINT" /proc/mounts
then
    echo "$MOUNT_POINT already mounted."
    exit
else
    if [[ -e /dev/mapper/backuppc-store ]]
    then
        echo "/dev/mapper/backuppc-store already exists (encrypted volume already unlocked)"
    else
        # Open encrypted filesystem (see lsblk to locate the filesystem)
        cryptsetup luksOpen --readonly /dev/disk/by-partlabel/backuppc-offsite[0-9] backuppc-store

        # Give device chance to appear
        sleep 1
    fi

    if [[ -d "$OVERLAY_RW" ]]
    then
        echo "WARNING: removing stale $OVERLAY_RW" >&2
        rm -rf "$OVERLAY_RW"
    fi
    if [[ -d "$OVERLAY_WORK" ]]
    then
        echo "WARNING: removing stale $OVERLAY_WORK" >&2
        rm -rf "$OVERLAY_WORK"
    fi

    # Create new overlay directories (so will start from read-only DR copy's point in time)
    mkdir "$OVERLAY_RW"
    mkdir "$OVERLAY_WORK"
    chown backuppc:backuppc "$OVERLAY_RW"

    if [[ -d /media/backuppc-ro ]]
    then
        echo "Read-only mountpoint exists."
    else
        mkdir /media/backuppc-ro
    fi

    mount -o ro /dev/mapper/backuppc-store /media/backuppc-ro

    # Assuming /etc/fstab is correct
    mount "$MOUNT_POINT"
fi

# Try and start the service, if not running
if systemctl is-active backuppc &>/dev/null
then
    echo "BackupPC already running"
else
    systemctl start backuppc
fi

echo "$MOUNT_POINT successfully mounted (read-only with read-write overlay) and BackupPC service started."

/usr/local/sbin/backuppc-umount-usb-dr

#!/bin/bash

# Globbing (-f) required to match USB disk by partition label
set -euo pipefail

if [[ ! $UID -eq 0 ]]
then
    echo "This command must be run as root." >&2
    exit 1
fi

MOUNT_POINT=/var/lib/backuppc
OVERLAY_RW=/tmp/backuppc-rw
OVERLAY_WORK=/tmp/backuppc-overlay-work

if systemctl is-active backuppc &>/dev/null
then
    systemctl stop backuppc
else
    echo "BackupPC not running"
fi

if grep -q "$MOUNT_POINT" /proc/mounts
then
    umount "$MOUNT_POINT"
else
    echo "$MOUNT_POINT not mounted."
fi

if grep -q /media/backuppc-ro /proc/mounts
then
    umount /media/backuppc-ro
else
    echo "Read-only mountpoint not mounted."
fi

# Destroy the overlay filesytems - for DR do not want to preserve
# anything BackupPC might have writen over the pristine snapshot.
rm -rf "$OVERLAY_WORK"
rm -rf "$OVERLAY_RW"

# Give everything chance to settle, to avoid 'device busy' errors from
# device-manager
sleep 1

# Close the encrypted volume
if [ -e /dev/mapper/backuppc-store ] 
then
    cryptsetup luksClose backuppc-store
else
    echo "Encrypted volume not open."
fi

# Power off the USB drive
[ -e /dev/disk/by-partlabel/backuppc-offsite[0-9] ] && udisksctl power-off -b /dev/disk/by-partlabel/backuppc-offsite[0-9]

echo "$MOUNT_POINT fully unmounted, all overlays destroyed and encrypted volumes closed."

Final thought on BackupPC DR

I am considering that Ansible might be better for setting up the DR read-only mounts (and installing BackupPC). With a clone of the necessary playbooks placed, alongside the backups, onto the off-site disks they would be more self-contained as a bare-metal DR recovery media. It could make the only pre-requisites a supported PC with Ansible and standard encryption tools installed, along with the passphrase to unlock the encrypted off-site copy. Everything else could be installed and configured by Ansible when the playbook is run.

The idea of this appeals to me, having gone through the pain of having to setup SaltStack from scratch in order to get configuration management back to then start (re)configuring systems for recovery - Ansible seems a much stronger product from a DR perspective. Although something similar could probably be achieved with salt-ssh, it would still require more work to perform DR than an Ansible playbook approach.