Orchestrating Debian install and automated post-install configuration with Ansible

In the exciting* conclusion of the process of using Ansible to find dynamic client IPs and solving the PXE/iPXE/Debian installer/dropbear/OS identity crisis, I use this work to automate the post-install configuration of systems installed using my generic Debian install preseed. In this post I will be going through the install Ansible playbook (including some tweaks to the Debian installer preseed file), post-install configuration (bootstrap) playbook, remote unlocking playbook and reinstall playbook.

(*other opinions are available)

Install playbook

This playbook (which I called install.yaml) installs a new host. Currently it expects the host to be manually turned on. It presumes that the install is fully automated and that, once complete, the host will boot into the new OS with SSH listening. In the future, I intend to make the PXE configuration more dynamic (setting the specific host to PXE boot into the automated install, for example) which will result in some changes being needed to this.

For now I have manually set my development host to PXE boot directly into the pre-seeded Debian install via a MAC-specific override per my previous posts on targeted PXE booting and improved iPXE configuration. As this is destructive and the host-specific change to the DHCP configuration alters the normal behaviour of DHCP, which changes how it behaves compared to the standard configuration, I have crafted the playbook to only run if the special variable INSTALL_HOSTS is defined (it is used as the target host or hosts for the playbook). I have not set this variable anywhere in the inventory, it is explicitly specified via the -e ansible-playbook command line option.

As the install is automatic, if the host boots the pre-seeded installer option, essentially all this playbook does is configure the DHCP server and wait for the install to complete.

It configures the DHCP server to ignore the clint id for the MAC of the host being installed, which at present must be already known via a variable, which can be set on the host in the inventory. I have not explored whether there are alternative ways to discover the MAC of the host, e.g. if the switch port of the host is known, and still make a workable process. It then waits for the install to complete by waiting for SSH to be listening on the target host. Once the OS is installed, I reverted the changes to the DHCP server to restore the previous (standard) behaviour. I used the method of reconfiguring the DHCP server as I believe this is the most robust way; it supports installing via multiple methods (PXE, as I am doing here, boot from USB or CD etc.), each of which would cause a different number of DHCP requests to occur, as well as coping if different NICs behave differently regarding the client id they report and/or the number of times they DHCP before iPXE gets loaded.

Once the DHCP server is configured to ignore the client id, the IP address will not change (once one is assigned) due to one of the network (PXE) boot loader (at least on the machines I tested on, this is part of the NIC’s firmware so others could behave differently), iPXE or the Debian installer issuing a DHCP release - so each new DHCP request just gets issued the (from the server’s perspective) existing IP leased. The only time I foresee this not working is a race condition that would only work if the DHCP lease time was extremely short (a few seconds). In this instance, an existing lease might expire in one of the transitions between the parts (PXE/iPXE/installer/installed OS) of the process (within any one of these, the lease should be renewed by the DHCP client once ½ of the lease time has expired).

---
- hosts: localhost
  # Don't use facts, so save some time by not bothering to gather them.
  gather_facts: false
  any_errors_fatal: true
  tasks:
    # If this task fails, Ansible will abort the whole playbook and not
    # run subsequent plays on any host.
    - ansible.builtin.assert:
        that: INSTALL_HOSTS is defined
        fail_msg: Set host to be installed in variable INSTALL_HOSTS - note this action maybe destructive!
- hosts: '{{ INSTALL_HOSTS }}'
  # Cannot guarantee host is up yet.
  gather_facts: false
  tasks:
    # How else can the client be identified? Could we look up the MAC from a switch or existing DHCP lease, perhaps? - probably one for the reinstall script rather than install?
    - name: MAC is known for the client
      ansible.builtin.assert:
        that: mac_address is defined
        fail_msg: No mac_address fact/variable set for {{ inventory_hostname }}
    # All installs going to be done via DHCP (as the installer only DHCPs) so need to configure DHCP server in all cases.
    - name: DHCP server ignores client id (so IP won't change during install if using DHCP)
      # * uses host's `mac_address` variable
      # * uses `all` group `dhcp_server_host` variable
      ansible.builtin.include_role:
        name: dhcp
        tasks_from: host-ignore-client-id
    - name: Configuration changes have been applied
      ansible.builtin.meta: flush_handlers
    # XXX TODO - use pdu to power on? What if already on (e.g. as part of reinstall)?
    # XXX Probably need to check it's off as  a precondition if using PDU to trigger boot?
    - name: DHCP IP address is known
      # * uses host's `mac_address` variable
      # * uses `all` group `dhcp_server_host` variable
      ansible.builtin.include_role:
        name: dhcp
        tasks_from: lookup-host
      vars:
        wait_for_lease: true
    - name: DHCP IP address is being used for ansible_host
      block:
        - ansible.builtin.set_fact:
            ansible_host: "{{ dhcp_lease.ip }}"
        - ansible.builtin.debug:
            msg: After host discovery, ansible_host is set to '{{ ansible_host }}'.
    - name: SSH daemon is up (which means auto-install has finished)
      delegate_to: localhost
      ansible.builtin.wait_for:
        host: "{{ ansible_host }}"
        port: 22
        # This could take a long time - wait upto 30 minutes (1800 seconds)
        timeout: 1800
    - name: New SSH host key is known
      delegate_to: localhost
      ansible.builtin.command: /usr/bin/ssh-keyscan -H -T10 -tecdsa "{{ hostvars[inventory_hostname]['ansible_host'] | default(inventory_hostname) }}"
      register: new_ssh_hostkey
      changed_when: false  # Always a read operation, never changes anything
    - name: No old keys are in local known_hosts
      delegate_to: localhost
      # Only allow one thread to update the known_hosts file at a time
      throttle: 1
      ansible.builtin.known_hosts:
        name: '{{ hostvars[inventory_hostname].ansible_host | default(inventory_hostname) }}'
        state: absent
    - name: No old keys are in local known_hosts
      delegate_to: localhost
      # Only allow one thread to update the known_hosts file at a time
      throttle: 1
      ansible.builtin.known_hosts:
        name: '{{  dhcp_lease.ip }}'
        state: absent
    - name: New host key is in local known_hosts
      delegate_to: localhost
      # Only allow one thread to update the known_hosts file at a time
      throttle: 1
      ansible.builtin.known_hosts:
        name: '{{ hostvars[inventory_hostname].ansible_host | default(inventory_hostname) }}'
        key: '{{ new_ssh_hostkey.stdout }}'
    - name: New host key is in local known_hosts
      delegate_to: localhost
      # Only allow one thread to update the known_hosts file at a time
      throttle: 1
      ansible.builtin.known_hosts:
        name: '{{ dhcp_lease.ip }}'
        key: '{{ new_ssh_hostkey.stdout }}'
    - name: DHCP server no longer ignores client id
      # * uses host's `mac_address` variable
      # * uses `all` group `dhcp_server_host` variable
      ansible.builtin.include_role:
        name: dhcp
        tasks_from: host-unignore-client-id
# Proceed to bootstrap new host
- name: New host is bootstrapped
  ansible.builtin.import_playbook: bootstrap.yaml
  vars:
    BOOTSTRAP_HOSTS: "{{ INSTALL_HOSTS }}"
- name: Dynamic IPs are removed from local known hosts
  hosts: '{{ INSTALL_HOSTS }}'
  gather_facts: false # Won't need them
  tasks:
    - name: No old keys are in local known_hosts
      delegate_to: localhost
      # Only allow one thread to update the known_hosts file at a time
      throttle: 1
      # dhcp_lease.ip was found earlier
      ansible.builtin.known_hosts:
        name: '{{  dhcp_lease.ip }}'
        state: absent
...

Preseed improvements

Initial root password

Deciding on an initial root password

Although I did not specify it in my previous post, I was using a very simple (equivalent to the common r00tme example) root password in my initial preseed, which was convenient for developing up to this point.

I considered a number of options to improve this initial position, all set via the preseed file:

Use a complex but static initial root password
Not specify a root password (setting d-i passwd/root-login to false), but instead pre-create an ansible user with a complex static initial password (the Debian installer will give the user sudo rights)
Generate a unique complex root password per install
Not specify a root password but pre-create an ansible user with a unique complex password per install

At the moment, the preseed file is static and changing that increases the complexity of the solution (to generate per-install anything requires generating a preseed file as part of the install process). So far, this set up is designed to keep the initial install generic and do customisation using a configuration management tool (currently Ansible). Generating custom, per-install, preseed file blurs this boundary. For these reasons I decided not to do any per-install password setting in the preseed, despite being a more secure posture.

Having ruled out per-install passwords, I considered whether to configure an initial root password or disable root login and pre-create an ansible user. As the installer will give that user unrestricted sudo access, out of the box (password login, no ssh-key, so single factor login) this provides limited security advantage over the use of root directly (the superuser being ansible rather than root is an example of security through obscurity).

I also considered that using a static password in the preseed does not preclude rotating that password (the only constraint being that no installs can be in progress at the time of rotation) periodically to improve security.

Generating the initial root password

I generated a password (using my previous quick and dirty hack to generate the password) and stored it in /kv/install/initial_root_password in the vault:

PASS_FILE=$(mktemp)
chmod 600 ${PASS_FILE}  # Be sure only we can read it
tr -dc '[:print:]' < /dev/urandom | head -c32 > ${PASS_FILE}
vault kv put -mount=kv /install/initial_root_password password=@${PASS_FILE}
rm -f ${PASS_FILE}

To generate the crypted hash (for the preseed file), with bash (to set it statically for now):

vault read -field=password /kv/install/initial_root_password | mkpasswd -m sha512crypt -R 656000 -s

The -R 656000 just matches Ansible’s ansible.builtin.password_hash filter’s default.

or, with Ansible (which could be useful to make it template the preseed file, for password rotation) - note I improved on this when I refreshed the PXE configuration after writing this bit but before publishing this post:

{{ lookup('community.hashi_vault.vault_read', 'kv/install/initial_root_password').data.password | ansible.builtin.password_hash }}

Not sending a client id with DHCP requests

Post configuration, there is a discrepancy between the DHCP request issued during the initramfs environment’s DHCP for remote unlocking and the one issued by the OS after the encryption has been successfully unlocked. Specifically, the OS client (dhclient by default) requests with a client id and the initramfs client does not which causes an RFC2131 compliant server to treat them as different clients there therefore issue different dynamic addresses in response to each request.

The OS client can be configured to not send a client id by adding client no to the interface’s configuration in /etc/network/interfaces which then means both DHCP clients are treated as being the same machine by the server. This results in a single IP address being issued to both, so it does not change between the encryption unlock and the OS being fully loaded and that one system does not consume an extra address from the pool (until the lease on the first expires - having already proven empirically that an DHCP release is not issued on the first address).

Setting configuration with preseed

Using the same pattern I used before, I created a script called preseed-no-dhcp-client-id that adds commands to the preseed/late-command script that gets built up in /tmp (very meta) that gets run after the install is complete but before the system is rebooted:

#!/bin/sh
# ^^ no bash in installer environment, only BusyBox

# Die on error
set -e

cat - >>/tmp/late-command-script <<EOF
## BEGIN ADDED BY preseed-no-dhcp-client-id preseed/include_command
in-target /usr/bin/sed -i '/^iface .* inet dhcp$/a # Do NOT send client-id, to match initramfs behaviour so IP dies not\n# change between initramfs and OS.\nclient no' /etc/network/interfaces
## END ADDED BY preseed-no-dhcp-client-id preseed/include_command
EOF

This is copied to the preseed web-server by the existing Ansible playbook (just another filename in the loop) and then added to the pressed/include-command script in the preseed configuration file itself:

d-i preseed/include_command string                                  \
  for file in                                                       \
    preseed-script-headers                                          \
    preseed-crypto-key                                              \
    preseed-ssh-setup                                               \
    preseed-remove-dummy-lv                                         \
    preseed-no-dhcp-client-id                                       \
  ; do                                                              \
    wget -P /tmp $( dirname $( debconf-get preseed/url ) )/$file && \
    chmod 500 /tmp/$file &&                                         \
    /tmp/$file;                                                     \
  done;

For the future, this (the list of files) should be turned into a single variable that is used for both the files to push out to the web-server and the loop in the preseed configuration file (so there is one list to maintain).

Post install configuration

The post install Ansible playbook is what does all of the configuration, the base OS install is entirely generic. The playbook consists of a number of plays, which I will go through step by step.

The preseed deploys some keys, from the HashiCorp Vault, so that initially Ansible can login as root without a password to do the bootstrap.

1. Check that BOOTSTRAP_HOSTS is set

Unlike the install playbook, it should be relatively safe (if not thoroughly tested) to (re)run the bootstrap against a provisioned host so the requirement that a specific host is targeted (via the BOOTSTRAP_HOSTS variable, similar to the INSTALL_HOSTS used before) is set would not be necessary. However several tasks, before the ansible user is created, are hardcoded to use the root user which will fail once SSH is secured to prevent remote root logins. These tasks are tagged root, so they can be skipped, but that requires intimate knowledge of the process and is intended for debugging/troubleshooting purposes - e.g. if the play does not complete. The install playbook sets BOOTSTRAP_HOSTS to INSTALL_HOSTS when including the bootstrap playbook.

- hosts: localhost
  # Don't use facts, so save some time by not bothering to gather them.
  gather_facts: false
  any_errors_fatal: true
  tasks:
    # If this task fails, Ansible will abort the whole playbook and not
    # run subsequent plays on any host.
    - ansible.builtin.assert:
        that: BOOTSTRAP_HOSTS is defined
        fail_msg: Set host to be bootstrapped in variable BOOTSTRAP_HOSTS - note this playbook will fail on already bootstrapped hosts!

2. Install python

In order for Ansible modules to work, Python needs to be installed. This is not present by default on a minimal Debian install, such as the one done by my preseed, so installing this (using the ansible.builtin.raw module that does not require Python) is the first thing that needs to happen:

- name: Python is available
  hosts: '{{ BOOTSTRAP_HOSTS }}'
  # Fact gathering will fail if no python on remote host
  gather_facts: false
  tags: root
  vars:
    ansible_become_method: su
    # Keys should let us in as root - ssh password auth will be
    # disabled by default.
    ansible_user: root
  tasks:
    - name: Python is installed
      become: true
      # XXX This assumes Debian - need to be cleverer for other OSs
      # Redirects all apt-get output to stderr, so it can be seen if a
      # failure happens but stdout is only 'Present', 'Installed' or
      # 'Failed'
      ansible.builtin.raw: bash -c '(which python3 && echo "Present") || (apt-get -y install python3 && echo "Installed") || echo "Failed"'
      register: python_install_output
      # Changed when we had to install python
      changed_when: python_install_output.stdout_lines[-1] == 'Installed'
      # Failed if python wasn't there and (in the logical sense) it
      # didn't install.
      failed_when: python_install_output.stdout_lines[-1] not in ['Installed', 'Present']

3. Set the hostname

The Debian installer configures the LVM volume group to be named according to the hostname. This is a design decision that I like, it means that, e.g. for recovery, a disk can be easily placed into another Linux system without worrying about clashing volume group names - a situation that means more hoops have to be jumped through (to rename the volume group) to make the logical volume accessible, which I have experienced with RedHat family systems. However, this means that when I change the hostname I want to also rename the volume group to match (the preseed builds the host with the hostname unconfigured-hostname).

Before writing the playbook, I manually worked out the process required:

Change hostname: hostnamectl hostname test

Update /etc/hosts with the new local hostname:

 sed -i 's/127.0.1.1.*$/127.0.1.1 test.dev.internal test/' /etc/hosts

Reboot for the correct (new) hostname to be logged in /etc/lvm/backup when the VG is renamed
Rename the volume group, based on https://wiki.debian.org/LVM#Renaming_a_volume_group:

vgrename needs to be done while logged directly in as root as it causes /home to unmount. This is why it is done at this stage of the process, before securing ssh (below) as that disables remote root.
1. Rename the volume group:
```
 vgrename unconfigured-hostname-vg vg_$( hostname -s )
```
2. Create symlinks to the old logical volume names (or the system will not be able to reboot):
```
 cd /dev/mapper
 for lv in /dev/mapper/vg_$( hostname -s | sed 's/-/--/g' )-*
 do
   ln -s "${lv##*/}" "/dev/mapper/unconfigured--hostname--vg-${lv##*-}"
 done
```
3. Update paths of filesystems in /etc/fstab:
```
 sed -i "s#unconfigured--hostname--vg#vg_$( hostname -s | sed 's/-/--/g' )#g" /etc/fstab
```
4. Update path to resume partition in /etc/initramfs-tools/conf.d/resume:
```
 sed -i "s#unconfigured--hostname--vg#vg_$( hostname -s | sed 's/-/--/g' )#g" /etc/initramfs-tools/conf.d/resume
```
5. Update paths to filesystems in /boot/grub/grub.cfg (I think this is just the root filesystem?):
```
 sed -i "s#unconfigured--hostname--vg#vg_$( hostname -s | sed 's/-/--/g' )#g" /boot/grub/grub.cfg
```
6. Update initramfs with the new paths:
```
 update-initramfs -c -k all
```
7. Reboot with the new paths:
```
 reboot
```
8. Update grub (which will fail before the reboot because it uses the mounted rather than the configured filesystems):
```
 update_grub
```
Finally, just for neatness, update the hostname in the comments in the hosts’s ssh keys:
```
 sed -i "s#unconfigured-hostname#$( hostname -s )#g" /etc/ssh/ssh_host_*.pub
```

The Ansible play follows this exact same process:

- name: Hostname is correct
  hosts: '{{ BOOTSTRAP_HOSTS }}'
  tags: root
  vars:
    ansible_become_method: su
    # Keys should let us in as root - ssh password auth will be
    # disabled by default.
    ansible_user: root
  handlers:
    - name: Reboot
      become: true
      ansible.builtin.reboot:
  tasks:
    - name: Hostname is set
      become: true
      ansible.builtin.hostname:
        name: '{{ inventory_hostname }}'
      notify: Reboot
    - name: Hostname is correct in /etc/hosts
      become: true
      ansible.builtin.lineinfile:
        # Use dns.domain which is the domain from the DHCP server.
        # I do not know how reliable this is to guarantee the domain can be found?
        line: 127.0.1.1 {{ inventory_hostname }}.{{ ansible_facts.dns.domain }} {{ inventory_hostname }}
        path: /etc/hosts
        regexp: '^127.0.1.1\s'
        state: present
      notify: Reboot
    - name: Hostname changes have applied
      ansible.builtin.meta: flush_handlers
    # LVM rename process based on https://wiki.debian.org/LVM#Renaming_a_volume_group
    - name: LVM Volume group is named correctly
      become: true
      # Should look at community.general.lvg_rename but not in the
      # version of community.general (5.8.0) installed.
      ansible.builtin.command:
        # N.B. vgrename needs to be done while logged directly in as
        # root as it causes /home to unmount (so pre-securing ssh).
        argv:
          - /usr/sbin/vgrename
          - unconfigured-hostname-vg
          - vg_{{ inventory_hostname }}
      when: "'unconfigured-hostname-vg' in ansible_facts.lvm.vgs"
      register: lvm_rename
      notify: Reboot
    - name: Update LVM VG dependent configuration
      block:
        - name: List of logical volumes is known
          ansible.builtin.find:
            paths: /dev/mapper/
            file_type: any  # Default 'file' doesn't match symlinks
            patterns: "vg_{{ inventory_hostname | replace('-', '--') }}-*"
          register: lv_names
        - name: Links for old volume group names exist
          become: true
          ansible.builtin.file:
            path: "{{ item.path | replace('/vg_' + (inventory_hostname | replace('-', '--')) + '-', '/unconfigured--hostname--vg-') }}"
            src: '{{ item.path }}'
            state: link
          loop: '{{ lv_names.files }}'
        - name: Configuration files are correct
          become: true
          ansible.builtin.replace:
            path: '{{ item }}'
            regexp: unconfigured--hostname--vg
            replace: vg_{{ inventory_hostname | replace('-', '--') }}
          loop:
            - /etc/fstab
            - /etc/initramfs-tools/conf.d/resume
            - /boot/grub/grub.cfg
          notify: Reboot
        - name: Initramfs is correct
          become: true
          ansible.builtin.command:
            argv:
              - /usr/sbin/update-initramfs
              - -c
              - -k
              - all
          notify: Reboot
        - name: Logical volume changes have applied
          ansible.builtin.meta: flush_handlers
        - name: Grub configuration is up to date
          become: true
          ansible.builtin.command: /usr/sbin/update-grub
      when: lvm_rename.changed
    - name: List of ssh host public key files is known
      ansible.builtin.find:
        paths: /etc/ssh
        file_type: any  # Shouldn't be symlinks but why risk it?
        patterns: ssh_host_*.pub
      register: ssh_host_keys
    - name: Hostname is correct in host key comments
      become: true
      ansible.builtin.replace:
        path: '{{ item.path }}'
        regexp: unconfigured-hostname
        replace: '{{ inventory_hostname }}'
      loop: '{{ ssh_host_keys.files }}'

4. Secure user accounts

This next play installs sudo, adds an ansible user account with permission to use sudo, sets up some ssh keys with access to the ansible user and generates, stores and sets both the ansible and root account passwords to random, host-specific, values. The passwords are set using the recipe from my previous post, which uses a random seed based on the hostname to make it idempotent (otherwise a new encrypted value is generated on each run so the password is always updated even though the password itself is unchanged).

- name: User accounts are secured
  hosts: '{{ BOOTSTRAP_HOSTS }}'
  tags: root
  vars:
    ansible_become_method: su
    # Keys from install process should let us in as root - ssh
    # for root will be `prohibit-password` by default.
    ansible_user: root
  tasks:
    - name: Sudo is installed
      become: true
      ansible.builtin.package:
        name: sudo
        state: present
    - name: New passwords are stored in vault
      delegate_to: localhost
      community.hashi_vault.vault_write:
        path: kv/hosts/{{ inventory_hostname }}/users/{{ item }}
        data:
          password: "{{ lookup('ansible.builtin.password', '/dev/null', chars=['ascii_letters', 'digits', 'punctuation']) }}"
      loop:
        - root
        - ansible
    - name: New passwords are set on remote host
      become: true
      ansible.builtin.user:
        name: "{{ item }}"
        password: "{{ lookup('community.hashi_vault.vault_read', 'kv/hosts/' + inventory_hostname + '/users/' + item).data.password | ansible.builtin.password_hash('sha512', 65534 | random(seed=inventory_hostname) | string) }}"
      loop:
        - root
        - ansible
    - name: Ansible user can sudo
      become: true
      ansible.builtin.user:
        name: ansible
        append: true
        groups:
          - sudo
    - name: Ansible user ssh keys are correct
      ansible.posix.authorized_key:
        user: ansible
        # Use the same keys that allow root in for now - probably need to revisit this?
        key: "{{ lookup('community.hashi_vault.vault_read', 'kv/install/initial_root_keys').data.ssh_keys }}"
        state: present

5. Secure SSH

Until this point, my playbook has been hard-coded to log in as root and using su as the escalation method. Once the ansible user is created, the default settings in my inventory should begin working - to be sure, this play begins with a sanity check that Ansible can become root without any play-specific settings. This to try and avoid accidentally locking myself out of the system by locking it down if all is not well. Provided that it can become root, the play then disables directly logging in as root as well as interactive login methods so only key-based access will work.

- name: Lockdown SSH
  hosts: '{{ BOOTSTRAP_HOSTS }}'
  handlers:
    - name: Restart sshd
      become: true
      ansible.builtin.service:
        name: sshd
        state: restarted
  tasks:
    - name: Can login and escalate using default (should be `ansible` user) credentials (sanity check)
      become: true
      ansible.builtin.command: /usr/bin/whoami
      changed_when: false
      register: sanity_check_output
    - name: Assert successfully became root (sanity check - belt and braces)
      ansible.builtin.assert:
        that:
          - sanity_check_output.stdout == 'root'
    # Can definitely now login and escalate to root - proceed to
    # disable root login and prohibit password ssh logins.
    - name: Root ssh login is denied
      become: true
      ansible.builtin.lineinfile:
        line: PermitRootLogin no
        path: /etc/ssh/sshd_config
        regexp: '^#?PermitRootLogin\s'
        state: present
      notify: Restart sshd
    - name: Password authentication is denied
      become: true
      ansible.builtin.lineinfile:
        line: PasswordAuthentication no
        path: /etc/ssh/sshd_config
        regexp: '^#?PasswordAuthentication\s'
        state: present
      notify: Restart sshd
    - name: Challenge response authentication is denied (Debian <12)
      become: true
      ansible.builtin.lineinfile:
        line: ChallengeResponseAuthentication no
        path: /etc/ssh/sshd_config
        regexp: '^#?ChallengeResponseAuthentication\s'
        state: present
      notify: Restart sshd
      # Changed to KbdInteractiveAuthentication in Bookworm (12)
      when: ansible_facts.distribution == 'Debian' and (ansible_facts.distribution_major_version | int) < 12
    - name: Challenge response authentication is denied (Debian 12+)
      become: true
      ansible.builtin.lineinfile:
        line: KbdInteractiveAuthentication no
        path: /etc/ssh/sshd_config
        regexp: '^#?KbdInteractiveAuthentication\s'
        state: present
      notify: Restart sshd
      # Changed from ChallengeResponseAuthentication in Bookworm (12)
      when: ansible_facts.distribution == 'Debian' and (ansible_facts.distribution_major_version | int) >= 12

6. Setup LUKS encryption passphrase and remote unlocking

During the preseeded install LUKS encryption is setup with a dynamically generated key embedded in the unencrypted initial ram disk. Prior to any data ends up on the system, this key needs to be replaced with a per-system key which is not stored on the system and the existing key revoked. So that the unencrypted boot partition, or anything on the system, is sufficient to unlock the encryption.

In addition to replacing the encryption key, the play installs dropbear ssh server and initramfs integration that provides an early boot SSH service which can be used to remotely unlock the encryption during the boot process.

I initially setup dropbear with a late_command in the preseed file which configured an ssh key which allowed remote unlocking:

# Dropbear options:
# -I 600 - Disconnect session after 60s of inactivity
# -j - Disable local port forwarding
# -k - Disable remote port forwarding
# -p 2222 - Listen on port 2222
# -s - Do not allow password authentication (won't work anyway in the initramfs environment)
# /etc/dropbear/initramfs/dropbear.conf was /etc/dropbear-initramfs/config in Bullseye - moved in Bookworm
d-i preseed/late_command string \
  in-target sed -i 's/^#?DROPBEAR_OPTIONS=.*$/DROPBEAR_OPTIONS="-I 600 -j -k -p 2222 -s"/' /etc/dropbear/initramfs/dropbear.conf && \
  in-target mkdir -m 700 -p /root/.ssh && \
  in-target /usr/bin/sh -c 'echo \'ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBGELav9hG7S1Kohs5QyEsrBIXLbT18tdTZCFg5rUITwxXg1JDKlzuR7v+8zLmbzWCBs0IR8QA9EBw0099h8QW3A= laurence@core\' > /root/.ssh/authorized_keys' && \
  in-target cp /root/.ssh/authorized_keys /etc/dropbear/initramfs/authorized_keys && \
  in-target update-initramfs -u;

This command:

Configured dropbear-initramfs’s options for the early initramfs ssh daemon (to allow remove unlocking of the encrypted volume)
Creates root’s .ssh/authorized_keys file (to allow root to login with that key to perform initial configuration of the system)
Copy’s root’s .ssh/authorized_keys file to dropbear-initramfs’s authorized_keys to allow that key to also login to unlock
Runs update-initramfs to ensure the dropbear changes are incorporated in the initial ram disk image

Making dropbear listen on a different port serves a couple of purposes:

For automation, being on a different port allows (e.g.) Ansible to easily tell if the system is waiting to be unlocked or the “normal” ssh daemon is listening post boot
The dropbear service presents different host keys to the main OS ssh daemon and having it on a different port allows this to be managed by clients easily
Allows the dropbear ssh service, for remote unlocking, to be firewalled differently (due to being a different port, it is easy to be more restrictive which remote systems can access it in external (to this system) firewalls)

My play configures dropbear the same way. It uses community.crypto.luks_device, so the community.crypto collection needs adding to requirments.yaml and installing with ansible-galaxy collection install -r requirements.yaml if not already present:

- name: Disk encryption is setup for remote unlocking
  hosts: '{{ BOOTSTRAP_HOSTS }}'
  handlers:
    - name: Current initramfs is updated
      become: true
      ansible.builtin.command: /usr/sbin/update-initramfs -u
    - name: Old dropbear host key is deleted
      delegate_to: localhost
      # Only allow one thread to update the known_hosts file at a time
      throttle: 1
      ansible.builtin.known_hosts:
        name: '[{{ hostvars[inventory_hostname].ansible_host | default(inventory_hostname) }}]:2222'
        state: absent
    - name: Existing dhclient lease is cleared
      become: true
      ansible.builtin.file:
        path: /var/lib/dhcp/dhclient.{{ ansible_facts.default_ipv4.interface }}.leases
        state: absent
  tasks:
    - name: dropbear is installed
      become: true
      ansible.builtin.package:
        name: dropbear-initramfs
        state: present
    - name: Unlock ssh keys are deployed
      become: true
      ansible.posix.authorized_key:
        user: root
        path: /etc/dropbear/initramfs/authorized_keys
        manage_dir: false
        # Use the same keys that allow root in for now - probably need to revisit this?
        key: "{{ lookup('community.hashi_vault.vault_read', 'kv/install/initial_root_keys').data.ssh_keys }}"
        state: present
      notify: Current initramfs is updated
    - name: Dropbear configuration is correct
      become: true
      ansible.builtin.lineinfile:
        line: DROPBEAR_OPTIONS="-I 600 -j -k -p 2222 -s"
        path: /etc/dropbear/initramfs/dropbear.conf
        regexp: '^#?DROPBEAR_OPTIONS='
        state: present
      notify:
        - Current initramfs is updated
        - Old dropbear host key is deleted
    - name: Installer LUKS unlock key file is `stat`ed
      become: true
      ansible.builtin.stat:
        path: /etc/keys/luks-lvm.key
      register: luks_key_stat
    - name: LUKS passphrase is setup
      block:
        - name: Block device and filesystem types are known
          ansible.builtin.command: /usr/bin/lsblk -o PATH,FSTYPE -J
          register: block_path_type_json
          # Read only operation - never changes anything
          changed_when: false
        - name: Encrypted block devices are known
          ansible.builtin.set_fact:
            encrypted_block_devices: >-
              {{
                  (block_path_type_json.stdout | from_json).blockdevices
                  |
                  selectattr('fstype', 'eq', 'crypto_LUKS')
                  |
                  map(attribute='path')
              }}
        - name: Only one encrypted device exists
          ansible.builtin.assert:
            that:
              - encrypted_block_devices | length == 1
        - name: Encrypted device name is stored
          ansible.builtin.set_fact:
            encrypted_block_device: "{{ encrypted_block_devices | first }}"
        - name: New passphrase is stored in vault
          delegate_to: localhost
          community.hashi_vault.vault_write:
            path: kv/hosts/{{ inventory_hostname }}/luks/passphrase
            data:
              passphrase: "{{ lookup('ansible.builtin.password', '/dev/null', chars=['ascii_letters', 'digits', 'punctuation'], length=40) }}"
        - name: New passphrase is set
          become: true
          community.crypto.luks_device:
            new_passphrase: "{{ lookup('community.hashi_vault.vault_read', 'kv/hosts/' + inventory_hostname + '/luks/passphrase').data.passphrase }}"
            keyfile: /etc/keys/luks-lvm.key
            device: "{{ encrypted_block_device }}"
        - name: Installer generated key file is removed for unlocking
          become: true
          community.crypto.luks_device:
            remove_keyfile: /etc/keys/luks-lvm.key
            passphrase: "{{ lookup('community.hashi_vault.vault_read', 'kv/hosts/' + inventory_hostname + '/luks/passphrase').data.passphrase }}"
            device: "{{ encrypted_block_device }}"
        - name: Installer generated key file is removed from crypttab
          become: true
          ansible.builtin.replace:
            path: /etc/crypttab
            regexp: '/etc/keys/luks-lvm\.key'
            replace: 'none'
          # Needs to be correct in initramfs or cryptroot-unlock will not prompt for passphrase
          notify: Current initramfs is updated
        - name: Installer generated key file is removed from disk
          become: true
          ansible.builtin.file:
            path: /etc/keys/luks-lvm.key
            state: absent
          # Key file will need removing from initramfs
          notify: Current initramfs is updated
        - name: No keys are copied to initramfs
          become: true
          ansible.builtin.lineinfile:
            line: '#KEYFILE_PATTERN='
            path: /etc/cryptsetup-initramfs/conf-hook
            regexp: '^#?KEYFILE_PATTERN='
            state: present
          notify: Current initramfs is updated
      when: luks_key_stat.stat.exists

7. Test remote unlocking

Finally, before the playbook ends, I reboot the target and test the remote unlock works - this verifies that the unlock passphrase is correct in the vault and the whole process works before any data that might possibly be important is written inside the encrypted filesystem (at this point everything could be recreated by re-running the install and bootstrap process).

At the moment the encryption unlocking is a playbook in its own right, which has to be included as its own play (ansible.builtin.import_playbook can only be at the top/play level). It (and a number of other things, like creating/refreshing the luks passphrase) would be better as tasks in a role but this is left as an exercise for the future.

- name: Host is rebooted to test unlock
  hosts: '{{ BOOTSTRAP_HOSTS }}'
  tasks:
    - name: Host is rebooted
      become: true
      # Ansible's reboot command waits, and checks, for the host to
      # come back, which will never happen. Even with async (see below)
      # there is a race condition if the ssh connection gets closed (by
      # the shutdown process) before Ansible has disconnected so it is
      # necessary to delay the shutdown command by longer than the
      # async value, in order to avoid this problem.
      ansible.builtin.shell: 'sleep 2 && /usr/bin/systemctl reboot --message="Ansible triggered reboot for LUKS unlock sanity check."'
      # Run in background, waiting 1 second before closing connection
      async: 1
      # Launch in fire-and-forget mode - with a poll of 0 Ansible skips
      # polling entirely and moves to the next task, which is precisely
      # what we need.
      poll: 0
# XXX turn this into either tasks or role, so it can be part of the above play.
- name: Check remote unlock is working
  ansible.builtin.import_playbook: unlock-crypt.yaml
  vars:
    UNLOCK_HOSTS: '{{ BOOTSTRAP_HOSTS }}'

Remotely unlocking the encrypted root filesystem

Once I had dropbear setup, before I finished the bootstrap playbook I developed a playbook to unlock systems remotely using their passphrase from the vault. The dropbear environment does not have a python interpretor so most Ansible modules will not work directly.

My first attempt at doing this was to use the ansible.builtin.raw module to run the unlock command on the remote system, however that module does not support passing a value for stdin so some other means has to be used to pass the unlock key to the unlock command. Other modules, including ansible.builtin.copy, require python on the remote system (which isn’t available inside the initramfs) which makes it harder.

---
- hosts: all
  gather_facts: false
  tasks:
    - name: Wait for dropbear ssh daemon to be up
      delegate_to: localhost
      ansible.builtin.wait_for:
        host: "{{ ansible_host }}"
        port: 2222
    - name: Unlock encrypted disk
      # No python in the initramfs. Not ideal - briefly exposes
      # password via proc within initramfs environment.
      # ansible.builtin.raw doesn't support passing directly to stdin.
      ansible.builtin.raw: echo -n {{ unlock_key | quote }} | cryptroot-unlock
      vars:
        # Pulled this out so it can be replaced with a lookup in
        # the future.
        unlock_key: supersecretkey
        ansible_user: root  # Temporarily login directly as root
        # Temporarily login to the dropbear initramfs daemon that
        # I configured on a different port.
        ansible_port: 2222
    - name: Wait for host to be properly up
      delegate_to: localhost
      ansible.builtin.wait_for:
        host: "{{ ansible_host }}"
        port: 22
...

After getting this functioning, I went looking for a solution and found a feature request for stdin support however it has been closed due to its age. This did lead me (via https://github.com/gsauthof/dracut-sshd/issues/32) to https://github.com/gsauthof/playbook/blob/master/fedora/initramfs/ansible/unlock_tasks.yml, which uses a delegated (to localhost) command task to run ssh locally to issue the unlock command with the key advantage of being able to pass the password in on stdin. While this would be ugly in most contexts, it is little worse than using the raw module and it is reasonable to expect the ssh client to be installed where the playbook is being run (even on Windows). Whether this last assumption (ssh client is installed) holds true when using containers to execute playbooks (as I presume Ansible Automation Platform does), I do not yet know.

Taking this method, to reduce the risk of exposing the unlock key, I ended up with this:

---
- hosts: "{{ UNLOCK_HOSTS | default('all') }}"
  # Host will not be in a state to gather_facts if waiting to be unlocked
  gather_facts: false
  tasks:
    - name: Attempt to find connection details if needed
      # * uses host's `mac_address` variable
      # * uses `all` group `dhcp_server_host` variable
      ansible.builtin.include_role:
        name: dhcp
        tasks_from: lookup-host
      # Cannot be part of the block or Ansible applies the when
      # to all the included tasks, including those that are
      # delegated (and hence the test evaluated against the
      # delegated host rather than the current host).
      when: >-
        ansible_host == inventory_hostname
        and
        inventory_hostname is not ansible.utils.resolvable
    - name: Wait for dropbear ssh daemon to be up
      delegate_to: localhost
      ansible.builtin.wait_for:
        host: "{{ dhcp_lease.ip | default(hostvars[inventory_hostname]['ansible_host']) | default(inventory_hostname) }}"
        port: 2222
    - name: Unlock encrypted disk
      # No python in the initramfs. Work around ansible.builtin.raw
      # not supporting stdin (https://github.com/ansible/ansible/issues/34556)
      delegate_to: localhost
      # Accept new (but not changed) host keys - so first connection after
      # install works (provided old key has been removed)
      ansible.builtin.command:
        cmd: >
          ssh
          -o StrictHostKeyChecking=accept-new
          -p 2222
          -l root
          {{ dhcp_lease.ip | default(hostvars[inventory_hostname]['ansible_host']) | default(inventory_hostname) }}
          cryptroot-unlock
        stdin: "{{ unlock_key }}"
        stdin_add_newline: false
      vars:
        unlock_key: >-
          {{
            lookup(
              'community.hashi_vault.vault_read',
              'kv/hosts/' + inventory_hostname + '/luks/passphrase'
            ).data.passphrase
          }}
    - name: Wait for host to be properly up
      delegate_to: localhost
      ansible.builtin.wait_for:
        host: "{{ dhcp_lease.ip | default(hostvars[inventory_hostname]['ansible_host']) | default(inventory_hostname) }}"
        port: 22
...

Reinstall playbook

To round this off, I wrote a reinstall playbook based on my previous work doign the same with Rocky Linux. It just destroys the partition table and reboots the host, before launching the install playbook above:

---
- hosts: localhost
  # Don't use facts, so save some time by not bothering to gather them.
  gather_facts: false
  any_errors_fatal: true
  tasks:
    # If this task fails, Ansible will abort the whole playbook and not
    # run subsequent plays on any host.
    - ansible.builtin.assert:
        that: REDEPLOY_HOSTS is defined
        fail_msg: Set host to be deployed in variable REDEPLOY_HOSTS - note this action is destructive!
- hosts: '{{ REDEPLOY_HOSTS }}'
  tasks:
    # Required to blow away GPT partition table later. Do this early so
    # if there's a problem installing it will fail early (before any
    # destructive action has been taken)
    - name: Ensure gdisk is installed
      become: true
      ansible.builtin.package:
        name: gdisk
        state: present
    # Ironically, have to install this new package just to immediately
    # destroy the machine with Ansible's ansible.builtin.expect module.
    - name: Install pexpect python module
      become: true
      ansible.builtin.package:
        name: python3-pexpect # For Rocky 8 - maybe different on others?
        state: present
    - name: Get list of partitions
      ansible.builtin.set_fact:
        disks_with_partitions: "{{ disks_with_partitions + [item.key] }}"
      loop: "{{ ansible_facts.devices | dict2items }}"
      vars:
        disks_with_partitions: []
      when: item.value.removable == '0' and item.value.partitions | length > 0
      loop_control:
        label: "{{ item.key }}"
    - name: Destroy host's disk partition table(s) (to enable fall through to other boot methods for auto-reinstall)
      become: true
      ansible.builtin.expect:
         command: gdisk /dev/{{ item }}
         # Although this is a map (and therefore unordered), each prompt
         # will only appear once so I am not worried about multiple
         # matches happening.
         responses:
           # x == Enter expert mode
           'Command \(\? for help\):': x
           # z == zap (destroy) GPT partition table
           'Expert command \(\? for help\):': z
           # Ansible doesn't seem to substitute `{{ item }}` in a key,
           # so have to do a looser match. Will always be on a disk,
           # never a partition, so should not end with a digit. On my
           # systems `[a-z]+` seems sufficient.
           'About to wipe out GPT on /dev/[a-z0-9]+. Proceed\? \(Y/N\):': Y
           'Blank out MBR\? \(Y/N\):': Y
      loop: "{{ disks_with_partitions }}"
    - name: Reboot host
      become: true
      # Ansible's reboot command waits, and checks, for the host to
      # come back, which will never happen. Even with async (see below)
      # there is a race condition if the ssh connection gets closed (by
      # the shutdown process) before Ansible has disconnected so it is
      # necessary to delay the shutdown command by longer than the
      # async value, in order to avoid this problem.
      ansible.builtin.shell: 'sleep 2 && /usr/bin/systemctl reboot --message="Ansible triggered reboot for system redeployment."'
      # Run in background, waiting 1 second before closing connection
      async: 1
      # Launch in fire-and-forget mode - with a poll of 0 Ansible skips
      # polling entirely and moves to the next task, which is precisely
      # what we need.
      poll: 0
    - name: Old keys are removed from local known_hosts
      delegate_to: localhost
      # Only allow one thread to update the known_hosts file at a time
      throttle: 1
      ansible.builtin.known_hosts:
        name: '{{ hostvars[inventory_hostname].ansible_host | default(inventory_hostname) }}'
        state: absent
    - name: Old crypt unlock keys are removed from local known_hosts
      delegate_to: localhost
      # Only allow one thread to update the known_hosts file at a time
      throttle: 1
      ansible.builtin.known_hosts:
        name: '[{{ hostvars[inventory_hostname].ansible_host | default(inventory_hostname) }}]:2222'
        state: absent
- name: Begin install process
  ansible.builtin.import_playbook: install.yaml
  vars:
    INSTALL_HOSTS: '{{ REDEPLOY_HOSTS }}'
...

Ansible defaults

Hosts deployed this way should default to logging in with the ansible user, using local keys (as ssh password login is entirely disabled), and using the system-specific password from the vault to escalate priviledges:

ansible_user: ansible
ansible_become_password: "{{ lookup('community.hashi_vault.vault_read', 'kv/hosts/' + inventory_hostname + '/users/ansible').data.password }}"

In the longer term, these settings should become the default for all of my systems.

Future actions

This post ends with a list of actions that need to be done at some point:

Bring host-specific pxe config for automated preseeded (re)install under Ansible control (as opposed to manual)
The list of scripts deployed to the preseed web-server and then fetched by the preseed configuration file should be turned into a single variable that is used for both (so there is one list to maintain)
Modify DHCP with a “wait until a lease newer than x appears”, rather than rely on the client id being consistent between, for example, dropbear and the OS (then drop changing the network configuration from the preseed install process)
Make the encryption unlock a role instead of a playbook
Move replacing the generated luks key with a passphrase part of the same role and general enough that it can be also used to rotate the passphrases periodically