Turning unlock crypt Ansible playbook into a role
I previously created a playbook to remotely unlock my encrypted root paritions, which works well but as a playbook it can only be run directly or utilised via ansible.builtin.import_playbook
. To make it more flexible, for example to allow it to be used as part of a reboot cycle inside another role, I decided to make it a role.
The first part of my existing playbook deals with finding the host’s connection information from the DHCP server, if the inventory_hostname
is not resolvable and ansible_host
has not already been set to a specific value. I decided not to include this, in order to keep the role focused on the task at hand (unlocking) - it will accept the target host as an argument, falling back to ansible_host
and finally inventory_hostname
. (Aside: I am dissatisfied with my current “look up the IP from the DHCP server” discovery method as there are a number of problematic edge-cases, it is on my backlog to look into revising it).
The new role
The tasks/main.yaml
of the new unlock-crypt
role are almost a copy-and-paste of the existing playbook’s unlock sequence:
---
- name: Wait for dropbear ssh daemon to be up
delegate_to: localhost
ansible.builtin.wait_for:
host: "{{ unlock_crypt_target_host }}"
port: "{{ unlock_crypt_port }}"
- name: Unlock encrypted disk
# No python in the initramfs. Work around ansible.builtin.raw
# not supporting stdin (https://github.com/ansible/ansible/issues/34556)
delegate_to: localhost
# Accept new (but not changed) host keys - so first connection after
# install works (provided old key has been removed)
ansible.builtin.command:
cmd: >
ssh
-o StrictHostKeyChecking=accept-new
-p {{ unlock_crypt_port }}
-l root
{{ unlock_crypt_target_host }}
cryptroot-unlock
stdin: "{{ unlock_crypt_key }}"
stdin_add_newline: false
- name: Wait for host to be properly up
delegate_to: localhost
ansible.builtin.wait_for:
host: "{{ unlock_crypt_target_host }}"
port: "{{ unlock_crypt_check_port }}"
...
The arguments_spec.yaml
file looked like this:
---
argument_specs:
main:
short_description: Unlocks the LUKS disk of a remote system
author: Laurence Alexander Hurst
options:
unlock_crypt_target_host:
description: Target host (must be accessible by `ssh`ing to this name/address)
type: str
default: "{{ hostvars[inventory_hostname]['ansible_host'] | default(inventory_hostname) }}"
unlock_crypt_port:
description: Port of the ssh server on the target host to which to issue the unlock command
type: int
default: 22
unlock_crypt_key:
description: Unlock key
type: str
required: true
unlock_crypt_check_port:
description: Port to wait to appear on the target server to confirm it is unlocked and reade
type: int
default: 22
...
and the defaults/main.yaml
file:
---
unlock_crypt_port: 22
unlock_crypt_target_host: "{{ hostvars[inventory_hostname]['ansible_host'] | default(inventory_hostname) }}"
unlock_crypt_check_port: 22
...
Using the new role
Initially I did a straight drop-in replace of the existing task in my unlock-crypt.yaml
playbook:
- name: Unlock remote host
ansible.builtin.include_role:
name: unlock-crypt
vars:
unlock_crypt_target_host: >-
{{
dhcp_lease.ip
| default(hostvars[inventory_hostname]['ansible_host'])
| default(inventory_hostname)
}}
unlock_crypt_port: 2222
unlock_crypt_key: >-
{{
lookup(
'community.hashi_vault.vault_read',
'kv/hosts/' + inventory_hostname + '/luks/passphrase'
).data.passphrase
}}
However, when I wanted to use this in another role (specifically installing Proxmox virtual environment, which requires a kernel change before installing the main packages and therefore a reboot) it made sense to move the port and key variables to a group’s variables. I originally did this via the domain level groups but that relied on a fact to correctly determine the domain which may not work, depending on if there are cached facts available, if the host is not yet unlocked. In the end I created a new_world_order
group, on which I also set ansible_user
to the specific user I created host-specific passwords for, as this should be the norm for all my hosts eventually.
unlock_crypt_port: 2222
unlock_crypt_key: >-
{{
lookup(
'community.hashi_vault.vault_read',
'kv/hosts/' + inventory_hostname + '/luks/passphrase'
).data.passphrase
}}
The use of the the IP address by preference might be specific to the installation process (I need to think this through - the changing client ID problem makes me thing this is a nuanced problem) so I left this setting local to the unlock-crypt.yaml
playbook for now.
With these variables defined at the group level, using this in another role was easy (although you can see from the preceeding comment that this is not the final arrangement - I have just not yet decided on the best way to proceed):
# XXX this is very site specific... (maybe we need to take a reboot
# method/role that is an argument, defaulting to
# ansible.builtin.reboot?)
- name: Host is rebooted if kernel is updated
become: true
# Ansible's reboot command waits, and checks, for the host to
# come back, which will never happen. Even with async (see below)
# there is a race condition if the ssh connection gets closed (by
# the shutdown process) before Ansible has disconnected so it is
# necessary to delay the shutdown command by longer than the
# async value, in order to avoid this problem.
ansible.builtin.shell: 'sleep 2 && /usr/bin/systemctl reboot --message="Ansible triggered reboot for LUKS unlock sanity check."'
# Run in background, waiting 1 second before closing connection
async: 1
# Launch in fire-and-forget mode - with a poll of 0 Ansible skips
# polling entirely and moves to the next task, which is precisely
# what we need.
poll: 0
when: kernel_updated.changed
- name: Unlock remote host
# Uses:
# unlock_crypt_port from `all` group_vars
# unlock_crypt_key from `all` group_vars
ansible.builtin.include_role:
name: unlock-crypt
when: kernel_updated.changed