One of the big bits I have not yet migrated from SaltStack to Ansible is my router, including DHCP and DNS. The management subnet was not configured to allow PXE booting, so I decided this was the right opportunity to make a start on that migration to Ansible. I did this by expanding my existing DHCP role with additional tasks.

What I have done is highly specific to ISC’s DHCP server, which is the only one SaltStack was configuring. At some point I would like to configure dnsmasq the same way, having gone to the effort of supporting it for the rest of the DHCP role.

Global configuration

The main bit of additional global configuration to ISC’s DHCP server are the definitions of extra options and classes:

ddns-update-style none;
default-lease-time 3600;
max-lease-time 7200;
authoritative;
log-facility local7;

option capwap code 138 = ip-address;
option arch code 93 = unsigned integer 16;
option apc-vendor-cookie code 43 = string;
option rfc3442-classless-static-routes code 121 = array of integer 8;
class "vendor-class" {
  match option vendor-class-identifier;
}

In the SaltStack version, the values for default-lease-time and max-lease-time were hardcoded (the options and classes were not) however I changed these to variables with default values to avoid retaining the hardcoded values as a technical debt.

I initially created a server.yaml task file in my DHCP role that used ansible.builtin.lineinfile and ansible.builtin.blockinfile for the options and classes respectively. My thinking was that it would be easier to add options, in particular, as and when they are need in other task files however I found this was unsatisfactory as it complicated removing and/or renaming options. Instead, I changed to templating an entire global.conf file that I then included in the main dhcpd.conf. All of the options and classes were defined in one block, like this, in my original SaltStack version.

The template looks like this:

ddns-update-style none;
default-lease-time {{ dhcp_default_lease_time }};
max-lease-time {{ dhcp_max_lease_time }};
authoritative;
log-facility local7;

# Custom options
{% for option in dhcp_options %}
option {{ option.name }} code {{ option.code }} = {{ option.type }};
{% endfor %}
# Classes
{% for class in dhcp_classes %}
class "{{ class.name }}" {
  {{ class.configuration | indent(2)}}
}
{% endfor %}

The tasks file that deploys this modular configuration file and includes it in the main configuration file, looks like this - note that the configuration of /etc/default/isc-dhcp-server, which defines the interfaces the dhcp server listens on, has not yet been completed (I will look at this below).

---
- name: DHCP server software is installed
  become: true
  ansible.builtin.package:
    name: isc-dhcp-server
    state: present
- name: Modular DHCP configuration folder exists
  become: true
  ansible.builtin.file:
    path: /etc/dhcp/dhcpd.d
    state: directory
    owner: root
    group: root
    mode: '700'
- name: Global configuration file is correct
  become: true
  # Uses:
  # * dhcp_default_lease_time
  # * dhcp_max_lease_time
  # * dhcp_options
  # * dhcp_classes
  ansible.builtin.template:
    dest: /etc/dhcp/dhcpd.d/global.conf
    owner: root
    group: root
    mode: '400'
    src: dhcpd/global.conf
  notify: isc-dhcp-server is restarted
- name: Global configuration is included
  become: true
  ansible.builtin.lineinfile:
    path: /etc/dhcp/dhcpd.conf
    # This is necessary to ensure classes and options are defined before
    # any other includes (which may use them).
    insertbefore: ^include
    firstmatch: true  # Insert before first match, not last.
    line: include "/etc/dhcp/dhcpd.d/global.conf";
  notify: isc-dhcp-server is restarted
# XXX Need to configure /etc/default/isc-dhcp-server
- name: DHCP server is running
  become: true
  ansible.builtin.service:
    name: isc-dhcp-server
    enabled: true
    state: started
...

I added to the new entry point to meta/argument_specs.yaml:

server:
  short_description: Install then configure the DHCP server's global settings.
  options:
    dhcp_default_lease_time:
      description: Default lease time for DHCP responses.
      type: int
      default: 3600
    dhcp_max_lease_time:
      description: >-
        Maximum lease time for DHCP responses (if the client requests
        longer than the default).
      type: int
      default: 7200
    dhcp_options:
      description: Custom options to define in the DHCP server configuration.
      type: list
      default: []
      elements: dict
      options:
        name:
          description: Option name in DHCP configuration file
          type: str
          required: true
        code:
          description: DHCP option code
          type: int
          required: yes
        type:
          description: DHCP option type (e.g. `ip-address`, `string`).
          type: str
          required: yes
    dhcp_classes:
      description: Custom classes to define in the DHCP server configuration.
      type: list
      default: []
      elements: dict
      options:
        name:
          description: Class name in DHCP configuration file
          type: str
          required: true
        configuration:
          description: Class configuration for the DHCP configuration file.
          type: str
          required: yes

I added the default values to defaults/main.yaml in the dhcp role:

dhcp_default_lease_time: 3600
dhcp_max_lease_time: 7200
dhcp_options: []
dhcp_classes: []

I added a new play to my site.yaml playbook for the dhcp_servers group that does the global configuration. For now, I have hardcoded the dhcp_options and dhcp_classes variables at this level. I was experimenting with not having a default route on some of my restricted subnets, and using option 121 (rfc3442-classless-static-routes) to push out specific routes however most embedded devices, including my TP-Link EAP225 wireless access points and HP Microserver iLO interface, do not seem to accept these routes and only send traffic to the IP specified in the router option (hence it is commented out):

- hosts: dhcp_servers
  tags: dhcp
  tasks:
    - name: DHCP global configuration is correct
      ansible.builtin.import_role:
        name: dhcp
        tasks_from: server
      vars:
        dhcp_options:
          - name: capwap
            code: 138
            type: ip-address
          - name: client-system-architecture
            code: 93
            type: unsigned integer 16
          - name: apc-vendor-cookie
            code: 43
            type: string
          #- name: rfc3442-classless-static-routes
          #   code: 121
          #   type: array of integer 8
        dhcp_classes:
          - name: vendor-class
            configuration: match option vendor-class-identifier;
          - name: user-class
            configuration: match option user-class;
          - name: pxe-client-architecture
            configuration: |-
              match if substring(option vendor-class-identifier, 0, 9) = "PXEClient"
                and not (option user-class = "iPXE");
              match option client-system-architecture;
              next-server {{ network_pxe_server }};

Subnet configuration

After the global configuration, I needed to configure the subnets that the DHCP server manages. Like with the global configuration, I opted to make this modular and included in the main configuration (as opposed to my old SaltStack configuration that generated a single monolithic configuration file). I created an add-subnet.yaml task for the dhcp role - I considered having the main server.yaml file include this for each subnet but decided it was easier to keep the subnet configuration separate from the main server configuration.

The tasks file looks like this:

---
- name: Modular DHCP configuration folder exists
  become: true
  ansible.builtin.file:
    path: /etc/dhcp/dhcpd.d
    state: directory
    owner: root
    group: root
    mode: '700'
- name: Subnet configuration file is correct
  become: true
  # Uses:
  # * dhcp_subnet_name
  # * dhcp_subnet
  # * dhcp_ranges
  # * dhcp_router
  # * dhcp_subclasses
  # * dhcp_domain_name
  # * dhcp_domain_search
  # * dhcp_domain_name_servers
  ansible.builtin.template:
    dest: /etc/dhcp/dhcpd.d/subnet-{{ dhcp_subnet_name }}.conf
    owner: root
    group: root
    mode: '400'
    src: dhcpd/subnet.conf
  notify: isc-dhcp-server is restarted
- name: Subnet configuration is included
  become: true
  ansible.builtin.lineinfile:
    path: /etc/dhcp/dhcpd.conf
    line: include "/etc/dhcp/dhcpd.d/subnet-{{ dhcp_subnet_name }}.conf";
  notify: isc-dhcp-server is restarted
...

And the configuration template looked like this:

# {{ dhcp_subnet_name }} subnet configuration
subnet {{ dhcp_subnet | ansible.utils.ipaddr('network') }} netmask {{ dhcp_subnet | ansible.utils.ipaddr('netmask') }} {
{% for range in dhcp_ranges %}
  range {{ range.start }} {{ range.end }};
{% endfor %}
  option domain-name "{{ dhcp_domain_name }}";
  option domain-name-servers {{ dhcp_domain_name_servers }};
  option domain-search "{{ dhcp_domain_search }}";
  option routers {{ dhcp_router }};

{% for class_match in dhcp_subclasses | default([]) %}
{%   for subclass_match in class_match.subclasses %}
  subclass "{{ class_match.class_name }}" {{ subclass_match.match }} {
    {{ subclass_match.configuration | indent(4) }}
  }
{%   endfor %}
{% endfor %}
}

And the argument_specs.yaml entry for this new role entry point looked like this:

add-subnet:
  short_description: Add a new managed subnet to the DHCP server configuration.
  options:
    dhcp_subnet_name:
      description: Name for the subnet - used for the configuration filename.
      type: str
      required: true
    dhcp_subnet:
      description: The subnet in CIDR (xx.xx.xx.xx/xx) format
      type: str
      required: true
    dhcp_ranges:
      description: The ranges of dynamic addresses
      type: list
      elements: dict
      options:
        start:
          description: The start of the range
          type: str
          required: true
        end:
          description: The end of the range
          type: str
          required: true
    dhcp_domain_name:
      description: Domain name to be handed out to DHCP clients
      type: str
      required: true
    dhcp_domain_name_servers:
      description: Domain name servers to be handed out to DHCP clients
      type: str
      required: true
    dhcp_domain_search:
      description: Search domains to be handed out to DHCP clients
      type: str
      required: true
    dhcp_router:
      description: >-
        Router(s) to be handed out to DHCP clients (multiple
        routers are comma separated)
      type: str
      required: true
    dhcp_subclasses:
      description: Subclass matches to include in this subnet's configuration
      type: list
      elements: dict
      required: false
      options:
        class_name:
          description: Class name for this set of submatches
          type: str
          required: true
        subclasses:
          description: List of subclass matches for the specified `class_name`
          type: list
          elements: dict
          required: true
          options:
            match:
              description: Match for `class_name` this subclass represents
              type: str
              required: true
            configuration:
              description: Configuration for this subclass match
              type: str
              required: true

To use is, I added the subnets to a new inventory variable networks and constructed a loop over it in the new play I added for the dhcp_servers group, above in site.yaml:

- name: DHCP subnet configuration is correct
  ansible.builtin.include_role:
    name: dhcp
    tasks_from: add-subnet
  loop: '{{ networks | dict2items }}'
  vars:
    dhcp_subnet_name: '{{ item.value.name }}'
    dhcp_subnet: '{{ item.key }}'
    dhcp_ranges:
      - start: '{{ item.value.dhcp_start }}'
        end: '{{ item.value.dhcp_end }}'
    dhcp_router: '{{ item.value.router }}'
    dhcp_subclasses: '{{ item.value.subclasses | default([]) }}'
    dhcp_domain_name: '{{ item.value.domain_name }}'
    dhcp_domain_search: '{{ item.value.domain_search }}'
    dhcp_domain_name_servers: '{{ item.value.domain_name_servers }}'

The variable is a bit complicated as I used some yaml anchors and aliases to deduplicate what I could, particularly for the subnets that I enabled PXE booting on and overriding the DNS settings for the subnets I don’t want to be using my internal DNS:

# Catch-22 for bootstrapping - domain must be correctly set on the
# host being targeted.
network_default: &net_def
  domain_name: '{{ ansible_facts.domain }}'
  domain_name_servers: router.{{ ansible_facts.domain }}
  domain_search: '{{ ansible_facts.domain }}'

network_public_dns: &net_pub_dns
  domain_name_servers: 8.8.8.8

# PXE related settings
network_pxe_server: starfleet-command.{{ ansible_facts.domain }}
network_subclasses_pxe_user_class: &net_pxe_subclass_user_class
  class_name: user-class
  subclasses:
    - match: '"iPXE"'
      configuration: |-
        next-server {{ network_pxe_server }};
        filename "ipxe.cfg";
network_subclasses_pxe_client_architecture: &net_pxe_subclass_client_arch
  # Client architecture:
  # 0 == BIOS
  # 6 == 32-bit x86 EFI
  # 7 == 64-bit x86 EFI
  # 10 == 32-bit ARM EFI
  # 11 == 64-bit ARM EFI
  class_name: pxe-client-architecture
  subclasses:
    - match: '00:00'
      configuration: |-
        filename "undionly.kpxe";
    - match: '00:07'
      configuration: |-
        filename "ipxe.efi";

networks:
  192.168.10.0/24:
    <<: *net_def
    name: management
    dhcp_start: 192.168.10.100
    dhcp_end: 192.168.10.150
    router: router-mgmt.{{ ansible_facts.domain }}
    subclasses:
      # PXE subclasses
      - *net_pxe_subclass_user_class
      - *net_pxe_subclass_client_arch
      # Specific hardware vendor settings - only required on management network
      - class_name: vendor-class
        subclasses:
          - match: '"TP-LINK"'
            configuration: |-
              option capwap omada.{{ ansible_facts.domain }};
          - match: '"APC"'
            configuration: |-
              option apc-vendor-cookie 01:04:31:41:50:43;
  192.168.20.0/24:
    <<: *net_def
    name: main
    dhcp_start: 192.168.20.100
    dhcp_end: 192.168.20.160
    router: router.{{ ansible_facts.domain }}
    subclasses:
      # PXE subclasses
      - *net_pxe_subclass_user_class
      - *net_pxe_subclass_client_arch
  192.168.30.0/24:
    <<: [*net_pub_dns, *net_def]
    name: iot
    dhcp_start: 192.168.30.100
    dhcp_end: 192.168.30.200
    router: router-iot.{{ ansible_facts.domain }}
  192.168.31.0/24:
    <<: [*net_pub_dns, *net_def]
    name: cctv
    dhcp_start: 192.168.31.100
    dhcp_end: 192.168.31.120
    router: router-cctv.{{ ansible_facts.domain }}
  192.168.32.0/24:
    <<: [*net_pub_dns, *net_def]
    name: solar
    dhcp_start: 192.168.32.100
    dhcp_end: 192.168.32.120
    router: router-solar.{{ ansible_facts.domain }}
  192.168.40.0/24:
    <<: [*net_pub_dns, *net_def]
    name: guest
    dhcp_start: 192.168.40.100
    dhcp_end: 192.168.40.160
    router: router-guest.{{ ansible_facts.domain }}
  192.168.50.0/24:
    <<: *net_def
    name: services
    dhcp_start: 192.168.50.100
    dhcp_end: 192.168.50.200
    router: router-service.{{ ansible_facts.domain }}

Configuring which interfaces to be DHCP server for

I mentioned earlier that I needed to configure /etc/default/isc-dhcp-server.

I thought long and hard about how to identify which interfaces the DHCP server needs to listen on. I considered:

  • Including interfaces based on if they are designated as the router - but this precludes having different devices acting as router and DHCP server on one network.
  • Including interfaces based on if they are given static IP assignments in the network - but this precludes having DHCP servers that exist with static IPs on network they are not DHCP server for (i.e. having a DHCP server that is a DHCP server on one network but also has a static IP on another network on which a different host/device is the DHCP server).
  • Explicitly marking interfaces on which it should be DHCP server - this gives maximum flexibility as both of the above scenarios can be catered for, so I decided to adopt this approach.

To keep the role as flexible as possible, and avoid contaminating it with anything bespoke to my inventory, I just added the interfaces as a straight list to the dhcp role’s server entry point:

dhcp_interfaces:
  description: List of network interfaces to configure the DHCP server to listen on.
  type: list
  elements: str
  required: true

Configuring the actual list of interfaces in the file is a very simple ansible.builtin.lineinfile, which directly replaced the placeholder comment # XXX Need to configure /etc/default/isc-dhcp-server from before:

- name: /etc/default/isc-dhcp-server is correctly configured
  become: true
  ansible.builtin.lineinfile:
    path: /etc/default/isc-dhcp-server
    regex: ^#?INTERFACESv4=
    line: INTERFACESv4="{{ dhcp_interfaces | join(' ') }}"
  notify: isc-dhcp-server is restarted

The complicated bit is generating this list, which for now I added to the play targeting dhcp_servers in site.yaml (which is growing a bit large and I should split up into smaller playbooks):

# Build map of MAC -> interfaces
- name: MAC address to interface names is populated
  ansible.builtin.set_fact:
    mac_interface_map: >-
      {{
        mac_interface_map
        |
        combine({
          interface_details.macaddress: item
        })
      }}
  vars:
    # Initial value, fact will take precedence once defined
    mac_interface_map: {}
    interface_details: '{{ ansible_facts[item] }}'
  loop: '{{ ansible_facts.interfaces }}'
  # Skip VLAN subinterfaces and interfaces with no MAC (e.g. VPN and loopback)
  when: "'.' not in item and 'macaddress' in interface_details"
- name: Interfaces that are designated for DHCP serving are included
  ansible.builtin.set_fact:
      dhcp_interface_list: >-
        {{
          dhcp_interface_list
          +
          [
            {
              'mac_address': item.mac_address,
              'interface': mac_interface_map[item.mac_address] + (item.vlan is defined) | ternary('.' + item.vlan | default('') | string, ''),
              'vlan': item.vlan | default(omit),
            }
          ]
        }}
  loop: '{{ interfaces }}'
  vars:
    # Initial value for dhcp_interface_list - fact will take
    # priority once initialised.
    dhcp_interface_list: []
  when: item.dhcp_server | default(false)

Finally, I just passed this list into the existing call to import the server tasks from the dhcp role:

vars:
  dhcp_interfaces: "{{ dhcp_interface_list | map(attribute='interface') }}"

In my inventory, I added a new variable with the interfaces (specifying mac_address, which is used to map to the OS’ interfaces names). The names are currently not used but will be very shortly.

interfaces:
  - name: xxx-mgmt
    mac_address: aa:bb:cc:dd:ee:ff
    vlan: 1
    dhcp_server: true
    cnames:
      - router-mgmt
  # Main network - main interfaces, name is implicitly this inventory_host
  - mac_address: aa:bb:cc:dd:ee:ff
    vlan: 2
    dhcp_server: true
    cnames:
      - router
  - name: xxx-iot
    mac_address: aa:bb:cc:dd:ee:ff
    vlan: 3
    dhcp_server: true
    cnames:
      - router-iot
  - name: xxx-cctv
    mac_address: aa:bb:cc:dd:ee:ff
    vlan: 4
    dhcp_server: true
    cnames:
      - router-cctv
  - name: xxx-solar
    mac_address: aa:bb:cc:dd:ee:ff
    vlan: 5
    dhcp_server: true
    cnames:
      - router-solar
  - name: xxx-guest
    mac_address: aa:bb:cc:dd:ee:ff
    vlan: 6
    dhcp_server: true
    cnames:
      - router-guest
  - name: xxx-service
    mac_address: aa:bb:cc:dd:ee:ff
    vlan: 7
    dhcp_server: true
    cnames:
      - router-service
  - name: xxx-external
    mac_address: ff:ee:dd:cc:bb:aa

N.B. I now have two ways of specifying mac_address, a top-level variable used for DHCP discovery (which does not work for multi-homed systems) and now the interfaces list. I needed to revisit my DHCP discovery and update to use the interfaces variable instead and remove the defunct mac_address, which I did below.

Host specific configuration

I already have a template for host-specific configurations from my work on ignoring client ids, where I used /etc/dhcp/dhcpd.d/host-{{ inventory_hostname }}.conf as the host-specific filename template and included those files in the main configuration, like I did with the subnet configuration above.

Thinking about the filename, in light of the changes to the inventory variables adding aliases for hosts, I decided to retain a file per-host (inventory_hostname based naming) as it seems more manageable for all interfaces on a specific host to be in the same file. However, this complicates adding individual lines of configuration to specific aliases (as I did for ignoring the client id), so I will need to reimplement that piece - but that needs revisiting to use the new interfaces rather than mac_address host variable, anyway.

A further problem I have is that isc-dhcp-server does not support multi-homed clients (from the dhcpd.conf manual):

Please note that the current implementation assumes clients only have a single network interface. A client with two network interfaces will see unpredictable behavior. This is considered a bug, and will be fixed in a later release.

In practice, there are two issues:

  1. The same same host configuration cannot have multiple MAC addresses (e.g. if the same host has 2 interfaces). Each MAC needs to be treated as a different host in the configuration.
  2. The same MAC address cannot be separately configured for different subnets (e.g. different host name per network). This means that each VLAN configured on the same host cannot be given a separate hostname unless a static IP assignment is provided (as a unique static IP per subnet can be configured via multiple host blocks for the same MAC).

Dnsmasq would allow me to ignore the client provided name but still does dynamic DNS with configured DNS names. The problem with migrating my network to use dnsmasq for DHCP and DNS is that it does not support zones (or an equivalent) to, for example, only present internal resolution to some networks and only allow public internet resolution on others (e.g. IoT and guest).

For now, I resigned myself to needing to assign IP addresses when a host has a presence on multiple VLANs through one interface - at some point I may need to migrate to Kea, ISC’s replacement for ISC DHCP server, as that allows host configuration to be created per subnet, which would solve the problem.

The basic add-host.yaml task file I added to the dhcp role was very similar to the add-subnet.yaml I created above:

---
- name: Modular DHCP configuration folder exists
  become: true
  ansible.builtin.file:
    path: /etc/dhcp/dhcpd.d
    state: directory
    owner: root
    group: root
    mode: '700'
- name: Host configuration file is correct
  become: true
  # Uses:
  # * dhcp_host_name
  # * dhcp_host_configuration
  ansible.builtin.template:
    dest: /etc/dhcp/dhcpd.d/host-{{ dhcp_host_name }}.conf
    owner: root
    group: root
    mode: '400'
    src: dhcpd/host.conf
  notify: isc-dhcp-server is restarted
- name: Host configuration is included
  become: true
  ansible.builtin.lineinfile:
    path: /etc/dhcp/dhcpd.conf
    line: include "/etc/dhcp/dhcpd.d/host-{{ dhcp_host_name }}.conf";
  notify: isc-dhcp-server is restarted
...

The dhcp_host_configuration variable is a list of configurations for the host, dhcp_host_name is just used for the filename. The configuration file template looks like this:

# {{ dhcp_host_name }} host configuration
{% for config_entry in dhcp_host_configuration %}
host {{ config_entry.name }} {
  hardware ethernet {{ config_entry.mac_address }};
  ignore-client-uids {{ config_entry.ignore_client_id | default(false) | ternary('true', 'false') }};
{%   if 'ip4_address' in config_entry.keys() and config_entry.ip4_address | default(false) %}
  fixed-address {{ config_entry.ip4_address }};
{%   endif %}
}
{% endfor %}

The variables, and the definition of their contents, I added to the dhcp role’s argument_specs.yaml:

add-host:
  short_description: Add a new managed host to the DHCP server configuration.
  options:
    dhcp_host_name:
      description: Name for the host - used for the configuration filename.
      type: str
      required: true
    dhcp_host_configuration:
      description: The configurations for the host (one per interface)
      type: list
      elements: dict
      options:
        name:
          description: The name of the configuration's `host` entry
          type: str
          required: true
        mac_address:
          description: The MAC address of the interface
          type: str
          required: true
        ignore_client_id:
          description: Whether to tell the DHCP server to ignore the host's client ID or not
          type: bool
          default: false
          required: false
        ip4_address:
          description: A static IP address to configure the DHCP server to issue to this host
          type: str
          required: false

To the existing networks configuration variable, I added a new key to each network called ip4_assignments which links IPs in the subnet to the interface name (which is implicitly the inventory’s host name if the interface has no name). I did it this way, instead of attaching the IP address to the host (as I did with SaltStack) because this makes it much easier to track which IP addresses are statically assigned or available in a network.

networks:
  192.168.10.0/24:
    #...
    ip4_assignments:
      192.168.10.250: xxx-mgmt
  192.168.20.0/24:
    #...
    ip4_assignments:
      192.168.20.250: xxx
  192.168.30.0/24:
    #...
    ip4_assignments:
      192.168.30.250: xxx-iot
  192.168.31.0/24:
    #...
    ip4_assignments:
      192.168.31.250: xxx-cctv
  192.168.32.0/24:
    #...
    ip4_assignments:
      192.168.32.250: xxx-solar
  192.168.40.0/24:
    #...
    ip4_assignments:
      192.168.40.250: xxx-guest
  192.168.50.0/24:
    #...
    ip4_assignments:
      192.168.50.250: xxx-service

Finally, I added add-host.yaml to my top-level site.yaml (as I mentioned before, I really need to start splitting this up) for each host I needed to configure in the DHCP server. Ironically, perhaps, this is the most complicated bit as I had to wrangle the various data structures that I designed for my personal ease into the arguments that I designed to keep the roles straight forward. I will step through this piece-by-piece…

Firstly, I created a map of names to IP address - combining and reversing the IP to name map in each network:

- name: Map of all names to static IP addresses is known
  ansible.builtin.set_fact:
    name_ip_map: >-
      {{
        name_ip_map
        |
        combine({
          item.value: item.key
        })
      }}
  vars:
    name_ip_map: {}  # Fact will take precedence once set
  loop: "{{ networks.values() | map(attribute='ip4_assignments') | map('dict2items') | flatten }}"

Next, I created a DHCP host information orientated dictionary from the list of interfaces for each host, grouped by those hosts (so they end up in a per-host configuration file). This is the most complicated bit:

- name: Host interface information is wrangled into dhcp host information
  # This works by building a list of (host, interface) for
  # every interface defined in the inventory then looping over
  # that to create a dictionary mapping host (in an ansible
  # inventory sense): list of dhcp host information for each
  # interface.
  block:
    - name: List of all interfaces is known
      ansible.builtin.set_fact:
        all_host_interfaces: >-
          {{
            all_host_interfaces
            +
            [item.key] | product(item.value.interfaces | default([]))
          }}
      vars:
        all_host_interfaces: []  # fact will take precedence once defined
      loop: '{{ hostvars | dict2items }}'
      loop_control:
        label: '{{ item.key }}'
    - name: List of DHCP host configurations is known
      ansible.builtin.set_fact:
        # Note that the DHCP host name defaults to the inventory
        # host name (`item.0`) if not defined.
        dhcp_host_configurations: >-
          {{
            dhcp_host_configurations
            |
            combine({
              item.0:
                dhcp_host_configurations[item.0] | default([])
                +
                [{
                  'mac_address': item.1.mac_address,
                  'name': item.1.name | default(item.0),
                  'ip4_address': name_ip_map[item.1.name | default(item.0)] | default(None),
                }]
            })
          }}
      vars:
        dhcp_host_configurations: {}  # fact will take precedence once defined
      loop: '{{ all_host_interfaces }}'

Finally, I use the list I have just built to loop over each host (in an inventory sense) and add all of the DHCP host information:

- name: DHCP host configuration is correct
  ansible.builtin.include_role:
    name: dhcp
    tasks_from: add-host
  loop: '{{ dhcp_host_configurations.keys() }}'
  vars:
    dhcp_host_name: '{{ item }}'
    dhcp_host_configuration: '{{ dhcp_host_configurations[item] }}'

To test this, I added my two printers (which are currently still in the “main” (vlan 20) subnet). I chose the printers having already ascertained that the drivers insist on referring to at least one of them by IP address and will not accept a hostname. This means that, unfortunately, they will be difficult (if not impossible) to migrate to a more dynamic setup, which is my ultimate goal. Firstly to the network:

  192.168.20.0/24:
    <<: *net_def
    name: main
    dhcp_start: 192.168.20.100
    dhcp_end: 192.168.20.160
    router: router.
    subclasses:
      # PXE subclasses
      - *net_pxe_subclass_user_class
      - *net_pxe_subclass_client_arch
    ip4_assignments:
      192.168.20.30: printer-1 # Old printer
      192.168.20.31: printer-2 # New printer
      192.168.20.250: xxx

And then added them as hosts to my existing dummy group (which will not target them for any configuration):

dummy:
  hosts:
    # Printers (currently dummy, don't connect to them at the moment)
    printer-1:
      interfaces:
        - mac_address: 00:11:22:33:44:55
    printer-2:
      interfaces:
        - mac_address: 11:22:33:44:55:66

Aside from the obviously dummy hostnames and MAC addresses, this is a complete copy and paste of the configuration. Which worked perfectly.

Tidying up old files

The final piece of this particular puzzle is removing defunct hosts and subnets if they are removed from the configuration (if they change, or new ones are added, their configuration should be updated by these existing tasks).

Fortunately, in order to configure them, we have just build a list of all the valid subnets and hosts, so this is relatively straight-forward especially considering the per-subnet or per-host file approach. To determine what needs to be removed, I just needed to obtain a list of relevant configuration files (host-*.conf and subnet-*.conf in /etc/dhcp/dhcpd.d) from the server and identify any that are not in the desired lists. These defunct ones then need to be removed from being included in the main configuration then deleted - I called this remove-defunct-files.yaml in the dhcp role’s tasks:

---
- name: List of subnet files is known
  become: true
  ansible.builtin.find:
    paths:
      - /etc/dhcp/dhcpd.d
    patterns:
      - subnet-*.conf
  register: dhcp_subnet_configuration_files
- name: List of host files is known
  become: true
  ansible.builtin.find:
    paths:
      - /etc/dhcp/dhcpd.d
    patterns:
      - host-*.conf
  register: dhcp_host_configuration_files
- name: Files to remove is known
  ansible.builtin.set_fact:
    dhcp_defunct_files: >-
      {{
        dhcp_defunct_files
        +
        item.file_list | reject('in', item.permitted_files)
      }}
  vars:
    dhcp_defunct_files: []  # Fact will take precedence when set
  loop:
    - permitted_files: >-
        {{
          ['subnet-']
          | product(dhcp_permitted_subnets)
          | product(['.conf'])
          | map('map', 'join', '')
          | map('join', '')
        }}
      file_list: >-
        {{
          dhcp_subnet_configuration_files.files
          | map(attribute='path')
          | map('basename')
        }}
    - permitted_files: >-
        {{
          ['host-']
          | product(dhcp_permitted_hosts)
          | map('join', '')
          | product(['.conf'])
          | map('join', '')
        }}
      file_list: >-
        {{
          dhcp_host_configuration_files.files
          | map(attribute='path')
          | map('basename')
        }}
- name: Defunct files are not included in main configuration file
  become: true
  ansible.builtin.lineinfile:
    path: /etc/dhcp/dhcpd.conf
    line: include "/etc/dhcp/dhcpd.d/{{ item }}";
    state: absent
  notify: isc-dhcp-server is restarted
  loop: '{{ dhcp_defunct_files }}'
- name: Defunct files are removed
  become: true
  # No need to notify isc-dhcp-server as already not in the main
  # config, so not active part of configuration (unless it was
  # just removed, in which case task above will have notified).
  ansible.builtin.file:
    path: /etc/dhcp/dhcpd.d/{{ item }}
    state: absent
  loop: '{{ dhcp_defunct_files }}'
...

The arguments, which I added to the role’s argument_specs.yaml, are just a list of permitted subnets and hosts:

remove-defunct-files:
  short_description: Remove configuration files for hosts and subnets not in the list
  options:
    dhcp_permitted_subnets:
      description: List of permitted subnet names
      type: list
      elements: str
      required: true
    dhcp_permitted_hosts:
      description: List of permitted host names
      type: list
      elements: str
      required: true

Finally, I just needed to add one tasks to my play for dhcp_servers as the information for these arguments is already available:

- name: Superfluous DHCP configurations do not exist
  ansible.builtin.include_role:
    name: dhcp
    tasks_from: remove-defunct-files
  vars:
    dhcp_permitted_hosts: '{{ dhcp_host_configurations.keys() }}'
    dhcp_permitted_subnets: "{{ networks.values() | map(attribute='name') }}"

Updating DHCP IP discovery and ignoring client ID

Previously I wrote Ansible tasks to lookup IPs for DHCP clients and, later, ignoring client IDs. These both used a mac_address scalar value I set on the host in inventory. I have just added an interfaces array variable that allows specifying multiple mac_addresses for multi-homed systems, so these need updating and the mac_address variable removing.

For existing single-homed machines with a mac_address I needed to move this variable’s value to the new interfaces variable. This was a case of replacing, e.g.:

mac_address: 12:34:56:78:90:ab

with:

interfaces:
  - mac_address: 12:34:56:78:90:ab

DHCP IP discovery

In the spirit of updating documentation before changing the code, I updated the argument specification for the lookup-host entry point in the dhcp role’s meta/argument_specs.yaml:

mac_addresses:
  description: >-
    The mac addresses of the host to lookup the IP for (in the
    format aa:bb:cc:dd:ee:ff). The first address for any of the
    supplied `mac_addresses` will be found. To find addresses
    for multiple `mac_addresses`, call this entry point
    repeatedly.

As can be seen from the description, my intention is that an IP (as in exactly one IP) will be found for any of the MAC addresses in the list. If more than one IP is wanted, e.g. the IP for each interface, then the role can be applied repeatedly to find the IP for each interface wanted.

The code change required to the lookup-host.yaml is a simple change of | selectattr('mac', 'eq', mac_address) to | selectattr('mac', 'in', mac_addresses) in each of the places that a filter is used to find the desired matches. When I first tested the code, I found it returned a long expired lease - that expired in January (it is currently late March). I therefore also added a filter to only return unexpired leases - I did not make this an optional filter because I cannot imagine a situation where the expired lease would be useful (c.f. waiting for a current one). The additional filter, which I chained after the new mac filter, | selectattr('expires', 'gt', now().timestamp()).

The complete, revised, lookup-host.yaml file is:

---
- name: DHCP leases from are fetched from DHCP server
  # Uses `dhcp_server_host` for where to connect
  ansible.builtin.include_tasks: lookup-dhcp-leases.yaml
- name: dhcp_lease for the requested MAC address is set
  ansible.builtin.set_fact:
    dhcp_lease:
      ip: "{{ this_lease.ip | ansible.utils.ipaddr }}"
      hostname: "{{ this_lease.hostname }}"
      expires: "{{ this_lease.expires }}"
  vars:
    # Sort sorts ascending, so `last` is the most recent lease
    this_lease: >-
      {{
        dhcp_leases
        | selectattr('mac', 'in', mac_addresses)
        | selectattr('expires', 'gt', now().timestamp())
        | sort(attribute='expires')
        | last
      }}
  when: dhcp_leases is defined and (dhcp_leases | selectattr('mac', 'in', mac_addresses) | selectattr('expires', 'gt', now().timestamp()) | length > 0)
- name: dhcp_lease is None if no leases for the MAC found
  ansible.builtin.set_fact:
    dhcp_lease: null
  when: dhcp_leases is not defined or (dhcp_leases | selectattr('mac', 'in', mac_addresses) | selectattr('expires', 'gt', now().timestamp()) | length == 0)
- name: Recursion delay of {{ wait_for_lease_delay }}s has happened, if required
  delegate_to: localhost  # Stop Ansible trying to connect to the host to do the wait_for
  ansible.builtin.wait_for:
    timeout: "{{ wait_for_lease_delay }}"
  when: wait_for_lease and dhcp_lease is none
- name: Recursion has happened, if required
  ansible.builtin.include_tasks: lookup-host.yaml
  when: wait_for_lease and dhcp_lease is none
...

Ignoring client ID

My install.yaml playbook, which orchestrates the initial install of an OS, configures the DHCP server to, temporarily, ignore the client’s client ID as the install process uses several IDs, and consistency in the IP address makes it much easier to orchestrate it with Ansible (particularly as different numbers of DHCP clients and requests are involved depending on the method used to start the installer). This used the mac_address inventory variable that I have just replaced with the interfaces list.

I have just added a way to configure the host specific DHCP settings. Previously, I had added entry-points (task files) to the dhcp role to ignore the client ID by adding specific lines to do so. Now I have a way to fully configured host settings, including ignoring the client ID I want to use this and render the old method redundant.

I initially tried to directly update the ignore client ID tasks in my existing install.yaml. In order to capture the configuration information for the host configuration, following the pattern I used in site.yaml (this somewhat duplicates that and I do need to refactor this to remove the duplication), I needed to have the networks variable I configured in the domain’s group variables.

In the code in my site.yaml, I used the ansible_facts.domain variable, from the host’s discovered facts, to determine the domain on “real” hosts and the 2nd part of splitting the inventory_hostname on . once on hosts in the dummy group. This does not work for use in the tasks targetting the host to be installed in install.yaml, as I originally updated the code to add delegation to the add-host entrypoint tasks (with delegate_to, in the same way as the IP lookup works).

I soon realised that this was not the best approach, and that I should instead extract the tasks to alter the host’s DHCP configuration to be a play before and after the install tasks (which target the host to be installed) targetting the DHCP servers. This removed the need to delegate any tasks in relation to the client ID alteration and meant I could use the ansible_facts.domain (which I had ended up prompting for, as detecting it from the local host was unreliable - I will explain why below) from the DHCP servers to lookup their networks. I kept, however, the lookup-host as an entry point that does use delegation to run tasks on the DHCP server - this feels right to me. The looking up of the IP address is a task about the host being looked up, the fact is set on that host, and so I think it can logically be thought of as the current target host delegating to the DHCP server the looking up of its own IP address.

The change that I did keep, from this failed work around, was to replaced every use of ansible_facts.domain with domain, a new variable. This allowed me to combine the previous two ways of detecting, via targetting all:!dummy and dummy in separate plays to add the domain group, into a single tasks that works on both where the method used is embedded in a group-level variable. This also makes overriding, if needed, easy - which I was doing with a play-local variable via vars_prompt initially before I changed to targetting the dhcp servers with the add-host entry point for dhcp role.

In the all group, I set this group level variable to default to the existing value:

domain: '{{ ansible_facts.domain }}'

There were a lot of places that I replaced with this new variable, for example the hostname variable (also set on the all group) which changed from:

hostname: "{{ inventory_hostname }}{% if '.' not in inventory_hostname and ansible_
facts.domain %}.{{ ansible_facts.domain }}{% endif %}"

to:

hostname: "{{ inventory_hostname }}{% if '.' not in inventory_hostname %}.{{ domain
 }}{% endif %}"

and the network defaults (set on the domain group) changed from:

network_default: &net_def
  domain_name: '{{ ansible_facts.domain }}'
  domain_name_servers: router.{{ ansible_facts.domain }}
  domain_search: '{{ ansible_facts.domain }}'

to:

network_default: &net_def
  domain_name: '{{ domain }}'
  domain_name_servers: router.{{ domain }}
  domain_search: '{{ domain }}'

I created a new group level variable file for the dummy group, group_vars/dummy.yaml, which set the domain variable to be the domain found by splitting the hostname once on . and taking the second value:

---
domain: "{{ inventory_hostname.split('.', 1)[1] }}"
...

Using this domain variable allowed me to combine the old two methods of determining the domain (to add the domain group) with one. The old code:

- hosts: all:!dummy
  tags: always  # Always add extra groups and lookup sudo password
  tasks:
    - name: Group hosts by domain (mainly for environment detection)
      ansible.builtin.group_by:
        key: domain_{{ ansible_facts.domain | replace('.', '_') }}
# ...
# Group dummy hosts, without attempting to connect to them
- hosts: dummy
  gather_facts: no
  tags: always  # Always add extra groups
  tasks:
    - name: Group hosts by domain (mainly for environment detection)
      ansible.builtin.group_by:
        key: domain_{{ domain | default(inventory_hostname.split('.', 1)[1]) | replace('.', '_') }}

Is replaced with:

- hosts: all
  tags: always  # Always add extra groups
  # Dummy group cannot gather facts but non-dummies should have
  # their facts gathered above.
  gather_facts: no
  tasks:
    - name: Group hosts by domain (mainly for environment detection)
      ansible.builtin.group_by:
        key: domain_{{ domain | replace('.', '_') }}

The comment regarding non-dummy hosts having their facts gathered refers to this not being the first play in the playbook (the earlier having both hosts: all:!dummy and has a task to run ansible.builtin.setup which populates ansible_facts). I also combined the revised sudo password lookup, which I initially put in the play with the domain configuration, with the automatically use local connection for the current host play. That revised play now looks like this:

- hosts: all:!dummy
  # Need correct ansible_connection value to gather facts, which is not
  # yet set for everything.
  gather_facts: no
  tags: always  # Always set connection and lookup sudo password
  tasks:
    - block:
        - name: Find local hostname
          delegate_to: localhost
          ansible.builtin.command: /bin/hostname
          register: local_hostname
          changed_when: false # A read-only command (in this form)
          check_mode: false # Run even in check-mode
        - name: Set connection to local for this machine
          # Magic to delegate to the inventory host that is the local
          # machine.
          delegate_to: "{{ local_hostname.stdout }}"
          # Magic to set this fact on the delegated host - which is the
          # local one.
          delegate_facts: yes
          ansible.builtin.set_fact:
            ansible_connection: local
      run_once: yes
    - name: Ensure fact cache is up to date
      ansible.builtin.setup:
    - name: Ansible sudo password is retrieved from vault, if known
      delegate_to: localhost
      community.hashi_vault.vault_read:
        # So many things can determine the remote username (
        # ansible_user variable, SSH_DEFAULT_USER environment
        # variable, .ssh/config, etc. etc.) it's safer to user the
        # discovered fact.
        path: kv/hosts/{{ inventory_hostname }}/users/{{ ansible_facts.user_id }}
      register: sudo_pass
      # No password in vault is fine - will just not set it.
      failed_when: false
    - name: sudo password is set for host, if found in the vault
      ansible.builtin.set_fact:
        ansible_become_password: '{{ sudo_pass.data.data.password }}'
      when: "'data' in sudo_pass"

In my install.yaml, I added a check that there is at least one MAC for the host being installed, to the existing check that INSTALL_HOSTS is set (this should only be passed by explicitly typing the target host to --extra-vars on the command-line, as a safety check the correct host is being targetted). As you can see from the comment, I hope to support “discovering” uninstalled machines via other methods (e.g. specifying the switch port they are attached to and querying the switch) but for now MAC and DHCP are it (there’s also a reference to “the reinstall script” that doesn’t exist yet):

- hosts: localhost
  # Don't use facts, so save some time by not bothering to gather them.
  gather_facts: false
  any_errors_fatal: true
  tasks:
    # If this task fails, Ansible will abort the whole playbook and not
    # run subsequent plays on any host.
    - name: INSTALL_HOSTS is set
      ansible.builtin.assert:
        that: INSTALL_HOSTS is defined
        fail_msg: >-
          Set host to be installed in variable INSTALL_HOSTS - note
          this action maybe destructive!
    # How else can the client be identified? Could we look up the
    # MAC from a switch or existing DHCP lease, perhaps? - possibly
    # one for the reinstall script rather than install?
    - name: All hosts matched by INSTALL_HOSTS can be discovered
      ansible.builtin.assert:
        that: >-
          hostvars[item].interfaces
          | map(attribute='mac_address')
          | length > 0
        fail_msg: >-
          At least one interface with a MAC address (set in
          `mac_address`) must exist on {{ item }} for discovery.
      loop: "{{ query('ansible.builtin.inventory_hostnames', INSTALL_HOSTS) }}"

Next, I took the DHCP host configuration from site.yaml as a template and added a new play (below this sanity check) that reconfigures just the INSTALL_HOSTS to ignore the DHCP client’s supplied client ID. I had to added remove-mac-address-client-id, which was a task file included by a handler notified by the now defunct host-ignore-client-id entry point:

- hosts: dhcp_servers
  tags: dhcp
  tasks:
    - name: Group hosts by domain (mainly for environment detection)
      ansible.builtin.group_by:
        key: domain_{{ domain | replace('.', '_') }}
    - name: List of all mac addresses is known
      ansible.builtin.set_fact:
        all_install_host_macs: >-
          {{
            all_install_host_macs
            +
            hostvars[item].interfaces
            | map(attribute='mac_address')
            | default([])
          }}
      vars:
        all_install_host_macs: []  # fact will take precedence once defined
      loop: "{{ query('ansible.builtin.inventory_hostnames', INSTALL_HOSTS) }}"
    - name: Map of all names to static IP addresses is known
      ansible.builtin.set_fact:
        name_ip_map: >-
          {{
            name_ip_map
            |
            combine({
              item.value: item.key
            })
          }}
      vars:
        name_ip_map: {}  # Fact will take precedence once set
      loop: >-
        {{
          networks.values()
          | map(attribute='ip4_assignments')
          | map('dict2items')
          | flatten
        }}
    - name: List of all interfaces is known
      ansible.builtin.set_fact:
        all_host_interfaces: >-
          {{
            all_host_interfaces
            +
            [item] | product(hostvars[item].interfaces | default([]))
          }}
      vars:
        all_host_interfaces: []  # fact will take precedence once defined
      loop: "{{ query('ansible.builtin.inventory_hostnames', INSTALL_HOSTS) }}"
    - name: List of DHCP host configurations is known
      ansible.builtin.set_fact:
        # Note that the DHCP host name defaults to the inventory
        # host name (`item.0`) if not defined.
        dhcp_host_configurations: >-
          {{
            dhcp_host_configurations
            |
            combine({
              item.0:
                dhcp_host_configurations[item.0] | default([])
                +
                [{
                  'mac_address': item.1.mac_address,
                  'name': item.1.name | default(item.0),
                  'ip4_address': name_ip_map[item.1.name | default(item.0)] | default(None),
                }]
            })
          }}
      vars:
        dhcp_host_configurations: {}  # fact will take precendence once defined
      loop: '{{ all_host_interfaces }}'
    - name: DHCP host configuration is correct (ignores client id)
      ansible.builtin.include_role:
        name: dhcp
        tasks_from: add-host
      loop: '{{ dhcp_host_configurations.keys() }}'
      vars:
        dhcp_host_name: '{{ item }}'
        # Temporarily ignore client id and remove any static IP
        # map, as static IPs do not show up in isc-dhcp-server's ip
        # leases and, for the install, I don't to pre-suppose which
        # interface will be the one to PXE boot.
        dhcp_host_configuration: >-
          {{
            dhcp_host_configurations[item]
            | map('combine', {'ip4_address': None})
            | map('combine', {'ignore_client_id': true})
          }}
    - name: All but latest lease with client ID are removed
      ansible.builtin.include_role:
        name: dhcp
        tasks_from: remove-mac-address-client-id
      vars:
        mac_addresses: '{{ all_install_host_macs }}'

The remove-mac-address-client-id changes were literally turn the tasks that removed the individual mac_address from the leases files into a loop over the list mac_addresses in both the isc-dhcp-server and dnsmasq specific versions. In the dnsmasq file, remove-mac-address-client-id-dnsmasq.yaml, I also changed the task to find the latest lease and re-add it to the leased file without a client ID (so the current lease does not change) to loop over the mac_addresses. I also removed mac address dhcp client id is removed from the dhcp role’s handlers as it is now redundant.

To use the new lookup-host entry point, I needed to set the mac_addresses variable. I added this as the first task on the (existing) task list targeting the INSTALL_HOSTS:

- hosts: '{{ INSTALL_HOSTS }}'
  # Cannot guarantee host is up yet.
  gather_facts: false
  tasks:
    - name: List of mac addresses is known
      ansible.builtin.set_fact:
        mac_addresses: "{{ interfaces | map(attribute='mac_address') }}"

Updating the playbook to actually lookup based on this was, because I was lazy with the variable names, just a case of updating a comment:

    - name: DHCP IP address is known
      # * uses `mac_addresses` variable from above
      # * uses `all` group `dhcp_server_host` variable
      ansible.builtin.include_role:
        name: dhcp
        tasks_from: lookup-host
      vars:
        wait_for_lease: true

Finally, I added a new play to reverse the ignoring of the client ID. This goes after the one that targets INSTALL_HOSTS to orchestrate the actual install but before the play that includes the “bootstrap” playbook to (e.g.) install python and secure the new host:

# Finally, restore the original DHCP configuration - doing this last
# ensures the IP doesn't changed, even during the bootstrap, until
# the entire play is complete.
- hosts: dhcp_servers
  # Same tag as dhcp pre-install play as all the variables are
  # set there, so if setup is skipped this must be too.
  tags: dhcp
  # a) we just gathered them above (for the domain) and b) don't
  # think we need any for this.
  gather_facts: false
  tasks:
    # From the ansible.builtin.set_fact documentation:
    # > These variables will be available to subsequent plays during
    # > an ansible-playbook run via the host they were set on.
    # (So no need to re-process anything to set them up again.)
    - name: DHCP host configuration is correct (no longer ignores client id)
      ansible.builtin.include_role:
        name: dhcp
        tasks_from: add-host
      vars:
        dhcp_host_name: '{{ item }}'
        dhcp_host_configuration: "{{ dhcp_host_configurations[item] }}"
      loop: '{{ dhcp_host_configurations.keys() }}'

Finally, using this revised method of configuring the dhcp_servers as a template, I added plays to add and remove the host-specific iPXE configuration for the auto-install that had previously been doing manually. These I added before the DHCP server configuration and before the bootstrap play respectively:

- hosts: ipxe_servers
  gather_facts: false  # Don't need any facts
  tasks:
    - name: List of all mac addresses is known
      ansible.builtin.set_fact:
        all_install_host_macs: >-
          {{
            all_install_host_macs
            +
            hostvars[item].interfaces
            | map(attribute='mac_address')
            | default([])
          }}
      vars:
        all_install_host_macs: []  # fact will take precedence once defined
      loop: "{{ query('ansible.builtin.inventory_hostnames', INSTALL_HOSTS) }}"
    - name: Auto install is setup for hosts being installed
      become: true
      ansible.builtin.file:
        path: /srv/tftp/mac-{{ item | replace(':', '-') }}.ipxe
        # Must be unqualified due to tftp-hpa running in a chroot
        src: auto-install.ipxe
        state: link
      loop: '{{ all_install_host_macs }}'

and

- hosts: ipxe_servers
  gather_facts: false  # Don't need any facts
  tasks:
    # From the ansible.builtin.set_fact documentation:
    # > These variables will be available to subsequent plays during
    # > an ansible-playbook run via the host they were set on.
    # (So no need to re-process anything to set them up again.)
    - name: Auto install is removed for hosts being installed
      become: true
      ansible.builtin.file:
        path: /srv/tftp/mac-{{ item | replace(':', '-') }}.ipxe
        state: absent
      loop: '{{ all_install_host_macs }}'

Fixing unlock-crypt playbook

Finally, the unlock-crypt.yaml playbook also uses the DHCP lookup script to find the host’s IP address if the hostname is not resolvable. Fortunately, like with the install.yaml playbook, this is just a case of gathering the host’s MAC addresses and updating a comment:

- name: List of mac addresses is known (if required)
  ansible.builtin.set_fact:
    mac_addresses: "{{ interfaces | default([]) | map(attribute='mac_address') }}"
  when: >-
    ansible_host == inventory_hostname
    and
    inventory_hostname is not ansible.utils.resolvable
- name: At least one MAC is known for the client (if required)
  ansible.builtin.assert:
    that: mac_addresses | length > 0
    fail_msg: No `mac_address`s in interfaces for {{ inventory_hostname }}
  when: >-
    ansible_host == inventory_hostname
    and
    inventory_hostname is not ansible.utils.resolvable
- name: Attempt to find connection details if needed
  # * uses `mac_addresses` variable from above
  # * uses `all` group `dhcp_server_host` variable
  ansible.builtin.include_role:
    name: dhcp
    tasks_from: lookup-host
  # Cannot be part of the block or Ansible applies the when
  # to all the included tasks, including those that are
  # delegated (and hence the test evaluated against the
  # delegated host rather than the current host).
  when: >-
    ansible_host == inventory_hostname
    and
    inventory_hostname is not ansible.utils.resolvable

Tidying up defunct tasks

Finally, I removed the now defunct host-ignore-client-id and host-unignore-client-id entry points from the dhcp role’s meta/argument_specs.yaml file and deleted host-ignore-client-id.yaml, host-ignore-client-id-dnsmasq.yaml, host-ignore-client-id-isc-dhcp-server.yaml, host-unignore-client-id.yaml, host-unignore-client-id-dnsmasq.yaml and host-unignore-client-id-isc-dhcp-server.yaml from the role’s tasks folder.

Rinse and repeat for dnsmasq

Previously I had ensured my dhcp role worked for both dnsmasq and isc-dhcp-server. I use Dnsmasq in ephemeral VM networks, that I create and destroy for testing, due to it being very simple and easy configuration and providing an abundance of services (as much as I usually hate things that violate the Unix “do one thing and do it well” philosophy, the convenience is clear) including DHCP, DNS and TFTP out of the bot. However it does not support some of the features of ISC’s DHCP Server (or that combined with Bind9) I find particularly useful for some of my network isolation (e.g. DNS views).

The work I have just done to configure the server, I did on my live network (out of a desire for haste, although it has taken far longer than I hoped) so I only wrote the ISC server version. I will need to create Dnsmasq versions of:

  • server.yaml
  • add-subnet.yaml
  • add-host.yaml
  • remove-defunct-files.yaml

Fortunately, this is all looks relatively straight-forward - in some ways more so than ISC DHCP Server as Dnsmasq does not need explicit includes adding/removing from its main configuration for a modular approach. However I have kept this as a task for another day…