Using Ansible to find IP addresses for DHCP clients
This post follows on from automating Debian install, picking up doing the post-install configuration from the point of having a (generic) baseline OS installed. I am using Ansible to do the post-installation configuration but any other configuration management tool could be used and I have personally used SaltStack, Puppet and cfengine to implement this approach (simple, minimal OS install handing off to configuration management tool to do host-specific customisation post-install) with a variety of Linux distributions. Reasons I like this approach include that all host-specific configuration and data is in one location, creating clarity about whether a setting or configuration comes from the install script (e.g. kickstart/preseed) or configuration management tool (or both), and all customisations being applied by the configuration management tool means those remain correct (and corrected by the tool if they deviate) throughout the lifetime of the system.
Finding connection information for unconfigured hosts
In order to connect to a host to configure it correctly, we need to somehow link the host to the information about that host. The easiest way for me is to use the MAC address as an identifier. When it comes to targeting pxe booting (which will probably be the subject of a future blog post about end-to-end automation of (re)installs), I will also wish to have the MAC address so this approach I consider a “technical investment” (c.f. technical debt). I configured this as a variable on the host in the inventory:
all:
hosts:
proxmox-1:
mac_address: 08:00:27:96:56:50
For this development work, I used dnsmasq however my live network is using ISC’s DHCP server, so I wrote support for both. The DHCP host, on which to look up the IP address, is an argument dhcp_server_host
that I set at the all
group level to lookup the first host in the dhcp_servers
group (dhcp_server_host: "{{ groups.dhcp_servers | default([None]) | first }}"
). This works for my current environment but makes the role flexible for other setups.
The proof-of-concept code to select which method to use just tests to see if specific file exists, either the dnsmasq leases file or ISC’s leases utility (if, for some reason, both exist then both sets of tasks will be included and the later will override the first).
If not configured, ansible_host
takes the value of the inventory host name - testing if ansible_host
is undefined is therefore useless (as it is always set to something) so, to tell if it has been explicitly set, the test is “is ansible_host
different to inventory_hostname
?” although there is a clear problem if the ansible_host
variable is explicitly (if redundantly) set to inventory_hostname
.
Detecting DHCP server software
This ended up as a task file called detect-dhcp-server-software.yaml
in a dhcp
role:
---
# This originated from a playbook to lookup IPs by mac address. For
# that reason, it uses lease related artefacts to identify the
# software and "delegate_to" to run on the dhcp host (presuming that
# anything running directly against the dhcp server would not need to
# use this set of tasks). Both of these decisions may need to be
# revisited in the future.
# I did try removing the delegation and apply it at the "include_tasks"
# level, however delegate_to is not permitted on an include_tasks by
# Ansible so it has to be done this way. Delegation will not occur if
# `dhcp_server_host` is set to `None`.
- name: dnsmasq specific file is `stat`ed
delegate_to: "{{ dhcp_server_host }}"
ansible.builtin.stat:
path: /var/lib/misc/dnsmasq.leases
register: dnsmasq_leases_file
- name: isc-dhcp-server specific file is `stat`ed
delegate_to: "{{ dhcp_server_host }}"
ansible.builtin.stat:
path: /usr/sbin/dhcp-lease-list
register: isc_dhcp_leases_command
- name: Know if DHCP server software is dnsmasq
ansible.builtin.set_fact:
dhcp_server_software: dnsmasq
when: dnsmasq_leases_file.stat.exists
- name: Know if DHCP server software is isc-dhcp-server
ansible.builtin.set_fact:
dhcp_server_software: isc-dhcp-server
when: isc_dhcp_leases_command.stat.exists
- name: Detected software is reported
ansible.builtin.debug: msg="DHCP server is {{ dhcp_server_software }}"
...
Since this task file is not intended to be used directly, it is used by other tasks in the role to transparently flex to different server software, I did not add an argument specification for it to meta/argument_specs.yaml
in the role. If I did, the only argument is the DHCP server host - dhcp_server_host
.
Looking up the DHCP IP lease
Originally, this just found the host’s lease (from it’s mac_address
variable) from each server software’s leases file however I later added the wait_for_lease
option, to optionally wait until a lease exists if there is not one. When I did this, it became easier to retrieve a full set of leases (in a standardised format) from the server software in use, and then use a common set of tasks to work with this abstract information. This task file is called lookup-host.yaml
and it recurses by including itself, if needed, until a lease exists (yes, this will loop infinitely (well, until the call stack overflows) if none appears).
---
- name: DHCP leases from are fetched from DHCP server
# Uses `dhcp_server_host` for where to connect
ansible.builtin.include_tasks: lookup-dhcp-leases.yaml
- name: dhcp_lease for the requested MAC address is set
ansible.builtin.set_fact:
dhcp_lease:
ip: "{{ this_lease.ip | ansible.utils.ipaddr }}"
hostname: "{{ this_lease.hostname }}"
expires: "{{ this_lease.expires }}"
vars:
# Sort sorts ascending, so `last` is the most recent lease
this_lease: >-
{{
dhcp_leases
| selectattr('mac', 'eq', mac_address)
| sort(attribute='expires')
| last
}}
when: dhcp_leases is defined and (dhcp_leases | selectattr('mac', 'eq', mac_address) | length > 0)
- name: dhcp_lease is None if no leases for the MAC found
ansible.builtin.set_fact:
dhcp_lease: null
when: dhcp_leases is not defined or (dhcp_leases | selectattr('mac', 'eq', mac_address) | length == 0)
- name: Recursion delay of {{ wait_for_lease_delay }}s has happened, if required
delegate_to: localhost # Stop Ansible trying to connect to the host to do the wait_for
ansible.builtin.wait_for:
timeout: "{{ wait_for_lease_delay }}"
when: wait_for_lease and dhcp_lease is none
- name: Recursion has happened, if required
ansible.builtin.include_tasks: lookup-host.yaml
when: wait_for_lease and dhcp_lease is none
...
The lookup-dhcp-leases.yaml
tasks file just includes the software-specific lookup-dhcp-leases-<software>.yaml
file for the DHCP server software in use:
---
- name: DHCP server software is detected (if not already set)
ansible.builtin.include_tasks: detect-dhcp-server-software.yaml
when: dhcp_server_software is undefined
- name: Appropriate lookup tasks are included
ansible.builtin.include_tasks: lookup-dhcp-leases-{{ dhcp_server_software }}.yaml
...
It is worth noting that dhcp_server_host
can be explicitly set to null
/None
(in yaml/python parlance), in which case any delegate_to: {{ dhcp_server_host }}
tasks will run against the current inventory host (I found this out empirically).
I found the format of dnsmasq’s leases file frustratingly hard to locate in any documentation, in the end I found a mailing list post which describes it:
Fields in order:
- Time of lease expiry as epoch time. Can be changed at compile time to remaining lease time (in seconds) or total lease renewal time.
- MAC address.
- IP address.
- Computer name, if known (always unqualified).
- Client-ID, if known.
For dnsmasq, the task file reads the leases file directly:
---
- name: DHCP leases file is read from the dhcp server
delegate_to: "{{ dhcp_server_host }}"
# On my system this file is world-readable, so no special
# permissions required.
# Treat this space-delimited file like a CSV for ease of parsing.
community.general.read_csv:
delimiter: ' '
dialect: unix
fieldnames:
- expires
- mac
- ip
- hostname
- client_id
path: /var/lib/misc/dnsmasq.leases
register: dhcp_leases_csv
- name: CSV data is in correct data types and format
ansible.builtin.set_fact:
dhcp_leases: >-
{{ dhcp_leases + [
{
'mac': lease.mac,
'ip': lease.ip,
'hostname': lease.hostname,
'expires': lease.expires,
}
] }}
loop: "{{ dhcp_leases_csv.list }}"
vars:
dhcp_leases: [] # Start with empty list, fact will take precedence
lease: "{{ item | combine({ 'expires': item.expires | int }) }}"
...
For ISC’s DHCP Server, I used the dhcp-lease-list
tool. This has some limitations, chiefly it presumes a single MAC address does not have multiple leases using different client id (or, at least, it does not expose the client id making this impossible to detect and by default only shows the latest lease for each MAC address). This is one of the reasons I chose to ignore this situation in my tasks. I could, alternatively, parse the software’s leases file but its format is complex (using delimited blocks, rather than lines, per client) and would be hard to parse using existing Ansible modules.
---
- name: DHCP leases file is read from the dhcp server
delegate_to: "{{ dhcp_server_host }}"
# Despite being in /sbin, on my system the leases file file is
# world-readable, and no special permissions were required.
ansible.builtin.command: /usr/sbin/dhcp-lease-list --parsable --all
changed_when: false # Read-only command
register: dhcp_leases_output
- name: Leases results are parsed
ansible.builtin.set_fact:
# Example lease:
# MAC aa:bb:cc:dd:ee:ff IP 192.168.0.1 HOSTNAME -NA- BEGIN 2023-08-15 16:06:33 END 2023-08-15 17:06:33 MANUFACTURER
dhcp_leases_parsed: >-
{{ dhcp_leases_parsed + [
dict(
['mac', 'ip', 'hostname', 'start_time', 'end_time', 'manufacturer']
|
zip(
item
|
regex_search(
'^MAC (?P<mac>[0-9a-f:]+) IP (?P<ip>[0-9\\.]+) HOSTNAME (?P<hostname>[^ ]+) BEGIN (?P<start_time>[0-9-: ]+) END (?P<end_time>[0-9-: ]+) MANUFACTURER (?P<manufacturer>.*)$',
'\g<mac>',
'\g<ip>',
'\g<hostname>',
'\g<start_time>',
'\g<end_time>',
'\g<manufacturer>',
)
)
)
] }}
vars:
dhcp_leases_parsed: [] # Start with empty list, fact will take precedence
loop: "{{ dhcp_leases_output.stdout_lines }}"
- name: Parsed output is in correct format and data types
ansible.builtin.set_fact:
dhcp_leases: >-
{{ dhcp_leases + [
{
'mac': lease.mac,
'ip': lease.ip,
'hostname': lease.hostname,
'expires': lease.expires
}
] }}
loop: "{{ dhcp_leases_parsed }}"
vars:
dhcp_leases: [] # Start with empty list, fact will take precedence
# Times in leases file are UTC but datetime will assume localtime
# unless timezone is explicit.
lease: >-
{{
item
|
combine({
'expires':
(
(item.end_time ~ '+0000')
|
to_datetime(format='%Y-%m-%d %H:%M:%S%z')
).timestamp()
|
int
})
}}
...
I added both lookup-dhcp-leases
and lookup-host
argument specifications to meta/argument_specs.yaml
in the role, which Ansible uses to validate the role’s task file is being accessed with the correct variables defined. This file supports using different tasks files as the entry point:
---
argument_specs:
lookup-host:
short_description: >-
Lookup the IP address of the current inventory host.
description: >
Lookup the current inventory host's IP address (from its
`mac_address`) on the DHCP server specified by
`dhcp_server_host` and set `dhcp_lease` accordingly. It
assumes only the most recent IP lease for the MAC address is
relevant - i.e. does not consider the case of multiple client
ids for the same MAC.
`dhcp_lease` will be a dictionary with the keys `ip`,
`hostname`, `expires` which holds the IPv4 address,
hostname and expiry (in seconds since Unix epoch) of the
most recent lease according to the DHCP server.
options:
dhcp_server_host:
description: >-
The inventory host to delegate the lookup tasks to (i.e.
the DHCP server).
type: str
mac_address:
description: >-
The mac address of the host to lookup the IP for (in the
format aa:bb:cc:dd:ee:ff)
type: str
required: true
wait_for_lease:
description: >-
If a DHCP IP lease is not found for the mac address, retry
until one becomes available (when `true`).
type: bool
default: false
wait_for_lease_delay:
description: >-
Time (in seconds) between retries when `wait_for_lease`
is `true`.
type: int
default: 5
lookup-dhcp-leases:
short_description: >-
Lookup all the leases on the DHCP server.
description: >
Lookup the DHCP leases from the DHCP server. Will try to detect the
dhcp software if `dhcp_server_software` is not set.
`dhcp_leases` will be a list of dictionaries with the keys `mac`,
`ip`, `hostname`, `expires` which holds the MAC address,
IPv4 address, hostname and expiry (in seconds since Unix epoch)
of each lease. (Note, due to limitations in isc's dhcp-lease-list
output, does not include client id).
options:
dhcp_server_host:
description: >-
The inventory host to delegate the lookup tasks to (i.e.
the DHCP server).
type: str
...
The default values I specified, I added to the role’s defaults/main.yaml
:
---
wait_for_lease: false
wait_for_lease_delay: 5
...
Using the role
The recipe to use the role for looking up the IP when the inventory host cannot be looked up and no specific ansible_host
value has been given is:
- name: Host's dhcp details are known
# * uses host's `mac_address` variable
# * uses default value for `dhcp_server_host` variable (which
# looks up from `dhcp_servers` group)
ansible.builtin.include_role:
name: dhcp
tasks_from: lookup-host
when: >-
ansible_host == inventory_hostname
and
inventory_hostname is not ansible.utils.resolvable
However, the Debian installer will always DHCP in my environment so for the purposes of automating the installation process, a DHCP lookup will take place every time (even if inventory_hostname is resolvable).
I will pick up from here in another post, I started working on this one on 22nd August and decided to split it up in order to get it published and break up what became a very long piece…