Following on from my previous post on PXE booting Debian Installer with network-console (SSH access), I wanted to take this a stage further an fully automate deployments but in a very targeted way. My idea is that some hosts (e.g. Proxmox nodes) should reinstall and, ultimately, re-add themselves to the cluster automatically where as others should continue to just boot into the interactive Debian installer by default. Fortunately this is very easy to achieve with iPXE.

iPXE provides a decent amount of information about the local system as “settings”, which can be used as variables inside its “script” (configuration) file. Combined with iPXE’s || logic operator, to provide an alternative command if one fails, we can implement ‘try this file, if it does not exist try that file and finally, if nothing else exists, do this’ in an iPXE script. The list of settings and values for the current host can be seen by running config from the iPXE shell (useful for developing scripts).

I added logic that allows me to override the default (display a menu) for any host by creating a file with its serial number, mac address (using hyphens instead of colons, as tftpd-hpa did not seem to like serving files with colons), hostname (only uses the DHCP provided hostname, does not fallback to rDNS unlike the Debain Installer) or product (useful for detecting types of systems e.g. VirtualBox for VirtualBox VMs). The chain --replace is equivalent to the exec call in other languages - entirely replacing the current script with the one being chained. Without --replace the current script would resume once the chained one finished.

#!ipxe
chain --replace serial-${serial}.ipxe ||
chain --replace mac-${mac:hexhyp}.ipxe ||
chain --replace host-${hostmame}.ipxe ||
chain --replace product-${product}.ipxe ||
goto menu

:menu
menu Please choose boot option
item --gap Installers:
item bullseye-amd64 Debian Bullseye (11) 64-bit installer
item buster-amd64 Debian Buster (10) 64-bit installer
item centos-7-x86_64 CentOS 7 64-bit installer
item dban-2.3.0 Darik's Boot and Nuke (2.3.0)
item jessie-amd64 Debian Jessie (6) 64-bit installer
item stretch-amd64 Debian Stretch (9) 64-bit installer
item --gap iPXE options:
item shell Start iPXE shell
item exit Exit iPXE and proceed to next boot option
choose --default exit option && goto ${option}
goto menu

:shell
shell
goto menu

:exit
exit

:bullseye-amd64
kernel ../bullseye-amd64/linux initrd=initrd.gz -- quiet
initrd ../bullseye-amd64/initrd.gz
boot

:buster-amd64
kernel ../buster-amd64/linux initrd=initrd.gz -- quiet
initrd ../buster-amd64/initrd.gz
boot

:centos-7-x86_64
kernel ../centos-7-x86_64/vmlinuz initrd=initrd.img -- ip=dhcp inst.repo=http://www.mirrorservice.org/sites/mirror.centos.org/7/os/x86_64/ rhgb quiet
initrd ../centos-7-x86_64/initrd.img
boot

:dban-2.3.0
kernel ../dban-2.3.0/dban-2.3.0.bzi nuke="dwipe" slient vga=785
boot

:jessie-amd64
kernel ../jessie-amd64/linux initrd=initrd.gz -- quiet
initrd ../jessie-amd64/initrd.gz
boot

:stretch-amd64
kernel ../stretch-amd64/linux initrd=initrd.gz -- quiet
initrd ../stretch-amd64/initrd.gz
boot

Override for a class of hosts

To change the boot process for one or more hosts, I just create the corresponding file - for example to make all VirtualBox VMs automatically boot into the current Debian installer after 20 seconds (retaining the options to open the shell or exit), I would create product-VirtualBox.ipxe in the same directory as the above main configuration with this content:

:menu
menu Please choose boot option
item --gap Installers:
item bullseye-amd64 Debian Bullseye (11) 64-bit installer
item --gap iPXE options:
item shell Start iPXE shell
item exit Exit iPXE and proceed to next boot option
choose --default bullseye-amd64 --timeout 20000 option && goto ${option}
goto menu

:shell
shell
goto menu

:exit
exit

:bullseye-amd64
kernel ../bullseye-amd64/linux initrd=initrd.gz -- quiet
initrd ../bullseye-amd64/initrd.gz
boot

Override for a specific host

To make a specific host automatically start the Debian installer with an ssh network console after 20 seconds, ready to install, I would create serial-SYSTEM-SERIAL-NO.ipxe, hostname-SYSTEM-DHCP-HOSTNAME.ipxe or mac-aa-bb-cc-dd-ee-ff.ipxe where ‘SYSTEM-SERIAL-NO’ is the system’s built-in serial number, ‘SYSTEM-DHCP-HOSTNAME’ the hostname given to it by the DHCP server or ‘aa-bb-cc-dd-ee-ff’ is the pxe-booting network interface’s MAC address (lower case hex with hyphens, in my configuration) with this content:

:menu
menu Please choose boot option
item --gap Installers:
item bullseye-amd64 Debian Bullseye (11) 64-bit installer (network-console)
item --gap iPXE options:
item shell Start iPXE shell
item exit Exit iPXE and proceed to next boot option
choose --default bullseye-amd64 --timeout 20000 option && goto ${option}
goto menu

:shell
shell
goto menu

:exit
exit

:bullseye-amd64
kernel ../bullseye-amd64/linux initrd=initrd.gz auto-install/enable=true netcfg/get_hostname=debian netcfg/get_domain= preseed/url=http://debian-preseed/d-i/buster/preseed-network-console.cfg -- quiet
initrd ../bullseye-amd64/initrd.gz
boot

Future enhancements

At the moment, my setup fetches the installer kernels and ramdisk images from the distribution’s mirrors (from the internet in my live network, the local mirrors in my lab) and puts copies in the tftp server’s directory. This is for historic reasons, when I was using older PXE bootloaders, however iPXE can fetch everything (including its configuration file) via HTTP so there is no reason to continue doing this. I should at some point change to using the copy from the mirror directly rather than maintaining extra local copies of these files.

Ansible playbook

I did some of the work on this while sitting in a car-park waiting for my wife, on a laptop whilst connected to a very flaky wireless connection (probably due to distance from car to access-point). As I was effectively disconnected from the internet, I was doing some testing in VirtualBox (hence adding pattern-matching for that class of device to the example above) which I did by adding a second network interface to an existing Debian VM I had and creating another VM with no disk that I used to PXE boot (I even flipped it between UEFI and BIOS mode a few times to verify everything was working smoothly in both configurations).

The Debian box I was using had been previously used for doing some Ansible testing, so already had it installed - so I created an Ansible “playbook” to setup the local machine as a PXE server on one of its interfaces, including product and host (via MAC address) specific boot menus (detectable by their menu titles for testing purposes). When I got home, I was pleasantly surprised to see I had managed to reproduce almost exactly how the PXE setup I created in 2013, in SaltStack, works - right down to the key names used in the underlying data to configure the interfaces.

In order to use the ansible.utils.ipaddr helper (which is very handy, not least to programmatically extract information from CIDR-style ip addresses/masks) I had to install the netaddr Python module:

# Install manually:
pip install netaddr
# or, preferably, add to requirements.txt (self-documenting the dependency) and:
pip install -r requirements.txt

The need not to have any infrastructure and be able to randomly write, deploy and run playbooks on any host without requiring any special configuration or privileges on either end (unless you want to do something specific that requires elevated access) is growing on me. As long as Ansible is installed, or installable, and Python/PowerShell (Linux/Windows respectively) available on the remote system it “just works” out of the box even in isolated environments, be they air-gapped lab “playgrounds” or random VMs spun up on a whim to test something, and is really easy compared to having to build a central controller or (re)configure anything.

The playbook looks like this (enp0s8 is the second interface attached to an internal network and 08:00:27:3a:1f:60 was the MAC address of the test, diskless, VM):

---
- hosts: localhost
  gather_facts: no
  vars:
    dhcp:
      domain_name: vbox.internal
      interfaces:
        enp0s8:
          ip4_address: 192.168.0.1/24
          dynamic_ranges:
            - start: 192.168.0.100
              end: 192.168.0.150
          enable_pxe: yes
  handlers:
    - name: restart isc-dhcp-server
      become: yes
      ansible.builtin.service:
        name: isc-dhcp-server
        state: restarted
  tasks:
    - name: Install required packages
      become: yes
      ansible.builtin.package:
        name:
          - ipxe
          - tftpd-hpa
          - isc-dhcp-server
    - name: Copy UEFI and BIOS bootloaders to tftp served directory
      become: yes
      ansible.builtin.copy:
        src: '{{ item }}'
        dest: /srv/tftp/boot/
        remote_src: yes
        mode: 00444
        directory_mode: 00755
        owner: root
        group: nogroup
      loop:
        - /usr/lib/ipxe/ipxe.efi
        - /usr/lib/ipxe/undionly.kpxe
    - name: Configure DHCP network interface with static address
      become: yes
      ansible.builtin.copy:
        dest: /etc/network/interfaces.d/{{ item[0] }}
        content: |
          allow-hotplug {{ item[0] }}
          iface {{ item[0] }} inet static
            address {{ item[1].ip4_address }}
        owner: root
        group: root
        mode: 00444
      register: configured_interfaces
      loop: '{{ dhcp.interfaces.items() }}'
    - name: Bring up configured interfaces
      become: yes
      ansible.builtin.command: ifup {{ item.item[0] }}
      when: item.changed
      loop: '{{ configured_interfaces.results }}'
    - name: Configure isc-dhcp-server to only serve on correct interface
      become: yes
      ansible.builtin.lineinfile:
        path: /etc/default/isc-dhcp-server
        search_string: INTERFACESv4=
        line: INTERFACESv4="{{ ' '.join(dhcp.interfaces.keys()) }}"
      notify: restart isc-dhcp-server
    - name: Configure isc-dhcp-server
      become: yes
      ansible.builtin.copy:
        dest: /etc/dhcp/dhcpd.conf
        content: |
          option domain-name "{{ dhcp.domain_name }}";
          default-lease-time 600;
          max-lease-time 7200;
          ddns-update-style none;
          authoritative;
          get-lease-hostnames true;
          option arch code 93 = unsigned integer 16;
          {% for subnet in dhcp.interfaces.values() %}
          subnet {{ subnet.ip4_address | ansible.utils.ipaddr('network') }} netmask {{ subnet.ip4_address | ansible.utils.ipaddr('netmask') }} {
          {%   for range in subnet.dynamic_ranges %}
            range {{ range.start }} {{ range.end }};
          {%   endfor %}

          {%   if subnet.enable_pxe %}
            if exists user-class and option user-class = "iPXE" {
              filename "boot/main.ipxe";
            } else {
              # Determine client architecture:
              # 0 == BIOS
              # 6 == 32-bit x86 EFI
              # 7 == 64-bit x86 EFI
              # 10 == 32-bit ARM EFI
              # 11 == 64-bit ARM EFI
              if exists arch {
                if option arch = 00:00 {
                  filename "boot/undionly.kpxe";
                } elsif option arch = 00:07 {
                  filename "boot/ipxe.efi";
                }
              }
            }
            next-server {{ subnet.ip4_address | ipaddr('address') }};
          {%   endif %}
          }
          {% endfor %}

        owner: root
        group: root
        mode: 00444
      notify: restart isc-dhcp-server
    - name: iPXE main file
      become: yes
      ansible.builtin.copy:
        dest: /srv/tftp/boot/main.ipxe
        mode: 00444
        owner: root
        group: nogroup
        content: |
          #!ipxe
          chain --replace serial-${serial}.ipxe ||
          chain --replace mac-${mac:hexhyp}.ipxe ||
          chain --replace host-${hostmame}.ipxe ||
          chain --replace product-${product}.ipxe ||
          goto menu


          :menu
          menu Please choose boot option
          item --gap Installers:
          item --gap iPXE options:
          item shell Start iPXE shell
          item exit Exit iPXE and proceed to next boot option
          choose --default exit option && goto ${option}
          goto menu

          :shell
          shell
          goto menu

          :exit
          exit
    - name: iPXE VirtualBox product file
      become: yes
      ansible.builtin.copy:
        dest: /srv/tftp/boot/product-VirtualBox.ipxe
        mode: 00444
        owner: root
        group: nogroup
        content: |
          #!ipxe
          :menu
          menu (VirtualBox menu) Please choose boot option
          item --gap Installers:
          item --gap iPXE options:
          item shell Start iPXE shell
          item exit Exit iPXE and proceed to next boot option
          choose --default exit option && goto ${option}
          goto menu

          :shell
          shell
          goto menu

          :exit
          exit
    - name: iPXE mac-specific file
      become: yes
      ansible.builtin.copy:
        dest: /srv/tftp/boot/mac-08-00-27-3a-1f-60.ipxe
        mode: 00444
        owner: root
        group: nogroup
        content: |
          #!ipxe
          :menu
          menu (MAC 08:00:27:3A:1F:60 menu) Please choose boot option
          item --gap Installers:
          item --gap iPXE options:
          item shell Start iPXE shell
          item exit Exit iPXE and proceed to next boot option
          choose --default exit option && goto ${option}
          goto menu

          :shell
          shell
          goto menu

          :exit
          exit
...