For nearly 2 years I have been using Let’s Encrypt (like half the tech world) for SSL certificates on my public-facing projects and services. I have decided to try an extend their use to my internal sites too, and do-away with running my own certificate authority except for a few niche cases (OpenVPN, for example).

In my current use case, all of the services that require an SSL certificate reside on my cloud-based virtual server, which makes handling them very easy - I have a script which runs the update and triggers a restart or reload (as appropriate) of any daemons that might need to know if the certificate is updated.

This new one is a bit more complicated, as some of the systems that need a certificate are embedded so the certificate will need to be requested on a different host (a Linux server) and then pushed onto the client with either an API call or tool.

As my DNS is hosted by the excellent Mythic Beasts, who also provide my VPN, I use their DNS API hook with Dehydrated to automate management of my Let’s Encrypt certificates.

Dehydrated only allows the use of one hook script, so in order to automate the updating of certificates and refreshing of affected services I have a daily cron job that checks to see if dehydrated’s certificate is newer than the destination one and, if it is, copy the new one over and restart/reload daemons. This is crude and should be done using dehydrated’s deploy-cert hook point instead. As I now need to be a bit more sophisticated for these embedded systems, a one-size-fits-all recipe for copy-reload is not longer going to work, I am revisiting this hack.

Mythic Beasts also maintain a tool called dehydrated-code-rack, which dispatches any number of hooks using run-parts. run-parts comes from the debianutils package (described as “Miscellaneous utilities specific to Debian”), so it is probable the tool will not work on other systems (e.g. FreeBSD, Fedora, Red Hat and derivatives like CentOS). I have submitted a pull-request to make it work on other systems but fortunately all of my Linux systems are running Debian.

Migrating from single hook and cron job to code-rack

Step 1 is to unpick my existing solution and migrate it to a process that supports multiple hooks, which I can then replicate on my new set-up. I could just implement it in the new place but having 2 different ways the process works is a recipe for confusion in the future and complicates maintaining it. While I am doing this, I will migrate my existing cron job to use hooks as well.

Download code-rack

code-rack expects to be in the same place as the hooks (the hook script uses dirname $0 to locate the hook-directories). To work-around this, I have fetched their git repository to a separate directory then symlinked the relevant scripts into the hook directory.

Fetch code-rack with git, make the hook directory and symlink and create the hook folders (before setting permissions so that the dehydrated user can run but not modify the hooks).

git clone https://github.com/mythic-beasts/dehydrated-code-rack.git /etc/dehydrated/code-rack
mkdir /etc/dehydrated/hook
ln -s /etc/dehydrated/code-rack/code-rack /etc/dehydrated/hook/
mkdir /etc/dehydrated/hook/{available,deploy-challenge,clean-challenge,deploy-cert,unchanged-cert,invalid-challenge,request-failure,exit-hook}
chown -R root:root /etc/dehydrated/hook
chmod -R 755 /etc/dehydrated/hook

Migrate the existing hook

Once code-rack is in place, this is simple. Like code-rack, I git cloned the dns-api into a subdirectory of /etc/dehydrated called dehydrated-mythic-dns01 (although it occurs to me as I type this that /usr/local/lib/dehydrated might be a more appropriate place for both of them). If yours is elsewhere, adjust accordingly.

First step is to replace the HOOK configuration setting for dehydrated with code-rack instead of the DNS api directly. I had created a file in /etc/dehydrated/conf.d with this setting:

HOOK=/etc/dehydrated/hook/code-rack
HOOK_CHAIN=yes

The second step is to add the dns-api hooks to code-racks stack, with some symlinks. Like code-rack, the dns-api script expects to find a file in dirname $0/../common/, so that extra directory needs to be created and the file symlinked. The actual hook files (but not the common file, which is sourced) need to be executable, so double check the permissions on the symlink target.

mkdir /etc/dehydrated/hook/common
for file in clean-challenge common deploy-challenge
do
  ln -s /etc/dehydrated/dehydrated-mythic-dns01/$file/mythic-dns01 /etc/dehydrated/hook/$file/
done

Migrate the service cron job

My current cron job does two things:

  1. If a certificate dehydrated has generated or renewed (in /var/lib/dehydrated/certs) is newer than the one in use (in /etc/ssl) then copy the new one over
  2. If any certificates have been copied then reload my webservers and mail servers

In the new hook process, I need to separate these into two hooks.

Copy the certificate

The hook is only called when a certificate is updated so I can get rid of the check for if it is newer and just do the copy based on the domain we are told has been updated:

#!/bin/bash

SRCDIR=/var/lib/dehydrated/certs
TGTDIR=/etc/ssl

set -e
echo "This script ($0) will abort on first error." >&2

for file in cert.pem chain.pem fullchain.pem privkey.pem
do
  cp -v "$SRCDIR/$DOMAIN/$file" "$TGTDIR/$DOMAIN/$file"
  if [ $file == "privkey.pem" ]
  then
  	# Make sure private key remains private
  	mode=600
  else
    mode=644
  fi
  chmod -v $mode "$TGTDIR/$DOMAIN/$file"
done

This script now requires pre-creating the directory for the domain in /etc/ssl and ensuring dehydrated has permission to update the file in there (simplest to have it dehydrated owned).

Reload the services

The last thing to do is reload services if the certificates are updated. On this system, I only need to reload (not restart) some daemons if the certificate is updated and as I only have one certificate I could use the bundled services-reload script with an environment variable that reloads them all. However, I want to do something a bit less crude with my new services and be able to reload specific services only if the certificate they are using is updated. The first piece of this puzzle is to be able to reload specific services via a script. Taking inspiration from how Munin’s plug-ins work, I have created this hook (which needs to live outside the hook point directory) that can be symlinked into the hook-point to restart specific services based on the symlink name:

#!/bin/bash

# symlink this file to 'reload_<name of service to reload'

filename="$(basename "$0")"
service="${filename#*_}"
echo "Reloading $service..."
/usr/bin/sudo /bin/systemctl reload "$service"

Note that the user dehydrated runs as (‘dehydrated’ on my system) needs nopasswd sudo permission to run ‘systemctl reload’, which can be granted with this sudoers.d file:

dehydrated    hostname=(root) NOPASSWD: /bin/systemctl reload *

For example, to reload nginx I can create the following symlink (assuming my script is in /etc/hook/available/reload_):

ln -s /etc/hook/available/reload_ /etc/hook/deploy-cert/reload_nginx

Fixing code-rack

I then tested this with a manual run of the dehydrated and found that code-rack does not work with the current version in Debian Stable:

$ sudo -u dehydrated dehydrated -c
# INFO: Using main config file /etc/dehydrated/config
# INFO: Using additional config file /etc/dehydrated/conf.d/hook-code-rack.sh
# INFO: Using additional config file /etc/dehydrated/conf.d/mail.sh
# INFO: Using additional config file /etc/dehydrated/conf.d/mythicbeasts-dns-api.sh
# INFO: Using additional config file /etc/dehydrated/conf.d/user.sh
/bin/mkdir: cannot create directory ‘/etc/dehydrated/hook/this-hookscript-is-broken--dehydrated-is-working-fine--please-ignore-unknown-hooks-in-your-script’: Permission denied
run-parts: failed to open directory /etc/dehydrated/hook/this-hookscript-is-broken--dehydrated-is-working-fine--please-ignore-unknown-hooks-in-your-script: No such file or directory

To fix this, I submitted another pull request that adds a catch to exit on unknown hooks and then cloned from my fork (https://github.com/loz-hurst/dehydrated-code-rack.git), having merged it into my master branch.

Once I have made this change, the new set-up worked fine.

While I was doing this I also noticed that 4 hooks were missing (startup_hook, generate_csr, deploy_ocsp and sync_cert) and the HEADERS variable is not provided for the request_failure hook. I will need the generate_csr hook to fetch the CSR from my embedded systems, shortly, so I will have to fix these too.

Dispatching specific hooks based on domain

Step 2 is to facilitate running different hooks depending on the certificate being updated, so that the CSR can be fetched and the signed certificate uploaded to the embedded system only for that one certificate - other certificated will be deployed locally.

The script

To do this I created a new script called ifdomain_, again taking inspiration from how Munin (and hence my service script above) work:

#!/bin/bash

# symlink this file to 'ifdomain_<domain with '-' in place of '.'>_<hook to run>'

filename="$(basename "$0")"

next_hook_dir="$(dirname "$( realpath $0 )")"

name_without_prefix="${filename#*-}"
dash_domain="${name_without_prefix%%_*}"
next_hook="${name_without_prefix#*_}"

if [[ ${DOMAIN//./-} = $dash_domain ]]
then
	echo "Running $next_hook for domain $DOMAIN"
	$next_hook_dir/$next_hook
else
	echo "Skipping $next_hook, domain ${DOMAIN//./-} is not $dash_domain"
fi

For example, to reload nginx if the ‘example.com’ domain’s certificate is updated (assuming my script is in /etc/hook/available/ifdomain_ and there is a /etc/hook/available/reload_nginx script (which would be a symlink to my earlier reload- script)):

ln -s /etc/hook/available/ifdomain_ /etc/hook/deploy-cert/ifdomain_example-com_reload_nginx

As a slight aside: I originally wrote this using dashes as separators and underscores to replace dots in the domain name, however dashes are more common in domain and host names which created problems when I encountered them with the script so I switched them.

Restarting docker container

One of my internal tools is running in a docker container. Updating the SSL certificate requires restarting it in order to it to notice the change. Fortunately this is easy to facilitate using a very similar recipe to the one I wrote for restarting services:

#!/bin/bash

# symlink this file to 'docker-restart_<name of container to reload'

filename="$(basename "$0")"
container="${filename#*_}"
echo "Restarting docker container $service..."
/usr/bin/sudo /usr/bin/docker restart "$container"

Again, the user dehydrated runs as (‘dehydrated’ on my system) needs nopasswd sudo permission to run ‘docker restart’, which can be granted with this sudoers.d file:

dehydrated    hostname=(root) NOPASSWD: /usr/bin/docker restart *

HP iLO certificates

Two of the embedded systems I want to deploy SSL certificates to are HP servers with integrated lights out (iLO) out of band management interface.

The first thing I did was to create an iLO user for Dehydrated to use to update the certificate. I granted it only permissions to configure the iLO:

Screenshot 2021-01-19 180406

The iLOs can be remotely managed using the python-hpilo project. Conveniently, this is packaged and in Debian’s repository, so I just installed it.

I created a configuration file per server in /etc/dehydrated/hp-ilo to avoid putting credentials in the script or on the command line. Making the file names match the server, it is easy to code the script to automatically use the correct one. It is also convenient to put some settings for the certificate in there:

[ilo]
login = username
password = password

[certificate]
organizational_unit = my department
organization = my organisation
locality = My city
state = My county
country = GB

The certificate settings are needed otherwise the CSR will be generated with HP’s build in ones of “O = Hewlett Packard Enterprise, OU = ISS, L = Houston, ST = Texas, C = US”. These values are referenced in the command I use to get the CSR from the iLO - they are not automatically picked up just from their presence in the configuration file but can be overridden per-iLO (if desired) doing it this way (c.f. hard-coding into the hook script).

After ensuring that the file has appropriate permissions (owned by the ‘dehydrated’ user and mode 0400), I tested with hpilo_cli:

hpilo_cli -c server.domain.tld.ini server.domain.tld get_uid_status

Which should output the current state of the server’s ID light (‘OFF’ in my case).

Next we need the hook script, which needs to be called at the ‘generate_csr’ stage (hint: use my ‘ifdomain_’ script from earlier in this blog post to only trigger it for the iLO domains). I called it ‘hp-ilo-csr’.

#!/bin/bash

ILO_CONF_DIR=/etc/dehydrated/hp-ilo

ILO_CONF_FILE="$ILO_CONF_DIR/$DOMAIN.ini"

# Check config file exists
if [ ! -f "$ILO_CONF_FILE" ]
then
  echo "No iLO configuration file for $DOMAIN (looked for $DOMAIN.ini in $ILO_CONF_DIR)" >&2
  exit 1
fi

# Sanity check connection is working by checking UID light status
if ! hpilo_cli -c "$ILO_CONF_FILE" "$DOMAIN" get_uid_status >/dev/null
then
  echo "Error checking connectivity to iLO (failed to get UID light status) for $DOMAIN" >&2
  exit 1
fi

# If that succeeded, request certificate
if ! hpilo_cli -c "$ILO_CONF_FILE" "$DOMAIN" certificate_signing_request country='$certificate.country' state='$certificate.state' locality='$certificate.locality' organization='$certificate.organization' organizational_unit='$certificate.organizational_unit' common_name="$DOMAIN" | grep -v '^>>>'
then
  echo "CSR request failed, retrying..." >&2
  # If request failed, assume iLO is still generating certificate and retry every 5 seconds for upto 10 minutes
  succeeded=0
  i=0
  while [ $i -lt 120 ]  # 120 * 5s waiting = 10 minutes
  do
    sleep 5
    if hpilo_cli -c "$ILO_CONF_FILE" "$DOMAIN" certificate_signing_request country='$certificate.country' state='$certificate.state' locality='$certificate.locality' organization='$certificate.organization' organizational_unit='$certificate.organizational_unit' common_name="$DOMAIN" 2>/dev/null | grep -v '^>>>'
    then
      # succeeded this time
      succeeded=1
      echo "Got CSR after $i re-attempts." >&2
      break
    fi
    i=$(( i + 1 ))
  done
  if [ $succeeded -eq 0 ]
  then
    echo "ERROR: unable to obtain CSR from iLO!" >&2
    exit 1
  fi
fi

Again, dehydrated-code-rack was lacking support for the generate_csr hook point - cue another pull request.

Finally, just need to upload the signed certificate (I called this one hp-ilo-deploy). This one does not need a sanity check at the start, a failure of the certificate upload is always an error - with the CSR generation in normal use it can indicate that the CSR is still be generated hence the check to catch other errors first:

#!/bin/bash

ILO_CONF_DIR=/etc/dehydrated/hp-ilo

ILO_CONF_FILE="$ILO_CONF_DIR/$DOMAIN.ini"

# Check config file exists
if [ ! -f "$ILO_CONF_FILE" ]
then
  echo "No iLO configuration file for $DOMAIN (looked for $DOMAIN.ini in $ILO_CONF_DIR)" >&2
  exit 1
fi

hpilo_cli -c "$ILO_CONF_FILE" "$DOMAIN" import_certificate certificate="$( cat $CERTFILE )"

exit $?  # use the hpilo_cli exit status as this script's exit status

All in all, it took 5 days of tinkering in the 30 minutes or so after work to get this all working but I am pleased that it is.