Mirroring pip packages
As I was trying to install a recent version of Ansible in my air-gapped home lab network I discovered that mirroring Python packages, e.g. from PyPi, is quite difficult despite pip’s download option. The --platform
, --python-version
, --implementation
and --abi
options are supposed to allow downloading for another platform but finding the right combination of options was tricky - in particular the cryptography
package (which often causes me issues) would not download for Debian Bullseye on either Buster or macOS with any combinations I tried. In the end, I resorted to the approach I adopted for Gentoo, using a Docker container to download for the platform inside the container.
docker role
I started by separating the setup-docker
sub task file (tasks/setup-docker.yaml
) from gentoo-mirror
role into a role in its own right, called docker
. This is because I now needed Docker in more than just that one role so, applying the DRY (don’t repeat yourself) principal, now is the time to pull it out.
The role takes not arguments, the only variable is a built-in fact for the distribution’s release, so all that is needed is the task. This remains Debian-specific for the host, I should look at creating a version that works on macOS (and maybe Windows) too. The tasks go in the role’s tasks/main.yaml
file:
---
- name: Ensure Docker repository is available
become: yes
ansible.builtin.apt_repository:
# Match what is currently in SaltStack (repo and filename) so they
# do not end up fighting.
repo: deb https://download.docker.com/linux/debian {{ ansible_facts.distribution_release }} stable
filename: docker
state: present
- name: Ensure Docker is installed
become: yes
ansible.builtin.package:
name:
- docker-ce
- docker-ce-cli
- containerd.io
- python3-docker # So Ansible can manage docker
state: present
...
Updating gentoo-mirror role
In gentoo-mirror
’s tasks/main.yaml
I removed (and the tasks/setup-docker.yaml
file):
ansible.builtin.include_tasks: setup-docker.yaml
and created meta/main.yaml
with the contents:
---
dependencies:
- role: docker
...
Creating the pip-mirrors role
This has the same dependency, so needs an identical meta/main.yaml
file:
---
dependencies:
- role: docker
...
This roles does take arguments; where to mirror to, a list of platforms to download for and a list of packages to download. The specification for these arguments goes in meta/argument_specs.yaml
:
---
argument_specs:
main:
short_description: Mirror pip packages, using docker to fetch platform-specific version
options:
target:
type: str
required: true
description: Base directory to mirror to
platforms:
type: list
required: true
elements: dict
options:
image:
type: str
required: true
description: Docker image name to use to fetch pip
pre-command:
type: str
description: Pre-pip command to run (e.g. to install pip in the container). No command will be run if not provided.
name:
type: str
description: sub-directory name to use (defaults to value of `image` value with colons replace by hyphens)
packages:
type: list
required: true
elements: str
description: List of packages to fetch with pip (passed directly to pip, so anything pip accepts (e.g. version constraints) can be included)
...
The tasks to fetch the packages just use the docker images from the platforms
list to download all of the packages
. This does in tasks/main.yaml
:
---
- name: Make sure the target directories exists
ansible.builtin.file:
path: "{{ target }}/{{ item.name | default(item.image | replace(':', '-')) }}"
state: directory
loop: '{{ platforms }}'
- name: Fetch packages
become: yes
community.docker.docker_container:
name: do_pip_fetch
container_default_behavior: no_defaults # Stop warning
cleanup: yes
detach: no
# This originally used `pip download -d` instead of `pip wheel -w`
# but that caused problems with missing build dependencies in
# isolated (air-gapped) environments. `pip wheel` will produce
# built binaries for the platform, even if that means building
# locally.
command: "bash -c '{{ item['pre-command'] | default('/bin/true') }} && pip wheel -w /mnt/ {{ packages | map('quote') | join(' ') }}'"
image: '{{ item.image }}'
mounts:
- source: "{{ target }}/{{ item.name | default(item.image | replace(':', '-')) }}"
target: /mnt
type: bind
loop: '{{ platforms }}'
...
Using pip-mirrors role
To use this role to download ansible for Debian Bullseye (current stable, at time of writing):
- role: pip-mirrors
target: '{{ mirror_base_path }}/pip'
platforms:
- image: debian:bullseye
pre-command: 'apt-get update && apt-get -y install python3-pip'
packages:
- ansible
tags: ['pip']
N.B. This will never remove packages from the mirror, which might become a problem over time.
Installing from mirror
Unfortunately simply uploading the files to a web-server doesn’t confirm to the layout required by pip, which requires normalising the names of the packages and placing them in the appropriate locations. The easiest way I found to use it (until it becomes large) was to download the packages with wget
’s recursion:
# Create a temporary folder
outdir=$(mktemp -d)
# Fetch with wget in temporary directory:
# -r = recursive
# -A whl = download only files with the extension 'whl' (was
# `-A whl,gz,tar` when using `pip download` instead of
# `pip wheel` to create mirror but now will always be whl)
# -np = do not recurse to parent directories
# -nd = do not create the server's directory hierarchy locally
pushd $outdir
wget -r -A whl -np -nd http://mirror/mirrors/pip/debian-bullseye/
popd
# Install Ansible
# -f = where to look for packages
pip install -f $outdir --no-index ansible
# Tidy up
rm -rf $outdir
Although a simpler solution would probably be to NFS mount the mirror - I’m sure it could be used by the OS that way too.