This post could be subtitled “down the rabbit hole”. I needed to reboot my router (kernel update) and, when I set about doing so, found that a backup job was still running 4 hours after it had started. Looking into why this was the case, I found the backups were taking on average 5 hours. Looking at the size of the backup it seemed a little on the large size for what the box does so I set about seeing whether this could be reduced which led to find some missing configurations…

My cruft

I started with a du of the root directories, quickly discovering two very large (>5GB) directories:

  • in /srv there was a rescued-from-old-install directory. After copying over a couple of files that I thought might not be stored elsewhere (other than the backups) I deleted this.
  • in my home directory is a synchronised copy of my NextCloud self-hosted sync-n-share folder. This is needed, being synchronised to my always-on router is part of my belt-and-braces strategy to ensure that the folder’s contents is captured in at least one backup. Reducing the size of the contents, however, is on my (very long) “todo” list.

Apt cache

The next largest directory was /var/lib/cache, and this was almost entire due to the size of the apt sub-folder. To remove obsolete (no longer downloadable from the mirror) packages from apt’s cache (/var/lib/cache/apt by default) run:

apt-get autoclean

On my router, this reduced the space consumed from over 2GB to 426MB.

My intention is to add a cronjob to run apt-get autoclean on a regular basis to help manage the size of this cache directory in future.

Fail2Ban

The next largest directory, /var/lib/fail2ban, turned out to contain one file, fail2ban’s database:

$ ls -lh /var/lib/fail2ban/fail2ban.sqlite3
-rw------- 1 root root 1.4G Oct 19 10:48 /var/lib/fail2ban/fail2ban.sqlite3

Investigating the contents I found that fail2ban has banned nothing in over a year:

$ sudo sqlite3 /var/lib/fail2ban/fail2ban.sqlite3
SQLite version 3.27.2 2019-02-25 16:06:06
Enter ".help" for usage hints.
sqlite> .tables
bans        fail2banDb  jails       logs
sqlite> .schema bans
CREATE TABLE bans(jail TEXT NOT NULL, ip TEXT, timeofban INTEGER NOT NULL, data JSON, FOREIGN KEY(jail) REFERENCES jails(name) );
CREATE INDEX bans_jail_timeofban_ip ON bans(jail, timeofban);
CREATE INDEX bans_jail_ip ON bans(jail, ip);
CREATE INDEX bans_ip ON bans(ip);
sqlite> select count(1) from bans;
2151497
sqlite> select count(1) from fail2banDb;;
1
sqlite> select count(1) from jails;
1
sqlite> select count(1) from logs;
1
sqlite> select strftime('%d-%m-%Y %H:%M:%f', min(timeofban), 'unixepoch') from bans;
24-11-2018 21:44:21.000
sqlite> select strftime('%d-%m-%Y %H:%M:%f', max(timeofban), 'unixepoch') from bans;
06-07-2021 12:08:45.000

I was also surprised that there was only one jail and, on looking into the configuration, it transpires that my old fail2ban configuration, last updated in 2012 according to revision control, never made it into the (at the time, new) Salt managed configuration in 2013. A simpler configuration was in Salt and deployed on the router, although the firewall no longer allows any remote access so the risk of the missing configuration was minimal.

As the historic ban information is useless (the old configuration permanently banned repeat offenders after 3 previous bans, across all monitored services but this configuration only applied time-limited bans), I stopped fail2ban and removed the file before restarting it (which recreated the file) as suggested on ServerFault. It is noteworthy that [newer versions of fail2ban] (0.11.1+) have the fix “purge database will be executed now” but my server has 0.10 installed. To clear this database:

systemctl stop fail2ban
rm /var/lib/fail2ban/fail2ban.sqlite3
systemctl start fail2ban

This reduced the size considerably:

$ ls -lh /var/lib/fail2ban/fail2ban.sqlite3
-rw------- 1 root root 56K Oct 19 22:19 /var/lib/fail2ban/fail2ban.sqlite3

Finally

After doing this bit to tidying, I reduced the total size of the files on the server (excluding the NextCloud folder) by nearly 60% - 30% if the large NextCloud folder is included - which I hope will make backups take a bit less time going forward.