We have a recurring problem with disk space being exhausted on the root filesystem of a system, the root cause of which is gnome-terminal holding open file-handles to very large deleted temporary files in /tmp. I suspect there is a bug in gnome-terminal not closing the handles to its scrollback buffer (possibly only when set to unlimited scrollback, as some users have).
The oddity with this is that
df will show the filesystem as 100% full with, e.g., 31GB used yet
du -x will only be able to account for, e.g., 0.5 of the total - suggesting only 16GB of files exist on the disk.
Diagnosing the stale file-handles requires breaking out
greping for open deleted files, like this:
lsof | grep ' (deleted)$'
The output tells you the following information (in column order):
- process name
- process id (PID)
- file descriptor number and mode (‘r’ for read, ‘w’ for write and ‘u’ for read and write)
- type (usually ‘REG’ for regular file in this case)
- device numbers separated by commas
- node number
- name of the file
Once the offending files have been located, recovery can be effected by several methods - in increasing order of finesse (the last two were christened ‘axe’ and ‘scalpel’ by a colleague):
- sledgehammer -
rebootthe machine. Will close and reset all file handles.
- axe -
-9if necessary) the offending process. Will probably annoy the user whose process is holding open the files but, in the case of gnome-terminal at least, very effectively releases the space.
- scalpel - use
truncateto resize the file-handles to zero size, however this may destabilise the program if it has internally cached information about the file-size and does not have very robust file access code. The basic form of the command is
truncate -s0 /proc/$PID/fd/$FDwhere
$FDcan be found from columns 2 and 4 of the