Mind the inode!

What’s the first action you take when a log says: no space left on device? Mine is:

$ df -h /mount/point

Well, quite surprisingly at first, a filesystem can be full even when there’s still plenty of space on it. Is something that rarely happened to me, so rarely that I usually don’t check the other side of the coin: free inodes!

To check if you’re run out of inodes use the -i option of df:

tx0@wallace:~$ df -i /tmp/
Filesystem       Inodes  IUsed    IFree IUse% Mounted on
/dev/sdb2      18317312 394930 17922382    3% /
tx0@wallace:~$ df -h /tmp/
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb2       275G   11G  251G   4% /

This is from my computer, not from a starving server, but you see the point: occupation is 4% by size, but 3% by inode. The filesystem if full if at least one of the two is 100%. So check the inodes too!

Although very rarely, inode exhaustion is a condition that can happen. For example if you have an application which produces (and keep for future reference) a huge amount of very small files, like profiles, input data, ini files, xml files and similar. I’ve faced this problem working with hadoop from Cloudera CDH4.7. In the /tmp/hadoop-hive/mapred/local/archive/ directory hadoop or hive was keeping track of all past jobs. The directory contained 400K subdirectories, each with its load of small files. The filesystem was at 35% of space, but 100% of inodes were allocated. To free the space I’ve first done a:

$ ls -atr  /tmp/hadoop-hive/mapred/local/archive/ > /tmp/inode_list

This gave me a list of the 400K directory ordered from the oldest to the newest. I’ve then edited the file with vi to cut a tail of 50K entries, just to be sure to keep the most recent ones. Then I’ve started a for loop inside a screen to delete all the entries in the list:

$ for f in `cat /tmp/inode_list `; do echo $f; rm -rf -- $f; done

The inode pressure started do fall and I had the opportunity to restart blocked services like Hiveserver2.