February 16, 2013

Unix: Why do “df” and “du” report different disk usages?

Let's understand the fundamental difference between the two commands first.

  • "df" reports usage based on details available at filesystem-level (inodes)
  • "du" reports usage based on the actual contents (files) of a directory

There are many reasons for the discrepancy between the numbers reported by the two commands. However, it is _mostly_ because of removal of a large file, the file handle of which is still present with a process (the file was opened by the process). When a file is removed, "du" no longer takes it into consideration although "df" counts it as the file inode is still present because of the open file handle - which is why "du" reports free usage where as "df" does not.

How do you fix the discrepancy in such cases? Simplest is by not removing the files that are held open by some process. However, if you do run into a situation of accidentally removing such files, restart the application/process that held the open file descriptor - this is the proper way to clean up. There is also a workaround of clearing the file descriptor manually/directly although it is not recommended procedure.

Another reason for the mismatch is that "df" counts the complete blocks (8 KB blocks, for example, as seen in the above techbit) where as "du" counts the actual file sizes (like 1 KB if the file is actually only 1 KB). If there is a large number of such small files, then the difference can show up significantly.

There are many other possible reasons as well for the mismatch. Provided below are few links for further reading:

http://linuxshellaccount.blogspot.com/2008/12/why-du-and-df-display-different-values.html
http://blog.thilelli.net/post/2008/10/18/Discrepancies-Between-df-And-du-Outputs
http://sysunconfig.net/aixtips/df_du_diff_out.txt

No comments: