Friday, August 17, 2012

Speeding up "fuser"Activity


"fuser" is a great utility for Unix systems for determining who has a file opened or just if a file is opened by anyone.  Related to Oracle maintenance, its handy when performing any kind of cleanup of tracefiles or audit files.  Oracle dumps plenty of files under [a|b|u]dump yet as part of regular cleanup it's not wise to blindly remove these files.  If a file is still open removing it doesn't release the space.

The problem is "fuser" can also be very slow.  The point of the utility is to find all users who have a given file open, but that literally means check every process on the system to see if they have the associated file descriptor reserved, in DBA terms "do a full scan" of all processes.  On a system with thousands of processes and thousands of files to check, the impact can be rather large.

The good news is Oracle is nice enough to frequently provide the PID who owns a file within the filename itself.  Since these audit and tracefiles are exclusive to a PID this means you could grab the PID off the filename and check one spot to see if that process still has the file open, instead of checking all processes on the system.

As an example, the following shows an "fuser" method and an lookup-by-PID method for checking all audit files to see if any are open:

"fuser" Method
ls -1 $ORACLE_BASE/admin//adump/ora*.aud | while read AUD_FILE
do
   [[ -n "`fuser $AUD_FILE | cut -d':' -f1`" ]] && echo "File $AUD_FILE is open"
done

Lookup-by-PID Method
ls -1 $ORACLE_BASE/admin//adump/ora*.aud | while read AUD_FILE
do
   AUD_FILE_BASE=`basename $AUD_FILE .aud`
   AUD_FILE_PID=`echo $AUD_FILE_BASE | cut -d'_' -f2`
   if [ `ls -l /proc/$AUD_FILE_PID/fd 2>/dev/null | grep -c "$AUD_FILE"` -eq 1 ]; then
      echo "File $AUD_FILE is open"
   fi
done

The "Lookup" method takes a few extra lines of code but the performance is dramatic.  I tested this on a system with 2500+ active processes with a few thousand audit files, using "time" to compare the 2 methods. 

For "fuser":
real    8m3.663s
user    0m11.197s
sys     0m50.258s

For "Lookup -by-PID"
real    0m0.951s
user    0m0.272s
sys     0m0.895s

Not every system has as many files to check so I ran both tests against just 1 file, still with 2500+ active processes.  The difference was still 1000x - "fuser" = 0m3.144s, "Lookup-by-PID" = 0m0.003s.

For me the bottom line is that you can have a huge, positive impact on performance if you're willing to add a smidge of intelligence to your code.