Friday, May 25, 2012

tar + find + "." = too Many Files

I know this blog post has a somewhat strange title, especially for a DBA, but it's all related to an issue I ran into recently that was more difficult to resolve than it should have been, mostly due to time constraints.

I was working on an issue under Oracle 10gR2.  I had to load all trace and log files under /bdump and /udump to MOS for help in analyzing the problem.  The system was a 4-node RAC and even with regular file cleanup jobs running nightly those directories had hundreds of files each.  But, I knew a specific time range and wanted all files and directories, so I just used "tar" with the file list generated from a "find" command using the "-mmin" argument.

After looking more closely after the first few tarballs were created I found that ALL files were getting tar'ed each time, as if the "find" command's argument was being ignored.  I ran the "find" separately, which worked as expected, but when used as input for "tar" it gave me all files.

It turns out my problem was the "." directory.  I don't think twice about seeing the current directory (".") or the parent directory ("..") in listings, but they obviously can affect output of commands.  As a simple example of what I ran into, let's say we have 5 files of 1KB, 2KB, ... 5KB in size and need to create a tarball of any over 2KB.

First, create files for the simple test:

for KB in 1 2 3 4 5
do
   dd if=/dev/zero of=${KB}kb_file.txt bs=1024 count=$KB
done

% ls -ltr
total 24
-rw-r--r--  1 oracle oinstall 5120 May 25 16:18 5kb_file.txt
-rw-r--r--  1 oracle oinstall 4096 May 25 16:18 4kb_file.txt
-rw-r--r--  1 oracle oinstall 3072 May 25 16:18 3kb_file.txt
-rw-r--r--  1 oracle oinstall 2048 May 25 16:18 2kb_file.txt
-rw-r--r--  1 oracle oinstall 1024 May 25 16:18 1kb_file.txt

Next, show that the "find" command gets what I want:

% find . -size +2049c -ls
 98960    4 drwxr-xr-x   2 oracle   oinstall     4096 May 25 16:18 .
 98971    4 -rw-r--r--   1 oracle   oinstall     3072 May 25 16:18 ./3kb_file.txt
 98972    4 -rw-r--r--   1 oracle   oinstall     4096 May 25 16:18 ./4kb_file.txt
 98973    8 -rw-r--r--   1 oracle   oinstall     5120 May 25 16:18 ./5kb_file.txt

And last, see how this works with "tar":

% tar -cvf 3kb_or_bigger.tar `find . -size +2049c -print`
./
./1kb_file.txt
./2kb_file.txt
./3kb_file.txt
./4kb_file.txt
./5kb_file.txt
tar: ./3kb_or_bigger.tar: file is the archive; not dumped
./3kb_file.txt
./4kb_file.txt
./5kb_file.txt

As can be seen, all files are in the tarball, along with a second set of just those that I really wanted.  What's happening is "." is passed to "tar", which tells "tar" to pull all files from that directory.  Filtering on filename and/or file type in the "find" command would resolve this, but at the time I was taking every shortcut possible.  You can bet that I'll respect the "." directory more from now on!

No comments:

Post a Comment