Sunday, May 13, 2018

"grep" with Colors can't find its Results

I ran into an interesting situation recently where one of our standard cleanup scripts of Oracle tracefiles, logs, etc. could not find an entry in /etc/oratab even though it was there.  The problem was that "grep" didn't know what color $ORACLE_SID was.

Obviously $ORACLE_SID isn't set to a specific color but when you color options with "grep" those special characters that display colors can come back to bite you.  For example, lets say you have are using ASM for an Oracle database and want to switch your environment to point to that ASM instance and do so programmatically.  The following shows the instance exists and is also defined in "/etc/oratab":

% ps -fu oracle | grep asm_smon_ | grep -v grep
oracle     9234      1  0  2017 ?        00:14:39 asm_smon_+ASM
% grep "^+ASM" /etc/oratab
+ASM:/grid/app/12.1.0/oracle:N          # line added by Agent

To switch the environment in a script you can simply tell "oraenv" to not prompt for input after you set $ORACLE_SID:

% cat x.sh
export ORAENV_ASK="NO"
export ORACLE_SID=`grep "^+ASM" /etc/oratab | cut -d\: -f1`
. oraenv

And when running the simple script:

% chmod +x x.sh
% x.sh
ORACLE_HOME = [/dnbusr1/oracle] ?

Wait!  I validated that the instance I was going to use, "+ASM", existed in "/etc/oratab".  Why couldn't "oraenv" find the value?  At first when I tried debugging, by making a copy of "/usr/local/bin/oraenv" and in that local copy I added "set -x", then re-running my script to see what was going on, I didn't see the problem:

% x.sh
+++ SILENT=
+++ '[' 0 -gt 0 ']'
+++ case ${ORACLE_TRACE:-""} in
+++ N=
+++ C=
+++ echo '\c'
+++ grep c
+++ N=-n
+++ '[' /ora01/app/oracle/product/12.1.0/db_1 = 0 ']'
+++ OLDHOME=/ora01/app/oracle/product/12.1.0/db_1
+++ case ${ORAENV_ASK:-""} in
+++ NEWSID="^+ASM"
+++ export ORACLE_SID
++++ dbhome "^+ASM"
+++ ORAHOME=/dnbusr1/oracle
+++ case $? in
+++ echo -n 'ORACLE_HOME = [/dnbusr1/oracle] ? '
ORACLE_HOME = [/dnbusr1/oracle] ? +++ read NEWHOME

Just for yucks I decided to send the output to a logfile and review the results in "vi" and sure enough, that helped identify the problem:

% x.sh 1>x.log 2>&1

% vi x.log
+++ SILENT=
+++ case ${ORACLE_TRACE:-""} in
+++ N=
+++ C=
+++ echo '\c'
+++ grep c
+++ N=-n
+++ '[' /grid/app/11.2.0.4/oracle = 0 ']'
+++ OLDHOME=/grid/app/11.2.0.4/oracle
+++ case ${ORAENV_ASK:-""} in
+++ NEWSID='^[[01;31m^[[K+ASM^[[m^[[K1'
+++ export ORACLE_SID
++++ dbhome '^[[01;31m^[[K+ASM^[[m^[[K1'

The highlighted escape sequences are adding color to the string.  And why would that be done?  Because someone had added color options to "grep", using "export GREP_OPTIONS="--color=always" ".  In my script I used "grep" to find the Oracle instance (in case the environment I was working on had a different value such as a cluster) and the color escape sequences were stored in the variable with the string.

So is there a happy medium where you can use colors to help identify found strings yet not have those colors interfere in cases like I described above?  Yes, for the most part.  The best option would be to use "--color" or "--color=auto".  In either case ("auto" is the default) color is added only when standard output is connected to a terminal, to partially quote the "man" page for "ls".  In other words, a command like "grep "^+ASM" /etc/oratab" would include color whereas "grep "^+ASM" /etc/oratab | cut -d\: -f1" would not because output is piped.