Thursday, November 3, 2011

"ulimit", ASM, and Thinking Things Through

The other day while reviewing an instance's alert log I found an interesting message:
WARNING:Oracle instance running on a system with low open file descriptor
        limit. Tune your system to increase this limit to avoid
        severe performance degradation.

A quick search of MOS pointed me to bug 5862719.  It turns out on our RAC system ( on RHEL 4, the /etc/init.d/init.crsd file has the command "ulimit -n unlimited" in it, which fails and drops the file descriptor limit to its default, which in our case is 1024.
% ps -ef | grep smon_dchdb
oracle   20720     1  0 Oct31 ?        00:00:16 ora_smon_dchdb1

% grep "open files" /proc/20720/limits
Max open files            1024                 1024                 files

The system recently crashed (which was why I was checking the alert log) and so all auto-started Oracle processes inherited their ulimits from the init processes.
My point of this isn't to re-document an Oracle bug but emphasize the importance of taking a step back periodically and thinking things through.  Is an FD limit of 1024 potentially hurting our system, performance-wise?  If so, can I duplicate this elsewhere to prove the impact, justifying an emergency fix?
On Linux it's pretty easy to check on open file counts for processes running under the same account.  As "oracle" I issued the following:
# Check all "oracle" processes associated with database "dchdb" and ASM, grab the PID
# and current command, then count how many files are opened and display the results.
ps -f -u oracle | grep -E '(dchdb1|ASM)' | awk '{print $2" "$8}' | \
   while read IN_PID IN_CMD
      echo "Files opened for PID $IN_PID $IN_CMD: `ls -1 /proc/${IN_PID}/fd 2>/dev/null | wc -l`"
   done | sort -nr -k7,7 | head -6

#-- output --#
Files opened for PID 21810 ora_rbal_dchdb1: 28
Files opened for PID 21790 ora_asmb_dchdb1: 28
Files opened for PID 22080 ora_arc1_dchdb1: 27
Files opened for PID 22174 ora_qmnc_dchdb1: 26
Files opened for PID 22078 ora_arc0_dchdb1: 26
Files opened for PID 20859 ora_lck0_dchdb1: 24

No process has more than 28 files open, so a limit of 1024 seems pretty safe.  Yet the database has a few hundred datafiles, so why isn't the number higher?  This is where ASM comes in to play.  ASM bypasses the file system so database "files" aren't managed by the OS and FD's, so for the most part FD limits have a different context when dealing with ASM.
Result?  While I believe it's not always wise to leave code as is even though 1 or more commands/parts are failing each time, in this case spending a bit of time to think things through allowed this change to be scheduled on a lower priority.

Friday, October 21, 2011

Generating Extra Rows and the Right Cardinality

Periodically I find code executed on databases I support that is used to generate x number of extra rows, normally to produce a set of literal values as a table.  For example, recently I found the following embedded as an inline view in a rather large query:

(SELECT LAST_DAY(ADD_MONTHS(:in_date, -1)) work_date, -1 rws_back
   FROM dual UNION
 SELECT LAST_DAY(ADD_MONTHS(:in_date, -2)), -2
   FROM dual UNION
 SELECT LAST_DAY(ADD_MONTHS(:in_date, -3)), -3
   FROM dual UNION
 SELECT LAST_DAY(ADD_MONTHS(:in_date, -13)), -13
   FROM dual) in_dt_view

The idea was to generate a 13-row table, with each row holding a date counting backwards 1 month from the previous row.  Those 26 lines of SQL could have easily been written as:

(SELECT /*+ CARDINALITY(d 13) */ LAST_DAY(ADD_MONTHS(sysdate, (level*-1))) work_date
      , (level*-1) rws_back
   FROM dual d
CONNECT BY level <= 13) in_dt_view

Using "CONNECT BY level <=" is a handy way of generating x number of rows for any number of purposes, not only in situations like the above one but also for generating data when conducting tests.  This change isn't meant to tune the query (it does take a smidge less Shared Pool space) but make it easier to write (imagine if they needed 48 months back in time). 

Using "CONNECT BY level <=" isn't something new as there are many examples of this technique on the internet.  But for whatever reason it doesn't seem to be used that often in production code, instead mostly in code for generating data and/or presentations.

But why the CARDINALITY hint?  Without it the optimizer thinks only 1 row will be returned as DUAL has only 1 row and the optimizer doesn't try to determine the row count generated by "CONNECT BY …".  Tom Kyte has a great discussion about this hint at

Monday, April 18, 2011

Connection Total Mismatch between Listener and Database

Normally I don't spend a lot of time reviewing connection details in my client's listener logs, but I came across a situation recently where failover and connection distribution were being questioned, so I figured it'd be best to review listener log data just to be safe.  What I ended up finding was that connection counts as recorded in the listener logs was nearly twice that of what "logons cumulative" statistic was showing, which in my mind wasn't right.
To perform any type of listener log analysis, I first needed some sort of tool, as the system involved is a 4-node OLTP RAC with 100,000's of connections happening per day.  Fortunately Arup Nanda did most of this work for me, based on his DBAzine article on 9/16/08 entitled "Mining Information from the Listener Log".  I used his notes to create an External Table for pointing to a given listener log and an Oracle function for properly parsing out details within lines of the listener log.  (See the end of this article for SQL queries involved to help analyze the data.)
As for the hardware and database involved, the following uses simplified names of targets to help illustrate the problem:

  • 4-node RAC running RHEL 4.8 and Oracle
  • Nodes are named RAC1, RAC2, RAC3, and RAC4 with one 1 Instance of the database on each, named inst1, inst2, inst3, and inst4
  • 2 Oracle Services are defined to use Inst1 and Inst2 as a strictly OLTP set of balanced Instances (service "inst1and2") and use Inst3 and Inst4 as a strictly Batch set of balanced Instances (service "batch1and2").
Back to the problem, I found that Service "inst1and2" was getting 1,000's of connections per hour against Instances Inst3 and Inst4, which from my understanding couldn't happen.  I used "srvctl" as one way to validate this Service's definition, just to be sure:
% srvctl config service -d $db_unique_name -a

inst1and2 PREF: inst1 inst2 AVAIL:  TAF: NONE
batch1and2 PREF: inst3 inst4 AVAIL:  TAF: NONE

As a last check, I asked the Application team to send me the exact connection string they were using.  Sure enough, this pointed out the problem:

Testing this from my PC using "sqlplus user/pass@'' ", I reproduced the problem.  I found that RAC3 / 4's listener log recorded a connection:
18-APR-2011 09:52:03 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=inst1and2)(FAILOVER_MODE=(TYPE=SELECT)(METHOD=BASIC))(CID=(PROGRAM=C:\oracle\product\10.2.0\db_1\bin\sqlplus.exe)(HOST=myserver123)(USER=ME))) * (ADDRESS=(PROTOCOL=tcp)(HOST=...)(PORT=1234)) * establish * inst1and2 * 0

… and also RAC1 / 2's listener log recorded a connection:
18-APR-2011 09:52:03 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=inst1and2)(FAILOVER_MODE=(TYPE=SELECT)(METHOD=BASIC))(CID=(PROGRAM=C:\oracle\product\10.2.0\db_1\bin\sqlplus.exe)(HOST=myserver123)(USER=ME))(INSTANCE_NAME=inst1)) * (ADDRESS=(PROTOCOL=tcp)(HOST=...)(PORT=2345)) * establish * inst1and2 * 0

Notice the string "…INSTANCE_NAME…" in the listener log entry for RAC1 / 2.  According to MOS document 263599.1, when this is present it represents a redirected connection, which is true in the test case above.  Simple enough, except that this string DOESN'T occur in any of listener log entries in question.  Further down in the same MOS document, there's a short blurb about thin jdbc connections and that if these connections are redirected, there won't be any mention of this in the listener log, as in the string INSTANCE_NAME won't be present.
This matches my case, as all application servers involved are connecting via thin jdbc.  MOS document 263599.1 also points out that "lsnrctl services" can also point out connect redirects.  Checking RAC4 I see the following:

Service "inst1and2" has 2 instance(s).
  Instance "inst1", status READY, has 1 handler(s) for this service...
      "DEDICATED" established:658909 refused:0 state:ready
  Instance "inst2", status READY, has 1 handler(s) for this service...
      "DEDICATED" established:737610 refused:0 state:ready

For Service "inst1and2" on RAC4, 659,909 connections have been redirected to "inst1" and 737,610 connections have been redirected to "inst2", all since the listener was last restarted.
Although all of this helps account for any connection count discrepancies between the listener log and database statistic "logons cumulative", I'm left with having to check "lsnrctl services" and connect strings from application servers to help tie it all together.  It'd sure be a lot easier, at least to me, for Oracle to have included a message like "redirect" within the listener log entry to point out what happened.

SQL Code

The following is an example of a query I used to analyze listener log data via an external table.  I'm including this as I found a few cases where the data didn't line up as expected and I had to use a bit that knowledge to handle most of those cases.
SELECT min_log_date, max_log_date,
       CASE WHEN parse_listener_log_line_fnc(connect_string, 'HOST') IS NOT NULL AND
                 parse_listener_log_line_fnc(connect_string, 'HOST') NOT IN ('__jdbc__')
               THEN parse_listener_log_line_fnc(connect_string, 'HOST')
            ELSE REGEXP_REPLACE(REGEXP_REPLACE(protocol_info,
                                '(^.*HOST=)([0-9]+.[0-9]+.[0-9]+.[0-9]+).*\$', '\2'),
                                '(^.*HOST=)(lnx|racvip){1}([0-9]*\$', '\2\3')
       END source_machine
     , NVL(parse_listener_log_line_fnc(connect_string, 'PROGRAM'),
           parse_listener_log_line_fnc(connect_string, 'HOST'))
     , CASE WHEN connect_string IN ('ping', 'service_update', 'status') THEN connect_string
            WHEN protocol_info IN ('services', 'status') THEN protocol_info
            ELSE action
       END operation
     , NVL(service_name, protocol_info) service, return_code, COUNT(*)
  FROM listener_log_tb
     , (SELECT MIN(log_date) min_log_date, MAX(log_date) max_log_date
          FROM listener_log_tb)
 GROUP BY min_log_date, max_log_date
        , CASE WHEN parse_listener_log_line_fnc(connect_string, 'HOST') IS NOT NULL AND
                    parse_listener_log_line_fnc(connect_string, 'HOST') NOT IN ('__jdbc__')
                  THEN parse_listener_log_line_fnc(connect_string, 'HOST')
               ELSE REGEXP_REPLACE(REGEXP_REPLACE(protocol_info,
                                   '(^.*HOST=)([0-9]+.[0-9]+.[0-9]+.[0-9]+).*\$', '\2'),
                                   '(^.*HOST=)(lnx|racvip){1}([0-9]*\$', '\2\3')
        , NVL(parse_listener_log_line_fnc(connect_string, 'PROGRAM'),
              parse_listener_log_line_fnc(connect_string, 'HOST'))
        , CASE WHEN connect_string IN ('ping', 'service_update', 'status') THEN connect_string
               WHEN protocol_info IN ('services', 'status') THEN protocol_info
               ELSE action
        , NVL(service_name, protocol_info), return_code;