Friday, October 21, 2011

Generating Extra Rows and the Right Cardinality

Periodically I find code executed on databases I support that is used to generate x number of extra rows, normally to produce a set of literal values as a table.  For example, recently I found the following embedded as an inline view in a rather large query:

(SELECT LAST_DAY(ADD_MONTHS(:in_date, -1)) work_date, -1 rws_back
   FROM dual UNION
 SELECT LAST_DAY(ADD_MONTHS(:in_date, -2)), -2
   FROM dual UNION
 SELECT LAST_DAY(ADD_MONTHS(:in_date, -3)), -3
   FROM dual UNION
 SELECT LAST_DAY(ADD_MONTHS(:in_date, -13)), -13
   FROM dual) in_dt_view

The idea was to generate a 13-row table, with each row holding a date counting backwards 1 month from the previous row.  Those 26 lines of SQL could have easily been written as:

(SELECT /*+ CARDINALITY(d 13) */ LAST_DAY(ADD_MONTHS(sysdate, (level*-1))) work_date
      , (level*-1) rws_back
   FROM dual d
CONNECT BY level <= 13) in_dt_view

Using "CONNECT BY level <=" is a handy way of generating x number of rows for any number of purposes, not only in situations like the above one but also for generating data when conducting tests.  This change isn't meant to tune the query (it does take a smidge less Shared Pool space) but make it easier to write (imagine if they needed 48 months back in time). 

Using "CONNECT BY level <=" isn't something new as there are many examples of this technique on the internet.  But for whatever reason it doesn't seem to be used that often in production code, instead mostly in code for generating data and/or presentations.

But why the CARDINALITY hint?  Without it the optimizer thinks only 1 row will be returned as DUAL has only 1 row and the optimizer doesn't try to determine the row count generated by "CONNECT BY …".  Tom Kyte has a great discussion about this hint at