OR Dear Me!  When Cost Based OR  Expansion Gets it Wrong – Part 3

…continued from Part 1 and Part 2.

OK, let’s return to original query before with no index in place and lets review the optimizer’s estimate of how many rows will be returned, and try to deduce where it came from. Note throughout this article, I will be referring to sections in Jonathan Lewis‘s essential book, Cost-Based Oracle Fundamentals (CBOF).

SQL> CREATE TABLE t (c NUMBER(2, 0) NOT NULL,
  2*                 p VARCHAR2(4000));
Table T created.

SQL> INSERT /*+append*/ INTO t
  2      SELECT
  3          mod(ROWNUM, 100) c,
  4          rpad('X', 4000)  padding
  5      FROM
  6          dual
  7      CONNECT BY
  8*         ROWNUM <= 100000;

100 000 rows inserted.

SQL> exec dbms_stats.gather_table_stats(null, 't', method_opt=>'for all columns size 1');

PL/SQL procedure successfully completed.

SQL> SELECT /*+gather_plan_statistics*/ DISTINCT trim(p) p FROM t WHERE
  2          (c BETWEEN  0 AND  1) OR (c BETWEEN  2 AND  3) OR (c BETWEEN  4 AND  5) OR (c BETWEEN  6 AND  7) OR (c BETWEEN  8 AND  9)
  3       OR (c BETWEEN 10 AND 11) OR (c BETWEEN 12 AND 13) OR (c BETWEEN 14 AND 15) OR (c BETWEEN 16 AND 17) OR (c BETWEEN 18 AND 19)
  4       OR (c BETWEEN 20 AND 21) OR (c BETWEEN 22 AND 23) OR (c BETWEEN 24 AND 25) OR (c BETWEEN 26 AND 27) OR (c BETWEEN 28 AND 29)
  5       OR (c BETWEEN 30 AND 31) OR (c BETWEEN 32 AND 33) OR (c BETWEEN 34 AND 35) OR (c BETWEEN 36 AND 37) OR (c BETWEEN 38 AND 39)
  6       OR (c BETWEEN 40 AND 41) OR (c BETWEEN 42 AND 43) OR (c BETWEEN 44 AND 45) OR (c BETWEEN 46 AND 47) OR (c BETWEEN 48 AND 49)
  7       OR (c BETWEEN 50 AND 51) OR (c BETWEEN 52 AND 53) OR (c BETWEEN 54 AND 55) OR (c BETWEEN 56 AND 57) OR (c BETWEEN 58 AND 59)
  8       OR (c BETWEEN 60 AND 61) OR (c BETWEEN 62 AND 63) OR (c BETWEEN 64 AND 65) OR (c BETWEEN 66 AND 67) OR (c BETWEEN 68 AND 69)
  9       OR (c BETWEEN 70 AND 71) OR (c BETWEEN 72 AND 73) OR (c BETWEEN 74 AND 75) OR (c BETWEEN 76 AND 77) OR (c BETWEEN 78 AND 79)
 10       OR (c BETWEEN 80 AND 81) OR (c BETWEEN 82 AND 83) OR (c BETWEEN 84 AND 85) OR (c BETWEEN 86 AND 87) OR (c BETWEEN 88 AND 89)
 11*      OR (c BETWEEN 90 AND 91) OR (c BETWEEN 92 AND 93) OR (c BETWEEN 94 AND 95) OR (c BETWEEN 96 AND 97) OR (c BETWEEN 98 AND 99);

   P
____
X

SQL> SELECT * FROM dbms_xplan.display_cursor(format=>'allstats last -bytes -cost -predicate');
...
----------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:01.18 |     100K|    100K|
|   1 |  HASH UNIQUE       |      |      1 |      1 |      1 |00:00:01.18 |     100K|    100K|
|   2 |   TABLE ACCESS FULL| T    |      1 |  77857 |    100K|00:00:01.35 |     100K|    100K|
----------------------------------------------------------------------------------------------

I’m going to refer to the individual predicates as Pnn, so P1 is “c BETWEEN 0 AND 1”, P2 is “C BETWEEN 2 AND 3” and so on up to P50 which is “c BETWEEN 98 AND 99”. My aim is to show why the optimizer estimated 77,857 rows for (P1 OR P2 OR … OR P50).

Let’s first check what information the database has about the table T and column C from database statistics.

SELECT num_rows FROM user_tables WHERE table_name = 'T';

   NUM_ROWS
___________
     100000


SQL> SELECT histogram,
  2      num_nulls,
  3      num_distinct,
  4      utl_raw.cast_to_number(low_value)  low_value,
  5      utl_raw.cast_to_number(high_value) high_value
  6  FROM
  7      user_tab_cols
  8  WHERE
  9          table_name = 'T'
 10*     AND column_name = 'C';

   HISTOGRAM    NUM_NULLS    NUM_DISTINCT    LOW_VALUE    HIGH_VALUE
____________ ____________ _______________ ____________ _____________
NONE                    0             100            0            99

I’ll be referring to value from NUM_ROWS, NUM_DISTINCT, LOW_VALUE and HIGH_VALUE throughout this article, in addition I will be rounding some numbers as we go along as the optimizer does.

We’re going to be applying a standard procedure of divide and conquer for these calculations, so let’s start with P1.

SQL> SELECT /*+gather_plan_statistics*/ DISTINCT trim(p) p FROM t WHERE (c BETWEEN  0 AND  1);

   P
____
X

SQL> SELECT * FROM dbms_xplan.display_cursor(format=>'allstats last -bytes -cost');

                                                                                PLAN_TABLE_OUTPUT
_________________________________________________________________________________________________
SQL_ID  9cnx92h4yupdc, child number 0
-------------------------------------
SELECT /*+gather_plan_statistics*/ DISTINCT trim(p) p FROM t WHERE (c
...
----------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:00.64 |     100K|    100K|
|   1 |  HASH UNIQUE       |      |      1 |      1 |      1 |00:00:00.64 |     100K|    100K|
|*  2 |   TABLE ACCESS FULL| T    |      1 |   2010 |   2000 |00:00:00.16 |     100K|    100K|
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(("C"<=1 AND "C">=0))

As you can see from the Predicate Information section, Oracle has split P1 into two predicates “C >= 0” and “C <= 1”. And as the optimizer knows that C >= LOW_VALUE should match all rows, it effectively ignores this and uses “C <= 1”.

CBOR (p53 case 4) gives us the formula for selectivity, and cardinality into which we can insert the values from the statistics.

Sel(P1) = Sel("C <= 1")
(1 - LOW_VALUE)/(HIGH_VALUE-LOW_VALUE)+1/NUM_DISTINCT
=(1-0)/(99-0)+1/100
=0.020101

Card(P1)=Sel(P1)*NUM_ROWS
=0.020101*100000
=2010

Let's move to P2.
SQL> SELECT /*+gather_plan_statistics*/ DISTINCT trim(p) p FROM t WHERE (c between 2 and 3);

   P
____
X

SQL> SELECT * FROM dbms_xplan.display_cursor(format=>'allstats last -bytes -cost -predicate');

                                                                                PLAN_TABLE_OUTPUT
_________________________________________________________________________________________________
...
----------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:00.66 |     100K|    100K|
|   1 |  HASH UNIQUE       |      |      1 |      1 |      1 |00:00:00.66 |     100K|    100K|
|*  2 |   TABLE ACCESS FULL| T    |      1 |   3010 |   2000 |00:00:01.17 |     100K|    100K|
----------------------------------------------------------------------------------------------

This is CBOR p53 case 6, so selectivity and cardinality are calculated as follows

Sel(P2) = (3-2)/(HIGH_VALUE-LOW_VALUE)+2/NUM_DISTINCT
=1/99+2/100
=0.030101

Card(P2)
=Sel(P2)*NUM_ROWS
=0.030101*100000
=3010

So next we need Card(P1 OR P2). You might think that the optimizer would simply add the estimates of 2,010 rows from the P1 and 3,010 rows from the P2 but we can see that it doesn’t.

SQL> SELECT /*+gather_plan_statistics*/ DISTINCT trim(p) p FROM t WHERE (c BETWEEN  0 AND  1) OR (c BETWEEN 2 AND 3);

   P
____
X

SQL> SELECT * FROM dbms_xplan.display_cursor(format=>'allstats last -bytes -cost -predicate');
...
----------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:00.68 |     100K|    100K|
|   1 |  HASH UNIQUE       |      |      1 |      1 |      1 |00:00:00.68 |     100K|    100K|
|*  2 |   TABLE ACCESS FULL| T    |      1 |   4960 |   3000 |00:00:00.63 |     100K|    100K|
----------------------------------------------------------------------------------------------

The reason for this is that while it might be obvious to us that any rows matched by P1 will not be matched by P2, the optimizer doesn’t know this so it has to use the following calculations (CBOF p56)

Sel(P1 OR P2) 
=Sel(P1)+Sel(P2)-Sel(P1)*Sel(P2)
=0.030101+0.020101-0.030101*0.020101
=0.049596

Card(P1 OR P2)
=Sel(P1 OR P2)* NUM_ROWS
=0.049596*100000
=4960

Calculation for P3 is pretty much the same as for P2 and we get SEL(P3)=0.030101

And also following the same pattern in P2 we get

Sel(P1 OR P2 OR P3)
=Sel(P1 OR P2)+Sel(P3)-Sel(P1 OR P2)*Sel(P3)
=0.049596+0.030101-0.049596*0.030101
=0.078204

Card(P1 OR P2 OR P3)=
Sel(P1 OR P2 OR P3)*NUM_ROWS
=0.078204*100000
=7821

This matches the estimate that the optimizer shows us for the query

----------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:00.68 |     100K|    100K|
|   1 |  HASH UNIQUE       |      |      1 |      1 |      1 |00:00:00.68 |     100K|    100K|
|   2 |   TABLE ACCESS FULL| T    |      1 |   7821 |   6000 |00:00:00.20 |     100K|    100K|
----------------------------------------------------------------------------------------------

This pattern repeats all the way up to P1 OR P2 … P49 (I used a spreadsheet) to give us

SEL(P1 OR ... P49)
=0.77403

CARD(P1 OR ... P49)
=0.77403*100000
=77403

This matches what the optimizer tells us it estimates for this query

--------------------------------------------
| Id  | Operation          | Name | E-Rows |
--------------------------------------------
|   0 | SELECT STATEMENT   |      |        |
|   1 |  HASH UNIQUE       |      |      1 |
|   2 |   TABLE ACCESS FULL| T    |  77403 |
--------------------------------------------

Finally we come to P50, and this is the same edge case that P1 was so we get SEL(P50)=0.020101 and the calcuations are as follows

SEL(P1 OR ... OR P49 OR P50)
= SEL(P1 OR ... OR P49) + SEL(P50) - SEL(P1 OR ... P49)* SEL(P50)
= 0.77403 + 0.020101 - 0.77403 * 0.020101
= 0.778572

CARD(P1 OR ... OR P49 OR P50)
=SEL(P1 OR ... OR P49 OR P50)*NUM_ROWS
= 0.778572*10000
= 77857

Fortunately this matches the optimizer has said is it’s estimated cardinality for the full query.

I think it’s useful at this point to compare how the estimated number of rows deviates from the actual number of rows matched as the number of predicates increases. This is shown in the chart below, and you can see that with less than 30 predicates the optimizer overestimates the number of rows returned, and that as the we get more than 30 predicates this turns into an under-estimation.

In the next article, we will compare this behaviour with the case where Cost Based Or Expansion occurs.

OR Dear Me!  When Cost Based OR  Expansion Gets it Wrong – Part 2

I promised in Part 1 that next we would drill into where the cardinality estimate for the non-transformed query comes about, but, never having been someone to refrain from kicking the Cost Based Optimizer when it’s down, I realised i can fool it even better with some skewed data, and a histogram. Note I am testing on 21c but I believe behaviour should reproduce on any version after 12c.

SQL> CREATE TABLE t (
  2      c NUMBER(2, 0) NOT NULL,
  3      p VARCHAR2(4000)
  4* );

Table T created.

SQL> INSERT /*+append*/ INTO t
  2      SELECT
  3          CASE WHEN ROWNUM <= 100000 THEN mod(ROWNUM, 100)
  4          ELSE 99 END c,
  5          rpad('X', 4000)  padding
  6      FROM
  7          dual
  8      CONNECT BY
  9*         ROWNUM <= 199000;

199 000 rows inserted.

SQL> CREATE INDEX i ON t (c);

Index I created.

SQL> EXEC dbms_stats.gather_table_stats(null, 't', method_opt=>'for all columns size 100');

PL/SQL procedure successfully completed.

So we have a table with 199,000 rows, of which for each value of c between 0 and 98 there are 1,000 rows in the table, and the for the remaining 100,000 then c equals 99. In addition there is a a histogram with 100 buckets on c, so that the optimizer has perfect knowledge of this data distribution. Check out how this affects the execution plan and cardinality estimates for our query now.

SQL> SELECT /*+gather_plan_statistics*/ DISTINCT trim(p) p FROM t WHERE
  2          (c BETWEEN  0 AND  1) OR (c BETWEEN  2 AND  3) OR (c BETWEEN  4 AND  5) OR (c BETWEEN  6 AND  7) OR (c BETWEEN  8 AND  9)
  3       OR (c BETWEEN 10 AND 11) OR (c BETWEEN 12 AND 13) OR (c BETWEEN 14 AND 15) OR (c BETWEEN 16 AND 17) OR (c BETWEEN 18 AND 19)
  4       OR (c BETWEEN 20 AND 21) OR (c BETWEEN 22 AND 23) OR (c BETWEEN 24 AND 25) OR (c BETWEEN 26 AND 27) OR (c BETWEEN 28 AND 29)
  5       OR (c BETWEEN 30 AND 31) OR (c BETWEEN 32 AND 33) OR (c BETWEEN 34 AND 35) OR (c BETWEEN 36 AND 37) OR (c BETWEEN 38 AND 39)
  6       OR (c BETWEEN 40 AND 41) OR (c BETWEEN 42 AND 43) OR (c BETWEEN 44 AND 45) OR (c BETWEEN 46 AND 47) OR (c BETWEEN 48 AND 49)
  7       OR (c BETWEEN 50 AND 51) OR (c BETWEEN 52 AND 53) OR (c BETWEEN 54 AND 55) OR (c BETWEEN 56 AND 57) OR (c BETWEEN 58 AND 59)
  8       OR (c BETWEEN 60 AND 61) OR (c BETWEEN 62 AND 63) OR (c BETWEEN 64 AND 65) OR (c BETWEEN 66 AND 67) OR (c BETWEEN 68 AND 69)
  9       OR (c BETWEEN 70 AND 71) OR (c BETWEEN 72 AND 73) OR (c BETWEEN 74 AND 75) OR (c BETWEEN 76 AND 77) OR (c BETWEEN 78 AND 79)
 10       OR (c BETWEEN 80 AND 81) OR (c BETWEEN 82 AND 83) OR (c BETWEEN 84 AND 85) OR (c BETWEEN 86 AND 87) OR (c BETWEEN 88 AND 89)
 11*      OR (c BETWEEN 90 AND 91) OR (c BETWEEN 92 AND 93) OR (c BETWEEN 94 AND 95) OR (c BETWEEN 96 AND 97) OR (c BETWEEN 98 AND 99);

   P
____
X

SQL> SELECT * FROM dbms_xplan.display_cursor(format=>'allstats last -bytes -cost -predicate');
                                                                                                               PLAN_TABLE_OUTPUT
________________________________________________________________________________________________________________________________
SQL_ID  1hwut074tng4n, child number 0
-------------------------------------
...
-----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                 |      1 |        |      1 |00:00:06.81 |     199K|    199K|
|   1 |  HASH UNIQUE                           |                 |      1 |      1 |      1 |00:00:06.81 |     199K|    199K|
|   2 |   VIEW                                 | VW_ORE_1B35BA0F |      1 |  25785 |    199K|00:00:12.57 |     199K|    199K|
|   3 |    UNION-ALL                           |                 |      1 |        |    199K|00:00:09.03 |     199K|    199K|
|   4 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |   2001 |   2000 |00:00:00.87 |    2005 |   1998 |
|   5 |      INDEX RANGE SCAN                  | I               |      1 |   2001 |   2000 |00:00:00.02 |       5 |      0 |
|   6 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |   1980 |   2000 |00:00:01.44 |    2006 |   1998 |
|   7 |      INDEX RANGE SCAN                  | I               |      1 |   1980 |   2000 |00:00:00.02 |       6 |      0 |
|   8 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |   1940 |   2000 |00:00:01.46 |    2006 |   1998 |
|   9 |      INDEX RANGE SCAN                  | I               |      1 |   1940 |   2000 |00:00:00.02 |       6 |      0 |
|  10 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |   1883 |   2000 |00:00:01.33 |    2006 |   1998 |
|  11 |      INDEX RANGE SCAN                  | I               |      1 |   1883 |   2000 |00:00:00.02 |       6 |      0 |
|  12 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |   1810 |   2000 |00:00:01.30 |    2006 |   1998 |
|  13 |      INDEX RANGE SCAN                  | I               |      1 |   1810 |   2000 |00:00:00.02 |       6 |      0 |
|  14 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |   1722 |   2000 |00:00:01.34 |    2006 |   1998 |
|  15 |      INDEX RANGE SCAN                  | I               |      1 |   1722 |   2000 |00:00:00.02 |       6 |      0 |
|  16 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |   1624 |   2000 |00:00:01.32 |    2006 |   1998 |
|  17 |      INDEX RANGE SCAN                  | I               |      1 |   1624 |   2000 |00:00:00.02 |       6 |      0 |
|  18 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |   1516 |   2000 |00:00:01.30 |    2006 |   1998 |
|  19 |      INDEX RANGE SCAN                  | I               |      1 |   1516 |   2000 |00:00:00.02 |       6 |      0 |
|  20 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |   1403 |   2000 |00:00:01.34 |    2006 |   1998 |
|  21 |      INDEX RANGE SCAN                  | I               |      1 |   1403 |   2000 |00:00:00.02 |       6 |      0 |
|  22 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |   1286 |   2000 |00:00:01.36 |    2005 |   1998 |
|  23 |      INDEX RANGE SCAN                  | I               |      1 |   1286 |   2000 |00:00:00.02 |       5 |      0 |
|  24 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |   1169 |   2000 |00:00:01.48 |    2006 |   1998 |
|  25 |      INDEX RANGE SCAN                  | I               |      1 |   1169 |   2000 |00:00:00.02 |       6 |      0 |
|  26 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |   1053 |   2000 |00:00:01.52 |    2006 |   1998 |
|  27 |      INDEX RANGE SCAN                  | I               |      1 |   1053 |   2000 |00:00:00.02 |       6 |      0 |
|  28 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |    940 |   2000 |00:00:01.59 |    2006 |   1998 |
|  29 |      INDEX RANGE SCAN                  | I               |      1 |    940 |   2000 |00:00:00.02 |       6 |      0 |
|  30 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |    832 |   2000 |00:00:01.82 |    2006 |   2004 |
|  31 |      INDEX RANGE SCAN                  | I               |      1 |    832 |   2000 |00:00:00.02 |       6 |      4 |
|  32 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |    730 |   2000 |00:00:01.49 |    2006 |   2004 |
|  33 |      INDEX RANGE SCAN                  | I               |      1 |    730 |   2000 |00:00:00.02 |       6 |      4 |
|  34 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |    635 |   2000 |00:00:01.49 |    2006 |   2004 |
|  35 |      INDEX RANGE SCAN                  | I               |      1 |    635 |   2000 |00:00:00.02 |       6 |      4 |
|  36 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |    549 |   2000 |00:00:01.44 |    2006 |   2004 |
|  37 |      INDEX RANGE SCAN                  | I               |      1 |    549 |   2000 |00:00:00.02 |       6 |      4 |
|  38 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |    470 |   2000 |00:00:01.44 |    2006 |   2004 |
|  39 |      INDEX RANGE SCAN                  | I               |      1 |    470 |   2000 |00:00:00.02 |       6 |      4 |
|  40 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |    399 |   2000 |00:00:01.51 |    2006 |   2004 |
|  41 |      INDEX RANGE SCAN                  | I               |      1 |    399 |   2000 |00:00:00.02 |       6 |      4 |
|  42 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |    337 |   2000 |00:00:01.45 |    2005 |   2003 |
|  43 |      INDEX RANGE SCAN                  | I               |      1 |    337 |   2000 |00:00:00.03 |       5 |      3 |
|  44 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |    282 |   2000 |00:00:01.45 |    2006 |   2004 |
|  45 |      INDEX RANGE SCAN                  | I               |      1 |    282 |   2000 |00:00:00.02 |       6 |      4 |
|  46 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |    235 |   2000 |00:00:01.41 |    2006 |   2004 |
|  47 |      INDEX RANGE SCAN                  | I               |      1 |    235 |   2000 |00:00:00.02 |       6 |      4 |
|  48 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |    194 |   2000 |00:00:01.47 |    2006 |   2004 |
|  49 |      INDEX RANGE SCAN                  | I               |      1 |    194 |   2000 |00:00:00.03 |       6 |      4 |
|  50 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |    159 |   2000 |00:00:01.50 |    2006 |   2004 |
|  51 |      INDEX RANGE SCAN                  | I               |      1 |    159 |   2000 |00:00:00.02 |       6 |      4 |
|  52 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |    129 |   2000 |00:00:01.48 |    2006 |   2004 |
|  53 |      INDEX RANGE SCAN                  | I               |      1 |    129 |   2000 |00:00:00.02 |       6 |      4 |
|  54 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |    105 |   2000 |00:00:01.45 |    2006 |   2004 |
|  55 |      INDEX RANGE SCAN                  | I               |      1 |    105 |   2000 |00:00:00.02 |       6 |      4 |
|  56 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |     84 |   2000 |00:00:01.48 |    2006 |   2004 |
|  57 |      INDEX RANGE SCAN                  | I               |      1 |     84 |   2000 |00:00:00.02 |       6 |      4 |
|  58 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |     67 |   2000 |00:00:01.44 |    2006 |   2004 |
|  59 |      INDEX RANGE SCAN                  | I               |      1 |     67 |   2000 |00:00:00.03 |       6 |      4 |
|  60 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |     54 |   2000 |00:00:01.40 |    2006 |   2004 |
|  61 |      INDEX RANGE SCAN                  | I               |      1 |     54 |   2000 |00:00:00.02 |       6 |      4 |
|  62 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |     42 |   2000 |00:00:01.52 |    2006 |   2004 |
|  63 |      INDEX RANGE SCAN                  | I               |      1 |     42 |   2000 |00:00:00.02 |       6 |      4 |
|  64 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |     33 |   2000 |00:00:01.46 |    2005 |   2003 |
|  65 |      INDEX RANGE SCAN                  | I               |      1 |     33 |   2000 |00:00:00.02 |       5 |      3 |
|  66 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |     26 |   2000 |00:00:01.48 |    2006 |   2004 |
|  67 |      INDEX RANGE SCAN                  | I               |      1 |     26 |   2000 |00:00:00.02 |       6 |      4 |
|  68 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |     20 |   2000 |00:00:01.42 |    2006 |   2004 |
|  69 |      INDEX RANGE SCAN                  | I               |      1 |     20 |   2000 |00:00:00.03 |       6 |      4 |
|  70 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |     16 |   2000 |00:00:01.42 |    2006 |   2004 |
|  71 |      INDEX RANGE SCAN                  | I               |      1 |     16 |   2000 |00:00:00.02 |       6 |      4 |
|  72 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |     12 |   2000 |00:00:01.50 |    2006 |   2004 |
|  73 |      INDEX RANGE SCAN                  | I               |      1 |     12 |   2000 |00:00:00.02 |       6 |      4 |
|  74 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      9 |   2000 |00:00:01.53 |    2006 |   2004 |
|  75 |      INDEX RANGE SCAN                  | I               |      1 |      9 |   2000 |00:00:00.02 |       6 |      4 |
|  76 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      7 |   2000 |00:00:01.48 |    2006 |   2004 |
|  77 |      INDEX RANGE SCAN                  | I               |      1 |      7 |   2000 |00:00:00.02 |       6 |      4 |
|  78 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      5 |   2000 |00:00:01.71 |    2006 |   2004 |
|  79 |      INDEX RANGE SCAN                  | I               |      1 |      5 |   2000 |00:00:00.04 |       6 |      4 |
|  80 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      4 |   2000 |00:00:01.59 |    2006 |   2004 |
|  81 |      INDEX RANGE SCAN                  | I               |      1 |      4 |   2000 |00:00:00.02 |       6 |      4 |
|  82 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      3 |   2000 |00:00:01.43 |    2006 |   2004 |
|  83 |      INDEX RANGE SCAN                  | I               |      1 |      3 |   2000 |00:00:00.03 |       6 |      4 |
|  84 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      2 |   2000 |00:00:01.44 |    2006 |   2004 |
|  85 |      INDEX RANGE SCAN                  | I               |      1 |      2 |   2000 |00:00:00.02 |       6 |      4 |
|  86 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      2 |   2000 |00:00:01.42 |    2005 |   2003 |
|  87 |      INDEX RANGE SCAN                  | I               |      1 |      2 |   2000 |00:00:00.02 |       5 |      3 |
|  88 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      1 |   2000 |00:00:01.42 |    2006 |   2004 |
|  89 |      INDEX RANGE SCAN                  | I               |      1 |      1 |   2000 |00:00:00.02 |       6 |      4 |
|  90 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      1 |   2000 |00:00:01.34 |    2006 |   2004 |
|  91 |      INDEX RANGE SCAN                  | I               |      1 |      1 |   2000 |00:00:00.05 |       6 |      4 |
|  92 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      1 |   2000 |00:00:01.50 |    2006 |   2004 |
|  93 |      INDEX RANGE SCAN                  | I               |      1 |      1 |   2000 |00:00:00.02 |       6 |      4 |
|  94 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      1 |   2000 |00:00:01.43 |    2006 |   2004 |
|  95 |      INDEX RANGE SCAN                  | I               |      1 |      1 |   2000 |00:00:00.03 |       6 |      4 |
|  96 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      1 |   2000 |00:00:01.52 |    2006 |   2004 |
|  97 |      INDEX RANGE SCAN                  | I               |      1 |      1 |   2000 |00:00:00.03 |       6 |      4 |
|  98 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      1 |   2000 |00:00:01.51 |    2006 |   2004 |
|  99 |      INDEX RANGE SCAN                  | I               |      1 |      1 |   2000 |00:00:00.03 |       6 |      4 |
| 100 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      1 |   2000 |00:00:01.38 |    2006 |   2004 |
| 101 |      INDEX RANGE SCAN                  | I               |      1 |      1 |   2000 |00:00:00.03 |       6 |      4 |
| 102 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      9 |    101K|00:00:03.62 |     101K|    101K|
| 103 |      INDEX RANGE SCAN                  | I               |      1 |      9 |    101K|00:00:01.25 |     199 |    197 |
-----------------------------------------------------------------------------------------------------------------------------

So although all 199,000 rows from the table are actually examined (A-Rows), the estimate (E-Rows) was that only 25,785 would be. This discrepancy is starting to become problematic.

Pay particular attention to the last two lines in the execution plan. We can see (A-Rows) column that 101K rows were retrieved, so can deduce that these lines correspond to the (c BETWEEN 98 and 99) predicate. The optimizer has estimated that only 9 rows will be retrieved and has done so using the index.

If I execute the query with that single predicate then the cardinality estimate is correct, and the optimizer (sensibly) choses to do a full table scan rather than using the index.

SQL> SELECT /*+gather_plan_statistics*/ DISTINCT trim(p) p FROM t WHERE (c BETWEEN 98 AND 99);

   P
____
X

SQL> SELECT * FROM dbms_xplan.display_cursor(format=>'allstats last -bytes -cost -predicate');

                                                                                PLAN_TABLE_OUTPUT
_________________________________________________________________________________________________
...
----------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:01.74 |     199K|    199K|
|   1 |  HASH UNIQUE       |      |      1 |      1 |      1 |00:00:01.74 |     199K|    199K|
|   2 |   TABLE ACCESS FULL| T    |      1 |    101K|    101K|00:00:02.38 |     199K|    199K|
----------------------------------------------------------------------------------------------


15 rows selected.

In the next article we will return to the simpler case without histograms, and start to examine where the cardinality estimates are calculated.

OR Dear Me!  When Cost Based OR  Expansion Gets it Wrong – Part 1

The Cost Based OR Expansion transformation allows the optimizer to generate execution plans which often perform better than could be otherwise achieved. But sometimes, as in a case I have been recently investigating, it can cause things to go wrong.  I have modeled the behavior and will demonstrate in it this article.  Planned subsequent articles will drill into the calculations both with and without the transformation, and potential work-arounds.
I’m not going to waste time on describing the basics of the transformation itself, if that’s what you need I suggest you check out articles by Nigel Bayliss, Nenad Noveljic and Mohamed Houri.

SQL> CREATE TABLE t (c NUMBER(2, 0) NOT NULL,
  2*                 p VARCHAR2(4000));
Table T created.

SQL> INSERT /*+append*/ INTO t
  2      SELECT
  3          mod(ROWNUM, 100) c,
  4          rpad('X', 4000)  padding
  5      FROM
  6          dual
  7      CONNECT BY
  8*         ROWNUM <= 100000;

100 000 rows inserted.

SQL> exec dbms_stats.gather_table_stats(null, 't', method_opt=>'for all columns size 1');

PL/SQL procedure successfully completed.

So we have a table containing 100,000 rows, and that column C contains 100 distinct values, ranging from 0 to 99.  For simplicity I have not created  histogram on C, but the existance of a histogram does not significantly affect the behaviour I will demonstrate.

SQL> SELECT /*+gather_plan_statistics*/ DISTINCT trim(p) p FROM t WHERE
  2          (c BETWEEN  0 AND  1) OR (c BETWEEN  2 AND  3) OR (c BETWEEN  4 AND  5) OR (c BETWEEN  6 AND  7) OR (c BETWEEN  8 AND  9)
  3       OR (c BETWEEN 10 AND 11) OR (c BETWEEN 12 AND 13) OR (c BETWEEN 14 AND 15) OR (c BETWEEN 16 AND 17) OR (c BETWEEN 18 AND 19)
  4       OR (c BETWEEN 20 AND 21) OR (c BETWEEN 22 AND 23) OR (c BETWEEN 24 AND 25) OR (c BETWEEN 26 AND 27) OR (c BETWEEN 28 AND 29)
  5       OR (c BETWEEN 30 AND 31) OR (c BETWEEN 32 AND 33) OR (c BETWEEN 34 AND 35) OR (c BETWEEN 36 AND 37) OR (c BETWEEN 38 AND 39)
  6       OR (c BETWEEN 40 AND 41) OR (c BETWEEN 42 AND 43) OR (c BETWEEN 44 AND 45) OR (c BETWEEN 46 AND 47) OR (c BETWEEN 48 AND 49)
  7       OR (c BETWEEN 50 AND 51) OR (c BETWEEN 52 AND 53) OR (c BETWEEN 54 AND 55) OR (c BETWEEN 56 AND 57) OR (c BETWEEN 58 AND 59)
  8       OR (c BETWEEN 60 AND 61) OR (c BETWEEN 62 AND 63) OR (c BETWEEN 64 AND 65) OR (c BETWEEN 66 AND 67) OR (c BETWEEN 68 AND 69)
  9       OR (c BETWEEN 70 AND 71) OR (c BETWEEN 72 AND 73) OR (c BETWEEN 74 AND 75) OR (c BETWEEN 76 AND 77) OR (c BETWEEN 78 AND 79)
 10       OR (c BETWEEN 80 AND 81) OR (c BETWEEN 82 AND 83) OR (c BETWEEN 84 AND 85) OR (c BETWEEN 86 AND 87) OR (c BETWEEN 88 AND 89)
 11*      OR (c BETWEEN 90 AND 91) OR (c BETWEEN 92 AND 93) OR (c BETWEEN 94 AND 95) OR (c BETWEEN 96 AND 97) OR (c BETWEEN 98 AND 99);

   P
____
X

SQL> SELECT * FROM dbms_xplan.display_cursor(format=>'allstats last -bytes -cost -predicate');

...

----------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:01.18 |     100K|    100K|
|   1 |  HASH UNIQUE       |      |      1 |      1 |      1 |00:00:01.18 |     100K|    100K|
|   2 |   TABLE ACCESS FULL| T    |      1 |  77857 |    100K|00:00:01.35 |     100K|    100K|
----------------------------------------------------------------------------------------------

The optimizer estimated (E-Rows) that only 77,857 rows would need to be retrieved, whereas in reality (as should be obvious by comparing the query with the data) all 100,000 rows needed to be retrieved. While the estimate is not perfect, it’s reasonably close,  and as the database doesn’t have any alternatives choses to perform a full table scan, on my test system completing in about a second.

Contrast that behaviour with what happens after an index is created.

SQL> CREATE INDEX i ON t (c);

Index I created.

SQL> SELECT /*+gather_plan_statistics*/ DISTINCT trim(p) p FROM t WHERE
  2         (c BETWEEN  0 AND  1) OR (c BETWEEN  2 AND  3) OR (c BETWEEN  4 AND  5) OR (c BETWEEN  6 AND  7) OR (c BETWEEN  8 AND  9)
  3      OR (c BETWEEN 10 AND 11) OR (c BETWEEN 12 AND 13) OR (c BETWEEN 14 AND 15) OR (c BETWEEN 16 AND 17) OR (c BETWEEN 18 AND 19)
  4      OR (c BETWEEN 20 AND 21) OR (c BETWEEN 22 AND 23) OR (c BETWEEN 24 AND 25) OR (c BETWEEN 26 AND 27) OR (c BETWEEN 28 AND 29)
  5      OR (c BETWEEN 30 AND 31) OR (c BETWEEN 32 AND 33) OR (c BETWEEN 34 AND 35) OR (c BETWEEN 36 AND 37) OR (c BETWEEN 38 AND 39)
  6      OR (c BETWEEN 40 AND 41) OR (c BETWEEN 42 AND 43) OR (c BETWEEN 44 AND 45) OR (c BETWEEN 46 AND 47) OR (c BETWEEN 48 AND 49)
  7      OR (c BETWEEN 50 AND 51) OR (c BETWEEN 52 AND 53) OR (c BETWEEN 54 AND 55) OR (c BETWEEN 56 AND 57) OR (c BETWEEN 58 AND 59)
  8      OR (c BETWEEN 60 AND 61) OR (c BETWEEN 62 AND 63) OR (c BETWEEN 64 AND 65) OR (c BETWEEN 66 AND 67) OR (c BETWEEN 68 AND 69)
  9      OR (c BETWEEN 70 AND 71) OR (c BETWEEN 72 AND 73) OR (c BETWEEN 74 AND 75) OR (c BETWEEN 76 AND 77) OR (c BETWEEN 78 AND 79)
 10      OR (c BETWEEN 80 AND 81) OR (c BETWEEN 82 AND 83) OR (c BETWEEN 84 AND 85) OR (c BETWEEN 86 AND 87) OR (c BETWEEN 88 AND 89)
 11*     OR (c BETWEEN 90 AND 91) OR (c BETWEEN 92 AND 93) OR (c BETWEEN 94 AND 95) OR (c BETWEEN 96 AND 97) OR (c BETWEEN 98 AND 99);

   P
____
X

SQL> SELECT * FROM dbms_xplan.display_cursor(format=>'allstats last -bytes -cost -predicate');
                                          
...

-----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                 |      1 |        |      1 |00:00:05.81 |     100K|    100K|
|   1 |  HASH UNIQUE                           |                 |      1 |      1 |      1 |00:00:05.81 |     100K|    100K|
|   2 |   VIEW                                 | VW_ORE_1B35BA0F |      1 |  26653 |    100K|00:00:10.36 |     100K|    100K|
|   3 |    UNION-ALL                           |                 |      1 |        |    100K|00:00:08.46 |     100K|    100K|
|   4 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |   2010 |   2000 |00:00:01.52 |    2005 |   2004 |
|   5 |      INDEX RANGE SCAN                  | I               |      1 |   2010 |   2000 |00:00:00.02 |       5 |      4 |
|   6 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |   2950 |   2000 |00:00:02.81 |    2006 |   2004 |
|   7 |      INDEX RANGE SCAN                  | I               |      1 |   2950 |   2000 |00:00:00.02 |       6 |      4 |
|   8 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |   2832 |   2000 |00:00:02.35 |    2006 |   2004 |
|   9 |      INDEX RANGE SCAN                  | I               |      1 |   2832 |   2000 |00:00:00.02 |       6 |      4 |
|  10 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |   2666 |   2000 |00:00:02.30 |    2006 |   2004 |
|  11 |      INDEX RANGE SCAN                  | I               |      1 |   2666 |   2000 |00:00:00.02 |       6 |      4 |
|  12 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |   2462 |   2000 |00:00:01.71 |    2006 |   2004 |
|  13 |      INDEX RANGE SCAN                  | I               |      1 |   2462 |   2000 |00:00:00.02 |       6 |      4 |
|  14 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |   2231 |   2000 |00:00:01.95 |    2006 |   2004 |
|  15 |      INDEX RANGE SCAN                  | I               |      1 |   2231 |   2000 |00:00:00.05 |       6 |      4 |
|  16 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |   1985 |   2000 |00:00:01.84 |    2006 |   2004 |
|  17 |      INDEX RANGE SCAN                  | I               |      1 |   1985 |   2000 |00:00:00.02 |       6 |      4 |
|  18 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |   1736 |   2000 |00:00:01.71 |    2006 |   2004 |
|  19 |      INDEX RANGE SCAN                  | I               |      1 |   1736 |   2000 |00:00:00.02 |       6 |      4 |
|  20 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |   1492 |   2000 |00:00:02.08 |    2006 |   2004 |
|  21 |      INDEX RANGE SCAN                  | I               |      1 |   1492 |   2000 |00:00:00.02 |       6 |      4 |
|  22 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |   1262 |   2000 |00:00:01.93 |    2005 |   2003 |
|  23 |      INDEX RANGE SCAN                  | I               |      1 |   1262 |   2000 |00:00:00.02 |       5 |      3 |
|  24 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |   1051 |   2000 |00:00:02.01 |    2006 |   2004 |
|  25 |      INDEX RANGE SCAN                  | I               |      1 |   1051 |   2000 |00:00:00.02 |       6 |      4 |
|  26 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |    863 |   2000 |00:00:02.04 |    2006 |   2004 |
|  27 |      INDEX RANGE SCAN                  | I               |      1 |    863 |   2000 |00:00:00.04 |       6 |      4 |
|  28 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |    698 |   2000 |00:00:01.52 |    2006 |   2004 |
|  29 |      INDEX RANGE SCAN                  | I               |      1 |    698 |   2000 |00:00:00.02 |       6 |      4 |
|  30 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |    557 |   2000 |00:00:01.59 |    2006 |   2004 |
|  31 |      INDEX RANGE SCAN                  | I               |      1 |    557 |   2000 |00:00:00.02 |       6 |      4 |
|  32 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |    439 |   2000 |00:00:01.56 |    2006 |   2004 |
|  33 |      INDEX RANGE SCAN                  | I               |      1 |    439 |   2000 |00:00:00.02 |       6 |      4 |
|  34 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |    343 |   2000 |00:00:01.69 |    2006 |   2004 |
|  35 |      INDEX RANGE SCAN                  | I               |      1 |    343 |   2000 |00:00:00.02 |       6 |      4 |
|  36 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |    264 |   2000 |00:00:02.02 |    2006 |   2004 |
|  37 |      INDEX RANGE SCAN                  | I               |      1 |    264 |   2000 |00:00:00.02 |       6 |      4 |
|  38 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |    202 |   2000 |00:00:01.59 |    2006 |   2004 |
|  39 |      INDEX RANGE SCAN                  | I               |      1 |    202 |   2000 |00:00:00.03 |       6 |      4 |
|  40 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |    153 |   2000 |00:00:01.41 |    2006 |   2005 |
|  41 |      INDEX RANGE SCAN                  | I               |      1 |    153 |   2000 |00:00:00.02 |       6 |      5 |
|  42 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |    115 |   2000 |00:00:01.42 |    2005 |   2003 |
|  43 |      INDEX RANGE SCAN                  | I               |      1 |    115 |   2000 |00:00:00.02 |       5 |      3 |
|  44 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |     86 |   2000 |00:00:01.60 |    2006 |   2004 |
|  45 |      INDEX RANGE SCAN                  | I               |      1 |     86 |   2000 |00:00:00.02 |       6 |      4 |
|  46 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |     64 |   2000 |00:00:01.40 |    2006 |   2004 |
|  47 |      INDEX RANGE SCAN                  | I               |      1 |     64 |   2000 |00:00:00.02 |       6 |      4 |
|  48 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |     47 |   2000 |00:00:01.47 |    2006 |   2004 |
|  49 |      INDEX RANGE SCAN                  | I               |      1 |     47 |   2000 |00:00:00.02 |       6 |      4 |
|  50 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |     35 |   2000 |00:00:01.46 |    2006 |   2004 |
|  51 |      INDEX RANGE SCAN                  | I               |      1 |     35 |   2000 |00:00:00.02 |       6 |      4 |
|  52 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |     26 |   2000 |00:00:01.44 |    2006 |   2004 |
|  53 |      INDEX RANGE SCAN                  | I               |      1 |     26 |   2000 |00:00:00.02 |       6 |      4 |
|  54 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |     19 |   2000 |00:00:01.47 |    2006 |   2004 |
|  55 |      INDEX RANGE SCAN                  | I               |      1 |     19 |   2000 |00:00:00.02 |       6 |      4 |
|  56 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |     14 |   2000 |00:00:01.37 |    2006 |   2004 |
|  57 |      INDEX RANGE SCAN                  | I               |      1 |     14 |   2000 |00:00:00.02 |       6 |      4 |
|  58 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |     10 |   2000 |00:00:01.35 |    2006 |   2004 |
|  59 |      INDEX RANGE SCAN                  | I               |      1 |     10 |   2000 |00:00:00.04 |       6 |      4 |
|  60 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      8 |   2000 |00:00:01.39 |    2006 |   2004 |
|  61 |      INDEX RANGE SCAN                  | I               |      1 |      8 |   2000 |00:00:00.02 |       6 |      4 |
|  62 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      6 |   2000 |00:00:01.40 |    2006 |   2004 |
|  63 |      INDEX RANGE SCAN                  | I               |      1 |      6 |   2000 |00:00:00.03 |       6 |      4 |
|  64 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      4 |   2000 |00:00:01.53 |    2005 |   2003 |
|  65 |      INDEX RANGE SCAN                  | I               |      1 |      4 |   2000 |00:00:00.02 |       5 |      3 |
|  66 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      3 |   2000 |00:00:01.41 |    2006 |   2004 |
|  67 |      INDEX RANGE SCAN                  | I               |      1 |      3 |   2000 |00:00:00.02 |       6 |      4 |
|  68 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      2 |   2000 |00:00:01.42 |    2006 |   2005 |
|  69 |      INDEX RANGE SCAN                  | I               |      1 |      2 |   2000 |00:00:00.03 |       6 |      5 |
|  70 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      2 |   2000 |00:00:01.72 |    2006 |   2004 |
|  71 |      INDEX RANGE SCAN                  | I               |      1 |      2 |   2000 |00:00:00.03 |       6 |      4 |
|  72 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      1 |   2000 |00:00:01.45 |    2006 |   2004 |
|  73 |      INDEX RANGE SCAN                  | I               |      1 |      1 |   2000 |00:00:00.03 |       6 |      4 |
|  74 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      1 |   2000 |00:00:01.51 |    2006 |   2004 |
|  75 |      INDEX RANGE SCAN                  | I               |      1 |      1 |   2000 |00:00:00.02 |       6 |      4 |
|  76 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      1 |   2000 |00:00:01.42 |    2006 |   2004 |
|  77 |      INDEX RANGE SCAN                  | I               |      1 |      1 |   2000 |00:00:00.02 |       6 |      4 |
|  78 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      1 |   2000 |00:00:01.56 |    2006 |   2004 |
|  79 |      INDEX RANGE SCAN                  | I               |      1 |      1 |   2000 |00:00:00.02 |       6 |      4 |
|  80 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      1 |   2000 |00:00:01.64 |    2006 |   2004 |
|  81 |      INDEX RANGE SCAN                  | I               |      1 |      1 |   2000 |00:00:00.02 |       6 |      4 |
|  82 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      1 |   2000 |00:00:01.42 |    2006 |   2004 |
|  83 |      INDEX RANGE SCAN                  | I               |      1 |      1 |   2000 |00:00:00.02 |       6 |      4 |
|  84 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      1 |   2000 |00:00:01.56 |    2006 |   2004 |
|  85 |      INDEX RANGE SCAN                  | I               |      1 |      1 |   2000 |00:00:00.02 |       6 |      4 |
|  86 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      1 |   2000 |00:00:01.39 |    2005 |   2003 |
|  87 |      INDEX RANGE SCAN                  | I               |      1 |      1 |   2000 |00:00:00.02 |       5 |      3 |
|  88 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      1 |   2000 |00:00:01.41 |    2006 |   2004 |
|  89 |      INDEX RANGE SCAN                  | I               |      1 |      1 |   2000 |00:00:00.03 |       6 |      4 |
|  90 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      1 |   2000 |00:00:01.52 |    2006 |   2004 |
|  91 |      INDEX RANGE SCAN                  | I               |      1 |      1 |   2000 |00:00:00.03 |       6 |      4 |
|  92 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      1 |   2000 |00:00:01.45 |    2006 |   2004 |
|  93 |      INDEX RANGE SCAN                  | I               |      1 |      1 |   2000 |00:00:00.02 |       6 |      4 |
|  94 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      1 |   2000 |00:00:01.45 |    2006 |   2004 |
|  95 |      INDEX RANGE SCAN                  | I               |      1 |      1 |   2000 |00:00:00.02 |       6 |      4 |
|  96 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      1 |   2000 |00:00:01.41 |    2006 |   2005 |
|  97 |      INDEX RANGE SCAN                  | I               |      1 |      1 |   2000 |00:00:00.03 |       6 |      5 |
|  98 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      1 |   2000 |00:00:01.42 |    2006 |   2004 |
|  99 |      INDEX RANGE SCAN                  | I               |      1 |      1 |   2000 |00:00:00.02 |       6 |      4 |
| 100 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      1 |   2000 |00:00:01.43 |    2006 |   2004 |
| 101 |      INDEX RANGE SCAN                  | I               |      1 |      1 |   2000 |00:00:00.02 |       6 |      4 |
| 102 |     TABLE ACCESS BY INDEX ROWID BATCHED| T               |      1 |      1 |   2000 |00:00:01.41 |    2006 |   2004 |
| 103 |      INDEX RANGE SCAN                  | I               |      1 |      1 |   2000 |00:00:00.02 |       6 |      4 |
-----------------------------------------------------------------------------------------------------------------------------

After adding the index, the estimated number of rows that to be fetched has dropped from 77,857 to 26,653, and the rows are now fetched using the index, resulting in an execution plan that takes either 6 or 8 seconds (depending on which line of the plan you choose to believe) to complete.

In the next article we will drill into how the original cardinality estimate of 77,857 came about, before we see how the Cost Based OR Expansion caused things to go wrong.

Adaptive Cursor Sharing not working as expected

We’ve been experiencing some issues with multiple child cursors for a given SQL statement, and I’ve just spent some time working on building a reproducible testcase of one problem that I thought I’d share in the hopes of documenting the behavior.

It is a python script connecting to an Oracle Database (tested against 12.2 and 19.6), so requires the cx_Oracle Python module.

import cx_Oracle
import argparse 

def setup_t1(con):
    cursor = con.cursor()
    try:
        cursor.execute("drop table t1")
    except Exception:
        pass
    cursor.execute("create table t1 (c1 number, n1 nchar(2), n2 nchar(2))")
    cursor.execute("insert into t1 select 1, to_char(round(rownum/10)),  to_char(mod(rownum, 2)) from all_objects where rownum <= 20")
    con.commit
    cursor.callproc("sys.dbms_stats.gather_table_stats", (None, 't1'))
    cursor.close()                    

def setup_t2(con):
    cursor = con.cursor()
    try:
        cursor.execute("drop table t2")
    except Exception:
        pass
    cursor.execute("create table t2(c1 number)")
    cursor.execute("create index i2 on t2(c1)")
    cursor.execute("insert into t2(c1) select rownum from all_objects where rownum <= 1000")
    con.commit
    cursor.callproc("sys.dbms_stats.gather_table_stats", \
                    keywordParameters = dict(ownname = None, tabname = "t2", method_opt = "for all columns size auto"))
    cursor.close()                    

def run_query(con, p1, p2):
    cursor = con.cursor()
    query = "select  count(*) from t1, t2 where t1.n1 = :1 and t1.n2 = :2 and t1.c1 = t2.c1"
    cursor.execute(query,[p1, p2])
    cursor.fetchall()
    cursor.close()

def run_query_0(con):
    run_query(con, '0','0')    

def run_query_0_space(con):
    run_query(con, '0 ','0 ')  

def get_sql_id(con):
    cursor = con.cursor()
    query = "select prev_sql_id from v$session where sid = userenv('SID')"
    cursor.execute(query)     
    row = cursor.fetchone()    
    cursor.close()
    return row[0]

def dump_sql(con, sql_id):
    cursor = con.cursor()    
    cursor.execute("select plan_hash_value, count(*) from v$sql where sql_id = :1 and is_shareable = 'Y' and is_obsolete = 'N' group by plan_hash_value", [sql_id])     
    rows = cursor.fetchall()    
    for row in rows:
        print (row)
    cursor.close()

parser = argparse.ArgumentParser()
parser.add_argument('--connection', type=str)
args = parser.parse_args()

con = cx_Oracle.connect(args.connection)
setup_t1(con)
setup_t2(con)
run_query_0(con)
run_query_0_space(con)
run_query_0(con)
sql_id = get_sql_id(con)
con.close

con = cx_Oracle.connect(args.connection)
for x in range(0, 80):
    run_query_0(con)
    run_query_0_space(con)

dump_sql(con, sql_id)
con.close

Results are as below, showing 81 active child cursors.

('66dbrch2wu34f', 3482066175, 1)
('66dbrch2wu34f', 2913850814, 80)

I’m not sure why it’s necessary to close and re-open the connection between the first three executions (when it becomes bind aware) and the subsequent executions, but it is. If both the nchar columns are char (or nvarchar2) the issue doesn’t reproduce.

It seems the database is not correctly handling the Cursor Selectivity Cubes:

SELECT
    predicate,
    range_id,
    low,
    high,
    COUNT(*)
FROM
    v$sql_cs_selectivity
WHERE
    sql_id = '66dbrch2wu34f'
GROUP BY
    predicate,
    range_id,
    low,
    high

PREDICATE                                  RANGE_ID LOW        HIGH         COUNT(*)
---------------------------------------- ---------- ---------- ---------- ----------
=1                                                0 0.300000   0.366667          317

SQL>
It also seems like the fact that the plan is adaptive is a factor. If I hint the query to not allow adaptive plans, with OPT_PARAM('_optimizer_adaptive_plans','false') I don't see the problem.
select * from dbms_xplan.display_cursor('66dbrch2wu34f', null, format=>'+adaptive')

-----------------------------------------------------------------------------------------
|   Id  | Operation                      | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|     0 | SELECT STATEMENT               |      |       |       |     4 (100)|          |
|     1 |  SORT AGGREGATE                |      |     1 |    17 |            |          |
|  *  2 |   HASH JOIN                    |      |     3 |    51 |     4   (0)| 00:00:01 |
|-    3 |    NESTED LOOPS                |      |     3 |    51 |     4   (0)| 00:00:01 |
|-    4 |     STATISTICS COLLECTOR       |      |       |       |            |          |
|  *  5 |      TABLE ACCESS STORAGE FULL | T1   |     3 |    39 |     2   (0)| 00:00:01 |
|- *  6 |     INDEX RANGE SCAN           | I2   |     1 |     4 |     2   (0)| 00:00:01 |
|     7 |    INDEX STORAGE FAST FULL SCAN| I2   |  1000 |  4000 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("T1"."C1"="T2"."C1")
   5 - storage(("T1"."N1"=SYS_OP_C2C(:1) AND "T1"."N2"=SYS_OP_C2C(:2)))
       filter(("T1"."N1"=SYS_OP_C2C(:1) AND "T1"."N2"=SYS_OP_C2C(:2)))
   6 - access("T1"."C1"="T2"."C1")

Note
-----
   - this is an adaptive plan (rows marked '-' are inactive)

ORDS Issues in Oracle 19c? Check your Service Names

We upgraded our first database from 12.2 to 19c last week, and encountered a nasty issue with ORDS.  Credit goes to my colleagues Mingda Lu and Au Chun Kei for doing the hard work in understanding what was causing the issue.

The issue can be demonstrated with the Oracle DB Developer VM.  I have created a RESTful Web Service following the oracle-base guide.

A quick test with wget shows that everything is OK with the default settings:

[oracle@localhost ~]$ wget http://localhost:8080/ords/hr/hrmod/employees/100
--2020-02-15 04:23:46-- http://localhost:8080/ords/hr/hrmod/employees/100
Resolving localhost (localhost)... 127.0.0.1
Connecting to localhost (localhost)|127.0.0.1|:8080... connected.
HTTP request sent, awaiting response... 200 OK

Note however that ORDS is configured with the PDB default service name ‘orcl’

[oracle@localhost ~]$ grep servicename /u01/userhome/oracle/ords/vmconfig/ords/defaults.xml
<entry key="db.servicename">orcl</entry>

What happens if use a different service name for ORDS?  I’ll create a couple of services to demonstrate the issue.

[oracle@localhost ~]$ sql system/oracle@localhost:1521/orcl

SQLcl: Release 19.1 Production on Sat Feb 15 04:58:57 2020
Copyright (c) 1982, 2020, Oracle. All rights reserved.
Last Successful login time: Sat Feb 15 2020 04:59:02 -05:00
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

SQL> exec sys.dbms_service.create_service('orcl_ords', 'orcl_ords');
PL/SQL procedure successfully completed.

SQL> exec sys.dbms_service.start_service('orcl_ords');
PL/SQL procedure successfully completed.

SQL> exec sys.dbms_service.create_service('orcl.ords', 'orcl.ords');
PL/SQL procedure successfully completed.

SQL> exec sys.dbms_service.start_service('orcl.ords');
PL/SQL procedure successfully completed.

If I change db.servicename entry to ‘orcl_ords’ in defaults.xml, restart ORDS and retest, all is OK.

[oracle@localhost wget http://localhost:8080/ords/hr/hrmod/employees/100
--2020-02-15 05:03:17-- http://localhost:8080/ords/hr/hrmod/employees/100
Resolving localhost (localhost)... 127.0.0.1
Connecting to localhost (localhost)|127.0.0.1|:8080... connected.
HTTP request sent, awaiting response... 200 OK

However if I change the service name to the other new service, orcl.ords, and restart ORDS, then it starts up without any problems, but when testing the service we get an error.

[oracle@localhost ~]$ wget http://localhost:8080/ords/hr/hrmod/employees/100
--2020-02-15 05:05:57-- http://localhost:8080/ords/hr/hrmod/employees/100
Resolving localhost (localhost)... 127.0.0.1
Connecting to localhost (localhost)|127.0.0.1|:8080... connected.
HTTP request sent, awaiting response... 503 Service Unavailable
2020-02-15 05:06:01 ERROR 503: Service Unavailable.

The error stacks from ORDS seems to gives some clues as to what’s going on.

WARNING: The database user for the connection pool named |apex|pu|, is not authorized to proxy to the schema named HR
oracle.dbtools.common.jdbc.ConnectionPoolConfigurationException: The database user for the connection pool named |apex|pu|, is not authorized to proxy to the schema named HR
at oracle.dbtools.common.jdbc.ConnectionPoolExceptions.from(ConnectionPoolExceptions.java:46)
at oracle.dbtools.common.jdbc.ConnectionPoolExceptions.from(ConnectionPoolExceptions.java:53)
Caused by: oracle.dbtools.common.ucp.ConnectionLabelingException: Error occurred when attempting to configure url: jdbc:oracle:thin:@//localhost:1521/orcl.ords with labels: {oracle.dbtools.jdbc.label.schema=HR}
at oracle.dbtools.common.ucp.LabelingCallback.handle(LabelingCallback.java:147)
at oracle.dbtools.common.ucp.LabelingCallback.proxyToSchema(LabelingCallback.java:210)
at oracle.dbtools.common.ucp.LabelingCallback.configure(LabelingCallback.java:76)
Caused by: java.sql.SQLException: ORA-01017: invalid username/password; logon denied
at oracle.jdbc.driver.T4CTTIoer11.processError(T4CTTIoer11.java:494)
at oracle.jdbc.driver.T4CTTIoer11.processError(T4CTTIoer11.java:441)
at oracle.jdbc.driver.T4CTTIoer11.processError(T4CTTIoer11.java:436)
at oracle.jdbc.driver.T4CTTIfun.processError(T4CTTIfun.java:1027)

This seems to imply there’s some issue with the proxy authentication mechanism when using the orcl.ords service, however testing from sqlplus, all seems to be OK.

[oracle@localhost ords]$ sql ords_public_user[hr]/oracle@localhost:1521/orcl.ords

SQLcl: Release 19.1 Production on Sat Feb 15 05:09:46 2020

Copyright (c) 1982, 2020, Oracle. All rights reserved.

Last Successful login time: Fri May 31 2019 16:29:03 -04:00

Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0


SQL>

From our testing, any service with a name with the format <pdb_name>.<any_text> exhibits the problem. We have used service names with such a format in 12.2 without issues, so it seems this is new behaviour introduced in 18c or 19c.

We’ve also noticed that when checking v$services, the value for con_id for the ‘problem’ service is 1 which may give a clue as to what’s going on, although it only seems to cause a problem for ORDS.

SQL> select con_id, network_name from v$services where network_name in ('orcl_ords', 'orcl.ords');
CON_ID NETWORK_NAME
_________ _______________
1 orcl.ords
3 orcl_ords

ords.enable_schema fails with “ORA-06598: insufficient INHERIT PRIVILEGES privilege”

This is issue I always hit settitng up a test environment using Oracle REST Data Services (ORDS).  Using SYSTEM user to call ords.enable_schema throws ORA-06598.  According to the documentation it should succeed (note the SYSTEM user has DBA role) .

Only database users with the DBA role can enable or disable a schema other than their own.

[oracle@localhost ~]$ sql system/oracle

SQLcl: Release 19.1 Production on Sat Feb 15 02:20:33 2020
Copyright (c) 1982, 2020, Oracle.  All rights reserved.
Last Successful login time: Sat Feb 15 2020 02:20:36 -05:00
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

SQL> SHOW USER
USER is "SYSTEM"

SQL> select role from session_roles where role = 'DBA';
   ROLE
_______
DBA
SQL> BEGIN
  2     ords.enable_schema(
  3        p_enabled             => TRUE,
  4        p_schema              => 'HR',
  5        p_url_mapping_type    => 'BASE_PATH',
  6        p_url_mapping_pattern => 'hr',
  7        p_auto_rest_auth      => FALSE);
  8
  9    COMMIT;
 10  END;
 11  /
BEGIN
*
ERROR at line 1:
ORA-06598: insufficient INHERIT PRIVILEGES privilege
ORA-06512: at "ORDS_METADATA.ORDS", line 1
ORA-06512: at line 2

The solution is simple, grant inherit privileges on the current user (SYSTEM) to the ORDS_METADATA user.

SQL> show user
USER is "SYSTEM"
SQL> grant inherit privileges on user SYSTEM to ORDS_METADATA;
Grant succeeded.

Once this is completed, the call to ords.enable_schema is successful.

SQL> BEGIN
2 ords.enable_schema(
3 p_enabled => TRUE,
4 p_schema => 'HR',
5 p_url_mapping_type => 'BASE_PATH',
6 p_url_mapping_pattern => 'hr',
7 p_auto_rest_auth => FALSE);
8
9 COMMIT;
10 END;
11 /

PL/SQL procedure successfully completed.

ORA-01017 starting DB with srvctl? Consider firing your (oracle) agent!

Warning please don’t blindly follow the steps here without doing your own analysis of the risks involved, and ideally without getting Oracle support involved, I was hesitant to publish this, but as I’ve been in contact with someone else and it’s helped them workaround (to some extent) an issue they have been having I think it’s worth putting out there.

We had a problem during a dataguard switch-over (luckily planned switch-over for patching rather than a disaster situation) where Grid Infrastructure (clusterware) was unable to bring up one of the databases, it kept throwing “ORA-01017: invalid username/password”. Starting the database the ‘traditional way’ using “sqlplus / as sysdba” had no such problems.

Reviewing Oracle Support, particularly Doc ID 2313555.1 we identified some non-standard configuration in the Oracle home used for this database, but even after resolving them, the error persisted.

At times like these you realize (or at least I did) how little is published about the internals of how clusterware and the oracle databases it manages interact.

I suspected that restarting the entire clusterware stack would resolve the issue but that was difficult as this node also managed a production database which we didn’t want to take down.

However I guessed that restarting the clusterware agent for the oracle user might fix the problem. The executable is oraagant.bin and the process owner is oracle. I believe this is the process clusterware uses to actually start the database (You’ll also probably notice a similar process owned by grid and orarootagent.bin running as root).

I killed the oracle agent process and crossed my fingers. Luckily clusterware re-spawned this process and afterwards we were able to restart the problem instance without any problems.

Please re-read the first paragraph if you are considering to apply this work-around, and don’t blame me if you break anything, if it helps though I’m happy to take the credit!

Adding covering fields to a Primary Key Index

This is something to file under the (admittedly rather large) category of things that I wasn’t aware that the Oracle database could do.

While tuning a query, I wanted to use a common technique of adding fields to an index to eliminate a “Table Access by Index RowID” operation, however this particular case was a complicated  by the fact that the index was supporting the primary key, and the table was large and frequently accessed.

This is probably easiest demonstrated by the (much simplified) example below:

SQL> create table singles(id number generated always as identity,
2                       artist varchar2(255),
3                       title varchar2(255),
4                       constraint singles_pk primary key (id));

Table SINGLES created.

SQL>
SQL> insert into singles (artist,
2                        title)
3              values ('Chesney Hawkes',
4                      'The One And Only');

1 row inserted.

SQL> commit;

Commit complete.

SQL> select index_name from user_indexes where table_name = 'SINGLES';
INDEX_NAME
_____________
SINGLES_PK

SQL> select artist from singles where id = 1;
ARTIST
_________________
Chesney Hawkes

SQL> select * from dbms_xplan.display_cursor(format=>'BASIC');
PLAN_TABLE_OUTPUT
_____________________________________________________
EXPLAINED SQL STATEMENT:
------------------------
select artist from singles where id = 1

Plan hash value: 3923658952

--------------------------------------------------
| Id  | Operation                   | Name       |
--------------------------------------------------
|   0 | SELECT STATEMENT            |            |
|   1 |  TABLE ACCESS BY INDEX ROWID| SINGLES    |
|   2 |   INDEX UNIQUE SCAN         | SINGLES_PK |
--------------------------------------------------

14 rows selected.

Note that adding a new index on (id, artist) makes the plan more efficient:

SQL> create index i_singles_covering on singles(id, artist);

Index I_SINGLES_COVERING created.

SQL> select artist from singles where id = 1;
ARTIST
_________________
Chesney Hawkes

SQL> select * from dbms_xplan.display_cursor(format=>'BASIC');
PLAN_TABLE_OUTPUT
__________________________________________________
EXPLAINED SQL STATEMENT:
------------------------
select artist from singles where id = 1

Plan hash value: 1012019734

-----------------------------------------------
| Id  | Operation        | Name               |
-----------------------------------------------
|   0 | SELECT STATEMENT |                    |
|   1 |  INDEX RANGE SCAN| I_SINGLES_COVERING |
-----------------------------------------------

13 rows selected.

However we’ve now got two indexes, SINGLES_PK on (id) and I_SINGLES_COVERING on (id, artist).  SINGLES_PK is redundant, but being used to support the Primary Key:

SQL> select index_name from user_indexes where table_name = 'SINGLES';
           INDEX_NAME
_____________________
SINGLES_PK
I_SINGLES_COVERING

Now it is possible for a primary key to be supported by I_SINGLES_COVERING, but initially I thought I’d have to choose between dropping and re-creating the primary key to use the new index, or leaving the system in the non-optimal state of having the two indexes.

However I came across this blog post from Richard Foote, which referenced another post from Jonathan Lewis.  It describes the following technique of modifying the constraint to use the new index without needing to re-recreate it.  It’s worth noting that the index SINGLES_PK that the database automatically created to initially support the primary key gets dropped during this operation.

SQL> alter table singles
  2        modify constraint singles_pk
  3        using index i_singles_covering;
Table SINGLES altered.


SQL> select index_name from user_indexes where table_name = 'SINGLES';
INDEX_NAME
_____________________
I_SINGLES_COVERING

One thing I observed my testing was that if I created i_singles_covering as a unique index (id is unique as it’s the primary key, so obviously combination of id & artist must also be unique) then the database was unwilling to use this index to support the primary key:

SQL> create unique index i_singles_covering on singles(id, artist);

Index I_SINGLES_COVERING created.

SQL> alter table singles
2          modify constraint singles_pk
3          using index i_singles_covering;

ORA-14196: Specified index cannot be used to enforce the constraint.
14196. 00000 - "Specified index cannot be used to enforce the constraint."
*Cause: The index specified to enforce the constraint is unsuitable
for the purpose.
*Action: Specify a suitable index or allow one to be built automatically.

This case is documented by Oracle Support Document ID 577253.1 which states:

We cannot use a prefix of a unique index to enforce a unique constraint. We can use a whole unique index or a prefix of a non-unique index to do that. This is the way Oracle was designed.

However I can’t off-hand think of any technical reason for this limitation.

 

SQL Plan Directives: Gotta Purge ‘Em All

A tip I picked up from Nigel Bayliss regards purging SQL Plan Directives, I’ve been using it a lot recently and can’t see it documented elsewhere.

As some background these records, exposed via the DBA_SQL_PLAN_DIRECTIVES view, are cardinality corrections created and used when running with Adaptive Statistics enabled.  There is a job that should automatically purge all records unused for longer than the value of SPD_RETENTION_WEEKS, but we’ve experienced occasions when this job doesn’t work as expected.

The records can be individually purged by calling DBMS_SPD.DROP_SQL_PLAN_DIRECTIVE, but that’s a pain if you’ve got a lot of them

However what the documentation doesn’t mention is that you can call the procedure, passing in NULL for the mandatory directive_id parameter:

exec sys.DBMS_SPD.DROP_SQL_PLAN_DIRECTIVE (NULL);

This will purge all records based on retention rules that the auto-purge job follows.  If you really want to Purge ‘Em All then you can set the retention to 0 before calling the procedure.

exec sys.DBMS_SPD.SET_PREFS('SPD_RETENTION_WEEKS', '0');

 

Planned (Cursor) Obsolescence

I’ll start this blog-post by posing a question. Is it possible to have multiple records in v$sql for a given sql_id and child_number combination? While the title of this blog post may give you some clues, I’ll admit I’d always assumed that those values uniquely identified a child cursor.

As a bit of background we had a database availability situation this week, which we narrowed down to SGA issues, specifically bug 15881004 “Excessive SGA memory usage with Extended Cursor Sharing”. Some of our more complex SQL Statements were getting many (more than 700) child cursors. The reported reason for the child was “Bind mismatch(33)”.  Probably caused by bug 14176247 “Many child cursors using Adaptive Cursor Sharing with binds (due to BIND_EQUIV_FAILURE)”, although that is listed as fixed in 12.1 and this instance is running on 12.2.

We resolved the immediate issue by flushing the shared pool (admittedly not a great solution, but sometimes you got to do what you got to do), and created SQL Plan Baselines for those problem SQL statements so they would each just get one plan and child cursor.

We plan to monitor more closely for any SQL statements that do have many child cursors, however we need to make sure that even if that does happen it doesn’t break the system again.  One thing that seemed promising is the _cursor_obsolete_threshold parameter.  We had already reduced this parameter down to 1024 from it’s default of 8192 based on Mike Deitrich’s blog post but with this incident were considering reducing it further.  I think it’s wise to be wary of messing too much with underscore parameters but per Doc ID 2431353.1 Oracle Support say “the … parameter can be adjusted case-to-case basis should there be a problem”. For sure we had a significant problem with the setting at 1024 so plan to reduce further to 512.

We involved super consultant Stefan Koehler to review our findings and action plan, he was broadly in agreement, even recommending further reduction of the parameter value to 256.  However something puzzling me which I asked him was “What actually happens if the number of child cursors hit the value specified by this parameter”.  His answer “Well what happens is this … if your parent cursor got more than _cursor_obsolete_threshold child cursors it invalidates the parent (and in consequence all childs) and it starts from 0 again”

I was skeptical, my expectation was that Oracle would just invalidate the oldest unused child cursor and then re-use that child number.  Another thing puzzling me was happens if some of the child cursors were still held open?  Time to test this out for myself…
First let me demonstrate how I can get 4 child cursors for a given SQL Statement using different values of optimizer_index_cost_adj as a quick hack.
SQL> alter system flush shared_pool;
SQL> select count(*) from all_objects;
  COUNT(*)
     74807

SQL> select prev_sql_id from v$session where sid=sys_context('userenv','sid');
PREV_SQL_ID
9tz4qu4rj9rdp

SQL> alter session set optimizer_index_cost_adj=1;
SQL> select count(*) from all_objects;
  COUNT(*)
     74807

SQL> alter session set optimizer_index_cost_adj=2;
SQL> select count(*) from all_objects;
  COUNT(*)
     74807

SQL> alter session set optimizer_index_cost_adj=3;
SQL> select count(*) from all_objects;
  COUNT(*)
     74807

SQL> select sql_id, child_number, executions from v$sql where sql_id = '9tz4qu4rj9rdp';
SQL_ID            CHILD_NUMBER   EXECUTIONS
9tz4qu4rj9rdp                0            1
9tz4qu4rj9rdp                1            1
9tz4qu4rj9rdp                2            1
9tz4qu4rj9rdp                3            1

Let me reduce _cursor_obsolete_threshold at session level and re-run the test.

SQL> alter system flush shared_pool;
SQL> alter session set "_cursor_obsolete_threshold"=2;
SQL> alter session set optimizer_index_cost_adj=100;
SQL> select count(*) from all_objects;
  COUNT(*)
     74807

SQL> alter session set optimizer_index_cost_adj=1;
SQL> select count(*) from all_objects;
  COUNT(*)
     74807

SQL> alter session set optimizer_index_cost_adj=2;
SQL> select count(*) from all_objects;
  COUNT(*)
     74807

SQL> alter session set optimizer_index_cost_adj=3;
SQL> select count(*) from all_objects;
  COUNT(*)
     74807

SQL> select sql_id, child_number, executions from v$sql where sql_id = '9tz4qu4rj9rdp';
SQL_ID            CHILD_NUMBER   EXECUTIONS
9tz4qu4rj9rdp                0            1
9tz4qu4rj9rdp                1            1
9tz4qu4rj9rdp                0            1
9tz4qu4rj9rdp                1            1

Whoah…. each combination of sql_id and child number has two entries (not what I was expecting to see). To get a more full picture we need to look at a couple of additional fields, namely ‘address’ and ‘is_obsolete’.

SQL> select sql_id,  address, child_number, is_obsolete from v$sql where sql_id = '9tz4qu4rj9rdp';
SQL_ID          ADDRESS              CHILD_NUMBER IS_OBSOLETE
9tz4qu4rj9rdp   00000000610AB500                0 Y
9tz4qu4rj9rdp   00000000610AB500                1 Y
9tz4qu4rj9rdp   0000000073DDE788                0 N
9tz4qu4rj9rdp   0000000073DDE788                1 N

Although we tend to use sql_id as our handle for the parent cursor, Oracle actually uses the ‘Address’ field, and when the _cursor_obsolete_threshold value is exceeded, Oracle allocates a new parent cursor with a new ‘Address’.  This explains how Oracle copes when old child cursors are held open, they still stay in the shared pool, keeping their address, but are marked as obsolete, able to be aged out when they are no longer in use.

The other lessons here, firstly that Stefan knows his stuff, but also whenever someone tells you something, don’t just take it on trust, it’s normally easy to validate for yourself, and you may learn something about how Oracle works along the way