Views expressed here are solely that of my own. Please make sure that you test the code/queries that appear on my blog before applying them to the production environment.

Friday, July 15, 2011

Effects of indexes on Oracle Cost Based Optimizer (CBO) execution plan selection

Followings are some tests that I performed on an Oracle 11gR2 database and my comments about the effects of the indexes on Oracle Cost Based Optimizer (CBO) execution plan selection. The results can change on different database versions or on the databases configured with different database parameters. Below are my output that I got at the time of my tests.

[oracle@localhost uural]$ sqlplus / as sysdba

SQL*Plus: Release 11.1.0.7.0 - Production on Thu Jul 14 14:15:11 2011

Copyright (c) 1982, 2008, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

First I create a test user
SQL> !more cre_user-U0001.sql
drop user u0001 cascade;
create user u0001 identified by u0001;
grant create session to u0001;
alter user u0001 quota 5G on users;
grant create table to u0001;
grant dba to u0001;
SQL> @cre_user-U0001

User dropped.


User created.


Grant succeeded.


User altered.


Grant succeeded.


Grant succeeded.

Then I connect with that test user
SQL> conn u0001/u0001
Connected.

Create a big sample table
SQL> create table t1 as
  2  select
  3    rownum id,
  4    dbms_random.string('U',6) v1,
  5    lpad(rownum,10) v2,
  6    rpad('x',100) padding
  7  from
  8    dual
connect by level <= 1000000;  9

Table created.
Create an index on id column
SQL> create index t1_01 on t1(id);

Index created.
gather table statistics with cascade option, which will gather index statistics also
SQL> exec dbms_stats.gather_table_stats('','t1',estimate_percent => null,cascade => true,method_opt => 'for all columns size 1');

PL/SQL procedure successfully completed.

SQL> set lines 160

SQL> desc t1
 Name                                                                                      Null?    Type
 ----------------------------------------------------------------------------------------- -------- ------------------------------------------------------------
 ID                                                                                                 NUMBER
 V1                                                                                                 VARCHAR2(4000)
 V2                                                                                                 VARCHAR2(40)
 PADDING                                                                                            VARCHAR2(100)
set auto trace to explain only, we are only interested in the execution plans right now
SQL> set autot trace exp
SQL> sho autot
autotrace TRACEONLY EXPLAIN
When we try to select all columns with a condition using an indexed column, CBO decides to do an "INDEX RANGE SCAN" and after finding the related rowids of the rows matching the index key in where condition, then select the rows from table by using these rowids.
SQL> select * from t1 where id = 100000;

Execution Plan
----------------------------------------------------------
Plan hash value: 2362378154

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |   124 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |     1 |   124 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_01 |     1 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID"=100000)
When we select only the indexed column from the table CBO goes directly to the table and does a full table scan(TABLE ACCESS FULL). Although this column is indexed, it did not use the index for this sql, because there is no "NOT NULL" constraint on indexed column "ID" and there can be NULL values for this column in the table which they did not exist in the corresponding index, since the index keys consisting of null values are not kept in an index.
SQL> select id from t1;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  1000K|  4882K|  4895   (1)| 00:00:59 |
|   1 |  TABLE ACCESS FULL| T1   |  1000K|  4882K|  4895   (1)| 00:00:59 |
--------------------------------------------------------------------------
This time we query again the indexed column by giving a not null condition, and CBO uses associated index. It does not go to table because indexed column value also exists in the index as index key so there is no need to go to the table.Although, actually we know that there is only one row matching the condition, CBO does not know it and it does a "INDEX RANGE SCAN" instead of "INDEX UNIQUE SCAN" which could only be an option if we created a unique index on "ID" column.
SQL> select id from t1 where id = 100000;

Execution Plan
----------------------------------------------------------
Plan hash value: 2311469903

--------------------------------------------------------------------------
| Id  | Operation        | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT |       |     1 |     5 |     3   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| T1_01 |     1 |     5 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("ID"=100000)
Again for the count operation since there is a not null condition in the where clause CBO computes this count only using the index without going to the table.
SQL> select count(*) from t1 where id = 100000;

Execution Plan
----------------------------------------------------------
Plan hash value: 1142046246

---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |     1 |     5 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE   |       |     1 |     5 |            |          |
|*  2 |   INDEX RANGE SCAN| T1_01 |     1 |     5 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID"=100000)
Lets now set the indexed column to "NOT NULL" and see what will change.
SQL> alter table t1 modify id not null;

Table altered.
Now CBO knows that there is no NULL value for the indexed column in the table and since all the values of indexed column is also recorded in the corresponding index there is no need to go to table. CBO chooses a "INDEX FAST FULL SCAN" (index ffs) which is done by using multi block read (mbr) technique, and because of that, returning rows from this operation will not be ordered by indexed column value.
SQL> select id from t1;

Execution Plan
----------------------------------------------------------
Plan hash value: 93400700

------------------------------------------------------------------------------
| Id  | Operation            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |  1000K|  4882K|   609   (1)| 00:00:08 |
|   1 |  INDEX FAST FULL SCAN| T1_01 |  1000K|  4882K|   609   (1)| 00:00:08 |
------------------------------------------------------------------------------
This time we want the result in order and CBO chooses "INDEX FULL SCAN" which will read all the index keys from the beginning leaf block to the end leaf block in the order of indexed key and return them again without going to the table.
SQL> select id from t1 order by id;

Execution Plan
----------------------------------------------------------
Plan hash value: 2357002434

--------------------------------------------------------------------------
| Id  | Operation        | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT |       |  1000K|  4882K|  2235   (1)| 00:00:27 |
|   1 |  INDEX FULL SCAN | T1_01 |  1000K|  4882K|  2235   (1)| 00:00:27 |
--------------------------------------------------------------------------
This is very common execution plan CBO uses the index to get the rowids matching the criteria and then goes to table to fetch the matching rows by using the rowids got from the index.
SQL> select * from t1 where id = 100000;

Execution Plan
----------------------------------------------------------
Plan hash value: 2362378154

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |   124 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |     1 |   124 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_01 |     1 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID"=100000)
This time CBO does the ordering by using the index although there is no where condition containing the indexed column. Since the order by column is an indexes column, and all the values of that column exist in the index tree because of the "NOT NULL" constraint on that column, CBO finds the rowids of all rows in order from the index and then gets the associated rows from the table by using these rowids.
SQL> select * from t1 order by id;

Execution Plan
----------------------------------------------------------
Plan hash value: 331831787

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |  1000K|   118M| 20103   (1)| 00:04:02 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |  1000K|   118M| 20103   (1)| 00:04:02 |
|   2 |   INDEX FULL SCAN           | T1_01 |  1000K|       |  2235   (1)| 00:00:27 |
-------------------------------------------------------------------------------------
Because of the "NOT NULL" constraint CBO knows that all values of the indexed column are index tree and the count of them equals to the count of rows in the table. It does a "INDEX FAST FULL SCAN" by using multi block read operation.
SQL> select count(*) from t1;

Execution Plan
----------------------------------------------------------
Plan hash value: 442041690

-----------------------------------------------------------------------
| Id  | Operation             | Name  | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT      |       |     1 |   609   (1)| 00:00:08 |
|   1 |  SORT AGGREGATE       |       |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| T1_01 |  1000K|   609   (1)| 00:00:08 |
-----------------------------------------------------------------------
Now make the indexed column unique
SQL> alter table t1 add constraint t1_c_01 unique (id);

Table altered.
As we can see CBO still chooses "INDEX RANGE SCAN", because although we made the indexed column UNIQUE, the index is still non-unique
SQL> select * from t1 where id = 100000;

Execution Plan
----------------------------------------------------------
Plan hash value: 2362378154

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |   124 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |     1 |   124 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_01 |     1 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID"=100000)
Lets drop and recreate the index as a unique index
SQL> drop index t1_01;
drop index t1_01
           *
ERROR at line 1:
ORA-02429: cannot drop index used for enforcement of unique/primary key


SQL> alter table t1 drop constraint t1_c_01;

Table altered.

SQL> drop index t1_01;

Index dropped.

SQL> create unique index t1_01 on t1(id);

Index created.

SQL> exec dbms_stats.gather_table_stats('','t1',estimate_percent => null,cascade => true,method_opt => 'for all columns size 1');

PL/SQL procedure successfully completed.
Now we see that CBO decides to do a "INDEX UNIQUE SCAN" which will be cheaper than a "INDEX RANGE SCAN" in means of cost.
SQL> select * from t1 where id = 100000;

Execution Plan
----------------------------------------------------------
Plan hash value: 2407846814

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |   124 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |     1 |   124 |     3   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | T1_01 |     1 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID"=100000)
Again CBO chooses "INDEX UNIQUE SCAN" without going to the table since all the indexed column values exists in the index because of the "NOT NULL" constrained on indexed column.
SQL> select id from t1 where id = 100000;

Execution Plan
----------------------------------------------------------
Plan hash value: 2394460924

---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |     1 |     5 |     2   (0)| 00:00:01 |
|*  1 |  INDEX UNIQUE SCAN| T1_01 |     1 |     5 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("ID"=100000)
Nothings changed in the execution plan after changing the index to a unique one.
SQL> select * from t1 order by id;

Execution Plan
----------------------------------------------------------
Plan hash value: 331831787

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |  1000K|   118M| 19965   (1)| 00:04:00 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |  1000K|   118M| 19965   (1)| 00:04:00 |
|   2 |   INDEX FULL SCAN           | T1_01 |  1000K|       |  2097   (1)| 00:00:26 |
-------------------------------------------------------------------------------------
Lets give an example how CBO chooses to scan the index leaf blocks from the end to the beginning when we give an order by in descending order by using "INDEX FULL SCAN DESCENDING".
SQL> select * from t1 order by id desc;

Execution Plan
----------------------------------------------------------
Plan hash value: 2369582229

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |  1000K|   118M| 19965   (1)| 00:04:00 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |  1000K|   118M| 19965   (1)| 00:04:00 |
|   2 |   INDEX FULL SCAN DESCENDING| T1_01 |  1000K|       |  2097   (1)| 00:00:26 |
-------------------------------------------------------------------------------------
Again if we only select only the indexed column, column values comes from only the index without going to the table in descending order.
SQL> select id from t1 order by id desc;

Execution Plan
----------------------------------------------------------
Plan hash value: 1356827296

------------------------------------------------------------------------------------
| Id  | Operation                  | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |       |  1000K|  4882K|  2097   (1)| 00:00:26 |
|   1 |  INDEX FULL SCAN DESCENDING| T1_01 |  1000K|  4882K|  2097   (1)| 00:00:26 |
------------------------------------------------------------------------------------

SQL>

That's all folks.
I hope my demonstrations and comments can give you any little idea about how the Oracle CBO works and decides the execution plans in the presence of indexes.

1 comment:

Frank said...

Thanks, very useful.