Friday, May 25, 2012

Minimizing query execution time using oracle hints

Querying on a indexed columns generally improves query performance but in some cases if all columns in where clause are indexed columns degrades query performances. And this can be observed if table you are querying has humongous data (*many* rows). In that scenario, execution time of query on a table (which has many rows) will increase as the number of indexed columns in the where condition increase.

For example: Below are the columns present in dcspp_order table

Indexes that present in this table


Query 1: Querying dcspp_order with creation_date and state, only state is indexed column
set serveroutput on
variable n number
exec :n := dbms_utility.get_time;
SELECT count(*) FROM dcspp_order WHERE  creation_date>='01-Jan-02' and creation_date <='12-Dec-02' and state = 'INCOMPLETE';
exec dbms_output.put_line( ((dbms_utility.get_time-:n)/100) || ' seconds....' );

 The output -
  COUNT(*)
----------
         0
1 row selected.
35.07 seconds....
PL/SQL procedure successfully completed.

Execution plan for above query-
1 Every row in the table PROFILEV5.DCSPP_ORDER  is read.
2 For the rows returned by step 1, filter out rows depending on filter criteria.
3 The rows were sorted to support a group operation (MAX,MIN,AVERAGE, SUM, etc).
4 Rows were returned by the SELECT statement.

Query 2 : Querying dcspp_order with last_modified_date and state, both columns are indexed
set serveroutput on
variable n number
exec :n := dbms_utility.get_time;
SELECT count(*) FROM dcspp_order WHERE last_modified_date>='01-Jan-02' and last_modified_date <='12-Dec-02' and state = 'INCOMPLETE';
exec dbms_output.put_line( ((dbms_utility.get_time-:n)/100) || ' seconds....' );
The output -
  COUNT(*)
----------
         0

1 row selected.
224.43 seconds....
PL/SQL procedure successfully completed.

Execution Plan for above query -
1 One or more rows were retrieved using index PROFILEV5.IX_DCSPP_ORDER_01 .  The index was scanned in ascending order..
2 One or more rows were retrieved using index PROFILEV5.ORDER_LASTMOD_IDX .  The index was scanned in ascending order..
3 The result sets from steps 1, 2 were joined (hash).
4 A view definition was processed, either from a stored view PROFILEV5.index$_join$_001  or as defined by steps 3.
5 For the rows returned by step 4, filter out rows depending on filter criteria.
6 The rows were sorted to support a group operation (MAX,MIN,AVERAGE, SUM, etc).
7 Rows were returned by the SELECT statement.

As you can observe Query 2 execution time (224.43 secs) is more than Query 1 execution time (35.07 secs) , even though Query 2 has both indexed columns in the condition.

Why?
If you observe the execution plan of Query 1, STATE column index - PROFILEV5.IX_DCSPP_ORDER_01 is not considered that's because of condition also has a column creation_date which doesn't have any index. So there is a full table scan in this case. Now If you observe the execution plan of Query 2, since all the columns in the condition has indexes,  STATE column index PROFILEV5.IX_DCSPP_ORDER_01 and LAST_MODIFED_DATE column index PROFILEV5.ORDER_LASTMOD_IDX are considered while executing, first scans through STATE index and next scan through LAST_MODIFED_DATE index and then results from both are joined which is taking much time since there are hugh number of rows in dcspp_order table.

How to optimize Query 2 execution time?
 Using oracle hints we can ask to skip one index, which will avoid hash join.

Query 2 optimizing using oracle hints
set serveroutput on
variable n number
exec :n := dbms_utility.get_time;
SELECT /*+NO_INDEX(dcspp_order IX_DCSPP_ORDER_01)*/  count(*) FROM dcspp_order WHERE last_modified_date>='01-Jan-02' and last_modified_date <='12-Dec-02' and state = 'INCOMPLETE';
exec dbms_output.put_line( ((dbms_utility.get_time-:n)/100) || ' seconds....' );
output -
  COUNT(*)
----------
         0

1 row selected.
28.17 seconds....
PL/SQL procedure successfully completed.

Execution plan -
1 Every row in the table PROFILEV5.DCSPP_ORDER  is read.
2 For the rows returned by step 1, filter out rows depending on filter criteria.
3 The rows were sorted to support a group operation (MAX,MIN,AVERAGE, SUM, etc).
4 Rows were returned by the SELECT statement.

So execution time Query 2 is reduced from 224.43 seconds to 28.17 seconds.
We used NO_INDEX oracle hint to skip STATE column index to avoid hash join.

Hope this helps