I’ve been looking at a lot of application SQL this week with the hope of getting some easy changes done to boost performance. I go about this by looking at where the time is going and seeing what can be done to fix that problem. A few of the problematic SQLs follow the pattern of
optimizer_mode=first_rows_10
select *
from main_table m
join lookup_table l1
on ...
join lookup_table l2
on ...
join lookup_table l3
on ...
where m.slightly_selective = :blah1
and m.not_at_all_selective_but_neccessary = :blah2
order by m.some_date , m.primary_key
i.e find some rows that you care about, join them to a lot of other tables to fill out some data, return the earliest 10 rows (by ordering against the main table columns only and only fetching 10 results).
If you have an index which covers your main filter columns and the order by clause then you can end up with a nice first rows plan which loops through the rows ordered as desired and stops when it gets 10 results. If you don’t have that index then you will end up having to visit all the rows that match the predicates, join them all to the lookup tables and then sort them and then take the first 10 rows. I aim to improve this scenario without having to add indexes to cover all of your order by conditions.
Problem Demo
First we set up some tables
create table big_table as select * from all_objects;
create table big_table_2 as select * from big_table;
create index big_idx_perfect on big_table (object_type,object_name,object_id);
create unique index big2_idx1 on big_table_2 (object_id);
Our query will be
select *
from big_table t1
join big_table_2 t2
on t1.object_id = t2.object_id
where t1.object_type ='TABLE'
order by t1.object_name, t1.object_id
We’ve got an index which covers both our filter and the order by columns in the right order, so the explain plan we get is
-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 11 | 2948 | 19 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 11 | 2948 | 19 (0)| 00:00:01 |
| 2 | NESTED LOOPS | | 11 | 2948 | 19 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| BIG_TABLE | 1918 | 250K| 8 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | BIG_IDX_PERFECT | 11 | | 3 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | BIG2_IDX1 | 1 | | 0 (0)| 00:00:01 |
| 6 | TABLE ACCESS BY INDEX ROWID | BIG_TABLE_2 | 1 | 134 | 1 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("T1"."OBJECT_TYPE"='TABLE')
5 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")
Note how we don’t have any sorting done in the plan because we are accessing everything in the right order by using the index.
What happens when we replace the perfectly crafted index with one that only allows us to filter?
drop index big_idx_perfect;
create index big_idx_filter on big_table (object_type,object_name);
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1918 | 501K| | 489 (5)| 00:00:01 |
| 1 | SORT ORDER BY | | 1918 | 501K| 624K| 489 (5)| 00:00:01 |
|* 2 | HASH JOIN | | 1918 | 501K| | 388 (6)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| BIG_TABLE | 1918 | 250K| | 191 (4)| 00:00:01 |
| 4 | TABLE ACCESS FULL| BIG_TABLE_2 | 72892 | 9538K| | 196 (7)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")
3 - filter("T1"."OBJECT_TYPE"='TABLE')
It turns out that my index isn’t even good enough to provide selectivity (which is no surprise if you look at what it’s filtering on). By losing the index ordered access we have to now get all the rows and sort them all, we have to do full table scans and a hash join just to return 10 rows.
Problem Fixing
What if we were to tell Oracle to place an additional order by once it has read from our first table. Once the rows are in order, it can successfully use a first rows approach to join from the ordered rows until it reaches the right number of results (or it finishes). This is easy to do if you can rewrite the SQL, you just need to add a subquery
select *
from (select * from big_table order by object_name, object_id ) t1
join big_table_2 t2
on t1.object_id = t2.object_id
where t1.object_type ='TABLE'
order by t1.object_name, t1.object_id
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1918 | 501K| | 489 (5)| 00:00:01 |
| 1 | SORT ORDER BY | | 1918 | 501K| 624K| 489 (5)| 00:00:01 |
|* 2 | HASH JOIN | | 1918 | 501K| | 388 (6)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| BIG_TABLE | 1918 | 250K| | 191 (4)| 00:00:01 |
| 4 | TABLE ACCESS FULL| BIG_TABLE_2 | 72892 | 9538K| | 196 (7)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("BIG_TABLE"."OBJECT_ID"="T2"."OBJECT_ID")
3 - filter("BIG_TABLE"."OBJECT_TYPE"='TABLE')
Oh, it’s the same plan that sorts all the results before allowing it to return. What happened? Well, Oracle thinks it knows better than us and has rewritten the query back to how it was before. If we grab the outline from the plan, we can see if we can control the behaviour.
select * from dbms_xplan.display(format=>'outline')
...
Outline Data
-------------
/*+
BEGIN_OUTLINE_DATA
FULL(@"SEL$73523A42" "T1"@"SEL$2")
USE_HASH(@"SEL$F1D6E378" "T2"@"SEL$1")
LEADING(@"SEL$F1D6E378" "T1"@"SEL$1" "T2"@"SEL$1")
FULL(@"SEL$F1D6E378" "T2"@"SEL$1")
NO_ACCESS(@"SEL$F1D6E378" "T1"@"SEL$1")
OUTLINE(@"SEL$1")
OUTLINE(@"SEL$3")
OUTLINE(@"SEL$2")
MERGE(@"SEL$1" >"SEL$3")
OUTLINE_LEAF(@"SEL$F1D6E378")
ELIMINATE_OBY(@"SEL$2")
OUTLINE_LEAF(@"SEL$73523A42")
FIRST_ROWS(10)
DB_VERSION('18.1.0')
OPTIMIZER_FEATURES_ENABLE('18.1.0')
IGNORE_OPTIM_EMBEDDED_HINTS
END_OUTLINE_DATA
*/
It’s the ELIMINATE_OBY
which looks suspicious to me, so we want to tell Oracle don’t do that. Let’s lookup the counter hint
select inverse from v$sql_hint where name ='ELIMINATE_OBY';
INVERSE
----------------------------------------------------------------
NO_ELIMINATE_OBY
Easy! So we need to stick that in our subquery too
Solution – Order By Placement
select *
from (select /*+ no_eliminate_oby */ * from big_table t1 order by t1.object_name, t1.object_id ) t1
join big_table_2 t2
on t1.object_id = t2.object_id
where t1.object_type ='TABLE'
order by t1.object_name, t1.object_id
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 6070 | 11 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 10 | 6070 | 11 (0)| 00:00:01 |
| 2 | NESTED LOOPS | | 10 | 6070 | 11 (0)| 00:00:01 |
| 3 | VIEW | | 1918 | 885K| 1 (0)| 00:00:01 |
| 4 | SORT ORDER BY | | 1918 | 250K| 192 (5)| 00:00:01 |
|* 5 | TABLE ACCESS FULL | BIG_TABLE | 1918 | 250K| 191 (4)| 00:00:01 |
|* 6 | INDEX UNIQUE SCAN | BIG2_IDX1 | 1 | | 0 (0)| 00:00:01 |
| 7 | TABLE ACCESS BY INDEX ROWID| BIG_TABLE_2 | 1 | 134 | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - filter("T1"."OBJECT_TYPE"='TABLE')
6 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")
Success! And we’ve even got a lower cost, suggesting this transformation was not costed.
How do we know whether this rewrite would be beneficial?
This has an easy answer: benchmark it.
In reality, first rows style queries are still going to be difficult to get. TABLE ACCESS BY INDEX ROWID BATCHED
and ADAPTIVE JOIN
are a couple of things that have been introduced that prevent ordered results keeping their order as you join. You would have to give up on these (and probably a few other performance improvements) in order to benefit. In my opinion, this is going to be heavily waited towards my query rewrite still being faster – better to not read rows at all than to read a lot with faster reads.
If Oracle could cost this transformation then it would be a very easy win for complex TOP-N queries. In the mean time, we will have to identify cases which look like they would benefit and do the rewrite ourselves.
Identifying Candidate SQLs
In theory, we should be able to look for query plans that could benefit. The following criteria could be considered:
- Order by in the select statement
- First_rows optimizer mode/rownum filter after the order by/fetch first rows syntax
- All columns in the order by are coming from one table
- This table is driving the query
- Optimizer expects many more rows to be found in the the driving table than it expects to want to fetch
I will try to write up a query which can identify these cases in bulk in the future.
For those queries, all we need to do is take the order by clause and replicate it into a subquery using the driving table and use our NO_ELIMINATE_OBY
hint.
The real solution would be for this query transformation to be done by the optimizer and costed. The Oracle cost-based optimizer is already able to do a similar query transformation with group by clauses since 11g . If the optimizer is able to do the transformation for you, then tuning this in live applications can be as simple as creating a SQL Plan Baseline or a SQL Patch to inject a hint.