I had a hard time coming up with a snappy title for this, I don’t think it’s going to be particularly google-able but never mind.
The problem I’m tackling here is that we have a query which outer joins two tables but the main filter is achieved with a coalesce (or NVL) of a column from the outer joined table and the main table. I’m going to go over how I investigated why this was slow and how I would solve the problem with a query rewrite (both when I can touch the application code, and when I cant’).
To put this into some context and make the problem more understandable, let’s create some sample data.
We have two tables: as_main which contains names of things and as_optional which contains alternative names for the same things that should be used if they exist. They are joined together on their own primary keys. I’ve added indexes on the name columns in both.
create table as_main (pkcol number primary key ,padding_col varchar2(200 char) ,name varchar2(100 char) ); create index as_main_name on as_main (name); create table as_optional (pkcol number primary key ,padding_col varchar2(200 char) ,alternative_name varchar2(100 char) ); create index as_optional_name on as_optional (alternative_name); insert into as_main (pkcol, padding_col, name) select rownum, rpad('x',200,'x'), dbms_random.string('a',30) from dual connect by rownum <=10000; insert into as_optional (pkcol, padding_col, alternative_name) select pkcol, rpad('x',200,'x'), dbms_random.string('a',30) from as_main where mod(pkCol, 5) = 0; commit; exec dbms_stats.gather_table_stats(null,'as_main') exec dbms_stats.gather_table_stats(null,'as_optional')
The business logic for this data model requires that for each row in as_main, if another row exists in as_optional then display the alternative_name from there instead. E.g. if I wanted to look up the data with pkcol=100, I would run:
select m.pkcol, coalesce(o.alternative_name, m.name) name from as_main m left join as_optional o on m.pkcol = o.pkcol where m.pkcol = 100;
And we get a very sensible and cheap looking plan
Plan hash value: 1756420787 ------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 70 | 3 (0)| 00:00:01 | | 1 | NESTED LOOPS OUTER | | 1 | 70 | 3 (0)| 00:00:01 | | 2 | TABLE ACCESS BY INDEX ROWID| AS_MAIN | 1 | 35 | 2 (0)| 00:00:01 | |* 3 | INDEX UNIQUE SCAN | SYS_C0033781918 | 1 | | 1 (0)| 00:00:01 | | 4 | TABLE ACCESS BY INDEX ROWID| AS_OPTIONAL | 1 | 35 | 1 (0)| 00:00:01 | |* 5 | INDEX UNIQUE SCAN | SYS_C0033781919 | 1 | | 0 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("M"."PKCOL"=100) 5 - access("O"."PKCOL"(+)=100)
But what happens if I want to search based on that computed name column? Let’s fiter on coalesce(m.name, o.alternative_name)
select m.pkcol, coalesce(o.alternative_name, m.name) name from as_main m left join as_optional o on m.pkcol = o.pkcol where coalesce(m.name, o.alternative_name) = 'fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj';
Plan hash value: 202792823 -------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10000 | 683K| 145 (0)| 00:00:01 | |* 1 | FILTER | | | | | | |* 2 | HASH JOIN RIGHT OUTER| | 10000 | 683K| 145 (0)| 00:00:01 | | 3 | TABLE ACCESS FULL | AS_OPTIONAL | 2000 | 70000 | 25 (0)| 00:00:01 | | 4 | TABLE ACCESS FULL | AS_MAIN | 10000 | 341K| 120 (0)| 00:00:01 | -------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(COALESCE("M"."NAME","O"."ALTERNATIVE_NAME")='fvzXQcNfOuXnBgMwYlz zMkFPKYxDrj') 2 - access("M"."PKCOL"="O"."PKCOL"(+))
Yikes! The driving filter is being executed on plan line 1, after a full tablescan of each table. We have to read and join every row (note the lack of predicate applied to lines 3 and 4) before we can start applying our filter, that’s going to be nasty when these tables grow.
You might be wondering if there’s a way for the query to be executed in a fast way at all. You could filter on rows in as_main that match the name filter and check to make sure they don’t have a joining row in as_optional, you can then filter on rows in as_optional that match the name filter and join back to as_main. I’ll share the SQL version of this later on but let’s see if we can get the optimizer to do the work for us.
We used coalesce, but there are some tricks that the optimizer has with NVL so let’s see if changing that helps.
select m.pkcol, coalesce(o.alternative_name, m.name) name from as_main m left join as_optional o on m.pkcol = o.pkcol where nvl(o.alternative_name, m.name) = 'fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj';
Plan hash value: 202792823 -------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10000 | 683K| 145 (0)| 00:00:01 | |* 1 | FILTER | | | | | | |* 2 | HASH JOIN RIGHT OUTER| | 10000 | 683K| 145 (0)| 00:00:01 | | 3 | TABLE ACCESS FULL | AS_OPTIONAL | 2000 | 70000 | 25 (0)| 00:00:01 | | 4 | TABLE ACCESS FULL | AS_MAIN | 10000 | 341K| 120 (0)| 00:00:01 | -------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(NVL("O"."ALTERNATIVE_NAME","M"."NAME")='fvzXQcNfOuXnBgMwYlzzMkFP KYxDrj') 2 - access("M"."PKCOL"="O"."PKCOL"(+))
Nope! Same plan. What if we expand the NVL filter a bit:
select m.pkcol, coalesce(o.alternative_name, m.name) name from as_main m left join as_optional o on m.pkcol = o.pkcol where (o.alternative_name = 'fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj' or (o.alternative_name is null and m.name ='fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj'));
Plan hash value: 202792823 -------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5 | 350 | 145 (0)| 00:00:01 | |* 1 | FILTER | | | | | | |* 2 | HASH JOIN RIGHT OUTER| | 5 | 350 | 145 (0)| 00:00:01 | | 3 | TABLE ACCESS FULL | AS_OPTIONAL | 2000 | 70000 | 25 (0)| 00:00:01 | | 4 | TABLE ACCESS FULL | AS_MAIN | 10000 | 341K| 120 (0)| 00:00:01 | -------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("O"."ALTERNATIVE_NAME"='fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj' OR "O"."ALTERNATIVE_NAME" IS NULL AND "M"."NAME"='fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj' ) 2 - access("M"."PKCOL"="O"."PKCOL"(+))
Hmm, no change again. What if we force the or expansion with a use_concat hint?
select /*+use_concat*/m.pkcol, coalesce(o.alternative_name, m.name) name from as_main m left join as_optional o on m.pkcol = o.pkcol where (o.alternative_name = 'fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj' or (o.alternative_name is null and m.name ='fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj'));
Plan hash value: 2005024459 ---------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 6 | 420 | 148 (0)| 00:00:01 | | 1 | CONCATENATION | | | | | | |* 2 | FILTER | | | | | | | 3 | NESTED LOOPS OUTER | | 1 | 70 | 3 (0)| 00:00:01 | | 4 | TABLE ACCESS BY INDEX ROWID BATCHED| AS_MAIN | 1 | 35 | 2 (0)| 00:00:01 | |* 5 | INDEX RANGE SCAN | AS_MAIN_NAME | 1 | | 1 (0)| 00:00:01 | | 6 | TABLE ACCESS BY INDEX ROWID | AS_OPTIONAL | 1 | 35 | 1 (0)| 00:00:01 | |* 7 | INDEX UNIQUE SCAN | SYS_C0033779786 | 1 | | 0 (0)| 00:00:01 | |* 8 | FILTER | | | | | | |* 9 | HASH JOIN RIGHT OUTER | | 5 | 350 | 145 (0)| 00:00:01 | | 10 | TABLE ACCESS FULL | AS_OPTIONAL | 2000 | 70000 | 25 (0)| 00:00:01 | | 11 | TABLE ACCESS FULL | AS_MAIN | 10000 | 341K| 120 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("O"."ALTERNATIVE_NAME" IS NULL) 5 - access("M"."NAME"='fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj') 7 - access("M"."PKCOL"="O"."PKCOL"(+)) 8 - filter("O"."ALTERNATIVE_NAME"='fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj' AND (LNNVL("O"."ALTERNATIVE_NAME" IS NULL) OR LNNVL("M"."NAME"='fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj'))) 9 - access("M"."PKCOL"="O"."PKCOL"(+))
We’re sort of there, part of the query will lead on the name index on as_main, the other part is not. We can take a closer look at the predicates against the second part of the concatenation for clues.
8 - filter("O"."ALTERNATIVE_NAME"='fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj' AND (LNNVL("O"."ALTERNATIVE_NAME" IS NULL) OR LNNVL("M"."NAME"='fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj'))) 9 - access("M"."PKCOL"="O"."PKCOL"(+))
We can see that Oracle is still choosing to execute this as an outer join (from the operation in the plan or the presense of (+)s in the predicates). It’s then deciding that it can only filter on
"O"."ALTERNATIVE_NAME"='fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj'
After executing the join. We know that the filter could legally be applied on line 10 if the join on line 9 was an inner join (which we know can happen as we’re filtering on it being not null). The optimizer doesn’t seem to spot this. Perhaps it wants to apply the LNNVL predicate at the same time (which must be done after the join becaues it uses both sides of the join). We know that the LNNVL isn’t necessary because the two sides of the OR condition are mutually exclusive, you can’t have a column equal to a value and also null, but this always gets added as a fail safe with OR is expanded to union all. Note: LNNVL (condition) is the equivalent of (condition = false or condition is unknown) ie the 3-valued logic form of anything but true.
Now that we’ve done some digging into whether we can help Oracle do the query transformation itself, it might be time to give in and just rewrite the query ourselves. This is my manually rewritten query
select m.pkcol, coalesce(o.alternative_name, m.name) name from as_main m left join as_optional o on m.pkcol = o.pkcol where o.alternative_name = 'fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj' union all select m.pkcol, coalesce(o.alternative_name, m.name) name from as_main m left join as_optional o on m.pkcol = o.pkcol where m.name = 'fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj' and o.alternative_name is null;
All I’ve done here is copy and pasted the OR version of the query and stuck one side of the OR condition in one subquery and the other side in the other. It is logically identical to the first version (remember the sides of the OR are mutually exclusive so we can get away with the union all), but let’s see how the optimizer treats it.
Plan hash value: 236470842 ----------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 2 | 140 | 6 (0)| 00:00:01 | | 1 | UNION-ALL | | | | | | | 2 | NESTED LOOPS | | 1 | 70 | 3 (0)| 00:00:01 | | 3 | NESTED LOOPS | | 1 | 70 | 3 (0)| 00:00:01 | | 4 | TABLE ACCESS BY INDEX ROWID BATCHED| AS_OPTIONAL | 1 | 35 | 2 (0)| 00:00:01 | |* 5 | INDEX RANGE SCAN | AS_OPTIONAL_NAME | 1 | | 1 (0)| 00:00:01 | |* 6 | INDEX UNIQUE SCAN | SYS_C0033779785 | 1 | | 0 (0)| 00:00:01 | | 7 | TABLE ACCESS BY INDEX ROWID | AS_MAIN | 1 | 35 | 1 (0)| 00:00:01 | |* 8 | FILTER | | | | | | | 9 | NESTED LOOPS OUTER | | 1 | 70 | 3 (0)| 00:00:01 | | 10 | TABLE ACCESS BY INDEX ROWID BATCHED| AS_MAIN | 1 | 35 | 2 (0)| 00:00:01 | |* 11 | INDEX RANGE SCAN | AS_MAIN_NAME | 1 | | 1 (0)| 00:00:01 | | 12 | TABLE ACCESS BY INDEX ROWID | AS_OPTIONAL | 1 | 35 | 1 (0)| 00:00:01 | |* 13 | INDEX UNIQUE SCAN | SYS_C0033779786 | 1 | | 0 (0)| 00:00:01 | ----------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 5 - access("O"."ALTERNATIVE_NAME"='fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj') 6 - access("M"."PKCOL"="O"."PKCOL") 8 - filter("O"."ALTERNATIVE_NAME" IS NULL) 11 - access("M"."NAME"='fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj') 13 - access("M"."PKCOL"="O"."PKCOL"(+)) Note ----- - this is an adaptive plan
We now do some lookups against both tables on our driving filter, compute the join from either side and then combine them. This should be much more optimal than the full table scans. But we had to do the transformation for the optimizer here.
SQL Translation
What if we didn’t have the luxury of being able to modify the code? We can’t use SQL Plan Baselines or SQL Patches because there’s no way Oracle would use our desired plans (the closest you’d get would include that LNNVL filter), but perhaps Oracle has another trick up its sleeve?
Enter SQL Translation Framework. I was reminded about this in a recent Twitter exchange, it allows you (from 12.1 onwards) to modify the SQL being sent by an application before it gets parsed. The main piece of documentation relevent to us is https://docs.oracle.com/en/database/oracle/oracle-database/12.2/drdas/SQLtranslation-and-examples.html#GUID-E519283E-9BD8-4354-B612-72A07D7A5016. I had a look for a demo and found a thread started by Bob Bryla, it was answered by Solomon Yakobson with some code I found very useful: https://community.oracle.com/thread/3996327. You should read the thread as I won’t include all the code here.
I’ll note that you can also use dbms_advanced_rewrite to rewrite this exact query, but I need to be able to accept any literal value for the string.
To use Solomon’s sample code, we just need to figure out the regexp_replace arguments that will convert our badly performing query to the transformed query. Regular expressions are tricky to understand at first but we’ll be using them very simplistically so hopefully you can follow along(I recommend using online demos to see what does what – https://regexr.com/ is usually my goto)
select m.pkcol, coalesce(o.alternative_name, m.name) name from as_main m left join as_optional o on m.pkcol = o.pkcol where coalesce(m.name, o.alternative_name) = 'fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj'
First, we need to escape anything that would mean something special in regular expressions – parenthesise and full stops (periods).
select m\.pkcol, coalesce\(o\.alternative_name, m\.name\) name from as_main m left join as_optional o on m\.pkcol = o\.pkcol where coalesce\(m\.name, o\.alternative_name\) = 'fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj'
Next, we need to chunk up the query
(select m\.pkcol, coalesce\(o\.alternative_name, m\.name\) name from as_main m left join as_optional o on m\.pkcol = o\.pkcol where coalesce\(m\.name, o\.alternative_name\) = )('fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj')
This gives us two groups, \1 which gives us the static part of the query, \2 which gives us the (currently static) value we’re comparing the coalesce against. Next step is to allow for further values in our string and ensure that we are referring to the complete query.
^(select m\.pkcol, coalesce\(o\.alternative_name, m\.name\) name from as_main m left join as_optional o on m\.pkcol = o\.pkcol where coalesce\(m\.name, o\.alternative_name\) = )('\w+')$
\w means a word character (alphabetic character, a number, or an underscore). If we apply this regular expression to our query then \2 would give us ‘fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj’.
^ means the start of a string and $ means the end of a string, so we can wrap our query in those to ensure that the regular expression won’t be triggered if there is anything on either side of our query.
Now we need to form the query that this can be replaced by: first the manual translation we came up with before
select m.pkcol, coalesce(o.alternative_name, m.name) name from as_main m left join as_optional o on m.pkcol = o.pkcol where o.alternative_name = 'fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj' union all select m.pkcol, coalesce(o.alternative_name, m.name) name from as_main m left join as_optional o on m.pkcol = o.pkcol where m.name = 'fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj' and o.alternative_name is null
We need to replace the known variables with our regular expression groups:
select m.pkcol, coalesce(o.alternative_name, m.name) name from as_main m left join as_optional o on m.pkcol = o.pkcol where o.alternative_name = \2 union all select m.pkcol, coalesce(o.alternative_name, m.name) name from as_main m left join as_optional o on m.pkcol = o.pkcol where m.name = \2 and o.alternative_name is null
Now let’s put it all together, using q strings (like Solomon) so we don’t need to worry about escaping our quotes. We only need to modify the user defined PL/SQL package (the one that’s been defined as attr_translator):
create or replace package body u1_translator is procedure translate_sql( sql_text in clob, translated_text out clob ) is begin translated_text := regexp_replace( sql_text, q'#^(select m\.pkcol, coalesce\(o\.alternative_name, m\.name\) name from as_main m left join as_optional o on m\.pkcol = o\.pkcol where coalesce\(m\.name, o\.alternative_name\) = )('\w+')$#', 'select m.pkcol, coalesce(o.alternative_name, m.name) name from as_main m left join as_optional o on m.pkcol = o.pkcol where o.alternative_name = \2 union all select m.pkcol, coalesce(o.alternative_name, m.name) name from as_main m left join as_optional o on m.pkcol = o.pkcol where m.name = \2 and o.alternative_name is null' ); end; procedure translate_error( error_code in binary_integer, translated_code out binary_integer, translated_sqlstate out varchar2 ) is begin null; end; End; /
We’ve mimiced the white space so the formatting of the script is a little funny. We could concatenate the string together with carriage returns and line feeds (yes, Oracle runs on Windows just fine), or we could do it on one line and use \s+ for any whitespace (which includes new lines and carriage returns). Ultimately, I don’t think that will add readability here.
To test this, we just need to set up a session to use the SQL Translation Profile (see the linked code) and run our query
select m.pkcol, coalesce(o.alternative_name, m.name) name from as_main m left join as_optional o on m.pkcol = o.pkcol where coalesce(m.name, o.alternative_name) = 'someOtherValue'
Now uses
----------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 2 | 140 | 6 (0)| 00:00:01 | | 1 | UNION-ALL | | | | | | | 2 | NESTED LOOPS | | 1 | 70 | 3 (0)| 00:00:01 | | 3 | NESTED LOOPS | | 1 | 70 | 3 (0)| 00:00:01 | | 4 | TABLE ACCESS BY INDEX ROWID BATCHED| AS_OPTIONAL | 1 | 35 | 2 (0)| 00:00:01 | |* 5 | INDEX RANGE SCAN | AS_OPTIONAL_NAME | 1 | | 1 (0)| 00:00:01 | |* 6 | INDEX UNIQUE SCAN | SYS_C0033779785 | 1 | | 0 (0)| 00:00:01 | | 7 | TABLE ACCESS BY INDEX ROWID | AS_MAIN | 1 | 35 | 1 (0)| 00:00:01 | |* 8 | FILTER | | | | | | | 9 | NESTED LOOPS OUTER | | 1 | 70 | 3 (0)| 00:00:01 | | 10 | TABLE ACCESS BY INDEX ROWID BATCHED| AS_MAIN | 1 | 35 | 2 (0)| 00:00:01 | |* 11 | INDEX RANGE SCAN | AS_MAIN_NAME | 1 | | 1 (0)| 00:00:01 | | 12 | TABLE ACCESS BY INDEX ROWID | AS_OPTIONAL | 1 | 35 | 1 (0)| 00:00:01 | |* 13 | INDEX UNIQUE SCAN | SYS_C0033779786 | 1 | | 0 (0)| 00:00:01 | ----------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 5 - access("O"."ALTERNATIVE_NAME"='someOtherValue') 6 - access("M"."PKCOL"="O"."PKCOL") 8 - filter("O"."ALTERNATIVE_NAME" IS NULL) 11 - access("M"."NAME"='someOtherValue') 13 - access("M"."PKCOL"="O"."PKCOL"(+))
Excellent!
A challenge for the reader wouldbe to make this as dynamic as possible – what if we want it to support different columns being selected? What if we wanted to support additional predicates? What if we wanted to support this as a subquery? What if we want to support different outer join conditions (would that even be valid to rewrite)? What if we want to support different tables all-together?
Obviously, the best solution would be for the optimizer to do it for us (and cost the transformation)!