Driving an outer join query from the coalesce of the outer joined table and the main table

I had a hard time coming up with a snappy title for this, I don’t think it’s going to be particularly google-able but never mind.

The problem I’m tackling here is that we have a query which outer joins two tables but the main filter is achieved with a coalesce (or NVL) of a column from the outer joined table and the main table. I’m going to go over how I investigated why this was slow and how I would solve the problem with a query rewrite (both when I can touch the application code, and when I cant’).

To put this into some context and make the problem more understandable, let’s create some sample data.

We have two tables: as_main which contains names of things and as_optional which contains alternative names for the same things that should be used if they exist. They are joined together on their own primary keys. I’ve added indexes on the name columns in both.

create table as_main 
  (pkcol number primary key
  ,padding_col varchar2(200 char) 
  ,name varchar2(100 char)
  );
create index as_main_name on as_main (name);
create table as_optional 
  (pkcol number primary key
  ,padding_col varchar2(200 char)
  ,alternative_name varchar2(100 char)
  );
create index as_optional_name on as_optional (alternative_name);
insert into as_main (pkcol, padding_col, name)
select rownum, rpad('x',200,'x'), dbms_random.string('a',30) from dual connect by rownum <=10000;
insert into as_optional (pkcol, padding_col, alternative_name)
select pkcol, rpad('x',200,'x'), dbms_random.string('a',30) from as_main where mod(pkCol, 5) = 0;
commit;
exec dbms_stats.gather_table_stats(null,'as_main')
exec dbms_stats.gather_table_stats(null,'as_optional')

The business logic for this data model requires that for each row in as_main, if another row exists in as_optional then display the alternative_name from there instead. E.g. if I wanted to look up the data with pkcol=100, I would run:

select m.pkcol, coalesce(o.alternative_name, m.name) name
from   as_main m
left join as_optional o 
  on   m.pkcol = o.pkcol
where  m.pkcol = 100;

And we get a very sensible and cheap looking plan

Plan hash value: 1756420787
 
------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                 |     1 |    70 |     3   (0)| 00:00:01 |
|   1 |  NESTED LOOPS OUTER          |                 |     1 |    70 |     3   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| AS_MAIN         |     1 |    35 |     2   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | SYS_C0033781918 |     1 |       |     1   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID| AS_OPTIONAL     |     1 |    35 |     1   (0)| 00:00:01 |
|*  5 |    INDEX UNIQUE SCAN         | SYS_C0033781919 |     1 |       |     0   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("M"."PKCOL"=100)
   5 - access("O"."PKCOL"(+)=100)

But what happens if I want to search based on that computed name column? Let’s fiter on coalesce(m.name, o.alternative_name)

select m.pkcol, coalesce(o.alternative_name, m.name) name
from   as_main m
left join as_optional o 
  on   m.pkcol = o.pkcol
where  coalesce(m.name, o.alternative_name) = 'fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj';
Plan hash value: 202792823
 
--------------------------------------------------------------------------------------
| Id  | Operation              | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |             | 10000 |   683K|   145   (0)| 00:00:01 |
|*  1 |  FILTER                |             |       |       |            |          |
|*  2 |   HASH JOIN RIGHT OUTER|             | 10000 |   683K|   145   (0)| 00:00:01 |
|   3 |    TABLE ACCESS FULL   | AS_OPTIONAL |  2000 | 70000 |    25   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL   | AS_MAIN     | 10000 |   341K|   120   (0)| 00:00:01 |
--------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(COALESCE("M"."NAME","O"."ALTERNATIVE_NAME")='fvzXQcNfOuXnBgMwYlz
              zMkFPKYxDrj')
   2 - access("M"."PKCOL"="O"."PKCOL"(+))

Yikes! The driving filter is being executed on plan line 1, after a full tablescan of each table. We have to read and join every row (note the lack of predicate applied to lines 3 and 4) before we can start applying our filter, that’s going to be nasty when these tables grow.

You might be wondering if there’s a way for the query to be executed in a fast way at all. You could filter on rows in as_main that match the name filter and check to make sure they don’t have a joining row in as_optional, you can then filter on rows in as_optional that match the name filter and join back to as_main. I’ll share the SQL version of this later on but let’s see if we can get the optimizer to do the work for us.

We used coalesce, but there are some tricks that the optimizer has with NVL so let’s see if changing that helps.

select m.pkcol, coalesce(o.alternative_name, m.name) name
from   as_main m
left join as_optional o 
  on   m.pkcol = o.pkcol
where  nvl(o.alternative_name, m.name) = 'fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj';
Plan hash value: 202792823
 
--------------------------------------------------------------------------------------
| Id  | Operation              | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |             | 10000 |   683K|   145   (0)| 00:00:01 |
|*  1 |  FILTER                |             |       |       |            |          |
|*  2 |   HASH JOIN RIGHT OUTER|             | 10000 |   683K|   145   (0)| 00:00:01 |
|   3 |    TABLE ACCESS FULL   | AS_OPTIONAL |  2000 | 70000 |    25   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL   | AS_MAIN     | 10000 |   341K|   120   (0)| 00:00:01 |
--------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(NVL("O"."ALTERNATIVE_NAME","M"."NAME")='fvzXQcNfOuXnBgMwYlzzMkFP
              KYxDrj')
   2 - access("M"."PKCOL"="O"."PKCOL"(+))

Nope! Same plan. What if we expand the NVL filter a bit:

select m.pkcol, coalesce(o.alternative_name, m.name) name
from   as_main m
left join as_optional o 
  on   m.pkcol = o.pkcol
where   (o.alternative_name = 'fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj' or (o.alternative_name is null and m.name ='fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj'));
Plan hash value: 202792823
 
--------------------------------------------------------------------------------------
| Id  | Operation              | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |             |     5 |   350 |   145   (0)| 00:00:01 |
|*  1 |  FILTER                |             |       |       |            |          |
|*  2 |   HASH JOIN RIGHT OUTER|             |     5 |   350 |   145   (0)| 00:00:01 |
|   3 |    TABLE ACCESS FULL   | AS_OPTIONAL |  2000 | 70000 |    25   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL   | AS_MAIN     | 10000 |   341K|   120   (0)| 00:00:01 |
--------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("O"."ALTERNATIVE_NAME"='fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj' OR 
              "O"."ALTERNATIVE_NAME" IS NULL AND "M"."NAME"='fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj'
              )
   2 - access("M"."PKCOL"="O"."PKCOL"(+))

Hmm, no change again. What if we force the or expansion with a use_concat hint?

select /*+use_concat*/m.pkcol, coalesce(o.alternative_name, m.name) name
from   as_main m
left join as_optional o 
  on   m.pkcol = o.pkcol
where   (o.alternative_name = 'fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj' or (o.alternative_name is null and m.name ='fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj'));
Plan hash value: 2005024459
 
----------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                 |     6 |   420 |   148   (0)| 00:00:01 |
|   1 |  CONCATENATION                         |                 |       |       |            |          |
|*  2 |   FILTER                               |                 |       |       |            |          |
|   3 |    NESTED LOOPS OUTER                  |                 |     1 |    70 |     3   (0)| 00:00:01 |
|   4 |     TABLE ACCESS BY INDEX ROWID BATCHED| AS_MAIN         |     1 |    35 |     2   (0)| 00:00:01 |
|*  5 |      INDEX RANGE SCAN                  | AS_MAIN_NAME    |     1 |       |     1   (0)| 00:00:01 |
|   6 |     TABLE ACCESS BY INDEX ROWID        | AS_OPTIONAL     |     1 |    35 |     1   (0)| 00:00:01 |
|*  7 |      INDEX UNIQUE SCAN                 | SYS_C0033779786 |     1 |       |     0   (0)| 00:00:01 |
|*  8 |   FILTER                               |                 |       |       |            |          |
|*  9 |    HASH JOIN RIGHT OUTER               |                 |     5 |   350 |   145   (0)| 00:00:01 |
|  10 |     TABLE ACCESS FULL                  | AS_OPTIONAL     |  2000 | 70000 |    25   (0)| 00:00:01 |
|  11 |     TABLE ACCESS FULL                  | AS_MAIN         | 10000 |   341K|   120   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter("O"."ALTERNATIVE_NAME" IS NULL)
   5 - access("M"."NAME"='fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj')
   7 - access("M"."PKCOL"="O"."PKCOL"(+))
   8 - filter("O"."ALTERNATIVE_NAME"='fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj' AND 
              (LNNVL("O"."ALTERNATIVE_NAME" IS NULL) OR LNNVL("M"."NAME"='fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj')))
   9 - access("M"."PKCOL"="O"."PKCOL"(+))

We’re sort of there, part of the query will lead on the name index on as_main, the other part is not. We can take a closer look at the predicates against the second part of the concatenation for clues.

8 - filter("O"."ALTERNATIVE_NAME"='fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj' AND 
           (LNNVL("O"."ALTERNATIVE_NAME" IS NULL) OR LNNVL("M"."NAME"='fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj')))
9 - access("M"."PKCOL"="O"."PKCOL"(+))

We can see that Oracle is still choosing to execute this as an outer join (from the operation in the plan or the presense of (+)s in the predicates). It’s then deciding that it can only filter on

"O"."ALTERNATIVE_NAME"='fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj'

After executing the join. We know that the filter could legally be applied on line 10 if the join on line 9 was an inner join (which we know can happen as we’re filtering on it being not null). The optimizer doesn’t seem to spot this. Perhaps it wants to apply the LNNVL predicate at the same time (which must be done after the join becaues it uses both sides of the join). We know that the LNNVL isn’t necessary because the two sides of the OR condition are mutually exclusive, you can’t have a column equal to a value and also null, but this always gets added as a fail safe with OR is expanded to union all. Note: LNNVL (condition) is the equivalent of (condition = false or condition is unknown) ie the 3-valued logic form of anything but true.

Now that we’ve done some digging into whether we can help Oracle do the query transformation itself, it might be time to give in and just rewrite the query ourselves. This is my manually rewritten query

select m.pkcol, coalesce(o.alternative_name, m.name) name
from   as_main m
left join as_optional o 
  on   m.pkcol = o.pkcol
where  o.alternative_name = 'fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj'
union all
select m.pkcol, coalesce(o.alternative_name, m.name) name
from   as_main m
left join as_optional o 
  on   m.pkcol = o.pkcol
where  m.name = 'fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj'
and    o.alternative_name is null;

All I’ve done here is copy and pasted the OR version of the query and stuck one side of the OR condition in one subquery and the other side in the other. It is logically identical to the first version (remember the sides of the OR are mutually exclusive so we can get away with the union all), but let’s see how the optimizer treats it.

Plan hash value: 236470842
 
-----------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                  |     2 |   140 |     6   (0)| 00:00:01 |
|   1 |  UNION-ALL                             |                  |       |       |            |          |
|   2 |   NESTED LOOPS                         |                  |     1 |    70 |     3   (0)| 00:00:01 |
|   3 |    NESTED LOOPS                        |                  |     1 |    70 |     3   (0)| 00:00:01 |
|   4 |     TABLE ACCESS BY INDEX ROWID BATCHED| AS_OPTIONAL      |     1 |    35 |     2   (0)| 00:00:01 |
|*  5 |      INDEX RANGE SCAN                  | AS_OPTIONAL_NAME |     1 |       |     1   (0)| 00:00:01 |
|*  6 |     INDEX UNIQUE SCAN                  | SYS_C0033779785  |     1 |       |     0   (0)| 00:00:01 |
|   7 |    TABLE ACCESS BY INDEX ROWID         | AS_MAIN          |     1 |    35 |     1   (0)| 00:00:01 |
|*  8 |   FILTER                               |                  |       |       |            |          |
|   9 |    NESTED LOOPS OUTER                  |                  |     1 |    70 |     3   (0)| 00:00:01 |
|  10 |     TABLE ACCESS BY INDEX ROWID BATCHED| AS_MAIN          |     1 |    35 |     2   (0)| 00:00:01 |
|* 11 |      INDEX RANGE SCAN                  | AS_MAIN_NAME     |     1 |       |     1   (0)| 00:00:01 |
|  12 |     TABLE ACCESS BY INDEX ROWID        | AS_OPTIONAL      |     1 |    35 |     1   (0)| 00:00:01 |
|* 13 |      INDEX UNIQUE SCAN                 | SYS_C0033779786  |     1 |       |     0   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   5 - access("O"."ALTERNATIVE_NAME"='fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj')
   6 - access("M"."PKCOL"="O"."PKCOL")
   8 - filter("O"."ALTERNATIVE_NAME" IS NULL)
  11 - access("M"."NAME"='fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj')
  13 - access("M"."PKCOL"="O"."PKCOL"(+))
 
Note
-----
   - this is an adaptive plan

We now do some lookups against both tables on our driving filter, compute the join from either side and then combine them. This should be much more optimal than the full table scans. But we had to do the transformation for the optimizer here.

SQL Translation

What if we didn’t have the luxury of being able to modify the code? We can’t use SQL Plan Baselines or SQL Patches because there’s no way Oracle would use our desired plans (the closest you’d get would include that LNNVL filter), but perhaps Oracle has another trick up its sleeve?

Enter SQL Translation Framework. I was reminded about this in a recent Twitter exchange, it allows you (from 12.1 onwards) to modify the SQL being sent by an application before it gets parsed. The main piece of documentation relevent to us is https://docs.oracle.com/en/database/oracle/oracle-database/12.2/drdas/SQLtranslation-and-examples.html#GUID-E519283E-9BD8-4354-B612-72A07D7A5016. I had a look for a demo and found a thread started by Bob Bryla, it was answered by Solomon Yakobson with some code I found very useful: https://community.oracle.com/thread/3996327. You should read the thread as I won’t include all the code here.

I’ll note that you can also use dbms_advanced_rewrite to rewrite this exact query, but I need to be able to accept any literal value for the string.

To use Solomon’s sample code, we just need to figure out the regexp_replace arguments that will convert our badly performing query to the transformed query. Regular expressions are tricky to understand at first but we’ll be using them very simplistically so hopefully you can follow along(I recommend using online demos to see what does what – https://regexr.com/ is usually my goto)

select m.pkcol, coalesce(o.alternative_name, m.name) name
from   as_main m
left join as_optional o 
  on   m.pkcol = o.pkcol
where  coalesce(m.name, o.alternative_name) = 'fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj'

First, we need to escape anything that would mean something special in regular expressions – parenthesise and full stops (periods).

select m\.pkcol, coalesce\(o\.alternative_name, m\.name\) name
from   as_main m
left join as_optional o 
  on   m\.pkcol = o\.pkcol
where  coalesce\(m\.name, o\.alternative_name\) = 'fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj'

Next, we need to chunk up the query

(select m\.pkcol, coalesce\(o\.alternative_name, m\.name\) name
from   as_main m
left join as_optional o 
  on   m\.pkcol = o\.pkcol
where  coalesce\(m\.name, o\.alternative_name\) = )('fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj')

This gives us two groups, \1 which gives us the static part of the query, \2 which gives us the (currently static) value we’re comparing the coalesce against. Next step is to allow for further values in our string and ensure that we are referring to the complete query.

^(select m\.pkcol, coalesce\(o\.alternative_name, m\.name\) name
from   as_main m
left join as_optional o 
  on   m\.pkcol = o\.pkcol
where  coalesce\(m\.name, o\.alternative_name\) = )('\w+')$

\w means a word character (alphabetic character, a number, or an underscore). If we apply this regular expression to our query then \2 would give us ‘fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj’.
^ means the start of a string and $ means the end of a string, so we can wrap our query in those to ensure that the regular expression won’t be triggered if there is anything on either side of our query.

Now we need to form the query that this can be replaced by: first the manual translation we came up with before

select m.pkcol, coalesce(o.alternative_name, m.name) name
from   as_main m
left join as_optional o 
  on   m.pkcol = o.pkcol
where  o.alternative_name = 'fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj'
union all
select m.pkcol, coalesce(o.alternative_name, m.name) name
from   as_main m
left join as_optional o 
  on   m.pkcol = o.pkcol
where  m.name = 'fvzXQcNfOuXnBgMwYlzzMkFPKYxDrj'
and    o.alternative_name is null

We need to replace the known variables with our regular expression groups:

select m.pkcol, coalesce(o.alternative_name, m.name) name
from   as_main m
left join as_optional o 
  on   m.pkcol = o.pkcol
where  o.alternative_name = \2
union all
select m.pkcol, coalesce(o.alternative_name, m.name) name
from   as_main m
left join as_optional o 
  on   m.pkcol = o.pkcol
where  m.name = \2
and    o.alternative_name is null

Now let’s put it all together, using q strings (like Solomon) so we don’t need to worry about escaping our quotes. We only need to modify the user defined PL/SQL package (the one that’s been defined as attr_translator):

create or replace
  package body u1_translator
    is
      procedure translate_sql(
                              sql_text        in  clob,
                              translated_text out clob
                            )
        is
        begin
           translated_text := regexp_replace(
                                             sql_text,
                                             q'#^(select m\.pkcol, coalesce\(o\.alternative_name, m\.name\) name
from   as_main m
left join as_optional o 
  on   m\.pkcol = o\.pkcol
where  coalesce\(m\.name, o\.alternative_name\) = )('\w+')$#',
                                             'select m.pkcol, coalesce(o.alternative_name, m.name) name
from   as_main m
left join as_optional o 
  on   m.pkcol = o.pkcol
where  o.alternative_name = \2
union all
select m.pkcol, coalesce(o.alternative_name, m.name) name
from   as_main m
left join as_optional o 
  on   m.pkcol = o.pkcol
where  m.name = \2
and    o.alternative_name is null'
                                           );
     end;
     procedure translate_error(
                               error_code          in  binary_integer,
                               translated_code    out binary_integer,
                               translated_sqlstate out varchar2
                             )
       is
       begin
           null;
     end;
End;
/

We’ve mimiced the white space so the formatting of the script is a little funny. We could concatenate the string together with carriage returns and line feeds (yes, Oracle runs on Windows just fine), or we could do it on one line and use \s+ for any whitespace (which includes new lines and carriage returns). Ultimately, I don’t think that will add readability here.

To test this, we just need to set up a session to use the SQL Translation Profile (see the linked code) and run our query

select m.pkcol, coalesce(o.alternative_name, m.name) name
from   as_main m
left join as_optional o 
  on   m.pkcol = o.pkcol
where  coalesce(m.name, o.alternative_name) = 'someOtherValue'

Now uses

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                  |     2 |   140 |     6   (0)| 00:00:01 |
|   1 |  UNION-ALL                             |                  |       |       |            |          |
|   2 |   NESTED LOOPS                         |                  |     1 |    70 |     3   (0)| 00:00:01 |
|   3 |    NESTED LOOPS                        |                  |     1 |    70 |     3   (0)| 00:00:01 |
|   4 |     TABLE ACCESS BY INDEX ROWID BATCHED| AS_OPTIONAL      |     1 |    35 |     2   (0)| 00:00:01 |
|*  5 |      INDEX RANGE SCAN                  | AS_OPTIONAL_NAME |     1 |       |     1   (0)| 00:00:01 |
|*  6 |     INDEX UNIQUE SCAN                  | SYS_C0033779785  |     1 |       |     0   (0)| 00:00:01 |
|   7 |    TABLE ACCESS BY INDEX ROWID         | AS_MAIN          |     1 |    35 |     1   (0)| 00:00:01 |
|*  8 |   FILTER                               |                  |       |       |            |          |
|   9 |    NESTED LOOPS OUTER                  |                  |     1 |    70 |     3   (0)| 00:00:01 |
|  10 |     TABLE ACCESS BY INDEX ROWID BATCHED| AS_MAIN          |     1 |    35 |     2   (0)| 00:00:01 |
|* 11 |      INDEX RANGE SCAN                  | AS_MAIN_NAME     |     1 |       |     1   (0)| 00:00:01 |
|  12 |     TABLE ACCESS BY INDEX ROWID        | AS_OPTIONAL      |     1 |    35 |     1   (0)| 00:00:01 |
|* 13 |      INDEX UNIQUE SCAN                 | SYS_C0033779786  |     1 |       |     0   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   5 - access("O"."ALTERNATIVE_NAME"='someOtherValue')
   6 - access("M"."PKCOL"="O"."PKCOL")
   8 - filter("O"."ALTERNATIVE_NAME" IS NULL)
  11 - access("M"."NAME"='someOtherValue')
  13 - access("M"."PKCOL"="O"."PKCOL"(+))

Excellent!

A challenge for the reader wouldbe to make this as dynamic as possible – what if we want it to support different columns being selected? What if we wanted to support additional predicates? What if we wanted to support this as a subquery? What if we want to support different outer join conditions (would that even be valid to rewrite)? What if we want to support different tables all-together?

Obviously, the best solution would be for the optimizer to do it for us (and cost the transformation)!

Conditional Outer Joins forcing Nested Loops

This is yet another post inspired by a problem on the ODC https://community.oracle.com/message/14839371#14839371. The OP hasn’t mentioned any performance problem, but this is a classic recipe for issues. We want to left join to another table but only when a condition against our driving table has been met.

Let’s copy the table from the post and demo the conditional outer join scenario (I’ve changed the data types to remove a few other quirks):

drop table oe_order_headers_a;
create table oe_order_headers_a  
(header_id         number   not null
,order_number      number   not null
,order_source_id   number
,attribute20       number
);  
  
insert into oe_order_headers_a(header_id, order_number, order_source_id, attribute20) values (1411757, 10000103, null, null );  
insert into oe_order_headers_a(header_id, order_number, order_source_id, attribute20) values (1411758, 50000408, 1021, 1411757);  
insert into oe_order_headers_a(header_id, order_number, order_source_id, attribute20) values (1411759, 50000409, 2, 1411757);  
insert into oe_order_headers_a(header_id, order_number, order_source_id, attribute20) values (1411760, 50000410, null, null);  
insert into oe_order_headers_a(header_id, order_number, order_source_id, attribute20) values (1411761, 10000105, 2, 1411761);  
insert into oe_order_headers_a(header_id, order_number, order_source_id, attribute20) values (1411762, 50000411, 1021, 1411762);  
insert into oe_order_headers_a(header_id, order_number, order_source_id, attribute20) values (1411763, 50000412, 12, 1411763);  
   
select oh.order_number  
      ,oh.order_source_id  
      ,oh2.order_number parent_order_number  
from   oe_order_headers_a oh  
left join oe_order_headers_a oh2 
  on  (oh.order_source_id = 1021 and oh.attribute20 = oh2.header_id)  
where  oh.order_number between 50000000 and 60000000;


ORDER_NUMBER ORDER_SOURCE_ID PARENT_ORDER_NUMBER
------------ --------------- -------------------
    50000408            1021            10000103
    50000411            1021            50000411
    50000412              12
    50000409               2
    50000410

And if we look at the explain plan and some stats (of a second execution) via SQL*Plus autotrace, we get:

ANDY@pdb1>set autotrace on
ANDY@pdb1>/

Execution Plan
----------------------------------------------------------
Plan hash value: 2432841757

-----------------------------------------------------------------------------------------
| Id  | Operation          | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                    |     5 |   325 |     7  (15)| 00:00:01 |
|*  1 |  HASH JOIN OUTER   |                    |     5 |   325 |     7  (15)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL| OE_ORDER_HEADERS_A |     5 |   195 |     3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| OE_ORDER_HEADERS_A |     7 |   182 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("OH"."ATTRIBUTE20"="OH2"."HEADER_ID"(+) AND
              "OH"."ORDER_SOURCE_ID"=CASE  WHEN ("OH2"."HEADER_ID"(+) IS NOT NULL) THEN 1021
              ELSE 1021 END )
   2 - filter("OH"."ORDER_NUMBER">=50000000 AND "OH"."ORDER_NUMBER"<=60000000)

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         15  consistent gets
          0  physical reads
          0  redo size
        825  bytes sent via SQL*Net to client
        572  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          5  rows processed

The plan looks ok, we’re doing a full scan of the same table twice but in reality we would probably have two different tables here. If you take a look at the predicates section, you’ll notice that Uncle Oracle has done something very clever, it’s transformed our conditional outer join condition (the oh.order_source_id = 1021 bit) into another equality join condition:

 "OH"."ORDER_SOURCE_ID"=CASE  WHEN ("OH2"."HEADER_ID"(+) IS NOT NULL) THEN 1021
              ELSE 1021 END )

This is what allows us to do the hash join, you’re not allowed to do a hash join without equality conditions – after all, you have to hash some expression and compare that to a hashed expression from the other side.

Let’s take it a step further, what about if we wanted to also outer join to our other table when our order_source_id is 1022. We would instinctively just change our equality condition into an IN condition, right? Well, here’s what happens:

ANDY@pdb1>select oh.order_number
  2        ,oh.order_source_id
  3        ,oh2.order_number parent_order_number
  4  from   oe_order_headers_a oh
  5  left join oe_order_headers_a oh2
  6    on  (oh.order_source_id in (1021,1022) and oh.attribute20 = oh2.header_id)
  7  where  oh.order_number between 50000000 and 60000000
  8  /

ORDER_NUMBER ORDER_SOURCE_ID PARENT_ORDER_NUMBER
------------ --------------- -------------------
    50000408            1021            10000103
    50000409               2
    50000410
    50000411            1021            50000411
    50000412              12


Execution Plan
----------------------------------------------------------
Plan hash value: 1726314521

-------------------------------------------------------------------------------------------
| Id  | Operation            | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                    |     5 |   260 |    18   (0)| 00:00:01 |
|   1 |  NESTED LOOPS OUTER  |                    |     5 |   260 |    18   (0)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL  | OE_ORDER_HEADERS_A |     5 |   195 |     3   (0)| 00:00:01 |
|   3 |   VIEW               |                    |     1 |    13 |     3   (0)| 00:00:01 |
|*  4 |    FILTER            |                    |       |       |            |          |
|*  5 |     TABLE ACCESS FULL| OE_ORDER_HEADERS_A |     1 |    26 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("OH"."ORDER_NUMBER">=50000000 AND "OH"."ORDER_NUMBER"<=60000000)
   4 - filter("OH"."ORDER_SOURCE_ID"=1021 OR "OH"."ORDER_SOURCE_ID"=1022)
   5 - filter("OH"."ATTRIBUTE20"="OH2"."HEADER_ID")

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         23  consistent gets
          0  physical reads
          0  redo size
        826  bytes sent via SQL*Net to client
        572  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          5  rows processed

We are now doing a nested loop to that full tablescan of our outer table! Taking a look at the predicates we can see that Uncle Oracle is no longer working his magic for us: we are doing a filter predicate on line 4 followed by the join predicate on line 5. On the bright side, we are only doing that full table scan once per row in our driving table that matches our conditional join predicate but this is 2 times as many as we really need to.

I’ll note that if our main condition was on something that could use an index and was fine running via nested loop then we wouldn’t have a problem.

How do we address this? We do the work for the optimizer, we transform that conditional join predicate into an equality join predicate.

If we tried to write this how Oracle rewrote the first query internally, it’s easy to see why it wasn’t able this time: it’s difficult! The best I could come up with following the pattern was:

select oh.order_number  
      ,oh.order_source_id  
      ,oh2.order_number parent_order_number  
from   oe_order_headers_a oh  
      ,oe_order_headers_a oh2 
where  oh.attribute20 = oh2.header_id (+)
and    oh.order_source_id in (case  when (oh2.header_id(+) is not null) then 1021 else 1021 end , case  when (oh2.header_id(+) is not null) then 1022 else 1022 end)
and    oh.order_number between 50000000 and 60000000;

ERROR at line 7:
ORA-01719: outer join operator (+) not allowed in operand of OR or IN

This doesn’t even solve our equality requirement, we would have to expand the IN into two union all branches… it’s just not worth it.

Luckily, there is a more obvious transformation which is to do the transformation on the left side of the join. I would naturally write it like this:

select oh.order_number  
      ,oh.order_source_id  
      ,oh2.order_number parent_order_number  
from   oe_order_headers_a oh  
left join oe_order_headers_a oh2 
  on  (case when oh.order_source_id in (1021,1022) then oh.attribute20 else null end = oh2.header_id)  
where  oh.order_number between 50000000 and 60000000;


Execution Plan
----------------------------------------------------------
Plan hash value: 2432841757

-----------------------------------------------------------------------------------------
| Id  | Operation          | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                    |     5 |   325 |     7  (15)| 00:00:01 |
|*  1 |  HASH JOIN OUTER   |                    |     5 |   325 |     7  (15)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL| OE_ORDER_HEADERS_A |     5 |   195 |     3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| OE_ORDER_HEADERS_A |     7 |   182 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("OH2"."HEADER_ID"(+)=CASE "OH"."ORDER_SOURCE_ID" WHEN 1021 THEN
              "OH"."ATTRIBUTE20" WHEN 1022 THEN "OH"."ATTRIBUTE20" ELSE NULL END )
   2 - filter("OH"."ORDER_NUMBER">=50000000 AND "OH"."ORDER_NUMBER"<=60000000)

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         15  consistent gets
          0  physical reads
          0  redo size
        825  bytes sent via SQL*Net to client
        572  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          5  rows processed

Back to the standard performance of our hash join from before, bingo!