Real Time Materialized Views pre 12.2? (Part 1)

Oh, yes!

Like many others, I was excited when Oracle announced that in 12.2 you could have always fresh MViews with a change to your fast refresh Mviews. If you want to read further then the first step is the documentation https://docs.oracle.com/database/122/SQLRF/CREATE-MATERIALIZED-VIEW.htm ENABLE ON QUERY COMPUTATION is the clause you’re after. Franck Pachot has a very detailed write up as well https://blog.dbi-services.com/12cr2-real-time-materialized-view-on-query-computation/. Now, you might be in the same boat as many – you don’t have 12.2 because you’re using on-premises and your company doesn’t have plans to upgrade to 12.2 because upgrading is time consuming… but you still want to use real time MVs don’t you?

So do I! So I gave it a shot, my theory: if Oracle can refresh a materialized view using only a materialized view log then one can view the real time data using just the materialized data and the materialized view log. We will be limited by only materialized views that are fast refreshable, and we’ll have to work out the logic ourselves. I will present solutions for some basic (but useful!) materialized views, you will have to adapt them to fit your scenario which means you will have to understand what’s going on. This will be delivered in parts, this post will serve as an introduction, it may even be enough to get you started on your own solution!

First things first, we’re going to need to understand what goes into a materialized view log. With so many options, it’s easy to get a little mixed up so let’s do some sample cases. Try these on your own system, I tested them on my home system running Oracle 12.1.0.2 EE.

First a basic table with a primary key and some sample data:

create table base_t_1 (pk_col number primary key, number_col number, string_col varchar2(40));
create sequence seq_1;
insert into base_t_1 (pk_col, number_col, string_col)
select seq_1.nextVal, trunc(dbms_random.value(0,100)), dbms_random.string('x',10) 
from   dual connect by rownum <=1000;
commit;

We’ve got a basic table with a thousand rows and a few random numbers in one column and random strings in another column. Now we create a materialized view log on the table, without using any special clauses:

create materialized view log on base_t_1;
Materialized view log created.

That was easy, so what does the materialized view log look like? We can use the user_mview_logs data dictionary view to describe the mview logs of the logged in user , the master column is the table that the mview log is on. I’m using a saved script called print, I’ll share this another time but essentially it is the same as Tom Kyte’s print_table stored procedure but as an anonymous block.

ANDY@orcl>@print "select * from user_mview_logs where master = 'BASE_T_1'"

LOG_OWNER                     : ANDY
MASTER                        : BASE_T_1
LOG_TABLE                     : MLOG$_BASE_T_1
LOG_TRIGGER                   :
ROWIDS                        : NO
PRIMARY_KEY                   : YES
OBJECT_ID                     : NO
FILTER_COLUMNS                : NO
SEQUENCE                      : NO
INCLUDE_NEW_VALUES            : NO
PURGE_ASYNCHRONOUS            : NO
PURGE_DEFERRED                : NO
PURGE_START                   :
PURGE_INTERVAL                :
LAST_PURGE_DATE               :
LAST_PURGE_STATUS             :
NUM_ROWS_PURGED               :
COMMIT_SCN_BASED              : NO
STAGING_LOG                   : NO
-----------------

ANDY@orcl>desc MLOG$_BASE_T_1
 Name                                                  Null?    Type
 ----------------------------------------------------- -------- -------------------
 PK_COL                                                         NUMBER
 SNAPTIME$$                                                     DATE
 DMLTYPE$$                                                      VARCHAR2(1)
 OLD_NEW$$                                                      VARCHAR2(1)
 CHANGE_VECTOR$$                                                RAW(255)
 XID$$                                                          NUMBER
We can see our primary key column has automatically been added. Ifwe included rowid in the create mview log statement we’d get that too. The log is currently empty, so let’s see what we get when we insert a new row:
insert into base_t_1 (pk_col, number_col, string_col)
select seq_1.nextVal, trunc(dbms_random.value(0,100)), dbms_random.string('x',10) 
from   dual;
@print "select * from MLOG$_BASE_T_1"

PK_COL                        : 1001
SNAPTIME$$                    : 01/01/4000 00:00:00
DMLTYPE$$                     : I
OLD_NEW$$                     : N
CHANGE_VECTOR$$               : FE
XID$$                         : 562962838326956
-----------------

How can we interpret this? It’s telling us the row affected has a primary key of 1001 (the next value in the sequence), it was added to the log using a dml type of I (Insert), and it’s the New version of the row being logged. The other columns: snaptime$$ is used by Oracle to coordinate refreshes of multiple mviews (and we may not get into that), change_vector$$ is an internal representation of what’s changed – it’s just a hex string that we can convert to binary, here FE->11111110, some of those 1s must mean we’ve made new values for 3 of our columns, we don’t have much information to base this on so let’s just leave it to one side for now. The final column is xid$$, that must be something to do with the transaction of the insert, I’ve kept the transaction open so let’s see what we can find. Here is a query that fetches the v$transaction row for the users session (by looking up the taddr value from v$session for the Sid of the users session) 

ANDY@orcl>@print "select * from v$transaction where addr = (select taddr from v$session where sid = sys_context('userenv','sid'))"

ADDR                          : 00007FF866C8CCB8
XIDUSN                        : 2
XIDSLOT                       : 3
XIDSQN                        : 3756
UBAFIL                        : 5
UBABLK                        : 148
UBASQN                        : 774
UBAREC                        : 33
STATUS                        : ACTIVE
START_TIME                    : 02/28/17 22:56:30
START_SCNB                    : 7161539
START_SCNW                    : 0
START_UEXT                    : 0
START_UBAFIL                  : 5
START_UBABLK                  : 148
START_UBASQN                  : 774
START_UBAREC                  : 30
SES_ADDR                      : 00007FF86AEE7730
FLAG                          : 3587
SPACE                         : NO
RECURSIVE                     : NO
NOUNDO                        : NO
PTX                           : NO
NAME                          :
PRV_XIDUSN                    : 0
PRV_XIDSLT                    : 0
PRV_XIDSQN                    : 0
PTX_XIDUSN                    : 0
PTX_XIDSLT                    : 0
PTX_XIDSQN                    : 0
DSCN-B                        : 0
DSCN-W                        : 0
USED_UBLK                     : 1
USED_UREC                     : 4
LOG_IO                        : 69
PHY_IO                        : 2
CR_GET                        : 9640
CR_CHANGE                     : 0
START_DATE                    : 28/02/2017 22:56:30
DSCN_BASE                     : 0
DSCN_WRAP                     : 0
START_SCN                     : 7161539
DEPENDENT_SCN                 : 0
XID                           : 02000300AC0E0000
PRV_XID                       : 0000000000000000
PTX_XID                       : 0000000000000000
CON_ID                        : 0
-----------------
v$transaction has an xid column in hexadecimal and our mview log has one using decimal, let’s compare:
ANDY@orcl>select to_char(562962838326956,'fm0xxxxxxxxxxxxx') bin_to_hex from dual;

BIN_TO_HEX
---------------
02000300000eac
Close, but perhaps we’re just comparing to the wrong XID. If we consider the 3 elements of the transaction describing where the undo exists:
@print "select XIDUSN, XIDSlot, XIDSQN, to_char(XIDUSN,'fm0X') XIDUSN_hex, to_char(xidSlot,'fm0XXX') xidSlot_hex, to_char(xidSQN,'fm0XXXXXXXXXX') XIDSQN_hex from v$transaction where addr = (select taddr from v$session where sid = sys_context('userenv','sid'))"

ANDY@orcl>/
XIDUSN                        : 2
XIDSLOT                       : 3
XIDSQN                        : 3756
XIDUSN_HEX                    : 02
XIDSLOT_HEX                   : 0003
XIDSQN_HEX                    : 00000000EAC
-----------------
That’s more like it! Maybe we should move on with another DML? Let’s update the row we just inserted and see what happens
update base_t_1 set number_col = 2 where pk_col = 1001;
ANDY@orcl>@print "select * from MLOG$_BASE_T_1"

PK_COL                        : 1001
SNAPTIME$$                    : 01/01/4000 00:00:00
DMLTYPE$$                     : I
OLD_NEW$$                     : N
CHANGE_VECTOR$$               : FE
XID$$                         : 562962838326956
-----------------
PK_COL                        : 1001
SNAPTIME$$                    : 01/01/4000 00:00:00
DMLTYPE$$                     : U
OLD_NEW$$                     : U
CHANGE_VECTOR$$               : 04
XID$$                         : 562962838326956
-----------------
First thing you should notice is the original row still exists, you should appreciate that it might complicate matters for us (on the other hand, without knowing the full history of a row would also give us problems). The new row tells us we did an update to the row with primary key 1001. The old_new$$ column has been set to ‘U’, this must mean that the row has been updated – it doesn’t matter whether you’re looking at the old or new version as all we have is the primary key, to get the information about the row we’d have to join back to the master table. The change_vector$$ translates (hex to bin) to 00000100, we know that we’ve modified the second column of base_t_1. What happens when we modify it again for a second time?
update base_t_1 set string_col = '3' where pk_col = 1001;
ANDY@orcl>@print "select * from MLOG$_BASE_T_1"

----
Including the previous rows
-----------------
PK_COL                        : 1001
SNAPTIME$$                    : 01/01/4000 00:00:00
DMLTYPE$$                     : U
OLD_NEW$$                     : U
CHANGE_VECTOR$$               : 08
XID$$                         : 562962838326956
-----------------
This change_vector$$ translates to 00001000
So have we got anything there that we can use for a materialized view? Not immediately, in order to get the data needed for a fresh MView we’d need to go back to the master table to get the new row. That doesn’t sound very effective, and presumably we can do better? Yes!
Let’s rollback, and create the materialized view log with some extra columns:
ANDY@orcl>create materialized view log on base_t_1 with (number_col, string_col);

Materialized view log created.
And the same tests:
insert into base_t_1 (pk_col, number_col, string_col)
select seq_1.nextVal, trunc(dbms_random.value(0,100)), dbms_random.string('x',10) 
from   dual;
ANDY@orcl>@print "select * from MLOG$_BASE_T_1"

PK_COL                        : 1003
NUMBER_COL                    : 0
STRING_COL                    : 92V1H71O2K
SNAPTIME$$                    : 01/01/4000 00:00:00
DMLTYPE$$                     : I
OLD_NEW$$                     : N
CHANGE_VECTOR$$               : FE
XID$$                         : 1407490847674749
-----------------

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.05
ANDY@orcl>
ANDY@orcl>update base_t_1 set number_col = 2 where pk_col = 1003;

1 row updated.

Elapsed: 00:00:00.00
ANDY@orcl>@print "select * from MLOG$_BASE_T_1"

PK_COL                        : 1003
NUMBER_COL                    : 0
STRING_COL                    : 92V1H71O2K
SNAPTIME$$                    : 01/01/4000 00:00:00
DMLTYPE$$                     : I
OLD_NEW$$                     : N
CHANGE_VECTOR$$               : FE
XID$$                         : 1407490847674749
-----------------
PK_COL                        : 1003
NUMBER_COL                    : 0
STRING_COL                    : 92V1H71O2K
SNAPTIME$$                    : 01/01/4000 00:00:00
DMLTYPE$$                     : U
OLD_NEW$$                     : O
CHANGE_VECTOR$$               : 04
XID$$                         : 1407490847674749
-----------------
I’ve only done one insert and one update here just to demonstrate the first problem: even though we have the rest of the columns from the table, we only have the Old state of the row that was updated. Let’s refer to the documentation to see what we can do better

new_values_clause::=

Description of new_values_clause.gif follows

So trying with including new values:
create materialized view log on base_t_1 with (number_col, string_col) including new values;

insert into base_t_1 (pk_col, number_col, string_col)
select seq_1.nextVal, trunc(dbms_random.value(0,100)), dbms_random.string('x',10) 
from   dual;
ANDY@orcl>@print "select * from MLOG$_BASE_T_1"

PK_COL                        : 1004
NUMBER_COL                    : 54
STRING_COL                    : YB3XWVR9GI
SNAPTIME$$                    : 01/01/4000 00:00:00
DMLTYPE$$                     : I
OLD_NEW$$                     : N
CHANGE_VECTOR$$               : FE
XID$$                         : 1126041640766467
-----------------

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.04
ANDY@orcl>
ANDY@orcl>update base_t_1 set number_col = 2 where pk_col = 1004;

1 row updated.

Elapsed: 00:00:00.01
ANDY@orcl>@print "select * from MLOG$_BASE_T_1"

PK_COL                        : 1004
NUMBER_COL                    : 54
STRING_COL                    : YB3XWVR9GI
SNAPTIME$$                    : 01/01/4000 00:00:00
DMLTYPE$$                     : I
OLD_NEW$$                     : N
CHANGE_VECTOR$$               : FE
XID$$                         : 1126041640766467
-----------------
PK_COL                        : 1004
NUMBER_COL                    : 54
STRING_COL                    : YB3XWVR9GI
SNAPTIME$$                    : 01/01/4000 00:00:00
DMLTYPE$$                     : U
OLD_NEW$$                     : O
CHANGE_VECTOR$$               : 04
XID$$                         : 1126041640766467
-----------------
PK_COL                        : 1004
NUMBER_COL                    : 2
STRING_COL                    : YB3XWVR9GI
SNAPTIME$$                    : 01/01/4000 00:00:00
DMLTYPE$$                     : U
OLD_NEW$$                     : N
CHANGE_VECTOR$$               : 04
XID$$                         : 1126041640766467
-----------------
We’ve got the complete new version of the row. Can you spot the next problem? Is it obvious which version of the 1004 row is the latest? We know that we couldn’t have inserted the row after the update. For the update, the new version of the row has the old_new$$ flag as ‘N’, so we’ve got that. What happens when we modify the row further?
ANDY@orcl>update base_t_1 set number_col = 1 where pk_col = 1004;

1 row updated.

Elapsed: 00:00:00.03
ANDY@orcl>@print "select * from MLOG$_BASE_T_1"

PK_COL                        : 1004
NUMBER_COL                    : 54
STRING_COL                    : YB3XWVR9GI
SNAPTIME$$                    : 01/01/4000 00:00:00
DMLTYPE$$                     : I
OLD_NEW$$                     : N
CHANGE_VECTOR$$               : FE
XID$$                         : 1126041640766467
-----------------
PK_COL                        : 1004
NUMBER_COL                    : 54
STRING_COL                    : YB3XWVR9GI
SNAPTIME$$                    : 01/01/4000 00:00:00
DMLTYPE$$                     : U
OLD_NEW$$                     : O
CHANGE_VECTOR$$               : 04
XID$$                         : 1126041640766467
-----------------
PK_COL                        : 1004
NUMBER_COL                    : 2
STRING_COL                    : YB3XWVR9GI
SNAPTIME$$                    : 01/01/4000 00:00:00
DMLTYPE$$                     : U
OLD_NEW$$                     : N
CHANGE_VECTOR$$               : 04
XID$$                         : 1126041640766467
-----------------
PK_COL                        : 1004
NUMBER_COL                    : 2
STRING_COL                    : YB3XWVR9GI
SNAPTIME$$                    : 01/01/4000 00:00:00
DMLTYPE$$                     : U
OLD_NEW$$                     : O
CHANGE_VECTOR$$               : 04
XID$$                         : 1126041640766467
-----------------
PK_COL                        : 1004
NUMBER_COL                    : 1
STRING_COL                    : YB3XWVR9GI
SNAPTIME$$                    : 01/01/4000 00:00:00
DMLTYPE$$                     : U
OLD_NEW$$                     : N
CHANGE_VECTOR$$               : 04
XID$$                         : 1126041640766467
-----------------
We could be “clever” and weave through the different versions of the rows linking New’s to Old’s etc and joining on the column values – it may work but I sure hope there’s a better way. What else is in the documentation?

“Use the WITH clause to indicate whether the materialized view log should record the primary key, rowid, object ID, or a combination of these row identifiers when rows in the master are changed. You can also use this clause to add a sequence to the materialized view log to provide additional ordering information for its records.”

Let’s try the sequence clause, rolling back and dropping the mview log first.

create materialized view log on base_t_1 with sequence (number_col, string_col) including new values;

 into base_t_1 (pk_col, number_col, string_col)
select seq_1.nextVal, trunc(dbms_random.value(0,100)), dbms_random.string(‘x’,10)
from   dual;
update base_t_1 set number_col = 2 where pk_col = 1005;
update base_t_1 set number_col = 3 where pk_col = 1005;
@print “select * from MLOG$_BASE_T_1”
PK_COL                        : 1005
NUMBER_COL                    : 14
STRING_COL                    : SH5LQKHTIV
SEQUENCE$$                    : 310001
SNAPTIME$$                    : 01/01/4000 00:00:00
DMLTYPE$$                     : I
OLD_NEW$$                     : N
CHANGE_VECTOR$$               : FE
XID$$                         : 1688897104907956
-----------------
PK_COL                        : 1005
NUMBER_COL                    : 14
STRING_COL                    : SH5LQKHTIV
SEQUENCE$$                    : 310002
SNAPTIME$$                    : 01/01/4000 00:00:00
DMLTYPE$$                     : U
OLD_NEW$$                     : O
CHANGE_VECTOR$$               : 04
XID$$                         : 1688897104907956
-----------------
PK_COL                        : 1005
NUMBER_COL                    : 2
STRING_COL                    : SH5LQKHTIV
SEQUENCE$$                    : 310003
SNAPTIME$$                    : 01/01/4000 00:00:00
DMLTYPE$$                     : U
OLD_NEW$$                     : N
CHANGE_VECTOR$$               : 04
XID$$                         : 1688897104907956
-----------------
PK_COL                        : 1005
NUMBER_COL                    : 2
STRING_COL                    : SH5LQKHTIV
SEQUENCE$$                    : 310004
SNAPTIME$$                    : 01/01/4000 00:00:00
DMLTYPE$$                     : U
OLD_NEW$$                     : O
CHANGE_VECTOR$$               : 04
XID$$                         : 1688897104907956
-----------------
PK_COL                        : 1005
NUMBER_COL                    : 3
STRING_COL                    : SH5LQKHTIV
SEQUENCE$$                    : 310005
SNAPTIME$$                    : 01/01/4000 00:00:00
DMLTYPE$$                     : U
OLD_NEW$$                     : N
CHANGE_VECTOR$$               : 04
XID$$                         : 1688897104907956
-----------------
Perfect! We’ve got our sequence$$ column now, the highest value per primary key tells us the most up to date version of the row.
I think that’s enough for part one. Part two we will create a simple materialized view on this base table (select * from base_table) and see how we can combine the mview log and the mview to create a fresh picture of the base table for query. I plan on publishing part two Wednesday 8th March so stay tuned, if you want further clarification on anything then leave me a comment or reach out to me on twitter @AndrewSayer_
Advertisements