Tuesday 24 November 2009

Table Expressions, Cardinality, SYS_OP_ATG and KOKBF$

Recently I was involved in discussion on OTN form (see http://forums.oracle.com/forums/thread.jspa?threadID=986657)
and I found an interesting information inside the posted execution plan which was in "Predicate information" section and was

filter(SYS_OP_ATG(VALUE(KOKBF$),2,3,2)='HQPRM003')


Googling showed some interesting information in Vlad Sadilovski's blog that KOKBF$ is a generic alias given the table expressions. What was more interesting was undocumented SYS_OP_ATG function which attracted my interest. So I decided to create a simple case to observe what is going on.


SQL> CREATE TYPE demo_object AS OBJECT
2 ( ID NUMBER(6)
3 , first_name VARCHAR2(30)
4 , last_name VARCHAR2(20));
5 /

Type created.

SQL> create type demo_object_t as AS TABLE OF demo_object;
2 /

Type created.

SQL> CREATE or replace FUNCTION demo_func( p_num_rows number )
2 RETURN demo_object_t
3 as
4 r_demo_object_t demo_object_t;
5 BEGIN
6 SELECT demo_object(rownum,dbms_random.string('k',30), dbms_random.string('k',20))
7 BULK COLLECT INTO r_demo_object_t
8 FROM dual connect by level <= p_num_rows;
9 RETURN r_demo_object_t;
10 END;
11 /

Function created.

SQL> select * from table(demo_func(10));

ID FIRST_NAME LAST_NAME
---------- ------------------------------ --------------------
1 POXXOPVQXYPXUIGCQLVBWLNXXLUEQC JAIAUCWHIMCRRJQXWUJY
2 SVYYKMZRBERFQFZKAQQOOSDNKMIJSG JMAOAUMHSWGNMYRCBSAG
3 LKNMLDIGIDTJECSRZDNSGHECNGVEFL TUUFTVPIVIXGRWYZFMCF
4 UIAEJZRIRROBBCINFTKNAXJEPFGKPP LPNGQCTLNLMEJWLGTWQJ
5 MGKMJJOHOZPCYEXEDVOWYSYUJMDDBF PTQACVZBNFCJOMQKJNRA
6 DSOODWTOPLPDAFZRXXGKKGECIPCHHB MHWPUTQJOSGOFOKCALJW
7 VTTVCLHRWLXGVMAKWQPOJDMLYUIEOS MXABXPTDHQENPKEHYCFZ
8 QJPWITNBARETOWMUVOQHABEEYNKLOD UYOEOUROCXHXXVNQILYU
9 PZDJEPXAHXXIAOVTOGKPSCMNDORNQX IZLRNYOLEWSPWDMCQMCY
10 FBOKISOVXUNFASOLGOZZPGVHSEFDAG MXOEIQSZMLTBREPGBDSM

10 rows selected.



So now we are ready to see what is going on.


SQL> select * from table(dbms_xplan.display);

Plan hash value: 2056388305

-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8168 | 16336 | 29 (0)| 00:00:01 |
| 1 | COLLECTION ITERATOR PICKLER FETCH| DEMO_FUNC | | | | |
-----------------------------------------------------------------------------------------------


The estimated cardinality is 8168 rows what is a default value used for 8k block size.
The actual cardinality was 10 rows. With additional condition id=8 the estimate changes to 82 rows.


SQL> explain plan for
2 select a.id, a.first_name,a.last_name
3 from table(demo_func(10)) a
4 where a.id = 8;

Explained.

SQL> select * from table(dbms_xplan.display);

Plan hash value: 2056388305

-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 82 | 164 | 29 (0)| 00:00:01 |
|* 1 | COLLECTION ITERATOR PICKLER FETCH| DEMO_FUNC | | | | |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(SYS_OP_ATG(VALUE(KOKBF$),1,2,2)=8)


I wanted to correct the cardinality estimate and I used undocumented hint CARDINALITY which is available since Oracle9i.


SQL> explain plan for
2 select /*+ cardinality( a 1 ) */ a.id, a.first_name,a.last_name
3 from table(demo_func(10)) a
4 where a.id = 8;

Explained.

SQL> select * from table(dbms_xplan.display);

Plan hash value: 2056388305

-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2 | 29 (0)| 00:00:01 |
|* 1 | COLLECTION ITERATOR PICKLER FETCH| DEMO_FUNC | | | | |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(SYS_OP_ATG(VALUE(KOKBF$),1,2,2)=8)


I also used event 10053 to get CBO trace and there I found the following information:


SQL> alter session set events '10053 trace name context forever';

Session altered.

SQL> explain plan for
2 select /*+ cardinality( a 1 ) */ a.id, a.first_name,a.last_name
3 from table(demo_func(10)) a
4 where a.id = 8;

Explained.

SQL> alter session set events '10053 trace name context off';



***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
Table: KOKBF$ Alias: KOKBF$ (NOT ANALYZED)
#Rows: 8168 #Blks: 100 AvgRowLen: 100.00
Access path analysis for KOKBF$
***************************************
SINGLE TABLE ACCESS PATH
Single Table Cardinality Estimation for KOKBF$[KOKBF$]
Table: KOKBF$ Alias: KOKBF$
Card: Original: 8168.000000 >> Single Tab Card adjusted from:81.680000 to:1.000000
Rounded: 1 Computed: 1.00 Non Adjusted: 81.68
Access Path: TableScan
Cost: 29.29 Resp: 29.29 Degree: 0
Cost_io: 29.00 Cost_cpu: 6429744
Resp_io: 29.00 Resp_cpu: 6429744
Best:: AccessPath: TableScan
Cost: 29.29 Degree: 1 Resp: 29.29 Card: 1.00 Bytes: 0

***************************************


My next experiment was creating a join between two row sets returned by the function and I have used the cardinality hint again.


SQL> explain plan for
2 select /*+ cardinality(a 1) cardinality (b 1 ) */ a.id, a.first_name,a.last_name
3 from table(demo_func(10)) a, table(demo_func(10)) b
4 where a.id = 8
5 and a.id = b.id;

Explained.

SQL> select * from table(dbms_xplan.display);

Plan hash value: 2170227849

------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 59 (2)| 00:00:01 |
|* 1 | HASH JOIN | | 1 | 4 | 59 (2)| 00:00:01 |
|* 2 | COLLECTION ITERATOR PICKLER FETCH| DEMO_FUNC | | | | |
|* 3 | COLLECTION ITERATOR PICKLER FETCH| DEMO_FUNC | | | | |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------


1 - access(SYS_OP_ATG(VALUE(KOKBF$),1,2,2)=SYS_OP_ATG(VALUE(KOKBF$),1,2,2))
2 - filter(SYS_OP_ATG(VALUE(KOKBF$),1,2,2)=8)
3 - filter(SYS_OP_ATG(VALUE(KOKBF$),1,2,2)=8)


In the CBO trace file I found this interesting piece of information where the SQL was
transformed in the following way:


SELECT /*+ OPT_ESTIMATE (TABLE "B" ROWS=1.000000 ) OPT_ESTIMATE (TABLE "A" ROWS=1.000000 ) */
"A"."SYS_NC_ROWINFO$"."ID" "ID",
"A"."SYS_NC_ROWINFO$"."FIRST_NAME" "FIRST_NAME",
"A"."SYS_NC_ROWINFO$"."LAST_NAME" "LAST_NAME"
FROM (SELECT VALUE (kokbf$) "SYS_NC_ROWINFO$"
FROM TABLE ("JOC"."DEMO_FUNC" (10)) "KOKBF$") "A",
(SELECT VALUE (kokbf$) "SYS_NC_ROWINFO$"
FROM TABLE ("JOC"."DEMO_FUNC" (10)) "KOKBF$") "B"
WHERE "A"."SYS_NC_ROWINFO$"."ID" = 8
AND "A"."SYS_NC_ROWINFO$"."ID" = "B"."SYS_NC_ROWINFO$"."ID"

Query block SEL$1 (#0) unchanged
PL/SQL function (DEMO_FUNC) is not secure.


Here CBO transformed CARDINALITY hint to a new undocumented hint used in SQL Profiles
OPT_ESTIMATE (TABLE "B" ROWS=1.000000 ) which does exactly the same -
scales down the number of rows to 1 in this case.

Another very interesting piece of information here is how the statement is internally transformed to use SYS_NC_ROWINFO$ which is a virtual column that references the row XMLType instance and is there since Oracle9i. In later releases this was replaced with a new OBJECT_VALUE pseudo column but obviousl CBO still uses that internally (for more details see "Oracle Database SQL Language Reference")

There was also other interesting transformation present which was:


SELECT /*+ OPT_ESTIMATE (TABLE "KOKBF$" ROWS=1.000000 ) OPT_ESTIMATE (TABLE "KOKBF$" ROWS=1.000000 ) */
sys_op_atg (VALUE (kokbf$), 1, 2, 0) "ID",
sys_op_atg (VALUE (kokbf$), 2, 3, 0) "FIRST_NAME",
sys_op_atg (VALUE (kokbf$), 3, 4, 0) "LAST_NAME"
FROM TABLE ("JOC"."DEMO_FUNC" (10)) "KOKBF$",
TABLE ("JOC"."DEMO_FUNC" (10)) "KOKBF$"
WHERE sys_op_atg (VALUE (kokbf$), 1, 2, 0) = 8
AND sys_op_atg (VALUE (kokbf$), 1, 2, 0) = sys_op_atg (VALUE (kokbf$), 1, 2, 0)

Query block SEL$2B5797BE (#0) unchanged
query block SEL$1 transformed to SEL$2B5797BE (#0)
Considering Query Transformations on query block SEL$2B5797BE (#0)
**************************
Query transformations (QT)
**************************
CBQT: copy not possible on query block SEL$2B5797BE (#0) because nested table
CBQT bypassed for query block SEL$2B5797BE (#0): Cannot copy query block.


Here Oracle uses undocumented function sys_op_atg which returns a column from object type.
The first parameter specifies the starting column, the second one specifies the ending column and for the third parameter I was not able to figure out what does it mean. In all cases which I have found on Interenet or in Oracle manuals this parameter had always value of 2.

Then CBO was trying to push the filter condition inside function execution what was of course not possible.

FPD: Considering simple filter push in query block SEL$2B5797BE (#0)


In next step CBO generated transitive predicate which could be observed under the "Predicate Information" section of execution plan.


SYS_OP_ATG(VALUE(KOKBF$),1,2,2)=8 AND SYS_OP_ATG(VALUE(KOKBF$),1,2,2)=SYS_OP_ATG(VALUE(KOKBF$),1,2,2)
try to generate transitive predicate from check constraints for query block SEL$2B5797BE (#0)
finally: SYS_OP_ATG(VALUE(KOKBF$),1,2,2)=8 AND SYS_OP_ATG(VALUE(KOKBF$),1,2,2)=SYS_OP_ATG(VALUE(KOKBF$),1,2,2) AND SYS_OP_ATG(VALUE(KOKBF$),1,2,2)=8

FPD: transitive predicates are generated in query block SEL$2B5797BE (#0)
SYS_OP_ATG(VALUE(KOKBF$),1,2,2)=8


From my presentation at IUOG back in 2005 about the Extensible optimizer I used the trick to associate the function with statistics type. Of course I could use also DYNAMIC_SAMPLING but this is more elegant way. Here is what I have done:


SQL> CREATE OR REPLACE TYPE demo_func_stats_type AS OBJECT (
2 dummy_attribute NUMBER,
3 STATIC FUNCTION odcigetinterfaces (p_interfaces OUT SYS.odciobjectlist)
4 RETURN NUMBER,
5 STATIC FUNCTION odcistatstablefunction (
6 p_function IN SYS.odcifuncinfo,
7 p_stats OUT SYS.odcitabfuncstats,
8 p_args IN SYS.odciargdesclist,
9 p_num_rows IN NUMBER
10 )
11 RETURN NUMBER
12 );
13 /

Type created.

SQL> CREATE OR REPLACE TYPE BODY demo_func_stats_type AS
2 STATIC FUNCTION odcigetinterfaces (p_interfaces OUT SYS.odciobjectlist)
3 RETURN NUMBER IS
4 BEGIN
5 p_interfaces :=
6 SYS.odciobjectlist (SYS.odciobject ('SYS', 'ODCISTATS2'));
7 RETURN odciconst.success;
8 END odcigetinterfaces;
9 STATIC FUNCTION odcistatstablefunction (
10 p_function IN SYS.odcifuncinfo,
11 p_stats OUT SYS.odcitabfuncstats,
12 p_args IN SYS.odciargdesclist,
13 p_num_rows IN NUMBER
14 )
15 RETURN NUMBER IS
16 BEGIN
17 p_stats := SYS.odcitabfuncstats (p_num_rows);
18 RETURN odciconst.success;
19 END odcistatstablefunction;
20 END;
21 /

Type body created.

SQL> explain plan for
2 select a.id, a.first_name,a.last_name
3 from table(demo_func(10)) a, table(demo_func(1000)) b
4 where a.id = 8
5 and a.id = b.id;

Explained.

SQL> select * from table(dbms_xplan.display);

Plan hash value: 2170227849

------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 67 | 268 | 59 (2)| 00:00:01 |
|* 1 | HASH JOIN | | 67 | 268 | 59 (2)| 00:00:01 |
|* 2 | COLLECTION ITERATOR PICKLER FETCH| DEMO_FUNC | | | | |
|* 3 | COLLECTION ITERATOR PICKLER FETCH| DEMO_FUNC | | | | |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------


1 - access(SYS_OP_ATG(VALUE(KOKBF$),1,2,2)=SYS_OP_ATG(VALUE(KOKBF$),1,2,2))
2 - filter(SYS_OP_ATG(VALUE(KOKBF$),1,2,2)=8)
3 - filter(SYS_OP_ATG(VALUE(KOKBF$),1,2,2)=8)



Here we can see that the cardinality estimate is quite off - it should be 1. Therefore I now use my statistics type and associate it with demo_func.


SQL> associate statistics with functions demo_func using demo_func_stats_type;

Statistics associated.

SQL> explain plan for
2 select a.id, a.first_name,a.last_name
3 from table(demo_func(10)) a, table(demo_func(1000)) b
4 where a.id = 8
5 and a.id = b.id;

Explained.

SQL> select * from table(dbms_xplan.display);

Plan hash value: 2170227849

------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 59 (2)| 00:00:01 |
|* 1 | HASH JOIN | | 1 | 4 | 59 (2)| 00:00:01 |
|* 2 | COLLECTION ITERATOR PICKLER FETCH| DEMO_FUNC | | | | |
|* 3 | COLLECTION ITERATOR PICKLER FETCH| DEMO_FUNC | | | | |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------


1 - access(SYS_OP_ATG(VALUE(KOKBF$),1,2,2)=SYS_OP_ATG(VALUE(KOKBF$),1,2,2))
2 - filter(SYS_OP_ATG(VALUE(KOKBF$),1,2,2)=8)
3 - filter(SYS_OP_ATG(VALUE(KOKBF$),1,2,2)=8)


As we can see the CBO now correctly estimates the number of rows.
In the CBO trace file we can find the following information:


***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
Table: KOKBF$ Alias: KOKBF$ (NOT ANALYZED)
#Rows: 8168 #Blks: 100 AvgRowLen: 100.00
***********************
Table Stats::
Table: KOKBF$ Alias: KOKBF$ (NOT ANALYZED)
#Rows: 8168 #Blks: 100 AvgRowLen: 100.00
Access path analysis for KOKBF$
***************************************
SINGLE TABLE ACCESS PATH
Single Table Cardinality Estimation for KOKBF$[KOKBF$]
Calling user-defined function card function...
Bind :3 Value 10
JOC.DEMO_FUNC_STATS_TYPE.ODCIStatsTableFunction returned:
num_rows : 10
Table: KOKBF$ Alias: KOKBF$
Card: Original: 10.000000 Rounded: 1 Computed: 0.10 Non Adjusted: 0.10
Access Path: TableScan
Cost: 29.29 Resp: 29.29 Degree: 0
Cost_io: 29.00 Cost_cpu: 6429744
Resp_io: 29.00 Resp_cpu: 6429744
Best:: AccessPath: TableScan
Cost: 29.29 Degree: 1 Resp: 29.29 Card: 0.10 Bytes: 0

Access path analysis for KOKBF$
***************************************
SINGLE TABLE ACCESS PATH
Single Table Cardinality Estimation for KOKBF$[KOKBF$]
Calling user-defined function card function...
Bind :3 Value 1000
JOC.DEMO_FUNC_STATS_TYPE.ODCIStatsTableFunction returned:
num_rows : 1000
Table: KOKBF$ Alias: KOKBF$
Card: Original: 1000.000000 Rounded: 10 Computed: 10.00 Non Adjusted: 10.00
Access Path: TableScan
Cost: 29.29 Resp: 29.29 Degree: 0
Cost_io: 29.00 Cost_cpu: 6429744
Resp_io: 29.00 Resp_cpu: 6429744
Best:: AccessPath: TableScan
Cost: 29.29 Degree: 1 Resp: 29.29 Card: 10.00 Bytes: 0

***************************************


As we can see the user-defined statistics function which CBO calls during the optimization phase resolves the problem of erroneously determined cardinality.

One can disassociate statistics with the following command:

SQL> disassociate statistics from functions demo_func;

Statistics disassociated.


Now we can check what is going on in run time. We use gather_plan_statistics hint on statement level to gather all execution statistics.


SQL> select /*+ gather_plan_statistics */ a.id, a.first_name,a.last_name
2 from table(demo_func(10)) a,
3 table(demo_func(1000)) b
4 where a.id = 8
5 and a.id = b.id;

ID FIRST_NAME LAST_NAME
---------- ------------------------------ --------------------
8 EMRFXDHQISONDDVTYLMPQDRDIWAHMG BMUUKQWCUCKBYGGKMBEM


SQL> select * from table(dbms_xplan.display_cursor(NULL,NULL, 'allstats last'));

SQL_ID 4kyvk4q692gqu, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ a.id, a.first_name,a.last_name
from table(demo_func(10)) a, table(demo_func(1000)) b where a.id = 8
and a.id = b.id

Plan hash value: 2170227849

----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------------
|* 1 | HASH JOIN | | 1 | 1 | 1 |00:00:00.14 | 3 |
|* 2 | COLLECTION ITERATOR PICKLER FETCH| DEMO_FUNC | 1 | | 1 |00:00:00.01 | 3 |
|* 3 | COLLECTION ITERATOR PICKLER FETCH| DEMO_FUNC | 1 | | 1 |00:00:00.14 | 0 |
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access(SYS_OP_ATG(VALUE(KOKBF$),1,2,2)=SYS_OP_ATG(VALUE(KOKBF$),1,2,2))
2 - filter(SYS_OP_ATG(VALUE(KOKBF$),1,2,2)=8)
3 - filter(SYS_OP_ATG(VALUE(KOKBF$),1,2,2)=8)


From this plan we can see that each row source is returned by "COLLECTION ITERATOR PICKLER FETCH" and that function DEMO_FUNC is executed only once (Starts = 1).

Of course there is no chance to push predicate as a.id=b.id and a.id=8 inside the function execution. Therefore Oracle has to use SYS_OP_ATG function to decompose returned object and perform the join operation only after function calls complete there execution.

We can see that this could be very inefficient. To resolve the possible performance issues one should not use functions (this is true also for pipelined functions) if the function result is used in later join operations or the majority of returned rows is filtered out later on. One possible solution would be to pass some conditions as function parameters, but the best solution is to replace a function call with inline view which contains the query used inside function. Then the CBO has more possibility and place for optimization and can optimize the whole statement.

If I return back to the origin for this post - we got the best result when we had replaced function call with inline view containing the query from the function. The CBO can't really push any kind of predicates inside the function call and the only possibility is to use function parameter for passing them.

This was also a good example to point out the problems of cardinality estimates for table functions and how one can use extensible optimizer to resolve it.

3 comments:

  1. Hi Joze,

    a small addtition to this:
    >The estimated cardinality is 8168 rows what is a default value used for 8k block size.
    I remember reading metalink note which explained it as "we assume a default of a 100 blocks table with 100 bytes per row for table() functions", which results in a figure of 8168. Unfortunately today I can't find the doc ID.

    ReplyDelete
  2. Timur,

    Thanks for pointing that out. I must say that I have never seen that note but I just accepted this number as some kind of a default.

    ReplyDelete
  3. There's an article about setting cardinality for pipelined and table functions at
    www.oracle-developer.net/display.php?id=427.

    ReplyDelete