Chinar Aliyev

12c Adaptive Sampling: part II
This is the second article of my 12c Adaptive Sampling series. In part I, we saw how dynamic sampling estimates single table and group by cardinality. In this part we will focus on how Adaptive Dynamic Sampling works with joins.

The model

Let's create the following two tables and gather statistics without histograms
 
create table t1
as select * from dba_users;
 
create table t2
as select * from dba_objects;
 
 
execute dbms_stats.gather_table_stats(user,'t1',method_opt=>'for all columns size 1');
execute dbms_stats.gather_table_stats(user,'t2',method_opt=>'for all columns size 1');
 
Then I will set the optimizer_dynamic_sampling to the new 12c value (11)
 
alter session set optimizer_dynamic_sampling=11;
 
And finally I will execute the following simple two table join query
 
 
select count(*)
from
t1,t2
where t1.username=t2.owner;
 
The execution plan of the above query is

SQL_ID  a28zr3kmq7psn, child number 0
-------------------------------------
---------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |
---------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        |      1 |
|   1 |  SORT AGGREGATE     |      |      1 |      1 |      1 |
|*  2 |   HASH JOIN         |      |      1 |  52070 |  55220 |
|   3 |    TABLE ACCESS FULL| T1   |      1 |     42 |     42 |
|   4 |    TABLE ACCESS FULL| T2   |      1 |  92254 |  92254 |
---------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T1"."USERNAME"="T2"."OWNER")
Note
-----
- dynamic statistics used: dynamic sampling (level=AUTO)
 

The investigation

 

In previous releases dynamic sampling has not applied to estimate join cardinality. So above estimation is enough good. By looking at the corresponding below 10053 and 10046 trace files respectively
 
10053 trace file
SPD: Return code in qosdDSDirSetup: NOCTX, estType = JOIN
Join Card: 92254.000000 = outer (42.000000) * inner (92254.000000) * sel (0.023810)
>> Join Card adjusted from 92254.000000 to 52070.040000 due to adaptive dynamic sampling, prelen=2
Adjusted Join Cards: adjRatio=0.564420 cardHjSmj=52070.040000 cardHjSmjNPF=52070.040000 cardNlj=52070.040000 cardNSQ=52070.040000 cardNSQ_na=92254.000000
Join Card - Rounded: 52070 Computed: 52070.040000
Outer table: T1 Alias: T1
10046 trace file
SQL ID: 86cd0yqkg18hx
Plan Hash: 3696410285
SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring
optimizer_features_enable(default) no_parallel result_cache(snapshot=3600)
*/ SUM(C1)
FROM
(SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "T2#0") */ 1 AS C1 FROM
"T2"SAMPLE BLOCK(50.5051, 8) SEED(1) "T2#0", "T1""T1#1"WHERE
("T1#1"."USERNAME"="T2#0"."OWNER")) innerQuery
 
 
call    count    cpu elapsed   disk  query current  rows
------- -----  ----- -------  ----- ------ -------  ----
Parse       1   0.00    0.00      0      2       0     0
Execute     1   0.00    0.00      0      0       0     0
Fetch       1   0.03    0.03      0    763       0     1
------- -----  ----- -------  ----- ------ -------  ----
total       3   0.03    0.03      0    765       0     1
 
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 105     (recursive depth: 1)
 
Rows   Row Source Operation
-----  --------------------
1 RESULT CACHE
1  SORT AGGREGATE
26298HASH JOIN    
42    TABLE ACCESS FULL T1
44761    TABLE ACCESS SAMPLE T2


we can infer that Oracle did an estimation of just a portion(fraction) of join, then ends up by making a guess on the whole join (let me call this method 1).
By sampling only table T2 at a sampling rate of 50,5051, Oracle has estimated that the hash join operation between T2 and T1 will generate 26,298 rows. Then, via a simple arithmetic guess, Oracle estimated that the entire hash join operation will generate
 
(26,298 rows / 50,5051) * 100 = 52069,98 ~ 52070
You might ask why T1 table has not been sampled? As it has already explained in the part I of this series, the whole strategy followed by Oracle when using Adaptive Dynamic Sampling is “Estimate a cardinality for a fraction of the single table then guess the cardinality of the whole result set”. In order to follow the same strategy when estimating cardinality of joins, Oracle may follow three methods
  1. Sample a fraction of the large table in the join, then join that sampled fraction of the small table to the second table. The cardinality that results from that join will be used to get the cardinality of the entire join result set as we saw through the above “guess” formula.
  2. When both tables in the join are large (what does “large” mean here? when DBMS consider segment is “large”? I think it is depend I/O count, so how many I/O DBMS did for reading this segment? or this criteria can be controllable via a threshold), Oracle will sample a fraction of both tables using a different sampling size (depending on the table size), join these “small” result sets and compute the cardinality of that small fraction of the join.
How all those methods can be efficient in practice? In order to answer this question, I will go step by step

create table t3

as

select OWNER, TABLE_NAME,

       COLUMN_NAME, DATA_TYPE, DATA_TYPE_MOD, DATA_TYPE_OWNER,

       DATA_LENGTH, DATA_PRECISION, DATA_SCALE, NULLABLE, COLUMN_ID,

       DEFAULT_LENGTH, NUM_DISTINCT, LOW_VALUE, HIGH_VALUE,

       DENSITY, NUM_NULLS, NUM_BUCKETS, LAST_ANALYZED, SAMPLE_SIZE,

       CHARACTER_SET_NAME, CHAR_COL_DECL_LENGTH,

       GLOBAL_STATS, USER_STATS, AVG_COL_LEN, CHAR_LENGTH, CHAR_USED,

       V80_FMT_IMAGE, DATA_UPGRADED, HISTOGRAM, DEFAULT_ON_NULL,

       IDENTITY_COLUMN, SENSITIVE_COLUMN,

       EVALUATION_EDITION, UNUSABLE_BEFORE, UNUSABLE_BEGINNING

from dba_tab_columns;

 

execute dbms_stats.gather_table_stats(user,'t3',method_opt=>'for all columns size 1');

 

alter system flush shared_pool;

 

alter session set optimizer_dynamic_sampling=11;

 

select 

count(*)

from

t2

,t3

where

t2.owner=t3.owner and

t2.object_name=t3.table_name and

t2.object_type='TABLE';


 
And execution plan will be
 

---------------------------------------------------------------

| Id  | Operation           | Name | Starts | E-Rows | A-Rows |

---------------------------------------------------------------

|   0 | SELECT STATEMENT    |      |      1 |        |      1 |

|   1 |  SORT AGGREGATE     |      |      1 |      1 |      1 |

|*  2 |   HASH JOIN         |      |      1 |  31573 |  30286 |

|*  3 |    TABLE ACCESS FULL| T2   |      1 |   1938 |   2479 |

|   4 |    TABLE ACCESS FULL| T3   |      1 |    110K|    110K|

---------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("T2"."OWNER"="T3"."OWNER"AND "T2"."OBJECT_NAME"="T3"."TABLE_NAME")

3 - filter("T2"."OBJECT_TYPE"='TABLE')

Note

-----

- dynamic statistics used: dynamic sampling (level=AUTO)

 

 
As you can see the optimizer estimations are not bad. In this example the 10046 trace file reveals that Oracle executed two dynamic sampling statements: the first one concerns the sampling of table T2 as shown below (it seems that this choice is because of the presence of a predicate on T2 table using a constant value):
 

Plan Hash: 3252009800

SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring

optimizer_features_enable(default) no_parallel result_cache(snapshot=3600)

*/ SUM(C1)

FROM

(SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "T2")  */ AS C1 FROM "T2"

SAMPLE BLOCK(50.5051, 8) SEED(1)  "T2"WHERE ("T2"."OBJECT_TYPE"='TABLE'))

innerQuery

 

call     count       cpu    elapsed       disk      query current      rows

------- ------  -------- ---------- ---------- ---------- --------  -------

Parse        1      0.00       0.00          0          2        0        0

Execute      1      0.00       0.00          0          0        0        0

Fetch        1      0.01       0.01        118        760        0        1

------- ------  -------- ---------- ---------- ---------- --------  -------

total        3      0.01       0.01        118        762        0        1

 

Rows     Row Source Operation

-------  ---------------------

1  RESULT CACHE  

1   SORT AGGREGATE 

979TABLE ACCESS SAMPLE T2

 

(979 rows / 50,5051) * 100 = 1938,41 ~ 1938


 
While the second sampling concerns the join operation using the OPT_ESTIMATE hint as shown below:
 

SQL ID: 6wxwa7hvmmnb3

Plan Hash: 2702931906

SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring

optimizer_features_enable(default) no_parallel result_cache(snapshot=3600)

OPT_ESTIMATE(@"innerQuery", TABLE, "T2#1", ROWS=1938.42) */ SUM(C1)

FROM

(SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "T3#0")  */ AS C1 FROM

"T3"SAMPLE BLOCK(44.1258, 8) SEED(1)  "T3#0", "T2""T2#1"WHERE

("T2#1"."OBJECT_TYPE"='TABLE') AND ("T2#1"."OWNER"="T3#0"."OWNER") AND

("T2#1"."OBJECT_NAME"="T3#0"."TABLE_NAME")) innerQuery

 

call     count       cpu    elapsed       disk      query  current     rows

------- ------  -------- ---------- ---------- ---------- --------  -------

Parse        1      0.00       0.00          0          2        0        0

Execute      1      0.00       0.00          0          0        0        0

Fetch        1      0.04       0.04        143       2352        0        1

------- ------  -------- ---------- ---------- ---------- --------  -------

total        3      0.05       0.05        143       2354        0        1

 

 

Rows     Row Source Operation

-------  ---------------------------------------------------

1  RESULT CACHE  

1   SORT AGGREGATE

13932HASH JOIN  

2479     TABLE ACCESS FULL T2

49701     TABLE ACCESS SAMPLE T3

 
 
 
Finally putting the two pieces together we can say that Oracle started by sampling T2 table and ended up by estimating a T2 cardinality of 1938. Then it used this estimated cardinality and sampled the join operation using a sample size value of 44.12 with which Oracle found a join cardinality of 13932. Using this estimated join cardinality, Oracle find finally that the entire result set of the join cardinality will be
 
13932*100/44.1258=31573.36 ≈ 31573
 
This kind of estimation is based on method 2 (see point ii above) where both tables in the join are large. If you have quite big tables what will happen?
 
Having investigated a case where the first method (see point i above) has been used, next I will investigate the second method (see point ii above) where both tables are large. This is obtained issuing several insert into t2 (t3) select * from t2(t3) statements until I got the following size picture

SEGMENT_NAME            MB

------------       -------

T2                 1600.00

T3                 1813.00

 

TABLE_NAME     NUM_ROWS BLOCKS

------------   -------  -------

T3             110996      1813

T2             92254       158


Those table inserts have not been followed by a call to dbms_stats package so that statistics are stale
select count(*) from t2; Return 11,808,512 rows.
select count(*) from t3; Return 14,207,488 rows.
 
The following query
 
select
count(*)
from
t2
,t3
where
t2.object_name=t3.table_name and
t2.object_type='TABLE';
 
 
when executed under optimizer_dynamic_sampling set to 11 gives an excerpt of the corresponding 10046 trace file reproduced below:

 

SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring

optimizer_features_enable(default) no_parallel result_cache(snapshot=3600)

*/ SUM(C1)

FROM

(SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "T2")  */ AS C1 FROM "T2"

SAMPLE BLOCK(0.391869, 8) SEED(1)  "T2"WHERE ("T2"."OBJECT_TYPE"='TABLE')) innerQuery

 

Rows     Row Source Operation

-------  --------------------

1  RESULT CACHE  

1   SORT AGGREGATE 

1114    TABLE ACCESS SAMPLE T2

 
 
Oracle use method 1,in this case, to estimate t2 table cardinality (because of that “filter” on t2 table Oracle thinks it will reduce the data from t2 and hence get benefit from sampling) as shown in the above trace file.
 
Additionally the same trace file shows the following SQL statement.
 

QL ID: 22fjpqb6fyafj

Plan Hash: 527772662

SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring

optimizer_features_enable(default) no_parallel result_cache(snapshot=3600)

OPT_ESTIMATE(@"innerQuery", TABLE, "T2#1", ROWS=284278.875) */ SUM(C1)

FROM

(SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "T2#1")  */ 1 AS C1 FROM

"T2" SAMPLE BLOCK(0.391869, 8) SEED(1)  "T2#1", "T3""T3#0" WHERE

("T2#1"."OBJECT_NAME"="T3#0"."TABLE_NAME") AND ("T2#1"."OBJECT_TYPE"=

'TABLE')) innerQuery

 

 

call     count       cpu    elapsed       disk      query current  rows

------- ------  -------- ---------- ---------- ---------- -------  -----

Parse        1      0.00       0.00          0          0       0      0

Execute      1      0.00       0.00          0          0       0      0

Fetch        1      0.93       1.09      1889327006       0      0

------- ------  -------- ---------- ---------- ---------- -------  -----

total        3      0.93       1.09      1889327006       0      0

 

 

Rows     Row Source Operation

-------  --------------------

0  RESULT CACHE  

0   SORT AGGREGATE 

0    HASH JOIN (cr=0 pr=0 pw=0 …)    

1298249     TABLE ACCESS FULL T3

0     TABLE ACCESS SAMPLE T2  (cr=0 pr=0 pw=0 …)

 
 
 
Where you can notice that the above query has not been fully executed.After scanning 1.2 millions rows from T3 table, Oracle stopped the query. Because Oracle has derived that the sampling it is doing will not be efficient in this situation.
 
So, how Oracle has managed to get its cardinality in this case?
 
From the same 10053 trace file we can isolate the below lines related to join estimation
 
Best NL cost: 24711383775.180614
resc: 24711383775.180614 resc_io: 24667427343.000000 resc_cpu: 475388814033344
resp: 24711383775.180614 resp_io: 24667427343.000000 resc_cpu: 475388814033344
SPD: Return code in qosdDSDirSetup: NOCTX, estType = JOIN
Join Card: 587113.315151 = outer (284278.875000) * inner (110996.000000) * sel (1.8607e-05)
Join Card - Rounded: 587113 Computed: 587113.315151
 
Here is outer table cardinality estimated using dynamic sampling but for inner table cardinality used num_rows from dictionary (object statistics). However both object statistics is inefficient (stale!), they are really different from reality. It only applied for T2 table because it has filter column, due to it assumes using sampling can be more efficient. So from user_tab_col_statistics
 
 

Table     column        num_nistinct   density                    

T2        OBJECT_NAME   53744          0.0000186067281929145

T3        TABLE_NAME    9003           0.000111074086415639

 
From above information CBO selected join selectivity 1/num_distinct of t2.object_name (or density of this column). And final execution plan was:
 

---------------------------------------------------------------

| Id  | Operation           | Name | Starts | E-Rows | A-Rows |

---------------------------------------------------------------

|   0 | SELECT STATEMENT    |      |      1 |        |      1 |

|   1 |  SORT AGGREGATE     |      |      1 |      1 |      1 |

|*  2 |   HASH JOIN         |      |      1 |    587K|    500M|

|   3 |    TABLE ACCESS FULL| T3   |      1 |    110K|     14M|

|*  4 |    TABLE ACCESS FULL| T2   |      1 |    284K|    317K|

---------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("T2"."OBJECT_NAME"="T3"."TABLE_NAME")

4 - filter("T2"."OBJECT_TYPE"='TABLE')

Note

-----

- dynamic statistics used: dynamic sampling (level=AUTO)


It is Interesting to note that T3 table cardinality has not been estimated using sampling but instead Oracle used the stale statistics from dictionary to get the T3 table cardinality estimation. And this is the reason why the join cardinality estimation was very bad. But even when we refresh (gather) statistics again including histograms, the join cardinality estimation will not be good. Re-gathering statistics will help in getting a better T3 table cardinality estimation. This is why, after I have gathered statistics (including histograms) the new execution plan is:
 

---------------------------------------------------------------

| Id  | Operation           | Name | Starts | E-Rows | A-Rows |

---------------------------------------------------------------

|   0 | SELECT STATEMENT    |      |      1 |        |      1 |

|   1 |  SORT AGGREGATE     |      |      1 |      1 |      1 |

|*  2 |   HASH JOIN         |      |      1 |     75M|    500M|

|*  3 |    TABLE ACCESS FULL| T2   |      1 |    284K|    317K|

|   4 |    TABLE ACCESS FULL| T3   |      1 |     14M|     14M|

---------------------------------------------------------------

 

Predicate Information (identified by operation id):

---------------------------------------------------

 

2 - access("T2"."OBJECT_NAME"="T3"."TABLE_NAME")

3 - filter("T2"."OBJECT_TYPE"='TABLE')

 

Note

-----

- dynamic statistics used: dynamic sampling (level=AUTO)

 
In this case with the presence of both Dynamic Sampling at level 11 and fresh and representative statistics the estimations of T2 table cardinality is 284k instead of Actual 317K. We can say that even when accurate statistics are present, if Dynamic sampling is used at level 11, Oracle will use dynamic sampling on table T2 because it thinks that sampling will be efficient when applied on a table having a filter predicate.
 
This is why I decided to check the same query without the predicate part
 

select 

count(*)

from

t2

,t3

where     

t2.object_name=t3.table_name

    –- and t2.owner=t3.owner  –- I commented this predicate part


 
The above query now contains only an equality on the join columns. In this case, as we will see via the corresponding CBO trace file,Oracle will completely ignore dynamic sampling. Dynamic sampling didn't kicked not because of the large size of the table but because there is no filter predicate on table and there is fresh and accurate statistics (and the CBO will ignore Dynamic Sampling in this case as well even if statistics are stale). The CBO trace shows clearly that Oracle used dictionary statistics to estimate single table and join cardinality:
 

SINGLE TABLE ACCESS PATH

Single Table Cardinality Estimation for T2[T2]

SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE

 

*** 2014-10-10 06:24:57.445

** Performing dynamic sampling initial checks. **

** Not using old style dynamic sampling since ADS is enabled.

Table: T2  Alias: T2

Card: Original: 11810048.000000  Rounded: 11810048  Computed: 11810048.000000  Non Adjusted: 11810048.000000

***************************************

SINGLE TABLE ACCESS PATH

Single Table Cardinality Estimation for T3[T3]

SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE

 

*** 2014-10-10 06:24:57.445

** Performing dynamic sampling initial checks. **

** Not using old style dynamic sampling since ADS is enabled.

Table: T3  Alias: T3

Card: Original: 14208000.000000  Rounded: 14208000  Computed: 14208000.000000  Non Adjusted: 14208000.000000

Best NL cost: 1029989931239.724365

resc: 1029989931239.724365  resc_io: 1024777132847.000000  resc_cpu: 56376414617314440

resp: 1029989931239.724365  resp_io: 1024777132847.000000  resc_cpu: 56376414617314440

SPD: Return code in qosdDSDirSetup: NOCTX, estType = JOIN

Join Card:  3122156184.578744 = outer (11810048.000000) * inner (14208000.000000) * sel (1.8607e-05)

 

*** 2014-10-10 06:24:59.238

Join Card - Rounded: 3122156185 Computed: 3122156184.578744

 

 
In the next (final) article I will discuss Optimizer`s additional mechanism to estimate join cardinality and I will summarize all these.