Oracle：嵌套回圈的內行源-估計行不正確？-有解無憂

我的理解是，嵌套回圈連接的內部行源的解釋計劃中的估計行數反映了該嵌套回圈僅一次迭代的行數。

在以下示例中，解釋計劃的第 6 步是嵌套回圈連接的內部行源，該連接每次通過一個 ROWID 獲取一行。因此，它的估計行數應該為 1（每個 ROWID 只有 1 行）。

為什么第 6 步table access by index ROWID顯示 100（我希望它顯示 1）？

使用 Oracle 19c 企業版。

drop table "C";
drop table "P";

create table "P"
  ( p_id  NUMBER
  , p_num NUMBER
  , p_pad CHAR(200 byte)
  )
;

insert
  into "P" (p_id, p_num, p_pad)
  select level
       , mod(level-1,200/2)
       , ' '
    from dual
    connect by level <= 200
;

create table "C"
  ( c_id  NUMBER
  , p_id  NUMBER
  , c_pad CHAR(200 byte)
  )
;

insert /*  append enable_parallel_dml parallel (auto) */
  into "C" (c_id, p_id, c_pad)
  with
    "D" as
    ( select /*  materialize */ null from dual connect by level <= 100
    )
    select rownum      c_id
         , p_id        p_id
         , ' '         c_pad
      from "P", "D"
;
commit;

create index IX_P on p (p_num);   
create unique index IU_P on p (p_id);
alter table p add constraint UK_P unique (p_id) rely using index IU_P enable validate;
alter table C add constraint R_C foreign key (p_id) references p (p_id) rely enable validate;
create index IR_C on _C (p_id);
exec dbms_stats.gather_table_stats(OwnName => null, TabName => 'P', cascade => true);
exec dbms_stats.gather_table_stats(OwnName => null, TabName => 'C', cascade => true);

select /*  optimizer_features_enable('19.1.0')
           use_nl (P C) */
       *
  from "P"
       join "C"
         on P.p_id = C.p_id
            and P.p_num = 1
;
plan hash value: 3840235794

----------------------------------------------------------------------------------------------
| id  | Operation                             | name | rows  | Bytes | cost (%CPU)| time     |
----------------------------------------------------------------------------------------------
|   0 | select statement                      |      |   200 | 83000 |   205   (0)| 00:00:01 |
|   1 |  nested LOOPS                         |      |   200 | 83000 |   205   (0)| 00:00:01 |
|   2 |   nested LOOPS                        |      |   200 | 83000 |   205   (0)| 00:00:01 |
|   3 |    table access by index ROWID BATCHED| P    |     2 |   414 |     3   (0)| 00:00:01 |
|*  4 |     index range scan                  | IX_P |     2 |       |     1   (0)| 00:00:01 |
|*  5 |    index range scan                   | IR_C |   100 |       |     1   (0)| 00:00:01 |
|   6 |   table access by index ROWID         | C    |   100 | 20800 |   101   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("P"."P_NUM"=1)
   5 - access("P"."P_ID"="C"."P_ID")

外行源步驟 3 乘以內行源步驟 5 = 嵌套回圈步驟 2。

但是，外行源步驟 2 乘以內行源步驟 6 <> 嵌套回圈步驟 1。

我同意第 1 步的總數應該是 200，但不明白為什么第 6 步的估計行數為 100。

為什么第 6 步的估計行數為 100 而不是 1？

提前致謝。

uj5u.com熱心網友回復：

在這里，您可以看到外部預期的行數 NESTED LOOP

select p_id, count(*) from C where p_id in (
select p_id from P where p_num = 1)
group by p_id;

      P_ID   COUNT(*)
---------- ----------
         2        100
       102        100

所以實際上每次迭代都希望得到 100 行。

如果您使用提示運行查詢，gather_plan_statistics您可以看到Starts實際行數和總行數A-Rows。

select /*  gather_plan_statistics use_nl (P C) */
       *
  from "P"
       join "C"
         on P.p_id = C.p_id
            and P.p_num = 1

SQL_ID  927pggk6scpwt, child number 0
-------------------------------------
select /*  gather_plan_statistics use_nl (P C) */        *   from "P"   
     join "C"          on P.p_id = C.p_id             and P.p_num = 1
 
Plan hash value: 2326820011
 
--------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |      |      1 |        |    200 |00:00:00.01 |     213 |
|   1 |  NESTED LOOPS                         |      |      1 |    200 |    200 |00:00:00.01 |     213 |
|   2 |   NESTED LOOPS                        |      |      1 |    200 |    200 |00:00:00.01 |      13 |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| P    |      1 |      2 |      2 |00:00:00.01 |       5 |
|*  4 |     INDEX RANGE SCAN                  | IX_P |      1 |      2 |      2 |00:00:00.01 |       3 |
|*  5 |    INDEX RANGE SCAN                   | IR_C |      2 |    100 |    200 |00:00:00.01 |       8 |
|   6 |   TABLE ACCESS BY INDEX ROWID         | C    |    200 |    100 |    200 |00:00:00.01 |     200 |
--------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   4 - access("P"."P_NUM"=1)
   5 - access("P"."P_ID"="C"."P_ID")

即操作 5開始了兩次（列Starts），總共有 200 行（列A-Rows）

操作6開始，每次200次得到一個排。

uj5u.com熱心網友回復：

我認為這個 Oracle 檔案段落很好地解釋了這種情況：

多個嵌套回圈操作偶爾會出現在一個連接的執行計劃中，這表明 Oracle 使用了嵌套回圈批處理優化技術。此方法所做的是將兩個行源的單個連接轉換為驅動行源與探測行源的一個副本的連接，該副本連接到 ROWID 上的自身副本；由于我們現在有三個行源，我們至少需要兩個嵌套回圈。用于在 ROWID 上執行自聯接的探測行源副本用于過濾行，因此它將TABLE ACCESS BY ... ROWID在執行計劃中具有相應的條目。這種基于成本的優化通常可以減少 I/O，盡管執行計劃可能無法透明地顯示收益。

您示例中的第 6 步是“探測行源副本”；它基本上是表 C 的快取版本，因此它有 100 行。但是它的成本在所有外部嵌套回圈之間共享——該表只被訪問過一次——所以它已經包含在第 2 步的總數中。（我認為？）

轉載請註明出處，本文鏈接：https://www.uj5u.com/net/356378.html

標籤：甲骨文嵌套循环 sql-执行计划行号基数估计

上一篇：如何在沒有固定長度的欄位值子字串上連接表？

下一篇：按每個組擴展缺失日期的資料