您的位置:首页 > 运维架构

Hash join与nested loops join的效率比较

2013-10-14 20:34 381 查看
创建测试表
SQL> create table t as select rownum id from dba_objects;

Table created.

SQL> create index idx_t on t(id);

Index created.

SQL> exec dbms_stats.gather_table_stats('sys','t',cascade=>true);

PL/SQL procedure successfully completed.

SQL> create table t1 as select id,'T1' name from t where id<1000;

Table created.

SQL> create index idx_t1 on t1(id);

Index created.

SQL> exec dbms_stats.gather_table_stats('sys','t1',cascade=>true);

PL/SQL procedure successfully completed.

上面的表中t表有7万多行,t1表1000行数据

SQL> set autotrace traceonly;
SQL> select * from t,t1 where t1.id<100 and t.id=t1.id;

99 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3627535484

----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 98 | 1176 | 5 (20)| 00:00:01 |
|* 1 | HASH JOIN | | 98 | 1176 | 5 (20)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_T | 99 | 495 | 2 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| T1 | 99 | 693 | 2 (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("T"."ID"="T1"."ID")
2 - access("T"."ID"<100)
3 - filter("T1"."ID"<100)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
13 consistent gets
0 physical reads
0 redo size
2881 bytes sent via SQL*Net to client
590 bytes received via SQL*Net from client
8 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
99 rows processed

从上面看出CBO默认根据统计信息计算使用Hash join方式,逻辑读13

SQL> select /*+ use_nl(t,t1) */ * from t,t1 where t1.id<100 and t.id=t1.id;

99 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 967599162

----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 98 | 1176 | 58 (2)| 00:00:01 |
| 1 | NESTED LOOPS | | 98 | 1176 | 58 (2)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_T | 99 | 495 | 2 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| T1 | 1 | 7 | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("T"."ID"<100)
3 - filter("T1"."ID"<100 AND "T"."ID"="T1"."ID")

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
412 consistent gets
0 physical reads
0 redo size
2881 bytes sent via SQL*Net to client
590 bytes received via SQL*Net from client
8 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
99 rows processed
当使用hint强制使用nested join方式逻辑读为412,远远超过了hash join

SQL> select /*+ use_merge(t,t1) */ * from t,t1 where t1.id<100 and t.id=t1.id;

99 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3760646495

-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 98 | 1176 | 5 (20)| 00:00:01 |
| 1 | MERGE JOIN | | 98 | 1176 | 5 (20)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_T | 99 | 495 | 2 (0)| 00:00:01 |
|* 3 | SORT JOIN | | 99 | 693 | 3 (34)| 00:00:01 |
|* 4 | TABLE ACCESS FULL| T1 | 99 | 693 | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("T"."ID"<100)
3 - access("T"."ID"="T1"."ID")
filter("T"."ID"="T1"."ID")
4 - filter("T1"."ID"<100)

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
13 consistent gets
0 physical reads
0 redo size
2881 bytes sent via SQL*Net to client
590 bytes received via SQL*Net from client
8 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
99 rows processed

当使用hint强制使用Merge join方式逻辑读与hash join一致,但是多了一次内存排序,因此最优的还是hash join

继续下面实验:
也可以通过hint改变基数的方式走nested loop join
SQL> select /*+ cardinality(t1 1) */ * from t,t1 where t1.id<100 and t.id=t1.id;

----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 12 | 3 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 1 | 12 | 3 (0)| 00:00:01 |
|* 2 | TABLE ACCESS FULL| T1 | 1 | 7 | 2 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | IDX_T | 1 | 5 | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------

从这里就可以看出在连接一个小表的时候CBO会选择nested loop join

SQL> delete t1 where id>10;

989 rows deleted.

SQL> commit;

Commit complete.

SQL> select count(*) from t1;

COUNT(*)
----------
10

SQL> select * from t,t1 where t.id=t1.id;

10 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1557655968

----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 110 | 12 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 10 | 110 | 12 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| T1 | 10 | 60 | 2 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | IDX_T | 1 | 5 | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("T"."ID"="T1"."ID")

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
14 consistent gets
0 physical reads
0 redo size
804 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
10 rows processed

从上面看出当连接一张小表的时候,CBO计算出使用nested loops join方式

SQL> select /*+ use_hash(t,t1) */ * from t,t1 where t.id=t1.id;

10 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1444793974

---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 110 | 35 (3)| 00:00:01 |
|* 1 | HASH JOIN | | 10 | 110 | 35 (3)| 00:00:01 |
| 2 | TABLE ACCESS FULL| T1 | 10 | 60 | 2 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| T | 73961 | 361K| 32 (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("T"."ID"="T1"."ID")

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
122 consistent gets
0 physical reads
0 redo size
804 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
10 rows processed

当我们强制使用hash join 可以看到逻辑读增加到了122,相差100多

SQL> select /*+ use_merge(t,t1) */ * from t,t1 where t.id=t1.id;

10 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 4143939902

-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 110 | 169 (2)| 00:00:03 |
| 1 | MERGE JOIN | | 10 | 110 | 169 (2)| 00:00:03 |
| 2 | INDEX FULL SCAN | IDX_T | 73961 | 361K| 166 (1)| 00:00:02 |
|* 3 | SORT JOIN | | 10 | 60 | 3 (34)| 00:00:01 |
| 4 | TABLE ACCESS FULL| T1 | 10 | 60 | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("T"."ID"="T1"."ID")
filter("T"."ID"="T1"."ID")

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
7 consistent gets
0 physical reads
0 redo size
804 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
10 rows processed

使用merge join 逻辑读最低,但是多了一次内存排序,CBO最终还是选择了nested loops join

SQL> drop index idx_t ;

Index dropped.

SQL> exec dbms_stats.gather_table_stats('sys','t',cascade=>true);

PL/SQL procedure successfully completed.

SQL> select * from t,t1 where t.id=t1.id;

20 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1444793974

---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20 | 220 | 67 (3)| 00:00:01 |
|* 1 | HASH JOIN | | 20 | 220 | 67 (3)| 00:00:01 |
| 2 | TABLE ACCESS FULL| T1 | 10 | 60 | 2 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| T | 147K| 722K| 64 (2)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("T"."ID"="T1"."ID")

Statistics
----------------------------------------------------------
135 recursive calls
0 db block gets
250 consistent gets
0 physical reads
0 redo size
1097 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
20 rows processed

如上当删除索引后cbo会选择hash join

总结:当多表连接的时候,如果是大表之间hash join速度最快,如果其中一个表相对来说比较小,并且关联字段上有索引的时候nested loops 速度比较快,上面实验也证实了如果CBO判断基数不准确会导致错误的执行计划。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  oracle Hash join nes