您的位置:首页 > 数据库 > Oracle

oracle in和exists、not in和not exists原理和性能探究

2013-10-18 16:02 337 查看
      对于in和exists、not in和not exists还是有很多的人有疑惑,更有甚者禁用not in,所有的地方都要用not exists,它真的高效吗?通过下面的使用我们来证明。

先制造一些数据

SQL> drop table test1 purge;

SQL> drop table test2 purge;

SQL> create table test1 as select * from dba_objects where rownum <=1000;

SQL> create table test2 as select * from dba_objects;

SQL> exec dbms_stats.gather_table_stats(user,'test1');

SQL> exec dbms_stats.gather_table_stats(user,'test2');

SQL> set autotrace traceonly

in和exists原理及性能实验:

SQL> select * from test1 t1 where t1.object_id in (select t2.object_id from test2 t2);

已选择1000行。

执行计划

----------------------------------------------------------

Plan hash value: 3819917785

----------------------------------------------------------------------------

| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |

----------------------------------------------------------------------------

|   0 | SELECT STATEMENT   |       |   997 | 84745 |   168   (3)| 00:00:03 |

|*  1 |  HASH JOIN SEMI    |       |   997 | 84745 |   168   (3)| 00:00:03 |

|   2 |   TABLE ACCESS FULL| TEST1 |  1000 | 80000 |     5   (0)| 00:00:01 |

|   3 |   TABLE ACCESS FULL| TEST2 | 50687 |   247K|   162   (2)| 00:00:02 |

----------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   1 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")

统计信息

----------------------------------------------------------

          1  recursive calls

          0  db block gets

         95  consistent gets

          0  physical reads

          0  redo size

      45820  bytes sent via SQL*Net to client

       1111  bytes received via SQL*Net from client

         68  SQL*Net roundtrips to/from client

          0  sorts (memory)

          0  sorts (disk)

       1000  rows processed

SQL> select *  from test1 t1

  2   where exists (select 1 from test2 t2 where t1.object_id = t2.object_id);

已选择1000行。

执行计划

----------------------------------------------------------

Plan hash value: 3819917785

----------------------------------------------------------------------------

| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |

----------------------------------------------------------------------------

|   0 | SELECT STATEMENT   |       |   997 | 84745 |   168   (3)| 00:00:03 |

|*  1 |  HASH JOIN SEMI    |       |   997 | 84745 |   168   (3)| 00:00:03 |

|   2 |   TABLE ACCESS FULL| TEST1 |  1000 | 80000 |     5   (0)| 00:00:01 |

|   3 |   TABLE ACCESS FULL| TEST2 | 50687 |   247K|   162   (2)| 00:00:02 |

----------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   1 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")

统计信息

----------------------------------------------------------

          0  recursive calls

          0  db block gets

         95  consistent gets

          0  physical reads

          0  redo size

      45820  bytes sent via SQL*Net to client

       1111  bytes received via SQL*Net from client

         68  SQL*Net roundtrips to/from client

          0  sorts (memory)

          0  sorts (disk)

       1000  rows processed
        结论:在oracle 10g中,in 和 exists其实是一样的,原理就是两张表做HASH JOIN SEMI。也可以通过10053事件看到两条sql语句最终转换成同一条sql。

not in和not exists原理及性能实验:

not exists 比 not in效率高的例子


SQL> select count(*) from test1 where object_id not in(select object_id from test2);

执行计划

----------------------------------------------------------

Plan hash value: 3641219899

-----------------------------------------------------------------------------

| Id  | Operation           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |

-----------------------------------------------------------------------------

|   0 | SELECT STATEMENT    |       |     1 |     4 | 81076   (2)| 00:16:13 |

|   1 |  SORT AGGREGATE     |       |     1 |     4 |            |          |

|*  2 |   FILTER            |       |       |       |            |          |

|   3 |    TABLE ACCESS FULL| TEST1 |  1000 |  4000 |     5   (0)| 00:00:01 |

|*  4 |    TABLE ACCESS FULL| TEST2 |     1 |     5 |   162   (2)| 00:00:02 |

-----------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   2 - filter( NOT EXISTS (SELECT /*+ */ 0 FROM "TEST2" "TEST2" WHERE

              LNNVL("OBJECT_ID"<>:B1)))

   4 - filter(LNNVL("OBJECT_ID"<>:B1))

统计信息

----------------------------------------------------------

          1  recursive calls

          0  db block gets

       9410  consistent gets

          0  physical reads

          0  redo size

        407  bytes sent via SQL*Net to client

        385  bytes received via SQL*Net from client

          2  SQL*Net roundtrips to/from client

          0  sorts (memory)

          0  sorts (disk)

          1  rows processed

SQL> select count(*) from test1 t1 where not exists

    (select 1 from test2 t2 where t1.object_id=t2.object_id);

执行计划

----------------------------------------------------------

Plan hash value: 240185659

-----------------------------------------------------------------------------

| Id  | Operation           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |

-----------------------------------------------------------------------------

|   0 | SELECT STATEMENT    |       |     1 |     9 |   168   (3)| 00:00:03 |

|   1 |  SORT AGGREGATE     |       |     1 |     9 |            |          |

|*  2 |   HASH JOIN ANTI    |       |     3 |    27 |   168   (3)| 00:00:03 |

|   3 |    TABLE ACCESS FULL| TEST1 |  1000 |  4000 |     5   (0)| 00:00:01 |

|   4 |    TABLE ACCESS FULL| TEST2 | 50687 |   247K|   162   (2)| 00:00:02 |

-----------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   2 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")

统计信息

----------------------------------------------------------

          1  recursive calls

          0  db block gets

        717  consistent gets

          0  physical reads

          0  redo size

        407  bytes sent via SQL*Net to client

        385  bytes received via SQL*Net from client

          2  SQL*Net roundtrips to/from client

          0  sorts (memory)

          0  sorts (disk)

          1  rows processed

          
not in比not exists 效率高的例子

SQL> Set autotrace off

SQL> drop table test1 purge;

表已删除。

SQL> drop table test2 purge;

表已删除。

SQL> create table test1 as select * from dba_objects where rownum <=5;

表已创建。

SQL> create table test2 as select * from dba_objects;

表已创建。

SQL> Insert into test2 select * from dba_objects;

已创建50687行。

SQL> Insert into test2 select * from test2;

已创建101374行。

SQL> Insert into test2 select * from test2;

已创建202748行。

SQL> Commit;

提交完成。

SQL> exec dbms_stats.gather_table_stats(user,'test1');

PL/SQL 过程已成功完成。

SQL> exec dbms_stats.gather_table_stats(user,'test2');

PL/SQL 过程已成功完成。

SQL> Set autotrace traceonly

SQL> select count(*) from test1 where object_id not in(select object_id from test2);

执行计划

----------------------------------------------------------

Plan hash value: 3641219899

-----------------------------------------------------------------------------

| Id  | Operation           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |

-----------------------------------------------------------------------------

|   0 | SELECT STATEMENT    |       |     1 |     3 |  3143   (2)| 00:00:38 |

|   1 |  SORT AGGREGATE     |       |     1 |     3 |            |          |

|*  2 |   FILTER            |       |       |       |            |          |

|   3 |    TABLE ACCESS FULL| TEST1 |     5 |    15 |     3   (0)| 00:00:01 |

|*  4 |    TABLE ACCESS FULL| TEST2 |     8 |    40 |  1256   (2)| 00:00:16 |

-----------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   2 - filter( NOT EXISTS (SELECT /*+ */ 0 FROM "TEST2" "TEST2" WHERE

              LNNVL("OBJECT_ID"<>:B1)))

   4 - filter(LNNVL("OBJECT_ID"<>:B1))

统计信息

----------------------------------------------------------

          1  recursive calls

          0  db block gets
         23  consistent gets

          0  physical reads

          0  redo size

        407  bytes sent via SQL*Net to client

        385  bytes received via SQL*Net from client

          2  SQL*Net roundtrips to/from client

          0  sorts (memory)

          0  sorts (disk)

          1  rows processed

SQL> select count(*) from test1 t1 where not exists

    (select 1 from test2 t2 where t1.object_id=t2.object_id);

执行计划

----------------------------------------------------------

Plan hash value: 240185659

-----------------------------------------------------------------------------

| Id  | Operation           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |

-----------------------------------------------------------------------------

|   0 | SELECT STATEMENT    |       |     1 |     8 |  1263   (3)| 00:00:16 |

|   1 |  SORT AGGREGATE     |       |     1 |     8 |            |          |

|*  2 |   HASH JOIN ANTI    |       |     1 |     8 |  1263   (3)| 00:00:16 |

|   3 |    TABLE ACCESS FULL| TEST1 |     5 |    15 |     3   (0)| 00:00:01 |

|   4 |    TABLE ACCESS FULL| TEST2 |   405K|  1981K|  1253   (2)| 00:00:16 |

-----------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   2 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")

统计信息

----------------------------------------------------------

          1  recursive calls

          0  db block gets
       5609  consistent gets

          0  physical reads

          0  redo size

        407  bytes sent via SQL*Net to client

        385  bytes received via SQL*Net from client

          2  SQL*Net roundtrips to/from client

          0  sorts (memory)

          0  sorts (disk)

          1  rows processed
        结论:not in 和not exists原理是nestedloops 与HASH JOIN的区别,not in中的filter算法类似于nestedloops。如果比较两者的性能,就是比较nestedloops 与HASH JOIN的性能差异。在本例子中:

    not in 性能 大于not exists  test1的数据量5条,test2数量40多万条。

    not exists 性能 大于not in  test1的数据量1000条,test2数量50687条。


not in和not exists还有一个重要区别,就是查询条件后面的语句连接字段中有null值时,not in查询的结果不正确。
http://blog.csdn.net/stevendbaguo/article/details/8270572
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: