oracle的表连接hash join、nested loop join
一.普通查询分页
有两条sql可控选择:
A: select * from (select rownum rn,syslog.* from syslog) where rn>10 and rn<=20
B: select * from(select rownum rn,syslog.* from syslog where rownum<=20) where rn>10
二者区别:B语句在参数rn较小,即用户翻前面一些页数时,查询效率更高,不过越到后面,二者查询效率越接近。
二.first_rows查询策略对分页的影响
创建测试表:
1.create table page_test as select * from rownum id,t.* from syslog t;
2.select * from (select /*+ first_rows */ rownum rn,a.source from page_test a,page_test b,page_test c where a.id=b.id and b.id=c.id and rownum<=5) where rn>0;
3.select * from (select rownum rn,a.source from page_test a,page_test b,page_test c where a.id=b.id and b.id=c.id and rownum<=5) where rn>0;
表未加索引,发现2、3所耗时间相差不多,而且比较长。
现在给表加索引:
4.create index ind_page_test_id on page_test(id);
重新执行上面2、3两条sql,发现查询时间大大降低,但2比3执行更快。
继续,往后查询
5.select * from (select /*+ first_rows */ rownum rn,a.source from page_test a,page_test b,page_test c where a.id=b.id and b.id=c.id and rownum<=500005) where rn>500000;
6.select * from (select rownum rn,a.source from page_test a,page_test b,page_test c where a.id=b.id and b.id=c.id and rownum<=500005) where rn>500000;
重新执行上面5、6两条sql,发现6的时间反而比5少。因此,在执行分页是,页码低的应该使用hint提示的sql语句2,页面高的,使用普通查询sql语句6。三、多表关联查询分页策略:写sql时,以数据量小的为驱动表(就是写在from后面跟表的前面),并在数据量大的表建立索引。
三、带排序需求的分页
7.普通
select * from (select rownum countnum, t.* from (select * from page_test where sourcetype = '登陆' order by useroid desc) t where rownum <= 50) where countnum > 0
8.带分析函数的
select * from (select row_number() over(order by useroid desc) countnum, t.* from page_test t where sourcetype = '登陆') where rownum <= 50 and countnum > 0
通过代码测试:
/*创建视图*/ create or replace view stats as select 'stat...'||a.name name,b.value from v$statname a,v$mystat b where a.STATISTIC#=b.STATISTIC# union all select 'latch.'||name name,gets from v$latch /*创建临时表*/ create global temporary table run_stats ( runid varchar2(20), name varchar2(80), value number ) on commit preserve rows; /*创建包*/ create or replace package runstats_pgk as procedure rs_start; procedure rs_middle; procedure rs_stop(p_difference_threshold in number default 0) end; create or replace package body runstats_pgk as q_start number; q_run1 number; q_run2 number; procedure rs_start is begin delete from run_stats; insert into run_stats select 'before',stats.* from stats; g_start :=dbms_utility.get_time; end; procedure rs_middle is begin g_run1 :=(dbms_utility.get_time-g_start); insert into run_stats select 'after 1',stats.* from stats; g_start := dbms_utility.get_time; end; procedure rs_stop(p_difference_threshold in number default 0) is begin g_run2 :=(dbms_utility.get_time-g_start); dbms_output.put_line ('run1 ran in'||g_run1||'hsecs'); dbms_output.put_line ('run2 ran in'||g_run2||'hsecs'); dbms_output.put_line ('run1 ran in'||round(g_run1/g_run2*100,2)||'% if the time'); dbms_output.put_line(char(9)); insert into run_stats select 'after 2',stats.* from stats; dbms_output.put_line (rpad('Name',30)||lpad('Run1',10)||lpad('Run2',10)||lpad('Diff',10)); for x in (select rpad(a.name,30)||to_char(b.value-a.value,'9,999,999')|| to_char(c.value-b.value,'9,999,999')|| to_char((c.value-b.value)-(b.value-a.value),'9,999,999' date from run_stats a,run_state b,run_stats c where a.name=b.name and b.name=c.name and a.runid='before' and b.runid='after 1' and c.runid='after 2' and c,value-a.value>0 and abs((c.value-b.value)-(b.value-a.value))>p_difference_threshold order by abs((c.value-b.value)-(b.value-a.value)) )loop dbms_output.put_line(x.data); end loop; dbms_output.put_line(chr(9)); dbms_output.put_line (lpad('Run1',10)||lpad('Run2',10)||lpad('Diff',10)||lpad('Pct',8)); for x in (select to_char(run1,'9,999,999')|| to_char(run2,'9,999,999')|| to_char(diff,'9,999,999')|| to_char(round(run1/run2*100,2),'9,999,999')||'%'data from (select sum(b.value-a.value) run1,sum(c.value-b.value) run2, sum(c.value-b.value)-(b.value-a.value)) diff from run_stats a,run_stats b,run_stats c where a.name=b.name and b.name=c.name and a.runid='before' and b.runid='after 1' and c.runid='after 2' and a.name like 'LATCH%' ) )loop dbms_output.put_line(x.data); end loop; end; end; /*创建存储过程*/ create or replace procedure sp_test is num1 number; num2 number; begin runstats_pgk.rs_start; for i in 1..100 loop select count(*) into num1 from ( select * from( select rownum countnum,t.* from( select * from page_test where sourcetype='登陆' order by useroid desc) t where rownum<=50 ) where countnum>0 ); end loop; runstats_pgk.rs_middle; for i in 1..100 loop select count(*) into num2 from ( select * from( select row_number() over(order by useroid desc) countnum,t.* from page_test t where sourcetype='登陆') where countnum>0 and countnum<=50 ); end loop; runstats_pgk.rs_stop(100); end sp_test;
/*测试*/ set serveroutput on size 20000 exec sp_test;
结果:在oracle 8i分析函数不适合作分页,9i中分析函数经过专门优化后可以比rownum有更好的性能。
备注新概念:oracle_hints:基于代价的优化器是很聪明的,在绝大多数情况下它会选择正确的优化器,减轻了DBA的负担。但有时它也聪明反被聪明误,选择了很差的执行计划,使某个语句的执行变得奇慢无比。
Oracle Hints是一种机制,用来告诉优化器按照我们的告诉它的方式生成执行计划。我们可以用Oracle Hints来实现:
1) 使用的优化器的类型
2) 基于代价的优化器的优化目标,是all_rows还是first_rows。
3) 表的访问路径,是全表扫描,还是索引扫描,还是直接利用rowid。
4) 表之间的连接类型
5) 表之间的连接顺序
6) 语句的并行程度
除了”RULE”提示外,一旦使用的别的提示,语句就会自动的改为使用CBO优化器,此时如果你的数据字典中没有统计数据,就会使用缺省的统计数据。所以建议大家如果使用CBO或Hints提示,则最好对表和索引进行定期的分析(这样能使基于cbo的执行计划更加准确)。
此时就需要DBA进行人为的干预,告诉优化器使用我们指定的存取路径或连接类型生成执行计划,从而使语句高效的运行。
四、带查询条件的分页的速度优化
对查询条件加索引
create index ind_page_test_owner_type_id on page_test(useroid,sourcetype,id);
analyze index ind_page_test_owner_type_id compute statistics;
select *
from (select rid
from (select row_number() over(order by id) rn, rowid rid
from page_test
where useroid = 28570
and sourcetype = '登陆') t
where rn > 0
and rn <= 50) b,
page_test t
where t.rowid = b.rid
select /*+ ordered user_nl(b,t) */ *
from (select rid
from (select row_number() over(order by id) rn, rowid rid
from page_test
where useroid = 28570
and sourcetype = '登陆') t
where rn > 0
and rn <= 50) b,
page_test t
where t.rowid = b.rid
以上对单个大表的分页是颇有效率的,而对于某些需要关联再分页的表,可以先进行小表的分页,再关联大表,并将小表放前面,如果出现了hashjoin,也可以通过/*+ ordered user_nl(b,t) */改为nested loop
备注:rowid是记录真实位置,rownum是伪列。应用如下
A:当查询语句中包含order by时, 会先执行rownum再按order by排序, 通常这不是我们想要的. 可以引入子查询来实现我们想要的结果.
SELECT ROWNUM,t.tid FROM (SELECT tid FROM test ORDER BY col) t;
B:利用rowid来查询记录,而且通过rowid查询记录是查询速度最快的查询方法
可通过ROWNUM限制返回结果的记录数(行数)
oracle的表连接hash join、nested loop join这个本来想研究的,后面还是觉得实践出真知,用到,再试。
分页注意事项
1.对视图进行union all时,可能执行错误的执行计划:
分析原因:对于视图的查询,没有用到索引
解决办法:
1)将对视图的查询改为对基表的查询
2)使用分析函数,把sql改为
2.一般利用rowid去获得记录,是最优的,但有时候也会出现执行计划不稳定的情况,这样就只能通过hint,手动改动执行计划来慢慢摸索了。
3.对于分页有order by的排序,由于order by字段值有相等的情况,导致前后页有数据重复。
1.order by字段加上id。
2.9i版本以上使用分析函数。
总结:数据库调优,一定要多调多试,才能找到最优的解决办法。
- oracle的表连接hash join、nested loop join
- oracle多表连接方式Hash Join Nested Loop Join Merge Join
- Oracle优化器、优化模式、表的连接方式(Hash Join、Nested Loop、Sort Merge Join)
- oracle sql调优学习笔记(三)表的连接方式:NESTED LOOP、HASH JOIN、SORT MERGE JOIN
- ORACLE 连接方式 NESTED LOOP、HASH JOIN
- oracle 数据库中几种连接方式执行过程(nested loop、hash join、sort order join)
- Oracle 表的连接方式(1)-----Nested loop join和 Sort merge join
- 分享三种oracle表的连接方式:NESTED LOOP、HASH JOIN、SORT MERGE JOIN
- 深入理解Oracle表(3):三大表连接方式详解之Nested loop join和 Sort merge join
- 深入理解Oracle表(3):三大表连接方式详解之Nested loop join和 Sort merge join
- 多表连接的三种方式详解 HASH JOIN MERGE JOIN NESTED LOOP
- Oracle的Filter,Nest loop,Merge sort join和Hash join
- 多表连接的三种方式详解 HASH JOIN MERGE JOIN NESTED LOOP
- Oracle的Filter,Nest loop,Merge sort join和Hash join
- oracle表连接----->嵌套循环(Nested Loops Join)
- hash join、nested loop,sort merge join
- 多表连接的三种方式详解 HASH JOIN MERGE JOIN NESTED LOOP
- ORACLE 连接方式 NESTED LOOP、HAS…
- 多表连接的三种方式详解 HASH JOIN MERGE JOIN NESTED LOOP
- oracle hash join和nested loop下的驱动表相关测试