您的位置:首页 > 数据库 > Oracle

oracle的表连接hash join、nested loop join

2013-10-29 16:45 211 查看
一.普通查询分页

有两条sql可控选择:

A: select * from (select rownum rn,syslog.* from syslog) where rn>10 and rn<=20

B: select * from(select rownum rn,syslog.* from syslog where rownum<=20) where rn>10

二者区别:B语句在参数rn较小,即用户翻前面一些页数时,查询效率更高,不过越到后面,二者查询效率越接近。

二.first_rows查询策略对分页的影响

创建测试表:

1.create table page_test as select * from rownum id,t.* from syslog t;
2.select * from (select /*+ first_rows */ rownum rn,a.source from page_test a,page_test b,page_test c where a.id=b.id and b.id=c.id and rownum<=5) where rn>0;
3.select * from (select rownum rn,a.source from page_test a,page_test b,page_test c where a.id=b.id and b.id=c.id and rownum<=5) where rn>0;
表未加索引,发现2、3所耗时间相差不多,而且比较长。
现在给表加索引:
4.create index ind_page_test_id on page_test(id);
重新执行上面2、3两条sql,发现查询时间大大降低,但2比3执行更快。
继续,往后查询
5.select * from (select /*+ first_rows */ rownum rn,a.source from page_test a,page_test b,page_test c where a.id=b.id and b.id=c.id and rownum<=500005) where rn>500000;
6.select * from (select rownum rn,a.source from page_test a,page_test b,page_test c where a.id=b.id and b.id=c.id and rownum<=500005) where rn>500000;
 

重新执行上面5、6两条sql,发现6的时间反而比5少。因此,在执行分页是,页码低的应该使用hint提示的sql语句2,页面高的,使用普通查询sql语句6。三、多表关联查询分页策略:写sql时,以数据量小的为驱动表(就是写在from后面跟表的前面),并在数据量大的表建立索引。
三、带排序需求的分页

7.普通

select *
from (select rownum countnum, t.*
from (select *
from page_test
where sourcetype = '登陆'
order by useroid desc) t
where rownum <= 50)
where countnum > 0


8.带分析函数的

select *
from (select row_number() over(order by useroid desc) countnum, t.*
from page_test t
where sourcetype = '登陆')
where rownum <= 50 and countnum > 0


通过代码测试:

/*创建视图*/
create or replace view stats as select 'stat...'||a.name name,b.value from v$statname a,v$mystat b
where a.STATISTIC#=b.STATISTIC# union all select 'latch.'||name name,gets from v$latch
/*创建临时表*/
create global temporary table run_stats
(
runid varchar2(20),
name varchar2(80),
value number
) on commit preserve rows;
/*创建包*/
create or replace package runstats_pgk
as
procedure rs_start;
procedure rs_middle;
procedure rs_stop(p_difference_threshold in number default 0)
end;
create or replace package body runstats_pgk
as
q_start number;
q_run1 number;
q_run2 number;

procedure rs_start
is
begin
delete from run_stats;

insert into run_stats
select 'before',stats.* from stats;
g_start :=dbms_utility.get_time;
end;
procedure rs_middle
is
begin
g_run1 :=(dbms_utility.get_time-g_start);
insert into run_stats
select 'after 1',stats.* from stats;
g_start := dbms_utility.get_time;
end;
procedure rs_stop(p_difference_threshold in number default 0)
is
begin
g_run2 :=(dbms_utility.get_time-g_start);
dbms_output.put_line
('run1 ran in'||g_run1||'hsecs');
dbms_output.put_line
('run2 ran in'||g_run2||'hsecs');
dbms_output.put_line
('run1 ran in'||round(g_run1/g_run2*100,2)||'% if the time');
dbms_output.put_line(char(9));

insert into run_stats
select 'after 2',stats.* from stats;
dbms_output.put_line
(rpad('Name',30)||lpad('Run1',10)||lpad('Run2',10)||lpad('Diff',10));

for x in
(select rpad(a.name,30)||to_char(b.value-a.value,'9,999,999')||
to_char(c.value-b.value,'9,999,999')||
to_char((c.value-b.value)-(b.value-a.value),'9,999,999' date
from run_stats a,run_state b,run_stats c
where a.name=b.name
and b.name=c.name
and a.runid='before'
and b.runid='after 1'
and c.runid='after 2'
and c,value-a.value>0
and abs((c.value-b.value)-(b.value-a.value))>p_difference_threshold
order by abs((c.value-b.value)-(b.value-a.value))
)loop
dbms_output.put_line(x.data);
end loop;
dbms_output.put_line(chr(9));
dbms_output.put_line
(lpad('Run1',10)||lpad('Run2',10)||lpad('Diff',10)||lpad('Pct',8));
for x in
(select to_char(run1,'9,999,999')||
to_char(run2,'9,999,999')||
to_char(diff,'9,999,999')||
to_char(round(run1/run2*100,2),'9,999,999')||'%'data
from
(select sum(b.value-a.value) run1,sum(c.value-b.value) run2,
sum(c.value-b.value)-(b.value-a.value)) diff
from run_stats a,run_stats b,run_stats c
where a.name=b.name
and b.name=c.name
and a.runid='before'
and b.runid='after 1'
and c.runid='after 2'
and a.name like 'LATCH%'
)
)loop
dbms_output.put_line(x.data);
end loop;
end;
end;

/*创建存储过程*/
create or replace procedure sp_test is
num1 number;
num2 number;
begin
runstats_pgk.rs_start;
for i in 1..100 loop
select count(*) into num1 from
(
select * from(
select rownum countnum,t.* from(
select * from page_test where sourcetype='登陆' order by useroid desc) t where rownum<=50
) where countnum>0
);
end loop;
runstats_pgk.rs_middle;
for i in 1..100 loop
select count(*) into num2 from
(
select * from(
select row_number() over(order by useroid desc) countnum,t.*
from page_test t where sourcetype='登陆')
where countnum>0 and countnum<=50
);
end loop;
runstats_pgk.rs_stop(100);
end sp_test;
/*测试*/
set serveroutput on size 20000
exec sp_test;

结果:在oracle 8i分析函数不适合作分页,9i中分析函数经过专门优化后可以比rownum有更好的性能。

备注新概念:oracle_hints:基于代价的优化器是很聪明的,在绝大多数情况下它会选择正确的优化器,减轻了DBA的负担。但有时它也聪明反被聪明误,选择了很差的执行计划,使某个语句的执行变得奇慢无比。

Oracle Hints是一种机制,用来告诉优化器按照我们的告诉它的方式生成执行计划。我们可以用Oracle Hints来实现:
1) 使用的优化器的类型
2) 基于代价的优化器的优化目标,是all_rows还是first_rows。
3) 表的访问路径,是全表扫描,还是索引扫描,还是直接利用rowid。
4) 表之间的连接类型
5) 表之间的连接顺序
6) 语句的并行程度

除了”RULE”提示外,一旦使用的别的提示,语句就会自动的改为使用CBO优化器,此时如果你的数据字典中没有统计数据,就会使用缺省的统计数据。所以建议大家如果使用CBO或Hints提示,则最好对表和索引进行定期的分析(这样能使基于cbo的执行计划更加准确)。

 

此时就需要DBA进行人为的干预,告诉优化器使用我们指定的存取路径或连接类型生成执行计划,从而使语句高效的运行。

四、带查询条件的分页的速度优化
对查询条件加索引

create index ind_page_test_owner_type_id on page_test(useroid,sourcetype,id);
analyze index ind_page_test_owner_type_id compute statistics;

select *
  from (select rid
          from (select row_number() over(order by id) rn, rowid rid
                  from page_test
                 where useroid = 28570
                   and sourcetype = '登陆') t
         where rn > 0
           and rn <= 50) b,
       page_test t
 where t.rowid = b.rid
 
 select /*+ ordered user_nl(b,t) */ *
  from (select rid
          from (select row_number() over(order by id) rn, rowid rid
                  from page_test
                 where useroid = 28570
                   and sourcetype = '登陆') t
         where rn > 0
           and rn <= 50) b,
       page_test t
 where t.rowid = b.rid

以上对单个大表的分页是颇有效率的,而对于某些需要关联再分页的表,可以先进行小表的分页,再关联大表,并将小表放前面,如果出现了hashjoin,也可以通过/*+ ordered user_nl(b,t) */改为nested loop
备注:rowid是记录真实位置,rownum是伪列。应用如下

A:当查询语句中包含order by时, 会先执行rownum再按order by排序, 通常这不是我们想要的. 可以引入子查询来实现我们想要的结果.
SELECT ROWNUM,t.tid FROM (SELECT tid FROM test ORDER BY col) t;

B:利用rowid来查询记录,而且通过rowid查询记录是查询速度最快的查询方法
可通过ROWNUM限制返回结果的记录数(行数)

                     SQL> select rownum from test where rownum<10; 2)通过ROWNUM为表中某列产生一个唯一(UNIQUE)值                      SQL> UPDATE table_name SET column_name = ROWNUM;                      //将rownum指定为该行某列的值。

oracle的表连接hash join、nested loop join这个本来想研究的,后面还是觉得实践出真知,用到,再试。

分页注意事项

1.对视图进行union all时,可能执行错误的执行计划:


分析原因:对于视图的查询,没有用到索引

解决办法:

1)将对视图的查询改为对基表的查询

2)使用分析函数,把sql改为

 

2.一般利用rowid去获得记录,是最优的,但有时候也会出现执行计划不稳定的情况,这样就只能通过hint,手动改动执行计划来慢慢摸索了。

3.对于分页有order by的排序,由于order by字段值有相等的情况,导致前后页有数据重复。

1.order by字段加上id。

2.9i版本以上使用分析函数。

总结:数据库调优,一定要多调多试,才能找到最优的解决办法。


 

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: