您的位置:首页 > 数据库 > Oracle

使用Oracle SQL trace时需要注意的问题

2011-08-25 15:29 411 查看
经常使用Sql Trace和10046 event来诊断Oracle数据库性能问题。而level超过1的10046事件通常称为extended sql trace,通常用于诊断确定的单个SQL、存储过程或会话的性能问题,具有如下的几个优点:
  可以得到SQL执行时实际的执行计划。

  可以得到SQL执行时所花时间的具体分布,CPU消耗了多长时间,多块读消耗了多长时间等等。

  可以得到SQL执行时的各种与性能相关的统计数据,逻辑读、物理读、fetch次数、parse次数等等。

  不仅能够用于性能测试,同时能够用于诊断正在执行的SQL或存储过程的性能。

  有很多的工具用于格式化生成的trace文件,除了Oracle自带的TKPROF、Metalink Note 224270.1 Trace Analyzer,以及第三方的免费工具如orasrp,《Troubleshooting Oracle Performance》作者开发的TVD$XTAT,甚至还有商业化的软件Hotsos Profiler等。

  不过前段时间在用10046事件诊断一个性能问题的时候,却让生成的结果误导了。后来仔细检查发现,在会话开启sql trace的情况下,SQL语句会重新解析,导致开启sql trace之后与开启之前相比,执行计划可能发生了变化,导致sql trace的结果不能真实地反映会话执行SQL的情况,在分析时容易发生偏差。

  下面是一个测试:

  测试的环境是Oracle 10.2.0.1 for Windows,不过前面提到的案例,是发生在Oracle 9i下的,所以9i和10g都有这个问题,而11g目前还没有测试过,有兴趣的朋友可以在11g上进行测试。

Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/
首先创建一个sql文件,内容为:

  select
/*+ testsql
*/ sum(value)
from t1
where flag=:v_flag;

  创建一个列上数据有倾斜的表:

  SQL>
create table t1 (value
number ,flag
number,pad
varchar2(2000));

  表已创建。

  SQL>
insert into t1
select rownum,mod(rownum,2000),lpad('x',1000,'x')
from dba_objects;

  已创建49796行。

  SQL>
commit;

  提交完成。

  SQL>
insert into t1
select rownum,3000,lpad('x',1000,'x')
from dba_objects
where rownum<=10000;

  已创建10000行。

  SQL>
commit;

  提交完成。

  SQL>
create index t1_idx
on t1(flag);

  索引已创建。

  SQL>
exec dbms_stats.gather_table_stats(ownname=>user,tabname=>'T1',cascade=>true,method_opt=>'for
all indexed columns');

  PL/SQL 过程已成功完成。

  SQL>
select column_name,num_distinct,num_buckets
from user_tab_columns
where table_name='T1';

  COLUMN_NAME NUM_DISTINCT NUM_BUCKETS

  ------------------------------ ------------ -----------

  VALUE

  FLAG 2030
75

  PAD

  select
/*+ testsql
*/ sum(value)
from t1
where flag=:v_flag;

  创建一个列上数据有倾斜的表:

  SQL>
create table t1 (value
number ,flag
number,pad
varchar2(2000));

  表已创建。

  SQL>
insert into t1
select rownum,mod(rownum,2000),lpad('x',1000,'x')
from dba_objects;

  已创建49796行。

  SQL>
commit;

  提交完成。

  SQL>
insert into t1
select rownum,3000,lpad('x',1000,'x')
from dba_objects
where rownum<=10000;

  已创建10000行。

  SQL>
commit;

  提交完成。

  SQL>
create index t1_idx
on t1(flag);

  索引已创建。

  SQL>
exec dbms_stats.gather_table_stats(ownname=>user,tabname=>'T1',cascade=>true,method_opt=>'for
all indexed columns');

  PL/SQL 过程已成功完成。

  SQL>
select column_name,num_distinct,num_buckets
from user_tab_columns
where table_name='T1';

  COLUMN_NAME NUM_DISTINCT NUM_BUCKETS

  ------------------------------ ------------ -----------

  VALUE

  FLAG 2030
75

  PAD
  在创建的测试表中,FLAG列有2001个不同的值,其中,0-1999之间每个值约为25个,而有一个特殊的值3000,有10000个。收集统计信息时,在FLAG列上收集了直方图。

  下面运行test.sql:

Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/
SQL>
var v_flag
number;

  SQL>
exec :v_flag:=3000;

  SQL>
set autot on stat

  SQL>
@test

  SUM(VALUE)

  ----------

  50005000

  统计信息

  -------------------------------------------------------

  0 recursive calls

  0 db block gets

  8575 consistent gets

  0 physical reads

  0 redo size

  412 bytes sent via SQL*Net
to client

  384 bytes received via SQL*Net
from client

  2 SQL*Net roundtrips
to/from client

  0 sorts (memory)

  0 sorts (disk)

  1 rows processed

  SQL>
var v_flag
number;

  SQL>
exec :v_flag:=3000;

  SQL>
set autot on stat

  SQL>
@test

  SUM(VALUE)

  ----------

  50005000

  统计信息

  -------------------------------------------------------

  0 recursive calls

  0 db block gets

  8575 consistent gets

  0 physical reads

  0 redo size

  412 bytes sent via SQL*Net
to client

  384 bytes received via SQL*Net
from client

  2 SQL*Net roundtrips
to/from client

  0 sorts (memory)

  0 sorts (disk)

  1 rows processed
  我们来看看SQL的执行计划: 

Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/
SQL>
select sql_id,hash_value,child_number,executions
from v$sql
where sql_text
like '%testsql%'
and sql_text
not
like '%v$sql%';

  SQL_ID HASH_VALUE CHILD_NUMBER EXECUTIONS

  ------------- ---------- ------------ ----------

  8hdvb5nwfqm8r 954944791
0
2

  SQL>
select *
from
table(dbms_xplan.display_cursor('8hdvb5nwfqm8r',0));

  PLAN_TABLE_OUTPUT

  ---------------------------------------

  SQL_ID 8hdvb5nwfqm8r, child number
0

  -------------------------------------

  select
/*+ testsql
*/ sum(value)
from t1
where flag=:v_flag

  Plan hash value:
3724264953

  ---------------------------------------------------------------------------

  | Id
| Operation
| Name | Rows
| Bytes
| Cost (%CPU)| Time
|

  ---------------------------------------------------------------------------

  |
0
| SELECT STATEMENT
| |
| |
765 (100)|
|

  |
1
| SORT AGGREGATE
| |
1
| 17
| |
|

  |*
2
| TABLE ACCESS
FULL| T1
| 9863
| 163K|
765 (1)|
00:00:07
|

  ---------------------------------------------------------------------------

  Predicate Information (identified by operation id):

  ---------------------------------------------------

  2
- filter("FLAG"=:V_FLAG)

  SQL>
select sql_id,hash_value,child_number,executions
from v$sql
where sql_text
like '%testsql%'
and sql_text
not
like '%v$sql%';

  SQL_ID HASH_VALUE CHILD_NUMBER EXECUTIONS

  ------------- ---------- ------------ ----------

  8hdvb5nwfqm8r 954944791
0
2

  SQL>
select *
from
table(dbms_xplan.display_cursor('8hdvb5nwfqm8r',0));

  PLAN_TABLE_OUTPUT

  ---------------------------------------

  SQL_ID 8hdvb5nwfqm8r, child number
0

  -------------------------------------

  select
/*+ testsql
*/ sum(value)
from t1
where flag=:v_flag

  Plan hash value:
3724264953

  ---------------------------------------------------------------------------

  | Id
| Operation
| Name | Rows
| Bytes
| Cost (%CPU)| Time
|

  ---------------------------------------------------------------------------

  |
0
| SELECT STATEMENT
| |
| |
765 (100)|
|

  |
1
| SORT AGGREGATE
| |
1
| 17
| |
|

  |*
2
| TABLE ACCESS
FULL| T1
| 9863
| 163K|
765 (1)|
00:00:07
|

  ---------------------------------------------------------------------------

  Predicate Information (identified by operation id):

  ---------------------------------------------------

  2
- filter("FLAG"=:V_FLAG)
  可以看到,由于绑定变量窥视,SQL语句走了全表扫描,没有使用索引。

Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/
SQL>
exec :v_flag:=300;

  SQL>
set autot on stat

  SQL>
@test

  SUM(VALUE)

  ----------

  607500

  统计信息

  ----------------------------------------------------------

  0 recursive calls

  0 db block gets

  8575 consistent gets

  0 physical reads

  0 redo size

  411 bytes sent via SQL*Net
to client

  384 bytes received via SQL*Net
from client

  2 SQL*Net roundtrips
to/from client

  0 sorts (memory)

  0 sorts (disk)

  1 rows processed

  SQL>
exec :v_flag:=300;

  SQL>
set autot on stat

  SQL>
@test

  SUM(VALUE)

  ----------

  607500

  统计信息

  ----------------------------------------------------------

  0 recursive calls

  0 db block gets

  8575 consistent gets

  0 physical reads

  0 redo size

  411 bytes sent via SQL*Net
to client

  384 bytes received via SQL*Net
from client

  2 SQL*Net roundtrips
to/from client

  0 sorts (memory)

  0 sorts (disk)

  1 rows processed
  从上面的输出可以看到,在变量=300时,仍然走了全表扫描,仍然没有使用索引。

wordend
  使用sqlplus创建一个新的到oracle的连接,并执行测试SQL:

  开启另一个会话,

Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/
 SQL>
select sid,serial#
from v$session
where sid=(select sid
from v$mystat
where rownum=1);

  SID SERIAL#

  ---------- ----------

  22
153

  SQL>
var v_flag
number;

  SQL>
exec :v_flag:=500;

  SQL>
set autot on stat

  SQL>
@test

  SUM(VALUE)

  ----------

  612500

  统计信息

  ----------------------------------------------------------

  0 recursive calls

  0 db block gets

  8575 consistent gets

  0 physical reads

  0 redo size

  411 bytes sent via SQL*Net
to client

  384 bytes received via SQL*Net
from client

  2 SQL*Net roundtrips
to/from client

  0 sorts (memory)

  0 sorts (disk)

  1 rows processed

  SQL>
select sid,serial#
from v$session
where sid=(select sid
from v$mystat
where rownum=1);

  SID SERIAL#

  ---------- ----------

  22
153

  SQL>
var v_flag
number;

  SQL>
exec :v_flag:=500;

  SQL>
set autot on stat

  SQL>
@test

  SUM(VALUE)

  ----------

  612500

  统计信息

  ----------------------------------------------------------

  0 recursive calls

  0 db block gets

  8575 consistent gets

  0 physical reads

  0 redo size

  411 bytes sent via SQL*Net
to client

  384 bytes received via SQL*Net
from client

  2 SQL*Net roundtrips
to/from client

  0 sorts (memory)

  0 sorts (disk)

  1 rows processed
  从SQL执行得到的逻辑读来看,仍然是走的全表扫描。这里很明显使用索引更为合适,这就是使用绑定变量引起的副作用。

  然而,如果我们打这个这个会话的10046 事件,则结果如何呢?

Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/
 SQL>
exec sys.dbms_system.set_ev(22,153,10046,12,'');

  SQL>
exec :v_flag:=410;

  SQL>
@test

  SUM(VALUE)

  ----------

  610250

  SQL>
exec sys.dbms_system.set_ev(22,153,10046,12,'');

  SQL>
exec :v_flag:=410;

  SQL>
@test

  SUM(VALUE)

  ----------

  610250
  下面是trace文件的内容:

Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/
 PARSING IN
CURSOR #1
len=59 dep=0
uid=55 oct=3
lid=55 tim=7703132042
hv=954944791 ad='27cf5884'

  select
/*+ testsql
*/ sum(value)
from t1
where flag=:v_flag

  END
OF STMT

  PARSE #1:c=0,e=309,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,tim=7703132037

  BINDS #1:

  kkscoacd

  Bind#0

  oacdty=02 mxl=22(22)
mxlc=00 mal=00
scl=00 pre=00

  oacflg=03 fl2=1000000
frm=00 csi=00
siz=24
off=0

  kxsbbbfp=0819ca94 bln=22 avl=03
flg=05

  value=410

  EXEC #1:c=15625,e=8864,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,tim=7703164320

  WAIT #1: nam='SQL*Net message to client'
ela=
5 driver id=1111838976 #bytes=1
p3=0 obj#=-1
tim=7703165526

  FETCH #1:c=0,e=237,p=0,cr=27,cu=0,mis=0,r=1,dep=0,og=1,tim=7703166929

  WAIT #1: nam='SQL*Net message from client'
ela=
187 driver id=1111838976 #bytes=1
p3=0 obj#=-1
tim=7703168520

  FETCH #1:c=0,e=3,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=0,tim=7703169851

  WAIT #1: nam='SQL*Net message to client'
ela=
5 driver id=1111838976 #bytes=1
p3=0 obj#=-1
tim=7703171170

  WAIT #1: nam='SQL*Net message from client'
ela=
421 driver id=1111838976 #bytes=1
p3=0 obj#=-1
tim=7703172845

  STAT #1 id=1 cnt=1
pid=0 pos=1
obj=0 op='SORT AGGREGATE (cr=27
pr=0 pw=0 time=260 us)'

  STAT #1 id=2 cnt=25
pid=1 pos=1
obj=51606 op='TABLE ACCESS BY
INDEX ROWID T1 (cr=27 pr=0 pw=0 time=439 us)'

  STAT #1 id=3 cnt=25
pid=2 pos=1
obj=51607 op='INDEX RANGE SCAN
T1_IDX (cr=2 pr=0 pw=0 time=36 us)'

  PARSING IN
CURSOR #1
len=59 dep=0
uid=55 oct=3
lid=55 tim=7703132042
hv=954944791 ad='27cf5884'

  select
/*+ testsql
*/ sum(value)
from t1
where flag=:v_flag

  END
OF STMT

  PARSE #1:c=0,e=309,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,tim=7703132037

  BINDS #1:

  kkscoacd

  Bind#0

  oacdty=02 mxl=22(22)
mxlc=00 mal=00
scl=00 pre=00

  oacflg=03 fl2=1000000
frm=00 csi=00
siz=24
off=0

  kxsbbbfp=0819ca94 bln=22 avl=03
flg=05

  value=410

  EXEC #1:c=15625,e=8864,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,tim=7703164320

  WAIT #1: nam='SQL*Net message to client'
ela=
5 driver id=1111838976 #bytes=1
p3=0 obj#=-1
tim=7703165526

  FETCH #1:c=0,e=237,p=0,cr=27,cu=0,mis=0,r=1,dep=0,og=1,tim=7703166929

  WAIT #1: nam='SQL*Net message from client'
ela=
187 driver id=1111838976 #bytes=1
p3=0 obj#=-1
tim=7703168520

  FETCH #1:c=0,e=3,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=0,tim=7703169851

  WAIT #1: nam='SQL*Net message to client'
ela=
5 driver id=1111838976 #bytes=1
p3=0 obj#=-1
tim=7703171170

  WAIT #1: nam='SQL*Net message from client'
ela=
421 driver id=1111838976 #bytes=1
p3=0 obj#=-1
tim=7703172845

  STAT #1 id=1 cnt=1
pid=0 pos=1
obj=0 op='SORT AGGREGATE (cr=27
pr=0 pw=0 time=260 us)'

  STAT #1 id=2 cnt=25
pid=1 pos=1
obj=51606 op='TABLE ACCESS BY
INDEX ROWID T1 (cr=27 pr=0 pw=0 time=439 us)'

  STAT #1 id=3 cnt=25
pid=2 pos=1
obj=51607 op='INDEX RANGE SCAN
T1_IDX (cr=2 pr=0 pw=0 time=36 us)'
  可以看,mis=1,表明发生了硬解析,op=’INDEX RANGE SCAN T1_IDX 表明使用使用了索引,而op=’SORT AGGREGATE (cr=27 表明逻辑读只有27,逻辑读幅下降,的确是使用了索引。

Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/
 SQL>
select sql_id,hash_value,child_number,executions
from v$sql
where sql_text
like '%testsql%'
and sql_text
not
like '%v$sql%';

  SQL_ID HASH_VALUE CHILD_NUMBER EXECUTIONS

  ------------- ---------- ------------ ----------

  8hdvb5nwfqm8r 954944791
0
7

  8hdvb5nwfqm8r 954944791
1
1

  SQL>
select *
from
table(dbms_xplan.display_cursor('8hdvb5nwfqm8r',1));

  PLAN_TABLE_OUTPUT

  -------------------------------------

  SQL_ID 8hdvb5nwfqm8r, child number
1

  -------------------------------------

  select
/*+ testsql
*/ sum(value)
from t1
where flag=:v_flag

  Plan hash value:
1563165360

  ---------------------------------------------------------------------------------------

  | Id
| Operation
| Name | Rows
| Bytes
| Cost (%CPU)| Time
|

  ---------------------------------------------------------------------------------------

  |
0
| SELECT STATEMENT
| |
| |
30 (100)|
|

  |
1
| SORT AGGREGATE
| |
1
| 17
| |
|

  |
2
| TABLE ACCESS
BY INDEX ROWID| T1
| 33
| 561
| 30 (0)|
00:00:01
|

  |*
3
| INDEX RANGE SCAN
| T1_IDX
| 33
| |
1 (0)|
00:00:01
|

  ---------------------------------------------------------------------------------------

  Predicate Information (identified by operation id):

  ---------------------------------------------------

  3
- access("FLAG"=:V_FLAG)

  SQL>
select sql_id,hash_value,child_number,executions
from v$sql
where sql_text
like '%testsql%'
and sql_text
not
like '%v$sql%';

  SQL_ID HASH_VALUE CHILD_NUMBER EXECUTIONS

  ------------- ---------- ------------ ----------

  8hdvb5nwfqm8r 954944791
0
7

  8hdvb5nwfqm8r 954944791
1
1

  SQL>
select *
from
table(dbms_xplan.display_cursor('8hdvb5nwfqm8r',1));

  PLAN_TABLE_OUTPUT

  -------------------------------------

  SQL_ID 8hdvb5nwfqm8r, child number
1

  -------------------------------------

  select
/*+ testsql
*/ sum(value)
from t1
where flag=:v_flag

  Plan hash value:
1563165360

  ---------------------------------------------------------------------------------------

  | Id
| Operation
| Name | Rows
| Bytes
| Cost (%CPU)| Time
|

  ---------------------------------------------------------------------------------------

  |
0
| SELECT STATEMENT
| |
| |
30 (100)|
|

  |
1
| SORT AGGREGATE
| |
1
| 17
| |
|

  |
2
| TABLE ACCESS
BY INDEX ROWID| T1
| 33
| 561
| 30 (0)|
00:00:01
|

  |*
3
| INDEX RANGE SCAN
| T1_IDX
| 33
| |
1 (0)|
00:00:01
|

  ---------------------------------------------------------------------------------------

  Predicate Information (identified by operation id):

  ---------------------------------------------------

  3
- access("FLAG"=:V_FLAG)
  从上面的输出可以看到,这条SQL已经产生了另外一个子游标(child cursor)。而查看新的子游标的执行计划,已经使用了索引。

  由此,可以看出,开启sql trace之后,会话执行的SQL将不会重用未开启sql trace时生成的执行计划,而发生硬解析。由于硬解析时绑定变量窥视,将会使硬解析产生的子游标,其执行计划可能发生了变化。

  在最后,我们来看看引起游标不能共享的原因是什么?

Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/
 SQL>
select
* from v$sql_shared_cursor
where sql_id='8hdvb5nwfqm8r';

  SQL_ID ADDRESS CHILD_AD CHILD_NUMBER U S O O S L S E B P I S T A B D L T R I I R L I O S M U T N F A I T D L D B P C S R P T M B M R O P M F L

  ------------- -------- -------- ------------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

  8hdvb5nwfqm8r 27CF5884 27CF57A0 0 N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N

  8hdvb5nwfqm8r 27CF5884 27CCC968 1 N N N N Y N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N

  SQL>
select *
from v$sql_shared_cursor
where sql_id='8hdvb5nwfqm8r';

  SQL_ID ADDRESS CHILD_AD CHILD_NUMBER U S O O S L S E B P I S T A B D L T R I I R L I O S M U T N F A I T D L D B P C S R P T M B M R O P M F L

  ------------- -------- -------- ------------ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

  8hdvb5nwfqm8r 27CF5884 27CF57A0 0 N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N

  8hdvb5nwfqm8r 27CF5884 27CCC968 1 N N N N Y N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
  从结果可以看到,STATS_ROW_MISMATCH导致了游标不能共享,Oracle文档这个列的解释是:The existing statistics do not match the existing child cursor。

  在最后,使用10g的dbms_monitor包来打开会话的sql trace,测试结果完全相同。

wordend
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: