您的位置:首页 > 数据库 > Oracle

oracle 11.2.0.3 rac hanganalyze 案例

2015-11-13 20:09 621 查看
环境:

rhel6.5 oracle 11.2.0.3.8 rac 两个节点

客户说db无法使用,需要dba介入处理。

dba登陆任一个节点的os,执行了如下的命令:

[oracle@hosta ~]$ sqlplus '/ as sysdba'

SQL*Plus: Release 11.2.0.3.0 Production on Fri Nov 13 19:19:30 2015

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

SQL> oradebug setorapname reco
Oracle pid: 22, Unix process pid: 13200, image: oracle@hosta (RECO)
SQL> oradebug unlimit
Statement processed.
SQL> oradebug -g all hanganalyze 3
Hang Analysis in /u02/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_diag_13164.trc
SQL> oradebug -g all hanganalyze 3
Hang Analysis in /u02/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_diag_13164.trc
SQL> oradebug -g all dump systemstate 266
Statement processed.
SQL> oradebug -g all dump systemstate 266
Statement processed.
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
[oracle@hosta ~]$ date
Fri Nov 13 19:22:22 CST 2015
[oracle@hosta ~]$
--->注意:如上命令的来源:
How to Collect Diagnostics for Database Hanging Issues (文档 ID 452358.1)


/u02/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_diag_13164.trc的内容截取如下:

*** 2015-11-13 19:20:11.392
DIAG sends out oradebug command to nodes [0,1]

*** 2015-11-13 19:20:11.454
****************************************************************
PORADEBUG BEGIN ORIGINATING INST:1 SERIAL:1 PID:24794
PORADATA COMMAND:hanganalyze 3
session (kjzddbx) switches to a new action
PORADATA TRACEFILE /u02/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_diag_13164.trc
PORADEBUG END ORIGINATING INST:1 SERIAL:1 PID:24794
****************************************************************

*** 2015-11-13 19:20:12.060
===============================================================================
HANG ANALYSIS:
instances (db_name.oracle_sid): orcl.orcl1, orcl.orcl2
oradebug_node_dump_level: 3
analysis initiated by oradebug
os thread scheduling delay history: (sampling every 1.000000 secs)
0.000000 secs at [ 19:20:11 ]
NOTE: scheduling delay has not been sampled for 0.494255 secs    0.000000 secs from [ 19:20:07 - 19:20:12 ], 5 sec avg
0.000000 secs from [ 19:19:12 - 19:20:12 ], 1 min avg
0.000124 secs from [ 19:15:12 - 19:20:12 ], 5 min avg
===============================================================================

Chains most likely to have caused the hang:
[a] Chain 1 Signature: 'SQL*Net message from client'<='enq: TX - row lock contention'
Chain 1 Signature Hash: 0x38c48850
[b] Chain 2 Signature: 'SQL*Net message from client'<='enq: TX - row lock contention'
Chain 2 Signature Hash: 0x38c48850
[c] Chain 3 Signature: 'SQL*Net message from client'<='enq: TX - row lock contention'
Chain 3 Signature Hash: 0x38c48850

===============================================================================
Non-intersecting chains:

-------------------------------------------------------------------------------
Chain 1:
-------------------------------------------------------------------------------
Oracle session identified by:
{
instance: 1 (orcl.orcl1)
os id: 24694
process id: 44, oracle@hosta
session id: 10
session serial #: 193
}
is waiting for 'enq: TX - row lock contention' with wait info:
{
p1: 'name|mode'=0x54580006
p2: 'usn<<16 | slot'=0x8000e
p3: 'sequence'=0x452
time in wait: 6 min 10 sec
timeout after: never
wait id: 37
blocking: 0 sessions
current sql: update scott.emp set ename=ename|| '1+1' where empno='7369'
short stack: ksedsts()+461<-ksdxfstk()+32<-ksdxcb()+1876<-sspuser()+112<-__sighandler()<-semtimedop()+10<-skgpwwait()+160<-ksliwat()+1865<-kslwaitctx()+163<-kjusuc()+3619<-ksipgetctxi()+1759<-ksqcmi()+24096<-ksqgtlctx()+3488<-ksqgelctx()+569<-ktuGetTxForXid()+131<-ktcwit1()+313<-kdddgb()+8525<-kdusru()+461<-updrowFastPath()+1209<-qerupFetch()+2554<-updaul()+1309<-updThreePhaseExe()+324<-updexe()+443<-opiexe()+10085<-kpoal8()+2224<-opiodr()+916<-ttcpip()+2242<-opitsk()+1673<-opiino()+966<-opiodr()+916<-opidrv()+570<-sou2o()+103
wait history:
* time between current wait and wait #1: 0.000239 sec
1.       event: 'gc current block 2-way'
time waited: 0.001155 sec
wait id: 36              p1: ''=0x4
p2: ''=0x97
p3: ''=0x2000001
* time between wait #1 and #2: 0.000273 sec
2.       event: 'Disk file operations I/O'
time waited: 0.000010 sec
wait id: 35              p1: 'FileOperation'=0x2
p2: 'fileno'=0x4
p3: 'filetype'=0x2
* time between wait #2 and #3: 0.002898 sec
3.       event: 'SQL*Net message from client'
time waited: 2.574032 sec
wait id: 34              p1: 'driver id'=0x54435000
p2: '#bytes'=0x1
}
and is blocked by
=> Oracle session identified by:
{
instance: 1 (orcl.orcl1)
os id: 18342
process id: 49, oracle@hosta
session id: 137
session serial #: 2227
}
which is waiting for 'SQL*Net message from client' with wait info:
{
p1: 'driver id'=0x54435000
p2: '#bytes'=0x1
time in wait: 7 min 44 sec
timeout after: never
wait id: 877
blocking: 3 sessions
current sql: <none>
short stack: ksedsts()+461<-ksdxfstk()+32<-ksdxcb()+1876<-sspuser()+112<-__sighandler()<-read()+14<-nttfprd()+343<-nsbasic_brc()+376<-nsbrecv()+69<-nioqrc()+485<-opikndf2()+978<-opitsk()+847<-opiino()+966<-opiodr()+916<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-ssthrdmain()+252<-main()+201<-__libc_start_main()+253<-_start()+36
wait history:
* time between current wait and wait #1: 0.000301 sec
1.       event: 'SQL*Net message to client'
time waited: 0.000016 sec
wait id: 876             p1: 'driver id'=0x54435000
p2: '#bytes'=0x1
* time between wait #1 and #2: 0.001011 sec
2.       event: 'SQL*Net message from client'
time waited: 269 min 8 sec
wait id: 875             p1: 'driver id'=0x54435000
p2: '#bytes'=0x1
* time between wait #2 and #3: 0.000097 sec
3.       event: 'SQL*Net message to client'
time waited: 0.000006 sec
wait id: 874             p1: 'driver id'=0x54435000
p2: '#bytes'=0x1
}

Chain 1 Signature: 'SQL*Net message from client'<='enq: TX - row lock contention'
Chain 1 Signature Hash: 0x38c48850
-------------------------------------------------------------------------------

===============================================================================
Intersecting chains:

-------------------------------------------------------------------------------
Chain 2:
-------------------------------------------------------------------------------
Oracle session identified by:
{
instance: 2 (orcl.orcl2)
os id: 24814
process id: 44, oracle@hostb
session id: 10
session serial #: 3041
}
is waiting for 'enq: TX - row lock contention' with wait info:
{
p1: 'name|mode'=0x54580006
p2: 'usn<<16 | slot'=0x8000e
p3: 'sequence'=0x452
time in wait: 343 min 6 sec
timeout after: never
wait id: 133
blocking: 0 sessions
current sql:  update scott.emp set ename=ename|| '2' where empno='7369'
short stack: ksedsts()+461<-ksdxfstk()+32<-ksdxcb()+1876<-sspuser()+112<-__sighandler()<-semtimedop()+10<-skgpwwait()+160<-ksliwat()+1865<-kslwaitctx()+163<-kjusuc()+3619<-ksipgetctxi()+1759<-ksqcmi()+24096<-ksqgtlctx()+3488<-ksqgelctx()+569<-ktuGetTxForXid()+131<-ktcwit1()+313<-kdddgb()+8525<-kdusru()+461<-updrowFastPath()+1209<-qerupFetch()+2554<-updaul()+1309<-updThreePhaseExe()+324<-updexe()+443<-opiexe()+10085<-kpoal8()+2224<-opiodr()+916<-ttcpip()+2242<-opitsk()+1673<-opiino()+966<-opiodr()+916<-opidrv()+570<-sou2o()+103
wait history:
* time between current wait and wait #1: 0.000407 sec
1.       event: 'gc cr block 2-way'
time waited: 0.002959 sec
wait id: 132             p1: ''=0x3
p2: ''=0xf0
p3: ''=0x1f
* time between wait #1 and #2: 0.000215 sec
2.       event: 'Disk file operations I/O'
time waited: 0.000011 sec
wait id: 131             p1: 'FileOperation'=0x2
p2: 'fileno'=0x3
p3: 'filetype'=0x2
* time between wait #2 and #3: 0.000245 sec
3.       event: 'gc current block 2-way'
time waited: 0.001150 sec
wait id: 130             p1: ''=0x4
p2: ''=0x97
p3: ''=0x2000001
}
and is blocked by 'instance: 1, os id: 18342, session id: 137',
which is a member of 'Chain 1'.

Chain 2 Signature: 'SQL*Net message from client'<='enq: TX - row lock contention'
Chain 2 Signature Hash: 0x38c48850
-------------------------------------------------------------------------------

-------------------------------------------------------------------------------
Chain 3:
-------------------------------------------------------------------------------
Oracle session identified by:
{
instance: 2 (orcl.orcl2)
os id: 1112
process id: 50, oracle@hostb
session id: 255
session serial #: 3581
}
is waiting for 'enq: TX - row lock contention' with wait info:
{
p1: 'name|mode'=0x54580006
p2: 'usn<<16 | slot'=0x8000e
p3: 'sequence'=0x452
time in wait: 7 min 12 sec
timeout after: never
wait id: 23
blocking: 0 sessions
current sql: update scott.emp set ename=ename|| '2+1' where empno='7369'
short stack: ksedsts()+461<-ksdxfstk()+32<-ksdxcb()+1876<-sspuser()+112<-__sighandler()<-semtimedop()+10<-skgpwwait()+160<-ksliwat()+1865<-kslwaitctx()+163<-kjusuc()+3619<-ksipgetctxi()+1759<-ksqcmi()+24096<-ksqgtlctx()+3488<-ksqgelctx()+569<-ktuGetTxForXid()+131<-ktcwit1()+313<-kdddgb()+8525<-kdusru()+461<-updrowFastPath()+1209<-qerupFetch()+2554<-updaul()+1309<-updThreePhaseExe()+324<-updexe()+443<-opiexe()+10085<-kpoal8()+2224<-opiodr()+916<-ttcpip()+2242<-opitsk()+1673<-opiino()+966<-opiodr()+916<-opidrv()+570<-sou2o()+103
wait history:
* time between current wait and wait #1: 0.005374 sec
1.       event: 'SQL*Net message from client'
time waited: 42.635240 sec
wait id: 22              p1: 'driver id'=0x54435000
p2: '#bytes'=0x1
* time between wait #1 and #2: 0.000005 sec
2.       event: 'SQL*Net message to client'
time waited: 0.000005 sec
wait id: 21              p1: 'driver id'=0x54435000
p2: '#bytes'=0x1
* time between wait #2 and #3: 0.000340 sec
3.       event: 'SQL*Net message from client'
time waited: 0.000568 sec
wait id: 20              p1: 'driver id'=0x54435000
p2: '#bytes'=0x1
}
and is blocked by 'instance: 1, os id: 18342, session id: 137',
which is a member of 'Chain 1'.

Chain 3 Signature: 'SQL*Net message from client'<='enq: TX - row lock contention'
Chain 3 Signature Hash: 0x38c48850
-------------------------------------------------------------------------------

===============================================================================
Extra information that will be dumped at higher levels:
[level  4] :   1 node dumps -- [LEAF] [LEAF_NW]
[level  5] :   3 node dumps -- [NO_WAIT] [INVOL_WT] [SINGLE_NODE] [NLEAF] [SINGLE_NODE_NW]

State of ALL nodes-->可以看到,在执行“oradebug -g all hanganalyze 3”命令的节点的diag文件中,才有这个“State of ALL nodes”字样。
([nodenum]/cnode/sid/sess_srno/session/ospid/state/[adjlist]):
[9]/1/10/193/0x95231978/24694/NLEAF/[136]---->nodenum=9的 session(即:1号实例上sid为10的session)的blocker(阻塞者)为nodenum=136的 session
[136]/1/137/2227/0x95376400/18342/LEAF/------>nodenum=136的 session是blocker(阻塞者)
[481]/2/10/3041/0x95231978/24814/NLEAF/[136]->nodenum=481的 session(即:2号实例上sid为10的session)的blocker(阻塞者)为nodenum=136的 session
[726]/2/255/3581/0x94d0a760/1112/NLEAF/[136]->nodenum=726的 session(即:2号实例上sid为255的session)的blocker(阻塞者)为nodenum=136的 session

*** 2015-11-13 19:20:12.061
===============================================================================
END OF HANG ANALYSIS
===============================================================================

*** 2015-11-13 19:20:12.064
===============================================================================
HANG ANALYSIS DUMPS:
oradebug_node_dump_level: 3
===============================================================================

State of LOCAL nodes
([nodenum]/cnode/sid/sess_srno/session/ospid/state/[adjlist]):
[9]/1/10/193/0x95231978/24694/NLEAF/[136]
[136]/1/137/2227/0x95376400/18342/LEAF/

No processes qualify for dumping.

===============================================================================
HANG ANALYSIS DUMPS: END
===============================================================================
session (kjzddbx) switches to a new action
PORADATA SERIAL:1 RINST:2 RDIAG:31036
PORADATA SERIAL:1 RINST:2 REMOTE EXEC OSPID:31036
PORADATA SERIAL:1 RINST:2 TRACEFILE:/u02/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_diag_31036.trc
PORADATA SERIAL:1 RINST:2  DONE

/u02/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_diag_13164.trc的内容摘录完毕。

 

/u02/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_diag_31036.trc的内容摘录如下:

*** 2015-11-13 19:20:11.423
****************************************************************
PORADEBUG BEGIN ORIGINATING INST:1 SERIAL:1 PID:24794
PORADATA COMMAND:hanganalyze 3
session (kjzddbx) switches to a new action
PORADATA TRACEFILE /u02/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_diag_31036.trc
PORADEBUG END ORIGINATING INST:1 SERIAL:1 PID:24794
****************************************************************

*** 2015-11-13 19:20:12.064
===============================================================================
HANG ANALYSIS DUMPS:
oradebug_node_dump_level: 3
===============================================================================

State of LOCAL nodes
([nodenum]/cnode/sid/sess_srno/session/ospid/state/[adjlist]):
[481]/2/10/3041/0x95231978/24814/NLEAF/[136]---->可以看到,sid为10的session的blocker(阻塞者)是 nodenum=136,而这个136不在State of LOCAL nodes中,这说明这个nodenum=136在对方实例上。
[726]/2/255/3581/0x94d0a760/1112/NLEAF/[136]---->可以看到,sid为255的session的blocker(阻塞者)是 nodenum=136,而这个136不在State of LOCAL nodes中,这说明这个nodenum=136在对方实力上。
->上面的2是指2号实例,该编号从1开始。

No processes qualify for dumping.

===============================================================================
HANG ANALYSIS DUMPS: END
===============================================================================
session (kjzddbx) switches to a new action

*** 2015-11-13 19:20:12.284
Oradebug command console output: <none>


/u02/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_diag_31036.trc的内容摘录完毕。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: