Oracle RAC环境下定位并杀掉最终阻塞的会话-续
2018-04-21 08:47
357 查看
之前在《Oracle RAC环境下定位并杀掉最终阻塞的会话》中,最终使用一个SQL查询出RAC实例之间的所有阻塞关系。但是实际在某些极端的生产环境,是不允许执行复杂的SQL语句,即使允许执行可能现场也不方便复制SQL,手敲的话效率低下,那么本文就介绍另一种简单的方法来快速定位最终阻塞会话,也就是DBA常用的oradebug hanganalyze。
1.模拟故障
2.oradebug hanganalyze
3.分析trace文件
如果按照之前的SQL查询,结果如下:
oradebug hanganalyze 3
因为我这里是RAC环境,需要分析所有实例:
oradebug -g all hanganalyze 3
Chain 1: 可以看到实例1的会话29被实例2的会话148阻塞,实例2的会话148又被实例1的会话26阻塞;
Chain 2: 可以看到实例2的会话23被实例2的会话148阻塞,而实例2的会话148又在第一个chain中。
可以发现这与我之前用SQL查询的结果是一样的意思,都可以做到快速定位最终阻塞会话是实例1的会话26,与客户确认后杀掉即可。
附:oradebug hanganalyze 3分析的trace文件中的核心信息
可以看到这种oradebug hanganalyze生成的trc文件并没有那么的难理解,反而这种chain的描述更加清楚,身为一名专业的Oracle DBA,我们是必须要掌握这种方法的。
1.模拟故障
2.oradebug hanganalyze
3.分析trace文件
1.模拟故障
直接根据《Oracle RAC环境下定位并杀掉最终阻塞的会话》中的来模拟,不再赘述。如果按照之前的SQL查询,结果如下:
SYS@jyzhao1 >--cascade blocking@gv$session SYS@jyzhao1 >select * 2 from (select a.inst_id, a.sid, a.serial#, 3 a.sql_id, 4 a.event, 5 a.status, 6 connect_by_isleaf as isleaf, 7 sys_connect_by_path(a.SID||'@'||a.inst_id, ' <- ') tree, 8 level as tree_level 9 from gv$session a 10 start with a.blocking_session is not null 11 connect by (a.sid||'@'||a.inst_id) = prior (a.blocking_session||'@'||a.blocking_instance)) 12 where isleaf = 1 13 order by tree_level asc; INST_ID SID SERIAL# SQL_ID EVENT STATUS ISLEAF TREE TREE_LEVEL ---------- ---------- ---------- ------------- ----------------------------------- -------- ---------- ------------------------------ ---------- 1 26 3479 SQL*Net message from client INACTIVE 1 <- 148@2 <- 26@1 2 1 26 3479 SQL*Net message from client INACTIVE 1 <- 29@1 <- 148@2 <- 26@1 3 1 26 3479 SQL*Net message from client INACTIVE 1 <- 23@2 <- 148@2 <- 26@1 3 SYS@jyzhao1 >
2.oradebug hanganalyze
如今假设我们的环境的确不方便使用SQL查询,那么就需要用到oradebug hanganalyze来分析。oradebug hanganalyze 3
因为我这里是RAC环境,需要分析所有实例:
oradebug -g all hanganalyze 3
SYS@jyzhao1 >oradebug -g all hanganalyze 3 Hang Analysis in /opt/app/oracle/diag/rdbms/jyzhao/jyzhao1/trace/jyzhao1_diag_1919.trc
3.分析trace文件
我们去分析这个生成的trc文件,可以很清楚的看到HANG分析部分,存在两个chain,比如我这个实验的情况就是:Chain 1: 可以看到实例1的会话29被实例2的会话148阻塞,实例2的会话148又被实例1的会话26阻塞;
Chain 2: 可以看到实例2的会话23被实例2的会话148阻塞,而实例2的会话148又在第一个chain中。
可以发现这与我之前用SQL查询的结果是一样的意思,都可以做到快速定位最终阻塞会话是实例1的会话26,与客户确认后杀掉即可。
附:oradebug hanganalyze 3分析的trace文件中的核心信息
*** 2018-04-21 07:51:44.975 =============================================================================== HANG ANALYSIS: instances (db_name.oracle_sid): jyzhao.jyzhao1, jyzhao.jyzhao2 oradebug_node_dump_level: 3 analysis initiated by oradebug os thread scheduling delay history: (sampling every 1.000000 secs) 0.000000 secs at [ 07:51:44 ] NOTE: scheduling delay has not been sampled for 0.305046 secs 0.000000 secs from [ 07:51:40 - 07:51:45 ], 5 sec avg 0.000000 secs from [ 07:50:45 - 07:51:45 ], 1 min avg 0.000000 secs from [ 07:46:45 - 07:51:45 ], 5 min avg vktm time drift history =============================================================================== Chains most likely to have caused the hang: [a] Chain 1 Signature: 'SQL*Net message from client'<='enq: TX - row lock contention'<='enq: TX - row lock contention' Chain 1 Signature Hash: 0x42598823 [b] Chain 2 Signature: 'SQL*Net message from client'<='enq: TX - row lock contention'<='enq: TX - row lock contention' Chain 2 Signature Hash: 0x42598823 =============================================================================== Non-intersecting chains: ------------------------------------------------------------------------------- Chain 1: ------------------------------------------------------------------------------- Oracle session identified by: { instance: 1 (jyzhao.jyzhao1) os id: 2712 process id: 42, oracle@jyrac1 (TNS V1-V3) session id: 29 session serial #: 81 } is waiting for 'enq: TX - row lock contention' with wait info: { p1: 'name|mode'=0x54580006 p2: 'usn<<16 | slot'=0xe0002 p3: 'sequence'=0x3a3 time in wait: 8 min 21 sec timeout after: never wait id: 37 blocking: 0 sessions current sql: update emp set job = 'CEO' where empno = 7839 short stack: ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-semtimedop()+10<-skgpwwait()+178<-ksliwat()+2022<-kslwaitctx()+163<-k jusuc()+3400<-ksipgetctxi()+1759<-ksqcmi()+20798<-ksqgtlctx()+3501<-ksqgelctx()+557<-ktuGetTxForXid()+131<-ktcwit1()+336<-kdddgb()+8587<-kdusru()+460<-kauupd()+412<-updrow ()+2167<-qerupFetch()+860<-updaul()+1378<-updThreePhaseExe()+318<-updexe()+638<-opiexe()+10378<-kpoal8()+2380<-opiodr()+917<-ttcpip()+2183<-opitsk()+1710<-opiino()+969<-op iodr()+917<-opidrv()+570<-sou2o( wait history: * time between current wait and wait #1: 0.010123 sec 1. event: 'gc current block 2-way' time waited: 0.000622 sec wait id: 36 p1: ''=0x7 p2: ''=0x6483 p3: ''=0x2000001 * time between wait #1 and #2: 0.007260 sec 2. event: 'gc cr block 2-way' time waited: 0.000501 sec wait id: 35 p1: ''=0x6 p2: ''=0x523 p3: ''=0x2c * time between wait #2 and #3: 0.000462 sec 3. event: 'gc cr block 2-way' time waited: 0.000689 sec wait id: 34 p1: ''=0x6 p2: ''=0xb0 p3: ''=0x2b } and is blocked by => Oracle session identified by: { instance: 2 (jyzhao.jyzhao2) os id: 23427 process id: 41, oracle@jyrac2 (TNS V1-V3) session id: 148 session serial #: 17715 } which is waiting for 'enq: TX - row lock contention' with wait info: { p1: 'name|mode'=0x54580006 p2: 'usn<<16 | slot'=0x50008 p3: 'sequence'=0x9e6 time in wait: 9 min 9 sec timeout after: never wait id: 152 blocking: 2 sessions current sql: update emp set job = 'MANAGER' where empno = 7788 short stack: ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-semtimedop()+10<-skgpwwait()+178<-ksliwat()+2022<-kslwaitctx()+163<-k jusuc()+3400<-ksipgetctxi()+1759<-ksqcmi()+20798<-ksqgtlctx()+3501<-ksqgelctx()+557<-ktuGetTxForXid()+131<-ktcwit1()+336<-kdddgb()+8587<-kdusru()+460<-kauupd()+412<-updrow ()+2167<-qerupFetch()+860<-updaul()+1378<-updThreePhaseExe()+318<-updexe()+638<-opiexe()+10378<-kpoal8()+2380<-opiodr()+917<-ttcpip()+2183<-opitsk()+1710<-opiino()+969<-op iodr()+917<-opidrv()+570<-sou2o( wait history: * time between current wait and wait #1: 0.005023 sec 1. event: 'gc cr block 2-way' time waited: 0.000478 sec wait id: 151 p1: ''=0x3 p2: ''=0xc0 p3: ''=0x19 * time between wait #1 and #2: 0.000295 sec 2. event: 'gc cr block 2-way' time waited: 0.000821 sec wait id: 150 p1: ''=0x3 p2: ''=0x362 p3: ''=0x1a * time between wait #2 and #3: 0.000294 sec 3. event: 'gc cr block 2-way' time waited: 0.000431 sec wait id: 149 p1: ''=0x3 p2: ''=0xc0 p3: ''=0x19 } and is blocked by => Oracle session identified by: { instance: 1 (jyzhao.jyzhao1) os id: 1648 process id: 32, oracle@jyrac1 (TNS V1-V3) session id: 26 session serial #: 3479 } which is waiting for 'SQL*Net message from client' with wait info: { p1: 'driver id'=0x62657100 p2: '#bytes'=0x1 time in wait: 9 min 28 sec timeout after: never wait id: 168 blocking: 3 sessions current sql: <none> short stack: ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-read()+14<-ntpfprd()+117<-nsbasic_brc()+376<-nsbrecv()+69<-nioqrc()+4 95<-opikndf2()+978<-opitsk()+831<-opiino()+969<-opiodr()+917<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-ssthrdmain()+265<-main()+201<-__libc_start_main()+253 wait history: * time between current wait and wait #1: 0.000010 sec 1. event: 'SQL*Net message to client' time waited: 0.000003 sec wait id: 167 p1: 'driver id'=0x62657100 p2: '#bytes'=0x1 * time between wait #1 and #2: 0.000239 sec 2. event: 'gc current grant 2-way' time waited: 0.000337 sec wait id: 166 p1: ''=0x7 p2: ''=0x6483 p3: ''=0x2010001 * time between wait #2 and #3: 0.000196 sec 3. event: 'db file sequential read' time waited: 0.000824 sec wait id: 165 p1: 'file#'=0x7 p2: 'block#'=0x6483 p3: 'blocks'=0x1 } Chain 1 Signature: 'SQL*Net message from client'<='enq: TX - row lock contention'<='enq: TX - row lock contention' Chain 1 Signature Hash: 0x42598823 ------------------------------------------------------------------------------- =============================================================================== Intersecting chains: ------------------------------------------------------------------------------- Chain 2: ------------------------------------------------------------------------------- Oracle session identified by: { instance: 2 (jyzhao.jyzhao2) os id: 23488 process id: 42, oracle@jyrac2 (TNS V1-V3) session id: 23 session serial #: 12635 } is waiting for 'enq: TX - row lock contention' with wait info: { p1: 'name|mode'=0x54580006 p2: 'usn<<16 | slot'=0xe0002 p3: 'sequence'=0x3a3 time in wait: 8 min 34 sec timeout after: never wait id: 39 blocking: 0 sessions current sql: update emp set sal = 15000 where empno = 7839 short stack: ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-semtimedop()+10<-skgpwwait()+178<-ksliwat()+2022<-kslwaitctx()+163<-k jusuc()+3400<-ksipgetctxi()+1759<-ksqcmi()+20798<-ksqgtlctx()+3501<-ksqgelctx()+557<-ktuGetTxForXid()+131<-ktcwit1()+336<-kdddgb()+8587<-kdusru()+460<-kauupd()+412<-updrow ()+2167<-qerupFetch()+860<-updaul()+1378<-updThreePhaseExe()+318<-updexe()+638<-opiexe()+10378<-kpoal8()+2380<-opiodr()+917<-ttcpip()+2183<-opitsk()+1710<-opiino()+969<-op iodr()+917<-opidrv()+570<-sou2o( wait history: * time between current wait and wait #1: 0.000226 sec 1. event: 'gc cr block 2-way' time waited: 0.000605 sec wait id: 38 p1: ''=0x3 p2: ''=0xc0 p3: ''=0x19 * time between wait #1 and #2: 0.011878 sec 2. event: 'gc cr block 2-way' time waited: 0.000610 sec wait id: 37 p1: ''=0x3 p2: ''=0xc0 p3: ''=0x19 * time between wait #2 and #3: 0.000316 sec 3. event: 'Disk file operations I/O' time waited: 0.000007 sec wait id: 36 p1: 'FileOperation'=0x2 p2: 'fileno'=0x3 p3: 'filetype'=0x2 } and is blocked by 'instance: 2, os id: 23427, session id: 148', which is a member of 'Chain 1'. Chain 2 Signature: 'SQL*Net message from client'<='enq: TX - row lock contention'<='enq: TX - row lock contention' Chain 2 Signature Hash: 0x42598823 ------------------------------------------------------------------------------- ===============================================================================
可以看到这种oradebug hanganalyze生成的trc文件并没有那么的难理解,反而这种chain的描述更加清楚,身为一名专业的Oracle DBA,我们是必须要掌握这种方法的。
相关文章推荐
- Oracle RAC环境下的阻塞(blocking blocked)介绍和实例演示
- Oracle RAC环境下的阻塞(blocking blocked)介绍和实例演示
- 如何定位Oracle数据库被锁阻塞会话的根源
- oracle 阻塞会话的查看与解除
- ORACLE RAC--如何定位追踪产生的core文件
- Oracle RAC 环境下的连接管理
- Java路径问题最终解决方案—可定位所有资源的相对路径寻址
- 在WinRt环境下异步调试定位问题
- 也谈windows下cx_Oracle的配置,环境连接乱码种种问题最终解决
- v$session——记录数据库当前会话的阻塞、等待信息!
- Oracle rac集群环境中的特殊问题
- 巧用backtrace系列函数,在不具备gdb环境的Linux系统上大致定位段错误位置
- quartz在集群环境下的最终解决方案
- HDNL性能调优(3)Oracle RAC 环境
- UWB定位环境搭建
- Java路径问题最终解决方案—可定位所有资源的相对路径寻址
- UPDATE 时, 如何避免数据定位处理被阻塞
- linux环境下无阻塞键盘输入以及MAP容器中value为数组的一些问题
- 基于CentOS与VmwareStation10搭建Oracle11G RAC 64集群环境:4.安装Oracle RAC FAQ-4.5.安装Grid,创建ASM磁盘组空间不足
- oracle RAC环境vip异常的修复