高并发压力下导致数据库bug
2014-12-22 16:27
218 查看
环境信息: linux 6.1 + oracle11.2.0.3 RAC
问题现象:
学校晚上6点选课,人数大概有3000,7点时,数据库报错如下(数据库到6点多还是可以连接的),数据库hung住了。
Tue Dec 16 18:00:33 2014
Dumping diagnostic data in directory=[cdmp_20141216180033], requested by (instance=2, osid=24917 (M001)), summary=[incident=45211].
Tue Dec 16 18:02:23 2014
PMON failed to acquire latch, see PMON dump
Tue Dec 16 18:03:23 2014
PMON failed to acquire latch, see PMON dump
Tue Dec 16 18:04:54 2014
PMON failed to acquire latch, see PMON dump
Tue Dec 16 18:05:55 2014
PMON failed to acquire latch, see PMON dump
Tue Dec 16 18:07:26 2014
PMON failed to acquire latch, see PMON dump
Tue Dec 16 18:08:26 2014
PMON failed to acquire latch, see PMON dump
.........................................
.........................................
Tue Dec 16 19:31:32 2014
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x18] [PC:0x9379D18, kglic0()+1086] [flags: 0x0, count: 1]
Errors in file /u01/app/oracle/diag/rdbms/ywkdb/ywkdb2/trace/ywkdb2_m003_28984.trc (incident=44523):
ORA-07445: exception encountered: core dump [kglic0()+1086] [SIGSEGV] [ADDR:0x18] [PC:0x9379D18] [Address not mapped to object] []
Incident details in: /u01/app/oracle/diag/rdbms/ywkdb/ywkdb2/incident/incdir_44523/ywkdb2_m003_28984_i44523.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Tue Dec 16 20:00:23 2014
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x18] [PC:0x9379D18, kglic0()+1086] [flags: 0x0, count: 1]
Errors in file /u01/app/oracle/diag/rdbms/ywkdb/ywkdb2/trace/ywkdb2_m000_30227.trc (incident=44811):
ORA-07445: exception encountered: core dump [kglic0()+1086] [SIGSEGV] [ADDR:0x18] [PC:0x9379D18] [Address not mapped to object] []
Incident details in: /u01/app/oracle/diag/rdbms/ywkdb/ywkdb2/incident/incdir_44811/ywkdb2_m000_30227_i44811.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
最后排查,是数据库该版本的一个bug。
详见:
ORA-7445 Core Dump [Kglic0()+712] (文档 ID 1325437.1)
MMON Slave Process Reports ORA-7445 [kglic0]/[kksIterCursorStat] Error, Plus Database Hangs (文档 ID 1487720.1)
排查过程:
根据ORA-07445提报的错误,排查oracle官方文档,发现在官方文档中标明这是一个oracle bug;文档中说明,在以下条件下可触发该bug:
这几点,目前数据库状态都符合:
1)。目前数据库版本是11.2.0.3
2)。报错提示是ORA-07445: [kglic0()]
3)。在trc文件中,有kksIterCursorStat提示
4)。_kghdsidx_count值>1
5)。查询select count(*) from v$sql_bind_capture where name like ':SYS%';总数>0
根据官方文档的处理意见,修改隐含参数"_kghdsidx_count"=1,重启数据库服务可以解决此问题。
[align=justify] [/align]
另外根据官方文档中的延伸说明:Bug 12340939 ORA-7445 [kglic0] can occur capturing cursor stats for V$SQLSTATS也可能会触发该问题,修改隐含参数
"_cursor_stats_enabled"=FALSE,重启数据库服务可以解决此问题。
修改后,学校第二天选课没有再报错了。
案例总结:
1.有问题先查MOS
2.具体到这个问题,之前没有出现这个问题,选课时出来了,说明一点:在大数据高并发情况下,这个bug显现了。
导致问题的原因:
During SQL statistics analysis for query execution, we load dependency information into the Library Cache. These dependency memory pieces can be overwritten by other sessions without causing an error. When the original session runs into the wrong dependency information it can lead to the internal error and performance issues during subsequent queries of these Library Cache objects.
---简而言之:Library Cache由于并发压力,历史信息找不到了
或者
During SQL statistics analysis for query execution, we load dependency information into the Library Cache. These dependency memory pieces can be overwritten by other sessions without causing an error. When the original session runs into the wrong dependency information it can lead to the internal error and performance issues during subsequent queries of these Library Cache objects.
---简而言之:表的统计信息有问题
问题现象:
学校晚上6点选课,人数大概有3000,7点时,数据库报错如下(数据库到6点多还是可以连接的),数据库hung住了。
Tue Dec 16 18:00:33 2014
Dumping diagnostic data in directory=[cdmp_20141216180033], requested by (instance=2, osid=24917 (M001)), summary=[incident=45211].
Tue Dec 16 18:02:23 2014
PMON failed to acquire latch, see PMON dump
Tue Dec 16 18:03:23 2014
PMON failed to acquire latch, see PMON dump
Tue Dec 16 18:04:54 2014
PMON failed to acquire latch, see PMON dump
Tue Dec 16 18:05:55 2014
PMON failed to acquire latch, see PMON dump
Tue Dec 16 18:07:26 2014
PMON failed to acquire latch, see PMON dump
Tue Dec 16 18:08:26 2014
PMON failed to acquire latch, see PMON dump
.........................................
.........................................
Tue Dec 16 19:31:32 2014
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x18] [PC:0x9379D18, kglic0()+1086] [flags: 0x0, count: 1]
Errors in file /u01/app/oracle/diag/rdbms/ywkdb/ywkdb2/trace/ywkdb2_m003_28984.trc (incident=44523):
ORA-07445: exception encountered: core dump [kglic0()+1086] [SIGSEGV] [ADDR:0x18] [PC:0x9379D18] [Address not mapped to object] []
Incident details in: /u01/app/oracle/diag/rdbms/ywkdb/ywkdb2/incident/incdir_44523/ywkdb2_m003_28984_i44523.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Tue Dec 16 20:00:23 2014
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x18] [PC:0x9379D18, kglic0()+1086] [flags: 0x0, count: 1]
Errors in file /u01/app/oracle/diag/rdbms/ywkdb/ywkdb2/trace/ywkdb2_m000_30227.trc (incident=44811):
ORA-07445: exception encountered: core dump [kglic0()+1086] [SIGSEGV] [ADDR:0x18] [PC:0x9379D18] [Address not mapped to object] []
Incident details in: /u01/app/oracle/diag/rdbms/ywkdb/ywkdb2/incident/incdir_44811/ywkdb2_m000_30227_i44811.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
最后排查,是数据库该版本的一个bug。
详见:
ORA-7445 Core Dump [Kglic0()+712] (文档 ID 1325437.1)
MMON Slave Process Reports ORA-7445 [kglic0]/[kksIterCursorStat] Error, Plus Database Hangs (文档 ID 1487720.1)
排查过程:
根据ORA-07445提报的错误,排查oracle官方文档,发现在官方文档中标明这是一个oracle bug;文档中说明,在以下条件下可触发该bug:
这几点,目前数据库状态都符合:
1)。目前数据库版本是11.2.0.3
2)。报错提示是ORA-07445: [kglic0()]
3)。在trc文件中,有kksIterCursorStat提示
4)。_kghdsidx_count值>1
5)。查询select count(*) from v$sql_bind_capture where name like ':SYS%';总数>0
根据官方文档的处理意见,修改隐含参数"_kghdsidx_count"=1,重启数据库服务可以解决此问题。
[align=justify] [/align]
另外根据官方文档中的延伸说明:Bug 12340939 ORA-7445 [kglic0] can occur capturing cursor stats for V$SQLSTATS也可能会触发该问题,修改隐含参数
"_cursor_stats_enabled"=FALSE,重启数据库服务可以解决此问题。
修改后,学校第二天选课没有再报错了。
案例总结:
1.有问题先查MOS
2.具体到这个问题,之前没有出现这个问题,选课时出来了,说明一点:在大数据高并发情况下,这个bug显现了。
导致问题的原因:
During SQL statistics analysis for query execution, we load dependency information into the Library Cache. These dependency memory pieces can be overwritten by other sessions without causing an error. When the original session runs into the wrong dependency information it can lead to the internal error and performance issues during subsequent queries of these Library Cache objects.
---简而言之:Library Cache由于并发压力,历史信息找不到了
或者
During SQL statistics analysis for query execution, we load dependency information into the Library Cache. These dependency memory pieces can be overwritten by other sessions without causing an error. When the original session runs into the wrong dependency information it can lead to the internal error and performance issues during subsequent queries of these Library Cache objects.
---简而言之:表的统计信息有问题
相关文章推荐
- 简单演示 Oracle 数据库并发导致行级锁
- Oracle 数据库针对表主键列并发导致行级锁简单演示
- 又见bug,动态调整SGA参数导致数据库crash
- 并发数据库事务缺锁导致的数据不一致情况:丢失更新,脏读,不可重复读,幻读
- centos 7.2系统bug导致HGDB后台进程崩溃,数据库无法访问
- 简单演示 Oracle 数据库并发导致段级锁(表级锁)
- ASMB的BUG(ORA-04030 kfmditer)导致数据库宕机
- ASMB的BUG(ORA-04030 kfmditer)导致数据库宕机
- 并发数据库压力测试的shell脚本代码
- 并发操作导致数据库不一致性
- oracle:数据库版本问题导致的bug
- 高并发下Netty4底层bug导致直接内存溢出分析
- 大量并发SQL导致数据库性能问题诊断优化
- 简单处理K->V缓存失效后高并发访问导致数据库负载急剧上升的问题
- 由 BUG 引发 ORA-15064 进而导致数据库实例意外中止
- 并发数据库压力测试的shell脚本代码
- MySQL + PHP的模式在大并发压力下导致MySQL中存在大量僵死进程
- 数据库并发导致的脏读 不可重复读 幻读的处理方法
- 关于服务器并发访问导致重复写数据库的问题