处理因sqlnet.ora引起的ASM资源为UNKNOWN一例
2015-07-21 21:52
585 查看
一.背景介绍
很久没用自己的测试环境了,某次需要研究一个GG的问题。启动RAC的时候出现了一个节点ASM实例为UNKNOWN状态且该节点的监听非启动状态并且启动报错。系统配置信息如下:
OS:RHEL 5.5
CRS:10.2.0.5
DB:10.2.0.5
OCR等存放在OCFS但数据文件存放在ASM中。
二.问题分析步骤
2.1.启动集群
启动集群的过程中node101的nodeapp都可以正常启动说明OCR等设备没有问题,检查OCR状态也是正常的。启动后集群状态树下:
[oracle@node101 bdump]$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....s1.inst application OFFLINE OFFLINE
ora....s2.inst application OFFLINE OFFLINE
ora....prod.db application OFFLINE OFFLINE
ora.dgrac.db application ONLINE ONLINE node102
ora....c1.inst application ONLINE OFFLINE
ora....c2.inst application ONLINE ONLINE node102
ora.ggsplx.db application OFFLINE OFFLINE
ora....p1.inst application OFFLINE OFFLINE
ora....p2.inst application OFFLINE OFFLINE
ora....SM1.asm application ONLINE UNKNOWN node101
ora....01.lsnr application ONLINE OFFLINE
ora....101.gsd application ONLINE ONLINE node101
ora....101.ons application ONLINE ONLINE node101
ora....101.vip application ONLINE ONLINE node101
ora....SM2.asm application ONLINE ONLINE node102
ora....02.lsnr application ONLINE ONLINE node102
ora....102.gsd application ONLINE ONLINE node102
ora....102.ons application ONLINE ONLINE node102
ora....102.vip application ONLINE ONLINE node102
2.2.分析日志
通过查看crsd.log发现在启动集群时有以下报错信息:
2014-09-22 14:11:16.939: [ CRSRES][1499146560]0Attempting to start `ora.node101.ASM1.asm` on member `node101`
2014-09-22 14:11:17.066: [ CRSRES][1501247808]0Attempting to start `ora.node101.vip` on member `node101`
2014-09-22 14:11:17.918: [ CRSRES][1503349056]0Attempting to start `ora.node102.ASM2.asm` on member `node102`
2014-09-22 14:11:17.924: [ CRSRES][1505450304]0Attempting to start `ora.node102.vip` on member `node102`
2014-09-22 14:11:23.786: [ CRSRES][1501247808]0Start of `ora.node101.vip` on member `node101` succeeded.
2014-09-22 14:11:25.565: [ CRSRES][1501247808]0startRunnable: setting CLI values
2014-09-22 14:11:25.604: [ CRSRES][1505450304]0Start of `ora.node102.vip` on member `node102` succeeded.
2014-09-22 14:11:27.450: [ CRSRES][1501247808]0Attempting to start `ora.node101.LISTENER_NODE101.lsnr` on member `node101`
2014-09-22 14:11:29.452: [ CRSRES][1505450304]0Attempting to start `ora.node102.LISTENER_NODE102.lsnr` on member `node102`
2014-09-22 14:11:36.367: [ CRSAPP][1499146560]0StartResource error for ora.node101.ASM1.asm error code = 1
2014-09-22 14:11:39.567: [ CRSAPP][1501247808]0StartResource error for ora.node101.LISTENER_NODE101.lsnr error code = 1
2014-09-22 14:11:41.746: [ CRSAPP][1499146560]0StopResource error for ora.node101.ASM1.asm error code = 1
2014-09-22 14:11:41.819: [ CRSRES][1501247808]0Start of `ora.node101.LISTENER_NODE101.lsnr` on member `node101` failed.
2014-09-22 14:11:42.054: [ CRSRES][1499146560]0X_OP_StopResourceFailed : Stop Resource failed
(File: rti.cpp, line: 1808
2014-09-22 14:11:42.055: [ CRSRES][1499146560][ALERT]0`ora.node101.ASM1.asm` on member `node101` has experienced an unrecoverable failure.
2014-09-22 14:11:42.055: [ CRSRES][1499146560]0Human intervention required to resume its availability.
2014-09-22 14:11:42.971: [ CRSRES][1505450304]0Start of `ora.node102.LISTENER_NODE102.lsnr` on member `node102` succeeded.
2014-09-22 14:11:43.064: [ CRSRES][1499146560]0CRS-1028: Dependency analysis failed because of:
'Resource in UNKNOWN state: ora.node101.ASM1.asm'
2014-09-22 14:11:43.209: [ CRSRES][1543223616]0startRunnable: setting CLI values
2014-09-22 14:11:43.226: [ CRSRES][1543223616]0Attempting to start `ora.node101.LISTENER_NODE101.lsnr` on member `node101`
2014-09-22 14:11:44.344: [ CRSAPP][1543223616]0StartResource error for ora.node101.LISTENER_NODE101.lsnr error code = 1
2014-09-22 14:11:45.448: [ CRSRES][1543223616]0Start of `ora.node101.LISTENER_NODE101.lsnr` on member `node101` failed.
按照日志所说,我手工去启动ASM实例可以起来,但是必须带着密码.
[oracle@node101 admin]$ export ORACLE_SID=+ASM1
[oracle@node101 admin]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.5.0 - Production on Mon Sep 22 15:30:51 2014
Copyright (c) 1982, 2010, Oracle. All Rights Reserved.
ERROR:
ORA-01031: insufficient privileges
Enter user-name:
启动监听同样报错说访问权限被拒绝:
[oracle@node101 admin]$ lsnrctl start
LSNRCTL for Linux: Version 10.2.0.5.0 - Production on 22-SEP-2014 15:32:01
Copyright (c) 1991, 2010, Oracle. All rights reserved.
Starting /u01/app/oracle/product/10.2.0/db_1/bin/tnslsnr: please wait...
TNSLSNR for Linux: Version 10.
4000
2.0.5.0 - Production
System parameter file is /u01/app/oracle/product/10.2.0/db_1/network/admin/listener.ora
Log messages written to /u01/app/oracle/product/10.2.0/db_1/network/log/listener.log
Error listening on: (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
TNS-12560: TNS:protocol adapter error
TNS-00584: Valid node checking configuration error
Listener failed to start. See the error message(s) above...
尝试使用crs_stop -f <res_name>停掉资源后再启动发现ASM报错:
[oracle@node101 admin]$ srvctl start asm -n node101
PRKS-1009 : Failed to start ASM instance "+ASM1" on node "node101", [PRKS-1009 : Failed to start ASM instance "+ASM1" on node "node101", [node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:SQL*Plus: Release 10.2.0.5.0 - Production on Mon Sep 22 15:36:11 2014
node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:Copyright (c) 1982, 2010, Oracle. All Rights Reserved.
node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:Enter user-name: ERROR:
node101:ora.node101.ASM1.asm:ORA-01031: insufficient privileges
node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:Enter user-name: SP2-0306: Invalid option.
node101:ora.node101.ASM1.asm:Usage: CONN[ECT] [logon] [AS {SYSDBA|SYSOPER}]
node101:ora.node101.ASM1.asm:where <logon> ::= <username>[/<password>][@<connect_identifier>] | /
node101:ora.node101.ASM1.asm:Enter user-name: Enter password:
node101:ora.node101.ASM1.asm:ERROR:
node101:ora.node101.ASM1.asm:ORA-01005: null password given; logon denied
node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:SP2-0157: unable to CONNECT to ORACLE after 3 attempts, exiting SQL*Plus
node101:ora.node101.ASM1.asm:
CRS-0215: Could not start resource 'ora.node101.ASM1.asm'.]]
[PRKS-1009 : Failed to start ASM instance "+ASM1" on node "node101", [node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:SQL*Plus: Release 10.2.0.5.0 - Production on Mon Sep 22 15:36:11 2014
node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:Copyright (c) 1982, 2010, Oracle. All Rights Reserved.
node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:Enter user-name: ERROR:
node101:ora.node101.ASM1.asm:ORA-01031: insufficient privileges
node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:Enter user-name: SP2-0306: Invalid option.
node101:ora.node101.ASM1.asm:Usage: CONN[ECT] [logon] [AS {SYSDBA|SYSOPER}]
node101:ora.node101.ASM1.asm:where <logon> ::= <username>[/<password>][@<connect_identifier>] | /
node101:ora.node101.ASM1.asm:Enter user-name: Enter password:
node101:ora.node101.ASM1.asm:ERROR:
node101:ora.node101.ASM1.asm:ORA-01005: null password given; logon denied
node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:SP2-0157: unable to CONNECT to ORACLE after 3 attempts, exiting SQL*Plus
node101:ora.node101.ASM1.asm:
CRS-0215: Could not start resource 'ora.node101.ASM1.asm'.]]
再次启动:
[oracle@node101 admin]$ srvctl start asm -n node101
PRKS-1009 : Failed to start ASM instance "+ASM1" on node "node101", [PRKS-1009 : Failed to start ASM instance "+ASM1" on node "node101", [CRS-1028: Dependency analysis failed because of:
CRS-0223: Resource 'ora.node101.ASM1.asm' has placement error.]]
[PRKS-1009 : Failed to start ASM instance "+ASM1" on node "node101", [CRS-1028: Dependency analysis failed because of:
CRS-0223: Resource 'ora.node101.ASM1.asm' has placement error.]]
2.3.问题定位
通过这些信息可以确认是因为登陆ASM需要密码导致CRS无法将ASM打开,此时想到了SQLNET.ORA里面的参数。打开文件后发现:
TCP.VALIDNODE_CHECKING=yes(表示需要允许的节点才能连进来,引起监听启动失败)
SQLNET.AUTHENTICATION_SERVICES=none(引起ASM无密码登陆失败)
SQLNET.EXPIRE_TIME=1
三.问题解决
通过以上确认是因为不合理配置了sqlnet.ora导致集群相关服务无法启动吗,故通过以下步骤进行处理问题:
mv sqlnet.ora sqlnet.ora.bak
crs_stop -f <asm_service>
srvctl start asm -n <node_name>
srvctl start listener -n <node_name>
很久没用自己的测试环境了,某次需要研究一个GG的问题。启动RAC的时候出现了一个节点ASM实例为UNKNOWN状态且该节点的监听非启动状态并且启动报错。系统配置信息如下:
OS:RHEL 5.5
CRS:10.2.0.5
DB:10.2.0.5
OCR等存放在OCFS但数据文件存放在ASM中。
二.问题分析步骤
2.1.启动集群
启动集群的过程中node101的nodeapp都可以正常启动说明OCR等设备没有问题,检查OCR状态也是正常的。启动后集群状态树下:
[oracle@node101 bdump]$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....s1.inst application OFFLINE OFFLINE
ora....s2.inst application OFFLINE OFFLINE
ora....prod.db application OFFLINE OFFLINE
ora.dgrac.db application ONLINE ONLINE node102
ora....c1.inst application ONLINE OFFLINE
ora....c2.inst application ONLINE ONLINE node102
ora.ggsplx.db application OFFLINE OFFLINE
ora....p1.inst application OFFLINE OFFLINE
ora....p2.inst application OFFLINE OFFLINE
ora....SM1.asm application ONLINE UNKNOWN node101
ora....01.lsnr application ONLINE OFFLINE
ora....101.gsd application ONLINE ONLINE node101
ora....101.ons application ONLINE ONLINE node101
ora....101.vip application ONLINE ONLINE node101
ora....SM2.asm application ONLINE ONLINE node102
ora....02.lsnr application ONLINE ONLINE node102
ora....102.gsd application ONLINE ONLINE node102
ora....102.ons application ONLINE ONLINE node102
ora....102.vip application ONLINE ONLINE node102
2.2.分析日志
通过查看crsd.log发现在启动集群时有以下报错信息:
2014-09-22 14:11:16.939: [ CRSRES][1499146560]0Attempting to start `ora.node101.ASM1.asm` on member `node101`
2014-09-22 14:11:17.066: [ CRSRES][1501247808]0Attempting to start `ora.node101.vip` on member `node101`
2014-09-22 14:11:17.918: [ CRSRES][1503349056]0Attempting to start `ora.node102.ASM2.asm` on member `node102`
2014-09-22 14:11:17.924: [ CRSRES][1505450304]0Attempting to start `ora.node102.vip` on member `node102`
2014-09-22 14:11:23.786: [ CRSRES][1501247808]0Start of `ora.node101.vip` on member `node101` succeeded.
2014-09-22 14:11:25.565: [ CRSRES][1501247808]0startRunnable: setting CLI values
2014-09-22 14:11:25.604: [ CRSRES][1505450304]0Start of `ora.node102.vip` on member `node102` succeeded.
2014-09-22 14:11:27.450: [ CRSRES][1501247808]0Attempting to start `ora.node101.LISTENER_NODE101.lsnr` on member `node101`
2014-09-22 14:11:29.452: [ CRSRES][1505450304]0Attempting to start `ora.node102.LISTENER_NODE102.lsnr` on member `node102`
2014-09-22 14:11:36.367: [ CRSAPP][1499146560]0StartResource error for ora.node101.ASM1.asm error code = 1
2014-09-22 14:11:39.567: [ CRSAPP][1501247808]0StartResource error for ora.node101.LISTENER_NODE101.lsnr error code = 1
2014-09-22 14:11:41.746: [ CRSAPP][1499146560]0StopResource error for ora.node101.ASM1.asm error code = 1
2014-09-22 14:11:41.819: [ CRSRES][1501247808]0Start of `ora.node101.LISTENER_NODE101.lsnr` on member `node101` failed.
2014-09-22 14:11:42.054: [ CRSRES][1499146560]0X_OP_StopResourceFailed : Stop Resource failed
(File: rti.cpp, line: 1808
2014-09-22 14:11:42.055: [ CRSRES][1499146560][ALERT]0`ora.node101.ASM1.asm` on member `node101` has experienced an unrecoverable failure.
2014-09-22 14:11:42.055: [ CRSRES][1499146560]0Human intervention required to resume its availability.
2014-09-22 14:11:42.971: [ CRSRES][1505450304]0Start of `ora.node102.LISTENER_NODE102.lsnr` on member `node102` succeeded.
2014-09-22 14:11:43.064: [ CRSRES][1499146560]0CRS-1028: Dependency analysis failed because of:
'Resource in UNKNOWN state: ora.node101.ASM1.asm'
2014-09-22 14:11:43.209: [ CRSRES][1543223616]0startRunnable: setting CLI values
2014-09-22 14:11:43.226: [ CRSRES][1543223616]0Attempting to start `ora.node101.LISTENER_NODE101.lsnr` on member `node101`
2014-09-22 14:11:44.344: [ CRSAPP][1543223616]0StartResource error for ora.node101.LISTENER_NODE101.lsnr error code = 1
2014-09-22 14:11:45.448: [ CRSRES][1543223616]0Start of `ora.node101.LISTENER_NODE101.lsnr` on member `node101` failed.
按照日志所说,我手工去启动ASM实例可以起来,但是必须带着密码.
[oracle@node101 admin]$ export ORACLE_SID=+ASM1
[oracle@node101 admin]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.5.0 - Production on Mon Sep 22 15:30:51 2014
Copyright (c) 1982, 2010, Oracle. All Rights Reserved.
ERROR:
ORA-01031: insufficient privileges
Enter user-name:
启动监听同样报错说访问权限被拒绝:
[oracle@node101 admin]$ lsnrctl start
LSNRCTL for Linux: Version 10.2.0.5.0 - Production on 22-SEP-2014 15:32:01
Copyright (c) 1991, 2010, Oracle. All rights reserved.
Starting /u01/app/oracle/product/10.2.0/db_1/bin/tnslsnr: please wait...
TNSLSNR for Linux: Version 10.
4000
2.0.5.0 - Production
System parameter file is /u01/app/oracle/product/10.2.0/db_1/network/admin/listener.ora
Log messages written to /u01/app/oracle/product/10.2.0/db_1/network/log/listener.log
Error listening on: (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
TNS-12560: TNS:protocol adapter error
TNS-00584: Valid node checking configuration error
Listener failed to start. See the error message(s) above...
尝试使用crs_stop -f <res_name>停掉资源后再启动发现ASM报错:
[oracle@node101 admin]$ srvctl start asm -n node101
PRKS-1009 : Failed to start ASM instance "+ASM1" on node "node101", [PRKS-1009 : Failed to start ASM instance "+ASM1" on node "node101", [node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:SQL*Plus: Release 10.2.0.5.0 - Production on Mon Sep 22 15:36:11 2014
node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:Copyright (c) 1982, 2010, Oracle. All Rights Reserved.
node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:Enter user-name: ERROR:
node101:ora.node101.ASM1.asm:ORA-01031: insufficient privileges
node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:Enter user-name: SP2-0306: Invalid option.
node101:ora.node101.ASM1.asm:Usage: CONN[ECT] [logon] [AS {SYSDBA|SYSOPER}]
node101:ora.node101.ASM1.asm:where <logon> ::= <username>[/<password>][@<connect_identifier>] | /
node101:ora.node101.ASM1.asm:Enter user-name: Enter password:
node101:ora.node101.ASM1.asm:ERROR:
node101:ora.node101.ASM1.asm:ORA-01005: null password given; logon denied
node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:SP2-0157: unable to CONNECT to ORACLE after 3 attempts, exiting SQL*Plus
node101:ora.node101.ASM1.asm:
CRS-0215: Could not start resource 'ora.node101.ASM1.asm'.]]
[PRKS-1009 : Failed to start ASM instance "+ASM1" on node "node101", [node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:SQL*Plus: Release 10.2.0.5.0 - Production on Mon Sep 22 15:36:11 2014
node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:Copyright (c) 1982, 2010, Oracle. All Rights Reserved.
node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:Enter user-name: ERROR:
node101:ora.node101.ASM1.asm:ORA-01031: insufficient privileges
node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:Enter user-name: SP2-0306: Invalid option.
node101:ora.node101.ASM1.asm:Usage: CONN[ECT] [logon] [AS {SYSDBA|SYSOPER}]
node101:ora.node101.ASM1.asm:where <logon> ::= <username>[/<password>][@<connect_identifier>] | /
node101:ora.node101.ASM1.asm:Enter user-name: Enter password:
node101:ora.node101.ASM1.asm:ERROR:
node101:ora.node101.ASM1.asm:ORA-01005: null password given; logon denied
node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:
node101:ora.node101.ASM1.asm:SP2-0157: unable to CONNECT to ORACLE after 3 attempts, exiting SQL*Plus
node101:ora.node101.ASM1.asm:
CRS-0215: Could not start resource 'ora.node101.ASM1.asm'.]]
再次启动:
[oracle@node101 admin]$ srvctl start asm -n node101
PRKS-1009 : Failed to start ASM instance "+ASM1" on node "node101", [PRKS-1009 : Failed to start ASM instance "+ASM1" on node "node101", [CRS-1028: Dependency analysis failed because of:
CRS-0223: Resource 'ora.node101.ASM1.asm' has placement error.]]
[PRKS-1009 : Failed to start ASM instance "+ASM1" on node "node101", [CRS-1028: Dependency analysis failed because of:
CRS-0223: Resource 'ora.node101.ASM1.asm' has placement error.]]
2.3.问题定位
通过这些信息可以确认是因为登陆ASM需要密码导致CRS无法将ASM打开,此时想到了SQLNET.ORA里面的参数。打开文件后发现:
TCP.VALIDNODE_CHECKING=yes(表示需要允许的节点才能连进来,引起监听启动失败)
SQLNET.AUTHENTICATION_SERVICES=none(引起ASM无密码登陆失败)
SQLNET.EXPIRE_TIME=1
三.问题解决
通过以上确认是因为不合理配置了sqlnet.ora导致集群相关服务无法启动吗,故通过以下步骤进行处理问题:
mv sqlnet.ora sqlnet.ora.bak
crs_stop -f <asm_service>
srvctl start asm -n <node_name>
srvctl start listener -n <node_name>
相关文章推荐
- Redis命令小细节
- MySQL 用户权限详细汇总
- Oracle中用户(User)和模式(Schema)的概念
- sybase sql anywhere 5.0 安装后sybase central中无法打开视图等的解决办法
- MySQL中删除所有表的方法
- Mac安装MySQLdb
- Redis学习笔记---安装
- 服务器保持与Mysql的连接
- 完全卸载oracle11g步骤
- 安装64位版Oracle11gR2后无法启动SQLDeveloper的解决方案
- WCM重启报数据库启动错误
- oracle取整操作
- oracle 学习笔记
- 15-07-20 数据库--索引视图编程
- mysql乱码的好文
- Oracle学习笔记
- redis 五种数据类型的使用场景
- SQL查询练习题目
- Ubuntu下mysql设置远程访问
- 初窥Python(一)——使用pymongo连接MongoDB