您的位置:首页 > 数据库

处理因sqlnet.ora引起的ASM资源为UNKNOWN一例

2015-07-21 21:52 585 查看
一.背景介绍

很久没用自己的测试环境了,某次需要研究一个GG的问题。启动RAC的时候出现了一个节点ASM实例为UNKNOWN状态且该节点的监听非启动状态并且启动报错。系统配置信息如下:

OS:RHEL 5.5
CRS:10.2.0.5
DB:10.2.0.5
OCR等存放在OCFS但数据文件存放在ASM中。

二.问题分析步骤
2.1.启动集群

启动集群的过程中node101的nodeapp都可以正常启动说明OCR等设备没有问题,检查OCR状态也是正常的。启动后集群状态树下:

[oracle@node101 bdump]$ crs_stat -t

Name           Type           Target    State     Host       

------------------------------------------------------------

ora....s1.inst application    OFFLINE   OFFLINE              

ora....s2.inst application    OFFLINE   OFFLINE              

ora....prod.db application    OFFLINE   OFFLINE              

ora.dgrac.db   application    ONLINE    ONLINE    node102    

ora....c1.inst application    ONLINE    OFFLINE              

ora....c2.inst application    ONLINE    ONLINE    node102    

ora.ggsplx.db  application    OFFLINE   OFFLINE              

ora....p1.inst application    OFFLINE   OFFLINE              

ora....p2.inst application    OFFLINE   OFFLINE              

ora....SM1.asm application    ONLINE    UNKNOWN   node101    
ora....01.lsnr application    ONLINE    OFFLINE              

ora....101.gsd application    ONLINE    ONLINE    node101    

ora....101.ons application    ONLINE    ONLINE    node101    

ora....101.vip application    ONLINE    ONLINE    node101    

ora....SM2.asm application    ONLINE    ONLINE    node102    

ora....02.lsnr application    ONLINE    ONLINE    node102    

ora....102.gsd application    ONLINE    ONLINE    node102    

ora....102.ons application    ONLINE    ONLINE    node102    

ora....102.vip application    ONLINE    ONLINE    node102

2.2.分析日志

通过查看crsd.log发现在启动集群时有以下报错信息:

2014-09-22 14:11:16.939: [  CRSRES][1499146560]0Attempting to start `ora.node101.ASM1.asm` on member `node101`

2014-09-22 14:11:17.066: [  CRSRES][1501247808]0Attempting to start `ora.node101.vip` on member `node101`

2014-09-22 14:11:17.918: [  CRSRES][1503349056]0Attempting to start `ora.node102.ASM2.asm` on member `node102`

2014-09-22 14:11:17.924: [  CRSRES][1505450304]0Attempting to start `ora.node102.vip` on member `node102`

2014-09-22 14:11:23.786: [  CRSRES][1501247808]0Start of `ora.node101.vip` on member `node101` succeeded.

2014-09-22 14:11:25.565: [  CRSRES][1501247808]0startRunnable: setting CLI values

2014-09-22 14:11:25.604: [  CRSRES][1505450304]0Start of `ora.node102.vip` on member `node102` succeeded.
2014-09-22 14:11:27.450: [  CRSRES][1501247808]0Attempting to start `ora.node101.LISTENER_NODE101.lsnr` on member `node101`

2014-09-22 14:11:29.452: [  CRSRES][1505450304]0Attempting to start `ora.node102.LISTENER_NODE102.lsnr` on member `node102`

2014-09-22 14:11:36.367: [  CRSAPP][1499146560]0StartResource error for ora.node101.ASM1.asm error code = 1

2014-09-22 14:11:39.567: [  CRSAPP][1501247808]0StartResource error for ora.node101.LISTENER_NODE101.lsnr error code = 1

2014-09-22 14:11:41.746: [  CRSAPP][1499146560]0StopResource error for ora.node101.ASM1.asm error code = 1

2014-09-22 14:11:41.819: [  CRSRES][1501247808]0Start of `ora.node101.LISTENER_NODE101.lsnr` on member `node101` failed.

2014-09-22 14:11:42.054: [  CRSRES][1499146560]0X_OP_StopResourceFailed : Stop Resource failed

(File: rti.cpp, line: 1808

2014-09-22 14:11:42.055: [  CRSRES][1499146560][ALERT]0`ora.node101.ASM1.asm` on member `node101` has experienced an unrecoverable failure.

2014-09-22 14:11:42.055: [  CRSRES][1499146560]0Human intervention required to resume its availability.

2014-09-22 14:11:42.971: [  CRSRES][1505450304]0Start of `ora.node102.LISTENER_NODE102.lsnr` on member `node102` succeeded.

2014-09-22 14:11:43.064: [  CRSRES][1499146560]0CRS-1028: Dependency analysis failed because of:

'Resource in UNKNOWN state: ora.node101.ASM1.asm'

2014-09-22 14:11:43.209: [  CRSRES][1543223616]0startRunnable: setting CLI values

2014-09-22 14:11:43.226: [  CRSRES][1543223616]0Attempting to start `ora.node101.LISTENER_NODE101.lsnr` on member `node101`

2014-09-22 14:11:44.344: [  CRSAPP][1543223616]0StartResource error for ora.node101.LISTENER_NODE101.lsnr error code = 1

2014-09-22 14:11:45.448: [  CRSRES][1543223616]0Start of `ora.node101.LISTENER_NODE101.lsnr` on member `node101` failed.

按照日志所说,我手工去启动ASM实例可以起来,但是必须带着密码.

[oracle@node101 admin]$ export ORACLE_SID=+ASM1

[oracle@node101 admin]$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.5.0 - Production on Mon Sep 22 15:30:51 2014

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.

ERROR:

ORA-01031: insufficient privileges

Enter user-name: 

启动监听同样报错说访问权限被拒绝:

[oracle@node101 admin]$ lsnrctl start

LSNRCTL for Linux: Version 10.2.0.5.0 - Production on 22-SEP-2014 15:32:01

Copyright (c) 1991, 2010, Oracle.  All rights reserved.

Starting /u01/app/oracle/product/10.2.0/db_1/bin/tnslsnr: please wait...

TNSLSNR for Linux: Version 10.
4000
2.0.5.0 - Production

System parameter file is /u01/app/oracle/product/10.2.0/db_1/network/admin/listener.ora

Log messages written to /u01/app/oracle/product/10.2.0/db_1/network/log/listener.log

Error listening on: (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))

TNS-12560: TNS:protocol adapter error

TNS-00584: Valid node checking configuration error

Listener failed to start. See the error message(s) above...

尝试使用crs_stop -f <res_name>停掉资源后再启动发现ASM报错:

[oracle@node101 admin]$ srvctl start asm -n node101
PRKS-1009 : Failed to start ASM instance "+ASM1" on node "node101", [PRKS-1009 : Failed to start ASM instance "+ASM1" on node "node101", [node101:ora.node101.ASM1.asm:

node101:ora.node101.ASM1.asm:SQL*Plus: Release 10.2.0.5.0 - Production on Mon Sep 22 15:36:11 2014

node101:ora.node101.ASM1.asm:

node101:ora.node101.ASM1.asm:Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.

node101:ora.node101.ASM1.asm:

node101:ora.node101.ASM1.asm:Enter user-name: ERROR:

node101:ora.node101.ASM1.asm:ORA-01031: insufficient privileges

node101:ora.node101.ASM1.asm:

node101:ora.node101.ASM1.asm:

node101:ora.node101.ASM1.asm:Enter user-name: SP2-0306: Invalid option.

node101:ora.node101.ASM1.asm:Usage: CONN[ECT] [logon] [AS {SYSDBA|SYSOPER}]

node101:ora.node101.ASM1.asm:where <logon>  ::= <username>[/<password>][@<connect_identifier>] | /

node101:ora.node101.ASM1.asm:Enter user-name: Enter password:

node101:ora.node101.ASM1.asm:ERROR:

node101:ora.node101.ASM1.asm:ORA-01005: null password given; logon denied

node101:ora.node101.ASM1.asm:

node101:ora.node101.ASM1.asm:

node101:ora.node101.ASM1.asm:SP2-0157: unable to CONNECT to ORACLE after 3 attempts, exiting SQL*Plus

node101:ora.node101.ASM1.asm:

CRS-0215: Could not start resource 'ora.node101.ASM1.asm'.]]

  [PRKS-1009 : Failed to start ASM instance "+ASM1" on node "node101", [node101:ora.node101.ASM1.asm:

node101:ora.node101.ASM1.asm:SQL*Plus: Release 10.2.0.5.0 - Production on Mon Sep 22 15:36:11 2014

node101:ora.node101.ASM1.asm:

node101:ora.node101.ASM1.asm:Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.

node101:ora.node101.ASM1.asm:

node101:ora.node101.ASM1.asm:Enter user-name: ERROR:

node101:ora.node101.ASM1.asm:ORA-01031: insufficient privileges

node101:ora.node101.ASM1.asm:

node101:ora.node101.ASM1.asm:

node101:ora.node101.ASM1.asm:Enter user-name: SP2-0306: Invalid option.

node101:ora.node101.ASM1.asm:Usage: CONN[ECT] [logon] [AS {SYSDBA|SYSOPER}]

node101:ora.node101.ASM1.asm:where <logon>  ::= <username>[/<password>][@<connect_identifier>] | /

node101:ora.node101.ASM1.asm:Enter user-name: Enter password:

node101:ora.node101.ASM1.asm:ERROR:

node101:ora.node101.ASM1.asm:ORA-01005: null password given; logon denied

node101:ora.node101.ASM1.asm:

node101:ora.node101.ASM1.asm:

node101:ora.node101.ASM1.asm:SP2-0157: unable to CONNECT to ORACLE after 3 attempts, exiting SQL*Plus

node101:ora.node101.ASM1.asm:

CRS-0215: Could not start resource 'ora.node101.ASM1.asm'.]]

再次启动:

[oracle@node101 admin]$ srvctl start asm -n node101

PRKS-1009 : Failed to start ASM instance "+ASM1" on node "node101", [PRKS-1009 : Failed to start ASM instance "+ASM1" on node "node101", [CRS-1028: Dependency analysis failed because of:

CRS-0223: Resource 'ora.node101.ASM1.asm' has placement error.]]

  [PRKS-1009 : Failed to start ASM instance "+ASM1" on node "node101", [CRS-1028: Dependency analysis failed because of:

CRS-0223: Resource 'ora.node101.ASM1.asm' has placement error.]]

2.3.问题定位

通过这些信息可以确认是因为登陆ASM需要密码导致CRS无法将ASM打开,此时想到了SQLNET.ORA里面的参数。打开文件后发现:

TCP.VALIDNODE_CHECKING=yes(表示需要允许的节点才能连进来,引起监听启动失败)
SQLNET.AUTHENTICATION_SERVICES=none(引起ASM无密码登陆失败)

SQLNET.EXPIRE_TIME=1

三.问题解决

通过以上确认是因为不合理配置了sqlnet.ora导致集群相关服务无法启动吗,故通过以下步骤进行处理问题:

mv sqlnet.ora sqlnet.ora.bak
crs_stop -f <asm_service>
srvctl start asm -n <node_name>
srvctl start listener -n <node_name>
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: