crs只能启动一个asm实例
2012-12-04 22:43
295 查看
今天接到一个朋友的电话,说他有个客户rac安装的时候总是有问题。cluster软件已经装上,但是没法装数据库。由于网络环境比较差,无法远程,只能通过QQ来了解情况和诊断了。
一开始,先让对方运行crs_stat -t看看各个资源的状况:
[oracle@rac01 bin]$ ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application ONLINE ONLINE rac01
ora....01.lsnr application ONLINE ONLINE rac01
ora.rac01.gsd application ONLINE ONLINE rac01
ora.rac01.ons application ONLINE ONLINE rac01
ora.rac01.vip application ONLINE ONLINE rac01
ora....SM2.asm application ONLINE OFFLINE
ora....02.lsnr application ONLINE ONLINE rac02
ora.rac02.gsd application ONLINE ONLINE rac02
ora.rac02.ons application ONLINE ONLINE rac02
ora.rac02.vip application ONLINE ONLINE rac02
发现在rac02上asm没起来,并且通过ps -ef 看asm的进程也不存在:
[root@rac02 etc]# ps -ef|grep asm
oracle 21691 15798 0 11:11 pts/1 00:00:00 more /u01/app/oracle/diag/asm/+asm/+ASM2/trace/+ASM2_pmon_12899.trc
root 21892 19317 0 11:11 pts/3 00:00:00 grep asm
由于很多时候,特别是在虚拟机中,crs_start启动总是会有点问题,一般只要重启,都会解决该问题,于是尝试重启crs,用crs_stop -all和crs_start -all重启。
在启动的时候,报错了:
[oracle@rac01 bin]$ ./crs_start -all
Attempting to start `ora.rac01.vip` on member `rac01`
Attempting to start `ora.rac02.vip` on member `rac02`
Attempting to start `ora.rac02.ASM2.asm` on member `rac02`
Attempting to start `ora.rac01.ASM1.asm` on member `rac01`
Start of `ora.rac01.vip` on member `rac01` succeeded.
Start of `ora.rac02.vip` on member `rac02` succeeded.
Attempting to start `ora.rac01.LISTENER_RAC01.lsnr` on member `rac01`
Attempting to start `ora.rac02.LISTENER_RAC02.lsnr` on member `rac02`
Start of `ora.rac01.LISTENER_RAC01.lsnr` on member `rac01` succeeded.
Start of `ora.rac02.LISTENER_RAC02.lsnr` on member `rac02` succeeded.
Start of `ora.rac01.ASM1.asm` on member `rac01` failed.
rac02 : CRS-1019: Resource ora.rac01.ASM1.asm (application) cannot run on rac02
Start of `ora.rac02.ASM2.asm` on member `rac02` succeeded.
Attempting to start `ora.rac01.gsd` on member `rac01`
Attempting to start `ora.rac01.ons` on member `rac01`
CRS-1002: Resource 'ora.rac02.ons' is already running on member 'rac02'
Attempting to start `ora.rac02.gsd` on member `rac02`
Start of `ora.rac01.gsd` on member `rac01` succeeded.
Start of `ora.rac02.gsd` on member `rac02` succeeded.
Start of `ora.rac01.ons` on member `rac01` succeeded.
CRS-0215: Could not start resource 'ora.rac01.ASM1.asm'.
CRS-0223: Resource 'ora.rac02.ons' has placement error.
上面的报错中,关键的一句还是:rac02 : CRS-1019: Resource ora.rac01.ASM1.asm (application) cannot run on rac02。检查crs_stat -t,发现:
[oracle@rac01 bin]$ ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application ONLINE OFFLINE
ora....01.lsnr application ONLINE ONLINE rac01
ora.rac01.gsd application ONLINE ONLINE rac01
ora.rac01.ons application ONLINE ONLINE rac01
ora.rac01.vip application ONLINE ONLINE rac01
ora....SM2.asm application ONLINE ONLINE rac02
ora....02.lsnr application ONLINE ONLINE rac02
ora.rac02.gsd application ONLINE ONLINE rac02
ora.rac02.ons application ONLINE ONLINE rac02
ora.rac02.vip application ONLINE ONLINE rac02
问题似乎是asm实例只能在一个节点上启动,要去看看asm的log了。
到asm的bdump下发现:
[oracle@rac01 admin]$ cd +ASM
[oracle@rac01 +ASM]$ ls
hdump pfile
[oracle@rac01 +ASM]$ ll
total 8
drwxr-x--- 2 oracle oinstall 4096 Oct 9 10:30 hdump
drwxr-x--- 2 oracle oinstall 4096 Oct 9 10:30 pfile
[oracle@rac01 +ASM]$
没有asm的bdump的log?!那就是似乎还没到crs去拉起asm实例的那一步了。于是继续往上追溯,去看看crs的log:
Oracle Database 11g CRS Release 11.1.0.6.0 - Production Copyright 1996, 2007 Oracle. All rights reserved.
2010-10-09 10:03:29.059: [ default][4277080288] CRS Daemon Starting
2010-10-09 10:03:29.060: [ CRSMAIN][4277080288] Checking the OCR device
2010-10-09 10:03:29.079: [ CRSMAIN][4277080288] Connecting to the CSS Daemon
2010-10-09 10:03:29.080: [ COMMCRS][1107462464]clsc_connect: (0x1c7d86e0) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac01_))
2010-10-09 10:03:29.081: [ CSSCLNT][4277080288]clsssInitNative: failed to connect to (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac01_)), rc 9
2010-10-09 10:03:29.081: [ CRSRTI][4277080288] CSS is not ready. Received status 3 from CSS. Waiting for good status ..
2010-10-09 10:03:30.082: [ COMMCRS][1107462464]clsc_connect: (0x1c7d86c0) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac01_))
2010-10-09 10:03:30.083: [ CSSCLNT][4277080288]clsssInitNative: failed to connect to (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac01_)), rc 9
2010-10-09 10:03:30.083: [ CRSRTI][4277080288] CSS is not ready. Received status 3 from CSS. Waiting for good status ..
2010-10-09 10:03:31.084: [ COMMCRS][1107462464]clsc_connect: (0x1c7d86c0) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac01_))
2010-10-09 10:03:31.084: [ CSSCLNT][4277080288]clsssInitNative: failed to connect to (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac01_)), rc 9
2010-10-09 10:03:31.084: [ CRSRTI][4277080288] CSS is not ready. Received status 3 from CSS. Waiting for good status ..
2010-10-09 10:06:52.786: [ CRSMAIN][4277080288] CRSD running as the Privileged user
2010-10-09 10:06:52.823: [ CLSVER][4277080288] Active Version from OCR:10.1.0.2.0
2010-10-09 10:06:52.823: [ CLSVER][4277080288] Active Version is less than Software Version
2010-10-09 10:06:52.823: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:06:53.824: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:06:54.825: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:06:55.826: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:06:56.827: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:06:57.828: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:06:58.829: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:06:59.831: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:00.831: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:01.832: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:02.833: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retr
2010-10-09 10:07:02.833: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:03.834: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:04.835: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:05.836: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:06.837: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:07.838: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:08.839: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:09.840: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:10.841: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:11.842: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:12.843: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:13.844: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:14.845: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:15.846: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:16.847: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:17.848: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:18.849: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:19.850: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:20.851: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:21.852: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:22.853: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:23.854: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:24.855: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:40.874: [ CLSVER][4277080288] Registered in CSS group crs_version
2010-10-09 10:07:40.874: [ CRSMAIN][4277080288] Initializing OCR
2010-10-09 10:07:40.875: [ CLSVER][1117952320] Monitoring the crs_version group for *** change notification
2010-10-09 10:07:40.875: [ CLSVER][1117952320] Doing grpstat on crs_version group
2010-10-09 10:07:40.875: [ CLSVER][1117952320] Returned from grpstat with event 1
2010-10-09 10:07:40.875: [ CLSVER][1117952320] Doing grpstat on crs_version group
2010-10-09 10:07:40.911: [ OCRRAW][4277080288]proprioo: for disk 0 (/dev/sdd1), id match (1), my id set (1084942139,1028247821) total id sets (1), 1st set (1084942139,1028247821), 2nd set (0,0) my votes (2), total votes (2)
2010-10-09 10:07:41.074: [ CRSD][4277080288] ENV Logging level for Module: allcomp 0
2010-10-09 10:07:41.084: [ CRSD][4277080288] ENV Logging level for Module: default 0
2010-10-09 10:07:41.093: [ CRSD][4277080288] ENV Logging level for Module: OCRRAW 0
2010-10-09 10:07:41.102: [ CRSD][4277080288] ENV Logging level for Module: OCROSD 0
2010-10-09 10:07:41.111: [ CRSD][4277080288] ENV Logging level for Module: OCRCAC 0
2010-10-09 10:07:41.121: [ CRSD][4277080288] ENV Logging level for Module: COMMCRS 0
2010-10-09 10:07:41.130: [ CRSD][4277080288] ENV Logging level for Module: COMMNS 0
从log上看,应该是css的错误了,CSS,即Cluster Synchronization Services,根据文档的意思是说:Manages the cluster configuration by controlling which nodes are members of the cluster and by notifying members when a node joins or leaves the cluster. If you are using third-party clusterware,
then the css process interfaces with your clusterware to manage node membership information.主要是负责节点间的控制和通信问题了。
尝试ping各个节点:ping rac01没问题,ping rac02没问题,ping rac01-priv没问题,ping rac02-priv没问题;尝试验证互信机制,尝试ssh rac01 date没问题,ssh rac02 date没问题,ssh rac01-priv date没问题,ssh rac02-priv date也没问题。
再次尝试用srvctl重启rac01上的asm,出现了很重要的报错信息:
[oracle@rac01 dbs]$ srvctl start asm -n rac01
PRKS-1009 : Failed to start ASM instance "+ASM1" on node "rac01", [PRKS-1009 : Failed to start ASM instance "+ASM1" on node "rac01", [rac01:ora.rac01.ASM1.asm:
rac01:ora.rac01.ASM1.asm:SQL*Plus: Release 11.1.0.6.0 - Production on Sat Oct 9 11:58:36 2010
rac01:ora.rac01.ASM1.asm:
rac01:ora.rac01.ASM1.asm:Copyright (c) 1982, 2007, Oracle. All rights reserved.
rac01:ora.rac01.ASM1.asm:
rac01:ora.rac01.ASM1.asm:Enter user-name: Connected to an idle instance.
rac01:ora.rac01.ASM1.asm:
rac01:ora.rac01.ASM1.asm:SQL> ORA-03113: end-of-file on communication channel
rac01:ora.rac01.ASM1.asm:SQL> Disconnected
rac01:ora.rac01.ASM1.asm:
CRS-0215: Could not start resource 'ora.rac01.ASM1.asm'.]]
[PRKS-1009 : Failed to start ASM instance "+ASM1" on node "rac01", [rac01:ora.rac01.ASM1.asm:
rac01:ora.rac01.ASM1.asm:SQL*Plus: Release 11.1.0.6.0 - Production on Sat Oct 9 11:58:36 2010
rac01:ora.rac01.ASM1.asm:
rac01:ora.rac01.ASM1.asm:Copyright (c) 1982, 2007, Oracle. All rights reserved.
rac01:ora.rac01.ASM1.asm:
rac01:ora.rac01.ASM1.asm:Enter user-name: Connected to an idle instance.
rac01:ora.rac01.ASM1.asm:
rac01:ora.rac01.ASM1.asm:SQL> ORA-03113: end-of-file on communication channel
rac01:ora.rac01.ASM1.asm:SQL> Disconnected
rac01:ora.rac01.ASM1.asm:
CRS-0215: Could not start resource 'ora.rac01.ASM1.asm'.]]
根据PRKS-1009和CRS-0215,基本可以断定是网卡设置的问题了。用oifcfg检查:
[oracle@rac01 bin]$ ./oifcfg getif
eth0 10.0.253.0 global public
eth1 192.168.253.0 global cluster_interconnect
eth2 192.168.130.0 global cluster_interconnect
eth3 192.168.131.0 global cluster_interconnect
[oracle@rac01 bin]$
问了一下,130和131网段是连存储的,和rac间的priv通信没关系。rac0x-priv是在253网段,因此不应该有eth2和eth3的配置。
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
#Public IP
10.0.253.151 rac01
10.0.253.152 rac02
#Private IP
192.168.253.151 rac01-priv
192.168.253.152 rac02-priv
#Virtual IP
10.0.253.156 rac01-vip
10.0.253.157 rac02-vip
用oifcfg del删除:
[oracle@rac01 bin]$ ./oifcfg delif -global eth2/192.168.130.0
[oracle@rac01 bin]$ ./oifcfg delif -global eth3/192.168.131.0
[oracle@rac01 bin]$ ./oifcfg getif
eth0 10.0.253.0 global public
eth1 192.168.253.0 global cluster_interconnect
再次重启crs:
--在一个窗口运行crs_stop:
[oracle@rac01 bin]$ ./crs_stop -all
--在另一窗口看:
[oracle@rac01 bin]$ ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application ONLINE ONLINE rac01
ora....01.lsnr application OFFLINE OFFLINE
ora.rac01.gsd application OFFLINE OFFLINE
ora.rac01.ons application ONLINE ONLINE rac01
ora.rac01.vip application OFFLINE OFFLINE
ora....SM2.asm application OFFLINE OFFLINE
ora....02.lsnr application OFFLINE OFFLINE
ora.rac02.gsd application OFFLINE OFFLINE
ora.rac02.ons application OFFLINE OFFLINE
ora.rac02.vip application OFFLINE OFFLINE
发现还有2个asm和nodeapp没停下来,用srvctl停:
srvctl stop asm -n rac01
srvctl stop nodeapps -n rac01
[oracle@rac01 bin]$ ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application OFFLINE OFFLINE
ora....01.lsnr application OFFLINE OFFLINE
ora.rac01.gsd application OFFLINE OFFLINE
ora.rac01.ons application OFFLINE OFFLINE
ora.rac01.vip application OFFLINE OFFLINE
ora....SM2.asm application OFFLINE OFFLINE
ora....02.lsnr application OFFLINE OFFLINE
ora.rac02.gsd application OFFLINE OFFLINE
ora.rac02.ons application OFFLINE OFFLINE
ora.rac02.vip application OFFLINE OFFLINE
再次启动:
[oracle@rac01 bin]$ ./crs_start -all
Attempting to start `ora.rac01.vip` on member `rac01`
Attempting to start `ora.rac02.vip` on member `rac02`
Attempting to start `ora.rac02.ASM2.asm` on member `rac02`
Attempting to start `ora.rac01.ASM1.asm` on member `rac01`
Start of `ora.rac01.vip` on member `rac01` succeeded.
Start of `ora.rac02.vip` on member `rac02` succeeded.
Attempting to start `ora.rac01.LISTENER_RAC01.lsnr` on member `rac01`
Attempting to start `ora.rac02.LISTENER_RAC02.lsnr` on member `rac02`
Start of `ora.rac02.ASM2.asm` on member `rac02` succeeded.
Start of `ora.rac01.ASM1.asm` on member `rac01` succeeded.
Start of `ora.rac01.LISTENER_RAC01.lsnr` on member `rac01` succeeded.
Start of `ora.rac02.LISTENER_RAC02.lsnr` on member `rac02` succeeded.
CRS-1002: Resource 'ora.rac01.ons' is already running on member 'rac01'
CRS-1002: Resource 'ora.rac02.ons' is already running on member 'rac02'
Attempting to start `ora.rac01.gsd` on member `rac01`
Attempting to start `ora.rac02.gsd` on member `rac02`
Start of `ora.rac01.gsd` on member `rac01` succeeded.
Start of `ora.rac02.gsd` on member `rac02` succeeded.
CRS-0223: Resource 'ora.rac01.ons' has placement error.
CRS-0223: Resource 'ora.rac02.ons' has placement error.
[oracle@rac01 bin]$ ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application ONLINE ONLINE rac01
ora....01.lsnr application ONLINE ONLINE rac01
ora.rac01.gsd application ONLINE ONLINE rac01
ora.rac01.ons application ONLINE ONLINE rac01
ora.rac01.vip application ONLINE ONLINE rac01
ora....SM2.asm application ONLINE ONLINE rac02
ora....02.lsnr application ONLINE ONLINE rac02
ora.rac02.gsd application ONLINE ONLINE rac02
ora.rac02.ons application ONLINE ONLINE rac02
ora.rac02.vip application ONLINE ONLINE rac02
搞定,可以继续安装rac数据库了!
原文地址:http://www.oracleblog.org/working-case/crs-can-only-start-one-asm-instance/
一开始,先让对方运行crs_stat -t看看各个资源的状况:
[oracle@rac01 bin]$ ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application ONLINE ONLINE rac01
ora....01.lsnr application ONLINE ONLINE rac01
ora.rac01.gsd application ONLINE ONLINE rac01
ora.rac01.ons application ONLINE ONLINE rac01
ora.rac01.vip application ONLINE ONLINE rac01
ora....SM2.asm application ONLINE OFFLINE
ora....02.lsnr application ONLINE ONLINE rac02
ora.rac02.gsd application ONLINE ONLINE rac02
ora.rac02.ons application ONLINE ONLINE rac02
ora.rac02.vip application ONLINE ONLINE rac02
发现在rac02上asm没起来,并且通过ps -ef 看asm的进程也不存在:
[root@rac02 etc]# ps -ef|grep asm
oracle 21691 15798 0 11:11 pts/1 00:00:00 more /u01/app/oracle/diag/asm/+asm/+ASM2/trace/+ASM2_pmon_12899.trc
root 21892 19317 0 11:11 pts/3 00:00:00 grep asm
由于很多时候,特别是在虚拟机中,crs_start启动总是会有点问题,一般只要重启,都会解决该问题,于是尝试重启crs,用crs_stop -all和crs_start -all重启。
在启动的时候,报错了:
[oracle@rac01 bin]$ ./crs_start -all
Attempting to start `ora.rac01.vip` on member `rac01`
Attempting to start `ora.rac02.vip` on member `rac02`
Attempting to start `ora.rac02.ASM2.asm` on member `rac02`
Attempting to start `ora.rac01.ASM1.asm` on member `rac01`
Start of `ora.rac01.vip` on member `rac01` succeeded.
Start of `ora.rac02.vip` on member `rac02` succeeded.
Attempting to start `ora.rac01.LISTENER_RAC01.lsnr` on member `rac01`
Attempting to start `ora.rac02.LISTENER_RAC02.lsnr` on member `rac02`
Start of `ora.rac01.LISTENER_RAC01.lsnr` on member `rac01` succeeded.
Start of `ora.rac02.LISTENER_RAC02.lsnr` on member `rac02` succeeded.
Start of `ora.rac01.ASM1.asm` on member `rac01` failed.
rac02 : CRS-1019: Resource ora.rac01.ASM1.asm (application) cannot run on rac02
Start of `ora.rac02.ASM2.asm` on member `rac02` succeeded.
Attempting to start `ora.rac01.gsd` on member `rac01`
Attempting to start `ora.rac01.ons` on member `rac01`
CRS-1002: Resource 'ora.rac02.ons' is already running on member 'rac02'
Attempting to start `ora.rac02.gsd` on member `rac02`
Start of `ora.rac01.gsd` on member `rac01` succeeded.
Start of `ora.rac02.gsd` on member `rac02` succeeded.
Start of `ora.rac01.ons` on member `rac01` succeeded.
CRS-0215: Could not start resource 'ora.rac01.ASM1.asm'.
CRS-0223: Resource 'ora.rac02.ons' has placement error.
上面的报错中,关键的一句还是:rac02 : CRS-1019: Resource ora.rac01.ASM1.asm (application) cannot run on rac02。检查crs_stat -t,发现:
[oracle@rac01 bin]$ ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application ONLINE OFFLINE
ora....01.lsnr application ONLINE ONLINE rac01
ora.rac01.gsd application ONLINE ONLINE rac01
ora.rac01.ons application ONLINE ONLINE rac01
ora.rac01.vip application ONLINE ONLINE rac01
ora....SM2.asm application ONLINE ONLINE rac02
ora....02.lsnr application ONLINE ONLINE rac02
ora.rac02.gsd application ONLINE ONLINE rac02
ora.rac02.ons application ONLINE ONLINE rac02
ora.rac02.vip application ONLINE ONLINE rac02
问题似乎是asm实例只能在一个节点上启动,要去看看asm的log了。
到asm的bdump下发现:
[oracle@rac01 admin]$ cd +ASM
[oracle@rac01 +ASM]$ ls
hdump pfile
[oracle@rac01 +ASM]$ ll
total 8
drwxr-x--- 2 oracle oinstall 4096 Oct 9 10:30 hdump
drwxr-x--- 2 oracle oinstall 4096 Oct 9 10:30 pfile
[oracle@rac01 +ASM]$
没有asm的bdump的log?!那就是似乎还没到crs去拉起asm实例的那一步了。于是继续往上追溯,去看看crs的log:
Oracle Database 11g CRS Release 11.1.0.6.0 - Production Copyright 1996, 2007 Oracle. All rights reserved.
2010-10-09 10:03:29.059: [ default][4277080288] CRS Daemon Starting
2010-10-09 10:03:29.060: [ CRSMAIN][4277080288] Checking the OCR device
2010-10-09 10:03:29.079: [ CRSMAIN][4277080288] Connecting to the CSS Daemon
2010-10-09 10:03:29.080: [ COMMCRS][1107462464]clsc_connect: (0x1c7d86e0) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac01_))
2010-10-09 10:03:29.081: [ CSSCLNT][4277080288]clsssInitNative: failed to connect to (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac01_)), rc 9
2010-10-09 10:03:29.081: [ CRSRTI][4277080288] CSS is not ready. Received status 3 from CSS. Waiting for good status ..
2010-10-09 10:03:30.082: [ COMMCRS][1107462464]clsc_connect: (0x1c7d86c0) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac01_))
2010-10-09 10:03:30.083: [ CSSCLNT][4277080288]clsssInitNative: failed to connect to (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac01_)), rc 9
2010-10-09 10:03:30.083: [ CRSRTI][4277080288] CSS is not ready. Received status 3 from CSS. Waiting for good status ..
2010-10-09 10:03:31.084: [ COMMCRS][1107462464]clsc_connect: (0x1c7d86c0) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac01_))
2010-10-09 10:03:31.084: [ CSSCLNT][4277080288]clsssInitNative: failed to connect to (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac01_)), rc 9
2010-10-09 10:03:31.084: [ CRSRTI][4277080288] CSS is not ready. Received status 3 from CSS. Waiting for good status ..
2010-10-09 10:06:52.786: [ CRSMAIN][4277080288] CRSD running as the Privileged user
2010-10-09 10:06:52.823: [ CLSVER][4277080288] Active Version from OCR:10.1.0.2.0
2010-10-09 10:06:52.823: [ CLSVER][4277080288] Active Version is less than Software Version
2010-10-09 10:06:52.823: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:06:53.824: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:06:54.825: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:06:55.826: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:06:56.827: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:06:57.828: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:06:58.829: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:06:59.831: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:00.831: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:01.832: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:02.833: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retr
2010-10-09 10:07:02.833: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:03.834: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:04.835: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:05.836: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:06.837: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:07.838: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:08.839: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:09.840: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:10.841: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:11.842: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:12.843: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:13.844: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:14.845: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:15.846: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:16.847: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:17.848: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:18.849: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:19.850: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:20.851: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:21.852: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:22.853: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:23.854: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:24.855: [ CSSCLNT][4277080288]clssgsGroupJoin: CSS has not reached fatal mode.Registration is not yet safe. Retrying
2010-10-09 10:07:40.874: [ CLSVER][4277080288] Registered in CSS group crs_version
2010-10-09 10:07:40.874: [ CRSMAIN][4277080288] Initializing OCR
2010-10-09 10:07:40.875: [ CLSVER][1117952320] Monitoring the crs_version group for *** change notification
2010-10-09 10:07:40.875: [ CLSVER][1117952320] Doing grpstat on crs_version group
2010-10-09 10:07:40.875: [ CLSVER][1117952320] Returned from grpstat with event 1
2010-10-09 10:07:40.875: [ CLSVER][1117952320] Doing grpstat on crs_version group
2010-10-09 10:07:40.911: [ OCRRAW][4277080288]proprioo: for disk 0 (/dev/sdd1), id match (1), my id set (1084942139,1028247821) total id sets (1), 1st set (1084942139,1028247821), 2nd set (0,0) my votes (2), total votes (2)
2010-10-09 10:07:41.074: [ CRSD][4277080288] ENV Logging level for Module: allcomp 0
2010-10-09 10:07:41.084: [ CRSD][4277080288] ENV Logging level for Module: default 0
2010-10-09 10:07:41.093: [ CRSD][4277080288] ENV Logging level for Module: OCRRAW 0
2010-10-09 10:07:41.102: [ CRSD][4277080288] ENV Logging level for Module: OCROSD 0
2010-10-09 10:07:41.111: [ CRSD][4277080288] ENV Logging level for Module: OCRCAC 0
2010-10-09 10:07:41.121: [ CRSD][4277080288] ENV Logging level for Module: COMMCRS 0
2010-10-09 10:07:41.130: [ CRSD][4277080288] ENV Logging level for Module: COMMNS 0
从log上看,应该是css的错误了,CSS,即Cluster Synchronization Services,根据文档的意思是说:Manages the cluster configuration by controlling which nodes are members of the cluster and by notifying members when a node joins or leaves the cluster. If you are using third-party clusterware,
then the css process interfaces with your clusterware to manage node membership information.主要是负责节点间的控制和通信问题了。
尝试ping各个节点:ping rac01没问题,ping rac02没问题,ping rac01-priv没问题,ping rac02-priv没问题;尝试验证互信机制,尝试ssh rac01 date没问题,ssh rac02 date没问题,ssh rac01-priv date没问题,ssh rac02-priv date也没问题。
再次尝试用srvctl重启rac01上的asm,出现了很重要的报错信息:
[oracle@rac01 dbs]$ srvctl start asm -n rac01
PRKS-1009 : Failed to start ASM instance "+ASM1" on node "rac01", [PRKS-1009 : Failed to start ASM instance "+ASM1" on node "rac01", [rac01:ora.rac01.ASM1.asm:
rac01:ora.rac01.ASM1.asm:SQL*Plus: Release 11.1.0.6.0 - Production on Sat Oct 9 11:58:36 2010
rac01:ora.rac01.ASM1.asm:
rac01:ora.rac01.ASM1.asm:Copyright (c) 1982, 2007, Oracle. All rights reserved.
rac01:ora.rac01.ASM1.asm:
rac01:ora.rac01.ASM1.asm:Enter user-name: Connected to an idle instance.
rac01:ora.rac01.ASM1.asm:
rac01:ora.rac01.ASM1.asm:SQL> ORA-03113: end-of-file on communication channel
rac01:ora.rac01.ASM1.asm:SQL> Disconnected
rac01:ora.rac01.ASM1.asm:
CRS-0215: Could not start resource 'ora.rac01.ASM1.asm'.]]
[PRKS-1009 : Failed to start ASM instance "+ASM1" on node "rac01", [rac01:ora.rac01.ASM1.asm:
rac01:ora.rac01.ASM1.asm:SQL*Plus: Release 11.1.0.6.0 - Production on Sat Oct 9 11:58:36 2010
rac01:ora.rac01.ASM1.asm:
rac01:ora.rac01.ASM1.asm:Copyright (c) 1982, 2007, Oracle. All rights reserved.
rac01:ora.rac01.ASM1.asm:
rac01:ora.rac01.ASM1.asm:Enter user-name: Connected to an idle instance.
rac01:ora.rac01.ASM1.asm:
rac01:ora.rac01.ASM1.asm:SQL> ORA-03113: end-of-file on communication channel
rac01:ora.rac01.ASM1.asm:SQL> Disconnected
rac01:ora.rac01.ASM1.asm:
CRS-0215: Could not start resource 'ora.rac01.ASM1.asm'.]]
根据PRKS-1009和CRS-0215,基本可以断定是网卡设置的问题了。用oifcfg检查:
[oracle@rac01 bin]$ ./oifcfg getif
eth0 10.0.253.0 global public
eth1 192.168.253.0 global cluster_interconnect
eth2 192.168.130.0 global cluster_interconnect
eth3 192.168.131.0 global cluster_interconnect
[oracle@rac01 bin]$
问了一下,130和131网段是连存储的,和rac间的priv通信没关系。rac0x-priv是在253网段,因此不应该有eth2和eth3的配置。
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
#Public IP
10.0.253.151 rac01
10.0.253.152 rac02
#Private IP
192.168.253.151 rac01-priv
192.168.253.152 rac02-priv
#Virtual IP
10.0.253.156 rac01-vip
10.0.253.157 rac02-vip
用oifcfg del删除:
[oracle@rac01 bin]$ ./oifcfg delif -global eth2/192.168.130.0
[oracle@rac01 bin]$ ./oifcfg delif -global eth3/192.168.131.0
[oracle@rac01 bin]$ ./oifcfg getif
eth0 10.0.253.0 global public
eth1 192.168.253.0 global cluster_interconnect
再次重启crs:
--在一个窗口运行crs_stop:
[oracle@rac01 bin]$ ./crs_stop -all
--在另一窗口看:
[oracle@rac01 bin]$ ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application ONLINE ONLINE rac01
ora....01.lsnr application OFFLINE OFFLINE
ora.rac01.gsd application OFFLINE OFFLINE
ora.rac01.ons application ONLINE ONLINE rac01
ora.rac01.vip application OFFLINE OFFLINE
ora....SM2.asm application OFFLINE OFFLINE
ora....02.lsnr application OFFLINE OFFLINE
ora.rac02.gsd application OFFLINE OFFLINE
ora.rac02.ons application OFFLINE OFFLINE
ora.rac02.vip application OFFLINE OFFLINE
发现还有2个asm和nodeapp没停下来,用srvctl停:
srvctl stop asm -n rac01
srvctl stop nodeapps -n rac01
[oracle@rac01 bin]$ ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application OFFLINE OFFLINE
ora....01.lsnr application OFFLINE OFFLINE
ora.rac01.gsd application OFFLINE OFFLINE
ora.rac01.ons application OFFLINE OFFLINE
ora.rac01.vip application OFFLINE OFFLINE
ora....SM2.asm application OFFLINE OFFLINE
ora....02.lsnr application OFFLINE OFFLINE
ora.rac02.gsd application OFFLINE OFFLINE
ora.rac02.ons application OFFLINE OFFLINE
ora.rac02.vip application OFFLINE OFFLINE
再次启动:
[oracle@rac01 bin]$ ./crs_start -all
Attempting to start `ora.rac01.vip` on member `rac01`
Attempting to start `ora.rac02.vip` on member `rac02`
Attempting to start `ora.rac02.ASM2.asm` on member `rac02`
Attempting to start `ora.rac01.ASM1.asm` on member `rac01`
Start of `ora.rac01.vip` on member `rac01` succeeded.
Start of `ora.rac02.vip` on member `rac02` succeeded.
Attempting to start `ora.rac01.LISTENER_RAC01.lsnr` on member `rac01`
Attempting to start `ora.rac02.LISTENER_RAC02.lsnr` on member `rac02`
Start of `ora.rac02.ASM2.asm` on member `rac02` succeeded.
Start of `ora.rac01.ASM1.asm` on member `rac01` succeeded.
Start of `ora.rac01.LISTENER_RAC01.lsnr` on member `rac01` succeeded.
Start of `ora.rac02.LISTENER_RAC02.lsnr` on member `rac02` succeeded.
CRS-1002: Resource 'ora.rac01.ons' is already running on member 'rac01'
CRS-1002: Resource 'ora.rac02.ons' is already running on member 'rac02'
Attempting to start `ora.rac01.gsd` on member `rac01`
Attempting to start `ora.rac02.gsd` on member `rac02`
Start of `ora.rac01.gsd` on member `rac01` succeeded.
Start of `ora.rac02.gsd` on member `rac02` succeeded.
CRS-0223: Resource 'ora.rac01.ons' has placement error.
CRS-0223: Resource 'ora.rac02.ons' has placement error.
[oracle@rac01 bin]$ ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application ONLINE ONLINE rac01
ora....01.lsnr application ONLINE ONLINE rac01
ora.rac01.gsd application ONLINE ONLINE rac01
ora.rac01.ons application ONLINE ONLINE rac01
ora.rac01.vip application ONLINE ONLINE rac01
ora....SM2.asm application ONLINE ONLINE rac02
ora....02.lsnr application ONLINE ONLINE rac02
ora.rac02.gsd application ONLINE ONLINE rac02
ora.rac02.ons application ONLINE ONLINE rac02
ora.rac02.vip application ONLINE ONLINE rac02
搞定,可以继续安装rac数据库了!
原文地址:http://www.oracleblog.org/working-case/crs-can-only-start-one-asm-instance/
相关文章推荐
- ORACLE RAC中一个实例不能随crs自动启动的解决
- Singleton 同一个程序同时只能启动一个实例
- ORACLE RAC中一个实例不能随crs自动启动的解决
- ASM CRS 实例启动和关闭
- 11gR2启动ASM实例时遭遇ORA-29701
- 程序只启动一个实例的几种方法
- 设计一个类,我们只能生成该类的一个实例。
- 一个单机启动多个resin实例
- Oracle同一个用户下启动多个数据库实例
- C# 创建互斥进程(程序只能运行一个实例)
- Delphi只能运行一个程序实例的方法
- qt编写一个只能运行单个实例的程序,不用Windows API
- VC 设置应用程序只能运行一个实例
- c# 程序只能运行一次(多次运行只能打开同一个程序) 并激活第一个实例,使其获得焦点,并在最前端显示.
- Idea中启动一个工程多个实例
- C#中使用事件只启动一个实例
- 用C#给程序加启动画面并只允许一个应用程序实例运行
- 程序只启动一个实例的几种方法
- Python实现保证只能运行一个脚本实例
- python的单例模式,就是一个类只能有一个实例的模式