您的位置:首页 > 其它

11GRAC CRS启动失败

2015-06-01 14:52 155 查看
[root@racnode2 ~]# crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
又发生了失败!

[root@racnode2 ~]# crsctl check cssd
CRS-272: This command remains for backward compatibility only
Cluster Synchronization Services is online

[root@racnode2 ~]# crsctl check crsd
CRS-272: This command remains for backward compatibility only
Cannot communicate with Cluster Ready Services

[root@racnode2 ~]# crsctl check evmd
CRS-272: This command remains for backward compatibility only
Cannot communicate with Event Manager

启动CRS 报告说已经激活状态
[root@racnode2 ~]# crsctl start crs
CRS-4640: Oracle High Availability Services is already active
CRS-4000: Command Start failed, or completed with errors.

[oracle@racnode1 ~]$ olsnodes -n
racnode1 1
racnode2 2
[oracle@racnode1 ~]$ olsnodes -i
PRCO-19: 检索集群中节点的列表时失败
PRCO-4: OCR 初始化失败
PROC-32: 本地节点上的集群准备服务尚未运行 消息传送错误

OCR 坏了?
重导入 不行!
[root@racnode1 ~]# ocrconfig -local -import /u01/crs_home/product/cdata/racnode1/backup_20150413_193124.olr
PROTL-19: Cannot proceed while the Oracle High Availability Service is running

[root@racnode1 ~]# ps -ef |grep d.bin
root 2489 1 0 09:37 ? 00:00:01 /u01/crs_home/product/bin/ohasd.bin reboot
grid 2636 1 0 09:38 ? 00:00:00 /u01/crs_home/product/bin/gipcd.bin
grid 2643 1 0 09:38 ? 00:00:00 /u01/crs_home/product/bin/mdnsd.bin
grid 2655 1 0 09:38 ? 00:00:00 /u01/crs_home/product/bin/gpnpd.bin
grid 2717 1 0 09:38 ? 00:00:01 /u01/crs_home/product/bin/ocssd.bin
root 2833 1 0 09:41 ? 00:00:00 /u01/crs_home/product/bin/octssd.bin reboot
grid 2848 1 0 09:41 ? 00:00:00 /u01/crs_home/product/bin/evmd.bin
grid 3072 1 0 09:42 ? 00:00:00 /u01/crs_home/product/bin/oclskd.bin
root 3608 2788 0 09:49 pts/1 00:00:00 grep d.bin

[root@racnode1 ~]# crsctl stat res -t -init
----------------------------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS Cluster Resources
--------------------------------------------------------------------------------------------------------------------------------------------
ora.asm 1 ONLINE ONLINE racnode1 Started
ora.crsd 1 ONLINE OFFLINE
ora.cssd 1 ONLINE ONLINE racnode1
ora.cssdmonitor 1 ONLINE ONLINE racnode1
ora.ctssd 1 ONLINE ONLINE racnode1 ACTIVE:0
ora.diskmon 1 ONLINE ONLINE racnode1
ora.drivers.acfs 1 ONLINE ONLINE racnode1
ora.evmd 1 ONLINE ONLINE racnode1
ora.gipcd 1 ONLINE ONLINE racnode1
ora.gpnpd 1 ONLINE ONLINE racnode1
ora.mdnsd 1 ONLINE ONLINE racnode1

就ORA.CRSD 没启动

[grid@racnode1 crsd]$ tail -500 crsd.log

[ OCRAPI][3010044496]a_init_clsss: failed to call clsu_get_private_ip_addr (7)
2015-06-01 09:42:37.584: [ OCRAPI][3010044496]a_init:13!: Clusterware init unsuccessful : [44]
2015-06-01 09:42:37.584: [ CRSOCR][3010044496] OCR context init failure. Error: PROC-44: 缃.??板.?.?缁..?f.浣.腑?洪. 缃.??板.?.?缁..?f.浣..璇.[7]
2015-06-01 09:42:37.584: [ CRSD][3010044496][PANIC] CRSD exiting: Could not init OCR, code: 44
2015-06-01 09:42:37.584: [ CRSD][3010044496] Done.

查看日志 OCR PROC-44
表决磁盘可以啊
[root@racnode1 crsd]# crsctl query css votedisk
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE 2c284f8178974f76bf16baabb5920203 (/dev/asm-diska) [ORC]
2. ONLINE 438ef4b51ebd4ffabfbf9153db6d90d1 (/dev/asm-diskb) [ORC]
3. ONLINE a9dfb2767c3e4fc7bfe1168f675262ad (/dev/asm-diskc) [ORC]

ASM 实例启动了
[grid@racnode1 ~]$ sqlplus / as sysasm

SQL*Plus: Release 11.2.0.1.0 Production on 星期一 6月 1 10:36:15 2015

Copyright (c) 1982, 2009, Oracle. All rights reserved.

连接到:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

SQL>
GROUP_NUMBER HEADER_STATUS STATE PATH UDID
------------ ------------------------ ---------------- -------------------------------------------------- ----------
3 MEMBER NORMAL /dev/asm-diskc
3 MEMBER NORMAL /dev/asm-diska
1 MEMBER NORMAL /dev/asm-diske
2 MEMBER NORMAL /dev/asm-diskf
3 MEMBER NORMAL /dev/asm-diskb

表决磁盘组在线
只有个日志磁盘组没MOUNT
SQL> Select group_number,name,state,type,total_mb,free_mb From v$asm_diskgroup;

GROUP_NUMBER NAME STATE TYPE TOTAL_MB FREE_MB
------------ ------------------------------------------------------------ ---------------------- ------------ ---------- ----------
1 ARCHI MOUNTED EXTERN 5120 4688
2 DATA MOUNTED EXTERN 10240 8061
3 ORC MOUNTED NORMAL 3072 2146
0 REDO DISMOUNTED 0 0

可以 mount 成功
ALTER DISKGROUP redo mount
Diskgroup altered. redo mount
Diskgroup altered.

SQL> Select group_number,name,state,type,total_mb,free_mb From v$asm_diskgroup;

GROUP_NUMBER NAME STATE TYPE TOTAL_MB FREE_MB
------------ ------------------------------------------------------------ ---------------------- ------------ ---------- ----------
1 ARCHI MOUNTED EXTERN 5120 4688
2 DATA MOUNTED EXTERN 10240 8061
3 ORC MOUNTED NORMAL 3072 2146
4 REDO MOUNTED EXTERN 3072 2977

难道是因为CRS无法访问ASM吗? 查看监听,启动监听, 等待动态注册.......
[grid@racnode1 ~]$ lsnrctl status

LSNRCTL for Linux: Version 11.2.0.1.0 - Production on 01-6月 -2015 10:48:28

Copyright (c) 1991, 2009, Oracle. All rights reserved.

正在连接到 (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))
TNS-12541: TNS: 无监听程序
TNS-12560: TNS: 协议适配器错误
TNS-00511: 无监听程序
Linux Error: 2: No such file or directory
[grid@racnode1 ~]$ lsnrctl start

LSNRCTL for Linux: Version 11.2.0.1.0 - Production on 01-6月 -2015 10:48:39

Copyright (c) 1991, 2009, Oracle. All rights reserved.

启动/u01/crs_home/product/bin/tnslsnr: 请稍候...

TNSLSNR for Linux: Version 11.2.0.1.0 - Production
系统参数文件为/u01/crs_home/product/network/admin/listener.ora
写入/u01/crs_home/base/diag/tnslsnr/racnode1/listener/alert/log.xml的日志信息
监听: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))

正在连接到 (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))
LISTENER 的 STATUS
------------------------
别名 LISTENER
版本 TNSLSNR for Linux: Version 11.2.0.1.0 - Production
启动日期 01-6月 -2015 10:48:39
正常运行时间 0 天 0 小时 0 分 1 秒
跟踪级别 off
安全性 ON: Local OS Authentication
SNMP OFF
监听程序参数文件 /u01/crs_home/product/network/admin/listener.ora
监听程序日志文件 /u01/crs_home/base/diag/tnslsnr/racnode1/listener/alert/log.xml
监听端点概要...
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
监听程序不支持服务
命令执行成功

SQL> alter system register;

System altered.

lsnrctl status 等了好久哦!!

动态注册读的是parameter file
静态注册读的是listener.ora

SQL> show parameter spfile

NAME TYPE VALUE
------------------------------------ ---------------------- ------------------------------
spfile string +ORC/racnode-cluster/asmparame
terfile/registry.253.876943459

SQL> show parameter lis

NAME TYPE VALUE
------------------------------------ ---------------------- ------------------------------
listener_networks string
local_listener string
remote_listener string

SQL> alter system set LOCAL_LISTENER='(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.4.100)(PORT = 1521))';

System altered.

这样也等了好久

[grid@racnode1 racnode1]$ vi alertracnode1.log
[grid@racnode1 racnode1]$ pwd
2015-06-01 11:34:13.433
[ohasd(2489)]CRS-2765:资源 'ora.crsd' 已失败 (在服务器 'racnode1' 上)?B
2015-06-01 11:34:15.469
[ohasd(2489)]CRS-2765:资源 'ora.crsd' 已失败 (在服务器 'racnode1' 上)?B
2015-06-01 11:34:17.498
[ohasd(2489)]CRS-2765:资源 'ora.crsd' 已失败 (在服务器 'racnode1' 上)?B
2015-06-01 11:34:19.546
[ohasd(2489)]CRS-2765:资源 'ora.crsd' 已失败 (在服务器 'racnode1' 上)?B
2015-06-01 11:34:21.618
[ohasd(2489)]CRS-2765:资源 'ora.crsd' 已失败 (在服务器 'racnode1' 上)?B
2015-06-01 11:34:23.689
[ohasd(2489)]CRS-2765:资源 'ora.crsd' 已失败 (在服务器 'racnode1' 上)?B
2015-06-01 11:34:25.739
[ohasd(2489)]CRS-2765:资源 'ora.crsd' 已失败 (在服务器 'racnode1' 上)?B
2015-06-01 11:34:27.784
[ohasd(2489)]CRS-2765:资源 'ora.crsd' 已失败 (在服务器 'racnode1' 上)?B
2015-06-01 11:34:29.824
[ohasd(2489)]CRS-2765:资源 'ora.crsd' 已失败 (在服务器 'racnode1' 上)?B
2015-06-01 11:34:31.851
[ohasd(2489)]CRS-2765:资源 'ora.crsd' 已失败 (在服务器 'racnode1' 上)?B
2015-06-01 11:34:33.892
[ohasd(2489)]CRS-2765:资源 'ora.crsd' 已失败 (在服务器 'racnode1' 上)?B
2015-06-01 11:34:33.892
[ohasd(2489)]CRS-2771:已达到资源 'ora.crsd' 的最大重新启动尝试次数; 将不会重新启动。

再查看CRSD.LOG
2015-06-01 11:34:16.514: [ OCRAPI][2567017040]a_init:13!: Clusterware init unsuccessful : [44]
2015-06-01 11:34:16.515: [ CRSOCR][2567017040] OCR context init failure. Error: PROC-44: 网络地址和网络接口操作中出错 网络地址和网络接口操作错误 [7]
2015-06-01 11:34:16.515: [ CRSD][2567017040][PANIC] CRSD exiting: Could not init OCR, code: 44
2015-06-01 11:34:16.515: [ CRSD][2567017040] Done.

这次用GRID看清了 具体错误!

grid@racnode1 crsd]$ cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6

#node1
192.168.4.100 racnode1.localdomain racnode1
192.168.4.110 racnode1-vip.localdomain racnode1-vip
192.168.5.100 racnode1-priv.localdomain racnode1-priv
#node2
192.168.4.101 racnode2.localdomain racnode2
192.168.4.111 racnode2-vip.localdomain racnode2-vip
192.168.5.101 racnode2-priv.localdomain racnode2-priv

192.168.4.121 scan-cluster.localdomain scan-cluster

看下是否能PING 通 结果 私有网卡没通
grid@racnode1 cssd]$ oifcfg iflist
eth0 192.168.4.0

网卡1 接入系统却没有激活

操作系统图形界面激活网卡1
[grid@racnode1 cssd]$ oifcfg iflist
eth0 192.168.4.0
eth1 192.168.5.0

等了10分钟后 一切都好了
为什么网卡在停机 和重启后无法激活 有待进步考证
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: