您的位置:首页 > 大数据 > 人工智能

root.sh Fails to Start HAIP as Default Gateway is Configured for Private Network VLAN (文档 ID 1366211

2016-05-31 22:34 961 查看

Applies to:

Oracle Server - Enterprise Edition - Version 11.2.0.2 and later

Information in this document applies to any platform.

Symptoms

Installing 11.2.0.2 Grid Infrastructure on 2 node RAC cluster with VLAN configured for underlying network, root.sh fails with:

......

Start of resource "ora.cluster_interconnect.haip" failed

CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'db1'

CRS-5017: The resource action "ora.cluster_interconnect.haip start" encountered the following error:

Start action for HAIP aborted

CRS-2674: Start of 'ora.cluster_interconnect.haip' on 'db1' failed

CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'db1'

CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'db1' succeeded

CRS-4000: Command Start failed, or completed with errors.

Failed to start Oracle Clusterware stack

Failed to start High Availability IP at /u01/app/11.2.0/grid/crs/install/crsconfig_lib.pm line 1043.

/u01/app/11.2.0/grid/perl/bin/perl -I/u01/app/11.2.0/grid/perl/lib -I/u01/app/11.2.0/grid/crs/install /u01/app/11.2.0/grid/crs/install/rootcrs.pl execution failed

This happens on both nodes.

Changes

New installation

Cause

The problem happens as IP address 10.1.15.254/24 is configured as default gateway for VLAN for private network on Cisco switch, causing HAIP retrieve MAC address e8:b7:48:e3:10:d4 associated with IP: 10.1.15.254/24 instead of the real MAC 00:10:3e:14:8e:19
associated with private network adapter 10.1.15.30 and HAIP startup fails with conflict MAC address.

orarootagent_root.log shows:

2011-09-20 01:34:29.591: [ USRTHRD][1099024704] {0:0:167} HAIP: initializing to 1 interfaces

2011-09-20 01:34:29.592: [ USRTHRD][1099024704] {0:0:167} HAIP: configured to use 1 interfaces

2011-09-20 01:34:29.595: [ USRTHRD][1099024704] {0:0:167} HAIP: Updating member info HAIP1;10.1.15.0#0

2011-09-20 01:34:29.595: [ USRTHRD][1099024704] {0:0:167} InitializeHaIps[ 0] infList 'inf eth1, ip 10.1.15.30, sub 10.1.15.0'

2011-09-20 01:34:29.596: [ USRTHRD][1099024704] {0:0:167} Error in getting Key SYSTEM.network.haip.group.cluster_interconnect.interface.valid in OCR

2011-09-20 01:34:29.598: [ CLSINET][1099024704] failed to open OLR HAIP subtype SYSTEM.network.haip.group.cluster_interconnect.interface.valid key, rc=4

2011-09-20 01:34:29.598: [ USRTHRD][1099024704] {0:0:167} HAIP reset on new modified startup, ipSize 0 != numInf 1

2011-09-20 01:34:29.598: [ USRTHRD][1099024704] {0:0:167} HAIP: starting inf 'eth1', suggestedIp '', assignedIp ''

2011-09-20 01:34:29.598: [ USRTHRD][1099024704] {0:0:167} Thread:[NetHAWork]start {

2011-09-20 01:34:29.598: [ USRTHRD][1099024704] {0:0:167} Thread:[NetHAWork]start }

2011-09-20 01:34:29.598: [ USRTHRD][1119660352] {0:0:167} [NetHAWork] thread started

2011-09-20 01:34:29.598: [ USRTHRD][1119660352] {0:0:167} Arp::sCreateSocket {

2011-09-20 01:34:29.627: [ USRTHRD][1119660352] {0:0:167} Arp::sCreateSocket }

2011-09-20 01:34:29.627: [ USRTHRD][1119660352] {0:0:167} Starting Probe for ip 169.254.12.247

2011-09-20 01:34:29.627: [ USRTHRD][1119660352] {0:0:167} Transitioning to Probe State

2011-09-20 01:34:30.115: [ USRTHRD][1119660352] {0:0:167} Arp::sProbe {

2011-09-20 01:34:30.115: [ USRTHRD][1119660352] {0:0:167} Arp::sSend: sending type 1

2011-09-20 01:34:30.115: [ USRTHRD][1119660352] {0:0:167} Arp::sProbe }

2011-09-20 01:34:30.116: [ USRTHRD][1119660352] {0:0:167} PROBE: got conflicting source ip 169.254.12.247, addr e8:b7:48:e3:10:d4

2011-09-20 01:34:30.116: [ USRTHRD][1119660352] {0:0:167} PROBE: conflict detected src { 169.254.12.247, e8:b7:48:e3:10:d4 }, target { 0.0.0.0, 00:10:3e:14:8e:19 }
4000


2011-09-20 01:34:30.116: [ USRTHRD][1119660352] {0:0:167} Starting Probe for ip 169.254.38.147

2011-09-20 01:34:30.116: [ USRTHRD][1119660352] {0:0:167} Transitioning to Probe State

2011-09-20 01:34:30.760: [ USRTHRD][1119660352] {0:0:167} Arp::sProbe {

2011-09-20 01:34:30.760: [ USRTHRD][1119660352] {0:0:167} Arp::sSend: sending type 1

2011-09-20 01:34:30.760: [ USRTHRD][1119660352] {0:0:167} Arp::sProbe }

2011-09-20 01:34:30.762: [ USRTHRD][1119660352] {0:0:167} PROBE: got conflicting source ip 169.254.38.147, addr e8:b7:48:e3:10:d4

2011-09-20 01:34:30.762: [ USRTHRD][1119660352] {0:0:167} PROBE: conflict detected src { 169.254.38.147, e8:b7:48:e3:10:d4 }, target { 0.0.0.0, 00:10:3e:14:8e:19 }

...

<< repeated 10 times with different HAIP IP and abort:

2011-09-20 01:34:35.459: [ USRTHRD][1119660352] {0:0:167} Rate limiting attempts, numConflict 10

2011-09-20 01:35:29.501: [ AGFW][1113356608] {0:0:167} Created alert : (:CRSAGF00113:) : Aborting the command: start for resource: ora.cluster_interconnect.haip 1 1

2011-09-20 01:35:35.708: [ora.cluster_interconnect.haip][1115457856] {0:0:167} [start] Start of HAIP aborted

2011-09-20 01:35:35.709: [ AGENT][1115457856] {0:0:167} UserErrorException: Locale is

2011-09-20 01:35:35.709: [ora.cluster_interconnect.haip][1115457856] {0:0:167} [start] clsnUtils::error Exception type=2 string=

CRS-5017: The resource action "ora.cluster_interconnect.haip start" encountered the following error:

Start action for HAIP aborted

Network configuration shows no Mac address e8:b7:48:e3:10:d4 is defined on the host physical network:

$ /sbin/ifconfig -a

eth0 Link encap:Ethernet HWaddr 00:10:3E:58:3E:E7

     inet addr:10.2.14.30 Bcast:10.2.14.255 Mask:255.255.255.0

     inet6 addr: fe80::216:3eff:fe58:3ee7/64 Scope:Link

     UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

     RX packets:4273973 errors:0 dropped:0 overruns:0 frame:0

     TX packets:3176416 errors:0 dropped:0 overruns:0 carrier:0

     collisions:0 txqueuelen:1000

     RX bytes:4309493182 (4.0 GiB) TX bytes:2326925399 (2.1 GiB)

eth1 Link encap:Ethernet HWaddr 00:10:3E:14:8E:19

     inet addr:10.1.15.30 Bcast:10.1.15.255 Mask:255.255.255.0

     inet6 addr: fe80::216:3eff:fe14:8e19/64 Scope:Link

     UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

     RX packets:1441782 errors:0 dropped:0 overruns:0 frame:0

     TX packets:1156267 errors:0 dropped:0 overruns:0 carrier:0

     collisions:0 txqueuelen:1000

     RX bytes:935044730 (891.7 MiB) TX bytes:682093588 (650.4 MiB)

Per network admin, MAC address e8:b7:48:e3:10:d4 is associated with IP 10.1.15.254/24, it is created as gateway IP for VLAN for private network on Cisco switch.

#show int Vlan15

Vlan15 is up, line protocol is up

Hardware is EtherSVI, address is e8b7.48e3.10d4 (bia e8b7.48e3.10d4)

Description: Cluster

Internet address is 10.1.15.254/24

Solution

It's recommended to have private network on dedicated switches, but in case VLAN is used for private network, on Cisco switch, gateway is not needed for the private network VLAN.

After removing the gateway IP 10.1.15.254/24 from the Cisco switch,  deconfig the failed Grid Infrastructure installation:

as  root user:

# $GRID_HOME/crs/install/rootcrs.pl -deconfig -force

On the last node:

# $GRID_HOME/crs/install/rootcrs.pl -deconfig -force -lastnode

rerun root.sh as root user:

# $GRID_HOME/root.sh
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  oracle RAC