您的位置:首页 > 其它

Voting Disk和OCR基础、管理、备份和恢复

2016-09-13 11:06 375 查看

Voting Disk和OCR基础、管理、备份和恢复

Oracle Custerwar包含3个组件:

voting disks Oracle Cluster Registry (OCR), and Oracle Local Registry (OLR).

Voting Disks Voting disks manage information about node membership. Each voting disk must be accessible by all nodes in the cluster for nodes to be members of the cluster

–voting disks 管理节点的membership。每一个voting disk的信息,必有被集群内其它所有节点够访问

OCR OCR manages Oracle Clusterware and Oracle RAC database configuration information 保存RAC的配置和注册信息

OLR OLR resides on every node in the cluster and manages Oracle Clusterware configuration information for each particular node

–OLR 驻留在集群上的每个节点上的,在该节点上,保存当前节点的Oracle Clusterware的配置信息

Voting Disks

Voting Disk主要为了在出现脑裂时,决定那个Partion获得控制权,其他的Partion必须从集群中剔除。

Oracle为防止发生RAC集群裂脑(split-brain),Voting Disk表决磁盘可以定位故障节点并将其踢出集群环境。

所以Voting Disk的可用性非常重要,ORACLE推荐使用ASM磁盘组冗余的方式,来保证磁盘组的高可用性。

Voting Disks的冗余方式

External redundancy: A disk group with external redundancy can store only one voting disk

–外部冗余,使用一块磁盘。通过非oracle管理的冗余,例如,存储的mirror,或通过手工备份到文件系统来实现

Normal redundancy: A disk group with normal redundancy can store up to three voting disks

–Normal冗余,oracle所管理的,最少使用3块磁盘做mirror,至少是三个failgroup

High redundancy: A disk group with high redundancy can store up to five voting disks

–High冗余,使用最少5块磁盘\

voting disk使用normal模式的failgroup

col name for a12
col path for a16
set linesize 120
select name,path,failgroup from v$asm_disk;
SQL> select name,path,failgroup from v$asm_disk;

NAME         PATH                 FAILGROUP
------------ -------------------- ------------------------------
DATADG_0000  /dev/asm-diske       DATADG_0000
SYSDG_0001   /dev/asm-diskc       SYSDG_0001
SYSDG_0000   /dev/asm-diskb       SYSDG_0000
DATADG_0001  /dev/asm-diskf       DATADG_0001
SYSDG_0002   /dev/asm-diskd       SYSDG_0002
EXTDG_0000   /dev/asm-diskg       EXTDG_0000

注:Voting disks使用的冗余方式和数据存储的冗余方式对比
Externam redundancy:  一块磁盘做冗余,至少一个failgroup
Normal redundancy:    二块磁盘做冗余,至少两个failgroup
High redundancy:      三块磁盘做冗余,至少三个failgroup


管理Voting Disk

查看voting disk

crsctl query css votedisk
[grid@rac1 ~]$ crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
1. ONLINE   83685a6237234f1bbf59ad2468978166 (/dev/asm-diskb) [SYSDG]
2. ONLINE   66a45fb0a3684f46bf6aac677a4cee08 (/dev/asm-diskc) [SYSDG]
3. ONLINE   14dc980afad64fe7bf7fdf45bfc899ef (/dev/asm-diskd) [SYSDG]
Located 3 voting disk(s).
GUID: File Universal Id,voting disk和惟一标识


关于Voting Disk的备份

ASM RAC 11.2: Why Voting Disk Are Not Listed Thru ASMCMD or SQL*Plus?, And How To Backup Voting Disk In ASM 11.2? (Doc ID 1369079.1)

1)In previous releases, backing up the voting disks using a dd command was a required post installation task. With Oracle Clusterware 11g Release 2, backing up and restoring a voting disk using the dd command may result in the loss of the voting disk, so this procedure is not supported.

–在11gR2之前,可以使用dd命令进行votingd disk的备份,而在11gR2之后,使用asm disk之后,这个dd操作不被支持了

2) Backing up voting disks manually is no longer required because voting disk data is backed up automatically in the OCR as part of any configuration change and voting disk data is automatically restored to any added voting disks.

–voting disk的备份和ocr的备份是在一起的,进行自动备份的,无需手工进行voting disk的备份

3) The next note shows a clear example about how to restore the OCR & Votedisk on release 11.2:

=)> How to restore ASM based OCR after complete loss of the CRS diskgroup on Linux/Unix systems Document 1062983.1

Voting Disk备份和OCR备份是在一起备份的,则可以通过查看OCR的备份来获取voting disk的备份时间和策略

查看OCR的备份

查看OCR的备份

ocrconfig -showbackup
[grid@rac1 ~]$ ocrconfig -showbackup
rac1     2016/09/12 07:14:22     /app/grid/11.2.0/cdata/rac/backup00.ocr
rac1     2016/09/12 03:14:20     /app/grid/11.2.0/cdata/rac/backup01.ocr
rac1     2016/09/11 23:14:18     /app/grid/11.2.0/cdata/rac/backup02.ocr
rac1     2016/09/11 03:14:06     /app/grid/11.2.0/cdata/rac/day.ocr
rac1     2016/09/07 19:13:35     /app/grid/11.2.0/cdata/rac/week.ocr
PROT-25: Manual backups for the Oracle Cluster Registry are not available
rac1: 备份所有节点名称  其它的是时间和备份文件位置


ORC备份策略配置

每4个小时自动生成一份OCR备份,并保留最后3个备份。

CRSD进程还会在每天开始时生成OCR备份,并保留最后2个备份。

CRSD进程还会在每周开始时生成OCR备份,并保留最后2个备份。

Steps

增加和删除一个votedisk

crsctl delete css votedisk FUID
crsctl add css votedisk path_to_vote_disk


查询votedisk状态和使用

[grid@rac1 ~]$ crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
1. ONLINE   83685a6237234f1bbf59ad2468978166 (/dev/asm-diskb) [SYSDG]
2. ONLINE   66a45fb0a3684f46bf6aac677a4cee08 (/dev/asm-diskc) [SYSDG]
3. ONLINE   14dc980afad64fe7bf7fdf45bfc899ef (/dev/asm-diskd) [SYSDG]
Located 3 voting disk(s).


模拟损坏

dd if=/dev/zero of=/dev/sdd bs=1024
bs=1024 这个单位是MB,这个意思,则说明写了1024MB的数据,从而是1.1GB
[root@rac1 ~]# dd if=/dev/zero of=/dev/sdd bs=1024
dd: 正在写入"/dev/sdd": 设备上没有空间
记录了1048577+0 的读入
记录了1048576+0 的写出
1073741824字节(1.1 GB)已复制,86.6881 秒,12.4 MB/秒


查询损坏后的状态

[grid@rac1 ~]$ crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
1. ONLINE   83685a6237234f1bbf59ad2468978166 (/dev/asm-diskb) [SYSDG]
2. ONLINE   66a45fb0a3684f46bf6aac677a4cee08 (/dev/asm-diskc) [SYSDG]
3. PENDOFFL 14dc980afad64fe7bf7fdf45bfc899ef (/dev/asm-diskd) [SYSDG]    PENDOFFL损坏了
Located 3 voting disk(s).


–查看asm磁盘状态

col name format a15;
col path format a25;
col  failgroup format a20;

select dg.name, d.path, d.failgroup, d.failgroup_type
from v$asm_diskgroup dg, v$asm_disk d
where dg.group_number = d.group_number and dg.name = 'SYSDG'
order by dg.name, d.failgroup, d.path;

NAME            PATH                      FAILGROUP            FAILGRO
--------------- ---
1847d
---------------------- -------------------- -------
SYSDG           /dev/asm-diskb            SYSDG_0000           REGULAR
SYSDG           /dev/asm-diskc            SYSDG_0001           REGULAR    --这个查看的已经看不到/dev/asm-diskd了


尝试通过drop disk和add disk操作(因为没有发现,则会报错找不到)

SQL> ALTER DISKGROUP SYSDG drop disk '/dev/asm-diskd';
ALTER DISKGROUP SYSDG drop disk '/dev/asm-diskd'
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15054: disk "/DEV/ASM-DISKD" does not exist in diskgroup "SYSDG"


查询(过了一段时间,查询voting disk所在的磁盘组,发现PENDOFFL的磁盘没有了,这个是为什么?目前并不知道原因)

Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
[grid@rac1 ~]$ crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
1. ONLINE   83685a6237234f1bbf59ad2468978166 (/dev/asm-diskb) [SYSDG]
2. ONLINE   66a45fb0a3684f46bf6aac677a4cee08 (/dev/asm-diskc) [SYSDG]


尝试将这个磁盘/dev/asm-disk重新加入到这个磁盘

alter diskgroup SYSDG add failgroup SYSDG_0002 disk '/dev/asm-diskd' rebalance power 11;
SQL> col name format a15;
SQL> col path format a25;
SQL> col  failgroup format a20;
SQL>
SQL> select dg.name, d.path, d.failgroup, d.failgroup_type
2  from v$asm_diskgroup dg, v$asm_disk d
3  where dg.group_number = d.group_number and dg.name = 'SYSDG'
4  order by dg.name, d.failgroup, d.path;

NAME            PATH                      FAILGROUP            FAILGRO
--------------- ------------------------- -------------------- -------
SYSDG           /dev/asm-diskb            SYSDG_0000           REGULAR
SYSDG           /dev/asm-diskc            SYSDG_0001           REGULAR
SYSDG           /dev/asm-diskd            SYSDG_0002           REGULAR

[grid@rac1 ~]$ crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
1. ONLINE   83685a6237234f1bbf59ad2468978166 (/dev/asm-diskb) [SYSDG]
2. ONLINE   66a45fb0a3684f46bf6aac677a4cee08 (/dev/asm-diskc) [SYSDG]
3. ONLINE   0da2e65317894f6abfa6c188f4954547 (/dev/asm-diskd) [SYSDG]
Located 3 voting disk(s).

rebalance完成
SQL> select count(*) from v$asm_operation;

COUNT(*)
----------
0


使用/dev/asm-diskh 来代替 /dev/asm-diskd

crsctl delete css votedisk '0da2e65317894f6abfa6c188f4954547'
crsctl add css votedisk 'dev/asm-diskh'


grid用户执行

[grid@rac1 ~]$ crsctl delete css votedisk '0da2e65317894f6abfa6c188f4954547'
CRS-4258: Addition and deletion of voting files are not allowed because the voting files are on ASM


root用户执行

cd /app/grid/11.2.0/bin
[root@rac1 bin]# ./crsctl delete css votedisk '0da2e65317894f6abfa6c188f4954547'
CRS-4258: Addition and deletion of voting files are not allowed because the voting files are on ASM


查看mos

Doc ID 1060146.1
CRS-4258: Addition and Deletion of Voting Files are not Allowed Because the Voting Files are on ASM in 11gR2
CAUSE
1. Seeing stale voting files is due to bug 9024611.
2. "delete" command is not available , only "replace" command is available when voting files are stored on  ASM diskgroup.
Please see Oracle Clusterware Administration and Deployment Guide11g Release 2 (11.2
"delete" command is not available , only "replace" command is available when voting files are stored on  ASM diskgroup.
通过crsctl 的delete操作不被支持了,可以用replace命令代替

$ crsctl replace votedisk  +asm_disk_group   --- Put available ASM diskgroup
$ crsctl query css votedisk         --- Check if voting files are all online on the new ASM diskgroup
$ crsctl replace votedisk +PLAY    -- Put the original ASM diskgroup where voting files were


使用replace替换磁盘组的方式来转移voting disk

创建新asm磁盘组
create diskgroup DGSYS external redundancy DISK  '/dev/asm-diskh';
SQL> select name,state from v$asm_diskgroup;
NAME                           STATE
------------------------------ -----------
DATADG                         MOUNTED
EXTDG                          MOUNTED
SYSDG                          MOUNTED
DGSYS                          MOUNTED           --DGSYS 是刚新建的磁盘组

crsctl replace votedisk +DGSYS
Failed to create voting files on disk group DGSYS.
Change to configuration failed, but was successfully rolled back.
CRS-4000: Command Replace failed, or completed with errors.


查看alert日志
/app/gridbase/diag/asm/+asm/+ASM1/trace
alert_+ASM1.log

NOTE: Voting File refresh pending for group 4/0x64885b18 (DGSYS)
NOTE: Attempting voting file creation in diskgroup DGSYS
ERROR: Voting file allocation failed for group DGSYS
Errors in file /app/gridbase/diag/asm/+asm/+ASM1/trace/+ASM1_ora_22033.trc:
ORA-15221: ASM operation requires compatible.asm of 11.2.0.0.0 or higher
NOTE: Attempting voting file refresh on diskgroup DGSYS
NOTE: Refresh completed on diskgroup DGSYS. No voting file found.


查看DBSYS的compatibility属性值

select inst_id, name, state, type, free_mb, substr(compatibility,1,10) compatibility from gv$asm_diskgroup;

INST_ID NAME                           STATE       TYPE      FREE_MB COMPATIBIL
---------- ------------------------------ ----------- ------ ---------- ----------
2 DATADG                         MOUNTED     NORMAL      12328 11.2.0.0.0
2 EXTDG                          MOUNTED     EXTERN       6020 10.1.0.0.0
2 SYSDG                          MOUNTED     NORMAL       2146 11.2.0.0.0
2 DGSYS                          DISMOUNTED                  0 0.0.0.0.0
1 DATADG                         MOUNTED     NORMAL      12328 11.2.0.0.0
1 EXTDG                          MOUNTED     EXTERN       6020 10.1.0.0.0
1 SYSDG                          MOUNTED     NORMAL       2146 11.2.0.0.0
1 DGSYS                          MOUNTED     EXTERN        974 10.1.0.0.0


在第二节点,先将这个磁盘组mount

node2上操作:
alter diskgroup DBSYS mount;


修改compatibility属性:

ALTER DISKGROUP DGSYS set ATTRIBUTE 'compatible.asm' = '11.2.0.0.0';


进行重新replace操作

[grid@rac1 ~]$ id
uid=502(grid) gid=5001(oinstall) 组=5001(oinstall),5002(dba),5004(asmadmin),5006(asmdba),5007(asmoper)
[grid@rac1 ~]$ crsctl replace votedisk +DGSYS
Successful addition of voting disk 8d1ff9be73b34f62bf2435a991909ccb.
Successful deletion of voting disk 83685a6237234f1bbf59ad2468978166.
Successful deletion of voting disk 66a45fb0a3684f46bf6aac677a4cee08.
Successful deletion of voting disk 0da2e65317894f6abfa6c188f4954547.
Successfully replaced voting disk group with +DGSYS.
CRS-4266: Voting file(s) successfully replaced


查看当前votedisk的信息

[grid@rac1 ~]$ crsctl query css votedisk
## STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
1. ONLINE   8d1ff9be73b34f62bf2435a991909ccb (/dev/asm-diskh) [DGSYS]
Located 1 voting disk(s).


OCR Oracle Cluster Registry Oracle集群注册信息

OCR oracle cluster registry,The OCR contains information about all Oracle resources in the cluster.

OLR oracle local registry,the OLR is a registry similar to the OCR located on each node in a cluster, but contains information specific to each node.

–OLR只包含集群中当前节点的注册信息

–The Oracle High Availability Services uses this information. OHAS服务通过 OLR 信息管理高可用性服务

OLR 存放到cluster中每个节点的本地,默认的存储路径是 Grid_home/cdata/host_name.olr

eg:
[grid@rac2 cdata]$ ls -l
总用量 2908
drwxr-xr-x 2 grid oinstall      4096 9月   7 14:41 localhost
drwxrwxr-x 2 grid oinstall      4096 9月   7 14:41 rac
drwxr-xr-x 2 grid oinstall      4096 9月   7 15:16 rac2
-rw------- 1 root oinstall 272756736 9月  12 11:40 rac2.olr


关于ocr需要注意的地方

1.If you upgrade from a previous version of Oracle Clusterware to 11g release 2 (11.2) and you want to store OCR in an Oracle ASM disk group, then you must set the ASM Compatibility compatibility attribute to 11.2.0.0.

通过升级,将ocr信息存放到asm磁盘组中,则设置asm compatibility参数需要为 11.2.0.0

2、You can store the OCR in an Oracle ASM disk group that has external redundancy. If a disk fails in the disk group, or if you bring down Oracle ASM, then you can lose the OCR because it depends on Oracle ASM for I/O.

–OCR可以存放到external redundancy模式的磁盘组上,但是如果这个磁盘组出问题,则ocr信息无法访问,而crs相关资源不能启动。

To avoid this issue, add another OCR to a different disk group. Alternatively, you can store OCR on a block device, or on a shared file system using OCRCONFIG to enable OCR redundancy.

3、Oracle does not support storing the OCR on different storage types simultaneously, such as storing OCR on both Oracle ASM and a shared file system, except during a migration.

–不支持同时使用不同类型的存储方式。 ASM和文件系统不能同时使用存放OCR,只有在迁移的过程中临时使用

4、If Oracle ASM fails, then OCR is not accessible on the node on which Oracle ASM failed, but the cluster remains operational. The entire cluster only fails if the Oracle ASM instance on the OCR master node fails, if the majority of the OCR locations are in Oracle ASM, and if there is an OCR read or write access, then the crsd stops and the node becomes inoperative.

–OCR在当前节点的failed,则当前节点的cluster不能启动,其它没有受到影响的节点仍然可以启动

5、OCR inherits the redundancy of the disk group. If you want high redundancy for OCR, you must configure the disk group with high redundancy when you create it.

– OCR的冗余模式是继承自ASM磁盘组的冗余模式的。 如果放到ASM磁盘组中,则需要将ASM磁盘组做 redundancy

6、OCR存放的是集群的配置信息,放到共享存储上。整个集群中,只有一个节点能对OCR DISK作配置的更新等读写操作,这个节点应时oracle clusterware的主节点。所有其它节点读取这个配置,运行期间在内存中存放一份配置的copy。 当配置发生变化时,则master node负责同步配置信息到其它节点上。

7、由于每个节点都有配置信息的拷贝,如果修改节点的配置信息不同步,则会发生健忘问题。

8、oracle通过在/etc/oracle/ocr.loc(linux)文件中指定ocr在共享存储上的位置。这个在安装时会自动配置。

/var/opt/oracle/ocr.loc(Solaris System系统存放的位置)

查看ocr的存放位置和信息

ocrcheck

[grid@rac1 trace]$ ocrcheck
Status of Oracle Cluster Registry is as follows :
Version                  :          3
Total space (kbytes)     :     262120
Used space (kbytes)      :       3132
Available space (kbytes) :     258988
ID                       :  534658389
Device/File Name         :     +SYSDG
Device/File integrity check succeeded
Device/File not configured
Device/File not configured
Device/File not configured
Device/File not configured
Cluster registry integrity check succeeded
Logical corruption check bypassed due to non-privileged user
OCRCHECK displays all OCR locations that are registered and whether they are available (online). 查看OCR存放的位置


创建external redundancy磁盘组

create diskgroup DBSYS external redundancy disk '/dev/asm-diskh' attribute 'compatible.asm' = '11.2','compatible.rdbms' = '11.2';


必须使用root用户操作

[root@rac1 bin]# ./ocrconfig
ocrconfig      ocrconfig.bin
[root@rac1 bin]# ./ocrconfig -add +dgsys
PROT-30: The Oracle Cluster Registry location to be added is not usable
PROC-50: The Oracle Cluster Registry location to be added is inaccessible on nodes rac2.
rac2 节点上,将磁盘组进行mount起来:
alter diskgroup dgsys mount;


增加存储ocr的位置

[root@rac1 bin]# ./ocrconfig -add +dgsys
./ocrcheck
Status of Oracle Cluster Registry is as follows :
Version                  :          3
Total space (kbytes)     :     262120
Used space (kbytes)      :       3148
Available space (kbytes) :     258972
ID                       :  534658389
Device/File Name         :     +SYSDG
Device/File integrity check succeeded   --增加了+dbsys存放ocr的位置
Device/File Name         :     +dgsys
Device/File integrity check succeeded
Device/File not configured
Device/File not configured
Device/File not configured
Cluster registry integrity check succeeded

Logical corruption check succeeded

[root@rac1 bin]# ./ocrconfig -delete +SYSDG   --尝试删除原有的SYSDG存放OCR位置
[root@rac1 bin]# ./ocrcheck
Status of Oracle Cluster Registry is as follows :
Version                  :          3
Total space (kbytes)     :     262120
Used space (kbytes)      :       3148
Available space (kbytes) :     258972
ID                       :  534658389
Device/File Name         :     +dgsys
Device/File integrity check succeeded
Device/File not configured
Device/File not configured
Device/File not configured
Device/File not configured
Cluster registry integrity check succeeded
Logical corruption check succeeded
[root@rac1 bin]# ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
1. ONLINE   ab1928abe53f4f17bfce2337ffe0aa02 (/dev/asm-diskb) [SYSDG]
2. ONLINE   9476505073584fecbf9ef65a12525956 (/dev/asm-diskc) [SYSDG]
3. ONLINE   3d0fabe8d6434fa5bfdf389deba3df3b (/dev/asm-diskd) [SYSDG]
Located 3 voting disk(s).

从上述可以看到,votedisk和ocr存放到了不同的磁盘组。


使用repace做ocr替换

ocrconfig -replace current_OCR_location -replacement new_OCR_location
ocrconfig -replace +DGSYS -replacement +SYSDG
[root@rac1 bin]# ./ocrconfig -replace +DGSYS -replacement +SYSDG
PROT-28: Cannot delete or replace the only configured Oracle Cluster Registry location

PROT-00028: Cannot delete or replace the only configured Oracle Cluster Registry location.
Cause: A delete or replace operation was performed when there was only 1 configured Oracle Cluster Registry location. The above listed operation can only be performed when there are at least 2 configured Oracle Cluster Registry locations.
--当只有一个OCR存放的位置时,是不能进行替换的。至少当前已存在两个或两个以上的才能通过replace操作完成
Action: Execute 'ocrconfig -add new location' to add the new Oracle Cluster Registry location followed by 'ocrconfig -delete existing location' command.


OCR的备份

Automatic backups: Oracle Clusterware automatically creates OCR backups every four hours. At any one time, Oracle Database always retains the last three backup copies of OCR. The CRSD process that creates the backups also creates and retains an OCR backup for each full day and at the end of each week. You cannot customize the backup frequencies or the number of files that Oracle Database retains.

–自动备份:每隔4小时,保留至少三个可用的备份。crsd 进程完成的备份操作,无法通过设置数据库参数来完成定制的备份策略

Manual backups: Use the ocrconfig -manualbackup command to force Oracle Clusterware to perform a backup of OCR at any time, rather than wait for the automatic backup. The -manualbackup option is especially useful when you want to obtain a binary backup on demand, such as before you make changes to the OCR. The OLR only supports manual backups.

–手工备份,通过ocrconfig命令完成,而OLR,则只能通过manual backups的方式进行备份

OCR的物理备份

查看备份

ocrconfig -showbackup

[grid@rac1 ~]$ ocrconfig -showbackup|awk '/^rac/{print}'
rac2     2016/09/13 07:08:25     /app/grid/11.2.0/cdata/rac/backup00.ocr
rac2     2016/09/13 03:08:24     /app/grid/11.2.0/cdata/rac/backup01.ocr
rac2     2016/09/12 23:08:23     /app/grid/11.2.0/cdata/rac/backup02.ocr
rac1     2016/09/12 03:14:20     /app/grid/11.2.0/cdata/rac/day.ocr
rac1     2016/09/07 19:13:35     /app/grid/11.2.0/cdata/rac/week.ocr
rac2     2016/09/13 09:01:40     /app/grid/11.2.0/cdata/rac/backup_20160913_090140.ocr

oerr prot 25  --查看错误信息


导出可查看的ocr信息文本文件,可以查看,但是这个不能作为ocr的备份文件

ocrdump -backupfile /tmp/ocrdump.txt


ocr备份文件存放的默认路径 Grid_home/cdata/cluster_name

对于物理备份恢复,不能简单的使用操作系统级别的复制命令(使用ocr文件时)来完成,该操作将导致ocr不可用。

对于物理备份,仅仅只能使用restore方式来进行恢复,而不支持import方式

物理备份(通过restore自动备份恢复ocr信息)

前言:If you are storing the OCR on an Oracle ASM disk group, and that disk group is corrupt, then you must restore the Oracle ASM disk group using Oracle ASM utilities, and then mount the disk group again before recovering the OCR. Recover the OCR by running the command ocrconfig -restore.

如果ocr所有的磁盘组出现问题,首先要恢复磁盘组,在恢复完磁盘组之后,再进行ocr的恢复

restore步骤

[root@rac2 bin]# ./ocrconfig -restore /app/grid/11.2.0/cdata/rac/backup_20160913_090140.ocr

PROT-19: Cannot proceed while the Cluster Ready Service is running

只有在删除crs不运行的状态时,才能进行ocrconfig -restore操作

1.olsnodes 查看集群的节点

2.停止crs资源

./crsctl stop crs    (root用户执行)
./crsctl stop crs -f (如果报错,则强制停止在所有节点停止crs)


3、进行restore

ocrconfig -restore  /app/grid/11.2.0/cdata/rac/backup_20160913_090140.ocr
[root@rac2 bin]# ./ocrconfig -restore  /app/grid/11.2.0/cdata/rac/backup_20160913_090140.ocr
PROT-35: The configured OCR locations are not accessible.
[root@rac2 bin]# ls -l /app/grid/11.2.0/cdata/rac/backup_20160913_090140.ocr
-rw------- 1 root root 7471104 9月  13 09:01 /app/grid/11.2.0/cdata/rac/backup_20160913_090140.ocr


4、上述报错,应该是必需检测信息的进程没有启动,从而没有发现已经存在的ocr备份文件

Start the Oracle Clusterware stack on one node in exclusive mode by running the following command as root:
(Ignore any errors that display)
./crsctl  start crs -excl


日志:

[root@rac2 bin]# ./crsctl start crs -excl
CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.mdnsd' on 'rac2'
CRS-2676: Start of 'ora.mdnsd' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'rac2'
CRS-2676: Start of 'ora.gpnpd' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac2'
CRS-2672: Attempting to start 'ora.gipcd' on 'rac2'
CRS-2676: Start of 'ora.cssdmonitor' on 'rac2' succeeded
CRS-2676: Start of 'ora.gipcd' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rac2'
CRS-2672: Attempting to start 'ora.diskmon' on 'rac2'
CRS-2676: Start of 'ora.diskmon' on 'rac2' succeeded
CRS-2676: Start of 'ora.cssd' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'rac2'
CRS-2672: Attempting to start 'ora.drivers.acfs' on 'rac2'
CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'rac2'
CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'rac2'
CRS-2676: Start of 'ora.drivers.acfs' on 'rac2' succeeded
CRS-2676: Start of 'ora.ctssd' on 'rac2' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'rac2'
CRS-2676: Start of 'ora.asm' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'rac2'
CRS-2676: Start of 'ora.crsd' on 'rac2' succeeded


查看进程

[root@rac2 ~]# ps -ef|grep d.bin
root     30456     1  0 09:26 ?        00:00:00 /app/grid/11.2.0/bin/ohasd.bin exclusive
grid     30637     1  0 09:26 ?        00:00:00 /app/grid/11.2.0/bin/mdnsd.bin
grid     30648     1  0 09:26 ?        00:00:00 /app/grid/11.2.0/bin/gpnpd.bin
grid     30661     1  0 09:26 ?        00:00:00 /app/grid/11.2.0/bin/gipcd.bin
grid     30706     1  0 09:26 ?        00:00:00 /app/grid/11.2.0/bin/ocssd.bin -X
root     30844     1  0 09:27 ?        00:00:00 /app/grid/11.2.0/bin/octssd.bin
root     31157     1  0 09:27 ?        00:00:00 /app/grid/11.2.0/bin/crsd.bin reboot
root     31325 30575  0 09:28 pts/1    00:00:00 grep d.bin


查看资源的启动状态

crsctl stat res -t -init

[root@rac2 bin]# ./crs_stat -t
CRS-0184: Cannot communicate with the CRS daemon.

Check whether crsd is running. If it is, stop it by running the following command as root:
(检查crsd进程是否启动,如果启动了,则通过如下的命令进行停止掉)
crsctl stop resource ora.crsd -init       --restore的过程中,必须保证crsd进程是不运行的状态 ps -ef|grep crsd
[root@rac2 bin]# ./crsctl stop resource ora.crsd -init
CRS-2673: Attempting to stop 'ora.crsd' on 'rac2'
CRS-2677: Stop of 'ora.crsd' on 'rac2' succeeded

./crsctl start has
./crsctl stat res -t


5、再次尝试进行恢复

ocrconfig -restore  /app/grid/11.2.0/cdata/rac/backup_20160913_090140.ocr
[root@rac2 bin]# ./ocrconfig -restore  /app/grid/11.2.0/cdata/rac/backup_20160913_090140.ocr
[root@rac2 bin]#
正常执行成功


6、ocrcheck

7、crsctl stop crs -f

8、crsctl start crs

9、cluvfy comp ocr -n all -verbose

Verify the OCR integrity of all of the cluster nodes that are configured as part of your cluster by running the following CVU command

通过cvu验证一致性的操作
[grid@rac2 ~]$ cluvfy comp ocr -n all -verbose|grep -v ^$
验证 OCR 完整性
正在检查 OCR 完整性...
正在检查是否缺少非集群配置...
所有节点都没有非集群的, 仅限本地的配置
“ASM 运行”检查通过。ASM 正在所有指定节点上运行
正在检查 OCR 配置文件 "/etc/oracle/ocr.loc"...
OCR 配置文件 "/etc/oracle/ocr.loc" 检查成功
ocr 位置 "+dgsys" 的磁盘组在所有节点上都可用
ocr 位置 "+SYSDG" 的磁盘组在所有节点上都可用
NOTE:
此检查不验证 OCR 内容的完整性。请以授权用户的身份执行 'ocrcheck' 以验证 OCR 的内容。
OCR 完整性检查已通过
OCR 完整性 的验证成功。

完成!!


OCR的逻辑备份

you should also export the OCR contents before and after making significant configuration changes, such as adding or deleting nodes from your environment, modifying Oracle Clusterware resources, and upgrading, downgrading or creating a database.

当进行数据库集群管理操作时,如显著的配置变化、增加和删除节点、修改rac资源,升级数据库等操作之前,需要做一个ocr的逻辑备份

关于OCR物理备份和逻辑备份

ocrconfig -restore对应的是自动策略的备份
ocrconfig -export和-import相互对应。
ocrconfg -restore 和 export/import是不能互混使用的。


使用逻辑备份恢复

1、olsnodes
2、crsctl stop crs
crsctl stop crs -f  (excute at all nodes)
3、crsctl start crs -excl (启动必要的进程)
crsctl stop resource ora.crsd -init
4、ocrconfig -import file_name
5、ocrcheck
6、crsctl stop crs -f
crsctl start crs
7、cluvfy comp crs -n all -verbose
You can only import an exported OCR.


OLR Oracle Local Registry

oracle本地注册信息

使用root用户查看(使用grid用户会报错,权限不足)

[root@rac2 bin]# ./ocrcheck -local|grep -v ^$
Status of Oracle Local Registry is as follows :
Version                  :          3
Total space (kbytes)     :     262120
Used space (kbytes)      :       2668
Available space (kbytes) :     259452
ID                       : 2112081575
Device/File Name         : /app/grid/11.2.0/cdata/rac2.olr
Device/File integrity check succeeded
Local registry integrity check succeeded
Logical corruption check succeeded


dump查看

ocrdump -local -stdout


export和import OLR文件

ocrconfig -local -export file_name
ocrconfig -local -import file_name


物理备份

ocrconfig –local –manualbackup
ocrdump -local -backupfile olr_backup_file_name
ocrconfig -local -backuploc new_olr_backup_path


Restore OLR

crsctl stop crs
ocrconfig -local -restore file_name
ocrcheck -local
crsctl start crs
cluvfy comp olr


引用 OCR的树状结构

http://blog.itpub.net/23135684/viewspace-715989/

OCR的树状结构,数据的存储形式是键值对

root
├─SYSTEM
│  ├─css
│  ├─language
│  ├─version
│  ├─ORA_CRS_HOME
│  ├─local_only
│  ├─evm
│  ├─crs
│  └─OCR
├─DATABASE
│  ├─NODEAPPS
│  ├─LOG
│  ├─ASM
│  ├─DATABASES
│  │  ├─SERVICE
│  │  └─INSTANCE
│  └─ONS
└─CRS


简单对上述三类键值的功能做下述简要描述:

1)SYSTEM键包含了与Oracle Clusterware主要进程CSSD、CRSD和EVMD的相关数据;

2)DATABASE键包含了在Oracle Clusterware注册的RAC数据库相关的数据;

3)OCR键记录了与资源概要文件相关的信息,维护其他注册到Oracle Clusterware的应用程序的可用性。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息