Cluster中3个Nodes挂掉2个,恢复Recovery Pending的DB的方案探索
2013-11-09 19:08
369 查看
大家或许会遇到一个Cluster中,3个Nodes挂掉两个的情况,这时剩下的一个Node上的DB就会变成Recovery Pending的状态,从而无法访问。AlwaysOn Group及Replica的状态也会变得不正常,显示Resolving状态。这时,如果没有数据的备份,同时挂掉的两个Nodes也恢复不了,而你又需要使处于Recovery Pending的DB恢复成正常可以访问的状态,你会怎么办呢?这里将探索解决方案。
首先尝试Detach, Take Offline等,不过失败:
Detach或Take Offline时都会报如下错误:
The operation cannot be performed on database "ASRS_F1" because it is involved in a database mirroring session or an availability group. Some operations are not allowed on a database that is participating in a database mirroring session or in an availability
group.
ALTER DATABASE statement failed. (Microsoft SQL Server, Error: 1468)
Rename时报如下错误:
Database 'ASRS_F1' cannot be opened due to inaccessible files or insufficient memory or disk space. See the SQL Server errorlog for details. (Microsoft SQL Server, Error: 945)
Delete时报如下错误(测试环境下想试试会发生什么,生产环境切勿乱尝试):
The database 'ASRS_F1' is currently joined to an availability group. Before you can drop the database, you need to remove it from the availatility group. (Microsoft SQL Server, Error: 3752)
强制离线(ALTER DATABASE [ASRS_F1] SET OFFLINE WITH ROLLBACK IMMEDIATE)时会报如下错误:
Msg 1468, Level 16, State 1, Line 1
The operation cannot be performed on database "ASRS_F1" because it is involved in a database mirroring session or an availability group. Some operations are not allowed on a database that is participating in a database mirroring session or in an availability
group.
Msg 5069, Level 16, State 1, Line 1
ALTER DATABASE statement failed.
恢复(RESTORE DATABASE ASRS_F1 WITH RECOVERY)也不行:
Msg 3104, Level 16, State 1, Line 1
RESTORE cannot operate on database 'ASRS_F1' because it is configured for database mirroring or has joined an availability group. If you intend to restore the database, use ALTER DATABASE to remove mirroring or to remove the database from its availability group.
Msg 3013, Level 16, State 1, Line 1
RESTORE DATABASE is terminating abnormally.
关闭HADR(ALTER DATABASE ASRS_F1 SET HADR OFF)也不行:
Msg 35220, Level 16, State 1, Line 1
Could not process the operation. AlwaysOn Availability Groups replica manager is waiting for the host computer to start a Windows Server Failover Clustering (WSFC) cluster and join it. Either the local computer is not a cluster node, or the local cluster node
is not online. If the computer is a cluster node, wait for it to join the cluster. If the computer is not a cluster node, add the computer to a WSFC cluster. Then, retry the operation.
按照之前大部分的错误提示,将DB从group中移除:
也不行:
Msg 35220, Level 16, State 1, Line 1
Could not process the operation. AlwaysOn Availability Groups replica manager is waiting for the host computer to start a Windows Server Failover Clustering (WSFC) cluster and join it. Either the local computer is not a cluster node, or the local cluster node
is not online. If the computer is a cluster node, wait for it to join the cluster. If the computer is not a cluster node, add the computer to a WSFC cluster. Then, retry the operation.
又想到Disable AlwaysOn Availablity Groups:
不幸的是,点击Apply后,弹出了:
最后OK退出时,发现服务又重启了一下,且并没有弹出错误提示,刷新后再看属性,居然Disable了。
再来移除时
会报这样的错误:
Msg 35221, Level 16, State 1, Line 1
Could not process the operation. AlwaysOn Availability Groups replica manager is disabled on this instance of SQL Server. Enable AlwaysOn Availability Groups, by using the SQL Server Configuration Manager. Then, restart the SQL Server service, and retry the
currently operation. For information about how to enable and disable AlwaysOn Availability Groups, see SQL Server Books Online.
不管怎样,貌似都无法实现,个人感觉更新相关的系统表或许可行:
但是,更新失败:
Msg 259, Level 16, State 1, Line 1
Ad hoc updates to system catalogs are not allowed.
针对这个报错,通过sp_configure配置相关的功能,发现还是无法更新……
众多尝试中,我发现有一种可以让当前的DB可用:
1、 Stop SQL Server服务
2、 将数据文件(mdf、ldf)复制到其他地方,比如另一台Server
3、将数据文件Attach。
后续希望能找到直接在原有Server上恢复的方案。
首先尝试Detach, Take Offline等,不过失败:
Detach或Take Offline时都会报如下错误:
The operation cannot be performed on database "ASRS_F1" because it is involved in a database mirroring session or an availability group. Some operations are not allowed on a database that is participating in a database mirroring session or in an availability
group.
ALTER DATABASE statement failed. (Microsoft SQL Server, Error: 1468)
Rename时报如下错误:
Database 'ASRS_F1' cannot be opened due to inaccessible files or insufficient memory or disk space. See the SQL Server errorlog for details. (Microsoft SQL Server, Error: 945)
Delete时报如下错误(测试环境下想试试会发生什么,生产环境切勿乱尝试):
The database 'ASRS_F1' is currently joined to an availability group. Before you can drop the database, you need to remove it from the availatility group. (Microsoft SQL Server, Error: 3752)
强制离线(ALTER DATABASE [ASRS_F1] SET OFFLINE WITH ROLLBACK IMMEDIATE)时会报如下错误:
Msg 1468, Level 16, State 1, Line 1
The operation cannot be performed on database "ASRS_F1" because it is involved in a database mirroring session or an availability group. Some operations are not allowed on a database that is participating in a database mirroring session or in an availability
group.
Msg 5069, Level 16, State 1, Line 1
ALTER DATABASE statement failed.
恢复(RESTORE DATABASE ASRS_F1 WITH RECOVERY)也不行:
Msg 3104, Level 16, State 1, Line 1
RESTORE cannot operate on database 'ASRS_F1' because it is configured for database mirroring or has joined an availability group. If you intend to restore the database, use ALTER DATABASE to remove mirroring or to remove the database from its availability group.
Msg 3013, Level 16, State 1, Line 1
RESTORE DATABASE is terminating abnormally.
关闭HADR(ALTER DATABASE ASRS_F1 SET HADR OFF)也不行:
Msg 35220, Level 16, State 1, Line 1
Could not process the operation. AlwaysOn Availability Groups replica manager is waiting for the host computer to start a Windows Server Failover Clustering (WSFC) cluster and join it. Either the local computer is not a cluster node, or the local cluster node
is not online. If the computer is a cluster node, wait for it to join the cluster. If the computer is not a cluster node, add the computer to a WSFC cluster. Then, retry the operation.
按照之前大部分的错误提示,将DB从group中移除:
ALTER AVAILABILITY GROUP agASRS REMOVE DATABASE ASRS_F1
也不行:
Msg 35220, Level 16, State 1, Line 1
Could not process the operation. AlwaysOn Availability Groups replica manager is waiting for the host computer to start a Windows Server Failover Clustering (WSFC) cluster and join it. Either the local computer is not a cluster node, or the local cluster node
is not online. If the computer is a cluster node, wait for it to join the cluster. If the computer is not a cluster node, add the computer to a WSFC cluster. Then, retry the operation.
又想到Disable AlwaysOn Availablity Groups:
不幸的是,点击Apply后,弹出了:
最后OK退出时,发现服务又重启了一下,且并没有弹出错误提示,刷新后再看属性,居然Disable了。
再来移除时
ALTER AVAILABILITY GROUP agASRS REMOVE DATABASE ASRS_F1
会报这样的错误:
Msg 35221, Level 16, State 1, Line 1
Could not process the operation. AlwaysOn Availability Groups replica manager is disabled on this instance of SQL Server. Enable AlwaysOn Availability Groups, by using the SQL Server Configuration Manager. Then, restart the SQL Server service, and retry the
currently operation. For information about how to enable and disable AlwaysOn Availability Groups, see SQL Server Books Online.
不管怎样,貌似都无法实现,个人感觉更新相关的系统表或许可行:
SELECT NAME,STATE,state_desc,replica_id,group_database_id FROM SYS.DATABASES
但是,更新失败:
Msg 259, Level 16, State 1, Line 1
Ad hoc updates to system catalogs are not allowed.
针对这个报错,通过sp_configure配置相关的功能,发现还是无法更新……
众多尝试中,我发现有一种可以让当前的DB可用:
1、 Stop SQL Server服务
2、 将数据文件(mdf、ldf)复制到其他地方,比如另一台Server
3、将数据文件Attach。
后续希望能找到直接在原有Server上恢复的方案。
相关文章推荐
- Cluster中3个Nodes挂掉2个,恢复Recovery Pending的DB的方案探索(续)
- MySQL DROP DB或TABLE场景下借助SQL Thread快速应用binlog恢复方案
- 常州某印刷厂服务器3个250G硬盘坏了2个的RAID5数据恢复一案
- 一道matlab作业题:假设从楼上到楼下有8个台阶,每一步有三种走法:走1个台阶;走2个台阶;走3个台阶,问可以有多少种方案?并将所有方案输出
- 移动web页面支持弹性滚动的3个方案
- 题目: 一个骰子,6面,1个面是 1, 2个面是2, 3个面是3, 问平均掷多少次能使1、2、3都至少出现一次。
- 【参数】恢复db_recovery_file_dest_size参数为默认值“0”方法
- Hyper-V安装Oracle Linux6_4 Oracle db 12c并使用rman做异机恢复
- MySQL高可用方案-PXC(Percona XtraDB Cluster)环境部署详解
- margin 参数 与 位置(1个上下左右;2个:上下+左右;3个:上+左右+下;)
- Oracle 不同故障的恢复方案
- maven模块的子模块受损-恢复方案
- 分布式缓存集群方案特性使用场景(Memcache/Redis(Twemproxy/Codis/Redis-cluster))优缺点对比及选型
- LinkedIn创始人Reid Hoffman的新书:创业者应该有的3个方案ABZ
- HP EVA 6400 144块硬盘的磁盘存储数据恢复方案
- implements Runnable synchronized代码块 * 2个线程向同一数组中加随机数,每个数组加3个数,交替
- Mysql集群方案(一) - Cluster & Replication 介绍
- Oracle 不同故障的恢复方案
- NV恢复方案解决4.0降级后WIFI异常【附正常NV文件】 - 中兴 V889D - 安智 - Powered by Discuz!
- redis 与DB同步方案