Heartbeat + Drbd +Mysql 构建高可用的MYSQL数据库服务
2011-03-30 10:45
573 查看
一、什么是DRBD[/b]
DRBD[/b]是由内核模块和相关脚本而构成,用以构建高可用性的集群。其实现方式是通过网络来镜像整个设备。您可以把它看作是一种网络RAID1。
Drbd[/b] 负责接收数据,把数据写到本地磁盘,然后发送给另一个主机。另一个主机再将数据存到自己的磁盘中。其他所需的组件有集群成员服务,如TurboHA 或 心跳连接,以及一些能在块设备上运行的应用程序
二、搭建实验环境
1, 两台CentOS5.5机器,没有环境的可以虚拟机搭建
CentOS 1
IP:192.168.1.251
CentOS 2
IP:192.168.1.252
2,修改两机的/etc/hosts文件,两个都改成一样[root@node1 ~]# cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 CentOS localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
192.168.1.251 node1
192.168.1.252 node2
复制代码修改完成之后,重启,使得机器名分别是node1 node2
如果没有修改成功,请查看别的文件是否还有设定HOSTNAME选项,如/etc/sysconfig/network文件等。
在node1上ping node2;在node2上ping node1是相互能ping通的
三、安装DRBD[/b]
1、需要安装DRBD[/b]及DRBD[/b]内核模块,如果机器联网,要以用YUM来安装[root@node1 ~]# yum search drbd[/b]
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
* addons: centos.candishosting.com.cn
* base: centos.ustc.edu.cn
* extras: centos.candishosting.com.cn
* updates: centos.ustc.edu.cn
================================ Matched: drbd[/b] =================================
drbd[/b].i386 : Distributed Redundant Block Device driver for Linux[/b]
drbd[/b]82.i386 : Distributed Redundant Block Device driver for Linux[/b]
drbd[/b]83.i386 : Distributed Redundant Block Device driver for Linux[/b]
kmod-drbd[/b].i686 : drbd[/b] kernel module(s)
kmod-drbd[/b]-PAE.i686 : drbd[/b] kernel module(s)
kmod-drbd[/b]-xen.i686 : drbd[/b] kernel module(s)
kmod-drbd[/b]82.i686 : drbd[/b]82 kernel module(s)
kmod-drbd[/b]82-PAE.i686 : drbd[/b]82 kernel module(s)
kmod-drbd[/b]82-xen.i686 : drbd[/b]82 kernel module(s)
kmod-drbd[/b]83.i686 : drbd[/b]83 kernel module(s)
kmod-drbd[/b]83-PAE.i686 : drbd[/b]83 kernel module(s)
kmod-drbd[/b]83-xen.i686 : drbd[/b]83 kernel module(s)
复制代码可以看到列出相关的包,安装我们相要的,两台都要安装[root@node1 ~]# yum install -y drbd[/b]83 kmod-drbd[/b]83
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
* addons: centos.candishosting.com.cn
* base: centos.ustc.edu.cn
* extras: centos.candishosting.com.cn
* updates: centos.ustc.edu.cn
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package drbd[/b]83.i386 0:8.3.8-1.el5.centos set to be updated
---> Package kmod-drbd[/b]83.i686 0:8.3.8-1.el5.centos set to be installed
--> Finished Dependency Resolution
...............略
复制代码[root@node1 ~]# lsmod | grep drbd[/b]
drbd[/b] 228528 3
复制代码可以看到DRBD[/b]模块已加载成功
2、改写DRBD[/b]的配置文件
分别是两个机器里生成drbd[/b]的配置文件/etc/drbd[/b].conf,文件内容是相同的[root@node1 ~]# vi /etc/drbd[/b].conf
#
# please have a a look at the example configuration file in
# /usr/share/doc/drbd[/b]83/drbd[/b].conf
global { usage-count yes; }
common { syncer { rate 10M; } }
resource db {
protocol C;
net {
}
on node1 {
device /dev/drbd[/b]1;
disk /dev/sdb1;
address 192.168.1.251:7789;
meta-disk internal;
}
on node2 {
device /dev/drbd[/b]1;
disk /dev/sdb1;
address 192.168.1.252:7789;
meta-disk internal;
}
}
复制代码3、建立drbd[/b]设备文件:for i in $(seq 0 15) ; do mknod /dev/drbd[/b]$i b 147 $i ; done
复制代码初始化meta-data area:[root@node1 ~]# drbdadm create-md db
md_offset 21467942912
al_offset 21467910144
bm_offset 21467254784
Found ext3 filesystem
20964792 kB data area apparently used
20964116 kB left usable by current configuration
Device size would be truncated, which
would corrupt data and result in
'access beyond end of device' errors.
You need to either
* use external meta data (recommended)
* shrink that filesystem first
* zero out the device (destroy the filesystem)
Operation refused.
Command 'drbdmeta 1 v08 /dev/sdb1 internal create-md' terminated with exit code 40
drbdadm create-md db: exited with code 40
复制代码可以看到,出现了错误
原因you created your filesystem before you created your DRBD[/b] resource, or
you created your filesystem on your backing device, rather than your DRBD[/b],
neither of which is a problem by itself, except – as the error message tries to hint – you need to enlarge the device (e.g.
lvextend), shrink the filesystem (e.g. resize2fs), or place the DRBD[/b] metadata somewhere else (external meta data).
DRBD[/b] tries to detect an existing use of the block device in question. E.g. if it detects an existing file system that uses
all the available space (as is default for most filesystems), and you try to use DRBD[/b] with internal meta data, there is no
room for the internal meata data – creating that would corrupt the last few MiB of the existing file system.
If re-creating the filesystem on the DRBD[/b] is an option, one way to “zero out the device (destroy the filesystem)”, and then
recreate it on the DRBD[/b] is
解决办法:初始化磁盘文件格式
dd if=/dev/zero bs=1M count=1 of=/dev/sdXYZ; sync
drbdadm create-md $r
drbdadm -- -o primary $r
mkfs /dev/drbdY
复制代码有了方法,我们来解决
------------------------------------------------------------------[root@node1 ~]# dd if=/dev/zero bs=1M count=1 of=/dev/sdb1;sync
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.0570659 seconds, 18.4 MB/s
再次来初始化meta-data area
[root@node1 ~]# drbdadm create-md db
Writing meta data...
initializing activity log
NOT initialized bitmap
New drbd[/b] meta data block successfully created.
success
复制代码可以看到success,表示成功。以上操作,在两机都要做
-----------------------------------------------------------------
现在我们可以启动DRBD[/b]了,分别在两台主机上执行,以下三种启动方法,都可以[root@node1 ~]# /etc/init.d/drbd[/b] start
[root@node1 ~]# service drbd[/b] start
[root@node1 ~]# drbdadm all up
复制代码查下启动后的端口,在drbd[/b].conf中有指出
netstat -ant的输出结果里有一行[root@node1 ~]# netstat -ant
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 192.168.1.251:7789 192.168.1.252:52601 ESTABLISHED
复制代码3、分别在两机看drbd[/b]当前的状态[root@node1 ~]# cat /proc/drbd[/b]
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16
1: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:20964116
[root@node2 ~]# cat /proc/drbd[/b]
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16
0: cs:Unconfigured
1: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:20964116
复制代码"/proc/drbd[/b]"中显示了drbd[/b]当前的状态.ro:Secondary表示两台主机的状态,都是"备机"状态.
ds是磁盘状态,都是"不一致"状态.
这是由于,DRBD[/b]无法判断哪一方为主机,以哪一方的磁盘数据作为标准数据.所以,我们需要初始化
4、一个主机.在node1上执行:[root@node1 ~]#drbdsetup /dev/drbd[/b]1 primary -o
复制代码分别再次在两机看drbd[/b]当前的状态[root@node1 ~]# cat /proc/drbd[/b]
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16
1: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r----
ns:131140 nr:0 dw:0 dr:132096 al:0 bm:7 lo:0 pe:16 ua:30 ap:0 ep:1 wo:b oos:20833460
[>....................] sync'ed: 0.7% (20344/20472)M delay_probe: 16
finish: 0:49:08 speed: 7,040 (6,876) K/sec
复制代码主备机状态分别是"主/备",主机磁盘状态是"实时",备机状态是"不一致".
在第3行,可以看到数据正在同步中,即主机正在将磁盘上的数据,传递到备机上.现在的进度是[>...................] sync'ed: 0.7%
(20344/20472)M
数据同步完成之后,再次分别看两机的状态[root@node1 ~]# cat /proc/drbd[/b]
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16
1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
[root@node2 ~]# cat /proc/drbd[/b]
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16
1: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
复制代码磁盘状态都是"实时",表示数据同步完成了.
你现在可以把主机上的DRBD[/b]设备挂载到一个目录上进行使用.备机的DRBD[/b]设备无法被挂载,因为它是
用来接收主机数据的,由DRBD[/b]负责操作.[root@node1 ~]# mkfs.ext3 /dev/drbd[/b]1
mke2fs 1.39 (29-May-2006)
Filesystem label=
OS type: Linux[/b]
Block size=4096 (log=2)
Fragment size=4096 (log=2)
2621440 inodes, 5241029 blocks
262051 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=0
160 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000
Writing inode tables: 45/160
[root@node1 ~]# mount /dev/drbd[/b]1 /mnt
[root@node1 ~]# touch /mnt/test.txt
[root@node1 ~]# ls /mnt/
lost+found test.txt
复制代码在主机node1上产生的文件test.txt,也完整的保存在备机node2的DRBD[/b]分区上.
这就是DRBD[/b]的网络RAID-1功能. 在主机上的任何操作,都会被同步到备机的相应磁盘分区上,达到数据备份的效果.
DRBD[/b]的主备机切换有时,你需要将DRBD[/b]的主备机互换一下.可以执行下面的操作:
在主机上,先要卸载掉DRBD[/b]设备[root@node1 ~]# umount /mnt
[root@node1 ~]# drbdadm secondary db
[root@node1 ~]# cat /proc/drbd[/b]
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16
1: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r----
ns:464580 nr:0 dw:464580 dr:165 al:368 bm:123 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
复制代码现在,两台主机都是"备机".
在备机node2上,将它升级为"主机".[root@node2 ~]# drbdadm primary db
[root@node2 ~]# cat /proc/drbd[/b]
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16
1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----
ns:0 nr:464580 dw:464580 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
[root@node2 ~]# mount /dev/drbd[/b]1 /mnt
[root@node2 ~]# ls /mnt/
lost+found test.txt
复制代码现在node2成为主机了
DRBD[/b]相关切换命令
DRBD[/b]切换drbdadm secondary r0 //把主机切换成备机
drbdadm primary r0 //把备机切换成主机
复制代码注意,在我测试的情况。DRBD[/b]只有主机可以读写,备机不能够挂载,也就是说不能够读。可能是我的drbd[/b].conf文件配置方面的原因
上面的DRBD[/b]已经成功配置完成,可以正常切换,现在我们让两台成为高可用,所谓高可用,就是双机热备,互备。这里我不解释太多,相应搞过HA的人都清楚
四、安装heartbeat mysql mysql-server
1、主机node1和node2上安装heartbeat mysql mysql-server[root@node1 ~]# yum install -y heartbeat mysql mysql-server
[root@node2 ~]# umount /mnt
[root@node2 ~]# drbdadm secondary db
[root@node1 ~]# drbdadm primary db
[root@node1 ~]# mount /dev/drbd[/b]1 /mnt
复制代码以下只在node1上进行操作
在node1上启动mysql初始化数据库数据文件[root@node1 ~]# service mysqld start
[root@node1 ~]# service mysqld stop
Stopping MySQL: [ OK ]
[root@node1 ~]# cp /var/lib/mysql/* /mnt/ -ar
复制代码---------------------------------------------------------------------
配置heartbeat[root@node1 ~]# cp /usr/share/doc/heartbeat-2.1.3/haresources /etc/ha.d/
[root@node1 ~]# cp /usr/share/doc/heartbeat-2.1.3/ha.cf /etc/ha.d/
[root@node1 ~]# cp /usr/share/doc/heartbeat-2.1.3/authkeys /etc/ha.d/
[root@node1 ~]# chmod 600 /etc/ha.d/authkeys
复制代码[root@node1 ~]# vi /etc/ha.d/authkeys
authkeys文件中下面两行前面的#号去掉
auth 1
1 crc
复制代码配置ha.cf 内容[root@node1 ~]# vi /etc/ha.d/ha.cf
看下配置后的文件内容
[root@node1 ~]# grep -v "^#" /etc/ha.d/ha.cf
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local0
keepalive 2
deadtime 10
warntime 5
initdead 120
ucast eth0 192.168.1.252
auto_failback on
node node1
node node2
ping 192.168.1.1
respawn hacluster /usr/lib/heartbeat/ipfail
apiauth ipfail gid=haclient uid=hacluster
复制代码添加自动加载drbd[/b] 及持载文件系统的脚本,放在/etc/ha.d/resource.d 目录下面[root@node1 ~]# vi /etc/ha.d/resource.d/mysqld_umount
#!/bin/sh
unset LC_ALL; export LC_ALL
unset LANGUAGE; export LANGUAGE
prefix=/usr
exec_prefix=/usr
. /etc/ha.d/shellfuncs
case "$1" in
'start')
#/sbin/drbdadm -- --do-what-I-say primary all
/sbin/drbdadm primary all
#drbdsetup /dev/drbd[/b]1 primary -o
/bin/mount /dev/drbd[/b]1 /var/lib/mysql
;;
'pre-start')
;;
'post-start')
;;
'stop')
/bin/umount /var/lib/mysql
/sbin/drbdadm secondary all
;;
'pre-stop')
;;
'post-stop')
;;
*)
echo "Usage: $0 { start | pre-start | post-start | stop | pre-stop | post-stop }"
;;
esac
exit 0
-----------------------------------------------------------------
[root@node1 ~]# chmod +x /etc/ha.d/resource.d/mysqld_umount
复制代码配置haresource
添加如下一行node1 IPaddr::192.168.1.250/24/eth0:1 mysqld_umount mysqld
复制代码-------------------------------------------------------------
工作完成,把/etc/ha.d目录全部覆盖到node2上面去
scp -r /etc/ha.d 192.168.1.252:/etc[root@node1 ~]# scp -r /etc/ha.d 192.168.1.252:/etc
The authenticity of host '192.168.1.252 (192.168.1.252)' can't be established.
RSA key fingerprint is 6f:e8:cd:f9:b6:8d:aa:4a:e3:28:41:63:86:aa:40:66.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.1.252' (RSA) to the list of known hosts.
root@192.168.1.252's password:
复制代码提示输入对方root用户的密码,全部转输完以后,注意修改/etc/ha.d/ha.cf 中的ucast eth0 192.168.1.252 为对方的IP也就是node1主机的
eth0的IP 改为ucast eth1 192.168.1.251
node1启动heartbeat服务并看日志:[root@node1 ~]# service heartbeat start
Starting High-Availability services:
2010/08/20_11:30:18 INFO: Resource is stopped
[ OK ]
[root@node1 ~]# tail /var/log/ha-log -f
heartbeat[6466]: 2010/08/20_11:30:32 info: Status update for node node2: status up
heartbeat[6466]: 2010/08/20_11:30:32 info: Comm_now_up(): updating status to active
heartbeat[6466]: 2010/08/20_11:30:32 info: Local status now set to: 'active'
heartbeat[6466]: 2010/08/20_11:30:32 info: Starting child client "/usr/lib/heartbeat/ipfail" (498,496)
heartbeat[6479]: 2010/08/20_11:30:33 info: Starting "/usr/lib/heartbeat/ipfail" as uid 498 gid 496 (pid 6479)
harc[6475]: 2010/08/20_11:30:33 info: Running /etc/ha.d/rc.d/status status
heartbeat[6466]: 2010/08/20_11:30:33 info: Status update for node node2: status active
harc[6495]: 2010/08/20_11:30:34 info: Running /etc/ha.d/rc.d/status status
ipfail[6479]: 2010/08/20_11:30:37 info: Status update: Node node2 now has status active
ipfail[6479]: 2010/08/20_11:30:40 info: Asking other side for ping node count.
ipfail[6479]: 2010/08/20_11:30:43 info: No giveup timer to abort.
heartbeat[6466]: 2010/08/20_11:30:43 info: remote resource transition completed.
heartbeat[6466]: 2010/08/20_11:30:43 info: remote resource transition completed.
heartbeat[6466]: 2010/08/20_11:30:43 info: Initial resource acquisition complete (T_RESOURCES(us))
IPaddr[6548]: 2010/08/20_11:30:45 INFO: Resource is stopped
heartbeat[6512]: 2010/08/20_11:30:46 info: Local Resource acquisition completed.
harc[6599]: 2010/08/20_11:30:46 info: Running /etc/ha.d/rc.d/ip-request-resp ip-request-resp
ip-request-resp[6599]: 2010/08/20_11:30:46 received ip-request-resp IPaddr::192.168.1.250/24/eth0:1 OK yes
ResourceManager[6620]: 2010/08/20_11:30:46 info: Acquiring resource group: node1 IPaddr::192.168.1.250/24/eth0:1
mysqld_umount mysqld
IPaddr[6647]: 2010/08/20_11:30:48 INFO: Resource is stopped
ResourceManager[6620]: 2010/08/20_11:30:48 info: Running /etc/ha.d/resource.d/IPaddr 192.168.1.250/24/eth0:1 start
IPaddr[6745]: 2010/08/20_11:30:50 INFO: Using calculated netmask for 192.168.1.250: 255.255.255.0
IPaddr[6745]: 2010/08/20_11:30:50 INFO: eval ifconfig eth0:0 192.168.1.250 netmask 255.255.255.0 broadcast 192.168.1.255
IPaddr[6716]: 2010/08/20_11:30:51 INFO: Success
ResourceManager[6620]: 2010/08/20_11:30:52 info: Running /etc/ha.d/resource.d/mysqld_umount start
ResourceManager[6620]: 2010/08/20_11:30:53 info: Running /etc/init.d/mysqld start
复制代码node2启动服务并看日志[root@node2 ~]# service heartbeat start
Starting High-Availability services:
2010/08/20_11:30:30 INFO: Resource is stopped
[ OK ]
[root@node2 ~]# tail /var/log/ha-log -f
heartbeat[6149]: 2010/08/20_11:30:33 info: Local status now set to: 'active'
heartbeat[6149]: 2010/08/20_11:30:33 info: Starting child client "/usr/lib/heartbeat/ipfail" (498,496)
heartbeat[6149]: 2010/08/20_11:30:33 info: Status update for node node1: status active
heartbeat[6170]: 2010/08/20_11:30:33 info: Starting "/usr/lib/heartbeat/ipfail" as uid 498 gid 496 (pid 6170)
harc[6178]: 2010/08/20_11:30:34 info: Running /etc/ha.d/rc.d/status status
ipfail[6170]: 2010/08/20_11:30:42 info: Ping node count is balanced.
heartbeat[6149]: 2010/08/20_11:30:43 info: local resource transition completed.
heartbeat[6149]: 2010/08/20_11:30:43 info: Initial resource acquisition complete (T_RESOURCES(us))
heartbeat[6194]: 2010/08/20_11:30:43 info: No local resources [/usr/share/heartbeat/ResourceManager listkeys node2] to acquire.
heartbeat[6149]: 2010/08/20_11:30:43 info: remote resource transition completed.
复制代码分别看两机的端口,看下MYSQL是否随heartbeat启动[root@node1 ~]# netstat -ant
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN
tcp 0 0 192.168.1.252:7789 192.168.1.251:41577 ESTABLISHED
tcp 0 0 192.168.1.252:48782 192.168.1.251:7789 ESTABLISHED
[root@node2 ~]# netstat -ant
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 192.168.1.251:7789 192.168.1.252:48782 ESTABLISHED
tcp 0 0 192.168.1.251:41577 192.168.1.252:7789 ESTABLISHED
tcp 0 0 :::22 :::* LISTEN
复制代码建表并加记录,验证是否数据同步[root@node1 ~]# mysql
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.0.77 Source distribution
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
mysql> create database db;
Query OK, 1 row affected (0.00 sec)
mysql> use db;
Database changed
mysql> create table t (id int(10),name char(10));
Query OK, 0 rows affected (0.09 sec)
mysql> insert into t values(001,"ganxing"),(002,"boobooke"),(003,"bbk"),(004,"abc");
Query OK, 4 row affected (0.00 sec)
Records: 4 Duplicates: 0 Warnings: 0
mysql> select * from t;
+------+----------+
| id | name |
+------+----------+
| 1 | ganxing |
| 2 | boobooke |
| 3 | bbk |
| 4 | abc |
+------+----------+
4 rows in set (0.01 sec)
复制代码停止node1上的heartbeat服务,看node2是否能接管,并同步数据[root@node1 ~]# service heartbeat stop
[root@node1 ~]# tail /var/log/ha-log -f
heartbeat[6466]: 2010/08/20_11:36:08 info: killing HBWRITE process 6470 with signal 15
heartbeat[6466]: 2010/08/20_11:36:08 info: killing HBREAD process 6471 with signal 15
heartbeat[6466]: 2010/08/20_11:36:08 info: killing HBWRITE process 6472 with signal 15
heartbeat[6466]: 2010/08/20_11:36:08 info: killing HBREAD process 6473 with signal 15
heartbeat[6466]: 2010/08/20_11:36:08 info: Core process 6471 exited. 5 remaining
heartbeat[6466]: 2010/08/20_11:36:08 info: Core process 6470 exited. 4 remaining
heartbeat[6466]: 2010/08/20_11:36:09 info: Core process 6469 exited. 3 remaining
heartbeat[6466]: 2010/08/20_11:36:09 info: Core process 6473 exited. 2 remaining
heartbeat[6466]: 2010/08/20_11:36:09 info: Core process 6472 exited. 1 remaining
heartbeat[6466]: 2010/08/20_11:36:09 info: node1 Heartbeat shutdown complete.
复制代码查看node2接管日志[root@node2 ~]# tail /var/log/ha-log -f
heartbeat[6149]: 2010/08/20_11:36:06 info: Received shutdown notice from 'node1'.
heartbeat[6149]: 2010/08/20_11:36:06 info: Resources being acquired from node1.
heartbeat[6208]: 2010/08/20_11:36:06 info: acquire local HA resources (standby).
heartbeat[6208]: 2010/08/20_11:36:07 info: local HA resource acquisition completed (standby).
heartbeat[6149]: 2010/08/20_11:36:07 info: Standby resource acquisition done [all].
heartbeat[6209]: 2010/08/20_11:36:07 info: No local resources [/usr/share/heartbeat/ResourceManager listkeys node2] to acquire.
harc[6234]: 2010/08/20_11:36:08 info: Running /etc/ha.d/rc.d/status status
mach_down[6250]: 2010/08/20_11:36:08 info: Taking over resource group IPaddr::192.168.1.250/24/eth0:1
ResourceManager[6276]: 2010/08/20_11:36:09 info: Acquiring
resource group: node1 IPaddr::192.168.1.250/24/eth0:1 mysqld_umount
mysqld
IPaddr[6303]: 2010/08/20_11:36:10 INFO: Resource is stopped
ResourceManager[6276]: 2010/08/20_11:36:11 info: Running /etc/ha.d/resource.d/IPaddr 192.168.1.250/24/eth0:1 start
IPaddr[6401]: 2010/08/20_11:36:12 INFO: Using calculated netmask for 192.168.1.250: 255.255.255.0
IPaddr[6401]: 2010/08/20_11:36:13 INFO: eval ifconfig eth0:0 192.168.1.250 netmask 255.255.255.0 broadcast 192.168.1.255
IPaddr[6372]: 2010/08/20_11:36:13 INFO: Success
ResourceManager[6276]: 2010/08/20_11:36:14 info: Running /etc/ha.d/resource.d/mysqld_umount start
ResourceManager[6276]: 2010/08/20_11:36:15 info: Running /etc/init.d/mysqld start
heartbeat[6149]: 2010/08/20_11:36:18 WARN: node node1: is dead
heartbeat[6149]: 2010/08/20_11:36:18 info: Dead node node1 gave up resources.
heartbeat[6149]: 2010/08/20_11:36:18 info: Link node1:eth0 dead.
ipfail[6170]: 2010/08/20_11:36:18 info: Status update: Node node1 now has status dead
mach_down[6250]: 2010/08/20_11:36:19 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired
heartbeat[6149]: 2010/08/20_11:36:19 info: mach_down takeover complete.
mach_down[6250]: 2010/08/20_11:36:19 info: mach_down takeover complete for node node1.
ipfail[6170]: 2010/08/20_11:36:19 info: NS: We are still alive!
ipfail[6170]: 2010/08/20_11:36:19 info: Link Status update: Link node1/eth0 now has status dead
ipfail[6170]: 2010/08/20_11:36:20 info: Asking other side for ping node count.
ipfail[6170]: 2010/08/20_11:36:20 info: Checking remote count of ping nodes.
复制代码来验证一下数据同步情况[root@node2 ~]# mysql
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.0.77 Source distribution
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
mysql> use db
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> select * from t;
+------+----------+
| id | name |
+------+----------+
| 1 | ganxing |
| 2 | boobooke |
| 3 | bbk |
| 4 | abc |
+------+----------+
4 rows in set (0.00 sec)
复制代码服务启动的顺序
drbd[/b]让系统自动加载
heartbeat 放在/etc/rc.local里面MYSQLD不要开机自动加载 http://51CTO提醒您,请勿滥发广告!/bbs/thread-47791-1-1.html
DRBD[/b]是由内核模块和相关脚本而构成,用以构建高可用性的集群。其实现方式是通过网络来镜像整个设备。您可以把它看作是一种网络RAID1。
Drbd[/b] 负责接收数据,把数据写到本地磁盘,然后发送给另一个主机。另一个主机再将数据存到自己的磁盘中。其他所需的组件有集群成员服务,如TurboHA 或 心跳连接,以及一些能在块设备上运行的应用程序
二、搭建实验环境
1, 两台CentOS5.5机器,没有环境的可以虚拟机搭建
CentOS 1
IP:192.168.1.251
CentOS 2
IP:192.168.1.252
2,修改两机的/etc/hosts文件,两个都改成一样[root@node1 ~]# cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 CentOS localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
192.168.1.251 node1
192.168.1.252 node2
复制代码修改完成之后,重启,使得机器名分别是node1 node2
如果没有修改成功,请查看别的文件是否还有设定HOSTNAME选项,如/etc/sysconfig/network文件等。
在node1上ping node2;在node2上ping node1是相互能ping通的
三、安装DRBD[/b]
1、需要安装DRBD[/b]及DRBD[/b]内核模块,如果机器联网,要以用YUM来安装[root@node1 ~]# yum search drbd[/b]
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
* addons: centos.candishosting.com.cn
* base: centos.ustc.edu.cn
* extras: centos.candishosting.com.cn
* updates: centos.ustc.edu.cn
================================ Matched: drbd[/b] =================================
drbd[/b].i386 : Distributed Redundant Block Device driver for Linux[/b]
drbd[/b]82.i386 : Distributed Redundant Block Device driver for Linux[/b]
drbd[/b]83.i386 : Distributed Redundant Block Device driver for Linux[/b]
kmod-drbd[/b].i686 : drbd[/b] kernel module(s)
kmod-drbd[/b]-PAE.i686 : drbd[/b] kernel module(s)
kmod-drbd[/b]-xen.i686 : drbd[/b] kernel module(s)
kmod-drbd[/b]82.i686 : drbd[/b]82 kernel module(s)
kmod-drbd[/b]82-PAE.i686 : drbd[/b]82 kernel module(s)
kmod-drbd[/b]82-xen.i686 : drbd[/b]82 kernel module(s)
kmod-drbd[/b]83.i686 : drbd[/b]83 kernel module(s)
kmod-drbd[/b]83-PAE.i686 : drbd[/b]83 kernel module(s)
kmod-drbd[/b]83-xen.i686 : drbd[/b]83 kernel module(s)
复制代码可以看到列出相关的包,安装我们相要的,两台都要安装[root@node1 ~]# yum install -y drbd[/b]83 kmod-drbd[/b]83
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
* addons: centos.candishosting.com.cn
* base: centos.ustc.edu.cn
* extras: centos.candishosting.com.cn
* updates: centos.ustc.edu.cn
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package drbd[/b]83.i386 0:8.3.8-1.el5.centos set to be updated
---> Package kmod-drbd[/b]83.i686 0:8.3.8-1.el5.centos set to be installed
--> Finished Dependency Resolution
...............略
复制代码[root@node1 ~]# lsmod | grep drbd[/b]
drbd[/b] 228528 3
复制代码可以看到DRBD[/b]模块已加载成功
2、改写DRBD[/b]的配置文件
分别是两个机器里生成drbd[/b]的配置文件/etc/drbd[/b].conf,文件内容是相同的[root@node1 ~]# vi /etc/drbd[/b].conf
#
# please have a a look at the example configuration file in
# /usr/share/doc/drbd[/b]83/drbd[/b].conf
global { usage-count yes; }
common { syncer { rate 10M; } }
resource db {
protocol C;
net {
}
on node1 {
device /dev/drbd[/b]1;
disk /dev/sdb1;
address 192.168.1.251:7789;
meta-disk internal;
}
on node2 {
device /dev/drbd[/b]1;
disk /dev/sdb1;
address 192.168.1.252:7789;
meta-disk internal;
}
}
复制代码3、建立drbd[/b]设备文件:for i in $(seq 0 15) ; do mknod /dev/drbd[/b]$i b 147 $i ; done
复制代码初始化meta-data area:[root@node1 ~]# drbdadm create-md db
md_offset 21467942912
al_offset 21467910144
bm_offset 21467254784
Found ext3 filesystem
20964792 kB data area apparently used
20964116 kB left usable by current configuration
Device size would be truncated, which
would corrupt data and result in
'access beyond end of device' errors.
You need to either
* use external meta data (recommended)
* shrink that filesystem first
* zero out the device (destroy the filesystem)
Operation refused.
Command 'drbdmeta 1 v08 /dev/sdb1 internal create-md' terminated with exit code 40
drbdadm create-md db: exited with code 40
复制代码可以看到,出现了错误
原因you created your filesystem before you created your DRBD[/b] resource, or
you created your filesystem on your backing device, rather than your DRBD[/b],
neither of which is a problem by itself, except – as the error message tries to hint – you need to enlarge the device (e.g.
lvextend), shrink the filesystem (e.g. resize2fs), or place the DRBD[/b] metadata somewhere else (external meta data).
DRBD[/b] tries to detect an existing use of the block device in question. E.g. if it detects an existing file system that uses
all the available space (as is default for most filesystems), and you try to use DRBD[/b] with internal meta data, there is no
room for the internal meata data – creating that would corrupt the last few MiB of the existing file system.
If re-creating the filesystem on the DRBD[/b] is an option, one way to “zero out the device (destroy the filesystem)”, and then
recreate it on the DRBD[/b] is
解决办法:初始化磁盘文件格式
dd if=/dev/zero bs=1M count=1 of=/dev/sdXYZ; sync
drbdadm create-md $r
drbdadm -- -o primary $r
mkfs /dev/drbdY
复制代码有了方法,我们来解决
------------------------------------------------------------------[root@node1 ~]# dd if=/dev/zero bs=1M count=1 of=/dev/sdb1;sync
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.0570659 seconds, 18.4 MB/s
再次来初始化meta-data area
[root@node1 ~]# drbdadm create-md db
Writing meta data...
initializing activity log
NOT initialized bitmap
New drbd[/b] meta data block successfully created.
success
复制代码可以看到success,表示成功。以上操作,在两机都要做
-----------------------------------------------------------------
现在我们可以启动DRBD[/b]了,分别在两台主机上执行,以下三种启动方法,都可以[root@node1 ~]# /etc/init.d/drbd[/b] start
[root@node1 ~]# service drbd[/b] start
[root@node1 ~]# drbdadm all up
复制代码查下启动后的端口,在drbd[/b].conf中有指出
netstat -ant的输出结果里有一行[root@node1 ~]# netstat -ant
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 192.168.1.251:7789 192.168.1.252:52601 ESTABLISHED
复制代码3、分别在两机看drbd[/b]当前的状态[root@node1 ~]# cat /proc/drbd[/b]
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16
1: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:20964116
[root@node2 ~]# cat /proc/drbd[/b]
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16
0: cs:Unconfigured
1: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:20964116
复制代码"/proc/drbd[/b]"中显示了drbd[/b]当前的状态.ro:Secondary表示两台主机的状态,都是"备机"状态.
ds是磁盘状态,都是"不一致"状态.
这是由于,DRBD[/b]无法判断哪一方为主机,以哪一方的磁盘数据作为标准数据.所以,我们需要初始化
4、一个主机.在node1上执行:[root@node1 ~]#drbdsetup /dev/drbd[/b]1 primary -o
复制代码分别再次在两机看drbd[/b]当前的状态[root@node1 ~]# cat /proc/drbd[/b]
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16
1: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r----
ns:131140 nr:0 dw:0 dr:132096 al:0 bm:7 lo:0 pe:16 ua:30 ap:0 ep:1 wo:b oos:20833460
[>....................] sync'ed: 0.7% (20344/20472)M delay_probe: 16
finish: 0:49:08 speed: 7,040 (6,876) K/sec
复制代码主备机状态分别是"主/备",主机磁盘状态是"实时",备机状态是"不一致".
在第3行,可以看到数据正在同步中,即主机正在将磁盘上的数据,传递到备机上.现在的进度是[>...................] sync'ed: 0.7%
(20344/20472)M
数据同步完成之后,再次分别看两机的状态[root@node1 ~]# cat /proc/drbd[/b]
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16
1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
[root@node2 ~]# cat /proc/drbd[/b]
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16
1: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
复制代码磁盘状态都是"实时",表示数据同步完成了.
你现在可以把主机上的DRBD[/b]设备挂载到一个目录上进行使用.备机的DRBD[/b]设备无法被挂载,因为它是
用来接收主机数据的,由DRBD[/b]负责操作.[root@node1 ~]# mkfs.ext3 /dev/drbd[/b]1
mke2fs 1.39 (29-May-2006)
Filesystem label=
OS type: Linux[/b]
Block size=4096 (log=2)
Fragment size=4096 (log=2)
2621440 inodes, 5241029 blocks
262051 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=0
160 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000
Writing inode tables: 45/160
[root@node1 ~]# mount /dev/drbd[/b]1 /mnt
[root@node1 ~]# touch /mnt/test.txt
[root@node1 ~]# ls /mnt/
lost+found test.txt
复制代码在主机node1上产生的文件test.txt,也完整的保存在备机node2的DRBD[/b]分区上.
这就是DRBD[/b]的网络RAID-1功能. 在主机上的任何操作,都会被同步到备机的相应磁盘分区上,达到数据备份的效果.
DRBD[/b]的主备机切换有时,你需要将DRBD[/b]的主备机互换一下.可以执行下面的操作:
在主机上,先要卸载掉DRBD[/b]设备[root@node1 ~]# umount /mnt
[root@node1 ~]# drbdadm secondary db
[root@node1 ~]# cat /proc/drbd[/b]
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16
1: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r----
ns:464580 nr:0 dw:464580 dr:165 al:368 bm:123 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
复制代码现在,两台主机都是"备机".
在备机node2上,将它升级为"主机".[root@node2 ~]# drbdadm primary db
[root@node2 ~]# cat /proc/drbd[/b]
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16
1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----
ns:0 nr:464580 dw:464580 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
[root@node2 ~]# mount /dev/drbd[/b]1 /mnt
[root@node2 ~]# ls /mnt/
lost+found test.txt
复制代码现在node2成为主机了
DRBD[/b]相关切换命令
DRBD[/b]切换drbdadm secondary r0 //把主机切换成备机
drbdadm primary r0 //把备机切换成主机
复制代码注意,在我测试的情况。DRBD[/b]只有主机可以读写,备机不能够挂载,也就是说不能够读。可能是我的drbd[/b].conf文件配置方面的原因
上面的DRBD[/b]已经成功配置完成,可以正常切换,现在我们让两台成为高可用,所谓高可用,就是双机热备,互备。这里我不解释太多,相应搞过HA的人都清楚
四、安装heartbeat mysql mysql-server
1、主机node1和node2上安装heartbeat mysql mysql-server[root@node1 ~]# yum install -y heartbeat mysql mysql-server
[root@node2 ~]# umount /mnt
[root@node2 ~]# drbdadm secondary db
[root@node1 ~]# drbdadm primary db
[root@node1 ~]# mount /dev/drbd[/b]1 /mnt
复制代码以下只在node1上进行操作
在node1上启动mysql初始化数据库数据文件[root@node1 ~]# service mysqld start
[root@node1 ~]# service mysqld stop
Stopping MySQL: [ OK ]
[root@node1 ~]# cp /var/lib/mysql/* /mnt/ -ar
复制代码---------------------------------------------------------------------
配置heartbeat[root@node1 ~]# cp /usr/share/doc/heartbeat-2.1.3/haresources /etc/ha.d/
[root@node1 ~]# cp /usr/share/doc/heartbeat-2.1.3/ha.cf /etc/ha.d/
[root@node1 ~]# cp /usr/share/doc/heartbeat-2.1.3/authkeys /etc/ha.d/
[root@node1 ~]# chmod 600 /etc/ha.d/authkeys
复制代码[root@node1 ~]# vi /etc/ha.d/authkeys
authkeys文件中下面两行前面的#号去掉
auth 1
1 crc
复制代码配置ha.cf 内容[root@node1 ~]# vi /etc/ha.d/ha.cf
看下配置后的文件内容
[root@node1 ~]# grep -v "^#" /etc/ha.d/ha.cf
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local0
keepalive 2
deadtime 10
warntime 5
initdead 120
ucast eth0 192.168.1.252
auto_failback on
node node1
node node2
ping 192.168.1.1
respawn hacluster /usr/lib/heartbeat/ipfail
apiauth ipfail gid=haclient uid=hacluster
复制代码添加自动加载drbd[/b] 及持载文件系统的脚本,放在/etc/ha.d/resource.d 目录下面[root@node1 ~]# vi /etc/ha.d/resource.d/mysqld_umount
#!/bin/sh
unset LC_ALL; export LC_ALL
unset LANGUAGE; export LANGUAGE
prefix=/usr
exec_prefix=/usr
. /etc/ha.d/shellfuncs
case "$1" in
'start')
#/sbin/drbdadm -- --do-what-I-say primary all
/sbin/drbdadm primary all
#drbdsetup /dev/drbd[/b]1 primary -o
/bin/mount /dev/drbd[/b]1 /var/lib/mysql
;;
'pre-start')
;;
'post-start')
;;
'stop')
/bin/umount /var/lib/mysql
/sbin/drbdadm secondary all
;;
'pre-stop')
;;
'post-stop')
;;
*)
echo "Usage: $0 { start | pre-start | post-start | stop | pre-stop | post-stop }"
;;
esac
exit 0
-----------------------------------------------------------------
[root@node1 ~]# chmod +x /etc/ha.d/resource.d/mysqld_umount
复制代码配置haresource
添加如下一行node1 IPaddr::192.168.1.250/24/eth0:1 mysqld_umount mysqld
复制代码-------------------------------------------------------------
工作完成,把/etc/ha.d目录全部覆盖到node2上面去
scp -r /etc/ha.d 192.168.1.252:/etc[root@node1 ~]# scp -r /etc/ha.d 192.168.1.252:/etc
The authenticity of host '192.168.1.252 (192.168.1.252)' can't be established.
RSA key fingerprint is 6f:e8:cd:f9:b6:8d:aa:4a:e3:28:41:63:86:aa:40:66.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.1.252' (RSA) to the list of known hosts.
root@192.168.1.252's password:
复制代码提示输入对方root用户的密码,全部转输完以后,注意修改/etc/ha.d/ha.cf 中的ucast eth0 192.168.1.252 为对方的IP也就是node1主机的
eth0的IP 改为ucast eth1 192.168.1.251
node1启动heartbeat服务并看日志:[root@node1 ~]# service heartbeat start
Starting High-Availability services:
2010/08/20_11:30:18 INFO: Resource is stopped
[ OK ]
[root@node1 ~]# tail /var/log/ha-log -f
heartbeat[6466]: 2010/08/20_11:30:32 info: Status update for node node2: status up
heartbeat[6466]: 2010/08/20_11:30:32 info: Comm_now_up(): updating status to active
heartbeat[6466]: 2010/08/20_11:30:32 info: Local status now set to: 'active'
heartbeat[6466]: 2010/08/20_11:30:32 info: Starting child client "/usr/lib/heartbeat/ipfail" (498,496)
heartbeat[6479]: 2010/08/20_11:30:33 info: Starting "/usr/lib/heartbeat/ipfail" as uid 498 gid 496 (pid 6479)
harc[6475]: 2010/08/20_11:30:33 info: Running /etc/ha.d/rc.d/status status
heartbeat[6466]: 2010/08/20_11:30:33 info: Status update for node node2: status active
harc[6495]: 2010/08/20_11:30:34 info: Running /etc/ha.d/rc.d/status status
ipfail[6479]: 2010/08/20_11:30:37 info: Status update: Node node2 now has status active
ipfail[6479]: 2010/08/20_11:30:40 info: Asking other side for ping node count.
ipfail[6479]: 2010/08/20_11:30:43 info: No giveup timer to abort.
heartbeat[6466]: 2010/08/20_11:30:43 info: remote resource transition completed.
heartbeat[6466]: 2010/08/20_11:30:43 info: remote resource transition completed.
heartbeat[6466]: 2010/08/20_11:30:43 info: Initial resource acquisition complete (T_RESOURCES(us))
IPaddr[6548]: 2010/08/20_11:30:45 INFO: Resource is stopped
heartbeat[6512]: 2010/08/20_11:30:46 info: Local Resource acquisition completed.
harc[6599]: 2010/08/20_11:30:46 info: Running /etc/ha.d/rc.d/ip-request-resp ip-request-resp
ip-request-resp[6599]: 2010/08/20_11:30:46 received ip-request-resp IPaddr::192.168.1.250/24/eth0:1 OK yes
ResourceManager[6620]: 2010/08/20_11:30:46 info: Acquiring resource group: node1 IPaddr::192.168.1.250/24/eth0:1
mysqld_umount mysqld
IPaddr[6647]: 2010/08/20_11:30:48 INFO: Resource is stopped
ResourceManager[6620]: 2010/08/20_11:30:48 info: Running /etc/ha.d/resource.d/IPaddr 192.168.1.250/24/eth0:1 start
IPaddr[6745]: 2010/08/20_11:30:50 INFO: Using calculated netmask for 192.168.1.250: 255.255.255.0
IPaddr[6745]: 2010/08/20_11:30:50 INFO: eval ifconfig eth0:0 192.168.1.250 netmask 255.255.255.0 broadcast 192.168.1.255
IPaddr[6716]: 2010/08/20_11:30:51 INFO: Success
ResourceManager[6620]: 2010/08/20_11:30:52 info: Running /etc/ha.d/resource.d/mysqld_umount start
ResourceManager[6620]: 2010/08/20_11:30:53 info: Running /etc/init.d/mysqld start
复制代码node2启动服务并看日志[root@node2 ~]# service heartbeat start
Starting High-Availability services:
2010/08/20_11:30:30 INFO: Resource is stopped
[ OK ]
[root@node2 ~]# tail /var/log/ha-log -f
heartbeat[6149]: 2010/08/20_11:30:33 info: Local status now set to: 'active'
heartbeat[6149]: 2010/08/20_11:30:33 info: Starting child client "/usr/lib/heartbeat/ipfail" (498,496)
heartbeat[6149]: 2010/08/20_11:30:33 info: Status update for node node1: status active
heartbeat[6170]: 2010/08/20_11:30:33 info: Starting "/usr/lib/heartbeat/ipfail" as uid 498 gid 496 (pid 6170)
harc[6178]: 2010/08/20_11:30:34 info: Running /etc/ha.d/rc.d/status status
ipfail[6170]: 2010/08/20_11:30:42 info: Ping node count is balanced.
heartbeat[6149]: 2010/08/20_11:30:43 info: local resource transition completed.
heartbeat[6149]: 2010/08/20_11:30:43 info: Initial resource acquisition complete (T_RESOURCES(us))
heartbeat[6194]: 2010/08/20_11:30:43 info: No local resources [/usr/share/heartbeat/ResourceManager listkeys node2] to acquire.
heartbeat[6149]: 2010/08/20_11:30:43 info: remote resource transition completed.
复制代码分别看两机的端口,看下MYSQL是否随heartbeat启动[root@node1 ~]# netstat -ant
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN
tcp 0 0 192.168.1.252:7789 192.168.1.251:41577 ESTABLISHED
tcp 0 0 192.168.1.252:48782 192.168.1.251:7789 ESTABLISHED
[root@node2 ~]# netstat -ant
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 192.168.1.251:7789 192.168.1.252:48782 ESTABLISHED
tcp 0 0 192.168.1.251:41577 192.168.1.252:7789 ESTABLISHED
tcp 0 0 :::22 :::* LISTEN
复制代码建表并加记录,验证是否数据同步[root@node1 ~]# mysql
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.0.77 Source distribution
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
mysql> create database db;
Query OK, 1 row affected (0.00 sec)
mysql> use db;
Database changed
mysql> create table t (id int(10),name char(10));
Query OK, 0 rows affected (0.09 sec)
mysql> insert into t values(001,"ganxing"),(002,"boobooke"),(003,"bbk"),(004,"abc");
Query OK, 4 row affected (0.00 sec)
Records: 4 Duplicates: 0 Warnings: 0
mysql> select * from t;
+------+----------+
| id | name |
+------+----------+
| 1 | ganxing |
| 2 | boobooke |
| 3 | bbk |
| 4 | abc |
+------+----------+
4 rows in set (0.01 sec)
复制代码停止node1上的heartbeat服务,看node2是否能接管,并同步数据[root@node1 ~]# service heartbeat stop
[root@node1 ~]# tail /var/log/ha-log -f
heartbeat[6466]: 2010/08/20_11:36:08 info: killing HBWRITE process 6470 with signal 15
heartbeat[6466]: 2010/08/20_11:36:08 info: killing HBREAD process 6471 with signal 15
heartbeat[6466]: 2010/08/20_11:36:08 info: killing HBWRITE process 6472 with signal 15
heartbeat[6466]: 2010/08/20_11:36:08 info: killing HBREAD process 6473 with signal 15
heartbeat[6466]: 2010/08/20_11:36:08 info: Core process 6471 exited. 5 remaining
heartbeat[6466]: 2010/08/20_11:36:08 info: Core process 6470 exited. 4 remaining
heartbeat[6466]: 2010/08/20_11:36:09 info: Core process 6469 exited. 3 remaining
heartbeat[6466]: 2010/08/20_11:36:09 info: Core process 6473 exited. 2 remaining
heartbeat[6466]: 2010/08/20_11:36:09 info: Core process 6472 exited. 1 remaining
heartbeat[6466]: 2010/08/20_11:36:09 info: node1 Heartbeat shutdown complete.
复制代码查看node2接管日志[root@node2 ~]# tail /var/log/ha-log -f
heartbeat[6149]: 2010/08/20_11:36:06 info: Received shutdown notice from 'node1'.
heartbeat[6149]: 2010/08/20_11:36:06 info: Resources being acquired from node1.
heartbeat[6208]: 2010/08/20_11:36:06 info: acquire local HA resources (standby).
heartbeat[6208]: 2010/08/20_11:36:07 info: local HA resource acquisition completed (standby).
heartbeat[6149]: 2010/08/20_11:36:07 info: Standby resource acquisition done [all].
heartbeat[6209]: 2010/08/20_11:36:07 info: No local resources [/usr/share/heartbeat/ResourceManager listkeys node2] to acquire.
harc[6234]: 2010/08/20_11:36:08 info: Running /etc/ha.d/rc.d/status status
mach_down[6250]: 2010/08/20_11:36:08 info: Taking over resource group IPaddr::192.168.1.250/24/eth0:1
ResourceManager[6276]: 2010/08/20_11:36:09 info: Acquiring
resource group: node1 IPaddr::192.168.1.250/24/eth0:1 mysqld_umount
mysqld
IPaddr[6303]: 2010/08/20_11:36:10 INFO: Resource is stopped
ResourceManager[6276]: 2010/08/20_11:36:11 info: Running /etc/ha.d/resource.d/IPaddr 192.168.1.250/24/eth0:1 start
IPaddr[6401]: 2010/08/20_11:36:12 INFO: Using calculated netmask for 192.168.1.250: 255.255.255.0
IPaddr[6401]: 2010/08/20_11:36:13 INFO: eval ifconfig eth0:0 192.168.1.250 netmask 255.255.255.0 broadcast 192.168.1.255
IPaddr[6372]: 2010/08/20_11:36:13 INFO: Success
ResourceManager[6276]: 2010/08/20_11:36:14 info: Running /etc/ha.d/resource.d/mysqld_umount start
ResourceManager[6276]: 2010/08/20_11:36:15 info: Running /etc/init.d/mysqld start
heartbeat[6149]: 2010/08/20_11:36:18 WARN: node node1: is dead
heartbeat[6149]: 2010/08/20_11:36:18 info: Dead node node1 gave up resources.
heartbeat[6149]: 2010/08/20_11:36:18 info: Link node1:eth0 dead.
ipfail[6170]: 2010/08/20_11:36:18 info: Status update: Node node1 now has status dead
mach_down[6250]: 2010/08/20_11:36:19 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired
heartbeat[6149]: 2010/08/20_11:36:19 info: mach_down takeover complete.
mach_down[6250]: 2010/08/20_11:36:19 info: mach_down takeover complete for node node1.
ipfail[6170]: 2010/08/20_11:36:19 info: NS: We are still alive!
ipfail[6170]: 2010/08/20_11:36:19 info: Link Status update: Link node1/eth0 now has status dead
ipfail[6170]: 2010/08/20_11:36:20 info: Asking other side for ping node count.
ipfail[6170]: 2010/08/20_11:36:20 info: Checking remote count of ping nodes.
复制代码来验证一下数据同步情况[root@node2 ~]# mysql
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.0.77 Source distribution
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
mysql> use db
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> select * from t;
+------+----------+
| id | name |
+------+----------+
| 1 | ganxing |
| 2 | boobooke |
| 3 | bbk |
| 4 | abc |
+------+----------+
4 rows in set (0.00 sec)
复制代码服务启动的顺序
drbd[/b]让系统自动加载
heartbeat 放在/etc/rc.local里面MYSQLD不要开机自动加载 http://51CTO提醒您,请勿滥发广告!/bbs/thread-47791-1-1.html
相关文章推荐
- Heartbeat + Drbd +Mysql 构建高可用的MYSQL数据库服务
- 构建MySQL+DRBD+heartbeat高可用
- MySQL+Heartbeat+DRBD构建高可用MySQL环境
- MySQL+Heartbeat+DRBD构建高可用MySQL环境
- MySQL+Heartbeat+DRBD构建高可用MySQL环境 推荐
- MySQL+Heartbeat+DRBD构建高可用MySQL环境
- heartbeat+drbd+mysql构建mysql高可用群集
- MySQL+Heartbeat+DRBD构建高可用MySQL环境
- Debian系统搭建HeartBeat+DRBD+mysql实现高可用
- mysql高可用之DRBD + HEARTBEAT + MYSQL
- Drbd+heartbeat+mysql replication来构建mysql的高可用性
- Drbd+heartbeat+mysql replication来构建mysql的高可用性
- Heartbeat+DRBD+NFS 构建高可用的文件系统
- MySQL+Heartbeat+DRBD+LVS+keepalived实现数据库高可用群集
- 基于drbd+corosync+mysql构建高可用主从服务器
- corosync+pacemaker+drbd构建mysql高可用平台的简单案例
- DRBD+Heartbeat+MySQL高可用
- mysql+heartbeat+drbd+apache(nginx)+php实现高可用
- Heartbeat+DRBD+MySQL高可用架构方案与实施过程细节
- Heartbeat+drbd+mysql的高可用部署