您的位置:首页 > 运维架构 > Linux

Linux 高可用(HA)集群之heartbeat基于crm进行资源管理

2017-09-18 14:45 821 查看

一、高可用集群之heartbeat基于crm进行资源管理1、集群的工作模型:A/P:两个节点,工作与主备模型N-M N>M,N个节点,M个服务N-N:N个节点,N个服务A/A:双主模型:
2、资源转移的方式rgmanager:failover domain prioritypacemaker:资源黏性:资源约束(三种类型):位置约束:资源更倾向于那个节点上inf:无穷大n:-n:-inf:负无穷排列约束:资源运行在同一节点的倾向性inf:-inf:顺序约束:资源的启动次序及关闭次序
3、如何让web service中的三个资源:VIP、httpd和filesystem运行于同一节点上1.排列约束2.资源组(resource group)
6、RA类型heartbeat legacyLSBOCFSTONITH7、资源类型primitive,native:主资源,只能运行于一个节点group:组资源clone:克隆资源总克隆数,每个节点最多可运行的克隆数stonith cluster filesystemmaster/salve:主从资源8、分布式锁:
9、图形化配置 ha.cf crm on

/usr/lib64/heartbeat/ha_propagate 将配置文件传送到别的节点

10、安装gui heartbeat v2使用crm作为ijiqun资源管理器:需要在ha.cf中添加 crm on crm通过mgmtd集成监听5560/tcp 需要启动hb_gui的主机为hacluster用户添加密码,使用hb_gui启动

with quorum:拥有法定票数 without quorum :不拥有法定票数
11、定义高可用的web service VIP httpd
from to:以它为基础

web service VIPhttpdNFS
二、配置1、ha.cf[root@snn heartbeat]# vim /etc/ha.d/ha.cf mcast eth0 694 1 0crm on
[root@snn heartbeat]# /usr/lib64/heartbeat/ha_propagate Propagating HA configuration files to node datanode4.abc.com.ha.cf 100% 10KB 10.4KB/s 00:00 authkeys 100% 694 0.7KB/s 00:00 Setting HA startup configuration on node datanode4.abc.com.
2、注意haresources与crm不兼容,不被crm所读取[root@snn heartbeat]# mv /etc/ha.d/haresources /root
底下mv是datanode4的主机[root@datanode4 ha.d]# mv haresources /root/
[root@snn heartbeat]# service heartbeat startlogd is already runningStarting High-Availability services: Done.
[root@snn heartbeat]# ssh datanode4 'service heartbeat start'logd is already runningStarting High-Availability services: Done.
3、查看日志 [root@snn heartbeat]# tail -f /var/log/messagesJun 19 16:00:29 snn crmd: [2223]: notice: populate_cib_nodes: Node: datanode4.abc.com (uuid: 0862d824-047e-4826-9e26-21a7603f53c8)Jun 19 16:00:30 snn crmd: [2223]: notice: populate_cib_nodes: Node: snn.abc.com (uuid: 6009ca6a-56eb-4d35-872e-3b8dc0fc9851)Jun 19 16:00:30 snn crmd: [2223]: info: do_ha_control: Connected to HeartbeatJun 19 16:00:30 snn crmd: [2223]: info: do_ccm_control: CCM connection established... waiting for first callbackJun 19 16:00:30 snn crmd: [2223]: info: do_started: Delaying start, CCM (0000000000100000) not connectedJun 19 16:00:30 snn crmd: [2223]: info: crmd_init: Starting crmd's mainloopJun 19 16:00:30 snn crmd: [2223]: notice: crmd_client_status_callback: Status update: Client snn.abc.com/crmd now has status [online]Jun 19 16:00:30 snn crmd: [2223]: notice: crmd_client_status_callback: Status update: Client snn.abc.com/crmd now has status [online]Jun 19 16:00:30 snn crmd: [2223]: notice: crmd_client_status_callback: Status update: Client datanode4.abc.com/crmd now has status [online]Jun 19 16:00:30 snn cib: [2219]: info: mem_handle_event: Got an event OC_EV_MS_NEW_MEMBERSHIP from ccmJun 19 16:00:30 snn cib: [2219]: info: mem_handle_event: instance=5, nodes=2, new=2, lost=0, n_idx=0, new_idx=0, old_idx=4Jun 19 16:00:30 snn cib: [2219]: info: cib_ccm_msg_callback: PEER: datanode4.abc.comJun 19 16:00:30 snn cib: [2219]: info: cib_ccm_msg_callback: PEER: snn.abc.comJun 19 16:00:31 snn crmd: [2223]: info: do_started: Delaying start, CCM (0000000000100000) not connectedJun 19 16:00:31 snn crmd: [2223]: info: mem_handle_event: Got an event OC_EV_MS_NEW_MEMBERSHIP from ccmJun 19 16:00:31 snn crmd: [2223]: info: mem_handle_event: instance=5, nodes=2, new=2, lost=0, n_idx=0, new_idx=0, old_idx=4Jun 19 16:00:31 snn crmd: [2223]: info: crmd_ccm_msg_callback: Quorum (re)attained after event=NEW MEMBERSHIP (id=5)Jun 19 16:00:31 snn crmd: [2223]: info: ccm_event_detail: NEW MEMBERSHIP: trans=5, nodes=2, new=2, lost=0 n_idx=0, new_idx=0, old_idx=4Jun 19 16:00:31 snn crmd: [2223]: info: ccm_event_detail: #011CURRENT: datanode4.abc.com [nodeid=0, born=3]Jun 19 16:00:31 snn crmd: [2223]: info: ccm_event_detail: #011CURRENT: snn.abc.com [nodeid=1, born=5]Jun 19 16:00:31 snn crmd: [2223]: info: ccm_event_detail: #011NEW: datanode4.abc.com [nodeid=0, born=3]Jun 19 16:00:31 snn crmd: [2223]: info: ccm_event_detail: #011NEW: snn.abc.com [nodeid=1, born=5]Jun 19 16:00:31 snn crmd: [2223]: info: do_started: The local CRM is operationalJun 19 16:00:31 snn crmd: [2223]: info: do_state_transition: State transition S_STARTING -> S_PENDING [ input=I_PENDING cause=C_CCM_CALLBACK origin=do_started ]
4、查看集群监控状态//如果想它只显示一次使用crm_mon --one-shot[root@snn heartbeat]# crm_monRefresh in 6s...
============Last updated: Fri Jun 19 16:11:34 2015Current DC: snn.abc.com (6009ca6a-56eb-4d35-872e-3b8dc0fc9851)2 Nodes configured.0 Resources configured.============
Node: datanode4.abc.com (0862d824-047e-4826-9e26-21a7603f53c8): onlineNode: snn.abc.com (6009ca6a-56eb-4d35-872e-3b8dc0fc9851): online

4、crm的命令工具[root@snn heartbeat]# crm_sh/usr/sbin/crm_sh:31: DeprecationWarning: The popen2 module is deprecated. Use the subprocess module. from popen2 import Popen3crm # helpUsage: crm (nodes|config|resources)crm # nodescrm nodes # helpUsage: nodes (status|list)crm nodes # list <node id="0862d824-047e-4826-9e26-21a7603f53c8" uname="datanode4.abc.com" type="normal"/> <node id="6009ca6a-56eb-4d35-872e-3b8dc0fc9851" uname="snn.abc.com" type="normal"/>crm nodes #
5、安装heartbeat的时候自动创建一个用户hacluster,但没有密码,需要创建 [root@snn heartbeat]# cat /etc/passwd |grep haclusterhacluster:x:498:498:heartbeat user:/var/lib/heartbeat/cores/hacluster:/sbin/nologin
[root@snn heartbeat]# passwd hacluster更改用户 hacluster 的密码 。新的 密码:无效的密码: WAY 过短无效的密码: 过于简单重新输入新的 密码:passwd: 所有的身份验证令牌已经成功更新。
6、直接运行hb_gui[root@snn ~]# hb_gui Traceback (most recent call last): File "/usr/bin/hb_gui", line 41, in <module> import gtk, gtk.glade, gobject File "/usr/lib64/python2.6/site-packages/gtk-2.0/gtk/__init__.py", line 64, in <module> _init() File "/usr/lib64/python2.6/site-packages/gtk-2.0/gtk/__init__.py", line 52, in _init _gtk.init_check()RuntimeError: could not open display以上有错误提示






四、定义组的方式 web server:vip:挂在到/var/www/html1、删除原来主资源



从日志 来看,nfs正常挂在到4这主机上,但httpd先启动后又关闭,奇怪了
4、来到datanode4这台机子,单独启动httpd看看,没有成功[root@datanode4 ~]# /etc/init.d/httpd restart停止 httpd: [失败]正在启动 httpd:Syntax error on line 292 of /etc/httpd/conf/httpd.conf:
5、查看SElinux状态,吓了一跳,问题出现在这里[root@datanode4 conf]# getenforce Enforcing[root@datanode4 conf]# setenforce 0
[root@datanode4 conf]# getenforce Permissive把配置文件改成disabled[root@datanode4 conf]# vim /etc/selinux/config SELINUX=disabled
6、单独在启动httpd看看[root@datanode4 conf]# /etc/init.d/httpd start正在启动 httpd: [确定][root@datanode4 conf]# /etc/init.d/httpd stop停止 httpd:

五、验证1、nfs的index.html内容[root@datanode ~]# cat /web/htdocs/index.html <h1>datanode.abc.com</h1>[root@datanode ~]# ifconfig eth0eth0 Link encap:Ethernet HWaddr 00:0C:29:50:AC:6E inet addr: Bcast: Mask: inet6 addr: fe80::20c:29ff:fe50:ac6e/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:83505 errors:0 dropped:0 overruns:0 frame:0 TX packets:2037 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:7403212 (7.0 MiB) TX bytes:228350 (222.9 KiB)
2、datanode4的主机的vip地址,如果单纯输入ifocnfig,不能显示出来的,它没有利用别名来定义,所以要用的ip addr show [root@datanode4 html]# ifconfig eth0
eth0 Link encap:Ethernet HWaddr 00:0C:29:E1:2F:66 inet addr: Bcast: Mask: inet6 addr: fe80::20c:29ff:fee1:2f66/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:147365 errors:0 dropped:0 overruns:0 frame:0 TX packets:66651 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:20284443 (19.3 MiB) TX bytes:14571080 (13.8 MiB)
[root@datanode4 html]# ip addr show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:0c:29:e1:2f:66 brd ff:ff:ff:ff:ff:ff inet brd scope global eth0 inet brd scope global secondary eth0 //显示vip地址 inet6 fe80::20c:29ff:fee1:2f66/64 scope link valid_lft forever preferred_lft forever



[root@snn ~]# ip addr show1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:0c:29:b1:89:48 brd ff:ff:ff:ff:ff:ff inet brd scope global eth0 inet brd scope global secondary eth0 inet6 fe80::20c:29ff:feb1:8948/64 scope link valid_lft forever preferred_lft forever


内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  Linux 高可用 HA