Mysql高可用集群MHA
2017-10-11 00:00
330 查看
思路
配置MHA基本环境——配置主从——配置MHA——测试
2) 构造4个通用的authorized_keys
在主机shenxiang61上把 主机shenxiang67/68/69的公钥Copy过来
将Copy过来的公钥写入authorized_keys中
查看生成的通用密钥
看到authorized_keys文件里面有4行记录,分别代表了访问主机shenxiang61/67/68/69的公用密钥。然后把这个authorized_keys公钥文件copy到主机shenxiang67/68/69上面同一个目录下。
3) 开始check验证如下
主机shenxiang67
主机shenxiang67
主机shenxiang68
主机shenxiang69
OK,4台服务器已经能实现两两互相ssh通了,不需要输入密码即可。
PS:如果不能实现任何两台主机互相之间可以无密码登录,后面的环节可能会有问题。
https://my.oschina.net/u/3409834/blog/1548979
验证授权用户
重启Mysql服务
Slave2中进行相同操作,过程略!
在三台机器中都做如下操作:
分别在master,slave1,slave2上安装数据节点包
MHA配置文件cnf
Manager上配置WriteVIP切换脚本masterha_ip_failover
正如黄底红字所示,一切准备就绪!就等着MYSQL无响应呢
再登录slave2服务器,看slave 的状态
测试成功,如上所示,原67主机MySQL连接失败,68主机当选新的master,61主机记录最近的日志,大功告成。
查看Write VIP 192.168.1.222
红色部分可以用manager的日志中截取
4) 查看日志
当有slave节点宕掉的情况是启动不了的,加上–ignore_fail_on_start即使有节点宕掉也能启动mha
或者
–orig_master_is_new_slave切换时加上此参数是将原master变为slave节点,如果不加此参数,原来的master将不启动
–running_updates_limit=10000 切换时候选master如果有延迟的话,mha切换不能成功,加上此参数表示延迟在此时间范围内都可切换(单位为s),但是切换的时间长短是由recover时relay日志的大小决定
手动在线切换mha,切换时需要将在运行的mha停掉后才能切换。
在备库先执行DDL,一般先stop slave,一般不记录mysql日志,可以通过set SQL_LOG_BIN = 0实现。然后进行一次主备切换操作,再在原来的主库上执行DDL。这种方法适用于增减索引,如果是增加字段就需要额外注意。
可以通过如下命令停止mha
masterha_stop –conf=/etc/app1.cnf
配置MHA基本环境——配置主从——配置MHA——测试
一、MHA配置环境
1、主机角色
主 机 名 | 角 色 | IP地址 | 软 件 |
Shenxiang61 | Manager | 192.168.1.61 | Mha4mysql-Manager Mha4mysql-node |
Shenxiang67 | Master | 192.168.1.67 | Mha4mysql-node |
Shenxiang68 | Slave1 | 192.168.1.68 | Mha4mysql-node |
Shenxiang69 | Slave2 | 192.168.1.69 | Mha4mysql-node |
2、系统配置
系 统 | 内 核 | Mysql版本 | 内 存 |
CentOS 6.7 | Linux version 2.6.32-573.el6.x86_64 | mysql Ver 14.14 Distrib 5.6.26 | 1G |
3、生成SSH无密钥证书
1) 分别在主机shenxiang61/67/68/69生成密钥,并把公钥写入authorized_keys文件[root@shenxiang61 ~]# ssh-keygen Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Created directory '/root/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: 5f:f3:11:6b:ad:b0:d0:e2:37:18:99:02:a3:8e:b8:f4 root@shenxiang61 The key's randomart image is: +--[ RSA 2048]----+ | | | | | o . | | . o + + | | . S * = + .| | . o + * * o | |... . + + o | |... . . | |. E | +-----------------+ [root@shenxiang61 ~]# cd ./.ssh/ [root@shenxiang61 .ssh]# ls id_rsa id_rsa.pub [root@shenxiang61 .ssh]# cat id_rsa.pub > authorized_keys [root@shenxiang61 .ssh]# ls authorized_keys id_rsa id_rsa.pub
2) 构造4个通用的authorized_keys
在主机shenxiang61上把 主机shenxiang67/68/69的公钥Copy过来
[root@shenxiang61 .ssh]# scp 192.168.1.67:~/.ssh/id_rsa.pub ./id_rsa.pub.67 The authenticity of host '192.168.1.67 (192.168.1.67)' can't be established. RSA key fingerprint is ed:0a:c4:cd:d3:04:54:44:a0:6f:cd:35:64:94:89:53. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '192.168.1.67' (RSA) to the list of known hosts. root@192.168.1.67's password: id_rsa.pub 100% 398 0.4KB/s 00:00 [root@shenxiang61 .ssh]# scp 192.168.1.68:~/.ssh/id_rsa.pub ./id_rsa.pub.68 The authenticity of host '192.168.1.68 (192.168.1.68)' can't be established. RSA key fingerprint is ed:0a:c4:cd:d3:04:54:44:a0:6f:cd:35:64:94:89:53. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '192.168.1.68' (RSA) to the list of known hosts. root@192.168.1.68's password: id_rsa.pub 100% 398 0.4KB/s 00:00 [root@shenxiang61 .ssh]# scp 192.168.1.69:~/.ssh/id_rsa.pub ./id_rsa.pub.69 The authenticity of host '192.168.1.69 (192.168.1.69)' can't be established. RSA key fingerprint is ed:0a:c4:cd:d3:04:54:44:a0:6f:cd:35:64:94:89:53. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '192.168.1.69' (RSA) to the list of known hosts. root@192.168.1.69's password: id_rsa.pub 100% 398 0.4KB/s 00:00
将Copy过来的公钥写入authorized_keys中
[root@shenxiang61 .ssh]# ls authorized_keys id_rsa id_rsa.pub id_rsa.pub.67 id_rsa.pub.68 id_rsa.pub.69 known_hosts [root@shenxiang61 .ssh]# cat id_rsa.pub.67 >> authorized_keys [root@shenxiang61 .ssh]# cat id_rsa.pub.68 >> authorized_keys [root@shenxiang61 .ssh]# cat id_rsa.pub.69 >> authorized_keys
查看生成的通用密钥
[root@shenxiang61 .ssh]# cat authorized_keys ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA+JwgX9/jn5IJOQ8aWLY+3vV8UdJ4GMbhinjV4ePdGhwM+C/W9vqWsz9tCsDmf23pPjF45puM8FmfILaetld15fGc/AmmaCpb6ezKlFKgSecohK2XNhArNCMKwnLMDsQb+4QaEQPNXsZXThazUO7OgWlj7gODtiuPSYNlyQYd6PZreFPlz4WWj+zHQAuTptiNpXS1ZgITqH8EHOZmU4rEiFds/7hpwjdyCQOPMHT0DxN/ux6Ki8xhBGkBSovxtA+18jw4fUv2ETPq7bXy/wVhUjD4dcW3ebgCrFZmvS5PzNEGnGGa9swVWWad7TEt2PD1l2XpFW/VRx+TA/pHG2vzLw== root@shenxiang61 ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAqiuEsTh6rNcuwGvujR48mRnB+mQXWghkQm0f7y5UXOqxO9B9oRzR66J4sBx8R3tdFwfMYZOWawGj0Uup7rvluoQhChK/lfge6zzMnjQXSTXwm0tqeuJhLHuK/NhZr7ikEjzUPKb/5Sh5fH0rODm+xMqRp8HN547+bsMA3aO3L8YizN2HQtGSb9DkedoJWJJ+6KwziZnkClO9bvwbpugbBRSaleV1dQByg/2eU/Dl82c0kMDxK6jny2tpwLWARt31W/1qui3ojXBxYJVkT9fgv2ON9JVZlKuWgHq8PIRkpVUPp78A+sXet5Lx0r2iF7AFLuowmCa87uIq9viAguYIUQ== root@shenxiang67 ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAyiAMRZ6bV+TdtKn3JaQU1gVL+uRZOzw5PjUY/Xs30YZytj0SF6aYzg/rxzajrQcRA3pS0l2mle95POICzY8ohyzAokjNuKJQKvGlPwYuRCsHoqwACwDFGPB87TvSyqeX6OlpnxHxpdBCuKyVZd8BPvH6+HBqF7E/UtNAMWpwfIyZKcnV8hLxdSnMYRSmo1J/Y5HaO5CgVFhgYJHqaGs8MziEPSCjp53H39K6F1dqKVlq1zrp0E03YiX6Y9509uftSN0JgWdKK1KIKXkYjPj/RHAOTNld7GH4H01N9eUDEmcdX7ob/FX9XPSLhDgwYQV9kb64mrj9o/K1bwoakyNpNQ== root@shenxiang68 ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAzup8f0SW49hBfQbeZ0jfrpMlxFPhV62ztEJkkn59E1RT5QS8aSjFvZuV47J4riTWxhfKzWfW0ZhrF/jibWb3hmc+pfj+F8a05NBVCmYNxt8M9MEKLdiQJgo6eHwHP8LkAIS937om1/pSpSviVJNakfDS3QliccNDntiWq3fWpZxpb3CV8DBy1VEBlf4bt6GWCNaWx9Sh+3gA7BqVMUQ5SQYgTdcUOf7Xf0LXbbgvKbEZZI6/6o8up2YWANf9q6Jrdyi3RPcs5HExgcWEnXsv9B8ViHF6TWXawcZ1buywbxPCk1OI73AR2RJIkuiKFpk6vNJ/KyyYMYVMaKH/e7slqw== root@shenxiang69
看到authorized_keys文件里面有4行记录,分别代表了访问主机shenxiang61/67/68/69的公用密钥。然后把这个authorized_keys公钥文件copy到主机shenxiang67/68/69上面同一个目录下。
[root@shenxiang61 .ssh]# scp authorized_keys 192.168.1.67:~/.ssh/ root@192.168.1.67's password: authorized_keys 100% 1592 1.6KB/s 00:00 [root@shenxiang61 .ssh]# scp authorized_keys 192.168.1.68:~/.ssh/ root@192.168.1.68's password: authorized_keys 100% 1592 1.6KB/s 00:00 [root@shenxiang61 .ssh]# scp authorized_keys 192.168.1.69:~/.ssh/ root@192.168.1.69's password: authorized_keys 100% 1592 1.6KB/s 00:00
3) 开始check验证如下
主机shenxiang67
[root@shenxiang61 ~]# ssh 192.168.1.67 Last login: Tue Feb 14 09:43:29 2017 from 192.168.1.69 [root@shenxiang67 ~]# exit logout Connection to 192.168.1.67 closed. [root@shenxiang61 ~]# ssh 192.168.1.68 Last login: Tue Feb 14 09:45:32 2017 from 192.168.1.69 [root@shenxiang68 ~]# exit logout Connection to 192.168.1.68 closed. [root@shenxiang61 ~]# ssh 192.168.1.69 Last login: Tue Feb 14 09:45:41 2017 from 192.168.1.68 [root@shenxiang69 ~]# exit logout Connection to 192.168.1.69 closed. [root@shenxiang61 ~]#
主机shenxiang67
[root@shenxiang67 ~]# ssh 192.168.1.61 Last login: Tue Feb 14 07:52:35 2017 from 192.168.1.33 [root@shenxiang61 ~]# exit logout Connection to 192.168.1.61 closed. [root@shenxiang67 ~]# ssh 192.168.1.68 Last login: Tue Feb 14 09:46:36 2017 from 192.168.1.61 [root@shenxiang68 ~]# exit logout Connection to 192.168.1.68 closed. [root@shenxiang67 ~]# ssh 192.168.1.69 Last login: Tue Feb 14 09:46:37 2017 from 192.168.1.61 [root@shenxiang69 ~]# exit logout Connection to 192.168.1.69 closed. [root@shenxiang67 ~]#
主机shenxiang68
[root@shenxiang68 ~]# ssh 192.168.1.61 Last login: Tue Feb 14 07:53:21 2017 from 192.168.1.67 [root@shenxiang61 ~]# exit logout Connection to 192.168.1.61 closed. [root@shenxiang68 ~]# ssh 192.168.1.67 Last login: Tue Feb 14 09:44:53 2017 from 192.168.1.61 [root@shenxiang67 ~]# exit logout Connection to 192.168.1.67 closed. [root@shenxiang68 ~]# ssh 192.168.1.69 Last login: Tue Feb 14 09:47:15 2017 from 192.168.1.67 [root@shenxiang69 ~]# exit logout Connection to 192.168.1.69 closed. [root@shenxiang68 ~]#
主机shenxiang69
[root@shenxiang69 ~]# ssh 192.168.1.61 Last login: Tue Feb 14 07:53:52 2017 from 192.168.1.68 [root@shenxiang61 ~]# exit logout Connection to 192.168.1.61 closed. [root@shenxiang69 ~]# ssh 192.168.1.67 Last login: Tue Feb 14 09:46:07 2017 from 192.168.1.68 [root@shenxiang67 ~]# exit logout Connection to 192.168.1.67 closed. [root@shenxiang69 ~]# ssh 192.168.1.68 Last login: Tue Feb 14 09:47:14 2017 from 192.168.1.67 [root@shenxiang68 ~]# exit logout Connection to 192.168.1.68 closed. [root@shenxiang69 ~]#
OK,4台服务器已经能实现两两互相ssh通了,不需要输入密码即可。
PS:如果不能实现任何两台主机互相之间可以无密码登录,后面的环节可能会有问题。
4、/etc/Hosts文件配置
[root@shenxiang61 ~]# vim /etc/hosts [root@shenxiang61 ~]# cat !$ cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.1.61 manager 192.168.1.67 master 192.168.1.68 slave1 192.168.1.69 slave2 [root@shenxiang61 ~]# scp /etc/hosts master:/etc/ The authenticity of host 'master (192.168.1.67)' can't be established. RSA key fingerprint is ed:0a:c4:cd:d3:04:54:44:a0:6f:cd:35:64:94:89:53. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'master' (RSA) to the list of known hosts. hosts 100% 239 0.2KB/s 00:00 [root@shenxiang61 ~]# scp /etc/hosts slave1:/etc/ The authenticity of host 'slave1 (192.168.1.68)' can't be established. RSA key fingerprint is ed:0a:c4:cd:d3:04:54:44:a0:6f:cd:35:64:94:89:53. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'slave1' (RSA) to the list of known hosts. hosts 100% 239 0.2KB/s 00:00 [root@shenxiang61 ~]# scp /etc/hosts slave2:/etc/ The authenticity of host 'slave2 (192.168.1.69)' can't be established. RSA key fingerprint is ed:0a:c4:cd:d3:04:54:44:a0:6f:cd:35:64:94:89:53. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'slave2' (RSA) to the list of known hosts. hosts 100% 239 0.2KB/s 00:00
二、Mysql5.6.26源码编译安装
详情查看此文章:https://my.oschina.net/u/3409834/blog/1548979
三、Mysql主从配置
1、mysql主配置
1) 修改配置文件my.cnf
在[mysqld]下面添加如下内容[mysqld] log-bin=mysql-bin-master server-id = 1
2) 授权slave用户并重启服务
授权slave用户[root@shenxiang67 ~]# mysql -uroot -p -e "grant replication slave,replication client on *.* to 'slave'@'%' identified by '123456'" Enter password:********
验证授权用户
[root@shenxiang68 ~]# mysql -uslave -p123456 -hmaster Warning: Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2 Server version: 5.6.26-log Source distribution Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql>
重启Mysql服务
[root@shenxiang67 ~]# /etc/init.d/mysqld restart Shutting down MySQL. SUCCESS! Starting MySQL. SUCCESS!
3) 查看主机状态
要求记录file文件名称和position[root@shenxiang67 ~]# mysql -uroot -p -e "show master status\G" Enter password: *************************** 1. row *************************** File: mysql-bin-master.000005 Position: 120 Binlog_Do_DB: Binlog_Ignore_DB: Executed_Gtid_Set:
2、mysql从配置
1) 修改配置文件my.cnf
在[mysqld]下面添加如下红色内容[mysqld] log-bin=mysql-bin-master server-id = 2 log_slave_updates = 1 read_only= 1
2) 授权slave用户并重启服务(MHA主从切换时会用到)
mysql> grant replication slave,replication client on *.* to 'slave'@'%' identified by '123456'; Query OK, 0 rows affected (0.00 sec)
3) 重启服务并添加Master信息
Slave1配置过程[root@shenxiang68 ~]# mysql -uroot -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 3 Server version: 5.6.26-log Source distribution Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> stop slave; change master to master_host='master',master_port=3306,master_user='slave',master_password='123456',master_log_file='mysql-bin-master.000005',master_log_pos=120; Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected, 2 warnings (0.01 sec) mysql> start slave; Query OK, 0 rows affected (0.01 sec) mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: master Master_User: slave Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin-master.000005 Read_Master_Log_Pos: 120 Relay_Log_File: shenxiang68-relay-bin.000002 Relay_Log_Pos: 290 Relay_Master_Log_File: mysql-bin-master.000005 Slave_IO_Running: Yes Slave_SQL_Running: Yes
Slave2中进行相同操作,过程略!
3、赋权MHA远程登录帐号
然后再就是赋权了,之前的一步赋权操作是权限是只有replication,MHA会在配置文件里要求能远程登录到数据库,所以要进行必要的赋权。在三台机器中都做如下操作:
mysql> grant all privileges on *.* to 'sxooky'@'%' identified by '123456'; mysql> flush privileges;
四、安装MHA
1、安装依赖包
[root@shenxiang61 mysql]# yum -y install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager --skip-broken Loaded plugins: fastestmirror, security Setting up Install Process Loading mirror speeds from cached hostfile * base: mirrors.aliyun.com * epel: mirrors.aliyun.com * extras: mirrors.aliyun.com * updates: mirrors.aliyun.com base | 3.7 kB 00:00 epel | 4.3 kB 00:00 extras | 3.4 kB 00:00 updates | 3.4 kB 00:00 Resolving Dependencies --> Running transaction check ---> Package perl-Config-Tiny.noarch 0:2.12-7.1.el6 will be installed Installed: perl-Config-Tiny.noarch 0:2.12-7.1.el6 perl-DBD-MySQL.x86_64 0:4.013-3.el6 perl-Log-Dispatch.noarch 0:2.27-1.el6 perl-Parallel-ForkManager.noarch 0:0.7.9-1.el6 Dependency Installed: perl-DBI.x86_64 0:1.609-4.el6 perl-Email-Date-Format.noarch 0:1.002-5.el6 perl-MIME-Lite.noarch 0:3.027-2.el6 perl-MIME-Types.noarch 0:1.28-2.el6 perl-Mail-Sender.noarch 0:0.8.16-3.el6 perl-Mail-Sendmail.noarch 0:0.79-12.el6 perl-MailTools.noarch 0:2.04-4.el6 perl-Params-Validate.x86_64 0:0.92-3.el6 perl-TimeDate.noarch 1:1.16-13.el6 Complete!
2、安装mha4mysql-node
上传node数据节点包,并传输至其他Mysql server[root@shenxiang61 MHA]# scp mha4mysql-node-0.53-0.el6.noarch.rpm slave1:~ The authenticity of host 'slave1 (192.168.1.68)' can't be established. RSA key fingerprint is ed:0a:c4:cd:d3:04:54:44:a0:6f:cd:35:64:94:89:53. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'slave1' (RSA) to the list of known hosts. mha4mysql-node-0.53-0.el6.noarch.rpm 100% 33KB 32.9KB/s 00:00 [root@shenxiang61 MHA]# scp mha4mysql-node-0.53-0.el6.noarch.rpm slave1:~ The authenticity of host 'slave1 (192.168.1.68)' can't be established. RSA key fingerprint is ed:0a:c4:cd:d3:04:54:44:a0:6f:cd:35:64:94:89:53. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'slave1' (RSA) to the list of known hosts. mha4mysql-node-0.53-0.el6.noarch.rpm 100% 33KB 32.9KB/s 00:00 [root@shenxiang61 MHA]# scp mha4mysql-node-0.53-0.el6.noarch.rpm slave2:~ The authenticity of host 'slave2 (192.168.1.69)' can't be established. RSA key fingerprint is ed:0a:c4:cd:d3:04:54:44:a0:6f:cd:35:64:94:89:53. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'slave2' (RSA) to the list of known hosts. mha4mysql-node-0.53-0.el6.noarch.rpm 100% 33KB 32.9KB/s 00:00
分别在master,slave1,slave2上安装数据节点包
[root@shenxiang61 ~]# rpm -ivh mha4mysql-node-0.53-0.el6.noarch.rpm Preparing... ########################################### [100%] 1:mha4mysql-node ########################################### [100%] [root@shenxiang67 MHA]# rpm -ivh mha4mysql-node-0.53-0.el6.noarch.rpm Preparing... ########################################### [100%] 1:mha4mysql-node ########################################### [100%] [root@shenxiang68 ~]# rpm -ivh mha4mysql-node-0.53-0.el6.noarch.rpm Preparing... ########################################### [100%] 1:mha4mysql-node ########################################### [100%] [root@shenxiang69 ~]# rpm -ivh mha4mysql-node-0.53-0.el6.noarch.rpm Preparing... ########################################### [100%] 1:mha4mysql-node ########################################### [100%]
3、安装mha4mysql-manager
[root@shenxiang61 MHA]# rpm -ivh mha4mysql-manager-0.53-0.el6.noarch.rpm error: Failed dependencies: perl(Time::HiRes) is needed by mha4mysql-manager-0.53-0.el6.noarch [root@shenxiang61 MHA]# yum install -y perl-Time-HiRes Downloading Packages: (1/11): perl-5.10.1-141.el6_7.1.x86_64.rpm | 10 MB 00:10 (2/11): perl-Compress-Raw-Zlib-2.021-141.el6_7.1.x86_64.rpm | 70 kB 00:00 (3/11): perl-Compress-Zlib-2.021-141.el6_7.1.x86_64.rpm | 46 kB 00:00 (4/11): perl-IO-Compress-Base-2.021-141.el6_7.1.x86_64.rpm | 70 kB 00:00 (5/11): perl-IO-Compress-Zlib-2.021-141.el6_7.1.x86_64.rpm | 136 kB 00:00 (6/11): perl-Module-Pluggable-3.90-141.el6_7.1.x86_64.rpm | 40 kB 00:00 (7/11): perl-Pod-Escapes-1.04-141.el6_7.1.x86_64.rpm | 33 kB 00:00 (8/11): perl-Pod-Simple-3.13-141.el6_7.1.x86_64.rpm | 213 kB 00:00 (9/11): perl-Time-HiRes-1.9721-141.el6_7.1.x86_64.rpm | 49 kB 00:00 (10/11): perl-libs-5.10.1-141.el6_7.1.x86_64.rpm | 579 kB 00:00 (11/11): perl-version-0.77-141.el6_7.1.x86_64.rpm | 52 kB 00:00 --------------------------------------------------------------------------------- Total 987 kB/s | 11 MB 00:11 Running rpm_check_debug Running Transaction Test Installed: perl-Time-HiRes.x86_64 4:1.9721-141.el6_7.1 Dependency Updated: perl.x86_64 4:5.10.1-141.el6_7.1 perl-Compress-Raw-Zlib.x86_64 1:2.021-141.el6_7.1 perl-Compress-Zlib.x86_64 0:2.021-141.el6_7.1 perl-IO-Compress-Base.x86_64 0:2.021-141.el6_7.1 perl-IO-Compress-Zlib.x86_64 0:2.021-141.el6_7.1 perl-Module-Pluggable.x86_64 1:3.90-141.el6_7.1 perl-Pod-Escapes.x86_64 1:1.04-141.el6_7.1 perl-Pod-Simple.x86_64 1:3.13-141.el6_7.1 perl-libs.x86_64 4:5.10.1-141.el6_7.1 perl-version.x86_64 3:0.77-141.el6_7.1 [root@shenxiang61 MHA]# rpm -ivh mha4mysql-manager-0.53-0.el6.noarch.rpm Preparing... ########################################### [100%] 1:mha4mysql-manager ########################################### [100%]
五、配置MHA并测试
1、配置
[root@shenxiang61 MHA]# mkdir -p /etc/masterha/app1 [root@shenxiang61 MHA]# cd [root@shenxiang61 ~]# touch /etc/masterha/app1/manager.log
MHA配置文件cnf
[root@shenxiang61 ~]# vim /etc/masterha/app1.cnf [root@shenxiang61 ~]# cat !$ cat /etc/masterha/app1.cnf [server default] manager_workdir=/etc/masterha/app1 #设置manager的工作目录 manager_log=/etc/masterha/app1/manager.log #日志文件路径 master_binlog_dir=/data #设置master 保存binlog的位置,以便MHA可以找到master的日志 user=sxooky password=123456 #之前数据库远程登录授权的MHA用户 ssh_user=root #用于测试ssh登录的 repl_user=slave repl_password=123456 #主从配置授权用户slave ping_interval=1 #master_ip_online_change_script="" #report_script="" #shutdown_script="" #设置故障发生后关闭故障主机脚本(该脚本的主要作用是关闭主机防止发生脑裂,这里没有使用) master_ip_failover_script=/usr/bin/masterha_ip_failover #设置自动failover时候的切换脚本 [server1] hostname=192.168.1.67 port=3306 master_binlog_dir=/data candidate_master=1 check_repl_delay=0 [server2] hostname=192.168.1.68 port=3306 master_binlog_dir=/var/lib/mysql candidate_master=1 check_repl_delay=0 [server3] hostname=192.168.1.69 port=3306 master_binlog_dir=/var/lib/mysql ignore_fail=1 no_master=1
2、Mysql write VIP配置(192.168.1.222)
Mysql-master server 网卡eth0绑定write VIP : 192.168.1.222[root@shenxiang67 ~]# ifconfig eth0:1 192.168.1.222/24 [root@shenxiang67 ~]# ifconfig eth0 Link encap:Ethernet HWaddr 00:0C:29:5E:02:B6 inet addr:192.168.1.67 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::20c:29ff:fe5e:2b6/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:79190 errors:0 dropped:0 overruns:0 frame:0 TX packets:19751 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:7314843 (6.9 MiB) TX bytes:5212776 (4.9 MiB) eth0:1 Link encap:Ethernet HWaddr 00:0C:29:5E:02:B6 inet addr:192.168.1.222 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:220 errors:0 dropped:0 overruns:0 frame:0 TX packets:220 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:18681 (18.2 KiB) TX bytes:18681 (18.2 KiB)
Manager上配置WriteVIP切换脚本masterha_ip_failover
[root@shenxiang61 ~]# cat /usr/bin/masterha_ip_failover #!/usr/bin/env perl use strict; use warnings FATAL => 'all'; use Getopt::Long; my ( $command,$ssh_user,$orig_master_host,$orig_master_ip, $orig_master_port,$new_master_host,$new_master_ip,$new_master_port ); my $vip = '192.168.1.222/24'; my $key = '1'; my $ssh_start_vip = "/sbin/ifconfig eth0:$key $vip"; my $ssh_stop_vip = "/sbin/ifconfig eth0:$key down"; GetOptions( 'command=s' => \$command, 'ssh_user=s' => \$ssh_user, 'orig_master_host=s' => \$orig_master_host, 'orig_master_ip=s' => \$orig_master_ip, 'orig_master_port=i' => \$orig_master_port, 'new_master_host=s' => \$new_master_host, 'new_master_ip=s' => \$new_master_ip, 'new_master_port=i' => \$new_master_port, ); exit &main(); sub main { print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n"; if ( $command eq "stop" || $command eq "stopssh" ) { my $exit_code = 1; eval { print "Disabling the VIP on old master: $orig_master_host \n"; &stop_vip(); $exit_code = 0; }; if ($@) { warn "Got Error: $@\n"; exit $exit_code; } exit $exit_code; } elsif ( $command eq "start" ) { my $exit_code = 10; eval { print "Enabling the VIP - $vip on the new master - $new_master_host \n"; &start_vip(); $exit_code = 0; }; if ($@) { warn $@; exit $exit_code; } exit $exit_code; } elsif ( $command eq "status" ) { print "Checking the Status of the script.. OK \n"; exit 0; } else { &usage(); exit 1; } } sub start_vip() { `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`; } sub stop_vip() { return 0 unless ($ssh_user); `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`; } sub usage { print "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n"; }
3、测试SSH
[root@shenxiang61 ~]# masterha_check_ssh --conf=/etc/masterha/app1.cnf Tue Feb 14 11:27:35 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Tue Feb 14 11:27:35 2017 - [info] Reading application default configurations from /etc/masterha/app1.cnf.. Tue Feb 14 11:27:35 2017 - [info] Reading server configurations from /etc/masterha/app1.cnf.. Tue Feb 14 11:27:35 2017 - [info] Starting SSH connection tests.. Tue Feb 14 11:27:36 2017 - [debug] Tue Feb 14 11:27:35 2017 - [debug] Connecting via SSH from root@192.168.1.67(192.168.1.67:22) to root@192.168.1.68(192.168.1.68:22).. Tue Feb 14 11:27:35 2017 - [debug] ok. Tue Feb 14 11:27:35 2017 - [debug] Connecting via SSH from root@192.168.1.67(192.168.1.67:22) to root@192.168.1.69(192.168.1.69:22).. Tue Feb 14 11:27:35 2017 - [debug] ok. Tue Feb 14 11:27:36 2017 - [debug] Tue Feb 14 11:27:36 2017 - [debug] Connecting via SSH from root@192.168.1.68(192.168.1.68:22) to root@192.168.1.67(192.168.1.67:22).. Tue Feb 14 11:27:36 2017 - [debug] ok. Tue Feb 14 11:27:36 2017 - [debug] Connecting via SSH from root@192.168.1.68(192.168.1.68:22) to root@192.168.1.69(192.168.1.69:22).. Tue Feb 14 11:27:36 2017 - [debug] ok. Tue Feb 14 11:27:37 2017 - [debug] Tue Feb 14 11:27:36 2017 - [debug] Connecting via SSH from root@192.168.1.69(192.168.1.69:22) to root@192.168.1.67(192.168.1.67:22).. Tue Feb 14 11:27:36 2017 - [debug] ok. Tue Feb 14 11:27:36 2017 - [debug] Connecting via SSH from root@192.168.1.69(192.168.1.69:22) to root@192.168.1.68(192.168.1.68:22).. Tue Feb 14 11:27:36 2017 - [debug] ok. Tue Feb 14 11:27:37 2017 - [info] All SSH connection tests passed successfully.
4、测试repl
[root@shenxiang61 ~]# masterha_check_repl --conf=/etc/masterha/app1.cnf Fri Feb 17 20:34:32 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Fri Feb 17 20:34:32 2017 - [info] Reading application default configurations from /etc/masterha/app1.cnf.. Fri Feb 17 20:34:32 2017 - [info] Reading server configurations from /etc/masterha/app1.cnf.. Fri Feb 17 20:34:32 2017 - [info] MHA::MasterMonitor version 0.53. Fri Feb 17 20:34:32 2017 - [info] Dead Servers: Fri Feb 17 20:34:32 2017 - [info] Alive Servers: Fri Feb 17 20:34:32 2017 - [info] 192.168.1.67(192.168.1.67:3306) Fri Feb 17 20:34:32 2017 - [info] 192.168.1.68(192.168.1.68:3306) Fri Feb 17 20:34:32 2017 - [info] 192.168.1.69(192.168.1.69:3306) Fri Feb 17 20:34:32 2017 - [info] Alive Slaves: Fri Feb 17 20:34:32 2017 - [info] 192.168.1.68(192.168.1.68:3306) Version=5.6.26-log (oldest major version between slaves) log-bin:enabled Fri Feb 17 20:34:32 2017 - [info] Replicating from 192.168.1.67(192.168.1.67:3306) Fri Feb 17 20:34:32 2017 - [info] Primary candidate for the new Master (candidate_master is set) Fri Feb 17 20:34:32 2017 - [info] 192.168.1.69(192.168.1.69:3306) Version=5.6.26-log (oldest major version between slaves) log-bin:enabled Fri Feb 17 20:34:32 2017 - [info] Replicating from 192.168.1.67(192.168.1.67:3306) Fri Feb 17 20:34:32 2017 - [info] Not candidate for the new Master (no_master is set) Fri Feb 17 20:34:32 2017 - [info] Current Alive Master: 192.168.1.67(192.168.1.67:3306) Fri Feb 17 20:34:32 2017 - [info] Checking slave configurations.. Fri Feb 17 20:34:32 2017 - [warning] relay_log_purge=0 is not set on slave 192.168.1.68(192.168.1.68:3306). Fri Feb 17 20:34:32 2017 - [warning] relay_log_purge=0 is not set on slave 192.168.1.69(192.168.1.69:3306). Fri Feb 17 20:34:32 2017 - [info] Checking replication filtering settings.. Fri Feb 17 20:34:32 2017 - [info] binlog_do_db= , binlog_ignore_db= Fri Feb 17 20:34:32 2017 - [info] Replication filtering check ok. Fri Feb 17 20:34:32 2017 - [info] Starting SSH connection tests.. Fri Feb 17 20:34:34 2017 - [info] All SSH connection tests passed successfully. Fri Feb 17 20:34:34 2017 - [info] Checking MHA Node version.. Fri Feb 17 20:34:34 2017 - [info] Version check ok. Fri Feb 17 20:34:34 2017 - [info] Checking SSH publickey authentication settings on the current master.. Fri Feb 17 20:34:34 2017 - [info] HealthCheck: SSH to 192.168.1.67 is reachable. Fri Feb 17 20:34:34 2017 - [info] Master MHA Node version is 0.53. Fri Feb 17 20:34:34 2017 - [info] Checking recovery script configurations on the current master.. Fri Feb 17 20:34:34 2017 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/data --output_file=/var/tmp/save_binary_logs_test --manager_version=0.53 --start_file=mysql-bin-master.000010 Fri Feb 17 20:34:34 2017 - [info] Connecting to root@192.168.1.67(192.168.1.67).. Creating /var/tmp if not exists.. ok. Checking output directory is accessible or not.. ok. Binlog found at /data, up to mysql-bin-master.000010 Fri Feb 17 20:34:34 2017 - [info] Master setting check done. Fri Feb 17 20:34:34 2017 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers.. Fri Feb 17 20:34:34 2017 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user=sxooky --slave_host=192.168.1.68 --slave_ip=192.168.1.68 --slave_port=3306 --workdir=/var/tmp --target_version=5.6.26-log --manager_version=0.53 --relay_log_info=/data/relay-log.info --relay_dir=/data/ --slave_pass=xxx Fri Feb 17 20:34:34 2017 - [info] Connecting to root@192.168.1.68(192.168.1.68:22).. Checking slave recovery environment settings.. Opening /data/relay-log.info ... ok. Relay log found at /data, up to shenxiang68-relay-bin.000002 Temporary relay log file is /data/shenxiang68-relay-bin.000002 Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Fri Feb 17 20:34:35 2017 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user=sxooky --slave_host=192.168.1.69 --slave_ip=192.168.1.69 --slave_port=3306 --workdir=/var/tmp --target_version=5.6.26-log --manager_version=0.53 --relay_log_info=/data/relay-log.info --relay_dir=/data/ --slave_pass=xxx Fri Feb 17 20:34:35 2017 - [info] Connecting to root@192.168.1.69(192.168.1.69:22).. Checking slave recovery environment settings.. Opening /data/relay-log.info ... ok. Relay log found at /data, up to shenxiang69-relay-bin.000002 Temporary relay log file is /data/shenxiang69-relay-bin.000002 Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Fri Feb 17 20:34:35 2017 - [info] Slaves settings check done. Fri Feb 17 20:34:35 2017 - [info] 192.168.1.67 (current master) +--192.168.1.68 +--192.168.1.69 Fri Feb 17 20:34:35 2017 - [info] Checking replication health on 192.168.1.68.. Fri Feb 17 20:34:35 2017 - [info] ok. Fri Feb 17 20:34:35 2017 - [info] Checking replication health on 192.168.1.69.. Fri Feb 17 20:34:35 2017 - [info] ok. Fri Feb 17 20:34:35 2017 - [info] Checking master_ip_failover_script status: Fri Feb 17 20:34:35 2017 - [info] /usr/bin/masterha_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.1.67 --orig_master_ip=192.168.1.67 --orig_master_port=3306 IN SCRIPT TEST====/sbin/ifconfig eth0:1 down==/sbin/ifconfig eth0:1 192.168.1.222/24=== Checking the Status of the script.. OK Fri Feb 17 20:34:35 2017 - [info] OK. Fri Feb 17 20:34:35 2017 - [warning] shutdown_script is not defined. Fri Feb 17 20:34:35 2017 - [info] Got exit code 0 (Not master dead). MySQL Replication Health is OK.
5、数据库宕机测试
1)启动MHA
后面重新开一个窗口,用于监控日志,因为这个是打开前台的进程,运行的时候光标会卡住[root@shenxiang61 ~]# >/etc/masterha/app1/manager.log [root@shenxiang61 ~]# nohup masterha_manager --conf=/etc/masterha/app1.cnf > /tmp/mha_manager.log 2>&1 & [1] 32432
2) 监控日志
新建监控日志的窗口,查看[root@shenxiang61 ~]# tailf /etc/masterha/app1/manager.log IN SCRIPT TEST====/sbin/ifconfig eth0:1 down==/sbin/ifconfig eth0:1 192.168.1.222/24=== Checking the Status of the script.. OK Fri Feb 17 21:23:33 2017 - [info] OK. Fri Feb 17 21:23:33 2017 - [warning] shutdown_script is not defined. Fri Feb 17 21:23:33 2017 - [info] Set master ping interval 1 seconds. Fri Feb 17 21:23:33 2017 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes. Fri Feb 17 21:23:33 2017 - [info] Starting ping health check on 192.168.1.67(192.168.1.67:3306).. Fri Feb 17 21:23:33 2017 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
正如黄底红字所示,一切准备就绪!就等着MYSQL无响应呢
3) 主服务器宕机
手动暂停主数据库服务器[root@shenxiang67 ~]# /etc/init.d/mysqld stop Shutting down MySQL...... SUCCESS!
4) 再次查看日志与结果
日志[root@shenxiang61 ~]# tailf /etc/masterha/app1/manager.log IN SCRIPT TEST====/sbin/ifconfig eth0:1 down==/sbin/ifconfig eth0:1 192.168.1.222/24=== Checking the Status of the script.. OK Fri Feb 17 21:23:33 2017 - [info] OK. Fri Feb 17 21:23:33 2017 - [warning] shutdown_script is not defined. Fri Feb 17 21:23:33 2017 - [info] Set master ping interval 1 seconds. Fri Feb 17 21:23:33 2017 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes. Fri Feb 17 21:23:33 2017 - [info] Starting ping health check on 192.168.1.67(192.168.1.67:3306).. Fri Feb 17 21:23:33 2017 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
Fri Feb 17 21:28:38 2017 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away)
Fri Feb 17 21:28:38 2017 - [info] Executing SSH check script: save_binary_logs --command=test --start_pos=4 --binlog_dir=/data --output_file=/var/tmp/save_binary_logs_test --manager_version=0.53 --binlog_prefix=mysql-bin-master
Fri Feb 17 21:28:38 2017 - [info] HealthCheck: SSH to 192.168.1.67 is reachable.
Fri Feb 17 21:28:39 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Fri Feb 17 21:28:39 2017 - [warning] Connection failed 1 time(s)..
Fri Feb 17 21:28:40 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Fri Feb 17 21:28:40 2017 - [warning] Connection failed 2 time(s)..
Fri Feb 17 21:28:41 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Fri Feb 17 21:28:41 2017 - [warning] Connection failed 3 time(s)..
Fri Feb 17 21:28:41 2017 - [warning] Master is not reachable from health checker!
Fri Feb 17 21:28:41 2017 - [warning] Master 192.168.1.67(192.168.1.67:3306) is not reachable!
Fri Feb 17 21:28:41 2017 - [warning] SSH is reachable.
Fri Feb 17 21:28:41 2017 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/masterha/app1.cnf again, and trying to connect to all servers to check server status..
Fri Feb 17 21:28:41 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Fri Feb 17 21:28:41 2017 - [info] Reading application default configurations from /etc/masterha/app1.cnf..
Fri Feb 17 21:28:41 2017 - [info] Reading server configurations from /etc/masterha/app1.cnf..
Fri Feb 17 21:28:41 2017 - [info] Dead Servers:
Fri Feb 17 21:28:41 2017 - [info] 192.168.1.67(192.168.1.67:3306)
Fri Feb 17 21:28:41 2017 - [info] Alive Servers:
Fri Feb 17 21:28:41 2017 - [info] 192.168.1.68(192.168.1.68:3306)
Fri Feb 17 21:28:41 2017 - [info] 192.168.1.69(192.168.1.69:3306)
Fri Feb 17 21:28:41 2017 - [info] Alive Slaves:
Fri Feb 17 21:28:41 2017 - [info] 192.168.1.68(192.168.1.68:3306) Version=5.6.26-log (oldest major version between slaves) log-bin:enabled
Fri Feb 17 21:28:41 2017 - [info] Replicating from 192.168.1.67(192.168.1.67:3306)
Fri Feb 17 21:28:41 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Fri Feb 17 21:28:41 2017 - [info] 192.168.1.69(192.168.1.69:3306) Version=5.6.26-log (oldest major version between slaves) log-bin:enabled
Fri Feb 17 21:28:41 2017 - [info] Replicating from 192.168.1.67(192.168.1.67:3306)
Fri Feb 17 21:28:41 2017 - [info] Not candidate for the new Master (no_master is set)
Fri Feb 17 21:28:41 2017 - [info] Checking slave configurations..
Fri Feb 17 21:28:41 2017 - [warning] relay_log_purge=0 is not set on slave 192.168.1.68(192.168.1.68:3306).
Fri Feb 17 21:28:41 2017 - [warning] relay_log_purge=0 is not set on slave 192.168.1.69(192.168.1.69:3306).
Fri Feb 17 21:28:41 2017 - [info] Checking replication filtering settings..
Fri Feb 17 21:28:41 2017 - [info] Replication filtering check ok.
Fri Feb 17 21:28:41 2017 - [info] Master is down!
Fri Feb 17 21:28:41 2017 - [info] Terminating monitoring script.
Fri Feb 17 21:28:41 2017 - [info] Got exit code 20 (Master dead).
Fri Feb 17 21:28:41 2017 - [info] MHA::MasterFailover version 0.53.
Fri Feb 17 21:28:41 2017 - [info] Starting master failover.
Fri Feb 17 21:28:41 2017 - [info]
Fri Feb 17 21:28:41 2017 - [info] * Phase 1: Configuration Check Phase..
Fri Feb 17 21:28:41 2017 - [info]
Fri Feb 17 21:28:41 2017 - [info] Dead Servers:
Fri Feb 17 21:28:41 2017 - [info] 192.168.1.67(192.168.1.67:3306)
Fri Feb 17 21:28:41 2017 - [info] Checking master reachability via mysql(double check)..
Fri Feb 17 21:28:41 2017 - [info] ok.
Fri Feb 17 21:28:41 2017 - [info] Alive Servers:
Fri Feb 17 21:28:41 2017 - [info] 192.168.1.68(192.168.1.68:3306)
Fri Feb 17 21:28:41 2017 - [info] 192.168.1.68(192.168.1.68:3306)
Fri Feb 17 21:28:41 2017 - [info] 192.168.1.69(192.168.1.69:3306)
Fri Feb 17 21:28:41 2017 - [info] Alive Slaves:
Fri Feb 17 21:28:41 2017 - [info] 192.168.1.68(192.168.1.68:3306) Version=5.6.26-log (oldest major version between slaves) log-bin:enabled
Fri Feb 17 21:28:41 2017 - [info] Replicating from 192.168.1.67(192.168.1.67:3306)
Fri Feb 17 21:28:41 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Fri Feb 17 21:28:41 2017 - [info] 192.168.1.69(192.168.1.69:3306) Version=5.6.26-log (oldest major version between slaves) log-bin:enabled
Fri Feb 17 21:28:41 2017 - [info] Replicating from 192.168.1.67(192.168.1.67:3306)
Fri Feb 17 21:28:41 2017 - [info] Not candidate for the new Master (no_master is set)
Fri Feb 17 21:28:41 2017 - [info] ** Phase 1: Configuration Check Phase completed.
Fri Feb 17 21:28:41 2017 - [info]
Fri Feb 17 21:28:41 2017 - [info] * Phase 2: Dead Master Shutdown Phase..
Fri Feb 17 21:28:41 2017 - [info]
Fri Feb 17 21:28:41 2017 - [info] Forcing shutdown so that applications never connect to the current master..
Fri Feb 17 21:28:41 2017 - [info] Executing master IP deactivatation script:
Fri Feb 17 21:28:41 2017 - [info] /usr/bin/masterha_ip_failover --orig_master_host=192.168.1.67 --orig_master_ip=192.168.1.67 --orig_master_port=3306 --command=stopssh --ssh_user=root
IN SCRIPT TEST====/sbin/ifconfig eth0:1 down==/sbin/ifconfig eth0:1 192.168.1.222/24===
Disabling the VIP on old master: 192.168.1.67
Fri Feb 17 21:28:41 2017 - [info] done.
Fri Feb 17 21:28:41 2017 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Fri Feb 17 21:28:41 2017 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Fri Feb 17 21:28:41 2017 - [info]
Fri Feb 17 21:28:41 2017 - [info] * Phase 3: Master Recovery Phase..
Fri Feb 17 21:28:41 2017 - [info]
Fri Feb 17 21:28:41 2017 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Fri Feb 17 21:28:41 2017 - [info]
Fri Feb 17 21:28:41 2017 - [info] The latest binary log file/position on all slaves is mysql-bin-master.000010:120
Fri Feb 17 21:28:41 2017 - [info] Latest slaves (Slaves that received relay log files to the latest):
Fri Feb 17 21:28:41 2017 - [info] 192.168.1.68(192.168.1.68:3306) Version=5.6.26-log (oldest major version between slaves) log-bin:enabled
Fri Feb 17 21:28:41 2017 - [info] Replicating from 192.168.1.67(192.168.1.67:3306)
Fri Feb 17 21:28:41 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Fri Feb 17 21:28:41 2017 - [info] 192.168.1.69(192.168.1.69:3306) Version=5.6.26-log (oldest major version between slaves) log-bin:enabled
Fri Feb 17 21:28:41 2017 - [info] Replicating from 192.168.1.67(192.168.1.67:3306)
Fri Feb 17 21:28:41 2017 - [info] Not candidate for the new Master (no_master is set)
Fri Feb 17 21:28:41 2017 - [info] The oldest binary log file/position on all slaves is mysql-bin-master.000010:120
Fri Feb 17 21:28:41 2017 - [info] Oldest slaves:
Fri Feb 17 21:28:41 2017 - [info] 192.168.1.68(192.168.1.68:3306) Version=5.6.26-log (oldest major version between slaves) log-bin:enabled
Fri Feb 17 21:28:41 2017 - [info] Replicating from 192.168.1.67(192.168.1.67:3306)
Fri Feb 17 21:28:41 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Fri Feb 17 21:28:41 2017 - [info] 192.168.1.69(192.168.1.69:3306) Version=5.6.26-log (oldest major version between slaves) log-bin:enabled
Fri Feb 17 21:28:41 2017 - [info] Replicating from 192.168.1.67(192.168.1.67:3306)
Fri Feb 17 21:28:41 2017 - [info] Not candidate for the new Master (no_master is set)
Fri Feb 17 21:28:41 2017 - [info]
Fri Feb 17 21:28:41 2017 - [info] * Phase 3.2: Saving Dead Master's Binlog Phase..
Fri Feb 17 21:28:41 2017 - [info]
Fri Feb 17 21:28:41 2017 - [info] Fetching dead master's binary logs..
Fri Feb 17 21:28:41 2017 - [info] Executing command on the dead master 192.168.1.67(192.168.1.67:3306): save_binary_logs --command=save --start_file=mysql-bin-master.000010 --start_pos=120 --binlog_dir=/data --output_file=/var/tmp/saved_master_binlog_from_192.168.1.67_3306_20170217212841.binlog --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.53
Creating /var/tmp if not exists.. ok.
Concat binary/relay logs from mysql-bin-master.000010 pos 120 to mysql-bin-master.000010 EOF into /var/tmp/saved_master_binlog_from_192.168.1.67_3306_20170217212841.binlog ..
Dumping binlog format description event, from position 0 to 120.. ok.
Dumping effective binlog data from /data/mysql-bin-master.000010 position 120 to tail(143).. ok.
Concat succeeded.
Fri Feb 17 21:28:42 2017 - [info] scp from root@192.168.1.67:/var/tmp/saved_master_binlog_from_192.168.1.67_3306_20170217212841.binlog to local:/etc/masterha/app1/saved_master_binlog_from_192.168.1.67_3306_20170217212841.binlog succeeded.
Fri Feb 17 21:28:42 2017 - [info] HealthCheck: SSH to 192.168.1.68 is reachable.
Fri Feb 17 21:28:42 2017 - [info] HealthCheck: SSH to 192.168.1.69 is reachable.
Fri Feb 17 21:28:42 2017 - [info]
Fri Feb 17 21:28:42 2017 - [info] * Phase 3.3: Determining New Master Phase..
Fri Feb 17 21:28:42 2017 - [info]
Fri Feb 17 21:28:42 2017 - [info] Finding the latest slave that has all relay logs for recovering other slaves..
Fri Feb 17 21:28:42 2017 - [info] All slaves received relay logs to the same position. No need to resync each other.
Fri Feb 17 21:28:42 2017 - [info] Searching new master from slaves..
Fri Feb 17 21:28:42 2017 - [info] Candidate masters from the configuration file:
Fri Feb 17 21:28:42 2017 - [info] 192.168.1.68(192.168.1.68:3306) Version=5.6.26-log (oldest major version between slaves) log-bin:enabled
Fri Feb 17 21:28:42 2017 - [info] Replicating from 192.168.1.67(192.168.1.67:3306)
Fri Feb 17 21:28:42 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Fri Feb 17 21:28:42 2017 - [info] Non-candidate masters:
Fri Feb 17 21:28:42 2017 - [info] 192.168.1.69(192.168.1.69:3306) Version=5.6.26-log (oldest major version between slaves) log-bin:enabled
Fri Feb 17 21:28:42 2017 - [info] Replicating from 192.168.1.67(192.168.1.67:3306)
Fri Feb 17 21:28:42 2017 - [info] Not candidate for the new Master (no_master is set)
Fri Feb 17 21:28:42 2017 - [info] Searching from candidate_master slaves which have received the latest relay log events..
Fri Feb 17 21:28:42 2017 - [info] New master is 192.168.1.68(192.168.1.68:3306)
Fri Feb 17 21:28:42 2017 - [info] Starting master failover..
Fri Feb 17 21:28:42 2017 - [info]
From:
192.168.1.67 (current master)
+--192.168.1.68
+--192.168.1.69
To:
192.168.1.68 (new master)
+--192.168.1.69
Fri Feb 17 21:28:42 2017 - [info]
Fri Feb 17 21:28:42 2017 - [info] * Phase 3.3: New Master Diff Log Generation Phase..
Fri Feb 17 21:28:42 2017 - [info]
Fri Feb 17 21:28:42 2017 - [info] This server has all relay logs. No need to generate diff files from the latest slave.
Fri Feb 17 21:28:42 2017 - [info] Sending binlog..
Fri Feb 17 21:28:42 2017 - [info] scp from local:/etc/masterha/app1/saved_master_binlog_from_192.168.1.67_3306_20170217212841.binlog to root@192.168.1.68:/var/tmp/saved_master_binlog_from_192.168.1.67_3306_20170217212841.binlog succeeded.
Fri Feb 17 21:28:42 2017 - [info]
Fri Feb 17 21:28:42 2017 - [info] * Phase 3.4: Master Log Apply Phase..
Fri Feb 17 21:28:42 2017 - [info]
Fri Feb 17 21:28:42 2017 - [info] *NOTICE: If any error happens from this phase, manual recovery is needed.
Fri Feb 17 21:28:42 2017 - [info] Starting recovery on 192.168.1.68(192.168.1.68:3306)..
Fri Feb 17 21:28:42 2017 - [info] Generating diffs succeeded.
Fri Feb 17 21:28:42 2017 - [info] Waiting until all relay logs are applied.
Fri Feb 17 21:28:42 2017 - [info] done.
Fri Feb 17 21:28:42 2017 - [info] Getting slave status..
Fri Feb 17 21:28:42 2017 - [info] This slave(192.168.1.68)'s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(mysql-bin-master.000010:120). No need to recover from Exec_Master_Log_Pos.
Fri Feb 17 21:28:42 2017 - [info] Connecting to the target slave host 192.168.1.68, running recover script..
Fri Feb 17 21:28:42 2017 - [info] Executing command: apply_diff_relay_logs --command=apply --slave_user=sxooky --slave_host=192.168.1.68 --slave_ip=192.168.1.68 --slave_port=3306 --apply_files=/var/tmp/saved_master_binlog_from_192.168.1.67_3306_20170217212841.binlog --workdir=/var/tmp --target_version=5.6.26-log --timestamp=20170217212841 --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.53 --slave_pass=xxx
Fri Feb 17 21:28:43 2017 - [info]
MySQL client version is 5.6.26. Using --binary-mode.
Applying differential binary/relay log files /var/tmp/saved_master_binlog_from_192.168.1.67_3306_20170217212841.binlog on 192.168.1.68:3306. This may take long time...
Applying log files succeeded.
Fri Feb 17 21:28:43 2017 - [info] All relay logs were successfully applied.
Fri Feb 17 21:28:43 2017 - [info] Getting new master's binlog name and position..
Fri Feb 17 21:28:43 2017 - [info] mysql-bin-master.000013:120
Fri Feb 17 21:28:43 2017 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.1.68', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin-master.000013', MASTER_LOG_POS=120, MASTER_USER='slave', MASTER_PASSWORD='xxx';
Fri Feb 17 21:28:43 2017 - [info] Executing master IP activate script:
Fri Feb 17 21:28:43 2017 - [info] /usr/bin/masterha_ip_failover --command=start --ssh_user=root --orig_master_host=192.168.1.67 --orig_master_ip=192.168.1.67 --orig_master_port=3306 --new_master_host=192.168.1.68 --new_master_ip=192.168.1.68 --new_master_port=3306
IN SCRIPT TEST====/sbin/ifconfig eth0:1 down==/sbin/ifconfig eth0:1 192.168.1.222/24===
Enabling the VIP - 192.168.1.222/24 on the new master - 192.168.1.68
Fri Feb 17 21:28:43 2017 - [info] OK.
Fri Feb 17 21:28:43 2017 - [info] Setting read_only=0 on 192.168.1.68(192.168.1.68:3306)..
Fri Feb 17 21:28:43 2017 - [info] ok.
Fri Feb 17 21:28:43 2017 - [info] ** Finished master recovery successfully.
Fri Feb 17 21:28:43 2017 - [info] * Phase 3: Master Recovery Phase completed.
Fri Feb 17 21:28:43 2017 - [info]
Fri Feb 17 21:28:43 2017 - [info] * Phase 4: Slaves Recovery Phase..
Fri Feb 17 21:28:43 2017 - [info]
Fri Feb 17 21:28:43 2017 - [info] * Phase 4.1: Starting Parallel Slave Diff Log Generation Phase..
Fri Feb 17 21:28:43 2017 - [info]
Fri Feb 17 21:28:43 2017 - [info]
Fri Feb 17 21:28:43 2017 - [info] -- Slave diff file generation on host 192.168.1.69(192.168.1.69:3306) started, pid: 16824. Check tmp log /etc/masterha/app1/192.168.1.69_3306_20170217212841.log if it takes time..
Fri Feb 17 21:28:43 2017 - [info]
Fri Feb 17 21:28:43 2017 - [info] Log messages from 192.168.1.69 ...
Fri Feb 17 21:28:43 2017 - [info]
Fri Feb 17 21:28:43 2017 - [info] This server has all relay logs. No need to generate diff files from the latest slave.
Fri Feb 17 21:28:43 2017 - [info] End of log messages from 192.168.1.69.
Fri Feb 17 21:28:43 2017 - [info] -- 192.168.1.69(192.168.1.69:3306) has the latest relay log events.
Fri Feb 17 21:28:43 2017 - [info] Generating relay diff files from the latest slave succeeded.
Fri Feb 17 21:28:43 2017 - [info]
Fri Feb 17 21:28:43 2017 - [info] * Phase 4.2: Starting Parallel Slave Log Apply Phase..
Fri Feb 17 21:28:43 2017 - [info]
Fri Feb 17 21:28:43 2017 - [info] -- Slave recovery on host 192.168.1.69(192.168.1.69:3306) started, pid: 16826. Check tmp log /etc/masterha/app1/192.168.1.69_3306_20170217212841.log if it takes time..
Fri Feb 17 21:28:43 2017 - [info]
Fri Feb 17 21:28:43 2017 - [info] Log messages from 192.168.1.69 ...
Fri Feb 17 21:28:43 2017 - [info]
Fri Feb 17 21:28:43 2017 - [info] Sending binlog..
Fri Feb 17 21:28:43 2017 - [info] scp from local:/etc/masterha/app1/saved_master_binlog_from_192.168.1.67_3306_20170217212841.binlog to root@192.168.1.69:/var/tmp/saved_master_binlog_from_192.168.1.67_3306_20170217212841.binlog succeeded.
Fri Feb 17 21:28:43 2017 - [info] Starting recovery on 192.168.1.69(192.168.1.69:3306)..
Fri Feb 17 21:28:43 2017 - [info] Generating diffs succeeded.
Fri Feb 17 21:28:43 2017 - [info] Waiting until all relay logs are applied.
Fri Feb 17 21:28:43 2017 - [info] done.
Fri Feb 17 21:28:43 2017 - [info] Getting slave status..
Fri Feb 17 21:28:43 2017 - [info] This slave(192.168.1.69)'s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(mysql-bin-master.000010:120). No need to recover from Exec_Master_Log_Pos.
Fri Feb 17 21:28:43 2017 - [info] Connecting to the target slave host 192.168.1.69, running recover script..
Fri Feb 17 21:28:43 2017 - [info] Executing command: apply_diff_relay_logs --command=apply --slave_user=sxooky --slave_host=192.168.1.69 --slave_ip=192.168.1.69 --slave_port=3306 --apply_files=/var/tmp/saved_master_binlog_from_192.168.1.67_3306_20170217212841.binlog --workdir=/var/tmp --target_version=5.6.26-log --timestamp=20170217212841 --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.53 --slave_pass=xxx
Fri Feb 17 21:28:43 2017 - [info]
MySQL client version is 5.6.26. Using --binary-mode.
Applying differential binary/relay log files /var/tmp/saved_master_binlog_from_192.168.1.67_3306_20170217212841.binlog on 192.168.1.69:3306. This may take long time...
Applying log files succeeded.
Fri Feb 17 21:28:43 2017 - [info] All relay logs were successfully applied.
Fri Feb 17 21:28:43 2017 - [info] Resetting slave 192.168.1.69(192.168.1.69:3306) and starting replication from the new master 192.168.1.68(192.168.1.68:3306)..
Fri Feb 17 21:28:43 2017 - [info] Executed CHANGE MASTER.
Fri Feb 17 21:28:43 2017 - [info] Slave started.
Fri Feb 17 21:28:43 2017 - [info] End of log messages from 192.168.1.69.
Fri Feb 17 21:28:43 2017 - [info] -- Slave recovery on host 192.168.1.69(192.168.1.69:3306) succeeded.
Fri Feb 17 21:28:43 2017 - [info] All new slave servers recovered successfully.
Fri Feb 17 21:28:43 2017 - [info]
Fri Feb 17 21:28:43 2017 - [info] * Phase 5: New master cleanup phease..
Fri Feb 17 21:28:43 2017 - [info]
Fri Feb 17 21:28:43 2017 - [info] Resetting slave info on the new master..
Fri Feb 17 21:28:43 2017 - [info] 192.168.1.68: Resetting slave info succeeded.
Fri Feb 17 21:28:43 2017 - [info] Master failover to 192.168.1.68(192.168.1.68:3306) completed successfully.
Fri Feb 17 21:28:43 2017 - [info]
----- Failover Report -----
app1: MySQL Master failover 192.168.1.67 to 192.168.1.68 succeeded
Master 192.168.1.67 is down!
Check MHA Manager logs at shenxiang61:/etc/masterha/app1/manager.log for details.
Started automated(non-interactive) failover.
Invalidated master IP address on 192.168.1.67.
The latest slave 192.168.1.68(192.168.1.68:3306) has all relay logs for recovery.
Selected 192.168.1.68 as a new master.
192.168.1.68: OK: Applying all logs succeeded.
192.168.1.68: OK: Activated master IP address.
192.168.1.69: This host has the latest relay log events.
Generating relay diff files from the latest slave succeeded.
192.168.1.69: OK: Applying all logs succeeded. Slave started, replicating from 192.168.1.68.
192.168.1.68: Resetting slave info succeeded.
Master failover to 192.168.1.68(192.168.1.68:3306) completed successfully.
再登录slave2服务器,看slave 的状态
[root@shenxiang69 ~]# mysql -uroot -p123456 -e "show slave status\G" Warning: Using a password on the command line interface can be insecure. *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.1.68 Master_User: slave Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin-master.000007 Read_Master_Log_Pos: 120 Relay_Log_File: shenxiang69-relay-bin.000002 Relay_Log_Pos: 290 Relay_Master_Log_File: mysql-bin-master.000007 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 120 Relay_Log_Space: 469 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 2 Master_UUID: a6181d83-f21b-11e6-bd9a-000c29271de5 Master_Info_File: /data/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 0
测试成功,如上所示,原67主机MySQL连接失败,68主机当选新的master,61主机记录最近的日志,大功告成。
查看Write VIP 192.168.1.222
[root@shenxiang68 ~]# ifconfig eth0 Link encap:Ethernet HWaddr 00:0C:29:27:1D:E5 inet addr:192.168.1.68 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::20c:29ff:fe27:1de5/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:92752 errors:0 dropped:0 overruns:0 frame:0 TX packets:23849 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:8348697 (7.9 MiB) TX bytes:6319349 (6.0 MiB) eth0:1 Link encap:Ethernet HWaddr 00:0C:29:27:1D:E5 inet addr:192.168.1.222 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:920 errors:0 dropped:0 overruns:0 frame:0 TX packets:920 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:78012 (76.1 KiB) TX bytes:78012 (76.1 KiB)
6、重新把主机shenxiang67加入MHA集群
1) 查看新的Master(shenxiang68主机)状态
[root@shenxiang68 ~]# mysql -uroot -p123456 -e "show master status\G" Warning: Using a password on the command line interface can be insecure. *************************** 1. row *************************** File: mysql-bin-master.000007 Position: 120 Binlog_Do_DB: Binlog_Ignore_DB: Executed_Gtid_Set:
2) 配置主从加入集群
mysql> stop slave; CHANGE MASTER TO MASTER_HOST='192.168.1.68',MASTER_PORT=3306,MASTER_LOG_FILE='mysql-bin-master.000013',MASTER_LOG_POS=120,MASTER_USER='slave',MASTER_PASSWORD='123456';start slave; Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected, 2 warnings (0.01 sec) Query OK, 0 rows affected (0.01 sec)
红色部分可以用manager的日志中截取
[root@shenxiang61 ~]# grep -i "All other slaves should start" /etc/masterha/app1/manager.log Fri Feb 17 21:28:43 2017 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.1.68', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin-master.000013', MASTER_LOG_POS=120, MASTER_USER='slave', MASTER_PASSWORD='xxx'; mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.1.68 Master_User: slave Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin-master.000013 Read_Master_Log_Pos: 120 Relay_Log_File: shenxiang67-relay-bin.000002 Relay_Log_Pos: 290 Relay_Master_Log_File: mysql-bin-master.000013 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 120 Relay_Log_Space: 469 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 2 Master_UUID: a6181d83-f21b-11e6-bd9a-000c29271de5 Master_Info_File: /data/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 0 1 row in set (0.00 sec)
3) 重新启动MHA
清空日志,并启动MHA[root@shenxiang61 app1]# >/etc/masterha/app1/manager.log [root@shenxiang61 app1]# masterha_check_status --conf=/etc/masterha/app1.cnf app1 is stopped(2:NOT_RUNNING). [root@shenxiang61 app1]# nohup masterha_manager --conf=/etc/masterha/app1.cnf > /tmp/mha_manager.log < /dev/null 2>&1 & [1] 16927 [root@shenxiang61 app1]# masterha_check_status --conf=/etc/masterha/app1.cnf app1 (pid:16927) is running(0:PING_OK), master:192.168.1.68
4) 查看日志
[root@shenxiang61 ~]# tailf /etc/masterha/app1/manager.log Fri Feb 17 21:49:51 2017 - [info] MHA::MasterMonitor version 0.53. Fri Feb 17 21:49:51 2017 - [info] Dead Servers: Fri Feb 17 21:49:51 2017 - [info] Alive Servers: Fri Feb 17 21:49:51 2017 - [info] 192.168.1.67(192.168.1.67:3306) Fri Feb 17 21:49:51 2017 - [info] 192.168.1.68(192.168.1.68:3306) Fri Feb 17 21:49:51 2017 - [info] 192.168.1.69(192.168.1.69:3306) Fri Feb 17 21:49:51 2017 - [info] Alive Slaves: Fri Feb 17 21:49:51 2017 - [info] 192.168.1.67(192.168.1.67:3306) Version=5.6.26-log (oldest major version between slaves) log-bin:enabled Fri Feb 17 21:49:51 2017 - [info] Replicating from 192.168.1.68(192.168.1.68:3306) Fri Feb 17 21:49:51 2017 - [info] Primary candidate for the new Master (candidate_master is set) Fri Feb 17 21:49:51 2017 - [info] 192.168.1.69(192.168.1.69:3306) Version=5.6.26-log (oldest major version between slaves) log-bin:enabled Fri Feb 17 21:49:51 2017 - [info] Replicating from 192.168.1.68(192.168.1.68:3306) Fri Feb 17 21:49:51 2017 - [info] Not candidate for the new Master (no_master is set) Fri Feb 17 21:49:51 2017 - [info] Current Alive Master: 192.168.1.68(192.168.1.68:3306) Fri Feb 17 21:49:51 2017 - [info] Checking slave configurations.. Fri Feb 17 21:49:51 2017 - [info] read_only=1 is not set on slave 192.168.1.67(192.168.1.67:3306). Fri Feb 17 21:49:51 2017 - [warning] relay_log_purge=0 is not set on slave 192.168.1.67(192.168.1.67:3306). Fri Feb 17 21:49:51 2017 - [warning] relay_log_purge=0 is not set on slave 192.168.1.69(192.168.1.69:3306). Fri Feb 17 21:49:51 2017 - [info] Checking replication filtering settings.. Fri Feb 17 21:49:51 2017 - [info] binlog_do_db= , binlog_ignore_db= Fri Feb 17 21:49:51 2017 - [info] Replication filtering check ok. Fri Feb 17 21:49:51 2017 - [info] Starting SSH connection tests.. Fri Feb 17 21:49:52 2017 - [info] All SSH connection tests passed successfully. Fri Feb 17 21:49:52 2017 - [info] Checking MHA Node version.. Fri Feb 17 21:49:53 2017 - [info] Version check ok. Fri Feb 17 21:49:53 2017 - [info] Checking SSH publickey authentication settings on the current master.. Fri Feb 17 21:49:53 2017 - [info] HealthCheck: SSH to 192.168.1.68 is reachable. Fri Feb 17 21:49:53 2017 - [info] Master MHA Node version is 0.53. Fri Feb 17 21:49:53 2017 - [info] Checking recovery script configurations on the current master.. Fri Feb 17 21:49:53 2017 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/data --output_file=/var/tmp/save_binary_logs_test --manager_version=0.53 --start_file=mysql-bin-master.000013 Fri Feb 17 21:49:53 2017 - [info] Connecting to root@192.168.1.68(192.168.1.68).. Creating /var/tmp if not exists.. ok. Checking output directory is accessible or not.. ok. Binlog found at /data, up to mysql-bin-master.000013 Fri Feb 17 21:49:53 2017 - [info] Master setting check done. Fri Feb 17 21:49:53 2017 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers.. Fri Feb 17 21:49:53 2017 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user=sxooky --slave_host=192.168.1.67 --slave_ip=192.168.1.67 --slave_port=3306 --workdir=/var/tmp --target_version=5.6.26-log --manager_version=0.53 --relay_log_info=/data/relay-log.info --relay_dir=/data/ --slave_pass=xxx Fri Feb 17 21:49:53 2017 - [info] Connecting to root@192.168.1.67(192.168.1.67:22).. Checking slave recovery environment settings.. Opening /data/relay-log.info ... ok. Relay log found at /data, up to shenxiang67-relay-bin.000002 Temporary relay log file is /data/shenxiang67-relay-bin.000002 Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Fri Feb 17 21:49:53 2017 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user=sxooky --slave_host=192.168.1.69 --slave_ip=192.168.1.69 --slave_port=3306 --workdir=/var/tmp --target_version=5.6.26-log --manager_version=0.53 --relay_log_info=/data/relay-log.info --relay_dir=/data/ --slave_pass=xxx Fri Feb 17 21:49:53 2017 - [info] Connecting to root@192.168.1.69(192.168.1.69:22).. Checking slave recovery environment settings.. Opening /data/relay-log.info ... ok. Relay log found at /data, up to shenxiang69-relay-bin.000002 Temporary relay log file is /data/shenxiang69-relay-bin.000002 Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Fri Feb 17 21:49:53 2017 - [info] Slaves settings check done. Fri Feb 17 21:49:53 2017 - [info] 192.168.1.68 (current master) +--192.168.1.67 +--192.168.1.69 Fri Feb 17 21:49:53 2017 - [info] Checking master_ip_failover_script status: Fri Feb 17 21:49:53 2017 - [info] /usr/bin/masterha_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.1.68 --orig_master_ip=192.168.1.68 --orig_master_port=3306 IN SCRIPT TEST====/sbin/ifconfig eth0:1 down==/sbin/ifconfig eth0:1 192.168.1.222/24=== Checking the Status of the script.. OK Fri Feb 17 21:49:53 2017 - [info] OK. Fri Feb 17 21:49:53 2017 - [warning] shutdown_script is not defined. Fri Feb 17 21:49:53 2017 - [info] Set master ping interval 1 seconds. Fri Feb 17 21:49:53 2017 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes. Fri Feb 17 21:49:53 2017 - [info] Starting ping health check on 192.168.1.68(192.168.1.68:3306).. Fri Feb 17 21:49:53 2017 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
六、mha日常维护命令
1、查看ssh登陆是否成功
masterha_check_ssh --conf=/etc/masterha/app1.cnf
2、查看复制是否建立好
masterha_check_repl --conf=/etc/masterha/app1.cnf
3、启动mha
nohup masterha_manager --conf=/etc/masterha/app1.cnf > /tmp/mha_manager.log < /dev/null 2>&1 &
当有slave节点宕掉的情况是启动不了的,加上–ignore_fail_on_start即使有节点宕掉也能启动mha
nohup masterha_manager –conf=/etc/masterha/app1.cnf –ignore_fail_on_start > /tmp/mha_manager.log < /dev/null 2>&1 &
4、检查启动的状态
masterha_check_status --conf=/etc/masterha/app1.cnf
5、停止mha
masterha_stop --conf=/etc/masterha/app1.cnf
6、failover后下次重启
每次failover切换后会在管理目录生成文件failover.complete ,下次在切换的时候会发现有这个文件导致切换不成功,需要手动清理掉。rm -rf /masterha/app1/app1.failover.complete 也可以加上参数–ignore_last_failover
7、手工failover
手工failover场景,master死掉,但是masterha_manager没有开启,可以通过手工failover:masterha_master_switch --conf=/etc/masterha/app1.cnf --dead_master_host=10.50.2.10 --master_state=dead --new_master_host=10.50.2.12 --ignore_last_failover
8、masterha_manager
是一种监视和故障转移的程序。另一方面,masterha_master_switch程序不监控主库。 masterha_master_switch可以用于主库故障转移,也可用于在线总开关。9、手动在线切换
masterha_master_switch --conf=/etc/app1.cnf --master_state=alive --new_master_host=192.168.119.74 --orig_master_is_new_slave
或者
masterha_master_switch --conf=/etc/app1.cnf --master_state=alive --new_master_host=192.168.119.74 --orig_master_is_new_slave --running_updates_limit=10000
–orig_master_is_new_slave切换时加上此参数是将原master变为slave节点,如果不加此参数,原来的master将不启动
–running_updates_limit=10000 切换时候选master如果有延迟的话,mha切换不能成功,加上此参数表示延迟在此时间范围内都可切换(单位为s),但是切换的时间长短是由recover时relay日志的大小决定
手动在线切换mha,切换时需要将在运行的mha停掉后才能切换。
在备库先执行DDL,一般先stop slave,一般不记录mysql日志,可以通过set SQL_LOG_BIN = 0实现。然后进行一次主备切换操作,再在原来的主库上执行DDL。这种方法适用于增减索引,如果是增加字段就需要额外注意。
可以通过如下命令停止mha
masterha_stop –conf=/etc/app1.cnf
相关文章推荐
- MySQL 高可用集群架构 MHA
- mysql高可用集群之MHA和Galera Cluster
- mysql高可用集群——MHA架构
- mysql的MHA高可用集群
- mysql进阶(三)MHA高可用集群
- centos6.4下mysql高可用架构MHA搭建
- MySQL高可用架构之MHA
- MYSQL + MHA +keepalive + VIP安装配置(二)--MHA的配置
- <练习>利用heartbeat实现双节点mysql高可用集群
- Corosync+pacemaker实现基于drbd分散式存储的mysql高可用集群
- 探索MySQL高可用架构之MHA(3)
- MySQL高可用集群之MySQL-MMM
- 基于MHA的MySQL高可用架构的实现
- MySQL高可用方案MHA的部署和原理
- 轻松构建Mysql高可用集群系统
- Mysql集群架构MHA应用实战
- Mysql5.5部署MHA
- MySQL高可用性之MHA
- 基于keepalived搭建MySQL的高可用集群
- MySQL-MHA高可用+Atlas读写分离配置