您的位置:首页 > 数据库 > SQL

实现MySQL高可用架构之MHA

2017-11-22 22:02 736 查看
  MHA是一款开源的mysql的高可用程序,它为mysql主从复制架构提供了automating master failover功能。MHA在监控到master节点故障时,会提升其中拥有最新数据的slave节点成为新的master节点。在此期间,MHA会通过于其他节点获取额外信息来避免一致性方面的问题。MHA还提供了master节点的在线切换功能,能够在30秒内实现故障切换,并在故障切换中,最大可能的保证数据一致性。



MHA服务有两种角色,MHA Manager管理节点和MHA Node数据节点。

  MHA Manager:通常单独部署在一台独立机器上管理多个master/slave集群,每个master/slave集群称为一个application,用来管理统筹整个集群。

  MHA Node:运行在每台mysql服务器上,它是通过监控具备解析和清理logs功能的脚本来加快故障转移;主要是接收管理节点所发出指令的代理,代理需要运行在每一个mysql节点上。

  简单的来说,manager监控集群组上的每个节点,并且自动识别集群组中的master,当master宕机时,manager将原master的二进制日志保存下来,检查集群组中具有最新更新数据的服务器,将其提升为新的主节点,将原master的二进制日志更新到新的节点上,并且将其他节点指向新的master。

总结一下MHA工作原理:

 1、从宕机崩溃的master保存二进制日志事件

 2、识别含有最新更新的slave

 3、应用差异的中继日志到其他slave

 4、应用从master保存的二进制日志事件

 5、提升一个slave为新的master

 6、使用其他的slave连接新的master进行复制数据。

【注意】因为manager要监控集群上的所有节点,且集群内的各机器要实现数据复制,所以它们之间应该实现免密钥登录。

【实验】实现mysql的高架构MHA

实验环境:

四台主机:

  manager:192.168.216.15

  master:192.168.216.13

  slave1:192.168.216.17

  slave2:192.168.216.16

1、在四台主机上安装node包,可以选择源码编译或者yum安装,在这里使用的是yum安装。

manager主机要安装mysql-manager包,用来实现监控。

manager:
[root@centos7 ~]# yum install mha4mysql-manager-0.56-0.el6.noarch.rpm  mha4mysql-node-0.56-0.el6.noarch.rpm

master及slave:
[root@centos7 ~]# yum install  mha4mysql-node-0.56-0.el6.noarch.rpm


2、首先实现一主二从的mysql主从复制

master配置:

[root@centos7 ~]# vim /etc/my.cnf
[mysqld]
server_id=1             #配置server_id,让主服务器有唯一ID号
log_bin=master-bin      #启动二进制日志
relay_log=relay_log     #中继日志
skip_name_resolve=on    #跳过名字解析

[root@centos7 ~]# systemctl start mariadb
[root@centos7 ~]# mysql -uroot
MariaDB [(none)]> select user,host,password from mysql.user;
+-------+----------------+-------------------------------------------+
| user  | host           | password                                  |
+-------+----------------+-------------------------------------------+
| root  | localhost      |                                           |
| root  | 127.0.0.1      |                                           |
| root  | ::1            |                                           |
+-------+----------------+-------------------------------------------+
3 rows in set (0.00 sec)
MariaDB [(none)]> grant replication slave,replication client on *.* to slave@'%' identified by 'centos';
#授权用户,使得从服务器可以进行数据复制

MariaDB [(none)]> grant all on *.* to admin@'%' identified by 'centos';
#这个授权用户是对manager设置的超级用户,这个用户在两台slave上也需要设置,可以实现主从以后在master上设置,slave会自动进行复制;

MariaDB [(none)]> show master status;
+-------------------+----------+--------------+------------------+
| File              | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+-------------------+----------+--------------+------------------+
| master-bin.000003 |      245 |              |                  |
+-------------------+----------+--------------+------------------+
#查看master的Position,slave需要对应其Position值;
1 row in set (0.00 sec)

MariaDB [(none)]> show slave hosts;
Empty set (0.00 sec)
#在未进行slave设置时,master上的slave host为空;


3、两台slave节点设置:

[root@centos7 ~]# vim /etc/my.cnf
[mysqld]
server_id=2         #复制集群中的各节点的id均必须唯一;slave2的ID为3
log_bin=slave1-bin
relay_log=slave-relay-log
read_only=on        #作为slave mysql,需要只读权限
skip_name_resolve=on
relay_log_purge=0   #是否自动清空不再需要中继日志,0表示off

[root@centos7 ~]# systemctl start mariadb
[root@centos7 ~]# mysql -uroot
MariaDB [(none)]> change master to master_host='192.168.216.13',master_user='slave',master_password='centos',master_log_file='master-bin.000003',master_log_pos=245;
Query OK, 0 rows affected (0.01 sec)
#让slave连接master,并开始重做master二进制日志中的事件,Position与master对应;

MariaDB [(none)]> start slave;
Query OK, 0 rows affected (0.00 sec)
#开启复制线程;
MariaDB [(none)]> show slave status\G;  #查看从服务器状态
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.216.13
Master_User: slave
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: master-bin.000003
Read_Master_Log_Pos: 245
Relay_Log_File: slave-relay-log.000002
Relay_Log_Pos: 530
Relay_Master_Log_File: master-bin.000003
Slave_IO_Running: Yes     #IO线程正常运行
Slave_SQL_Running: Yes     #SQL线程正常运行
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 245
Relay_Log_Space: 824
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
1 row in set (0.00 sec)

ERROR: No query specified


4、这时查看master主机,可以清楚看到两台slave;

[root@centos7 ~]# mysql -uroot
MariaDB [(none)]> show slave hosts;
+-----------+------+------+-----------+
| Server_id | Host | Port | Master_id |
+-----------+------+------+-----------+
|         3 |      | 3306 |         1 |
|         2 |      | 3306 |         1 |
+-----------+------+------+-----------+
2 rows in set (0.00 sec)
#查看server_id


另外,假设当master宕机,新master上任时,新master也需要授权用户来让其他的slave对自己的数据进行复制,所以在slave上我们也要提前授权用户:

MariaDB [(none)]> grant replication slave,replication client on *.* to slave@'%' identified by 'centos';


5、接下来进行配置manager。

由于mha4mysql-manager-0.56-0.el6.noarch.rpm这个包安装以后并没有自定义的配置文件,所以配置文件需要手动来写:

[root@centos7 ~]#
vim /etc/mha_master/app1.cnf
[server default]
user=admin          #manager的超级用户,在节点数据库中有体现
password=centos     #密码
manager_workdir=/etc/mha_master/app1      #manager工作目录
manager_log=/etc/mha_master/manager.log   #manager日志文件
remote_workdir=/mydata/mha_master/app1    #远程主机工作目录
ssh_user=root         #ssh连接用户
repl_user=slave       #slave授权用户
repl_password=centos
ping_interval=1       #ping间隔时长

[server1]
hostname=192.168.216.17     #节点1的地址
ssh_port=22                 #节点1的ssh端口
candidate_master=1          #将来可不可以成为master候选节点/主节点

[server2]
hostname=192.168.216.16
ssh_port=22
candidate_master=1

[server3]
hostname=192.168.216.13
ssh_port=22
candidate_master=1


6、前面说了,实现MHA搭建首先要确认节点之间无秘钥登录,因此需要实现ssh无秘钥登录。这里以一台机器做示例,其他同此。

[root@centos7 ~]# ssh-keygen
[root@centos7 ~]# ssh-copy-id -i ~/.ssh/id_rsa  root@192.168.216.13       #分别发送到另外几台主机,不做一一演示


7、检查manager主机环境并启动。

[root@web-server1 mha_master]# masterha_check_ssh -conf=/etc/mha_master/app1.cnf
#检查ssh环境是否成功
Wed Nov 22 14:27:16 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Wed Nov 22 14:27:16 2017 - [info] Reading application default configuration from /etc/mha_master/app1.cnf..
Wed Nov 22 14:27:16 2017 - [info] Reading server configuration from /etc/mha_master/app1.cnf..
Wed Nov 22 14:27:16 2017 - [info] Starting SSH connection tests..
Wed Nov 22 14:27:17 2017 - [debug]
Wed Nov 22 14:27:16 2017 - [debug]  Connecting via SSH from root@192.168.216.17(192.168.216.17:22) to root@192.168.216.16(192.168.216.16:22)..
Wed Nov 22 14:27:17 2017 - [debug]   ok.
Wed Nov 22 14:27:17 2017 - [debug]  Connecting via SSH from root@192.168.216.17(192.168.216.17:22) to root@192.168.216.13(192.168.216.13:22)..
Wed Nov 22 14:27:17 2017 - [debug]   ok.
Wed Nov 22 14:27:18 2017 - [debug]
Wed Nov 22 14:27:17 2017 - [debug]  Connecting via SSH from root@192.168.216.16(192.168.216.16:22) to root@192.168.216.17(192.168.216.17:22)..
Wed Nov 22 14:27:17 2017 - [debug]   ok.
Wed Nov 22 14:27:17 2017 - [debug]  Connecting via SSH from root@192.168.216.16(192.168.216.16:22) to root@192.168.216.13(192.168.216.13:22)..
Wed Nov 22 14:27:18 2017 - [debug]   ok.
Wed Nov 22 14:27:18 2017 - [debug]
Wed Nov 22 14:27:17 2017 - [debug]  Connecting via SSH from root@192.168.216.13(192.168.216.13:22) to root@192.168.216.17(192.168.216.17:22)..
Wed Nov 22 14:27:18 2017 - [debug]   ok.
Wed Nov 22 14:27:18 2017 - [debug]  Connecting via SSH from root@192.168.216.13(192.168.216.13:22) to root@192.168.216.16(192.168.216.16:22)..
Wed Nov 22 14:27:18 2017 - [debug]   ok.
Wed Nov 22 14:27:18 2017 - [info] All SSH connection tests passed successfully.
#successfully表示成功

[root@web-server1 mha_master]# masterha_check_repl -conf=/etc/mha_master/app1.cnf
#检查主从复制环境是否成功
Wed Nov 22 14:27:40 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Wed Nov 22 14:27:40 2017 - [info] Reading application default configuration from /etc/mha_master/app1.cnf..
Wed Nov 22 14:27:40 2017 - [info] Reading server configuration from /etc/mha_master/app1.cnf..
Wed Nov 22 14:27:40 2017 - [info] MHA::MasterMonitor version 0.56.
Wed Nov 22 14:27:40 2017 - [info] GTID failover mode = 0
Wed Nov 22 14:27:40 2017 - [info] Dead Servers:
Wed Nov 22 14:27:40 2017 - [info] Alive Servers:
Wed Nov 22 14:27:40 2017 - [info]   192.168.216.17(192.168.216.17:3306)
Wed Nov 22 14:27:40 2017 - [info]   192.168.216.16(192.168.216.16:3306)
Wed Nov 22 14:27:40 2017 - [info]   192.168.216.13(192.168.216.13:3306)
Wed Nov 22 14:27:40 2017 - [info] Alive Slaves:
Wed Nov 22 14:27:40 2017 - [info]   192.168.216.17(192.168.216.17:3306)  Version=5.5.52-MariaDB (oldest major version between slaves) log-bin:enabled
Wed Nov 22 14:27:40 2017 - [info]     Replicating from 192.168.216.13(192.168.216.13:3306)
Wed Nov 22 14:27:40 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Wed Nov 22 14:27:40 2017 - [info]   192.168.216.16(192.168.216.16:3306)  Version=5.5.52-MariaDB (oldest major version between slaves) log-bin:enabled
Wed Nov 22 14:27:40 2017 - [info]     Replicating from 192.168.216.13(192.168.216.13:3306)
Wed Nov 22 14:27:40 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Wed Nov 22 14:27:40 2017 - [info] Current Alive Master: 192.168.216.13(192.168.216.13:3306)
Wed Nov 22 14:27:40 2017 - [info] Checking slave configurations..
Wed Nov 22 14:27:40 2017 - [warning]  relay_log_purge=0 is not set on slave 192.168.216.17(192.168.216.17:3306).
Wed Nov 22 14:27:40 2017 - [warning]  relay_log_purge=0 is not set on slave 192.168.216.16(192.168.216.16:3306).
Wed Nov 22 14:27:40 2017 - [info] Checking replication filtering settings..
Wed Nov 22 14:27:40 2017 - [info]  binlog_do_db= , binlog_ignore_db=
Wed Nov 22 14:27:40 2017 - [info]  Replication filtering check ok.
Wed Nov 22 14:27:40 2017 - [info] GTID (with auto-pos) is not supported
Wed Nov 22 14:27:40 2017 - [info] Starting SSH connection tests..
Wed Nov 22 14:27:42 2017 - [info] All SSH connection tests passed successfully.
Wed Nov 22 14:27:42 2017 - [info] Checking MHA Node version..
Wed Nov 22 14:27:42 2017 - [info]  Version check ok.
Wed Nov 22 14:27:42 2017 - [info] Checking SSH publickey authentication settings on the current master..
Wed Nov 22 14:27:43 2017 - [info] HealthCheck: SSH to 192.168.216.13 is reachable.
Wed Nov 22 14:27:43 2017 - [info] Master MHA Node version is 0.56.
Wed Nov 22 14:27:43 2017 - [info] Checking recovery script configurations on 192.168.216.13(192.168.216.13:3306)..
Wed Nov 22 14:27:43 2017 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql,/var/log/mysql --output_file=/mydata/mha_master/app1/save_binary_logs_test --manager_version=0.56 --start_file=master-bin.000003
Wed Nov 22 14:27:43 2017 - [info]   Connecting to root@192.168.216.13(192.168.216.13:22)..
Creating /mydata/mha_master/app1 if not exists..    ok.
Checking output directory is accessible or not..
ok.
Binlog found at /var/lib/mysql, up to master-bin.000003
Wed Nov 22 14:27:43 2017 - [info] Binlog setting check done.
Wed Nov 22 14:27:43 2017 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Wed Nov 22 14:27:43 2017 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='admin' --slave_host=192.168.216.17 --slave_ip=192.168.216.17 --slave_port=3306 --workdir=/mydata/mha_master/app1 --target_version=5.5.52-MariaDB --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx
Wed Nov 22 14:27:43 2017 - [info]   Connecting to root@192.168.216.17(192.168.216.17:22)..
Checking slave recovery environment settings..
Opening /var/lib/mysql/relay-log.info ... ok.
Relay log found at /var/lib/mysql, up to slave-relay-log.000002
Temporary relay log file is /var/lib/mysql/slave-relay-log.000002
Testing mysql connection and privileges.. done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Wed Nov 22 14:27:44 2017 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='admin' --slave_host=192.168.216.16 --slave_ip=192.168.216.16 --slave_port=3306 --workdir=/mydata/mha_master/app1 --target_version=5.5.52-MariaDB --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx
Wed Nov 22 14:27:44 2017 - [info]   Connecting to root@192.168.216.16(192.168.216.16:22)..
Checking slave recovery environment settings..
Opening /var/lib/mysql/relay-log.info ... ok.
Relay log found at /var/lib/mysql, up to slave-relay-log.000002
Temporary relay log file is /var/lib/mysql/slave-relay-log.000002
Testing mysql connection and privileges.. done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Wed Nov 22 14:27:44 2017 - [info] Slaves settings check done.
Wed Nov 22 14:27:44 2017 - [info]
192.168.216.13(192.168.216.13:3306) (current master)
+--192.168.216.17(192.168.216.17:3306)
+--192.168.216.16(192.168.216.16:3306)

Wed Nov 22 14:27:44 2017 - [info] Checking replication health on 192.168.216.17..
Wed Nov 22 14:27:44 2017 - [info]  ok.
Wed Nov 22 14:27:44 2017 - [info] Checking replication health on 192.168.216.16..
Wed Nov 22 14:27:44 2017 - [info]  ok.
Wed Nov 22 14:27:44 2017 - [warning] master_ip_failover_script is not defined.
Wed Nov 22 14:27:44 2017 - [warning] shutdown_script is not defined.
Wed Nov 22 14:27:44 2017 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.
#ok成功

[root@web-server1 mha_master]# nohup masterha_manager -conf=/etc/mha_master/app1.cnf &> /etc/mha_master/manager.log &
#启动manager并放到后台执行
[1] 4330
[root@web-server1 mha_master]# jobs
[1]+  Running                 nohup masterha_manager -conf=/etc/mha_master/app1.cnf &>/etc/mha_master/manager.log &
[root@web-server1 mha_master]# masterha_check_status -conf=/etc/mha_master/app1.cnf
#检查manager健康状态
app1 (pid:4330) is running(0:PING_OK), master:192.168.216.13


查看目录:



到此,mysql高架构的MHA已经搭建完成,接下来进行测试:

8、将master节点的mysql服务关闭,模拟服务宕机;

[root@centos7 ~]# systemctl stop mariadb


这时查看manager上日志记录:

[root@centos7 mha_master]# ls
app1  app1.cnf  manager.log
[root@centos7 mha_master]# pwd
/etc/mha_master
[root@centos7 mha_master]# ls
app1  app1.cnf  manager.log
[root@centos7 mha_master]# vim manager.log




日志文件已经清楚的告诉我们,原master已经宕机,192.168.216.17已经成为了新的master,那么接下来验证一下:

[root@centos7 ~]# mysql -uroot
MariaDB [(none)]> show slave hosts;
+-----------+------+------+-----------+
| Server_id | Host | Port | Master_id |
+-----------+------+------+-----------+
|         3 |      | 3306 |         2 |
+-----------+------+------+-----------+
1 row in set (0.00 sec)

MariaDB [(none)]> show master status;
+-------------------+----------+--------------+------------------+
| File              | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+-------------------+----------+--------------+------------------+
| slave1-bin.000001 |      245 |              |                  |
+-------------------+----------+--------------+------------------+
1 row in set (0.00 sec)


数据库显示slave1已经成为了新的master,并且slave2已经作为它的slave指向了slave1,不信可以查看一下slave2的数据库:



另外,master目录下也有记录可以显示:





这里需要注意的是,原master宕机后,即使后来重新启动服务,它并不会自动成为新master的slave,如果想要成为新master的slave,手动来指定即可:

[root@centos7 ~]# mysql -uroot
MariaDB [(none)]> change master to master_host='192.168.216.17',master_user='slave',master_password='centos',master_log_file='slave1-bin.000001',master_log_pos=245;
Query OK, 0 rows affected (0.02 sec)
MariaDB [(none)]> start slave;
Query OK, 0 rows affected (0.01 sec)


指定完成后查看新master(slave1)的slave hosts,看设置是否成功:

[root@centos7 ~]# mysql -uroot
MariaDB [(none)]> show slave hosts;
+-----------+------+------+-----------+
| Server_id | Host | Port | Master_id |
+-----------+------+------+-----------+
|         1 |      | 3306 |         2 |
|         3 |      | 3306 |         2 |
+-----------+------+------+-----------+
2 rows in set (0.00 sec)


新的mysql主从复制架构已经实现~~~
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息