实现MySQL高可用架构之MHA
2017-11-22 22:02
736 查看
MHA是一款开源的mysql的高可用程序,它为mysql主从复制架构提供了automating master failover功能。MHA在监控到master节点故障时,会提升其中拥有最新数据的slave节点成为新的master节点。在此期间,MHA会通过于其他节点获取额外信息来避免一致性方面的问题。MHA还提供了master节点的在线切换功能,能够在30秒内实现故障切换,并在故障切换中,最大可能的保证数据一致性。
MHA服务有两种角色,MHA Manager管理节点和MHA Node数据节点。
MHA Manager:通常单独部署在一台独立机器上管理多个master/slave集群,每个master/slave集群称为一个application,用来管理统筹整个集群。
MHA Node:运行在每台mysql服务器上,它是通过监控具备解析和清理logs功能的脚本来加快故障转移;主要是接收管理节点所发出指令的代理,代理需要运行在每一个mysql节点上。
简单的来说,manager监控集群组上的每个节点,并且自动识别集群组中的master,当master宕机时,manager将原master的二进制日志保存下来,检查集群组中具有最新更新数据的服务器,将其提升为新的主节点,将原master的二进制日志更新到新的节点上,并且将其他节点指向新的master。
总结一下MHA工作原理:
1、从宕机崩溃的master保存二进制日志事件
2、识别含有最新更新的slave
3、应用差异的中继日志到其他slave
4、应用从master保存的二进制日志事件
5、提升一个slave为新的master
6、使用其他的slave连接新的master进行复制数据。
【注意】因为manager要监控集群上的所有节点,且集群内的各机器要实现数据复制,所以它们之间应该实现免密钥登录。
四台主机:
manager:192.168.216.15
master:192.168.216.13
slave1:192.168.216.17
slave2:192.168.216.16
1、在四台主机上安装node包,可以选择源码编译或者yum安装,在这里使用的是yum安装。
manager主机要安装mysql-manager包,用来实现监控。
2、首先实现一主二从的mysql主从复制
master配置:
3、两台slave节点设置:
4、这时查看master主机,可以清楚看到两台slave;
另外,假设当master宕机,新master上任时,新master也需要授权用户来让其他的slave对自己的数据进行复制,所以在slave上我们也要提前授权用户:
5、接下来进行配置manager。
由于mha4mysql-manager-0.56-0.el6.noarch.rpm这个包安装以后并没有自定义的配置文件,所以配置文件需要手动来写:
6、前面说了,实现MHA搭建首先要确认节点之间无秘钥登录,因此需要实现ssh无秘钥登录。这里以一台机器做示例,其他同此。
7、检查manager主机环境并启动。
查看目录:
到此,mysql高架构的MHA已经搭建完成,接下来进行测试:
8、将master节点的mysql服务关闭,模拟服务宕机;
这时查看manager上日志记录:
日志文件已经清楚的告诉我们,原master已经宕机,192.168.216.17已经成为了新的master,那么接下来验证一下:
数据库显示slave1已经成为了新的master,并且slave2已经作为它的slave指向了slave1,不信可以查看一下slave2的数据库:
另外,master目录下也有记录可以显示:
这里需要注意的是,原master宕机后,即使后来重新启动服务,它并不会自动成为新master的slave,如果想要成为新master的slave,手动来指定即可:
指定完成后查看新master(slave1)的slave hosts,看设置是否成功:
新的mysql主从复制架构已经实现~~~
MHA服务有两种角色,MHA Manager管理节点和MHA Node数据节点。
MHA Manager:通常单独部署在一台独立机器上管理多个master/slave集群,每个master/slave集群称为一个application,用来管理统筹整个集群。
MHA Node:运行在每台mysql服务器上,它是通过监控具备解析和清理logs功能的脚本来加快故障转移;主要是接收管理节点所发出指令的代理,代理需要运行在每一个mysql节点上。
简单的来说,manager监控集群组上的每个节点,并且自动识别集群组中的master,当master宕机时,manager将原master的二进制日志保存下来,检查集群组中具有最新更新数据的服务器,将其提升为新的主节点,将原master的二进制日志更新到新的节点上,并且将其他节点指向新的master。
总结一下MHA工作原理:
1、从宕机崩溃的master保存二进制日志事件
2、识别含有最新更新的slave
3、应用差异的中继日志到其他slave
4、应用从master保存的二进制日志事件
5、提升一个slave为新的master
6、使用其他的slave连接新的master进行复制数据。
【注意】因为manager要监控集群上的所有节点,且集群内的各机器要实现数据复制,所以它们之间应该实现免密钥登录。
【实验】实现mysql的高架构MHA
实验环境:四台主机:
manager:192.168.216.15
master:192.168.216.13
slave1:192.168.216.17
slave2:192.168.216.16
1、在四台主机上安装node包,可以选择源码编译或者yum安装,在这里使用的是yum安装。
manager主机要安装mysql-manager包,用来实现监控。
manager: [root@centos7 ~]# yum install mha4mysql-manager-0.56-0.el6.noarch.rpm mha4mysql-node-0.56-0.el6.noarch.rpm master及slave: [root@centos7 ~]# yum install mha4mysql-node-0.56-0.el6.noarch.rpm
2、首先实现一主二从的mysql主从复制
master配置:
[root@centos7 ~]# vim /etc/my.cnf [mysqld] server_id=1 #配置server_id,让主服务器有唯一ID号 log_bin=master-bin #启动二进制日志 relay_log=relay_log #中继日志 skip_name_resolve=on #跳过名字解析 [root@centos7 ~]# systemctl start mariadb [root@centos7 ~]# mysql -uroot MariaDB [(none)]> select user,host,password from mysql.user; +-------+----------------+-------------------------------------------+ | user | host | password | +-------+----------------+-------------------------------------------+ | root | localhost | | | root | 127.0.0.1 | | | root | ::1 | | +-------+----------------+-------------------------------------------+ 3 rows in set (0.00 sec) MariaDB [(none)]> grant replication slave,replication client on *.* to slave@'%' identified by 'centos'; #授权用户,使得从服务器可以进行数据复制 MariaDB [(none)]> grant all on *.* to admin@'%' identified by 'centos'; #这个授权用户是对manager设置的超级用户,这个用户在两台slave上也需要设置,可以实现主从以后在master上设置,slave会自动进行复制; MariaDB [(none)]> show master status; +-------------------+----------+--------------+------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | +-------------------+----------+--------------+------------------+ | master-bin.000003 | 245 | | | +-------------------+----------+--------------+------------------+ #查看master的Position,slave需要对应其Position值; 1 row in set (0.00 sec) MariaDB [(none)]> show slave hosts; Empty set (0.00 sec) #在未进行slave设置时,master上的slave host为空;
3、两台slave节点设置:
[root@centos7 ~]# vim /etc/my.cnf [mysqld] server_id=2 #复制集群中的各节点的id均必须唯一;slave2的ID为3 log_bin=slave1-bin relay_log=slave-relay-log read_only=on #作为slave mysql,需要只读权限 skip_name_resolve=on relay_log_purge=0 #是否自动清空不再需要中继日志,0表示off [root@centos7 ~]# systemctl start mariadb [root@centos7 ~]# mysql -uroot MariaDB [(none)]> change master to master_host='192.168.216.13',master_user='slave',master_password='centos',master_log_file='master-bin.000003',master_log_pos=245; Query OK, 0 rows affected (0.01 sec) #让slave连接master,并开始重做master二进制日志中的事件,Position与master对应; MariaDB [(none)]> start slave; Query OK, 0 rows affected (0.00 sec) #开启复制线程; MariaDB [(none)]> show slave status\G; #查看从服务器状态 *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.216.13 Master_User: slave Master_Port: 3306 Connect_Retry: 60 Master_Log_File: master-bin.000003 Read_Master_Log_Pos: 245 Relay_Log_File: slave-relay-log.000002 Relay_Log_Pos: 530 Relay_Master_Log_File: master-bin.000003 Slave_IO_Running: Yes #IO线程正常运行 Slave_SQL_Running: Yes #SQL线程正常运行 Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 245 Relay_Log_Space: 824 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 1 1 row in set (0.00 sec) ERROR: No query specified
4、这时查看master主机,可以清楚看到两台slave;
[root@centos7 ~]# mysql -uroot MariaDB [(none)]> show slave hosts; +-----------+------+------+-----------+ | Server_id | Host | Port | Master_id | +-----------+------+------+-----------+ | 3 | | 3306 | 1 | | 2 | | 3306 | 1 | +-----------+------+------+-----------+ 2 rows in set (0.00 sec) #查看server_id
另外,假设当master宕机,新master上任时,新master也需要授权用户来让其他的slave对自己的数据进行复制,所以在slave上我们也要提前授权用户:
MariaDB [(none)]> grant replication slave,replication client on *.* to slave@'%' identified by 'centos';
5、接下来进行配置manager。
由于mha4mysql-manager-0.56-0.el6.noarch.rpm这个包安装以后并没有自定义的配置文件,所以配置文件需要手动来写:
[root@centos7 ~]# vim /etc/mha_master/app1.cnf [server default] user=admin #manager的超级用户,在节点数据库中有体现 password=centos #密码 manager_workdir=/etc/mha_master/app1 #manager工作目录 manager_log=/etc/mha_master/manager.log #manager日志文件 remote_workdir=/mydata/mha_master/app1 #远程主机工作目录 ssh_user=root #ssh连接用户 repl_user=slave #slave授权用户 repl_password=centos ping_interval=1 #ping间隔时长 [server1] hostname=192.168.216.17 #节点1的地址 ssh_port=22 #节点1的ssh端口 candidate_master=1 #将来可不可以成为master候选节点/主节点 [server2] hostname=192.168.216.16 ssh_port=22 candidate_master=1 [server3] hostname=192.168.216.13 ssh_port=22 candidate_master=1
6、前面说了,实现MHA搭建首先要确认节点之间无秘钥登录,因此需要实现ssh无秘钥登录。这里以一台机器做示例,其他同此。
[root@centos7 ~]# ssh-keygen [root@centos7 ~]# ssh-copy-id -i ~/.ssh/id_rsa root@192.168.216.13 #分别发送到另外几台主机,不做一一演示
7、检查manager主机环境并启动。
[root@web-server1 mha_master]# masterha_check_ssh -conf=/etc/mha_master/app1.cnf #检查ssh环境是否成功 Wed Nov 22 14:27:16 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Wed Nov 22 14:27:16 2017 - [info] Reading application default configuration from /etc/mha_master/app1.cnf.. Wed Nov 22 14:27:16 2017 - [info] Reading server configuration from /etc/mha_master/app1.cnf.. Wed Nov 22 14:27:16 2017 - [info] Starting SSH connection tests.. Wed Nov 22 14:27:17 2017 - [debug] Wed Nov 22 14:27:16 2017 - [debug] Connecting via SSH from root@192.168.216.17(192.168.216.17:22) to root@192.168.216.16(192.168.216.16:22).. Wed Nov 22 14:27:17 2017 - [debug] ok. Wed Nov 22 14:27:17 2017 - [debug] Connecting via SSH from root@192.168.216.17(192.168.216.17:22) to root@192.168.216.13(192.168.216.13:22).. Wed Nov 22 14:27:17 2017 - [debug] ok. Wed Nov 22 14:27:18 2017 - [debug] Wed Nov 22 14:27:17 2017 - [debug] Connecting via SSH from root@192.168.216.16(192.168.216.16:22) to root@192.168.216.17(192.168.216.17:22).. Wed Nov 22 14:27:17 2017 - [debug] ok. Wed Nov 22 14:27:17 2017 - [debug] Connecting via SSH from root@192.168.216.16(192.168.216.16:22) to root@192.168.216.13(192.168.216.13:22).. Wed Nov 22 14:27:18 2017 - [debug] ok. Wed Nov 22 14:27:18 2017 - [debug] Wed Nov 22 14:27:17 2017 - [debug] Connecting via SSH from root@192.168.216.13(192.168.216.13:22) to root@192.168.216.17(192.168.216.17:22).. Wed Nov 22 14:27:18 2017 - [debug] ok. Wed Nov 22 14:27:18 2017 - [debug] Connecting via SSH from root@192.168.216.13(192.168.216.13:22) to root@192.168.216.16(192.168.216.16:22).. Wed Nov 22 14:27:18 2017 - [debug] ok. Wed Nov 22 14:27:18 2017 - [info] All SSH connection tests passed successfully. #successfully表示成功 [root@web-server1 mha_master]# masterha_check_repl -conf=/etc/mha_master/app1.cnf #检查主从复制环境是否成功 Wed Nov 22 14:27:40 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Wed Nov 22 14:27:40 2017 - [info] Reading application default configuration from /etc/mha_master/app1.cnf.. Wed Nov 22 14:27:40 2017 - [info] Reading server configuration from /etc/mha_master/app1.cnf.. Wed Nov 22 14:27:40 2017 - [info] MHA::MasterMonitor version 0.56. Wed Nov 22 14:27:40 2017 - [info] GTID failover mode = 0 Wed Nov 22 14:27:40 2017 - [info] Dead Servers: Wed Nov 22 14:27:40 2017 - [info] Alive Servers: Wed Nov 22 14:27:40 2017 - [info] 192.168.216.17(192.168.216.17:3306) Wed Nov 22 14:27:40 2017 - [info] 192.168.216.16(192.168.216.16:3306) Wed Nov 22 14:27:40 2017 - [info] 192.168.216.13(192.168.216.13:3306) Wed Nov 22 14:27:40 2017 - [info] Alive Slaves: Wed Nov 22 14:27:40 2017 - [info] 192.168.216.17(192.168.216.17:3306) Version=5.5.52-MariaDB (oldest major version between slaves) log-bin:enabled Wed Nov 22 14:27:40 2017 - [info] Replicating from 192.168.216.13(192.168.216.13:3306) Wed Nov 22 14:27:40 2017 - [info] Primary candidate for the new Master (candidate_master is set) Wed Nov 22 14:27:40 2017 - [info] 192.168.216.16(192.168.216.16:3306) Version=5.5.52-MariaDB (oldest major version between slaves) log-bin:enabled Wed Nov 22 14:27:40 2017 - [info] Replicating from 192.168.216.13(192.168.216.13:3306) Wed Nov 22 14:27:40 2017 - [info] Primary candidate for the new Master (candidate_master is set) Wed Nov 22 14:27:40 2017 - [info] Current Alive Master: 192.168.216.13(192.168.216.13:3306) Wed Nov 22 14:27:40 2017 - [info] Checking slave configurations.. Wed Nov 22 14:27:40 2017 - [warning] relay_log_purge=0 is not set on slave 192.168.216.17(192.168.216.17:3306). Wed Nov 22 14:27:40 2017 - [warning] relay_log_purge=0 is not set on slave 192.168.216.16(192.168.216.16:3306). Wed Nov 22 14:27:40 2017 - [info] Checking replication filtering settings.. Wed Nov 22 14:27:40 2017 - [info] binlog_do_db= , binlog_ignore_db= Wed Nov 22 14:27:40 2017 - [info] Replication filtering check ok. Wed Nov 22 14:27:40 2017 - [info] GTID (with auto-pos) is not supported Wed Nov 22 14:27:40 2017 - [info] Starting SSH connection tests.. Wed Nov 22 14:27:42 2017 - [info] All SSH connection tests passed successfully. Wed Nov 22 14:27:42 2017 - [info] Checking MHA Node version.. Wed Nov 22 14:27:42 2017 - [info] Version check ok. Wed Nov 22 14:27:42 2017 - [info] Checking SSH publickey authentication settings on the current master.. Wed Nov 22 14:27:43 2017 - [info] HealthCheck: SSH to 192.168.216.13 is reachable. Wed Nov 22 14:27:43 2017 - [info] Master MHA Node version is 0.56. Wed Nov 22 14:27:43 2017 - [info] Checking recovery script configurations on 192.168.216.13(192.168.216.13:3306).. Wed Nov 22 14:27:43 2017 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql,/var/log/mysql --output_file=/mydata/mha_master/app1/save_binary_logs_test --manager_version=0.56 --start_file=master-bin.000003 Wed Nov 22 14:27:43 2017 - [info] Connecting to root@192.168.216.13(192.168.216.13:22).. Creating /mydata/mha_master/app1 if not exists.. ok. Checking output directory is accessible or not.. ok. Binlog found at /var/lib/mysql, up to master-bin.000003 Wed Nov 22 14:27:43 2017 - [info] Binlog setting check done. Wed Nov 22 14:27:43 2017 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers.. Wed Nov 22 14:27:43 2017 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='admin' --slave_host=192.168.216.17 --slave_ip=192.168.216.17 --slave_port=3306 --workdir=/mydata/mha_master/app1 --target_version=5.5.52-MariaDB --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info --relay_dir=/var/lib/mysql/ --slave_pass=xxx Wed Nov 22 14:27:43 2017 - [info] Connecting to root@192.168.216.17(192.168.216.17:22).. Checking slave recovery environment settings.. Opening /var/lib/mysql/relay-log.info ... ok. Relay log found at /var/lib/mysql, up to slave-relay-log.000002 Temporary relay log file is /var/lib/mysql/slave-relay-log.000002 Testing mysql connection and privileges.. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Wed Nov 22 14:27:44 2017 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='admin' --slave_host=192.168.216.16 --slave_ip=192.168.216.16 --slave_port=3306 --workdir=/mydata/mha_master/app1 --target_version=5.5.52-MariaDB --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info --relay_dir=/var/lib/mysql/ --slave_pass=xxx Wed Nov 22 14:27:44 2017 - [info] Connecting to root@192.168.216.16(192.168.216.16:22).. Checking slave recovery environment settings.. Opening /var/lib/mysql/relay-log.info ... ok. Relay log found at /var/lib/mysql, up to slave-relay-log.000002 Temporary relay log file is /var/lib/mysql/slave-relay-log.000002 Testing mysql connection and privileges.. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Wed Nov 22 14:27:44 2017 - [info] Slaves settings check done. Wed Nov 22 14:27:44 2017 - [info] 192.168.216.13(192.168.216.13:3306) (current master) +--192.168.216.17(192.168.216.17:3306) +--192.168.216.16(192.168.216.16:3306) Wed Nov 22 14:27:44 2017 - [info] Checking replication health on 192.168.216.17.. Wed Nov 22 14:27:44 2017 - [info] ok. Wed Nov 22 14:27:44 2017 - [info] Checking replication health on 192.168.216.16.. Wed Nov 22 14:27:44 2017 - [info] ok. Wed Nov 22 14:27:44 2017 - [warning] master_ip_failover_script is not defined. Wed Nov 22 14:27:44 2017 - [warning] shutdown_script is not defined. Wed Nov 22 14:27:44 2017 - [info] Got exit code 0 (Not master dead). MySQL Replication Health is OK. #ok成功 [root@web-server1 mha_master]# nohup masterha_manager -conf=/etc/mha_master/app1.cnf &> /etc/mha_master/manager.log & #启动manager并放到后台执行 [1] 4330 [root@web-server1 mha_master]# jobs [1]+ Running nohup masterha_manager -conf=/etc/mha_master/app1.cnf &>/etc/mha_master/manager.log & [root@web-server1 mha_master]# masterha_check_status -conf=/etc/mha_master/app1.cnf #检查manager健康状态 app1 (pid:4330) is running(0:PING_OK), master:192.168.216.13
查看目录:
到此,mysql高架构的MHA已经搭建完成,接下来进行测试:
8、将master节点的mysql服务关闭,模拟服务宕机;
[root@centos7 ~]# systemctl stop mariadb
这时查看manager上日志记录:
[root@centos7 mha_master]# ls app1 app1.cnf manager.log [root@centos7 mha_master]# pwd /etc/mha_master [root@centos7 mha_master]# ls app1 app1.cnf manager.log [root@centos7 mha_master]# vim manager.log
日志文件已经清楚的告诉我们,原master已经宕机,192.168.216.17已经成为了新的master,那么接下来验证一下:
[root@centos7 ~]# mysql -uroot MariaDB [(none)]> show slave hosts; +-----------+------+------+-----------+ | Server_id | Host | Port | Master_id | +-----------+------+------+-----------+ | 3 | | 3306 | 2 | +-----------+------+------+-----------+ 1 row in set (0.00 sec) MariaDB [(none)]> show master status; +-------------------+----------+--------------+------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | +-------------------+----------+--------------+------------------+ | slave1-bin.000001 | 245 | | | +-------------------+----------+--------------+------------------+ 1 row in set (0.00 sec)
数据库显示slave1已经成为了新的master,并且slave2已经作为它的slave指向了slave1,不信可以查看一下slave2的数据库:
另外,master目录下也有记录可以显示:
这里需要注意的是,原master宕机后,即使后来重新启动服务,它并不会自动成为新master的slave,如果想要成为新master的slave,手动来指定即可:
[root@centos7 ~]# mysql -uroot MariaDB [(none)]> change master to master_host='192.168.216.17',master_user='slave',master_password='centos',master_log_file='slave1-bin.000001',master_log_pos=245; Query OK, 0 rows affected (0.02 sec) MariaDB [(none)]> start slave; Query OK, 0 rows affected (0.01 sec)
指定完成后查看新master(slave1)的slave hosts,看设置是否成功:
[root@centos7 ~]# mysql -uroot MariaDB [(none)]> show slave hosts; +-----------+------+------+-----------+ | Server_id | Host | Port | Master_id | +-----------+------+------+-----------+ | 1 | | 3306 | 2 | | 3 | | 3306 | 2 | +-----------+------+------+-----------+ 2 rows in set (0.00 sec)
新的mysql主从复制架构已经实现~~~
相关文章推荐
- mysql实现高可用架构之MHA
- [置顶] 构建MHA实现MySQL高可用之集群架构配置详解
- MySQL集群架构篇:MHA+MySQL-PROXY+LVS实现MySQL集群架构高可用/高性能-技术流ken
- Mysql常用主从复制架构以及MHA高可用的主从复制的实现
- 基于MHA的MySQL高可用架构的实现
- MySQL高可用方案:基于MHA实现的自动故障转移群集
- 高可用架构篇--MyCat在MySQL主从复制基础上实现读写分离
- MySQL高可用架构之MHA
- mysql高可用架构之-MHA学习
- MHA+Lvs+Keepalived实现MySQL的高可用及读负载均衡_4(Lvs+Keepalived)
- MySQL高可用架构之MHA
- 探索MySQL高可用架构之MHA(1)
- 探索MySQL高可用架构之MHA(3)
- Mysql的高可用MHA实现
- MHA+Lvs+Keepalived实现MySQL的高可用及读负载均衡_1(概览)
- MySQL高可用架构之MHA
- 搭建mysql高可用架构mha
- MySQL高可用架构之MHA
- MySQL之高可用架构—MHA
- Mysql之运用MHA的功能实现服务高可用