A complete explanation of MySQL high availability architecture: MHA architecture

A complete explanation of MySQL high availability architecture: MHA architecture

MHA (Master HA) is an open source MySQL high availability program that provides automating master failover functionality for the MySQL master-slave replication architecture. When MHA detects a master node failure, it will promote the slave node with the latest data to become the new master node. During this period, MHA will avoid consistency issues by obtaining additional information from other slave nodes. MHA also provides the online switching function of the master node, that is, switching the master/slave node on demand.
MHA is a mature MySQL high availability solution developed by Japanese Yoshinorim (formerly employed by DeNA and now employed by FaceBook). MHA can implement failover within 30 seconds and ensure data consistency to the greatest extent possible during failover. Taobao is currently developing a similar product, TMHA, which currently supports one master and one slave.

insert image description here

1. Introduction

MHA (Master High Availability) is currently a relatively mature solution for MySQL high availability. It was developed by youshimaton of Japan's DeNA company (currently working at Facebook) and is an excellent high-availability software for fault switching and master-slave promotion in a MySQL high-availability environment. During the MySQL failover process, MHA can automatically complete the database failover operation within 0 to 30 seconds, and during the failover process, MHA can ensure data consistency to the greatest extent to achieve true high availability.

2. Composition

It consists of two parts: MHA Manager (management node) and MHA Node (data node).
When the master fails, it can automatically promote the slave with the latest data to the new master, and then re-point all other slaves to the new master. The entire failover process is completely transparent to the application.

3. Working Process

During the MHA automatic failover process, MHA attempts to save the binary logs from the downed primary server to ensure that data is not lost to the greatest extent, but this is not always possible. For example, if the primary server hardware fails or cannot be accessed via ssh, MHA cannot save the binary log and only performs a failover, losing the latest data. Using MySQL 5.5's semisynchronous replication, the risk of data loss can be greatly reduced. MHA can be combined with semisynchronous replication. If only one slave has received the latest binary log, MHA can apply the latest binary log to all other slave servers, thus ensuring data consistency of all nodes.

4. Architecture

To build MHA, a replication cluster must have at least three database servers, one master and two slaves, that is, one server acts as the master, one server acts as the standby master, and the other server acts as the slave.

insert image description here

(1) Save binary log events from the crashed master;
(2) Identify the slave containing the latest update;
(3) Apply the difference relay log to other slaves;
(4) Apply the binary log events saved from the master;
(5) Promote a slave to a new master;
(6) Enable other slaves to connect to the new master for replication;

Manager Toolkit Main Features

masterha_check_ssh Check the SSH configuration status of MHA masterha_check_repl Check the MySQL replication status masterha_manger Start MHA
masterha_check_status checks the current MHA running status masterha_master_monitor checks whether the master is down masterha_master_switch controls failover (automatic or manual)
masterha_conf_host adds or deletes configured server information

Node Toolkit Features

save_binary_logs saves and copies the master's binary logs apply_diff_relay_logs identifies differential relay log events and applies the differential events to other slaves
filter_mysqlbinlog removes unnecessary ROLLBACK events (MHA no longer uses this tool)
purge_relay_logs Clear relay logs (will not block the SQL thread)

5. Example display

Preparations for deploying MHA:

Role ip address host name server_id type Monitor host 192.168.0.20 server01 - monitor replication group Master 192.168.0.50 server02 1 write Candicate master 192.168.0.60 server03 2 read Slave 192.168.0.70 server04 3 read

The master provides write services to the outside world, the standby master (the actual slave, host name server03) provides read services, and the slave also provides related read services. Once the master goes down, the standby master will be promoted to the new master, and the slave will point to the new master.

## 1. Turn off the firewall systemctl stop firewalld
systemctl disable firewalld
setenforce 0

## 2. Set the host name hostnamectl set-hostname Mysql1
hostnamectl set-hostname Mysql2
hostnamectl set-hostname Mysql3

## 3. Node settings: Master, slave1, slave2 configuration files /etc/my.cnf

## a. Master Node##
vim /etc/my.cnf
[mysqld]
server-id = 1
log_bin = master-bin
log-slave-updates = true
systemctl restart mysqld

##Slave1, Slave2 nodes##
vim /etc/my.cnf
server-id = 2 #The server-ids of the three servers cannot be the same log_bin = master-bin
relay-log = relay-log-bin
relay-log-index = slave-relay-bin.index
systemctl restart mysqld

## 4. Create a soft link ln -s /usr/local/mysql/bin/mysql /usr/sbin/
ln -s /usr/local/mysql/bin/mysqlbinlog /usr/sbin/

## 5. One master and two slaves: All nodes are authorized with mysql -uroot -p
grant replication slave on *.* to 'myslave'@'192.168.80.%' identified by '123'; #Slave database synchronization use grant all privileges on *.* to 'mha'@'192.168.80.%' identified by 'manager'; #manager use grant all privileges on *.* to 'mha'@'Mysql1' identified by 'manager'; #Prevent the slave library from connecting to the master library through the host name grant all privileges on *.* to 'mha'@'Mysql2' identified by 'manager';
grant all privileges on *.* to 'mha'@'Mysql3' identified by 'manager';
## The binary file can be seen on the master, sync point show master status;
+-------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+-------------------+----------+--------------+------------------+-------------------+
| master-bin.000002 | 1745 | | | |
+-------------------+----------+--------------+------------------+-------------------+
## slave1, slave2 node data synchronization results show slave status\G		
// Make sure both IO and SQL threads are Yes, indicating that the synchronization is normal.
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
# Set slave1 and slave2 to read-only mode set global read_only=1;
# Create a database test##Insert a piece of data into the Master database to test whether it is synchronized##
create database test_db;
use test_db;
create table test(id int);
insert into test(id) values ​​(1);
# View from the database select * from test_db.test;
+------+
|id|
+------+
| 1 |
+------+

Install MHA software

(1) Install the MHA-dependent environment on all servers. First install the epel source

yum install epel-release --nogpgcheck -y

yum install -y perl-DBD-MySQL \
perl-Config-Tiny \
perl-Log-Dispatch \
perl-Parallel-ForkManager \
perl-ExtUtils-CBuilder \
perl-ExtUtils-MakeMaker \
perl - CPAN

(2) To install the MHA software package, you must first install the node component on all servers.

For each operating system version, it is different. Here, CentOS7.4 must choose version 0.57.
The node component must be installed on all servers first, and the manager component must be installed on the MHA-manager node last, because the manager depends on the node component.

cd /opt
tar zxvf mha4mysql-node-0.57.tar.gz
cd mha4mysql-node-0.57
perl Makefile.PL
make && make install

(3) Install the manager component on the MHA manager node

cd /opt
tar zxvf mha4mysql-manager-0.57.tar.gz
cd mha4mysql-manager-0.57
perl Makefile.PL
make && make install

7. Configure passwordless authentication on all servers (1) Configure passwordless authentication on all database nodes on the manager node

-----------------------------------------------------------------------------------------
#After the manager component is installed, several tools will be generated under /usr/local/bin, mainly including the following:
masterha_check_ssh checks the SSH configuration of MHAmasterha_check_repl checks the MySQL replication statusmasterha_manger starts the manager scriptmasterha_check_status checks the current MHA running statusmasterha_master_monitor checks whether the master is downmasterha_master_switch controls failover (automatic or manual)
masterha_conf_host Add or delete configured server information masterha_stop Shut down the manager

#After the node component is installed, several scripts will be generated under /usr/local/bin (these tools are usually triggered by the MHAManager script and do not require manual operation). The main ones are as follows:
save_binary_logs saves and copies the master's binary logs apply_diff_relay_logs identifies differential relay log events and applies the differential events to other slaves
filter_mysqlbinlog removes unnecessary ROLLBACK events (MHA no longer uses this tool)
purge_relay_logs Clear relay logs (does not block the SQL thread)

(2) Configure password-free authentication on mysql1 to the database nodes mysql2 and mysql3

ssh-keygen -t rsa #Press Enter all the way ssh-copy-id 192.168.80.10
ssh-copy-id 192.168.80.20
ssh-copy-id 192.168.80.30

(3) Configure password-free authentication on mysql2 to the database nodes mysql1 and mysql3

ssh-keygen -t rsa
ssh-copy-id 192.168.80.20
ssh-copy-id 192.168.80.30

(4) Configure password-free authentication on mysql3 to the database nodes mysql1 and mysql2

ssh-keygen -t rsa
ssh-copy-id 192.168.80.10
ssh-copy-id 192.168.80.30

8. Configure MHA on the manager node
(1) Copy the relevant scripts to the /usr/local/bin directory on the manager node

cp -rp /opt/mha4mysql-manager-0.57/samples/scripts /usr/local/bin
//After copying, there will be four executable files ll /usr/local/bin/scripts/
----------------------------------------------------------------------------------------------------------
master_ip_failover #Script for VIP management during automatic switching master_ip_online_change #Vip management during online switching power_manager #Script for shutting down the host after a failure occurs send_report #Script for sending an alarm after a failure switch

(2) Copy the above VIP management script for automatic switching to the /usr/local/bin directory. Here, the master_ip_failover script is used to manage VIP and failover.

cp /usr/local/bin/scripts/master_ip_failover /usr/local/bin

(3) The modified contents are as follows: (Delete the original contents, directly copy and modify the VIP related parameters)

vim /usr/local/bin/master_ip_failover
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';

use Getopt::Long;

my
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
###############################Add content section#########################################
my $vip = '192.168.80.200'; #Specify the address of vipmy $brdc = '192.168.80.255'; #Specify the broadcast address of vipmy $ifdev = 'ens33'; #Specify the network card bound to vipmy $key = '1'; #Specify the serial number of the virtual network card bound to vipmy $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip"; #Represents that the value of this variable is ifconfig ens33:1 192.168.80.200
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down"; #This variable value represents ifconfig ens33:1 192.168.80.200 down
my $exit_code = 0; #Specify the exit status code as 0
#my $ssh_start_vip = "/usr/sbin/ip addr add $vip/24 brd $brdc dev $ifdev label $ifdev:$key;/usr/sbin/arping -q -A -c 1 -I $ifdev $vip;iptables -F;";
#my $ssh_stop_vip = "/usr/sbin/ip addr del $vip/24 dev $ifdev label $ifdev:$key";
##################################################################################
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);

exit &main();

sub main {

print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";

if ( $command eq "stop" || $command eq "stopssh" ) {

my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {

my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
## A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}

Sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}

(4) Create the MHA software directory and copy the configuration file. Here, the app1.cnf configuration file is used to manage the MySQL node server.

mkdir /etc/masterha
cp /opt/mha4mysql-manager-0.57/samples/conf/app1.cnf /etc/masterha

vim /etc/masterha/app1.cnf #Delete the original content, directly copy and modify the IP address of the node server [server default]
manager_log=/var/log/masterha/app1/manager.log
manager_workdir=/var/log/masterha/app1
master_binlog_dir=/usr/local/mysql/data
master_ip_failover_script=/usr/local/bin/master_ip_failover
master_ip_online_change_script=/usr/local/bin/master_ip_online_change
password=manager
ping_interval=1
remote_workdir=/tmp
repl_password=123
repl_user=myslave
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.80.20 -s 192.168.80.30
shutdown_script=""
ssh_user=root
user=mha

[server1]
hostname=192.168.80.10
port=3306

[server2]
candidate_master=1
check_repl_delay=0
hostname=192.168.80.20
port=3306

[server3]
hostname=192.168.80.30
port=3306

illustrate:

[server default]
manager_log=/var/log/masterha/app1/manager.log #manager log manager_workdir=/var/log/masterha/app1.log #manager working directory master_binlog_dir=/usr/local/mysql/data/ #The location where master saves binlog. The path here must be consistent with the path of binlog configured in master so that MHA can find it master_ip_failover_script=/usr/local/bin/master_ip_failover #Set the switching script for automatic failover, which is the script above master_ip_online_change_script=/usr/local/bin/master_ip_online_change #Set the switching script for manual switching password=manager #Set the password of the root user in mysql. This password is the password for creating the monitoring user in the previous article ping_interval=1 #Set the time interval for monitoring the main database and sending ping packets. The default is 3 seconds. Automatically failover after three attempts without response
remote_workdir=/tmp #Set the location where the remote MySQL binlog is saved when switching occurs repl_password=123 #Set the password of the replication user repl_user=myslave #Set the user of the replication user report_script=/usr/local/send_report #Set the script for sending alarms after switching secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.80.20 -s 192.168.80.30 #Specify the IP address of the slave server to be checked shutdown_script="" #Set the script to shut down the faulty host after a failure occurs (the main function of this script is to shut down the host to prevent brain split, which is not used here)
ssh_user=root #Set the ssh login username user=mha #Set monitoring user root

[server1]
hostname=192.168.80.10
port=3306

[server2]
hostname=192.168.80.20
port=3306
candidate_master=1
#Set to candidate master. After setting this parameter, the slave library will be promoted to the master library after the master-slave switch occurs, even if the slave library is not the latest slave in the cluster

check_repl_delay=0
#By default, if a slave lags behind the master by more than 100M relay logs, MHA will not select the slave as a new master. 
Because it takes a long time to recover this slave; by setting check_repl_delay=0, MHA triggers the switch and ignores the replication delay when selecting a new master. This parameter is very useful for hosts with candidate_master=1 set, because the candidate master must be the new master during the switch.

[server3]
hostname=192.168.80.30
port=3306

Enable the master's virtual IP

/sbin/ifconfig ens33:1 192.168.80.200/24

10. Test ssh passwordless authentication on the manager node

masterha_check_ssh -conf=/etc/masterha/app1.cnf

Tue Nov 26 23:09:45 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Tue Nov 26 23:09:45 2020 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Tue Nov 26 23:09:45 2020 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Tue Nov 26 23:09:45 2020 - [info] Starting SSH connection tests..
Tue Nov 26 23:09:46 2020 - [debug] 
Tue Nov 26 23:09:45 2020 - [debug] Connecting via SSH from [email protected](192.168.80.20:22) to [email protected](192.168.80.30:22)..
Tue Nov 26 23:09:46 2020 - [debug] ok.
Tue Nov 26 23:09:47 2020 - [debug] 
Tue Nov 26 23:09:46 2020 - [debug] Connecting via SSH from [email protected](192.168.80.30:22) to [email protected](192.168.80.20:22)..
Tue Nov 26 23:09:47 2020 - [debug] ok.
Tue Nov 26 23:09:47 2020 - [info] All SSH connection tests passed successfully.
# Passwordless authentication successful

11. Test the MySQL master-slave connection on the manager node

masterha_check_repl -conf=/etc/masterha/app1.cnf

Tue Nov 26 23:10:29 2020 - [info] Slaves settings check done.
Tue Nov 26 23:10:29 2020 - [info] 
192.168.80.20(192.168.80.20:3306) (current master)
 +--192.168.80.30(192.168.80.30:3306)

Tue Nov 26 23:10:29 2020 - [info] Checking replication health on 192.168.80.30..
Tue Nov 26 23:10:29 2020 - [info] ok.
Tue Nov 26 23:10:29 2020 - [info] Checking master_ip_failover_script status:
Tue Nov 26 23:10:29 2020 - [info] /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.80.20 --orig_master_ip=192.168.80.20 --orig_master_port=3306 


IN SCRIPT TEST====/sbin/ifconfig ens33:1 down==/sbin/ifconfig ens33:1 192.168.80.200===

Checking the Status of the script.. OK 
Tue Nov 26 23:10:29 2020 - [info] OK.
Tue Nov 26 23:10:29 2020 - [warning] shutdown_script is not defined.
Tue Nov 26 23:10:29 2020 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.
# Master-slave connection successful
# Start MHA on the manager node
nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
--remove_dead_master_conf: This parameter means that when a master-slave switch occurs, the IP address of the old master database will be removed from the configuration file.
--manger_log: Log storage location.
--ignore_last_failover: By default, if MHA detects consecutive downtimes and the interval between two downtimes is less than 8 hours,
  No failover will be performed. This restriction is made to avoid the ping-pong effect. This parameter means ignoring the files generated by the last MHA-triggered switch.
  By default, after the MHA switch occurs, the log directory will be recorded, that is, the log app1.failover.complete file set above.
  The next time you switch again, if the file is found in the directory, the switch will not be allowed to be triggered unless the file is deleted after the first switch.
  For convenience, this is set to --ignore_last_failover.
# b Check the MHA status and you can see that the current master is the Mysql1 node.
masterha_check_status --conf=/etc/masterha/app1.cnf

# a log mode cat /var/log/masterha/app1/manager.log | grep "current master"

# c Check whether the VIP address 192.168.80.200 of Mysql1 exists. This VIP address will not disappear because the manager node stops the MHA service.
ifconfig
//To shut down the manager service, you can use the following command.
masterha_stop --conf=/etc/masterha/app1.cnf
Or you can directly kill the process ID to shut it down.

Simulate failure

# aMonitor and observe log records on the manager node tail -f /var/log/masterha/app1/manager.log

# bStop the mysql service on the Master node Mysql1 systemctl stop mysqld
or pkill -9 mysql

# cAfter a normal automatic switch, the MHA process will exit. HMA will automatically modify the content of the app1.cnf file and delete the downed mysql1 node. Check whether mysql2 takes over VIP
ifconfig
Algorithm for failover of candidate master database:
1. Generally, the quality of slave databases is judged from (position/GTID). If there is a difference in data, the slave closest to the master becomes the candidate master.
2. If the data is consistent, select the alternate master database according to the order of configuration files.
3. Set a weight (candidate_master=1) to forcibly specify the candidate master according to the weight.
(1) By default, if a slave lags behind the master by 100MB of relay logs, it will fail even if it has weights.
(2) If check_repl_delay=0, it is forcibly selected as the backup master even if it lags behind by a large number of logs.

Bug Fixes

1. Restart the mysql service systemctl restart mysqld
2. Repair the master-slave: first check the binary files and synchronization points show master status;
3. The original master server mysql1 performs synchronization operations: change master to master_host='192.168.80.20',master_user='myslave',master_password='123',master_log_file='master-bin.000001',master_log_pos=1745;
start slave;

4. Modify the configuration file app1.cnf on the manager node
vi /etc/masterha/app1.cnf
......
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.80.10 -s 192.168.80.30
......
[server1]
hostname=192.168.80.20
port=3306

[server2]
candidate_master=1
check_repl_delay=0
hostname=192.168.80.10
port=3306

[server3]
hostname=192.168.80.30
port=3306
5. The slave library must be set to read-only mode: the current slave library is mysql
set global read_only=1;

6. Start MHA on the manager node and check whether VIP drifts to mysql2
nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &

##Check the MHA status and you can see that the current master is the Mysql2 node masterha_check_status --conf=/etc/masterha/app1.cnf

This is the end of this article about MHA, a high-availability architecture for MySQL. For more information about MySQL high-availability architecture, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • How to use Python to collect MySQL MHA deployment and operation status information
  • Detailed deployment steps for MySQL MHA high availability configuration and failover
  • Steps to build MHA architecture deployment in MySQL
  • Summary of several error logs about MySQL MHA setup and switching
  • Mysql GTID Mha configuration method
  • Super deployment tutorial of MHA high availability failover solution under MySQL
  • MHA implements manual switching of MySQL master-slave database
  • Introduction to MySQL MHA operation status monitoring

<<:  In-depth analysis of the role of HTML <!--...--> comment tags

>>:  Solution to the problem that a line is left blank for no reason on the utf8-encoded page under IE and the utf8 page cannot be displayed

Recommend

Basic Implementation of AOP Programming in JavaScript

Introduction to AOP The main function of AOP (Asp...

Analysis of Facebook's Information Architecture

<br />Original: http://uicom.net/blog/?p=762...

Kill a bunch of MySQL databases with just a shell script like this (recommended)

I was woken up by a phone call early in the morni...

Implementation of Docker packaging image and configuration modification

I have encountered many problems in learning Dock...

Web front-end development experience summary

XML files should be encoded in utf-8 as much as p...

Example of implementing a virtual list in WeChat Mini Program

Table of contents Preface analyze Initial Renderi...

HTML insert image example (html add image)

Inserting images into HTML requires HTML tags to ...

Example of using supervisor to manage nginx+tomcat containers

need: Use docker to start nginx + tomcat dual pro...

Understanding and application scenarios of enumeration types in TypeScript

Table of contents 1. What is 2. Use Numeric Enume...

JS realizes the scrolling effect of announcement online

This article shares the specific code of JS to ac...

js implements a simple English-Chinese dictionary

This article shares the specific code of js to im...

Linux's fastest text search tool ripgrep (the best alternative to grep)

Preface Speaking of text search tools, everyone m...

Analysis of the ideas of implementing vertical tables in two ways in Vue project

Problem Description In our projects, horizontal t...

Install MySQL in Ubuntu 18.04 (Graphical Tutorial)

Tip: The following operations are all performed u...