Detailed explanation of MySQL semi-synchronization

Detailed explanation of MySQL semi-synchronization

Preface

When I was doing the second interview with Tencent after the new year, the first question I was asked after writing the algorithm was, what is MySQL semi-synchronization? I was totally confused at the time. I thought it was about MySQL's two-phase commit issue? After confirming it, it turned out to be not a two-stage submission. Then the interviewer saw that I didn’t even know what the question was about, so he skipped this question and went straight to the next one. So this time I will summarize the knowledge content of this part. There is a lot of text content, which may be a bit boring, but it is still interesting for those who are interested in this area.

MySQL master-slave replication

In large-scale projects, we usually use MySQL replication to create a MySQL master-slave cluster. Data synchronization can be performed mainly by configuring one or more backup databases for the server. The replication function is not only conducive to building high-performance applications, but also the basis for high availability, scalability, disaster recovery, backup, and data warehouse.

To put it simply, the master-slave replication of MySQL is used to achieve read-write separation. Compared with a single-point database that can read and write, it improves the performance of the business system and optimizes the user experience. In addition, the high availability of the database is achieved through master-slave replication. When the master node MySQL hangs up, the slave database can be used to take over.

Replication methods supported by MySQL

MySQL supports three replication modes:

  • Statement-based replication (also called logical replication) mainly means that the SQL statements executed on the master database will be repeated on the slave database. This is the type of replication that MySQL uses by default, which is more efficient. However, there are certain problems. If functions such as uuid() and rand() are used in SQL, the data copied to the slave database will be biased.
  • Row-based replication refers to copying updated data to the slave database instead of executing side statements. It is supported only from MySQL 5.1.
  • Hybrid replication uses statement replication by default. When it is found that the statement cannot accurately replicate data (for example, the statement contains functions such as uuid() and rand()), row-based replication is used.

Master-slave replication principle

The MySQL replication principle can be roughly divided into three steps:

  1. Record data changes on the master database in the binary log.
  2. The slave database copies the logs on the master database to its own relay log.
  3. The slave database reads the events in the relay log and replays them on top of the slave database data.

The main process is as follows:

Let's take a closer look at the three steps of copying:

The first step is to record binary logs on the master database. First, the master database must enable the binlog logging function and authorize the slave database to access it. One thing to note here is that the order in the binlog log is recorded in the order in which the transactions were submitted rather than the order in which each statement was executed.

Step 2: Copy binLog from the library to its local RelayLog. First, the slave database starts a working thread, called an I/O thread. The I/O thread establishes a normal client connection with the master database. Then, a special binary dump (binlog dump) thread is started on the master database. This dump thread reads events in the binlog. After catching up with the main database, it will go into hibernation until the main database notifies it of a new update statement. In this way, the binlog data is transferred to the relaylog on the slave library through the I/O thread on the slave library and the binlog dump thread on the master library.

Step 3: Start a SQL thread in the slave database, read events from the relaylog, and execute them in the slave database to update the slave database data.

==This replication architecture achieves the decoupling of event acquisition and event replay, and the running I/O thread can work independently of the SQL thread. However, this architecture also limits the replication process. The most important point is that queries running concurrently on the primary database can only be executed serially in the standby database because there is only one SQL thread to replay the events in the relay log. ==

When it comes to the problem of serial execution of master-slave replication, I thought of a problem I encountered before at work. There was a business scenario in which we had an operation to initialize a batch of data. The data was obtained from the interface of an external system. Then I used multiple threads in the thread pool to obtain data from the interface of the external system in parallel. After each thread obtained the data, it was directly inserted into the database. Then, after all the data has been entered into the database, a batch query is performed to query the data just inserted into the database and put it into ElasticSearch. As a result, the data put into ES was always incomplete. After studying it for a long time, it still didn't work. Finally, the problem was solved by letting the query go to the main database. At that time, I didn't know that this problem was caused by the serialization of MySQL master-slave replication.

MySQL master-slave replication mode

MySQL's master-slave replication actually supports multiple replication modes, including asynchronous replication, semi-synchronous replication, and GTID replication.

Asynchronous Mode

The default replication mode of MySQL is asynchronous mode, which mainly refers to the I/O thread on the MySQL master server. After writing data to binlong, it directly returns the successful data update to the client, regardless of whether the data is transmitted to the slave server and written to the relaylog. In this mode, copying data is actually risky. Once the data is only written to the binlog of the master database and has not been synchronized to the slave database in time, data loss will occur.

However, this mode is indeed the most efficient, because the function of changing data only needs to be completed in the main database, and copying data from the database will not affect the data writing operation of the main database.

As I said above, although this asynchronous replication mode is highly efficient, the risk of data loss is very high, so there is a semi-synchronous replication mode to be introduced later.

Semi-synchronous mode

MySQL has supported semi-synchronous master-slave replication mode since version 5.5 in the form of a plug-in. What is the semi-synchronous master-slave replication mode? Here is an explanation by way of comparison:

  • Asynchronous replication mode: As we have introduced above, in asynchronous replication mode, after the master database executes the transaction submitted by the client, as long as the execution logic is written to the binlog, it will immediately return to the client, and does not care whether the slave database is executed successfully. This will have a hidden danger, that is, when the binlog executed by the master database has not been synchronized to the slave database, the master database hangs up. At this time, the slave database will be forcibly promoted to the master database, which may cause data loss.
  • Synchronous replication mode: After the master database executes the transaction submitted by the client, it needs to wait until all slave databases have also executed the transaction before returning a successful execution to the client. Because it has to wait until all slave libraries are executed, the execution process will be blocked and waiting for the results to be returned, so the performance will be seriously affected.
  • Semi-synchronous replication mode: Semi-synchronous replication mode can be said to be a replication mode between asynchronous and synchronous. After the master database executes the transaction submitted by the client, it must wait for at least one slave database to receive the binlog and write the data to the relay log before returning a successful result to the client. The semi-synchronous replication mode improves data availability compared to the asynchronous mode, but it also produces a certain performance delay, at least the round-trip time of a TCP/IP connection.

In the semi-synchronous replication mode, it is clear that after a transaction is successfully committed, the transaction will exist in at least two places: the master database and one of the slave databases. The main principle is that when the master's dump thread notifies the slave, an ACK mechanism is added, which confirms whether the slave has received the transaction flag code. The master's dump thread not only sends binlog to the slave, but is also responsible for receiving the slave's ACK. When an exception occurs and the slave does not ACK the transaction, it will automatically downgrade to asynchronous replication until the exception is fixed and then automatically change to semi-synchronous replication.

The process of MySQL semi-synchronous replication is as follows:

Hidden dangers of semi-synchronous replication The semi-synchronous replication mode also has certain data risks. When a transaction is submitted to the master database and is waiting for the slave database ACK, if the master goes down, there will be two problems at this time.

  • The transaction has not been sent to the Slave: If the transaction has not been sent to the Slave, the client will resubmit the transaction after receiving the failure result. Because the resubmitted transaction is executed on the new Master, it will be executed successfully. Later, if the previous Master is restored, it will join the cluster as a Slave. At this time, the previous transaction will be executed twice. The first time is when this machine was previously the Master, and the second time is synchronized from the main database after it becomes a Slave.
  • The transaction has been synchronized to the Slave: Because the transaction has been synchronized to the Slave, when the client receives the failure result and submits the transaction again, the transaction will be executed twice on the current Slave machine.

In order to solve the above hidden dangers, MySQL has added a new semi-synchronous mode since version 5.7. The execution process of the new semi-synchronous method is to move the "Storage Commit" step to after the "Write Slave dump". This ensures that the master database transaction is committed only after the slave transaction is ACKed. MySQL 5.7.2 adds a new parameter for configuration: rpl_semi_sync_master_wait_point. This parameter has two configurable values:

  • AFTER_SYNC: When the parameter value is AFTER_SYNC, it means that the new semi-synchronous replication method is used.
  • AFTER_COMMIT: Indicates that the old semi-synchronous replication mode is used.

Starting from MySQL version 5.7.2, the default semi-synchronous replication mode is AFTER_SYNC mode, but this solution is not a panacea, because the AFTER_SYNC mode commits the transaction of the master database only after the transaction is synchronized to the slave. If the master hangs up while the master database is waiting for the slave database to synchronize successfully, the master transaction submission will fail, and the client will also receive the result of the transaction execution failure. However, the contents of binLog have been written to the Relay Log on the slave. At this time, the slave data will be more, but the problem of more data is generally not serious. More data is always better than less. MySQL, when it is unable to solve the problem of distributed data consistency, can guarantee that data will not be lost. More data is always better than losing data.

Here are some parameters of semi-synchronous replication mode:

mysql> show variables like '%Rpl%';
+-------------------------------------------+------------+
| Variable_name | Value |
+-------------------------------------------+------------+
| rpl_semi_sync_master_enabled | ON |
| rpl_semi_sync_master_timeout | 10000 |
| rpl_semi_sync_master_trace_level | 32 |
| rpl_semi_sync_master_wait_for_slave_count | 1 |
| rpl_semi_sync_master_wait_no_slave | ON |
| rpl_semi_sync_master_wait_point | AFTER_SYNC |
| rpl_stop_slave_timeout | 31536000 |
+-------------------------------------------+------------+
-- Semi-synchronous replication mode switch rpl_semi_sync_master_enabled
-- Semi-synchronous replication, timeout, in milliseconds. When this time is exceeded, it automatically switches to asynchronous replication mode rpl_semi_sync_master_timeout
-- Introduced in MySQL 5.7.3, this variable sets how many slave responses the master needs to wait for before returning to the client. The default value is 1.
rpl_semi_sync_master_wait_for_slave_count
-- This value indicates whether the number of slaves in the current cluster can still meet the currently configured semi-synchronous replication mode. The default value is ON. When the semi-synchronous replication mode is not met, all slaves switch to asynchronous replication and this value will also become OFF
rpl_semi_sync_master_wait_no_slave
-- Represents the way semi-synchronous replication commits transactions. After 5.7.2, the default is AFTER_SYNC
rpl_semi_sync_master_wait_point 

GTID Mode

MySQL has introduced the GTID replication mode since version 5.6. GTID is the abbreviation of global transaction identifier. GTID is composed of UUID+TransactionId. UUID is the unique identifier of a single MySQL instance. When the MySQL instance is started for the first time, a server_uuid is automatically generated and written to the auto.cnf (mysql/data/auto.cnf) file in the data directory by default. TransactionId is the number of transactions executed on the MySQL server, and increases as the number of transactions increases. This ensures that the GTID is globally unique within a set of replicas.

In this way, you can clearly see through GTID which instance the current transaction was submitted from and the number of transactions submitted.

Let's look at the specific form of a GTID:

mysql> show master status;
+-----------+----------+--------------+------------------+-------------------------------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+-----------+----------+--------------+------------------+-------------------------------------------+
| on.000003 | 187 | | | 76147e28-8086-4f8c-9f98-1cf33d92978d:1-322|
+-----------+----------+--------------+------------------+-------------------------------------------+
1 row in set (0.00 sec)

How GTIDs work

Due to the uniqueness of GTID in a set of master-slave replication clusters, it is guaranteed that each GTID transaction is executed only once on one MySQL. So how is this mechanism implemented? What is the principle of GTID?

When the slave server connects to the master server, it passes the GTID it has executed (Executed_Gtid_Set: the transaction code that has been executed) and the GTID it has obtained (Retrieved_Gtid_Set: the transaction number that the slave has received from the master) to the master server. The master server will send the missing GTID and the corresponding transactionID from the slave server to the slave server, allowing the slave server to complete the data. When the primary server goes down, the conf server that synchronizes data most successfully will be found and directly promoted to the primary server. If a slave server other than the most successful one is forced to become the master, a change command will be sent to the most successful server to complete the GTID, and then the forced server will be promoted to the master.

The main data synchronization mechanism can be divided into the following steps:

  • ==When the master updates data, it generates a GTID before the transaction and records it in the binlog. ==
  • ==The slave i/o thread writes the changed binlog to the relay log. ==
  • ==The sql thread obtains the GTID from the relay log, and then compares whether there is a record in the binlog on the slave side. ==
  • ==If there is a record, it means that the transaction of this GTID has been executed, and the slave will ignore this GTID. ==
  • ==If there is no record, the Slave will execute the GTID transaction from the relay log and record it in the binlog. ==
  • ==During the parsing process, determine whether there is a primary key. If there is no primary key, use the secondary index. If there is no secondary index, scan the entire table. ==

The initial structure is as follows

From the above figure, we can see that when the Master hangs up, Slave-1 has completed the Master's transactions, but Slave-2 is delayed a little, so it has not completed the Master's transactions. At this time, Slave-1 is promoted to the master. After Slave-2 connects to the new master (Slave-1), it transmits the latest GTID to the new master, and then Slave-1 starts sending transactions to Slave-2 from the next GTID of this GTID. This self-finding replication model reduces the possibility of transaction loss and the time to recover from failures.

Advantages and Disadvantages of GTID

Through the above analysis, we can conclude that the advantages of GTID are:

  • ==Each transaction corresponds to an execution ID, and a GTID will only be executed once on a server;==
  • ==GTID is used to replace the traditional replication method. The biggest difference between GTID replication and normal replication mode is that it does not require the specification of binary file name and location; ==
  • ==Reduce manual intervention and service downtime. When the main machine fails, the software can be used to promote a backup machine from among many backup machines to the main machine;==

The disadvantages of GTID are also obvious:

  • ==First, non-transactional storage engines are not supported; ==
  • ==Does not support replication of create table ... select statements (master database reports an error directly); (Principle: Two SQL statements will be generated, one is the DDL SQL statement for creating a table, and the other is the SQL statement for inserting data into. Since DDL will cause automatic commit, this SQL statement requires at least two GTIDs, but in GTID mode, only one GTID can be generated for this SQL statement)==
  • ==A SQL statement is not allowed to update a transaction engine table and a non-transaction engine table at the same time;==
  • ==In a MySQL replication group, all GTIDs are required to be enabled or disabled. ==
  • ==Enabling GTID requires restart (except mysql5.7);==
  • ==After GTID is enabled, the original traditional replication method is no longer used (unlike semi-synchronous replication, which can be downgraded to asynchronous replication after failure);==
  • ==For create temporary table and drop temporary table statements are not supported;==
  • ==sql_slave_skip_counter is not supported;==

In fact, there is a lot of content about GTID. If you want to study it in depth, you can read this article. Finally, let's talk about some necessary conditions for enabling GTID:

  • MySQL 5.6 version, add in the my.cnf file:
gtid_mode=on (required) # Enable gtid function log_bin=log-bin=mysql-bin (required) # Enable binlog binary log function log-slave-updates=1 (required) # You can also write 1 as on
enforce-gtid-consistency=1 (required) #You can also write 1 as on
  • For MySQL 5.7 or higher, add the following to the my.cnf file:
gtid_mode=on (required)
enforce-gtid-consistency=1 (required)
log_bin=mysql-bin (optional) #High availability switching, it is best to enable this function log-slave-updates=1 (optional) #High availability switching, it is best to enable this function

The above is a detailed explanation of MySQL semi-synchronization. For more information about MySQL semi-synchronization, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • Summary of MYSQL full backup, master-slave replication, cascading replication, and semi-synchronization
  • MySQL semi-synchronous replication principle configuration and introduction detailed explanation
  • A brief talk about MySQL semi-synchronous replication
  • Mysql semi-synchronous replication principle and troubleshooting
  • In-depth analysis of semi-synchronous and asynchronous MySQL master-slave replication configuration

<<:  Several common CSS layouts (summary)

>>:  Zen HTML Elements Friends who use zen coding can collect it

Recommend

HTML user registration page settings source code

Design the web page shown above: <!DOCTYPE htm...

Common methods and problems of Docker cleaning

If you use docker for large-scale development but...

Detailed explanation of dynamically generated tables using javascript

*Create a page: two input boxes and a button *Cod...

MySQL uses binlog logs to implement data recovery

MySQL binlog is a very important log in MySQL log...

Implementation of mysql configuration SSL certificate login

Table of contents Preface 1. MySQL enables SSL co...

Solution to 404 Problem of Tomcat Installation in Docker

Find the containerID of tomcat and enter the toma...

MySQL free installation version configuration tutorial

This article shares the MySQL free installation c...

Detailed explanation of the construction and use of docker private warehouse

1. Download the repository image docker pull regi...

Simple implementation method of Linux process monitoring and automatic restart

Purpose: Under Linux, the server program may be d...

Detailed introduction to logs in Linux system

Table of contents 1. Log related services 2. Comm...

A brief analysis of the basic concepts of HTML web pages

What is a web page? The page displayed after the ...

Detailed analysis of the parameter file my.cnf of MySQL in Ubuntu

Preface Based on my understanding of MySQL, I thi...

33 ice and snow fonts recommended for download (personal and commercial)

01 Winter Flakes (Individual only) 02 Snowtop Cap...

Summary of MySQL development standards and usage skills

1. Naming conventions 1. Database names, table na...