PrefaceWhen I was doing the second interview with Tencent after the new year, the first question I was asked after writing the algorithm was, what is MySQL semi-synchronization? I was totally confused at the time. I thought it was about MySQL's two-phase commit issue? After confirming it, it turned out to be not a two-stage submission. Then the interviewer saw that I didn’t even know what the question was about, so he skipped this question and went straight to the next one. So this time I will summarize the knowledge content of this part. There is a lot of text content, which may be a bit boring, but it is still interesting for those who are interested in this area. MySQL master-slave replicationIn large-scale projects, we usually use MySQL replication to create a MySQL master-slave cluster. Data synchronization can be performed mainly by configuring one or more backup databases for the server. The replication function is not only conducive to building high-performance applications, but also the basis for high availability, scalability, disaster recovery, backup, and data warehouse. To put it simply, the master-slave replication of MySQL is used to achieve read-write separation. Compared with a single-point database that can read and write, it improves the performance of the business system and optimizes the user experience. In addition, the high availability of the database is achieved through master-slave replication. When the master node MySQL hangs up, the slave database can be used to take over. Replication methods supported by MySQLMySQL supports three replication modes:
Master-slave replication principleThe MySQL replication principle can be roughly divided into three steps:
The main process is as follows: Let's take a closer look at the three steps of copying: The first step is to record binary logs on the master database. First, the master database must enable the binlog logging function and authorize the slave database to access it. One thing to note here is that the order in the binlog log is recorded in the order in which the transactions were submitted rather than the order in which each statement was executed. Step 2: Copy binLog from the library to its local RelayLog. First, the slave database starts a working thread, called an I/O thread. The I/O thread establishes a normal client connection with the master database. Then, a special binary dump (binlog dump) thread is started on the master database. This dump thread reads events in the binlog. After catching up with the main database, it will go into hibernation until the main database notifies it of a new update statement. In this way, the binlog data is transferred to the relaylog on the slave library through the I/O thread on the slave library and the binlog dump thread on the master library. Step 3: Start a SQL thread in the slave database, read events from the relaylog, and execute them in the slave database to update the slave database data. ==This replication architecture achieves the decoupling of event acquisition and event replay, and the running I/O thread can work independently of the SQL thread. However, this architecture also limits the replication process. The most important point is that queries running concurrently on the primary database can only be executed serially in the standby database because there is only one SQL thread to replay the events in the relay log. ==
MySQL master-slave replication modeMySQL's master-slave replication actually supports multiple replication modes, including asynchronous replication, semi-synchronous replication, and GTID replication. Asynchronous ModeThe default replication mode of MySQL is asynchronous mode, which mainly refers to the I/O thread on the MySQL master server. After writing data to binlong, it directly returns the successful data update to the client, regardless of whether the data is transmitted to the slave server and written to the relaylog. In this mode, copying data is actually risky. Once the data is only written to the binlog of the master database and has not been synchronized to the slave database in time, data loss will occur. However, this mode is indeed the most efficient, because the function of changing data only needs to be completed in the main database, and copying data from the database will not affect the data writing operation of the main database. As I said above, although this asynchronous replication mode is highly efficient, the risk of data loss is very high, so there is a semi-synchronous replication mode to be introduced later. Semi-synchronous modeMySQL has supported semi-synchronous master-slave replication mode since version 5.5 in the form of a plug-in. What is the semi-synchronous master-slave replication mode? Here is an explanation by way of comparison:
In the semi-synchronous replication mode, it is clear that after a transaction is successfully committed, the transaction will exist in at least two places: the master database and one of the slave databases. The main principle is that when the master's dump thread notifies the slave, an ACK mechanism is added, which confirms whether the slave has received the transaction flag code. The master's dump thread not only sends binlog to the slave, but is also responsible for receiving the slave's ACK. When an exception occurs and the slave does not ACK the transaction, it will automatically downgrade to asynchronous replication until the exception is fixed and then automatically change to semi-synchronous replication. The process of MySQL semi-synchronous replication is as follows: Hidden dangers of semi-synchronous replication The semi-synchronous replication mode also has certain data risks. When a transaction is submitted to the master database and is waiting for the slave database ACK, if the master goes down, there will be two problems at this time.
In order to solve the above hidden dangers, MySQL has added a new semi-synchronous mode since version 5.7. The execution process of the new semi-synchronous method is to move the "Storage Commit" step to after the "Write Slave dump". This ensures that the master database transaction is committed only after the slave transaction is ACKed. MySQL 5.7.2 adds a new parameter for configuration: rpl_semi_sync_master_wait_point. This parameter has two configurable values:
Starting from MySQL version 5.7.2, the default semi-synchronous replication mode is AFTER_SYNC mode, but this solution is not a panacea, because the AFTER_SYNC mode commits the transaction of the master database only after the transaction is synchronized to the slave. If the master hangs up while the master database is waiting for the slave database to synchronize successfully, the master transaction submission will fail, and the client will also receive the result of the transaction execution failure. However, the contents of binLog have been written to the Relay Log on the slave. At this time, the slave data will be more, but the problem of more data is generally not serious. More data is always better than less. MySQL, when it is unable to solve the problem of distributed data consistency, can guarantee that data will not be lost. More data is always better than losing data. Here are some parameters of semi-synchronous replication mode: mysql> show variables like '%Rpl%'; +-------------------------------------------+------------+ | Variable_name | Value | +-------------------------------------------+------------+ | rpl_semi_sync_master_enabled | ON | | rpl_semi_sync_master_timeout | 10000 | | rpl_semi_sync_master_trace_level | 32 | | rpl_semi_sync_master_wait_for_slave_count | 1 | | rpl_semi_sync_master_wait_no_slave | ON | | rpl_semi_sync_master_wait_point | AFTER_SYNC | | rpl_stop_slave_timeout | 31536000 | +-------------------------------------------+------------+ -- Semi-synchronous replication mode switch rpl_semi_sync_master_enabled -- Semi-synchronous replication, timeout, in milliseconds. When this time is exceeded, it automatically switches to asynchronous replication mode rpl_semi_sync_master_timeout -- Introduced in MySQL 5.7.3, this variable sets how many slave responses the master needs to wait for before returning to the client. The default value is 1. rpl_semi_sync_master_wait_for_slave_count -- This value indicates whether the number of slaves in the current cluster can still meet the currently configured semi-synchronous replication mode. The default value is ON. When the semi-synchronous replication mode is not met, all slaves switch to asynchronous replication and this value will also become OFF rpl_semi_sync_master_wait_no_slave -- Represents the way semi-synchronous replication commits transactions. After 5.7.2, the default is AFTER_SYNC rpl_semi_sync_master_wait_point GTID ModeMySQL has introduced the GTID replication mode since version 5.6. GTID is the abbreviation of global transaction identifier. GTID is composed of UUID+TransactionId. UUID is the unique identifier of a single MySQL instance. When the MySQL instance is started for the first time, a server_uuid is automatically generated and written to the auto.cnf (mysql/data/auto.cnf) file in the data directory by default. TransactionId is the number of transactions executed on the MySQL server, and increases as the number of transactions increases. This ensures that the GTID is globally unique within a set of replicas. In this way, you can clearly see through GTID which instance the current transaction was submitted from and the number of transactions submitted. Let's look at the specific form of a GTID: mysql> show master status; +-----------+----------+--------------+------------------+-------------------------------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +-----------+----------+--------------+------------------+-------------------------------------------+ | on.000003 | 187 | | | 76147e28-8086-4f8c-9f98-1cf33d92978d:1-322| +-----------+----------+--------------+------------------+-------------------------------------------+ 1 row in set (0.00 sec) How GTIDs workDue to the uniqueness of GTID in a set of master-slave replication clusters, it is guaranteed that each GTID transaction is executed only once on one MySQL. So how is this mechanism implemented? What is the principle of GTID? When the slave server connects to the master server, it passes the GTID it has executed (Executed_Gtid_Set: the transaction code that has been executed) and the GTID it has obtained (Retrieved_Gtid_Set: the transaction number that the slave has received from the master) to the master server. The master server will send the missing GTID and the corresponding transactionID from the slave server to the slave server, allowing the slave server to complete the data. When the primary server goes down, the conf server that synchronizes data most successfully will be found and directly promoted to the primary server. If a slave server other than the most successful one is forced to become the master, a change command will be sent to the most successful server to complete the GTID, and then the forced server will be promoted to the master. The main data synchronization mechanism can be divided into the following steps:
The initial structure is as follows From the above figure, we can see that when the Master hangs up, Slave-1 has completed the Master's transactions, but Slave-2 is delayed a little, so it has not completed the Master's transactions. At this time, Slave-1 is promoted to the master. After Slave-2 connects to the new master (Slave-1), it transmits the latest GTID to the new master, and then Slave-1 starts sending transactions to Slave-2 from the next GTID of this GTID. This self-finding replication model reduces the possibility of transaction loss and the time to recover from failures. Advantages and Disadvantages of GTIDThrough the above analysis, we can conclude that the advantages of GTID are:
The disadvantages of GTID are also obvious:
In fact, there is a lot of content about GTID. If you want to study it in depth, you can read this article. Finally, let's talk about some necessary conditions for enabling GTID:
gtid_mode=on (required) # Enable gtid function log_bin=log-bin=mysql-bin (required) # Enable binlog binary log function log-slave-updates=1 (required) # You can also write 1 as on enforce-gtid-consistency=1 (required) #You can also write 1 as on
gtid_mode=on (required) enforce-gtid-consistency=1 (required) log_bin=mysql-bin (optional) #High availability switching, it is best to enable this function log-slave-updates=1 (optional) #High availability switching, it is best to enable this function The above is a detailed explanation of MySQL semi-synchronization. For more information about MySQL semi-synchronization, please pay attention to other related articles on 123WORDPRESS.COM! You may also be interested in:
|
<<: Several common CSS layouts (summary)
>>: Zen HTML Elements Friends who use zen coding can collect it
Design the web page shown above: <!DOCTYPE htm...
If you use docker for large-scale development but...
*Create a page: two input boxes and a button *Cod...
MySQL binlog is a very important log in MySQL log...
Table of contents Preface 1. MySQL enables SSL co...
I will explain the installation of MySQL under Wi...
Find the containerID of tomcat and enter the toma...
This article shares the MySQL free installation c...
1. Download the repository image docker pull regi...
Purpose: Under Linux, the server program may be d...
Table of contents 1. Log related services 2. Comm...
What is a web page? The page displayed after the ...
Preface Based on my understanding of MySQL, I thi...
01 Winter Flakes (Individual only) 02 Snowtop Cap...
1. Naming conventions 1. Database names, table na...