MySQL5.7 parallel replication principle and implementation

MySQL5.7 parallel replication principle and implementation

Anyone who has a little knowledge of data operations and maintenance knows that MySQL 5.5 and earlier use a single SQL thread for playback. If the Master QPS is slightly higher, there will be a delay. 5.6 is a library-based parallel playback mechanism. Only when there are multiple libraries can replication have an advantage. 5.7 is a group-based parallel playback. Transactions in the same group can be replayed in parallel to solve the delay problem.

MySQL 5.7 Parallel Replication Era

As we all know, MySQL replication delay is one of the problems that has been criticized. However, in the two previous blogs of Inside Jun (1, 2), it has been mentioned that MySQL 5.7 version already supports "real" parallel replication function, officially called enhanced multi-threaded slave (MTS for short). Therefore, the replication delay problem has been greatly improved. Even in the NetEase e-commerce application where Inside Jun works, the problem of delay of several hours has been completely eliminated. However, I found that many friends still don’t understand this “great” feature that is worthy of being recorded in history, so I decided to share it with you. In short, after version 5.7, the replication delay problem will never exist.

MySQL 5.6 Parallel Replication Architecture

It is true that MySQL version 5.6 also supports so-called parallel replication, but its parallelism is only based on schema, that is, based on library. If there are multiple schemas in the user's MySQL database instance, it can indeed be of great help to the speed of slave replication. The architecture of MySQL 5.6 parallel replication is as follows:

The red framed part in the above figure is the key to achieving parallel replication. Prior to MySQL 5.6, there were two threads on the slave server: the I/O thread and the SQL thread. The I/O thread is responsible for receiving binary logs (or more precisely, binary log events), and the SQL thread plays back binary logs. If the parallel replication function is enabled in MySQL version 5.6, the SQL thread becomes the coordinator thread, which is mainly responsible for the following two parts:

  • If it is determined that parallel execution is possible, the binary log of the transaction is selected to execute in the worker thread
  • If it is determined that parallel execution is not possible, such as the operation is a DDL or a cross-schema transaction operation, wait until all worker threads are completed before executing the current log.

This means that the coordinator thread does not only send logs to the worker thread, but can also replay logs itself, but all operations that can be parallelized are delivered by the worker thread. The coordinator thread and worker are typical producer and consumer models.

The above mechanism implements schema-based parallel replication, but there are two problems. First, the crash safe function is difficult to implement because transactions that are executed later may be completed first due to parallel replication. Then, when a crash occurs, the processing logic of this part is relatively complicated. From the code point of view, 5.6 introduces the Low-Water-Mark tag to solve this problem. From the design point of view, it hopes to solve this problem with the help of the idempotence of the log. However, the binary log playback of 5.6 cannot achieve idempotence. Another critical issue is that the parallel replication effect of this design is not very high. If the user instance has only one library, parallel playback cannot be achieved, and the performance may even be worse than the original single-threaded one. A single database with multiple tables is a more common scenario than multiple databases with multiple tables.

MySQL 5.7 Parallel Replication Based on Group Commit

MySQL 5.7 can be called true parallel replication. The main reason for this is that the playback of the slave server is consistent with the host, that is, the slave server will play back the same parallel execution as the master server.

The idea of ​​parallel replication in MySQL 5.7 is simple and easy to understand. In a nutshell: all transactions submitted by a group can be replayed in parallel, because these transactions have entered the prepare phase of the transaction, which means that there is no conflict between the transactions (otherwise it would be impossible to submit).

To be compatible with MySQL 5.6's library-based parallel replication, 5.7 introduces a new variable, slave-parallel-type, whose configurable values ​​are:

  • DATABASE: Default value, library-based parallel replication
  • LOGICAL_CLOCK: Parallel replication based on group commit

GTID support for parallel replication

How to know whether a transaction is in a group is another problem, because the original MySQL does not provide such information. In MySQL 5.7, the design is to store group commit information in GTID. So what if the user does not enable the GTID function, that is, sets the parameter gtid_mode to OFF? Therefore, MySQL 5.7 introduced a binary log event type called Anonymous_Gtid, such as:

mysql> SHOW BINLOG EVENTS in 'mysql-bin.000006';
 +------------------+-----+----------------+-----------+-------------+-----------------------------------------------+
 | Log_name | Pos | Event_type | Server_id | End_log_pos | Info |
 +------------------+-----+----------------+-----------+-------------+-----------------------------------------------+
 | mysql-bin.000006 | 4 | Format_desc | 88 | 123 | Server ver: 5.7.7-rc-debug-log, Binlog ver: 4 |
 | mysql-bin.000006 | 123 | Previous_gtids | 88 | 194 | f11232f7-ff07-11e4-8fbb-00ff55e152c6:1-2 |
 | mysql-bin.000006 | 194 | Anonymous_Gtid | 88 | 259 | SET @@SESSION.GTID_NEXT= 'ANONYMOUS' |
 | mysql-bin.000006 | 259 | Query | 88 | 330 | BEGIN |
 | mysql-bin.000006 | 330 | Table_map | 88 | 373 | table_id: 108 (aaa.t) |
 | mysql-bin.000006 | 373 | Write_rows | 88 | 413 | table_id: 108 flags: STMT_END_F |
 .....

This means that in MySQL 5.7, even if GTID is not enabled, there will be an Anonymous_Gtid before each transaction starts, and this GTID contains group commit information.

LOGICAL_CLOCK

However, through the above SHOW BINLOG EVENTS, we did not find any information about group submission. However, through the mysqlbinlog tool, users can find the internal information submitted by the group:

# mysqlbinlog mysql-bin.0000006 | grep last_committed
#150520 14:23:11 server id 88 end_log_pos 259 CRC32 0x4ead9ad6 GTID last_committed=0 sequence_number=1
#150520 14:23:11 server id 88 end_log_pos 1483 CRC32 0xdf94bc85 GTID last_committed=0 sequence_number=2
#150520 14:23:11 server id 88 end_log_pos 2708 CRC32 0x0914697b GTID last_committed=0 sequence_number=3
#150520 14:23:11 server id 88 end_log_pos 3934 CRC32 0xd9cb4a43 GTID last_committed=0 sequence_number=4
#150520 14:23:11 server id 88 end_log_pos 5159 CRC32 0x06a6f531 GTID last_committed=0 sequence_number=5
#150520 14:23:11 server id 88 end_log_pos 6386 CRC32 0xd6cae930 GTID last_committed=0 sequence_number=6
#150520 14:23:11 server id 88 end_log_pos 7610 CRC32 0xa1ea531c GTID last_committed=6 sequence_number=7
...

It can be found that compared with the original binary log content, there are more last_committed and sequence_number. Last_committed indicates the number of the last transaction committed when the transaction was committed. If transactions have the same last_committed, it means that these transactions are in a group and can be played back in parallel. For example, there are 6 transactions with last_committed being 0, which means that 6 transactions were committed during group commit, and these 6 transactions can be replayed in parallel on the slave.

The last_committed and sequence_number mentioned above represent the so-called LOGICAL_CLOCK. First, let's look at the definition of LOGICAL_CLOCK in the source code:

class Logical_clock
 {
 private:
 int64 state;
 /*
 Offset is subtracted from the actual "absolute time" value at
 logging a replication event. That is the event holds logical
 timestamps in the "relative" format. They are meaningful only in
 the context of the current binlog.
 The member is updated (incremented) per binary log rotation.
 */
 int64 offset;
 ......

The state is an auto-incrementing value. The offset is updated every time the binary log is rotated, and the state value at the time of the rotation is recorded. In fact, state and offset record the global count value, while the binary log only stores the relative value of the current file. The scenarios for using LOGICAL_CLOCK are as follows:

class MYSQL_BIN_LOG: public TC_LOG
 {
 ...
 public:
 /* Committed transactions timestamp */
 Logical_clock max_committed_transaction;
 /* "Prepared" transactions timestamp */
 Logical_clock transaction_counter;
 ...

You can see that two Logical_clock variables are defined in the MYSQL_BIN_LOG class:

  • max_committed_transaction: records the logical_clock of the last group commit, representing last_committed in the above mysqlbinlog
  • transaction_counter: records the logcial_clock of each transaction in the current group submission, representing the sequence_number in the above mysqlbinlog

Parallel replication test

The following figure shows the QPS of the slave server after MTS is enabled. The test tool is the single-table full update test of sysbench. The test results show that the performance is best under 16 threads, and the QPS of the slave can reach more than 25,000. Further increasing the number of parallel execution threads to 32 does not bring higher improvement. The QPS of the original single-thread playback is only around 4000, which shows the performance improvement brought by MySQL 5.7 MTS. However, since the test is on a single table, the MTS mechanism of MySQL 5.6 is completely powerless.

mysql

MySQL 5.7 Parallel Replication

Parallel replication configuration and tuning

master_info_repository

After enabling the MTS function, be sure to set the parameter master_info_repostitory to TABLE, which can improve performance by 50% to 80%. This is because when parallel replication is enabled, the update of the meta-master.info file will increase significantly, and the competition for resources will also increase. In previous versions of InnoSQL, parameters were added to control the frequency of refreshing the master.info file, or even to prevent it from being refreshed. Because refreshing this file is unnecessary, recovery based on the master-info.log file is itself unreliable. In MySQL 5.7, Insider recommends setting master_info_repository to TABLE to reduce this overhead.

slave_parallel_workers

If slave_parallel_workers is set to 0, MySQL 5.7 degenerates to the original single-threaded replication. However, if slave_parallel_workers is set to 1, the SQL thread function is converted to the coordinator thread, but there is only one worker thread for playback, which is also single-threaded replication. However, there are some differences between these two performances. Because there is one more forwarding of the coordinator thread, the performance of slave_parallel_workers=1 is worse than that of 0. In Inside's test, there is still a 20% performance drop, as shown in the following figure:

mysql
​​​​​​MySQL 5.7 Parallel Replication​​​​​​

This introduces another problem. If the load on the host is not large, the efficiency of group submission will be low. It is very likely that only one transaction will be submitted in each group. When replaying on the slave, although parallel replication is enabled, the performance will be worse than the original single-threaded performance, that is, the delay will increase. Smart friends, have you ever thought about optimizing this?

Enhanced Multi-Threaded Slave Configuration

Having said so much, it is actually very simple to enable enhanced multi-threaded slave. Just follow the following settings:

# slave
slave-parallel-type=LOGICAL_CLOCK
slave-parallel-workers=16
master_info_repository=TABLE
relay_log_info_repository=TABLE
relay_log_recovery=ON

Parallel replication monitoring

Replication monitoring can still be done with SHOW SLAVE STATUS\G, but MySQL 5.7 adds the following metadata tables under the performance_schema architecture, which allow users to monitor replication in a more detailed manner:

mysql> show tables like 'replication%';
 +---------------------------------------------+
 | Tables_in_performance_schema (replication%) |
 +---------------------------------------------+
 | replication_applier_configuration |
 | replication_applier_status |
 | replication_applier_status_by_coordinator |
 | replication_applier_status_by_worker |
 | replication_connection_configuration |
 | replication_connection_status |
 | replication_group_member_stats |
 | replication_group_members |
 +---------------------------------------------+
 8 rows in set (0.00 sec)

Summarize

The Enhanced Multi-Threaded Slave introduced in MySQL 5.7 solves the replication delay problem that has plagued MySQL for decades. This once again reminds some ignorant PostgreSQL users not to stay with their previous impressions of MySQL. Physical replication does not necessarily have an advantage over logical replication. The MTS of MySQL 5.7 can completely solve the delay problem.

Reference:

- http://www.ttlsa.com/mysql/mysql-5-7-enhanced-multi-thread-salve/

- http://moguhu.com/article/detail?articleId=129

- https://www.codercto.com/a/63073.html

- https://dev.mysql.com/doc/refman/5.7/en/replication-options-replica.html#sysvar_slave_preserve_commit_order

This is the end of this article about the principle and implementation of MySQL 5.7 parallel replication. For more relevant MySQL 5.7 parallel replication content, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope everyone will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • Detailed explanation of MySQL master-slave replication and read-write separation
  • Common repair methods for MySQL master-slave replication disconnection
  • Analysis of three parameters of MySQL replication problem
  • Comprehensive analysis of MySql master-slave replication mechanism
  • MySQL Series 13 MySQL Replication

<<:  Summary of scientific principles and suggestions for web design and production

>>:  Docker5 full-featured harbor warehouse construction process

Recommend

The main differences between MySQL 4.1/5.0/5.1/5.5/5.6

Some command differences between versions: show i...

WeChat applet implements a simple handwritten signature component

Table of contents background: need: Effect 1. Ide...

MySQL slow query: Enable slow query

1. What is the use of slow query? It can record a...

How to install OpenSuse on virtualbox

The virtual machine is installed on the host mach...

Instructions for using MySQL isolation Read View

Which historical version can the current transact...

Deploy grafana+prometheus configuration using docker

docker-compose-monitor.yml version: '2' n...

How to use type enhancement without typingscript

Preface Due to the weak typing of JS, loose writi...

Summary of xhtml block level tags

* address - address * blockquote - block quote * c...

Detailed steps for installing Harbor, a private Docker repository

The installation of Harbor is pretty simple, but ...

MySQL replication mechanism principle explanation

Background Replication is a complete copy of data...

A simple way to change the password in MySQL 5.7

This is an official screenshot. After MySQL 5.7 i...

Detailed Example of MySQL InnoDB Locking Mechanism

1. InnoDB locking mechanism The InnoDB storage en...