A simple explanation of MySQL parallel replication

A simple explanation of MySQL parallel replication

1. Background of Parallel Replication

First of all, why is there this concept of parallel replication?

1. DBAs should know that MySQL replication is based on binlog.

2. MySQL replication consists of two parts, IO thread and SQL thread.

3. The IO thread is mainly used to pull the binlog passed by the receiving Master and write it to the relay log

4. The SQL thread is mainly responsible for parsing the relay log and applying it to the slave

5. In any case, the IO and SQL threads are single-threaded, while the master is multi-threaded, so there will inevitably be delays. In order to solve this problem, multi-threading came into being.

6. IO multithreading?

6.1 There is no need for multithreading of IO, because IO threads are not the bottleneck.

7. SQL multithreading?

7.1 Yes, the latest 5.6, 5.7, and 8.0 all implement multi-threading on the SQL thread to improve the concurrency of the slave.

Next, let’s take a look at MySQL’s efforts and achievements in parallel replication.

II. Key Points

Whether it can be done in parallel depends on whether there are lock conflicts between multiple transactions. This is the key. The following parallel replication principle is to see how to avoid lock conflicts

MySQL 5.6 schema-based parallel replication

slave-parallel-type=DATABASE (transactions in different databases, no lock conflicts)

As mentioned before, the purpose of parallel replication is to make the slave run as multi-threaded as possible. Of course, multi-threading based on the library level is also a way (transactions in different libraries, no lock conflicts)

Let's talk about the advantages first: It is relatively simple to implement and easy to use for users. Then let's talk about the disadvantages: Because it is based on the library, the granularity of parallelism is very coarse. Now the architecture of many companies is one library and one instance. For such an architecture, 5.6 parallel replication is powerless. Of course, there is also the order of master and slave transactions, which is also a big problem for 5.6

Without further ado, here are some pictures

4. MySQL 5.7 Parallel replication based on group commit

slave-parallel-type=LOGICAL_CLOCK : Commit-Parent-Based mode (transactions in the same group [same last-commit], no lock conflicts. In the same group, there must be no conflicts, otherwise they cannot be in the same group)
slave-parallel-type=LOGICAL_CLOCK : Lock-Based mode (even if the transactions are not in the same group, as long as there is no lock conflict [prepare phase] between the transactions, they can be concurrent. If they are not in the same group, as long as the prepare phases of N transactions can overlap, there is no lock conflict)

Group commit is described in detail in previous articles and will not be explained here. When MySQL5.7 commits a group, it also marks the transactions of each group. Now I think it is for the convenience of MTS.

Let's look at a set of binlogs first

last_committed=0 sequence_number=1
last_committed=1 sequence_number=2
last_committed=2 sequence_number=3
last_committed=3 sequence_number=4
last_committed=4 sequence_number=5
last_committed=4 sequence_number=6
last_committed=4 sequence_number=7
last_committed=6 sequence_number=8
last_committed=6 sequence_number=9
last_committed=9 sequence_number=10

4.1 Commit-Parent-Based Mode

4.2 Lock-Based Mode


5. MySQL 8.0 parallel replication based on write-set

Conflict detection based on primary key (binlog_transaction_depandency_tracking = COMMIT_ORDERE|WRITESET|WRITESET_SESSION, parallelization is possible if the primary key or non-empty unique key of the modified row does not conflict)
5.7.22 also supports the write-set mechanism

Transaction dependency: binlog_transaction_depandency_tracking = COMMIT_ORDERE|WRITESET|WRITESET_SESSION

COMMIT_ORDERE: Continue with group-based commit
WRITESET: Determine transaction dependencies based on write sets
WRITESET_SESSION: Based on the write set, but transactions in the same session will not have the same last_committed

Transaction detection algorithm: transaction_write_set_extraction = OFF| XXHASH64 | MURMUR32

MySQL will have a variable to store the HASH value of the submitted transaction. The values ​​of the primary key (or unique key) modified by all submitted transactions will be compared with the set of that variable after hashing to determine whether the modified row conflicts with it, and thus determine the dependency relationship.

The variable mentioned here can be set in size by: binlog_transaction_dependency_history_size

This granularity is at the row level. At this time, the parallel granularity is finer and the parallel speed is faster. In some cases, it is not an exaggeration to say that the parallelism of the slave exceeds that of the master (the master is a single-threaded write, and the slave can also play back in parallel)

6. How to make the slave's parallel replication and the master's transaction execution order consistent?

After 5.7.19, you can set slave_preserve_commit_order = 1

Official explanation:

For multithreaded slaves, enabling this variable ensures that transactions are externalized on the slave in the same order as they appear in the slave's relay log.
Setting this variable has no effect on slaves for which multithreading is not enabled.
All replication threads (for all replication channels if you are using multiple replication channels) must be stopped before changing this variable.
--log-bin and --log-slave-updates must be enabled on the slave.
In addition --slave-parallel-type must be set to LOGICAL_CLOCK.
Once a multithreaded slave has been started, transactions can begin to execute in parallel.
With slave_preserve_commit_order enabled, the executing thread waits until all previous transactions are committed before committing.
While the slave thread is waiting for other workers to commit their transactions it reports its status as Waiting for preceding transaction to commit.

The general implementation principle is: the excecution phase can be executed in parallel, and the binlog flush is performed in sequence. When the engine layer commits, it is also completed in the queue order according to binlog_order_commit

In other words, if this parameter is set, the slave will parallelize as the master does.

Summarize

The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. If you have any questions, you can leave a message to communicate. Thank you for your support for 123WORDPRESS.COM.

You may also be interested in:
  • A brief analysis of MySQL's WriteSet parallel replication
  • A brief analysis of MySQL parallel replication
  • MySQL5.7 parallel replication principle and implementation

<<:  30 minutes to give you a comprehensive understanding of React Hooks

>>:  Tomcat components illustrate the architectural evolution of a web server

Recommend

Drawing fireworks effect of 2021 based on JS with source code download

This work uses the knowledge of front-end develop...

How to implement a multi-terminal bridging platform based on websocket in JS

Table of contents 1. What to debug 2. Features of...

Basic usage of find_in_set function in mysql

Preface This is a new function I came across rece...

Simple implementation method of Linux process monitoring and automatic restart

Purpose: Under Linux, the server program may be d...

How to get the dynamic number of remaining words in textarea

I encountered a case at work that I had never wri...

Tutorial diagram of installing mysql8.0.18 under linux (Centos7)

1 Get the installation resource package mysql-8.0...

WeChat applet implements simple calculator function

WeChat applet: Simple calculator, for your refere...

How to add shortcut commands in Xshell

As a useful terminal emulator, Xshell is often us...

Solution to the ineffectiveness of flex layout width in css3

Two-column layout is often used in projects. Ther...

Implementation of rewrite jump in nginx

1. New and old domain name jump Application scena...

Vue implements websocket customer service chat function

This article mainly introduces how to implement a...