A brief analysis of MySQL's WriteSet parallel replication

A brief analysis of MySQL's WriteSet parallel replication

【Historical Background】

I have been working as a MySQL-DBA for three years, and I have witnessed MySQL's evolution from "basically available", "MySQL can be used on marginal systems", to "Oh shit! Why don't you use MySQL";

As the saying goes! "The fate of a database depends on both the historical process and its own struggle!" I will not discuss the "historical process" here. As for the "self-struggle", I just want to talk about several key time nodes of parallel replication.

In general, MySQL has experienced three key time nodes for parallel replication so far: "inter-database concurrency", "group commit", and "write set". It can be said that every generation has its own talents, and the previous waves die on the beach. In general, the later ones are much better than the previous ones!

[Inter-library concurrency]

The theoretical basis for inter-database concurrency is this: there may be multiple libraries (schemas) in an instance, and there is no dependency between different libraries. Therefore, a separate SQL thread is started for each library (schema) on the slave side. In this way, the efficiency of master-slave replication can be improved through multi-threaded parallel replication.

This theory sounds good, but in fact there is only one business library for one instance, so this kind of concurrency between libraries is useless; that is to say, this method is applicable to relatively few scenarios, and this shortcoming was not solved until "group submission"!

【Group Submission】

The theoretical basis of group commit is this: if multiple transactions can be committed at the same time, this indirectly indicates that there is no conflict in the locks of these transactions, that is, they each hold different locks and do not affect each other; logically, we regard several transactions as a group, and assign them to SQL threads for execution in units of "groups" on the slave, so that multiple SQL threads can run in parallel; and the granularity of parallelism is not based on the database, the effect is better than "concurrency between databases".

This actually has some problems, because it requires a certain degree of concurrency on the database, otherwise there may be only one transaction in each group, which is no different from serial. To solve this problem, MySQL provides two parameters that want to wait before submitting, so that there are as many transactions as possible in the group to improve the efficiency of parallel replication.

" binlog_group_commit_sync_no_delay_count " sets a lower water mark, that is, a group must have enough transactions before committing; to prevent the group from never having enough transactions.

For multiple transactions, MySQL also provides another parameter " binlog_group_commit_sync_delay " based on time. This parameter indicates how long to wait at most. After this time, the transaction will be committed even if the transaction is not completed.

Personal experience! It is particularly difficult to find suitable values ​​for these two parameters. Even if they are suitable today, they may become inappropriate after a few days when the business changes. It would be great if MySQL could achieve an adaptive effect on its own. This adaptive effect can only be achieved through WriteSet (WriteSet is not achieved by automatically adjusting these two parameters, it uses a completely different solution approach).

【WriteSet】

What problem does WriteSet solve? Of course, the problem of "group submission" is solved! It's the same as saying nothing, so let's give an example (more academic); suppose you updated the row with id == 1 on the first day, and you updated the row with id == 2 on the second day, and on the third day a slave came to synchronize your data! Due to the nature of “group commit”, these two updates will be packaged into different “groups”, that is, there will be two groups; since there is only one transaction in each group, they are logically serial!

As a DBA, you can see that these two can actually be packaged into the same group, because they do not conflict with each other, and even if they are packaged into the same group, it will not cause data inconsistency. So you have two options

Method 1): Sister, you are bold enough to set "binlog_group_commit_sync_no_delay_count" to 2, that is, a group must contain at least two transactions, and set " binlog_group_commit_sync_delay " to more than 24 hours! If you really do it, you can go home, your database is too slow (the first update waited a day) to complete!

Method 2): Ask MySQL to use a small notebook to record what it has changed recently. If the data to be changed now does not conflict with the previous data, then they can be packaged into the same group; still using our previous example, since the id of the value changed on the second day is == 2, it does not conflict with the first day, so it can completely package the second day's update and the first day's update into the same group. This way, there are two transactions in the group, and when the slave replays them on the third day, there will be a parallel effect.

This little notebook is so awesome, can it be made bigger? certainly! The parameter binlog_transaction_dependency_history_size is the capacity of the small notebook; does my MySQL have this small notebook? If your MySQL is newer than mysql-5.7.22, the little notebook is built-in.

That is to say, "WriteSet" is built on the foundation of the giant "group submission", and it is self-adaptive packaging and grouping on the master, so you only need to add two parameters on the master

binlog_transaction_dependency_tracking = WRITESET # COMMIT_ORDER   
transaction_write_set_extraction = XXHASH64

Now that we have covered the theory, let’s look at the practice.

【WriteSet Practice】

I will not go into details about how to build a parallel replication environment based on WriteSet, which is to say, adding two more parameters on the master than the normal "group commit". I will just give the behavioral changes under the two parallel replication modes.

1): The target SQL we want to execute is as follows

create database tempdb;
use tempdb;
create table person(id int not null auto_increment primary key,name int);

insert into person(name) values(1);
insert into person(name) values(2);
insert into person(name) values(3);
insert into person(name) values(5);

2): Take a look at the grouping of the above SQL by group submission

3): See how write_set optimizes group submission

You can see that each insert can be executed in parallel, so they are grouped into the same group (same last_committed); last_committed, sequence_number, these two values ​​are recorded in the binlog. When parsing binlog, I usually use the following options

mysqlbinlog -vvv --base64-output='decode-rows' mysql-bin.000002

【Summarize】

WriteSet is a new parallel replication implementation built on the "group commit" method. It is more flexible than "group commit". Of course, due to the increased concurrency, WriteSet will have better performance than "group commit". When some WriteSets cannot resolve conflicts, they can smoothly transition to the "group commit" mode.

The above is a brief analysis of the details of MySQL WriteSet parallel replication. For more information about MySQL WriteSet parallel replication, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • A brief analysis of MySQL parallel replication
  • A simple explanation of MySQL parallel replication
  • MySQL5.7 parallel replication principle and implementation

<<:  How to insert a link in html

>>:  Implementation of rewrite jump in nginx

Recommend

About WeChat Mini Program to implement cloud payment

Table of contents 1. Introduction 2. Thought Anal...

Solution to the inaccessibility of Tencent Cloud Server Tomcat port

I recently configured a server using Tencent Clou...

Detailed explanation of how to view the current number of MySQL connections

1. View the detailed information of all current c...

React uses routing to redirect to the login interface

In the previous article, after configuring the we...

How to deploy MySQL master and slave in Docker

Download image Selecting a MySQL Image docker sea...

Introduction to MySQL statement comments

MySQL supports three types of comments: 1. From t...

Vue uses ECharts to implement line charts and pie charts

When developing a backend management project, it ...

Mysql queries the transactions being executed and how to wait for locks

Use navicat to test and learn: First use set auto...

How to install MySQL via SSH on a CentOS VPS

Type yum install mysql-server Press Y to continue...

Vue3 implements CSS infinite seamless scrolling effect

This article example shares the specific code of ...

Detailed introduction to Mysql date query

Query the current date SELECT CURRENT_DATE(); SEL...

How to use JS WebSocket to implement simple chat

Table of contents Short Polling Long-Polling WebS...