Comprehensive analysis of MySql master-slave replication mechanism

Comprehensive analysis of MySql master-slave replication mechanism

As a relational database, MySQL provides a built-in data replication mechanism, which allows advanced features such as high-availability architecture to be implemented based on its replication mechanism, making MySQL suitable for production environments without the need for additional plug-ins or other tools. This is one of the conditions for MySQL to be widely used in practice.

The replication mechanism based on MySQL can not only achieve high availability of the database, but also realize advanced features such as performance expansion, off-site disaster recovery, and hot and cold separation.

  • High availability: By configuring a certain replication mechanism, MySQL implements cross-host data replication, thereby achieving a certain degree of high availability. If you need to achieve higher availability, you only need to configure multiple copies or perform cascading replication to achieve the goal.
  • Performance expansion: Since the replication mechanism provides multiple data backups, in scenarios where read-write consistency requirements are not high, you can configure one or more replicas to distribute read requests to replica nodes, thereby improving the overall read-write performance.
  • Offsite disaster recovery: You only need to deploy the replica node to an offsite computer room to easily obtain a certain offsite disaster recovery capability. In practice, factors such as network latency that may affect the overall performance need to be considered.
  • Transaction separation: By configuring a replication mechanism and sending low-frequency, high-computation transactions to replica nodes for execution, these transactions can be prevented from competing with high-frequency transactions for computing resources, thereby avoiding overall performance issues.

To obtain the above capabilities, you need to understand the basic MySQL replication mechanism and select the appropriate configuration based on the actual application scenario.

Master-slave replication mechanism

MySQL implements master-slave replication based on binlog. The slave node tracks and obtains the latest updates in the master node binlog and replays them in itself, thereby replicating the master node data.

The following figure is a schematic diagram of the MySQL master-slave replication process. There are three threads involved in the whole process, and their responsibilities are:

  • Master node binlog dump thread: This thread is created after the slave node connects to the master node and is responsible for sending the newly written data in the binlog to the slave node. When reading binlog, the dump thread first acquires the binlog lock, releases it immediately after reading, and then sends the read data to the slave node.
  • Slave node I/O thread: The slave node I/O thread is responsible for sending data synchronization requests to the master node, receiving data sent by the master node and writing it to the relay-log.
  • Slave node SQL thread: This thread reads data updates from the relay-log and replays them.

Asynchronous replication

By default, MySQL's master-slave replication is asynchronous replication. Under this mechanism, the master node will respond to the client's request immediately after completing the local log writing, and the data replication process of the slave node is executed asynchronously.

Obviously, under this mechanism, since the replication process does not affect the primary node's response to client requests, there is no significant loss in overall performance compared to a single node.

However, under this mechanism, if the master node crashes when the data is committed but not synchronized to the slave node, if a master-slave switch occurs and new data is written, data loss or inconsistency may occur.

Semisynchronous replication

Starting from version 5.6, MySQL supports semisynchronous replication, which has the following differences compared to asynchronous replication:

After receiving the client's request, the master node must complete the log writing of its own node and wait for at least one slave node to complete the data synchronization response (or time out) before responding to the request.

The slave node will respond to the master node only after writing to the relay-log and completing the disk flush.

When the slave node responds to a timeout, the master node will degenerate the synchronization mechanism to asynchronous replication. After at least one slave node recovers and completes data catch-up, the master node will restore the synchronization mechanism to semi-synchronous replication.

It can be seen that compared with asynchronous replication, semi-synchronous replication improves data availability to a certain extent. When it has not degenerated to asynchronous replication, if the master node goes down, the data has been copied to at least one slave node.

At the same time, since the slave node needs to complete the response to the client, compared with asynchronous replication, this time requires more time for network interaction between the master and slave nodes and the time it takes for the slave node to write files and flush them to the disk. Therefore, the overall cluster's response performance to the client is bound to be reduced.

Master-slave replication format

Since MySQL's replication mechanism is based on binlog, the format of binlog determines the format of master-slave replication. There are two types of binlog: row-based and statement-based, so replication also has two corresponding formats.

Statement-Based Replication (SBR)

For statement-based replication, the binlog records only the executed statements. This method has the following advantages:

  • It has been in existence since version 3.23 and has been a proven technology for a long time.
  • Less data is written to the log file, which means less file writing and network transmission consumption, so that the master-slave replication can be completed faster overall, improving performance.
  • The log file records all statements executed on the database and can be used for auditing and other purposes.

There are the following disadvantages:

  • User-defined functions (UDFs) and functions whose execution results are uncertain cannot be copied.
  • When updating data, more row locks are required than row-based replication
  • For complex statements such as insert first and then update, the slave node needs to perform a complete corresponding replay, while row-based replication only needs to execute the final result.

Row-Based Replication (RBR)

In the row-based replication mechanism, the corresponding binlog is also row-based. In this case, each time data is updated and written to the binlog, changes in all affected rows are converted.

This replication method has the following advantages:

  • All data changes can be safely replicated and will not be affected by UDFs and special functions.
  • Most DBMS adopt this replication method, and the cost of knowledge migration is low.
  • When updating data, fewer row locks are required, resulting in higher performance.

There are the following disadvantages:

  • When DML involves large amounts of data, row-based logs will generate a large amount of log data. Large amounts of data mean longer time for log file writing and network transmission, which may lead to a significant deterioration in overall performance and may also cause concurrency problems.
  • It is not possible to view the executed statements through the log, and it is also impossible to know the statements executed on the slave node.

In actual architecture applications, it is necessary to reasonably use the master-slave replication mechanism according to the business characteristics of the system and select the appropriate master-slave replication format.

The above is the detailed content of the comprehensive analysis of the MySql master-slave replication mechanism. For more information about the MySql master-slave replication mechanism, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • MySQL5.7 parallel replication principle and implementation
  • Detailed explanation of MySQL master-slave replication and read-write separation
  • Common repair methods for MySQL master-slave replication disconnection
  • Analysis of three parameters of MySQL replication problem
  • MySQL Series 13 MySQL Replication

<<:  A brief summary of basic web page performance optimization rules

>>:  Example of using CSS3 to customize the style of input multiple-select box

Recommend

Record of the actual process of packaging and deployment of Vue project

Table of contents Preface 1. Preparation - Server...

Practice of multi-layer nested display of element table

There is a requirement for a list containing mult...

Implementation of deploying Apollo configuration center using docker in CentOS7

Apollo open source address: https://github.com/ct...

Detailed explanation of the execution principle of MySQL kill command

Table of contents Kill instruction execution prin...

Example of deploying MySQL on Docker

Table of contents 1 What is container cloud? 2 In...

Learn how to write neat and standard HTML tags

Good HTML code is the foundation of a beautiful w...

Implementing access control and connection restriction based on Nginx

Preface Nginx 's built-in module supports lim...

An example of how Tomcat manages Session

Learned ConcurrentHashMap but don’t know how to a...

Use of MySQL query rewrite plugin

Query Rewrite Plugin As of MySQL 5.7.6, MySQL Ser...

The implementation process of extracting oracle data to mysql database

In the migration of Oracle database to MySQL data...