Troubleshooting of master-slave delay issues when upgrading MySQL 5.6 to 5.7

Troubleshooting of master-slave delay issues when upgrading MySQL 5.6 to 5.7

Recently, when upgrading the Zabbix database from MySQL 5.6 to 5.7, a master-slave delay problem occurred. This problem has troubled me for a long time and has not been solved. It was finally solved yesterday. I sorted out the entire troubleshooting process and shared it with everyone.

Environmental Description:

The MySQL master database is version 5.6, and there are four slave databases, three of which are version 5.6 and one is version 5.7. The library and table structures of all masters and slaves are consistent. The 5.7 slave database has a lot of delays, while the 5.6 slave database has no problem. The business is monitored by Zabbix, and basically all of them are insert batch operations. Each insert SQL inserts about 400-1000 rows of data.

question:

The slave database of MySQL5.7 has a large number of delays, the relaylog is written to the disk normally, and the application to the database is slow. There is no pressure on disk IO and CPU. There is no difference between sync_binlog being 20000 or 0, max_allowed_packet=128M, innodb_flush_log_at_trx_commit=0, bulk_insert_buffer_size = 128M, binlog_format=row, sync_relay_log=10000, parallel replication is not used, SSL is not enabled, GDID is not enabled, and semi-synchronization is not enabled.

Troubleshooting process:

1: Check each performance-related parameter and find no abnormality.

2: Checking the network card, hard disk, changing the server, and restarting the database server had no effect. The 5.7 delay still existed, so hardware problems were ruled out.

3: 5.7 synchronizes the binlog of the main database 5.6 to the relaylog quickly and normally, but the efficiency of replaying the relaylog in the 5.7 database is extremely low.

4: Compare the show engine innodb status results of 5.6 and 5.7 slaves:

=============5.6===============================
---BUFFER POOL 1
Buffer pool size 655359
Buffer pool size, bytes 10737401856
Free buffers 1019
Database pages 649599
Old database pages 239773
Modified db pages 119309
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 10777670, not young 181119246
13.90 youngs/s, 157.51 non-youngs/s
Pages read 8853516, created 135760152, written 784514803
20.96 reads/s, 58.17 creates/s, 507.02 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 2 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 649599, unzip_LRU len: 0
I/O sum[209618]:cur[2], unzip sum[0]:cur[0]
=============5.7==============================
---BUFFER POOL 1
Buffer pool size 819100
Buffer pool size, bytes 13420134400
Free buffers 1018
Database pages 722328
Old database pages 266620
Modified db pages 99073
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 37153, not young 795
0.00 youngs/s, 0.00 non-youngs/s
Pages read 149632, created 572696, written 2706369
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 722328, unzip_LRU len: 453903
I/O sum[98685]:cur[0], unzip sum[882]:cur[6]
+++++++++++++++++++++++

By comparison, we found that unzip has a value in 5.7, but not in 5.6. We initially suspected that the cause of the delay is related to compression and decompression.

5: Use perf top -p pidof mysqld to view the 5.7 slave library

It is found that libz.so.1.2.7[.]cc32 accounts for a higher proportion than mysqld, at about 6%. This library is related to compression and decompression.

6: Change the innodb_compression_level to 0 (that is, do not enable compression, the default is 6, the range is 0-9), and observe that there is no effect and the delay still exists. only

The share of libz has gone down, but the share of libc-2.17.so has gone up, higher than mysqld, at around 9%. Use pstack to view the waiting issues of the decompression in the research institute.

7: Check the history tables of zabbix. In order to save disk space, these tables were compressed:

CREATE TABLE trends (
itemid bigint(20) unsigned NOT NULL,
clock int(11) NOT NULL DEFAULT '0',
num int(11) NOT NULL DEFAULT '0',
value_min double(16,4) NOT NULL DEFAULT '0.0000',
value_avg double(16,4) NOT NULL DEFAULT '0.0000',
value_max double(16,4) NOT NULL DEFAULT '0.0000',
PRIMARY KEY (itemid,clock),
KEY clock (clock)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8

I suspect it is related to the compression parameter ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8.

8: Rebuild all historical tables, remove ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8, resynchronize, gradually reduce the delay, and recover.

Question: Why does the same table structure cause master-slave delay in 5.7 but not in 5.6? This may be caused by the backward compatibility issues of compression and decompression in MySQL 5.7. I didn't investigate it further, but I reported a bug to the official and asked them to check the source code: http://bugs.mysql.com/100702.

Please use ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8 with caution in production. After communicating with several experts in the industry, they said that the compression of MySQL versions before 8.0 is not very reliable, and using ZSTD for 8.0 is better.

This is the end of this article about the process of troubleshooting the master-slave delay problem when upgrading MySQL 5.6 to 5.7. For more information about the master-slave delay when upgrading MySQL 5.6 to 5.7, please search for previous articles on 123WORDPRESS.COM or continue to browse the related articles below. I hope everyone will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • MySQL master-slave replication delay causes and solutions
  • Causes and solutions for MySQL master-slave synchronization delay
  • Detailed analysis of MySQL master-slave delay phenomenon and principle
  • Analysis and solution of MYSQL master-slave asynchrony delay principle
  • Detailed explanation of reducing MySQL master-slave data synchronization delay
  • In-depth explanation of MySQL master-slave replication delay problem
  • Solution to MySQL master-slave delay problem

<<:  Vue implements star rating with decimal points

>>:  js development plug-in to achieve tab effect

Recommend

Summary of Node.js service Docker container application practice

This article will not explain the use and install...

VMWare15 installs Mac OS system (graphic tutorial)

Installation Environment WIN10 VMware Workstation...

Code to enable IE8 in IE7 compatibility mode

The most popular tag is IE8 Browser vendors are sc...

How to Develop a Progressive Web App (PWA)

Table of contents Overview Require URL of the app...

Analysis of MySQL example DTID master-slave principle

Table of contents 1. Basic Concepts of GTID 2. GT...

Centering the Form in HTML

I once encountered an assignment where I was give...

Usage instructions for the docker create command

The docker create command can create a container ...

Docker's flexible implementation of building a PHP environment

Use Docker to build a flexible online PHP environ...

Recommended plugins and usage examples for vue unit testing

Table of contents frame First-class error reporti...

How to use ElementUI pagination component Pagination in Vue

The use of ElementUI paging component Pagination ...

Vue echarts realizes horizontal bar chart

This article shares the specific code of vue echa...