Recently, when upgrading the Zabbix database from MySQL 5.6 to 5.7, a master-slave delay problem occurred. This problem has troubled me for a long time and has not been solved. It was finally solved yesterday. I sorted out the entire troubleshooting process and shared it with everyone. Environmental Description: The MySQL master database is version 5.6, and there are four slave databases, three of which are version 5.6 and one is version 5.7. The library and table structures of all masters and slaves are consistent. The 5.7 slave database has a lot of delays, while the 5.6 slave database has no problem. The business is monitored by Zabbix, and basically all of them are insert batch operations. Each insert SQL inserts about 400-1000 rows of data. question: The slave database of MySQL5.7 has a large number of delays, the relaylog is written to the disk normally, and the application to the database is slow. There is no pressure on disk IO and CPU. There is no difference between sync_binlog being 20000 or 0, max_allowed_packet=128M, innodb_flush_log_at_trx_commit=0, bulk_insert_buffer_size = 128M, binlog_format=row, sync_relay_log=10000, parallel replication is not used, SSL is not enabled, GDID is not enabled, and semi-synchronization is not enabled. Troubleshooting process: 1: Check each performance-related parameter and find no abnormality. 2: Checking the network card, hard disk, changing the server, and restarting the database server had no effect. The 5.7 delay still existed, so hardware problems were ruled out. 3: 5.7 synchronizes the binlog of the main database 5.6 to the relaylog quickly and normally, but the efficiency of replaying the relaylog in the 5.7 database is extremely low. 4: Compare the show engine innodb status results of 5.6 and 5.7 slaves: =============5.6=============================== ---BUFFER POOL 1 Buffer pool size 655359 Buffer pool size, bytes 10737401856 Free buffers 1019 Database pages 649599 Old database pages 239773 Modified db pages 119309 Pending reads 0 Pending writes: LRU 0, flush list 0, single page 0 Pages made young 10777670, not young 181119246 13.90 youngs/s, 157.51 non-youngs/s Pages read 8853516, created 135760152, written 784514803 20.96 reads/s, 58.17 creates/s, 507.02 writes/s Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 2 / 1000 Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s LRU len: 649599, unzip_LRU len: 0 I/O sum[209618]:cur[2], unzip sum[0]:cur[0] =============5.7============================== ---BUFFER POOL 1 Buffer pool size 819100 Buffer pool size, bytes 13420134400 Free buffers 1018 Database pages 722328 Old database pages 266620 Modified db pages 99073 Pending reads 0 Pending writes: LRU 0, flush list 0, single page 0 Pages made young 37153, not young 795 0.00 youngs/s, 0.00 non-youngs/s Pages read 149632, created 572696, written 2706369 0.00 reads/s, 0.00 creates/s, 0.00 writes/s Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000 Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s LRU len: 722328, unzip_LRU len: 453903 I/O sum[98685]:cur[0], unzip sum[882]:cur[6] +++++++++++++++++++++++ By comparison, we found that unzip has a value in 5.7, but not in 5.6. We initially suspected that the cause of the delay is related to compression and decompression. 5: Use perf top -p pidof mysqld to view the 5.7 slave library It is found that libz.so.1.2.7[.]cc32 accounts for a higher proportion than mysqld, at about 6%. This library is related to compression and decompression. 6: Change the innodb_compression_level to 0 (that is, do not enable compression, the default is 6, the range is 0-9), and observe that there is no effect and the delay still exists. only The share of libz has gone down, but the share of libc-2.17.so has gone up, higher than mysqld, at around 9%. Use pstack to view the waiting issues of the decompression in the research institute. 7: Check the history tables of zabbix. In order to save disk space, these tables were compressed: CREATE TABLE trends ( itemid bigint(20) unsigned NOT NULL, clock int(11) NOT NULL DEFAULT '0', num int(11) NOT NULL DEFAULT '0', value_min double(16,4) NOT NULL DEFAULT '0.0000', value_avg double(16,4) NOT NULL DEFAULT '0.0000', value_max double(16,4) NOT NULL DEFAULT '0.0000', PRIMARY KEY (itemid,clock), KEY clock (clock) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8 I suspect it is related to the compression parameter ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8. 8: Rebuild all historical tables, remove ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8, resynchronize, gradually reduce the delay, and recover. Question: Why does the same table structure cause master-slave delay in 5.7 but not in 5.6? This may be caused by the backward compatibility issues of compression and decompression in MySQL 5.7. I didn't investigate it further, but I reported a bug to the official and asked them to check the source code: http://bugs.mysql.com/100702. Please use ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8 with caution in production. After communicating with several experts in the industry, they said that the compression of MySQL versions before 8.0 is not very reliable, and using ZSTD for 8.0 is better. This is the end of this article about the process of troubleshooting the master-slave delay problem when upgrading MySQL 5.6 to 5.7. For more information about the master-slave delay when upgrading MySQL 5.6 to 5.7, please search for previous articles on 123WORDPRESS.COM or continue to browse the related articles below. I hope everyone will support 123WORDPRESS.COM in the future! You may also be interested in:
|
<<: Vue implements star rating with decimal points
>>: js development plug-in to achieve tab effect
When using Docker containers, it is more convenie...
If you want to wrap the text in the textarea input...
Recently, when I was drawing an interface, I enco...
Note: In web development, after adding autocomplet...
1. Enable remote access to the docker server Log ...
Ⅰ. Problem description: Use CSS to achieve 3D con...
Table of contents Preface start React Lifecycle R...
There is a project developed on Mac, and the pack...
Table of contents 1. Use help information 2. Crea...
In the XHTML language, we all know that the ul ta...
Table of contents Overview What is lazy loading? ...
Table of contents 1. Project Integration 1. CDN i...
Table of contents Features Preservation strategy ...
"We're writing our next set of mobile pr...
This article example shares the specific code of ...