Reasons for the sudden drop in MySQL performance

Reasons for the sudden drop in MySQL performance

Sometimes you may encounter a situation where a SQL statement is executed normally and very quickly. But suddenly at some point the execution will be very slow, and this scenario cannot be reproduced and can only be sent randomly.

Reasons why SQL execution suddenly slows down

When explaining MySQL Redo log before, I mentioned the WAL mechanism. In order to ensure the speed of MySQL updates, when performing an update operation, the update content is first written to the redo log, and then the content of the redo log is applied to the disk when the system is idle.

When the contents of the memory data page (redo log) and the disk data page are inconsistent, the memory is also called a "dirty page". After the memory data is written to the disk, the data is consistent and the memory page is called a "clean page".

When memory data is written to disk, this process is called flush process. SQL execution suddenly becomes very slow and performance degrades. The reason may be related to the flush operation.

Because when the flush operation is in progress, the update operation will wait for the redo log to be written.

Causes of flush operation

Scenario 1: The redo log is full. At this time, the system will stop the update operation and push the check point forward to leave space in the redo log for continued writing.

Here it is assumed that the gap from CP to CP' has been written to the disk, and this part becomes a clean page. At this time, write pos can be written to this area.

Scenario 2: When the system is short of memory and needs new memory pages, it is found that the memory is insufficient and some data pages need to be eliminated. If the data page is dirty when it is eliminated, the dirty page must be written to disk.

At this time, there is a question: the content in the named redo log has been recorded in the log. If the memory is full, can't we just delete it? The next time you read in, apply the contents of the redo log to the disk.

The reason why we did not choose to clear the memory directly is for performance reasons. There are two situations when querying data:

  • First, the data page is in memory, and the memory is the correct result, which is returned directly
  • There is no data in the memory, so read it into the memory from the data file.

So this is more efficient.

Scenario 3: MySQL will enter the flush operation when the system is idle.

Scenario 4: When MySQL is shut down normally, dirty pages in memory will be flushed to disk.

The impact of flushing on performance

The third and fourth scenarios are relatively normal and there is no need to consider performance issues.

InnoDB will try to avoid the first scenario because in this case the entire system no longer accepts updates.

But sometimes human configuration errors occur, such as an instance with 128 GB of memory and innodb_io_capacity set to 20000. It is usually recommended to set the redo log to four 1GB files. But due to a configuration error, it was set to a 100M file.

Since the redo log is set too small, it will be filled up quickly. Write pos keeps catching up with checkpoint. At this point, the system can only stop all updates and advance the checkpoint.

The performance is that the disk IO is very small, but there is intermittent performance degradation.

For the second scenario, when the memory is insufficient, InnoDB will use the buffer pool to manage the memory.

Memory pages can be in three states in the buffer pool:

  • Unused data pages
  • I used it, but it is a clean page
  • Used, dirty page

Each data page header has an LSN, which is 8 bytes and increases with each modification.

Compare this LSN with the checkpoint LSN. The one smaller than the checkpoint LSN must be a clean page.

Since InnoDB's strategy is to use as much memory as possible, there are very few unused pages for long-running databases.

When it is found that the data page you want to read is not in the memory, you must apply for the data page from the buffer pool. And the data pages that have not been used for the longest time will be eliminated from memory

If it is a clean page, it can be released directly. If it is a dirty page, it must be flushed to disk first to become a clean page before it can be reused. If the dirty page is flushed in the following situations, the performance will be significantly affected:

Too many dirty pages need to be eliminated, resulting in longer query response time.
The log is full and updates are blocked.
To solve this problem, InnoDB uses a mechanism to control the proportion of dirty pages to avoid the above situation.

InooDB control strategy for flushing dirty pages

In InnoDB, the innodb_io_capacity parameter is used to tell InnoDB the current disk capacity of the host. It is recommended to set this value to the disk's IOPS.

You can test it with the fio tool:

fio -filename=$filename -direct=1 -iodepth 1 -thread -rw=randrw -ioengine=psync -bs=16k -size=500M -numjobs=10 -runtime=10 -group_reporting -name=mytest

Performance issues caused by innodb_io_capacity are common. For example, sometimes the system throughput (TPS) is low and writes are slow, but the disk IO is not high. It is possible that the parameter is set incorrectly. For example, the value of innodb_io_capacity is set very low, but the disk used is an SSD, which causes InnoDB to think that the system capacity is very poor, so the dirty page flushing is very slow. This causes dirty pages to accumulate, affecting query and update performance.

InnoDB mainly considers two factors when flushing disks:

  1. The proportion of dirty pages
  2. Redo log writing speed

Two numbers will be calculated first using these two factors separately.

innodb_max_dirty_pages_pct dirty page ratio upper limit, default 75%.

InnoDB calculates a number in the range of 0 - 100 based on the proportion of dirty pages (M). , the process is called F1 (M)

# M Dirty page ratio F1(M)
{
 if M>=innodb_max_dirty_pages_pct then
  return 100;
 return 100*M/innodb_max_dirty_pages_pct;
}

In addition, each time InnoDB writes a log, it will have a sequence number N. Then a number between 0 and 100 is calculated based on N. This calculation process is called F2(N).

N: The difference between the currently written sequence number and the checkpoint corresponding sequence number.

Finally, based on the two values ​​of F1 (M) and F2 (N), the larger value is taken as R, and then the engine can control the speed of flushing dirty pages according to innodb_io_capacity * R.

Therefore, no matter whether you are querying, you need to load data into the memory data page and eliminate the dirty page. Whether it is during an update or a disk flush, it is possible that the performance of MySQL will be degraded.

To avoid this situation, you should set the value of innodb_io_capacity reasonably and pay more attention to the proportion of dirty pages, so as not to let it approach 75%.

The dirty page ratio can be obtained in the following way:

mysql> use performance_schema;
mysql> select VARIABLE_VALUE into @a from global_status where VARIABLE_NAME = 'Innodb_buffer_pool_pages_dirty';
select VARIABLE_VALUE into @b from global_status where VARIABLE_NAME = 'Innodb_buffer_pool_pages_total';
select @a/@b;

In addition to the first time, when a query operation is in progress, if a dirty page needs to be flushed, if the neighbor of the dirty page is also a dirty page, the neighbor will be flushed together. If there happens to be a dirty page next to it, it will continue to be flushed. This is the reason why flush is too slow.

This behavior can be controlled by innodb_flush_neighbors . A value of 1 turns on the above mechanism, and a value of 0 turns it off.

For mechanical hard disks, a lot of random IO can be reduced, because the IOPS of mechanical hard disks is generally only a few hundred, and reducing random IO means performance improvement.

However, if you use a device with higher IOPS, such as SSD, IOPS is often not a bottleneck and you can just turn it off to reduce the response time of SQL statements.

In 8.0, the default is 0.

Summarize

This article mainly introduces how the flush operation during WAL may cause a sudden performance drop in MySQL.

The cause is usually due to insufficient memory, and the InnoDB flush process can be controlled by setting the appropriate innodb_io_capacity parameter.

The above are the details of the reasons why MySQL performance suddenly degraded. For more information about MySQL performance degradation, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • How to analyze MySQL query performance
  • Full steps to create a high-performance index in MySQL
  • Introduction to the use of MySQL performance stress benchmark tool sysbench
  • MySQL performance optimization index pushdown
  • Solutions to Mysql index performance optimization problems
  • MySQL performance optimization tips
  • MySQL 20 high-performance architecture design principles (worth collecting)
  • Summary of Mysql high performance optimization skills
  • Detailed explanation of GaussDB for MySQL performance optimization

<<:  After idea publishes web project, Tomcat server cannot find the project and its solution

>>:  Detailed explanation of Vue3 encapsulation Message message prompt instance function

Recommend

Docker deployment of Kafka and Spring Kafka implementation

This article mainly introduces the deployment of ...

The best solution for implementing digital plus and minus buttons with pure CSS

Preface: For the implementation of digital additi...

Common operation commands of MySQL in Linux system

Serve: # chkconfig --list List all system service...

The past two years with user experience

<br />It has been no more than two years sin...

Installation and configuration of MySQL 5.7.17 free installation version

MYSQL version: MySQL Community Server 5.7.17, ins...

Detailed explanation of the practical application of centos7 esxi6.7 template

1. Create a centos7.6 system and optimize the sys...

How to implement Linux deepin to delete redundant kernels

The previous article wrote about how to manually ...

Implementation of services in docker accessing host services

Table of contents 1. Scenario 2. Solution 3. Conc...

How to install docker under centos and remotely publish docker in springboot

Table of contents 1. Installation of JDK1.8 under...

Detailed tutorial on installing nvidia driver + CUDA + cuDNN in Ubuntu 16.04

Preparation 1. Check whether the GPU supports CUD...

Compile CPP files using G++ in Ubuntu

When I used g++ to compile the cpp file for the f...

Example code for implementing stacked carousel effect with HTML+CSS+JS

Effect: When the slideshow moves in one direction...

How to expand the disk space of Linux server

Table of contents Preface step Preface Today I fo...