Record a slow query event caused by a misjudgment of the online MySQL optimizer

Record a slow query event caused by a misjudgment of the online MySQL optimizer

Preface:

I received crazy slow query and request timeout alerts. I analyzed the anomalies from MySQL requests through metrics, and saw a lot of slow queries in cli —> show procedureslist. This SQL did not exist before, and this problem occurred later due to the increase in data volume. Although the feeds table is as large as 100 million, the frequent IO is not caused by inefficient innodb_buffer_pool_size because the feeds stream information has the characteristic of being hot recently. Later, after further explain execution plan analysis, the reason was found out. The MySQL query optimizer chose an index that it considered efficient.

The MySQL query optimizer is reliable in most cases! But you should be careful when your SQL language contains multiple indexes, as the final result is often a bit confusing. Because MySQL can only use one index for the same SQL, which one should I choose? When the amount of data is small, the MySQL optimizer will put the primary key index at the end and give priority to index and unique. When you reach a certain data level, and because your query operation has in, the MySQL query optimizer is likely to choose the primary key!

Remember one thing, MySQL query optimization is based on retrieval cost considerations, not time cost considerations. The optimizer calculates the cost based on the existing data status instead of actually executing the SQL statement.

Therefore, the MySQL optimizer cannot achieve the optimization effect every time. It cannot accurately estimate the cost. If you want to accurately obtain the cost of going through each index, you have to actually execute it once to know it. Therefore, cost analysis is just an estimate, and since it is an estimate, there will be misjudgments.

The table we are talking about here is the feed information flow table. We know that the feeds information flow table is not only frequently accessed, but also has a large amount of data. However, the data structure of this table is very simple, and the index is also simple. There are only two indexes in total, one is the primary key index, and the other is the unique key index.

As shown below, the size of this table has reached 100 million. Because there are enough cache front-ends and for various reasons, there is no time to shard the database and tables.

The problem is that when the amount of data is less than 100 million, the MySQL optimizer chooses to use the index index. When the amount of data exceeds 100 million, the MySQL query optimizer chooses to use the primary key index. The problem this brings is that the query speed is too slow.

This is the normal situation:

mysql> explain SELECT * FROM `feed` WHERE user_id IN (116537309,116709093,116709377) AND cid IN (1001,1005,1054,1092,1093,1095) AND id <= 128384713 ORDER BY id DESC LIMIT 0, 11 \G;
*************************** 1. row ***************************
      id: 1
 select_type: SIMPLE
    table: feed
  partitions: NULL
     type: range
possible_keys: PRIMARY,feed_user_target
     key: feed_user_target
   key_len: 6
     ref: NULL
     rows: 18
   filtered: 50.00
    Extra: Using where; Using index; Using filesort
1 row in set, 1 warning (0.00 sec)

For the same SQL statement, when the amount of data changes significantly, the MySQL query optimizer's choice of index also changes.

mysql> explain SELECT * FROM `feed` WHERE user_id IN (116537309,116709093,116709377) AND cid IN (1001,1005,1054,1092,1093,1095) AND id <= 128384713 ORDER BY id DESC LIMIT 0, 11 \G;
*************************** 1. row ***************************
      id: 1
 select_type: SIMPLE
    table: feed
     type: range
possible_keys: PRIMARY,feed_user_target
     key: PRIMARY
   key_len: 4
     ref: NULL
     rows: 11873197
    Extra: Using where
1 row in set (0.00 sec)

The solution is to use force index to force the query optimizer to use the index we give. I am using Python development environment here. Common Python ORMs have force index, ignore index, and user index parameters.

explain SELECT * FROM `feed` force index (feed_user_target) WHERE user_id IN (116537309,116709093,116709377) ...

So how can we prevent the problem that the MySQL optimizer chooses an inefficient index due to the increase in data?

I consulted several DBAs from different factories about this problem, and the answers they got were the same as our method. The problem can only be discovered through slow queries in the later stage, and then force index is specified in the SQL statement to solve the index problem. In addition, such problems will be avoided in the early stage of system launch, but business developers often cooperate with DBAs in the initial review work, but in the later stage, in order to save trouble, or they think there is no problem, MySQL query accidents occur.

I have only a vague understanding of the MySQL optimizer's index selection rules, and I plan to spend some time studying the rules later.

You may also be interested in:
  • How to locate MySQL slow queries
  • MySQL slow query optimization and slow query log analysis example tutorial
  • A brief talk about MySQL optimization tool - slow query
  • Mysql slow query optimization method and optimization principle
  • How to optimize MySQL performance through MySQL slow query
  • Detailed example of locating and optimizing slow query sql in MySQL

<<:  Vue integrates a rich text editor that supports image zooming and dragging

>>:  Use iptables and firewalld tools to manage Linux firewall connection rules

Recommend

Summary of common MySQL table design errors

Table of contents Mistake 1: Too many columns of ...

Detailed steps to install Docker mongoDB 4.2.1 and collect springboot logs

1: Install mongodb in docker Step 1: Install mong...

The difference between float and position attributes in CSS layout

CSS Layout - position Property The position attri...

Implementation example of Vue+Element+Springboot image upload

Recently, I happened to be in touch with the vue+...

Tutorial on installing mongodb under linux

MongoDB is cross-platform and can be installed on...

How to solve the element movement caused by hover-generated border

Preface Sometimes when hover pseudo-class adds a ...

Summary of the dockerfile-maven-plugin usage guide

Table of contents pom configuration Setting.xml c...

MySQL trigger simple usage example

This article uses examples to illustrate the simp...

Sample code for implementing menu permission control in Vue

When people are working on a backend management s...

Summary of MySQL character sets

Table of contents Character Set Comparison Rules ...

Summary of MySQL time statistics methods

When doing database statistics, you often need to...

The difference between Display, Visibility, Opacity, rgba and z-index: -1 in CSS

We often need to control the hidden, transparent ...