A case study to thoroughly understand how to correctly use MySQL inndb joint index

There is a business that queries the 5 most recently reviewed data.

SELECT `id`, `title`
FROM `th_content`
WHERE `audit_time` < 1541984478
 AND `status` = 'ONLINE'
ORDER BY `audit_time` DESC, `id` DESC
LIMIT 5;

Checking the monitoring status at that time, the CPU usage was over 100%. show processlist showed that many similar queries were in create sort index state.

View the structure of the table

CREATE TABLE `th_content` (
 `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
 `title` varchar(500) CHARACTER SET utf8 NOT NULL DEFAULT '' COMMENT 'Content title',
 `content` mediumtext CHARACTER SET utf8 NOT NULL COMMENT 'Main content',
 `audit_time` int(11) unsigned NOT NULL DEFAULT '0' COMMENT 'Audit time',
 `last_edit_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'Last edit time',
 `status` enum('CREATED','CHECKING','IGNORED','ONLINE','OFFLINE') CHARACTER SET utf8 NOT NULL DEFAULT 'CREATED' COMMENT 'Information status',
 PRIMARY KEY (`id`),
 KEY `idx_at_let` (`audit_time`,`last_edit_time`)
)ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

The index has a joint index with audit_time on the left, and no index on status .

Analyze the logic of the above sql execution:

Find all primary key IDs that are less than the review time from the joint index (if 1 million data have been reviewed before this timestamp, the primary key IDs of the corresponding 1 million data will be retrieved from the joint index)
It would be nice if there is an optimization in the future. Currently, there is: sort the 100 primary key IDs, and then in the next step of the table return operation, the primary keys that are close together may be retrieved in one disk I/O.
Go back to the table one by one, find out 1 million rows, and filter out the rows with status='ONLINE'
Finally, sort the query results (if 500,000 rows are all ONLINE, continue to sort these 500,000 rows)

Finally, because the amount of data is large, although only 5 rows are taken, according to the extreme example we just gave, 1 million rows of data are actually queried, and finally a memory sort of 500,000 rows of the database is performed in memory.

So it is very inefficient.

A schematic diagram is drawn to illustrate the query process in the first step. The pink part represents the data rows that need to be queried in the table in the end.

In the picture, I forged and filled in some data according to the index storage rules. If there is anything wrong, please leave a message to point it out. I hope that through this picture, you can see the way of joint index storage and index query.

Improvement idea 1

Range searches are not easy to use with good indexes. What improvements will there be if we add a joint index of audit_time and status ?

ALTER TABLE `th_content` ADD INDEX `idx_audit_status` (`audit_time`, `status`);

mysql> explain select `id`, `title` from `th_content` where `audit_time` < 1541984478 and `status` = 'ONLINE' order by `audit_time` desc, `id` desc limit 5;
+----+-------------+------------+-------+------------------------------------------+------------------+--------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+------------------------------------------+------------------+--------+------+--------+-------------+
| 1 | SIMPLE | th_content | range | idx_at_ft_pt_let,idx_audit_status | idx_audit_status | 4 | NULL | 209754 | Using where |
+----+-------------+------------+-------+------------------------------------------+------------------+--------+------+--------+-------------+

Details: Because audit_time is a range search, the index of the second column is not used. Only audit_time can be used, so key_len is 4. In the following idea 2, key_len these two fields is still 5.

Let's analyze the execution process after adding the index:

Find the joint index of the row with the largest audit_time that is less than the audit time from the joint index
Then search downwards one by one, because < audit_time is a range search and the values of the second column index are scattered. Therefore, you need to search forward one by one to match the index rows that meet the condition ( status = 'ONLINE') until you get the 5th row.
Return to the table to query the specific data required

In the diagram above, the pink color indicates the rows that meet the index requirements of the first column. If we query forward one by one, we will find three records on this leaf node. Then we need to continue to query to the left at the previous leaf node. Until 5 rows that meet the records are found, and finally return to the table.

Improvements

Because status value is in the index, there is no need to return to the table for query when filtering rows that satisfy status = 'ONLINE'. When returning to the table, only 5 rows of data are queried, which greatly reduces iops .

Disadvantages of this index

If status all five rows scanned in idx_audit_status is ONLINE , then only five rows need to be scanned;

If only 4 rows in the first 1 million rows scanned in idx_audit_status have status of ONLINE , you need to scan 1 million and 1 row to get the required 5 rows. The index needs to scan an undetermined number of rows.

Improvement idea 2

ALTER TABLE `th_content` DROP INDEX `idx_audit_status`;
ALTER TABLE `th_content` ADD INDEX `idx_status_audit` (`status`, `audit_time`);

This way, there is no pressure whether it is sorting or returning to the table.

Summarize

The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. Thank you for your support of 123WORDPRESS.COM. If you want to learn more about this, please check out the following links

You may also be interested in:

Implementation of MySQL joint index (composite index)
Which is faster among MySQL full-text index, joint index, like query, and json query?
In-depth analysis of the leftmost matching principle of MySQL joint index
Analysis of MySQL joint index function and usage examples
MySQL joint index usage examples
Joint Index Learning Tutorial in MySQL
Optimizing MySQL joint index and Where clause to improve database operation efficiency
MySQL independent index and joint index selection

<<: Graphic tutorial on installing CentOS7 on VMware 15.5

>>: Detailed explanation of JavaScript stack and copy

When a Linux (Ubuntu) user enters an incorrect password N times in a row to log in, the system will be automatically locked for X minutes

Blog

A case study to thoroughly understand how to correctly use MySQL inndb joint index

Web Design Principles of Hyperlinks

Vue implements a simple shopping cart example

Hover zoom effect made with CSS3

Detailed explanation of MySql 5.7.17 free installation configuration tutorial

Issues with locking in MySQL

Several methods and advantages and disadvantages of implementing three-column layout with CSS

When a Linux (Ubuntu) user enters an incorrect password N times in a row to log in, the system will be automatically locked for X minutes

Vue implementation counter case

Detailed explanation of the rounding accuracy problem of the toFixed() method in JS

Practical way to build selenium grid distributed environment with docker

Recommend

js canvas implements verification code and obtains verification code function

select the best presets to create full compatibility with all browsersselect

SSH port forwarding, local port forwarding, remote port forwarding, dynamic port forwarding details

Basic ideas and codes for implementing video players in browsers

Use shell script to install python3.8 environment in CentOS7 (recommended)

Solution to the problem of session failure caused by nginx reverse proxy

Use CSS3 background control properties + color transition to achieve gradient effect

Detailed explanation of the minimum width value of inline-block in CSS

Nginx uses the Gzip algorithm to compress messages

MySQL 5.7.31 64-bit free installation version tutorial diagram

Summary of new usage of vi (vim) under Linux

Detailed steps to change the default password when installing MySQL in Ubuntu

Detailed explanation of custom instructions for Vue.js source code analysis

Docker implements container port binding local port

MySQL error: Deadlock found when trying to get lock; try restarting transaction solution