Recently, many friends asked children a question about LIMIT in the Q&A group. Let me briefly describe this question below. questionIn order for the story to develop smoothly, we must first have a table: CREATE TABLE t ( id INT UNSIGNED NOT NULL AUTO_INCREMENT, key1 VARCHAR(100), common_field VARCHAR(100), PRIMARY KEY (id), KEY idx_key1 (key1) ) Engine=InnoDB CHARSET=utf8; Table t contains 3 columns, the id column is the primary key, and the key1 column is the secondary index column. The table contains 10,000 records. When we execute the following statement, the secondary index idx_key1 is used: mysql> EXPLAIN SELECT * FROM t ORDER BY key1 LIMIT 1; +----+-------------+-------+------------+-------+---------------+----------+---------+------+------+------+------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+----------+---------+------+------+------+------+ | 1 | SIMPLE | t | NULL | index | NULL | idx_key1 | 303 | NULL | 1 | 100.00 | NULL | +----+-------------+-------+------------+-------+---------------+----------+---------+------+------+------+------+ 1 row in set, 1 warning (0.00 sec) This is easy to understand because in the secondary index idx_key1, the key1 column is ordered. If the query is to obtain the first record sorted by the key1 column, MySQL only needs to obtain the first secondary index record from idx_key1, and then directly return to the table to obtain the complete record. However, if we change LIMIT 1 in the above statement to LIMIT 5000, 1, we need to scan the entire table and perform filesort. The execution plan is as follows: mysql> EXPLAIN SELECT * FROM t ORDER BY key1 LIMIT 5000, 1; +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+ | 1 | SIMPLE | t | NULL | ALL | NULL | NULL | NULL | NULL | 9966 | 100.00 | Using filesort | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+ 1 row in set, 1 warning (0.00 sec) Some students don't understand: LIMIT 5000, 1 can also use the secondary index idx_key1. We can first scan the 5001th secondary index record and then perform a table return operation on the 5001th secondary index record. This cost is definitely better than full table scan + filesort. I regret to tell you that due to defects in MySQL implementation, the above ideal situation will not occur. It will only stupidly perform full table scan + filesort. Let's talk about what is going on. Server layer and storage engine layerAs we all know, MySQL is actually divided into the server layer and the storage engine layer:
The execution of a SQL statement in MySQL requires multiple interactions between the server layer and the storage engine layer to get the final result. For example, consider the following query: SELECT * FROM t WHERE key1 > 'a' AND key1 < 'b' AND common_field != 'a'; The server layer will analyze that the above statement can be executed using the following two solutions:
The server layer will analyze which of the above two solutions has a lower cost, and then select the solution with lower cost as the execution plan. Then the interface provided by the storage engine is called to actually execute the query. Here we assume that solution 2 is adopted, that is, to use the secondary index idx_key1 to execute the above query. Then the conversation between the server layer and the storage engine layer can be as follows: Server layer: "Hey, please check the first record in the ('a', 'b') interval of the idx_key1 secondary index, and return the complete record to me after returning the table." InnoDB responds: “Got it. I’ll check it right away.” Then, InnoDB quickly locates the first secondary index record in the scan interval ('a', 'b') through the B+ tree corresponding to the idx_key1 secondary index, and then returns the complete clustered index record to the server layer. After receiving the complete clustered index record, the server layer continues to determine whether the common_field!='a' condition is met. If not, the record is discarded, otherwise the record is sent to the client. Then say to the storage engine: "Please give me the next record"
InnoDB: “Got it, I’ll check it right away.” InnoDB finds the next secondary index record in the ('a', 'b') interval of idx_key1 based on the next_record attribute of the record, then performs a table return operation and returns the obtained complete clustered index record to the server layer.
After receiving the complete clustered index record, the server layer continues to determine whether the common_field!='a' condition is met. If not, the record is discarded, otherwise the record is sent to the client. Then say to the storage engine: "Please give me the next record" ... and then repeat the above process over and over again. until: That is, until InnoDB finds that the next secondary index record obtained according to the next_record of the secondary index record is not in the interval ('a', 'b'), it tells the server layer: "Okay, there is no next record in the interval ('a', 'b')" When the server layer receives the message from InnoDB that there is no next record, it ends the query. Now everyone knows the basic interaction process between the server layer and the storage engine layer. What the hell is LIMIT?It may be a bit surprising to everyone to say that MySQL will only process the content in the LIMIT clause when the server layer is ready to send records to the client. Take the following statement as an example: SELECT * FROM t ORDER BY key1 LIMIT 5000, 1; If you execute the above query using idx_key1, MySQL will handle it like this:
From the above process, we can see that since MySQL will not determine whether the LIMIT clause meets the requirements until the record is actually sent to the client, if the secondary index is used to execute the above query, it means that 5001 table return operations are required. When analyzing the execution plan, the server layer will feel that the cost of executing so many table returns is too high, and it is not as fast as a direct full table scan + filesort, so it chooses the latter to execute the query. what to do?Due to the limitations of MySQL's implementation of the LIMIT clause, is it not possible to speed up queries by using secondary indexes when processing statements such as LIMIT 5000, 1? In fact, it is not. Just rewrite the above statement into: SELECT * FROM t, (SELECT id FROM t ORDER BY key1 LIMIT 5000, 1) AS d WHERE t.id = d.id; In this way, SELECT id FROM t ORDER BY key1 LIMIT 5000, 1 exists as a separate subquery. Because the query list of the subquery has only one id column, MySQL can execute the subquery by scanning only the secondary index idx_key1, and then search table t based on the primary key value obtained in the subquery. This saves the need to return to the table for the first 5,000 records, greatly improving query efficiency! TucaoWhen will the guys who designed MySQL fix this super stupid implementation of the LIMIT clause? Users have to manually try to deceive the optimizer to improve query efficiency~ This is the end of this article about the LIMIT statement in MySQL. For more information about the LIMIT statement in MySQL, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future! You may also be interested in:
|
<<: A brief discussion on read-only and disabled attributes in forms
>>: Example of implementing translation effect (transfrom: translate) with CSS3
Flex Basic Concepts Flex layout (flex is the abbr...
mysql-5.7.9 finally provides shutdown syntax: Pre...
<br />Question: Why is it not recommended to...
Several typical values of innodb_flush_method f...
Achieve results Code html <div class="css...
Table of contents 1. What is JSONP 2. JSONP cross...
Today, the company project needs to configure doc...
Mysql commonly used display commands 1. Display t...
When making a table page, sometimes the width set ...
Table of contents Why do we need Docker? Docker d...
Table of contents 1. JavaScript issues 2. Advanta...
Preface Dockerfile is a script interpreted by the...
Table of contents 1. Array.at() 2. Array.copyWith...
Dimensionality reduction of two-dimensional array...
Usage Environment In cmd mode, enter mysql --vers...