Reasons and optimization solutions for slow MySQL limit paging with large offsets

Reasons and optimization solutions for slow MySQL limit paging with large offsets

In MySQL, we usually use limit to complete the paging function on the page, but when the amount of data reaches a large value, the further you turn the page, the slower the interface response speed will be.

This article mainly discusses the reasons why limit paging is slow with large offsets and its optimization solutions. To simulate this situation, the following first introduces the table structure and the executed SQL.

Scenario simulation

Create table statement

The structure of the user table is relatively simple, including id, sex and name. To make the SQL execution time change more obvious, there are 9 name columns.

CREATE TABLE `user` (
 `id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'Primary key',
 `sex` tinyint(4) NULL DEFAULT NULL COMMENT 'Gender 0-Male 1-Female',
 `name1` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT 'Name',
 `name2` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT 'Name',
 `name3` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT 'Name',
 `name4` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT 'Name',
 `name5` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT 'Name',
 `name6` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT 'Name',
 `name7` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT 'Name',
 `name8` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT 'Name',
 `name9` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT 'Name',
 PRIMARY KEY (`id`) USING BTREE,
 INDEX `sex`(`sex`) USING BTREE
) ENGINE = InnoDB AUTO_INCREMENT = 9000001 CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Dynamic;

Data filling

A stored procedure is created here to fill in the data, with a total of 9,000,000 records. After the function is executed, another SQL statement is executed to modify the gender field.

ps: This function takes a long time to execute. I ran it for 617.284 seconds.

CREATE DEFINER=`root`@`localhost` PROCEDURE `data`()
begin 
 declare i int; 
 set i=1; 
 while(i<=9000000)do 
  insert into user values(i,0,i,i,i,i,i,i,i,i,i);
  set i=i+1; 
 end while;
end

-- Set the gender of users with even ids to 1-female update user set sex=1 where id%2=0;

SQL and execution time

SQL Execution time
select * from user where sex = 1 limit 100, 10; OK, Time: 0.005000s
select * from user where sex = 1 limit 1000, 10; OK, Time: 0.007000s
select * from user where sex = 1 limit 10000, 10; OK, Time: 0.016000s
select * from user where sex = 1 limit 100000, 10; OK, Time: 0.169000s
select * from user where sex = 1 limit 1000000, 10; OK, Time: 5.892000s
select * from user where sex = 1 limit 10000000, 10; OK, Time: 33.465000s

As you can see, the larger the limit offset, the longer the execution time.

Cause Analysis

First, let's analyze the execution process of this SQL statement, taking the first row in the above table as an example.

Since the sex column is an index column, MySQL will walk the sex index tree and hit the data with sex=1.

Then, because the non-clustered index stores the value of the primary key id, and the query statement requires a query of all columns, a table return will occur here. After hitting the data with a value of 1 in the sex index tree, the value of its leaf node, that is, the value of the primary key id, is used to query the values ​​of other columns (name, sex) of this row in the primary key index tree, and finally returned to the result set, so that the first row of data is successfully queried.

The last SQL statement requires limit 100, 10, which means querying data from 101 to 110. However, MySQL will query the first 110 rows and then discard the first 100 rows. Finally, only rows 101 to 110 are left in the result set, and the execution ends.

To summarize, in the above execution process, the reasons why the execution time of limit with large offset becomes longer are:

  • Querying all columns results in a table return
  • limit a, b will query the first a+b pieces of data, and then discard the first a piece of data

Combining the above two reasons, MySQL spends a lot of time on table return, and the result of a table return will not appear in the result set, which causes the query time to become longer and longer.

Optimization plan

Covering Index

Since invalid table returns are the main reason for slow query, the optimization plan is to reduce the number of table returns. Assume that in limit a, b, we first get the IDs of data a+1 to a+b, and then return to the table to obtain data in other columns. In this way, a number of table returns is reduced, and the speed will definitely be much faster.

This involves covering indexes. The so-called covering indexes allow you to retrieve the desired data from non-primary clustered indexes without having to query other columns from the primary key index through table return, which can significantly improve performance.

Based on this idea, the optimization solution is to query the primary key ID first, and then query other column data based on the primary key ID. The optimized SQL and execution time are shown in the following table.

Optimized SQL Execution time
select * from user a join (select id from user where sex = 1 limit 100, 10) b on a.id=b.id; OK, Time: 0.000000s
select * from user a join (select id from user where sex = 1 limit 1000, 10) b on a.id=b.id; OK, Time: 0.00000s
select * from user a join (select id from user where sex = 1 limit 10000, 10) b on a.id=b.id; OK, Time: 0.002000s
select * from user a join (select id from user where sex = 1 limit 100000, 10) b on a.id=b.id; OK, Time: 0.015000s
select * from user a join (select id from user where sex = 1 limit 1000000, 10) b on a.id=b.id; OK, Time: 0.151000s
select * from user a join (select id from user where sex = 1 limit 10000000, 10) b on a.id=b.id; OK, Time: 1.161000s

Sure enough, the execution efficiency has been significantly improved.

Conditional filtering

Of course, there is also a flawed approach that is to do conditional filtering based on sorting.

For example, in the user table above, I want to use limit paging to get data from 1000001 to 1000010. I can write SQL like this:

select * from user where sex = 1 and id > (select id from user where sex = 1 limit 1000000, 1) limit 10;

However, this optimization method is conditional: the primary key id must be in order. Under ordered conditions, other fields such as creation time can also be used instead of the primary key id, but the prerequisite is that this field is indexed.

In short, there are many limitations to using conditional filtering to optimize limit. It is generally recommended to use covering indexes for optimization.

summary

This paper mainly analyzes the reasons why limit paging with large offsets is slow, and also proposes corresponding optimization solutions. It recommends using covering indexes to optimize the problem of long execution time of limit paging with large offsets.

I hope this helps everyone.

The above is the detailed content of the reasons why MySQL limit paging is slow with large offsets and the optimization solution. For more information about MySQL limit paging, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • MySQL optimization solution: enable slow query log
  • A brief discussion on MySQL select optimization solution
  • MySQL query optimization: a table optimization solution for 1 million data
  • A brief discussion on MySQL large table optimization solution
  • MySQL functional index optimization solution
  • MySQL Optimization Solution Reference
  • Several common optimization solutions for MySQL

<<:  Solution to Vue's inability to watch array changes

>>:  VMware virtual machine installation Apple Mac OS super detailed tutorial

Recommend

Detailed explanation of the process of using docker to build minio and java sdk

Table of contents 1minio is simple 2 Docker build...

MySql index detailed introduction and correct use method

MySql index detailed introduction and correct use...

10 reasons why Linux is becoming more and more popular

Linux has been loved by more and more users. Why ...

A brief introduction to the general process of web front-end web development

I see many novice students doing front-end develop...

Vue implements the countdown component for second kills

This article shares the specific code of Vue to i...

MySQL 8.0.15 installation and configuration tutorial under Win10

What I have been learning recently involves knowl...

JavaScript Sandbox Exploration

Table of contents 1. Scenario 2. Basic functions ...

How to install Windows Server 2008 R2 on Dell R720 server

Note: All pictures in this article are collected ...

A brief discussion on VUE uni-app conditional coding and page layout

Table of contents Conditional compilation Page La...

Application of HTML and CSS in Flash

Application of HTML and CSS in Flash: I accidental...

Docker volumes file mapping method

background When working on the blockchain log mod...