1. The reason why the limit is getting slower as time goes byWhen we use limit to paging data, we will find that when we view the first few pages, the speed is very fast, for example, limit 200,25, it comes out instantly. But the speed becomes slower and slower as time goes by, especially after one million records, it becomes extremely stuck. What is the principle behind this? Let's first look at what the query SQL looks like when we turn the page to the back: select * from t_name where c_name1='xxx' order by c_name2 limit 2000000,25; The slowness of this query is actually caused by the large offset after limit. For example, the limit 2000000,25 above is equivalent to the database scanning 2000025 pieces of data, then discarding the first 20000000 pieces of data, and returning the remaining 25 pieces of data to the user. This approach is obviously unreasonable. 2. Million Data Simulation1. Create employee tables and department tables, and write stored procedures to insert data/*Department table, delete if it exists*/ drop table if EXISTS dep; create table dep( id int unsigned primary key auto_increment, depno mediumint unsigned not null default 0, depname varchar(20) not null default "", memo varchar(200) not null default "" ); /*Employee table, delete if exists*/ drop table if EXISTS emp; create table emp( id int unsigned primary key auto_increment, empno mediumint unsigned not null default 0, empname varchar(20) not null default "", job varchar(9) not null default "", mgr mediumint unsigned not null default 0, hiredate datetime not null, sal decimal(7,2) not null, comn decimal(7,2) not null, depno mediumint unsigned not null default 0 ); /* Function to generate random string */ DELIMITER $ drop FUNCTION if EXISTS rand_string; CREATE FUNCTION rand_string(n INT) RETURNS VARCHAR(255) BEGIN DECLARE chars_str VARCHAR(100) DEFAULT 'abcdefghijklmlopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'; DECLARE return_str VARCHAR(255) DEFAULT ''; DECLARE i INT DEFAULT 0; WHILE i < n DO SET return_str = CONCAT(return_str,SUBSTRING(chars_str,FLOOR(1+RAND()*52),1)); SET i = i+1; END WHILE; RETURN return_str; END $ DELIMITER; /*Function to generate random department number*/ DELIMITER $ drop FUNCTION if EXISTS rand_num; CREATE FUNCTION rand_num() RETURNS INT(5) BEGIN DECLARE i INT DEFAULT 0; SET i = FLOOR(100+RAND()*10); RETURN i; END $ DELIMITER; /*Create a stored procedure: insert data into the emp table*/ DELIMITER $ drop PROCEDURE if EXISTS insert_emp; CREATE PROCEDURE insert_emp(IN START INT(10),IN max_num INT(10)) BEGIN DECLARE i INT DEFAULT 0; /*set autocommit =0 Set autocommit to 0 and turn off the default commit*/ SET autocommit = 0; REPEAT SET i = i + 1; INSERT INTO emp(empno,empname,job,mgr,hiredate,sal,comn,depno) VALUES ((START+i),rand_string(6),'SALEMAN',0001,now(),2000,400,rand_num()); UNTIL i = max_num END REPEAT; COMMIT; END $ DELIMITER; /*Create a stored procedure: insert data into the dep table*/ DELIMITER $ drop PROCEDURE if EXISTS insert_dept; CREATE PROCEDURE insert_dept(IN START INT(10), IN max_num INT(10)) BEGIN DECLARE i INT DEFAULT 0; SET autocommit = 0; REPEAT SET i = i+1; INSERT INTO dep( depno,depname,memo) VALUES((START+i),rand_string(10),rand_string(8)); UNTIL i = max_num END REPEAT; COMMIT; END $ DELIMITER; 2. Execute the stored procedure/*Insert 120 records*/ call insert_dept(1,120); /*Insert 5 million pieces of data*/ call insert_emp(0,5000000); Inserting 5 million records may be slow 3.4 Query Methods1. Ordinary limit paging/*Offset is 100, take 25*/ SELECT a.empno,a.empname,a.job,a.sal,b.depno,b.depname from emp a left join dep b on a.depno = b.depno order by a.id desc limit 100,25; /*Offset is 4800000, take 25*/ SELECT a.empno,a.empname,a.job,a.sal,b.depno,b.depname from emp a left join dep b on a.depno = b.depno order by a.id desc limit 4800000,25; Execution Results [SQL] SELECT a.empno,a.empname,a.job,a.sal,b.depno,b.depname from emp a left join dep b on a.depno = b.depno order by a.id desc limit 100,25; Affected rows: 0 Time: 0.001s [SQL] SELECT a.empno,a.empname,a.job,a.sal,b.depno,b.depname from emp a left join dep b on a.depno = b.depno order by a.id desc limit 4800000,25; Affected rows: 0 Time: 12.275s The further you go, the slower the query efficiency. 2. Use index coverage + subquery optimizationBecause we have the primary key id and have built an index on it, we can first find the id value of the starting position in the index tree, and then query the row data based on the found id value. /*Subquery gets the id of the position offset by 100, and gets 25 after this position*/ SELECT a.empno,a.empname,a.job,a.sal,b.depno,b.depname from emp a left join dep b on a.depno = b.depno where a.id >= (select id from emp order by id limit 100,1) order by a.id limit 25; /*Subquery gets the id of the position offset by 4800000, and gets 25 after this position*/ SELECT a.empno,a.empname,a.job,a.sal,b.depno,b.depname from emp a left join dep b on a.depno = b.depno where a.id >= (select id from emp order by id limit 4800000,1) order by a.id limit 25; Execution Results [SQL] SELECT a.empno,a.empname,a.job,a.sal,b.depno,b.depname from emp a left join dep b on a.depno = b.depno where a.id >= (select id from emp order by id limit 100,1) order by a.id limit 25; Affected rows: 0 Time: 0.106s [SQL] SELECT a.empno,a.empname,a.job,a.sal,b.depno,b.depname from emp a left join dep b on a.depno = b.depno where a.id >= (select id from emp order by id limit 4800000,1) order by a.id limit 25; Affected rows: 0 Time: 1.541s 3. Redefine the starting positionApplicable to tables whose primary keys are auto-incremental /*Remember that the id of the last data in the previous paging is 100, so we will skip 100 and scan the table from 101*/ SELECT a.id,a.empno,a.empname,a.job,a.sal,b.depno,b.depname from emp a left join dep b on a.depno = b.depno where a.id > 100 order by a.id limit 25; /*Remember that the id of the last data in the previous paging is 4800000, so we will skip 4800000 and scan the table from 4800001*/ SELECT a.id,a.empno,a.empname,a.job,a.sal,b.depno,b.depname from emp a left join dep b on a.depno = b.depno where a.id > 4800000 order by a.id limit 25; [SQL] SELECT a.id,a.empno,a.empname,a.job,a.sal,b.depno,b.depname from emp a left join dep b on a.depno = b.depno where a.id > 100 order by a.id limit 25; Affected rows: 0 Time: 0.001s [SQL] SELECT a.id,a.empno,a.empname,a.job,a.sal,b.depno,b.depname from emp a left join dep b on a.depno = b.depno where a.id > 4800000 order by a.id limit 25; Affected rows: 0 Time: 0.000s This is the most efficient. No matter how the pages are divided, the time consumed is basically the same, because after executing the conditions, only 25 pieces of data are scanned. 4. Downgrade strategy (Baidu’s approach)This strategy is the simplest and most effective, because general big data queries will have search conditions, and no one will pay attention to the content after page 100. When the number of pages the user queries is too large, just return an error. For example, Baidu can only search page 76. The above are the details of 4 query optimization methods for MySQL millions of data. For more information on MySQL query optimization for millions of data, please pay attention to other related articles on 123WORDPRESS.COM! You may also be interested in:
|
<<: CSS code to control the background color of the web page
>>: Detailed explanation of JavaScript function introduction
Table of contents Overview 1. Develop the require...
1. addtime() Add the specified number of seconds ...
1. Introduction Whether the creation time of a fi...
Table of contents Preface XA Protocol How to impl...
Linux basic configuration Compile and install pyt...
This article example shares the specific code of ...
0x00 Introduction WordPress is the most popular C...
I downloaded and installed the latest version of ...
1. Command Introduction The cal (calendar) comman...
How can I hide the scrollbars while still being a...
Solution: Add the following code in <head>: ...
This article uses examples to explain the princip...
Table of contents 1. df command 2. du command 3. ...
Using abbreviations can help reduce the size of yo...
1|0 Background Due to project requirements, each ...