When there are tens of thousands of records in the table that needs to be queried from the database, querying all the results at once will become very slow, especially as the amount of data increases. At this time, paging query is required. There are also many methods and optimization points for database paging queries. Here are some methods I know. Preparation In order to test some of the optimizations listed below, an existing table is described below. Table name: order_history Return result: 5709294 The three query times are: 8903 ms 8323 ms 8401 ms General paging query General paging queries can be implemented using a simple limit clause. The limit clause is declared as follows: The LIMIT clause can be used to specify the number of records to be returned by the SELECT statement. The following points should be noted: The first parameter specifies the offset of the first returned record row. The second parameter specifies the maximum number of returned record rows. If only one parameter is given: it indicates the maximum number of rows to be returned. The second parameter is -1, which means to retrieve all rows from a certain offset to the end of the record set. The initial row offset is 0 (not 1). Here is an application example: This statement will query the 10 records after the 1000th record from the orders_history table, that is, the 1001st to 10010th records. By default, records in the data table are sorted by primary key (usually id). The above result is equivalent to: The three query times are: 3040 ms 3063 ms 3018 ms For this query method, the following tests the impact of the query record volume on time: select * from orders_history where type=8 limit 10000,1; select * from orders_history where type=8 limit 10000,10; select * from orders_history where type=8 limit 10000,100; select * from orders_history where type=8 limit 10000,1000; select * from orders_history where type=8 limit 10000,10000; The three query times are as follows: Query 1 record: 3072ms 3092ms 3002ms Query 10 records: 3081ms 3077ms 3032ms Query 100 records: 3118ms 3200ms 3128ms Query 1000 records: 3412ms 3468ms 3394ms Query 10,000 records: 3749ms 3802ms 3696ms In addition, I also did more than ten queries. Judging from the query time, it can be basically confirmed that when the number of query records is less than 100, there is basically no difference in query time. As the number of query records increases, the time spent will also increase. Test for query offset: select * from orders_history where type=8 limit 100,100; select * from orders_history where type=8 limit 1000,100; select * from orders_history where type=8 limit 10000,100; select * from orders_history where type=8 limit 100000,100; select * from orders_history where type=8 limit 1000000,100; The three query times are as follows: Query 100 offset: 25ms 24ms 24ms Query 1000 offset: 78ms 76ms 77ms Query 10000 offset: 3092ms 3212ms 3128ms Query 100000 offset: 3878ms 3812ms 3798ms Query 1000000 offset: 14608ms 14062ms 14700ms As the query offset increases, especially when the query offset is greater than 100,000, the query time increases dramatically. This paging query method starts scanning from the first record in the database, so the query speed will be slower as you go back in time. Moreover, the more data you query, the slower the overall query speed will be. Using subquery optimization This method first locates the id of the offset position, and then queries backwards. This method is suitable for situations where the id increases incrementally. select * from orders_history where type=8 limit 100000,1; select id from orders_history where type=8 limit 100000,1; select * from orders_history where type=8 and id>=(select id from orders_history where type=8 limit 100000,1) limit 100; select * from orders_history where type=8 limit 100000,100; The query times for the four statements are as follows: Statement 1: 3674ms Statement 2: 1315ms Statement 3: 1327ms Statement 4: 3710ms Note for the above query: Compare the first statement with the second statement: using select id instead of select * increases the speed by 3 times. Compare the second statement with the third statement: the speed difference is tens of milliseconds. Compare the third statement with the fourth statement: thanks to the increase in the speed of select id, the query speed of the third statement increases by 3 times. This method will be several times faster than the original general query method. Use id to limit optimization This method assumes that the id of the data table is continuously increasing. Then we can calculate the range of the id to be queried based on the number of pages and the number of records queried. We can use id between and to query: select * from orders_history where type=2 and id between 1000000 and 1000100 limit 100; Query time: 15ms 12ms 9ms This query method can greatly optimize the query speed and can basically be completed within tens of milliseconds. The limitation is that it can only be used when the id is clearly known. However, when creating a table, a basic id field is usually added, which brings a lot of traversal for paging queries. There is another way to write it: Of course, you can also use the in method to query, which is often used to query when multiple tables are associated, using the id set of other table queries to query: select * from orders_history where id in (select order_id from trade_2 where goods = 'pen') limit 100; Please note that some MySQL versions do not support the use of limit in the IN clause. Optimize using temporary tables This method no longer belongs to query optimization, but is mentioned here in passing. For the problem of using ID to limit optimization, the ID needs to be continuously incremented. However, in some scenarios, such as when using historical tables, or when data missing problems occur, you can consider using a temporary storage table to record the paging ID and use the paging ID to perform in queries. This can greatly improve the speed of traditional paging queries, especially when the amount of data is tens of millions. Description of the data table ID Generally, when creating a table in the database, it is mandatory to add an id incrementing field to each table to facilitate querying. If the amount of data in a database such as an order database is very large, it is usually divided into separate databases and tables. At this time, it is not recommended to use the database ID as the unique identifier. Instead, you should use a distributed high-concurrency unique ID generator to generate it, and use another field in the data table to store this unique identifier. Using range query to locate the id (or index) first, and then using the index to locate the data, can increase the query speed several times. That is, select id first, then select *; You may also be interested in:
|
<<: JavaScript Design Pattern Command Pattern
>>: VPS builds offline download server (post-network disk era)
Table of contents 1. Configure Linux hostname Con...
1. Triangle Border settings Code: width: 300px; h...
Table of contents JS function call, apply and bin...
Table of contents 1. Front-end routing implementa...
For example: Copy code The code is as follows: <...
1. Mobile selection of form text input: In the te...
Overview of MySQL Partitioned Tables We often enc...
Table of contents Common array methods pop() unsh...
We usually use routing in vue projects, and vue-r...
Text truncation with CSS Consider the following c...
Table of contents 1. Overview 1.1 What is strict ...
1 Download and start Tomcat Go to the official we...
Table of contents Modify the repository source st...
Table of contents 1. The principle of index push-...
like LIKE requires the entire data to match, whil...