A standard Group by statement contains sorting, grouping, and aggregation functions, such as select a,count(*) from t group by a; This statement uses a for sorting by default. If column a has no index, a temporary table will be created to count a and count(*), and then sorted by a through sort_buffer. Standard execution processstructure: create table t1(id int primary key, a int, b int, index(a)); delimiter;; create procedure idata() begin declare i int; set i=1; while(i<=1000)do insert into t1 values(i, i, i); set i=i+1; end while; end;; delimiter ; call idata(); The function is to insert 1000 statements into t1, from (1,1,1) to (1000,1000,1000). Execute select id%10 as m, count(*) as c from t1 group by m; Analysis: Using index means that this statement uses a covering index and selects index a, so there is no need to return to the table. process: 1. Create a temporary memory table with two fields m and c. The primary key is m. Step 2 If it is found that the total field length stored in the temporary table in memory reaches the size set by the parameter tmp_table_size, the temporary table in memory will be upgraded to a temporary table on disk, and then the traversal calculation will be restarted. The final sorting is the operation in the dashed box in the figure below. If the size of sort_buffer is not large enough, a temporary table will be used to assist in sorting. optimizationThe overall process of the unoptimized (that is, the grouping columns have no index) group by can be summarized as follows: because the data is unordered, a temporary table needs to be created, and then each data is judged one by one to which group it belongs, and finally sorted according to the grouping columns. Therefore, there are two approaches to optimization: Remove sortingIf the returned data does not need to be sorted, you can prohibit sorting, that is, change the above statement to select a,count(*) from t group by a order by null. Sort byIf the records are sorted by the sort field, the data becomes the following structure: In this way, when actually obtaining the fields to be returned or calculating the aggregate function, you only need to access them in sequence. When the column value becomes the next one, you know that the current group access is over and the previously counted data is returned directly. This avoids the creation of temporary tables, and sorting does not require additional sorting using sort_buffer. This greatly improves the efficiency of execution. accomplish1. If the grouping field is suitable for creating an index, create an index for the grouping field directly. MySQL version 5.7 supports the generated column mechanism, which is used to implement the associated update of column data. You can create a column z as follows, and then create an index on the z column (if it is MySQL 5.6 and earlier, you can also create a normal column and index to solve this problem) alter table t1 add column z int generated always as(id % 100), add index(z); Then parse: No temporary table or additional sorting is used, so performance is improved. 2. If the grouping field is not suitable (the usage rate is very low), you can use SQL_BIG_RESULT to try to optimize. By adding the SQL_BIG_RESULT hint to the group by statement, you can tell the optimizer that this statement involves a large amount of data and that you should use a temporary disk table directly. The MySQL optimizer sees that the disk temporary table is stored in a B+ tree, and its storage efficiency is not as high as that of an array. Therefore, since SQL_BIG_RESULT is used to indicate a large amount of data, considering the disk space, it is better to store it directly in an array. So after using SQL_BIG_RESULT the optimizer will use a disk temporary table of array structure. However, if the conditions for using a disk temporary table are not met, the disk temporary table will not be used. That is, when the sort_buffer space can store the total length of the fields to be returned and sorted, the array structure sort_buffer will be used. If the total field exceeds the sort_buffer size, then an array structure disk temporary table will be added to assist in sorting. If there is enough space in sort_buffer, the data will be sorted inside sort_buffer, which will serve as an index. Still using the above example, using SQL_BIG_RESULT alter table t1 add column z int generated always as(id % 100), add index(z); The specific process is as follows: 1. Initialize sort_buffer and put an integer field into it, denoted as m; Analysis: It can be seen that the temporary table is not used at this time, but sort_buffer is used directly for sorting, which saves the performance consumption caused by using the temporary table. Summarize 1. If there is no requirement to sort the results of the group by statement, add order by null after the statement; then generally there is no need to use a temporary table (the above two optimizations are both proposed under the premise of requiring sorting) The above is the detailed explanation of MySQL Group by optimization. For more information about MySQL Group by optimization, please pay attention to other related articles on 123WORDPRESS.COM! You may also be interested in:
|
<<: Detailed explanation of 5 solutions for CSS intermediate adaptive layout
>>: Solution to the failure of docker windows10 shared directory mounting
Preface Starting from React 16, the concept of Er...
Today, let's introduce several common text pr...
Whether it is Samba service or NFS service, the m...
Table of contents 1. Parent components and child ...
This article shares the specific code of Vue to i...
1. Table structure TABLE person id name 1 you 2 Y...
The specific method of installing CentOS 7.0 on V...
This article mainly explains how to install the M...
Preface The source code is only more than 100 lin...
Recently, when using select query in a project, I...
Table of contents 1. Concepts related to stored p...
Table of contents Introduction Download and insta...
1. Nginx status monitoring Nginx provides a built...
In Linux C/C++, thread-level operations are usual...
The HTML code for intercepting text beyond multipl...