MySQL group query optimization method

MySQL group query optimization method

MySQL handles GROUP BY and DISTINCT queries similarly in most cases; in fact, it sometimes switches between the two during optimization. Both types of queries can benefit from indexing, and this is usually the most important way to optimize them.

When an index cannot be used, MySQL has two strategies for GROUP BY queries: using a temporary table or filesort to perform grouping. For a given query, neither approach can be more efficient. We can configure SQL_BIG_RESULT and SQL_SMALL_RESULT to specify the optimizer to choose one of these methods.

Usually, it is more efficient to group the query table by id rather than by value. For example, the following query is inefficient:

SELECT actor.first_name, actor.last_name, COUNT(*)
FROM sakila.file_actor
INNER JOIN sakila.actor USING(actor_id)
GROUP BY actor.first_name, actor.last_name;

The following query is more efficient:

SELECT actor.first_name, actor.last_name, COUNT(*)
FROM sakila.file_actor
INNER JOIN sakila.actor USING(actor_id)
GROUP BY film_actor.actor_id;

Using actor.actor_id for grouping is more efficient than film_actor.actor_id.

This query works to its advantage because the actor's name is dependent on the actor_id, so it will return the same results, but it would not do so if the results were different. Sometimes the server even disables GROUP BY via SQL_MODE configuration. If you do not care about the obtained value at this time, and the value of the column used for grouping is unique, you can use MIN and MAX to solve this problem.

SELECT MIN(actor.first_name), MAX(actor.last_name), ...;

For perfectionists, they will think that your grouping is wrong and they are right. A consequence of a dummy MIN or MAX is that the query will not assemble correctly. However, sometimes you just want MySQL to execute queries faster. A perfectionist would be satisfied with the following query:

SELECT actor.first_name, actor.last_name, c.cnt
FROM sakila.actor
	INNER JOIN (
    SELECT actor_id, COUNT(*) AS cnt
    FROM sakila.film_actor
    GROUP BY actor_id
  ) AS c USING(actor_id);

However, creating and populating a temporary table in a subquery might be more expensive than it seems in theory. It is important to remember that the temporary table constructed by the subquery has no index, which will result in a performance degradation.

Generally in grouped queries, selecting columns that are not grouped is a bad idea. This is because the query results are uncertain. Once the index is changed or the optimizer uses a different strategy, the results will be changed. In fact, we recommend setting the server's SQL_MODE to ONLY_FULL_GROUP_BY, so that when a badly written group by query is written, the system will generate an error instead of executing it directly. After ONLY_FULL_GROUP_BY is enabled, the fields of SELECT can only be the fields specified by GROUP BY. In this case, you can construct a step-by-step query or a subquery to first find the grouped columns and then perform a secondary query.

MySQL will automatically group according to the order of the columns specified in GROUP BY, unless ORDER BY is used to specify a sorting rule. If you don't care about the order and find that this causes a filesort, you can use ORDER BY NULL to skip the automatic sort. You can also specify that the results should be sorted in a specified direction by adding DESC or ASC after GROUP BY.

Sometimes you can ask MySQL to do a super-aggregation in the results when grouping queries. This can be done by adding the WITH ROLLUP clause after the GROUP BY, but this may not achieve the desired optimization. You can check the execution method through EXPLAIN and pay attention to whether the grouping is done through filesort or temporary tables. Then compare the same queries after removing WITH ROLLUP. By comparing, we may be able to find ways to optimize.

Sometimes it is more efficient to add an aggregate query, even though this method will return more rows. You can also nest subqueries after FROM to keep intermediate query results, and then use UNION to obtain the final result.

But please note that it is best to remove WITH ROLLUP from your application and perform grouped queries through optimization.

Conclusion: When using GROUP BY for grouped queries, it is best to use index columns for grouping. If you do not need to specify the order, you can use ORDER BY NULL for optimization. If you are not grouping by index columns, you need to consider alternatives and whether to use subqueries or use WITH ROLLUP to check performance before optimizing. At the same time, in order to prevent unexpected errors in group queries, it is best to enable ONLY_FULL_GROUP_BY.

The above is the details of the optimization method of MySQL group query. For more information about the optimization of MySQL group query, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • MySQL index failure principle
  • The principles and defects of MySQL full-text indexing
  • MySQL index principle and usage example analysis
  • Understanding the MySQL query optimization process
  • MySQL paging query optimization techniques
  • MySQL index principle and query optimization detailed explanation

<<:  Implementing search box function with search icon based on html css

>>:  How to add configuration options to Discuz! Forum

Recommend

React's transition from Class to Hooks

Table of contents ReactHooks Preface WhyHooks? Fo...

Install Docker on Linux (very simple installation method)

I have been quite free recently. I have been doin...

Ubuntu Server Installation Tutorial in Vmware

This article shares with you the Ubuntu server ve...

Steps to customize icon in Vue

ant-design-vue customizes the use of Ali iconfont...

Several ways to introduce pictures in react projects

The img tag introduces the image Because react ac...

How to configure environment variables in Linux environment

JDK download address: http://www.oracle.com/techn...

Set IE8 to use IE7 style code

<meta http-equiv="x-ua-compatible" co...

Introducing multiple custom fonts in CSS3

Today I found a problem in HTML. There are many d...

A brief discussion on the definition and precautions of H tags

Judging from the results, there is no fixed patte...

How to remove the dotted border when clicking a link in FireFox

I encountered several browser compatibility issue...

Detailed explanation of how to configure Nginx web server sample code

Overview Today we will mainly share how to config...

Illustration-style website homepage design New trend in website design

You can see that their visual effects are very bea...