MySQL group query optimization method

MySQL group query optimization method

MySQL handles GROUP BY and DISTINCT queries similarly in most cases; in fact, it sometimes switches between the two during optimization. Both types of queries can benefit from indexing, and this is usually the most important way to optimize them.

When an index cannot be used, MySQL has two strategies for GROUP BY queries: using a temporary table or filesort to perform grouping. For a given query, neither approach can be more efficient. We can configure SQL_BIG_RESULT and SQL_SMALL_RESULT to specify the optimizer to choose one of these methods.

Usually, it is more efficient to group the query table by id rather than by value. For example, the following query is inefficient:

SELECT actor.first_name, actor.last_name, COUNT(*)
FROM sakila.file_actor
INNER JOIN sakila.actor USING(actor_id)
GROUP BY actor.first_name, actor.last_name;

The following query is more efficient:

SELECT actor.first_name, actor.last_name, COUNT(*)
FROM sakila.file_actor
INNER JOIN sakila.actor USING(actor_id)
GROUP BY film_actor.actor_id;

Using actor.actor_id for grouping is more efficient than film_actor.actor_id.

This query works to its advantage because the actor's name is dependent on the actor_id, so it will return the same results, but it would not do so if the results were different. Sometimes the server even disables GROUP BY via SQL_MODE configuration. If you do not care about the obtained value at this time, and the value of the column used for grouping is unique, you can use MIN and MAX to solve this problem.

SELECT MIN(actor.first_name), MAX(actor.last_name), ...;

For perfectionists, they will think that your grouping is wrong and they are right. A consequence of a dummy MIN or MAX is that the query will not assemble correctly. However, sometimes you just want MySQL to execute queries faster. A perfectionist would be satisfied with the following query:

SELECT actor.first_name, actor.last_name, c.cnt
FROM sakila.actor
	INNER JOIN (
    SELECT actor_id, COUNT(*) AS cnt
    FROM sakila.film_actor
    GROUP BY actor_id
  ) AS c USING(actor_id);

However, creating and populating a temporary table in a subquery might be more expensive than it seems in theory. It is important to remember that the temporary table constructed by the subquery has no index, which will result in a performance degradation.

Generally in grouped queries, selecting columns that are not grouped is a bad idea. This is because the query results are uncertain. Once the index is changed or the optimizer uses a different strategy, the results will be changed. In fact, we recommend setting the server's SQL_MODE to ONLY_FULL_GROUP_BY, so that when a badly written group by query is written, the system will generate an error instead of executing it directly. After ONLY_FULL_GROUP_BY is enabled, the fields of SELECT can only be the fields specified by GROUP BY. In this case, you can construct a step-by-step query or a subquery to first find the grouped columns and then perform a secondary query.

MySQL will automatically group according to the order of the columns specified in GROUP BY, unless ORDER BY is used to specify a sorting rule. If you don't care about the order and find that this causes a filesort, you can use ORDER BY NULL to skip the automatic sort. You can also specify that the results should be sorted in a specified direction by adding DESC or ASC after GROUP BY.

Sometimes you can ask MySQL to do a super-aggregation in the results when grouping queries. This can be done by adding the WITH ROLLUP clause after the GROUP BY, but this may not achieve the desired optimization. You can check the execution method through EXPLAIN and pay attention to whether the grouping is done through filesort or temporary tables. Then compare the same queries after removing WITH ROLLUP. By comparing, we may be able to find ways to optimize.

Sometimes it is more efficient to add an aggregate query, even though this method will return more rows. You can also nest subqueries after FROM to keep intermediate query results, and then use UNION to obtain the final result.

But please note that it is best to remove WITH ROLLUP from your application and perform grouped queries through optimization.

Conclusion: When using GROUP BY for grouped queries, it is best to use index columns for grouping. If you do not need to specify the order, you can use ORDER BY NULL for optimization. If you are not grouping by index columns, you need to consider alternatives and whether to use subqueries or use WITH ROLLUP to check performance before optimizing. At the same time, in order to prevent unexpected errors in group queries, it is best to enable ONLY_FULL_GROUP_BY.

The above is the details of the optimization method of MySQL group query. For more information about the optimization of MySQL group query, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • MySQL index failure principle
  • The principles and defects of MySQL full-text indexing
  • MySQL index principle and usage example analysis
  • Understanding the MySQL query optimization process
  • MySQL paging query optimization techniques
  • MySQL index principle and query optimization detailed explanation

<<:  Implementing search box function with search icon based on html css

>>:  How to add configuration options to Discuz! Forum

Recommend

JavaScript implements the pot-beating game of Gray Wolf

1. Project Documents 2. Use HTML and CSS for page...

Summary of two methods to implement vue printing function

Method 1: Install the plugin via npm 1. Install n...

Share 10 of the latest web front-end frameworks (translation)

In the world of web development, frameworks are ve...

Sample code for cool breathing effect using CSS3+JavaScript

A simple cool effect achieved with CSS3 animation...

Docker runs operations with specified memory

as follows: -m, --memory Memory limit, the format...

JavaScript array deduplication solution

Table of contents Method 1: set: It is not a data...

Detailed tutorial on installing and configuring MySQL 5.7.20 under Centos7

1. Download the MySQL 5.7 installation package fr...

JavaScript method to detect the type of file

Table of contents 1. How to view the binary data ...

Detailed explanation of important cascading concepts in CSS

Recently, I encountered a problem in the process ...

Jmeter connects to the database process diagram

1. Download the MySQL jdbc driver (mysql-connecto...

How to filter out certain libraries during mysql full backup

Use the --all-database parameter when performing ...

How to deploy ElasticSearch in Docker

1. What is ElasticSearch? Elasticsearch is also d...

Example of how rem is adapted for mobile devices

Preface Review and summary of mobile terminal rem...