Examples of optimization techniques for slow query efficiency in MySQL IN statements

Examples of optimization techniques for slow query efficiency in MySQL IN statements

The table structure is as follows. There are only 690 articles.

Article table article(id,title,content)
Tag table tag(tid,tag_name)
Tag article intermediate table article_tag (id, tag_id, article_id)

There is a tag whose tid is 135. Search for the article list with tag tid 135.

690 articles, using the following query, very slow:

select id,title from article where id in(
select article_id from article_tag where tag_id=135
)

This one is very fast:

select article_id from article_tag where tag_id=135

The query results are five articles with ids 428, 429, 430, 431, and 432

It is also quick to use the following sql to search for articles:

select id,title from article where id in(
428,429,430,431,432
)

Solution:

select id,title from article where id in(
select article_id from (select article_id from article_tag where tag_id=135) as tbt
)

Other solutions: (examples)

mysql> select * from abc_number_prop where number_id in (select number_id from abc_number_phone where phone = '82306839');

In order to save space, the output content is omitted, the same below.

67 rows in set (12.00 sec)

Only 67 rows of data were returned, but it took 12 seconds. There may be many such queries in the system at the same time, and the system will definitely not be able to handle it. Use desc to see (note: explain is also OK)

mysql> desc select * from abc_number_prop where number_id in (select number_id from abc_number_phone where phone = '82306839');
+----+--------------------+------------------+--------+-----------------+-------+---------+------------+---------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------------+--------+-----------------+-------+---------+------------+---------+--------------------------+
| 1 | PRIMARY | abc_number_prop | ALL | NULL | NULL | NULL | NULL | 2679838 | Using where |
| 2 | DEPENDENT SUBQUERY | abc_number_phone | eq_ref | phone,number_id | phone | 70 | const,func | 1 | Using where; Using index |
+----+--------------------+------------------+--------+-----------------+-------+---------+------------+---------+--------------------------+
2 rows in set (0.00 sec)

It can be seen that more than two million rows will be scanned when executing this query. Is it because no index is created? Let's take a look

mysql>show index from abc_number_phone;
+------------------+------------+-------------+--------------+-----------------+-----------+------------+----------+--------+------+------------+--------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+------------------+------------+-------------+--------------+-----------------+-----------+------------+----------+--------+------+------------+--------------+
| abc_number_phone | 0 | PRIMARY | 1 | number_phone_id | A | 36879 | NULL | NULL | | BTREE | | |
| abc_number_phone | 0 | phone | 1 | phone | A | 36879 | NULL | NULL | | BTREE | | |
| abc_number_phone | 0 | phone | 2 | number_id | A | 36879 | NULL | NULL | | BTREE | | |
| abc_number_phone | 1 | number_id | 1 | number_id | A | 36879 | NULL | NULL | | BTREE | | |
| abc_number_phone | 1 | created_by | 1 | created_by | A | 36879 | NULL | NULL | | BTREE | | |
| abc_number_phone | 1 | modified_by | 1 | modified_by | A | 36879 | NULL | NULL | YES | BTREE | | |
+------------------+------------+-------------+--------------+-----------------+-----------+------------+----------+--------+------+------------+--------------+
6 rows in set (0.06 sec)
mysql>show index from abc_number_prop;
+-----------------+------------+-------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------+------------+-------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| abc_number_prop | 0 | PRIMARY | 1 | number_prop_id | A | 311268 | NULL | NULL | | BTREE | | |
| abc_number_prop | 1 | number_id | 1 | number_id | A | 311268 | NULL | NULL | | BTREE | | |
| abc_number_prop | 1 | created_by | 1 | created_by | A | 311268 | NULL | NULL | | BTREE | | |
| abc_number_prop | 1 | modified_by | 1 | modified_by | A | 311268 | NULL | NULL | YES | BTREE | | |
+-----------------+------------+-------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
4 rows in set (0.15 sec)

As can be seen from the above output, these two tables have indexes created on the number_id field.
Check if there is any problem with the subquery itself.

mysql> desc select number_id from abc_number_phone where phone = '82306839';
+----+-------------+------------------+------+---------------+-------+----------+-------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------------+------+---------------+-------+----------+-------+------+--------------------------+
| 1 | SIMPLE | abc_number_phone | ref | phone | phone | 66 | const | 6 | Using where; Using index |
+----+-------------+------------------+------+---------------+-------+----------+-------+------+--------------------------+
1 row in set (0.00 sec)

No problem, we only need to scan a few rows of data, and the index will work.

Check it out:

mysql> select number_id from abc_number_phone where phone = '82306839';
+-----------+
| number_id |
+-----------+
| 8585 |
| 10720 |
|148644|
|151307|
|170691|
|221897|
+-----------+
6 rows in set (0.00 sec)

Directly put the data obtained from the subquery into the above query

mysql> select * from abc_number_prop where number_id in (8585, 10720, 148644, 151307, 170691, 221897);
67 rows in set (0.03 sec)

The speed is also fast. It seems that MySQL is not good enough when handling subqueries. I tried this on both MySQL 5.1.42 and MySQL 5.5.19 and both had this problem.

After searching the Internet, I found that many people have encountered this problem:

Reference 1: MySQL optimization: using joins instead of subqueries

Reference 2: MYSQL subquery and nested query optimization example analysis

According to the suggestions of these online materials, try using join instead.
Before modification:

select * from abc_number_prop where number_id in (select number_id from abc_number_phone where phone = '82306839');

After modification:

select a.* from abc_number_prop a inner join abc_number_phone b on a.number_id = b.number_id where phone = '82306839';
mysql> select a.* from abc_number_prop a inner join abc_number_phone b on a.number_id = b.number_id where phone = '82306839';
67 rows in set (0.00 sec)

The effect is good, and the query time is almost 0. Let’s see how MySQL executes this query.

mysql>desc select a.* from abc_number_prop a inner join abc_number_phone b on a.number_id = b.number_id where phone = '82306839';
+----+-------------+-------+------+-----------------+-----------+---------+----------------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+-----------------+-----------+---------+----------------+------+--------------------------+
| 1 | SIMPLE | b | ref | phone,number_id | phone | 66 | const | 6 | Using where; Using index |
| 1 | SIMPLE | a | ref | number_id | number_id | 4 | eap.b.number_id | 3 | |
+----+-------------+-------+------+-----------------+-----------+---------+----------------+------+--------------------------+
2 rows in set (0.00 sec)

Summary: When the subquery is slow, you can use JOIN to rewrite the query for optimization.

There are also articles online saying that queries using JOIN statements are not necessarily always faster than those using subqueries.

The MySQL manual also mentions this, and the specific original text is in this chapter of the MySQL documentation:
I.3. Restrictions on Subqueries
13.2.8. Subquery Syntax

excerpt:

1) About subqueries using IN:

Subquery optimization for IN is not as effective as for the = operator or for IN(value_list) constructs.

A typical case for poor IN subquery performance is when the subquery returns a small number of rows but the outer query returns a large number of rows to be compared to the subquery result.

The problem is that, for a statement that uses an IN subquery, the optimizer rewrites it as a correlated subquery. Consider the following statement that uses an uncorrelated subquery:

SELECT ... FROM t1 WHERE t1.a IN (SELECT b FROM t2);

The optimizer rewrites the statement to a correlated subquery:

SELECT ... FROM t1 WHERE EXISTS (SELECT 1 FROM t2 WHERE t2.b = t1.a);

If the inner and outer queries return M and N rows, respectively, the execution time becomes on the order of O(M×N), rather than O(M+N) as it would be for an uncorrelated subquery.

An implication is that an IN subquery can be much slower than a query written using an IN(value_list) construct that lists the same values ​​that the subquery would return.

2) About converting subqueries into joins:

The optimizer is more mature for joins than for subqueries, so in many cases a statement that uses a subquery can be executed more efficiently if you rewrite it as a join.

An exception occurs for the case where an IN subquery can be rewritten as a SELECT DISTINCT join. Example:

SELECT col FROM t1 WHERE id_col IN (SELECT id_col2 FROM t2 WHERE condition);

That statement can be rewritten as follows:

SELECT DISTINCT col FROM t1, t2 WHERE t1.id_col = t2.id_col AND condition;

But in this case, the join requires an extra DISTINCT operation and is not more efficient than the subquery

Summarize

The above is all the content of this article about the optimization techniques for the slow efficiency of MySQL in statement subquery. Friends who are interested can refer to: A brief discussion on the efficiency of MySQL subquery union and in, an introduction to enterprise production MySQL optimization, etc. If you have any questions, you can leave a message. Everyone is welcome to communicate and refer to it.

I hope this article is helpful to you.

You may also be interested in:
  • Subquery examples in MySQL
  • Detailed explanation of MySQL subqueries (nested queries), join tables, and combined queries
  • MySQL optimization: use join instead of subquery
  • Basic learning tutorial of table subquery and correlated subquery in MySQL
  • MySQL Notes: Introduction to Subquery Usage
  • Introduction to several common forms of MySQL subqueries
  • How to implement subquery in MySQL nested query

<<:  Example of using supervisor to manage nginx+tomcat containers

>>:  Reasons and solutions for the failure of React event throttling effect

Recommend

How to implement a binary search tree using JavaScript

One of the most commonly used and discussed data ...

Let's talk in detail about the props attributes of components in Vue

Table of contents Question 1: How are props used ...

Complete steps to use vue-router in vue3

Preface Managing routing is an essential feature ...

Vue integrates Tencent TIM instant messaging

This article mainly introduces how to integrate T...

A brief discussion on VUE uni-app conditional coding and page layout

Table of contents Conditional compilation Page La...

Steps to install RocketMQ instance on Linux

1. Install JDK 1.1 Check whether the current virt...

Basic notes on html and css (must read for front-end)

When I first came into contact with HTML, I alway...

Detailed explanation of HTML area tag

The <area> tag defines an area in an image ...

How to use physics engine joints in CocosCreator

Table of contents mousejoint mouse joint distance...

A brief introduction to mysql mycat middleware

1. What is mycat A completely open source large d...

Bug of Chinese input garbled characters in flex program Firefox

Chinese characters cannot be input in lower versio...

js implements mouse in and out card switching content

This article shares the specific code of js to re...

Detailed steps for using jib for docker deployment in Spring Cloud

Introduction to Jib Jib is a library developed by...

Detailed explanation of how to use Vue to load weather components

This article shares with you how to use Vue to lo...