MySQL slow query pitfalls

MySQL slow query pitfalls

What are the consequences of a slow query? When I was young, I always thought that it would just be slower to return data and the user experience would be worse. In fact, it is far more than that. I have experienced several online accidents, one of which was caused by a slow SQL query.

I remember that it was a query SQL. When the data volume was tens of thousands, it was still within 0.2 seconds. As the data increased sharply over a period of time, the time consumed once reached 2-3 seconds! No index is hit, resulting in a full table scan. The extra in explain shows: Using where; Using temporary; Using filesort. We are forced to use temporary table sorting. Due to high-frequency queries, the DB thread pool is quickly filled up with concurrency, resulting in a large number of query requests piling up. The DB server CPU is 100%+ for a long time, and a large number of requests time out. . Eventually the system crashed. The boss is here~

By the way, it was 8:30 pm on October 2nd. I was in my hometown Zaozhuang, sitting in a food stall with my buddies and bragging! Guess what embarrassing situation I will face?

It can be seen that if the team does not pay enough attention to slow queries, the risk is very high. After that accident, our boss said: If a similar accident happens again in anyone’s code, the development and department leader will be fired together. This scared a lot of leaders, so they quickly hired two DBA colleagues 🙂🙂🙂

Slow queries, as the name suggests, are queries that execute very slowly. How slow? If the time threshold (default 10s) set by the long_query_time parameter is exceeded, it is considered slow and needs to be optimized. Slow queries are recorded in the slow query log.

The slow query log is not enabled by default. If you need to optimize SQL statements, you can enable this feature. It allows you to easily know which statements need to be optimized (think about how terrible it is if a SQL statement takes 10 seconds).

Murphy's Law: What can go wrong will go wrong.

It’s one of those things that’s all too real. To prevent problems before they occur, let’s take a look at how to handle slow queries. This article is very dry, so remember to get a glass of water. If you don’t have time to read it, please save it first!

1. Slow query configuration

1-1. Enable slow query

MySQL support through

  • 1. Enter the command to enable slow query (temporarily), which will be automatically disabled after the MySQL service is restarted;
  • 2. Configure the my.cnf (my.ini for Windows) system file to enable it. Modifying the configuration file is a way to enable slow query persistently.

Method 1: Enable slow query through command

Step 1. Query slow_query_log to check whether the slow query log is enabled:

show variables like '%slow_query_log%';
mysql> show variables like '%slow_query_log%';
+---------------------+-----------------------------------+
| Variable_name | Value |
+---------------------+-----------------------------------+
| slow_query_log | OFF |
| slow_query_log_file | /var/lib/mysql/localhost-slow.log |
+---------------------+-----------------------------------+
2 rows in set (0.01 sec)

Step 2: Enable the slow query command:

set global slow_query_log='ON';

Step 3: Specify the threshold for recording the SQL execution time of the slow query log (long_query_time unit: seconds, default 10 seconds)

As shown below, I set it to 1 second. SQL statements that take more than 1 second to execute will be recorded in the slow query log.

set global long_query_time=1;

Step 4: Query the "slow query log file storage location"

show variables like '%slow_query_log_file%';
mysql> show variables like '%slow_query_log_file%';
+---------------------+-----------------------------------+
| Variable_name | Value |
+---------------------+-----------------------------------+
| slow_query_log_file | /var/lib/mysql/localhost-slow.log |
+---------------------+-----------------------------------+
1 row in set (0.01 sec)

slow_query_log_file specifies the storage path and file of the slow query log (placed together with the data file by default)

Step 5: Verify the slow query is enabled

You need to exit the current MySQL terminal and log in again to refresh;

After slow query is configured, it will record the following SQL statements that meet the conditions:

  • Query Statement
  • Data modification statements
  • SQL that has been rolled back

Method 2: Enable by configuring the my.cnf (my.ini for Windows) system file

(Version: MySQL 5.5 and above)

Add the following configuration under [mysqld] in the my.cnf file to enable slow query, as shown below

# Enable the slow query function slow_query_log=ON
# Specify the threshold for recording the slow query log SQL execution time long_query_time=1
# Optional, default data file path # slow_query_log_file=/var/lib/mysql/localhost-slow.log

After restarting the database, the slow query is enabled persistently. The query verification is as follows:

mysql> show variables like '%_query_%';
+------------------------------+-----------------------------------+
| Variable_name | Value |
+------------------------------+-----------------------------------+
| have_query_cache | YES |
| long_query_time | 1.000000 |
| slow_query_log | ON |
| slow_query_log_file | /var/lib/mysql/localhost-slow.log |
+------------------------------+-----------------------------------+
6 rows in set (0.01 sec)

1-2. Introduction to slow query logs

As shown above, this is a SQL statement that takes more than 1 second to execute (test)

  • First line: record time
  • The second line: User name, user IP information, thread ID number
  • The third line: the execution time [unit: seconds], the execution time to obtain the lock, the number of result rows obtained, and the number of data rows scanned
  • Line 4: The timestamp of when this SQL was executed
  • Line 5: Specific SQL statement

2. Explain analysis of slow query SQL

Analyze MySQL slow query logs. Use the explain keyword to simulate the optimizer to execute SQL query statements to analyze SQL slow query statements. Our test table below is an app information table with 1.37 million data. Let's analyze it as an example.

The SQL example is as follows:

-- 1.185s
SELECT * from vio_basic_domain_info where app_name like '%陈哈哈%';

This is a common fuzzy query statement. The query took 1.185 seconds and found 148 records.
The results of the Explain analysis are shown in the following table. According to the table information, the SQL statement does not use the index on the app_name field, the query type is a full table scan, and the number of scanned rows is 1.37w.

mysql> EXPLAIN SELECT * from vio_basic_domain_info where app_name like '%陈哈哈%';
+----+-------------+-----------------------+------------+------+---------------+------+---------+------+---------+----------+---------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------------+------------+------+---------------+------+---------+------+---------+----------+---------+
| 1 | SIMPLE | vio_basic_domain_info | NULL | ALL | NULL | NULL | NULL | NULL | 1377809 | 11.11 | Using where |
+----+-------------+-----------------------+------------+------+---------------+------+---------+------+---------+----------+---------+
1 row in set, 1 warning (0.00 sec)

When this SQL uses the index, the SQL is as follows: Query time: 0.156s, 141 data found

-- 0.156s
SELECT * from vio_basic_domain_info where app_name like '陈哈哈%';

The Explain analysis results are as follows. According to the table information, the SQL uses the idx_app_name index, the query type is index range query, and the number of scanned rows is 141. Because not all the columns in the query are in the index (select *), the table is returned once to obtain the data of other columns.

mysql> EXPLAIN SELECT * from vio_basic_domain_info where app_name like '陈哈哈%';
+----+-------------+-----------------------+------------+-------+---------------+--------------+-------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------------+------------+-------+---------------+--------------+-------+------+------+----------+-----------------------+
| 1 | SIMPLE | vio_basic_domain_info | NULL | range | idx_app_name | idx_app_name | 515 | NULL | 141 | 100.00 | Using index condition |
+----+-------------+-----------------------+------------+-------+---------------+--------------+-------+------+------+----------+-----------------------+
1 row in set, 1 warning (0.00 sec)

When this SQL uses a covering index, the SQL is as follows: Query time: 0.091s, 141 data found

-- 0.091s
SELECT app_name from vio_basic_domain_info where app_name like '陈哈哈%';

The Explain analysis results are as follows. According to the table information, the index is used just like the SQL above. Since the query column is included in the index column, 0.06s of table return time is saved.

mysql> EXPLAIN SELECT app_name from vio_basic_domain_info where app_name like '陈哈哈%';
+----+-------------+-----------------------+------------+-------+---------------+--------------+-------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------------+------------+-------+---------------+--------------+-------+------+------+----------+--------------------------+
| 1 | SIMPLE | vio_basic_domain_info | NULL | range | idx_app_name | idx_app_name | 515 | NULL | 141 | 100.00 | Using where; Using index |
+----+-------------+-----------------------+------------+-------+---------------+--------------+-------+------+------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)

So how do you analyze SQL through EXPLAIN parsing results? What do the attributes of each column represent? Let’s look down together.

2-1. Introduction to each column attribute:

  • id: SELECT query sequence number, reflecting the execution priority. If it is a subquery, the id sequence number will increase. The larger the id value, the higher the priority and the earlier it will be executed.
  • select_type: indicates the type of query.
  • table: The table of the output result set. If an alias is set, it will also be displayed.
  • partitions: matching partitions
  • type: access method to the table
  • possible_keys: indicates the possible indexes to be used when querying
  • key: indicates the index actually used
  • key_len: length of the index field
  • ref: comparison of columns and indexes
  • rows: the number of rows scanned (estimated number of rows)
  • filtered: The percentage of rows filtered by table conditions
  • Extra: Description and explanation of the execution

The starred categories above are commonly used when we optimize slow queries.

2-2. Commonly used attributes for slow query analysis

1. type:
The table access method indicates how MySQL finds the required rows in the table, also known as the "access type".

The types that exist are: ALL, index, range, ref, eq_ref, const, system, NULL (from left to right, performance from low to high). Here are three that we see every day:

  • ALL: (Full Table Scan) MySQL will traverse the entire table to find matching rows, often referred to as a full table scan
  • index: (Full Index Scan) The difference between index and ALL is that the index type only traverses the index tree
  • range: retrieve only the rows in a given range, using an index to select the rows

2. key

The key column shows the index actually used by SQL, usually one of the indexes in the possible_keys column. The MySQL optimizer generally selects a more suitable index by calculating the number of scan rows. If no index is selected, NULL is returned. Of course, there are cases where the MySQL optimizer selects the wrong index. You can force MySQL to "use or ignore a certain index" by modifying the SQL.

  • Force the use of an index: FORCE INDEX (index_name), USE INDEX (index_name)
  • Force an index to be ignored: IGNORE INDEX (index_name)

3. rows
rows is the number of rows that MySQL estimates it will need to read (scan) to find the required row, which may not be exact.

4. Extra
This column shows some additional information, which is important.

Using index
The queried columns are covered by the index, and the where filter condition is the leading column of the index, and Extra is Using index. This means that you can directly find the data that meets the conditions through index search without returning to the table.

Note: The leading column generally refers to the first column or "first few columns" in a joint index, as well as the case of a single-column index; here I call it the leading column for ease of understanding.

Using where
This means that the MySQL server will filter rows after the storage engine retrieves them; that is, the index is not used and the table is queried.

Possible causes:

  • The queried column is not covered by the index;
  • Where the filter condition is not the leading column of the index or the index cannot be used correctly;

Using temporary
This means that MySQL uses a temporary table when sorting the query results.

Using filesort
This means that MySQL will use an external index to sort the results, rather than reading the rows from the table in index order.

Using index condition
The query columns are not all in the index, and the where condition is a range of leading columns.

Using where; Using index
The queried column is covered by the index, and the where filter condition is one of the index columns, but it is not the leading column of the index, or there are other situations that affect the direct use of the index (such as the existence of a range filter condition). Extra is Using where; Using index, which means that the data that meets the conditions cannot be queried directly through index search, and the impact is not significant.

3. Sharing some slow query optimization experience

3-1. Optimize LIMIT paging

Operations that require paging in the system are usually implemented using the limit plus offset method, with an appropriate order by clause. If there is a corresponding index, the efficiency is usually good, otherwise MySQL needs to do a lot of file sorting operations.

A very troublesome problem is when the offset is very large, for example, it may be a query like limit 1000000,10. This means that MySQL needs to query 1000000 records and then only return the last 10. The previous 1000000 records will be discarded. This is very costly and will cause slow queries.

One of the simplest ways to optimize such queries is to use index covering scans whenever possible, rather than querying all columns. Then do a join operation as needed and return the required columns. This will greatly improve efficiency when the offset is large.

For the following query:

-- Execution time: 1.379s
SELECT * from vio_basic_domain_info LIMIT 1000000,10;

Explain the analysis results:

mysql> EXPLAIN SELECT * from vio_basic_domain_info LIMIT 1000000,10;
+----+-------------+-----------------------+------------+------+---------------+------+---------+------+---------+----------+------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------------+------------+------+---------------+------+---------+------+---------+----------+------+
| 1 | SIMPLE | vio_basic_domain_info | NULL | ALL | NULL | NULL | NULL | NULL | 1377809 | 100.00 | NULL |
+----+-------------+-----------------------+------------+------+---------------+------+---------+------+---------+----------+------+
1 row in set, 1 warning (0.00 sec)

The biggest problem with this statement is that the offset M in limit M,N is too large, causing each query to first find the first M records that meet the conditions from the entire table, then discard these M records and start from the M+1th record to find N records that meet the conditions in sequence. If the table is very large, the filter field does not have a suitable index, and M is very large, then this cost is very high.

Then if our next query can start from the position marked after the previous query, we can find 10 records that meet the conditions and note the position where the next query should start, so that the next query can start directly from that position, then we don't have to find the first M records that meet the conditions from the entire table every time we query, discard them, and then start from M+1 to find 10 more records that meet the conditions.

There are generally the following ways to handle slow paging queries

Idea 1: Constructing a covering index

By modifying SQL and using the covering index, for example, if I need to query only a few fields in the table, such as app_name and createTime, I can set a joint index on the app_name and createTime fields to achieve a covering index without scanning the entire table. This method is suitable for scenarios with fewer query columns and is not recommended for scenarios with too many query columns.
Time taken: 0.390s

mysql> EXPLAIN SELECT app_name,createTime from vio_basic_domain_info LIMIT 1000000,10;
+----+-------------+-----------------------+------------+-------+---------------+--------------+---------+------+---------+----------+---------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------------+------------+-------+---------------+--------------+---------+------+---------+----------+---------+
| 1 | SIMPLE | vio_basic_domain_info | NULL | index | NULL | idx_app_name | 515 | NULL | 1377809 | 100.00 | Using index |
+----+-------------+-----------------------+------------+-------+---------------+--------------+---------+------+---------+----------+---------+
1 row in set, 1 warning (0.00 sec)

Idea 2: Optimize offset

If covering indexes cannot be used, then the key is to find a way to quickly filter out the first 1 million pieces of data. We can use the ordered condition of the auto-increment primary key to first query the id value of the 1000001th data, and then query 10 rows later; this is suitable for scenarios where the primary key id is auto-incremented.
Time taken: 0.471s

SELECT * from vio_basic_domain_info where 
  id >=(SELECT id from vio_basic_domain_info ORDER BY id limit 1000000,1) limit 10;

Principle: First, query the value of the primary key id corresponding to the 1000001th data based on the index, and then directly query the 10 data behind the id directly through the value of the id. In the EXPLAIN analysis results below, you can see the two-step execution process of this SQL.

mysql> EXPLAIN SELECT * from vio_basic_domain_info where id >=(SELECT id from vio_basic_domain_info ORDER BY id limit 1000000,1) limit 10;
+----+-------------+-----------------------+------------+-------+---------------+--------+---------+------+---------+----------+---------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------------+------------+-------+---------------+--------+---------+------+---------+----------+---------+
| 1 | PRIMARY | vio_basic_domain_info | NULL | range | PRIMARY | PRIMARY | 8 | NULL | 10 | 100.00 | Using where |
| 2 | SUBQUERY | vio_basic_domain_info | NULL | index | NULL | PRIMARY | 8 | NULL | 1000001 | 100.00 | Using index |
+----+-------------+-----------------------+------------+-------+---------------+--------+---------+------+---------+----------+---------+
2 rows in set, 1 warning (0.40 sec)

Method 3: “Delayed association”

Time taken: 0.439s
Delayed join is suitable for tables with larger magnitude. The SQL is as follows:

SELECT * from vio_basic_domain_info inner join (select id from vio_basic_domain_info order by id limit 1000000,10) as myNew using(id);

Here we use covering index + delayed associated query, which is equivalent to querying only the id column first, using the covering index to quickly find the 10 data ids of the page, and then taking the returned 10 ids into the table and querying them again through the primary key index. (This method is less affected by the rapid growth of table data.)

mysql> EXPLAIN SELECT * from vio_basic_domain_info inner join (select id from vio_basic_domain_info order by id limit 1000000,10) as myNew using(id);
+----+-------------+-----------------------+------------+--------+---------------+--------+---------+----------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------------+------------+--------+---------------+--------+---------+----------+---------+----------+-------------+
| 1 | PRIMARY | <derived2> | NULL | ALL | NULL | NULL | NULL | NULL | 1000010 | 100.00 | NULL |
| 1 | PRIMARY | vio_basic_domain_info | NULL | eq_ref | PRIMARY | PRIMARY | 8 | myNew.id | 1 | 100.00 | NULL |
| 2 | DERIVED | vio_basic_domain_info | NULL | index | NULL | PRIMARY | 8 | NULL | 1000010 | 100.00 | Using index |
+----+-------------+-----------------------+------------+--------+---------------+--------+---------+----------+---------+----------+-------------+
3 rows in set, 1 warning (0.00 sec)

3-2. Check if the index does not work

Try to avoid starting fuzzy queries with the wildcard character '%', as this will cause the database engine to abandon the index and perform a full table scan. as follows:

SELECT * FROM t WHERE username LIKE '%陈哈哈%'

Optimization method: Try to use fuzzy query after the field. as follows:

SELECT * FROM t WHERE username LIKE '陈哈哈%'

If the requirement is to use fuzzy query in the front,

  • Use the MySQL built-in function INSTR(str,substr) to match, which is similar to indexOf() in Java, to find the subscript position of the string.
  • Use FullText indexing and match against to search
  • In case of large data volume, it is recommended to use ElasticSearch and Solr, which can retrieve billions of data in seconds.
  • When the table has a small amount of data (thousands of records), don't be fancy and just use like '%xx%'.

However, it must be said that MySQL's fuzzy matching of large fields is a flaw. After all, ensuring the ACID characteristics of transactions consumes too much performance. Therefore, if there are similar business needs in actual scenarios, it is recommended to decisively change the big data storage engine such as ElasticSearch, Hbase, etc. It's not about feelings here~

Try to avoid using not in, as it will cause the engine to perform a full table scan. It is recommended to use not exists instead, as follows:

-- Do not use the index SELECT * FROM t WHERE name not IN ('Timo','Captain');

-- Go by index select * from t as t1 where not exists (select * from t as t2 where name IN ('Timo','Captain') and t1.id = t2.id);

Try to avoid using or, as it will cause the database engine to abandon the index and perform a full table scan. as follows:

SELECT * FROM t WHERE id = 1 OR id = 3

Optimization method: You can use union instead of or. as follows:

SELECT * FROM t WHERE id = 1
   UNION
SELECT * FROM t WHERE id = 3

Try to avoid judging null values, which will cause the database engine to abandon the index and perform a full table scan. as follows:

SELECT * FROM t WHERE score IS NULL

Optimization method: You can add a default value of 0 to the field and judge the 0 value. as follows:

SELECT * FROM t WHERE score = 0

Try to avoid performing expressions or function operations on the left side of the equal sign in the where condition, as this will cause the database engine to abandon the index and perform a full table scan.
Expressions and function operations can be moved to the right side of the equal sign. as follows:

-- Full table scan SELECT * FROM T WHERE score/10 = 9
-- Go to index SELECT * FROM T WHERE score = 10*9

When the amount of data is large, avoid using the where 1=1 condition. Usually, in order to facilitate the assembly of query conditions, we will use this condition by default, and the database engine will abandon the index and perform a full table scan. as follows:

SELECT username, age, sex FROM T WHERE 1=1

Optimization method: Make judgments when assembling SQL with code. If there is no where condition, remove the where condition. If there is a where condition, add and.

Do not use <> or != in the query condition
When using the index column as a condition for query, avoid using judgment conditions such as <> or !=. If the business needs to use the not equal symbol, you need to re-evaluate the index creation to avoid indexing this field and replace it with other index fields in the query conditions.

The where condition only contains the non-leading columns of the composite index. For example, the composite (joint) index contains three columns: key_part1, key_part2, and key_part3, but the SQL statement does not contain the leading column "key_part1" of the index. According to the leftmost matching principle of the MySQL joint index, the joint index will not be used.

-- Select col1 from table where key_part2=1 and key_part3=2 without index
-- Go by index select col1 from table where key_part1 =1 and key_part2=1 and key_part3=2

Implicit type conversion causes the index not to be used. The following SQL statement cannot use the index correctly because the index column type is varchar, but the given value is a numeric value, which involves implicit type conversion.

select col1 from table where col_varchar=123;

Summarize

Well, through this article, I hope you have got some methods and experience in analyzing MySQL slow queries. Slow queries are always an unavoidable topic in MySQL. There are many ways to be slow. What if you want to completely avoid slow queries? Young man, I suggest you learn from your mistakes.

What we need to do is to discover and resolve slow queries in a timely manner. In fact, many slow queries occur passively, such as when the amount of data for a certain business increases dramatically, or when fields are changed or existing indexes are operated due to changes in business needs. Although it's not your fault, you may still have to take the blame

This is the end of this article about the pitfalls of MySQL slow query. For more relevant MySQL slow query content, please search for previous articles on 123WORDPRESS.COM or continue to browse the related articles below. I hope everyone will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • How to enable slow query log in docker mysql container
  • MySQL slow query optimization solution
  • Detailed example of locating and optimizing slow query sql in MySQL
  • MySQL optimization solution: enable slow query log
  • Reasons why MySQL queries are slow
  • Summary of MySQL slow log related knowledge
  • In-depth understanding of MySQL slow query log
  • How to quickly locate slow SQL in MySQL

<<:  Research on Web Page Size

>>:  Detailed explanation of the concept of docker container layers

Recommend

Specific use of Mysql prepare preprocessing

Table of contents 1. Preprocessing 2. Pretreatmen...

iframe parameters with instructions and examples

<iframe src=”test.jsp” width=”100″ height=”50″...

How to implement function currying and decurrying in Javascript

Function currying (black question mark face)? ? ?...

Vue implements click feedback instructions for water ripple effect

Table of contents Water wave effect Let's see...

MySQL compressed package version zip installation configuration method

There are some problems with the compressed versi...

Detailed explanation of overflow:auto usage

Before starting the main text, I will introduce s...

js realizes a gradually increasing digital animation

Table of contents background Achieve a similar ef...

Detailed steps to download Tomcat and put it on Linux

If you have just come into contact with Linux, th...

Detailed explanation of how to use the mysql backup script mysqldump

This article shares the MySQL backup script for y...

How to write memory-efficient applications with Node.js

Table of contents Preface Problem: Large file cop...