Teach you MySQL query optimization analysis tutorial step by step

Teach you MySQL query optimization analysis tutorial step by step

Preface

MySQL is a relational database with strong query capabilities, high data consistency, high data security, and support for secondary indexes. However, its performance is slightly inferior to that of non-relational databases, especially when the data is above one million, it is easy to have slow queries. At this time, you need to analyze the reasons for the slow query. Generally, it is caused by the programmer's poor SQL writing, the lack of key index, or the invalid index.

At this time, the EXPLAIN command provided by MySQL is particularly important. It can analyze the SELECT statement and output detailed information about the SELECT execution for developers to optimize it in a targeted manner.

And just add Explain before the query statement:

EXPLAIN SELECT * FROM customer WHERE id < 100;

Prepare

First, you need to create two test tables and data:

CREATE TABLE `customer` ( 
 `id` BIGINT(20) unsigned NOT NULL AUTO_INCREMENT,
 `name` VARCHAR(50) NOT NULL DEFAULT '',
 `age` INT(11) unsigned DEFAULT NULL,
 PRIMARY KEY (`id`),
 KEY `name_index` (`name`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4

INSERT INTO customer (name, age) VALUES ('a', 1);
INSERT INTO customer (name, age) VALUES ('b', 2);
INSERT INTO customer (name, age) VALUES ('c', 3);
INSERT INTO customer (name, age) VALUES ('d', 4);
INSERT INTO customer (name, age) VALUES ('e', 5);
INSERT INTO customer (name, age) VALUES ('f', 6);
INSERT INTO customer (name, age) VALUES ('g', 7);
INSERT INTO customer (name, age) VALUES ('h', 8);
INSERT INTO customer (name, age) VALUES ('i', 9);
CREATE TABLE `orders` (
 `id` BIGINT(20) unsigned NOT NULL AUTO_INCREMENT,
 `user_id` BIGINT(20) unsigned NOT NULL DEFAULT 0, `product_name` VARCHAR(50) NOT NULL DEFAULT '',
 `productor` VARCHAR(30) NOT NULL DEFAULT '',
 PRIMARY KEY (`id`),
 KEY `user_product_detail_index` (`user_id`, `product_name`, `productor`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4

INSERT INTO orders (user_id, product_name, productor) VALUES (1, 'p1', 'WHH');
INSERT INTO orders (user_id, product_name, productor) VALUES (1, 'p2', 'WL');
INSERT INTO orders (user_id, product_name, productor) VALUES (1, 'p1', 'DX');
INSERT INTO orders (user_id, product_name, productor) VALUES (2, 'p1', 'WHH');
INSERT INTO orders (user_id, product_name, productor) VALUES (2, 'p5', 'WL');
INSERT INTO orders (user_id, product_name, productor) VALUES (3, 'p3', 'MA');
INSERT INTO orders (user_id, product_name, productor) VALUES (4, 'p1', 'WHH');
INSERT INTO orders (user_id, product_name, productor) VALUES (6, 'p1', 'WHH');
INSERT INTO orders (user_id, product_name, productor) VALUES (9, 'p8', 'TE');

EXPLAIN Output Format

The output of the EXPLAIN command is roughly as follows:

mysql> explain select * from customer where id = 1\G
*************************** 1. row ***************************
 id: 1
 select_type: SIMPLE
 table: customer
 partitions: NULL
 type: const
possible_keys: PRIMARY
 key: PRIMARY
 key_len: 8
 ref: const
 rows: 1
 filtered: 100.00
 Extra: NULL
1 row in set, 1 warning (0.00 sec)

The meaning of each column is as follows:

  • id: The identifier of the SELECT query. Each SELECT is automatically assigned a unique identifier.
  • select_type: The type of SELECT query.
  • table: which table is being queried partitions: matching partitions type: join type
  • possible_keys: possible indexes to use in this query
  • key: The exact index used in this query.
  • ref: which field or constant is used with the key
  • rows: Shows how many rows were scanned by this query. This is an estimate.
  • filtered: indicates the percentage of data filtered by this query condition
  • extra: additional information

Next, let's take a look at some of the more important fields.

select_type

  • SIMPLE - A simple select query that does not contain subqueries or UNIONs
  • PRIMARY - If the query contains any complex subqueries, the outermost query is marked as primary
  • UNION - Indicates that this query is the second or subsequent query of a UNION
  • DEPENDENT UNION - The second or subsequent query in a UNION that depends on the outer query
  • UNION RESULT - select result from UNION table
  • DERIVED - Subqueries included in the from list are marked as derived. MySQL will recursively execute these subqueries and place the results in a temporary table.
  • SUBQUERY - contains a subquery in the select or where list
  • DEPENDENT SUBQUERY - The first SELECT in a subquery depends on the outer query. That is, the subquery depends on the result of the outer query.

The most common query type should be SIMPLE. For example, when our query has no subquery or UNION query, it is usually SIMPLE type, for example:

mysql> explain select * from customer where id = 2\G
*************************** 1. row ***************************
 id: 1
 select_type: SIMPLE
 table: customer
 partitions: NULL
 type: const
possible_keys: PRIMARY
 key: PRIMARY
 key_len: 8
 ref: const
 rows: 1
 filtered: 100.00
 Extra: NULL
1 row in set, 1 warning (0.00 sec)

If we use UNION query, then the EXPLAIN output is similar to the following:

mysql> EXPLAIN (SELECT * FROM customer WHERE id IN (1, 2, 3)) 
 -> UNION
 -> (SELECT * FROM customer WHERE id IN (3, 4, 5));
+----+--------------+------------+------------+-------+---------------+--------+---------+------+------+------+-----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------+------------+------------+-------+---------------+--------+---------+------+------+------+-----------------+
| 1 | PRIMARY | customer | NULL | range | PRIMARY | PRIMARY | 8 | NULL | 3 | 100.00 | Using where |
| 2 | UNION | customer | NULL | range | PRIMARY | PRIMARY | 8 | NULL | 3 | 100.00 | Using where |
| NULL | UNION RESULT | <union1,2> | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | Using temporary |
+----+--------------+------------+------------+-------+---------------+--------+---------+------+------+------+-----------------+
3 rows in set, 1 warning (0.00 sec)

table

Indicates the table or derived table involved in the query

type

The type field is important because it provides an important basis for judging whether the query is efficient. Through the type field, we can judge whether the query is a full table scan or an index scan.

type Common types

Common values ​​for type are:

  • system: There is only one data in the table. This type is a special const type.
  • const: For equal value queries scanning against primary key or unique index, only one row of data is returned at most. Const queries are very fast because they only read once. For example, the following query uses the primary key index, so type is const.
mysql> explain select * from customer where id = 2\G
*************************** 1. row ***************************
 id: 1
 select_type: SIMPLE
 table: customer
 partitions: NULL
 type: const
possible_keys: PRIMARY
 key: PRIMARY
 key_len: 8
 ref: const
 rows: 1
 filtered: 100.00
 Extra: NULL
1 row in set, 1 warning (0.00 sec)

eq_ref: This type usually appears in multi-table join queries, indicating that for each result of the previous table, only one row of results in the next table can be matched. And the comparison operation of the query is usually =, which has high query efficiency. For example:

mysql> EXPLAIN SELECT * FROM customer, order_info WHERE customer.id = order_info.user_id\G
*************************** 1. row ***************************
  id: 1
 select_type: SIMPLE
 table: order_info
 partitions: NULL
  type: index
possible_keys: user_product_detail_index
  key: user_product_detail_index
 key_len: 314
  ref: NULL
  rows: 9
 filtered: 100.00
 Extra: Using where; Using index
*************************** 2. row ***************************
  id: 1
 select_type: SIMPLE
 table: customer
 partitions: NULL
  type: eq_ref
possible_keys: PRIMARY
  key: PRIMARY
 key_len: 8
  ref: test.order_info.user_id
  rows: 1
 filtered: 100.00
 Extra: NULL
2 rows in set, 1 warning (0.00 sec)

ref: This type usually appears in multi-table join queries, for non-unique or non-primary key indexes, or queries using the leftmost prefix rule index.

For example, in the following example, the ref type query is used:

mysql> EXPLAIN SELECT * FROM customer, order_info WHERE customer.id = order_info.user_id AND order_info.user_id = 5\G
*************************** 1. row ***************************
  id: 1
 select_type: SIMPLE
 table: customer
 partitions: NULL
  type: const
possible_keys: PRIMARY
  key: PRIMARY
 key_len: 8
  ref: const
  rows: 1
 filtered: 100.00
 Extra: NULL
*************************** 2. row ***************************
  id: 1
 select_type: SIMPLE
 table: order_info
 partitions: NULL
  type: ref
possible_keys: user_product_detail_index
  key: user_product_detail_index
 key_len: 9
  ref: const
  rows: 1
 filtered: 100.00
 Extra: Using index
2 rows in set, 1 warning (0.01 sec)

range: Indicates using index range query to obtain partial data records in the table through the index field range. This type usually appears in =, <>, >, >=, <, <=, IS NULL, <=>, BETWEEN, IN() operations. When type is range, the ref field output by EXPLAIN is NULL, and the key_len field is the longest index used in this query.

For example, the following example is a range query:

mysql> EXPLAIN SELECT * FROM customer WHERE id BETWEEN 2 AND 8 \G
*************************** 1. row ***************************
  id: 1
 select_type: SIMPLE
 table: customer
 partitions: NULL
  type: range
possible_keys: PRIMARY
  key: PRIMARY
 key_len: 8
  ref: NULL
  rows: 7
 filtered: 100.00
 Extra: Using where
1 row in set, 1 warning (0.00 sec)

index: indicates full index scan, which is similar to the ALL type, except that the ALL type scans the entire table, while the index type only scans all indexes without scanning data.

The index type usually appears when the data to be queried can be obtained directly in the index tree without scanning the data. When this is the case, the Extra field will display Using index.

For example:

mysql> EXPLAIN SELECT name FROM customer \G
*************************** 1. row ***************************
  id: 1
 select_type: SIMPLE
 table: customer
 partitions: NULL
  type: index
possible_keys: NULL
  key: name_index
 key_len: 152
  ref: NULL
  rows: 10
 filtered: 100.00
 Extra: Using index
1 row in set, 1 warning (0.00 sec)

In the above example, the name field we are querying happens to be an index, so we can directly get the data from the index to meet the query requirements without querying the data in the table. Therefore, in this case, the value of type is index, and the value of Extra is Using index.

  • ALL: Indicates full table scan. This type of query is one of the worst performing queries. Generally speaking, we should not have ALL type queries, because such queries are a huge disaster for database performance when the amount of data is large. If a query is an ALL type query, then generally speaking, you can add indexes to the corresponding fields to avoid it.

The following is an example of a full table scan. You can see that in the full table scan, both the possible_keys and key fields are NULL, indicating that no index is used, and the rows are very large, so the overall query efficiency is very low.

mysql> EXPLAIN SELECT age FROM customer WHERE age = 20 \G*************************** 1. row ***************************
  id: 1
 select_type: SIMPLE
 table: customer
 partitions: NULL
  type: ALL
possible_keys: NULL
  key: NULL
 key_len: NULL
  ref: NULL
  rows: 10
 filtered: 10.00
 Extra: Using where
1 row in set, 1 warning (0.00 sec)

Performance comparison of type

Generally speaking, the performance relationship of different types is as follows:

ALL < index < range ~ index_merge < ref < eq_ref < const < system

The ALL type is the slowest because it is a full table scan under the same query conditions.

Although the index type query does not scan the entire table, it scans all indexes, so it is slightly faster than the ALL type. The following types all use indexes to query data, so they can filter part or most of the data, so the query efficiency is relatively high.

For a programmer, if he can ensure that the query reaches at least the range level or preferably the ref level, he is considered an excellent and responsible programmer.

  • ALL: (full table scan) Full table scan is undoubtedly the worst. If the data volume is in the millions or tens of millions, a full table scan will be very slow.
  • index: (full index scan) Full index file scan is much better than all. After all, finding data from the index tree is faster than finding data from the entire table.
  • range: retrieve only the rows in a given range, using the index to match the rows. The scope is narrowed, and of course it is faster than full table scan and full index file scan. SQL statements generally contain queries such as between, in, >, <, etc.
  • ref: A non-unique index scan is essentially an index access that returns all rows that match a single value. For example, when querying all colleagues in the company's R&D team, the matching results are multiple but not unique values.
  • eq_ref: Unique index scan, for each index key, there is a record matching it in the table. For example, if you query the CEO of a company, the matching result may only be one record.
  • const: indicates that the value can be found by indexing once. const is used to compare primary keys or unique indexes. Because there is only one row of data to match, MySQL can quickly convert the query into a constant if the primary key is placed in the where list.
  • system: The table has only one record (equal to the system table). This is a special column of the const type. It does not appear normally. Just understand it.

possible_key

possible_keys indicates the indexes that MySQL can use when querying. Note that even if some indexes appear in possible_keys, it does not mean that this index will actually be used by MySQL. The specific indexes that MySQL uses when querying are determined by the key field.

key

This field is the index that MySQL actually uses in the current query.

key_len

Indicates the number of bytes used by the query optimizer. This field can be used to evaluate whether the composite index is fully used, or only the leftmost fields are used.

The calculation rules for key_len are as follows:

  • String
    • char(n): n bytes in length
    • varchar(n): If it is utf8 encoding, it is 3
      n + 2 bytes; if utf8mb4 encoding, then 4
      n + 2 bytes.
  • Value type:
    • TINYINT: 1 byte
    • SMALLINT: 2 bytes
    • MEDIUMINT: 3 bytes
    • INT: 4 bytes
    • BIGINT: 8 bytes
  • Time Type
    • DATE: 3 bytes
    • TIMESTAMP: 4 bytes
    • DATETIME: 8 bytes
  • Field attributes: The NULL attribute takes up one byte. If a field is NOT NULL, it does not have this attribute.

Let's take two simple examples:

mysql> EXPLAIN SELECT * FROM order_info WHERE user_id < 3 AND product_name = 'p1' AND productor = 'WHH' \G
*************************** 1. row ***************************
   id: 1
 select_type: SIMPLE
  table: order_info
 partitions: NULL
   type: range
possible_keys: user_product_detail_index
   key: user_product_detail_index
  key_len: 9
   ref: NULL
   rows: 5
  filtered: 11.11
  Extra: Using where; Using index
1 row in set, 1 warning (0.00 sec)

The above example queries the specified content from the table order_info. From the table creation statement, we can see that the table order_info has a joint index:

KEY `user_product_detail_index` (`user_id`, `product_name`, `productor`)

However, in the query WHERE user_id < 3 AND product_name = 'p1' AND productor = 'WHH', the user_id range query is performed first. According to the leftmost prefix matching principle, when a range query is encountered, the index matching stops. Therefore, in fact, the only index field we use is user_id. Therefore, in EXPLAIN, the displayed key_len is 9. Because the user_id field is BIGINT, it occupies 8 bytes, and the NULL attribute occupies one byte, so the total is 9 bytes. If we change the user_id field to BIGINT(20) NOT NULL DEFAULT '0', the key_length should be 8.

Because of the leftmost prefix matching principle, our query only uses the user_id field of the joint index, so the efficiency is not high.

Let's look at the next example:

mysql> EXPLAIN SELECT * FROM order_info WHERE user_id = 1 AND product_name = 'p1' \G;
*************************** 1. row ***************************
   id: 1
 select_type: SIMPLE
  table: order_info
 partitions: NULL
   type: ref
possible_keys: user_product_detail_index
   key: user_product_detail_index
  key_len: 161
   ref: const,const
   rows: 2
  filtered: 100.00
  Extra: Using index
1 row in set, 1 warning (0.00 sec)

In this query, we did not use range query, and the value of key_len is 161. Why? Because in our query condition WHERE user_id = 1 AND product_name = 'p1', only the first two fields in the joint index are used, so keyLen(user_id) + keyLen(product_name) = 9 + 50 * 3 + 2 = 161

rows

Rows is also an important field. Based on the statistics, the MySQL query optimizer estimates the number of rows that SQL needs to scan to find the result set.

This value directly shows the efficiency of SQL. In principle, the fewer rows, the better.

Extra

A lot of extra information in EXplain will be displayed in the Extra field, the common ones are as follows:

  • Using filesort

When Using filesort is in Extra, it means that MySQL needs additional sorting operations and cannot achieve the sorting effect through index order. Generally, it is recommended to optimize and remove Using filesort because such queries consume a lot of CPU resources.

For example, the following example:

mysql> EXPLAIN SELECT * FROM order_info WHERE user_id = 1 AND product_name = 'p1' \G;
*************************** 1. row ***************************
   id: 1
 select_type: SIMPLE
  table: order_info
 partitions: NULL
   type: ref
possible_keys: user_product_detail_index
   key: user_product_detail_index
  key_len: 161
   ref: const,const
   rows: 2
  filtered: 100.00
  Extra: Using index
1 row in set, 1 warning (0.00 sec)

Our index is

KEY `user_product_detail_index` (`user_id`, `product_name`, `productor`)

However, the query above is sorted by product_name, so the index cannot be used for optimization, which will result in Using filesort.

If we change the sorting basis to ORDER BY user_id, product_name, then Using filesort will not appear. For example:

mysql> EXPLAIN SELECT * FROM order_info ORDER BY user_id, product_name \G
*************************** 1. row ***************************
   id: 1
 select_type: SIMPLE
  table: order_info
 partitions: NULL
   type: index
possible_keys: NULL
   key: user_product_detail_index
  key_len: 253
   ref: NULL
   rows: 9
  filtered: 100.00
  Extra: Using index
1 row in set, 1 warning (0.00 sec)
  • Using index

"Covering index scan" means that the query can find the required data in the index tree without scanning the table data file, which often indicates good performance

  • Using temporary

The query uses temporary tables, which usually occur in sorting, grouping and multi-table join situations. The query efficiency is not high. It is recommended to optimize.

Summarize

The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. Thank you for your support of 123WORDPRESS.COM.

You may also be interested in:
  • Implementation of MySQL select in subquery optimization
  • An article to master MySQL index query optimization skills
  • Summary of SQL query optimization knowledge points for MySQL tens of millions of big data
  • Mysql slow query optimization method and optimization principle
  • MySQL slow query optimization: the advantages of limit from theory and practice
  • MySQL query optimization: causes and solutions for slow queries

<<:  Analysis of permissions required to run docker

>>:  How to install Composer in Linux

Recommend

MySQL max_allowed_packet setting

max_allowed_packet is a parameter in MySQL that i...

WeChat applet selects the image control

This article example shares the specific code for...

Solution to mysql failure to start due to insufficient disk space in ubuntu

Preface Recently, I added two fields to a table i...

Native JS to implement image carousel JS to implement small advertising plug-in

Recently I want to use native JS to implement som...

How to use VirtualBox to build a local virtual machine environment on Mac

1. Big Data and Hadoop To study and learn about b...

Analysis of the Docker deployment Consul configuration process

Execute Command docker run -d --name consul -p 85...

How to install MySQL 5.7.17 and set the encoding to utf8 in Windows

download MySQL official download, select Windows ...

Detailed explanation of BOM and DOM in JavaScript

Table of contents BOM (Browser Object Model) 1. W...

Use of MySQL truncate table statement

The Truncate table statement is used to delete/tr...

Tutorial on installing Pycharm and Ipython on Ubuntu 16.04/18.04

Under Ubuntu 18.04 1. sudo apt install python ins...

InnoDB engine redo file maintenance method

If you want to adjust the size and number of Inno...