Detailed explanation of the execution plan explain command example in MySQL

Detailed explanation of the execution plan explain command example in MySQL

Preface

The explain command is the primary way to see how the query optimizer decided to execute a query.

This feature has limitations and doesn't always tell the truth, but its output is the best information available and is worth the time to understand because you can learn how the query is executed.

Calling EXPLAIN

When you add EXPLAIN before the SELECT statement, MySQL sets a flag on the query that, when the query plan is executed, causes it to return information about each step in the execution plan instead of executing it.

It returns one or more lines of information showing each part of the execution plan and the order in which they are executed.

Here is a simple explain effect:


There will be only one row in the output for each table in the query, and if the query is a join of two tables, there will be two rows in the output.

The alias form counts as one table, so if you join a table to itself, there will be two rows in the output.

The meaning of "table" here is quite broad, it can be a subquery, a union result, etc.

There are two variants of explain at the same time.

EXPLAIN EXTENDED tells the server to "reverse compile" the execution plan into a select statement.

You can see this generated statement by running show warnings immediately afterwards. This statement comes directly from the execution plan, not the original SQL statement, which has become a data structure at this point.

In most scenarios, it is different from the original statement. You can check how Query Optimizer converts the statement.

EXPLAIN EXTENDED is available in MySQL 5.0 and above, and a filtered column was added in 5.1.

EXPLAIN PARTITIONS shows the partitions that the query would access if the query was against a partitioned table.

It exists in MySQL 5.1 and above.

EXPLAIN Limitations:

explain will not tell you at all how triggers, stored procedures or UDFs affect queries

Stored procedures are not supported, although it is possible to manually extract queries and explain them individually

It does not tell you about specific optimizations that MySQL has made in its execution plan.

It does not display all information about the query’s execution plan

It does not distinguish between things with the same name, for example, it uses "filesort" for both in-memory sorts and temporary files, and it says "Using temporary" for both on-disk and in-memory temporary tables.

Can be misleading, for example, it can show a full index scan for a query with a small limit (mysql 5.1 explain shows more accurate information about the number of rows checked, but earlier versions do not take the limit into account)

Rewriting non-SELECT queries

mysql explain can only explain select queries, and will not explain stored procedure calls and insert, update, delete, or other statements. However, you can rewrite certain non-select queries to take advantage of explain. To achieve this, simply convert the statement into an equivalent select that accesses all of the same columns. Any additional columns must be in the select list, join clause, or where clause.

Suppose you want to rewrite the following update statement so that it can use explain

UPDATE sakila.actor
INNER JOIN sakila.film_actor USING (actor_id)
SET actor.last_update = film_actor.last_update;

The following explain statement is not equivalent to the above update because it does not require the server to obtain the last_update column from any table.

This distinction is very important. For example, the output shows that MySQL will use the covering index. However, when retrieving and updating the last_updated column, the covering index cannot be used. The following rewrite is closer to the original statement:


Rewriting the query like this isn't terribly scientific, but it's good enough to help understand what the query is doing.
(MySQL 5.6 will allow explanation of non-SELECT queries)

When displaying query plans, it is important to understand that there is no "equivalent" read query for a write query. A SELECT query only needs to find one copy of the data and return it. Any query that modifies data must look up and modify all copies of it in all indexes, which is often much more expensive than a seemingly equivalent SELECT query.

Columns in EXPLAIN

The meaning of each column in the explain result will be shown in the next section.

The lines in the output appear in the order that mysql actually executed the parts of the query, which is not always the same order as they appeared in the original SQL.

[id column]

This column always contains a number identifying the row to which the select belongs. If there are no subqueries or joins in the statement, there will be only one select, so each row will show a 1 in this column. Otherwise, the inner select statements are generally numbered sequentially, corresponding to their position in the original statement.

MySQL divides select queries into simple and complex types. Complex types can be divided into three categories: simple subqueries, so-called derived tables (subqueries in the from clause), and union queries.

Here is a simple subquery:

Subqueries and joins in the from clause add further complexity to the id column.

Here is a basic subquery in a from clause:


As you know, this query is executed with an anonymous temporary table. MySQL internally references this temporary table in the outer query through an alias (der). In more complex queries, you can see the ref column.

Finally, here is a union query:

Note the third extra line. The result of the union is always placed in an anonymous temporary table. MySQL then reads the result out of the temporary table. The temporary table does not appear in the original SQL, so its id column is null.

Compared to the previous example (which demonstrated the from clause of a subquery), the temporary table resulting from this query appears as the last row in the result, rather than the first.

So far this has been pretty straightforward, but mixing these three types of statements can make the output quite complicated, as we'll see shortly.

[select_type column]

This column shows whether the corresponding row is a simple or complex selection (and if the latter, which of the three complexity types). The simple value means that the query includes no subqueries or unions. If the query has any responsible subparts, the outermost part is marked as primary and the other parts are marked as follows:

SUBQUERY

A select that is contained in a subquery that is in the select list (in other words, not in the from clause) is marked as SUBQUERY

DERIVED

The DERIVED value is used to indicate that the select in the subquery contained in the FROM clause will be executed recursively by MySQL and the results will be placed in a temporary table. The server internally calls this a "derived table" because the temporary table is derived from the subquery.

UNION

The second and subsequent selects in a UNION are marked as unoin, and the first select is marked as if it were executed as part of the outer query. This is why the first select in the union in the previous example is shown as primary. If a union is contained in a subquery in the from clause, its first select is marked as derived.

UNION RESULT

The select used to retrieve the results from the anonymous temporary table of the union is marked as UNION RESULT.

In addition to these values, SUBQUERY and UNION can also be marked as DEPENDENT and UNCACHEABLE.

DEPENDENT means that the select depends on data found in the outer query.

UNCACHEABLE means that some characteristics of the select prevent the results from being cached in an Item_cache.

(Item_cache is undocumented and is not the same thing as the query cache, although it can be negated by some of the same type of constructs, such as the RAND() function.)

【table column】

This column shows which table the corresponding row is accessing, and in the usual case it is quite straightforward: it is the table, or a column of that table (if an alias is defined in the SQL).

You can observe the join order that MySQL's join optimizer chooses for the query from top to bottom in this column. For example, you can see that in the following query, the join order that MySQL chooses is different from the order specified in the statement:


MySQL's execution plan is always a left-side depth-first tree. If you turn this plan upside down, you can read the leaf nodes in order, which directly correspond to the rows in explain. The previous query plan looks like the following figure:

Derived tables and unions

When there is a subquery or a union in the from clause, the table column becomes much more complicated. In these scenarios, there is really no "table" to refer to, because the anonymous temporary table created by MySQL only exists during the query execution.

When there is a subquery in the from clause, the table column is of the form <derivedN>, where N is the id of the subquery. This always refers "forward" -- in other words, N points to the next line in the explain output.

When there is a union, the table column of the union result contains a list of ids participating in the union. This always "references backwards" because the union result appears after all participating rows in the union, and if there are more than 20 ids in the list, the table column is truncated to prevent it from being too long, at which point it is impossible to see all the values. Fortunately, it is still possible to infer which rows are included, because you can see the id of the first row, and everything that appears between this row and the union result will be included in some way.

An example of a complex select type

Here is a meaningless query that we use here as a compact example of a certain type of complex select:


The limit clause is just for convenience in case you plan to execute the program without explaining the results.

The following are some of the results of explain:

We intentionally made each query part access a different table so that we could figure out the problem, but it's still hard to solve, starting from above:

Line 1 refers forward to der_1. This query is marked as <derived3>, which is line 2 in the original SQL. To know which lines in the output refer to the select statement in <derived3>, read on.

The 2nd row has an id of 3 because it is part of the 3rd select in the query. It is classified as derived because it is nested inside a subquery in the from clause. It is the 4th row in the original SQL.

The id of the third row is 2, which is the third row in the original SQL. Note that it is after the row with a higher id, suggesting that it will be executed later, which is reasonable. It is classified as a DEPENDENT SUBQUERY, which means that its results depend on the outer query (that is, a correlated subquery). The outer query in this example starts at row 2 and selects data from der_1.

The 4th row is classified as a union, which means it is the second or subsequent select in the union, and its table is <derived6>, which means it retrieves data from the subquery of the from clause and appends it to the temporary table of the union. As before, to find the explain line that shows the query plan for this subquery, scroll down.

Line 5 is the der_2 subquery in line 8 of the original SQL, and explain calls it <derived6>.

Line 6 is a normal subquery in the select list of <derived6>, and its id is 7, which is very important...

...because it is greater than 5, which is the id of row 7. Why is it important? because it shows the bounds of the <derived6> subquery. When explain outputs a row with a select type of derived, it indicates the beginning of a "nested range". If the subsequent row has a smaller id (in this case, 5 is smaller than 6), it means the nested scope has been closed. This lets us know that line 7 is part of the select list that retrieves data from <derived6> - for example, the first part of the select list on line 4 (line 7 in the original SQL). This example is fairly easy to understand without having to know the meaning and rules of nested scopes, but of course sometimes it is not that easy. Another thing to note about this line in the output is that, because of the user variables, it is listed as UNCACHEABLE SUBQUERY.

The last line, union result, represents the stage of reading rows from the temporary table of the union. You can work backwards from this line if you wish, and it returns the results for rows with ids 1 and 4, which reference <derived3> and <derived6> respectively.

As you can see, these complex combinations of select types can make the explain output quite difficult to understand. Understanding the rules makes it easier, but it still takes more time.

Reading explain output often requires jumping around in the list. For example, if you look at the first line of output again, you won't know it's part of a union just by staring at it. You won't understand until you see the last line.

[type column]

The MySQL user manual says that this column shows the "association type," but we think it's more accurate to say the access type—in other words, how MySQL decides to find rows in the table. Here are the most important access methods, in order from worst to best:

ALL:

This is called a full table scan, which means that MySQL must scan the entire table, from beginning to end, to find the required row. (There are exceptions, such as when limit is used in the query, or "Using distinct/not exists" is displayed in the extra column.

index:

This is the same as a full table scan, except that MySQL scans the table in index order rather than row order. Its main advantage is that it avoids sorting; its biggest disadvantage is the overhead of reading the entire table in index order. This usually means that accessing rows in random order will be very expensive.

If you see "Using index" in the extra column, it means that MySQL is using index covering. It only scans the index data instead of every row in the index order. It is much less expensive than scanning the full table in the index order.

range:

A range scan is a restricted index scan that starts at a certain point in the index and returns rows that match this range of values. This is better than a full index scan because it does not need to traverse the entire index. The obvious scan is a query with between or > in the where clause.

When MySQL uses an index to find a range of values, such as in() and or lists, it also appears as a range scan. However, these two are actually quite different access types, with important differences in performance.

The cost of this type of scan is comparable to that of an indexed scan.

ref:

This is an index access (sometimes called an index seek) that returns all rows that match a single value. However, it may find multiple qualifying rows, so it is a hybrid of a seek and a scan. This type of index access occurs only when using a nonunique index or a nonunique prefix of a unique index. It is called ref because the index is compared to a reference value. This reference value is either a constant or the result value from the previous table in a multi-table query.

ref_or_null is a variation on ref that means mysql must do a second search to find null entries in the results of the initial search.

eq_ref:

Using this index lookup, MySQL knows to return at most one record that meets the criteria. This access method can be seen when MySQL uses a primary key or unique index lookup, and it compares them with a reference value. MySQL optimizes this type of access very well because it knows that it does not need to estimate the range of matching rows or continue searching after it finds a matching row.

const, system:

MySQL uses these access types when it can optimize part of a query and turn it into a constant. For example, if you select a row's primary key by putting it in the where clause, MySQL can turn the query into a constant and effectively remove the table from the join.

null:

This access method means that MySQL can decompose the query statement during the optimization phase and does not even need to access the table or index during the execution phase. For example, selecting the minimum value from an indexed column can be done by simply looking up the index, without accessing the table at execution time.

[possible_keys column]

This column shows which indexes the query can use, based on the columns accessed by the query and the comparison operators used. This list is created early in the optimization process, so some of the listed indexes may not be useful for subsequent optimization processes.

[key column]

This column shows which index MySQL decides to use to optimize access to the table. If the index does not appear in the possible_keys column, then MySQL chose it for another reason—for example, it might have chosen a covering index even without a where clause.

In other words, possible_keys reveals which index can help to perform the lookup efficiently, while key shows which index the optimizer can use to minimize the query cost. Here is an example:

[key_len column]

This column shows the number of bytes used by MySQL in the index. If MySQL is using only some columns in the index, you can use this value to figure out which columns. Remember that MySQL 5.5 and earlier versions can only use the leftmost prefix of the index. For example, the primary key of film_actor is two smallint columns, and each smallint column is two bytes, so each item in the index is 4 bytes. The following is an example query:


Based on the key_len column in the results, we can infer that the query uses the only first column, the actor_id column, to perform the index lookup. When we calculate the column usage, we must take the character set page in the character column into account.

View the execution plan:


The average length in this query is 13 bytes, which is the total length of columns a and b. Column a is 3 characters, each of which is at most 3 bytes in utf8, and column b is a 4-byte integer.

MySQL does not always show how much of an index is really used, for example, if you perform a LIKE query on a prefix pattern match it will show that the full width of the column is being used.

The key_len column shows the maximum length possible in the index field, not the actual number of bytes used by the data in the table; in the previous example MySQL always displays 13 bytes, even if column a happens to contain only one character in length. In other words, key_len is calculated by looking up the table definition, not the data in the table.

[ref column]

This column shows the column or constant used by the previous table to find the value in the index of the key column record. The following is an example showing the combination of join conditions and aliases. Note that the ref column reflects how the film table is aliased as f in the query text:


[rows column]

This column is the number of rows that MySQL estimates it will have to read to find the required row. This number is the number of loops in the inline loop join plan; that is, it is not the number of rows that MySQL thinks it will eventually read from the table, but the average number of rows that MySQL must read to find those rows that meet the criteria at each point in the query. (This criteria includes the conditions given in the SQL, as well as the current columns from the previous table in the join order.)

Depending on the table statistics and index selection, this estimate may be very inaccurate. In MySQL 5.0 and earlier versions, it does not reflect the limit clause. For example, the following query will not actually check 1057 rows.

By multiplying the values ​​of all rows columns, you can roughly estimate the number of rows that the entire query will examine. For example, the following query will examine approximately 2600 rows.


It is important to remember that this number is the number of rows MySQL thinks it will examine, not the number of rows in the result set. Also, realize that many optimizations, such as associative buffers and caches, cannot affect the displayed number of rows. MySQL may not actually read all the rows it estimates, and it does not know anything about operating system or hardware caches.

[Extra column]

This column contains additional information that does not fit in the other columns. Most of the values ​​that can appear here are documented in the MySQL User Manual.

The most common and most important values ​​are as follows.

“Using index”

This value indicates that MySQL will use a covering index to avoid accessing the table. Do not confuse covering indexes with index access types.

“Using where”

This means that the MySQL server will filter rows after the storage engine retrieves them. Many where conditions involve columns in the index and can be checked by the storage engine when (and if) it reads the index, so not all queries with a where clause will display "Using where". Sometimes the presence of "Using where" is a hint that the query could benefit from a different index.

“Using temporary”

This means that MySQL will use a temporary table when sorting the query results.

“Using filesort”

This means that MySQL will use an external index to sort the results, rather than reading the rows from the table in index order. MySQL has two file sorting algorithms, both of which can be done in memory or on disk. Explain will not tell you which file sorting MySQL will use, nor will it tell you whether the sorting will be done in memory or on disk.

“Range checked for each record(index map: N)”

This means that there is no good index to use, and the new index will be re-evaluated on each row of the join. N is a bitmap of the index shown in the possible_keys column and is redundant.

Output in tree format

MySQL users often prefer to format the output of explain into a tree to more accurately display the execution plan. In fact, the way to view the execution plan in explain is indeed a bit clumsy, and the tree structure is not suitable for tabular output. When there are a large number of values ​​in the extra column, the disadvantages are more obvious. The same is true for using union. Union is different from other types of joins that MySQL can do, and it is not suitable for explain.

If you have a full understanding of the rules and characteristics of explain, it is also feasible to use a tree-structured execution plan. But this is a bit tedious and is best left to automated tools. Percona Toolkit includes pt-visual-explain, which is such a tool.

Improvements in MySQL 5.6

MySQL 5.6 will include an important improvement to explain: the ability to explain queries like update, insert, etc. Although DML statements can be converted into equivalent "select" queries and explained, the results do not fully reflect how the statements are executed, so this is still very helpful. We have tried to use that technique when developing something like pt-upgrade in Percona Toolkit, and we have found more than once that when converting queries to selects, the optimizer cannot execute the code path we expect. Therefore, explaining a query without converting it into a select is very helpful for us to understand what happens during the execution process.

MySQL 5.6 will also include a series of changes to the query optimization and execution engine that allow anonymous temporary tables to be materialized as late as possible, rather than always creating and populating them when optimizing and executing the part of the query that uses this temporary table. This will allow MySQL to directly interpret query statements with subqueries without actually executing the subqueries first.

Finally, MySQL 5.6 will improve the optimizer by adding optimization tracking capabilities to the server, allowing users to view the decisions made by the optimizer, as well as the inputs (for example, the cardinality of an index) and the reasons for the decisions. This is very helpful, not only in understanding the execution plan chosen by the server, but also why it chose that plan.

Summarize

The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. If you have any questions, you can leave a message to communicate. Thank you for your support for 123WORDPRESS.COM.

You may also be interested in:
  • In-depth analysis of explain in MySQL query optimization
  • Detailed explanation of explain usage in MySQL
  • MySQL summary explain
  • MySQL performance analysis and explain usage instructions
  • Detailed explanation of the role of explain in MySQL
  • Detailed explanation of the use of mysql explain (analysis index)
  • Detailed explanation of the EXPLAIN command and its usage in MySQL
  • MYSQL explain execution plan
  • Detailed explanation of EXPLAIN command in MySQL
  • EXPLAIN statement and usage examples in MySQL

<<:  Detailed explanation of Linux CPU load and CPU utilization

>>:  Demonstration of building ElasticSearch middleware and common interfaces under centos7 in Linux system

Recommend

Six tips to increase web page loading speed

Secondly, the ranking of keywords is also related ...

CSS3 implementation example of rotating only the background image 180 degrees

1. Mental Journey When I was writing the cockpit ...

Summary of MySQL date and time functions (MySQL 5.X)

1. MySQL gets the current date and time function ...

Implementation of Mysql User Rights Management

1. Introduction to MySQL permissions There are 4 ...

JavaScript to achieve click image flip effect

I was recently working on a project about face co...

Navicat connects to MySQL8.0.11 and an error 2059 occurs

mistake The following error occurs when connectin...

Vue implements a search box with a magnifying glass

This article shares with you how to use Vue to im...

Detailed steps for Linux account file control management

In the Linux system, in addition to various accou...

How MySQL supports billions of traffic

Table of contents 1 Master-slave read-write separ...

JavaScript parseInt() and Number() difference case study

Learning objectives: The two functions parseInt()...

Detailed use cases of vue3 teleport

Official Website https://cli.vuejs.org/en/guide/ ...

javascript to switch pictures by clicking a button

This article example shares the specific code of ...

React's transition from Class to Hooks

Table of contents ReactHooks Preface WhyHooks? Fo...