Detailed explanation of MySQL joint query optimization mechanism

Detailed explanation of MySQL joint query optimization mechanism

MySQL federated query execution strategy.

Take a UNION query as an example. When MySQL executes a UNION query, it treats them as a series of single query statements, puts the corresponding results into a temporary table, and finally reads them out and returns them. In MySQL, each independent query is a join query, and the same is true for returning results from a temporary table.

In this case, MySQL's join query execution is simple - it treats the join query here as a nested loop join query. This means that MySQL will run one loop to read rows from one table, and then run a nested loop to read matching rows from the next table. This process continues until all matching rows in the union query are found. Then construct the return result according to the columns required in the SELECT statement. As shown in the following query statement:

SELECT tb1.col1, tb2.col2
FROM tb1 INNER JOIN tb2 USING (col3)
WHERE tb1.col1 IN(5,6);

The actual pseudo code that MySQL might execute is as follows:

outer_iter = iterator over tb1 where col1 IN(5,6);
outer_row = outer_iter.next;
while outer_row
	inner_iter = iterator over tb2 where col3 = outer_row.col3;
	inner_row = inner_iter.next
    while inner_row
    	output [outer_row.col1, inner_row.col2];
        inner_row = inner_iter.next;
	end
    outer_row = outer.iter.next;
end

After conversion to pseudo code, it looks like this

outer_iter = iterator over tb1 where col1 IN(5,6);
outer_row = outer_iter.next;
while outer_row
	inner_iter = iterator over tb2 where col3 = outer_row.col3;
	inner_row = inner_iter.next
    if inner_row
        while inner_row
            output [outer_row.col1, inner_row.col2];
            inner_row = inner_iter.next;
        end
    else
    	output [outer_row.col1, NULL];
	end
    outer_row = outer.iter.next;
end

Another way to visualize the query plan is to use a swim lane diagram. The following figure shows a swimlane diagram for an inner join query.

MySQL executes all kinds of queries in basically the same way. For example, when a subquery needs to be executed first in the FROM condition, the result is first put into a temporary table, and then the temporary table is treated as a normal table and then joined for processing. MySQL also uses temporary tables when performing union queries, and then rewrites the right join query into an equivalent left join. In short, the current version of MySQL will convert various queries into this way of processing as much as possible (the latest version MySQL5.6 and later introduced more complex processing methods).

Of course, not all legal SQL query statements can do this, and some queries may perform poorly in this way.

Execution plan

Unlike many other database products, MySQL does not generate bytecodes for query statements to execute query plans. In fact, the query execution plan is a tree of instructions, and the query execution engine generates query results based on this tree. The final query plan contains enough information to reconstruct the original query. If you execute EXPLAIN EXTENDED on the query statement (MySQL 8 and later do not need to add EXTENDED), and then execute SHOW WARNINGS, you can see the reconstructed query.

Conceptually, multi-table queries can be represented by a tree. For example, a query with four tables might look like the following tree. This is called a balanced tree in computers.

However this is not how MySQL executes queries. As mentioned earlier, MySQL always starts from one table and then looks for matching rows in the next table. Therefore, MySQL's query plan looks like the following left-deep join tree.

Federated Query Optimizer

The most important part of MySQL's query optimizer is the joint query optimizer, which determines the optimal order of executing multi-table queries. You can often get the same results by using multiple sequences of join queries. The federated query optimizer attempts to estimate the costs of these plans and then chooses the lowest-cost plan to execute.

The following is an example of a union query that returns the same results, but in a different order.

SELECT film.film_id, film.title, film.release_year, actor.actor_id, actor.first_name, actor.last_name
FROM sakila.film
INNER JOIN sakila.film_actor USING(film_id)
INNER JOIN sakila.actor USING(actor_id);

There may be some different ways of querying here. For example, MySQL can start from the film table, use the film_id index of film_actor to find the corresponding actor_di value, and then use the primary key from the actor table to find the corresponding actor data row. An Oracle user might state: "The film table is the driving table for film_actor, and film_actor is the driving table for the actor table." The results of using Explain parsing are as follows:

******** 1.row ********
id: 1
select_type: SIMPLE
table: actor
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 200
Extra:
******** 2.row ********
id: 1
select_type: SIMPLE
table: film_actor
type: ref
possible_keys: PRIMARY, idx_fk_film_id
key: PRIMARY
key_len: 2
ref: sakila.film.film_id
rows: 1
Extra: USING index
******** 3.row ********
id: 1
select_type: SIMPLE
table: film
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 2
ref: sakila.film_actor.film_id
rows: 1
Extra: 

This execution plan is very different from what we expected. MySQL starts with the actor table first and then goes in reverse order. Is this actually more efficient? We can add STRAIGHT_JOIN to EXPLAIN to avoid optimization:

EXPLAIN SELECT STRAIGHT_JOIN film.film_id, film.title, film.release_year, actor.actor_id, actor.first_name, actor.last_name
FROM sakila.film
INNER JOIN sakila.film_actor USING(film_id)
INNER JOIN sakila.actor USING(actor_id);
******** 1.row ********
id: 1
select_type: SIMPLE
table: film
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 951
Extra:
******** 2.row ********
id: 1
select_type: SIMPLE
table: film_actor
type: ref
possible_keys: PRIMARY, idx_fk_film_id
key: idx_fk_film_id
key_len: 2
ref: sakila.film.film_id
rows: 1
Extra: USING index
******** 3.row ********
id: 1
select_type: SIMPLE
table: actor
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 2
ref: sakila.film_actor.actor_id
rows: 1
Extra: 

This explains why MySQL needs to execute the query in reverse order, which results in fewer rows being examined.

  • Querying the film table first will require 951 queries for film_actor and actor (outermost loop)
  • If you bring the actor table to the front, you only need to query the other tables 200 times.

From this example, we can see that MySQL's joint query optimizer can reduce query costs by adjusting the order of query tables. Reordered join queries are usually very effective optimizations, often improving performance several times. If there is no performance improvement, you can also use STRAIGHT_JOIN to avoid reordering and use the query method you think is best. This situation is rarely encountered in practice, and in most cases, the joint query optimizer will do a better job than humans.

The query optimizer views the union to build a query execution tree with the lowest completion cost. If possible, it starts with all single-table plans and checks all possible subtree combinations. Unfortunately, a join query of N tables has N factorial number of possible combinations. This is called the search space of all possible query plans, and it grows very quickly. A joint index of 10 tables would have 3,628,800 different ways! Once the search space grows too large, query optimization will take a very long time. At this time, the server will stop performing full analysis and instead complete the optimization in a manner similar to a greedy algorithm. This number is controlled by the optimizer_search_depth system variable, which can be modified by yourself.

You may also be interested in:
  • MySQL beginners can say goodbye to the troubles of grouping and aggregation queries
  • Let's talk about MySQL joint query in detail
  • Detailed explanation of the principles and usage examples of MySQL join query, union query, and subquery
  • Analysis of MySQL multi-table joint query operation examples
  • MySQL database aggregate query and union query operations

<<:  How to quickly install nginx under Windows and configure it to start automatically

>>:  Web Design Tutorial (4): About Materials and Expressions

Recommend

User experience of portal website redesign

<br />From the launch of NetEase's new h...

CentOS 7 method to modify the gateway and configure the IP example

When installing the centos7 version, choose to co...

Summary of MySql storage engine and index related knowledge

Storage Engine What is a database storage engine?...

Box-shadow and drop-shadow to achieve irregular projection example code

When we want to add a shadow to a rectangle or ot...

mysql5.7.17 installation and configuration example on win2008R2 64-bit system

123WORDPRESS.COM has explained to you the install...

Five ways to achieve automatic page jump in HTML

In the previous article, we introduced three comm...

Does MySql need to commit?

Whether MySQL needs to commit when performing ope...

Vue+video.js implements video playlist

This article shares the specific code of vue+vide...

Summary of MySQL 8.0 memory-related parameters

Theoretically, the memory used by MySQL = global ...

CSS3 to achieve simple white cloud floating background effect

This is a very simple pure CSS3 white cloud float...

Some tips on using the HTML title attribute correctly

If you want to hide content from users of phones, ...

Thirty HTML coding guidelines for beginners

1. Always close HTML tags In the source code of p...