Detailed explanation of MySQL joint query optimization mechanism

Detailed explanation of MySQL joint query optimization mechanism

MySQL federated query execution strategy.

Take a UNION query as an example. When MySQL executes a UNION query, it treats them as a series of single query statements, puts the corresponding results into a temporary table, and finally reads them out and returns them. In MySQL, each independent query is a join query, and the same is true for returning results from a temporary table.

In this case, MySQL's join query execution is simple - it treats the join query here as a nested loop join query. This means that MySQL will run one loop to read rows from one table, and then run a nested loop to read matching rows from the next table. This process continues until all matching rows in the union query are found. Then construct the return result according to the columns required in the SELECT statement. As shown in the following query statement:

SELECT tb1.col1, tb2.col2
FROM tb1 INNER JOIN tb2 USING (col3)
WHERE tb1.col1 IN(5,6);

The actual pseudo code that MySQL might execute is as follows:

outer_iter = iterator over tb1 where col1 IN(5,6);
outer_row = outer_iter.next;
while outer_row
	inner_iter = iterator over tb2 where col3 = outer_row.col3;
	inner_row = inner_iter.next
    while inner_row
    	output [outer_row.col1, inner_row.col2];
        inner_row = inner_iter.next;
	end
    outer_row = outer.iter.next;
end

After conversion to pseudo code, it looks like this

outer_iter = iterator over tb1 where col1 IN(5,6);
outer_row = outer_iter.next;
while outer_row
	inner_iter = iterator over tb2 where col3 = outer_row.col3;
	inner_row = inner_iter.next
    if inner_row
        while inner_row
            output [outer_row.col1, inner_row.col2];
            inner_row = inner_iter.next;
        end
    else
    	output [outer_row.col1, NULL];
	end
    outer_row = outer.iter.next;
end

Another way to visualize the query plan is to use a swim lane diagram. The following figure shows a swimlane diagram for an inner join query.

MySQL executes all kinds of queries in basically the same way. For example, when a subquery needs to be executed first in the FROM condition, the result is first put into a temporary table, and then the temporary table is treated as a normal table and then joined for processing. MySQL also uses temporary tables when performing union queries, and then rewrites the right join query into an equivalent left join. In short, the current version of MySQL will convert various queries into this way of processing as much as possible (the latest version MySQL5.6 and later introduced more complex processing methods).

Of course, not all legal SQL query statements can do this, and some queries may perform poorly in this way.

Execution plan

Unlike many other database products, MySQL does not generate bytecodes for query statements to execute query plans. In fact, the query execution plan is a tree of instructions, and the query execution engine generates query results based on this tree. The final query plan contains enough information to reconstruct the original query. If you execute EXPLAIN EXTENDED on the query statement (MySQL 8 and later do not need to add EXTENDED), and then execute SHOW WARNINGS, you can see the reconstructed query.

Conceptually, multi-table queries can be represented by a tree. For example, a query with four tables might look like the following tree. This is called a balanced tree in computers.

However this is not how MySQL executes queries. As mentioned earlier, MySQL always starts from one table and then looks for matching rows in the next table. Therefore, MySQL's query plan looks like the following left-deep join tree.

Federated Query Optimizer

The most important part of MySQL's query optimizer is the joint query optimizer, which determines the optimal order of executing multi-table queries. You can often get the same results by using multiple sequences of join queries. The federated query optimizer attempts to estimate the costs of these plans and then chooses the lowest-cost plan to execute.

The following is an example of a union query that returns the same results, but in a different order.

SELECT film.film_id, film.title, film.release_year, actor.actor_id, actor.first_name, actor.last_name
FROM sakila.film
INNER JOIN sakila.film_actor USING(film_id)
INNER JOIN sakila.actor USING(actor_id);

There may be some different ways of querying here. For example, MySQL can start from the film table, use the film_id index of film_actor to find the corresponding actor_di value, and then use the primary key from the actor table to find the corresponding actor data row. An Oracle user might state: "The film table is the driving table for film_actor, and film_actor is the driving table for the actor table." The results of using Explain parsing are as follows:

******** 1.row ********
id: 1
select_type: SIMPLE
table: actor
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 200
Extra:
******** 2.row ********
id: 1
select_type: SIMPLE
table: film_actor
type: ref
possible_keys: PRIMARY, idx_fk_film_id
key: PRIMARY
key_len: 2
ref: sakila.film.film_id
rows: 1
Extra: USING index
******** 3.row ********
id: 1
select_type: SIMPLE
table: film
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 2
ref: sakila.film_actor.film_id
rows: 1
Extra: 

This execution plan is very different from what we expected. MySQL starts with the actor table first and then goes in reverse order. Is this actually more efficient? We can add STRAIGHT_JOIN to EXPLAIN to avoid optimization:

EXPLAIN SELECT STRAIGHT_JOIN film.film_id, film.title, film.release_year, actor.actor_id, actor.first_name, actor.last_name
FROM sakila.film
INNER JOIN sakila.film_actor USING(film_id)
INNER JOIN sakila.actor USING(actor_id);
******** 1.row ********
id: 1
select_type: SIMPLE
table: film
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 951
Extra:
******** 2.row ********
id: 1
select_type: SIMPLE
table: film_actor
type: ref
possible_keys: PRIMARY, idx_fk_film_id
key: idx_fk_film_id
key_len: 2
ref: sakila.film.film_id
rows: 1
Extra: USING index
******** 3.row ********
id: 1
select_type: SIMPLE
table: actor
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 2
ref: sakila.film_actor.actor_id
rows: 1
Extra: 

This explains why MySQL needs to execute the query in reverse order, which results in fewer rows being examined.

  • Querying the film table first will require 951 queries for film_actor and actor (outermost loop)
  • If you bring the actor table to the front, you only need to query the other tables 200 times.

From this example, we can see that MySQL's joint query optimizer can reduce query costs by adjusting the order of query tables. Reordered join queries are usually very effective optimizations, often improving performance several times. If there is no performance improvement, you can also use STRAIGHT_JOIN to avoid reordering and use the query method you think is best. This situation is rarely encountered in practice, and in most cases, the joint query optimizer will do a better job than humans.

The query optimizer views the union to build a query execution tree with the lowest completion cost. If possible, it starts with all single-table plans and checks all possible subtree combinations. Unfortunately, a join query of N tables has N factorial number of possible combinations. This is called the search space of all possible query plans, and it grows very quickly. A joint index of 10 tables would have 3,628,800 different ways! Once the search space grows too large, query optimization will take a very long time. At this time, the server will stop performing full analysis and instead complete the optimization in a manner similar to a greedy algorithm. This number is controlled by the optimizer_search_depth system variable, which can be modified by yourself.

You may also be interested in:
  • MySQL beginners can say goodbye to the troubles of grouping and aggregation queries
  • Let's talk about MySQL joint query in detail
  • Detailed explanation of the principles and usage examples of MySQL join query, union query, and subquery
  • Analysis of MySQL multi-table joint query operation examples
  • MySQL database aggregate query and union query operations

<<:  How to quickly install nginx under Windows and configure it to start automatically

>>:  Web Design Tutorial (4): About Materials and Expressions

Recommend

mysql IS NULL using index case explanation

Introduction The use of is null, is not null, and...

Tutorial on setting up scheduled tasks to backup the Oracle database under Linux

1. Check the character set of the database The ch...

Steps to deploy hyper-V to achieve desktop virtualization (graphic tutorial)

The hardware requirements for deploying Hyper-V a...

How to use CocosCreator object pool

Table of contents Preface: Specific operations St...

JavaScript Basics: Scope

Table of contents Scope Global Scope Function Sco...

vue-router hook function implements routing guard

Table of contents Overview Global hook function R...

JavaScript to implement limited time flash sale function

This article shares the specific code of JavaScri...

I have compiled a few cool design sites that I think are good.

You must have inspiration to design a website. Goo...

In-depth explanation of MySQL user account management and permission management

Preface The MySQL permission table is loaded into...

Significantly optimize the size of PNG images with CSS mask (recommended)

This article is welcome to be shared and aggregat...

Detailed use cases of vue3 teleport

Official Website https://cli.vuejs.org/en/guide/ ...

How to implement https with nginx and openssl

If the server data is not encrypted and authentic...

Suggestions on creating business HTML emails

Through permission-based email marketing, not onl...

Axios cancels repeated requests

Table of contents Preface 1. How to cancel a requ...

Why do select @@session.tx_read_only appear in DB in large quantities?

Find the problem When retrieving the top SQL stat...