Preface: I have always wanted to know how a SQL statement is executed and in what order it is executed. Then I checked and summarized the information from various parties and came up with the following blog post. This article will explore the knowledge from the perspective of MySQL overall architecture ---> query execution process ---> statement execution order. 1. Overview of MySQL architecture: The architecture is best viewed in pictures, with necessary explanatory text. The picture below is based on a picture in a reference book, and my own understanding is added to it. From the above figure we can see that the entire architecture is divided into two layers. The upper layer is MySQLD, which is called the 'SQL Layer', and the lower layer is various storage engines that provide interfaces to the upper layer, which is called the 'Storage Engine Layer'. The functions of other modules and components can be easily understood from their names, so I will not go into details here. 2. Query execution process Let me go a little further and explain the query execution process based on my own understanding: 1. Connect 1.1 The client initiates a Query request and listens to the client's 'connection management module' to receive the request 1.2 Forward the request to 'Connection/Thread Module' 1.3 Call the 'user module' to perform authorization check 1.4 After passing the check, the 'Connection/Thread Module' takes out the idle cached connection thread from the 'Thread Connection Pool' and connects it to the client request. If it fails, a new connection request is created 2. Processing 2.1 First query the cache to check whether the query statement matches completely, then check whether you have the permission. If both are successful, directly retrieve the data and return it 2.2 If the previous step fails, it is transferred to the 'command parser', which generates a parse tree after lexical analysis and syntax analysis 2.3 Next is the preprocessing stage, which handles the semantics that the parser cannot resolve, checks permissions, etc., and generates a new parse tree 2.4 Transfer it to the corresponding module for processing 2.5 If it is a SELECT query, the query optimizer will do a lot of optimization and generate an execution plan. 2.6 After receiving the request, the module checks whether the connected user has the permission to access the target table and target field through the 'Access Control Module' 2.7 If yes, call the 'table management module' to check whether the table cache exists. If yes, directly corresponding table and acquire lock, otherwise reopen the table file 2.8According to the meta data of the table, obtain the storage engine type and other information of the table, and call the corresponding storage engine processing through the interface 2.9 When data changes occur during the above process, if the logging function is turned on, they will be recorded in the corresponding binary log file 3. Results 3.1After the query request is completed, the result set is returned to the 'connection/thread module' 3.2 The returned value may also be a corresponding status indicator, such as success or failure. 3.3 'Connect to/Thread Module' to perform subsequent cleanup work and continue to wait for requests or disconnect from the client A brief summary in one picture 3. SQL parsing order Next, let's take a step further and look at the past and present of a SQL statement. First, let's look at the example sentence SELECT DISTINCT < select_list > FROM < left_table > < join_type > JOIN < right_table > ON < join_condition > WHERE < where_condition > GROUP BY < group_by_list > HAVING < having_condition > ORDER BY < order_by_condition > LIMIT < limit_number > However, the execution order is as follows FROM <left_table> ON <join_condition> <join_type> JOIN <right_table> WHERE <where_condition> GROUP BY <group_by_list> HAVING <having_condition> SELECT DISTINCT <select_list> ORDER BY <order_by_condition> LIMIT <limit_number> Although I didn't expect it to be like this, it is still very natural and harmonious at first glance. Where to get it from? Constantly filter the conditions, choose the same or different ones, and sort them. Then you know to take the first few. That being the case, let’s take a look at the details step by step. Preparation 1. Create a test database create database testQuery 2. Create a test table CREATE TABLE table1 ( uid VARCHAR(10) NOT NULL, name VARCHAR(10) NOT NULL, PRIMARY KEY(uid) )ENGINE=INNODB DEFAULT CHARSET=UTF8; CREATE TABLE table2 ( oid INT NOT NULL auto_increment, uid VARCHAR(10), PRIMARY KEY(oid) )ENGINE=INNODB DEFAULT CHARSET=UTF8; 3. Insert data INSERT INTO table1(uid,name) VALUES('aaa','mike'),('bbb','jack'),('ccc','mike'),('ddd','mike'); INSERT INTO table2(uid) VALUES('aaa'),('aaa'),('bbb'),('bbb'),('bbb'),('ccc'),(NULL); 4. Final desired result SELECT a.uid, count(b.oid) AS total FROM table1 AS a LEFT JOIN table2 AS b ON a.uid = b.uid WHERE a. NAME = 'mike' GROUP BY a.uid HAVING count(b.oid) < 2 ORDER BY total DESC LIMIT 1; ! Let’s start our SQL parsing journey now! 1. FROM When multiple tables are involved, the output of the left table will be used as the input of the right table, and then a virtual table VT1 will be generated. (1-J1) Cartesian product Calculate the Cartesian product (CROSS JOIN) of two related tables and generate virtual table VT1-J1. mysql> select * from table1,table2; +-----+------+-----+------+ | uid | name | oid | uid | +-----+------+-----+------+ | aaa | mike | 1 | aaa | | bbb | jack | 1 | aaa | | ccc | mike | 1 | aaa | | ddd | mike | 1 | aaa | | aaa | mike | 2 | aaa | | bbb | jack | 2 | aaa | | ccc | mike | 2 | aaa | | ddd | mike | 2 | aaa | | aaa | mike | 3 | bbb | | bbb | jack | 3 | bbb | | ccc | mike | 3 | bbb | | ddd | mike | 3 | bbb | | aaa | mike | 4 | bbb | | bbb | jack | 4 | bbb | | ccc | mike | 4 | bbb | | ddd | mike | 4 | bbb | | aaa | mike | 5 | bbb | | bbb | jack | 5 | bbb | | ccc | mike | 5 | bbb | | ddd | mike | 5 | bbb | | aaa | mike | 6 | ccc | | bbb | jack | 6 | ccc | | ccc | mike | 6 | ccc | | ddd | mike | 6 | ccc | | aaa | mike | 7 | NULL | | bbb | jack | 7 | NULL | | ccc | mike | 7 | NULL | | ddd | mike | 7 | NULL | +-----+------+-----+------+ rows in set (0.00 sec) (1-J2)ON Filter Based on the virtual table VT1-J1, all columns that meet the ON predicate condition are filtered out to generate the virtual table VT1-J2. Note: Due to grammatical restrictions, 'WHERE' is used here instead, from which readers can also feel the subtle relationship between the two; mysql> SELECT -> * -> FROM -> table1, -> table2 -> WHERE -> table1.uid = table2.uid -> ; +-----+------+-----+------+ | uid | name | oid | uid | +-----+------+-----+------+ | aaa | mike | 1 | aaa | | aaa | mike | 2 | aaa | | bbb | jack | 3 | bbb | | bbb | jack | 4 | bbb | | bbb | jack | 5 | bbb | | ccc | mike | 6 | ccc | +-----+------+-----+------+ rows in set (0.00 sec) (1-J3) Add external columns If an outer join (LEFT, RIGHT, FULL) is used, the columns in the main table (retained table) that do not meet the ON condition will also be added to VT1-J2 as external rows to generate a virtual table VT1-J3. mysql> SELECT -> * -> FROM -> table1 AS a -> LEFT OUTER JOIN table2 AS b ON a.uid = b.uid; +-----+------+------+------+ | uid | name | oid | uid | +-----+------+------+------+ | aaa | mike | 1 | aaa | | aaa | mike | 2 | aaa | | bbb | jack | 3 | bbb | | bbb | jack | 4 | bbb | | bbb | jack | 5 | bbb | | ccc | mike | 6 | ccc | | ddd | mike | NULL | NULL | +-----+------+------+------+ rows in set (0.00 sec) Below is a very vivid explanation diagram about 'SQL JOINS' found on the Internet. If it infringes your rights, please let us know and delete it. Thank you. 2. WHERE The temporary table generated in the VT1 process is filtered, and the columns that meet the WHERE clause are inserted into the VT2 table. Notice: At this time, due to grouping, aggregate operations cannot be used; nor can the alias created in SELECT be used; Difference from ON: If there are external columns, ON filters the associated table, and the main table (retained table) returns all columns; If no external columns are added, the effects are the same; application: The filtering of the main table should be placed in WHERE; For related tables, use ON if you query conditions first and then connect, and use WHERE if you connect first and then query conditions. mysql> SELECT -> * -> FROM -> table1 AS a -> LEFT OUTER JOIN table2 AS b ON a.uid = b.uid -> WHERE -> a.NAME = 'mike'; +-----+------+------+------+ | uid | name | oid | uid | +-----+------+------+------+ | aaa | mike | 1 | aaa | | aaa | mike | 2 | aaa | | ccc | mike | 6 | ccc | | ddd | mike | NULL | NULL | +-----+------+------+------+ rows in set (0.00 sec) 3. GROUP BY This clause will group the table generated in VT2 according to the columns in GROUP BY. Generate a VT3 table. Notice: The columns used in the subsequent processing statements, such as SELECT and HAVING, must be included in the GROUP BY. For those that do not appear, aggregate functions must be used. reason: GROUP BY changes the reference to the table, converting it to a new reference mode, and the number of columns that can be used for the next level of logical operations will be reduced; My understanding is: According to the grouping field, records with the same grouping field are merged into one record, because each group can only return one record unless it is filtered out, and the fields not in the grouping field may have multiple values, and multiple values cannot be put into one record, so these multi-valued columns must be converted into single values through aggregate functions; mysql> SELECT -> * -> FROM -> table1 AS a -> LEFT OUTER JOIN table2 AS b ON a.uid = b.uid -> WHERE -> a.NAME = 'mike' -> GROUP BY -> a.uid; +-----+------+------+------+ | uid | name | oid | uid | +-----+------+------+------+ | aaa | mike | 1 | aaa | | ccc | mike | 6 | ccc | | ddd | mike | NULL | NULL | +-----+------+------+------+ rows in set (0.00 sec) 4. HAVING This clause filters the different groups in the VT3 table and only acts on the grouped data. The clauses that meet the HAVING condition are added to the VT4 table. mysql> SELECT -> * -> FROM -> table1 AS a -> LEFT OUTER JOIN table2 AS b ON a.uid = b.uid -> WHERE -> a.NAME = 'mike' -> GROUP BY -> a.uid -> HAVING -> count(b.oid) < 2; +-----+------+------+------+ | uid | name | oid | uid | +-----+------+------+------+ | ccc | mike | 6 | ccc | | ddd | mike | NULL | NULL | +-----+------+------+------+ rows in set (0.00 sec) 5. SELECT This clause processes the elements in the SELECT clause and generates a VT5 table. (5-J1) Calculate the expression in the SELECT clause and generate VT5-J1 (5-J2)DISTINCT Find the duplicate columns in VT5-1 and delete them to generate VT5-J2 If the DISTINCT clause is specified in the query, a temporary in-memory table is created (if it cannot fit in memory, it needs to be stored on disk). The table structure of this temporary table is the same as the virtual table VT5 generated in the previous step. The difference is that a unique index is added to the column for the DISTINCT operation to eliminate duplicate data. mysql> SELECT -> a.uid, -> count(b.oid) AS total -> FROM -> table1 AS a -> LEFT OUTER JOIN table2 AS b ON a.uid = b.uid -> WHERE -> a.NAME = 'mike' -> GROUP BY -> a.uid -> HAVING -> count(b.oid) < 2; +-----+-------+ | uid | total | +-----+-------+ | ccc | 1 | | ddd | 0 | +-----+-------+ rows in set (0.00 sec) 6. ORDER BY From the table in VT5-J2, sort the results according to the conditions of the ORDER BY clause to generate the VT6 table. Notice: The only place where an alias can be used is in SELECT; mysql> SELECT -> a.uid, -> count(b.oid) AS total -> FROM -> table1 AS a -> LEFT OUTER JOIN table2 AS b ON a.uid = b.uid -> WHERE -> a.NAME = 'mike' -> GROUP BY -> a.uid -> HAVING -> count(b.oid) < 2 -> ORDER BY -> total DESC; +-----+-------+ | uid | total | +-----+-------+ | ccc | 1 | | ddd | 0 | +-----+-------+ rows in set (0.00 sec) 7.LIMIT The LIMIT clause selects the specified row data starting from the specified position from the VT6 virtual table obtained in the previous step. Notice: The impact of the positive and negative values of offset and rows; When the offset is large, the efficiency is very low. You can do this: Use subquery to optimize. In the subquery, first get the maximum id from the index, then sort in reverse order, and then get N rows of result sets. Using INNER JOIN optimization, the JOIN clause also gives priority to obtaining the ID list from the index, and then directly associates the query to obtain the final result. mysql> SELECT -> a.uid, -> count(b.oid) AS total -> FROM -> table1 AS a -> LEFT JOIN table2 AS b ON a.uid = b.uid -> WHERE -> a.NAME = 'mike' -> GROUP BY -> a.uid -> HAVING -> count(b.oid) < 2 -> ORDER BY -> total DESC -> LIMIT 1; +-----+-------+ | uid | total | +-----+-------+ | ccc | 1 | +-----+-------+ row in set (0.00 sec) This is the end of the SQL parsing journey. The above figure summarizes it: Reference books:
end: Well, this journey of in-depth understanding is almost over here. Although it is not very in-depth, it is just some things pieced together, and I referred to some books I have read before. The master's writing is indeed different. And I also got a lot of things in the process, the most important thing is to further realize the grandeur of the computer software world~ In addition, due to my limited knowledge, there are inevitably omissions and errors. If you find any, please let me know and correct them. Thank you~ Summarize The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. If you have any questions, you can leave a message to communicate. Thank you for your support for 123WORDPRESS.COM. You may also be interested in:
|
<<: Windows uses VMware to create a Linux virtual machine and install the CentOS7.2 operating system
>>: Analyze how uniapp dynamically obtains the interface domain name
This article explains the difference between arro...
Table of contents Preface Active withdrawal Excep...
An absolute URL is used to represent all the conte...
Table of contents Packaging, launching and optimi...
Install the required environment 1. gcc installat...
MySQL server has gone away issue in PHP 1. Backgr...
Recently, I participated in the development of th...
<br />Original text: http://andymao.com/andy...
Preface The project requires a circular menu. I s...
Introduction to Dockerfile Docker can automatical...
This article shares the specific code of Vue to i...
Use Docker to build a flexible online PHP environ...
Table of contents Examples from real life Slow qu...
1. Setting up nginx virtual host With virtual hos...
1. Installation Tip: There is currently no offici...