Detailed explanation of how a SQL statement is executed in MySQL

Detailed explanation of how a SQL statement is executed in MySQL

Overview

I have recently started learning MySQL related knowledge. I have organized and shared it based on the knowledge points I have learned and my own understanding. This article will analyze the execution process of the next SQL statement in MySQL, including how SQL queries flow inside MySQL and how SQL statements are updated.

1. Analysis of MySQL architecture

The following is a brief architecture diagram of MySQL:

MySQL is mainly divided into Server layer and storage engine layer

Server layer : mainly includes connectors, query cache, analyzer, optimizer, executor, etc. All cross-storage engine functions are implemented in this layer, such as stored procedures, triggers, views, functions, etc. There is also a general log module binglog log module.

Storage engine : Mainly responsible for data storage and reading, it adopts a replaceable plug-in architecture and supports multiple storage engines such as InnoDB, MyISAM, Memory, etc. Among them, the InnoDB engine has its own log module redolog module.

InnoDB version 5.5.5 is used as the default engine.

Connectors

It is mainly responsible for user login to the database and user identity authentication, including verification of account password, permissions and other operations. If the user account password is passed, the connector will query all permissions of the user in the permission table. After that, the permission logic judgment in this connection will depend on the permission data read at this time. In other words, as long as the connection is not disconnected, the user will not be affected even if the administrator modifies the user's permissions.

Query Cache

After the connection is established, when executing a query statement, the cache will be queried first. Mysql will first verify whether the SQL has been executed and cache it in the memory in the form of Key-Value. The Key is the query estimate and the Value is the result set. If the cache key is hit, it will be returned directly to the client. If it is not hit, subsequent operations will be performed and the results will be cached after completion for the next call. Of course, when the cache query is actually executed, the user's permissions will still be checked to see if there are query conditions for the table.

It is not recommended to use cache for Mysql query, because for frequently updated data, the effective time of cache is too short, and the effect is often not good. For data that is not frequently updated, it is still possible to use cache. The cache function was deleted after Mysql version 8.0. The official also believes that this function has relatively few actual application scenarios, so it was simply deleted.

Analyzer

If mysql does not hit the cache, it will enter the analyzer, which is mainly used to analyze the purpose of the SQL statement. The analyzer is also divided into several steps:

The first step is lexical analysis. An SQL statement consists of multiple strings. First, we need to extract keywords, such as select, propose the query table, propose the field name, propose the query conditions, and so on. After completing these operations, you will enter the second step.

The second step, syntax analysis, is mainly to determine whether the SQL you entered is correct and whether it conforms to the MySQL syntax.

After completing these two steps, MySQL is ready to start executing, but how to execute it and how to execute it to get the best result? This is when the optimizer comes into play.

Optimizer

The role of the optimizer is to execute the optimal execution plan it deems (although sometimes it is not optimal), such as how to choose an index when there are multiple indexes, how to choose the association order when querying multiple tables, etc.

Actuator

After the execution plan is selected, MySQL is ready to start executing. First, it will check whether the user has permission before execution. If the user does not have permission, an error message will be returned. If the user has permission, the engine interface will be called and the result of the interface execution will be returned.

2. Statement Analysis

2.1 Query Statement

Having said so much above, how is a SQL statement executed? In fact, our sql can be divided into two types, one is query and the other is update (add, update, delete). Let's analyze the query statement first. The statement is as follows:

select * from tb_student A where A.age='18' and A.name='张三';

Combined with the above description, we analyze the execution flow of this statement:

  1. First check whether the statement has permission. If not, an error message is returned directly. If permission is granted, before MySQL 8.0, the cache will be queried first, using this SQL statement as the key to query whether there is a result in the memory. If so, the cache will be directly cached. If not, proceed to the next step.
  2. Through the analyzer, lexical analysis is performed to extract the key elements of the SQL statement. For example, the statement above is a query select, the table name to be queried is tb_student, all columns need to be queried, and the query condition is the id of this table = '1'. Then determine whether there are any syntax errors in the SQL statement, such as whether the keywords are correct, etc. If the check is OK, proceed to the next step.
  3. The next step is for the optimizer to determine the execution plan. The above SQL statement can have two execution plans:

a. First query the student table whose name is "Zhang San", and then determine whether his age is 18.
b. First find the students who are 18 years old, and then search for students whose name is "Zhang San".

The optimizer then selects the solution with the best execution efficiency based on its own optimization algorithm (the optimizer believes that this may not always be the best solution). Then after confirming the execution plan, you are ready to start execution.

Perform permission check. If there is no permission, an error message will be returned. If there is permission, the database engine interface will be called and the engine's execution result will be returned.

2.2 Update Statement

The above is the execution process of a query SQL, so let's take a look at how an update statement is executed? The sql statement is as follows:

update tb_student A set A.age='19' where A.name='张三';

Let's modify Zhang San's age. In the actual database, this age field will definitely not be set, otherwise you will be beaten by the technical director. In fact, this statement will basically follow the process of the previous query, but it is necessary to record logs when executing updates, which will introduce the log module. The log module that comes with MySQL is binlog (archive log), which can be used by all storage engines. The commonly used InnoDB engine also comes with a log module redo log. We will discuss the execution process of this statement in InnoDB mode. The process is as follows:

  1. First query the data of Zhang San. If there is a cache, it will also be used.
  2. Then get the query statement, change age to 19, and call the engine API interface to write this row of data. The InnoDB engine saves the data in memory and records the redo log at the same time. At this time, the redo log enters the prepare state and tells the executor that the execution is completed and can be submitted at any time.
  3. After receiving the notification, the executor records the binlog, then calls the engine interface to submit the redo log to the committed state.
  4. Update completed.

Some students will definitely ask here, why do we need two logging modules? Can't we use one logging module? This is the previous MySQL mode. The MyISAM engine does not have a redo log, so we know that it does not support transactions. So it does not mean that only one log module cannot be used, but the InnoDB engine supports transactions through redo log. Then, some students may ask, can I use two log modules but not make it so complicated? Why does redo log need to introduce the prepare pre-commit status? Here we use proof by contradiction to explain why we do this?

  1. First write the redo log and commit it directly, then write the binlog. Suppose after writing the redo log, the machine crashes and the binlog is not written. After the machine restarts, the machine will restore the data through the redo log, but the binlog does not record the data at this time. When the machine is backed up later, this piece of data will be lost, and the master-slave synchronization will also lose this piece of data.
  2. Write binlog first, then write redo log. Suppose after writing binlog, the machine restarts abnormally. Since there is no redo log, the machine cannot recover this record. However, binlog has another record. Then, the same reason as above will cause data inconsistency.

If the redo log two-phase commit method is adopted, the situation will be different. After writing the binglog, submitting the redo log will prevent the above problems from occurring, thus ensuring data consistency. So the question is, is there an extreme situation? Assume that the redo log is in the pre-commit state and the binglog has been written. What will happen if an abnormal restart occurs at this time? This depends on the processing mechanism of MySQL. The processing process of MySQL is as follows:

  1. Determine whether the redo log is complete. If it is complete, commit it immediately.
  2. If the redo log is in pre-commit but not committed state, the system will determine whether the binlog is complete. If it is complete, the redo log will be committed; otherwise, the transaction will be rolled back.

This solves the problem of data consistency.

Conclusion

  1. Mysql is mainly divided into the server layer and the engine layer. The server layer mainly includes connectors, query cache, analyzer, optimizer, executor, and a log module (binlog). This log module can be shared by all execution engines.
  2. The engine layer is plug-in-based and currently mainly includes MyISAM, InnoDB, Memory, etc.
  3. The execution process of sql is divided into two categories. One category is for query and other processes as follows: permission check ---》query cache ---》analyzer ---》optimizer ---》permission check ---》executor ---》engine
  4. The execution process for update statements is as follows: Analyzer ----》Permission check ----》Executor---》Engine---redo log prepare---》binlog---》redo log commit

IV. Reference

"Building a MySQL knowledge network together"

The above is a detailed explanation of how a SQL statement is executed in MySQL. I hope it will be helpful to you. If you have any questions, please leave me a message and I will reply to you in time. I would also like to thank everyone for their support of the 123WORDPRESS.COM website!

You may also be interested in:
  • Solutions to MySQL batch insert and unique index problems
  • Will mysql's in invalidate the index?
  • Detailed explanation of how to migrate a MySQL database to another machine
  • Example of using go xorm to operate mysql
  • Mysql classic high-level/command line operation (quick) (recommended)
  • A brief discussion on the differences between the three major databases: Mysql, SqlServer, and Oracle
  • The use of mysql unique key in query and related issues
  • How to implement scheduled backup of MySQL database
  • Detailed explanation of mysql download and installation process
  • MySQL slow query optimization: the advantages of limit from theory and practice

<<:  Vue implementation counter case

>>:  How to automatically execute the task schedule crontab every few minutes in a specified time period on Linux

Recommend

Axios cancel request and avoid duplicate requests

Table of contents origin status quo Cancel reques...

What is the base tag and what does it do?

The <base> tag specifies the default addres...

nginx+tomcat example of accessing the project through the domain name

I was curious about how to access the project usi...

How does JS understand data URLs?

Table of contents Overview Getting started with d...

Detailed process of building nfs server using Docker's NFS-Ganesha image

Table of contents 1. Introduction to NFS-Ganesha ...

Detailed explanation of custom events of Vue components

Table of contents Summarize <template> <...

How to print highlighted code in nodejs console

Preface When the code runs and an error occurs, w...

Implementing simple chat room dialogue based on websocket

This article shares the specific code for impleme...

Issues with using Azure Container Registry to store images

Azure Container Registry is a managed, dedicated ...

In-depth explanation of the style feature in Vue3 single-file components

Table of contents style scoped style module State...

MySQL uses limit to implement paging example method

1. Basic implementation of limit In general, the ...

Detailed tutorial on deploying Jenkins based on docker

0. When I made this document, it was around Decem...

Nginx one domain name to access multiple projects method example

Background Recently, I encountered such a problem...

A brief summary of basic web page performance optimization rules

Some optimization rules for browser web pages Pag...

Mac node deletion and reinstallation case study

Mac node delete and reinstall delete node -v sudo...