Getting Started with Mysql--sql execution process

Getting Started with Mysql--sql execution process

1. Process

2. Core Architecture

Simply put, MySQL is mainly divided into the server layer and the storage engine layer:

  • Server layer: mainly includes connectors, query cache, analyzer, optimizer, executor, etc. All cross-storage engine functions are implemented in this layer, such as stored procedures, triggers, views, functions, etc. There is also a general log module binglog log module.
  • Storage engine: Mainly responsible for data storage and reading, it adopts a replaceable plug-in architecture and supports multiple storage engines such as InnoDB, MyISAM, Memory, etc. Among them, the InnoDB engine has its own log module redolog module. The most commonly used storage engine now is InnoDB, which has been used as the default storage engine since MySQL 5.5.5.

2.1 Introduction to basic components of the server layer

1. Connectors

Connectors are primarily concerned with authentication and permission-related functions, like a high-level doorman.

It is mainly responsible for user login to the database and user identity authentication, including verification of account password, permissions and other operations. If the user account password is passed, the connector will query all permissions of the user in the permission table. After that, the permission logic judgment in this connection will depend on the permission data read at this time. In other words, as long as the connection is not disconnected, the user will not be affected even if the administrator modifies the user's permissions.

2. Query cache (removed after MySQL 8.0)

The query cache is mainly used to cache the SELECT statement we executed and the result set of the statement.

After the connection is established, when executing a query statement, the cache will be queried first. MySQL will first verify whether the SQL has been executed and cache it in the memory in the form of Key-Value. The Key is the query estimate and the Value is the result set. If the cache key is hit, it will be returned directly to the client. If it is not hit, the subsequent operations will be performed and the results will be cached after completion for the next call. Of course, when the cache query is actually executed, the user's permissions will still be checked to see if there are query conditions for the table.

It is not recommended to use cache for MySQL queries because query cache invalidation may be very frequent in actual business scenarios. If you update a table, all query caches on this table will be cleared. For data that is not updated frequently, it is still possible to use cache.

Therefore, in most cases we do not recommend using query cache.

The cache function was deleted after MySQL version 8.0. The official also believed that this function had few actual application scenarios, so it was simply deleted.

3. Analyzer

If MySQL does not hit the cache, it will enter the analyzer, which is mainly used to analyze the purpose of the SQL statement. The analyzer is also divided into several steps:

  1. The first step is lexical analysis. An SQL statement consists of multiple strings. First, we need to extract keywords, such as select, propose the query table, propose the field name, propose the query conditions, and so on. After completing these operations, you will enter the second step.
  2. The second step, syntax analysis, is mainly to determine whether the SQL you entered is correct and whether it conforms to the MySQL syntax.

After completing these two steps, MySQL is ready to start executing, but how to execute it and how to execute it to get the best results? This is when the optimizer comes into play.

4. Optimizer

The role of the optimizer is to execute the optimal execution plan it deems (which may not be optimal sometimes. This article provides an in-depth explanation of this part of knowledge), such as how to choose an index when there are multiple indexes, how to choose the association order when querying multiple tables, etc.

It can be said that after the optimizer, the specific execution of this statement has been determined.

5. Actuator

After the execution plan is selected, MySQL is ready to start execution. First, it will check whether the user has permission before execution. If the user does not have permission, an error message will be returned. If the user has permission, the engine interface will be called and the result of the interface execution will be returned.

3. Statement Analysis

3.1 Query Statement

SQL can be divided into two types, one is query and the other is update (add, update, delete). Let's analyze the query statement first. The statement is as follows:

select * from tb_student A where A.age='18' and A.name=' 张三';

Combined with the above description, we analyze the execution flow of this statement:

First check whether the statement has permission. If not, an error message is returned directly. If permission is granted, before MySQL 8.0, the cache will be queried first, using this SQL statement as the key to check whether there is a result in memory. If so, the cache is directly cached. If not, proceed to the next step.

Through the analyzer, lexical analysis is performed to extract the key elements of the SQL statement. For example, the statement above is a query select, the table name to be queried is tb_student, all columns need to be queried, and the query condition is the id of this table = '1'. Then determine whether the SQL statement has any syntax errors, such as whether the keywords are correct, etc. If the check is OK, proceed to the next step.

The next step is for the optimizer to determine the execution plan. The above SQL statement can have two execution plans:

a. First query the student table for the student whose name is "Zhang San", and then determine whether his age is 18.
b. First find the students who are 18 years old, and then search for students whose name is "Zhang San".

The optimizer then selects the solution with the best execution efficiency based on its own optimization algorithm (the optimizer believes that this may not always be the best solution). Then after confirming the execution plan, you are ready to start execution.

Perform permission check. If there is no permission, an error message will be returned. If there is permission, the database engine interface will be called and the engine's execution result will be returned.

3.2 Update Statement

The above is the execution process of a query SQL, so let's take a look at how an update statement is executed? The sql statement is as follows:

update tb_student A set A.age='19' where A.name=' 張三';

Let's modify Zhang San's age. In the actual database, this age field will definitely not be set, otherwise you will be beaten by the technical director. In fact, this statement will basically follow the process of the previous query, but it is necessary to record logs when executing updates, which will introduce the log module. MySQL's built-in log module is binlog (archive log), which can be used by all storage engines. Our commonly used InnoDB engine also comes with a log module redo log (redo log). We will discuss the execution process of this statement in InnoDB mode. The process is as follows:

First query the data of Zhang San. If there is a cache, it will also be used.

Then get the query statement, change age to 19, and call the engine API interface to write this row of data. The InnoDB engine saves the data in memory and records the redo log. At this time, the redo log enters the prepare state and tells the executor that the execution is complete and can be submitted at any time.

After receiving the notification, the executor records the binlog, then calls the engine interface to submit the redo log to the committed state.

Update completed.

Some students here will definitely ask, why do we need two log modules? Can't we use one log module?

This is because MySQL was not initially compatible with the InnoDB engine (the InnoDB engine was inserted into MySQL by other companies as a plug-in). The native engine of MySQL is MyISAM, but we know that redo log is unique to the InnoDB engine and other storage engines do not have it. This results in the lack of crash-safe capability (the crash-safe capability means that even if the database restarts abnormally, previously submitted records will not be lost), and binlog logs can only be used for archiving.

It does not mean that it is not possible to use only one log module, but the InnoDB engine supports transactions through redo log. Then, some students may ask, can I use two log modules but not make it so complicated? Why does redo log need to introduce the prepare pre-commit status? Here we use proof by contradiction to explain why we do this?

First write the redo log and commit it directly, then write the binlog. Suppose after writing the redo log, the machine crashes and the binlog is not written. After the machine restarts, the machine will restore the data through the redo log. However, the binlog does not record the data at this time. When the machine is backed up later, this piece of data will be lost. At the same time, the master-slave synchronization will also lose this piece of data.

Write binlog first, then write redo log. Suppose after writing binlog, the machine restarts abnormally. Since there is no redo log, the machine cannot recover this record. However, binlog has another record. Then, the same reason as above will cause data inconsistency.

If the redo log two-phase commit method is adopted, the situation will be different. After writing the binglog, submitting the redo log will prevent the above problems from occurring, thus ensuring data consistency. So the question is, is there an extreme situation? Assume that the redo log is in the pre-commit state and the binglog has been written. What will happen if an abnormal restart occurs at this time? This depends on the processing mechanism of MySQL. The processing process of MySQL is as follows:

Determine whether the redo log is complete. If it is complete, commit it immediately.

If the redo log is in the pre-committed but not committed state, the transaction will be rolled back if the binlog is complete.

This solves the problem of data consistency.

4. Conclusion

  • MySQL is mainly divided into the server layer and the engine layer. The server layer mainly includes connectors, query cache, analyzer, optimizer, executor, and a log module (binlog). This log module can be shared by all execution engines, and redolog is only available in InnoDB.
  • The engine layer is plug-in-based and currently mainly includes MyISAM, InnoDB, Memory, etc.
  • The SQL execution process is divided into two categories. One category is for query processes as follows: Permission check -> query cache -> analyzer -> optimizer -> permission check -> executor -> engine
  • The execution process for update statements is as follows: Analyzer ----》Permission check ----》Executor—》Engine—redo log prepare—》binlog—》redo log commit

This article ends here. I hope you can pay more attention to other content on 123WORDPRESS.COM!

You may also be interested in:
  • Summary of basic operations for MySQL beginners
  • MySQL Beginner's Guide - Quick Reference
  • Getting Started with MySQL - Concepts

<<:  Regular expression usage in CSS selectors

>>:  12 Javascript table controls (DataGrid) are sorted out

Recommend

In-depth analysis of Nginx virtual host

Table of contents 1. Virtual Host 1.1 Virtual Hos...

How to build a React project with Vite

Table of contents Preface Create a Vite project R...

Vue project implements graphic verification code

This article example shares the specific code of ...

Vue parent component calls child component function implementation

Vue parent component calls the function of the ch...

CSS sets the list style and creates the navigation menu implementation code

1. Set the list symbol list-style-type: attribute...

Nodejs plug-in and usage summary

The operating environment of this tutorial: Windo...

React Native scaffolding basic usage detailed explanation

Build the project Execute the command line in the...

In-depth analysis of nginx+php-fpm service HTTP status code 502

One of our web projects has seen an increase in t...

Detailed explanation of the role of explain in MySQL

1. MYSQL index Index: A data structure that helps...

Docker uses nextcloud to build a private Baidu cloud disk

Suddenly, I needed to build a private service for...

How to use Celery and Docker to handle periodic tasks in Django

As you build and scale your Django applications, ...

Analysis of uniapp entry-level nvue climbing pit record

Table of contents Preface Hello World image Set b...

Detailed explanation of Docker compose orchestration tool

Docker Compose Docker Compose is a tool for defin...