MySQL infrastructure tutorial: detailed explanation of the query statement execution process

MySQL infrastructure tutorial: detailed explanation of the query statement execution process

Preface

I have always wanted to know how a SQL statement is executed and in what order it is executed. Then I checked and summarized the information from various parties and came up with the following article.

This note mainly records the basic architecture of MySQL and how a query statement is executed.

Let’s take a look at the detailed introduction.

For example, when we query an id=2 from the student table

select * from student where id=2;

Before explaining the execution flow of this statement, let's take a look at the basic architecture of MySQL.

The picture comes from Geek Time's MySQL practice. The picture describes the logical architecture of MySQL.

  • The server layer includes connectors, query cache, analyzer, optimizer, and executor, covering most of the core service functions of MySQL, as well as all built-in functions. All cross-storage engine functions are implemented in this layer, such as stored procedures, triggers, views, etc.
  • The storage engine layer is responsible for storing and retrieving data. Its architecture mode is plug-in-based and supports multiple storage engines such as InnoDB, MyISAM, Memory, etc. The innoDB engine is usually used.


Connectors

Before we use the database, we need to connect to the database. The connection statement is

mysql -h $ip -u $username -p $password

Our connector handles this process. The main function of the connector is to establish a connection with the client, obtain permissions, maintain and manage the connection. If the user's permissions change during the use of the connector, it will not take effect immediately, because the user permissions are read when connecting, and the permissions can only be updated by reconnecting.

The protocol for the connector to communicate with the client is TCP. After the connection is established, you can use show processlist to see the number of connections executed.

At the same time, if the connection time exceeds 8 hours, it will be automatically disconnected in the sleep state. This is the default setting of MySQL. If it is not disconnected, then this process can be called a long connection.

Corresponding to this is the short connection, which means disconnecting after executing one or several operations.

When long connections are used continuously, a lot of memory resources will be occupied. After MySQL 5.7, you can use the mysql_reset_connection statement to reinitialize resources.

Query Cache

After connecting, you are connected to the database and can now execute statements.

When executing a statement, MySQL first queries the cache to see if such a statement has been executed before. MySQL will store the previously executed statements and results in the form of key-value (of course, there is a certain storage and validity period). If the cache exists, the cached result is returned directly.

The caching workflow is

  • The server receives the SQL and searches the cache table using the SQL and some other conditions as keys
  • If the cache is found, the cache is returned directly
  • If the cache is not found, the SQL query is executed, including the original SQL parsing, optimization, etc.
  • After executing the SQL query results, cache the SQL query results into the cache table

Of course, if this table is modified, then all caches using this table will no longer be valid and the query cache entries will be cleared. Therefore, it is inappropriate to cache statements in a table that is modified repeatedly, because the cache will be effective at any time, so the hit rate of the query cache will be greatly reduced, which is not very cost-effective.

When data is being written to this table, the cache of this table (hit cache, cache write, etc.) will be invalid. In Innodb, if a transaction modifies this table, the cache of this table will be invalid before the transaction is committed. Before this transaction is committed, related queries of this table cannot be cached.

Generally speaking, if it is a static table or a table that rarely changes, it can be cached, and the hit rate will be very high.

Let's talk about when to use the cache. It is a difficult topic to measure whether turning on the cache will improve system performance.

  • Judged by cache hit rate, cache hit rate = cache hit count (Qcache_hits) / query count (Com_select)
  • By cache write rate, write rate = cache write times (Qcache_inserts) / query times (Qcache_inserts)
  • Judged by hit-write ratio, ratio = hit count (Qcache_hits) / write count (Qcache_inserts), in high-performance MySQL, it is called an index that can better reflect performance improvement. Generally speaking, a ratio of 3:1 is considered effective for query cache, and 10:1 is the best.

Analyzer

When the query cache is ineffective or there is no cache, the MySQL server will use an analyzer to analyze the statement. The analyzer is also called a parser.

The MySQL analyzer consists of two parts. The first part is used for lexical analysis to scan the character stream and identify single words according to the word formation rules. MySQL uses Flex to generate a lexical scanner. MySQL keywords and function keywords are defined in sql/lex.h and stored in two arrays. The function of the second part is syntax analysis. On the basis of lexical analysis, word sequences are combined into grammatical phrases, and finally a syntax tree is generated and submitted to the optimizer. The syntax analyzer uses Bison and defines the syntax rules in sql/sql_yacc.yy. Then generate the syntax tree according to relational algebra theory.

The above explanation of the analyzer is too official and complicated. In fact, the analyzer is mainly used to perform "lexical analysis" and then understand what the database statement is doing and what it means.

At this time, if the analyzer finds that there is a problem with this statement, it will report an error, such as ERROR 1064 (42000): You have an error in your SQL syntax

Optimizer

After the analyzer has finished analyzing and knows what the statement does, the next step is to use a dedicated optimizer to optimize the statement. The optimizer's task is to find the best solution to execute the SQL query. Most query optimizers, including the MySQL query optimizer, search to a more or less optimal level among all possible query evaluation scenarios.

The optimizer mainly selects an optimal execution plan, which is designed to reduce overhead and improve execution efficiency.

The MySQL optimizer is a very complex component that uses many optimization strategies to generate an optimal execution plan:

  • Redefine the order of association of tables (when multiple tables are associated and queried, they are not necessarily in the order specified in SQL, but there are some techniques to specify the association order)
  • Optimize MIN() and MAX() functions (to find the minimum value of a column, if the column has an index, you only need to find the leftmost end of the B+Tree index, otherwise you can find the maximum value. See below for the specific principle)
  • Terminate the query early (for example, when using Limit, the query will be terminated immediately after finding a result set that satisfies the quantity)
  • Optimize sorting (In the old version of MySQL, two-transfer sorting is used, that is, first read the row pointer and the field to be sorted in memory to sort them, and then read the data row according to the sorting result. The new version uses a single-transfer sorting, that is, read all the data rows at once, and then sort them according to the given column. For I/O-intensive applications, the efficiency will be much higher)

As MySQL continues to develop, the optimization strategies used by the optimizer are also constantly evolving. Here we only introduce a few very common and easy-to-understand optimization strategies.

Actuator

After the analyzer knows what the statement is supposed to do and the optimizer knows how to do it, the next step is execution, which is handed over to the executor.

When executing, the executor first determines whether the user has execution permission on the table. If not, an error message such as denied is returned.

If you have permission, the table will be opened and execution will continue. When opening a table, the executor uses the interface of the engine based on the engine defined in the table.

Finally, the statement is executed to obtain data and return it to the client.

Summarize

After MySQL gets the SQL statement, the general process is as follows:

0. The connector is responsible for communicating with the client

1. Query cache: First query the cache to see if there is a kv cache

2. Parser: responsible for parsing and forwarding sql

3. Preprocessor: Verify the parsed SQL tree

4. Optimizer: Get an execution plan

5. Query execution engine: The executor executes the statement to obtain the data result set

6. Return the data to the caller.

Well, the above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. If you have any questions, you can leave a message to communicate. Thank you for your support of 123WORDPRESS.COM.

You may also be interested in:
  • 15 basic SQL query statements that beginners must read
  • SqlServer basics data retrieval, query sorting statements
  • SQL-based query statements

<<:  IIS7~IIS8.5 delete or modify the server protocol header Server

>>:  How to install and uninstall IIS7 components using the WIN2008 server command line

Recommend

Vue achieves seamless carousel effect

This article shares the specific code of Vue to a...

Mysql 8.0.18 hash join test (recommended)

Hash Join Hash Join does not require any indexes ...

Solve the problem that vue project cannot carry cookies when started locally

Solve the problem that the vue project can be pac...

JavaScript common statements loop, judgment, string to number

Table of contents 1. switch 2. While Loop 3. Do/W...

Implementation of installing and uninstalling CUDA and CUDNN in Ubuntu

Table of contents Preface Install the graphics dr...

MySQL lock control concurrency method

Table of contents Preface 1. Optimistic Locking A...

Detailed explanation of Svn one-click installation shell script under linxu

#!/bin/bash #Download SVN yum -y install subversi...

Two ways to exit bash in docker container under Linux

If you want to exit bash, there are two options: ...

Solution to Ubuntu cannot connect to the network

Effective solution for Ubuntu in virtual machine ...

Practical record of vue using echarts word cloud chart

echarts word cloud is an extension of echarts htt...

Detailed explanation of jQuery chain calls

Table of contents Chain calls A small case Chain ...

Case study of dynamic data binding of this.$set in Vue

I feel that the explanation of this.$set on the I...