How Database SQL SELECT Queries Work

How Database SQL SELECT Queries Work

As Web developers, although we are not professional DBAs, we cannot do without databases. Most developers only know how to use four classic SQL statements: select, insert, delete, and update. So much so that we have never studied how they work, here we talk about how select works in the database.

The most classic topic in B/S architecture is the three-tier architecture, which can be roughly divided into the data layer, business logic layer and presentation layer. The function of the data layer is generally to interact with the database, such as querying records. We often write query SQL and then call the program to execute the SQL. But what is its internal workflow like? I think most of my friends, like me, are not sure which step to do first and which step to do next.

Step 1: The application sends the query SQL statement to the server for execution

When we execute SQL statements in the data layer, the application will connect to the corresponding database server and send the SQL statements to the server for processing.

Step 2: The server parses the requested SQL statement

1. SQL plan cache. Friends who often use query analyzer probably know such a fact. Often a query statement takes a very long time to execute when it is run for the first time, but if you run the same statement immediately or within a certain period of time, the query results will be returned in a very short time.

reason:

  • After receiving a query request, the server does not immediately query the database, but instead looks in the plan cache in the database to see if there is a corresponding execution plan. If so, it directly calls the compiled execution plan, saving the compilation time of the execution plan.
  • If the row being queried already exists in the data buffer storage area, there is no need to query the physical file. Instead, the data is retrieved from the cache. This way, retrieving data from the memory is much faster than reading data from the hard disk, thereby improving query efficiency. The data buffer storage area will be mentioned later.

2. If there is no corresponding execution plan in the SQL plan cache, the server will first perform syntax validation on the SQL statement requested by the user. If there is a syntax error, the server will end the query operation and return the corresponding error message to the application that called it.

Note: The error message returned at this time will only contain basic syntax error information, such as select written as selec. If the error message contains a column that does not exist in the table, the server will not check it at this time, because it is only a syntax verification. Whether the semantics are correct will be handled in the next step.

3. After the syntax conforms, it starts to verify whether its semantics are correct, for example, whether the table name, column name, stored procedure and other database objects actually exist. If it is found that some of them do not exist, an error will be reported to the application and the query will be terminated.

4. The next step is to obtain the parsing lock of the object. When we query a table, the server will first lock the object to ensure the consistency of the data. If it is not locked, data is inserted at this time, but because there is no lock, the query has read the record, and some insertions will be rolled back due to transaction failure, which will cause dirty reads.

5. The next step is to verify the database user's permissions. Even if the SQL statement syntax and semantics are correct, the query results may not be obtained at this time. If the database user does not have the corresponding access rights, the server will report an insufficient permission error to the application. In larger projects, there are often several database connection strings in a project. These database users have different permissions, some are read-only permissions, some are write-only permissions, and some are readable and writable. Different users are selected to execute according to different operations. If you are not careful, no matter how perfect your SQL statement is, it will be useless.

6. The last step of the analysis is to determine the final execution plan. After the syntax, semantics, and permissions are verified, the server will not return the results to you immediately. Instead, it will optimize your SQL and select different query algorithms to return them to the application in the most efficient form. For example, when doing a table join query, the server will ultimately decide whether to use hashjoin, mergejoin, or loopjoin based on the cost, which index will be more efficient, etc. However, its automatic optimization is limited. If you want to write efficient query SQL, you still need to optimize your SQL query statements.

Once the execution plan is determined, it will be saved in the SQL plan cache. The next time there is the same execution request, it will be directly taken from the plan cache to avoid recompiling the execution plan.

Step 3: Statement execution

After the server has finished parsing the SQL statement, it will know what the statement actually means, and then it will actually execute the SQL statement.

There are two situations at this time:

  • If the data rows contained in the query statement have been read into the data buffer storage area, the server will directly read the data from the data buffer storage area and return it to the application, avoiding reading from the physical file and improving the query speed.
  • If the data row is not in the data buffer storage area, the record will be read from the physical file and returned to the application, and the data row will be written to the data buffer storage area for next use.

Note: There are several types of SQL caches. Interested friends can search for them here. Sometimes, due to the existence of caches, it is difficult for us to see the optimization results immediately. Because the second execution is very fast due to the existence of caches, we usually eliminate the cache first, and then compare the performance before and after optimization. Here are several common methods:

DBCC DROPCLEANBUFFERS

Removes all clear buffers from the buffer pool.

DBCC FREEPROCCACHE

Removes all elements from the procedure cache.

DBCC FREESYSTEMCACHE

Releases all unused cache entries from all caches.

The SQL Server 2005 database engine cleans up unused cache entries in the background in advance to make memory available for current entries. However, you can manually remove unused entries from all caches using this command.

This can only basically eliminate the impact of SQL cache. There seems to be no solution to completely eliminate the cache. If you have one, please let me know.

Conclusion: Only by knowing the operation process of the service executing the SQL submitted by the application can we debug our application well.

  • Make sure the SQL syntax is correct;
  • Ensure the SQL semantic correctness, that is, whether the object exists;
  • Whether the database user has the corresponding access rights.

Summarize

The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. Thank you for your support of 123WORDPRESS.COM. If you want to learn more about this, please check out the following links

You may also be interested in:
  • Mysql master/slave database synchronization configuration and common errors
  • PHP singleton mode database connection class and page static implementation method
  • Two ideas for implementing database horizontal segmentation
  • MySQL database case sensitivity issue
  • Python uses adbapi to implement asynchronous storage of MySQL database
  • Talk about some experience in database field design
  • The difference between MySQL database stored procedures and transactions
  • The meaning and calculation method of QPS and TPS of MySQL database
  • How to backup and restore the mysql database if it is too large
  • How to reasonably use the redundant fields of the database

<<:  Detailed explanation of direct routing in cross-host communication of Docker containers

>>:  Vue custom table column implementation process record

Recommend

How to read the regional information of IP using Nginx and GeoIP module

Install GeoIP on Linux yum install nginx-module-g...

js to implement collision detection

This article example shares the specific code of ...

Teach you to quickly build a web cluster project based on nginx

Table of contents 1. Project Environment 2. Proje...

The use and difference between JavaScript pseudo-array and array

Pseudo-arrays and arrays In JavaScript, except fo...

Detailed steps for configuring Tomcat server in IDEA 2020

The steps for configuring Tomcat in IDEA 2020 are...

HTML basics summary recommendation (paragraph)

HTML Paragraph Paragraphs are defined by the <...

How to install MySQL and MariaDB in Docker

Relationship between MySQL and MariaDB MariaDB da...

Detailed explanation of keepAlive use cases in Vue

In development, it is often necessary to cache th...

How to add custom system services to CentOS7 systemd

systemd: The service systemctl script of CentOS 7...

Vue implementation counter case

This article example shares the specific code of ...

Native JS object-oriented typing game

This article shares the specific code of JS objec...

A simple method to regularly delete expired data records in MySQL

1. After connecting and logging in to MySQL, firs...

Steps to build MHA architecture deployment in MySQL

Table of contents MAH 1. Introduction to MAH Arch...