A brief discussion on the principle of shallow entry and deep exit of MySQL

A brief discussion on the principle of shallow entry and deep exit of MySQL

1. Overview of the page

The data we insert into MySQL is eventually stored in the page. In the design of InnoDB, pages are connected through a doubly linked list.

The rows of data stored in the page are connected through a singly linked list.

The User Records area in the above figure is used to store row data. So why is InnoDB designed this way? Assuming we don’t have the concept of pages, how can we quickly query and get results for thousands of data when we query? As we all know, MySQL has good performance, but without pages, all we have left is to traverse the data one by one.

How does that page achieve quick search? In the current page, you can traverse through the single linked list connecting each record in User Records . If the record is not found in the current page, you can quickly jump to the next page for query through the next page pointer.

2. Infimum and Supremum

Someone might say that you did not solve the problem by traversing User Records , you simply divided the data into groups. If my data is not on the current page at all, do I still have to traverse every piece of data on the previous page? This is too inefficient.

Of course, MySQL has also taken this problem into consideration, so there is actually an area in the page called The Infimum and Supremum Records .

With Infimum Record and Supremum Record , the query now does not need to traverse all User Records on a certain page, but only needs to compare these two records with the target record to be queried. For example, if the data id = 101 , it is obviously not on the current page. Then you can jump to the next page for retrieval through the next page pointer.

3. Use Page Directory

Someone might ask, aren’t all your User Records singly linked lists? Even if I know that the data I'm looking for is on the current page, in the worst case, don't I still have to traverse each page 100 times to find the data I'm looking for? You call this efficient?

I have to say that this is indeed a problem, but it is one that MySQL has already considered. Yes, traversing one by one is indeed very inefficient. To solve this problem, MySQL adds another area to the page Page Directory .

As the name suggests, Page Directory is a directory with many slots, each of which points to a record in User Records . As you can see, a slot is created every few pieces of data. In fact, the data I gave in the figure is very strictly in accordance with its settings. In a complete page, there will be a slot for every 6 data.

The design of Page Directory reminds you of another data structure, the skip list, except that only one layer of index is abstracted here.

MySQL will create the corresponding Slot when new data is added. With Page Directory , you can perform a rough binary search on the data of a page. As for why it is rough, after all, Page Directory does not contain complete data, and the result of binary search can only be an approximate location. After finding this approximate location, it is necessary to return to User Records and continue to traverse and match one by one.

However, this efficiency is much higher than the original version we just talked about.

4. The true appearance of the page

If I throw out the various components and concepts of the page right at the beginning, I myself cannot accept it, as it seems very rigid. Secondly, people who are not familiar with the page may not understand why the page is designed in this way. So I presented the general appearance of the page to you according to a set of ideas for querying a piece of data.

In fact, there are many other fields stored on the page, as well as other areas, but these will not affect our understanding of the page. Therefore, after having a clearer understanding of the page, we can take a look at what the real page looks like.

The above picture shows the actual composition of the page. In addition to what we mentioned before, there are some more that we haven’t talked about before, such as File Header , Page Header , Free Space , and File Tailer . Let’s look at them one by one.

4.1、File Header

In fact, File Header has been discussed above, but it is not called this name. The previous page pointer and next page pointer mentioned above actually belong to File Header , and there are many other data besides them.

In fact, I am quite resistant to listing a bunch of parameters and telling you the size of this one and what that one is used for. For those of us who need to understand the page in detail, it is enough to know only two for now, which are:

  • FIL_PAGE_PREV
  • FIL_PAGE_NEXT

These two variables are the previous page pointer and next page pointer mentioned above. They are called pointers for the convenience of understanding. In fact, they are the offset of the page on the disk.

4.2 Page Header

Compared with File Header , the data in Page Header is more familiar to us. I have drawn a picture here to list the contents in detail.

All of them are listed here because understanding the meaning of these parameters and why we need to set them can help us better understand the principles and structure of the page. The details can be explained by looking at the pictures.

I also want to complain here. Too many blogs are too rigid in their writing. For example, the parameter PAGE_HEAP_TOP , many blogs just call HEAP the heap. This is just like writing a comment Init and calling it initialization. It’s better not to write it. In fact, if you study it, you will find that the heap here actually refers to User Records.

There are two parameters that may be a bit confusing, namely PAGE_N_HEAP and PAGE_N_RECS , both of which are the number of records in the current User Records . The only difference is that PAGE_N_HEAP contains records marked for deletion, while PAGE_N_RECS contains all the data we can actually query.

4.3、Infimum & Supremum Records

As mentioned above, Infimum & Supremum Records will record the maximum and minimum records of the current page. In fact, it is not accurate. A more accurate description is the opening interval of the minimum and maximum records. Because in fact, Infimum Records will be smaller than the minimum value in the current page, and Supremum Records will be larger than the maximum value in the current page.

4.4 User Records

User Records can be said to be the part we come into contact with most often, after all, our data is ultimately here. After the page is initialized, there is no data in User Records . As the system runs and data is generated, the data in User Records will continue to expand, and the corresponding Free Space will gradually decrease.

We have already discussed the concepts in User Records . Here I will only talk about one point that I think is very important, that is, the order.

We know that in a clustered index, the Key is actually arranged in the order of Primary Key . Is this also the case with User Records ? When we insert a new data into User Records , will the existing data be reordered according to the order of Primary Key ?

The answer is no, because this will reduce the efficiency of MySQL processing.

The data in User Records is guaranteed by the pointing of the single linked list pointer, that is, the actual performance of the row data on the disk is queued according to the insertion order, with the first data in front and the later data in the back. It's just that a single linked list between the row data in User Records forms an order arranged according to Primary Key .

To express it in a diagram, it is roughly as follows:

4.5 Free Space

This part is actually discussed in other modules in disguise. Initially, User Records is completely empty. When new data comes in, space will be requested in Free Space . When there is no space Free Space , it means that a new page needs to be requested. There is nothing special about it.

4.6 Page Directory

This is not much different from what was discussed above, so I will skip it.

4.7 File Trailer

This is mainly to prevent the page from failing during the process of being flushed to disk due to extreme unexpected situations (network problems, fire, natural disasters), which may cause data inconsistency, that is, the formation of dirty pages.

There is only one component inside:

V. Conclusion

At this point, I think we have talked about everything about pages. I personally think that understanding the underlying page principles will help us use MySQL more friendly and rationally, so that it can exert its ultimate performance.

The above is a brief discussion of the detailed content of the shallow entry and deep exit principle of MySQL. For more information about the MySQL page principle, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • PHP page, MySQL database conversion to utf-8 garbled code, utf-8 encoding problem summary
  • How to use LIMIT for paging in MySQL
  • How to modify the Innodb data page size to optimize MySQL
  • Spring MVC+MyBatis+MySQL to implement paging function example
  • Nodejs mysql paging method
  • Detailed explanation of MySQL limit usage and performance analysis of paging query statements
  • PHP+MySQL to implement a simple login, registration and password modification web page
  • MySQL million-level data paging query optimization solution
  • Mysql sorting and paging (order by & limit) and existing pitfalls
  • MySQL paging query method for millions of data volumes and its optimization suggestions

<<:  How to Completely Clean Your Docker Data

>>:  CSS positioning layout (position, positioning layout skills)

Recommend

Vue.js implements timeline function

This article shares the specific code of Vue.js t...

Comparison of the usage of EXISTS and IN in MySQL

1. Usage: (1) EXISTS usage select a.batchName,a.p...

Application of CSS3 animation effects in activity pages

background Before we know it, a busy year is comi...

A brief analysis of CSS :is() and :where() coming to browsers soon

Preview versions of Safari (Technology Preview 10...

Practical TypeScript tips you may not know

Table of contents Preface Function Overloading Ma...

JavaScript implementation of a simple addition calculator

This article example shares the specific code of ...

WeChat applet implements user login module server construction

I chose node.js to build the server. Friends who ...

Detailed process of compiling and installing Storm on Kylin V10 server

1 Introduction Apache Storm is a free, open sourc...

How to use partitioning to optimize MySQL data processing for billions of data

When MySQL queries tens of millions of data, most...

Write a formal blog using XHTML CSS

The full name of Blog should be Web log, which mea...

How to backup MySQL regularly and upload it to Qiniu

In most application scenarios, we need to back up...

How to perform query caching in MySQL and how to solve failures

We all know that we need to understand the proper...

Vue realizes the percentage bar effect

This article shares the specific code of Vue to r...