Detailed explanation of the buffer pool in MySQL

Detailed explanation of the buffer pool in MySQL

Everyone knows that data in MySQL needs to be written to disk. Reading and writing to disk is very slow, especially compared to memory. However, when we usually execute SQL, both write and read operations can get results quickly, not as slow as expected.

You may say that I have an index, so of course it is faster with an index. But my friend, index files are also stored on disk, and the search process will generate disk I/O. If multiple operations are performed on a row of data at the same time, won’t it generate many repeated disk IOs?

Maybe you have thought about it, can't I just store the data in memory? There is no doubt that memory is faster than disk. That's right, so how to store it? This is the topic we are going to talk about today - the buffer pool.

Dear readers, please follow me~

Figure: Mind map

Getting Started with the Buffer Pool

As we mentioned above, when executing SQL to operate on a row, we cannot always perform disk operations directly. At least there is a buffer zone, otherwise who can bear to go deep into the nest every time.

This is where the buffer pool comes into being. Simply put, it is a memory area. One of the reasons for its existence is to avoid accessing the disk every time, and to put the most frequently accessed data in the cache to increase the access speed of the data.

Now that we know its function, let's take a look at the position of the buffer pool in the entire MySQL architecture and have a macro understanding.

Let's take a look at its internal components. In the buffer pool, there are several types besides data pages and index pages:

Application of buffer pool

Now that you know about the buffer pool, you may be most concerned about the role it plays in SQL execution. In the previous article, we briefly mentioned the execution process of a SQL statement, but did not touch upon issues related to the buffer pool. In this issue, we still use a SQL as the entry point.

When a SQL statement is executed, if it is a read operation, the result will be returned if the data page where the data to be searched is located is in the memory. Otherwise, the corresponding data page will be loaded into memory and then the result will be returned.

The same applies to write operations. If the data page where the row to be modified is located is in memory, the corresponding result is returned after the modification (of course there are subsequent operations). If it is not there, the data page corresponding to the row will be read from the disk into the memory and then modified.

Okay, now let's get back to the question we started with. Why is disk operation slow, but SQL execution is not slow? I believe you probably know it by now.

The existence of the buffer pool greatly reduces the overhead caused by disk I/O. If the data page where the data row to be operated is located exists in the cache, it does not need to be read from the disk. This way you can get the result quickly after execution.

Buffer pool pre-reading mechanism

We can see that as long as there is no or reduced disk I/O, the execution speed will naturally become faster. So is there a better way to deal with the unavoidable disk I/O of loading data pages? Since it cannot be avoided, then reducing the number of disk I/Os is always an option, right?

This is the new feature of "read-ahead" in MySQL that we are going to talk about. It is a way for Innodb to optimize I/O by reading multiple data pages in advance in the buffer pool. Because when the disk is read and written, it is read in pages (you can understand it as fixed-size data, for example, one page of data is 16K), at least one page of data is read each time. If the data to be read next time is in the page, there is no need to read it from the disk again, thus reducing disk I/O.

You can view the corresponding page size through the following command on the command line:

Buffer pool space management

You may wonder, since the buffer pool is such a fashionable thing, why not put all the data into the buffer pool? This speed is really great, but putting it on disk is as slow as an old ox pulling a cart.

Hey, brother, wake up. Putting aside the volatility of memory, the buffer pool also has a size limit. Then you may be confused again. Since the buffer pool has a size limit, how do I manage the data pages that I read in each time? Other data pages have taken up space, so where is my space?

Here we are going to talk about the space management of the buffer pool. In fact, the key part of managing the buffer pool is how to arrange the data in the pool and eliminate the data in the pool according to a certain strategy to ensure that the data in the pool does not "overflow" while ensuring that the commonly used data remains in the pool.

Traditional LRU elimination method

The buffer pool manages cache pages based on the traditional LRU method. Let's first look at how it is managed using LRU.

LRU, the full name is Least Recently Used, and its Chinese name is "least recently used". It's easy to understand from the name.

There are two situations here:

(1) The cache page is already in the buffer pool

In this case, the corresponding cache page will be placed at the head of the LRU linked list, without the need to read it from the disk or eliminate other cache pages.

As shown in the figure below, if the data to be accessed is in page 6, just put page 6 at the head of the linked list. In this case, no cache page will be eliminated.

(2) The cache page is not in the buffer pool

The cache page is not in the buffer. At this time, the corresponding data page needs to be read from the disk, placed at the head of the linked list, and the cache page at the end is eliminated.

As shown in the figure below, if the data to be accessed is in page 60, and page 60 is not in the buffer pool, it is loaded and placed at the head of the linked list, and cache page 17 at the end is eliminated.

Doesn’t it look simple and also satisfy the method of eliminating cache pages in the buffer pool? But let’s think about a few questions:

Pre-reading failure

We mentioned above that the buffer pool's read-ahead mechanism may preload adjacent data pages. Suppose two adjacent data pages 20 and 21 are loaded, if only the cache page with page number 20 is accessed, while the other cache page is not accessed. At this time, both cache pages are at the head of the linked list, but in order to load these two cache pages, the cache page at the end is eliminated, and the eliminated cache page is frequently accessed. In this case, the pre-reading fails, and the pages pre-loaded into the buffer pool are not accessed. Isn't this unreasonable?

Buffer pool pollution
There is another situation where when a SQL statement is executed, if a large amount of data is scanned or a full table scan is performed, a large number of data pages will be loaded into the buffer pool, thereby replacing all existing pages in the buffer pool. This situation is also unreasonable. This is buffer pool pollution and can cause MySQL performance to drop dramatically.

Separation of hot and cold data

It seems that the traditional LRU method cannot meet the space management requirements of the buffer pool. Therefore, Msyql designed a solution to separate hot and cold data based on LRU.

That is, the LRU linked list is divided into two parts, one for the hot data area and the other for the cold data area.

When a data page is first loaded into the buffer pool, it is placed at the head of the linked list in the cold data area. After 1 second (controlled by the innodb_old_blocks_time parameter), the cache page is accessed and then moved to the head of the linked list in the hot data area.

You may wonder why we have to wait 1 second before moving it to the hot data area? Think about it, what if a data page is accessed right after it is loaded into the cold data area, and is never accessed again? Doesn’t this result in a waste of hot data areas? If it is not accessed after 1 second, it means that it may not be accessed frequently in the future, so there is no need to move it to the hot buffer. When cache pages are insufficient, just eliminate them from the cold data area.

In another case, when my data page is already in the hot buffer, will the cache page be inserted into the head of the linked list as long as it is accessed? Needless to say, you must think it is unreasonable. The cache pages in the hot data area will be accessed frequently. If the linked list header is inserted every time a cache page is accessed, the entire hot buffer will be extremely chaotic. Just imagine that scene.

What should we do then? In MySQL, the last 3/4 of the hot data area is optimized to be moved to the head of the linked list only after it is accessed, and the cache pages in the first 1/4 will not be moved after they are accessed.

Okay, that’s all about the buffer pool. In this issue, we discussed why the buffer pool can speed up SQL execution, and also discussed how to manage the buffer pool space. Welcome to discuss in the comment section.

Summarize

Application of buffer pool

The buffer pool greatly reduces the overhead caused by disk I/O. By loading the data pages where the data rows are operated on into the buffer pool, the execution speed of SQL can be improved.

Buffer pool pre-reading mechanism

In order to reduce disk I/O, Innodb optimizes by reading multiple data pages in advance in the buffer pool, which is called pre-reading.

Buffer pool space management

  • The traditional LRU method for the buffer pool will lead to two situations: pre-reading failure and buffer pool pollution. Therefore, this traditional method is not suitable for buffer pool space management.
  • Based on the optimization of the LRU method, Msyql designed a solution for separating hot and cold data, dividing the LRU linked list into two parts: the hot data area and the cold data area, so as to solve the problems of pre-reading failure and buffer pool pollution.

The above is a detailed explanation of the buffer pool in MySQL. For more information about MySQL buffer pool, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • MySQL Innodb key features insert buffer
  • MySQL Query Cache and Buffer Pool
  • Important parameters for mysql optimization key_buffer_size table_cache
  • Optimize mysql key_buffer_size settings
  • What is the appropriate setting for mysql read_buffer_size?
  • MySQL Sort aborted: Out of sort memory, consider increasing server sort buffer size
  • Analyzing the Innodb buffer hit rate calculation from the MySQL source code
  • Detailed explanation of mysql operation buffer usage in php
  • Two important parameters in Mysql optimization and tuning: table_cache and key_buffer
  • Optimization setting of mysql key_buffer_size parameter
  • Research on Buffer Pool Pollution Caused by mysqldump
  • MySQL join buffer principle

<<:  XHTML Getting Started Tutorial: What is XHTML?

>>:  A brief introduction to VUE uni-app basic components

Recommend

Example of Vue implementing fixed bottom component

Table of contents 【Effect】 【Implementation method...

JavaScript Array Detailed Summary

Table of contents 1. Array Induction 1. Split a s...

Detailed explanation of the implementation of shared modules in Angular projects

Table of contents 1. Shared CommonModule 2. Share...

Basic principles of MySQL scalable design

Table of contents Preface 1. What is scalability?...

In-depth explanation of MySQL user account management and permission management

Preface The MySQL permission table is loaded into...

Docker automated build Automated Build implementation process diagram

Automated build means using Docker Hub to connect...

How to set list style attributes in CSS (just read this article)

List style properties There are 2 types of lists ...

Detailed explanation of Linux copy and paste in VMware virtual machine

1. Linux under VMware Workstation: 1. Update sour...

Specific example of MySQL multi-table query

1. Use the SELECT clause to query multiple tables...

Detailed explanation of Mysql transaction processing

1. MySQL transaction concept MySQL transactions a...

Vue3.0 implements encapsulation of checkbox components

This article example shares the specific code of ...

Use momentJs to make a countdown component (example code)

Today I'd like to introduce a countdown made ...

HTML head tag meta to achieve refresh redirection

Copy code The code is as follows: <html> &l...