Detailed explanation of InnoDB architecture and features (summary of InnoDB storage engine reading notes)

Background Threads

•Master Thread

The core background thread is mainly responsible for asynchronously refreshing the buffer pool data to the disk. For example, refreshing dirty pages, merging insert buffers, recycling undo pages, etc.

Operations once per second:

1. The log buffer is flushed to disk, even if the transaction has not been committed. This operation always occurs, so that no matter how large the transaction is, the commit time is very short.

2. When the IO pressure is very small (the number of IOs occurring within 1 second is less than 5% of innodb_io_capacity), merge the insert buffer of 5% of innodb_io_capacity.

3. When the dirty page ratio is greater than innodb_max_dirty_pages_cnt, flush the dirty pages in innodb_io_capacity buffer pools to disk. Otherwise, if innodb_adaptive_flush is turned on, the appropriate number of dirty pages to be refreshed is selected based on buf_flush_get_desired_flush_rate.

Operation every 10 seconds:

1. If the IO operation in the past 10 seconds is less than innodb_io_capacity, flush the dirty pages in innodb_io_capacity buffer pools to disk.

2. Merge 5% innodb_io_capacity insert buffers.

3. Flush the log buffer to disk.

4. Delete useless undo pages.

5. If the proportion of dirty pages in the buffer pool exceeds 70%, refresh innodb_io_capacity dirty pages to disk again. Otherwise flush 10% innodb_io_capacity dirty pages.

Background loop (when the database is idle or closed):

1. Delete useless undo pages.

2. Merge innodb_io_capacity insert buffers.

Flush loop (database idle):

1. Refresh innodb_io_capacity dirty pages

•IO Thread

The Innodb storage engine makes extensive use of AIO, and the IO Thread is mainly responsible for the callback of IO requests. This can be adjusted using the innodb_read_io_threads and innodb_write_io_threads parameter lists.

•Purge Thread

After the transaction is committed. The undolog associated with this transaction may no longer be needed. Purge Thread is used to recycle unnecessary undo pages.

•PageCleaner Thread

Responsible for refreshing dirty pages. Reduce the work of the master thread and the blocking of user query threads.

Memory buffer pool

For modification operations on pages in the database, the pages in the buffer pool are first modified and then refreshed to disk at a certain frequency. This means that instead of triggering a flush back to disk every time a page in the buffer pool is modified, it is flushed back to disk through the checkpoint technology. The size of the buffer pool can be configured through innodb_buffer_pool_size.

The data page types of the buffer pool include: data pages, index pages, undo pages, insert buffers, adaptive hash indexes, lock information stored innodb, and dictionary information.

The InnoDB storage engine now allows multiple buffer pool instances. This reduces lock contention by hashing to different buffer pool instances. This parameter can be set via innodb_buffer_pool_instance.

The buffer pool is a large memory area that is managed by the database using the LRU algorithm. But considering the operation of full table scan. Therefore, the simple LRU algorithm is not adopted. The midpoint position added to the LRU list. The newly read page is not placed directly at the head of the lru list, but at the midpoint position. By default, at 5/8 of the lru list length. Controlled by the parameter innodb_old_blocks_pct.

Insert Buffer

For insert and update operations of non-clustered indexes, the Innodb storage engine does not insert directly into the index page, but into the Insert Buffer. Then the insertbuffer and auxiliary index leaf nodes are merged at a certain frequency. This often combines multiple random insertions into a single operation. Greatly improved performance of non-clustered index inserts.

Innodb uses insertbuffer conditions:

• The index is a non-clustered index

•The index is not unique (if it is unique, you need to find the index to ensure uniqueness)

Insert Buffer Internal Implementation

The data structure of Insert Buffer is a B+ tree. After MySQL 4.1, there is only one B+ tree globally, which is responsible for inserting buffers for auxiliary indexes of all tables. In addition, this tree is stored in a shared tablespace, which is ibdata1 by default. Therefore, if you only use the independent tablespace ibd file to restore the table data, it may fail. It is also necessary to restore the auxiliary indexes on the table through the insert buffer data.

The non-leaf nodes of the Insert Buffer store the query key, which is constructed as space (4 bytes) + marker (1 byte) + offset (4 bytes). space indicates the table space ID of the table where the record is located, and offset indicates the page offset. The marker is used to be compatible with the old version.

Insert Buffer leaf points are constructed like space + marker + offset + metadata + records. space, marker, offset have the same meanings as above. The IBUF_REC_OFFSET_COUNT in the metadata stores a two-byte integer, which is used to sort the order in which records enter the Insert Buffer. By visiting this sequence again, we can get the correct value of the record. Starting from the 5th column of the Insert Buffer leaf node, the actual records inserted are shown.

After enabling the Insert Buffer index, records of auxiliary index pages may be inserted into the Insert Buffer B+ tree. In order to ensure that each merge insert into the buffer is successful, there must be a place to mark the available space of each auxiliary index page. Insert Buffer is marked with a special page whose type is Insert Buffer Bitmap. Each Insert Buffer Bitmap page is used to track 16384 pages, or 256 zones. Each Insert Buffer Bitmap page is on the second page of the 16384 pages. Each auxiliary index page occupies 4 bytes in the bit map, which is mainly used to indicate the available number of auxiliary index pages.

Merge Insert Buffer

Records in the Insert Buffer are merged into the real auxiliary index in the following cases:

• Auxiliary index pages are read into the buffer pool;

• When the Insert Buffer Bitmap page tracks that there is no free space on the auxiliary index page;

• Master Thread scheduling;

In this way, multiple record operations on the auxiliary index page are merged into the original auxiliary index page through one operation, thereby improving performance.

Double Write

InsertBuffer improves the performance of the Innodb storage engine, while the two writes improve the reliability of the data pages.

You may wonder, if a write failure occurs, can't it be recovered through the redo log? This is indeed a solution, but you must know that the redo log records the physical operations of the page, such as offset 800, writing the 'aaa' record. However, if the page is already corrupted, there is no point in redoing it. This means that before modifying a page, there must be a correct copy of the page. When a write failure occurs, the page is first restored through the copy of the page and then redone. This is a double write.

Double write consists of two parts, one part is the double write buffer in the memory. The other part is 128 consecutive pages in the shared table space on the physical disk, which is the same size as in memory (2M). When refreshing pages in the buffer pool, they are not written directly to disk, but memcpy to the double write buffer. Then, 1M data is written to the shared tablespace in two batches through the double write buffer, and then fsync is called immediately to synchronize the disk. This write overhead is not very large because the double write pages of the shared tablespace are continuous. After completing the writing of the double write page, writing the pages in the double write buffer to each table space is a discrete write.

If the operating system crashes while writing the page to disk. Then, during recovery, a copy of the page can be found from the double write buffer page in the shared tablespace. Copy it to the tablespace and then apply the redo log.

Adaptive HASH Index

The Innodb storage engine monitors queries on each index page on the table and may find that creating a hash index can improve the query speed. Then a hash index is created, called adaptive hash index (AHI).

AHI has a requirement that the consecutive access patterns to this page must be the same. For example, for a joint index like (a, b), the access pattern can be enabled as follows:

WHERE a = xxx

WHERE a = xxx and b = yyy

The same access mode means the same query conditions. If the above query operations are performed alternately. AHI will not be established.

In addition, AHI requires that if the same pattern is accessed 100 times, the page is accessed N times through the pattern, where N = records in the page * 1/16

Refresh neighboring pages

When refreshing a dirty page, the Innodb storage engine will check all pages in the area where the page is located. If it is a dirty page, it will be refreshed together.

Asynchronous IO

Innodb uses asynchronous IO to handle disk operations.

Check Point Technology

In order to avoid data loss, transaction database systems generally adopt a write ahead log strategy. That is, when a transaction is committed, the redo log is written first, and then the page is modified.

However, the redo log cannot grow infinitely, and the buffer value (dirty pages that have not been flushed to disk) cannot be infinite. Even if it can be infinitely large, it will take a long time to recover after a database crash. So Check Point technology is needed, which can solve:

• Reduce database recovery time;

• When the buffer pool is insufficient, dirty pages can be flushed to disk;

• When the redo log is unavailable (redo logs are recycled), flush dirty pages to disk;

When the database is restarted after a crash, there is no need to redo all the logs. Because the pages before the Check Point have been refreshed to disk, the database only needs to recover the redo logs after the Check Point. This significantly shortens recovery time.

For Innodb, the versions are actually compared by LSN (Log Sequence Number). The LSN is an 8-byte number. Each page has an LSN, the redo log has an LSN, and the CheckPoint also has an LSN. This can be observed by the following command

mysql> show engine innodb status\G;

.............

Log sequence number 92561351052

Log flushed up to 92561351052

Last checkpoint at 92561351052

The above detailed explanation of InnoDb system architecture and features (Summary of Innodb storage engine reading notes) is all the content that the editor shares with you. I hope it can give you a reference. I also hope that you will support 123WORDPRESS.COM.

You may also be interested in:

Mysql5.5 InnoDB storage engine configuration and optimization
MySQL Storage Engine Summary
A brief analysis of the advantages and disadvantages of choosing InnoDB and MyISAM as MySQL storage engines
The difference between MySQL storage engines InnoDB and MyISAM
InnoDB storage engine changes table shared space to independent space
In-depth discussion: Comparison of MySQL database MyISAM and InnoDB storage engines

<<: Solve the problem that shutting down Tomcat using shutdown.bat will shut down other Tomcats

>>: JavaScript imitates Jingdong carousel effect

Mysql SQL statement operation to add or modify primary key

Detailed explanation of InnoDB architecture and features (summary of InnoDB storage engine reading notes)

Mysql SQL statement operation to add or modify primary key

Summary of MySQL composite indexes

Detailed process of installing and configuring MySQL and Navicat prenium

How to use the Marquee tag in XHTML code

How to use the EXPLAIN command in SQL

Solution to the problem of Windows Server 2008 r2 server automatically restarting for no reason

Why web page encoding uses utf-8 instead of gbk or gb2312?

MAC+PyCharm+Flask+Vue.js build system

The difference and usage between div and span

The difference between Input's size and maxlength attributes

Recommend

Detailed explanation of :key in VUE v-for

Docker file storage path, modify port mapping operation mode

In-depth understanding of CSS @font-face performance optimization

MySQL 8.0.15 winx64 decompression version installation and configuration method graphic tutorial

How to install MySQL 5.7 on Ubuntu and configure the data storage path

Example code for implementing a QR code scanning box with CSS

JS+AJAX realizes the linkage of province, city and district drop-down lists

Specific use of the wx.getUserProfile interface in the applet

Detailed explanation of the TARGET attribute of the HTML hyperlink tag A

Talk about the 8 user instincts behind user experience in design

MySQL foreign key (FOREIGN KEY) usage case detailed explanation

Summary of shell's method for determining whether a variable is empty

HTML meta viewport attribute detailed description

A brief introduction to MySQL storage engine

How to use cookies to remember passwords for 7 days on the vue login page