PrefaceWhen it comes to database transactions, a lot of transaction-related knowledge will easily pop up in everyone's mind, such as the ACID characteristics of transactions, isolation levels, problems solved (dirty reads, non-repeatable reads, phantom reads), etc., but few people may really understand how these transaction characteristics are implemented and why there are four isolation levels. In the previous article, we have learned about the implementation principle of transaction isolation in MySQL. Today, let’s continue to talk about the implementation principle of MySQL persistence. Of course, MySQL is extensive and profound, and omissions in the article are inevitable. Criticisms and corrections are welcome. illustrate The transaction implementation logic of MySQL is located at the engine layer, and not all engines support transactions. The following instructions are based on the InnoDB engine. InnoDB reading and writing data principleBefore we continue, we need to understand how InnoDB reads and writes data. We know that database data is stored on disks, and we also know that the cost of disk I/O is very high. If the disk must be accessed every time data is read or written, the efficiency of the database will be very low. To solve this problem, InnoDB provides Buffer Pool as a buffer for accessing database data. The Buffer Pool is located in memory and contains a mapping of some data pages on disk. When data needs to be read, InnoDB will first try to read it from the Buffer Pool. If it cannot read the data, it will read it from the disk and put it into the Buffer Pool. When writing data, it will first write to the Buffer Pool page, mark such page as dirty, and put it on a special flush list. These modified data pages will be flushed to the disk at some point in the future (this process is called dirty flushing, which is the responsibility of other background threads). As shown in the following figure: The advantage of this design is that it can convert a large amount of disk I/O into memory reading and writing, and merge multiple modifications to a page into one I/O operation (flushing the entire page once it is dirty), avoiding access to the disk for each read and write operation, thereby greatly improving the performance of the database. Persistence DefinitionPersistence means that once a transaction is committed, its changes to the database should be permanent, and subsequent operations or failures should not have any impact on the modifications of this transaction. From the previous introduction, we know that InnoDB uses Buffer Pool to improve the performance of reading and writing. However, the Buffer Pool is in memory and is volatile. If MySQL suddenly crashes after a transaction is committed, and the modified data in the Buffer Pool has not been refreshed to the disk at this time, data will be lost and the persistence of the transaction cannot be guaranteed. To solve this problem, InnoDB introduced redo log to achieve persistence of data modifications. When data is modified, InnoDB not only modifies the data in the Buffer Pool, but also records the operation in the redo log and ensures that the redo log is written to the disk earlier than the corresponding page (usually when the transaction is committed), which is often called WAL. If MySQL crashes suddenly and the data has not been flushed back to disk, after restarting, MySQL will use the redo log that has been written to disk to recover the data pages that have not been flushed to disk. Implementation principle: redo logTo improve performance, similar to data pages, redo logs also consist of two parts: one is the log buffer in memory, which is volatile; the other is the redo log file on the disk, which is persistent. Redo log is a physical log that records the status of physical pages in the database. When data is modified, InnoDB not only modifies the data in the Buffer Pool, but also records the operation in the redo log buffer. When the transaction is committed, the redo log buffer is flushed to disk and recorded in the redo log file. If MySQL crashes, you can read the data in the redo log file when it restarts to restore the database. This way, there is no need to flush the data in real time every time a transaction is submitted. Writing processNote:
benefitWhen a transaction is committed, writing to the redo log has three main advantages over directly flushing the log: Flushing is random I/O, but writing redo logs is sequential I/O. Sequential I/O is much faster than random I/O and is not necessary. Should I write redo log first or modify data first?A DML operation may involve data modification and redo log recording. What is the order in which they are executed? Some articles on the Internet say that data should be modified first and redo log should be recorded later, while others say that redo log should be recorded first and data should be modified later. So what is the real situation? First of all, from the above description, we know that the redo log buffer will be written to the redo log file when the transaction is committed, and the flushing is at some subsequent time, so it is certain that the redo log is recorded first and the data page is modified later (of course, the WAL log is written first). The next question is whether to write the redo log buffer first or modify the Buffer Pool first. To understand this problem, we first need to understand the execution process of a DML in InnoDB. The execution process of a DML involves data modification, locking, unlocking, redo log recording, and undo log recording, which also need to ensure atomicity. InnoDB uses MTR (Mini-transactions) to ensure the atomicity of a DML operation. First, let’s look at the definition of MTR:
MTR is a short atomic operation that cannot be rolled back because it is atomic itself. Changes to data pages must go through MTR, which records changes to data pages caused by DML operations in the redo log. Let’s take a quick look at the MTR process:
It can be seen from this that InnoDB modifies the Buffer Pool first and then writes the redo log buffer. The process of recovering dataIn any case, InnoDB will attempt to perform recovery operations when it starts. During the recovery process, redo log is required, and if binlog is enabled, binlog and undo log are also required. Because it is possible that the data has been written to the binlog, but the database crashed before the redo log was flushed to disk (transactions are a feature of the InnoDB engine, and modified data may not be committed, while binlog is a feature of the MySQL service layer, and modified data will be recorded). At this time, the redo log, binlog, and undo log are required to determine whether there are any uncommitted transactions, and to roll back or commit the uncommitted transactions. The following is a brief description of the process of restoring data using only redo logs:
What is LSN?LSN, also known as log sequence number, is a monotonically increasing 64-bit unsigned integer. Both redo log and data pages store LSN, which can be used as a basis for data recovery. A larger LSN indicates that the change described by the referenced log record occurred later. What is Checkpoint?Checkpoint represents a save point, and all changes to the data page before this point (log LSN < Checkpoint LSN) have been written to the disk file. InnoDB records a Checkpoint after each disk flush and records the latest redo log LSN in the Checkpoint LSN, which is convenient for determining the starting point when restoring data. The above is a detailed explanation of the persistence implementation principle of transactions in MySQL. For more information about the persistence of MySQL transactions, please pay attention to other related articles on 123WORDPRESS.COM! You may also be interested in:
|
<<: Full analysis of web page elements
>>: Vue realizes screen adaptation of large screen pages
When encapsulating Vue components, I will still u...
The emergence of jQuery has greatly improved our ...
Copy code The code is as follows: <html> &l...
Menu bar example 1: Copy code The code is as foll...
Recently, the company happened to be doing live b...
MYSQL commonly used query commands: mysql> sel...
I believe that many users who make websites will ...
Table of contents Deploy tomcat 1. Download and d...
Table of contents 1. Use default parameters inste...
Table of contents Jenkins installation Install Ch...
It is very simple to install Scala environment in...
Nginx global variables There are many global vari...
Today I got familiar with the mouse zooming effect...
filter is generally used to filter certain values...
1. Download VMware Workstation 64 version https:/...