The most important logs in the MySQL log system are the redo log and the archive log, the latter is the log of the MySQL Server layer, and the former is the log of the InnoDB storage engine layer. 1 redo log1.1 What is redo logThe redo log is used to ensure the durability of transactions, which is the D in ACID. Persistence: Once a transaction is committed, the changes it makes to the data in the database are permanent, and should not be affected by a subsequent database failure. There are two types of redo logs: physical redo logs and logical redo logs. In InnoDB, the redo log is a physical log in most cases, recording the physical changes of data pages (actual data values). 1.2 Functions of redo logThe main function of redo log is to recover data when the database crashes. 1.3 Composition of redo logThe redo log can be divided into the following two parts The redo log buffer stored in memory and the redo log files stored on disk 1.4 When to record redo logsAfter the data modification is completed, the dirty page is written to the redo log buffer before being flushed to disk. That is, modify first, then write. Dirty pages: data in memory that is inconsistent with the data on disk (not bad!) In the following cases, redo logs are written from the redo log buffer to redo log files on disk.
The first and fourth cases will definitely execute the writing of redo logs. The execution of the second and third cases depends on the setting value of the parameter The creation of an index also requires recording the redo log. 1.5 An example of redoing the entire processTake update transactions as an example.
1.6 Guarantee of persistence1.6.1 Force Log at Commit Mechanism The Force Log at Commit mechanism implements transaction persistence. When operating in memory, logs are written to the redo log buffer. But before a transaction can be committed, all logs must first be written to the redo log files on disk. To ensure that each log is written to the redo log file, an fsync system call must be used to ensure that the log in the OS buffer is completely written to the log file on disk. fsync system call: You need to pass it an fd as the input parameter, and then the system call will work on the file pointed to by this fd. fsync will ensure that it will not return until the disk write operation is completed, so when your program uses this function and it returns successfully, it means that the data must have been safely written to the disk. Therefore, fsync is suitable for programs such as databases. 1.6.2 innodb_flush_log_at_trx_commit Parameter InnoDB provides a parameter
This method will not lose any data even if the system crashes, but because each submission is written to disk, the IO performance is poor.
This is actually equivalent to maintaining a user-designed buffer in memory, which reduces data transmission between the OS buffer and has better performance. If the system crashes, 1 second of data will be lost if the disk is written every second.
Although we call fsync() every second to write the logs in the OS buffer to the log file on disk, even if fsync is not called at other times, the data will gradually enter the disk autonomously. So when a system crash occurs, less data will be lost compared to the second case. But at the same time, since each submission is written to the OS buffer, the performance will be worse than the second case, but still better than the first case. In either case 1.6.3 A small performance test The performance gap between the options is huge. Let's do a simple test. #Create a test table drop table if exists test_flush_log; create table test_flush_log(id int,name char(50))engine=innodb; #Create a stored procedure to insert a specified number of rows into the test table drop procedure if exists proc; delimiter $$ create procedure proc(i int) begin declare s int default 1; declare c char(50) default repeat('a',50); while s<=i do start transaction; insert into test_flush_log values(null,c); commit; set s=s+1; end while; end$$ delimiter ; One hundred thousand records are inserted below. Ⅰ When test> call proc(100000) [2021-07-25 13:22:02] completed in 27 s 350 ms It takes up to 27.35s. Ⅱ When test> set @@global.innodb_flush_log_at_trx_commit=2; test> truncate test_flush_log; test> call proc(100000) [2021-07-25 13:27:33] completed in 5 s 774 ms It only takes 5.774s, which greatly improves the performance. III When test> set @@global.innodb_flush_log_at_trx_commit=0; test> truncate test_flush_log; test> call proc(100000) [2021-07-25 13:30:34] completed in 3 s 537 ms It only takes 3.537s, with higher performance. Obviously, the performance is very poor when Although changing to 0 and 2 can greatly improve performance, it will seriously affect security. We can modify the stored procedure, put the creation and submission of transactions outside the loop, submit them in a unified manner, and reduce the IO frequency. drop procedure if exists proc; delimiter $$ create procedure proc(i int) begin declare s int default 1; declare c char(50) default repeat('a',50); start transaction; while s<=i DO insert into test_flush_log values(null,c); set s=s+1; end while; commit; end$$ delimiter ; 1.6.4 Mini-transaction Mini-transaction is a mechanism used by InnoDB to process small transactions. It can ensure the data consistency in the data page when concurrent transaction operations and database exceptions occur. Mini transactions must follow the following three protocols:
The most stringent is full-sync, fsync ensures that the record is flushed to disk before returning, maximizing data security. The second level is write-only, which ensures that the records are written to the operating system. This allows data to survive process-level crashes. The least strict is no-sync, which keeps records in a memory buffer and does not guarantee immediate writing to the file system. Force log resubmission. That is, Force-log-at-commit, which requires that all mini-transaction logs must be flushed to disk when a transaction is committed. 1.7 The process of writing redo logThe figure above shows how the redo log is written to the log buffer. Each mini-transaction corresponds to each DML operation, such as an update statement.
1.8 Log BlockThe redo log is stored in blocks, and each block is 512 bytes in size. Whether in the memory redo log buffer, operating system buffer or redo log file, it is stored in such 512-byte blocks. Each log block header consists of the following four parts
1.9 log group Log group represents the grouping of redo logs, which consists of multiple redo log files of the same size. Determined by a parameter This group is a logical concept, but the group directory can be defined by the variable 2 Undo log 2.1 About undo log The purpose of undo log is to ensure the atomicity of database transactions. Atomicity means that a transaction is an indivisible unit of work, and either all operations in a transaction occur or none of them occur.
Undo logs usually exist in the form of logical logs. We can assume that when a record is deleted, the undo log will generate a corresponding insert record, and vice versa. When a record is updated, a counter-update record is generated. Undo logs are recorded in segments. Each undo operation occupies one undo log segment when recorded. Undo log will also generate redo log, because undo log also needs to achieve persistent protection. 2.2 undo log segment In order to ensure that transactions do not conflict when writing their respective undo logs during concurrent operations, nnodb manages the undo logs in segments. The rollback segment is called a rollback segment, and each rollback segment has 1024 undo log segments. MySQL versions after 5.5 support 128 rollback segments, which can store 128*1024 operations. You can also define a rollback segment by using the 2.3 purgeMySQL is designed like this in the operation of clustered index columns. For a delete statement delete from t where a = 1 If a has a clustered index (primary key), then no actual deletion will be performed. Instead, the delete flag will be set to 1 for the record where the primary key column is equal to 1, that is, the record will be saved in the B+ tree. Similarly, for update operations, instead of directly updating the record, the old record is marked as deleted and a new record is created. So, when are the old version records actually deleted? InnoDB uses undo logs to delete old versions. This operation is called purge operation. InnoDB opens up a purge thread to perform purge operations, and can control the number of purge threads. Each purge thread performs a purge operation every 10 seconds. InnoDB undo log design The undo logs of multiple transactions are allowed to exist on one page, and the storage order of the undo logs is arbitrary. InnoDB maintains a history linked list, connecting the undo logs in the order in which the transactions were committed. During the purge process, the InnoDB storage engine first finds the first record that needs to be cleaned from the history list, which is trx1 here. After the cleanup, the InnoDB storage engine will continue to look for records that can be cleaned in the Undo page where trx1 is located. Here, transaction trx3 will be found, and then trx5 will be found, but it is found that trx5 is referenced by other transactions and cannot be cleaned. Therefore, it will search in the history list again and find that the last record is trx2. Then, the Undo page where trx2 is located is found, and trx6 and trx4 are cleaned up in turn. Since all records in Undo page2 have been cleaned up, the Undo page can be reused. The design mode of the InnoDB storage engine, which first searches for undo logs in the history list and then searches for undo logs in the Undo page, is to avoid a large number of random read operations, thereby improving the efficiency of purge. 3 InnoDB Recovery Operations3.1 Data page flushing rules and checkpointsData in the memory (buffer pool) that has not been flushed to disk is called dirty data. Since both data and logs exist in the form of pages, dirty pages represent dirty data and dirty logs. In InnoDB, checkpoint is the only rule for flushing data to disk. After the checkpoint is triggered, the dirty data in the memory will be flushed to the disk. There are two types of checkpoints in the innodb storage engine:
master thread checkpoint. Controlled by the master thread, a certain percentage of dirty pages are flushed to disk every second or every 10 seconds. Since flushing dirty pages takes a certain amount of time to complete, the location of the record checkpoint is marked in the redo log after each flush is completed. 3.2 LSN3.2.1 LSN Concept LSN is called the logical sequence number of the log and occupies 8 bytes in InnoDB. We can learn the following information through LSN:
LSNs exist in the following two locations:
Obviously, if the LSN value in the page is smaller than the LSN value in the redo log, it means that data is lost. You can use
3.2.2 LSN Processing Flow (1) First, modify the data page in memory and record the LSN in the data page, which we will call data_in_buffer_lsn. (2) When the data page is modified (almost simultaneously), the redo log is written to the redo log in buffer and the corresponding LSN is recorded, which is temporarily called redo_log_in_buffer_lsn. (3) After writing the log in the buffer, when several log flushing rules are triggered, the redo log will be flushed to the redo log file on disk, and the corresponding LSN will be recorded in the file, which is temporarily called redo_log_on_disk_lsn; (4) Data pages cannot stay in memory forever. In some cases, a checkpoint will be triggered to flush dirty pages in memory (data dirty pages and log dirty pages) to disk. Therefore, when the checkpoint dirty page flush is completed, the LSN position of the checkpoint will be recorded in the redo log, which is temporarily called checkpoint_lsn. (5) To record the checkpoint location quickly, you only need to set a flag. However, flushing data pages may not be fast. For example, there are many data pages to be flushed in this checkpoint. That is to say, it takes a certain amount of time to flush all the data pages. Each data page flushed in the middle will record the LSN of the current page, which is temporarily called data_page_on_disk_lsn. In the above figure, the horizontal lines from top to bottom represent: the timeline, the LSN recorded in the data page in the buffer (data_in_buffer_lsn), the LSN recorded in the data page on the disk (data_page_on_disk_lsn), the LSN recorded in the redo log in the buffer (redo_log_in_buffer_lsn), the LSN recorded in the redo log file on the disk (redo_log_on_disk_lsn), and the LSN recorded in the checkpoint (checkpoint_lsn). Assume that at the beginning (12:0:00) all log pages and data pages have been flushed to disk and the checkpoint LSN has been recorded. At this time, their LSNs are completely consistent. Assume that a transaction is started and an update operation is executed immediately. After the execution is completed, the data page in the buffer and the redo log both record the updated LSN value, which is assumed to be 110. At this time, if you execute show engine innodb status to view the values of each LSN, that is, the position status at ① in the figure, the result will be:
After that, another delete statement is executed and the LSN increases to 150. At 12:00:01, the redo log flushing rule is triggered (one of the rules is that the default log flushing frequency controlled by innodb_flush_log_at_timeout is 1 second). At this time, the LSN in the redo log file on disk will be updated to the same as the LSN in the redo log in buffer, so both are equal to 150. At this time, show engine innodb status, which is the position ② in the figure, the result will be:
After that, an update statement is executed, and the LSN in the cache will increase to 300, which is position ③ in the figure. Assume that a checkpoint occurs later, which is the position ④ in the figure. As mentioned earlier, the checkpoint will trigger the flushing of data pages and log pages to disk, but it takes a certain amount of time to complete. Therefore, before the flushing of data pages is completed, the LSN of the checkpoint is still the LSN of the previous checkpoint, but at this time the LSN of the data pages and log pages on the disk has increased, that is:
However, the size of log flushed up to and pages flushed up to cannot be determined because log flushing may be faster than data flushing, may be equal to, or may be slower than. However, the checkpoint mechanism protects the data disk from being flushed slower than the log disk: when the data disk flushing speed exceeds the log disk flushing speed, the data disk flushing will be temporarily stopped until the log disk flushing progress exceeds the data disk flushing progress. When the data pages and log pages are flushed to disk, that is, when they reach position ⑤, all LSNs are equal to 300. As time goes by, at 12:00:02, which is position ⑥ in the figure, the log flushing rule is triggered again. However, the log LSN in the buffer and the log LSN in the disk are consistent at this time, so the log flushing is not performed. That is, when show engine innodb status is executed at this time, all LSNs are equal. Then an insert statement is executed. Assume that the LSN in the buffer increases to 800, which is position ⑦ in the figure. At this time, the sizes of various LSNs are the same as in position ①. Then the commit action is performed, i.e. position ⑧. By default, commit actions trigger log flushing but not data flushing, so the result of show engine innodb status is:
Finally, as time goes by, the checkpoint appears again, which is position ⑨ in the figure. However, this checkpoint will not trigger a log flush because the log's LSN has been synchronized before the checkpoint occurs. Assume that the data is flushed to disk extremely quickly, so fast that it is completed in an instant and the status change cannot be captured. In this case, the result of show engine innodb status will be that all LSNs are equal. 3.3 InnoDB Recovery BehaviorWhen you start InnoDB, a recovery operation will be performed regardless of the reason for the last exit. A checkpoint indicates that the LSN on the data page on the disk has been completely flushed, so during recovery, only the log portion starting from the checkpoint needs to be restored. For example, when the database crashes when the LSN of the last checkpoint is 10000, and the transaction is in the committed state. When the database is started, the LSN of the data page on the disk is checked. If the LSN of the data page is less than the LSN in the log, recovery will start from the checkpoint. There is another situation where the checkpoint is being flushed to disk before the crash, and the progress of flushing the data page exceeds the progress of flushing the log page. At this time, if the computer crashes, the LSN recorded in the data page will be greater than the LSN in the log page. This situation will be checked during the recovery process of the restart. At this time, the part that exceeds the log progress will not be redone, because this itself means that what has been done does not need to be redone. In addition, the transaction log is idempotent, so multiple operations that produce the same result are only recorded once in the log. However, the binary log is not idempotent. Multiple operations will be recorded. During recovery, the records in the binary log will be executed multiple times, which is much slower. For example, if the initial value of id in a certain record is 2, and the value is set to 3 through update, and then set to 2 again, the transaction log will record the unchanged page and no recovery is required. However, the binary will record the two update operations, and these two update operations will also be executed during recovery, which is slower than transaction log recovery. This is the end of this article about redo log and undo log in MySQL. For more information about redo log and undo log in MySQL, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future! You may also be interested in:
|
<<: Introduction to useRef and useState in JavaScript
>>: How to create a responsive column chart using CSS Grid layout
Color is one of the most important elements for a...
Centos7 startup process: 1.post(Power-On-Self-Tes...
The reason is this I wanted to deploy a mocker pl...
JPQL stands for Java Persistence Query Language. ...
Table of contents 1. Purpose 2. Grammar 3. Practi...
<br />A contradiction arises. In small works...
Recently, I encountered many problems when deploy...
The problem raised in the title can be broken dow...
1. New Features MySQL 5.7 is an exciting mileston...
Table of contents Preface 1. Use for...of to iter...
Method 1: Using the onclick event <input type=...
This article uses examples to illustrate the prin...
Table of contents 1. Introduction 2. Component De...
Today I will introduce the most basic functions of...
Preface Sorting is a basic function in databases,...