MySQL series: redo log, undo log and binlog detailed explanation

MySQL series: redo log, undo log and binlog detailed explanation

Implementation of transactions

The redo log ensures the persistence of transactions, and the undo log is used to help with transaction rollback and MVCC functions.

InnoDB storage engine architecture

redo log

Write Ahead Log Strategy

When a transaction is committed, the redo log is written first and then the page is modified; when data is lost due to a crash, the data can be recovered through the redo log.

  • InnoDB first puts the redo log information into the redo log cache
  • Refresh to redo log files at a certain frequency

Redo log files: By default, there are two files named ib_logfile1 and ib_logfile2 in the data directory of the InnoDB storage engine. Each InnoDB storage engine has at least one redo log file group, and each file group has at least two redo log files.

The following figure 1 shows that the redo log group runs in a circular writing mode. The InnoDB storage engine writes ib_logfile1 first, and when it reaches the end of the file, it switches to the redo log file ib_logfile2.

As shown in Figure 2, adding an OS Buffer helps to understand the fsync process.

The log group is called the redo log group, which is a logical concept. The InnoDB storage engine actually has only one log group.

The first redo log file in the log group stores four 512-byte blocks in its first 2KB:

Redo log buffer flushed to disk

The following three situations refresh:

  • The Master Thread flushes the redo log buffer to the redo log file every second
  • When each transaction commits, the redo log buffer is flushed to the redo log file
  • When the remaining space in the redo log buffer pool is less than 1/2, the redo log is flushed to the redo log file

To supplement the second of the three situations mentioned above, the triggering of the disk write process is controlled by the parameter innodb_flush_log_at_trx_commit, which indicates how to handle the redo log when the commit operation is performed.

The valid values ​​of the parameter innodb_flush_log_at_trx_commit are 0, 1, and 2.

  • 0 means that when a transaction is committed, the redo log of the transaction is not written to the log file on disk, but waits for the main thread to refresh it every second.
  • 1 means writing the redo log buffer to disk synchronously when executing commit, that is, accompanied by a call to fsync
  • 2 means writing the redo log to disk asynchronously, that is, writing it to the file system cache. There is no guarantee that the redo log file will be written when committing.

0, when the database crashes, some logs are not flushed to disk, so the transactions of the last period of time will be lost.
2. When the operating system crashes, the part of the transaction that has not been flushed from the file system cache to the redo log file will be lost after the database is restarted.

The following diagram helps to understand

Redo log blocks

In the InnoDB storage engine, redo logs are stored in 512 bytes. This means that the redo log cache and redo log files are saved in blocks, with each block being 512 bytes.

The redo log header is 12 bytes and the redo log tail is 8 bytes, so each redo log block can actually store 492 bytes.

Redo log format

The redo log is recorded in page-based format. By default, the page size of InnoDB is 16KB (controlled by the innodb_page_size variable). A page can store a large number of log blocks (each 512 bytes), and the log blocks record the changes in the data pages.

The format of the log body is divided into 4 parts:

  • redo_log_type: occupies 1 byte and indicates the log type of the redo log.
  • space: indicates the ID of the tablespace. After compression, the occupied space may be less than 4 bytes.
  • page_no: Indicates the page offset, which is also compressed.
  • redo_log_body represents the data part of each redo log, and the corresponding function will be called for parsing during recovery. For example, the contents written into the redo log by an insert statement and a delete statement are different.

The following figures show the general recording methods of insert and delete.

redo log recovery

The following LSN (Log Sequence Number) represents the checkpoint. When the database crashes at LSN 10000, the recovery operation only recovers logs in the range of LSN10000-LSN13000.

undo log

The role of undo log

Undo is a logical log that simply restores the database to its original state logically; all modifications are logically canceled, but the data structure and the page itself may be different after the rollback.

Undo log has two functions: providing rollback and multiple row version control (MVCC).

When the InnoDB storage engine rolls back, for each INSERT, a DELETE is completed; for each DELETE, an INSERT is executed; for each UPDATE, a reverse UPDATE is executed to put the previous row back.

MVCC: When a user reads a row of records, if the record has been occupied by other transactions, the current transaction can read the previous row version information through undo, thereby achieving non-locking reading.

Undo log storage method

The innodb storage engine manages undo in segments. The rollback segment is called a rollback segment, and each rollback segment has 1024 undo log segments.

In the old version, only one rollback segment was supported, so only 1024 undo log segments could be recorded. Later, MySQL 5.5 can support 128 rollback segments, that is, support 128*1024 undo operations. You can also customize the number of rollback segments through the variable innodb_undo_logs (before version 5.6, this variable was innodb_rollback_segments). The default value is 128.

Undo logs are stored in the shared tablespace by default.

Transaction commit undo log processing

When a transaction commits, the InnoDB storage engine does the following two things:

  • The undo log is put into a list for later purge. Whether the undo log and the page it is located on can be finally deleted is determined by the purge thread.
  • Determine whether the page where the undo log is located can be reused. If so, allocate it to the next transaction.

When a transaction is committed, the undo log is first put into the linked list, and then it is determined whether the used space of the undo page is less than 3/4. If so, it means that the undo page can be reused, and then the new undo log is recorded after the current undo log.

Undo log is divided into:

  • insert undo log
  • update undo log

Because of transaction isolation, the insert undo log is not visible to other transactions, so the undo log can be deleted directly after the transaction is committed without the need for a purge operation.

Update undo log records the undo logs generated by delete and update operations. The undo log may need to provide an MVCC mechanism, so it cannot be deleted when it is submitted.

There are two cases for update:

  • If the date column is not a primary key column, the undo log directly records how it was updated in reverse order. That is, the update is performed directly.
  • The operation of updating the primary key can be divided into two steps:
  • First, mark the original primary key record as deleted, so an undo log of type TRX_UNDO_DEL_MARK_REC needs to be generated
  • After that, a new record is inserted, generating an undo log of type TRX_UNDO_INSERT_MARK_REC.

When InnoDB purges, it first searches for the undo log in the history list, and then searches for the undo log in the undo page; this can avoid a large number of random read operations, thereby improving purge efficiency.

MVCC (Multi-version Concurrency Control)

MVCC actually adds two hidden columns after each row of records to record the creation version number and the deletion version number. When each transaction is started, it has a unique incrementing version number.

MVCC only works at REPEATABLE READ and READ COMMITTED isolation levels. There is no version problem for read uncommitted, and serialization locks all read rows.

Example:

Insert operation: The creation version number of the record is the transaction version number

If you insert a record, the transaction id is assumed to be 1, and the creation version number is also 1

id name create version delete version
1 test 1

Update operation: first mark the old version number as deleted, the version number is the current version number, and then insert a new record

For example, transaction 2 updates the name field
update table set name = 'new test' where id = 1;

The original record is marked for deletion, with a deletion version of 2, and a new record is inserted, with a creation version of 2

id name create version delete version
1 test 1 2
1 new test 2


Deletion operation: Use the transaction version as the deletion version number

For example, transaction 3 deletes the record
delete from table where id = 1;

id name create version delete version
1 test 2 3

Query Operation

Records that meet the following two conditions can be queried by the transaction:

  • InnoDB only searches for data rows whose versions are earlier than the current transaction version
  • The deleted version of a row is either undefined or greater than the current version number, which ensures that the row read by the transaction has not been deleted before the transaction starts.

MVCC benefits: reduce lock contention and improve performance

binlog

Concept and function of binary files

The binary log records all operations that change the MySQL database (excluding SELECT, SHOW, etc., because the data is not modified)

The main functions of binary files are:

  • Recovery: Recovery of some data requires binary logs
  • Replication: Synchronize a remote MySQL (slave) with another MySQL database (master) in real time by copying and executing binary logs
  • Audit: Users can audit the information in the binary log to determine whether there is an injection attack on the database.

Three binary file formats

MySQL 5.1 introduced the binlog_format parameter, which can be set to STATEMENT, ROW, and MIX.

  • STATEMENT: The binary file records the logical SQL statements of the log.
  • ROW: records changes to table rows. If ROW mode is set, you can set the InnoDB transaction isolation level to READ_COMMITTED for better concurrency.
  • MIX: MySQL uses STATEMENT format to record binary files by default, but ROW is used in some cases. Possible situations are:
  • The storage engine of the table is NDB, and all DML operations on the table are performed in ROW format.
  • Use of uncertain functions such as UUID(), USER(), CURRENT_USER(), FOUND_ROWS(), ROW_COUNT()
  • Using the INSERT DELAY statement
  • Using user defined functions
  • Using a temporary table

The difference between redo log and binary file

(Binary files are used for POINT-IN-TIME (PIT) recovery and the establishment of a master-slave replication environment.

  • The binary file records all log records related to the MySQL database, including logs of other storage engines such as InnoDB and MyISAM. The redo log of the InnoDB storage engine only records transaction logs related to the storage engine itself.
  • The recorded contents are different. Regardless of whether the user sets the binary log file record format to STATEMENT, ROW, or MIXED, it records the specific operation contents of a transaction, that is, the log is a logical log. The redo log files of the InnoDB storage engine record the physical changes to each page.
  • In addition, the writing time page is different. The binary log file is only committed before the transaction is committed, that is, it is only written to the disk once, no matter how large the transaction is at this time. However, during the transaction, redo log entries are continuously written to the redo log file.

group commit

If the transaction is not a read-only transaction, an fsync operation is required each time the transaction is committed to ensure that the redo log has been written to disk. However, the disk fsync performance is limited. To improve the disk fsync efficiency, current databases provide a group commit function, which means that multiple transaction logs can be refreshed at one time to ensure that they are written to the file.

For InnoDB group commit, a two-stage operation is performed:

  • Modify the information corresponding to the transaction in memory and write the log to the redo log buffer
  • Calling fsync will ensure that the logs are written to disk from the redo log buffer.

Before InnoDB1.2, the group commit function fails when binary files are opened:

After opening the binary file, the steps are as follows:
1) When a transaction is committed, the InnoDB storage engine performs a prepare operation
2) MySQL database upper layer writes binary files
3) InnoDB writes logs to redo log files

a) Modify the information corresponding to the transaction in memory and write the log to the redo log buffer b) Call fsync to ensure that the log is written from the redo log buffer to disk

To ensure that the writing order of the upper binary files of the MySQL database is consistent with the InnoDB transaction commit order, MySQL uses the prepare_commit_mutex lock internally. Therefore, step a) in step 3) cannot be performed when other transactions are executing step b), which causes the row commit function to fail.

The solution is BLGC (Binary Log Group Commit)

The MySQL 5.6 BLGC implementation is divided into three stages:

  • Flush phase: write the binary file of each transaction to memory
  • Sync phase: refresh the binary in memory to disk. If there are multiple transactions in the queue, only one fsync operation is required to complete the writing of the binary log. This is BLGC.
  • Commit stage: The leader calls the storage engine layer transaction commit in order. Since InnoDB already supports group commit, the problem of group commit failure caused by locking prepare_commit_mutex is solved.

This is the end of this article about the detailed explanation of MySQL redo log, undo log and binlog. For more relevant MySQL redo log, undo log and binlog content, please search 123WORDPRESS.COM's previous articles or continue to browse the following related articles. I hope everyone will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • Understand the difference between redo log and binlog in MySQL in one article
  • The difference between redo log and binlog in MySQL
  • A brief analysis of the differences between undo, redo and binlog in MySQL
  • Analysis of MySQL crash recovery based on Redo Log and Undo Log
  • Detailed explanation of redo log and undo log in MySQL
  • Summary of MySQL Undo Log and Redo Log
  • In-depth analysis of MySQL 8.0 redo log
  • Detailed explanation of MySQL redo log (redo log) and rollback log (undo logo)
  • In-depth understanding of MySQL redo log redo log

<<:  Some references about colors in HTML

>>:  Summary of 10 amazing tricks of Element-UI

Recommend

Research on the Input Button Function of Type File

<br />When uploading on some websites, a [Se...

Three common uses of openlayers6 map overlay (popup window marker text)

Table of contents 1. Write in front 2. Overlay to...

Methods for deploying MySQL services in Docker and the pitfalls encountered

I have been learning porters recently. I feel lik...

Analysis of GTK treeview principle and usage

The GtkTreeView component is an advanced componen...

Docker uses dockerfile to start node.js application

Writing a Dockerfile Taking the directory automat...

W3C Tutorial (13): W3C WSDL Activities

Web Services are concerned with application-to-ap...

Vue+thinkphp5.1+axios to realize file upload

This article shares with you how to use thinkphp5...

Detailed explanation of count without filter conditions in MySQL

count(*) accomplish 1. MyISAM: Stores the total n...

How to implement blank space in Taobao with CSS3

Make a blank space for Taobao: When you shrink th...

IDEA2021 tomcat10 servlet newer version pitfalls

Because the version I used when I was learning wa...

A brief discussion on the magic of parseInt() in JavaScript

cause The reason for writing this blog is that I ...

Detailed explanation of MySQL information_schema database

1. Overview The information_schema database is th...

Summary of pitfalls encountered in installing mysql and mysqlclient on centos7

1. Add MySQL Yum repository MySQL official websit...

Detailed explanation of using JavaScript WeakMap

A WeakMap object is a collection of key/value pai...