Introduction to the three essential logs for MySQL database interviews

Introduction to the three essential logs for MySQL database interviews

Quick, let’s start by thinking about a question: How does MySQL ensure that data is not lost?

In fact, to ensure that data is not lost, it is necessary to have the following two capabilities:
(1) Ability to restore to the state at any point in time;
(2) It can ensure that if MySQL suddenly crashes and restarts at any time, the submitted data will not be lost, and the incomplete data will be automatically rolled back;

Doesn't this bring us to the topic we are going to talk about today? To achieve the first point, we need to use bin log, and to achieve the second point, we need to use redo log and undo log.

Before understanding the three major logs, let's take a look at the process of Mysql data update:

insert image description here

The above picture contains the general relationship between the three types of logs: redo log, bin log, and undo log. Now let's get to the point.

1. redo log (transaction log of MySQL storage engine InnoDB)

We know that MySQL data is stored on disk, and disk IO is required every time data is read or written, resulting in poor performance in concurrent scenarios. For this purpose, MySQL introduces the cache Buffer Pool for optimization. It contains a mapping of some data pages on disk to relieve the disk pressure of the database.

When reading data from the database, it is first read from the cache. If the data is not in the cache, it is read from the disk and put into the cache. When writing data to the database, it is first written to the cache. At this time, the data page data in the cache will be changed. This data page is called a dirty page. After the data in the Buffer Pool is modified, it will be regularly flushed to the disk according to the set strategy. This process is called flushing dirty pages .

So the question is, if the modified data in the Buffer Pool has not been flushed to the disk in time, MySQL will crash and restart, which will lead to data loss and the persistence of the transaction cannot be guaranteed. What should we do?

Redo log solves this problem. That is to say, when the database modifies data, it will first write the update record to the redo log, and then modify the data in the Buffer Pool. When the transaction is committed, fsync is called to flush the redo log to the disk. As for when the updated data files in the cache are flushed to the disk, it is handled asynchronously by the background thread.

Note : At this time, the redo log transaction status is prepare, and it has not been truly committed successfully. It will not change to commit until the bin log is written to the disk. Only then can the transaction be truly committed successfully.

How to write redo log?

The redo log is written in a circular manner with a fixed size . When it is full, it will be written again from the beginning in a circular manner, similar to a ring. The reason for this design is that the redo log records the changes on the data page. If the data page in the Buffer Pool has been flushed to the disk, these records will become invalid, and the new log will overwrite and erase these invalid records.

Note : If the redo log is full, make sure that all records to be erased have been flushed to disk before erasing. During the period of erasing old records to release new space, no new update requests can be received, and MySQL performance will degrade. Therefore, it is important to properly adjust the redo log size in high concurrency situations.

What is crash-safe capability?

The Innodb engine has crash-safe capabilities, which means that at any stage of the transaction submission process, the integrity of the transaction can be guaranteed after MySQL crashes and restarts, and the submitted data will not be lost. This capability is guaranteed by redo logs. When MySQL crashes and restarts, the system will automatically check the redo logs and restore the modified data that has not yet been written to disk from the redo logs to MySQL.

2. Undo log rollback log (transaction log of MySQL storage engine InnoDB)

The undo log records the state before the data is modified. It belongs to the logical log and plays the role of rollback. It is the key to ensure the atomicity of transactions.
For example , if the name field of the record with ID=1 is updated, the original name data is Xiao Wang, and now the name is changed to Xiao Zhang. When the transaction executes the update X set name = Xiao Zhang where id = 1 statement, an update X set name = Xiao Wang where id = 1 record with the opposite logic is first recorded in the undo log. In this way, when the transaction fails for some reason, the undo log can be used to roll back the data to the state before the transaction was executed.

So the question is: if a record of the same transaction is modified multiple times, do we need to write the undo log of the state before the data modification every time?

No, because the undo log only records the original version of the data before the transaction starts. When this row of data is modified again, the generated modification record will be written to the redo log. The undo log is responsible for rollback, and the redo log is responsible for rollforward.

What is rollback and rollforward?

(1) Rollback

Uncommitted transactions, that is, transactions that have not been committed. However, some of the dirty pages modified within the transaction have been flushed to disk. At this time, the database crashes and restarts, and a rollback is needed to remove the dirty blocks that have been flushed from the disk.

(2) Roll forward

An incompletely committed transaction means that the transaction has been committed, but only part of the data in the dirty pages modified in the transaction has been flushed to disk, and the other part is still in the buffer pool. At this time, when the database crashes and restarts, rollforward is used to recover the data that has not been flushed to disk from the redo log and flush it to disk.

3. Bin log archive log (database server layer binary logical log, regardless of the engine)

The bin log records all SQL operations performed by users on the database (excluding query statements, because such operations do not modify the data itself). The reason why it is called an archive log is that it does not cyclically erase previous records like a redo log, but instead keeps logging. The default maximum capacity of a bin log file is 1G (which can be modified by the max_binlog_size parameter). If a single log exceeds the maximum value, a new file will be created to continue writing.
Note : Logs may be recorded based on transactions, and transactions should not be recorded across files. If the binlog log file reaches the maximum value but the transaction has not been committed, no new file record will be created, but the log will continue to grow. Therefore, the value of max_binlog_size is not necessarily equal to the actual binlog file size.

After the above introduction, binlog is mainly used for master-slave synchronization and database restoration based on time points .

So the question is, can we do without binlog (why do we need bin log if we have redo log)?

Need to look at the scene:

In master-slave mode, binlog is necessary because the data synchronization of the slave database depends on binlog;

In stand-alone mode, if point-in-time restoration of the database is not considered, binlog is not necessary because redo log can guarantee crash-safe capability.

After the redo log records are modified and written to the disk, the log will be overwritten and cannot be used for operations such as data recovery. The redo log is implemented at the InnoDB engine layer, and not all engines have it.

What is the difference between redo log and bin log?

insert image description here

What is redo log two-phase commit and why do we do it?
After the memory is updated, the engine layer writes the redo log and changes the status to prepare to commit the first stage. The server layer writes the bin log and changes the status to commit to commit the second stage. The purpose of two-phase commit is to ensure the consistency of binlog and redo log data.

What would happen if it wasn't two-phase commit?
1) Assume that the redo log is written first and then the bin log. That is, the redo log has no prepare stage. After writing, it is directly set to commit, and then the bin log is written. If the database crashes after writing the redo log but before writing the bin log, the system will automatically use the redo log to recover after restart. At this time, the data page data on the disk will be more than the recorded data in the bin log, resulting in inconsistent data.
2) Assume that bin log is written first and then redo log. If the database crashes after bin log is written but not redo log, the records in bin log will be more than the records in data pages on disk. The next time you use bin log to restore data, the restored data will be inconsistent with the original data.

Describe the redo log disaster recovery process?
If the redo log is complete (committed), restore directly using the redo log;
If the redo log is in the prepare state but not in the commit state, you need to determine whether the binlog is complete. If it is complete (committed), commit the redo log and then use the redo log to recover. If it is incomplete, roll back the transaction.

This is the end of this article about the three essential logs for MySQL database interviews. For more information about the three major MySQL logs, please search for previous articles on 123WORDPRESS.COM or continue to browse the related articles below. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • Solution to Mysql binlog log file being too large
  • Detailed explanation of MySQL binlog usage

<<:  Install Linux using VMware virtual machine (CentOS7 image)

>>:  An example of how to optimize a project after the Vue project is completed

Recommend

How to deploy FastDFS in Docker

Install fastdfs on Docker Mount directory -v /e/f...

SQL implementation of LeetCode (181. Employees earn more than managers)

[LeetCode] 181.Employees Earning More Than Their ...

What is Makefile in Linux? How does it work?

Run and compile your programs more efficiently wi...

How to modify Flash SWF files in web pages

I think this is a problem that many people have en...

How to backup MySQL regularly and upload it to Qiniu

In most application scenarios, we need to back up...

How to deploy gitlab using Docker-compose

Docker-compose deploys gitlab 1. Install Docker I...

Play with the connect function with timeout in Linux

In the previous article, we played with timeouts ...

JavaScript to implement limited time flash sale function

This article shares the specific code of JavaScri...

Docker implements container port binding local port

Today, I encountered a small problem that after s...

Usage and best practice guide for watch in Vue3

Table of contents Preface🌟 1. API Introduction 2....

Develop a vue component that encapsulates iframe

Table of contents 1. Component Introduction 2. Co...

HTML solves the problem of invalid table width setting

If you set the table-layer:fixed style for a tabl...