Detailed explanation of the execution process of mysql update statement

Detailed explanation of the execution process of mysql update statement

There was an article about the execution process of MySQL query statements before. Here is a summary of the execution process of update statements. Since update involves data modification, it is easy to infer that the update statement is more complicated than the select statement.

1. Preparation

Create a test table

CREATE TABLE `test` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `c` int(11) NOT NULL DEFAULT '0' COMMENT 'value',
 PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='test table';

Insert three pieces of data

INSERT INTO `test` (`c`) VALUES (1), (2), (3);

2. Testing

If I want to add 1 to the c value of the first data, then

UPDATE `test` SET `c` = `c` + 1 WHERE `id` = 1;

According to our usual thinking, we just need to find this record, change its value, and save it.

But let's look into the details. Since it involves modifying data, it involves logs.

3 Operation Order

3.1 Search records: The executor first searches the engine for the row with id=1. ID is the primary key, and the engine directly uses tree search to find this row. If the data page where the row with id=1 is located is already in the memory, it is directly returned to the executor; otherwise, it needs to be read from the disk into the memory first and then returned;

3.2 The executor gets the row data returned by the engine, changes num to 2, gets a new row of data, and then calls the engine interface to write this new row of data;

3.3 The engine updates the new row of data into memory and records the update operation in the redo log . At this time, the redo log is in the prepare state.

3.4 The engine informs the executor that the execution is complete and you can call my interface to submit the transaction at any time;

3.5 The executor generates the binlog of this operation and writes the binlog to disk.

3.6 The executor calls the engine's commit transaction interface , and the engine changes the redo log that has just been written to the commit state, and the update is completed.

binlog is the coordinator of the two-phase commit implementation in MySQL. It assigns a transaction ID to each transaction: XID
Phase 1 :
Start the transaction. The redo log and undo log have recorded the corresponding logs. At this time, the transaction status is prepare
Phase 2 :

After binlog completes write和fsync , the transaction must be committed successfully. Otherwise, the transaction is rolled back and commit is sent, undo information is cleared, redo is flushed, and the transaction status is set to completed.
4. Two types of logs 4.1 Redo log

A redo log is usually a physical log that records physical modifications to data pages rather than how a particular row or rows are modified. It is used to restore physical data pages after submission (restore data pages, and can only be restored to the last submitted position).

The general update process is as follows:

Directly query the original data and update it immediately; first find a temporary notebook to make a record, and then perform the calculation and update when you are not busy/during the settlement.

The first approach is not very optimistic under high concurrent IO conditions. Therefore, the second method is generally adopted.

There is also such a problem in MySQL. If every update operation needs to be written to the disk, and the disk also needs to find the corresponding record and then update it, the IO cost and search cost of the whole process will be very high. In order to solve this problem, the designers of MySQL used the idea of ​​redo log to improve update efficiency.

The entire process of operating the temporary notebook and the original data also corresponds to the WAL (Write-Ahead Logging) technology often mentioned in MySQL. The full name of WAL is Write-Ahead Logging . Its key point is to write the log first and then write to the disk . Specifically, when a record needs to be updated, the InnoDB engine will first write the record to the redo log (notepad) and update the memory, at which point the update is complete. At the same time, the InnoDB engine will update the operation records in this notebook to the disk when appropriate, and this update is often done when the system is relatively idle.

Question: How to solve the problem of redo log (Notepad) being full?
InnoDB's redo log is of fixed size. For example, it can be configured as a group of 4 files, each of which is 1GB in size. Then this "notepad" can record a total of 4GB of operations. Start writing from the beginning, and when you reach the end, go back to the beginning and write in a loop, as shown in the figure below.

Write pos is the position of the current record. It moves backwards while writing. After writing to the end of file No. 3, it returns to the beginning of file No. 0. Checkpoint is the current position to be erased, which is also moved backward and cyclic. Before erasing the record, the record must be updated to the data file.

The space between write pos and checkpoint is the empty space on the "notepad" that can be used to record new operations. If the write pos catches up with the checkpoint, it means that the "pink board" is full. At this time, no new updates can be performed. It is necessary to stop and erase some records first to advance the checkpoint.

With redo log, InnoDB can ensure that even if the database restarts abnormally, the previously submitted records will not be lost. This capability is called crash-safe .

4.2 Archive log binlog

MySQL as a whole actually consists of two parts: one is the server layer, which mainly handles MySQL functional aspects; the other is the engine layer, which is responsible for specific storage-related matters. The redo log above is a log specific to the InnoDB engine, and the server layer also has its own log, called binlog (archive log).

Why are there two logs?

Initially, there was no InnoDB engine in MySQL. The engine that comes with MySQL is MyISAM, but MyISAM does not have crash-safe capabilities, and binlog logs can only be used for archiving. InnoDB was introduced into MySQL by another company in the form of a plug-in. Since relying solely on binlog does not provide crash-safe capabilities, InnoDB uses another log system, namely redo log, to achieve crash-safe capabilities.

There are three differences between these two logs.

redo log is specific to the InnoDB engine; binlog is implemented at the server layer of MySQL and can be used by all engines.

Redo log is a physical log that records "what changes have been made to a certain data page"; binlog is a logical log that records the original logic of the statement, such as "add 1 to the c field of the row with ID=2".

Redo log is written in a cycle, and its space will eventually run out; binlog can be written in appended form. "Append write" means that after the binlog file reaches a certain size, it will switch to the next one and will not overwrite the previous log.

5 Two-Phase Commit

The writing of redo log is divided into two steps: prepare and commit, which is called "two-phase commit".

5.1 Why is “two-phase commit” necessary? This is to make the logic between the two logs consistent.

Binlog records all logical operations in an " append write " format. If your DBA promises that recovery can be done within half a month, then the backup system will definitely save all binlogs from the last half a month, and the system will regularly back up the entire database. The "regularly" here depends on the importance of the system, it can be once a day or once a week.

When you need to restore to a specified second, for example, at 2 pm one day, you find that a table was accidentally deleted at 12 pm and need to recover the data, you can do this:

First, find the most recent full backup. If you are lucky, it may be the backup from last night. Restore from this backup to the temporary database.

Then, starting from the backup time, take out the backed up binlogs one by one and replay them to the time before the table was accidentally deleted at noon.

In this way, your temporary database will be the same as the online database before the accidental deletion. Then you can take the table data out of the temporary database and restore it to the online database as needed.

After talking about the data recovery process, let’s talk about why the log requires “two-phase commit”. Let's use proof by contradiction to explain this.

5.2 If two-phase commit is not used

Since redo log and binlog are two independent logics, if two-phase commit is not used, either the redo log is written first and then the binlog is written, or the order is reversed.

Let's see what problems arise with these two approaches.

Still using the previous update statement as an example, the value of field c in the current row with ID=1 is 1.

5.2.1 Write redo log first, then binlog

Suppose that the MySQL process restarts abnormally when the redo log is written but the binlog is not. As we said before, after the redo log is written, the data can still be recovered even if the system crashes, so the value of c in this row after recovery is 2. However, since the binlog crashed before it was finished writing, this statement was not recorded in the binlog. If this binlog needs to be used to restore the temporary database, the temporary database will be missing this update because the binlog of this statement is lost. The value of c in the restored row is 1, which is different from the value in the original database.

5.2.2 Write binlog first, then redo log

If the transaction crashes after the binlog is written, the redo log has not been written yet and the transaction is invalid after crash recovery, so the value of c in this row is 1. However, the log "Changing c from 1 to 2" has been recorded in the binlog. Therefore, when binlog is used for recovery later, the c value of this row in the restored temporary database is 2, which is different from the value in the original database.

It can be seen that if "two-phase commit" is not used, the state of the database may be inconsistent with the state of the database restored using its log.

In fact, this process is not only needed to recover data after an incorrect operation. When capacity expansion is needed, that is, when more backup databases are needed to increase the system's read capability, the common practice now is to use full backup plus application binlog to achieve this. This "inconsistency" will cause inconsistencies in your master-slave database online.

Simply put, both redo log and binlog can be used to indicate the commit status of a transaction, and two-phase commit keeps these two states logically consistent.

The redo log is used to ensure crash-safe capabilities. When the innodb_flush_log_at_trx_commit parameter is set to 1, it means that the redo log of each transaction is directly persisted to disk. It is recommended to set this parameter to 1 to ensure that data is not lost after an abnormal restart of MySQL.

When the sync_binlog parameter is set to 1, it means that the binlog of each transaction is persisted to disk. It is also recommended to set this parameter to 1, so that the binlog is not lost after an abnormal restart of MySQL.

The above is the detailed content of the detailed explanation of the execution process of the MySQL update statement. For more information about the execution process of the update statement, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • Detailed usage of mysql update statement
  • Several ways to update batches in MySQL
  • MySQL UPDATE statement detailed explanation
  • Example tutorial on using UPDATE statement in MySQL
  • How to update multiple records in mysql at one time
  • Detailed explanation of MySQL database insert and update statements
  • How is an update statement executed in MySQL?

<<:  A brief discussion on how to cancel requests and prevent duplicate requests in axios

>>:  Solve the problem that PhpStorm fails to connect to VirtualBox

Recommend

An article to solve the echarts map carousel highlight

Table of contents Preface toDoList just do it Pre...

Some key points of website visual design

From handicraft design to graphic design to web de...

Using Openlayer in Vue to realize loading animation effect

Note: You cannot use scoped animations! ! ! ! via...

Detailed explanation of mkdir command in Linux learning

Table of contents Preface 1. Basic knowledge of f...

JavaScript to implement a simple clock

This article example shares the specific code for...

Common rule priority issues of Nginx location

Table of contents 1. Location / Matching 2. Locat...

my.cnf parameter configuration to optimize InnoDB engine performance

I have read countless my.cnf configurations on th...

Solve the problem of MySql8.0 checking transaction isolation level error

Table of contents MySql8.0 View transaction isola...

Linux 6 steps to change the default remote port number of ssh

The default ssh remote port in Linux is 22. Somet...

Security configuration and detection of SSL after the website enables https

It is standard for websites to enable SSL nowaday...

A complete example of implementing a timed crawler with Nodejs

Table of contents Cause of the incident Use Node ...

Summary of MySQL development standards and usage skills

1. Naming conventions 1. Database names, table na...

Analysis of MySQL joint index function and usage examples

This article uses examples to illustrate the func...

Two implementation codes of Vue-router programmatic navigation

Two ways to navigate the page Declarative navigat...