There was an article about the execution process of MySQL query statements before. Here is a summary of the execution process of update statements. Since update involves data modification, it is easy to infer that the update statement is more complicated than the select statement. 1. Preparation Create a test table CREATE TABLE `test` ( `id` int(11) NOT NULL AUTO_INCREMENT, `c` int(11) NOT NULL DEFAULT '0' COMMENT 'value', PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='test table'; Insert three pieces of data INSERT INTO `test` (`c`) VALUES (1), (2), (3); 2. Testing If I want to add 1 to the c value of the first data, then UPDATE `test` SET `c` = `c` + 1 WHERE `id` = 1; According to our usual thinking, we just need to find this record, change its value, and save it. But let's look into the details. Since it involves modifying data, it involves logs. 3 Operation Order 3.1 Search records: The executor first searches the engine for the row with id=1. ID is the primary key, and the engine directly uses tree search to find this row. If the data page where the row with id=1 is located is already in the memory, it is directly returned to the executor; otherwise, it needs to be read from the disk into the memory first and then returned; 3.2 The executor gets the row data returned by the engine, changes num to 2, gets a new row of data, and then calls the engine interface to write this new row of data; 3.3 The engine updates the new row of data into memory and records the update operation in the redo log . At this time, the redo log is in the prepare state. 3.4 The engine informs the executor that the execution is complete and you can call my interface to submit the transaction at any time; 3.5 The executor generates the binlog of this operation and writes the binlog to disk. 3.6 The executor calls the engine's commit transaction interface , and the engine changes the redo log that has just been written to the commit state, and the update is completed. After A redo log is usually a physical log that records physical modifications to data pages rather than how a particular row or rows are modified. It is used to restore physical data pages after submission (restore data pages, and can only be restored to the last submitted position). The general update process is as follows: Directly query the original data and update it immediately; first find a temporary notebook to make a record, and then perform the calculation and update when you are not busy/during the settlement. The first approach is not very optimistic under high concurrent IO conditions. Therefore, the second method is generally adopted. There is also such a problem in MySQL. If every update operation needs to be written to the disk, and the disk also needs to find the corresponding record and then update it, the IO cost and search cost of the whole process will be very high. In order to solve this problem, the designers of MySQL used the idea of redo log to improve update efficiency. The entire process of operating the temporary notebook and the original data also corresponds to the WAL (Write-Ahead Logging) technology often mentioned in MySQL. The full name of WAL is Write-Ahead Logging . Its key point is to write the log first and then write to the disk . Specifically, when a record needs to be updated, the InnoDB engine will first write the record to the redo log (notepad) and update the memory, at which point the update is complete. At the same time, the InnoDB engine will update the operation records in this notebook to the disk when appropriate, and this update is often done when the system is relatively idle. Question: How to solve the problem of redo log (Notepad) being full? Write pos is the position of the current record. It moves backwards while writing. After writing to the end of file No. 3, it returns to the beginning of file No. 0. Checkpoint is the current position to be erased, which is also moved backward and cyclic. Before erasing the record, the record must be updated to the data file. The space between write pos and checkpoint is the empty space on the "notepad" that can be used to record new operations. If the write pos catches up with the checkpoint, it means that the "pink board" is full. At this time, no new updates can be performed. It is necessary to stop and erase some records first to advance the checkpoint. With redo log, InnoDB can ensure that even if the database restarts abnormally, the previously submitted records will not be lost. This capability is called crash-safe . 4.2 Archive log binlog MySQL as a whole actually consists of two parts: one is the server layer, which mainly handles MySQL functional aspects; the other is the engine layer, which is responsible for specific storage-related matters. The redo log above is a log specific to the InnoDB engine, and the server layer also has its own log, called binlog (archive log). Why are there two logs? Initially, there was no InnoDB engine in MySQL. The engine that comes with MySQL is MyISAM, but MyISAM does not have crash-safe capabilities, and binlog logs can only be used for archiving. InnoDB was introduced into MySQL by another company in the form of a plug-in. Since relying solely on binlog does not provide crash-safe capabilities, InnoDB uses another log system, namely redo log, to achieve crash-safe capabilities. There are three differences between these two logs. redo log is specific to the InnoDB engine; binlog is implemented at the server layer of MySQL and can be used by all engines. Redo log is a physical log that records "what changes have been made to a certain data page"; binlog is a logical log that records the original logic of the statement, such as "add 1 to the c field of the row with ID=2". Redo log is written in a cycle, and its space will eventually run out; binlog can be written in appended form. "Append write" means that after the binlog file reaches a certain size, it will switch to the next one and will not overwrite the previous log. 5 Two-Phase Commit The writing of redo log is divided into two steps: prepare and commit, which is called "two-phase commit". 5.1 Why is “two-phase commit” necessary? This is to make the logic between the two logs consistent. Binlog records all logical operations in an " append write " format. If your DBA promises that recovery can be done within half a month, then the backup system will definitely save all binlogs from the last half a month, and the system will regularly back up the entire database. The "regularly" here depends on the importance of the system, it can be once a day or once a week. When you need to restore to a specified second, for example, at 2 pm one day, you find that a table was accidentally deleted at 12 pm and need to recover the data, you can do this: First, find the most recent full backup. If you are lucky, it may be the backup from last night. Restore from this backup to the temporary database. Then, starting from the backup time, take out the backed up binlogs one by one and replay them to the time before the table was accidentally deleted at noon. In this way, your temporary database will be the same as the online database before the accidental deletion. Then you can take the table data out of the temporary database and restore it to the online database as needed. After talking about the data recovery process, let’s talk about why the log requires “two-phase commit”. Let's use proof by contradiction to explain this. 5.2 If two-phase commit is not used Since redo log and binlog are two independent logics, if two-phase commit is not used, either the redo log is written first and then the binlog is written, or the order is reversed. Let's see what problems arise with these two approaches. Still using the previous update statement as an example, the value of field c in the current row with ID=1 is 1. 5.2.1 Write redo log first, then binlog Suppose that the MySQL process restarts abnormally when the redo log is written but the binlog is not. As we said before, after the redo log is written, the data can still be recovered even if the system crashes, so the value of c in this row after recovery is 2. However, since the binlog crashed before it was finished writing, this statement was not recorded in the binlog. If this binlog needs to be used to restore the temporary database, the temporary database will be missing this update because the binlog of this statement is lost. The value of c in the restored row is 1, which is different from the value in the original database. 5.2.2 Write binlog first, then redo log If the transaction crashes after the binlog is written, the redo log has not been written yet and the transaction is invalid after crash recovery, so the value of c in this row is 1. However, the log "Changing c from 1 to 2" has been recorded in the binlog. Therefore, when binlog is used for recovery later, the c value of this row in the restored temporary database is 2, which is different from the value in the original database. It can be seen that if "two-phase commit" is not used, the state of the database may be inconsistent with the state of the database restored using its log. In fact, this process is not only needed to recover data after an incorrect operation. When capacity expansion is needed, that is, when more backup databases are needed to increase the system's read capability, the common practice now is to use full backup plus application binlog to achieve this. This "inconsistency" will cause inconsistencies in your master-slave database online. Simply put, both redo log and binlog can be used to indicate the commit status of a transaction, and two-phase commit keeps these two states logically consistent. The redo log is used to ensure crash-safe capabilities. When the innodb_flush_log_at_trx_commit parameter is set to 1, it means that the redo log of each transaction is directly persisted to disk. It is recommended to set this parameter to 1 to ensure that data is not lost after an abnormal restart of MySQL. When the sync_binlog parameter is set to 1, it means that the binlog of each transaction is persisted to disk. It is also recommended to set this parameter to 1, so that the binlog is not lost after an abnormal restart of MySQL. The above is the detailed content of the detailed explanation of the execution process of the MySQL update statement. For more information about the execution process of the update statement, please pay attention to other related articles on 123WORDPRESS.COM! You may also be interested in:
|
<<: A brief discussion on how to cancel requests and prevent duplicate requests in axios
>>: Solve the problem that PhpStorm fails to connect to VirtualBox
Table of contents Preface toDoList just do it Pre...
From handicraft design to graphic design to web de...
Note: You cannot use scoped animations! ! ! ! via...
Table of contents Preface 1. Basic knowledge of f...
This article example shares the specific code for...
Table of contents 1. Location / Matching 2. Locat...
I have read countless my.cnf configurations on th...
Table of contents MySql8.0 View transaction isola...
The default ssh remote port in Linux is 22. Somet...
It is standard for websites to enable SSL nowaday...
Database application is an indispensable part of ...
Table of contents Cause of the incident Use Node ...
1. Naming conventions 1. Database names, table na...
This article uses examples to illustrate the func...
Two ways to navigate the page Declarative navigat...