PrefaceFirst, let's review what a transaction is. A transaction is the smallest unit of work for database operations. It is a series of operations performed as a single logical unit of work. These operations are submitted to the system as a whole and are either all executed or not executed. A transaction is a set of indivisible operations (logical unit of work). Characteristics of transactions: Atomicity: Atomicity means that all operations contained in a transaction are either completely successful or completely failed and rolled back. Consistency: The result of transaction execution must be to change the database from one consistent state to another consistent state. Isolation: The execution of a transaction cannot be interfered with by other transactions. That is, the operations and data used within a transaction are isolated from other concurrent transactions, and concurrently executed transactions cannot interfere with each other. Durability: Once a transaction is committed, the changes it makes to the data in the database should be permanent. Subsequent other operations or failures should not have any impact on its execution results. 1. Brief Analysis of the Underlying Principles of TransactionsAtomicity:Implementation principle: undo log Undo log is also called rollback log, which is the basis for achieving atomicity and isolation of transactions. When a transaction modifies the database, InnoDB generates a corresponding undo log; if the transaction fails or rollback is called, causing the transaction to be rolled back, the information in the undo log can be used to roll back the data to the state before the modification. Undo log is a logical log that records information related to SQL execution. When a rollback occurs, InnoDB will do the opposite of the previous work according to the contents of the undo log: for each insert, a delete will be executed when rolling back; for each delete, an insert will be executed when rolling back; for each update, an opposite update will be executed when rolling back to change the data back. Undo is stored in a special segment inside the database, which is called the undo segment. The undo segment is located in the shared tablespace. Undo is a logical log, so it only restores the database logic to its original state. Undo log will generate redo log, that is, the generation of undo log will be accompanied by the generation of redo log, because undo log also needs persistent protection. The undo log execution record is before each data is written or modified. Principle of undo log usage: Each row of data in the database table will have two more columns, DATA_TRX_ID and DATA_ROLL_PTR (there may also be a column DB_ROW_ID, which is automatically added when there is no default primary key). DATA_TRX_ID indicates the transaction version of the current data, and DATA_ROLL_PTR points to the transaction version just copied to The old version record in the undo log chain. The undo log is a linked list. If multiple transactions modify the file multiple times, undo logs will continue to be generated and a pointing relationship will be established through DATA_ROLL_PTR. Let me illustrate this with a picture: In this way, once the transaction is rolled back, MySQL can restore the data with the help of undo log, thereby ensuring the atomicity of uncommitted transactions. PersistenceImplementation principle: redo log Since InnoDB is the storage engine of MySQL, data is stored on disk. In order to reduce disk IO and improve reading performance, InnoDB provides a cache pool - Buffer Pool. The Buffer Pool contains the mapping of some data pages in the disk, which serves as a buffer for accessing the database: when reading data from the database, it is first read from the Buffer Pool. If the data is not in the Buffer Pool, it is read from the disk and then put into the Buffer Pool; when writing data to the database, it is first written to the Buffer Pool. The modified data in the Buffer Pool is periodically refreshed to the disk (this process is called flushing). However, this also brings a new problem. If MySQL crashes and the modified data in the Buffer Pool has not been refreshed to the disk, data will be lost and the persistence of the transaction cannot be guaranteed. In order to solve this problem, redo log, also called redo log, was introduced. When data is modified, in addition to modifying the data in the Buffer Pool, the operation is also recorded in the redo log; when the transaction is committed, the fsync interface is called to flush the redo log to disk. If MySQL crashes, you can read the data in the redo log when it restarts and restore the database. The redo log uses WAL (Write-ahead logging). All modifications are written to the log before being submitted, ensuring that data will not be lost due to MySQL downtime, thus meeting the persistence requirements. The redo log is for physical pages and is executed concurrently. The last commit will overwrite the uncommitted data. Local redo log: The redo log also has a buffer - the redo log buffer. When the transaction is committed, all modification information will be refreshed to the disk. Users can also modify the refresh policy by controlling the value of the variable innodb_flush_log_at_trx_commit (default 1). For example, if the value is set to 2, the refresh is controlled to occur every second, so that transactions are The transaction will be faster, but there may be a risk of log loss. The redo log is recorded after the SQL statement is executed. Since redo log also needs to be stored and involves disk IO, why do we still use it? (1) The storage of redo log is sequential, while cache synchronization is a random operation. (2) Cache synchronization is based on data pages, and the size of data transmitted each time is larger than the redo log. The redo log is used to recover data and ensure the persistence of committed transactions. Isolation: principle: consistency:Consistency is rather special. The aforementioned atomicity, persistence, and isolation all serve consistency. In addition, consistency also relies on the guarantees provided by the database itself, such as SQL syntax verification, column type insertion data type verification, and also on the guarantees provided by the application layer. For example, for a transfer operation, developers are required to deduct the transferor's balance and increase the recipient's balance. If there is a problem at the application level, consistency cannot be guaranteed. 2. Analysis of the underlying principles of isolation levelsIn the analysis of the underlying principles of transactions, the principle of isolation was not discussed in detail, so we will briefly introduce it here. First, let me introduce MySQL's MVCC (MultiVersion Concurrency Control), which is called multi-version concurrency control. It is implemented by relying on undo log and read view . Undo log has been introduced above and will not be repeated here. Read view is different from the database view we usually understand. It is used to determine the visibility of the current data version. ReadView has four main properties: (1). m_ids represents all currently active transaction IDs when the ReadView is generated. Active means that the transaction has been started but not yet committed. (2). min_trx_id indicates the smallest transaction ID among the currently active mIds; (3). max_trx_id indicates the maximum transaction ID when generating ReadView, which is not necessarily the largest transaction ID in mIds; (4).creator_trx_id indicates the transaction ID that created the ReadView. Note: Every time a transaction is opened, the transaction ID is incremented once. The transaction ID can be regarded as a global auto-increment variable. The first transaction opened may not be committed before the later transactions, such as long connections, so do not assume that max_trx_id is the maximum value in mIds. How does read view use the above four attributes to determine which version of data a transaction should read? If the data_trx_id of the accessed version is less than the minimum value in m_ids, it means that the transaction that generated the version has been committed before the ReadView is generated, and the version can be accessed by the current transaction. If the data_trx_id attribute value of the accessed version is the same as the creator_trx_id value in ReadView, it means that the current transaction is accessing the record that it has modified, so the version can be accessed by the current transaction. If the data_trx_id attribute value of the accessed version is greater than the max_trx_id value in ReadView, it indicates that the transaction that generated this version was opened after the current transaction generated ReadView, so this version cannot be accessed by the current transaction. If the data_trx_id attribute value of the accessed version is between the min_trx_id and max_trx_id of the ReadView, it is necessary to determine whether the trx_id attribute value is in the m_ids list. If it is, it means that the transaction that generated the version when the ReadView was created is still active and the version cannot be accessed. If it is not, it means that the transaction that generated the version when the ReadView was created has been committed and the version can be accessed. When a transaction wants to read a row of data, it first uses the above rules to determine the latest version of the data, that is, the row of records. If it is found to be accessible, it is read directly. If it is found to be inaccessible, it finds the undo log through the DATA_ROLL_PTR pointer and recursively searches for each version until it reads the version it can read. If it cannot be read, it returns empty. Therefore, when accessing data, a view will be created in the database, and the logical result of the view will be used as the basis for access: READ UNCOMMITED (uncommitted read): This isolation level directly returns the latest value on the record, and there is no view concept. Because reading does not add any locks, the write operation modifies the data during the reading process, which will cause dirty reads. The advantage is that it can improve concurrent processing performance and achieve parallel reading and writing. READ COMMITED : Under this isolation level, this view is created at the beginning of each SQL statement execution. InnoDB uses exclusive locks in READ COMMITTED, and uses the MVCC mechanism instead of locking data when reading. In other words, it adopts a read-write separation mechanism. REPEATABLE READ : At this isolation level, the view is created when the transaction starts and is used throughout the transaction. SERIALIZABLE : This isolation level directly uses locking to prevent parallel access. At this point, you may have discovered that MySQL can actually prevent phantom reads at the repeatable read isolation level with the help of MVCC. ConclusionAtomicity: Use undo log (rollback log) to implement rollback, thereby ensuring the atomicity of uncommitted transactions; Persistence: Use redo log to achieve data recovery, thereby ensuring the persistence of committed transactions; Isolation: Use locks and MVCC ideas to achieve read-write separation, read-read parallelism, and read-write parallelism; Consistency: Achieve consistency through rollback, recovery, and isolation in concurrent environments. This is the end of this article about the underlying principles of MySQL transactions and isolation levels. For more information about MySQL transactions and isolation levels, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future! You may also be interested in:
|
<<: Detailed explanation of non-parent-child component value transfer in Vue3
>>: Introduction to installing JDK under Linux, including uninstalling OpenJDK
So which one of these formats, GIF, PNG, and JPG,...
Redis is a distributed cache service. Caching is ...
Nginx first decides which server{} block in the c...
Sometimes you need to create some test data, base...
I had always wanted to learn Kubernetes because i...
Docker container connection 1. Network port mappi...
1. Introduction to Layer 4 Load Balancing What is...
JSONObject is just a data structure, which can be...
Develop a number guessing game that randomly sele...
Prepare a CentOS6 installation disk (any version)...
Table of contents 1. Project construction 2: Dire...
Transactions ensure the atomicity of multiple SQL...
This is the content of React 16. It is not the la...
Table of contents 1. What is multi-instance 2. Pr...
Table of contents Preface Introduction to Session...