In-depth analysis of MySQL database transactions and locks

In-depth analysis of MySQL database transactions and locks

1. Basic Concepts

A transaction is a set of operations that meet the ACID properties. Transactions can be committed through Commit and rolled back through Rollback. There will be intermediate states and consistent states (which are also the states that actually exist in the database table)

ACID

Atomicity : A transaction is considered as an indivisible minimum unit, and all operations of a transaction are either committed successfully or all fail and roll back. Rollback can be achieved using an undo log, which records the modification operations performed by the transaction. When rolling back, these modification operations can be reversed.

UndoLog: To ensure the atomicity of transactions, before operating any data, first back up the data to the Undo Log and then modify the data. If an error occurs or the user executes a ROLLBACK statement, the system can use the backup in the Undo Log to restore the data to the state before the transaction started. Unlike redo log, there is no separate undo log file on the disk. It is stored in a special segment inside the database, called an undo segment. The undo segment is located in the shared tablespace. Innodb implements three hidden fields for each row of records: 6-byte transaction ID (DB_TRX_ID) 7-byte rollback pointer (DB_ROLL_PTR) Hidden ID

Consistency: The database maintains a consistent state before and after the transaction is executed. In the consistent state, all transactions have the same reading results for the same data.

Isolation : Changes made by one transaction are not visible to other transactions until they are finally committed.

Durability : Once a transaction is committed, the changes made will be saved to the database forever. Even if the system crashes, the results of the transaction execution will not be lost. If a system crash occurs, redoLog can be used to recover, thus achieving persistence. Unlike undoLog, which records logical changes to data, redoLog records a summary of physical changes to data pages:
1. Only when consistency is met can the execution result of the transaction be correct.
2. In the absence of concurrency, transactions are executed serially and isolation can be met. At this time, as long as atomicity can be met, consistency can be met.
3. In the case of concurrency, multiple transactions are executed in parallel. Transactions must not only meet atomicity but also isolation to meet consistency.
4. Transactions need to be persistent in order to cope with system crashes

3.AutoCommit

MySQL uses the auto-commit mode by default , that is, if you do not explicitly use the start transaction statement to start a transaction, each operation will be treated as a transaction and automatically committed.

4. Transaction Isolation Level

Read uncommitted: Modifications in a transaction are visible to other transactions even if they are not committed.

Read committed: A transaction can only read changes made by committed transactions. In other words, changes made by a transaction are not visible to other transactions until they are committed.

Repeatable read: ensures that the results of reading the same data multiple times in the same transaction are the same. Serializable: forces transactions to be executed serially, so that multiple transactions do not interfere with each other and there will be no concurrent consistency problems. This isolation level requires locking to be implemented because the locking mechanism is used to ensure that only one transaction is executed at the same time, that is, to ensure that transactions are executed serially.

5. Concurrency consistency issues

background

In a concurrent environment, the isolation of transactions is difficult to ensure, so many concurrent consistency issues will arise.

Main scenes

Lost modification: A lost modification means that an update operation of a transaction is replaced by an update operation of another transaction. For example, transactions T1 and T2 both modify a piece of data. T1 modifies and commits the data first, and then T2 modifies the data later. The modification of T2 overwrites the modification of T1.
Business scenarios: Users can modify addresses by modifying address information, setting a default address, or deleting an address. The same update statement is called in these three scenarios. The interface provided to users for updating addresses needs to support users setting default addresses, and the interfaces for updating address information and setting default addresses cannot be provided separately. If they are provided separately, the upper-level service call actually calls two update interfaces at once, which can easily lead to loss of modifications.

Reading dirty data: Reading dirty data means that under different transactions, the current transaction can read data that is not committed by other transactions. For example, T1 modifies a data but does not commit it, and T2 subsequently reads the data. If T1 undoes this modification, the data read by T2 is dirty data.

Non-repeatable read: Non-repeatable read refers to reading the same data set multiple times within a transaction. Before the transaction ends, another transaction also accesses the same data set and makes modifications. Due to the modifications of the second transaction, the data read twice by the first transaction may be inconsistent. For example: T2 reads a data and T1 modifies the data. If T2 reads the data again, the result of this reading will be different from the result of the first reading.

Phantom read: Phantom read is essentially a non-repeatable read. T1 reads a range of data, T2 inserts new data in this range, and T1 reads the data in this range again. The result of this read is different from the result of the first read.

summary:
The main reason for concurrent inconsistency problems is the destruction of transaction isolation. The solution is to ensure isolation through concurrency control. Concurrency control can be achieved through blocking, but the blocking operation requires the user to control it, which is quite complicated. Database management systems provide transaction isolation levels, allowing users to deal with concurrent consistency issues in an easier way.

6. Lock

Blocking granularity:

Row-level lock: only the part of data or the row that needs to be modified is blocked, not all resources. The possibility of lock contention is small, and the system concurrency is high.

Table-level lock: The entire table is locked. The amount of locked data is too large, which greatly increases the probability of lock contention and causes a sharp decline in system concurrency performance.

Note: Locking consumes resources. All lock operations [including acquiring locks, releasing locks, and checking lock status] will increase system overhead. Therefore, the smaller the lock granularity, the greater the system overhead. When choosing the lock granularity, you need to make a trade-off between lock overhead and concurrency.

Blocking Type

Read-write lock

Mutex lock, abbreviated as X lock, is also called write lock.
A transaction adds an X lock to data object A and can read and update A. During the locking period, other transactions cannot add any locks to A.

Shared lock, abbreviated as S lock, is also called read lock.
A transaction adds an S lock to data object A. A can be read but not updated. During the locking period, other transactions can add an S lock to A, but cannot add an X lock.

Intention Lock

Mainly table lock, but not really lock

In the presence of row-level locks and table-level locks, if transaction T wants to add an X lock to table A, it must first detect whether other transactions have locked table A or any row in table A. Then, it needs to check every row of table A, which is very time-consuming.
Intention locks introduce IX/IS on top of the original X/S locks. IX/IS are both table locks, which are used to indicate that a transaction wants to add an X lock or an S lock to a certain data row in the table.
There are two provisions:
Before a transaction obtains an S lock on a data row object, it must first obtain an IS lock or a stronger lock on the table.
Before a transaction obtains an X lock on a data row object, it must first obtain an IX lock on the table.
By introducing intention locks, if transaction T wants to add an X lock to table A, it only needs to first detect whether other transactions have added an X/IX/S/IS lock to table A. If so, it means that other transactions are using the lock on this table or a row in the table, so transaction T fails to add an X lock.

Any IS/IX locks are compatible with each other because they only indicate the desire to lock the table, not the actual lock;
The compatibility relationship here is for table-level locks, and the table-level IX lock is compatible with the row-level X lock. Two transactions can add X locks to two data rows.
(Transaction T1 wants to add an X lock to data row R1, and transaction T2 wants to add an X lock to data row R2 in the same table. Both transactions need to add an IX lock to the table, but IX locks are compatible, and IX locks are also compatible with row-level X locks. Therefore, both transactions can successfully lock and modify two data rows in the same table.)

7.MySQL implicit and explicit locking

Implicit locking: MySQL's InnoDB storage engine uses a two-stage locking protocol, which automatically locks when needed according to the isolation level, and all locks are released at the same time. This is called implicit locking.

Two-stage lock protocol: locking and unlocking are divided into two stages. Serializable scheduling means that through concurrency control, the results of concurrently executed transactions are the same as the results of a serially executed transaction. Serially executed transactions do not interfere with each other, and there will be no concurrent consistency issues.

Or use a specific statement to perform explicit locking: SELECT ... LOCK In SHARE MODE; (shared lock) SELECT ... FOR UPDATE; (exclusive lock) The lock is automatically released after the transaction is completed and committed.

MySQL three-level blocking protocol

Level 1 locking protocol: When transaction T wants to modify data A, it must add an X lock and release the lock only after transaction T ends to solve the problem of lost modifications. At this time, two transactions cannot modify the same data at the same time, otherwise the modifications of the transactions will not be overwritten.

Level 2 locking protocol: Based on the level 1 protocol, it is required to add an S lock when reading data A. Releasing the S lock immediately after reading can solve the problem of reading dirty data. Because if a transaction is modifying data A, according to the level 1 locking protocol, an X lock will be added, and then an S lock cannot be added, which means that dirty data will not be read in.

Level 3 locking protocol: Based on the level 2 protocol, it is required that an S lock must be added when reading data, and the S lock can only be released after the transaction is completed. The S lock can solve the problem of non-repeatable reads, because when reading A, other transactions cannot add an X lock to A, thus avoiding data changes during reading.

8. Lock implementation of InnoDB engine

MVCC

Multi-version concurrency control is a specific way for MySQL's innoDB storage engine to implement isolation levels. It can be used to implement the two isolation levels of committed read and repeatable read. The uncommitted read isolation level always reads the latest data row, which has very low requirements and does not require the use of MVCC.

As mentioned in the locking section, locking can solve the concurrent consistency problems that arise when multiple transactions are executed simultaneously. In actual scenarios, read operations often outnumber write operations, so read-write locks are introduced to avoid unnecessary locking operations. For example, read and write are not mutually exclusive. In a read-write lock, read and write operations are still mutually exclusive, and MVCC uses the idea of ​​multiple versions. Write operations update the latest version snapshot, while read operations read the old version snapshot. There is no mutual exclusion relationship, which is similar to CopyOnWrite.

In MVCC, the modification operation (DELETE, INSERT, UPDATE) of a transaction will add a version snapshot to the data row. The most fundamental reason for dirty reads and non-repeatable reads is that a transaction reads uncommitted changes from other transactions. When a transaction performs a read operation, in order to solve the problems of dirty reads and non-repeatable reads, MVCC stipulates that only committed snapshots can be read. Of course, a transaction can read its own uncommitted snapshot, which is not considered a dirty read.

System version number SYS_ID: It is an increasing number. Every time a new transaction is started, the system version number will automatically increase.

Transaction version number TRX_ID: The system version number when the transaction starts.

MVCC's multi-version refers to multiple versions of snapshots, which are stored in the Undo log. The log connects all snapshots of a data row through the rollback pointer ROLL_PTR.

INSERT, UPDATE, and DELETE operations create a log and write the transaction version number TRX_ID. DELETE can be considered a special UPDATE, which also sets the DEL field to 1.

ReadView

MVCC maintains a ReadView structure, which mainly contains the list of uncommitted transactions in the current system, as well as the minimum and maximum values ​​of the list.

When performing a SELECT operation, the relationship between the TRX_ID of the data row snapshot and TRX_ID_MIN and TRX_ID_MAX is used to determine whether the data row snapshot can be used.

TRX_ID < TRX_ID_MIN, indicating that the data row snapshot was changed before all current uncommitted transactions, so it can be used.

TRX_ID > TRX_ID_MAX, indicating that the data row snapshot was changed after the transaction was started and therefore cannot be used.

TRX_ID_MIN <= TRX_ID <= TRX_ID_MAX, need to judge according to the isolation level

Committed read: If TRX_ID is in the TRX_IDs list, it means that the transaction corresponding to the data row snapshot has not been committed, so the snapshot cannot be used. Otherwise it has been submitted and can be used.

Repeatable Read: Neither can be used. Because if it can be used, other transactions can also read this data row snapshot and modify it, so the value obtained by the current transaction when reading this data row will change, which means that a non-repeatable read problem occurs. If the data row snapshot is not available, you need to follow the rollback pointer ROLL_PTR of the Undo Log to find the next snapshot and then perform the above judgment.

Snapshot read and safe read

Snapshot read: The select operation of MVCC is the data in the snapshot, and no locking operation is required.

Current read: MVCC other operations that modify the database require locking operations to read the latest data. It can be seen that MVCC does not completely avoid locking, but only avoids the locking operation of select

If you need to select a lock, you can force a lock operation, such as the shared lock and exclusive lock mentioned earlier.

Next-Key Locks

Concept: Next-Key Locks is a lock implementation of MySQL's InnoDB storage engine. MVCC cannot solve the phantom read problem, and Next-Key Locks exist to solve this problem. At the REPEATABLE READ isolation level, using MVCC + Next-Key Locks can solve the phantom read problem.

Record Locks: Locks an index on a record, not the record itself. If the table does not have an index, InnoDB automatically creates a hidden clustered index on the primary key.

Gap Locks: Locks the gaps between indexes, but not the indexes themselves. For example, when a transaction executes the following statement: SELECT c FROM t WHERE c BETWEEN 10 and 20 FOR UPDATE;

Next-Key Locks: It is a combination of Record Locks and Gap Locks, which not only locks the index on a record, but also locks the gap between indexes. It locks a range that is open at the front and closed at the back. For example, if an index contains the following values: 10, 11, 13, and 20, then the following range needs to be locked: (-∞, 10](10, 11](11, 13](13, 20](20, +∞)

IX. Conclusion

The above theories are numerous, but they also support the entire research and development process. When encountering various business scenarios, it is necessary to judge whether transactions will cause deadlocks, data inconsistencies, and other theoretical problems based on the isolation level of the database. The most powerful feature of MySQL is that it avoids phantom reads at the RR [Repeatable Read] level, which takes into account both performance and data security.
When developing microservices or distributed projects. Try to write transactions as simply as possible so that they do not lock the corresponding rows for a long time. This also ensures data consistency

This is the end of this article on in-depth analysis of MySQL database transactions and locks. For more relevant MySQL database transactions and locks, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope everyone will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • Detailed explanation of MySQL database table locking, unlocking and deleting transactions
  • MySQL Database Indexes and Transactions
  • MySQL database transaction example tutorial
  • Detailed explanation of transactions and indexes in MySQL database
  • Golang implements the submission and rollback of MySQL database transactions
  • Detailed explanation of dirty read, phantom read and non-repeatable read in MySQL database transactions

<<:  Automatically load kernel module overlayfs operation at CentOS startup

>>:  Native JS to achieve directory scrolling effects

Recommend

Troubleshooting MySQL high CPU load issues

High CPU load caused by MySQL This afternoon, I d...

Vue3 (Part 2) Integrating Ant Design Vue

Table of contents 1. Integrate Ant Design Vue 2. ...

js realizes a gradually increasing digital animation

Table of contents background Achieve a similar ef...

How to modify the initial password of a user in mysql5.7

When users install MySQL database for the first t...

Detailed steps to install and uninstall Apache (httpd) service on centos 7

uninstall First, confirm whether it has been inst...

Analyze the problem of pulling down the Oracle 11g image configuration in Docker

1. Pull the image docker pull registry.cn-hangzho...

HTML image img tag_Powernode Java Academy

summary Project description format <img src=&q...

Web Design Tutorial (6): Keep your passion for design

<br />Previous article: Web Design Tutorial ...

How to delete special character file names or directories in Linux

Delete a file by its inode number First use ls -i...

Summary of MySql import and export methods using mysqldump

Export database data: First open cmd and enter th...

Detailed explanation of Linux text processing command sort

sort Sort the contents of a text file Usage: sort...

Completely uninstall mysql. Personal test!

Cleanly uninstall MySQL. Personally tested, this ...

MySQL InnoDB row_id boundary overflow verification method steps

background I talked to my classmates about a boun...