Detailed explanation of the implementation principle of transaction isolation level in MySQL

Preface

When it comes to database transactions, a lot of transaction-related knowledge will easily pop up in everyone's mind, such as the ACID characteristics of transactions, isolation levels, problems solved (dirty reads, non-repeatable reads, phantom reads), etc., but few people may really understand how these transaction characteristics are implemented and why there are four isolation levels.

Today we will first talk about the implementation principle of transaction isolation in MySQL, and will continue to publish articles to analyze the implementation principles of other features.

Of course, MySQL is extensive and profound, and omissions in the article are inevitable. Criticisms and corrections are welcome.

illustrate

The transaction implementation logic of MySQL is located at the engine layer, and not all engines support transactions. The following instructions are based on the InnoDB engine.

definition

Isolation means that after different transactions are submitted and executed one after another, the final results are serial. That is to say, for a transaction, during its execution, the data changes it perceives should only be those caused by its own operations, and there should be no data changes caused by other transactions.

Isolation solves the problems that arise from concurrent transactions.

Standard SQL isolation levels

The simplest way to implement isolation is to execute each transaction serially. If the previous transaction has not been completed, the subsequent transactions will wait. However, this implementation method obviously has low concurrency efficiency and is not suitable for use in actual environments.

In order to solve the above problems and achieve different degrees of concurrency control, the SQL standard setters proposed different isolation levels: read uncommitted, read committed, repeatable read, and serializable read. The highest isolation level is serialized read. In other isolation levels, since transactions are executed concurrently, some problems are more or less allowed. See the matrix below:

Isolation level (+: allowed, -: not allowed)	Dirty Read	Non-repeatable read	Phantom Read
Read Uncommitted	+	+	+
Commit Read	-	+	+
Repeatable Read	-	-	+
Serialized Read	-	-	-

Note that MySQL's InnoDB engine solves the problem of non-repeatable reads through MVCC at the committed read level, and solves the problem of phantom reads through gap locks at the repeatable read level. See the analysis below for details.

Implementation principle

Implementation principle of standard SQL transaction isolation level

The problem we encountered above is actually the control problem under concurrent transactions. The most common way to solve concurrent transactions is pessimistic concurrency control (that is, locks in the database). The implementation of standard SQL transaction isolation levels relies on locks. Let's see how it is implemented:

Transaction Isolation Level	Implementation
Read Uncommitted (RU)	The transaction does not lock the data currently being read; When a transaction updates certain data (that is, the moment the update occurs), a row-level shared lock must be added to it first, which will not be released until the transaction ends.
Read Committed (RC)	The transaction adds a row-level shared lock to the data currently being read (locked only when reading), and releases the row-level shared lock immediately after reading the row; When a transaction updates certain data (that is, the moment the update occurs), a row-level exclusive lock must be added to it first, and it will not be released until the transaction ends.
Repeatable Read (RR)	When a transaction reads a piece of data (that is, the moment it starts reading), it must first add a row-level shared lock to it, which is not released until the transaction ends; When a transaction updates certain data (that is, the moment the update occurs), a row-level exclusive lock must be added to it first, and it will not be released until the transaction ends.
Serialized read (S)	When a transaction reads data, it must first add a table-level shared lock to it, which is not released until the transaction ends; When a transaction updates data, it must first add a table-level exclusive lock to it, which is not released until the transaction ends.

It can be seen that when only locks are used to implement isolation level control, frequent locking and unlocking are required, and read-write conflicts are likely to occur (for example, at the RC level, transaction A updates data row 1, and transaction B reads data row 1 before transaction A commits, and must wait for transaction A to commit and release the lock).

In order to solve the problem of read-write conflicts without locking, MySQL introduced the MVCC mechanism. For details, please see my previous analysis article: Understand optimistic locking, pessimistic locking and MVCC in the database in one article.

InnoDB transaction isolation level implementation principle

Before we proceed, we need to understand a few concepts:

1. Locking read and consistent non-locking read

Locking read: Actively lock reads in a transaction, such as SELECT ... LOCK IN SHARE MODE and SELECT ... FOR UPDATE. Row shared locks and row exclusive locks are added respectively. The classification of locks can be found in my previous analysis article: MySQL lock classification you should know).

https://dev.mysql.com/doc/refman/8.0/en/innodb-locking-reads.html

Consistent non-locking read: InnoDB uses MVCC to provide a database snapshot at a certain point in time to the transaction's query. The query sees changes made by transactions committed before that point in time, but not changes made by later or uncommitted transactions (except this transaction). That is to say, after the transaction is started, the data seen by the transaction are all the data at the moment the transaction is started, and subsequent modifications of other transactions will not be visible in this transaction.

Consistent read is the default mode for InnoDB to process SELECT statements at RC and RR isolation levels. A consistent non-locking read does not set any locks on the tables it accesses, so other transactions can concurrently read or modify them while a consistent non-locking read is being performed on the tables.

https://dev.mysql.com/doc/refman/8.0/en/innodb-consistent-read.html

2. Current read and snapshot read

Current Reading

What is read is the latest version. Operations such as UPDATE, DELETE, INSERT, SELECT ... LOCK IN SHARE MODE, and SELECT ... FOR UPDATE are all current reads. Why are they called current reads? That is, it reads the latest version of the record. When reading, it must ensure that other concurrent transactions cannot modify the current record, and the read record will be locked.

Snapshot Read

What is read is the snapshot version, that is, the historical version. For example, an unlocked SELECT operation is a snapshot read, that is, an unlocked non-blocking read. The premise of a snapshot read is that the isolation level is not an uncommitted read or a serialized read level, because an uncommitted read always reads the latest data row, not the data row that conforms to the current transaction version, and a serialized read will lock the table.

3. Implicit locking and explicit locking

Implicit Locking

InnoDB uses a two-phase locking protocol during transaction execution (without actively performing explicit locking):

Locking can be performed at any time, and InnoDB will automatically lock when needed according to the isolation level;
The lock is released only when commit or rollback is executed, and all locks are released at the same time.

Explicit Locking

InnoDB also supports explicit locking through specific statements (storage engine layer)

select ... lock in share mode //Shared lock select ... for update //Exclusive lock

Display locks at the MySQL Server layer:

lock table
unlock table

After understanding the above concepts, let's take a look at how InnoDB transactions are implemented (the following reads refer to non-actively locked selects)

Transaction Isolation Level	Implementation
Read Uncommitted (RU)	The transaction does not lock the data currently being read, and all data is read currently; When a transaction updates certain data (that is, the moment the update occurs), a row-level shared lock must be added to it first, which will not be released until the transaction ends.
Read Committed (RC)	The transaction does not lock the data currently being read and is a snapshot read; When a transaction updates certain data (that is, the moment the update occurs), it must first add a row-level exclusive lock (Record) to it, which is not released until the transaction ends. Through snapshots, MySQL solves the problem of non-repeatable reads at this level.
Repeatable Read (RR)	The transaction does not lock the data currently being read and is a snapshot read; When a transaction updates certain data (that is, the moment the update occurs), it must first add a row-level exclusive lock (Record, GAP, Next-Key) to it, which is not released until the transaction ends. Through gap locks, MySQL solves the problem of phantom reads at this level.
Serialized read (S)	When a transaction reads data, it must first add a table-level shared lock to it, which is not released until the transaction ends. This is all current reading. When a transaction updates data, it must first add a table-level exclusive lock to it, which is not released until the transaction ends.

As you can see, InnoDB solves the problem of read-write conflicts very well through MVCC, and solves the problems of phantom reads and non-repeatable reads that occur at the standard level one level in advance, greatly improving the concurrency capability of the database.

Some common misunderstandings

Does phantom read include delete?

Non-repeatable read: A row is read multiple times before and after, the data content is inconsistent, and it is an update and delete operation for other transactions. To solve this problem, use row shared locks to lock until the end of the transaction (that is, RR level. Of course, MySQL uses MVCC to solve this problem at the RC level).

Phantom reads: Phantom reads occur when the same query generates different sets of rows at different times. This is caused by insert operations of other transactions. To solve this problem, lock the entire table until the end of the transaction (that is, S level. Of course, MySQL uses gap locks at RR level to solve this problem).

Many articles on the Internet mention phantom reads and committed reads. Some say that phantom reads include deletes, while others say that deletes should be considered a committed read problem. So what is the truth? Let's actually look at the official documentation of MySQL (as follows)

The so-called phantom problem occurs within a transaction when the same query produces different sets of rows at different times. For example, if a SELECT) is executed twice, but returns a row the second time that was not returned the first time, the row is a "phantom" row.
https://dev.mysql.com/doc/refman/5.7/en/innodb-next-key-locking.html

It can be seen that phantom reads are aimed at changes before and after the result set, so it seems that the delete situation should be classified as a phantom read. However, if we actually analyze the implementation principle of the standard SQL listed above at the RR level, we will know that the RR level of the standard SQL will add a row shared lock to the found data rows, so other transactions cannot delete these data rows at this time. Therefore, under RR, there will be no phantom read due to delete, that is, phantom read does not include delete.

Can MVCC solve the phantom reading problem?

Many articles on the Internet say that MVCC or MVCC+gap locks solve the phantom read problem. In fact, MVCC cannot solve the phantom read problem. For example:

begin;

#Assuming that the users table is empty, the data found below is empty select * from users; #No lock #At this time, another transaction is committed and a data with id=1 is inserted select * from users; #Read the snapshot, the data found is empty update users set name='mysql' where id=1; #update is the current read, so the update is successful and an updated snapshot is generated select * from users; #Read the snapshot and find out a record with id 1, because MVCC can find the snapshot generated by the current transaction commit;

It can be seen that the data rows checked before and after are inconsistent, and phantom reads have occurred. Therefore, MVCC alone cannot solve the phantom read problem. The solution to the phantom read problem depends on gap locks. as follows:

begin;

#Assume that the users table is empty, the data found below is empty select * from users lock in share mode; #Add shared lock #At this time, another transaction B wants to commit and inserts a data with id=1. Due to the gap lock, it has to wait select * from users; #Read the snapshot, the data found is empty update users set name='mysql' where id=1; #update is the current read, and no update is performed because no data exists select * from users; #Read the snapshot, the data found is empty commit;

#Transaction B is successfully submitted and data is inserted

Note that to solve the phantom read problem at the RR level, we need to explicitly lock it, otherwise it will not be locked during the query.

The above is a detailed explanation of the implementation principle of transaction isolation levels in MySQL. For more information about MySQL transaction isolation levels, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:

How to choose transaction isolation level in MySQL project
Detailed explanation of transaction isolation levels in MySql study notes
Description of the default transaction isolation level of mysql and oracle
Briefly describe the four transaction isolation levels of MySql
Mysql case analysis of transaction isolation level

<<: Implementation of importing and exporting docker images

>>: CSS method of controlling element height from bottom to top and from top to bottom

Some notes on installing fastdfs image in docker

Detailed explanation of the implementation principle of transaction isolation level in MySQL

Preface

illustrate

definition

Standard SQL isolation levels

Implementation principle

Implementation principle of standard SQL transaction isolation level

InnoDB transaction isolation level implementation principle

Some common misunderstandings

Does phantom read include delete?

Can MVCC solve the phantom reading problem?

Some notes on installing fastdfs image in docker

Example of how to implement keepalived+nginx high availability

JavaScript Sandbox Exploration

Detailed tutorial on installing mysql-8.0.13 (zip installation) on windows 10 system

Implementation of running springboot project with Docker

About ROS2 installation and docker environment usage

A brief discussion on using Cartesian product principle to query multiple tables in MySQL

Detailed explanation of using Docker to build externally accessible MySQL

Solve the problem of using less in Vue

Understand the use of CSS3's all attribute

Recommend

How to install binary MySQL on Linux and crack MySQL password

MySQL 8.0 user and role management principles and usage details

How to deploy MySQL and Redis services using Docker

Solution for coexistence of multiple versions of PHP under Linux system (super simple)

JavaScript to implement checkbox selection or cancellation

Summary of JavaScript's setTimeout() usage

Detailed explanation of HTML tables

Linux uses if to determine whether a directory exists.

Docker image export, import and copy example analysis

Steps to install MySQL on Windows using a compressed archive file

Practice of using Tinymce rich text to customize toolbar buttons in Vue

Detailed tutorial for installing nginx on centos8 (picture and text)

Do you know how to use Vue to take screenshots of web pages?

React uses routing to redirect to the login interface

New ideas for time formatting in JavaScript toLocaleString()