Detailed explanation of MySQL phantom reads and how to eliminate them

Detailed explanation of MySQL phantom reads and how to eliminate them

This is a popular science article about database isolation levels. It aims to understand the famous phantom read phenomenon in databases. In order to focus on the topic, dirty reads and non-repeatable reads are not discussed.

Transaction Isolation Level

MySQL has four transaction isolation levels:

Read-uncommitted: Dirty read, non-repeatable read, and phantom read problems exist. Read-committed: No dirty read, but non-repeatable read and phantom read problems exist. Repeatable-read: No dirty read, non-repeatable read problems exist, but phantom read problems exist. Serialization: Solve dirty read, non-repeatable read, and phantom read problems, but the execution is completely serial, and the performance is the lowest.

What is phantom reading?

Wrong understanding of phantom read: phantom read means that transaction A executes two select operations to obtain different data sets, that is, select 1 obtains 10 records and select 2 obtains 11 records. This is not actually a phantom read, but a type of non-repeatable read, which only occurs at the RU RC level and does not occur at the MySQL default RR isolation level.

Here is my understanding of phantom reading:

Phantom reads do not mean that the result sets obtained from multiple reads in a transaction are different. What is more important about phantom reads is that the data state represented by the result set obtained from a certain select operation cannot support subsequent business operations. To be more specific: the select record does not exist, and you are going to insert this record, but when you execute insert, you find that this record already exists and cannot be inserted, just like an illusion

An example may make it easier to understand:

mysql> show create table user\G
*************************** 1. row ***************************
 Table: user
Create Table: CREATE TABLE `user` (
 `id` int(11) NOT NULL,
 `name` varchar(32) DEFAULT NULL,
 PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

Start two transactions T1 & T2 respectively, and set their isolation levels to Reaptable-Read:

T1:

mysql> set global transaction isolation level repeatable read;      
​
mysql> begin;
mysql> select * from user;
mysql> insert into user values ​​(1, 'jeff');
ERROR 1062 (23000): Duplicate entry '1' for key 'PRIMARY'
​
mysql> select * from user;

T2:

mysql> set global transaction isolation level repeatable read;      
​
mysql> begin;
mysql> insert into user values ​​(1, 'jeff');
mysql> commit;

T1 transaction checks whether there is a record with id 1 in the table, and inserts it if not.

T2 inserts interference records, causing phantom reads in T1.

In the above example, it is necessary to ensure that transaction T1 executes begin before executing transaction T2.

In the above example, a phantom read occurs at T1 because the data state read by T1 has a semantic conflict with the subsequent actions: when querying, it is clearly prompted that the record does not exist, but when inserting, it is prompted that the primary key is repeated, which is similar to the appearance of a phantom, so it is called a phantom read.

How to eliminate phantom reads

MySQL currently has two ways to eliminate phantom reads:

1. Manually add a row X lock to the select operation (SELECT ... FOR UPDATE). The reason is that the row lock in InnoDB locks the index. Even if the current record does not exist, the current transaction will obtain a record lock (if the record exists, a row X lock is added, if it does not exist, a next-key lock gap X lock is added). In this way, other transactions cannot insert records of this index, eliminating phantom reads.
2. Further increase the isolation level to SERIALIZABLE
Test the effect

mysql> begin;
​
mysql> select * from user where id = 2 for update;
mysql> insert into user values ​​(2, 'tony');

mysql> commit;

T2:

mysql> begin;
​
mysql> insert into user values ​​(2, 'jimmy');
ERROR 1062 (23000): Duplicate entry '2' for key 'PRIMARY'

Now when T1 is queried with for update, the index will be locked in Innodb (even if it does not exist currently), so the insert of transaction T2 will be blocked until T1 is committed. In this way, T1 succeeds. For T1, the phantom read is indeed eliminated, but the insert of T2 will report a duplicate primary key, which is also in line with expectations.

As for another way to improve the isolation level and eliminate phantom reads, those who are interested can try it themselves. I will not repeat it here. Its essence is similar, except that the system replaces manual locking.

Summarize

RR is the default isolation level of MySQL transactions. It is a compromise between transaction security and performance. After correctly understanding phantom reads, developers can decide whether to prevent phantom reads according to their needs.

SERIALIZABLE is pessimistic and believes that phantom reads will always occur, so it will automatically and implicitly add exclusive locks to the resources required by the transaction. Other transactions accessing this resource will be blocked and waiting. The transaction is safe, but performance needs to be carefully considered.

InnoDB locks are for indexes, which requires attention. Lock the row record. If it exists, add an X lock. Otherwise, add a next-key lock / gap lock / gap lock. Therefore, InnoDB can realize the pre-occupancy of a certain record by a transaction. As long as the transaction is still there, other transactions will not be able to occupy it. There will be a special article discussing locks later.

The above is a detailed explanation of MySQL phantom reads and how to eliminate them. For more information about MySQL phantom reads and how to eliminate them, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • Example of using MySQL transaction features to implement concurrent and safe auto-increment ID
  • Solution to PHP+MySQL high-concurrency locked transaction processing problem
  • Can MySQL's repeatable read level solve phantom reads?
  • Detailed explanation of how MySQL solves phantom reads
  • Mysql transaction concurrency problem solution
  • MySQL Series 10 MySQL Transaction Isolation to Implement Concurrency Control
  • How to solve the phantom read problem in MySQL
  • mysql+mybatis implements stored procedure + transaction + multi-concurrent serial number acquisition
  • Detailed explanation of concurrent dirty read + non-repeatable read + phantom read in Mysql transactions

<<:  Detailed explanation of the TARGET attribute of the HTML hyperlink tag A

>>:  How to set the memory size of Docker tomcat

Recommend

Talk about important subdirectory issues in Linux system

/etc/fstab Automatically mount partitions/disks, ...

JavaScript flow control (loop)

Table of contents 1. for loop 2. Double for loop ...

How to deploy Rancher with Docker (no pitfalls)

Must read before operation: Note: If you want to ...

mysql indexof function usage instructions

As shown below: LOCATE(substr,str) Returns the fi...

Docker installation rocketMQ tutorial (most detailed)

RocketMQ is a distributed, queue-based messaging ...

Use crontab to run the script of executing jar program regularly in centOS6

1. Write a simple Java program public class tests...

Eight examples of how Vue implements component communication

Table of contents 1. Props parent component ---&g...

Basic tutorial on using explain statement in MySQL

Table of contents 1. Overview 1. Explain statemen...

Implementation method of Nginx+tomcat load balancing cluster

The experimental environment is as follows Here y...

How to enable the slow query log function in MySQL

The MySQL slow query log is very useful for track...

Example of how to install nginx to a specified directory

Due to company requirements, two nginx servers in...

How to export mysql query results to csv

To export MySQL query results to csv , you usuall...