introduction When I was learning more about database theory, I learned about the possible problems with different isolation levels of things. For a better understanding, we tested and reproduced these problems in the MySQL database. Dirty reads and non-repeatable reads can be easily reproduced under the corresponding isolation levels. However, as for phantom reads, I found that they did not occur under the repeatable read isolation level. At that time, I thought that maybe MySQL did something to deal with phantom reads? test: Create a test table dept: CREATE TABLE `dept` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(20) DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=12 DEFAULT CHARSET=utf8 insert into dept(name) values("Logistics Department")
According to the above process, it is expected that the first select of transaction 1 will query one piece of data, and the second select will query two pieces of data (including the data submitted by transaction 2). However, in actual testing, it was found that the second select actually only queries and processes one piece of data. However, according to the implementation of repeatable read in database theory (exclusive locks and shared locks), this should not be the case. Before we understand the actual reasons, let’s review the relevant theories of things. Database Principles Theory thing Transaction generally refers to something to be done or done. In computer terms, it refers to a program execution unit that accesses and possibly updates various data items in a database. A transaction consists of all operations performed between the start of a transaction and the end of a transaction. In a relational database, a transaction can be a set of SQL statements or an entire program. Why do we have things? A database transaction typically consists of a sequence of operations that read or write to the database. It exists for two purposes:
Characteristics of things Transactions have four characteristics: atomicity, consistency, isolation, and durability. These four properties are often referred to as ACID properties.
Several properties between things are not a set of equivalent concepts: If there is only one thing at any time, it is naturally isolated, and consistency can be achieved as long as atomicity is guaranteed. If there is concurrency, atomicity and isolation must be guaranteed to ensure consistency. Problems in concurrent database transactions If transaction isolation is not considered, the following problems may occur:
Exclusive lock, shared lock Exclusive lock, also known as X lock, write lock. Shared lock, also known as S lock, read lock. The relationship between read-write locks is as follows:
That is, the relationship between read-write locks can be summarized as: multiple reads and single write Isolation level of transactions There are several isolation levels in transactions:
Implementation of isolation levels in MySQL The above content explains some concepts of database theory, but in databases such as MySQL and ORACLE, for performance reasons, they are not implemented completely according to the theories introduced above. MVCC Multi-Version Concurrency Control (MVCC) is a way to implement isolation levels in MySQL based on optimistic locking theory. It is used to implement read committed and repeatable read isolation levels. Implementation (isolation level is repeatable read) Before talking about how to implement it, let’s introduce two concepts:
In MySQL, two fields are added after each record in the table: Create version number: When creating a row of data, assign the current system version number as the creation version number Delete version number: When deleting a row of data, the current system version number is assigned as the deletion version number SELECT The rule for reading data during select is: the creation version number <= the current transaction version number, and the deletion version number is empty or > the current transaction version number. The creation version number <= the current transaction version number ensures that the retrieved data will not contain data created in transactions started later. This is why we don't find the data added later in the initial example. The deletion version number is empty or > the current transaction version number, which ensures that the data has not been deleted at least before the transaction is started, and is data that should be checked out. INSERT When inserting, assign the current system version number to the creation version number field. UPDATE Insert a new record, save the current transaction version number as the row creation version number, and save the current transaction version number to the originally deleted row. In fact, the update here is implemented through delete and insert. DELETE When deleting, the current system version number is assigned to the deletion version number field to indicate in which transaction the row of data will be deleted, even if the data is not actually deleted when the commit is performed. The data will not be found even if the data is opened according to the select rule. Does MVCC really solve phantom reading? From our initial test examples and the theoretical support above, it seems that the phantom read problem has been solved through MVCC in MySQL. Since serial read seems to be meaningless, we continue testing with doubts. Pre-test data:
Based on the above results, we expect the following result:
But in fact, our experience is: We originally hoped to change the department of the first data to Finance, but in the end both data were modified. This result tells us that the MySQL repeatable read isolation level does not completely solve the problem of phantom reads, but solves the phantom read problem when reading data. However, the phantom read problem still exists for modification operations, which means that MVCC's solution to phantom reads is not thorough. Snapshot read and current read When the above situation occurs, we need to know why it happens. After consulting some information, I found that at the RR level, although the MVCC mechanism makes data repeatable, the data we read may be historical data, not the latest data in the database. This method of reading historical data is called snapshot read, and the method of reading the latest version of the database is called current read. select snapshot read When performing a select operation, InnoDB will perform a snapshot read by default, and will record the result of this select. The data of this snapshot will be returned in subsequent selects. Even if other transactions are committed, it will not affect the data of the current select, thus achieving repeatable read. The snapshot is generated when select is executed for the first time. That is to say, suppose A starts a transaction and then does not perform any operations. At this time, B inserts a piece of data and commits it. At this time, A executes select, then the returned data will include the data added by B. It doesn't matter if other transactions are committed afterwards, because the snapshot has been generated and subsequent selects are based on the snapshot. Current Reading For operations that modify data (update, insert, delete), the current read mode is used. When executing these operations, the latest records will be read, and even data submitted by other transactions can be queried. Suppose you want to update a record, but the data has been deleted and committed in another transaction. If you update it, there will be a conflict, so you need to know the latest data when updating. It is precisely because of this that the situation we tested above occurred. The current read of select needs to be manually locked: select * from table where ? lock in share mode; select * from table where ? for update; There is a problem to explain At the beginning of the test, I thought that using the begin statement was to start a transaction. So in the second test above, because transaction 1 was started first, the newly added data of transaction 2 was found in transaction 1. At that time, I thought that this was inconsistent with the select rules in the previous MVCC, so I did the following test: SELECT * FROM information_schema.INNODB_TRX //Used to query the currently executing transaction It can be seen that if you just execute the begin statement, a transaction is not actually started. Next, add a select statement after begin: Therefore, it is important to understand that a transaction is actually opened only after operations such as adding, deleting, modifying, and checking the data have been performed. How to solve phantom reads Obviously, the repeatable read isolation level cannot completely solve the problem of phantom reads. If we need to solve phantom reads in our project, there are two ways:
In fact, the above two methods are not used in many projects. The performance of serialized reading is too poor, and in fact, phantom reading is often completely acceptable to us. Summarize The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. Thank you for your support of 123WORDPRESS.COM. You may also be interested in:
|
<<: Vue integrates Tencent TIM instant messaging
>>: Docker installs and runs the rabbitmq example code
1. Download and decompress 1. Introduction to Zoo...
MyISAM and InnoDB are the most common storage eng...
Written in front There are two ways to upgrade My...
Content 1. Give readers a reason to stay. Make the...
Nginx, pronounced "engine x," is an ope...
Although Microsoft has done a lot of research and ...
1. Zabbix backup [root@iZ2zeapnvuohe8p14289u6Z /]...
1. Introduction When a web project is published o...
Table of contents uni-app Introduction HTML part ...
MySQL supports hash and btree indexes. InnoDB and...
Preface I wrote an article about rem adaptation b...
In this article we assume you already know the ba...
Preface It is said that if the people doing opera...
I recently wrote a mobile page at work, which was...
How to install PHP7 on Linux? 1. Install dependen...