Implementation of MySQL's MVCC multi-version concurrency control

Implementation of MySQL's MVCC multi-version concurrency control

1 What is MVCC

The full name of MVCC is: Multiversion concurrency control. It processes the memory read within the transaction when providing concurrent access to the database to avoid the concurrency problem of write operations blocking read operations.

For example, programmer A is reading certain content in the database, and programmer B is modifying this content (assuming that the modification is within a transaction, which lasts about 10 seconds). A may see inconsistent data within these 10 seconds. How can A always read consistent data before B commits?

There are several ways to deal with this. The first one is lock-based concurrency control. When programmer B starts to modify the data, he locks the data. When programmer A reads the data at this time, he finds that he cannot read it and is in a waiting state. He can only read the data after B completes the operation. This ensures that A will not read inconsistent data, but this will affect the running efficiency of the program. There is another one: MVCC. When each user connects to the database, he sees a snapshot of the database at a specific moment. Before B's transaction is committed, A always reads a snapshot of the database at a specific moment and will not read the data modifications in B's transaction. Only when B's transaction is committed can A read B's modifications.

A database that supports MVCC, when updating certain data, does not overwrite the old data with the new data, but instead marks the old data as outdated and adds a new data version elsewhere. Therefore, multiple versions of the same data are stored, but only one is the most recent.

MVCC provides a solution for time consistency. When reading transactions under MVCC, a timestamp or transaction ID is usually used to determine which database state and which version of data to access. Read transactions and write transactions are isolated from each other and will not affect each other. Assume that the same piece of data is accessed by both read transactions and write transactions. In fact, the write transaction will create a new data version, while the read transaction accesses the old data version. The read transaction will not access the new data version until the write transaction is committed.

There are two ways to implement MVCC. The first way is to save multiple versions of data records in the database. When these different versions of data are no longer needed, the garbage collector reclaims these records. This method is adopted by PostgreSQL and Firebird/Interbase. SQL Server uses a similar mechanism, but the difference is that the old version data is not stored in the database, but in another database different from the main database, tempdb. The second implementation method only saves the latest version of data in the database, but dynamically reconstructs the old version of data when undo is used. This method is used by Oracle and MySQL/InnoDB.

2. InnoDB's MVCC implementation mechanism

MVCC can be considered a variant of row-level locking, which can avoid locking operations in many cases and thus has lower overhead. Most MVCC implementations implement non-blocking read operations, and write operations only lock necessary rows. InnoDB's MVCC implementation is achieved by saving a snapshot of data at a certain point in time. No matter how long a transaction takes to execute, the data seen within it is consistent. That is, transactions will not affect each other during execution. Below we briefly describe the implementation of MVCC in InnoDB.

InnoDB's MVCC is implemented by saving two hidden columns after each row of records: one column saves the creation time of the row, and the other saves the expiration time (deletion time) of the row. Of course, the time here is not a timestamp, but a system version number. Every time a new transaction is started, the system version number will increase. At the RR isolation level, MVCC operates as follows:

Select the operation.

InnoDB only searches for data rows whose versions are earlier than (or equal to) the current transaction version. It is ensured that the rows read by a transaction either exist before the transaction starts, or are records inserted or modified by the transaction itself.

The deleted version of a row is either undefined or greater than the current transaction version number. You can ensure that rows read by a transaction were not deleted before the transaction began.

insert operation. Save the current version number of the newly inserted row as the row version number.

delete operation. Save the current version number of the deleted row as the deletion mark.

update operation. It becomes a combination of insert and delete operations. The inserted row saves the current version number as the row version number, and the delete saves the current version number to the original row as a deletion marker.

Since the old data is not actually deleted, it must be cleaned up. InnoDB will start a background thread to perform the cleanup. The specific rule is to delete the rows whose version number is smaller than the current system version. This process is called purge.

3. A simple example

create table yang( 
    id int primary key auto_increment, 
    name varchar(20));
}

Assume that the system version number starts from 1.

INSERT

InnoDB stores the current system version number as the version number for each newly inserted row.
The first transaction ID is 1;

start transaction;
insert into yang values(NULL,'yang');
insert into yang values(NULL,'long');
insert into yang values(NULL,'fei');
commit;

The corresponding table in the data is as follows (the last two columns are hidden columns and we cannot see them through the query statement)

SELECT

InnoDB checks each row against the following two conditions:
a. InnoDB will only search for data rows whose versions are earlier than the current transaction version (that is, the system version number of the row is less than or equal to the system version number of the transaction). This ensures that the rows read by the transaction either exist before the transaction starts or are inserted or modified by the transaction itself.
b. The deleted version of the row is either undefined or greater than the current transaction version number, which ensures that the row read by the transaction has not been deleted before the transaction starts.
Only records that satisfy both a and b can be returned as query results.

DELETE

InnoDB saves the current system version number (transaction ID) as a deletion identifier for each deleted row.
See the following specific example analysis:
The second transaction, ID is 2;

start transaction;
select * from yang; //(1)
select * from yang; //(2)
commit;

Assumption 1

Assume that during the execution of transaction ID 2, when (1) is reached, another transaction ID 3 inserts a record into the table.
The third transaction ID is 3;

start transaction;
insert into yang values(NULL,'tian');
commit;

The data in the table is as follows:

Then, (2) in transaction 2 is executed. Since the creation time of the data with id=4 (transaction ID is 3), the ID of the current transaction is 2, and InnoDB only searches for data rows with transaction IDs less than or equal to the current transaction ID, the data row with id=4 will not be retrieved in (2) in transaction 2. The data retrieved by the two select statements in transaction 2 will only be as follows:

Assumption 2

Assume that during the execution of transaction ID 2, it has just executed to (1). Assume that after the transaction has executed transaction 3, it then executes transaction 4.
The fourth transaction:

start transaction;  
delete from yang where id=1;
commit;

At this point the tables in the database are as follows:

Next, transaction (2) with transaction ID 2 is executed. According to the SELECT search condition, it will search for rows whose creation time (ID of creation transaction) is less than the current transaction ID and whose deletion time (ID of deletion transaction) is greater than the current transaction ID. The row with id=4 has been mentioned above, and the row with id=1 has a deletion time (ID of deletion transaction) greater than the current transaction ID. Therefore, the select * from yang in transaction 2 (2) will also search for the data with id=1. Therefore, the data searched by the two select statements in transaction 2 are as follows:

UPDATE
When InnoDB executes UPDATE, it actually inserts a new row of records and saves its creation time as the ID of the current transaction, and also saves the deletion time from the current transaction ID to the row to be UPDATEd.

Assumption 3
Assume that after executing transaction 2 (1), other users execute transactions 3 and 4. At this time, another user performs an UPDATE operation on this table:
The fifth task:

start transaction;
update yang set name='Long' where id=2;
commit;

According to the update principle, a new row will be generated, and the transaction ID will be added to the deletion time column of the original column to be modified, resulting in the following table:

Continue to execute (2) of transaction 2. According to the search conditions of the select statement, the following table is obtained:

The same result is obtained as in transaction 2 (1) select.

This is the end of this article about the implementation of MySQL MVCC multi-version concurrency control. For more relevant MySQL MVCC multi-version concurrency control content, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope everyone will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • Implementation of MySQL multi-version concurrency control MVCC
  • Detailed explanation of MySQL multi-version concurrency control mechanism (MVCC) source code
  • In-depth study of MySQL multi-version concurrency control MVCC
  • Analysis of the underlying principle of MySQL multi-version concurrency control MVCC
  • Implementation of MySQL Multi-version Concurrency Control MVCC
  • Mysql MVCC multi-version concurrency control details

<<:  JavaScript Function Currying

>>:  User Experience Summary

Recommend

Differences between MySQL MyISAM and InnoDB

the difference: 1. InnoDB supports transactions, ...

Solution to incomplete text display in el-tree

Table of contents Method 1: The simplest way to s...

Detailed explanation of common methods of JavaScript arrays

Table of contents Common array methods pop() unsh...

Issues with Rancher deployment and importing K8S clusters

Rancher deployment can have three architectures: ...

Detailed explanation of the difference between alt and title

These two attributes are often used, but their di...

The difference between method=post/get in Form

Form provides two ways of data transmission - get ...

Steps to set up HTTPS website based on Nginx

Table of contents Preface: Encryption algorithm: ...

Solution for Docker Swarm external verification load balancing not taking effect

Problem Description I created three virtual machi...

How webpack implements static resource caching

Table of contents introduction Distinguish betwee...

100 ways to change the color of an image using CSS (worth collecting)

Preface “When it comes to image processing, we of...

Detailed explanation of storage engine in MySQL

MySQL storage engine overview What is a storage e...

MySQL NULL data conversion method (must read)

When using MySQL to query the database and execut...

Solution to the problem of passing values ​​between html pages

The first time I used the essay, I felt quite awkw...

A brief discussion on two current limiting methods in Nginx

The load is generally estimated during system des...