Detailed explanation of the MySQL MVCC mechanism principle

Detailed explanation of the MySQL MVCC mechanism principle

What is MVCC

MVCC, the full name of which is Multi-Version Concurrency Control, is multi-version concurrency control. MVCC is a concurrency control method, which is generally used in database management systems to implement concurrent access to databases and transactional memory in programming languages.

We know that in general, when we use the MySQL database, we use the Innodb storage engine. The Innodb storage engine supports transactions. So when multiple threads execute transactions at the same time, concurrency problems may occur. At this time, a method that can control concurrency is needed, and MVCC plays this role.

Mysql lock and transaction isolation level

Before understanding the principles of the MVCC mechanism, you need to first understand the MySQL lock mechanism and transaction isolation level. Leaving aside the MyISAM storage engine, for the Innodb storage engine, there are two types of locks: row locks and table locks. Table locks lock the entire table in one operation, which has the largest lock granularity but the lowest performance and will not cause deadlocks. Row lock locks a row at a time, which has a small lock granularity and high concurrency, but may cause deadlock.

Innodb row locks are divided into shared locks (read locks) and exclusive locks (write locks). When a transaction adds a read lock to a row, other transactions are allowed to read the row, but write operations are not allowed. Other transactions are also not allowed to add a write lock to the row, but a read lock can be added.

When a transaction adds a write lock to a row, other transactions are not allowed to write to this row, but they can read it. At the same time, other transactions are not allowed to add a read-write lock to this row.

Let's take a look at the transaction isolation levels of MySQL, which are divided into the following four levels:

  1. Read uncommitted: A transaction can read data that other transactions have not yet committed, resulting in dirty reads. For example, there is a salary table. Transaction A is started first, and then the salary of the employee with id 1 is queried. Assume that the salary is 1000 at this time. At this time, transaction B is also started and an update operation is performed, reducing the salary of the employee with id 1 by 100, but the transaction is not committed. At this time, if the query operation of transaction A is executed again, the data updated by transaction B can be read. If transaction B is rolled back at this time, transaction A will read "dirty" data. When transaction A performs an update operation, phantom reads may also occur.
  2. Read Committed: A transaction can only read data modified by another committed transaction, and after other transactions modify and commit the data once, the transaction can query the latest value. In the same example, this time the transaction isolation level is Read Committed, and transaction B does not commit the transaction, transaction A cannot read the data updated by transaction B, thus avoiding the generation of dirty data. However, after transaction B is committed, transaction A will find that the data has changed when executing the same query again. This is called non-repeatable read, which means that the results of executing the same query multiple times in the same transaction are inconsistent. At the same time, phantom reads still exist.
  3. Repeatable read: After a transaction reads a record for the first time, even if other transactions modify the value of the record and commit, when the transaction reads the record again, it still reads the value of the first read, rather than reading different data each time. This is repeatable read. This isolation level solves the problem of non-repeatability, but phantom reads may still occur.
  4. Serialization: This isolation level does not cause dirty reads or phantom reads because all operations on the same record are serial. However, this is not a concurrent transaction.

MySQL undo log

MVCC relies on Mysql's undo log at the bottom layer. The undo log records the operations of the database. Because the undo log is a logical log, it can be understood that when a record is deleted, the undo log will record a corresponding insert record. When a record is updated, the undo log will record an opposite update record. When the transaction fails and needs to be rolled back, it can be rolled back by reading the corresponding content in the undo log. MVCC makes use of the undo log.

Implementation principle of MVCC

The implementation of MVCC utilizes the implicit fields, undo log and ReadView of the database. First, let's look at the implicit fields. In fact, MySQL implicitly records the following hidden fields behind each row in the table: DB_TRX_ID (the ID of the most recently modified (modify/insert) transaction), DB_ROLL_PTR (rollback pointer, pointing to the previous version of this record), and DB_ROW_ID (auto-increment ID. If the data table does not have a primary key, the clustered index is created with this ID by default).

There are two types of undo logs: insert undo log, which is an undo log generated when a new record is inserted. It is only needed when a transaction is rolled back and can be discarded immediately after the transaction is committed. Update undo log, which is an undo log generated when a transaction is updated or deleted. It is not only needed when the transaction is rolled back, but also when snapshots are read. Therefore, it cannot be deleted casually. The corresponding log will be cleared uniformly by the purge thread only when the fast read or transaction rollback does not involve the log. MVCC uses update undo log.

In fact, the undo log records a version chain. Suppose there is a record in the database as follows:

Now there is a transaction A that modifies this record and changes the name to tom. The operation flow at this time is:

  • Transaction A first adds a row lock to the row record
  • Then copy the row record to the undo log as an old version
  • After copying, change the name of the row to tom, and then change the value of DB_TRX_ID of the row to the id of transaction A. At this time, assume that the id of transaction A is 1, and point the DB_POLL_PTR of the row to the record copied to the undo log.
  • After the transaction is committed, the lock is released

The situation at this point is as follows:

At this time, another transaction B modifies this record and changes age to 28. The operation flow at this time is:

  • Transaction B adds a row lock to the row record
  • The row record is copied to the undo log as an old version. If the undo log already has a record, a new undo log is inserted at the front of the undo log of the row record as the header of the linked list.
  • After copying, change the age of the row to 28, and then change the value of DB_TRX_ID of the row to the id of transaction B. At this time, assume that the id of transaction B is 2, and point the DB_POLL_PTR of the row to the record copied to the undo log.
  • Release the lock after the transaction is committed

The situation at this point is as follows:

From the above, we can see that modifications made to the same row of records by different transactions or the same transaction will cause the undo log of the row of records to form a version chain. The head of the undo log chain is the most recent old record, and the tail of the chain is the earliest old record.

Now let's assume a situation. Let's assume that neither transaction A nor transaction B has been committed. At this time, there is a transaction C that modifies the record named tom and changes age to 30. Then the transaction is committed. The id of transaction C is 3. Similarly, a record will be inserted into the undo log. At this time, the DB_TRX_ID of the first record in the undo log version chain is 3.

Now there is a transaction D, which queries the record with name tom. At this time, snapshot read will be enabled. The snapshot is a data snapshot triggered by the query operation at the beginning of the transaction. Unlocked read is snapshot read by default under the repeatable read isolation level. In contrast to snapshot read, there is also a current read. All update operations are current reads. A read view is generated during snapshot read. At the moment when the transaction executes snapshot read, a current snapshot of the database is generated to record and maintain the ID of the currently active transaction. Because the transaction ID is auto-incremental, the newer the transaction, the larger the ID. The read view follows the visibility algorithm, and whether it is visible requires some judgment. In addition to recording the currently active transaction ID, the read view also records the currently created maximum transaction ID. When reading a snapshot, it needs to be compared with the read view to obtain the visibility result.

Read view mainly compares the ID of the current transaction with the ID of the active transaction in the system. The comparison rules are as follows:

First, the Read view contains an array of transaction IDs that are active in the system at the time the Read view is generated, temporarily called id_list

Then the Read view will record the smallest transaction ID in the id_list, temporarily called low_id

Finally, the Read view will also record a transaction ID that has not been assigned in the system when the Read view is generated, which is the current largest transaction ID + 1, temporarily called high_id

  • If the current transaction ID is less than low_id, the current transaction is visible
  • If the current transaction ID is greater than high_id, the current transaction is not visible.
  • The current transaction is greater than low_id and less than high_id. Then determine whether it is in id_list. If it is, it means that the active transaction has not been committed yet. The current transaction is not visible, but it is visible to the active transaction itself. If it is not in id_list, the current transaction is visible

If the visibility result is invisible, you need to use DB_ROLL_PTR to get the DB_TRX_ID of the record from the undo log for comparison. By traversing the version chain until a DB_TRX_ID that meets specific conditions is found, the old record with this DB_TRX_ID is the latest old version that the current transaction can see.

The above is the detailed content of the detailed explanation of the MySQL MVCC mechanism principle. For more information about the MySQL MVCC mechanism principle, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • Detailed explanation of MySQL transaction isolation level and MVCC
  • How is MySQL transaction isolation achieved?
  • In-depth understanding of Mysql transaction isolation level and locking mechanism issues
  • Solve the problem of MySql8.0 checking transaction isolation level error
  • Analysis of the underlying principle of MySQL multi-version concurrency control MVCC
  • Implementation of MySQL Multi-version Concurrency Control MVCC
  • Mysql MVCC multi-version concurrency control details
  • MYSQL transaction isolation level and MVCC

<<:  How to set up cross-domain access in IIS web.config

>>:  How to keep the content within the container when the flex layout is stretched by child elements

Recommend

Nginx external network access intranet site configuration operation

background: The site is separated from the front ...

Idea deployment tomcat service implementation process diagram

First configure the project artifacts Configuring...

In-depth explanation of MySQL stored procedures (in, out, inout)

1. Introduction It has been supported since versi...

Detailed explanation of how to use CMD command to operate MySql database

First: Start and stop the mysql service net stop ...

Detailed explanation of flex layout in CSS

Flex layout is also called elastic layout. Any co...

Install mysql5.7 on Ubuntu 18.04

Ubuntu 18.04 installs mysql 5.7 for your referenc...

The difference between html, xhtml and xml

Development Trends: html (Hypertext Markup Languag...

Detailed explanation of Angular routing basics

Table of contents 1. Routing related objects 2. L...

Negative distance (empathy) - iterative process of mutual influence

Negative distance refers to empathy. Preface (rai...

Summary of the use of html meta tags (recommended)

Meta tag function The META tag is a key tag in th...

HTML version declaration DOCTYPE tag

When we open the source code of a regular website...

WeChat applet custom tabbar component

This article shares the specific code of the WeCh...

Problems encountered in using MySQL

Here are some problems encountered in the use of ...

React Fragment Introduction and Detailed Usage

Table of contents Preface Motivation for Fragment...