Detailed explanation of the MySQL MVCC mechanism principle

Detailed explanation of the MySQL MVCC mechanism principle

What is MVCC

MVCC, the full name of which is Multi-Version Concurrency Control, is multi-version concurrency control. MVCC is a concurrency control method, which is generally used in database management systems to implement concurrent access to databases and transactional memory in programming languages.

We know that in general, when we use the MySQL database, we use the Innodb storage engine. The Innodb storage engine supports transactions. So when multiple threads execute transactions at the same time, concurrency problems may occur. At this time, a method that can control concurrency is needed, and MVCC plays this role.

Mysql lock and transaction isolation level

Before understanding the principles of the MVCC mechanism, you need to first understand the MySQL lock mechanism and transaction isolation level. Leaving aside the MyISAM storage engine, for the Innodb storage engine, there are two types of locks: row locks and table locks. Table locks lock the entire table in one operation, which has the largest lock granularity but the lowest performance and will not cause deadlocks. Row lock locks a row at a time, which has a small lock granularity and high concurrency, but may cause deadlock.

Innodb row locks are divided into shared locks (read locks) and exclusive locks (write locks). When a transaction adds a read lock to a row, other transactions are allowed to read the row, but write operations are not allowed. Other transactions are also not allowed to add a write lock to the row, but a read lock can be added.

When a transaction adds a write lock to a row, other transactions are not allowed to write to this row, but they can read it. At the same time, other transactions are not allowed to add a read-write lock to this row.

Let's take a look at the transaction isolation levels of MySQL, which are divided into the following four levels:

  1. Read uncommitted: A transaction can read data that other transactions have not yet committed, resulting in dirty reads. For example, there is a salary table. Transaction A is started first, and then the salary of the employee with id 1 is queried. Assume that the salary is 1000 at this time. At this time, transaction B is also started and an update operation is performed, reducing the salary of the employee with id 1 by 100, but the transaction is not committed. At this time, if the query operation of transaction A is executed again, the data updated by transaction B can be read. If transaction B is rolled back at this time, transaction A will read "dirty" data. When transaction A performs an update operation, phantom reads may also occur.
  2. Read Committed: A transaction can only read data modified by another committed transaction, and after other transactions modify and commit the data once, the transaction can query the latest value. In the same example, this time the transaction isolation level is Read Committed, and transaction B does not commit the transaction, transaction A cannot read the data updated by transaction B, thus avoiding the generation of dirty data. However, after transaction B is committed, transaction A will find that the data has changed when executing the same query again. This is called non-repeatable read, which means that the results of executing the same query multiple times in the same transaction are inconsistent. At the same time, phantom reads still exist.
  3. Repeatable read: After a transaction reads a record for the first time, even if other transactions modify the value of the record and commit, when the transaction reads the record again, it still reads the value of the first read, rather than reading different data each time. This is repeatable read. This isolation level solves the problem of non-repeatability, but phantom reads may still occur.
  4. Serialization: This isolation level does not cause dirty reads or phantom reads because all operations on the same record are serial. However, this is not a concurrent transaction.

MySQL undo log

MVCC relies on Mysql's undo log at the bottom layer. The undo log records the operations of the database. Because the undo log is a logical log, it can be understood that when a record is deleted, the undo log will record a corresponding insert record. When a record is updated, the undo log will record an opposite update record. When the transaction fails and needs to be rolled back, it can be rolled back by reading the corresponding content in the undo log. MVCC makes use of the undo log.

Implementation principle of MVCC

The implementation of MVCC utilizes the implicit fields, undo log and ReadView of the database. First, let's look at the implicit fields. In fact, MySQL implicitly records the following hidden fields behind each row in the table: DB_TRX_ID (the ID of the most recently modified (modify/insert) transaction), DB_ROLL_PTR (rollback pointer, pointing to the previous version of this record), and DB_ROW_ID (auto-increment ID. If the data table does not have a primary key, the clustered index is created with this ID by default).

There are two types of undo logs: insert undo log, which is an undo log generated when a new record is inserted. It is only needed when a transaction is rolled back and can be discarded immediately after the transaction is committed. Update undo log, which is an undo log generated when a transaction is updated or deleted. It is not only needed when the transaction is rolled back, but also when snapshots are read. Therefore, it cannot be deleted casually. The corresponding log will be cleared uniformly by the purge thread only when the fast read or transaction rollback does not involve the log. MVCC uses update undo log.

In fact, the undo log records a version chain. Suppose there is a record in the database as follows:

Now there is a transaction A that modifies this record and changes the name to tom. The operation flow at this time is:

  • Transaction A first adds a row lock to the row record
  • Then copy the row record to the undo log as an old version
  • After copying, change the name of the row to tom, and then change the value of DB_TRX_ID of the row to the id of transaction A. At this time, assume that the id of transaction A is 1, and point the DB_POLL_PTR of the row to the record copied to the undo log.
  • After the transaction is committed, the lock is released

The situation at this point is as follows:

At this time, another transaction B modifies this record and changes age to 28. The operation flow at this time is:

  • Transaction B adds a row lock to the row record
  • The row record is copied to the undo log as an old version. If the undo log already has a record, a new undo log is inserted at the front of the undo log of the row record as the header of the linked list.
  • After copying, change the age of the row to 28, and then change the value of DB_TRX_ID of the row to the id of transaction B. At this time, assume that the id of transaction B is 2, and point the DB_POLL_PTR of the row to the record copied to the undo log.
  • Release the lock after the transaction is committed

The situation at this point is as follows:

From the above, we can see that modifications made to the same row of records by different transactions or the same transaction will cause the undo log of the row of records to form a version chain. The head of the undo log chain is the most recent old record, and the tail of the chain is the earliest old record.

Now let's assume a situation. Let's assume that neither transaction A nor transaction B has been committed. At this time, there is a transaction C that modifies the record named tom and changes age to 30. Then the transaction is committed. The id of transaction C is 3. Similarly, a record will be inserted into the undo log. At this time, the DB_TRX_ID of the first record in the undo log version chain is 3.

Now there is a transaction D, which queries the record with name tom. At this time, snapshot read will be enabled. The snapshot is a data snapshot triggered by the query operation at the beginning of the transaction. Unlocked read is snapshot read by default under the repeatable read isolation level. In contrast to snapshot read, there is also a current read. All update operations are current reads. A read view is generated during snapshot read. At the moment when the transaction executes snapshot read, a current snapshot of the database is generated to record and maintain the ID of the currently active transaction. Because the transaction ID is auto-incremental, the newer the transaction, the larger the ID. The read view follows the visibility algorithm, and whether it is visible requires some judgment. In addition to recording the currently active transaction ID, the read view also records the currently created maximum transaction ID. When reading a snapshot, it needs to be compared with the read view to obtain the visibility result.

Read view mainly compares the ID of the current transaction with the ID of the active transaction in the system. The comparison rules are as follows:

First, the Read view contains an array of transaction IDs that are active in the system at the time the Read view is generated, temporarily called id_list

Then the Read view will record the smallest transaction ID in the id_list, temporarily called low_id

Finally, the Read view will also record a transaction ID that has not been assigned in the system when the Read view is generated, which is the current largest transaction ID + 1, temporarily called high_id

  • If the current transaction ID is less than low_id, the current transaction is visible
  • If the current transaction ID is greater than high_id, the current transaction is not visible.
  • The current transaction is greater than low_id and less than high_id. Then determine whether it is in id_list. If it is, it means that the active transaction has not been committed yet. The current transaction is not visible, but it is visible to the active transaction itself. If it is not in id_list, the current transaction is visible

If the visibility result is invisible, you need to use DB_ROLL_PTR to get the DB_TRX_ID of the record from the undo log for comparison. By traversing the version chain until a DB_TRX_ID that meets specific conditions is found, the old record with this DB_TRX_ID is the latest old version that the current transaction can see.

The above is the detailed content of the detailed explanation of the MySQL MVCC mechanism principle. For more information about the MySQL MVCC mechanism principle, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • Detailed explanation of MySQL transaction isolation level and MVCC
  • How is MySQL transaction isolation achieved?
  • In-depth understanding of Mysql transaction isolation level and locking mechanism issues
  • Solve the problem of MySql8.0 checking transaction isolation level error
  • Analysis of the underlying principle of MySQL multi-version concurrency control MVCC
  • Implementation of MySQL Multi-version Concurrency Control MVCC
  • Mysql MVCC multi-version concurrency control details
  • MYSQL transaction isolation level and MVCC

<<:  How to set up cross-domain access in IIS web.config

>>:  How to keep the content within the container when the flex layout is stretched by child elements

Recommend

Why TypeScript's Enum is problematic

Table of contents What happened? When to use Cont...

Example of pre-rendering method for Vue single page application

Table of contents Preface vue-cli 2.0 version vue...

How to effectively compress images using JS

Table of contents Preface Conversion relationship...

Understanding v-bind in vue

Table of contents 1. Analysis of key source code ...

Summary of various forms of applying CSS styles in web pages

1. Inline style, placed in <body></body&g...

MySQL password modification example detailed explanation

MySQL password modification example detailed expl...

CSS3 achieves infinite scrolling/carousel effect of list

Effect Preview Ideas Scroll the current list to t...

The easiest way to install MySQL 5.7.20 using yum in CentOS 7

The default database of CentOS7 is mariadb, but m...

How to split and merge multiple values ​​in a single field in MySQL

Multiple values ​​combined display Now we have th...

Use the CSS border-radius property to set the arc

Phenomenon: Change the div into a circle, ellipse...

How to configure mysql5.6 to support IPV6 connection in Linux environment

Introduction: This article mainly introduces how ...

Do you know how to use the flash wmode attribute in web pages?

When doing web development, you may encounter the...

Implementing a web player with JavaScript

Today I will share with you how to write a player...

JS implements a simple todoList (notepad) effect

The notepad program is implemented using the thre...

How to call a piece of HTML code together on multiple HTML pages

Method 1: Use script method: Create a common head...