Detailed explanation of multi-version concurrency control of large objects in MySQL

MySQL 8.0: MVCC for Large Objects in InnoDB

In this article, I will explain the Multi-Version Concurrency Control (MVCC) design for Large Objects (LOBs) in MySQL InnoDB storage engine. MySQL 8.0 has a new feature that allows users to partially update large objects, including JSON documents. With this partial update feature, the way MVCC works for LOBs has changed when a LOB is partially updated. For normal updates (full updates), MVCC will work just like previous versions. Let's look at how MVCC works when partial updates are not involved, and then consider the use case of partial updates to LOBs.

MVCC General Updates

I use the term regular updates to refer to updates that are not partial updates. I will explain how MVCC can be used to routinely update large objects with an example. I will use the following mtr(1) test case for this purpose:

 create table t1 ( f1 int primary key , f2 longblob ) engine = innodb ; 
 insert into t1 values ( 1 , repeat ( 'a' , 65536 ) ) ; 
 
 start transaction; 
 update t1 set f2 = repeat ( 'b' , 65536 ) where f1 = 1 ; 
 
 -- echo # Connection con1: 
 -- For users who use the MySQL client, you may need to establish a new link by opening another terminal window, the same below.
 connect ( con1 , localhost , root , , ) ; 
 -- echo # Must see the old value 'aaaaaaaaaaa' 
 select f1 , right ( f2 , 10 ) from t1 order by f1 ; 
 
 -- echo # Connection default: 
 connection default; 
 disconnect con1; 
 commit ; 
 
 drop table t1 ;

In order to understand the following explanation, it is very important to understand the above test case carefully.

The test scenario is as follows:

Initially, table t1 contains a single record (R1).
Transaction trx1 updates the record to the new value.
While trx1 is still active, another transaction trx2 is reading the records. It will read the old value.

Table t1 contains only one record (R1). But trx1 and trx2 will see two different values. The table actually contains only the latest values (the values seen by trx1), while the values or records seen by trx2 are obtained from the undo log records. Let's look at the picture below to understand it better.

Initial state: before update operation

The following figure shows the situation before the update operation. The undo log is empty. The clustered index of a table contains one row. There is a LOB in the table. The clustered index record contains a reference to the LOB.

Final state: After the update operation

Now let's look at what happens after the update operation.

Here are some important observations:

There are two LOBs in the user tablespace - old LOB and new LOB. Old LOBs are only accessible through undo logs. The clustered index record points to the new LOB.
The update operation has created an undo log record containing the update vector. This undo log record points to the old LOB.
Clustered index records point to undo log records through the DB_ROLL_PTR system column. This roll pointer points to an undo log record that can be used to build a previous version of the clustered index record.
Undo records do not contain the LOB itself. Instead it contains only references to LOBs stored in the user tablespace.
The LOB references stored in undo log records are different from the LOB references stored in clustered index records.

The steps taken by the transaction in connection 1 are as follows:

The transaction looks at R1 and determines that no transaction that modified the clustered index record has been committed. This means that it cannot read that record (because the default isolation level is REPEATABLE READ).
It looks at the DB_ROLL_PTR in R1 and finds the undo log record. Build the previous version of R1 using undo logging.
It reads the older R1 for this build. Note that this version is not available in clustered index records. But it is built on the fly using undo records.
While R1 points to the new LOB, the old version of R1 constructed points to the old LOB. So the result contains the old LOB.

This is how MVCC for LOBs works when no partial updates are involved.

MVCC Partial Update

Let's look at another example to understand how MVCC works in the case of partial updates. We need another example because partial updating of JSON documents is currently only supported through the functions json_set() and json_replace().

 create table t2 ( f1 int primary key , j json ) engine = InnoDB ; 
 set @ elem_a = concat ( '"' , repeat ( 'a' , 200 ) , '"' ) ; 
 set @ elem_a_with_coma = concat ( @ elem_a , ',' ) ; 
 set @json_doc = concat ( "[" , repeat ( @elem_a_with_coma , 300 ) , @elem_a , "]" ) ; 
 
 insert into t2 ( f1 , j ) values ( 1 , @ json_doc ) ; 
 
 start transaction; 
 update t2 set j = json_set ( j , '$[200]' , repeat ( 'b' , 200 ) ) where f1 = 1 ; 
 
 -- echo # Connection con1: 
 connect ( con1 , localhost , root , , ) ; 
 -- echo # Must see the old value 'aaaaaaaaaaa...' 
 select json_extract ( j , '$[200]' ) from t2 ; 
 
 -- echo # Connection default: 
 connection default; 
 disconnect con1; 
 commit ;

The scenario is the same as the previous example. Just the longblob field has been changed to a JSON document. The data loaded is also slightly different to conform to the JSON format.

Tip: You can add the statement set debug = '+d, innodb_lob_print' to the above mtr test cases (in both) to print the LOB indexes in the server log file. The LOB index will be printed immediately after insertion. The LOB index will give you the structure of the stored LOB object.
Before a partial update operation

The initial conditions before the full or partial update operation are the same and have been given above. But in the figure below, some additional information is provided.

Let's look at some other information shown in the graph:

The LOB reference stored in the clustered index record now contains the LOB version number v1. During the initial insert operation, it is set to 1 and incremented on each partial update.
Each LOB data page has an entry in the LOB index. Each entry contains LOB version information. Whenever a LOB data page is modified, it will be copied to a new LOB data page with the new data, and a new LOB index entry will be created with an incremented LOB version number.

Additional information is the LOB version number. This is available in the LOB reference in the clustered index record and in each entry of the LOB index.

After a partial update operation

The following diagram illustrates the situation after a partial update operation.

The most important optimization here is that there is still only one LOB in the user tablespace. Update only those LOB data pages that need to be modified. This single LOB after the partial update operation contains both the old and new versions of the LOB. This is illustrated by the v1 and v2 labels on the LOB data page in the figure.

Another important observation is that the LOB references in the undo log and clustered index records point to the same LOB. But the LOB reference contains a different version number. The LOB reference in the undo log record contains v1 (the old version number), and the LOB reference in the clustered index record contains the new version number v2.

Purpose of LOB version numbers

As shown above, different LOB references with different version numbers point to the same LOB. A single LOB contains parts from different versions. The LOB version number is used to get the correct version that the various LOB references point to. In this section, we will see how this is done.

A LOB index contains a list of LOB pages that make up a LOB. It contains the page number of the LOB data page, the amount of data contained in each LOB data page, and a version number. Each node of this list is called a LOB index entry. Each LOB index entry contains a list of old versions. Let's look at a diagram illustrating the structure of the above partial update test case.

Initially, before the partial update is done, the LOB index contains a total of 4 entries. The page numbers of the four entries are 5, 6, 7, and 8. No LOB index entry has an old version. All four entries have a version number of 1.

After some of the updates were completed, we noticed that page 9 had replaced page 7, and page 7 was now considered an older version of page 9. Page number 9 has version number 2, and page number 7 has version number 1.

After the partial update is complete, when the LOB is accessed through a LOB reference with version number 1, the first index entry on page 5 is looked at. It has a version number of 1. If the version number in the index entry is less than or equal to the version number in the LOB reference, that entry will be read. Therefore, page 5 will be read. Then the index entry for page 6 will be viewed. It has version number 1, so that will be read. Then the index entry for page number 9 will be viewed. It has version number 2. But the lob reference has version number 1. If the version number in the index entry is greater than the version number in the LOB reference, the entry will not be read. Since the entry for page number 9 has version 2, its older version will be viewed. The index entry with page number 7 will be checked. It has version number 1, so that will be read. After this, the index entry with page number 8 is checked. It has version number 1, so it will also be read. This is how you access legacy LOBs.

After the partial update is complete, when the LOB is accessed through the LOB reference with version number 2, the first index entry on page 5 is looked at. It has a version number of 1. If the version number in the index entry is less than or equal to the version number in the LOB reference, that entry will be read. So it will read page numbers 5,6,9,8 in sequence. Since the version number is always <= 2, there is no need to use an older version to access page number 7.

One thing to remember is that LOBs do not exist independently in InnoDB. It is considered an extension of the clustered index record. Whether a LOB is visible to a transaction is not handled by the LOB module. The LOB module only processes clustered index records. If a transaction accesses a LOB, it means that it has determined with the help of the DB_TRX_ID in the clustered index record that it can look at the LOB (not a specific version of the LOB). So we don't worry about that aspect in the LOB module. We focus only on providing the correct content for a given LOB version number.

in conclusion

In this article, we saw how MVCC is done in InnoDB for large objects. When a LOB is partially updated, multiple LOB references can point to the same LOB. But they will have different version numbers. Using these LOB version numbers, the correct LOB content can be accessed.

Hope you found this information useful.

Thank you for using MySQL!

Notes:

(1) Mtr is the abbreviation of Mini-transaction, which literally means small thing. Compared with logical things, we call it physical things. Belongs to the underlying module of the Innodb storage engine. Mainly used for locking and logging information.

Summarize

The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. If you have any questions, you can leave a message to communicate. Thank you for your support for 123WORDPRESS.COM.

You may also be interested in:

Detailed explanation of MySQL multi-version concurrency control mechanism (MVCC) source code
Implementation of MySQL's MVCC multi-version concurrency control
MySQL concurrency control principle knowledge points
Implementation of MySQL multi-version concurrency control MVCC
Detailed explanation of the four transaction isolation levels in MySQL
Detailed explanation and comparison of the four transaction isolation levels in MySQL
In-depth analysis of MySQL transaction isolation and its impact on performance
Tutorial on the relationship between Innodb transaction isolation level and lock in MySQL
Introduction to MySQL database transaction isolation level (Transaction Isolation Level)
MySQL Series 10 MySQL Transaction Isolation to Implement Concurrency Control

<<: Dockerfile echo specifies the method of implementing multiple lines of text in the specified file

>>: Problems encountered when uploading images using axios in Vue

Detailed tutorial on installing Mysql5.7.19 on Centos7 under Linux

Detailed explanation of multi-version concurrency control of large objects in MySQL

Detailed tutorial on installing Mysql5.7.19 on Centos7 under Linux

Detailed explanation of the solution to the error in creating a user and granting permissions in mysql8.0

TypeScript learning notes: type narrowing

Solution for VMware Workstation Pro not running on Windows

JavaScript to imitate the registration and login function of Xiaomi official website

CSS3 realizes various graphic effects of small arrows

Web Design Tutorial (7): Improving Web Design Efficiency

How to backup MySQL regularly and upload it to Qiniu

Bugs encountered when using mybatis-generator with mysql8.0.3 in IDEA

Detailed explanation of Vue3.0 + TypeScript + Vite first experience

Recommend

JavaScript implements asynchronous submission of form data

The reason why MySQL uses B+ tree as its underlying data structure

The simplest MySQL data backup and restore tutorial in history (Part 2) (Part 36)

Vue implements infinite loading waterfall flow

MySQL table and column comments summary

15-minute parallel artifact GNU Parallel Getting Started Guide

Implementation code for partial refresh of HTML page

Linux command line quick tips: How to locate a file

Detailed explanation of how to customize the style of CSS scroll bars

Detailed explanation of MySQL 8.0 password expiration policy

Connector configuration in Tomcat

MySQL InnoDB MRR Optimization Guide

Detailed explanation of compatibility issues and solutions for setting max-width and min-width in table

Implement 24+ array methods in JavaScript by hand

A brief analysis of the matching priority of Nginx configuration location