Comprehensive analysis of optimistic locking, pessimistic locking and MVCC in MySQL

Comprehensive analysis of optimistic locking, pessimistic locking and MVCC in MySQL

Preface

In the actual use of the database, we often encounter situations where we do not want data to be written or read at the same time. For example, in a flash sale scenario, two requests simultaneously read that the system still has 1 item in stock, and then update the stock to 0 one after another. This will cause an oversold situation, and the actual inventory of the goods will not correspond to our records.

In order to solve problems such as data inconsistency caused by resource competition, we need a mechanism to ensure correct access and modification of data. In the database, this mechanism is database concurrency control. Among them, optimistic concurrency control, pessimistic concurrency control and multi-version concurrency control are the main technical means used for database concurrency control.

Pessimistic concurrency control

nature

Wikipedia: In relational database management systems, pessimistic concurrency control (also known as "pessimistic locking", abbreviated as "PCC") is a method of concurrency control. It prevents a transaction from modifying data in a way that affects other users. If a transaction performs an operation that reads a row of data and applies a lock, other transactions can only perform operations that conflict with the lock when the transaction releases the lock.

In fact, the pessimistic lock we often talk about is not an actual lock, but a concurrency control idea. Pessimistic concurrency control is pessimistic about data modification, and believes that conflicts will inevitably arise when data is accessed by the outside world. Therefore, locking is used in the process of data processing to ensure exclusive use of resources.

The database lock mechanism is actually implemented based on the viewpoint of pessimistic concurrency control, and according to actual usage, the database lock can be divided into many categories. For details, please see my later article.

Implementation

The locking process of the database pessimistic lock is as follows:

After starting a transaction, apply for a certain type of lock to the data that needs to be locked according to the operation type: for example, a shared row lock. If the lock is successful, continue with the subsequent operations. If the data has been locked by other locks and it conflicts with the lock to be added now, the lock will fail (for example, an exclusive lock has been added). At this time, you need to wait for other locks to be released (deadlock may occur)
Release the lock after completing the transaction

Pros and Cons

advantage:

Pessimistic concurrency control adopts a conservative strategy: "Get the lock first, and then access the data if it succeeds." This ensures that data acquisition and modification are carried out in an orderly manner, so it is suitable for use in an environment with more writes and fewer reads. Of course, using pessimistic locking cannot maintain very high performance, but under the premise that optimistic locking cannot provide better performance, pessimistic locking can ensure data security.

shortcoming:
Since locking is required and there may be lock conflicts or even deadlocks, pessimistic concurrency control increases system overhead, reduces system efficiency, and also reduces system parallelism.

Optimistic concurrency control

nature

Wikipedia: In relational database management systems, optimistic concurrency control (also known as "optimistic locking", Optimistic Concurrency Control, abbreviated as "OCC") is a concurrency control method. It assumes that multi-user concurrent transactions will not affect each other during processing, and each transaction can process the part of the data affected by it without generating locks.
Optimistic concurrency control is optimistic about data modification, believing that even in a concurrent environment, external operations on data generally will not cause conflicts, so it will not lock the data. Instead, before submitting data updates, each transaction will first check whether other transactions have modified the data after the transaction reads the data. If other transactions have updated the transaction, conflict information will be returned, allowing the user to decide how to proceed, such as retrying or rolling back.

It can be seen that optimistic locking is not actually a lock, and does not even use locks to implement concurrency control. Instead, it uses other methods to determine whether data can be modified. Optimistic locking is generally a locking mechanism implemented by the user. Although no actual lock is used, it can produce a locking effect.

Implementation

CAS (Compare and swap) is a well-known lock-free algorithm. Lock-free programming means realizing variable synchronization between multiple threads without using locks, that is, realizing variable synchronization without threads being blocked, so it is also called non-blocking synchronization. The scheme for implementing non-blocking synchronization is called a "lock-free programming algorithm".

Optimistic locking is basically implemented based on the CAS (Compare and swap) algorithm. Let's first look at the CAS process. The process of a CAS operation can be represented by the following C code:

int cas(long *addr, long old, long new)
{
 /* Executes atomically. */
 if(*addr != old)
  return 0;
 *addr = new;
 return 1;
}

CAS has three operands, the memory value V, the old expected value A, and the new value B to be modified. If and only if the expected value A and the memory value V are the same, change the memory value V to B, otherwise do nothing. The entire CAS operation is an atomic operation and is indivisible.

The implementation of optimistic locking is similar to the above process, mainly in the following ways:

  • Version number tag: Add a new field in the table: version, which is used to save the version number. When getting data, get the version number at the same time, and then use the following command to update the data: update xxx set version=version+1,… where … version="old version" and …. At this time, whether the update is successful is determined by judging whether the number of affected rows in the returned result is 0. If the update fails, it means that other requests have already updated the data.
  • Timestamp: Same as version number, but determined by timestamp. Generally speaking, many data tables have an update time field. By using this field to judge, there is no need to add a new field.
  • Fields to be updated: If there is no timestamp field and you do not want to add new fields, you can consider using the fields to be updated to make a judgment. Because updated data generally changes, you can compare the old value of the field to be updated with the current value of the database before updating. If there is no change, update it.
  • All field marks: All fields in the data table are used for judgment. This is equivalent to locking not only a few fields, but also the entire data row. As long as the data in this row changes, it will not be updated.

Pros and Cons

advantage:

Optimistic concurrency control does not actually lock, so there is no additional overhead, and deadlock problems are not likely to occur. It is suitable for concurrent scenarios with more reads and fewer writes. Because there is no additional overhead, it can greatly improve the performance of the database.

shortcoming:
Optimistic concurrency control is not suitable for concurrent scenarios with more writes than reads, because there will be many write conflicts, resulting in multiple waits and retries for data writing. In this case, its overhead is actually higher than that of pessimistic locking. Moreover, the business logic of optimistic locking is more complicated than that of pessimistic locking. The business logic must take into account failures and waiting for retries, and it cannot avoid direct modifications to the database by other third-party systems.

Multi-version concurrency control

nature

Wikipedia: Multiversion concurrency control (MCC or MVCC) is a type of concurrency control commonly used in database management systems and is also used in programming languages ​​to implement transactional memory.

Both optimistic concurrency control and pessimistic concurrency control ensure the serializability of transactions by resolving competition conditions between transactions by delaying or terminating corresponding transactions. Although the previous two concurrency control mechanisms can indeed fundamentally solve the problem of serializability of concurrent transactions, they are actually solving the problem of write conflicts. The difference between the two lies in the different degrees of optimism about write conflicts (pessimistic locks can also solve read-write conflicts, but the performance is average). In actual use, database read requests are many times more than write requests. If we can solve the problem of read-write concurrency, we can greatly improve the database read performance, and this is what multi-version concurrency control can do.

Unlike pessimistic concurrency control and optimistic concurrency control, MVCC is designed to solve the problem of multiple, long-term read operations starving write operations caused by read-write locks, that is, to solve the problem of read-write conflicts. MVCC can be used in conjunction with any of the previous two mechanisms to improve the read performance of the database.

Pessimistic locking of databases is based on the consideration of improving concurrent performance, and generally implements multi-version concurrency control at the same time. Not only MySQL, but other database systems including Oracle, PostgreSQL, etc. have also implemented MVCC, but their implementation mechanisms are different because MVCC does not have a unified implementation standard.

In general, the emergence of MVCC is a solution proposed by the database because it is dissatisfied with using pessimistic locking to solve the read-write conflict problem due to low performance.

Implementation

MVCC is implemented by saving a snapshot of the data at a certain point in time. The data items read by each transaction are a historical snapshot, which is called a snapshot read. Unlike the current read, the data read by the snapshot read may not be the latest, but snapshot isolation ensures that the data seen by the entire transaction is the data status when it starts. A write operation does not overwrite an existing data item, but creates a new version that becomes visible only when the transaction is committed.

Current read and snapshot read

What are current read and snapshot read in MySQL InnoDB?

Current read operations such as select lock in share mode (shared lock), select for update; update, insert, delete (exclusive lock) are all current reads. Why are they called current reads? That is, it reads the latest version of the record. When reading, it must ensure that other concurrent transactions cannot modify the current record, and the read record will be locked.

Snapshot reads are like select operations without locking, that is, non-blocking reads without locking. The premise of snapshot reads is that the isolation level is not uncommitted reads or serialization levels, because uncommitted reads always read the latest data rows, rather than the data rows that conform to the current transaction version. Serialization locks all rows read.

Pros and Cons

MVCC allows most read operations without locking. This design makes data reading operations simple, has good performance, and ensures that only rows that meet the criteria are read. The disadvantage is that each row of records requires additional storage space, more row checking work, and some additional maintenance work.

Applicable scenarios

Pessimistic Lock

The locking concurrency control used to resolve read-write conflicts and write-write conflicts is suitable for situations where there are more writes than reads and serious write conflicts. Because pessimistic locks are locked when reading data, scenarios with more reads will require frequent locking and a lot of waiting time. In the case of serious write conflicts, using pessimistic locks can ensure data consistency. High data consistency requirements can solve problems such as dirty reads, phantom reads, non-repeatable reads, first-class update loss, and second-class update loss.

Optimistic Locking

Lock-free concurrency control to resolve write-write conflicts is suitable for situations where there are more reads than writes, because if there are a large number of write operations, the possibility of write conflicts will increase, and the business layer needs to retry continuously, which will greatly reduce system performance. Data consistency requirements are not high, but very high response speed is required. It cannot solve dirty reads, phantom reads, and non-repeatable reads, but it can solve the update loss problem.

MVCC

Lock-free concurrency control that resolves read-write conflicts is combined with the above two to improve their read performance

The above is the detailed content of the comprehensive analysis of optimistic locking, pessimistic locking and MVCC in MySQL. For more information about optimistic locking, pessimistic locking and MVCC in MySQL, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • Introduction to MySQL isolation level, lock and MVCC
  • Summary of the differences between MySQL storage engines MyISAM and InnoDB
  • Setting the engine MyISAM/InnoDB when creating a data table in MySQL
  • Detailed explanation of memory management of MySQL InnoDB storage engine
  • Detailed explanation of various locks in the InnoDB storage engine in MySQL
  • Detailed explanation of the data page structure of MySQL's InnoDB storage engine
  • MySQL storage engines InnoDB and MyISAM
  • MYSQL database Innodb engine mvcc lock implementation principle

<<:  css input[type=file] style beautification (input upload file style)

>>:  Vue implements the drag and drop sorting function of the page div box

Recommend

The 6 Most Effective Ways to Write HTML and CSS

This article shares the 6 most effective methods,...

Solution to MySQLSyntaxErrorException when connecting to MySQL using bitronix

Solution to MySQLSyntaxErrorException when connec...

Detailed steps for installing rockerChat in docker and setting up a chat room

Comprehensive Documentation github address https:...

A QQ chat room based on vue.js

Table of contents Introduction The following is a...

Implementation of fastdfs+nginx cluster construction

1. Introduction to fastdfs 1. What is fastdfs Fas...

Some tips for using less in Vue projects

Table of contents Preface 1. Style penetration 1....

Vendor Prefix: Why do we need a browser engine prefix?

What is the Vendor Prefix? Vendor prefix—Browser ...

How to add conditional expressions to aggregate functions in MySql

MySQL filtering timing of where conditions and ha...

Using js to achieve waterfall effect

This article example shares the specific code of ...

Summary of MySQL injection bypass filtering techniques

First, let’s look at the GIF operation: Case 1: S...

Linux yum package management method

Introduction yum (Yellow dog Updater, Modified) i...

Robots.txt detailed introduction

Robots.txt is a plain text file in which website ...

Docker container connection implementation steps analysis

Generally speaking, after the container is starte...

Solution to the routing highlighting problem of Vue components

Preface Before, I used cache to highlight the rou...

Div css naming standards css class naming rules (in line with SEO standards)

There are many tasks to be done in search engine o...