In-depth explanation of MySQL common index and unique index

In-depth explanation of MySQL common index and unique index

Scenario

1. Maintain a citizen system with a field for ID number

2. The business code can ensure that no two duplicate ID numbers are written (if the business cannot guarantee this, you can rely on the unique index of the database to constrain it)

3. Common SQL query statements: SELECT name FROM CUser WHERE id_card = 'XXX'

4. Create an index

  • The ID number is relatively large, so it is not recommended to set it as the primary key
  • From a performance perspective, should you choose a normal index or a unique index?

Assume that the values ​​in field k are not repeated


Query process

1. Query statement: SELECT id FROM T WHERE k=5

2. Query process

  • Starting from the root of the B+ tree, search by layer to the leaf node, which is the data page in the lower right corner of the above figure.
  • Use binary search to locate specific records within the data page

3. For common indexes

  • Find the first record that meets the condition (5,500), then find the next record until the first record that does not meet k=5 is found

4. For unique index

  • Since the index defines uniqueness, the search stops after the first record that meets the conditions is found.

Performance Differences

1. Performance difference: very small

2. InnoDB data is read and written in units of data pages, the default is 16KB

3. When a record needs to be read, the record itself is not read from the disk, but read in units of data pages.

4. When the record with k=5 is found, the data page where it is located is already in memory

5. For ordinary indexes, only one more pointer search and one more calculation are required – CPU consumption is very low

  • If the record k=5 happens to be the last record of the data page, then if you want to get the next record, you need to read the next data page.
  • Very low probability: For an integer field index, a data page (16KB, compact format) can store approximately 745 values

change buffer

1. When a data page needs to be updated, if the data page is in memory, it is updated directly

2. If the data page is not in memory, without affecting data consistency

  • InnoDB caches these update operations in the change buffer
  • There is no need to read this data page from disk (random read)
  • The next time a query needs to access this data page, the data page is read into memory and the operations related to this data page in the change buffer are performed (merge)

3. Change buffer is persistent data, which has a copy in memory and is also written to disk

4. Record the update operation in the channge buffer first to reduce random disk reads and improve the execution speed of the statement

5. In addition, reading data pages into memory requires occupying the buffer pool. Using channge buffer can avoid occupying memory and improve memory utilization

6. The change buffer uses the memory in the buffer pool and cannot be increased indefinitely. The control parameter innodb_change_buffer_max_size

# Default is 25, maximum is 50
mysql> SHOW VARIABLES LIKE '%innodb_change_buffer_max_size%';
+-------------------------------+-------+
| Variable_name | Value |
+-------------------------------+-------+
| innodb_change_buffer_max_size | 25 |
+-------------------------------+-------+

merge

1. merge: apply the operations in the change buffer to the original data page

2. The execution process of merge

  • Read data pages from disk into memory (old version data pages)
  • Find the change buffer record for this data page from the change buffer (possibly multiple)
    Then execute them one by one to get the new version of the data page
  • Write redolog, including: table update of data page + change buffer change

3. After the merge is executed, the data page in the memory and the disk page corresponding to the change buffer have not been modified and are dirty pages.

  • Through other mechanisms, dirty pages are flushed to the corresponding physical disk pages.

4. Trigger timing

  • Visit this data page
  • The system background thread merges periodically
  • The database is shut down normally

Conditions of Use

1. For a unique index, all update operations must first determine whether the operation violates the uniqueness constraint.

2. The update of unique index cannot use change buffer, only common index can use change buffer

  • The primary key cannot use the change buffer either.
  • For example, to insert (4,400), you must first determine whether there is a record with k=4 in the table. The premise of this determination is to read the data page into memory.
  • Since the data page has been read into the memory, it is enough to directly update the data page in the memory without writing the change buffer.

Usage scenarios

1. The more changes a data page has recorded in the change buffer before it is merged, the greater the benefit.

2. For businesses with more writes than reads, the probability of a page being accessed immediately after it is written is extremely low, so the use of change buffer is most effective at this time.

  • For example, billing and logging systems

3. If the update mode of a business is: query will be performed immediately after writing

  • Although the update operation is recorded in the change buffer, the data page will be read from the disk immediately after the query, triggering the merge process.
  • Instead of reducing random reads, it increases the cost of maintaining the change buffer.

Update Process

Insert(4,400)

The target page is in memory

  • For the unique index, find the position between 3 and 5, determine if there is no conflict, and insert this value
  • For a normal index, find the position between 3 and 5 and insert this value
  • Performance Difference: Minimal

The target page is not in memory

1. For a unique index, you need to read the data page into memory, determine if there is any conflict, and insert the value

  • Random disk reads are very expensive

For common indexes, just record the update operation in the change buffer.

  • Reduced random disk reads, significantly improved performance

Index selection

1. There is not much difference in query performance between common index and unique index. The main consideration is update performance. It is recommended to choose common index.

2. Scenarios where it is recommended to close the change buffer

  • If all updates are followed by queries for this record
  • Control parameter innodb_change_buffering
mysql> SHOW VARIABLES LIKE '%innodb_change_buffering%';
+-------------------------+-------+
| Variable_name | Value |
+-------------------------+-------+
| innodb_change_buffering | all |
+-------------------------+-------+

# Valid Values ​​(>= 5.5.4)
none / inserts / deletes / changes / purges / all

# Valid Values ​​(<= 5.5.3)
none / inserts

# The predecessor of the change buffer is the insert buffer, which can only optimize insert operations

change buffer + redolog

Update Process

Current state of the k-tree: After finding the corresponding location, the data page Page 1 where k1 is located is in memory, and the data page Page 2 where k2 is located is not in memory

INSERT INTO t(id,k) VALUES (id1,k1),(id2,k2); 


# Memory: buffer pool
# redolog:ib_logfileX
# Data table space: t.ibd
# System tablespace: ibdata1

1. Page 1 is in memory, update the memory directly

2. Page 2 is not in memory, record in the changer buffer: add (id2,k2) to Page 2

3. The above two actions are included in the redolog (disk sequential write)

4. The transaction is now complete and the cost of executing the update statement is very low

  • Write twice to memory + once to disk

5. When a transaction is committed, the change buffer operation records are also recorded in the redolog

  • Therefore, the change buffer can be restored during crash recovery.

The dotted line is the background operation and does not affect the response time of the update operation

Reading process

Assumption: The read statement occurs shortly after the update statement, the data in memory is still there, and it has nothing to do with the system tablespace (ibdata1) and redolog (ib_logfileX).

SELECT * FROM t WHERE k IN (k1,k2); 


1. Read Page 1 and return it directly from memory (at this time, Page 1 may still be a dirty page and has not actually been written to disk)

2. Read Page 2, read the data page into memory through random disk read, and then apply the operation log in the change buffer (merge)

  • Generate a correct version and return

Improve update performance

1. redolog: saves random disk write IO consumption (sequential write)

2. Change buffer: saves IO consumption of random disk reads

References

"MySQL Practice 45 Lectures"

Summarize

The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. If you have any questions, you can leave a message to communicate. Thank you for your support for 123WORDPRESS.COM.

You may also be interested in:
  • Detailed Analysis of the Selection of MySQL Common Index and Unique Index
  • Which one should I choose between MySQL unique index and normal index?
  • Efficiency comparison between common index and unique index in MySQL
  • Detailed explanation of the difference between MySQL normal index and unique index

<<:  Comparison of two implementation methods of Vue drop-down list

>>:  Detailed tutorial on installing Docker and nvidia-docker on Ubuntu 16.04

Recommend

Example of how to reference environment variables in Docker Compose

In a project, you often need to use environment v...

Detailed explanation of the use of CSS3 rgb and rgba (transparent color)

I believe everyone is very sensitive to colors. C...

My CSS framework - base.css (reset browser default style)

Copy code The code is as follows: @charset "...

Tutorial on upgrading from Centos7 to Centos8 (with pictures and text)

If you upgrade in a formal environment, please ba...

CSS injection knowledge summary

Modern browsers no longer allow JavaScript to be ...

How to develop uniapp using vscode

Because I have always used vscode to develop fron...

Docker overlay realizes container intercommunication across hosts

Table of contents 1. Docker configuration 2. Crea...

Detailed analysis of MySQL instance crash cases

[Problem description] Our production environment ...

How to use CSS counters to beautify ordered lists of numbers

In web design, it is very important to use an org...

Detailed explanation of how to install PHP curl extension under Linux

This article describes how to install the PHP cur...

Detailed explanation of the this pointing problem in JavaScript

Summarize Global environment ➡️ window Normal fun...

How to implement Mysql scheduled tasks under Linux

Assumption: The stored procedure is executed ever...

Summary of the unknown usage of "!" in Linux

Preface In fact, the humble "!" has man...