In-depth explanation of MySQL common index and unique index

In-depth explanation of MySQL common index and unique index

Scenario

1. Maintain a citizen system with a field for ID number

2. The business code can ensure that no two duplicate ID numbers are written (if the business cannot guarantee this, you can rely on the unique index of the database to constrain it)

3. Common SQL query statements: SELECT name FROM CUser WHERE id_card = 'XXX'

4. Create an index

  • The ID number is relatively large, so it is not recommended to set it as the primary key
  • From a performance perspective, should you choose a normal index or a unique index?

Assume that the values ​​in field k are not repeated


Query process

1. Query statement: SELECT id FROM T WHERE k=5

2. Query process

  • Starting from the root of the B+ tree, search by layer to the leaf node, which is the data page in the lower right corner of the above figure.
  • Use binary search to locate specific records within the data page

3. For common indexes

  • Find the first record that meets the condition (5,500), then find the next record until the first record that does not meet k=5 is found

4. For unique index

  • Since the index defines uniqueness, the search stops after the first record that meets the conditions is found.

Performance Differences

1. Performance difference: very small

2. InnoDB data is read and written in units of data pages, the default is 16KB

3. When a record needs to be read, the record itself is not read from the disk, but read in units of data pages.

4. When the record with k=5 is found, the data page where it is located is already in memory

5. For ordinary indexes, only one more pointer search and one more calculation are required – CPU consumption is very low

  • If the record k=5 happens to be the last record of the data page, then if you want to get the next record, you need to read the next data page.
  • Very low probability: For an integer field index, a data page (16KB, compact format) can store approximately 745 values

change buffer

1. When a data page needs to be updated, if the data page is in memory, it is updated directly

2. If the data page is not in memory, without affecting data consistency

  • InnoDB caches these update operations in the change buffer
  • There is no need to read this data page from disk (random read)
  • The next time a query needs to access this data page, the data page is read into memory and the operations related to this data page in the change buffer are performed (merge)

3. Change buffer is persistent data, which has a copy in memory and is also written to disk

4. Record the update operation in the channge buffer first to reduce random disk reads and improve the execution speed of the statement

5. In addition, reading data pages into memory requires occupying the buffer pool. Using channge buffer can avoid occupying memory and improve memory utilization

6. The change buffer uses the memory in the buffer pool and cannot be increased indefinitely. The control parameter innodb_change_buffer_max_size

# Default is 25, maximum is 50
mysql> SHOW VARIABLES LIKE '%innodb_change_buffer_max_size%';
+-------------------------------+-------+
| Variable_name | Value |
+-------------------------------+-------+
| innodb_change_buffer_max_size | 25 |
+-------------------------------+-------+

merge

1. merge: apply the operations in the change buffer to the original data page

2. The execution process of merge

  • Read data pages from disk into memory (old version data pages)
  • Find the change buffer record for this data page from the change buffer (possibly multiple)
    Then execute them one by one to get the new version of the data page
  • Write redolog, including: table update of data page + change buffer change

3. After the merge is executed, the data page in the memory and the disk page corresponding to the change buffer have not been modified and are dirty pages.

  • Through other mechanisms, dirty pages are flushed to the corresponding physical disk pages.

4. Trigger timing

  • Visit this data page
  • The system background thread merges periodically
  • The database is shut down normally

Conditions of Use

1. For a unique index, all update operations must first determine whether the operation violates the uniqueness constraint.

2. The update of unique index cannot use change buffer, only common index can use change buffer

  • The primary key cannot use the change buffer either.
  • For example, to insert (4,400), you must first determine whether there is a record with k=4 in the table. The premise of this determination is to read the data page into memory.
  • Since the data page has been read into the memory, it is enough to directly update the data page in the memory without writing the change buffer.

Usage scenarios

1. The more changes a data page has recorded in the change buffer before it is merged, the greater the benefit.

2. For businesses with more writes than reads, the probability of a page being accessed immediately after it is written is extremely low, so the use of change buffer is most effective at this time.

  • For example, billing and logging systems

3. If the update mode of a business is: query will be performed immediately after writing

  • Although the update operation is recorded in the change buffer, the data page will be read from the disk immediately after the query, triggering the merge process.
  • Instead of reducing random reads, it increases the cost of maintaining the change buffer.

Update Process

Insert(4,400)

The target page is in memory

  • For the unique index, find the position between 3 and 5, determine if there is no conflict, and insert this value
  • For a normal index, find the position between 3 and 5 and insert this value
  • Performance Difference: Minimal

The target page is not in memory

1. For a unique index, you need to read the data page into memory, determine if there is any conflict, and insert the value

  • Random disk reads are very expensive

For common indexes, just record the update operation in the change buffer.

  • Reduced random disk reads, significantly improved performance

Index selection

1. There is not much difference in query performance between common index and unique index. The main consideration is update performance. It is recommended to choose common index.

2. Scenarios where it is recommended to close the change buffer

  • If all updates are followed by queries for this record
  • Control parameter innodb_change_buffering
mysql> SHOW VARIABLES LIKE '%innodb_change_buffering%';
+-------------------------+-------+
| Variable_name | Value |
+-------------------------+-------+
| innodb_change_buffering | all |
+-------------------------+-------+

# Valid Values ​​(>= 5.5.4)
none / inserts / deletes / changes / purges / all

# Valid Values ​​(<= 5.5.3)
none / inserts

# The predecessor of the change buffer is the insert buffer, which can only optimize insert operations

change buffer + redolog

Update Process

Current state of the k-tree: After finding the corresponding location, the data page Page 1 where k1 is located is in memory, and the data page Page 2 where k2 is located is not in memory

INSERT INTO t(id,k) VALUES (id1,k1),(id2,k2); 


# Memory: buffer pool
# redolog:ib_logfileX
# Data table space: t.ibd
# System tablespace: ibdata1

1. Page 1 is in memory, update the memory directly

2. Page 2 is not in memory, record in the changer buffer: add (id2,k2) to Page 2

3. The above two actions are included in the redolog (disk sequential write)

4. The transaction is now complete and the cost of executing the update statement is very low

  • Write twice to memory + once to disk

5. When a transaction is committed, the change buffer operation records are also recorded in the redolog

  • Therefore, the change buffer can be restored during crash recovery.

The dotted line is the background operation and does not affect the response time of the update operation

Reading process

Assumption: The read statement occurs shortly after the update statement, the data in memory is still there, and it has nothing to do with the system tablespace (ibdata1) and redolog (ib_logfileX).

SELECT * FROM t WHERE k IN (k1,k2); 


1. Read Page 1 and return it directly from memory (at this time, Page 1 may still be a dirty page and has not actually been written to disk)

2. Read Page 2, read the data page into memory through random disk read, and then apply the operation log in the change buffer (merge)

  • Generate a correct version and return

Improve update performance

1. redolog: saves random disk write IO consumption (sequential write)

2. Change buffer: saves IO consumption of random disk reads

References

"MySQL Practice 45 Lectures"

Summarize

The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. If you have any questions, you can leave a message to communicate. Thank you for your support for 123WORDPRESS.COM.

You may also be interested in:
  • Detailed Analysis of the Selection of MySQL Common Index and Unique Index
  • Which one should I choose between MySQL unique index and normal index?
  • Efficiency comparison between common index and unique index in MySQL
  • Detailed explanation of the difference between MySQL normal index and unique index

<<:  Comparison of two implementation methods of Vue drop-down list

>>:  Detailed tutorial on installing Docker and nvidia-docker on Ubuntu 16.04

Recommend

Detailed explanation of JavaScript array deduplication

Table of contents 1. Array deduplication 2. Dedup...

Detailed explanation of writing multiple conditions of CSS: not

The :not pseudo-class selector can filter element...

Method for comparing the size of varchar type numbers in MySQL database

Create a test table -- --------------------------...

Basic usage tutorial of MySQL slow query log

Slow query log related parameters MySQL slow quer...

How to set the border of a web page table

<br />Previously, we learned how to set cell...

Echarts tutorial on how to implement tree charts

Treemaps are mainly used to visualize tree-like d...

mysql replace part of the field content and mysql replace function replace()

[mysql] replace usage (replace part of the conten...

JavaScript code to achieve a simple calendar effect

This article shares the specific code for JavaScr...

How to reasonably use the redundant fields of the database

privot is the intermediate table of many-to-many ...

Summary and analysis of commonly used Docker commands and examples

Table of contents 1. Container lifecycle manageme...

Detailed discussion of the differences between loops in JavaScript

Table of contents Preface Enumerable properties I...

Solution to the inconsistency between crontab execution time and system time

Preface In LINUX, periodic tasks are usually hand...

JS implements city list effect based on VUE component

This article example shares the specific code for...

Will css loading cause blocking?

Maybe everyone knows that js execution will block...

Let's talk about the size and length limits of various objects in MySQL

Table of contents Identifier length limit Length ...