Details of the underlying data structure of MySQL indexes

Details of the underlying data structure of MySQL indexes

1. Index Type

1. B+ Tree

Why B+ tree instead of B tree?

First, let's look at the structural differences between B-tree and B+ tree.

B-tree structure:

B+ Tree:

You can see:

  • The B-tree has satellite data (a row of data in the data table) on each node, while the B+ tree only has satellite data on leaf nodes. This means that for disk sectors of the same size, the B+ tree can store more leaf nodes and require fewer disk IO times; it also means that the search efficiency of the B+ tree is more stable, and the fastest time complexity of B-tree data query is O(1).
  • Each node of a B-tree appears only once, and all nodes of a B+ tree appear in leaf nodes. All leaf nodes of the B+ tree form an ascending linked list, which is suitable for interval range search, while the B tree is not suitable.

2. What are the differences between MyISAM and InnoDB's B+ tree index implementations (clustered index and non-clustered index)?

First you need to understand clustered indexes and non-clustered indexes.

Clustered index:

In a clustered index, the leaf pages contain all the data for the row, and the node pages contain the index columns. InnoDB clusters data by primary key. If no primary key is defined, a unique non-empty index column is selected instead. If there is no such index, InnoDB implicitly defines a primary key as the clustered index.

Data distribution of clustered index:

In a clustered index, in addition to the primary key index, there is also a secondary index. The leaf nodes in the secondary index do not store "row pointers" but primary key values, which are used as "pointers" to the rows. This means that when searching for a row through a secondary index, the storage engine needs to find the leaf node of the secondary index to obtain the corresponding primary key value, and then search for the corresponding row in the clustered index based on this value, which is also called "back to the table". Of course, you can avoid table repetition by using covering indexes or InnoDB 's adaptive indexes to reduce such repetitive work.

Note : Each leaf node in a clustered index contains not only the complete data row, but also the transaction ID, rollback pointer for transactions and MVCC.

3. Non-clustered index

The primary key index and secondary index of a non-clustered index are no different in structure, both of which store "row pointers" pointing to the physical address of the data on the leaf nodes.

Primary key index and secondary index of clustered index:

Primary key index and secondary index of non-clustered index:

4. Advantages and disadvantages of clustered index

advantage:

Store related data together (for example, group all the user's emails together by user ID), otherwise each data read may cause a disk IO
Faster data access. Store indexes and data in the same B+ tree. It is usually faster to retrieve data from a clustered index than from a non-clustered index. Using a covering query, you can directly use the primary key value in the page node.

shortcoming:

If all data can be stored in memory, sequential access is no longer necessary, and clustered indexes have no advantage. Insertion speed depends on the insertion order. Random insertion can cause page splits and holes. Use OPTIMIZE TABLE to rebuild the table. Every insertion, update, and deletion requires maintenance of index changes, which is very expensive. Secondary indexes may be larger than expected because the primary key columns of the referenced rows are included in the node.

5. Hash Index

Hash indexes are implemented based on hash tables. Only queries that exactly match all columns of the index are valid, which means that hash indexes are suitable for equal value queries.

Specific implementation: For each row of data, the storage engine calculates a hash code for all index columns. The hash index stores all hash codes in the index and saves a pointer to each data row in the hash table.

In MySQL, only the Memory engine explicitly supports hash indexes, although Memory engine also supports B-tree indexes.

Note: The Memory engine supports non-unique hash indexes. The way to resolve conflicts is to store multiple record pointers with the same hash value in the form of a linked list.

6. Adaptive Hash Index

When InnoDB notices that certain index values ​​are used very frequently, it creates a hash index based on the B+ tree index in memory, so that the B+ tree index also has some advantages of the hash index, such as fast hash search.

This is the end of this article about the details of the underlying data structure of MySQL indexes. For more information about the underlying data structure of MySQL indexes, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • How to construct a table index in MySQL
  • How to maintain MySQL indexes and data tables
  • Detailed introduction to MySQL database index
  • Detailed explanation of MySQL database index
  • MySQL Data Optimization - Multi-layer Index
  • MySQL Database Indexes and Transactions
  • Detailed explanation of the principles of indexing MySQL tables

<<:  How to solve the problem of not getting form value after submitting html form input using disabled

>>:  Solution for using Baidu share on Https page

Recommend

Vue implements left and right sliding effect example code

Preface The effect problems used in personal actu...

How to remove inline styles defined by the style attribute (element.style)

When modifying Magento frequently, you may encount...

Introduction to local components in Vue

In Vue, we can define (register) local components...

MySQL Series 11 Logging

Tutorial Series MySQL series: Basic concepts of M...

MySQL log system detailed information sharing

Anyone who has worked on a large system knows tha...

Detailed explanation of the two modes of Router routing in Vue: hash and history

hash mode (default) Working principle: Monitor th...

Learn one minute a day to use Git server to view debug branches and fix them

Debug branch During the normal development of a p...

50 Beautiful FLASH Website Design Examples

Flash enabled designers and developers to deliver...

Solve the problem that PhpStorm fails to connect to VirtualBox

Problem description: When phpstorm's SFTP hos...

Linux type version memory disk query command introduction

1. First, let’s have a general introduction to th...

Recommend a cool flashing alarm button

The effect is as follows: The code is as follows ...

Implementation of MySQL scheduled database backup (full database backup)

Table of contents 1. MySQL data backup 1.1, mysql...

VMware vsphere 6.5 installation tutorial (picture and text)

vmware vsphere 6.5 is the classic version of vsph...