Details of the underlying data structure of MySQL indexes

1. Index Type

1. B+ Tree
2. What are the differences between MyISAM and InnoDB's B+ tree index implementations (clustered index and non-clustered index)?
3. Non-clustered index
4. Advantages and disadvantages of clustered index
5. Hash Index
6. Adaptive Hash Index

1. Index Type

1. B+ Tree

Why B+ tree instead of B tree?

First, let's look at the structural differences between B-tree and B+ tree.

B-tree structure:

B+ Tree:

You can see:

The B-tree has satellite data (a row of data in the data table) on each node, while the B+ tree only has satellite data on leaf nodes. This means that for disk sectors of the same size, the B+ tree can store more leaf nodes and require fewer disk IO times; it also means that the search efficiency of the B+ tree is more stable, and the fastest time complexity of B-tree data query is O(1).
Each node of a B-tree appears only once, and all nodes of a B+ tree appear in leaf nodes. All leaf nodes of the B+ tree form an ascending linked list, which is suitable for interval range search, while the B tree is not suitable.

2. What are the differences between MyISAM and InnoDB's B+ tree index implementations (clustered index and non-clustered index)?

First you need to understand clustered indexes and non-clustered indexes.

Clustered index:

In a clustered index, the leaf pages contain all the data for the row, and the node pages contain the index columns. InnoDB clusters data by primary key. If no primary key is defined, a unique non-empty index column is selected instead. If there is no such index, InnoDB implicitly defines a primary key as the clustered index.

Data distribution of clustered index:

In a clustered index, in addition to the primary key index, there is also a secondary index. The leaf nodes in the secondary index do not store "row pointers" but primary key values, which are used as "pointers" to the rows. This means that when searching for a row through a secondary index, the storage engine needs to find the leaf node of the secondary index to obtain the corresponding primary key value, and then search for the corresponding row in the clustered index based on this value, which is also called "back to the table". Of course, you can avoid table repetition by using covering indexes or InnoDB 's adaptive indexes to reduce such repetitive work.

Note : Each leaf node in a clustered index contains not only the complete data row, but also the transaction ID, rollback pointer for transactions and MVCC.

3. Non-clustered index

The primary key index and secondary index of a non-clustered index are no different in structure, both of which store "row pointers" pointing to the physical address of the data on the leaf nodes.

Primary key index and secondary index of clustered index:

Primary key index and secondary index of non-clustered index:

4. Advantages and disadvantages of clustered index

advantage:

Store related data together (for example, group all the user's emails together by user ID), otherwise each data read may cause a disk IO
Faster data access. Store indexes and data in the same B+ tree. It is usually faster to retrieve data from a clustered index than from a non-clustered index. Using a covering query, you can directly use the primary key value in the page node.

shortcoming:

If all data can be stored in memory, sequential access is no longer necessary, and clustered indexes have no advantage. Insertion speed depends on the insertion order. Random insertion can cause page splits and holes. Use OPTIMIZE TABLE to rebuild the table. Every insertion, update, and deletion requires maintenance of index changes, which is very expensive. Secondary indexes may be larger than expected because the primary key columns of the referenced rows are included in the node.

5. Hash Index

Hash indexes are implemented based on hash tables. Only queries that exactly match all columns of the index are valid, which means that hash indexes are suitable for equal value queries.

Specific implementation: For each row of data, the storage engine calculates a hash code for all index columns. The hash index stores all hash codes in the index and saves a pointer to each data row in the hash table.

In MySQL, only the Memory engine explicitly supports hash indexes, although Memory engine also supports B-tree indexes.

Note: The Memory engine supports non-unique hash indexes. The way to resolve conflicts is to store multiple record pointers with the same hash value in the form of a linked list.

6. Adaptive Hash Index

When InnoDB notices that certain index values are used very frequently, it creates a hash index based on the B+ tree index in memory, so that the B+ tree index also has some advantages of the hash index, such as fast hash search.

This is the end of this article about the details of the underlying data structure of MySQL indexes. For more information about the underlying data structure of MySQL indexes, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:

How to construct a table index in MySQL
How to maintain MySQL indexes and data tables
Detailed introduction to MySQL database index
Detailed explanation of MySQL database index
MySQL Data Optimization - Multi-layer Index
MySQL Database Indexes and Transactions
Detailed explanation of the principles of indexing MySQL tables

<<: How to solve the problem of not getting form value after submitting html form input using disabled

>>: Solution for using Baidu share on Https page

Recommend

Detailed explanation of two table copy statements: SELECT INTO and INSERT INTO SELECT (Differences between SQL database and Oracle database)

1. INSERT INTO SELECT statement The statement for...

Details of the underlying data structure of MySQL indexes

Table of contents

1. Index Type

1. B+ Tree

2. What are the differences between MyISAM and InnoDB's B+ tree index implementations (clustered index and non-clustered index)?

3. Non-clustered index

4. Advantages and disadvantages of clustered index

5. Hash Index

6. Adaptive Hash Index

Write a formal blog using XHTML CSS

Detailed explanation of the new background properties in CSS3

Modify the default data directory of MySQL 8.0 (quick operation without configuration)

Difference and principle analysis of Nginx forward and reverse proxy

How to implement HTML Table blank cell completion

How to implement scheduled backup of MySQL in Linux

Detailed explanation of nodejs built-in modules

Implementation of LNMP for separate deployment of Docker containers

Detailed instructions for installing SuPHP on CentOS 7.2

Detailed explanation of configuring Docker's yum source and installing it in CentOS7

Recommend

Detailed explanation of two table copy statements: SELECT INTO and INSERT INTO SELECT (Differences between SQL database and Oracle database)

How to insert pictures into HTML pages and add map index examples

MySQL Series 4 SQL Syntax

MAC+PyCharm+Flask+Vue.js build system

Linux five-step build kernel tree

Solution to the problem of not being able to access the home page when adding a tomcat container to Docker

In-depth analysis of MySQL explain usage and results

Solution to the problem of MySQL data delay jump

A brief discussion on the semantics of HTML and some simple optimizations

30 Tips for Writing HTML Code

The concrete implementation of JavaScript exclusive thinking

7 native JS error types you should know

SQL group by to remove duplicates and sort by other fields

Full analysis of MySQL INT type

Web page creation for beginners: Learn to use HTML's hyperlink A tag