1. What is an index?An index is a decentralized storage structure created to speed up the retrieval of data rows in a table. (Just like the dictionary we used when we were young, it would be faster to find the corresponding word with a dictionary) 2. Why do we need indexes?First we need to understand some concepts and knowledge
Through the above concepts, we roughly know what the index is used for - design the index system in advance, and when we query data, reduce the interaction with IO to improve our query efficiency. 3. How to design an index system?Let’s first understand a few concepts
—— key : the value stored in the actual data row - file address (pointer, we need to rely on the file address to find where the data file is stored)
—— From the above, we can see that our data format is of KV type. Knowing the KV format data, we know what data structure to use to store it, including hash table , tree ( binary tree , binary search tree , binary balanced tree , red-black tree , B tree , B+ tree ) 4.What is the MYSQL index system?Why not store it in the format mentioned above? As we all know, MySQL's index system uses B+ tree . Why is it B+ tree? Next, we analyze why other storage structures don’t work one by one. Before that, we still need to understand two prerequisites: OLAP and OLTP The more data we store, the larger the corresponding index will be. When we read from disk to memory, IO problems will arise. So do we create indexes for indexes? No, so MySQL uses the B+ tree 5. Hash TableThe above is the storage structure of the hash table. Let's discuss the advantages and disadvantages of this type of storage structure:
advantage: So is there a hash index in MySQL?
6. Tree6.1 Binary TreeThe binary tree itself is unordered. When we search for data, we have to compare the data with each node one by one to see if it meets our data requirements, which is inefficient. 6.2 Binary Search Tree (BST) Characteristics of binary search tree: data must be inserted in order, the left subtree must be smaller than the root node, and the right subtree must be guaranteed to be larger than the root node. Therefore, using a binary search tree compared to a binary tree obviously improves query efficiency. 6.3 Balanced Binary Tree (AVL Tree)According to the problems exposed by the binary search tree, we use the AVL tree to balance the tree by left or right rotation. However, in order to ensure balance, rotation is necessary when inserting data, compensating for the improvement in query performance by the loss in insertion performance . It's fine if I read more and write less, but if I have the same number of read and write requests, it's not suitable. 6.4 Red-Black TreesThe red-black tree is also balanced by left and right rotations, and there is also color-changing behavior. The longest subtree only needs to be no longer than twice the shortest subtree...so the query performance and insertion performance can be approximately balanced . However, as data is inserted, it is found that the depth of the tree will become deeper. The deeper the depth, the more IO times it will have, which affects the efficiency of data reading. 6.5 B-TreesIn view of the problems exposed by the red-black tree, how should we improve the efficiency of reading? Can we change from an ordered binary tree to an ordered multi-branch tree so that we can store more data? A degree of 4 means that a node stores three data values, and any values exceeding that must be transformed. So how is the actual data stored? We need the key and the complete data row The above picture shows how the B-tree actually stores data. Each node has three elements: key , pointer , and data . Let's idealize it and assume that keys and pointers do not take up space, and one piece of data takes up 1k of space. Then Disk 1 can store 16 pieces of data, Disk 3 also has 16 pieces of data, and Disk 8 also has 16 pieces of data. In this case, we can only store 16 + 16 + 16 = 4096 records, which is obviously a bit too little. In fact, keys and pointers also take up space. So we can't help but wonder, why is the amount of stored data so small? 6.6 B+TreeThe evolution from B-tree to B+tree: non-leaf nodes do not store data, only leaf nodes store data In the above figure, we can assume that p1 and 28 are a group of 10 bytes , so the first layer can store 16000/10=1600 such size, the second layer is also 1600, and the third layer data occupies 1kb, which is 16 records, so the total storage is 1600 1600 16=40960000 ( 40.96 million ) records The MySQL index structure is generally 3 to 4 layers, but there is one issue that needs attention. Assuming we have a three-layer storage structure, how can we store more data? Answer: The shorter the key length, the better. For varchar with a length less than 4 bytes, use varchar; for varchar with a length greater than 4 bytes, use int. According to the characteristics of B+ tree, it has large storage capacity and fast query, so MySQL uses B+ tree. SummarizeThis concludes the explanation of why the MySQL index system uses B+ trees. If I have said anything wrong, I hope you can remind me and correct it. This concludes this article on why MySQL's index system uses B+ trees. For more information about MySQL index B+ trees, please search for previous articles on 123WORDPRESS.COM or continue browsing the following related articles. I hope you will support 123WORDPRESS.COM in the future! You may also be interested in:
|
<<: An example of vertical centering of sub-elements in div using Flex layout
>>: How to install Elasticsearch7.6 cluster in docker and set password
The role of virtual DOM First of all, we need to ...
Install Nginx First pull the centos image docker ...
Table of contents TOKEN Timer Refresher 2. Intern...
pthread_create function Function Introduction pth...
Yesterday, I wrote a blog about the circular prog...
1. It is best to add a sentence like this before t...
Effect picture: html: <div class='site_bar...
SSH stands for Secure Shell, which is a secure tr...
summary Docker-compose can easily combine multipl...
For front-end developers, ensuring that the code ...
1. float+overflow:hidden This method mainly trigg...
This article example shares the specific code of ...
Today, this article introduces some basic concept...
First download JDK. Here we use jdk-8u181-linux-x...
Adding the rel="nofollow" attribute to ...