MySQL's MyISAM and InnoDB engines both use B+ tree indexes by default (displayed as "BTREE" when queried). This article discusses two issues:
Why can't all indexes fit into memory? The choice of index structure is based on the following property: when the data volume is large, the index cannot be fully loaded into the memory. Why can't the index all fit into memory? Assuming that the index is organized using a tree structure, a simple estimate is:
Assume the index is stored in memory. That is to say, every time 2G of data is saved on the physical disk, 200MB of memory will be occupied, and the occupancy ratio of index: data is about 1/10. Is a 1/10 occupancy ratio considered large? Physical disks are much cheaper than memory. Take a server with 16G memory and 1T hard disk as an example. If you want to fill up the 1T hard disk, you need at least 100G memory, which is much more than 16G. Taking into account that a table may have multiple indexes, joint indexes, and smaller data row occupancy, the actual occupancy ratio is usually greater than 1/10, and sometimes can reach 1/3. In an index-based storage architecture, the index:data ratio is too high, so the index cannot be fully loaded into memory. Other structural issues Since it cannot be loaded into memory, it must rely on disk (or SSD) storage. The reading and writing speed of memory is thousands of times that of disk (depending on the specific implementation). Therefore, the core issue is "how to reduce the number of disk reads and writes." First, without considering the page table mechanism, assume that each read and write goes directly to the disk, then:
BST, AVL, and RBT optimize the number of read and write operations from O(n) to O(log2(n)). AVL and RBT have a self-balancing function compared to BST, which reduces the number of read and write operations to a maximum of O(log2(n)). Assuming that an auto-increment primary key is used, the primary key itself is ordered, and the number of read and write times of the tree structure can be optimized to the tree height. The lower the tree height, the fewer the read and write times; self-balancing ensures the stability of the tree structure. If you want to further optimize, you can introduce B-tree and B+ tree. What problem does B-tree solve? Many articles mistakenly refer to B-trees as B-(reduction) trees, which may be a misunderstanding of its English name "B-Tree" (even worse, B-trees are called binary trees or binary search trees). Especially when speaking with B+ trees. It is taken for granted that if there is a B+ (plus) tree, there must be a B- (minus) tree. In fact, the English name of the B+ tree is "B+-Tree". If we ignore the maintenance operations, the B-tree is like an "m-way search tree" (m is the maximum number of subtrees) with a time complexity of O(logm(n)). However, the B-tree is designed with an efficient and simple maintenance operation to maintain the depth of the B-tree between approximately log(ceil(m/2))(n) and logm(n), greatly reducing the tree height. Once again, Don't worry about time complexity. Unlike simple algorithms, disk IO times are a bigger factor. Readers can deduce that the time complexity of B-tree and AVL is the same, but because B-tree has fewer layers and fewer disk IO times, in practice the performance of B-tree is better than that of binary trees such as AVL. Similar to a binary search tree, each node stores multiple keys and subtrees, and the subtrees and keys are arranged in sequence. The directory of the page table is to expand the external memory + accelerate disk reading and writing. A page is usually 4K (equal to the size of the disk data block, see the analysis of inode and block). The operating system loads the content from the disk to the memory in units of pages each time (to spread the seek cost). After modifying the page, it will write the page back to the disk at an appropriate time. Considering the good properties of the page table, the size of each node can be made approximately equal to a page (making m very large), so that each page loaded can completely cover a node in order to select the next level of subtree; the same applies to subtrees. For the page table, AVL (or RBT) is equivalent to a B-tree with 1 key + 2 subtrees. Since logically adjacent nodes are usually not physically adjacent, when a 4k page is read in, most of the space in the page will be invalid data. Assuming that the key and subtree node pointer each occupy 4B, the maximum size of the B-tree node is m * (4 + 4) = 8MB; the page size is 4KB. Then m = 4 * 1024 / 8m = 512, a 512-fork B-tree, 10 million data, the maximum depth is log(512/2)(10^7) = 3.02 ~= 4. In comparison, the depth of binary trees such as AVL is log(2)(10^7) = 23.25 ~= 24, which is more than 5 times the depth. shock! The depth of B-tree index is so high! In addition, B-trees are very friendly to the principle of locality. If the key is relatively small (such as the 4B self-incrementing key above), in addition to the benefit of the page table, the cache can further accelerate pre-reading. Delicious~ What problems does the B+ tree solve? The remaining problem of B-tree However, if B-tree is to be actually applied to database indexes, there are still some problems:
Question 1 The records in a data table have multiple fields. It is not enough to just locate the primary key, but you also need to locate the data row. There are 3 solutions:
Solution 1 is passed directly. Storing data rows will reduce the number of subtrees in the page, and m will decrease while the tree height will increase. In solution 2, a field is added to the node. Assuming it is a 4B pointer, the new m = 4 * 1024 / 12m = 341.33 ~= 341, and the maximum depth is log(341/2)(10^7) = 3.14 ~= 4. The node m and depth of solution 3 remain unchanged, but the time complexity becomes stable O(logm(n)). Option 3 can be considered. Question 2 In actual business, the frequency of range queries is very high. B-tree can only locate one index position (which may correspond to multiple rows), making it difficult to handle range queries. The minor changes are 2 plan:
At first glance, it seems that Solution 1 is better than Solution 2 - the time complexity and constant terms are the same, and Solution 1 does not need to be changed. But don’t forget the principle of locality. Regardless of whether the node stores data rows or data row locations, the advantage of Solution 2 is that the page table and cache can still be used to pre-read the information of the next node. However, Solution 1 faces the disadvantage that the nodes are logically adjacent but physically separated. Introducing B+ Tree In summary, Solution 2 for Problem 1 and Solution 1 for Problem 2 can be integrated into one solution (based on B-tree index), and Solution 3 for Problem 1 and Solution 2 for Problem 2 can be integrated into one solution (based on B+ tree index). In fact, some databases and file systems use B-trees, while others use B+ trees. For reasons that some monkeys don't understand yet, mainstream databases including MySQL mostly choose B+ trees. Right now: The main changes are as follows:
The process of adding, deleting and checking B-tree and B+-tree The process of adding and deleting B-trees can be temporarily referred to in the section "6. Insertion and deletion operations of B-trees" from B-tree, B+ tree, B* tree to R-tree. The same is true for adding and deleting B+ trees. I will not go into details here. Mysql index optimization Based on the properties of B+ trees, it is easy to understand various common MySQL index optimization ideas. Let’s not consider the differences between different engines for now. Prefer to use auto-increment key as primary key In the previous analysis, assuming that a 4B auto-increment key is used as the index, m can reach 512 and the layer height is only 3. There are two benefits of using an auto-increment key: The auto-increment key is generally an integer type such as int, and the key is relatively compact, so m can be very large and the index takes up little space. In the most extreme example, if a 50B varchar (including length) is used, then m = 4 * 1024 / 54m = 75.85 ~= 76, and the maximum depth is log(76/2)(10^7) = 4.43 ~= 5. Add to that the cost of cache misses and string comparisons, and the time cost increases significantly. At the same time, the key grows from 4B to 50B, and the space occupied by the entire index tree also grows horribly (if the secondary index uses the primary key to locate the data row, the space growth will be even more serious). The self-incrementing nature means that the insertion request of new data rows will inevitably fall to the rightmost side of the index tree, and the frequency of node splitting is low. Ideally, the index tree can reach a "full" state. When the index tree is full, the layer height is lower and the frequency of node merging when deleting nodes is also lower. Optimization experience: I once used a varchar(100) column as the primary key to store containerId. After 3 or 4 days, the 100G database was full. The DBA lady expressed her contempt for me in an email. . . Afterwards, an auto-increment column was added as the primary key and containerId as the unique secondary index. The time and space optimization effects were quite significant. Leftmost prefix matching An index can be as simple as a single column (a) or as complex as multiple columns (a, b, c, d), i.e. a joint index. If it is a joint index, the key is also composed of multiple columns. At the same time, the index can only be used to find whether the key exists (equality). If a range query (>, <, between, like left match) is encountered, further matching is not possible and it subsequently degenerates into a linear search. Therefore, the order in which the columns are arranged determines the number of columns that can hit the index. If there is an index (a, b, c, d), and the query condition is a = 1 and b = 2 and c > 3 and d = 4, then a, b, and c will be hit in each node in turn, but d will not be hit. That is the leftmost prefix matching principle. =、in automatic optimization order You do not need to consider the order of =, in, etc. MySQL will automatically optimize the order of these conditions to match as many index columns as possible. If there is an index (a, b, c, d), the query conditions c > 3 and b = 2 and a = 1 and d < 4 and a = 1 and c > 3 and b = 2 and d < 4 are all possible. MySQL will automatically optimize to a = 1 and b = 2 and c > 3 and d < 4, hitting a, b, and c in sequence. Index columns cannot be used in calculations Query conditions that involve index columns in calculations are not index-friendly (or even cannot use indexes), such as from_unixtime(create_time) = '2014-05-29'. The reason is simple. How to find the corresponding key in the node? If a linear scan is performed, the calculation needs to be recalculated each time, which is too costly; if a binary search is performed, the size relationship needs to be determined for the from_unixtime method. Therefore, index columns cannot participate in calculations. The above from_unixtime(create_time) = '2014-05-29' statement should be written as create_time = unix_timestamp('2014-05-29'). Don’t create new indexes if you can expand If you already have an index (a) and want to create an index (a, b), try to change the index (a) to the index (a, b). The cost of creating a new index is easy to understand. If the index (a) is modified to the index (a, b), MySQL can directly modify the index a's B+ tree to the index (a, b) by splitting, merging, etc. There is no need to create an index with a prefix inclusion relationship If you already have an index (a, b), you do not need to create index (a), but if necessary, you still need to consider creating index (b). Select columns with high discrimination as index Very easy to understand. For example, if gender is used as the index, the index can only divide 10 million rows of data into two parts (such as 5 million males and 5 million females), and the index is almost ineffective. The formula for discrimination is count(distinct <col>) / count(*), which indicates the proportion of unique fields. The larger the proportion, the better the discrimination. The unique key has a discrimination of 1, while some status and gender fields may have a discrimination close to 0 in the face of big data. This value is difficult to determine. Generally, the field that needs to be joined is required to be above 0.1, which means that on average 1 scan requires 10 records. The above is the full content of this article. I hope it will be helpful for everyone’s study. I also hope that everyone will support 123WORDPRESS.COM. You may also be interested in:
|
<<: Detailed explanation of Linux system directories sys, tmp, usr, var!
>>: Nodejs module system source code analysis
Preface At work, we often need to operate in a Li...
Mixin method: The browser cannot compile: The old...
This article describes the MySQL transaction mana...
I encountered this problem before when developing...
As usual, let’s first post the picture effect: Th...
Modern browsers no longer allow JavaScript to be ...
Start the mysql container in docekr Use command: ...
1. List The list ul container is loaded with a fo...
Table of contents Preface 1. Array traversal meth...
Table of contents 1. Effect 2. Main code 1. Effec...
I came into contact with CSS a long time ago, but...
1. Flash plug-in package download address: https:...
Find the containerID of tomcat and enter the toma...
This article shares the specific code of js to ac...
<br />When thoughts were divided into East a...