Data Structure - Tree (III): Multi-way Search Tree B-tree, B+ tree

Data Structure - Tree (III): Multi-way Search Tree B-tree, B+ tree

Multi-way search tree

  1. Height of a complete binary tree: O(log2N), where 2 is the logarithm
  2. Height of a complete M-way search tree: O(logmN), where M is the logarithm, the number of nodes at each level of the tree
  3. M-way search tree is mainly used to solve the problem of data storage that is too large to be loaded into memory. By increasing the number of nodes in each layer and storing more data in each node, more data can be stored in one layer, thereby reducing the height of the tree and reducing the number of disk accesses when searching for data.
  4. Therefore, the more nodes in each layer and the more keywords each node contains, the shorter the tree height will be. However, determining data at each node is slower, but B-tree focuses on disk performance bottlenecks, so the overhead of searching data at a single node can be ignored.

B-Tree

B-tree is an M-way search tree. B-tree is mainly used to solve the imbalance of M-way search tree which causes the tree height to become higher, just like the performance problem caused by the degeneration of binary tree into linked list. The B-tree ensures the balance of the M-way search tree by controlling and adjusting the nodes in each layer, such as node separation, node merging, and splitting the parent node upward to add a new layer when a layer is full. The specific rules are as follows:

  1. The number of child trees of the root node is between 2 and M, and the number of child trees of other non-leaf nodes is between M/2 and M. If the number of child trees exceeds M due to splitting, the parent node needs to be split recursively upwards. When a parent node that does not need to be split is found, the splitting stops. The splitting process continues until the root node. If the root node needs to be split, two roots will be generated, so a new root needs to be created to use these two roots as child nodes. At this time, the height of the tree will increase by 1.
  2. The value of the keyword of each non-leaf node increases from left to right. The i-th keyword represents the smallest keyword in subtree i+1; (where i is between 1 and (2 to M) for the root node and between 1 and (M/2 to M) for other non-leaf nodes);
  3. All data items of the B-tree are stored in leaf nodes. Non-leaf nodes do not store data. Non-leaf nodes only store keywords used to indicate the search direction, namely indexes. This helps load more non-leaf nodes into memory, making it easier to search for data;
  4. All leaf nodes are at the same depth and each leaf node contains L/2 to L items of data.

Size options: M and L

  1. M is the order or number of paths of the B-tree
  2. L is the maximum number of data items that can be stored in each leaf node
  3. In a B-tree, each node is a disk block, so M and L need to be determined based on the size of the disk block.

Disk block size and M calculation

  1. Each non-leaf node stores keywords and pointers to child trees. The specific number is: for a B-tree of order M, each non-leaf node stores M-1 keywords and M pointers to child trees. Therefore, if the size of each keyword is 8 bytes (such as the Java long type is 8 bytes) and each pointer is 4 bytes, then each non-leaf node of the M-order B-tree requires: 8 * (M-1) + 4 * M = 12M - 8 bytes.
  2. If it is stipulated that each non-leaf node (disk block) does not occupy more than 8K of memory, that is, 8192, then the maximum value of M is 683, that is, 683*12-8=8192.

Number of leaf node data items L

  1. If the size of each data item is also 256 bytes, then since the disk block size is 8K, or 8192 bytes, and each leaf node can store L/2 to L data items, each leaf node can store at most: 8192/256=32 data items, that is, the size of L is 32.
  2. The structure of a 5-order B-tree is as follows, that is, M and L are equal to 5: each non-leaf node contains at most M-1=5-1=4 keywords, including M, that is, 5 pointers to subtrees. If L is equal to 5, each leaf node can store up to 5 data items.

B+ Tree

The structure of the B+ tree is basically the same as that of the B tree. The only difference is that the leaf nodes of the B+ tree are connected by pointers to form a linked list, so it is easy to traverse all the leaf nodes, that is, to obtain all or all data items in a certain range of the search keyword. MySQL's InnoDB storage engine uses B+ tree as the index implementation.

The above is the detailed integration of multi-way search tree B-tree and B+ tree introduced by the editor. I hope it will be helpful to everyone. If you have any questions, please leave me a message and the editor will reply to you in time. I would also like to thank everyone for their support of the 123WORDPRESS.COM website!

You may also be interested in:
  • Summary of B-tree index knowledge points in MySQL optimization
  • A brief discussion on MySQL B-tree index and index optimization summary
  • Complete B-tree algorithm Java implementation code
  • In-depth understanding of C language B-tree

<<:  Ubuntu 18.04 installs pyenv, pyenv-virtualenv, virtualenv, Numpy, SciPy, Pillow, Matplotlib

>>:  The solution to the page not refreshing after the route changes after react jumps

Recommend

MySql quick insert tens of millions of large data examples

In the field of data analysis, database is our go...

jQuery combined with CSS to achieve the return to top function

CSS Operations CSS $("").css(name|pro|[...

Html+CSS floating advertisement strip implementation

1.html part Copy code The code is as follows: <...

Detailed explanation of Docker container data volumes

What is Let’s first look at the concept of Docker...

Installation and configuration of mysql 8.0.15 under Centos7

This article shares with you the installation and...

How to separate static and dynamic state by combining Apache with Tomcat

Experimental environment Apache and Tomcat are bo...

Detailed tutorial on installing mysql under Linux

1. Shut down the mysql service # service mysqld s...

Analysis of Linux boot system methods

This article describes how to boot the Linux syst...

Examples of clearfix and clear

This article mainly explains how to use clearfix a...