Data Structure - Tree (III): Multi-way Search Tree B-tree, B+ tree

Data Structure - Tree (III): Multi-way Search Tree B-tree, B+ tree

Multi-way search tree

  1. Height of a complete binary tree: O(log2N), where 2 is the logarithm
  2. Height of a complete M-way search tree: O(logmN), where M is the logarithm, the number of nodes at each level of the tree
  3. M-way search tree is mainly used to solve the problem of data storage that is too large to be loaded into memory. By increasing the number of nodes in each layer and storing more data in each node, more data can be stored in one layer, thereby reducing the height of the tree and reducing the number of disk accesses when searching for data.
  4. Therefore, the more nodes in each layer and the more keywords each node contains, the shorter the tree height will be. However, determining data at each node is slower, but B-tree focuses on disk performance bottlenecks, so the overhead of searching data at a single node can be ignored.

B-Tree

B-tree is an M-way search tree. B-tree is mainly used to solve the imbalance of M-way search tree which causes the tree height to become higher, just like the performance problem caused by the degeneration of binary tree into linked list. The B-tree ensures the balance of the M-way search tree by controlling and adjusting the nodes in each layer, such as node separation, node merging, and splitting the parent node upward to add a new layer when a layer is full. The specific rules are as follows:

  1. The number of child trees of the root node is between 2 and M, and the number of child trees of other non-leaf nodes is between M/2 and M. If the number of child trees exceeds M due to splitting, the parent node needs to be split recursively upwards. When a parent node that does not need to be split is found, the splitting stops. The splitting process continues until the root node. If the root node needs to be split, two roots will be generated, so a new root needs to be created to use these two roots as child nodes. At this time, the height of the tree will increase by 1.
  2. The value of the keyword of each non-leaf node increases from left to right. The i-th keyword represents the smallest keyword in subtree i+1; (where i is between 1 and (2 to M) for the root node and between 1 and (M/2 to M) for other non-leaf nodes);
  3. All data items of the B-tree are stored in leaf nodes. Non-leaf nodes do not store data. Non-leaf nodes only store keywords used to indicate the search direction, namely indexes. This helps load more non-leaf nodes into memory, making it easier to search for data;
  4. All leaf nodes are at the same depth and each leaf node contains L/2 to L items of data.

Size options: M and L

  1. M is the order or number of paths of the B-tree
  2. L is the maximum number of data items that can be stored in each leaf node
  3. In a B-tree, each node is a disk block, so M and L need to be determined based on the size of the disk block.

Disk block size and M calculation

  1. Each non-leaf node stores keywords and pointers to child trees. The specific number is: for a B-tree of order M, each non-leaf node stores M-1 keywords and M pointers to child trees. Therefore, if the size of each keyword is 8 bytes (such as the Java long type is 8 bytes) and each pointer is 4 bytes, then each non-leaf node of the M-order B-tree requires: 8 * (M-1) + 4 * M = 12M - 8 bytes.
  2. If it is stipulated that each non-leaf node (disk block) does not occupy more than 8K of memory, that is, 8192, then the maximum value of M is 683, that is, 683*12-8=8192.

Number of leaf node data items L

  1. If the size of each data item is also 256 bytes, then since the disk block size is 8K, or 8192 bytes, and each leaf node can store L/2 to L data items, each leaf node can store at most: 8192/256=32 data items, that is, the size of L is 32.
  2. The structure of a 5-order B-tree is as follows, that is, M and L are equal to 5: each non-leaf node contains at most M-1=5-1=4 keywords, including M, that is, 5 pointers to subtrees. If L is equal to 5, each leaf node can store up to 5 data items.

B+ Tree

The structure of the B+ tree is basically the same as that of the B tree. The only difference is that the leaf nodes of the B+ tree are connected by pointers to form a linked list, so it is easy to traverse all the leaf nodes, that is, to obtain all or all data items in a certain range of the search keyword. MySQL's InnoDB storage engine uses B+ tree as the index implementation.

The above is the detailed integration of multi-way search tree B-tree and B+ tree introduced by the editor. I hope it will be helpful to everyone. If you have any questions, please leave me a message and the editor will reply to you in time. I would also like to thank everyone for their support of the 123WORDPRESS.COM website!

You may also be interested in:
  • Summary of B-tree index knowledge points in MySQL optimization
  • A brief discussion on MySQL B-tree index and index optimization summary
  • Complete B-tree algorithm Java implementation code
  • In-depth understanding of C language B-tree

<<:  Ubuntu 18.04 installs pyenv, pyenv-virtualenv, virtualenv, Numpy, SciPy, Pillow, Matplotlib

>>:  The solution to the page not refreshing after the route changes after react jumps

Recommend

Steps to repair grub.cfg file corruption in Linux system

Table of contents 1. Introduction to grub.cfg fil...

Example code for implementing a simple search engine with MySQL

Table of contents Preface Introduction ngram full...

A brief discussion on React native APP updates

Table of contents App Update Process Rough flow c...

How to set up PostgreSQL startup on Ubuntu 16.04

Since PostgreSQL is compiled and installed, you n...

JS achieves five-star praise case

This article shares the specific code of JS to ac...

A detailed introduction to for/of, for/in in JavaScript

Table of contents In JavaScript , there are sever...

Add crontab scheduled tasks to debian docker container

Now most of the Docker images are based on Debian...

How to replace all tags in html text

(?i) means do not match case. Replace all uppercas...

Example code of vue custom component to implement v-model two-way binding data

In the project, you will encounter custom public ...

MySQL index cardinality concept and usage examples

This article uses examples to explain the concept...

How to enter directory/folder in Linux without using CD command

As we all know, without the cd command, we cannot...

A colorful cat under Linux

Friends who have used the Linux system must have ...

jQuery implements the function of adding and deleting employee information

This article shares the specific code of jQuery t...