Data Structure - Tree (III): Multi-way Search Tree B-tree, B+ tree

Multi-way search tree

Height of a complete binary tree: O(log2N), where 2 is the logarithm
Height of a complete M-way search tree: O(logmN), where M is the logarithm, the number of nodes at each level of the tree
M-way search tree is mainly used to solve the problem of data storage that is too large to be loaded into memory. By increasing the number of nodes in each layer and storing more data in each node, more data can be stored in one layer, thereby reducing the height of the tree and reducing the number of disk accesses when searching for data.
Therefore, the more nodes in each layer and the more keywords each node contains, the shorter the tree height will be. However, determining data at each node is slower, but B-tree focuses on disk performance bottlenecks, so the overhead of searching data at a single node can be ignored.

B-Tree

B-tree is an M-way search tree. B-tree is mainly used to solve the imbalance of M-way search tree which causes the tree height to become higher, just like the performance problem caused by the degeneration of binary tree into linked list. The B-tree ensures the balance of the M-way search tree by controlling and adjusting the nodes in each layer, such as node separation, node merging, and splitting the parent node upward to add a new layer when a layer is full. The specific rules are as follows:

The number of child trees of the root node is between 2 and M, and the number of child trees of other non-leaf nodes is between M/2 and M. If the number of child trees exceeds M due to splitting, the parent node needs to be split recursively upwards. When a parent node that does not need to be split is found, the splitting stops. The splitting process continues until the root node. If the root node needs to be split, two roots will be generated, so a new root needs to be created to use these two roots as child nodes. At this time, the height of the tree will increase by 1.
The value of the keyword of each non-leaf node increases from left to right. The i-th keyword represents the smallest keyword in subtree i+1; (where i is between 1 and (2 to M) for the root node and between 1 and (M/2 to M) for other non-leaf nodes);
All data items of the B-tree are stored in leaf nodes. Non-leaf nodes do not store data. Non-leaf nodes only store keywords used to indicate the search direction, namely indexes. This helps load more non-leaf nodes into memory, making it easier to search for data;
All leaf nodes are at the same depth and each leaf node contains L/2 to L items of data.

Size options: M and L

M is the order or number of paths of the B-tree
L is the maximum number of data items that can be stored in each leaf node
In a B-tree, each node is a disk block, so M and L need to be determined based on the size of the disk block.

Disk block size and M calculation

Each non-leaf node stores keywords and pointers to child trees. The specific number is: for a B-tree of order M, each non-leaf node stores M-1 keywords and M pointers to child trees. Therefore, if the size of each keyword is 8 bytes (such as the Java long type is 8 bytes) and each pointer is 4 bytes, then each non-leaf node of the M-order B-tree requires: 8 * (M-1) + 4 * M = 12M - 8 bytes.
If it is stipulated that each non-leaf node (disk block) does not occupy more than 8K of memory, that is, 8192, then the maximum value of M is 683, that is, 683*12-8=8192.

Number of leaf node data items L

If the size of each data item is also 256 bytes, then since the disk block size is 8K, or 8192 bytes, and each leaf node can store L/2 to L data items, each leaf node can store at most: 8192/256=32 data items, that is, the size of L is 32.
The structure of a 5-order B-tree is as follows, that is, M and L are equal to 5: each non-leaf node contains at most M-1=5-1=4 keywords, including M, that is, 5 pointers to subtrees. If L is equal to 5, each leaf node can store up to 5 data items.

B+ Tree

The structure of the B+ tree is basically the same as that of the B tree. The only difference is that the leaf nodes of the B+ tree are connected by pointers to form a linked list, so it is easy to traverse all the leaf nodes, that is, to obtain all or all data items in a certain range of the search keyword. MySQL's InnoDB storage engine uses B+ tree as the index implementation.

The above is the detailed integration of multi-way search tree B-tree and B+ tree introduced by the editor. I hope it will be helpful to everyone. If you have any questions, please leave me a message and the editor will reply to you in time. I would also like to thank everyone for their support of the 123WORDPRESS.COM website!

You may also be interested in:

Summary of B-tree index knowledge points in MySQL optimization
A brief discussion on MySQL B-tree index and index optimization summary
Complete B-tree algorithm Java implementation code
In-depth understanding of C language B-tree

<<: Ubuntu 18.04 installs pyenv, pyenv-virtualenv, virtualenv, Numpy, SciPy, Pillow, Matplotlib

>>: The solution to the page not refreshing after the route changes after react jumps

Detailed explanation of dynamic Christmas tree through JavaScript

Data Structure - Tree (III): Multi-way Search Tree B-tree, B+ tree

Detailed explanation of dynamic Christmas tree through JavaScript

JavaScript to dynamically load and delete tables

Unable to define IE6 font: 13px size is invalid, IE6 automatically displays a larger font solution

MySQL daily statistics report fills in 0 if there is no data on that day

Use Rem layout to achieve adaptive

Supplementary article on front-end performance optimization

A brief discussion on the magic of parseInt() in JavaScript

MySQL PXC builds a new node with only IST transmission (recommended)

Tutorial on installing jdk1.8 on ubuntu14.04

JavaScript removes unnecessary properties of an object

Recommend

Steps to repair grub.cfg file corruption in Linux system

Example code for implementing a simple search engine with MySQL

A brief discussion on React native APP updates

Solution to the problem of Failed when configuring IDEA to connect to MYSQL database

How to set up PostgreSQL startup on Ubuntu 16.04

JS achieves five-star praise case

A detailed introduction to for/of, for/in in JavaScript

Add crontab scheduled tasks to debian docker container

How to replace all tags in html text

Example code of vue custom component to implement v-model two-way binding data

Analyze the problem of transferring files and other parameters in the upload component of element-ui

MySQL index cardinality concept and usage examples

How to enter directory/folder in Linux without using CD command

A colorful cat under Linux

jQuery implements the function of adding and deleting employee information