Preface Since the most important data structure in MySQL index is B+ tree, let's first briefly talk about the principle of B+ tree. B+ Tree Principle 1. Data Structure B Tree refers to Balance Tree, which is a balanced tree. A balanced tree is a search tree where all leaf nodes are located at the same level. B+ Tree is implemented based on B Tree and leaf node sequential access pointers. It has the balance of B Tree and improves the performance of interval query through sequential access pointers. In a B+ Tree, the keys in a node are arranged non-decreasingly from left to right. If the left and right adjacent keys of a pointer are keyi and keyi+1, and they are not null, then all the keys of the node pointed to by the pointer are greater than or equal to keyi and less than or equal to keyi+1. 2. Operation When performing a search operation, first perform a binary search on the root node to find a pointer to a key, and then recursively search the node pointed to by the pointer. Until a leaf node is found, then a binary search is performed on the leaf node to find the data corresponding to the key. Insertion and deletion operations will destroy the balance of the balanced tree. Therefore, after the insertion and deletion operations, the tree needs to be split, merged, rotated, etc. to maintain the balance. 3. Comparison with Red-Black Tree Balanced trees such as red-black trees can also be used to implement indexes, but file systems and database systems generally use B+ Tree as the index structure, mainly for the following two reasons: 1. Fewer searches The time complexity of a balanced tree search operation is equal to the tree height h, which is roughly O(h)=O(logdN), where d is the out-degree of each node. The out-degree of the red-black tree is 2, while the out-degree of the B+ Tree is generally very large, so the tree height h of the red-black tree is obviously much larger than that of the B+ Tree, and the number of searches is also greater. (II) Using the disk pre-reading feature In order to reduce disk I/O, the disk is often not read strictly on demand, but reads in advance every time. During the pre-read process, the disk performs sequential reading. Sequential reading does not require disk seek and only requires a short rotation time, so the speed is very fast. The operating system generally divides memory and disk into solid-size blocks, each of which is called a page, and memory and disk exchange data in units of pages. The database system sets the size of an index node to the size of a page so that a node can be fully loaded in one I/O. And by utilizing the pre-reading feature, adjacent nodes can also be pre-loaded. MySQL Indexes Indexes are implemented at the storage engine level, not at the server level, so different storage engines have different index types and implementations. 1. B+Tree Index Is the default index type for most MySQL storage engines. Since there is no need to scan the entire table, only the tree needs to be searched, so the search speed is much faster. In addition to searching, it can also be used for sorting and grouping. You can specify multiple columns as index columns, and multiple index columns together form the key. Applicable to full key value, key value range and key prefix search, among which key prefix search is only applicable to the leftmost prefix search. If the search is not in the order of the index columns, the index cannot be used. InnoDB's B+Tree index is divided into primary index and auxiliary index. The data field of the leaf node of the primary index records the complete data record. This indexing method is called a clustered index. Because it is impossible to store data rows in two different places, a table can have only one clustered index. The data field of the leaf node of the auxiliary index records the value of the primary key. Therefore, when using the auxiliary index for search, you need to first find the primary key value and then search in the primary index. 2. Hash Index Hash indexes can be searched in O(1) time, but they lose orderliness: they cannot be used for sorting and grouping; they only support exact searches, and cannot be used for partial searches or range searches. The InnoDB storage engine has a special feature called "adaptive hash index". When an index value is used very frequently, a hash index will be created on top of the B+Tree index. This gives the B+Tree index some of the advantages of the hash index, such as fast hash search. 3. Full-text indexing The MyISAM storage engine supports full-text indexing, which is used to find keywords in text instead of directly comparing them for equality. The search condition uses MATCH AGAINST instead of the normal WHERE. The full-text index is implemented using an inverted index, which records the mapping of keywords to the documents in which they are located. The InnoDB storage engine also began to support full-text indexing in MySQL version 5.6.4. 4. Spatial Data Index The MyISAM storage engine supports spatial data indexes (R-Tree) and can be used for geographic data storage. Spatial data indexes index data from all dimensions and can effectively use any dimension for combined queries. GIS related functions must be used to maintain data. Index optimization 1. Independent columns When performing a query, the index column cannot be part of an expression or a function parameter, otherwise the index cannot be used. For example, the following query cannot use the index on the actor_id column: SELECT actor_id FROM sakila.actor WHERE actor_id + 1 = 5; 2. Multi-column index When you need to use multiple columns as conditions for a query, using a multi-column index will provide better performance than using multiple single-column indexes. For example, in the following statement, it is best to set actor_id and film_id as multi-column indexes. SELECT film_id, actor_ id FROM sakila.film_actor WHERE actor_id = 1 AND film_id = 1; 3. Order of index columns Put the most selective index columns first. The selectivity of an index refers to the ratio of unique index values to the total number of records. The maximum value is 1, in which case each record has a unique index corresponding to it. The higher the selectivity, the more efficient the query. For example, in the results shown below, customer_id has a higher selectivity than staff_id, so it is best to put the customer_id column at the front of the multi-column index. SELECT COUNT(DISTINCT staff_id)/COUNT(*) AS staff_id_selectivity, COUNT(DISTINCT customer_id)/COUNT(*) AS customer_id_selectivity, COUNT(*) FROM payment; staff_id_selectivity: 0.0001 customer_id_selectivity: 0.0373 COUNT(*): 16049 4. Prefix Index For columns of BLOB, TEXT, and VARCHAR types, you must use a prefix index to index only the beginning characters. The selection of prefix length needs to be determined based on index selectivity. 5. Covering Index The index contains the values of all the fields that need to be queried. It has the following advantages:
6. Leftmost Prefix Principle As the name implies, the leftmost is first, and any consecutive index can be matched starting from the leftmost. The essence of joint index: When you create a (a,b,c) joint index, it is equivalent to creating a (a) single-column index. If you want the (a,b) joint index and the (a,b,c) joint index to take effect, you can only use the three combinations of a and a,b and a,b,c. Advantages of Indexes
Conditions for using indexes
summary Index is a very important function in MySQL. If you can make good use of indexes in daily development, you can greatly improve the execution performance of SQL statements, so it is very necessary to understand the principles behind it. The above is the full content of this article. I hope it will be helpful for everyone’s study. I also hope that everyone will support 123WORDPRESS.COM. You may also be interested in:
|
<<: 5 Ways to Send Emails in Linux Command Line (Recommended)
>>: vue $set implements assignment of values to array collection objects
Table of contents First install wget View Help Ma...
Table of contents 1. Installation requirements 2....
In a web page, the <input type="file"...
Download the Windows version of Nginx from the Ng...
This article example shares the specific code of ...
Preface Regarding HugePages and Oracle database o...
When we are doing front-end development, we will ...
1. When the mobile terminal processes the list sl...
When we encounter a fault, we often think about h...
Table of contents 1. Component Registration 1.1 G...
Remark: The amount of data in this article is 1 m...
The following is the code for building an ssh ser...
This article shares the specific code for JavaScr...
Why optimize: With the launch of the actual proje...
When the Docker container exits, the file system ...