Preface I believe most people have used MySQL and indexes, but do you know how to create appropriate indexes? When the amount of data is small, inappropriate indexes will not have much impact on performance, but when the data gradually increases, performance will drop sharply. This article is a summary of MySQL indexes. If there are any errors, please comment and point them out. Index Basics We all know the steps of looking up a word in a dictionary. First, find the page number of the word in the index page, and then go to the corresponding page number to check the information of the word. The MySQL indexing method is similar to this. First, find the corresponding value in the index, and then find the corresponding data row based on the matching index record. If there is the following sql statement: select * from student where code='2333' If an index is created on the code column, MySQL will use the index to find the row with the value '2333', and then read all the data in the row and return it. Index Type B-Tree Index (either B-tree or B-tree), the vast majority of index types are B-Tree (or B-Tree variants), and we usually use this type of index. The MyISAM storage engine in MySQL uses B-tree, and InnoDB uses B+Tree. You can search Baidu for the difference between B-tree and B+tree. The tree-structured index can speed up the access to data. The storage engine no longer needs to scan the entire table to obtain the required data. Instead, it performs a binary search from the root node of the tree. As we all know, the speed of binary search is quite fast, so we can use the index to greatly improve the query speed. B-Tree supports the following types of queries: Assume that there is only a multi-column index on the student table: name, age, weight. The following queries can all use this index:
And all columns in the index column are matched. For example, to query name='abc' and age=12, the first and second columns are used here.
Only the beginning part of the index is used. For example, the query for name='ggg' only uses the first column of the index, and the query for name='ggg' and age=12 uses the first and second columns of the index.
You can also match only the beginning part of a column, for example, query name lik 'g%' to query records where name starts with g. The first column is used here
Can be used to match range values, such as querying name > 'abc' and name < 'bcd'
Used to match multiple columns, such as querying name='abc' and age > 12. In general, we can find that B-Tree index is suitable for searching based on the leftmost prefix, that is, the order of query fields must be the same as the order of index fields and start with the first index field. For example, the index can be used to query name, name and age, name and age and weight, but the index cannot be used to query age, age and name. Hash Index The hash index is implemented based on the hash table and will only take effect if all columns of the index are exactly matched. In MySQL, only the Memory engine explicitly supports hash indexes, which is also its default index. InnoDB cannot create hash indexes, but it has a feature called adaptive hash index. When certain index values are used very frequently, the engine will create another hash index in memory based on the B-Tree index, so that the B-Tree index also has some advantages of the hash index. This feature is a completely automatic, internal behavior, meaning it cannot be manually controlled or configured. High-performance indexing strategy Below are some common indexing strategies. Independent columns This is very simple. If the columns in the query are not independent, the index cannot be used, for example: select * from student where age+1=12 Even if the age column has an index, the above query statement cannot use the index. Prefix Indexes and Index Selectivity If you need to index a very long string column, creating an index directly will cause the index to take up more space and be slower. One optimization strategy is to simulate a hash index: calculate a hash value for the column and create an index on the hash value column. Another way is to create a prefix index. Only the characters at the beginning of this field are indexed. This can greatly reduce the space occupied and the index creation speed will be much faster. But this also has the following disadvantages:
The key here is to determine how many characters to index appropriately. It is necessary to avoid excessive length and to ensure sufficient index selectivity. There are two ways to help determine the number of index characters: The index field prefix data is evenly distributed. That is, the number of strings starting with the index character is evenly distributed. For example, if we index the first 3 characters of the name field, the following result is reasonable (only the top 8 are taken):
If the data in each column is relatively large, it means that the discrimination is not high enough and the number of index characters needs to be increased until the selectivity of the prefix is close to the indexability of the entire column, that is, the preceding data should be as small as possible. Calculate the selectivity of the full column and make the selectivity of the prefix close to the selectivity of the full column. The following statement calculates the complete column selectivity: -- The number of different strings/total number is the complete column selective select count(distinct name)/count(*) from person; The following statement calculates the selectivity of the first three fields of the index: -- String data with different first 3 characters/total data select count(distincy left(city,3))/count(*) from person Keep increasing the number of index characters until the selectivity approaches the full column selectivity and further increasing the number of index characters does not significantly improve the selectivity of the data. Creation Method -- Assume the optimal length is 4 alter table person add key (name(4)); Multi-column indexes Many people have this misunderstanding: if a query uses multiple fields and an 'and' query, wouldn't it be possible to maximize efficiency by indexing each field? This is not the case. MySQL will only select one of the fields for index search. In this case, you should create a multi-column index (also called a joint index) so that you can use multiple index fields. Note that the order of the index columns must be consistent with the query order. The "index merge" strategy was introduced in versions 5.0 and above. To some extent, multiple single-column indexes can also be used, such as the following query: -- MySQL will use the name and age indexes to find the data and then merge them -- If you use and, it will find the data and then compare and get the intersection select * from person where name = "bob" or age=12 However, this is not recommended. Too many and or or conditions will consume a lot of CPU and memory in the algorithm's caching, sorting, and merging operations. Choose the appropriate index column order In a multi-column B-Tree index, the order of the index columns means that the index is sorted first by the leftmost column, then the second column... Indexing a good multi-column index should put the most selective index first, and then lower in turn, so that it can better facilitate the index. Selective computation method discovery: prefix index section. Clustered Index A clustered index is not a separate index type, but a data storage method. The specific details depend on its implementation. InnoDB's clustered index actually stores index values and data rows in the same structure. Because you cannot put a row of data in two different places at the same time, a table can have only one clustered index. InnoDB's clustered index columns are called "primary key columns". If no primary key is defined, InnoDB chooses a unique, non-empty index instead. If there is no such index, InnoDB implicitly defines a primary key to serve as the clustered index. The main advantage of clustered indexes is that they can store related data together, reduce disk IO, and improve query efficiency. But there are also disadvantages:
Covering Index Simply put, an index covers the column fields that need to be queried, so there is no need to use the primary key for a secondary search in the clustered index, and the required data can be obtained in a secondary index. InnoDB indexes store index values in leaf nodes, so if all fields to be queried are included in an index and this index is used, the query speed can be greatly improved. For example, the following query: -- If name is indexed, directly retrieve the name value from the leaf node of the index without a secondary search. select name from person where name = 'abc' -- If there is a `name,age` aggregate index, the data will be returned directly without a secondary search. select name,age from person where name='abc' and age=12 Sorting using an index MySQL's sorting operation can also use indexes. The index can only be used for sorting when the column order of the index is exactly the same as the order of ORDER BY and the sorting method of all columns (ascending or descending) is also the same. Note: The number of sorted fields can be less than the corresponding index fields, but the order must be consistent. as follows: -- Assume there is a joint index of (name, age, sex) -- You can use the index to sort select ... order by name desc, age desc select ... order by name desc,age desc,sex desc -- Sorting is not allowed select ... order by name desc,sex desc select ... order by name desc,age asc Finish This article is based on MySQL 5.5. Newer versions may have different strategies. The above is the full content of this article. I hope it will be helpful for everyone’s study. I also hope that everyone will support 123WORDPRESS.COM. You may also be interested in:
|
<<: Implementation of Docker to build private warehouse (registry and Harbor)
>>: How to use Javascript to generate smooth curves
Introduction When we use the MySQL database, we a...
Table of contents 0x01 Failed to load the driver ...
vue+element UI encapsulates a public function to ...
When you use the docker command for the first tim...
The picture is used as the background and the lin...
Preface In a common business scenario, we need to...
Rendering After looking up relevant information o...
The project has been suspended recently, and the ...
Today, let's talk about a situation that is o...
When shutting down the MySQL server, various prob...
Table of contents Use of CURRENT_TIMESTAMP timest...
Table of contents Brief description: 1. Four char...
Let’s start with a question Five years ago when I...
When using docker-compose for deployment, the out...
Mainly use the preserve-3d and perspective proper...