Share some key interview questions about MySQL index

Preface

An index is a data structure that sorts one or more column values in a database and helps the database to efficiently obtain data.

If we use an analogy, the index in the database is equivalent to the table of contents in a book. When we want to find a knowledge point in the book, we can directly look for it in the table of contents instead of searching on every page of the book. However, this also brings up a disadvantage of the index. When modifying the database, it takes more time to modify the index.

But do you really understand MySQL indexes? These questions will help you understand several important knowledge points about indexes.

1. What is the leftmost prefix principle?

The following answers are all based on MySQL's InnoDB engine

For example, for the following table

If we create an index based on the name field, using a B+ tree structure, the approximate index structure is as follows

If we want to perform a fuzzy search to find the IDs of all people whose names begin with "张", the SQL statement is

select ID from table where name like '张%'

Because in the B+ tree structure index, the index items are sorted according to the order of the fields that appear in the index definition. When searching, the index can quickly locate Zhang Yi with an ID of 100, and then directly traverse to the right all people whose names begin with Zhang until the condition is not met.

That is to say, after we find the first person who meets the conditions, we can directly traverse to the right. Since the index is ordered, all people who meet the conditions will be gathered together.

This method of positioning to the left and then traversing to the right is what we call the leftmost prefix principle.

2. Why use B+ tree instead of hash table as index?

1. The hash table maps the index field into the corresponding hash code and then stores it in the corresponding position. In this way, if we want to perform fuzzy search, the hash table structure is obviously not supported and we can only traverse the table. The B+ tree can quickly find the corresponding data through the leftmost prefix principle.

2. If we want to perform a range search, such as searching for people with IDs between 100 and 400, the hash table does not support this either, and we can only traverse the entire table.

3. The index field is mapped into a hash code through hashing. If many fields happen to be mapped to the hash code of the same value, the resulting index structure will be a very long linked list, which will greatly increase the search time.

3. What is the difference between a primary key index and a non-primary key index?

For example, for the following table (actually, a k field is added to the above table), ID is the primary key.

The schematic diagram of primary key index and non-primary key index is as follows:

Where R represents the value of an entire row.

It is not difficult to see from the figure that the difference between the primary key index and the non-primary key index is: the leaf nodes of the non-primary key index store the primary key value, while the leaf nodes of the primary key index store the entire row of data. The non-primary key index is also called a secondary index, and the primary key index is also called a clustered index.

Let's perform queries based on these two structures to see what the differences are in their queries.

1. If the query statement is select * from table where ID = 100 , that is, the primary key query method, then only the B+ tree of ID needs to be searched.

2. If the query statement is select * from table where k = 1 , that is, a non-primary key query, the k index tree is searched first to obtain ID = 100, and then the ID index tree is searched again. This process is also called table return.

Now, do you know the difference between them?

4. Why is it recommended to use an auto-incrementing primary key index?

For this primary key index tree

If we insert a row of data with ID = 650, we can just insert it on the far right.

However, if a row of data with ID = 350 is inserted, since the B+ tree is ordered, the leaf nodes below need to be moved to make room for inserting the data with ID = 350, which will be time-consuming. If the data page where R4 is located is full, a page split operation will be required, which will be even worse.

However, if our primary key is auto-incremental, each inserted ID will be larger than the previous one, so we only need to insert it at the end each time, without moving the position, splitting, etc., which can improve performance. That is why it is recommended to use an index with auto-increment primary key.

Summarize

The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. Thank you for your support of 123WORDPRESS.COM.

You may also be interested in:

Mysql FAQ Collection
Introduction to the latest 21 MySQL high-frequency interview questions in 2019
Advanced MySQL Database Interview Questions with Answers
Answers to several high-frequency MySQL interview questions
A collection of MySQL questions and answers that are often asked in interviews to lay a solid foundation

<<: A brief analysis of the use of watchEffect in Vue3

>>: The process of installing SVN on Ubuntu 16.04.5LTS