Mysql database index interview questions (basic programmer skills)

Mysql database index interview questions (basic programmer skills)

introduction

Indexing is a difficult problem for Mysql , but it is also a very important basic skill for programmers. In normal project development, it is an important means of SQL optimization. In job interviews, it is an important consideration that interviewers often use to examine job applicants' database performance optimization. Therefore, it is a necessary ability for every programmer to thoroughly master the indexing principles and be able to apply them to actual database queries. This article will explain Mysql index from the aspects of index principles and index design principles. I believe that after reading this article, you will be able to completely convince the Alibaba interviewer with your understanding of Mysql index query data. Are you ready? We set off.

insert image description here

Indexing principle

Before designing and optimizing indexes, let's first have a deep understanding of the principles of indexes. Because all designs and optimizations must be based on your thorough understanding of the principles.

Many people know that when performing SQL queries, the same table and the same data. Query data without or with index. There is a big difference between the two. So why is there such a gap? Simply put, if the business data is compared to a dictionary, then the index is the directory of this dictionary. If I ask you to look up a word, and you don't use the directory to look it up, you can only flip through the pages one by one. If you are unlucky, you may have to flip to the last page to find the word you want. This is the legendary full table scan. However, if we search through the directory, we can quickly locate the page where the word is located and then find the corresponding word. You see, the power of indexes lies in improving the efficiency of data queries. OK, now we have a perceptual understanding of indexes. So let’s take a deeper look.

We all know that the data structure of the index in Mysql is B+ tree (the advantages and disadvantages of structures such as B tree and Hash index will not be explained here, which is not the focus of this article), so let's take a look at how B+ tree of the index on the disk grows step by step.

1. Data page

In daily project development, most of our business data exists in relational data. Then the data in each table in the database will eventually be stored on the server's hard disk. I wonder if you have ever thought about how this data is stored? In fact, the database tables we use every day in the Mysql database are logical tables that people can understand. It is actually stored on disk as data pages. Data pages are the basic unit of interaction between disk and memory. The Innodb storage engine of Mysql actually interacts with the data pages on disk through buffer pool instead of directly operating the data pages on disk. The structure of the data page is shown in the following figure:

Data page structure

At the same time, adjacent data pages are referenced to each other through a bidirectional linked list. As shown in the figure below, the orange-red part is the data page, and the small box in the middle can be understood as specific data. The data page size of Mysql 's InnoDB storage engine is 16KB . Mysql 's Innodb storage engine uniquely locates a data page through the page number, so each data page has its own page number. As can be seen from the above figure, each data page has a corresponding Page Header Page Header , which stores the page number of the current data page, the page number of the next page, and the page number of the previous page.

insert image description here

Adjacent data reference each other through pointers. The pointers mark the page number of the data page. Each data page stores a continuous segment of data. The record header in each data row stores the address offset of the next row of real data. It can be simply understood as having a pointer pointing to the address of the next row of data. Therefore, inside the data page, there is actually a one-way linked list about data rows. This one-way linked list is about the primary key id and is arranged from small to large.

insert image description here

From the above data page structure, we can see that each time data is inserted, the User Records area will become larger and the corresponding User Record area will be reduced. When the User Record area is consumed, a page split occurs to form a new data page. It should be noted here that if we use the auto-increment primary key in Mysql , then we can ensure that the data rows are arranged in the order of increasing id . However, if the primary key is set by ourselves and is not auto-incrementing, then it is possible that the primary key value of the data inserted later is smaller than the primary key value of the previous data. In this case, when splitting the page, Mysql will rearrange it according to the primary key size. I wonder if you have any questions here, why must we arrange them according to the size of the primary key? In fact, it is related to subsequent data queries. Arranging the data in the data page in the order of the primary key is the basis for the normal operation of the index. The general process is shown in the figure below:

insert image description here

2. Page Directory

Each data page has its own page directory. Page Directory in the page structure above is actually used to locate data rows. The data in the data page is actually allocated by group. Different slots in the page directory actually correspond to different groups in the data page. When querying data, find the corresponding slot through id , and then know the corresponding data row group in the data page based on the corresponding slot, and traverse the data in the data row group until the corresponding data is found.

insert image description here

3. Index principle analysis

(1) Indexing Basics

With the basic knowledge of data pages in the above two sections, it will be easier to understand the index principle. When there is no index, data queries are all performed by full table scan. Traverse each data row in the query data page, and then traverse all data pages until a data item that meets the conditions is found. Therefore, the query efficiency is very low. So how can we improve the efficiency of data query? Is it possible to have a primary key directory like a dictionary directory to locate the data page number? The answer is yes, and this is exactly what Mysql does. Mysql uses the primary key directory, which is actually the legendary primary key index, to optimize data queries. The primary key directory contains two important elements, one is the smallest primary key in the data page, and the other is the page number of the current data page. In this way, data can be queried through this primary key directory.

For example, if you want to query the data with primary key id=5 , you first search in the primary key directory. At this time, it is found that the primary key id=5 is greater than the primary key id=1 , but less than id=8 , so it can be determined that the data is actually in the data page with page number 1 .

Of course, in reality there will be many data pages in Mysql , so there will be many corresponding primary key indexes. In this case, you need to locate the data page through binary search and then find the corresponding data.

insert image description here

(2) Index page

Nowadays, various Internet companies are developing rapidly and the corresponding business volume is also huge. Therefore, the amount of data in the database is also very large. It is common to have millions or tens of millions of data in a table. According to the above primary key directory, a large number of primary keys and data page numbers need to be stored. Even if binary search is performed, the data query efficiency is relatively low.

Mysql actually stores index statements in index pages. When the amount of data is large, there will be more corresponding indexes, so special index pages are used to store index data. In addition, the upper layer of these index pages continues to query and locate the index pages through the primary key and index page number, so we get the following structure. id number here refers to the corresponding smallest id number.

insert image description here

If the data in the index page increases, the index page will also be split. In this way, the index pages form different levels, and the three page data of index page layer, index page and data page form what we call B+ tree. The figure below shows the B+ tree structure of the index, which allows data query to be completed much more efficiently than a full table scan. Only leaf nodes of B+ store data. The following figure is a primary key index, also called a clustered index. In fact, we can see that its fundamental idea is divide and rule. The amount of data is huge, right? So I will divide the data into many data pages. There are many data pages, right? So I will organize the data pages through index pages. There are many index pages, right? So I will index them through index pages.

insert image description here

Let's take a look at the data query process in the B+ tree. For example, if you need to query data with id 3, you will determine in the index page that you should go to index page 3. Then, in index page 3 , we continue to determine id=1 should go to index page 1, and in the index page, we determine that it should be the data page with page number 1 We traverse this data page and finally find the corresponding data.

insert image description here

The above B+ tree composed of index pages and data pages is a clustered index. Of course, we can also create ordinary indexes through other fields. The leaf nodes of ordinary indexes store the corresponding primary key id instead of specific data. The index will have the problem of table backtracking, that is, after querying the corresponding id , it is necessary to continue to query the specific data in the clustered index based on id . Only through such operations can all the data of select * be queried. Of course, we can avoid such query waste by using covering indexes.

Summarize

This article uses step-by-step diagrams to explain the indexing principle of Mysql 's InnoDB and build the corresponding B+ tree index structure. The specific process of data query is explained. I believe that everyone has a deeper understanding of indexes. Later, from a practical perspective, I will analyze how to design indexes and how to deal with index failures.

You may also be interested in:
  • Detailed explanation of MySQL database indexes and failure scenarios
  • Detailed introduction to MySQL database index
  • Detailed explanation of MySQL database index
  • MySQL Database Indexes and Transactions
  • MySQL database index order by sorting detailed explanation
  • Disadvantages and reasonable use of MySQL database index
  • The leftmost matching principle of MySQL database index
  • Detailed explanation of transactions and indexes in MySQL database
  • Why does the index in the Mysql database table not improve the query speed?

<<:  Vuex modularization and namespaced example demonstration

>>:  Create an SSL certificate that can be used in nginx and IIS

Recommend

JavaScript to implement the function of changing avatar

This article shares the specific code of JavaScri...

BUG of odd width and height in IE6

As shown in the figure: But when viewed under IE6...

Example steps for using AntV X6 with Vue.js

Table of contents 0x0 Introduction 0x1 Installati...

Detailed explanation of Excel parsing and exporting based on Vue

Table of contents Preface Basic Introduction Code...

Summary of 7 types of logs in MySQL

There are the following log files in MySQL: 1: re...

VMware ESXi installation and use record (with download)

Table of contents 1. Install ESXi 2. Set up ESXi ...

Solution to the problem that the docker container cannot be stopped

The solution is as follows: 1. Force delete conta...

jQuery realizes dynamic particle effect

This article shares the specific code of jQuery t...

Understanding innerHTML

<br />Related articles: innerHTML HTML DOM i...

Alibaba Cloud Server Linux System Builds Tomcat to Deploy Web Project

I divide the whole process into four steps: Downl...

js code to realize multi-person chat room

This article example shares the specific code of ...

JavaScript to implement click to switch verification code and verification

This article shares the specific code of JavaScri...

Detailed explanation of Docker data backup and recovery process

The data backup operation is very easy. Execute t...

About WSL configuration and modification issues in Docker

https://docs.microsoft.com/en-us/windows/wsl/wsl-...