To be honest, this question involves a lot of core knowledge of MySQL, and it can bring up a lot of topics, just like when you are testing your knowledge of computer networks and are asked, "What happens after you enter the URL and press Enter?" to see how much you can say. To be honest, I was asked this question during my previous interview at Tencent, but I gave a very poor answer. I had never thought about the relevant reasons before, which resulted in me being unable to come up with an answer for the time being. So today, I will take you to talk about the reasons in detail. I believe you will gain something after reading it, otherwise you can hit me. Start pretending: Classification discussion If a SQL statement executes very slowly, does that mean it will be slow every time it is executed? Or is it normal in most cases, and sometimes very slow? So I think we have to discuss the following two situations. 1. Most of the time it is normal, but it may be very slow occasionally. 2. When the amount of data remains unchanged, this SQL statement has always been executed very slowly. For these two situations, let's analyze the possible reasons. For occasional slow situations An SQL statement runs normally in most cases, but occasionally runs very slowly. In this case, I think there is nothing wrong with the writing of the SQL statement itself, but it is caused by other reasons. What could that be? The database is refreshing dirty pages and I am helpless When we want to insert a piece of data into the database or update a piece of data, we know that the database will update the data of the corresponding field in the memory. However, after the update, these updated fields will not be synchronized and persisted to the disk immediately. Instead, these updated records will be written to the redo log. When it is idle, the latest data will be synchronized to the disk through the log in the redo log. However, the capacity of the redo log is limited. If the database is always busy and updated frequently, the redo log will soon be full. At this time, there is no way to wait until idle time to synchronize the data to the disk. You can only suspend other operations and focus on synchronizing the data to the disk. At this time, it will cause our normal SQL statements to suddenly execute very slowly. Therefore, when the database is synchronizing data to the disk, it may cause our SQL statements to execute very slowly. What can I do if I can't get the lock? This is relatively easy to think of. The statement we want to execute happens to involve a table that is being used by someone else and is locked. We cannot get the lock and can only wait for others to release the lock. Or, the table is not locked, but a row to be used is locked. At this time, there is nothing I can do. If we want to determine whether it is really waiting for the lock, we can use the show processlist command to view the current status. Here I want to remind you that it is best to record some commands. Anyway, I have been asked several commands and I don’t know how to write them, haha. Next, let's analyze the second situation. I think the analysis of the second situation is the most important. For the situation that has always been so slow If this SQL statement executes so slowly every time when the amount of data is the same, then you should carefully consider your SQL writing. Let's analyze the reasons that may cause our SQL statements to execute very poorly. Let's first assume that we have a table with the following two fields, the primary key id, and two common fields c and d. mysql> CREATE TABLE `t` ( `id` int(11) NOT NULL, `c` int(11) DEFAULT NULL, `d` int(11) DEFAULT NULL, PRIMARY KEY (`id`) )ENGINE=InnoDB; Sorry, no index is used. The index is not used. I think this reason is something that many people can think of. For example, if you want to query this statement select * from t where 100 <c and c < 100000; The field is not indexed It happens that there is no index on your c field, so sorry, you can only scan the whole table, and you will not experience the fun of not having an index, so this time the query statement will be very slow. The field has an index, but the index is not used Well, now you have added an index to the c field, and then run another query: select * from t where c - 1 = 1000; I would like to ask you a question, will index query be used when querying in this way? The answer is no. If we perform calculations on the left side of the field, then I'm sorry, the index will not be used during the query. So, everyone should pay attention to the situation where there is an index on this field, but due to your own negligence, the system does not use the index. The correct query should be as follows select * from t where c = 1000 + 1; Someone might say, if there is an operation on the right side, then we can use the index? Wouldn't the database automatically optimize for us and automatically convert c - 1=1000 to c = 1000+1? I'm sorry, I really can't help you, so you have to be careful. Function operation results in no use of index If we perform function operations on the fields during the query, the index will not be used. For example select * from t where pow(c,2) = 1000; Here I am just giving an example, assuming that the function pow is to raise c to the power of n. In fact, there may not be a function pow(c,2). In fact, this is very similar to the calculation on the left above. So, when a statement executes very slowly, it may be that the statement does not use the index. However, you need to be able to analyze the specific reasons why the index is not used. The three reasons I listed above should be the most common ones. Haha, the database chose the wrong index. When we perform a query operation, for example select * from t where 100 < c and c < 100000; We know that there is a difference between primary key index and non-primary key index. The value stored in the primary key index is the data of the entire row field, while the value stored in the non-primary key index is not the data of the entire row field, but the value of the primary key field. If you don't understand it, you can read my article: Interview Tips: MySQL Index Related, which talks about the difference between primary key index and non-primary key index. In other words, if we use the index of the c field, we will eventually find the value of the corresponding primary key, and then use the primary key index based on the primary key value to find the entire row of data to be returned. Okay, after talking so much, I actually just want to tell you that even if you have an index on the c field, the system will not necessarily use the index on the c field, but may directly scan the entire table to find all data that meets 100 < c and c < 100000. Why is this happening? In fact, when the system executes this statement, it will make a prediction: Will the number of rows scanned by the c index be less, or will the number of rows scanned by directly scanning the entire table be less? Obviously, the fewer scan lines, the better, because fewer scan lines means fewer I/O operations. If the entire table is scanned, the number of scans is the total number of rows in the table, assuming it is n; if index c is used, after we find the primary key through index c, we have to use the primary key index to find the data of the entire row, that is, we need to use the index twice. Moreover, we don’t know how many rows of data meet the condition 100 c < and c < 10000. What if all the data in this table meets the condition? This means that when using the c index, not only does it scan n rows, but it also has to scan the index twice for each row of data. Therefore, it is possible for the system to perform a full table scan instead of an index. How does the system judge? The judgment comes from the system's prediction. That is, if the c field index is to be used, the system will predict how many rows will need to be scanned to use the c field index. If it predicts that a large number of rows will be scanned, it may not use the index and scan the entire table directly. So the question is, how does the system make predictions and judgments? Let me tell you here how the system makes judgments, although my neck is a little sore from writing so much. The system makes judgments based on the index's discrimination. The more different values an index has, the fewer indexes with the same value, which means the index has a higher discrimination. We also call the discrimination cardinality, that is, the higher the discrimination, the larger the cardinality. Therefore, a larger cardinality means that there are fewer rows that meet the condition 100 < c and c < 10000. Therefore, the larger the cardinality of an index, the more advantages it has in index query. So the question is, how do we know the cardinality of this index? Of course, the system will not traverse all the data to obtain the cardinality of an index, as the cost is too high. The index system predicts the cardinality of the index by traversing part of the data, that is, by sampling. After talking so much, here comes the key point, which is sampling. There is a possibility of errors. That is to say, the cardinality of the index c is actually very large, but when sampling, unfortunately, the cardinality of this index is predicted to be very small. For example, if the cardinality of the data you sampled happens to be very small, you might mistakenly think that the cardinality of the index is very small. Then, haha, the system will not use the c index, but will directly scan the entire file. So, after saying so much, we come to the conclusion that due to statistical errors, the system did not use the index but a full table scan, which is also the reason why our SQL statements execute very slowly. Let me make it clear here that the system determines whether to use the index and the prediction of the number of scan rows is actually only one of the reasons. Whether this query statement needs to use a temporary table or whether it needs to be sorted will also affect the system's choice. However, sometimes we can also query by forcing the index, for example select * from t force index(a) where c < 100 and c < 100000; We can also show index from t; To check whether the cardinality of the index matches the actual analyze table t; If it is inconsistent with the actual situation, we can recalculate the cardinality of the index. You can use this command To re-analyze the statistics. Since the cardinality of the index is predicted incorrectly, it also means that when our query statement has multiple indexes, the system may also choose the wrong index, which may also be a reason why SQL execution is very slow. Okay, that’s all I have to say for now. I think it’s great that you have come up with so much. Let me make a summary below. ### Summarize The above is my summary and understanding. For the last part, I am afraid that many people who don’t understand databases well may choose the wrong index, so I explained it in detail. Now I will summarize the above. When a SQL statement is executed very slowly, we will discuss two situations: 1. Most of the time it is normal, but occasionally it is slow, there are the following reasons (1) The database is refreshing dirty pages. For example, the redo log is full and needs to be synchronized to the disk. (2) During execution, a lock is encountered, such as a table lock or a row lock. 2. This SQL statement is always executed very slowly. There are the following reasons. (1) Index not used: For example, the field has no index; the index cannot be used due to calculations or function operations on the field. (2) The database selected the wrong index. If you have any additions, you can also add them in the comment section. The above is a detailed explanation of the reasons why SQL statements execute slowly. I hope it will be helpful to you. If you have any questions, please leave me a message and I will reply to you in time. I would also like to thank everyone for their support of the 123WORDPRESS.COM website! You may also be interested in:
|
<<: How to run Hadoop and create images in Docker
>>: Analyze how to automatically generate Vue component documentation
1. Copy the configuration file to the user enviro...
Solution: Bind the click event to the audio compo...
This article example shares the specific code of ...
Table of contents 1. Introduction to calculator f...
All the orchestration files and configuration fil...
Table of contents Preface 1. Project Architecture...
The development of Docker technology provides a m...
1. Simulate database data 1-1 Create database and...
1: I won’t go into the details of how to install ...
Table of contents 1 Difference 1.1 Space Occupanc...
Table of contents Preface 1. Cause of the problem...
join() method: connects all elements in an array ...
I don't know if you have used the frameset at...
illustrate: VMware IOInsight is a tool that helps...
This article is an integrated article on how to c...