Some methods to optimize query speed when MySQL processes massive data

Some methods to optimize query speed when MySQL processes massive data

In the actual projects I participated in, I found that when the amount of data in the MySQL table reaches millions, the efficiency of ordinary SQL queries drops sharply, and if there are many query conditions in the where clause, the query speed is simply intolerable. I once tested a conditional query on a table containing more than 4 million records (with indexes). The query time was as high as 40 seconds. I believe that any user would go crazy with such a high query delay. Therefore, how to improve the query efficiency of SQL statements is very important. The following are 30 SQL query optimization methods that are widely circulated on the Internet:

1. Try to avoid using the != or <> operator in the where clause, otherwise the engine will abandon the use of the index and perform a full table scan.

2. To optimize the query, try to avoid full table scans. First, consider creating indexes on the columns involved in where and order by.

3. Avoid using null value judgment on fields in the where clause, otherwise the engine will abandon the use of indexes and perform a full table scan, such as:
select id from t where num is null
You can set a default value of 0 on num to ensure that there is no null value in the num column in the table, and then query it like this:
select id from t where num=0

4. Try to avoid using or to connect conditions in the where clause, otherwise the engine will abandon the use of indexes and perform a full table scan, such as:
select id from t where num=10 or num=20
You can query like this:
select id from t where num=10
union all
select id from t where num=20

5. The following query will also result in a full table scan: (no leading percent sign)
select id from t where name like '�c%'
To improve efficiency, you can consider full-text retrieval.

6. Use in and not in with caution, otherwise it will lead to a full table scan, such as:
select id from t where num in(1,2,3)
For continuous values, use between instead of in:
select id from t where num between 1 and 3

7. If parameters are used in the where clause, a full table scan will also be caused. Because SQL resolves local variables only at run time, the optimizer cannot defer the choice of an access plan until run time; it must make the choice at compile time. However, if the access plan is built at compile time, the value of the variable is still unknown and cannot be used as an input for index selection. The following statement will perform a full table scan:
select id from t where num=@num
You can force the query to use the index instead:
select id from t with(index(index name)) where num=@num

8. Try to avoid expression operations on fields in the where clause, as this will cause the engine to abandon the use of indexes and perform a full table scan. like:
select id from t where num/2=100
Should be changed to:
select id from t where num=100*2

9. Try to avoid performing function operations on fields in the where clause, as this will cause the engine to abandon the use of indexes and perform a full table scan. like:
select id from t where substring(name,1,3)='abc' – id whose name starts with abc
select id from t where datediff(day,createdate,'2005-11-30′)=0–'2005-11-30′ generated id
Should be changed to:
select id from t where name like 'abc%'
select id from t where createdate>='2005-11-30′ and createdate<'2005-12-1′

10. Do not perform functions, arithmetic operations, or other expression operations on the left side of the "=" in the where clause, otherwise the system may not be able to use the index correctly.

11. When using an index field as a condition, if the index is a composite index, the first field in the index must be used as a condition to ensure that the system uses the index. Otherwise, the index will not be used, and the field order should be consistent with the index order as much as possible.

12. Do not write meaningless queries, such as those that require the generation of an empty table structure:
select col1,col2 into #t from t where 1=0
This type of code will not return any result set, but will consume system resources. It should be changed to this:
create table #t(…)

13. In many cases, using exists instead of in is a good choice:
select num from a where num in(select num from b)
Replace it with the following:
select num from a where exists(select 1 from b where num=a.num)

14. Not all indexes are effective for queries. SQL optimizes queries based on the data in the table. When there is a large amount of repeated data in the index column, the SQL query may not use the index. For example, if a table has a sex field with almost half male and half female, then even if an index is built on sex, it will not affect the query efficiency.

15. The more indexes there are, the better. Although indexes can improve the efficiency of corresponding selects, they also reduce the efficiency of inserts and updates, because the index may be rebuilt during inserts or updates. Therefore, how to build indexes requires careful consideration, depending on the specific situation. The number of indexes for a table should not exceed 6. If there are too many, you should consider whether indexes on columns that are not frequently used are necessary.

16. Avoid updating clustered index data columns as much as possible, because the order of clustered index data columns is the physical storage order of table records. Once the column value changes, the order of the entire table records will be adjusted, which will consume considerable resources. If the application system needs to frequently update clustered index data columns, you need to consider whether the index should be built as a clustered index.

17. Try to use numeric fields. If the field contains only numerical information, try not to design it as character type, as this will reduce the performance of queries and connections and increase storage overhead. This is because the engine compares each character in the string one by one when processing queries and connections, but for numeric types, only one comparison is enough.

18. Use varchar/nvarchar instead of char/nchar whenever possible. First, variable-length fields take up less storage space, which can save storage space. Second, for queries, searching in a relatively small field is obviously more efficient.

19. Do not use select * from t anywhere. Replace "*" with a specific field list and do not return any unused fields.

20. Try to use table variables instead of temporary tables. If the table variable contains a lot of data, be aware that the indexes are very limited (only the primary key index).

21. Avoid frequent creation and deletion of temporary tables to reduce the consumption of system table resources.

22. Temporary tables are not unusable. Using them appropriately can make certain routines more efficient, for example, when you need to repeatedly reference a data set in a large table or a commonly used table. However, for one-time events, it is better to use an export table.

23. When creating a new temporary table, if the amount of data to be inserted at one time is large, select into can be used instead of create table to avoid creating a large amount of logs and increase the speed; if the amount of data is not large, in order to ease the resources of the system table, create table first and then insert.

24. If temporary tables are used, be sure to explicitly delete all temporary tables at the end of the stored procedure, first truncate table, then drop table, this can avoid locking the system table for a long time.

25. Try to avoid using cursors because of their poor efficiency. If the data operated by the cursor exceeds 10,000 rows, you should consider rewriting it.

26. Before using cursor-based methods or temporary table methods, you should first look for set-based solutions to solve the problem. Set-based methods are usually more effective.

27. Like temporary tables, cursors are not unusable. Using a FAST_FORWARD cursor with small data sets is often superior to other row-by-row processing methods, especially when several tables must be referenced to obtain the required data. Routines that include "aggregates" in the result set will generally execute faster than using cursors. If development time permits, try both the cursor-based approach and the set-based approach to see which one works better.

28. Set SET NOCOUNT ON at the beginning of all stored procedures and triggers, and set SET NOCOUNT OFF at the end. There is no need to send a DONE_IN_PROC message to the client after each statement in stored procedures and triggers is executed.

29. Try to avoid returning large amounts of data to the client. If the amount of data is too large, consider whether the corresponding demand is reasonable.

30. Try to avoid large transaction operations and improve the system's concurrency capabilities.

You may also be interested in:
  • A practical record of checking and processing duplicate MySQL records on site
  • MySQL's method of dealing with duplicate data (preventing and deleting)
  • MySQL study notes on handling duplicate data
  • How to handle concurrent updates of MySQL data
  • Detailed explanation of MySQL execution principle, logical layering, and changing database processing engine
  • MySQL data processing sorting and explaining the operations of adding, deleting and modifying

<<:  Complete steps to use element in vue3.0

>>:  Detailed explanation of selinux basic configuration tutorial in Linux

Recommend

Analysis of several situations where MySQL index fails

1. Best left prefix principle - If multiple colum...

Implementation of modifying configuration files in Docker container

1. Enter the container docker run [option] image ...

Detailed explanation of the use of HTML header tags

HTML consists of two parts: head and body ** The ...

Solution to the blank page after vue.js packaged project

I believe that many partners who have just come i...

JavaScript anti-shake and throttling explained

Table of contents Stabilization Throttling Summar...

Disable IE Image Toolbar

I just tried it on IE6, and it does show the toolb...

Tips for List Building for Website Maintenance Pages

And, many times, maintenance requires your website...

Detailed explanation of Javascript closures and applications

Table of contents Preface 1. What is a closure? 1...

Solution to the problem that Navicat cannot remotely connect to MySql server

The solution to the problem that Navicat cannot r...

Summary of front-end knowledge in the Gokudō game

background In the early stages of learning Japane...

10 Deadly Semantic Mistakes in Web Typography

<br />This is from the content of Web front-...

Markup Language - Phrase Elements

Click here to return to the 123WORDPRESS.COM HTML ...