Causes and solutions for slow MySQL queries

Causes and solutions for slow MySQL queries

There are many reasons for slow query speed, the most common ones are as follows:

1. There is no index or the index is not used (this is the most common problem of slow query and is a defect in program design)
2. The I/O throughput is low, creating a bottleneck effect.
3. Failure to create calculated columns results in non-optimized queries.
4. Insufficient memory
5. Slow network speed
6. The amount of data queried is too large (you can use multiple queries or other methods to reduce the amount of data)
7. Lock or deadlock (this is also the most common problem of slow query and is a defect in program design)
8. sp_lock, sp_who, active users check, the reason is read-write competition resources.
9. Unnecessary rows and columns are returned
10. The query statement is not good and not optimized

You can optimize queries by:

1. Put data, logs, and indexes on different I/O devices to increase the reading speed. Previously, Tempdb could be placed on RAID0, but SQL2000 no longer supports this. The larger the amount (size) of data, the more important it is to improve I/O.
2. Split the table vertically and horizontally to reduce the size of the table (sp_spaceuse)
3. Upgrade your hardware
4. Create indexes based on query conditions, optimize indexes, optimize access methods, and limit the amount of data in the result set. Note that the fill factor should be appropriate (it is best to use the default value of 0). The index should be as small as possible. It is better to use columns with small bytes to create indexes (refer to index creation). Do not create a single index on a field with a limited number of values, such as the gender field.
5. Improve Internet speed;
6. Expand the server's memory. Windows 2000 and SQL server 2000 can support 4-8G of memory. Configure virtual memory: Virtual memory size should be configured based on the services running concurrently on the computer. When running Microsoft SQL Server? 2000, consider setting the virtual memory size to 1.5 times the physical memory installed in the computer. If you additionally install the full-text search feature and plan to run the Microsoft Search service to perform full-text indexing and queries, consider configuring the virtual memory size to be at least three times the physical memory installed in the computer. Configure the SQL Server max server memory server configuration option to 1.5 times the physical memory (half the virtual memory size setting).
7. Increase the number of server CPUs; however, it must be understood that parallel processing requires more resources such as memory than serial processing. MsSQL automatically evaluates and selects whether to use parallel or serial processing. A single task is broken down into multiple tasks that can be run on the processor. For example, the sorting, joining, scanning and GROUP BY clauses of delayed queries are executed simultaneously. SQL SERVER determines the optimal level of parallelism based on the system load. Complex queries that consume a lot of CPU are most suitable for parallel processing. However, update operations UPDATE, INSERT, and DELETE cannot be processed in parallel.
8. If you use LIKE to query, simply using index will not work, but full-text indexing consumes space. like 'a%' Use index like '%a' Do not use index When querying with like '%a%', the query time is proportional to the total length of the field value, so CHAR type cannot be used, but VARCHAR. Create a full-text index for fields with very long values.
9. Separation of DB Server and APPLication Server; Separation of OLTP and OLAP
10. Distributed partitioned views can be used to implement database server complexes. A federation is a group of servers that are managed separately but work together to share the processing load of the system. This mechanism of forming a database server federation through partitioning data can expand a group of servers to support the processing needs of large multi-tier Web sites. For more information, see Designing a federated database server. (Refer to the SQL help file 'Partitioned View')

a. Before implementing partitioned views, you must first horizontally partition the table
b. After creating the member table, define a distributed partitioned view on each member server, and each view has the same name. Thus, queries that reference the distributed partitioned view name can be run on any member server. The system operates as if there is a replica of the original table on each member server, but in fact there is only one member table and one distributed partitioned view on each server. The location of the data is transparent to the application.

11. Rebuild indexes with DBCC REINDEX and DBCC INDEXDEFRAG, shrink data and logs with DBCC SHRINKDB and DBCC SHRINKFILE. Set up automatic log shrinkage. For large databases, do not set up automatic database growth, as it will reduce server performance. There are many things to pay attention to when writing T-SQL. Here are the common points: First, the process of DBMS processing query plans is as follows:

1. Lexical and grammatical check of query statements
2. Submit the statement to the DBMS query optimizer
3. The optimizer performs algebraic optimization and access path optimization
4. Generate query plan from precompiled module
5. Then submit it to the system for processing and execution at the appropriate time
6. Finally, the execution result is returned to the user. Next, let's take a look at the data storage structure of SQL SERVER: the size of a page is 8K (8060) bytes, 8 pages form a disk area, and are stored according to the B-tree.

12. The difference between Commit and rollback Rollback: Roll back all transactions. Commit: commits the current transaction. There is no need to write transactions in dynamic SQL. If you need to write, please write it outside, such as: begin tran exec(@s) commit trans or write dynamic SQL as a function or stored procedure.
13. Use the Where clause in the query Select statement to limit the number of rows returned to avoid table scans. If unnecessary data is returned, the server's I/O resources are wasted, the network burden is increased, and performance is reduced. If the table is very large, locking the table during the table scan will prohibit other connections from accessing the table, which can have serious consequences.
14. SQL comments have no effect on execution
15. Avoid using the cursor as much as possible, as it takes up a lot of resources. If row-by-row execution is required, try to use non-cursor techniques, such as: looping on the client, using temporary tables, Table variables, subqueries, Case statements, etc. A cursor can be classified according to the fetching options it supports: Only rows must be fetched sequentially from the first row to the last row. FETCH NEXT is the only allowed fetch operation and is the default. Scrollability allows you to fetch any random row anywhere in the cursor. Cursor technology has become very powerful under SQL2000, and its purpose is to support loops. There are four concurrency options READ_ONLY: no updates are allowed through cursor positioning (Update), and no locks are taken on the rows that make up the result set. OPTIMISTIC WITH valueS: Optimistic concurrency control is a standard part of transaction control theory. Optimistic concurrency control is used in situations where there is only a small chance that a second user could update a row in the interval between opening a cursor and updating a row. When a cursor is opened with this option, no locks are held on its rows, which helps maximize its processing power. If the user attempts to modify a row, the current value of the row is compared with the value obtained the last time the row was fetched. If any value changes, the server knows that someone else has updated this row and returns an error. If the values ​​are the same, the server performs the modification. Select this concurrency option: OPTIMISTIC WITH ROW VERSIONING: This optimistic concurrency control option is based on row versioning. With row versioning, the table must have some kind of version identifier that the server can use to determine whether the row has changed since it was read into the cursor. In SQL Server, this capability is provided by the timestamp data type, which is a binary number that represents the relative order of changes in the database. Each database has a global current timestamp value: @@DBTS. Each time a row with a timestamp column is changed in any way, SQL Server first stores the current @@DBTS value in the timestamp column and then increments the value of @@DBTS. If a table has a timestamp column, the timestamp is recorded at the row level. The server can then compare the current timestamp value of a row with the timestamp value stored the last time it was fetched to determine whether the row has been updated. The server does not have to compare the values ​​of all columns, only the timestamp column. If the application does not have a timestamp column table requires optimistic concurrency based on row versioning, the cursor defaults to optimistic concurrency control based on numeric values. SCROLL LOCKS This option implements pessimistic concurrency control. In pessimistic concurrency control, the application attempts to lock database rows while reading them into a cursor result set. When using a server cursor, an update lock is placed on a row as it is read into the cursor. If a cursor is opened within a transaction, the transaction update lock is held until the transaction is committed or rolled back; the cursor lock is removed when the next row is fetched. If the cursor is opened outside a transaction, the lock is dropped when the next row is fetched. Therefore, whenever the user requires complete pessimistic concurrency control, the cursor should be opened within a transaction. An update lock prevents any other task from acquiring an update lock or an exclusive lock, thereby preventing other tasks from updating the row. However, an update lock does not block shared locks, so it does not prevent another task from reading the row unless the second task is also requesting a read with an update lock. Scroll locks These cursor concurrency options can generate scroll locks based on the lock hints specified in the SELECT statement of the cursor definition. A scroll lock is acquired on each row as it is fetched and is held until the next fetch or until the cursor is closed, whichever occurs first. On the next fetch, the server acquires scroll locks for the rows in the new fetch and releases the scroll locks for the rows in the previous fetch. Scroll locks are independent of transaction locks and can be held after a commit or rollback operation. If the option to close cursors on commit is off, the COMMIT statement does not close any open cursors, and scroll locks are retained until after the commit to maintain isolation of the fetched data. The type of scroll lock acquired depends on the cursor concurrency options and the lock hints in the cursor SELECT statement. Lock Hints Read-Only Optimistic Numeric Optimistic Row Versioning Locks No Hint Unlocked Unlocked Unlocked Update NOLOCK Unlocked Unlocked Unlocked Unlocked HOLDLOCK Shared Shared Shared Update UPDLOCK Error Update Update Update TABLOCKX Error Unlocked Unlocked Update Other Unlocked Unlocked Unlocked Update *Specifying the NOLOCK hint makes the table on which the hint is specified read-only within a cursor.
16. Use Profiler to track queries, get the time required for queries, and find out the SQL problems; use index optimizer to optimize indexes
17. Note the difference between UNion and UNion all. UNION all is good
18. Be careful when using DISTINCT. Do not use it when it is not necessary. Like UNION, it will slow down the query. Duplicate records are no problem in the query
19. Do not return unnecessary rows and columns when querying
20. Use sp_configure 'query governor cost limit' or SET QUERY_GOVERNOR_COST_LIMIT to limit the resources consumed by the query. When the resources consumed by evaluating a query exceed the limit, the server automatically cancels the query, killing it before it is executed. SET LOCKTIME sets the lock time
21. Use select top 100 / 10 Percent to limit the number of rows returned by the user or SET ROWCOUNT to limit the rows of the operation
22. Before SQL2000, generally do not use the following statements: "IS NULL", "<>", "!=", "!>", "!<", "NOT", "NOT EXISTS", "NOT IN", "NOT LIKE", and "LIKE '%500'", because they do not use indexes and are all table scans. Also, do not add functions, such as Convert, Substring, etc., to the column name in the WHere clause. If you must use a function, create a calculated column and then create an index instead. You can also change the writing method: Change WHERE SUBSTRING(firstname,1,1) = 'm' to WHERE firstname like 'm%' (index scan). Be sure to separate the function and column name. Also, do not create too many indexes or make them too large. NOT IN will scan the table multiple times. Use EXISTS, NOT EXISTS, IN, LEFT OUTER JOIN instead, especially left join. Exists is faster than IN, and the slowest is NOT operation. If the value of a column contains null, its index will not work before, but now the 2000 optimizer can handle it. The same is that IS NULL, "NOT", "NOT EXISTS", "NOT IN" can be optimized, but "<>" and so on cannot be optimized and no index is used.
23. Use Query Analyzer to view the query plan of the SQL statement and evaluate whether it is the optimized SQL. Generally, 20% of the code takes up 80% of the resources, and our optimization focus is on these slow areas.
24. If you find that the query does not use the index when using IN or OR, use explicit declaration to specify the index:
SELECT * FROM PersonMember (INDEX = IX_Title) WHERE processid IN ('male', 'female')
25. Pre-calculate the results to be queried and put them in the table, and then SELECT them when querying. This was the most important means before SQL7.0. For example, the calculation of hospitalization fees.
26. MIN() and MAX() can use appropriate indexes.
27. There is a principle in the database that the closer the code is to the data, the better. Therefore, Default is preferred, followed by Rules, Triggers, Constraint (constraints such as foreign key, primary key, CheckUNIQUE, etc., the maximum length of the data type, etc. are all constraints), Procedure. This not only reduces maintenance work, but also improves the quality of program writing and executes quickly.
28. If you want to insert a large binary value into an Image column, use a stored procedure and never use built-in INsert to insert it (I don’t know if JAVA does this). Because the application first converts the binary value into a string (twice the size), and the server converts it back into a binary value after receiving the character. The stored procedure does not have these actions: Method:
Create procedure p_insert as insert into table(Fimage) values ​​(@image)
Calling this stored procedure in the foreground and passing in binary parameters will significantly improve the processing speed.
29.Between is faster than IN in some cases. Between can find the range based on the index faster. The difference can be seen using the query optimizer.

select * from chineseresume where title in ('male','female') 
Select * from chineseresume where title between '男' and '女'

It's the same. Since in will be called many times, it may be slow sometimes.
30. When necessary, create indexes for global or local temporary tables, which can sometimes increase the speed, but this is not necessarily the case because indexes also consume a lot of resources. Its creation is the same as the actual table.
31. Do not create useless things, such as generating reports, which wastes resources. Use it only when it is necessary to use things.
32. The OR clause can be decomposed into multiple queries, and multiple queries can be connected through UNION. Their speed is only related to whether the index is used. If the query needs to use a joint index, it is more efficient to use UNION all. Multiple OR clauses do not use indexes, so they are rewritten in the form of UNION and then try to match the index. A key question is whether to use indexes.
33. Use views as little as possible as they are inefficient. Operations on views are slower than direct operations on tables, so stored procedures can be used instead. In particular, do not nest views, as nested views make it more difficult to find original data. Let's look at the essence of a view: it is an optimized SQL statement with a query plan generated and stored on the server. When retrieving data from a single table, do not use views pointing to multiple tables. Retrieve directly from the table or read from a view that only contains this table. Otherwise, unnecessary overhead is added and the query is disturbed. In order to speed up the query of views, MsSQL has added the view index function.
34. Do not use DISTINCT and ORDER BY when it is not necessary. These actions can be performed on the client side instead. They add extra overhead. This is the same principle as UNION and UNION ALL.

select top 20 ad.companyname,comid,position,ad.referenceid,worklocation, 
convert(varchar(10),ad.postDate,120) as postDate1,workyear,degreedescription FROM 
jobcn_query.dbo.COMPANYAD_query ad where referenceID in('JCNAD00329667','JCNAD132168','JCNAD00337748','JCNAD00338345', 
'JCNAD00333138','JCNAD00303570','JCNAD00303569', 
'JCNAD00303568','JCNAD00306698','JCNAD00231935','JCNAD00231933', 
'JCNAD00254567','JCNAD00254585','JCNAD00254608', 
'JCNAD00254607','JCNAD00258524','JCNAD00332133','JCNAD00268618', 
'JCNAD00279196','JCNAD00268613') order by postdate desc

35. In the list of values ​​after IN, put the most frequently appearing value at the front and the least frequently appearing value at the end to reduce the number of judgments.
36. When using SELECT INTO, it will lock the system tables (sysobjects, sysindexes, etc.), blocking access by other connections. Use explicit statements to create temporary tables instead of

select INTO. drop table t_lxh begin tran select * into t_lxh from chineseresume 
where name = 'XYZ' --commit

In another connection, SELECT * from sysobjects, you can see that SELECT INTO will lock the system table, and Create table will also lock the system table (whether it is a temporary table or a system table). So never use it inside things! ! ! In this case, if it is a temporary table that is often used, please use a real table or a temporary table variable.
37. Generally, redundant rows can be eliminated before the GROUP BY and HAVING clauses, so try not to use them to eliminate rows. The optimal execution order is as follows: the Where clause of select selects all appropriate rows, Group By is used to group and count rows, and the Having clause is used to eliminate redundant groups. In this way, the overhead of Group By and Having is small and the query is fast. Grouping and Having of large data rows consumes a lot of resources. If the purpose of Group BY does not include calculations but only grouping, then Distinct is faster.
38. Updating multiple records at once is faster than updating one record at a time, i.e. batch processing is better.
39. Use temporary tables less often. Try to use result sets and Table-type variables instead. Table-type variables are better than temporary tables.
40. In SQL2000, calculated fields can be indexed. The conditions that need to be met are as follows:

a. The expression of the calculated field is certain
b. Cannot be used in TEXT, Ntext, Image data types
c. The following options must be configured: ANSI_NULLS = ON, ANSI_PADDINGS = ON, ...

41. Try to place data processing on the server to reduce network overhead, such as using stored procedures. A stored procedure is a compiled, optimized SQL statement organized into an execution plan and stored in the database. It is a collection of control flow languages ​​and is of course fast. Dynamic SQL that is executed repeatedly can use a temporary stored procedure, which (temporary table) is placed in Tempdb. In the past, because SQL Server did not support complex mathematical calculations, this work had to be placed on other layers, increasing network overhead. SQL2000 supports UDFs and now supports complex mathematical calculations. The return value of the function should not be too large, as this will cause great overhead. User-defined functions consume a lot of resources when executed like cursors. If you want to return large results, use stored procedures.
42. Don’t use the same function repeatedly in a sentence. It wastes resources. It is faster to put the result in a variable and call it.
43. The efficiency of SELECT COUNT(*) is low, so try to change its writing method. EXISTS is fast. Also, please note the difference: the return values ​​of select count(Field of null) from Table and select count(Field of NOT null) from Table are different! ! !
44. When the server has enough memory, set the number of threads to be equal to the maximum number of connections + 5, which can achieve maximum efficiency. Otherwise, use the number of threads < the maximum number of connections to enable SQL SERVER's thread pool to solve the problem. If the number is still equal to the maximum number of connections + 5, the server's performance will be seriously damaged.
45. Access your tables in a certain order. If you lock table A first and then table B, you must lock them in this order in all stored procedures. If you (accidentally) lock table B first and then table A in a stored procedure, this may cause a deadlock. Deadlocks can be difficult to detect if the locking sequence is not carefully designed in advance.
46. ​​Monitor the load of the corresponding hardware through SQL Server Performance Monitor. If the value of the Memory: Page Faults/sec counter occasionally goes high, it indicates that there are threads competing for memory at that time. If it's consistently high, memory may be the bottleneck.

Process:
1. % DPC Time refers to the percentage of the processor used to receive and service deferred procedure calls (DPCs) during the sample interval. (The DPC is running at a lower priority interval than the standard interval). Since DPCs execute in privileged mode, the DPC percentage is a fraction of the privileged percentage. These times are calculated separately and are not part of the interval calculation total. This total shows the average busy time as a percentage of instance time.
2. %Processor Time counter If the parameter value continues to exceed 95%, it indicates that the bottleneck is the CPU. Consider adding a new processor or switching to a faster one.
3. % Privileged Time refers to the percentage of non-idle processor time used in privileged mode. (Privileged mode is a processing mode designed for operating system components and manipulating hardware drivers. It allows direct access to hardware and all memory. Another mode is user mode, which is a limited processing mode designed for applications, environment subsystems, and integer subsystems. The operating system converts application threads into privileged mode to access operating system services). The % of privileged time includes time servicing interrupts and DPCs. A high privileged time ratio may be caused by a large number of intervals generated by failed devices. This counter displays the average busy time as a fraction of the sample time.
4. % User Time indicates CPU-consuming database operations, such as sorting, executing aggregate functions, etc. If this value is very high, consider adding indexes, using simple table joins, horizontally splitting large tables, and other methods to reduce this value. Physical Disk: Curretn Disk Queue Length counter This value should not exceed 1.5~2 times the number of disks. To improve performance, add more disks. SQLServer:Cache Hit Ratio counter The higher the value, the better. If it is consistently below 80%, consider increasing the memory. Note that the parameter value is accumulated after SQL Server is started, so after a period of time, the value will not reflect the current value of the system.

47. Analysis of select emp_name form employee where salary > 3000 In this statement, if salary is of type Float, the optimizer will optimize it to Convert(float,3000). Because 3000 is an integer, we should use 3000.0 when programming instead of letting the DBMS perform the conversion at runtime. The same conversion is done for character and integer data.
48. The order of query association and writing

select a.personMemberID, * from chineseresume a,personmember b where personMemberID
= b.referenceid and a.personMemberID = 'JCNPRH39681' (A = B, B = 'number')

select a.personMemberID, * from chineseresume a,personmember b where a.personMemberID
= b.referenceid and a.personMemberID = 'JCNPRH39681' and b.referenceid = 'JCNPRH39681' (A = B, B = 'number', A = 'number')

select a.personMemberID, * from chineseresume a,personmember b where b.referenceid
= 'JCNPRH39681' and a.personMemberID = 'JCNPRH39681' (B = 'Number', A = 'Number')

49.

(1) IF no person in charge code is entered THEN code1=0 code2=9999 ELSE code1=code2=person in charge code END IF The SQL statement to be executed is: SELECT person in charge name FROM P2000 WHERE person in charge code>=:code1 AND person in charge code<=:code2
(2) IF the person in charge code is not entered THEN SELECT person in charge name FROM P2000 ELSE code= person in charge code SELECT person in charge code FROM P2000 WHERE person in charge code=:code END IF The first method uses only one SQL statement, and the second method uses two SQL statements. When the responsible person code is not entered, the second method is obviously more efficient than the first method because it has no restrictions; when the responsible person code is entered, the second method is still more efficient than the first method, not only because there is one less restriction, but also because the equality operation is the fastest query operation. We should not be afraid of trouble when writing programs

50. Regarding the new query paging method of JOBCN (as follows), use the performance optimizer to analyze the performance bottleneck. If it is on the I/O or network speed, the following optimization method is effective. If it is on the CPU or memory, the current method is better. Please distinguish the following methods, explaining that the smaller the index, the better.

begin 
DECLARE @local_variable table (FID int identity(1,1),ReferenceID varchar(20)) 
insert into @local_variable (ReferenceID) 
select top 100000 ReferenceID from chineseresume order by ReferenceID 
select * from @local_variable where Fid > 40 and fid <= 60 
end

and

begin 
DECLARE @local_variable table (FID int identity(1,1),ReferenceID varchar(20)) 
insert into @local_variable (ReferenceID) 
select top 100000 ReferenceID from chineseresume order by updatedate 
select * from @local_variable where Fid > 40 and fid <= 60 
end

The difference

begin 
create table #temp (FID int identity(1,1),ReferenceID varchar(20)) 
insert into #temp (ReferenceID) 
select top 100000 ReferenceID from chineseresume order by updatedate 
select * from #temp where Fid > 40 and fid <= 60 drop table #temp 
end

Analysis and solution of slow mysql query speed

1. Locate the slow-executing SQL statements, such as those that are not completed within 2 seconds.

show engines;
View slow query time
show variables like 'slow%';
Check how long it takes to be a slow query
show variables like 'long%';
Modify slow query time
set long_query_time=1;
Enable slow query logging
set global slow_query_log='ON';
See which threads are running
show full processlist
View the maximum number of connections
show variables like '%max_connections%';
Current number of connections
show status like 'Threads_connected%';

2. Solution

1. Start with your own SQL, put the SQL into Navicat and execute it once to see how long it takes, whether the SQL military regulations are followed, and whether *, in, etc. appear

2. Is the maximum number of connections insufficient to meet the current number of connections? Is it considered to be increased?

3. Index optimization, index commonly used fields, and do not index data types such as txt

4. Divide the database into separate tables, and some database tables are dedicated to query

5. Enable database cache

6. Server hardware upgrade

You may also be interested in:
  • Mysql error: Too many connections solution
  • MySql status viewing method How to view the number of connections and status in MySql?
  • Summary of tips for setting the maximum number of connections in MySQL
  • Solve the problem of JDBC connection to Mysql being invalid after long-term inactivity

<<:  Detailed examples of the difference between methods watch and computed in Vue.js

>>:  How to connect to Alibaba Cloud Ubuntu 16.04 server from local Windows remote desktop

Recommend

5 MySQL GUI tools recommended to help you with database management

There are many database management tools for MySQ...

The difference between Input's size and maxlength attributes

I recently used the input size and maxlength attri...

The use of v-model in vue3 components and in-depth explanation

Table of contents Use two-way binding data in v-m...

Web page printing thin line table + page printing ultimate strategy

When I was printing for a client recently, he aske...

Detailed steps to build the TypeScript environment and deploy it to VSCode

Table of contents TypeScript environment construc...

Full steps to create a password generator using Node.js

Table of contents 1. Preparation 2. Writing comma...

Detailed explanation of Vue's live broadcast function

Recently, the company happened to be doing live b...

MySQL partition table is classified by month

Table of contents Create a table View the databas...

Markup language - CSS layout

Click here to return to the 123WORDPRESS.COM HTML ...

js to achieve a simple magnifying glass effect

This article shares the specific code of js to ac...

InnoDB type MySql restore table structure and data

Prerequisite: Save the .frm and .ibd files that n...