Preface MySQL continued to maintain its strong growth trend in database popularity in 2016. More and more customers are building their applications on MySQL databases, or even migrating from Oracle to MySQL. However, some customers encounter problems such as slow response time and full CPU when using MySQL database. Alibaba Cloud RDS expert service team has helped cloud customers solve many urgent problems. Some common SQL problems that appear in the "ApsaraDB Expert Diagnosis Report" are summarized below for your reference. 1. LIMIT statement Pagination query is one of the most commonly used scenarios, but it is also usually the place where problems are most likely to occur. For example, for the following simple statement, the general DBA's solution is to add a composite index on the type, name, and create_time fields. In this way, conditional sorting can effectively utilize the index and quickly improve performance. SELECT * FROM operation WHERE type = 'SQLStats' AND name = 'SlowLog' ORDER BY create_time LIMIT 1000, 10; Well, probably more than 90% of DBAs would stop here to solve this problem. But when the LIMIT clause becomes "LIMIT 1000000,10", programmers will still complain: Why is it still slow when I only take 10 records? You should know that the database does not know where the 1,000,000th record starts, and even if there is an index, it needs to be calculated from scratch. When this kind of performance problem occurs, in most cases it is because the programmer is lazy. In scenarios such as front-end data browsing and paging, or large data batch export, the maximum value of the previous page can be used as a parameter as a query condition. The SQL is redesigned as follows: SELECT * FROM operation WHERE type = 'SQLStats' AND name = 'SlowLog' AND create_time > '2017-03-16 14:00:00' ORDER BY create_time limit 10; Under the new design, query time is basically fixed and will not change as the amount of data grows. 2. Implicit conversion Mismatch between query variables and field definition types in SQL statements is another common error. For example, the following statement: mysql> explain extended SELECT * > FROM my_balance b > WHERE b.bpn = 14000000123 > AND b.isverified IS NULL ; mysql> show warnings; | Warning | 1739 | Cannot use ref access on index 'bpn' due to type or collation conversion on field 'bpn' The field bpn is defined as varchar(20), and MySQL's strategy is to convert the string into a number before comparison. The function acts on the table field and the index becomes invalid. The above situation may be parameters automatically filled in by the application framework, rather than the original intention of the programmer. There are many complex application frameworks nowadays. While they are convenient to use, you should also be careful of the pitfalls they may cause you. 3. Update and delete associations Although MySQL 5.6 introduced the materialization feature, it is important to note that it currently only optimizes query statements. Updates or deletions need to be manually rewritten into JOINs. For example, in the following UPDATE statement, MySQL actually executes a loop/nested subquery (DEPENDENT SUBQUERY), and its execution time can be imagined. UPDATE operation o SET status = 'applying' WHERE o.id IN (SELECT id FROM (SELECT o.id, o.status FROM operation o WHERE o.group = 123 AND o.status NOT IN ( 'done' ) ORDER BY o.parent, o.id LIMIT 1) t); Execution plan: +----+--------------------+-------+-------+---------------+---------+---------+-------+------+-----------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------------+-------+-------+---------------+---------+---------+-------+------+-----------------------------------------------------+ | 1 | PRIMARY | o | index | | PRIMARY | 8 | | 24 | Using where; Using temporary | | 2 | DEPENDENT SUBQUERY | | | | | | | | | Impossible WHERE noticed after reading const tables | | 3 | DERIVED | o | ref | idx_2,idx_5 | idx_5 | 8 | const | 1 | Using where; Using filesort | +----+--------------------+-------+-------+---------------+---------+---------+-------+------+-----------------------------------------------------+ After being rewritten as JOIN, the subquery selection mode changes from DEPENDENT SUBQUERY to DERIVED, and the execution speed is greatly accelerated, from 7 seconds to 2 milliseconds. UPDATE operation o JOIN (SELECT o.id, o.status FROM operation o WHERE o.group = 123 AND o.status NOT IN ( 'done' ) ORDER BY o.parent, o.id LIMIT 1) t ON o.id = t.id SET status = 'applying' The execution plan is simplified to +----+-------------+-------+------+---------------+-------+---------+-------+------+-----------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+-------+---------+-------+------+-----------------------------------------------------+ | 1 | PRIMARY | | | | | | | | | Impossible WHERE noticed after reading const tables | | 2 | DERIVED | o | ref | idx_2,idx_5 | idx_5 | 8 | const | 1 | Using where; Using filesort | +----+-------------+-------+------+---------------+-------+---------+-------+------+-----------------------------------------------------+ 4. Mixed sorting MySQL cannot use indexes for mixed sorting. But in some scenarios, there are still opportunities to use special methods to improve performance. SELECT * FROM my_order o INNER JOIN my_appraise a ON a.orderid = o.id ORDER BY a.is_reply ASC, a.appraise_time DESC LIMIT 0, 20 The execution plan shows a full table scan: +----+-------------+-------+--------+-------------+---------+-----------------------+---------+-+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra +----+-------------+-------+--------+-------------+---------+-----------------------+---------+-+ | 1 | SIMPLE | a | ALL | idx_orderid | NULL | NULL | NULL | 1967647 | Using filesort | | 1 | SIMPLE | o | eq_ref | PRIMARY | PRIMARY | 122 | a.orderid | 1 | NULL | +----+-------------+-------+--------+---------+---------+---------+----------------+---------+-+ Since is_reply has only two states, 0 and 1, we rewrite it as follows, and the execution time is reduced from 1.58 seconds to 2 milliseconds. SELECT * FROM ((SELECT * FROM my_order o INNER JOIN my_appraise a ON a.orderid = o.id AND is_reply = 0 ORDER BY appraise_time DESC LIMIT 0, 20) UNION ALL (SELECT * FROM my_order o INNER JOIN my_appraise a ON a.orderid = o.id AND is_reply = 1 ORDER BY appraise_time DESC LIMIT 0, 20)) t ORDER BY is_reply ASC, appraisetime DESC LIMIT 20; 5. EXISTS Statement When MySQL treats the EXISTS clause, it still uses the nested subquery execution method. Such as the following SQL statement: SELECT * FROM my_neighbor n LEFT JOIN my_neighbor_apply sra ON n.id = sra.neighbor_id AND sra.user_id = 'xxx' WHERE n.topic_status < 4 AND EXISTS(SELECT 1 FROM message_info m WHERE n.id = m.neighbor_id AND m.inuser = 'xxx') AND n.topic_type <> 5 The execution plan is: +----+--------------------+-------+------+-----+------------------------------------------+---------+-------+---------+ -----+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------------+-------+------+ -----+------------------------------------------+---------+-------+---------+ -----+ | 1 | PRIMARY | n | ALL | | NULL | NULL | NULL | 1086041 | Using where | | 1 | PRIMARY | sra | ref | | idx_user_id | 123 | const | 1 | Using where | | 2 | DEPENDENT SUBQUERY | m | ref | | idx_message_info | 122 | const | 1 | Using index condition; Using where | +----+--------------------+-------+------+ -----+------------------------------------------+---------+-------+---------+ -----+ Removing exists and changing it to join can avoid nested subqueries and reduce the execution time from 1.93 seconds to 1 millisecond. SELECT * FROM my_neighbor n INNER JOIN message_info m ON n.id = m.neighbor_id AND m.inuser = 'xxx' LEFT JOIN my_neighbor_apply sra ON n.id = sra.neighbor_id AND sra.user_id = 'xxx' WHERE n.topic_status < 4 AND n.topic_type <> 5 New execution plan: +----+-------------+-------+--------+ -----+------------------------------------------+---------+ -----+------+ -----+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+ -----+------------------------------------------+---------+ -----+------+ -----+ | 1 | SIMPLE | m | ref | | idx_message_info | 122 | const | 1 | Using index condition | | 1 | SIMPLE | n | eq_ref | | PRIMARY | 122 | ighbor_id | 1 | Using where | | 1 | SIMPLE | sra | ref | | idx_user_id | 123 | const | 1 | Using where | +----+-------------+-------+--------+ -----+------------------------------------------+---------+ -----+------+ -----+ 6. Conditional Pushdown The cases where the outer query conditions cannot be pushed down to complex views or subqueries include:
As shown in the following statement, the execution plan shows that the condition acts after the aggregate subquery: SELECT * FROM (SELECT target, Count(*) FROM operation GROUP BY target) t WHERE target = 'rm-xxxx' +----+-------------+------------+-------+---------------+-------------+---------+-------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+-------+---------------+-------------+---------+-------+------+-------------+ | 1 | PRIMARY | <derived2> | ref | <auto_key 0 > | <auto_key0> | 514 | const | 2 | Using where | | 2 | DERIVED | operation | index | idx_4 | idx_4 | 519 | NULL | 20 | Using index | +----+-------------+------------+-------+---------------+-------------+---------+-------+------+-------------+ After determining that the query conditions can be directly pushed down from a semantic point of view, rewrite it as follows: SELECT target, Count(*) FROM operation WHERE target = 'rm-xxxx' GROUP BY target The execution plan becomes: +----+-------------+-----------+------+---------------+-------+---------+-------+------+--------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-----------+------+---------------+-------+---------+-------+------+--------------------+ | 1 | SIMPLE | operation | ref | idx_4 | idx_4 | 514 | const | 1 | Using where; Using index | +----+-------------+-----------+------+---------------+-------+---------+-------+------+--------------------+ For a detailed explanation of why MySQL external conditions cannot be pushed down, please refer to the article: http://mysql.taobao.org/monthly/2016/07/08 7. Narrow down the scope in advance First, the initial SQL statement: SELECT * FROM my_order o LEFT JOIN my_userinfo u ON o.uid = u.uid LEFT JOIN my_productinfo p ON o.pid = p.pid WHERE ( o.display = 0 ) AND ( o.ostaus = 1 ) ORDER BY o.selltime DESC LIMIT 0, 15 The original intention of this SQL statement is: first do a series of left joins, and then sort and take the first 15 records. It can also be seen from the execution plan that the estimated number of sorted records in the last step is 900,000 and the time consumption is 12 seconds. +----+-------------+-------+--------+---------------+---------+---------+----------------+--------+----------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+---------------+---------+---------+----------------+--------+----------------------------------------------------+ | 1 | SIMPLE | o | ALL | NULL | NULL | NULL | NULL | 909119 | Using where; Using temporary; Using filesort | | 1 | SIMPLE | u | eq_ref | PRIMARY | PRIMARY | 4 | o.uid | 1 | NULL | | 1 | SIMPLE | p | ALL | PRIMARY | NULL | NULL | NULL | 6 | Using where; Using join buffer (Block Nested Loop) | +----+-------------+-------+--------+---------------+---------+---------+----------------+--------+----------------------------------------------------+ Since the final WHERE condition and sorting are both for the leftmost primary table, you can sort my_order in advance to reduce the amount of data before doing a left join. After SQL is rewritten, the execution time is reduced to about 1 millisecond. SELECT * FROM ( SELECT * FROM my_order o WHERE ( o.display = 0 ) AND ( o.ostaus = 1 ) ORDER BY o.selltime DESC LIMIT 0, 15 ) LEFT JOIN my_userinfo u ON o.uid = u.uid LEFT JOIN my_productinfo p ON o.pid = p.pid ORDER BY o.selltime DESC limit 0, 15 Check the execution plan again: the subquery participates in JOIN after being materialized (select_type=DERIVED). Although the estimated number of rows to be scanned is still 900,000, the actual execution time becomes very small after using the index and the LIMIT clause. +----+-------------+------------+--------+---------------+---------+---------+---+--------+----------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+--------+---------------+---------+---------+---+--------+----------------------------------------------------+ | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 15 | Using temporary; Using filesort | | 1 | PRIMARY | u | eq_ref | PRIMARY | PRIMARY | 4 | o.uid | 1 | NULL | | 1 | PRIMARY | p | ALL | PRIMARY | NULL | NULL | NULL | 6 | Using where; Using join buffer (Block Nested Loop) | | 2 | DERIVED | o | index | NULL | idx_1 | 5 | NULL | 909112 | Using where | +----+-------------+------------+--------+---------------+---------+---------+---+--------+----------------------------------------------------+ 8. Pushing down intermediate result sets Let's look at the following example that has been initially optimized (the main table in the left join takes precedence over the query condition): SELECT a.*, c.allocated FROM ( SELECT resourceid FROM my_distributed WHERE isdelete = 0 AND cusmanagercode = '1234567' ORDER BY salecode limit 20) a LEFT JOIN ( SELECT resourcesid, sum(ifnull(allocation, 0) * 12345) allocated FROM my_resources GROUP BY resourcesid) c ON a.resourceid = c.resourcesid Are there any other problems with this statement? It is not difficult to see that subquery c is a full-table aggregation query, which will cause the performance of the entire statement to deteriorate when the number of tables is particularly large. In fact, for subquery c, the final result set of the left join only cares about the data that can match the resourceid of the main table. Therefore, we can rewrite the statement as follows, and the execution time is reduced from the original 2 seconds to 2 milliseconds. SELECT a.*, c.allocated FROM ( SELECT resourceid FROM my_distributed WHERE isdelete = 0 AND cusmanagercode = '1234567' ORDER BY salecode limit 20) a LEFT JOIN ( SELECT resourcesid, sum(ifnull(allocation, 0) * 12345) allocated FROM my_resources r, ( SELECT resourceid FROM my_distributed WHERE isdelete = 0 AND cusmanagercode = '1234567' ORDER BY salecode limit 20) a WHERE r.resourcesid = a.resourcesid GROUP BY resourcesid) c ON a.resourceid = c.resourcesid But subquery a appears multiple times in our SQL statement. This writing method not only has additional overhead, but also makes the entire statement appear complicated. Rewrite again using the WITH statement: WITH A ( SELECT resourceid FROM my_distributed WHERE isdelete = 0 AND cusmanagercode = '1234567' ORDER BY salecode limit 20) SELECT a.*, c.allocated FROM a LEFT JOIN ( SELECT resourcesid, sum(ifnull(allocation, 0) * 12345) allocated FROM my_resources r, a WHERE r.resourcesid = a.resourcesid GROUP BY resourcesid) c ON a.resourceid = c.resourcesid Summarize The database compiler generates an execution plan, which determines how the SQL is actually executed. But the compiler only does its best, and not all database compilers are perfect. Most of the scenarios mentioned above also have performance issues in other databases. Only by understanding the characteristics of the database compiler can you avoid its shortcomings and write high-performance SQL statements. When programmers design data models and write SQL statements, they should bring in the idea or awareness of algorithms. When writing complex SQL statements, you should develop the habit of using WITH statements. Concise and clear SQL statements can also reduce the burden on the database. Well, that’s all for this article. I hope the content of this article will be of certain reference value to your study or work. Thank you for your support of 123WORDPRESS.COM. You may also be interested in:
|
<<: How to encapsulate WangEditor rich text component in Angular
>>: Implementation of docker redis5.0 cluster cluster construction
This is a website I imitated when I was self-stud...
Table property settings that work well: Copy code ...
I have roughly listed some values to stimulate ...
Call How to call Amap API? The official open docu...
1. Problem During development, when inserting dat...
Download address: https://dev.mysql.com/downloads...
In fact, this is also a clickbait title, and it c...
Table of contents 1. What is Function Anti-shake?...
Table of contents 1. Basic grammar 2. Filter by c...
Table of contents for loop While Loop do-while lo...
How long has it been since I updated my column? H...
【question】 We have an HP server. When the SSD wri...
Meta tag function The META tag is a key tag in th...
The specific method is as follows: CSS Code Copy ...
Introduction: Compared with traditional image ver...