Eight common SQL usage examples in MySQL

Eight common SQL usage examples in MySQL

Preface

MySQL continued to maintain its strong growth trend in database popularity in 2016. More and more customers are building their applications on MySQL databases, or even migrating from Oracle to MySQL. However, some customers encounter problems such as slow response time and full CPU when using MySQL database.

Alibaba Cloud RDS expert service team has helped cloud customers solve many urgent problems. Some common SQL problems that appear in the "ApsaraDB Expert Diagnosis Report" are summarized below for your reference.

1. LIMIT statement

Pagination query is one of the most commonly used scenarios, but it is also usually the place where problems are most likely to occur.

For example, for the following simple statement, the general DBA's solution is to add a composite index on the type, name, and create_time fields. In this way, conditional sorting can effectively utilize the index and quickly improve performance.

SELECT * 
FROM operation 
WHERE type = 'SQLStats' 
  AND name = 'SlowLog' 
ORDER BY create_time 
LIMIT 1000, 10;

Well, probably more than 90% of DBAs would stop here to solve this problem.

But when the LIMIT clause becomes "LIMIT 1000000,10", programmers will still complain: Why is it still slow when I only take 10 records?

You should know that the database does not know where the 1,000,000th record starts, and even if there is an index, it needs to be calculated from scratch. When this kind of performance problem occurs, in most cases it is because the programmer is lazy.

In scenarios such as front-end data browsing and paging, or large data batch export, the maximum value of the previous page can be used as a parameter as a query condition. The SQL is redesigned as follows:

SELECT * 
FROM operation 
WHERE type = 'SQLStats' 
AND name = 'SlowLog' 
AND create_time > '2017-03-16 14:00:00' 
ORDER BY create_time limit 10;

Under the new design, query time is basically fixed and will not change as the amount of data grows.

2. Implicit conversion

Mismatch between query variables and field definition types in SQL statements is another common error. For example, the following statement:

mysql> explain extended SELECT * 
  > FROM my_balance b 
  > WHERE b.bpn = 14000000123 
  > AND b.isverified IS NULL ;
mysql> show warnings;
| Warning | 1739 | Cannot use ref access on index 'bpn' due to type or collation conversion on field 'bpn'

The field bpn is defined as varchar(20), and MySQL's strategy is to convert the string into a number before comparison. The function acts on the table field and the index becomes invalid.

The above situation may be parameters automatically filled in by the application framework, rather than the original intention of the programmer. There are many complex application frameworks nowadays. While they are convenient to use, you should also be careful of the pitfalls they may cause you.

3. Update and delete associations

Although MySQL 5.6 introduced the materialization feature, it is important to note that it currently only optimizes query statements. Updates or deletions need to be manually rewritten into JOINs.

For example, in the following UPDATE statement, MySQL actually executes a loop/nested subquery (DEPENDENT SUBQUERY), and its execution time can be imagined.

UPDATE operation o 
SET status = 'applying' 
WHERE o.id IN (SELECT id 
    FROM (SELECT o.id, 
        o.status 
      FROM operation o 
      WHERE o.group = 123 
        AND o.status NOT IN ( 'done' ) 
      ORDER BY o.parent, 
         o.id 
      LIMIT 1) t);

Execution plan:

+----+--------------------+-------+-------+---------------+---------+---------+-------+------+-----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+-------+---------------+---------+---------+-------+------+-----------------------------------------------------+
| 1 | PRIMARY | o | index | | PRIMARY | 8 | | 24 | Using where; Using temporary |
| 2 | DEPENDENT SUBQUERY | | | | | | | | | Impossible WHERE noticed after reading const tables |
| 3 | DERIVED | o | ref | idx_2,idx_5 | idx_5 | 8 | const | 1 | Using where; Using filesort |
+----+--------------------+-------+-------+---------------+---------+---------+-------+------+-----------------------------------------------------+

After being rewritten as JOIN, the subquery selection mode changes from DEPENDENT SUBQUERY to DERIVED, and the execution speed is greatly accelerated, from 7 seconds to 2 milliseconds.

UPDATE operation o 
  JOIN (SELECT o.id, 
       o.status 
      FROM operation o 
      WHERE o.group = 123 
       AND o.status NOT IN ( 'done' ) 
      ORDER BY o.parent, 
        o.id 
      LIMIT 1) t
   ON o.id = t.id 
SET status = 'applying'

The execution plan is simplified to

+----+-------------+-------+------+---------------+-------+---------+-------+------+-----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+-------+---------+-------+------+-----------------------------------------------------+
| 1 | PRIMARY | | | | | | | | | Impossible WHERE noticed after reading const tables |
| 2 | DERIVED | o | ref | idx_2,idx_5 | idx_5 | 8 | const | 1 | Using where; Using filesort |
+----+-------------+-------+------+---------------+-------+---------+-------+------+-----------------------------------------------------+

4. Mixed sorting

MySQL cannot use indexes for mixed sorting. But in some scenarios, there are still opportunities to use special methods to improve performance.

SELECT * 
FROM my_order o 
  INNER JOIN my_appraise a ON a.orderid = o.id 
ORDER BY a.is_reply ASC, 
   a.appraise_time DESC 
LIMIT 0, 20

The execution plan shows a full table scan:

+----+-------------+-------+--------+-------------+---------+-----------------------+---------+-+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra 
+----+-------------+-------+--------+-------------+---------+-----------------------+---------+-+
| 1 | SIMPLE | a | ALL | idx_orderid | NULL | NULL | NULL | 1967647 | Using filesort |
| 1 | SIMPLE | o | eq_ref | PRIMARY | PRIMARY | 122 | a.orderid | 1 | NULL |
+----+-------------+-------+--------+---------+---------+---------+----------------+---------+-+

Since is_reply has only two states, 0 and 1, we rewrite it as follows, and the execution time is reduced from 1.58 seconds to 2 milliseconds.

SELECT * 
FROM ((SELECT *
   FROM my_order o 
    INNER JOIN my_appraise a 
      ON a.orderid = o.id 
       AND is_reply = 0 
   ORDER BY appraise_time DESC 
   LIMIT 0, 20) 
  UNION ALL 
  (SELECT *
   FROM my_order o 
    INNER JOIN my_appraise a 
      ON a.orderid = o.id 
       AND is_reply = 1 
   ORDER BY appraise_time DESC 
   LIMIT 0, 20)) t 
ORDER BY is_reply ASC, 
   appraisetime DESC 
LIMIT 20;

5. EXISTS Statement

When MySQL treats the EXISTS clause, it still uses the nested subquery execution method. Such as the following SQL statement:

SELECT *
FROM my_neighbor n 
  LEFT JOIN my_neighbor_apply sra 
    ON n.id = sra.neighbor_id 
     AND sra.user_id = 'xxx' 
WHERE n.topic_status < 4 
  AND EXISTS(SELECT 1 
     FROM message_info m 
     WHERE n.id = m.neighbor_id 
       AND m.inuser = 'xxx') 
  AND n.topic_type <> 5

The execution plan is:

+----+--------------------+-------+------+-----+------------------------------------------+---------+-------+---------+ -----+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+------+ -----+------------------------------------------+---------+-------+---------+ -----+
| 1 | PRIMARY | n | ALL | | NULL | NULL | NULL | 1086041 | Using where |
| 1 | PRIMARY | sra | ref | | idx_user_id | 123 | const | 1 | Using where |
| 2 | DEPENDENT SUBQUERY | m | ref | | idx_message_info | 122 | const | 1 | Using index condition; Using where |
+----+--------------------+-------+------+ -----+------------------------------------------+---------+-------+---------+ -----+

Removing exists and changing it to join can avoid nested subqueries and reduce the execution time from 1.93 seconds to 1 millisecond.

SELECT *
FROM my_neighbor n 
  INNER JOIN message_info m 
    ON n.id = m.neighbor_id 
     AND m.inuser = 'xxx' 
  LEFT JOIN my_neighbor_apply sra 
    ON n.id = sra.neighbor_id 
     AND sra.user_id = 'xxx' 
WHERE n.topic_status < 4 
  AND n.topic_type <> 5

New execution plan:

+----+-------------+-------+--------+ -----+------------------------------------------+---------+ -----+------+ -----+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+ -----+------------------------------------------+---------+ -----+------+ -----+
| 1 | SIMPLE | m | ref | | idx_message_info | 122 | const | 1 | Using index condition |
| 1 | SIMPLE | n | eq_ref | | PRIMARY | 122 | ighbor_id | 1 | Using where |
| 1 | SIMPLE | sra | ref | | idx_user_id | 123 | const | 1 | Using where |
+----+-------------+-------+--------+ -----+------------------------------------------+---------+ -----+------+ -----+

6. Conditional Pushdown

The cases where the outer query conditions cannot be pushed down to complex views or subqueries include:

  • Aggregate subquery;
  • Subqueries containing LIMIT;
  • UNION or UNION ALL subquery;
  • Subqueries in output fields;

As shown in the following statement, the execution plan shows that the condition acts after the aggregate subquery:

SELECT * 
FROM (SELECT target, 
    Count(*) 
  FROM operation 
  GROUP BY target) t 
WHERE target = 'rm-xxxx'
+----+-------------+------------+-------+---------------+-------------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+-------------+---------+-------+------+-------------+
|
1
| PRIMARY |
 <derived2> 
| ref |
 <auto_key
0
> 
| <auto_key0> |
514
| const |
2
| Using where |
| 2 | DERIVED | operation | index | idx_4 | idx_4 | 519 | NULL | 20 | Using index |
+----+-------------+------------+-------+---------------+-------------+---------+-------+------+-------------+

After determining that the query conditions can be directly pushed down from a semantic point of view, rewrite it as follows:

SELECT target, 
  Count(*) 
FROM operation 
WHERE target = 'rm-xxxx' 
GROUP BY target

The execution plan becomes:

+----+-------------+-----------+------+---------------+-------+---------+-------+------+--------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+------+---------------+-------+---------+-------+------+--------------------+
| 1 | SIMPLE | operation | ref | idx_4 | idx_4 | 514 | const | 1 | Using where; Using index |
+----+-------------+-----------+------+---------------+-------+---------+-------+------+--------------------+

For a detailed explanation of why MySQL external conditions cannot be pushed down, please refer to the article: http://mysql.taobao.org/monthly/2016/07/08

7. Narrow down the scope in advance

First, the initial SQL statement:

SELECT * 
FROM my_order o 
    LEFT JOIN my_userinfo u 
       ON o.uid = u.uid
    LEFT JOIN my_productinfo p 
       ON o.pid = p.pid 
WHERE ( o.display = 0 ) 
    AND ( o.ostaus = 1 ) 
ORDER BY o.selltime DESC 
LIMIT 0, 15

The original intention of this SQL statement is: first do a series of left joins, and then sort and take the first 15 records. It can also be seen from the execution plan that the estimated number of sorted records in the last step is 900,000 and the time consumption is 12 seconds.

+----+-------------+-------+--------+---------------+---------+---------+----------------+--------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+----------------+--------+----------------------------------------------------+
| 1 | SIMPLE | o | ALL | NULL | NULL | NULL | NULL | 909119 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | u | eq_ref | PRIMARY | PRIMARY | 4 | o.uid | 1 | NULL |
| 1 | SIMPLE | p | ALL | PRIMARY | NULL | NULL | NULL | 6 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+--------+---------------+---------+---------+----------------+--------+----------------------------------------------------+

Since the final WHERE condition and sorting are both for the leftmost primary table, you can sort my_order in advance to reduce the amount of data before doing a left join. After SQL is rewritten, the execution time is reduced to about 1 millisecond.

SELECT * 
FROM (
SELECT * 
FROM my_order o 
WHERE ( o.display = 0 ) 
    AND ( o.ostaus = 1 ) 
ORDER BY o.selltime DESC 
LIMIT 0, 15
) 
   LEFT JOIN my_userinfo u 
       ON o.uid = u.uid 
   LEFT JOIN my_productinfo p 
       ON o.pid = p.pid 
ORDER BY o.selltime DESC
limit 0, 15

Check the execution plan again: the subquery participates in JOIN after being materialized (select_type=DERIVED). Although the estimated number of rows to be scanned is still 900,000, the actual execution time becomes very small after using the index and the LIMIT clause.

+----+-------------+------------+--------+---------------+---------+---------+---+--------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+---------------+---------+---------+---+--------+----------------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 15 | Using temporary; Using filesort |
| 1 | PRIMARY | u | eq_ref | PRIMARY | PRIMARY | 4 | o.uid | 1 | NULL |
| 1 | PRIMARY | p | ALL | PRIMARY | NULL | NULL | NULL | 6 | Using where; Using join buffer (Block Nested Loop) |
| 2 | DERIVED | o | index | NULL | idx_1 | 5 | NULL | 909112 | Using where |
+----+-------------+------------+--------+---------------+---------+---------+---+--------+----------------------------------------------------+

8. Pushing down intermediate result sets

Let's look at the following example that has been initially optimized (the main table in the left join takes precedence over the query condition):

SELECT a.*, 
     c.allocated 
FROM ( 
       SELECT resourceid 
       FROM my_distributed 
          WHERE isdelete = 0 
          AND cusmanagercode = '1234567' 
          ORDER BY salecode limit 20) a 
LEFT JOIN 
     ( 
       SELECT resourcesid, sum(ifnull(allocation, 0) * 12345) allocated 
       FROM my_resources 
          GROUP BY resourcesid) c 
ON a.resourceid = c.resourcesid

Are there any other problems with this statement? It is not difficult to see that subquery c is a full-table aggregation query, which will cause the performance of the entire statement to deteriorate when the number of tables is particularly large.

In fact, for subquery c, the final result set of the left join only cares about the data that can match the resourceid of the main table. Therefore, we can rewrite the statement as follows, and the execution time is reduced from the original 2 seconds to 2 milliseconds.

SELECT a.*, 
     c.allocated 
FROM ( 
          SELECT resourceid 
          FROM my_distributed 
          WHERE isdelete = 0 
          AND cusmanagercode = '1234567' 
          ORDER BY salecode limit 20) a 
LEFT JOIN 
     ( 
          SELECT resourcesid, sum(ifnull(allocation, 0) * 12345) allocated 
          FROM my_resources r, 
              ( 
                   SELECT resourceid 
                   FROM my_distributed 
                   WHERE isdelete = 0 
                   AND cusmanagercode = '1234567' 
                   ORDER BY salecode limit 20) a 
          WHERE r.resourcesid = a.resourcesid 
          GROUP BY resourcesid) c 
ON a.resourceid = c.resourcesid

But subquery a appears multiple times in our SQL statement. This writing method not only has additional overhead, but also makes the entire statement appear complicated. Rewrite again using the WITH statement:

WITH A 
( 
     SELECT resourceid 
     FROM my_distributed 
     WHERE isdelete = 0 
     AND cusmanagercode = '1234567' 
     ORDER BY salecode limit 20)
SELECT a.*, 
     c.allocated 
FROM a 
LEFT JOIN 
     ( 
          SELECT resourcesid, sum(ifnull(allocation, 0) * 12345) allocated 
          FROM my_resources r, 
              a 
          WHERE r.resourcesid = a.resourcesid 
          GROUP BY resourcesid) c 
ON a.resourceid = c.resourcesid

Summarize

The database compiler generates an execution plan, which determines how the SQL is actually executed. But the compiler only does its best, and not all database compilers are perfect.

Most of the scenarios mentioned above also have performance issues in other databases. Only by understanding the characteristics of the database compiler can you avoid its shortcomings and write high-performance SQL statements.

When programmers design data models and write SQL statements, they should bring in the idea or awareness of algorithms.

When writing complex SQL statements, you should develop the habit of using WITH statements. Concise and clear SQL statements can also reduce the burden on the database.

Well, that’s all for this article. I hope the content of this article will be of certain reference value to your study or work. Thank you for your support of 123WORDPRESS.COM.

You may also be interested in:
  • Solving problems encountered when importing and exporting Mysql
  • MySql quick insert tens of millions of large data examples
  • Common causes and solutions for slow MySQL SQL statements

<<:  How to encapsulate WangEditor rich text component in Angular

>>:  Implementation of docker redis5.0 cluster cluster construction

Recommend

Vue implements a small weather forecast application

This is a website I imitated when I was self-stud...

Table related arrangement and Javascript operation table, tr, td

Table property settings that work well: Copy code ...

Discuss the value of Web standards from four aspects with a mind map

I have roughly listed some values ​​to stimulate ...

Tutorial on installing mysql5.7.36 database in Linux environment

Download address: https://dev.mysql.com/downloads...

5 cool and practical HTML tags and attributes introduction

In fact, this is also a clickbait title, and it c...

Implementation and usage scenarios of JS anti-shake throttling function

Table of contents 1. What is Function Anti-shake?...

Essential conditional query statements for MySQL database

Table of contents 1. Basic grammar 2. Filter by c...

Looping methods and various traversal methods in js

Table of contents for loop While Loop do-while lo...

Detailed explanation of CSS animation attribute keyframes

How long has it been since I updated my column? H...

Detailed analysis and testing of SSD performance issues in MySQL servers

【question】 We have an HP server. When the SSD wri...

Summary of the use of html meta tags (recommended)

Meta tag function The META tag is a key tag in th...

Solution for mobile browsers not supporting position: fix

The specific method is as follows: CSS Code Copy ...

How to use Tencent slider verification code in Vue3+Vue-cli4 project

Introduction: Compared with traditional image ver...