Three strategies for rewriting MySQL query statements

Three strategies for rewriting MySQL query statements

When optimizing problematic queries, we need to change the way we get query results—but this doesn’t mean getting the same result set from MySQL. Sometimes we can transform a query into a form that gets the same results but with better performance. However, we also need to consider rewriting queries to obtain different results, because this can improve development efficiency. You can also modify the application code to achieve the same effect. This article will introduce some tips on how to rewrite queries.

Complex query and step-by-step query

An important query design question is whether it is better to decompose a complex query into multiple simpler queries. Traditional database design emphasizes solving a large amount of work with as few queries as possible. In the past, this would have been better. This is because of the higher cost of network communication in the past and the load on the query parser and optimizer.

However, this advice does not apply much to MySQL, because MySQL handles establishing and disconnecting connections very efficiently and responds quickly to simple queries. Today's network speeds have also increased significantly compared to before. Depending on the server version, MySQL can run more than 100,000 simple queries per second on a normal machine and complete 2,000 query communications per second on a Gigabit network. Therefore, distributing queries is not as bad as it used to be.

Compared to the number of data rows traversed per second, the connection response is still relatively slow. In memory data, this time reaches milliseconds. Of course, using as many queries as possible is still a good choice. However, sometimes we can improve performance by splitting a complex query into several simpler queries. Next we'll show some examples.

Using too many queries is a common mistake in programming. For example, some applications execute 10 separate queries to fetch 10 rows of data (using a loop to fetch them one by one), when this could be accomplished with a single query that fetches 10 rows of data. Therefore, this does not advocate splitting queries every time, but it depends on the actual situation.

Split query statement

Another approach is to split the query and reassemble it. By splitting large data queries into smaller queries to reduce the number of rows affected each time.

Cleaning old data is a classic example. Periodic data cleaning requires removing large amounts of data, which can lock large amounts of data rows for long periods of time. This operation also generates transaction logs, consumes a lot of resources, and can block small queries that should not be interrupted. Splitting the DELETE statements into medium-sized queries can significantly improve performance and reduce the extra latency caused by repeated queries when the queries are repeated. For example, the following delete statement:

DELETE FROM messages WHERE created < DATE_SUB(NOW(), INTERVAL 3 MONTH);

The pseudo code of the application is as follows:

rows_affected = 0
do {
  rows_affected = do_query(
  "DELETE FROM messages WHERE created < DATE_SUB(NOW(), INTERVAL 3 MONTH)
  LIMIT 10000")
  } while rows_affected > 0

Deleting 10,000 rows at a time is a large enough task to make each query efficient. A sufficiently short task will reduce the impact on the server (transactional storage engines benefit from this). It is also a good idea to insert some sleep time in the DELETE statements to spread the load over time and shorten the duration of locks held.

Disassembling joint query

Many high-performance applications break down joint queries. You can split the union query into multiple single-table queries and then combine the results in the application. For example:

SELECT * FROM tag
	JOIN tag_post ON tag_post.tag_id=tag.id
  JOIN post ON tag_post.post_id=post.id
WHERE tag.tag='mysql';

This union query can be broken down into the following parts.

SELECT * FROM tag WHERE tag='mysql';
SELECT * FROM tag_post WHERE tag_id=1234;
SELECT * FROM post WHERE post.id IN (123, 456, 567, 9098, 8904);

Note: tag_id=1234 and post.id IN (123, 456, 567, 9098, 8904) are the values ​​obtained based on the results of the previous query. Why do this? At first glance this seems unnecessary - it just increases the number of queries. However, this reconstruction of the query can bring the following advantages:

  • The caching mechanism will be more effective. Many applications use ORM to map data tables directly. In this example, if the object with tag mysql is already cached, the first query will be skipped. If posts with id 123, 567 or 9908 are in the cache, you can remove them from the IN list. With this strategy, query cache will benefit accordingly. If only one of the tables changes frequently, breaking up the join query can reduce the number of cache invalidations.
  • Executing these queries separately can sometimes reduce the chance of table locks.
  • This way it is easy to scale the database and move tables to different machines.
  • The queries themselves can be optimized. In this example, using an IN query instead of a union query may allow MySQL to sort the row IDs and retrieve the rows more optimally.
  • Can reduce redundant row accesses. Using this approach means that the data row is only retrieved once, while in a joint query it is possible to retrieve the same data repeatedly. For this reason, this decomposition may also reduce overall network load and memory usage.
  • To expand on this, you can also replace the nested loops of MySQL join queries by manually performing hash join queries, which may also be more efficient.

Finally, we can see that by breaking down the union query, the cache can be more reusable, the multi-server distributed data solution can be simpler, and IN queries can be used in large data tables to replace union queries or multiple repeated queries on the same table.

The above are the details of the three strategies for MySQL to rewrite query statements. For more information about MySQL rewriting query statements, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • Use of MySQL query rewrite plugin
  • An article to understand the execution process of MySQL query statements
  • Detailed explanation of the idea of ​​using Python sql statements to perform multi-condition fuzzy queries on mysql database
  • Summary of MySQL database like statement wildcard fuzzy query
  • A brief discussion on the problem of passing parameters when using in in pymysql query statements
  • MySQL fuzzy query statement collection
  • Detailed explanation of the process of querying user permissions using mysql statements
  • Summary of MySQL common SQL statements including complex SQL queries
  • In-depth explanation of SQL statement execution (MySQL architecture overview -> query execution process -> SQL parsing order)

<<:  How to configure Http, Https, WS, and WSS in Nginx

>>:  Web design and production test questions and reference answers

Recommend

Common usage of regular expressions in Mysql

Common usage of Regexp in Mysql Fuzzy matching, c...

Analysis and solution of a.getAttribute(href,2) problem in IE6/7

Brief description <br />In IE6 and 7, in a ...

A simple method to merge and remove duplicate MySQL tables

Scenario: The crawled data generates a data table...

How to add a certificate to docker

1. Upgrade process: sudo apt-get update Problems ...

Why MySQL database avoids NULL as much as possible

Many tables in MySQL contain columns that can be ...

Brief analysis of MySQL union and union all

In the database, both UNION and UNION ALL keyword...

Brief analysis of mysql scheduled backup tasks

Introduction In a production environment, in orde...

How to install openssh from source code in centos 7

Environment: CentOS 7.1.1503 Minimum Installation...

Several mistakes that JavaScript beginners often make

Table of contents Preface Confusing undefined and...

JavaScript implements double-ended queue

This article example shares the specific code of ...

Getting Started with MySQL - Concepts

1. What is it? MySQL is the most popular relation...

MySQL InnoDB transaction lock source code analysis

Table of contents 1. Lock and Latch 2. Repeatable...

JavaScript event loop case study

Event loop in js Because JavaScript is single-thr...