How to analyze MySQL query performance

How to analyze MySQL query performance

Query optimization, index optimization, and table design optimization are closely linked. If you have extensive experience writing MySQL query statements, you will know how to design tables and indexes to support efficient queries. Likewise, knowing the table design also helps understand how the table structure affects the query statements. Therefore, even if the table design and index design are good, if the query statement is poorly written, the query performance will be poor.

Before trying to write fast queries, it's important to remember that speed is measured in terms of response time. A query statement is a large task consisting of multiple subtasks, and each subtask consumes time. To optimize the query, we need to reduce the number of subtasks as much as possible or make the subtasks execute faster. Note: Sometimes we also need to consider the impact of a query on other queries in the system. In this case, we also need to reduce resource consumption as much as possible. _ Generally, we can think of the query lifecycle as running through the entire interaction sequence diagram from client to server, including query statement parsing, query planning, execution process, and data returned to the client. Execution is the most important part of the query process, which includes a large number of calls to obtain data rows from the storage engine, as well as post-processing of the data, such as grouping and sorting.

After completing all of these tasks, the query will also consume time in network transmission errors, CPU processing, data statistics and strategy planning, waiting for locks, and fetching data rows from the storage engine. These calls consume time in memory operations, CPU operations, and I/O operations. In each case, if these operations are abused, performed too many times, or performed too slowly, additional time overhead will be incurred. The goal of query optimization is to avoid these situations—either by eliminating or reducing operations, or by making them run faster.

It’s important to note that we can’t draw an exact diagram of the query lifecycle, our purpose is to show the importance of understanding the query lifecycle and thinking about how time-consuming these steps are. With this foundation, you can start to optimize the query statement.

Slow query basics: optimizing data acquisition

The most fundamental reason for poor query performance is processing too much data. Some queries must filter through large amounts of data, which cannot be optimized. But this is not a normal situation. Most bad queries can be optimized by accessing less data. The following two steps are useful for analyzing poorly performing queries:

  1. Find out if the app is accessing data other than what you need. Usually this means that the application is fetching too many rows or columns of data.
  2. Find out if the MySQL server is parsing more rows than necessary.

Check if unnecessary data is being requested from the database

Some queries request the required data from the database server and then discard it. This will increase the work of the MySQL server, increase the network load, and consume more memory and CPU resources of the application server. Here are some typical mistakes:

  1. Fetching unnecessary rows: A common misconception is to assume that MySQL provides only the required results, rather than calculating and returning the entire result set. Usually this mistake happens to people who are familiar with other database systems. These developers are accustomed to using SELECT statements that return many rows, then extracting the first N rows, and then no longer using the returned result set (for example, getting the 100 most recent articles from an information website, and then only displaying 10 of them on the front end). They will think that MySQL will stop querying after getting 10 rows of data, but in fact MySQL will get the complete data set. The client then either fetches all the data and discards most of it. The best solution is to add a LIMIT condition to the query.
  2. Get all the columns in a multi-table join query: If you need to get all the actors in the movie Age of Dinosaurs, don't write your SQL statement like this:
SELECT * FROM sakila.actor
INNER JOIN sakila.file_actor USING(actor_id)
INNER JOIN sakila.file USING (film_id)
WHERE sakila.film.title = 'Academy Dinosaur';

This will return all columns from the three tables involved in the union query. A better approach would be to write it like this:

SELECT sakila.actor.* FROM sakila.actor
INNER JOIN sakila.file_actor USING(actor_id)
INNER JOIN sakila.file USING (film_id)
WHERE sakila.film.title = 'Academy Dinosaur';
  1. Get all the data columns: When you see a query like SELECT *, be skeptical: do you really need all the columns? Most likely not. Fetching all data columns will invalidate covering indexes, increase I/O burden, memory consumption, and CPU load. Some DBAs simply disable SELECT * for this reason, and it can reduce the problems caused by people modifying table columns. Of course, requesting unnecessary data isn't always bad. The survey found that this approach can simplify development work because it can improve code reusability. As long as you know it will affect performance, that would be a valid reason. Similarly, if certain caching mechanisms are used in the application, the cache hit rate will also be improved. Fetching and caching full objects can be better handled by running multiple separate queries that fetch parts of the objects.
  2. Fetching the same data repeatedly: If you're not careful, it's easy to write code in your application that fetches the same data. For example, if you want to display a user's profile picture in a list of comments, you might fetch it once for each comment. A more effective way is to cache the data after the first fetch and use it directly in the comment list.

Check if MySQL is processing too much data

Once you've made sure that your queries aren't fetching unnecessary data, you can look for queries that process too much data before returning results. In MySQL, the simplest query consuming standards is:

  1. Response time
  2. The number of rows processed
  3. The number of rows returned

None of these metrics are perfect measures of query performance, but they provide a rough idea of ​​how much data MySQL fetches during internal processing when executing a query and how quickly the query runs. These three criteria are recorded in the slow query log, so discovering queries with excessive data processing from the slow query log is the best practice for query optimization.

Response Time First, note that query response time is a symptom we see. In reality, response time is more complicated than we think. Response time consists of two parts: service time and queue time. Service time is the time it takes the server to actually process the query. Queue time is the time when the server is not actually executing the query - it is waiting for some resource, such as the completion of an I/O operation, the release of a row lock, etc. The problem is that you can't accurately split the response time into these two parts - unless you can measure the time of each part separately, which is difficult to do. The most common and important cases are I/O blocking and waiting for locks, but this is not 100% the case.

As a result, response times are not constant under different load conditions. Other factors, such as storage engine locks, high concurrency, and hardware can also affect response time. Therefore, when checking response time, you must first determine whether the response time is caused by this query alone. The query response time can be estimated by computing the Query Quick Upper Bound Estimate (QUBE) method: by examining the query plan and the indexes used, determining the number of sequential and random I/O access operations required, and then multiplying it by the time the machine's hardware can perform each operation. By summing up all the time, we can evaluate whether the slow query response is caused by the query itself or other reasons.

Number of rows processed and returned When analyzing a query, it is useful to think in terms of the number of rows processed, because this gives us an intuitive idea of ​​how the query is getting the data we need. However, this is not a perfect measurement tool for finding bad queries. Not all row accesses are identical. Fewer rows are faster to access, and fetching rows from memory is much faster than fetching them from disk.

Ideally, the rows processed and the rows returned would be identical, but in practice this is rarely the case. For example, when using a joint index to build a returned row, the server must fetch data from multiple rows to generate the returned row data. The ratio of rows processed to rows returned is usually small, between 1:1 and 10:1, but can sometimes be orders of magnitude larger.

Data row processing and retrieval types

When thinking about the cost of a query, consider the cost of fetching a single row from a table. MySQL uses a variety of retrieval methods to find and return a row of data. Some may need to process multiple rows, while others may return results without any inspection.

The method for obtaining data is in the type column of the EXPLAIN output results. Includes full table scans, index scans, range scans, unique index lookups, and constants. Each of the above methods is faster than the one before it because the amount of data read decreases successively. We don't need to remember to get the type, but we need to understand the basic concept.

If there is no good retrieval type, the best way to solve the problem is to add a suitable index. Indexes enable MySQL to examine less data and thus retrieve rows more efficiently. For example, take the following simple query:

EXPLAIN SELECT * FROM sakila.film_actor WHERE file_id=1;

This query returns 10 rows of data, and the EXPLAIN command shows that MySQL uses the ref type to execute the query statement on the idx_fk_film_id index.

***********************1. row************************
id: 1
select_type: SIMPLE
table:film_actor
type: ref 
possible)keys: idx_fk_film_id
key: idx_fk_film_id
key_len: 2
ref: const
rows: 10
Extra: 

The EXPLAIN command shows that MySQL estimates that it needs to fetch only 10 rows to complete the query. In other words, the query optimizer knows how to choose the fetch type to make the query more efficient. What happens if there is no suitable index for the query? MySQL must use a suboptimal fetch type, and then look at the results after deleting the table index.

ALTER TABLE sakila.film_actor DROP FOREIGN KEY fk_film_actor_film;
ALTER TABLE sakila.film_actor DROP DROP KEY idx_fk_film_id;
EXPLAIN SELECT * FROM sakila.film_actor WHERE file_id=1;
***********************1. row************************
id: 1
select_type: SIMPLE
table:film_actor
type: ALL 
possible)keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5073
Extra: Using where

As expected, the fetch type becomes a full table scan (ALL), and MySQL estimates that it needs to process 5073 rows of data to complete the query. Using where in the Extra column shows that the MySQL server uses the WHERE condition to discard other data that does not meet the conditions read by the storage engine. Typically, MySQL uses WHERE conditions in the following three ways, from best to worst:

  1. Removing unmatched data rows through index lookup operations, which occurs at the storage engine layer;
  2. Use covering indexes (displayed as Using index in the Extra column) to avoid accessing data rows, and filter out data that does not meet the conditions after obtaining the results. This happens at the server level, but does not require reading rows from the tables.
  3. Get the data rows from the data table, and then filter out the unmatched data (displayed as Using where in the Extra column). This happens at the server level and requires reading rows from the table before filtering the data.

The following example demonstrates the importance of having good indexes. Good indexes help to use good data retrieval patterns and process only the required rows. However, adding an index does not always mean that MySQL retrieves and returns consistent rows. For example, the COUNT() aggregation method below.

SELECT actor_id, COUNT(*) FROM sakila.film_actor GROUP BY actor_id;

This query only returns 200 rows, but it needs to read thousands of rows before constructing the returned result set. This type of query statement cannot reduce the number of data rows that need to be processed even with an index.

Unfortunately, MySQL does not tell you how many rows were fetched to construct the returned result set, it only tells you the total number of rows fetched. Many rows are filtered out by the WHERE condition and have no effect on the returned result set. In the previous example, after removing the sakila.film_actor index, the query retrieves all rows of the data table, but only takes 10 data from them as the result set. Understanding how many rows the server fetched and how many rows it returned helps you understand the query itself. If you find that you need to fetch a large number of rows but only use a few rows in the result, you can fix this problem by doing the following:

  1. Use covering indexes, which save the storage engine from having to fetch the entire row of data (get it directly from the index).
  2. Modify a query table. An example is building a summary table to query statistical data.
  3. Rewrite complex query statements so that the MySQL query optimizer can execute them in a more optimal way.

The above is the details of how MySQL analyzes query performance. For more information about MySQL query performance analysis, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • Full steps to create a high-performance index in MySQL
  • Introduction to the use of MySQL performance stress benchmark tool sysbench
  • MySQL performance optimization index pushdown
  • Reasons for the sudden drop in MySQL performance
  • Solutions to Mysql index performance optimization problems
  • MySQL performance optimization tips
  • MySQL 20 high-performance architecture design principles (worth collecting)
  • Summary of Mysql high performance optimization skills
  • Detailed explanation of GaussDB for MySQL performance optimization

<<:  Design Association: Why did you look in the wrong place?

>>:  Detailed explanation of the basic use of react-navigation6.x routing library

Recommend

How to generate PDF and download it in Vue front-end

Table of contents 1. Installation and introductio...

Vue3+el-table realizes row and column conversion

Table of contents Row-Column Conversion Analyze t...

MySQL 8.0.12 installation graphic tutorial

MySQL8.0.12 installation tutorial, share with eve...

Example code for implementing equal width layout in multiple ways using CSS

The equal-width layout described in this article ...

Detailed explanation of common MySQL operation commands in Linux terminal

Serve: # chkconfig --list List all system service...

Example of implementing a virtual list in WeChat Mini Program

Table of contents Preface analyze Initial Renderi...

How to detect if the current browser is a headless browser with JavaScript

Table of contents What is a headless browser? Why...

mysql 8.0.19 winx64.zip installation tutorial

This article records the installation tutorial of...

Implementation of element multiple form validation

In the project, form testing is often encountered...

Solution to the problem of z-index not taking effect in CSS3

I recently wrote a combination of CSS3 and js, an...

Detailed explanation of how Vue components transfer values ​​to each other

Table of contents Overview 1. Parent component pa...

HTML thead tag definition and usage detailed introduction

Copy code The code is as follows: <thead> &...

How to install Oracle_11g using Docker

Install Oracle_11g with Docker 1. Pull the oracle...

Tomcat8 uses cronolog to split Catalina.Out logs

background If the catalina.out log file generated...