Optimizing the slow query of MySQL aggregate statistics data

Optimizing the slow query of MySQL aggregate statistics data

Written in front

When we operate the database in our daily life, such as order table, access record table, and product table.

We often deal with statistical problems such as calculating the sum of data columns and the number of data rows.

As the business develops, these tables will become larger and larger. If not handled properly, the speed of querying statistics will become slower and slower until the business can no longer tolerate it.

Therefore, we need to first understand and think about these scenario knowledge points, and reserve some optimization space to support business development at the beginning of the design.

SQL Aggregate Function

In MySQL and other data, aggregate functions are supported to facilitate data calculation.

The following methods are common

Take the average value AVG()
SUM()
Maximum value MAX()
Minimum value MIN()
COUNT()

Demonstrate several simple SQL statements:

Query the total number of orders with u_id 100

select count(id) from orders where u_id = 100;

Query the total consumption of orders with u_id 100

select sum(order_amount) from orders where u_id = 100;

Find the best-selling products

select max(sell_num) from goods

Statistics on the number and total amount of orders in July

select count(id) as count, sum(order_amount) as total_amount 
from orders where order_date between 20190701 and 20190731 and is_pay = 1

If at this time, the total number of order tables is 100 million. And this SQL statement runs very slowly, how should we troubleshoot and optimize it?

Some students may say: There are many rows, so add an index on the date field, so that the screening will be very fast.

The total number is 100 million. Assuming there are 10 million orders in July, the screening speed will naturally increase a lot after adding the index. But is our problem really solved at this point?

In this kind of aggregate function, the result needs to be calculated by traversing each piece of data. For example, if we want to count the total number of orders, we need to read the order amount of each row and then add them up.

That is to say, in this statistical SQL, you need to first filter 10 million data from 100 million data, and then traverse these data for calculation. It will be very slow at this time.

Adding indexes does not solve the problem of slow aggregate function statistics

Optimizing the aggregation statistics solution

Budget in Advance

Create a statistical data table, separated by date, such as: how many orders were sold on 20190801, the amount and other data.
When an order is generated (the data can be counted after payment is completed), the amount and quantity are added to the corresponding date in the statistical data table.

It should be noted that if there are scenarios such as refunds that will affect the reduction of data, remember to handle them accordingly

When we need to count the data for August, we only need to traverse and calculate the thirty or so rows of data for this month.

Timed landing

We can use easyswoole, scheduled tasks, etc. To calculate the total at a fixed time (for example, every 20 minutes), and then update it to the statistical data table.

Advantages: There is less processing to be done, and there is no need to change the API such as refund operation. It only needs to rely on the data of the original order table, and regularly count and refresh the statistical data.

It should be noted that different delivery frequencies should be set according to the popularity of different orders. For example, if the probability of data changes within a week is relatively large, it may be delivered within 20 minutes. The data from a year ago is unlikely to change, so you can choose to synchronize it once a day, or even not refresh it if you are sure it will not change.

Summarize

Indexes cannot solve the problem of slow SQL statements for statistical aggregation of data

Aggregate functions should be used with caution or not at all, because we cannot estimate how many rows of data we need to scan to calculate the future data volume.

The optimization plan is inseparable from the statistical table, and the statistical data calculated in a certain period need to be stored.

This is the end of this article about the slow query of MySQL aggregate statistical data. For more relevant content about the slow query of MySQL aggregate statistical data, please search for previous articles on 123WORDPRESS.COM or continue to browse the related articles below. I hope everyone will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • MySQL and PHP basics and applications: data query
  • A brief discussion on whether too many MySQL data queries will cause OOM
  • Detailed explanation of MySQL database tens of millions of data query and storage
  • MySQL json format data query operation
  • MySQL big data query optimization experience sharing (recommended)
  • MySQL and PHP basics and applications: data query statements

<<:  Detailed explanation of Promises in JavaScript

>>:  How to dynamically modify container port mapping in Docker

Recommend

Specific usage of CSS compound selectors

Intersection Selector The intersection selector i...

How to use DCL to manage users and control permissions in MySQL

DCL (Data Control Language): Data control languag...

Solution to the failure of loading dynamic library when Linux program is running

Unable to load dynamic library under Linux When t...

Docker container operation instructions summary and detailed explanation

1. Create and run a container docker run -it --rm...

Does Mysql ALTER TABLE lock the table when adding fields?

Table of contents Before MySQL 5.6 After MySQL 5....

MySQL Series Database Design Three Paradigm Tutorial Examples

Table of contents 1. Knowledge description of the...

The use of v-model in vue3 components and in-depth explanation

Table of contents Use two-way binding data in v-m...

7 Ways to Write a Vue v-for Loop

Table of contents 1. Always use key in v-for loop...

mysql method to recursively search for all child nodes of a menu node

background There is a requirement in the project ...

3 different ways to clear the option options in the select tag

Method 1 Copy code The code is as follows: documen...

Getting Started with CSS3 Animation in 10 Minutes

Introduction Animation allows you to easily imple...

MySQL SQL statement analysis and query optimization detailed explanation

How to obtain SQL statements with performance iss...

Use h1, h2, and h3 tags appropriately

In the process of making web pages, it is inevita...

Summary of several MySQL installation methods and configuration issues

1. MySQL rpm package installation # Download the ...