Detailed explanation of the idea of ​​querying the difference between hourly data and last hourly data in MySQL

Detailed explanation of the idea of ​​querying the difference between hourly data and last hourly data in MySQL

1. Introduction

The requirement is to obtain the difference and ratio between the hourly data and the previous hourly data within a certain time range. I originally thought it would be a very simple sql , but after thinking about it for two minutes, I found that it was not simple. I couldn’t find any reference solutions on the Internet, so I could only analyze it slowly by myself.

At first I had no idea, so I asked DBA classmate. DBA said he didn't know how to do it and asked me to write a php script to do the calculation. This was a bit too much. I just wanted to check some data temporarily, and I didn't believe that I couldn't find it directly using sql . Okay, let's try it as we go.

The blogger here uses a stupid method to achieve it. If you have a simpler way, please feel free to give me your advice. The comment section is waiting for you!

mysql version:

mysql> select version();
+---------------------+
| version() |
+---------------------+
| 10.0.22-MariaDB-log |
+---------------------+
1 row in set (0.00 sec)

2. Query the difference between each hour and the previous hour

1. Split requirements

Let's query separately here to see how much data there is, so as to facilitate subsequent combination.

(1) Obtaining hourly data volume

For the convenience of display, they are directly merged here, and only the data from 01-12 o'clock is displayed. This is not bug . .

select count(*) as nums,date_format(log_time,'%Y-%m-%d %h') as days from test where 1 and log_time >='2020-04-19 00:00:00' and log_time <= '2020-04-20 00:00:00' group by days;
+-------+---------------+
| nums | days |
+-------+---------------+
| 15442 | 2020-04-19 01 |
| 15230 | 2020-04-19 02 |
| 14654 | 2020-04-19 03 |
| 14933 | 2020-04-19 04 |
| 14768 | 2020-04-19 05 |
| 15390 | 2020-04-19 06 |
| 15611 | 2020-04-19 07 |
| 15659 | 2020-04-19 08 |
| 15398 | 2020-04-19 09 |
| 15207 | 2020-04-19 10 |
| 14860 | 2020-04-19 11 |
| 15114 | 2020-04-19 12 |
+-------+---------------+

(2) Obtain the amount of data from the previous hour

select count(*) as nums1,date_format(date_sub(date_format(log_time,'%Y-%m-%d %h'),interval -1 hour),'%Y-%m-%d %h') as days from test where 1 and log_time >='2020-04-19 00:00:00' and log_time <= '2020-04-20 00:00:00' group by days;
+-------+---------------+
| nums1 | days |
+-------+---------------+
| 15114 | 2020-04-19 01 |
| 15442 | 2020-04-19 02 |
| 15230 | 2020-04-19 03 |
| 14654 | 2020-04-19 04 |
| 14933 | 2020-04-19 05 |
| 14768 | 2020-04-19 06 |
| 15390 | 2020-04-19 07 |
| 15611 | 2020-04-19 08 |
| 15659 | 2020-04-19 09 |
| 15398 | 2020-04-19 10 |
| 15207 | 2020-04-19 11 |
| 14860 | 2020-04-19 12 |
+-------+---------------+

Notice:

1) To get the data of the previous hour, use the date_sub() function. date_sub(date, interval -1 hour) means to get the previous hour of the date parameter. For details, refer to the manual: https://www.w3school.com.cn/sql/func_date_sub.asp
2) The outermost nested date_format here is to keep the format consistent with the above. If this date_format is not added, the queried date format is: 2020-04-19 04:00:00, which is not convenient for comparison.

2. Put these two data together and see

select nums ,nums1,days,days1 
from 
(select count(*) as nums,date_format(log_time,'%Y-%m-%d %h') as days from test where 1 and log_time >='2020-04-19 00:00:00' and log_time <= '2020-04-20 00:00:00' group by days) as m,
(select count(*) as nums1,date_format(date_sub(date_format(log_time,'%Y-%m-%d %h'),interval -1 hour),'%Y-%m-%d %h') as days1 from test where 1 and log_time >='2020-04-19 00:00:00' and log_time <= '2020-04-20 00:00:00' group by days1) as n;

+-------+-------+---------------+---------------+
| nums | nums1 | days | days1 |
+-------+-------+---------------+---------------+
| 15442 | 15114 | 2020-04-19 01 | 2020-04-19 01 |
| 15442 | 15442 | 2020-04-19 01 | 2020-04-19 02 |
| 15442 | 15230 | 2020-04-19 01 | 2020-04-19 03 |
| 15442 | 14654 | 2020-04-19 01 | 2020-04-19 04 |
| 15442 | 14933 | 2020-04-19 01 | 2020-04-19 05 |
| 15442 | 14768 | 2020-04-19 01 | 2020-04-19 06 |
| 15442 | 15390 | 2020-04-19 01 | 2020-04-19 07 |
| 15442 | 15611 | 2020-04-19 01 | 2020-04-19 08 |
| 15442 | 15659 | 2020-04-19 01 | 2020-04-19 09 |
| 15442 | 15398 | 2020-04-19 01 | 2020-04-19 10 |
| 15442 | 15207 | 2020-04-19 01 | 2020-04-19 11 |
| 15442 | 14860 | 2020-04-19 01 | 2020-04-19 12 |
| 15230 | 15114 | 2020-04-19 02 | 2020-04-19 01 |
| 15230 | 15442 | 2020-04-19 02 | 2020-04-19 02 |
| 15230 | 15230 | 2020-04-19 02 | 2020-04-19 03 |

It can be seen that this combination is similar to the nested loop effect in the program, which is equivalent to nums being the outer loop and nums1 being the memory loop. When looping, first use the value of nums to match all values ​​of nums1 . Similar to the php program:

foreach($arr as $k=>$v){
 foreach($arr1 as $k1=>$v1){

 }
}

In this case, can we find the same values ​​in two loop arrays and then calculate the difference as we usually do when writing programs? It is obvious that the dates here are exactly the same and can be used as conditions for comparison.

3. Use case …when to calculate the difference

select (case when days = days1 then (nums - nums1) else 0 end) as diff
from 
(select count(*) as nums,date_format(log_time,'%Y-%m-%d %h') as days from test where 1 and log_time >='2020-04-19 00:00:00' and log_time <= '2020-04-20 00:00:00' group by days) as m,
(select count(*) as nums1,date_format(date_sub(date_format(log_time,'%Y-%m-%d %h'),interval -1 hour),'%Y-%m-%d %h') as days1 from test where 1 and log_time >='2020-04-19 00:00:00' and log_time <= '2020-04-20 00:00:00' group by days1) as n;

Effect:
+------+
| diff |
+------+
| 328 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
|-212 |
| 0 |
| 0

You can see that case..when is used here to calculate the difference when two dates are equal, which is similar to php program:

	foreach($arr as $k=>$v){
 foreach($arr1 as $k1=>$v1){
 if($k == $k1){
  //Find the difference}
 }
}

As a result, we can see that there are a lot of 0 and some calculated results. However, if we exclude these 0s, it seems that there is hope.

4. Filter out the part with a result of 0 and compare the final data

Here, having is used to filter the query results. having clause allows us to filter the groups of data. Although our sql does not have group by clause at the end, both subqueries have group by clauses. In theory, using having to filter data is the most appropriate. Let's try it.

select (case when days = days1 then (nums1 - nums) else 0 end) as diff
from 
(select count(*) as nums,date_format(log_time,'%Y-%m-%d %h') as days from test where 1 and log_time >='2020-04-19 00:00:00' and log_time <= '2020-04-20 00:00:00' group by days) as m,
(select count(*) as nums1,date_format(date_sub(date_format(log_time,'%Y-%m-%d %h'),interval -1 hour),'%Y-%m-%d %h') as days1 from test where 1 and log_time >='2020-04-19 00:00:00' and log_time <= '2020-04-20 00:00:00' group by days1) as n having diff <>0;

result:
+------+
| diff |
+------+
|-328 |
| 212 |
| 576 |
|-279 |
| 165 |
|-622 |
|-221 |
|-48 |
| 261 |
| 191 |
| 347 |
|-254 |
+------+

Here we can see the calculated results, so let’s compare them. Here are some data listed manually:

The difference between the current hour and the last hour: Current hour - last hour

This hour Last hour Difference
15442 15114 -328
15230 15442 212
14654 15230 576
14933 14654 -279
14768 14933 165

It can be seen that the difference is indeed successfully obtained. If you want to get the ratio of the difference, just case when days = days1 then (nums1 - nums)/nums1 else 0 end .

5. Get the decrease in data for this hour and last hour, and display the number of each decrease range

We can extend the original case..when statement by adding more conditions to divide the range, and finally perform group by summing the values ​​according to the decreasing range. This sql is a bit complicated. You can modify it as needed if you need it. It can be used in actual testing.

select case 
when days = days1 and (nums1 - nums)/nums1 < 0.1 then 0.1
when days = days1 and (nums1 - nums)/nums1 > 0.1 and (nums1 - nums)/nums1 < 0.2 then 0.2
when days = days1 and (nums1 - nums)/nums1 > 0.2 and (nums1 - nums)/nums1 < 0.3 then 0.3
when days = days1 and (nums1 - nums)/nums1 > 0.3 and (nums1 - nums)/nums1 < 0.4 then 0.4
when days = days1 and (nums1 - nums)/nums1 > 0.4 and (nums1 - nums)/nums1 < 0.5 then 0.5
when days = days1 and (nums1 - nums)/nums1 > 0.5 then 0.6
 else 0 end as diff,count(*) as diff_nums
from 
(select count(*) as nums,date_format(log_time,'%Y-%m-%d %h') as days from test where 1 and log_time >='2020-03-20 00:00:00' and log_time <= '2020-04-20 00:00:00' group by days) as m,
(select count(*) as nums1,date_format(date_sub(date_format(log_time,'%Y-%m-%d %h'),interval -1 hour),'%Y-%m-%d %h') as days1 from test where 1 and log_time >='2020-03-20 00:00:00' and log_time <= '2020-04-20 00:00:00' group by days1) as n group by diff having diff >0;

result:

+------+-----------+
| diff | diff_nums |
+------+-----------+
| 0.1 | 360 |
| 0.2 | 10 |
| 0.3 | 1 |
| 0.4 | 1 |
+------+-----------+

Conclusion

1. sql is actually similar to program code. By splitting the requirements and combining them step by step, most requirements can be achieved. If you are timid at the beginning, you naturally won’t be able to write.
2. However, it is generally not recommended to use sql to write complex calculations. It will be faster to write them using programs. The more complex sql , the lower the efficiency.
3. DBA classmates are sometimes unreliable, so you still have to rely on yourself

Supplementary introduction: The difference between MySQL database time and actual time is 8 hours

url=jdbc:mysql://127.0.0.1:3306/somedatabase?characterEncoding=utf-8&serverTimezone=GMT%2B8

Add &serverTimezone=GMT%2B8 after the database configuration

This is the end of this article about MySQL querying the difference between hourly data and previous hourly data. For more relevant MySQL hourly data difference content, please search 123WORDPRESS.COM's previous articles or continue to browse the following related articles. I hope everyone will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • Mysql method to calculate the difference between two adjacent rows of a column

<<:  JavaScript to implement random roll call web page

>>:  Detailed explanation of Mencached cache configuration based on Nginx

Recommend

Vue complete code to implement single sign-on control

Here is a Vue single sign-on demo for your refere...

HTML Basics_General Tags, Common Tags and Tables

Part 1 HTML <html> -- start tag <head>...

Vue implements Tab tab switching

This article example shares the specific code of ...

Vue implements a simple shopping cart example

This article example shares the specific code of ...

New ideas for time formatting in JavaScript toLocaleString()

Table of contents 1. Conventional ideas for time ...

MySQL sorting principles and case analysis

Preface Sorting is a basic function in databases,...

mysql 5.7.20 win64 installation and configuration method

mysql-5.7.20-winx64.zipInstallation package witho...

Dynamically add tables in HTML_PowerNode Java Academy

Without further ado, I will post the code for you...

MySQL 5.7 installation and configuration tutorial under CentOS7 (YUM)

Installation environment: CentOS7 64-bit, MySQL5....

The process of installing and configuring nginx in win10

1. Introduction Nginx is a free, open source, hig...

SQL query for users who have logged in for at least n consecutive days

Take 3 consecutive days as an example, using the ...

JavaScript to implement the web version of the snake game

This article shares the specific code for JavaScr...

js canvas realizes circular water animation

This article example shares the specific code of ...

Docker-compose image release process analysis of springboot project

Introduction The Docker-Compose project is an off...