MySQL grouping queries and aggregate functions

MySQL grouping queries and aggregate functions

Overview

I believe we often encounter such scenarios: we want to know the average spending amount of people who buy cosmetics on Tmall on Double Eleven (this may help to locate the price range of the goods); or the proportion of cosmetics consumption in different age groups (this may help to estimate the inventory of goods).

At this time, group query is needed. The purpose of group query is to divide the data into multiple logical groups (people who buy cosmetics are one group, and people of different age groups who buy cosmetics are also groups), and perform aggregation calculations on each group:.

The syntax format of group query is as follows:

 select cname, group_fun,... from tname [where condition]
 group by group_expression [having group_condition];

To explain:

1. group_fun represents an aggregate function, which refers to a function that performs aggregate calculations on grouped data.

2. group_expression represents the grouping expression. Multiple expressions are allowed and separated by commas.

3. group_condition After grouping, the process of conditionally filtering the grouped data.

4. In the grouping syntax, the fields that appear after select are either the fields after group by or the columns of the aggregate function. Other types will report an exception, which will be explained in detail in the following content.

Before talking about grouping, let's take a look at aggregate functions, which are an important part of the grouping query syntax format. We often need to summarize data without actually retrieving them, so MySQL provides special functions. These functions can be used to calculate the data we need for analysis and report generation.

Aggregate functions

There are several types of aggregate functions.

function illustrate
AVG() Returns the average value of the specified field
COUNT() Returns the number of rows in the query result
MAX() Returns the maximum value of the specified field
MIN() Returns the minimum value of the specified field
SUM() Returns the sum of the specified fields

AVG() Function

AVG() finds the average value of a specific column by counting the number of rows in a table and summing the values ​​in that column. AVG() can be used to return the average of all columns, or it can be used to return the average of a specific column or row.

The following example returns the average age of users in the users table:

mysql> select * from user2;
+----+--------+------+----------+-----+
| id | name | age | address | sex |
+----+--------+------+----------+-----+
| 1 | brand | 21 | fuzhou | 1 |
| 2 | helen | 20 | quanzhou | 0 |
| 3 | sol | 21 | xiamen | 0 |
| 4 | weng | 33 | guizhou | 1 |
| 5 | selina | 25 | NULL | 0 |
| 6 | anny | 23 | shanghai | 0 |
| 7 | annd | 24 | shanghai | 1 |
| 8 | sunny | NULL | guizhou | 0 |
+----+--------+------+----------+-----+
8 rows in set

mysql> select avg(age) from user2;
+----------+
| avg(age) |
+----------+
| 23.8571 |
+----------+
1 row in set

Note:

1. AVG() can only be used to determine the average of a specific numeric column.
2. The AVG() function ignores rows with NULL column values, so the age values ​​in the above figure are divided by 7 after accumulation, not by 8.

COUNT() Function

The COUNT() function counts. You can use COUNT() to determine the number of rows in a table that meet the criteria.

There are three ways to express count: count(*), count(specific field), and count(constant). The following demonstrates the usage of count(*) and count(cname).

mysql> select * from user2;
+----+--------+------+----------+-----+
| id | name | age | address | sex |
+----+--------+------+----------+-----+
| 1 | brand | 21 | fuzhou | 1 |
| 2 | helen | 20 | quanzhou | 0 |
| 3 | sol | 21 | xiamen | 0 |
| 4 | weng | 33 | guizhou | 1 |
| 5 | selina | 25 | NULL | 0 |
| 6 | anny | 23 | shanghai | 0 |
| 7 | annd | 24 | shanghai | 1 |
| 8 | sunny | NULL | guizhou | 0 |
+----+--------+------+----------+-----+
8 rows in set

mysql> select count(*) from user2 where sex=0;
+----------+
| count(*) |
+----------+
| 5 |
+----------+
1 row in set

mysql> select count(age) from user2 where sex=0;
+------------+
| count(age) |
+------------+
| 4 |
+------------+
1 row in set

As you can see, both retrieve the number of female users. Count(*) has one more than count(age) because age contains a null value.

So: if you specify a column name, rows where the value of the specified column is empty are ignored by the COUNT() function, but if an asterisk (*) is used in the COUNT() function, they are not ignored.

MAX() and MIN() Functions

MAX() returns the maximum value in the specified column, and MIN() returns the minimum value in the specified column.

mysql> select * from user2;
+----+--------+------+----------+-----+
| id | name | age | address | sex |
+----+--------+------+----------+-----+
| 1 | brand | 21 | fuzhou | 1 |
| 2 | helen | 20 | quanzhou | 0 |
| 3 | sol | 21 | xiamen | 0 |
| 4 | weng | 33 | guizhou | 1 |
| 5 | selina | 25 | NULL | 0 |
| 6 | anny | 23 | shanghai | 0 |
| 7 | annd | 24 | shanghai | 1 |
| 8 | sunny | NULL | guizhou | 0 |
+----+--------+------+----------+-----+
8 rows in set

mysql> select max(age),min(age) from user2;
+----------+----------+
| max(age) | min(age) |
+----------+----------+
| 33 | 20 |
+----------+----------+
1 row in set

Note: Similarly, the MAX() and MIN() functions ignore rows where the column value is NULL.

SUM Function

SUM() is used to return the sum (total) of the values ​​in the specified column. The following returns the sum of all ages. Again, null values ​​are ignored.

mysql> select * from user2;
+----+--------+------+----------+-----+
| id | name | age | address | sex |
+----+--------+------+----------+-----+
| 1 | brand | 21 | fuzhou | 1 |
| 2 | helen | 20 | quanzhou | 0 |
| 3 | sol | 21 | xiamen | 0 |
| 4 | weng | 33 | guizhou | 1 |
| 5 | selina | 25 | NULL | 0 |
| 6 | anny | 23 | shanghai | 0 |
| 7 | annd | 24 | shanghai | 1 |
| 8 | sunny | NULL | guizhou | 0 |
+----+--------+------+----------+-----+
8 rows in set

mysql> select sum(age) from user2;
+----------+
| sum(age) |
+----------+
| 167 |
+----------+
1 row in set

Group query

Data preparation, assuming that we have an order table as follows (recording the user's order amount and order time):

mysql> select * from t_order;
+---------+-----+-------+--------+---------------------+------+
| orderid | uid | uname | amount | time | year |
+---------+-----+-------+--------+---------------------+------+
| 20 | 1 | brand | 91.23 | 2018-08-20 17:22:21 | 2018 |
| 21 | 1 | brand | 87.54 | 2019-07-16 09:21:30 | 2019 |
| 22 | 1 | brand | 166.88 | 2019-04-04 12:23:55 | 2019 |
| 23 | 2 | helyn | 93.73 | 2019-09-15 10:11:11 | 2019 |
| 24 | 2 | helyn | 102.32 | 2019-01-08 17:33:25 | 2019 |
| 25 | 2 | helyn | 106.06 | 2019-12-24 12:25:25 | 2019 |
| 26 | 2 | helyn | 73.42 | 2020-04-03 17:16:23 | 2020 |
| 27 | 3 | sol | 55.55 | 2019-08-05 19:16:23 | 2019 |
| 28 | 3 | sol | 69.96 | 2020-09-16 19:23:16 | 2020 |
| 29 | 4 | weng | 199.99 | 2020-06-08 19:55:06 | 2020 |
+---------+-----+-------+--------+---------------------+------+
10 rows in set

Single field grouping

That is, group a certain field, such as grouping users, and output their user ID, order quantity, and total amount:

mysql> select uid,count(uid),sum(amount) from t_order group by uid;
+-----+------------+-------------+
| uid | count(uid) | sum(amount) |
+-----+------------+-------------+
| 1 | 3 | 345.65 |
| 2 | 4 | 375.53 |
| 3 | 2 | 125.51 |
| 4 | 1 | 199.99 |
+-----+------------+-------------+
4 rows in set

Multi-field grouping

That is, group multiple fields, such as grouping users, and then group their order data from different years, and output the order quantity and total consumption:

mysql> select uid,count(uid) as nums,sum(amount) as totalamount,year from t_order group by uid,year;
+-----+------+-------------+------+
| uid | nums | totalamount | year |
+-----+------+-------------+------+
| 1 | 1 | 91.23 | 2018 |
| 1 | 2 | 254.42 | 2019 |
| 2 | 3 | 302.11 | 2019 |
| 2 | 1 | 73.42 | 2020 |
| 3 | 1 | 55.55 | 2019 |
| 3 | 1 | 69.96 | 2020 |
| 4 | 1 | 199.99 | 2020 |
+-----+------+-------------+------+
7 rows in set

Conditional filtering before grouping: where

This is very simple. Before grouping (group by), we use the where keyword to filter the conditions and extract the data we need. Assume that we only need to list the data after August 2019. There are only 6 qualified source data, and two of them are grouped in the same year:

mysql> select uid,count(uid) as nums,sum(amount) as totalamount,year from t_order where time > '2019-08-01' group by uid,year;
+-----+------+-------------+------+
| uid | nums | totalamount | year |
+-----+------+-------------+------+
| 2 | 2 | 199.79 | 2019 |
| 2 | 1 | 73.42 | 2020 |
| 3 | 1 | 55.55 | 2019 |
| 3 | 1 | 69.96 | 2020 |
| 4 | 1 | 199.99 | 2020 |
+-----+------+-------------+------+
5 rows in set

Conditional filtering after grouping: having

Sometimes we need to filter the data after grouping. In this case, we need to use the having keyword to filter the data. Under the above conditions, we need to retrieve the data that has been consumed more than once:

mysql> select uid,count(uid) as nums,sum(amount) as totalamount,year from t_order where time > '2019-08-01' group by uid,year having nums>1;
+-----+------+-------------+------+
| uid | nums | totalamount | year |
+-----+------+-------------+------+
| 2 | 2 | 199.79 | 2019 |
+-----+------+-------------+------+
1 row in set

Here we need to distinguish between where and having:

Where is to filter the records before grouping (aggregation), while having is to filter the results after grouping and finally return the filtered results.

Having can be understood as a two-level query, that is, the query operation containing having first obtains the SQL query result table without the having clause, and then uses the having condition on this result table to filter out the matching records, and finally returns these records. Therefore, having can be followed by an aggregate function, and this aggregate function does not have to be the same as the aggregate function after select.

Sorting after grouping

The order condition is placed after the group by condition, which means that after calculating the total consumption and consumption frequency of each user, the total consumption of the user is sorted in descending order.

mysql> select uid,count(uid) as nums,sum(amount) as totalamount from t_order group by uid;
+-----+------+-------------+
| uid | nums | totalamount |
+-----+------+-------------+
| 1 | 3 | 345.65 |
| 2 | 4 | 375.53 |
| 3 | 2 | 125.51 |
| 4 | 1 | 199.99 |
+-----+------+-------------+
4 rows in set

mysql> select uid,count(uid) as nums,sum(amount) as totalamount from t_order group by uid order by totalamount desc;
+-----+------+-------------+
| uid | nums | totalamount |
+-----+------+-------------+
| 2 | 4 | 375.53 |
| 1 | 3 | 345.65 |
| 4 | 1 | 199.99 |
| 3 | 2 | 125.51 |
+-----+------+-------------+
4 rows in set

Limit after grouping

The limit keyword is usually placed at the end of the statement. For example, based on our search above, we limit 1 to only retrieve the item with the highest consumption amount and skip the others.

mysql> select uid,count(uid) as nums,sum(amount) as totalamount from t_order group by uid order by totalamount desc limit 1;
+-----+------+-------------+
| uid | nums | totalamount |
+-----+------+-------------+
| 2 | 4 | 375.53 |
+-----+------+-------------+
1 row in set

The order in which keywords are executed

We can see that we used the keywords where, group by, having, order by, and limit above. If used together, they have a certain order. The wrong order will cause an exception. The syntax format is as follows:

 select cname from tname
 where [original table query condition]
 group by [grouping expression]
 having [group filter condition]
 order by [sort condition]
 limit [offset,] count;
mysql> select uid,count(uid) as nums,sum(amount) as totalamount from t_order where time > '2019-08-01' group by uid having totalamount>100 order by totalamount desc limit 1;
+-----+------+-------------+
| uid | nums | totalamount |
+-----+------+-------------+
| 2 | 3 | 273.21 |
+-----+------+-------------+
1 row in set

Summarize

1. In the grouping syntax, the fields that appear after select are either the fields after group by or the columns of the aggregate function. Other types will report an exception: you can try it yourself.

2. The execution order of the grouping keywords: where, group by, having, order by, limit. The order cannot be changed, otherwise an exception will be reported: You can try it yourself.

The above is the details of MySQL grouping query and aggregate function. For more information about MySQL grouping query and aggregate function, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • Detailed explanation of MySQL single table query operation examples [syntax, constraints, grouping, aggregation, filtering, sorting, etc.]
  • Detailed explanation of the implementation principle of MySQL group query Group By
  • Detailed explanation of the group by statement in MySQL database group query
  • MySQL beginners can say goodbye to the troubles of grouping and aggregation queries

<<:  JavaScript dynamically generates a table with row deletion function

>>:  JavaScript to achieve skin effect (change background)

Recommend

Mysql5.7.14 Linux version password forgotten perfect solution

In the /etc/my.conf file, add the following line ...

Learn MySQL execution plan

Table of contents 1. Introduction to the Implemen...

HTML Grammar Encyclopedia_HTML Language Grammar Encyclopedia (Must Read)

Volume Label, Property Name, Description 002 <...

CSS3 simple cutting carousel picture implementation code

Implementation ideas First, create a parent conta...

How to install ionCube extension using pagoda

1. First install the pagoda Installation requirem...

WeChat applet implements waterfall flow paging scrolling loading

This article shares the specific code for WeChat ...

innerHTML Application

Blank's blog: http://www.planabc.net/ The use...

MySQL Basic Tutorial: Detailed Explanation of DML Statements

Table of contents DML statements 1. Insert record...

calc() to achieve full screen background fixed width content

Over the past few years, there has been a trend i...

Analysis of the principles of docker containers

Table of contents 01 What is the essence of a con...

Vue ElementUI Form form validation

Form validation is one of the most commonly used ...

How to use MySQL stress testing tools

1. MySQL's own stress testing tool - Mysqlsla...

MySql common query command operation list

MYSQL commonly used query commands: mysql> sel...