Usage and performance optimization techniques of aggregate function count in MySQL

The environment of this article is Windows 10, and the MySQL version is 5.7.12-log

1. Basic Use

There are two basic functions of count:

Count the number of data in a column;
Count the number of rows in the result set;

Used to obtain the number of data that meet the conditions. However, there are some situations that are different from the impression in use, such as when count is applied to one column, multiple columns, and when * is used to express the entire row, the effect is different.

The example table is as follows:

CREATE TABLE `NewTable` (
`id` int(11) NULL DEFAULT NULL ,
`name` varchar(30) NULL DEFAULT NULL ,
`country` varchar(50) NULL DEFAULT NULL ,
`province` varchar(30) NULL DEFAULT NULL ,
`city` varchar(30) NULL DEFAULT NULL 
)ENGINE=InnoDB

這里寫圖片描述

1.1 Do not count NULL values

If there are NULL values, they will be filtered out in the returned results.

select count(country) from person;

The returned results are as follows:

這里寫圖片描述

If the data item that meets the conditions does not exist, the structure returns 0. This method is often used to determine whether data that meets the conditions exists; the returned data type is bigint.

1.2 Processing of count(*)

The processing of count(*) is a little different. It will return the number of all data, but will not filter out NULL values. It is not equivalent to expanding into all columns, but will directly ignore all columns and count all rows directly. The statement is as follows:

select count(*) from person;

The returned results are as follows:

這里寫圖片描述

When you want to return the number of all data, but don't want to include columns that are all NULL, it is impossible to use count(*). However, in 1.1, it is mentioned that count will filter out NULL when acting on a column, so is it correct to write it directly like this?

select count(id, `name`, country, province, city) from person;

That's wrong. Count can only work on a single column, not multiple columns, so the above writing is wrong.

In addition, the count(*) statement has been optimized in the MyISAM storage engine. The number of data rows in each table is stored in the storage engine and can be obtained quickly. However, in transactional storage engines, such as InnoDB, multiple transactions are involved.

1.3 Processing of count(distinct …)

count(distinct ...) returns the number of rows that are distinct but not NULL. This is different from using only distinct because distinct does not filter NULL values. For details, see How to use distinct in MySQL.

- If there is no data that meets the conditions, it returns 0;
- This statement can be applied to multiple columns. When there is a difference between the columns, the entire row of data is considered different, which has the same effect as distinct applied to multiple columns.

select count(DISTINCT country) from person;

The returned results are as follows:

這里寫圖片描述

But for the combination of count(*) and count(distinct), as follows:

select count(DISTINCT *) from person;

This statement is wrong and cannot be executed, so it is different from select count(DISTINCT *) from person.

2. Performance Optimization

Usually, the count(*) operation requires scanning a large number of rows in the data table. Avoiding scanning a large amount of data becomes the key to optimizing the statement. This issue can be considered from the following two perspectives.

2.1 Optimization at the database level

2.1.1 For count(*)

MySQL has been optimized for count(*) internally. The query using explain is as follows:

EXPLAIN select count(*) from person;

這里寫圖片描述

It can be seen that the query does not use a full table scan or an index, and does not even need to query the data table. In the example database above, the storage engine of the library is InnoDB, and there is neither a primary key nor an index.

2.2 Counting a single column

The query is as follows:

EXPLAIN select count(country) from person where id > 2;

這里寫圖片描述

It was found that the entire table was scanned without a primary key and index. One of the most direct ways to avoid scanning a large number of rows in the data is to use indexes:

When id is set as a general index: INDEX abc (id) USING BTREE .

Execute the query as follows:

EXPLAIN select count(country) from person where id > 2;

The results are as follows:

這里寫圖片描述

At this time, it is found that the index is not used and the full table scan is still performed. When the following is executed:

EXPLAIN select count(country) from person where id > 4;

The results are as follows:

這里寫圖片描述

This is a range query using an index, which is obviously better than the above.

But the question is, why do we use indexes sometimes and not sometimes? In the first query above, possible keys were detected but not used? If anyone knows please explain!

Set id as the primary key and execute the query as follows:

EXPLAIN select count(country) from person where id > 2;

The results are as follows:

這里寫圖片描述

2.2 Optimizing at the application level

To optimize at the application level, you can consider introducing a cache subsystem into the system architecture, such as Memcached, which was commonly used in the past, or Redis, which is very popular now, but this will increase the complexity of the system.

MySQL group by and aggregate functions (sum, count, etc.) examples

First, let's take a look at MySQL aggregate functions.

A special function in MySQL: aggregate function, SUM, COUNT, MAX, MIN, AVG, etc. The fundamental difference between these functions and other functions is that they generally act on multiple records. For example:

SELECT SUM(score) FROM table

This sql means to query the sum of all score columns in the table table.

Next, we use an example to explain how to use aggregate functions in the group by statement.

The book table is as follows:

id	first_name	last_name	city
1	Jason	Martin	Toronto
2	Alison	Mathews	Vancouver
3	James	Mathews	Vancouver
4	Celia	Rice	Vancouver
5	David	Larry	New York

Now we need to group the cities and get the number of records in each group. We need the count aggregation function.

SELECT *,count(*) FROM book GROUP BY city

The result is:

id	first_name	last_name	city	count(*)
1	Jason	Martin	Toronto	1
2	Alison	Mathews	Vancouver	3
5	David	Larry	New York	1

First, divide the returned records into multiple groups based on city. This is the literal meaning of GROUP BY. After grouping, aggregate functions are then used to operate on different fields (one or more records) in each group.

You may also be interested in:

MySQL aggregate function sorting
MySQL grouping queries and aggregate functions
Analysis of MySQL query sorting and query aggregation function usage
How to add conditional expressions to aggregate functions in MySql
MySQL essential basics: grouping function, aggregate function, grouping query detailed explanation

<<: Method to detect whether ip and port are connectable

>>: Solve the problem of yum installation error Protected multilib versions

Three ways to draw a heart shape with CSS

Usage and performance optimization techniques of aggregate function count in MySQL

Three ways to draw a heart shape with CSS

Detailed tutorial on installing Hbase 2.3.5 on Vmware + Ubuntu18.04

Ubuntu 20.04 firewall settings simple tutorial (novice)

Detailed explanation of JavaScript error capture

Problems encountered when uploading images using axios in Vue

JS+Canvas realizes dynamic clock effect

How to remotely log in to the MySql database?

WeChat applet realizes left-right linkage

CentOS 7 switching boot kernel and switching boot mode explanation

A Brief Discussion on the Navigation Window in Iframe Web Pages

Recommend

jQuery+swiper component realizes the timeline sliding year tab switching effect

MySQL 5.7.13 winx64 installation and configuration method graphic tutorial (win10)

Telnet is moved to busybox-extras in Alpine image

jQuery custom magnifying glass effect

How to configure Nginx load balancing

MySQL Community Server 8.0.11 installation and configuration method graphic tutorial

Example of using @media responsive CSS to adapt to various screens

PHP-HTMLhtml important knowledge points notes (must read)

MySql development of automatic synchronization table structure

Deleting the database and running away? How to use xtraback to back up the MySQL database

Implementation of Docker deployment of Django+Mysql+Redis+Gunicorn+Nginx

Introduction to the common API usage of Vue3

Detailed explanation of galera-cluster deployment in cluster mode of MySQL

Example of horizontal arrangement of li tags in HTMl

How to simplify Redux with Redux Toolkit