Detailed explanation of multiple implementation methods of Mysql counting by conditions

Detailed explanation of multiple implementation methods of Mysql counting by conditions

Recently, I was adding a series of statistical functions to the backend of a certain website, and encountered many situations where counting by conditions was required. I have tried several methods, and I have briefly recorded them below for your reference.

Problem Description

To make the discussion easier to understand, I have simplified the issue a bit and removed a lot of the background.

Once upon a time there was an emperor who had 50 concubines. These concubines gave birth to 100,000 sons for him unjustly. The emperor was very distressed. It was difficult to manage such a large number of sons. Moreover, he wanted to know how many sons each concubine had given birth to so that he could reward them according to their merits. This was very difficult to do. So the emperor hired a programmer to help him write a program to use a database to store the information of all his sons, so that he could use the program to count and manage them.

The structure of the database is as follows:

id The prince's unique number
mother The unique number of the prince's mother

The emperor divided his concubines into two levels, the Concubines in the Celestial Palace (numbered less than 25) and the Concubines in the Underground Palace (numbered greater than or equal to 25). He wanted to know which one had stronger fertility, the Concubines in the Celestial Palace or the Concubines in the Underground Palace. So, programmers started writing SQL Queries.

Method 1: Using GROUP BY

SQL Query

SELECT COUNT(*) FROM `prince` GROUP BY `mother` > 24;

Execution Results

count(*)
50029
49971

Running time on 100,000 rows: 0.0335 seconds

analyze

The biggest problem with this GROUP BY approach is that it is impossible to distinguish the results obtained. Which of these two numbers is the number of princes born to the ladies in the Heavenly Palace, and which one is the number of princes born to the ladies in the Underground Palace? have no idea. So, even though it gives a total, it's meaningless.

Therefore, in order to distinguish the statistical results, the condition mother > 24 must also be reflected as a field in the result set. The modified SQL is as follows:

SELECT COUNT(*) AS `number`, `mother` > 24 AS `type` FROM `prince` GROUP BY `mother` > 24;

Execution Results

number type
50029 0
49971 1

When the conditional expression is used as a field, the value of the field is the value of the conditional expression. Therefore, in our example, type = 1 means that the value of mother > 24 is 1. Therefore, the numbers in the second row represent the number of princes born to the ladies in the underground palace.

After the modifications, we can see that the goddesses in the heavenly palace are slightly better.

Pros and Cons

The disadvantages are obvious. Since conditional expressions are used as the basis for grouping, it can only make binary divisions and is not suitable for situations where multiple categories need to be divided for statistical purposes. For example, it would be impossible to count the number of children born by concubines No. 1 to 10, No. 11 to 24, and No. 25 to 50 respectively.

In addition, since GROUP BY is used, sorting is involved and the execution time is longer.

I haven't found any advantages to this approach yet.

Method 2: Using nested SELECT

The goal can also be achieved by using nested SELECTs, counting the data under a condition in each SELECT clause, and then integrating these statistical data with a main SELECT.

SQL Query

SELECT 
  ( SELECT COUNT( * ) FROM `prince` WHERE `mother` >24 ) AS `digong`, 
  ( SELECT COUNT( * ) FROM `prince` WHERE `mother` <=24 ) AS `tiangong`

Execution Results

digong tiangong
49971 50029

Running time on 100,000 rows: 0.0216 seconds

analyze

This nested SELECT method is very intuitive. It counts the values ​​under each condition separately and summarizes them at the end. It is easy to understand and is no different from natural language.

Pros and Cons

The advantage is that it is intuitive and faster than GROUP BY. Although there are three SELECT statements, which seems to be two more than the GROUP BY solution, it does not involve sorting, which saves a lot of time.

The disadvantage may be that there are too many sentences, and students who are obsessed with the number of sentences may feel uncomfortable.

Method 3: Using CASE WHEN

The CASE WHEN statement is very powerful and can define flexible query conditions, making it very suitable for classification statistics.

SQL Query

SELECT 
  COUNT( CASE WHEN `mother` >24 THEN 1 ELSE NULL END ) AS `digong`, 
  COUNT( CASE WHEN `mother` <= 24 THEN 1 ELSE NULL END ) AS `tiangong`
FROM prince

Execution Results

digong tiangong
49971 50029

Running time on 100,000 rows: 0.02365825 seconds

analyze

The key to this method is

COUNT( CASE WHEN `mother` >24 THEN 1 ELSE NULL END )

Here, COUNT and CASE WHEN are used together to achieve categorical counting. First use CASE WHEN. When the condition is met, set the field value to 1. When the condition is not met, set the field value to NULL. Then the COUNT function only counts non-NULL fields. Then the problem is solved.

Pros and Cons

As for the advantage, this method does not involve sorting, so the running time is comparable to that of method 2, and the number of SELECT statements is reduced to 1.

The disadvantage is that the sentences are relatively long, and students who are obsessed with sentence length may feel uncomfortable.

Summarize

For conditional counting of certain categories, try not to use GROUP BY to avoid sorting and speed up the execution of the query.

If you need to classify based on the value of a field, and the value of the field is variable, for example, the emperor wants to count the number of children born by each concubine, and he may keep marrying many concubines. In this case, using methods 2 and 3 is not very effective, and it is simpler and more convenient to use a GROUP BY.

The above is the full content of this article. I hope it will be helpful for everyone’s study. I also hope that everyone will support 123WORDPRESS.COM.

You may also be interested in:
  • Two methods to implement MySQL group counting and range aggregation
  • Detailed explanation of MySQL to obtain statistical data for each day and each hour of a certain period of time
  • Summary of MySQL time statistics methods
  • mysql obtains statistical data within a specified time period
  • Implementing high-performance and high-concurrency counter solutions in MySQL (such as article click counts)
  • Mysql auto_increment recount (let id start from 1)
  • Detailed explanation of the process of creating a counter using PHP and MYSQL

<<:  Detailed explanation of the process of setting up nvidia-docker environment on Linux server

>>:  Vue realizes the sliding cross effect of the ball

Recommend

Various types of MySQL indexes

What is an index? An index is a data structure th...

js to achieve sliding carousel effect

This article shares the specific code of js to ac...

Tutorial on installing MySQL 5.7.28 on CentOS 6.2 (mysql notes)

1. Environmental Preparation 1.MySQL installation...

Detailed explanation of nginx installation, deployment and usage on Linux

Table of contents 1. Download 2. Deployment 3. Ng...

Detailed explanation of nginx request header data reading process

In the previous article, we explained how nginx r...

Solution to 1067 when Mysql starts in Windows

I just started working a few days ago and install...

Javascript implements simple navigation bar

This article shares the specific code of Javascri...

Examples of optimistic locking and pessimistic locking in MySQL

The task of concurrency control in a database man...

Vue implements irregular screenshots

Table of contents Image capture through svg CSS p...

A brief discussion on two current limiting methods in Nginx

The load is generally estimated during system des...

JavaScript realizes magnifying glass special effects

The effect to be achieved: When the mouse is plac...

Prevent HTML and JSP pages from being cached and re-fetched from the web server

After the user logs out, if the back button on the...