Let's talk about the performance of MySQL's COUNT(*)

Let's talk about the performance of MySQL's COUNT(*)

Preface

Basically, programmers in the workplace use count(*), count(1) or count(primary key) to count the number of rows in a database table. Do you understand the differences and performance between them?

In fact, during the development process, it is a very time-consuming operation for programmers to count the total number of rows in a large table. So which method should we use to count faster?

Next, let’s talk about the method and performance of counting the total number of rows in MySQL.

Which is faster: count(*), count(1), or count(primary key)?

1. Create a table and insert 10 million records for experimental testing:

# Create a test table CREATE TABLE `t6` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `name` varchar(50) NOT NULL,
 `status` tinyint(4) NOT NULL,
 PRIMARY KEY (`id`),
 KEY `idx_status` (`status`)
)ENGINE=InnoDB DEFAULT CHARSET=utf8;

# Create a stored procedure to insert 1000w data CREATE PROCEDURE insert_1000w()
BEGIN
  DECLARE i INT;
  SET i=1;
  WHILE i<=10000000 DO
    INSERT INTO t6(name,status) VALUES('god-jiang-666',1);
    SET i=i+1;
  END WHILE;
END;

#Call the stored procedure to insert 10 million rows of data call insert_1000w();

2. Analyze the experimental results

# It took 0.572 seconds select count(*) from t6; 

insert image description here

# It took 0.572 seconds select count(1) from t6; 

insert image description here

# It took 0.580 seconds select count(id) from t6; 

insert image description here

# It took 0.620 seconds select count(*) from t6 force index (primary); 

insert image description here

From the above experiment, we can conclude that count(*) and count(1) are the fastest, followed by count(id), and the slowest is when count uses a mandatory primary key .

Let's continue to test their respective execution plans:

explain select count(*) from t6;
show warnings; 

insert image description here

insert image description here

explain select count(1) from t6;
show warnings; 

insert image description here

insert image description here

explain select count(id) from t6;
show warnings; 

insert image description here

insert image description here

explain select count(*) from t6 force index (primary);
show warnings; 

insert image description here

insert image description here

From the above experiment, we can draw the following three points:

  1. The MySQL query optimizer rewrites count(*) to count(0) and selects the idx_status index.
  2. Both count(1) and count(id) select the idx_statux index
  3. After adding force index (primary), the forced index is used.

This idx_status is equivalent to a secondary auxiliary index tree. Its purpose is to illustrate that when InnoDB processes count(*), if there is an auxiliary index tree, it will give priority to the auxiliary index tree to count the total number of rows.

To verify the conclusion that count(*) will give priority to the auxiliary index tree, let's continue with the following experiment:

# Delete the idx_status index and continue to execute count(*)
alter table t6 drop index idx_status;

explain select count(*) from t6; 

insert image description here

From the above experiment, we can conclude that if the auxiliary index tree idx_status is deleted, count(*) will choose to use the primary key index. So the conclusion is: count(*) will give priority to the auxiliary index. If there is no auxiliary index, the primary key index will be used.

Why does count(*) prefer secondary indexes?

Prior to MySQL 5.7.18, InnoDB processed count(*) statements by scanning the clustered index.

Starting with MySQL 5.7.18, InnoDB handles COUNT(*) statements by traversing the smallest available secondary index. If no secondary index exists, the clustered index is scanned.

Why does the new version use secondary indexes to process count(*)?

Because the leaf nodes of the InnoDB secondary index tree store the primary key, and the leaf nodes of the primary key index tree store the entire row of data, the secondary index tree is smaller than the primary key index tree. Therefore, the query optimizer gives priority to secondary indexes based on cost considerations. So index count(*) is faster than count(primary key).

Summarize

The conclusion of this article is count(*)=count(1)>count(id) .

Why is count(id) slower without the primary key index? Because count(id) needs to retrieve the primary key, then check whether it is empty, and then accumulate it, which is more expensive.

Count(*) will count all NOT NULL and NULL fields, while count(id) will not count NULL fields, so we should try to use NOT NULL when building tables and give it a default of empty.

Finally, when you want to total the number of rows in a database table in the future, you can boldly use count(*) or count(1).

References

  • Chapter 6 of "High Performance MySQL" (3rd Edition) Optimizing COUNT() Queries
  • "MySQL Practice 45 Lectures" by Lin Xiaobin

This is the end of this article about the performance of MySQL COUNT(*). For more information about MySQL COUNT(*), please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • Optimized implementation of count() for large MySQL tables
  • Usage and performance optimization techniques of aggregate function count in MySQL
  • Sharing on count optimization issues in innodb in mysql
  • Detailed explanation of the correct use of the count function in MySQL
  • A brief discussion on MySQL count of rows
  • Summary of methods to improve mysql count
  • Detailed explanation of count without filter conditions in MySQL
  • Summary of the differences between count(*), count(1) and count(col) in MySQL
  • Detailed explanation of mySQL count data examples in multiple tables
  • Use and optimization of MySQL COUNT function

<<:  Limit HTML text box input to only numbers and decimal points

>>:  Use non-root users to execute script operations in docker containers

Recommend

Introduction to useRef and useState in JavaScript

Table of contents 1. useState hook 2. useRef hook...

Solution to the problem of invalid line-height setting in CSS

About the invalid line-height setting in CSS Let&...

Introduction to MySQL role functions

Table of contents Preface: 1. Introduction to rol...

How to use Linux to calculate the disk space occupied by timed files

Open the scheduled task editor. Cent uses vim to ...

Causes and solutions for slow MySQL query speed and poor performance

1. What affects database query speed? 1.1 Four fa...

Web page creation for beginners: Learn to use HTML's hyperlink A tag

The hyperlink a tag represents a link point and i...

How to implement dual-machine master and backup with Nginx+Keepalived

Preface First, let me introduce Keepalived, which...

Stealing data using CSS in Firefox

0x00 Introduction A few months ago, I found a vul...

How to install mysql5.7 in windows

First download the compressed version of mysql, t...

base target="" controls the link's target open frame

<base target=_blank> changes the target fram...

HTML implements the function of detecting input completion

Use "onInput(event)" to detect whether ...

Why is the scroll bar on the web page set on the right?

Why are the scroll bars of the browsers and word ...

W3C Tutorial (15): W3C SMIL Activities

SMIL adds support for timing and media synchroniz...