Optimized implementation of count() for large MySQL tables

Optimized implementation of count() for large MySQL tables

The following is my judgment based on the data structure of the B+ tree and my speculation on the experimental results. If there are any errors, please correct me!

Today I experimented with the optimization of MySQL count() operation. The following discussion is based on MySQL 5.7 InnoDB storage engine and x86 Windows operating system.

The structure of the created table is as follows (the amount of data is 1 million):

表結構

First, the question is which is faster among MySQL's count(*), count(PK), and count(1).
The results are as follows:

這里寫圖片描述

這里寫圖片描述

這里寫圖片描述

There is no difference! After adding the WHERE clause, the time for the three queries is also the same, so I won’t post the pictures.

When I was working in the company before, I wrote a SQL statement of select count(*) from table , which was very slow when there was a lot of data. So how to optimize it?

This starts with the InnoDB index. The InnoDB index is B+Tree.

For the primary key index: it only stores data on leaf nodes, its key is the primary key , and its value is the entire data .
For auxiliary indexes: key is the column for creating the index, and value is the primary key.

This gives us two pieces of information:
1. The entire data will be found according to the primary key
2. Only the primary key can be found based on the secondary index, and then the remaining information must be found through the primary key.

So if we want to optimize the count(*) operation, we need to find a short column and create a secondary index for it.
In my case it is status , although its "severelity" is almost 0.

Create an index first: ALTER TABLE test1 ADD INDEX ( status );
Then query as shown below:

這里寫圖片描述

It can be seen that the query time dropped from 3.35s to 0.26s, and the query speed increased by nearly 13 times .

If the index is the str column, what will the result be?
Create an index first: alter table test1 add index (str)
The results are as follows:

這里寫圖片描述

As you can see, the time is 0.422s, which is also very fast, but it is still about 1.5 times slower than status column.

To be more daring, I will do an experiment. I will delete the index of the status column, create a joint index of status and left(omdb,200) (this column has an average of 1000 characters), and then check the query time.
Create an index: alter table test1 add index ( status ,omdb(200))
The results are as follows:

這里寫圖片描述

The time is 1.172s

alter table test1 add index (status,imdbid);

Replenish! !
Be careful about index failures!
After the index is created, it looks like this:

這里寫圖片描述

You can see that key_len is 6, and Extra's description is using index.

If the index fails:

這里寫圖片描述

There are many situations in which indexes become invalid, such as using functions, != operations, etc. For details, please refer to the official documentation.

I haven't done a deep study on MySQL, and the above is based on my judgment based on the B+ tree data structure and speculation on experimental results. If there are any mistakes, please correct me!

This is the end of this article about the optimized implementation of count() for large MySQL tables. For more relevant content on count() optimization for large MySQL tables, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • Usage and performance optimization techniques of aggregate function count in MySQL
  • Sharing on count optimization issues in innodb in mysql
  • Let's talk about the performance of MySQL's COUNT(*)
  • Detailed explanation of the correct use of the count function in MySQL
  • A brief discussion on MySQL count of rows
  • Summary of methods to improve mysql count
  • Detailed explanation of count without filter conditions in MySQL
  • Summary of the differences between count(*), count(1) and count(col) in MySQL
  • Detailed explanation of mySQL count data examples in multiple tables
  • Use and optimization of MySQL COUNT function

<<:  Common HTML tag writing errors

>>:  Web development tutorial cross-domain solution detailed explanation

Recommend

MySQL master-slave synchronization principle and application

Table of contents 1. Master-slave synchronization...

Creating a Secondary Menu Using JavaScript

This article example shares the specific code of ...

HTML table tag tutorial (21): row border color attribute BORDERCOLOR

To beautify the table, you can set different bord...

Tutorial on deploying jdk and tomcat on centos7 without interface

1. Install xshell6 2. Create a server connection ...

How to Rename a Group of Files at Once on Linux

In Linux, we usually use the mv command to rename...

How to set up the terminal to run applications after Ubuntu starts

1. Enter start in the menu bar and click startup ...

MySQL tutorial data definition language DDL example detailed explanation

Table of contents 1. Introduction to the basic fu...

Implementation of waterfall layout in uni-app project

GitHub address, you can star it if you like it Pl...

Linux Operation and Maintenance Basic System Disk Management Tutorial

1. Disk partition: 2. fdisk partition If the disk...

CentOS 7.6 installation of MySQL 5.7 GA version tutorial diagram

Table of contents Environment Preparation Environ...

SQL function to merge a field together

Recently, I need to query all the fields in a rel...