We are all familiar with the MySQL count() function, which is used to count the number of rows in each table. But if your table gets bigger and bigger, and you use the InnoDB engine, you will find that the calculation speed will get slower and slower. In this article, we will first introduce the principles and reasons for implementing count(), then analyze the performance of different uses of count, and finally provide a solution for tables that need to change frequently and need to count the number of rows. Implementation of Count() InnoDB and MyISAM are commonly used data engines for MySQL. Due to the differences in their implementation, the efficiency of count() operations is also different. For MyISAM, it stores the total number of rows in each table on disk, so when using count(*) calculation, the efficiency is very high and the result is returned directly. However, if a where condition is added, a search will still be performed, so the efficiency is not high. For InnoDB, when performing count(*) operations, data is read from the engine row by row, and then the count is accumulated. Naturally, as the table becomes larger, the efficiency becomes lower. So why can't InnoDB record in the table like MyISAM? The reason is that InnoDB has more transaction support features than MyISAM, but it also requires certain trade-offs. Due to the control of MVCC, MySQL has the ability of concurrency. That is to say, at the same time, the number of rows of the table returned by InnoDB is not fixed. The number of rows seen by the transaction is related to the consistency view after it is enabled. In other words, the data version that each transaction can see is different, and it can only be judged row by row. Like the following transaction, assuming that table t has 10,000 records:
For Session A, Session B is not visible because it is not submitted. Session C is submitted, but it is submitted after Session A is started, so it is also invisible. So it is 10000. As for Session B, Session C was committed before it was started and inserted another entry, so the result is 10002. In fact, InnoDB has made optimizations when performing count(*) operations. When performing count(*) operations, since the ordinary index will save the id value of the primary key, it will find the smallest ordinary index tree for search instead of traversing the primary key index tree.
In addition, when using show table status, you can also query the number of rows very quickly, but it should be noted that this command uses the value of index statistics for sampling and estimation. The official document says that the error can be 40%-50%. But what should we do if we really need to get the number of rows in a table in real time? Manually save the number of tables Use a cache system to store counts For tables that are updated, you might want to use a cache system to support them. For example, Redis is used to store the total number of rows in a table. Each time an entry is inserted into the database, the Redis count increases by one, and decreases by one. This makes the read and write operations seem fast, but there are some problems. The cache system will lose updates: The data in Redis memory needs to be synchronized to the disk regularly, but there is no way to prevent abnormal restart of Redis. For example, after inserting data into Redis, Redis restarts and the data is not persisted to the hard disk. At this time, after restarting Redis, you can execute the count(*) operation from the database and then update it to Redis. A full table scan is still possible. The logic is not precise: Suppose a page needs to display the number of rows in a table and each piece of data. When implementing, you can first get the quantity from Redis, and then get the record from the database. But it may happen that:
For Session B, at time T2, it is found that the number of Redis entries is one less than that of the database.
For Session B, at time T2, it is found that the number of Redis records is 1 more than that of the database. In fact, the problem occurs because Redis and database record query are not in the same transaction. Save with database MySQL itself supports transactions due to the support of the InnoDB engine. Therefore, by replacing the Redis insert operation with an update operation in the database, you can take advantage of the transaction features at the RR level to ensure data accuracy. Another point is that, thanks to the support of redo log, crash-safe can be guaranteed when MySQL encounters an exception. Execution efficiency of different count usages count() itself is an aggregate function, which evaluates the returned result set row by row. If the parameter is not NULL, it will continue to accumulate and finally return the result. Therefore, count(*), count(id), and count(1) all mean returning the total number of rows in the result set that meet the conditions. And count(field) indicates the fields that are not NULL in the data rows that meet the conditions. For count(id), InnoDB will traverse the entire table, take out each row id, and give it to the server layer. The server determines whether the id is empty and then accumulates it. For count(1), InnoDB will traverse the entire table but will not retrieve the value. The server layer will put 1 in itself and then accumulate it. Therefore, count(1) is faster than count(*) because it does not require parsing data rows and copying field values. For count(field), if the field is defined as not null, it will be read line by line, and it will be determined that it cannot be null, and then accumulated. If it can be null when defined, the value needs to be removed during execution and accumulated only if it is not null. The exception is count(*), which is specially optimized. It does not take values, but directly accumulates them by row, and finds the smallest index tree for calculation. Summarize The execution efficiency of the MySQL count() function is related to the underlying data engine. MyISAM does not add where conditions, the query will be very fast, but it does not support transactions. InnoDB supports transactions. However, due to the implementation of MVCC, each query requires scanning rows one by one, which is inefficient. The solution is to design an external cache such as Redis to save records. However, there are cases of abnormal restarts and inaccurate data. A solution is to create a new table in InnoDB to save the records. Finally, InnoDB makes independent optimizations for count(*), while other count operations require additional operations. The above is a brief discussion of the details of MySQL count, which counts the number of rows. For more information about MySQL count, please pay attention to other related articles on 123WORDPRESS.COM! You may also be interested in:
|
<<: Vue.js performance optimization N tips (worth collecting)
>>: Detailed explanation of Linux dynamic library generation and usage guide
The solution to the background tiling or border br...
This article shares a collection of Java problems...
Forgot your MySQL password twice? At first I did ...
When we are writing a page, we often encounter a ...
Preface In order to ensure the consistency and in...
<br />Question: Why is it not recommended to...
In cells, light border colors can be defined indi...
Table of contents Create an image File Structure ...
Create a container [root@server1 ~]# docker run -...
During normal project development, if the MySQL v...
Permission denied: The reason for this is: there ...
This article introduces how to configure Nginx to...
Linux is generally used as a server, and the serv...
This article shares the specific code of Vue to i...
Table of contents 4 isolation levels of MySQL Cre...