A brief analysis of MySQL cardinality statistics

A brief analysis of MySQL cardinality statistics

1. What is the cardinality?
Cardinality refers to the number of different values ​​​​in a column of a MySQL table.

If this type of index is unique, then the cardinality = number of rows.

If this column is sex, and the enumeration type only has male and female, then its cardinality is 2

The higher the Cardinality, the more worthy the column is to be indexed. MySQL execution plan also selects indexes based on Cardinality.

The cardinality of each column in the table can be seen in the following way.

Take this classic example:
There is a column called sex. The values ​​stored in the sex column are either male or female, and its maximum cardinality is 2.
There is no need to create an index for sex. Because, in order to improve your sex-based query speed, MySQL will create a brand new B+Tree for the new index you selected. But your sex has only two values. For MySQL, even if it creates a B+Tree index for the column you specify, when the query is actually executed, at most one binary query is performed, and the remaining operations can only be traversal, so it is not meaningful to create an index for sex.

2. When does InnoDB update the cardinality?
Parameter: innodb_stats_auto_recalc controls whether MySQL actively recalculates this persistent information. The default is 1 for true and 0 for false.
By default, cardinality information is recalculated when the rows in a table change by more than 10%.

3. The base number is estimated

The base number is not updated in real time! And it is a value estimated through sampling!

As for what the formula for the base is, it may not be important.

It is important to know that it is an estimate calculated by randomly sampling data pages.

And the number of randomly sampled pages can be set by the parameter innodb_stats_persistent_sample_pages, the default value is 20.

This means that the base value is not accurate, and even the results of your calculations may differ greatly each time.

4. Persistent Cardinality

You can use the parameter innodb_stats_persistent to control whether the cardinality is persisted. The default setting is off.

Of course you can set STATS_PERSISTENT=1 for a single table and its innodb_stats_persistent will be automatically enabled.

The advantage of enabling it is that restarting MySQL will not recalculate this value, thus speeding up the restart process.

5. How to actively update the base?

Executing the following SQL will trigger InnoDB to update the cardinality (even if you don't realize it will update the cardinality).

So try to choose a low business peak period

analyze table tableName;

If the number of samples is too small, the calculated cardinality will be extremely wrong. That will most likely cause the MySQL optimizer to choose the wrong index. This is where you can increase the value appropriately. But increasing it too much may cause ANALYZE TABLE to run slowly.

Conversely, ANALYZE TABLE runs too slowly. You can adjust the value of the parameter innodb_stats_persistent_sample_pages appropriately. However, this may lead to inaccurate cardinality calculation.

If there is no way to balance the relationship between the two. Consider reducing the number of indexed columns in the table or limiting the number of partitions to reduce ANALYZE TABLE complexity. The number of columns in the table's primary key is also important because the primary key columns are appended to each nonunique index.

refer to:

https://dev.mysql.com/doc/refman/5.7/en/innodb-persistent-stats.html

https://dev.mysql.com/doc/refman/5.7/en/innodb-analyze-table-complexity.html

The above is a brief analysis of the details of MySQL cardinality statistics. For more information about MySQL cardinality statistics, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • Analysis of the principles of Mysql dirty page flush and shrinking table space
  • Recommend several MySQL related tools
  • A brief analysis of MySQL's lru linked list
  • MySQL Query Cache and Buffer Pool
  • mysql method to recursively search for all child nodes of a menu node
  • What is a MySQL tablespace?
  • How to locate MySQL slow queries
  • MySQL Flush-List and dirty page flushing mechanism

<<:  Detailed explanation of web page loading progress bar (recommended)

>>:  Solution to the problem of insufficient storage resource pool of Docker server

Recommend

MySQL query statement grouped by time

MySQL query by year, month, week, day group 1. Qu...

Detailed explanation of dynamic Christmas tree through JavaScript

Table of contents 1. Animated Christmas Tree Made...

Do you know how to optimize loading web fonts?

Just as the title! The commonly used font-family l...

Detailed tutorial on installing MySQL database on Alibaba Cloud Server

Table of contents Preface 1. Uninstall MySQL 2. I...

How to view and optimize MySql indexes

MySQL supports hash and btree indexes. InnoDB and...

Summary of CSS sibling element floating analysis

float:left/right/none; 1. Same level floating (1)...

Improvements to the web server to improve website performance

<br />In the first section of this series, w...

Detailed explanation of the usage of the ESCAPE keyword in MySQL

MySQL escape Escape means the original semantics ...

Detailed explanation of MySQL date addition and subtraction functions

1. addtime() Add the specified number of seconds ...

Rounding operation of datetime field in MySQL

Table of contents Preface 1. Background 2. Simula...

Summary of new usage of vi (vim) under Linux

I have used the vi editor for several years, but ...

What you need to understand about MySQL locks

1. Introduction MySQL locks can be divided into g...

uniapp Sample code for implementing global sharing of WeChat mini-programs

Table of contents Create a global shared content ...

Detailed examples of replace and replace into in MySQL into_Mysql

MySQL replace and replace into are both frequentl...