A brief analysis of MySQL cardinality statistics

A brief analysis of MySQL cardinality statistics

1. What is the cardinality?
Cardinality refers to the number of different values ​​​​in a column of a MySQL table.

If this type of index is unique, then the cardinality = number of rows.

If this column is sex, and the enumeration type only has male and female, then its cardinality is 2

The higher the Cardinality, the more worthy the column is to be indexed. MySQL execution plan also selects indexes based on Cardinality.

The cardinality of each column in the table can be seen in the following way.

Take this classic example:
There is a column called sex. The values ​​stored in the sex column are either male or female, and its maximum cardinality is 2.
There is no need to create an index for sex. Because, in order to improve your sex-based query speed, MySQL will create a brand new B+Tree for the new index you selected. But your sex has only two values. For MySQL, even if it creates a B+Tree index for the column you specify, when the query is actually executed, at most one binary query is performed, and the remaining operations can only be traversal, so it is not meaningful to create an index for sex.

2. When does InnoDB update the cardinality?
Parameter: innodb_stats_auto_recalc controls whether MySQL actively recalculates this persistent information. The default is 1 for true and 0 for false.
By default, cardinality information is recalculated when the rows in a table change by more than 10%.

3. The base number is estimated

The base number is not updated in real time! And it is a value estimated through sampling!

As for what the formula for the base is, it may not be important.

It is important to know that it is an estimate calculated by randomly sampling data pages.

And the number of randomly sampled pages can be set by the parameter innodb_stats_persistent_sample_pages, the default value is 20.

This means that the base value is not accurate, and even the results of your calculations may differ greatly each time.

4. Persistent Cardinality

You can use the parameter innodb_stats_persistent to control whether the cardinality is persisted. The default setting is off.

Of course you can set STATS_PERSISTENT=1 for a single table and its innodb_stats_persistent will be automatically enabled.

The advantage of enabling it is that restarting MySQL will not recalculate this value, thus speeding up the restart process.

5. How to actively update the base?

Executing the following SQL will trigger InnoDB to update the cardinality (even if you don't realize it will update the cardinality).

So try to choose a low business peak period

analyze table tableName;

If the number of samples is too small, the calculated cardinality will be extremely wrong. That will most likely cause the MySQL optimizer to choose the wrong index. This is where you can increase the value appropriately. But increasing it too much may cause ANALYZE TABLE to run slowly.

Conversely, ANALYZE TABLE runs too slowly. You can adjust the value of the parameter innodb_stats_persistent_sample_pages appropriately. However, this may lead to inaccurate cardinality calculation.

If there is no way to balance the relationship between the two. Consider reducing the number of indexed columns in the table or limiting the number of partitions to reduce ANALYZE TABLE complexity. The number of columns in the table's primary key is also important because the primary key columns are appended to each nonunique index.

refer to:

https://dev.mysql.com/doc/refman/5.7/en/innodb-persistent-stats.html

https://dev.mysql.com/doc/refman/5.7/en/innodb-analyze-table-complexity.html

The above is a brief analysis of the details of MySQL cardinality statistics. For more information about MySQL cardinality statistics, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • Analysis of the principles of Mysql dirty page flush and shrinking table space
  • Recommend several MySQL related tools
  • A brief analysis of MySQL's lru linked list
  • MySQL Query Cache and Buffer Pool
  • mysql method to recursively search for all child nodes of a menu node
  • What is a MySQL tablespace?
  • How to locate MySQL slow queries
  • MySQL Flush-List and dirty page flushing mechanism

<<:  Detailed explanation of web page loading progress bar (recommended)

>>:  Solution to the problem of insufficient storage resource pool of Docker server

Recommend

Front-end JavaScript housekeeper package.json

Table of contents 1. Required attributes 1. name ...

Demystifying the HTML 5 Working Draft

The World Wide Web Consortium (W3C) has released a...

Pay attention to the use of HTML tags in web page creation

HTML has attempted to move away from presentation...

In-depth explanation of special permissions SUID, SGID and SBIT in Linux

Preface For the permissions of files or directori...

Summary of experience in using div box model

Calculation of the box model <br />Margin + ...

Nginx implements https website configuration code example

https base port 443. It is used for something cal...

Why is it not recommended to use index as key in react?

1. Compare the old virtual DOM with the new virtu...

How does MySQL connect to the corresponding client process?

question For a given MySQL connection, how can we...

Core skills that web front-end development engineers need to master

The content involved in Web front-end development...

MySQL InnoDB tablespace encryption example detailed explanation

Preface Starting from MySQL 5.7.11, MySQL support...

MySQL PXC builds a new node with only IST transmission (recommended)

Demand scenario: The existing PXC environment has...

Docker memory monitoring and stress testing methods

The Docker container that has been running shows ...

How to solve the problem of margin overlap

1. First, you need to know what will trigger the v...

Basic use of javascript array includes and reduce

Table of contents Preface Array.prototype.include...