A brief analysis of MySQL cardinality statistics

A brief analysis of MySQL cardinality statistics

1. What is the cardinality?
Cardinality refers to the number of different values ​​​​in a column of a MySQL table.

If this type of index is unique, then the cardinality = number of rows.

If this column is sex, and the enumeration type only has male and female, then its cardinality is 2

The higher the Cardinality, the more worthy the column is to be indexed. MySQL execution plan also selects indexes based on Cardinality.

The cardinality of each column in the table can be seen in the following way.

Take this classic example:
There is a column called sex. The values ​​stored in the sex column are either male or female, and its maximum cardinality is 2.
There is no need to create an index for sex. Because, in order to improve your sex-based query speed, MySQL will create a brand new B+Tree for the new index you selected. But your sex has only two values. For MySQL, even if it creates a B+Tree index for the column you specify, when the query is actually executed, at most one binary query is performed, and the remaining operations can only be traversal, so it is not meaningful to create an index for sex.

2. When does InnoDB update the cardinality?
Parameter: innodb_stats_auto_recalc controls whether MySQL actively recalculates this persistent information. The default is 1 for true and 0 for false.
By default, cardinality information is recalculated when the rows in a table change by more than 10%.

3. The base number is estimated

The base number is not updated in real time! And it is a value estimated through sampling!

As for what the formula for the base is, it may not be important.

It is important to know that it is an estimate calculated by randomly sampling data pages.

And the number of randomly sampled pages can be set by the parameter innodb_stats_persistent_sample_pages, the default value is 20.

This means that the base value is not accurate, and even the results of your calculations may differ greatly each time.

4. Persistent Cardinality

You can use the parameter innodb_stats_persistent to control whether the cardinality is persisted. The default setting is off.

Of course you can set STATS_PERSISTENT=1 for a single table and its innodb_stats_persistent will be automatically enabled.

The advantage of enabling it is that restarting MySQL will not recalculate this value, thus speeding up the restart process.

5. How to actively update the base?

Executing the following SQL will trigger InnoDB to update the cardinality (even if you don't realize it will update the cardinality).

So try to choose a low business peak period

analyze table tableName;

If the number of samples is too small, the calculated cardinality will be extremely wrong. That will most likely cause the MySQL optimizer to choose the wrong index. This is where you can increase the value appropriately. But increasing it too much may cause ANALYZE TABLE to run slowly.

Conversely, ANALYZE TABLE runs too slowly. You can adjust the value of the parameter innodb_stats_persistent_sample_pages appropriately. However, this may lead to inaccurate cardinality calculation.

If there is no way to balance the relationship between the two. Consider reducing the number of indexed columns in the table or limiting the number of partitions to reduce ANALYZE TABLE complexity. The number of columns in the table's primary key is also important because the primary key columns are appended to each nonunique index.

refer to:

https://dev.mysql.com/doc/refman/5.7/en/innodb-persistent-stats.html

https://dev.mysql.com/doc/refman/5.7/en/innodb-analyze-table-complexity.html

The above is a brief analysis of the details of MySQL cardinality statistics. For more information about MySQL cardinality statistics, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • Analysis of the principles of Mysql dirty page flush and shrinking table space
  • Recommend several MySQL related tools
  • A brief analysis of MySQL's lru linked list
  • MySQL Query Cache and Buffer Pool
  • mysql method to recursively search for all child nodes of a menu node
  • What is a MySQL tablespace?
  • How to locate MySQL slow queries
  • MySQL Flush-List and dirty page flushing mechanism

<<:  Detailed explanation of web page loading progress bar (recommended)

>>:  Solution to the problem of insufficient storage resource pool of Docker server

Recommend

A comprehensive analysis of what Nginx can do

Preface This article only focuses on what Nginx c...

HTML css js implements Tab page sample code

Copy code The code is as follows: <html xmlns=...

Quickly solve the problem of slow startup after Tomcat reconfiguration

During the configuration of Jenkins+Tomcat server...

Common structural tags in XHTML

structure body, head, html, title text abbr, acro...

MySQL series tutorials for beginners

Table of contents 1. Basic concepts and basic com...

Summary of HTML formatting standards for web-based email content

1. Page requirements 1) Use standard headers and ...

Vue realizes simple effect of running light

This article shares the specific code of Vue to a...

Summary of four situations of joint query between two tables in Mysql

Generally speaking, in order to get more complete...

How to implement real-time polygon refraction with threejs

Table of contents Preface Step 1: Setup and front...

Detailed explanation of Nginx http resource request limit (three methods)

Prerequisite: nginx needs to have the ngx_http_li...

CSS3 realizes the effect of triangle continuous enlargement

1. CSS3 triangle continues to zoom in special eff...

Example code for implementing a hollow mask layer with CSS

Contents of this article: Page hollow mask layer,...

CSS easily implements fixed-ratio block-level containers

When designing H5 layout, you will usually encoun...

Several ways to use v-bind binding with Class and Style in Vue

Adding/removing classes to elements is a very com...