Overview of MySQL Statistics

Overview of MySQL Statistics

MySQL executes SQL through the process of SQL parsing and query optimization. The parser breaks down the SQL into data structures and passes them to subsequent steps. The query optimizer finds the best solution for executing the SQL query and generates an execution plan. The query optimizer determines how SQL is executed, which depends on the database statistics. Below we introduce the relevant content of innodb statistics in MySQL 5.7.

There are two types of storage for MySQL statistics: non-persistent and persistent statistics.

1. Non-persistent statistics

Non-persistent statistics are stored in memory and will be lost if the database is restarted. There are two ways to set non-persistent statistics:

1 global variables,

INNODB_STATS_PERSISTENT=OFF

2 CREATE/ALTER table parameters,

STATS_PERSISTENT=0

Non-persistent statistics are automatically updated in the following situations:

1 Execute ANALYZE TABLE

2 When innodb_stats_on_metadata=ON, execute SHOW TABLE STATUS, SHOW INDEX, and query TABLES, STATISTICS under INFORMATION_SCHEMA

3. With the --auto-rehash function enabled, log in using the mysql client

4 The table is opened for the first time

5 Since the last update of statistics, 1/16 of the data in the table has been modified

The disadvantages of non-persistent statistical information are obvious. If a large number of tables start to update statistical information after the database is restarted, it will have a great impact on the instance, so persistent statistical information is currently used.

2. Persistent Statistics

Starting from 5.6.6, MySQL uses persistent statistics by default, that is, INNODB_STATS_PERSISTENT=ON, and persistent statistics are stored in the tables mysql.innodb_table_stats and mysql.innodb_index_stats.

Persistent statistics are automatically updated in the following situations:

1 INNODB_STATS_AUTO_RECALC=ON

In this case, 10% of the data in the table is modified.

2 Adding a new index

innodb_table_stats is the statistical information of the table, and innodb_index_stats is the statistical information of the index. The meaning of each field is as follows:

innodb_table_stats

database_name

Database Name

table_name

Table name

last_update

The last time the statistics were updated

n_rows

Number of rows in the table

clustered_index_size

The number of pages in the clustered index

sum_of_other_index_sizes

The number of pages in other indexes

innodb_index_stats

database_name

Database Name

table_name

Table name

index_name

Index Name

last_update

The last time the statistics were updated

stat_name

Statistics Name

stat_value

The value of the statistic

sample_size

Sample size

stat_description

Type Description

To better understand innodb_index_stats, create a test table for illustration:

CREATE TABLE t1 (
 a INT, b INT, c INT, d INT, e INT, f INT,
 PRIMARY KEY (a, b), KEY i1 (c, d), UNIQUE KEY i2uniq (e, f)
)ENGINE=INNODB;

The written data is as follows:

To view the statistics of the t1 table, focus on the stat_name and stat_value fields.

When tat_name=size: stat_value indicates the number of indexed pages

When stat_name=n_leaf_pages: stat_value indicates the number of leaf nodes

When stat_name=n_diff_pfxNN: stat_value indicates the number of unique values ​​in the index field. Here is a detailed description:

1. n_diff_pfx01 indicates the number after distinct in the first column of the index. For example, column a of PRIMARY has only one value 1, so when index_name='PRIMARY' and stat_name='n_diff_pfx01', stat_value=1.

2. n_diff_pfx02 indicates the number of distinct values ​​in the first two columns of the index. For example, the e and f columns of i2uniq have 4 values, so when index_name='i2uniq' and stat_name='n_diff_pfx02', stat_value=4.

3. For non-unique indexes, the primary key index will be added after the original column. For example, if index_name='i1' and stat_name='n_diff_pfx03', the primary key column a is added after the original index columns c and d. The distinct result of (c, d, a) is 2.

Understanding the specific meanings of stat_name and stat_value can help us troubleshoot why the appropriate index is not used during SQL execution. For example, if the stat_value of an index n_diff_pfxNN is much smaller than the actual value, the query optimizer may believe that the index has poor selectivity, which may lead to the use of the wrong index.

3. Dealing with inaccurate statistical information

We checked the execution plan and found that the correct index was not used. If it is caused by a large difference in the statistical information in innodb_index_stats, you can handle it in the following ways:

1. Manually update statistics. Note that a read lock will be added during the execution:

ANALYZETABLE TABLE_NAME;

2. If the statistics are still inaccurate after the update, consider increasing the data pages sampled from the table. There are two ways to modify it:

a) Global variable INNODB_STATS_PERSISTENT_SAMPLE_PAGES, default value is 20;

b) A single table can specify sampling for that table:

ALTER TABLE TABLE_NAME STATS_SAMPLE_PAGES=40;

After testing, the maximum value of STATS_SAMPLE_PAGES here is 65535. If it exceeds this value, an error will be reported.

Currently, MySQL does not provide a histogram function. In some cases (such as uneven data distribution), simply updating statistical information may not necessarily result in an accurate execution plan. The only way is to specify the index through index hint. The new version 8.0 will add the histogram function. Let us look forward to MySQL becoming more and more powerful!

You may also be interested in:
  • Gearman + MySQL to achieve persistence operation example
  • Detailed explanation of deploying MySQL using Docker (data persistence)
  • Detailed explanation of Java emoji persistence in MySQL
  • MySQL 8 new features: how to modify persistent global variables
  • MySQL 8 new features: detailed explanation of persistence of auto-increment primary key
  • Reasons why MySQL 8.0 statistics are inaccurate
  • Detailed explanation of MySQL persistent statistics

<<:  How to configure SSL certificate in nginx to implement https service

>>:  Stop using absolute equality operators everywhere in JS

Recommend

Detailed explanation of Tomcat configuration and optimization solutions

Service.xml The Server.xml configuration file is ...

Vue component communication method case summary

Table of contents 1. Parent component passes valu...

Docker custom network container interconnection

Table of contents Preface –link Custom Network As...

Summary of the use of html meta tags (recommended)

Meta tag function The META tag is a key tag in th...

Linux dual network card binding script method example

In Linux operation and configuration work, dual n...

Let you understand the working principle of JavaScript

Table of contents Browser kernel JavaScript Engin...

JavaScript realizes magnifying glass special effects

The effect to be achieved: When the mouse is plac...

Analyze the duration of TIME_WAIT from the Linux source code

Table of contents 1. Introduction 2. First, let&#...

How to ensure transaction characteristics of MySQL InnoDB?

Preface If someone asks you "What are the ch...

Let's talk about the characteristics and isolation levels of MySQL transactions

The Internet is already saturated with articles o...

How to install Solr 8.6.2 in Docker and configure the Chinese word segmenter

1. Environment version Docker version 19.03.12 ce...

WeChat applet implements SMS login in action

Table of contents 1. Interface effect preview 2.u...

JavaScript Closures Explained

Table of contents 1. What is a closure? 1.2 Memoi...

Common ways to optimize Docker image size

The Docker images we usually build are usually la...