Example to explain the size of MySQL statistics table

Example to explain the size of MySQL statistics table

Counting the size of each table in each database is one of the simplest requirements of data governance. This article will count the data volume of each database and table in MySQL from two aspects: sampling statistical results and precise statistical results.

1. Statistical estimation of data volume

The MySQL data dictionary library information_schema records the estimated amount of data (inaccurate for InnoDB engine tables, accurate for MyISAM engine tables) and information such as data size, index size, and table fragment size.

If you want to know the approximate data volume of each database and table, you can directly query information_schema.tables for statistics. For example:

SELECT table_schema,table_name,table_rows,data_length+index_length+ data_free data_size
 FROM information_schema.`TABLES` WHERE table_schema IN ('db1','db2');

The unit of data_size is B

As mentioned above, the number of data items and sizes in the statistical information are based on the sampling statistics of part of the data, which may be different from the actual size. The larger the table, the more obvious the difference. If you want to know the actual situation of each table, you need to use the subsequent methods.

2. Count the actual amount of data

If you want to count the actual size of each table, you have to traverse each table and calculate the number of records. You can check the size of each table by checking the table space size (each table has an independent table space). The purpose of accurate statistics can be achieved by following the steps below.

Creating Paths

Create a working directory to save scripts and temporary files, etc.

mkdir -p /usr/local/data_size

Create statistics database and table

Create a statistics database on the database instance that needs statistics.

SQL> create database bak_db;

Create a stored procedure for statistics

SQL> use bak_db;
SQL>CREATE PROCEDURE `p_db_size`()
BEGIN
DECLARE v_id INT;
DECLARE v_maxid INT;
DECLARE v_tbname VARCHAR(50);
DECLARE v_dbname VARCHAR(50);
DECLARE v_sql_upd VARCHAR(200);
SET v_id =(SELECT MIN(id) FROM bak_db.tb_size);
SET v_maxid =(SELECT MAX(id) FROM bak_db.tb_size);
WHILE v_id <=v_maxid
DO
SET v_tbname = (SELECT tbname FROM bak_db.tb_size WHERE id=v_id);
SET v_dbname = (SELECT dbname FROM bak_db.tb_size WHERE id=v_id);
SET v_sql_upd = CONCAT('update bak_db.tb_size set tb_rows=(select count(*) from ',v_dbname,".",v_tbname,") where id=",v_id);
  SET @v_sql_upd := v_sql_upd;
  PREPARE stmt FROM @v_sql_upd;
  EXECUTE stmt ;
  DEALLOCATE PREPARE stmt;
  SET v_id = v_id +1;
END WHILE;
  END;

Creating a Script

vim data.sh
/* Insert the following content */

#!/bin/bash
cd /usr/local/data_size
du -s /data/mysql/mysql3306/data/db1/* |grep -v ".frm" |grep -v ".opt" >/usr/local/data_size/data_size     
du -s /data/mysql/mysql3306/data/db2/* |grep -v ".frm" |grep -v ".opt">>/usr/local/data_size/data_size

# The next 4 steps are to splice into sql 
awk '{print "insert into bak_db.tb_size(size,tb_route)values("""$0}' /usr/local/data_size/data_size >/usr/local/data_size/data_size1  
awk '{print $0";"}' /usr/local/data_size/data_size1 >/usr/local/data_size/data_size.sql
sed -i "s#\t#,'#g" /usr/local/data_size/data_size.sql
sed -i "s#;#');#g" /usr/local/data_size/data_size.sql

# Create a statistics table /usr/local/mysql5.7/bin/mysql -uroot -p'Test#123456' -h 192.168.28.132 -e "drop table if exists bak_db.tb_size;CREATE TABLE IF NOT EXISTS bak_db.tb_size ( id INT(11) NOT NULL PRIMARY KEY AUTO_INCREMENT ,size INT,tb_route VARCHAR(200),tbname VARCHAR(50),dbname VARCHAR(50),tb_rows INT(11));" 

# Import data /usr/local/mysql5.7/bin/mysql -uroot -p'Test#123456' -h 192.168.28.132 -e "use bak_db;truncate table bak_db.tb_size;source /usr/local/data_size/data_size.sql;"

# Generate library name and table name. Of course, this step can also be obtained from the data field. /usr/local/mysql5.7/bin/mysql -uroot -p'Test#123456' -h 192.168.28.132 -e "use bak_db;UPDATE bak_db.tb_size SET tbname=REPLACE(SUBSTRING_INDEX(tb_route,'/',-1),'.ibd','');"
 /usr/local/mysql5.7/bin/mysql -uroot -p'Test#123456' -h 192.168.28.132 -e "UPDATE bak_db.tb_size SET dbname=LEFT (SUBSTRING_INDEX(tb_route,'/',-2),INSTR(SUBSTRING_INDEX(tb_route,'/',-2),'/')-1);"
sleep 10 # If the previous steps are in the master database, it is recommended to pause for a while to avoid failure to obtain the table and content when performing statistics later. If the previous steps are in the slave database, this step can be omitted echo 'start call procedure' 
# Call the stored procedure to count the number of records in each table /usr/local/mysql5.7/bin/mysql -uroot -p'Test#123456' -h 192.168.28.132 -e "use bak_db;call bak_db.p_db_size();"

# Export the table and data /usr/local/mysql5.7/bin/mysqldump -uroot -p'Test#123456' -h 192.168.28.132 --single-transaction bak_db tb_size >/usr/local/data_size/tb_size.sql

# Import the table and results into the master database (the slave database is equivalent to deleting and rebuilding it once)
/usr/local/mysql5.7/bin/mysql -uroot -p'Test#123456' -h 192.168.28.128 -e "use bak_db;source /usr/local/data_size/tb_size.sql;"

The results are as follows:

It can be seen that the exact value is very different from the value in the statistical information, and the larger the table, the more obvious the difference.

TIPS: There is still a lot of room for optimization in the precise statistical script in this article. It was written in a hurry. You can adjust it as needed. My level is limited, so please feel free to correct me. If you have any questions, please feel free to communicate with me.

The above is an example to explain the details of MySQL statistical library table size. For more information about MySQL statistical library table size, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • Overview of MySQL Statistics
  • A small Shell script to accurately count the number of rows in each Mysql table
  • MySQL implements multi-table association statistics (subquery statistics) example
  • Python implements Mysql data statistics and numpy statistical functions
  • sqlserver/mysql statistics of continuous time period data by day, hour, minute [recommended]
  • Detailed explanation of MySQL to obtain statistical data for each day and each hour of a certain period of time
  • PHP+MySQL to achieve daily data statistics optimization operation example within a period of time
  • PHP advertising click statistics code (php+mysql)
  • Reasons why MySQL 8.0 statistics are inaccurate

<<:  A brief analysis of React's understanding of state

>>:  Detailed explanation of how to use Docker to deploy Django+MySQL8 development environment

Recommend

Docker Basics

Preface: Docker is an open source application con...

JavaScript to achieve the effect of clicking on the self-made menu

This article shares the specific code of JavaScri...

How to store text and pictures in MySQL

Large Text Data Types in Oracle Clob long text ty...

Detailed tutorial on building a local idea activation server

Preface The blogger uses the idea IDE. Because th...

Example of using the href attribute and onclick event of a tag

The a tag is mainly used to implement page jump, ...

The implementation principle of Tomcat correcting the JDK native thread pool bug

To improve processing power and concurrency, Web ...

mysql implements the operation of setting multiple primary keys

User table, ID number must be unique, mobile phon...

How to deploy MongoDB container with Docker

Table of contents What is Docker deploy 1. Pull t...

Summary of various common join table query examples in MySQL

This article uses examples to describe various co...

Summary of changes in the use of axios in vue3 study notes

Table of contents 1. Basic use of axio 2. How to ...

Common HTML tag writing errors

We better start paying attention, because HTML Po...

How to apply TypeScript classes in Vue projects

Table of contents 1. Introduction 2. Use 1. @Comp...

Solve the problem that the docker container cannot ping the external network

Today, when I was building a redis environment in...