Causes of MySQL Table FragmentationRegarding the causes of table fragmentation in MySQL, let's briefly summarize. The causes of fragmentation may vary depending on the MySQL Engine. These differences are not deeply understood and analyzed here. This article focuses only on the InnoDB engine. If there are any deficiencies or errors in the summary, please point them out. The data of the InnoDB table is stored in pages, and each page can store multiple records. These records are organized in a tree structure, which is called a B+ tree index. Both the data in the table and the auxiliary index use the B+ tree structure. This B+ tree index that maintains all the data in the table is called a clustered index and is organized by primary key. The leaf nodes of a clustered index contain the values of all fields in the row, and the leaf nodes of a secondary index contain the index columns and the primary key columns. In InnoDB, when some rows are deleted, they are only marked as "deleted" instead of being physically deleted from the index, so the space is not really released and reclaimed. InnoDB's Purge thread will clean up these unused index keys and rows asynchronously. However, the freed space is still not returned to the operating system for reuse, which results in many holes in the pages. If the table structure contains dynamic-length fields, these holes may not even be reused by InnoDB to store new rows because there is insufficient space. For more information about this, you can refer to the blog Overview of fragmented MySQL InnoDB tables. In addition, deleting data will cause blank space to appear in the page. A large number of random DELETE operations will inevitably cause discontinuous blank space in the data file. When inserting data, these blank spaces will be used, resulting in discontinuous storage locations for the data. The physical storage order is different from the logical sorting order, which is data fragmentation. A large number of UPDATEs will also cause file fragmentation. The smallest physical storage allocation unit of Innodb is a page, and UPDATE may also cause page splits. Frequent page splits will make pages sparse and irregularly filled, so the data will eventually be fragmented.
The data storage of the table may also become fragmented. However, fragmentation of data storage is more complicated than indexing. There are three types of data fragmentation. ##The following part is excerpted from [High Performance MySQL]## Row fragmentationThis fragmentation means that the data rows are stored as multiple pieces in multiple places. Even if the query only accesses one row from the index. Row fragmentation can also cause performance degradation. Intra-row fragmentationRow fragmentation occurs when logically sequential pages or rows are not stored sequentially on disk. Row fragmentation has a significant impact on operations such as full table scans and clustered index scans, which benefit from data being stored sequentially on disk. Free space fragmentationRemaining space fragmentation refers to a large amount of free space in a data page. This can cause the server to read a lot of unnecessary data. This causes waste. For MyISAM tables, all three types of fragmentation may occur. But InnoDB does not have short row fragmentation; InnoDB moves short rows and writes them into one fragment. InnoDb will move short rows and rewrite them into one fragment. The official document 14.15.4 Defragmenting a Table describes how to reduce table fragmentation as follows (very concise, MySQL official documents are often concise, with a lot of information, but no detailed introduction):
Random insertions or deletions from a secondary index can cause the index to become fragmented. Fragmentation means that the physical ordering of the index pages on disk is not close to the index ordering recorded on the pages, or that there are many unused pages in the 64-page blocks allocated to the index. One symptom of fragmentation is a table taking up more space than it "should" take up. How much, exactly, is hard to determine. All InnoDB data and indexes are stored in B-trees, whose fill factor may vary between 50% and 100%. Another symptom of fragmentation is that a table scan like this takes much longer than it "should" take How to find severely fragmented tables in MySQLRegarding table fragmentation in MySQL, there are generally two ways to find out the fragmentation in MySQL. Method 1: Use show table status from xxxx like 'xxxx' \G;The first xxx: the name of the database where the table is located, the second xxx: the name of the table to be queried. This method is actually not very practical. For example, you can only query the fragmentation of a single table (do you have to try it for each table in a database?), and you cannot query the fragmentation of all tables in a database or the entire instance. This is just a reference method. mysql> create table frag_tab_myisam -> ( -> id int, -> name varchar(63) ->) engine=MyISAM; Query OK, 0 rows affected (0.00 sec) mysql> insert into frag_tab_myisam -> values(1, 'it is only test row 1'); Query OK, 1 row affected (0.00 sec) mysql> mysql> insert into frag_tab_myisam -> values(2, 'it is only test row 2'); Query OK, 1 row affected (0.00 sec) mysql> mysql> mysql> insert into frag_tab_myisam -> values(3, 'it is only test row 3'); Query OK, 1 row affected (0.00 sec) mysql> mysql> insert into frag_tab_myisam -> values(4, 'it is only test row 4'); Query OK, 1 row affected (0.00 sec) mysql> mysql> show table status from kkk like 'frag_tab_myisam' \G; As shown in the following screenshot, if there is no DML operation, the size of Data_free is 0 Then we delete 2 records from the database. As shown below, the size of Data_free is 64KB. mysql> delete from frag_tab_myisam where id =1; Query OK, 1 row affected (0.00 sec) mysql> delete from frag_tab_myisam where id = 3; Query OK, 1 row affected (0.00 sec) Method 2: Query information_schema.TABLES to obtain table fragmentation information.As shown below, this is a classic script I compiled to query table fragmentation. You can make a lot of derivatives on it: for example, query the table fragmentation of a database. Or the table has more than 50M free space. You can set the query conditions according to your needs. Skip this here. SELECT CONCAT(table_schema, '.', table_name) AS TABLE_NAME ,engine AS TABLE_ENGINE ,table_type AS TABLE_TYPE ,table_rows AS TABLE_ROWS ,CONCAT(ROUND(data_length / ( 1024 * 1024), 2), 'M') AS TB_DATA_SIZE ,CONCAT(ROUND(index_length / ( 1024 * 1024), 2), 'M') AS TB_IDX_SIZE ,CONCAT(ROUND((data_length + index_length ) / ( 1024 * 1024 ), 2), 'M') AS TOTAL_SIZE ,CASE WHEN data_length = 0 THEN 0 ELSE ROUND(index_length / data_length, 2) END AS TB_INDX_RATE ,CONCAT(ROUND( data_free / 1024 / 1024,2), 'MB') AS TB_DATA_FREE ,CASE WHEN (data_length + index_length) = 0 THEN 0 ELSE ROUND(data_free/(data_length + index_length),2) END AS TB_FRAG_RATE FROM information_schema.TABLES ORDER BY data_free DESC; SELECT CONCAT(table_schema, '.', table_name) AS TABLE_NAME ,engine AS TABLE_ENGINE ,table_type AS TABLE_TYPE ,table_rows AS TABLE_ROWS ,CONCAT(ROUND(data_length / ( 1024 * 1024), 2), 'M') AS TB_DATA_SIZE ,CONCAT(ROUND(index_length / ( 1024 * 1024), 2), 'M') AS TB_IDX_SIZE ,CONCAT(ROUND((data_length + index_length ) / ( 1024 * 1024 ), 2), 'M') AS TOTAL_SIZE ,CASE WHEN data_length = 0 THEN 0 ELSE ROUND(index_length / data_length, 2) END AS TB_INDX_RATE ,CONCAT(ROUND( data_free / 1024 / 1024,2), 'MB') AS TB_DATA_FREE ,CASE WHEN (data_length + index_length) = 0 THEN 0 ELSE ROUND(data_free/(data_length + index_length),2) END AS TB_FRAG_RATE FROM information_schema.TABLES WHERE ROUND(DATA_FREE/1024/1024,2) >=50 ORDER BY data_free DESC; SELECT TABLE_SCHEMA ,TABLE_NAME ,ENGINE ,ROUND(((DATA_LENGTH + INDEX_LENGTH) / 1024 / 1024), 2) AS SIZE_MB ,ROUND(DATA_FREE/1024/1024,2) AS FREE_SIZ_MB FROM information_schema.TABLES WHERE DATA_FREE >=10*1024*1024 ORDER BY FREE_SIZ_MB DESC; How to reduce table fragmentation in MySQLIn MySQL, you can use OPTIMIZE TABLE and ALTER TABLE XXXX ENGINE = INNODB to reduce fragmentation. A brief introduction to these two methods is as follows: OPTIMIZE TABLEOPTIMIZE TABLE reorganizes the physical storage of tables and indexes, reducing storage space usage and improving IO efficiency when accessing tables. The exact changes made to each table depend on the storage engine used by that table. OPTIMIZE TABLE supports the following table types: INNODB, MYISAM, ARCHIVE, and NDB. It reorganizes the physical pages of table data and indexes, which is effective in reducing the space occupied and optimizing IO when accessing the table. The OPTIMIZE operation will temporarily lock the table, and the larger the amount of data, the longer it will take. After OPTIMIZE TABLE, the changes in the table are related to the storage engine. For MyISAM, PTIMIZE TABLE works as follows:
The original English text is as follows: For MyISAM tables, OPTIMIZE TABLE works as follows: 1. If the table has deleted or split rows, repair the table. 2. If the index pages are not sorted, sort them. 3. If the table's statistics are not up to date (and the repair could not be accomplished by sorting the index), update them. For InnoDB, PTIMIZE TABLE works as follows For InnoDB tables, OPTIMIZE TABLE maps to ALTER TABLE ... FORCE , which rebuilds the table to update index statistics and free up unused space in the clustered index. When you run it on an InnoDB table, it appears in the output of OPTIMIZE TABLE as follows: mysql> OPTIMIZE TABLE foo; +----------+----------+----------+-------------------------------------------------------------------+ | Table | Op | Msg_type | Msg_text | +----------+----------+----------+-------------------------------------------------------------------+ | test.foo | optimize | note | Table does not support optimize, doing recreate + analyze instead | | test.foo | optimize | status | OK | +----------+----------+----------+-------------------------------------------------------------------+ OPTIMIZE TABLE uses online DDL for normal and partitioned InnoDB tables, thus reducing downtime for concurrent DML operations. The rebuild of the table is triggered by OPTIMIZE TABLE and is done under the cover of ALTER TABLE ... FORCE. Exclusive table locks are taken only briefly during the prepare and commit phases of the operation. During the prepare phase, metadata is updated and intermediate tables are created. During the commit phase, table metadata changes are committed. OPTIMIZE TABLE rebuilds the table using the table copy method under the following conditions:
OPTIMIZE TABLE does not support online DDL for InnoDB tables that contain FULLTEXT indexes. Instead, use the copy table method. InnoDB uses a page allocation method to store data and does not suffer from fragmentation like traditional storage engines such as MyISAM. When considering whether to run optimization, consider the workload of transactions that the server will process:
Updates to a row typically rewrite data in the same page when there is enough space for the row, depending on the data type and the row format. See Section 14.9.1.5, “How Compression Works for InnoDB Tables”, and Section 14.11, “InnoDB Row Formats”. Highly concurrent workloads may leave gaps in indexes over time because InnoDB keeps multiple versions of the same data through its MVCC mechanism. See Section 14.3, “InnoDB Multi-Versioning”. In addition, for InnoDB tables with innodb_file_per_table=1, OPTIMIZE TABLE will reorganize the physical storage of tables and indexes and release free space to the operating system. That is to say, OPTIMIZE TABLE [tablename] is only applicable to independent table spaces. For more details about OPTIMIZE TABLE, see https://dev.mysql.com/doc/refman/8.0/en/optimize-table.html. I feel the official documentation is quite detailed. ALTER TABLE table_name ENGINE = Innodb;This is actually a NULL operation. On the surface, it does nothing, but in fact it reorganizes the fragments. When the optimization operation is performed, an empty ALTER command is actually executed, but this command also plays an optimization role. It will rebuild the entire table and delete unused blank space.
Question 1: Is it better to use OPTIMIZE TABLE or ALTER TABLE xxxx ENGINE = INNODB?In fact, for the InnoDB engine, ALTER TABLE xxxx ENGINE = INNODB executes an empty ALTER TABLE operation. OPTIMIZE TABLE is equivalent to ALTER TABLE ... FORCE. Referring to the description above, in some cases, OPTIMIZE TABLE or ALTER TABLE xxxx ENGINE= INNODB is basically the same. But in some cases, ALTER TABLE xxxx ENGINE= INNODB is better. For example, the old_alter_table system variable is not enabled, etc. In addition, for MyISAM type tables, using ALTER TABLE xxxx ENGINE = INNODB is obviously better than OPTIMIZE TABLE. Question 2: Will index fragments on the ALTER TABLE xxxx ENGINE= INNODB table be defragmented?ALTER TABLE ENGINE = INNODB will reorganize the data and indexes on the clustered index. If you want to verify it experimentally, you can compare the size of index_length before and after executing this command. Other ToolsNetizens suggest using pt tools or gh-ost to reduce table fragmentation. I haven’t used such tools yet, but they probably encapsulate the above two commands. This will not be introduced in detail here. References:High-performance MySQL https://dev.mysql.com/doc/refman/8.0/en/optimize-table.html https://dev.mysql.com/doc/refman/8.0/en/innodb-file-defragmenting.html https://lefred.be/content/overview-of-fragmented-mysql-innodb-tables/ https://yq.aliyun.com/articles/41166 http://mysql.taobao.org/monthly/2015/08/05/ This is the end of this article about MySQL table defragmentation and space recovery methods. For more information about MySQL table defragmentation and space recovery, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future! You may also be interested in:
|
<<: How to connect to a remote docker server with a certificate
>>: Combining XML and CSS styles
introduction Let's start with our content. I ...
Usually the pictures uploaded by users need to be...
Usage of time difference functions TIMESTAMPDIFF ...
The installation method of MySQL5.7 rpm under Lin...
1. Create a folder to store nginx shell scripts /...
People who often use MySQL may encounter the foll...
This article shares with you the specific method ...
Table of contents 1. Main functions 2. Implementa...
The css animation of the rotating flip effect, th...
1. Set and change the root password Check whether...
1. Requirements When using the Vue.js framework t...
Install the nvidia graphics card driver under Ubu...
I believe everyone is familiar with database inde...
In Dockerfile, run, cmd, and entrypoint can all b...
This article shares the specific code of the js n...