Optimize the storage efficiency of BLOB and TEXT columns in InnoDB tables

Optimize the storage efficiency of BLOB and TEXT columns in InnoDB tables

First, let's introduce a few key points about the MySQL InnoDB engine storage format:

1. InnoDB can choose to use a shared table space or an independent table space. It is recommended to use an independent table space for easy management and maintenance. Enable the innodb_file_per_table option. After 5.5, you can dynamically modify it online to take effect, and execute ALTER TABLE xx ENGINE = InnoDB to convert the existing table into an independent tablespace. For versions earlier than 5.5, you need to restart after modifying this option for it to take effect.

2. The default size of InnoDB data page is 16KB. After version 5.6, the new option innodb_page_size can be modified. In versions before 5.6, you can only modify the source code and recompile. However, it is not recommended to modify this configuration unless you are very clear about its advantages and disadvantages.

3. When new data is written to the InnoDB data page, 1/16 of the space will be reserved. The reserved space can be used for subsequent new record writing, reducing the overhead of frequently adding new data pages.

4. Each data page needs to store at least 2 rows of records. Therefore, the maximum length of a row record is theoretically 8KB, but in fact it should be smaller because there are some InnoDB internal data structures to store.

5. Limited by the InnoDB storage method, if the data is written sequentially, the ideal fill rate of the data page is 15/16, but it is generally impossible to guarantee complete sequential writing. Therefore, the fill rate of the data page is generally 1/2 to 15/16. Therefore, it is best for each InnoDB table to have an auto-increment column as the primary key so that new records can be written as sequentially as possible.

6. When the data page fill rate is less than 1/2, InnoDB will shrink it to release free space.

7. The InnoDB engine of MySQL 5.6 currently supports four formats: COMPACT , REDUNDANT , DYNAMIC , and COMPRESSED . The default format is COMPACT. COMPRESSED is rarely used and is not recommended (see the next item). If you need to use the compression feature, you can directly consider the TokuDB engine.

8. The COMPACT row format can save about 20% of storage space compared to REDUNDANT, and COMPRESSED can save about 50% of storage space compared to COMPACT, but it will cause TPS to drop by 90%. Therefore, the use of COMPRESSED row format is strongly discouraged.

9. When the row format is DYNAMIC or COMPRESSED, long columns such as TEXT/BLOB (long columns, which may also be other long columns, not necessarily only TEXT/BLOB types, depending on the specific situation) will be completely stored in an independent data page, and the clustered index page will only use a 20-byte pointer to point to the new page. This is the so-called off-page, similar to ORACLE's row migration, which wastes a lot of disk space and has poor I/O performance. Therefore, it is strongly recommended not to use BLOB, TEXT, or VARCHAR column types with a length exceeding 255.

10. When the InnoDB file format ( innodb_file_format ) is set to Antelope and the row format is COMPACT or REDUNDANT, BLOB, TEXT, or long VARCHAR columns only store the first 768 bytes in the clustered index page (the maximum 768 bytes is used to facilitate the creation of prefix indexes). The rest of the content is stored in additional pages, even if it is only one byte more. Therefore, all column lengths should be as short as possible.

11. The page of BLOB, TEXT, or long VARCHAR column stored in off-page is exclusive and cannot be shared. Therefore, it is strongly discouraged to use multiple long columns in a table.

In summary, if you really need to store BLOB, TEXT, or long VARCHAR columns in an InnoDB table in actual business, here are some suggestions:

1. Try to serialize and compress all data and store them in the same column to avoid multiple off-pages.

2. Convert columns with actual maximum storage length less than 255 to VARCHAR or CHAR type (if it is variable-length data, there is no difference between the two; if it is fixed-length data, use CHAR type).

3. If it is not possible to integrate all columns into one column, you can make a second choice and split them into multiple sub-tables according to the maximum length of each column. Try to make the total row length of each sub-table less than 8KB to reduce the frequency of off-page.

4. The above suggestions are based on the premise that the data page is the default 16KB. If you change it to 8KB or other sizes, please test it yourself based on the above theory to find the most appropriate value.

5. When the length of a character column is less than 255, whether CHAR or VARCHAR is used for storage, or the length of the VARCHAR column is defined as 255, the actual table space will not increase.

Summarize

The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. Thank you for your support of 123WORDPRESS.COM. If you want to learn more about this, please check out the following links

You may also be interested in:
  • MySQL database migration quickly exports and imports large amounts of data
  • Shell script to operate MySQL database to delete duplicate data
  • How to add, delete and modify columns in MySQL database
  • MySQL implements a solution similar to Oracle sequence
  • mysql code to implement sequence function
  • Can't connect to local MySQL through socket ''/tmp/mysql.sock'' solution
  • Detailed explanation of rpm installation in mysql
  • A complete list of commonly used MySQL functions (classified and summarized)
  • Use MySQL master-slave configuration to achieve read-write separation and reduce database pressure
  • mysql+spring+mybatis to realize code configuration of database read-write separation

<<:  How to solve the problem of zabbix monitoring causing Chinese garbled characters in the graphical interface due to PHP problems

>>:  A brief talk about Rx responsive programming

Recommend

Detailed explanation of slave_exec_mode parameter in MySQL

Today I accidentally saw the parameter slave_exec...

A brief discussion on Flex layout and scaling calculation

1. Introduction to Flex Layout Flex is the abbrev...

Do you know the weird things in Javascript?

Our veteran predecessors have written countless c...

An article teaches you JS function inheritance

Table of contents 1. Introduction: 2. Prototype c...

Solve the problem of margin merging

1. Merge the margins of sibling elements The effe...

Using jQuery to implement the carousel effect

What I bring to you today is to use jQuery to imp...

MySQL trigger syntax and application examples

This article uses examples to illustrate the synt...

MySQL common backup commands and shell backup scripts sharing

To back up multiple databases, you can use the fo...

How to use linux commands to convert and splice audio formats

Install FFmpeg flac eric@ray:~$ sudo apt install ...