MySQL advanced features - detailed explanation of the concept and mechanism of data table partitioning

Partitioning mechanism

SELECT query
INSERT Operation
DELETE Operation
UPDATE Operation

Types of Partitions

MySQL partitioning is implemented by wrapping the data table, which means that the index is actually defined based on each partition rather than the entire table. This feature is different from Oracle, where indexes and tables can be partitioned in a more flexible and complex way.

MySQL partitioning determines the partition to which a data row belongs by defining the conditions of the PATITION BY clause. When executing a query, the query optimizer distinguishes between partitions, which means that the query will not check all partitions, but only those partitions that contain the requested data.

The main purpose of partitioning is to roughly index and cluster the data table. This can reduce the need to access a large range of data tables and store related data rows close together. The benefits of partitioning are significant, especially for the following scenarios:

When the data table is too large to fit in the memory, or there are a lot of historical data and hot zone rows in a data table.
Partitioned data is easier to maintain than unpartitioned data. For example, old data can be easily cleared by deleting the entire partition, and single partitions can also be easily optimized, checked, and repaired.
Partitioning allows data to be physically distributed across the storage, which enables the server to use multiple hard drives more efficiently.
You can use partitioning to avoid bottlenecks in certain workloads.
For data backup, individual partitions can be backed up or restored individually, which is very beneficial for large data sets.

The implementation details of MySQL partitioning are very complex and difficult to understand. We only need to focus on its performance. If you want to learn more, you can read the section about partitioning in the MySQL manual. Partitioning also brings other problems and limitations:

The commands to create and alter tables are more complex.
Each table can have a maximum of 1024 partitions.
In MySQL 5.1, the partitioning expression must be an integer or return an integer; in MySQL 5.5 and later, columns can be used for partitioning in some cases.
Any primary key or unique index must contain all columns in the partitioning expression.
Foreign key constraints cannot be used.

Partitioning mechanism

As mentioned earlier, the partition table actually has multiple hidden physical storage tables, which are presented through handle objects. We cannot access the partitions directly. Typically, each partition is managed by a storage engine (therefore requiring the same storage engine for all partitions), and the index in the data table is actually an index of the hidden physical storage table. From the storage engine's perspective, partitions are also data tables. The storage engine does not actually know whether the data table is independent or a partition of a larger data table. The operations on the partition table are implemented through the following logical operations:

SELECT query

When querying a partitioned table, the partitioning layer opens and locks all hidden partitions, the query optimizer determines which hidden partitions can be ignored, and then the partitioning layer calls the storage engine that manages the partitions through the handle API to obtain the query results.

INSERT Operation

When a row of data is inserted, the partition layer opens and locks all partitions, then determines which partition stores the current data row, and stores the data row in the corresponding partition.

DELETE Operation

When deleting a row of data, the partitioning layer opens and locks all partitions, checks which partition contains the row of data, and then sends the delete request to that partition.

UPDATE Operation

When modifying a row of data, the partitioning layer opens and locks all partitions, checks which partition contains the row of data, obtains the row of data for modification, and then determines which partition should contain the new data row, sends the insert request to the partition, and sends the delete request to the old partition.
Some of the above operations support partition filtering (i.e. ignoring irrelevant partitions). For example, when deleting a row, the server needs to locate the data row first. If a matching partitioning expression condition is specified in the WHERE condition, the server can ignore partitions that do not contain the row. The same is true for UPDATE operations, and the same is true for INSERT operations. The server will only look for one partition to be inserted, not all.

Although the partition layer opens and locks all partitions, it does not mean that the partitions will remain locked. Storage engines like InnoDB support row-level locking and will only unlock partitions at the partition level. The locking and unlocking process is similar to the locking process of ordinary InnoDB data tables.

Types of Partitions

MySQL supports several types of partitioning. The most commonly used type is range partitioning, which is partitioning by different ranges of values or expressions of certain columns. For example, the following statement divides sales data into different partitions based on year:

CREATE TABLE sales (
  order_date DATETIME NOT NULL
  --Other column definitions) ENGINE=InnoDB PARTITION BY RANGE(YEAR(order_date)) (
  PARTITION p_2018 VALUES LESS THAN (2018),
  PARTITION p_2019 VALUES LESS THAN (2019),
  PARTITION p_2020 VALUES LESS THAN (2020),
  PARTITION p_other VALUES LESS THAN MAXVALUE);

A variety of functions can be used in the partition clause. The main requirement is that it must return a non-constant, deterministic integer. In the above example the YEAR function is used, but other functions can also be used, such as TO_DAYS(). Using time intervals for partitioning is a common approach for date-based data.

MySQL also supports key, hash, and list partitioning methods, and some also support subpartitioning (which is rarely used in practice). In MySQL 5.5 and later, you can use the RANGE COLUMNS partition type to partition directly by date-based columns without using functions to convert dates to integers. Other common partitioning techniques include:

Use keys for partitioning to reduce InnoDB mutex contention;
You can use the modulo calculation method to loop through the range partitions. For example, if you only need to keep the data for the last few days, you can partition by taking the modulo 7 for the date or by the day of the week.
Suppose the data table does not have an auto-incrementing primary key, but you also want to partition the hot zone data that is clustered together. Since the timestamp is not in the primary key, timestamp partitioning cannot be used. At this time, you can use HASH(id DIV 1000000), which will partition every 1,000,000 rows of data. This allows us to achieve our desired effect without changing the primary key. This has additional effects as well. That is we don't need to create constants of partitions to hold new data.

The above is the detailed content of the concept and mechanism of MySQL advanced feature - data table partition. For more information about MySQL advanced feature data table partition, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:

Detailed explanation of the use and underlying principles of MySQL table partitions
MySQL data table partitioning strategy and advantages and disadvantages analysis
MySQL database table partitioning considerations [recommended]
Detailed analysis of table partitioning technology in MySQL
A Brief Analysis of MySQL Data Table Partition Technology
Detailed explanation of MySQL table partitioning
How to create mysql table partitions
Understanding MySQL table partitioning in one article

<<: Summary of flex layout compatibility issues

>>: Implementation of Nginx configuration Https security authentication