MySQL learning record: bloody incident caused by KEY partition

MySQL learning record: bloody incident caused by KEY partition

Demand background

Part of the data in the business table tb_image is as follows, where id is unique but image_no is not unique. image_no represents the number of each file. Each file will generate several files in the business system. The unique ID of each file is the field id:

Some information about the business table tb_image is as follows:

  • Query by image_no and query by id;
  • Stock data 2kw;
  • Daily growth is about 4w;
  • Daily query volume is about 200,000;
  • It is a non-ToC system, so the concurrency ceiling is visible;

Solution Selection

Based on the above analysis of the business, it is completely unnecessary to separate the database and tables. If a single database is sharded, since queries need to be based on image_no and id, one solution is to use redundant sharding (that is, one copy of the data is saved with image_no as the sharding key, and the other copy of the data is saved with id as the sharding key); another solution is to use only image_no as the sharding key, and based on the query requirements of id, the business layer merges the results or introduces third-party middleware.

Considering that it is more complicated to divide a single database into tables, we decided to use the partition feature. In addition, the capacity evaluation partition table solution with 128 partitions (each partition has a kw-level data volume) can fully ensure that the business will run stably for at least 15 years (the orange part in the figure is more in line with the actual growth of our business):

In addition, since RANGE, LIST, and HASH partitions do not support VARCHAR columns, it is decided to use KEY partitions. The official introduction of its principle is to use the MySQL built-in hash algorithm and then take the modulus of the partition number.

Performance Testing

After selecting the shard key as image_no and deciding the number of partitions to be 128, it is time to load data for feasibility and performance testing. The reason why the number of partitions is chosen as 128 is: 1.1 billion/1kw=110≈128. In addition, programmers like to use the power of 2, you know. However, the partition number 128 is where all the nightmares begin.

I tried to insert 100,000 data into 128 partitions. After the insertion, a surprising phenomenon occurred: there was no data in all odd-numbered partitions (p1, p3, p5, …, p2n-1). At the same time, any even-numbered partition had a lot of data, and it was not very even. As shown in the following figure:

Note: The ibd file size of odd-numbered partitions is 112k. This is the initialization size when the partition table is created. There is actually no data. We can verify it through SQL: select partition_name, partition_expression, table_rows from information_schema.partitions where table_schema = schema() and table_name='image_subpart' ;, and some of the results are shown in the following figure:

Aren’t 100,000 pieces of data enough to illustrate the problem? On average, each partition has nearly 800 pieces of data! Ok, let’s do something drastic: I will insert another 990w pieces of data, for a total of 1kw of data. The result is still the same, odd-numbered partitions have no data, and even-numbered partitions have partitions.

Questions to think about

Let's recall the principle of KEY partitioning: Calculate the hash value of the shard key through the MySQL built-in hash algorithm and then take the modulus of the partition number. This principle can also be found on the MySQL official website. Please click on the link: 22.2.5 KEY Partitioning: https://dev.mysql.com/doc/refman/5.7/en/partitioning-key.html. The original text is as follows:

Partitioning by key is similar to partitioning by hash, except that where hash partitioning employs a user-defined expression, the hashing function for key partitioning is supplied by the MySQL server. NDB Cluster uses MD5() for this purpose; for tables using other storage engines, the server employs its own internal hashing function which is based on the same algorithm as PASSWORD().

**There can’t be such a crappy hash algorithm in this world, right? **No matter what algorithm you write, it won’t be so uneven, right? At this time, I wonder if there is some configuration causing it. But there are no partition-related variables in show variables.

At this time, ten thousand horses galloped past. Could it be that the document and source code are out of sync? Well, let's look at the MySQL source code. After all, the source code is the closest to the truth. The source code related to KEY partition is in the file sql_partition.cc. The author intercepted some key source codes as shown below. At first glance, there is nothing wrong. First calculate the hash value of the partition field and then take the modulus of the partition number:

/**
 Calculate part_id for (SUB)PARTITION BY KEY
 @param file Handler to storage engine
 @param field_array Array of fields for PARTTION KEY
 @param num_parts Number of KEY partitions
 @param func_value[out] Returns calculated hash value
 @return Calculated partition id
*/
inline
static uint32 get_part_id_key(handler *file,
               Field **field_array,
               uint num_parts,
               long long *func_value)
{
 DBUG_ENTER("get_part_id_key");
 // Calculate the hash value of the partition field *func_value = file->calculate_key_hash_value(field_array);
 // Modulo the number of partitions DBUG_RETURN((uint32) (*func_value % num_parts));
}

In despair, please search for "uneven KEY partition data" on the search engine. In the CSDN forum (https://bbs.csdn.net/topics/390857704) in the search results, there is a folk master Hua Xia Xiao Zu who answered as follows:

A colleague analyzed and measured the password function and found that for key partitioning, the number of partitions can only be specified as a prime number to ensure that each partition has data. I tested it, from 11 partitions to 17 partitions. Only the data in partitions 11, 13, and 17 are basically evenly distributed.

At this time, another ten thousand horses galloped past. But while I was wondering WHAT THE F**K, I was also a little excited because I might have found a solution (although I still don’t know why the built-in hash algorithm of MySQL behaves like this). Finally, I tested the KEY partition again and concluded as follows:

  1. If you set the partition number to an even number such as 40, 64, or 128 (PARTITIONS 64), the partitions with odd numbers (p1, p3, p5, p7, … p2n-1) will not be able to insert data at all.
  2. If you set the number of partitions to an odd but non-prime number such as 63 or 121 (PARTITIONS 63), all partitions will have data, but unevenly.
  3. If you set the number of partitions to a prime number such as 137 or 31 (PARTITIONS 137), all partitions will have data and will be very evenly spaced.

As shown in the figure below, this is the situation after the author adjusted the number of partitions to 127 and inserted 1 million data. SQL proves that the amount of data in each partition is almost the same:

Summary

I was shocked to learn that there is no official explanation for the huge pitfalls of using MySQL's KEY partition. In addition there is MySQL bug: Bug #72428 Partition by KEY() results in uneven data distribution

Students who are reading this article and have a strong interest can try to delve deeper into this issue. The author will also find some time to dig deeper into the MySQL source code to find out why the implementation of its hash algorithm is so sensitive to the number of partitions.

This is the end of this article about the bloody case caused by KEY partition of MySQL learning records. For more relevant MySQL KEY partition bloody case content, please search 123WORDPRESS.COM's previous articles or continue to browse the following related articles. I hope everyone will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • Getting Started Guide to MySQL Sharding
  • Python implements mysql automatic addition and deletion of partitions
  • In-depth explanation of MySql table, database, sharding and partitioning knowledge
  • Detailed explanation of the difference between Mysql temporary table and partition table
  • Detailed explanation of MySQL partition table
  • MySQL Best Practices: Basic Types of Partition Tables
  • Best Practices Guide for MySQL Partitioned Tables
  • Introduction to MySql table, database, sharding and partitioning knowledge points
  • Specific implementation methods of MySQL table sharding and partitioning
  • Correct use of MySQL partition tables
  • MySQL advanced features - detailed explanation of the concept and mechanism of data table partitioning

<<:  JavaScript anti-shake and throttling explained

>>:  Docker Compose installation methods in different environments

Recommend

How to set PATH environment variable in Linux system (3 methods)

1. In Windows system, many software installations...

Two solutions for Vue package upload server refresh 404 problem

1: nginx server solution, modify the .conf config...

Detailed steps for installing and configuring MySQL 5.7

1. Download MySQL 1. Log in to the official websi...

WeChat applet tab left and right sliding switch function implementation code

Effect picture: 1. Introduction Your own applet n...

How to implement Mysql scheduled task backup data under Linux

Preface Backup is the basis of disaster recovery....

MySQL simple example of sorting Chinese characters by pinyin

If the field storing the name uses the GBK charac...

Detailed explanation of three ways to cut catalina.out logs in tomcat

1. Log4j for log segmentation 1) Prepare three pa...

How to use and limit props in react

The props of the component (props is an object) F...

Analysis of basic usage of ul and li

Navigation, small amount of data table, centered &...

Detailed explanation of built-in methods of javascript array

Table of contents 1. Array.at() 2. Array.copyWith...

js to call the network camera and handle common errors

Recently, due to business reasons, I need to acce...

Solve the problem of blank gap at the bottom of Img picture

When working on a recent project, I found that th...

4 functions implemented by the transform attribute in CSS3

In CSS3, the transform function can be used to im...

JavaScript to achieve mouse drag effect

This article shares the specific code of JavaScri...