Table partitioning is different from database partitioning. So what should we pay attention to when using table partitioning? Today we will look at the details of MySQL database table partitioning. 1. Partition column index constraints If the table has a primary key or unique key, the partition column of the partition table must be included in the primary key or unique key list. This is to ensure the efficiency of the primary key. Otherwise, it will be troublesome if the items in the same primary key area are in partition A and partition B respectively. 2. Conditions for each partition type range Each partition contains those rows whose values of the partitioning expression lie within a given contiguous interval. These intervals must be continuous and cannot overlap. List only supports integer fields or expressions that return integers. The value list in each partition list must be an integer. The hash type only supports integer fields or expressions that return integers. The key type only supports column names (one or more column names), and does not support expressions 3. Partition available functions ABS() CEILING() (see CEILING() and FLOOR(), immediately following this list) DAY() DAYOFMONTH() DAYOFWEEK() DAYOFYEAR() DATEDIFF() EXTRACT() FLOOR() (see CEILING() and FLOOR(), immediately following this list) HOUR() MICROSECOND() MINUTE() MOD() MONTH() QUARTER() SECOND() TIME_TO_SEC() TO_DAYS() WEEKDAY() YEAR() YEARWEEK() Notice: Because the partition function does not include the FROM_UNIXTIME function, it is not possible to partition by converting timestamps to time. You can only partition by date or datetime. For example, by year we can use: PARTITION BY RANGE (YEAR(date)) By month: PARTITION BY RANGE(date div 100) #div will convert the date into an integer, for example: 2014-12-01 -> 20141201, 100 will remove two digits from the end, and the final result is 201412 An example of partitioning by order: CREATE TABLE `order` ( `order_id` bigint(19) NOT NULL DEFAULT '0' COMMENT 'Order ID: Year, month, day, hour, minute, second, 12-digit, 7-digit random number', `date` date NOT NULL DEFAULT '0000-00-00' COMMENT 'Order date', `amount` int(11) DEFAULT NULL COMMENT 'Payment amount, in cents', `status` tinyint(1) DEFAULT '0' COMMENT '0: Waiting for payment 1: Payment successful 2: Payment failed 3: Verification failed', `addtime` int(10) DEFAULT NULL COMMENT 'Order adding time', PRIMARY KEY (`order_id`,`date`) )ENGINE=MyISAM DEFAULT CHARSET=utf8; Because we cannot use timestamps to partition by time, we add a date field. This field and order_id are used as the primary key. We know that the partition column must be included in the primary key. Next we use date to calculate the year-month combination to partition ALTER TABLE order PARTITION BY RANGE( date DIV 100) ( PARTITION p_2014_06 VALUES LESS THAN (201407), PARTITION p_2014_07 VALUES LESS THAN (201408), PARTITION p_2014_08 VALUES LESS THAN (201409), PARTITION p_2014_09 VALUES LESS THAN (201410), PARTITION p_2014_10 VALUES LESS THAN (201411), PARTITION p_catch_all VALUES LESS THAN MAXVALUE ); In the above, LESS THAN MAXVALUE sets the last partition p_catch_all, so you cannot add partitions using the add method. The following statements are not available: You can only split the last p_catch_all partition into two. This has the advantage that the data in the p_catch_all partition will not be lost. The merging and splitting of data are performed using REORGANIZE PARTITION. alter table order reorganize partition p_catch_all into ( partition p_2014_11 values less than (201412), partition p_catch_all values less than maxvalue ); Merge partitions: alter table order reorganize partition p_2014_10,p_2014_11,p_catch_all into ( partition p_catch_test values less than MAXVALUE ); Why not assign it to p_catch_all? Because the partition will be reported to exist. Why do we need to include the last partition p_catch_all when merging? Because except for the last partition, the reorganized partition ranges cannot change the total range. alter table table name remove partitioning Note: The above statement can be executed in 5.5, but there seems to be a problem in 5.6. You need to test it first After partitioning, if the where condition is a range, the partitioning will not work, such as where date >= '2014-01-01' And date <= '2014-01-31' Supplement: Four partition types of MySQL tables 1. What is table partitioning Generally speaking, table partitioning is to divide a large table into several small tables according to conditions. MySQL 5.1 began to support data table partitioning. For example, if a user table has more than 6 million records, the table can be partitioned according to the storage date or the location. Of course, partitioning can also be based on other conditions. 2. Why partition the table? To improve scalability, manageability, and increase database efficiency for large tables and tables with varying access patterns. Some advantages of partitioning include: Can store more data than a single disk or file system partition. For data that has lost its meaning of being saved, it is usually easy to delete the data by deleting the partitions related to the data. Conversely, in some cases, the process of adding new data can be conveniently implemented by adding a new partition specifically for that new data. Other advantages commonly associated with partitioning include those listed below. These features in MySQL Partitioning are not yet implemented, but are high on our priority list; we hope to include them in the production version of 5.1. Some queries can be greatly optimized, mainly by saving the data that satisfies a given WHERE clause in one or more partitions, so that the search does not need to search other remaining partitions. Because partitions can be modified after a partitioned table is created, you can reorganize your data to improve the efficiency of commonly used queries if you did not do so when you first configured the partitioning scheme. Queries involving aggregate functions such as SUM() and COUNT() can be easily parallelized. A simple example of such a query is "SELECT salesperson_id, COUNT (orders) as order_total FROM sales GROUP BY salesperson_id;". By "parallel", it means that the query can be performed on each partition simultaneously, and the final result is simply obtained by summing the results of all partitions. By spreading data queries across multiple disks, you can achieve greater query throughput. 3. Partition Type RANGE partitioning: Assign multiple rows to partitions based on column values that fall within a given continuous interval. LIST partitioning: Similar to RANGE partitioning, the difference is that LIST partitioning is based on the column value matching a value in a discrete value set. HASH partitioning: A partition is selected based on the return value of a user-defined expression that is calculated using the column values of the rows to be inserted into the table. This function can contain any expression valid in MySQL that produces a non-negative integer value. KEY partitioning: Similar to HASH partitioning, the difference is that KEY partitioning only supports the calculation of one or more columns, and the MySQL server provides its own hash function. One or more columns must contain integer values. RANGE partitioning Assigns multiple rows to partitions based on column values that fall within a given contiguous interval. These intervals must be continuous and non-overlapping, and are defined using the VALUES LESS THAN operator. The following are examples. Sql code: CREATE TABLE employees ( id INT NOT NULL, fname VARCHAR(30), lname VARCHAR(30), hired DATE NOT NULL DEFAULT '1970-01-01', separated DATE NOT NULL DEFAULT '9999-12-31', job_code INT NOT NULL, store_id INT NOT NULL ) partition BY RANGE (store_id) ( partition p0 VALUES LESS THAN (6), partition p1 VALUES LESS THAN (11), partition p2 VALUES LESS THAN (16), partition p3 VALUES LESS THAN (21) ); According to this partitioning scheme, all rows corresponding to employees working in stores 1 to 5 are stored in partition P0, employees working in stores 6 to 10 are stored in P1, and so on. Note that each partition is defined in order, from lowest to highest. This is a requirement of the PARTITION BY RANGE syntax; in this respect, it is similar to a "switch ... case" statement in C or Java. For a new row containing the data (72, 'Michael', 'Widenius', '1998-06-25′, NULL, 13), it is easy to determine that it will be inserted into the p2 partition, but what will happen if a store numbered 21 is added? In this scenario, since there is no rule to include stores with store_id greater than 20, the server will not know where to save the row, which will result in an error. You can avoid this error by using a "catchall" VALUES LESS THAN clause in the CREATE TABLE statement, which catches all values greater than the highest value you explicitly specify: Sql code: CREATE TABLE employees ( id INT NOT NULL, fname VARCHAR(30), lname VARCHAR(30), hired DATE NOT NULL DEFAULT '1970-01-01', separated DATE NOT NULL DEFAULT '9999-12-31', job_code INT NOT NULL, store_id INT NOT NULL ) PARTITION BY RANGE (store_id) ( PARTITION p0 VALUES LESS THAN (6), PARTITION p1 VALUES LESS THAN (11), PARTITION p2 VALUES LESS THAN (16), PARTITION p3 VALUES LESS THAN MAXVALUE ); MAXVALUE represents the largest possible integer value. Now, all rows with store_id column values greater than or equal to 16 (the highest value defined) will be stored in partition p3. At some point in the future, when the number of stores has grown to 25, 30, or more, you can use the ALTER TABLE statement to add new partitions for stores 21-25, 26-30, and so on. In almost the same structure, you can also partition the table based on the employee's job code, that is, based on consecutive intervals of job_code column values. For example - assuming that 2-digit job codes are used to represent general (store) workers, 3-digit codes represent office and support staff, and 4-digit codes represent management, you can create the partitioned table using the following statement: Sql code: CREATE TABLE employees ( id INT NOT NULL, fname VARCHAR(30), lname VARCHAR(30), hired DATE NOT NULL DEFAULT '1970-01-01', separated DATE NOT NULL DEFAULT '9999-12-31', job_code INT NOT NULL, store_id INT NOT NULL ) PARTITION BY RANGE (job_code) ( PARTITION p0 VALUES LESS THAN (100), PARTITION p1 VALUES LESS THAN (1000), PARTITION p2 VALUES LESS THAN (10000) ); In this example, all rows related to store workers will be stored in partition p0, all rows related to office and support staff will be stored in partition p1, and all rows related to management will be stored in partition p2. It is also possible to use an expression in the VALUES LESS THAN clause. The most notable restriction here is that MySQL must be able to evaluate the expression returned as part of the LESS THAN (<) comparison; therefore, the expression cannot evaluate to NULL. For this reason, the hired, separated, job_code, and store_id columns of the employee table have been defined as NOT NULL. In addition to being able to split the table data based on store number, you can also use an expression based on one of two DATE (date) to split the table data. For example, suppose you want to partition the table based on the year each employee left the company, that is, the value of YEAR(separated). An example of a CREATE TABLE statement that implements this partitioning scheme is as follows: Sql code: CREATE TABLE employees ( id INT NOT NULL, fname VARCHAR(30), lname VARCHAR(30), hired DATE NOT NULL DEFAULT '1970-01-01', separated DATE NOT NULL DEFAULT '9999-12-31', job_code INT, store_id INT ) PARTITION BY RANGE (YEAR(separated)) ( PARTITION p0 VALUES LESS THAN (1991), PARTITION p1 VALUES LESS THAN (1996), PARTITION p2 VALUES LESS THAN (2001), PARTITION p3 VALUES LESS THAN MAXVALUE ); In this scheme, records of all employees hired before 1991 are stored in partition p0, records of all employees hired between 1991 and 1995 are stored in partition p1, records of all employees hired between 1996 and 2000 are stored in partition p2, and information on all workers hired after 2000 is stored in partition p3. Note: This optimization is not yet enabled in the MySQL 5.1 source code; however, work is in progress. LIST partition Similar to partitioning by RANGE, the difference is that LIST partitioning is based on the column value matching a value in a discrete set of values. LIST partitioning is implemented by using "PARTITION BY LIST(expr)", where "expr" is a column value or an expression based on a column value that returns an integer value, and then defining each partition by "VALUES IN (value_list)", where "value_list" is a comma-separated list of integers. Note: As of MySQL 5.1, when using LIST partitioning, it is possible to match only lists of integers. Sql code: CREATE TABLE employees ( id INT NOT NULL, fname VARCHAR(30), lname VARCHAR(30), hired DATE NOT NULL DEFAULT '1970-01-01', separated DATE NOT NULL DEFAULT '9999-12-31', job_code INT, store_id INT ); Assume that there are 20 audio-visual stores distributed in 4 regions with distribution rights, as shown in the following table: ==================== Regional Store ID Number North District 3, 5, 6, 9, 17 East District 1, 2, 10, 11, 19, 20 West District 4, 12, 13, 14, 18 Central District 7, 8, 15, 16 ==================== To partition the table in such a way that rows belonging to stores in the same region are stored in the same partition, you can use the following "CREATE TABLE" statement: Sql code: CREATE TABLE employees ( id INT NOT NULL, fname VARCHAR(30), lname VARCHAR(30), hired DATE NOT NULL DEFAULT '1970-01-01', separated DATE NOT NULL DEFAULT '9999-12-31', job_code INT, store_id INT ) PARTITION BY LIST(store_id) PARTITION pNorth VALUES IN (3,5,6,9,17), PARTITION pEast VALUES IN (1,2,10,11,19,20), PARTITION pWest VALUES IN (4,12,13,14,18), PARTITION pCentral VALUES IN (7,8,15,16) ); This makes it easy to add or delete employee records for a specific region in the table. For example, suppose all the video stores on the West Side were sold to other companies. Then all the records (rows) related to the employees working in the West District video store can be deleted using the query "ALTER TABLE employees DROP PARTITION pWest;", which is much more efficient than the DELETE query "DELETE query DELETE FROM employees WHERE store_id IN (4,12,13,14,18);" which has the same effect. [Key Points]: If you try to insert a row whose column value (or the return value of the partitioning expression) is not in the partition value list, the "INSERT" query will fail and report an error. For example, assuming the above scheme is used for LIST partitioning, the following query will fail: Sql code: INSERT INTO employees VALUES(224, 'Linus', 'Torvalds', '2002-05-01', '2004-10-12', 42, 21); This is because the "store_id" column value 21 cannot be found in the list of values used to define partitions pNorth, pEast, pWest, or pCentral. It is important to note that LIST partitions do not have definitions such as "VALUES LESS THAN MAXVALUE" that include other values. Any value that will match must be found in the list of values. HASH partition The partitions are selected based on the return value of a user-defined expression that is evaluated using column values for the rows to be inserted into the table. This function can contain any expression valid in MySQL that produces a non-negative integer value. To partition a table using HASH partitioning, add a "PARTITION BY HASH (expr)" clause to the CREATE TABLE statement, where "expr" is an expression that returns an integer. It can just be the name of a column whose field type is a MySQL integer. Additionally, you will most likely need to add a "PARTITIONS num" clause after it, where num is a non-negative integer representing the number of partitions the table will be split into. Sql code: CREATE TABLE employees ( id INT NOT NULL, fname VARCHAR(30), lname VARCHAR(30), hired DATE NOT NULL DEFAULT '1970-01-01', separated DATE NOT NULL DEFAULT '9999-12-31', job_code INT, store_id INT ) PARTITION BY HASH(store_id) PARTITIONS 4; If you do not include a PARTITIONS clause, the number of partitions will default to 1. Exception: For NDB Cluster tables, the default number of partitions will be the same as the number of cluster data nodes, with this correction likely to take into account any MAX_ROWS setting in order to ensure that all rows can fit into the partitions. LINER HASH MySQL also supports linear hashing, which differs from regular hashing in that linear hashing uses a linear powers-of-two algorithm, whereas regular hashing uses the modulus of the hash function value. The only difference in syntax between linear hash partitioning and regular hash partitioning is the addition of the "LINEAR" keyword in the "PARTITION BY" clause. Sql code: CREATE TABLE employees ( id INT NOT NULL, fname VARCHAR(30), lname VARCHAR(30), hired DATE NOT NULL DEFAULT '1970-01-01', separated DATE NOT NULL DEFAULT '9999-12-31', job_code INT, store_id INT ) PARTITION BY LINEAR HASH(YEAR(hired)) PARTITIONS 4; Suppose an expression expr, when using the linear hash function, the partition to which the record will be saved is partition N of num partitions, where N is obtained according to the following algorithm: 1. Find the next power of 2 greater than num. We call this value V, which can be obtained by the following formula: 2. V = POWER(2, CEILING(LOG(2, num))) (For example, suppose num is 13. Then LOG(2,13) is 3.7004397181411. CEILING(3.7004397181411) is 4, then V = POWER(2,4), which is equal to 16). 3. Set N = F(column_list) & (V – 1). 4. When N >= num: Set V = CEIL(V / 2) Set N = N & (V – 1) For example, suppose table t1, using linear hash partitioning and with 4 partitions, is created by the following statement: CREATE TABLE t1 (col1 INT, col2 CHAR(5), col3 DATE) PARTITION BY LINEAR HASH( YEAR(col3) ) PARTITIONS 6; Now suppose that you want to insert two rows into table t1, one record with the col3 column value of '2003-04-14' and the other record with the col3 column value of '1998-10-19'. The partition to which the first record will be saved is determined as follows: V = POWER(2, CEILING(LOG(2,7))) = 8 N = YEAR('2003-04-14′) & (8 – 1) = 2003 & 7 = 3 (3 >= 6 is false: the record will be saved in partition #3) The partition number to which the second record will be saved is calculated as follows: V = 8 N = YEAR('1998-10-19′) & (8-1) = 1998 & 7 = 6 (6 >= 4 is true: additional steps are required) N = 6 & CEILING(5 / 2) = 6 & 3 = 2 (2 >= 4 is false: the record will be saved in partition #2) The advantage of linear hash partitioning is that adding, deleting, merging, and splitting partitions becomes faster, which is beneficial for processing tables with extremely large amounts of data (1000s of gigabytes). Its disadvantage is that the distribution of data between partitions is unlikely to be balanced compared to the data distribution obtained using conventional HASH partitioning. KEY partition Similar to HASH partitioning, the difference is that KEY partitioning only supports calculation of one or more columns, and the MySQL server provides its own hash function. One or more columns must contain integer values. Sql code: CREATE TABLE tk ( col1 INT NOT NULL, col2 CHAR(5), col3 DATE ) PARTITION BY LINEAR KEY (col1) PARTITIONS 3; Using the keyword LINEAR in KEY partitioning has the same effect as using it in HASH partitioning. The partition number is obtained by the powers-of-two algorithm instead of the modulus algorithm. Summarize The above is all the content of this article about the precautions for MySQL database table partitioning. Interested friends can refer to: How to delete MySQL table data, several important MySQL variables, MYSQL subquery and nested query optimization example analysis, etc. I hope it will be helpful to everyone. If you have any questions, please leave a message. Everyone is welcome to communicate and discuss. You may also be interested in:
|
<<: A brief discussion on Nginx10m+ high concurrency kernel optimization
>>: Use JavaScript to create page effects
During project development, our database data is ...
The DIV floating effect (fixed position) is imple...
In actual web page development, accordions also a...
Problem Description As we all know, the SQL to so...
Six steps to install MySQL (only the installation...
Anchor tag usage: Linking to a specific location i...
This article mainly focuses on the installation a...
Preface I believe many students are already famil...
MySQL query not using index aggregation As we all...
Table of contents A JSON is built on two structur...
introduction Let's start with our content. I ...
I recently used Dreamweaver to make a product pres...
Implementing responsive layout with CSS Responsiv...
When using the docker-maven-plugin plug-in, Maven...
This article mainly introduces the effect of div ...