The pitfall of MySQL numeric type auto-increment

The pitfall of MySQL numeric type auto-increment

When designing table structures, numeric types are one of the most common types, but using numeric types well is not as simple as imagined, for example:

  • How to design an auto-incrementing primary key for massive concurrent Internet business? Is it enough to use INT?
  • How to design the account balance? Is using the DECIMAL type foolproof?

All of the above are wrong!

Although the numeric type may seem simple, the above-mentioned problem of "incomplete design thinking" is easy to occur in the design of table structure architecture (especially in the case of massive concurrent Internet scenarios).

Number Types

Integer Types

The MySQL database supports the integer types supported by the SQL standard: INT and SMALLINT. In addition, the MySQL database also supports integer types such as TINYINT, MEDIUMINT, and BIGINT (Table 1 shows the storage space and value range of various integer types):

MySQL Data Types Meaning (signed)
tinyint(m) 1 byte range (-128~127)
smallint(m) 2 bytes range (-32768~32767)
mediumint(m) 3 bytes range (-8388608~8388607)
int(m) 4 bytes range (-2147483648~2147483647)
bigint(m) 8 bytes range (+-9.22*10 to the 18th power)

In the integer type, there are signed and unsigned attributes, which represent the value range of the integer, and the default is signed. When designing, I do not recommend that you deliberately use unsigned attributes, because when doing some data analysis, the results returned by SQL may not be the results you want.

Let’s take a look at an example of a “sales table”, whose table structure and data are as follows. It is important to note that the column sale_count uses the unsigned attribute (that is, the column is designed to store values ​​greater than or equal to 0):

mysql> SHOW CREATE TABLE sale_G

*************************** 1. row ***************************

       Table: sale

Create Table: CREATE TABLE `sale` (

  `sale_date` date NOT NULL,

  `sale_count` int unsigned DEFAULT NULL,

  PRIMARY KEY (`sale_date`)

) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci

1 row in set (0.00 sec)


mysql> SELECT * FROM sale;

+------------+------------+

| sale_date | sale_count |

+------------+------------+

| 2020-01-01 | 10000 |

| 2020-02-01 | 8000 |

| 2020-03-01 | 12000 |

| 2020-04-01 | 9000 |

| 2020-05-01 | 10000 |

| 2020-06-01 | 18000 |

+------------+------------+

6 rows in set (0.00 sec)

Among them, sale_date represents the date of sale, and sale_count represents the sales quantity per month. Now there is a requirement that the boss wants to count the changes in monthly sales volume to make business decisions. This SQL statement requires a non-equijoin, but it's not too difficult to write:

SELECT 

    s1.sale_date, s2.sale_count - s1.sale_count AS diff

FROM

    sale s1

        LEFT JOIN

    sale s2 ON DATE_ADD(s2.sale_date, INTERVAL 1 MONTH) = s1.sale_date

ORDER BY sale_date;

However, during the execution, because the column sale_count uses the unsigned attribute, the following result is thrown:

ERROR 1690 (22003): BIGINT UNSIGNED value is out of range in '(`test`.`s2`.`sale_count` - `test`.`s1`.`sale_count`)'

As you can see, MySQL prompts the user that the calculated result is out of range. In fact, MySQL requires that unsigned values ​​remain unsigned after subtraction, otherwise an error will be reported.

To avoid this error, you need to set the database parameter sql_mode to NO_UNSIGNED_SUBTRACTION to allow the subtraction result to be signed, so that you can get the final desired result:

mysql> SET sql_mode='NO_UNSIGNED_SUBTRACTION';

Query OK, 0 rows affected (0.00 sec)

SELECT


    s1.sale_date,

    IFNULL(s2.sale_count - s1.sale_count,'') AS diff

FROM

    sale s1

    LEFT JOIN sale s2

    ON DATE_ADD(s2.sale_date, INTERVAL 1 MONTH) = s1.sale_date

ORDER BY sale_date;


+------------+-------+

| sale_date | diff |

+------------+-------+

| 2020-01-01 | |

| 2020-02-01 | 2000 |

| 2020-03-01 | -4000 |

| 2020-04-01 | 3000 |

| 2020-05-01 | -1000 |

| 2020-06-01 | -8000 |

+------------+-------+

6 rows in set (0.00 sec)

Floating point types and high precision types

In addition to integer types, commonly used digital types include floating-point and high-precision types.
Previous versions of MySQL had floating-point types Float and Double, but these types are not high-precision and are not SQL standard types, so they are not recommended for use in real production environments. Otherwise, during calculations, the final calculation results will be wrong due to precision type issues.
More importantly, starting from MySQL 8.0.17, when creating a table using the Float or Double type, the following warning will be thrown: MySQL reminds users that they should not use the above floating-point types, and even reminds that floating-point types will be deprecated in later versions

Specifying number of digits for floating point data types is deprecated and will be removed in a future release

The high-precision DECIMAL type among the numeric types can be used. When declaring a column of this type, you can (and usually must) specify the precision and scale, for example:

salary DECIMAL(8,2)

Where 8 is the precision (precision means the number of leading digits stored in the value) and 2 is the scale (scale means the number of digits stored after the decimal point). Usually in table structure design, the DECIMAL type can be used to represent user salaries, account balances, and other services accurate to 2 decimal places.

However, when used in massively concurrent Internet businesses, the DECIMAL type is not recommended for the amount field. Instead, the INT integer type is recommended (the reasons will be analyzed below).

Practical Design of Business Table Structure

Integer type and auto-increment design

In real business scenarios, integer types are most commonly used to represent the quantity of an item in business. For example, the sales quantity in the above table, or the inventory quantity and purchase frequency in e-commerce. In business, another common and important use of integer types is as the primary key of a table, that is, to uniquely identify a row of data.
Integer combined with the attribute auto_increment can realize the self-increment function, but when using auto-increment as the primary key in the table structure design, you should pay special attention to the following two points. If you do not pay attention, it may cause a catastrophic blow to the business:

  • Use BIGINT as the primary key instead of INT;
  • The auto-increment value is not persistent and may be backtracked (before MySQL 8.0).

From Table 1, we can see that the maximum range of INT is at the level of 4.2 billion. In the application of real Internet business scenarios, it is easy to reach the maximum value. For example, some flow tables and log tables have a daily data volume of 10 million. After 420 days, the upper limit of the INT type will be reached.
Therefore, when using an auto-increment integer as the primary key, always use BIGINT instead of INT. Do not use INT to save 4 bytes. When the upper limit is reached, changing the table structure will be a huge burden and pain.
This leads to an interesting question: if the upper limit of the INT type is reached, how will the database perform? Will it become 1 again? We can verify this with the following SQL statement:

mysql> CREATE TABLE t (

    -> a INT AUTO_INCREMENT PRIMARY KEY

    -> );


mysql> INSERT INTO t VALUES (2147483647);

Query OK, 1 row affected (0.01 sec)


mysql> INSERT INTO t VALUES (NULL);

ERROR 1062 (23000): Duplicate entry '2147483647' for key 't.PRIMARY'

It can be seen that when the INT upper limit is reached, a duplicate error will be reported when the auto-increment insertion is performed again, and the MySQL database will not automatically reset it to 1.
The second issue that needs special attention is that before MySQL 8.0, the auto-increment value is not persistent, and the auto-increment value may have backtracking problems!

mysql> SELECT * FROM t;

+---+

| a |

+---+

| 1 |

| 2 |

| 3 |

+---+

3 rows in set (0.01 sec)


mysql> DELETE FROM t WHERE a = 3;

Query OK, 1 row affected (0.02 sec)


mysql> SHOW CREATE TABLE t\G

*************************** 1. row ***************************

       Table: t

Create Table: CREATE TABLE `t` (

  `a` int NOT NULL AUTO_INCREMENT,

  PRIMARY KEY (`a`)

) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci

1 row in set (0.00 sec

It can be seen that after deleting the record with the auto-increment value of 3, the next auto-increment value is still 4 (AUTO_INCREMENT=4). There is no error here, and the auto-increment will not be backtracked. However, if the database is restarted at this time, the starting value of the auto-increment of table t will become 3 again after the database is started, that is, the auto-increment value will be backtracked. The details are as follows:

mysql> SHOW CREATE TABLE t\G

*************************** 1. row ***************************

       Table: t

Create Table: CREATE TABLE `t` (

  `a` int NOT NULL AUTO_INCREMENT,

  PRIMARY KEY (`a`)

) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci

1 row in set (0.00 s

To completely solve this problem, there are two methods:

  • Upgrade MySQL version to version 8.0, and the auto-increment value of each table will be persistent;
  • If you cannot upgrade the database version, it is strongly recommended not to use the auto-increment data type as the primary key in the core business table.

In fact, in the process of designing massive Internet architecture, in order to better scalability of distributed architecture in the future, it is not recommended to use integer type as primary key, and string type is more recommended.

Funds field design

In the business design of user balance, fund account balance, digital wallet, change, etc., since the fields are all capital fields, programmers usually use the DECIMAL type as the field selection, because this can be accurate to the minute, such as: DECIMAL(8,2).

CREATE TABLE User (

  userId BIGINT AUTO_INCREMENT,

  money DECIMAL(8,2) NOT NULL,

  ......

)

In the design standards for massive Internet services, the DECIMAL type is not recommended. Instead, it is recommended to convert DECIMAL to an integer type. In other words, it is recommended to store funds in cents rather than yuan. For example, 1 yuan is stored as integer type 100 in the database.

If the value range of the amount field is expressed using DECIMAL, how do you define the length? Because the DECIMAL type is a variable-length field, if you want to define an amount field, it is not enough to define it as DECIMAL(8,2). This can only mean that the maximum storage value is 999999.99, which means millions of funds can be stored.

The user's amount must be stored in a field of at least tens of billions, while the GDP amount field of the Statistics Bureau may reach tens of trillions. It is difficult to unify the definition using the DECIMAL type.
Another important thing is that the DECIMAL type is a binary encoding method, and its calculation efficiency is far less efficient than that of integer types. Therefore, it is recommended to use BIG INT to store amount-related fields.

Fields are stored in separate storage, but even so, BIG INT can store amounts in the gigabytes. Here, 1 trillion = 1 trillion.

The advantage of this is that all amount-related fields are fixed-length fields, occupying 8 bytes, and are highly efficient in storage. Another point is that direct integer calculation is more efficient.
Note that in database design, we place great emphasis on fixed-length storage because fixed-length storage has better performance.

Let's look at how records are stored in the database, roughly as follows:

If an update occurs, the original space of record 1 cannot accommodate the storage space of record 1 after the update. Therefore, the database will mark record 1 as deleted and find new space for record 1, such as:

In the above figure, *Record 1 indicates the space originally occupied by record 1, and this space will become fragmented space and cannot be used anymore unless the table space is manually defragmented.

So, when using BIG INT to store an amount field, how do you represent the data in the decimal point? In fact, this part can be completely handled and displayed by the front end. As for the database itself, it only needs to be stored by points.

This is the end of this article about the pitfalls of MySQL numeric type auto-increment. For more relevant content about MySQL numeric type auto-increment, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • Detailed explanation of single-row function code of date type in MySQL
  • Mysql date formatting and complex date range query
  • MySQL method of generating random numbers, strings, dates, verification codes and UUIDs
  • Analysis of MySQL's problem with sorting numbers in strings
  • MySQL example to explain single-row functions and character math date process control

<<:  JavaScript to achieve window display effect

>>:  Web page experience: planning and design

Recommend

MYSQL database basics - Join operation principle

Join uses the Nested-Loop Join algorithm. There a...

Native JS to implement login box email prompt

This article shares a native JS implementation of...

How to add abort function to promise in JS

Table of contents Overview Promise Race Method Re...

Implementation of MySQL index-based stress testing

1. Simulate database data 1-1 Create database and...

Example of Vue implementing fixed bottom component

Table of contents 【Effect】 【Implementation method...

Steps to enable MySQL database monitoring binlog

Preface We often need to do something based on so...

Introduction to local components in Vue

In Vue, we can define (register) local components...

Detailed explanation of JavaScript stack and copy

Table of contents 1. Definition of stack 2. JS st...

Implementing a web calculator based on JavaScript

This article shares the specific code of JavaScri...

Common solutions for Mysql read-write separation expiration

The pitfalls of MySQL read-write separation The m...

Realize super cool water light effect based on canvas

This article example shares with you the specific...

Encapsulate a simplest ErrorBoundary component to handle react exceptions

Preface Starting from React 16, the concept of Er...

js to make a simple calculator

This article shares the specific code of making a...

Docker image creation Dockerfile and commit operations

Build the image There are two main ways to build ...