Do you know the common MySQL design errors?

Do you know the common MySQL design errors?

Thanks to the development of the Internet, we can use the fragmented time such as waiting for a bus or taking the subway to study and get information anytime and anywhere. At the same time, the developed Internet also makes it convenient for people to quickly share their knowledge and discuss with friends who have the same interests and needs.

However, overly convenient sharing has also made knowledge diverse, making it easy for people to receive wrong information. Most of these errors are caused by the rapid development of technology and the lack of spare time to update the already released content. In order to avoid misunderstandings for those who learn later, let's take a look at some common errors in MySQL design specifications today.

Primary key design

Wrong design specification: It is recommended to use an auto-incrementing ID value as the primary key, and do not use UUID, MD5, HASH, or string as the primary key

This design specification can be seen in many articles. The advantages of auto-increment primary keys include small space occupation, order, and simplicity of use.

Let's first look at the disadvantages of the auto-increment primary key:

  • Since the auto-increment value is generated on the server side, it needs to be protected by an auto-increment AI lock. If there are a large number of insert requests at this time, there may be a performance bottleneck caused by the auto-increment value, so there is a concurrent performance problem;
  • The self-incremented value used as the primary key can only be unique in the current instance, but not globally unique, which makes it impossible to use it in a distributed architecture.
  • Public data values ​​can easily lead to security issues. If our product ID is an auto-increment primary key, users can obtain products by modifying the ID value. In serious cases, they can know how many products are stored in our database.
  • Performance issues that may be caused by MGR (MySQL Group Replication);

Because the auto-increment value is generated on the MySQL server, it needs to be protected by an auto-increment AI lock. If there are a large number of insert requests at this time, there may be a performance bottleneck caused by the auto-increment. For example, in the MySQL database, the parameter innodb_autoinc_lock_mode is used to control the time the auto-increment lock is held. Although we can adjust the parameter innodb_autoinc_lock_mode to obtain the maximum performance of auto-increment, there are still other problems. Therefore, in concurrent scenarios, it is recommended to use UUID as the primary key or to customize the primary key generated by the business.

We can use the UUID() function directly in MySQL to get the value of UUID.

MySQL> select UUID();
+--------------------------------------+
| UUID() |
+--------------------------------------+
| 23ebaa88-ce89-11eb-b431-0242ac110002 |
+--------------------------------------+
1 row in set (0.00 sec)

It should be noted that when storing time, UUID is stored in reverse order according to the time bits, that is, the low time bits are stored at the front and the high time bits are at the end. That is, the first 4 bytes of UUID will continue to change "randomly" with time, and are not monotonically increasing. Non-random values ​​will generate discrete IO when inserted, thus creating a performance bottleneck. This is also the biggest disadvantage of UUID compared to auto-increment.

To solve this problem, MySQL 8.0 introduced the function UUID_TO_BIN, which can convert the UUID string:

  • By using parameters to put the time bit with the highest value at the front, the disorder problem when inserting UUID is solved;
  • Removed useless character string "-" to save storage space;
  • The string is converted into a binary value for storage, and the space is eventually shortened from 36 bytes to 16 bytes.

Next, we convert the previous UUID string 23ebaa88-ce89-11eb-b431-0242ac110002 through the function UUID_TO_BIN, and the binary value is as follows:

MySQL> SELECT UUID_TO_BIN('23ebaa88-ce89-11eb-b431-0242ac110002',TRUE) as UUID_BIN;
+------------------------------------+
| UUID_BIN |
+------------------------------------+
| 0x11EBCE8923EBAA88B4310242AC110002 |
+------------------------------------+
1 row in set (0.01 sec)

In addition, MySQL 8.0 also provides the function BIN_TO_UUID, which supports reversing binary values ​​into UUID strings.

Although there is no UUID_TO_BIN/BIN_TO_UUID function before MySQL 8.0, it can still be solved by customizing the function. For the application layer, you can write corresponding functions according to your own programming language.

Of course, many students are also worried about the performance and storage space occupied by UUID. Here I also did some relevant insertion performance tests, and the results are shown in the following table:

As you can see, the sorted UUID provided by MySQL 8.0 has the best performance, even better than the auto-increment ID. In addition, since the result of the UUID_TO_BIN conversion is 16 bytes, which is only 8 bytes more than the auto-increment ID, the final storage space is only 3G more than the auto-increment ID.

And because UUID can guarantee global uniqueness, the benefits of using UUID are far greater than self-incrementing ID. You may be accustomed to using auto-increment as the primary key, but in concurrent scenarios, it is recommended to use a globally unique value such as UUID as the primary key.

Of course, although UUID is good, in a distributed scenario, the primary key needs to add some additional information to ensure the query efficiency of subsequent secondary indexes. It is recommended to generate the primary key according to business customization. However, when the concurrency and data volume are not that large, it is recommended to use self-incrementing UUID. Don't think that UUID cannot be used as a primary key.

Design of financial fields

Wrong design specification: Financial-related amount data must use the decimal type because float and double are both inaccurate floating-point types, while decimal is an accurate floating-point type. Therefore, when designing financial fields such as user balances and product prices, the decimal type is generally used, which can be accurate to the cent.

However, in the design standards for massive Internet services, the DECIMAL type is not recommended. Instead, it is recommended to convert DECIMAL to an integer type. In other words, for financial purposes, it is recommended to store data in cents rather than yuans. For example, 1 yuan is stored as integer type 100 in the database.

Following are the advantages of bigint type:

  • decimal is a binary encoding method, and its calculation efficiency is not as good as bigint.
  • If bigint is used, the field is a fixed-length field, which is efficient to store. Decimal is determined by the defined width. In data design, fixed-length storage has better performance.
  • Use bigint to store amounts divided into units, and you can also store amounts in the gigabyte range, which is completely sufficient.

Use of enumeration fields

Bad design practice: Avoid using ENUM types

In previous development projects, when encountering fields such as user gender, whether the product is on the shelf, whether the comment is hidden, etc., the fields were simply designed as tinyint, and then the fields were noted in the notes for 0 and 1 for the state.

The problems with this design are also obvious:

  • Unclear expression: This table may be designed by other colleagues. If you don’t have a deep impression of it, you need to read the field comments every time, and sometimes you even need to go to the database to confirm the meaning of the field when coding.
  • Dirty data: Although the inserted values ​​can be restricted by code at the application layer, the values ​​can still be modified through SQL and visualization tools.

For this type of fixed option value field, it is recommended to use the ENUM enumeration string type, plus the strict mode of SQL_MODE

In MySQL 8.0.16 and later versions, you can use the check constraint mechanism directly without using the enumeration field type.

Moreover, we usually use single characters such as "Y" and "N" when defining enumeration values, which does not take up much space. However, if the option values ​​are not fixed and may increase as the business develops, it is not recommended to use enumeration fields.

Index number limit

Wrong design specification: limit the number of indexes on each table, a table cannot have more than 5 indexes

There is no limit on the number of indexes for a single MySQL table. If there is a specific need for business queries, you can create them. Don't be superstitious about the limit.

Using subqueries

Bad design practice: Avoid subqueries

In fact, this specification is correct for older versions of MySQL. Because previous versions of MySQL database have limited optimization for subqueries, in many OLTP business scenarios, we require that online businesses avoid subqueries as much as possible.

However, in MySQL 8.0, the optimization of subqueries has been greatly improved, so you can use subqueries safely in the new version of MySQL.

Subqueries are easier for humans to understand than JOIN. For example, we now want to check the number of students who have not published articles in 2020.

SELECT COUNT(*)
FROM user
WHERE id not in (
    SELECT user_id
    from blog
    where publish_time >= "2020-01-01" AND publish_time <= "2020-12-31"
)

As you can see, the logic of the subquery is very clear: not IN is used to query the users of the article table.

If we use left join to write

SELECT count(*)
FROM user LEFT JOIN blog
ON user.id = blog.user_id and blog.publish_time >= "2020-01-01" and blog.publish_time <= "2020-12-31"
where blog.user_id is NULL;

It can be found that although LEFT JOIN can also meet the above requirements, it is not easy to understand.

We use explain to view the execution plans of the two SQL statements and find that they are the same.

It can be clearly seen from the above figure that both the subquery and the LEFT JOIN are eventually converted into the left hash Join, so the execution time of the above two SQL statements is the same. That is, in MySQL 8.0, the optimizer will automatically optimize the IN subquery into the best JOIN execution plan, which will significantly improve performance.

Summarize

After reading the previous content, I believe everyone has a new understanding of MySQL. These common errors can be summarized as follows:

  • UUID can also be used as a primary key. The performance of an auto-incrementing UUID is better than that of an auto-incrementing primary key, and the extra space occupied is negligible.
  • In addition to decimal, you can also try bigint for financial fields to store data divided into units.
  • For fields with fixed option values, it is recommended to use enumeration fields before MySQL 8, and use check function constraints after MySQL 8. Do not use 0, 1, or 2 to represent
  • There is no limit on the number of indexes in a table, which cannot exceed 5. You can add or delete them according to business conditions.
  • MySQL8 has optimized subqueries and can be used with confidence.

This is the end of this article about common incorrect MySQL design specifications. For more relevant MySQL incorrect design specifications, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope everyone will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • Professional MySQL development design specifications and SQL writing specifications
  • MYSQL database naming and design specifications

<<:  Steps to set up HTTPS website based on Nginx

>>:  Getting Started Tutorial for Beginners ⑨: How to Build a Portal Website

Recommend

Detailed explanation of mysql replication tool based on python

Table of contents 1. Introduction Second practice...

Detailed explanation of the pitfalls of mixing MySQL order by and limit

In MySQL, we often use order by for sorting and l...

How to use jconsole to monitor remote Tomcat services

What is JConsole JConsole was introduced in Java ...

Linux system to view CPU, machine model, memory and other information

During system maintenance, you may need to check ...

Implementing carousel effects with JavaScript

This article shares the specific code for JavaScr...

A brief discussion on whether too many MySQL data queries will cause OOM

Table of contents Impact of full table scan on th...

Summary of basic operations for MySQL beginners

Library Operations Query 1.SHOW DATABASE; ----Que...

Vue implements the drag and drop sorting function of the page div box

vue implements the drag and drop sorting function...

Brief analysis of centos 7 mysql-8.0.19-1.el7.x86_64.rpm-bundle.tar

Baidu Cloud Disk: Link: https://pan.baidu.com/s/1...

Comparative Analysis of MySQL Binlog Log Processing Tools

Table of contents Canal Maxwell Databus Alibaba C...

Vue.js implements image switching function

This article shares the specific code of Vue.js t...

Vue implements small search function

This article example shares the specific code of ...

How to elegantly back up MySQL account information

Preface: I recently encountered the problem of in...

How to use video.js in vue to play m3u8 format videos

Table of contents 1. Installation 2. Introducing ...

Install MySQL in Ubuntu 18.04 (Graphical Tutorial)

Tip: The following operations are all performed u...