In-depth analysis of why MySQL does not recommend using uuid or snowflake id as primary key

Preface: When designing a table in MySQL, MySQL officially recommends not using UUID or non-continuous and non-repeating snowflake IDs (long and unique), but recommends continuous self-incrementing primary key IDs. The official recommendation is auto_increment. So why is UUID not recommended? What are the disadvantages of using UUID? In this blog, we will analyze this problem and explore the internal reasons.

1: mysql and program examples

1.1: To illustrate this problem, we first create three tables, namely user_auto_key, user_uuid, and user_random_key, which represent the automatically increasing primary key, uuid as the primary key, and random key as the primary key, respectively. We keep the others completely unchanged. According to the control variable method, we only generate the primary key of each table using different strategies, while the other fields are exactly the same, and then test the insertion speed and query speed of the table:

Note: The random key here actually refers to the ID calculated by the snowflake algorithm, which is not continuous, repeated, or irregular: a string of 18-bit long values

ID automatically generates table:

User UUID Table

Random primary key table:

1.2: Theory alone is not enough, let's go directly to the program and use spring's jdbcTemplate to implement the incremental test:

Technical framework: springboot+jdbcTemplate+junit+hutool. The principle of the program is to connect to your own test database, and then write the same amount of data in the same environment to analyze the insert time to synthesize its efficiency. In order to achieve the most realistic effect, all data is randomly generated, such as name, email address, and address. The program has been uploaded from gitee, and the address is at the bottom of the article.

package com.wyq.mysqldemo;
import cn.hutool.core.collection.CollectionUtil;
import com.wyq.mysqldemo.databaseobject.UserKeyAuto;
import com.wyq.mysqldemo.databaseobject.UserKeyRandom;
import com.wyq.mysqldemo.databaseobject.UserKeyUUID;
import com.wyq.mysqldemo.diffkeytest.AutoKeyTableService;
import com.wyq.mysqldemo.diffkeytest.RandomKeyTableService;
import com.wyq.mysqldemo.diffkeytest.UUIDKeyTableService;
import com.wyq.mysqldemo.util.JdbcTemplateService;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.util.StopWatch;
import java.util.List;
@SpringBootTest
class MysqlDemoApplicationTests {

  @Autowired
  private JdbcTemplateService jdbcTemplateService;

  @Autowired
  private AutoKeyTableService autoKeyTableService;

  @Autowired
  private UUIDKeyTableService uuidKeyTableService;

  @Autowired
  private RandomKeyTableService randomKeyTableService;


  @Test
  void testDBTime() {

    StopWatch stopwatch = new StopWatch("Execute sql time consumption");


    /**
     * auto_increment key task */
    final String insertSql = "INSERT INTO user_key_auto(user_id,user_name,sex,address,city,email,state) VALUES(?,?,?,?,?,?,?,?)";

    List<UserKeyAuto> insertData = autoKeyTableService.getInsertData();
    stopwatch.start("Automatically generate key table task starts");
    long start1 = System.currentTimeMillis();
    if (CollectionUtil.isNotEmpty(insertData)) {
      boolean insertResult = jdbcTemplateService.insert(insertSql, insertData, false);
      System.out.println(insertResult);
    }
    long end1 = System.currentTimeMillis();
    System.out.println("Time consumed by auto key: " + (end1 - start1));

    stopwatch.stop();


    /**
     * uudID key
     */
    final String insertSql2 = "INSERT INTO user_uuid(id,user_id,user_name,sex,address,city,email,state) VALUES(?,?,?,?,?,?,?,?,?)";

    List<UserKeyUUID> insertData2 = uuidKeyTableService.getInsertData();
    stopwatch.start("UUID key table task starts");
    long begin = System.currentTimeMillis();
    if (CollectionUtil.isNotEmpty(insertData)) {
      boolean insertResult = jdbcTemplateService.insert(insertSql2, insertData2, true);
      System.out.println(insertResult);
    }
    long over = System.currentTimeMillis();
    System.out.println("Time consumed by UUID key: " + (over - begin));

    stopwatch.stop();


    /**
     * Random long value key
     */
    final String insertSql3 = "INSERT INTO user_random_key(id,user_id,user_name,sex,address,city,email,state) VALUES(?,?,?,?,?,?,?,?,?)";
    List<UserKeyRandom> insertData3 = randomKeyTableService.getInsertData();
    stopwatch.start("Random long value key table task starts");
    Long start = System.currentTimeMillis();
    if (CollectionUtil.isNotEmpty(insertData)) {
      boolean insertResult = jdbcTemplateService.insert(insertSql3, insertData3, true);
      System.out.println(insertResult);
    }
    Long end = System.currentTimeMillis();
    System.out.println("Random key task time consumed: " + (end - start));
    stopwatch.stop();


    String result = stopwatch.prettyPrint();
    System.out.println(result);
  }

1.3: Program write results

user_key_auto write result:

User_random_key write result:

The result of writing user_uuid table:

1.4: Efficiency test results

When the existing data volume is 1.3 million, let's test inserting 100,000 data to see what the result will be:

It can be seen that when the data volume is about 1 million, the insertion efficiency of uuid is the lowest, and when 1.3 million data are added in the subsequent sequence, the time of uuid drops sharply. The overall efficiency ranking based on the time usage is: auto_key>random_key>uuid. UUID has the lowest efficiency. When the amount of data is large, the efficiency drops sharply. So why does this phenomenon occur? With doubts, let's explore this issue:

2: Comparison of index structures using uuid and auto-increment id

2.1: Using the internal structure of the auto-increment id

The values of the auto-increment primary key are sequential, so Innodb stores each record behind a record. When the maximum fill factor of the page is reached (the default maximum fill factor of InnoDB is 15/16 of the page size, and 1/16 of the space will be reserved for future modifications):

① The next record will be written to a new page. Once the data is loaded in this order, the primary key page will be filled with nearly sequential records, which increases the maximum fill rate of the page and avoids page waste.

②The newly inserted row will be located under the original largest row. MySQL locates and addresses the row quickly, and does not consume extra time to calculate the position of the new row.

③Reduce page splitting and fragmentation

2.2: Index internal structure using uuid

Because UUID is completely irregular compared to sequential auto-increment IDs, the value of a new row does not necessarily have to be greater than the value of the previous primary key, so InnoDB cannot always insert new rows at the end of the index. Instead, it needs to find a new suitable position for the new row to allocate new space. This process requires a lot of additional operations. The lack of order in the data will lead to scattered data distribution, which will cause the following problems:

①: The target page to be written may have been flushed to disk and removed from the cache, or has not yet been loaded into the cache. InnoDB has to find and read the target page from disk into memory before inserting, which will cause a large amount of random IO

②: Because writes are out of order, InnoDB has to frequently split pages to allocate space for new rows. Page splits result in moving a large amount of data, and at least three pages need to be modified for one insertion.

③: Due to frequent page splits, pages become sparse and irregularly filled, which eventually leads to data fragmentation

After loading the random values (uuid and snowflake id) into the clustered index (the default index type of innodb), it is sometimes necessary to do an OPTIMEIZE TABLE to rebuild the table and optimize the page filling, which will take some time.

Conclusion: When using InnoDB, you should insert rows in the auto-increment order of the primary key as much as possible, and use monotonically increasing clustering key values to insert new rows as much as possible.

2.3: Disadvantages of using auto-increment id

So is there no harm in using an auto-incrementing id? No, there are also the following problems with self-incrementing IDs:

①: Once someone crawls your database, they can obtain your business growth information based on the database's self-incrementing ID, and it is easy to analyze your business situation

②: For high concurrent loads, InnoDB will cause obvious lock contention when inserting by primary key. The upper bound of the primary key will become a hot spot for contention because all insertions occur here. Concurrent insertions will cause gap lock contention.

③: The Auto_Increment lock mechanism will cause the snatch of the auto-increment lock, resulting in a certain performance loss

Appendix: To improve the lock contention problem of Auto_increment, you need to tune the configuration of innodb_autoinc_lock_mode

Three: Summary

This blog starts with the question raised at the beginning, creating a table and using jdbcTemplate to test the performance of different ID generation strategies in inserting large amounts of data. It then analyzes the different ID mechanisms in the MySQL index structure and their advantages and disadvantages, and explains in depth why UUID and random non-repeating IDs have performance loss in data insertion, and explains this issue in detail. In actual development, it is best to use auto-increment ID according to the official recommendation of MySQL. MySQL is profound and there are still many points worth optimizing inside that we need to learn.

Appendix: This blog demo address: https://gitee.com/Yrion/mysqlIdDemo

This concludes this article on in-depth analysis of why MySQL does not recommend using UUID or Snowflake ID as primary key. For more relevant content about MySQL UUID or Snowflake ID as primary key, please search 123WORDPRESS.COM's previous articles or continue to browse the following related articles. I hope everyone will support 123WORDPRESS.COM in the future!

You may also be interested in:

A brief analysis of whether MySQL primary key uses numbers or uuids for faster query
MySQL method of generating random numbers, strings, dates, verification codes and UUIDs
The difference and advantages and disadvantages of Mysql primary key UUID and auto-increment primary key
How to modify server uuid in Mysql
How to remove horizontal lines when storing UUID in Mysql
Why does Mysql analyze and design table primary key not use uuid?

<<: Linux uses bond to implement dual network cards to bind a single IP sample code

>>: More elegant processing of dates in JavaScript based on Day.js

After Apache is installed, the service cannot be started (error code 1 appears when starting the service)

In-depth analysis of why MySQL does not recommend using uuid or snowflake id as primary key

After Apache is installed, the service cannot be started (error code 1 appears when starting the service)

JavaScript to achieve fixed sidebar

How to make your JavaScript functions more elegant

Tutorial on installing MySQL with Docker and implementing remote connection

Element with selection table to change the check box in the header into text implementation code

5 basic skills of topic page design (Alibaba UED Shanmu)

Implementation of Nginx load balancing cluster

Summary of mysqladmin daily management commands under MySQL (must read)

How to solve the problem of automatic package update in Debian system

JS thoroughly understands GMT and UTC time zones

Recommend

Summary of MySQL InnoDB architecture

A brief introduction to VUE uni-app basic components

html+css+js to realize the function of photo preview and upload picture

MySQL 8.0.13 free installation version configuration tutorial under Windows environment

In-depth analysis of the slow query problem of MySQL Sending data

JavaScript setTimeout and setTimeinterval use cases explained

Detailed explanation of obtaining, assigning, and registering radio values in HTML

Analysis of the cause of docker error Exited (1) 4 minutes ago

Tutorial on how to use profile in MySQL

MySql5.7.21 installation points record notes

React+Amap obtains latitude and longitude in real time and locates the address

Vue dynamically displays the day of the week corresponding to the date according to the selected month

How to deal with the prompt "Operation not permitted" when deleting files in Linux

React Synthetic Events Explained

Solution to the problem that Java cannot connect to MySQL 8.0