Preface: When designing a table in MySQL, MySQL officially recommends not using UUID or non-continuous and non-repeating snowflake IDs (long and unique), but recommends continuous self-incrementing primary key IDs. The official recommendation is auto_increment. So why is UUID not recommended? What are the disadvantages of using UUID? In this blog, we will analyze this problem and explore the internal reasons. 1: mysql and program examples 1.1: To illustrate this problem, we first create three tables, namely user_auto_key, user_uuid, and user_random_key, which represent the automatically increasing primary key, uuid as the primary key, and random key as the primary key, respectively. We keep the others completely unchanged. According to the control variable method, we only generate the primary key of each table using different strategies, while the other fields are exactly the same, and then test the insertion speed and query speed of the table: Note: The random key here actually refers to the ID calculated by the snowflake algorithm, which is not continuous, repeated, or irregular: a string of 18-bit long values ID automatically generates table: User UUID Table Random primary key table: 1.2: Theory alone is not enough, let's go directly to the program and use spring's jdbcTemplate to implement the incremental test: Technical framework: springboot+jdbcTemplate+junit+hutool. The principle of the program is to connect to your own test database, and then write the same amount of data in the same environment to analyze the insert time to synthesize its efficiency. In order to achieve the most realistic effect, all data is randomly generated, such as name, email address, and address. The program has been uploaded from gitee, and the address is at the bottom of the article. package com.wyq.mysqldemo; import cn.hutool.core.collection.CollectionUtil; import com.wyq.mysqldemo.databaseobject.UserKeyAuto; import com.wyq.mysqldemo.databaseobject.UserKeyRandom; import com.wyq.mysqldemo.databaseobject.UserKeyUUID; import com.wyq.mysqldemo.diffkeytest.AutoKeyTableService; import com.wyq.mysqldemo.diffkeytest.RandomKeyTableService; import com.wyq.mysqldemo.diffkeytest.UUIDKeyTableService; import com.wyq.mysqldemo.util.JdbcTemplateService; import org.junit.jupiter.api.Test; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.boot.test.context.SpringBootTest; import org.springframework.util.StopWatch; import java.util.List; @SpringBootTest class MysqlDemoApplicationTests { @Autowired private JdbcTemplateService jdbcTemplateService; @Autowired private AutoKeyTableService autoKeyTableService; @Autowired private UUIDKeyTableService uuidKeyTableService; @Autowired private RandomKeyTableService randomKeyTableService; @Test void testDBTime() { StopWatch stopwatch = new StopWatch("Execute sql time consumption"); /** * auto_increment key task */ final String insertSql = "INSERT INTO user_key_auto(user_id,user_name,sex,address,city,email,state) VALUES(?,?,?,?,?,?,?,?)"; List<UserKeyAuto> insertData = autoKeyTableService.getInsertData(); stopwatch.start("Automatically generate key table task starts"); long start1 = System.currentTimeMillis(); if (CollectionUtil.isNotEmpty(insertData)) { boolean insertResult = jdbcTemplateService.insert(insertSql, insertData, false); System.out.println(insertResult); } long end1 = System.currentTimeMillis(); System.out.println("Time consumed by auto key: " + (end1 - start1)); stopwatch.stop(); /** * uudID key */ final String insertSql2 = "INSERT INTO user_uuid(id,user_id,user_name,sex,address,city,email,state) VALUES(?,?,?,?,?,?,?,?,?)"; List<UserKeyUUID> insertData2 = uuidKeyTableService.getInsertData(); stopwatch.start("UUID key table task starts"); long begin = System.currentTimeMillis(); if (CollectionUtil.isNotEmpty(insertData)) { boolean insertResult = jdbcTemplateService.insert(insertSql2, insertData2, true); System.out.println(insertResult); } long over = System.currentTimeMillis(); System.out.println("Time consumed by UUID key: " + (over - begin)); stopwatch.stop(); /** * Random long value key */ final String insertSql3 = "INSERT INTO user_random_key(id,user_id,user_name,sex,address,city,email,state) VALUES(?,?,?,?,?,?,?,?,?)"; List<UserKeyRandom> insertData3 = randomKeyTableService.getInsertData(); stopwatch.start("Random long value key table task starts"); Long start = System.currentTimeMillis(); if (CollectionUtil.isNotEmpty(insertData)) { boolean insertResult = jdbcTemplateService.insert(insertSql3, insertData3, true); System.out.println(insertResult); } Long end = System.currentTimeMillis(); System.out.println("Random key task time consumed: " + (end - start)); stopwatch.stop(); String result = stopwatch.prettyPrint(); System.out.println(result); } 1.3: Program write results user_key_auto write result: User_random_key write result: The result of writing user_uuid table: 1.4: Efficiency test results When the existing data volume is 1.3 million, let's test inserting 100,000 data to see what the result will be: It can be seen that when the data volume is about 1 million, the insertion efficiency of uuid is the lowest, and when 1.3 million data are added in the subsequent sequence, the time of uuid drops sharply. The overall efficiency ranking based on the time usage is: auto_key>random_key>uuid. UUID has the lowest efficiency. When the amount of data is large, the efficiency drops sharply. So why does this phenomenon occur? With doubts, let's explore this issue: 2: Comparison of index structures using uuid and auto-increment id 2.1: Using the internal structure of the auto-increment id The values of the auto-increment primary key are sequential, so Innodb stores each record behind a record. When the maximum fill factor of the page is reached (the default maximum fill factor of InnoDB is 15/16 of the page size, and 1/16 of the space will be reserved for future modifications): ① The next record will be written to a new page. Once the data is loaded in this order, the primary key page will be filled with nearly sequential records, which increases the maximum fill rate of the page and avoids page waste. ②The newly inserted row will be located under the original largest row. MySQL locates and addresses the row quickly, and does not consume extra time to calculate the position of the new row. ③Reduce page splitting and fragmentation 2.2: Index internal structure using uuid Because UUID is completely irregular compared to sequential auto-increment IDs, the value of a new row does not necessarily have to be greater than the value of the previous primary key, so InnoDB cannot always insert new rows at the end of the index. Instead, it needs to find a new suitable position for the new row to allocate new space. This process requires a lot of additional operations. The lack of order in the data will lead to scattered data distribution, which will cause the following problems: ①: The target page to be written may have been flushed to disk and removed from the cache, or has not yet been loaded into the cache. InnoDB has to find and read the target page from disk into memory before inserting, which will cause a large amount of random IO ②: Because writes are out of order, InnoDB has to frequently split pages to allocate space for new rows. Page splits result in moving a large amount of data, and at least three pages need to be modified for one insertion. ③: Due to frequent page splits, pages become sparse and irregularly filled, which eventually leads to data fragmentation After loading the random values (uuid and snowflake id) into the clustered index (the default index type of innodb), it is sometimes necessary to do an OPTIMEIZE TABLE to rebuild the table and optimize the page filling, which will take some time. Conclusion: When using InnoDB, you should insert rows in the auto-increment order of the primary key as much as possible, and use monotonically increasing clustering key values to insert new rows as much as possible. 2.3: Disadvantages of using auto-increment id So is there no harm in using an auto-incrementing id? No, there are also the following problems with self-incrementing IDs: ①: Once someone crawls your database, they can obtain your business growth information based on the database's self-incrementing ID, and it is easy to analyze your business situation ②: For high concurrent loads, InnoDB will cause obvious lock contention when inserting by primary key. The upper bound of the primary key will become a hot spot for contention because all insertions occur here. Concurrent insertions will cause gap lock contention. ③: The Auto_Increment lock mechanism will cause the snatch of the auto-increment lock, resulting in a certain performance loss Appendix: To improve the lock contention problem of Auto_increment, you need to tune the configuration of innodb_autoinc_lock_mode Three: Summary This blog starts with the question raised at the beginning, creating a table and using jdbcTemplate to test the performance of different ID generation strategies in inserting large amounts of data. It then analyzes the different ID mechanisms in the MySQL index structure and their advantages and disadvantages, and explains in depth why UUID and random non-repeating IDs have performance loss in data insertion, and explains this issue in detail. In actual development, it is best to use auto-increment ID according to the official recommendation of MySQL. MySQL is profound and there are still many points worth optimizing inside that we need to learn. Appendix: This blog demo address: https://gitee.com/Yrion/mysqlIdDemo This concludes this article on in-depth analysis of why MySQL does not recommend using UUID or Snowflake ID as primary key. For more relevant content about MySQL UUID or Snowflake ID as primary key, please search 123WORDPRESS.COM's previous articles or continue to browse the following related articles. I hope everyone will support 123WORDPRESS.COM in the future! You may also be interested in:
|
<<: Linux uses bond to implement dual network cards to bind a single IP sample code
>>: More elegant processing of dates in JavaScript based on Day.js
Table of contents text 1. Prepare the machine 2. ...
console.log( [] == ![] ) // true console.log( {} ...
This article uses examples to illustrate the usag...
Create a new server.js yarn init -y yarn add expr...
Red and pink, and their hexadecimal codes. #99003...
Preface When sharing a page, you hope to click th...
First: <abbr> or <acronym> These two s...
When we write code, we often need to know the dif...
Table of contents 1. Introduction to teleport 1.1...
Steps: 1. Install MySQL database 1. Download the ...
1. Function Introduction sed (Stream EDitor) is a...
Table of contents What is MVCC MVCC Implementatio...
1. Introduction to TypeScript The previous articl...
1. Install MySQL software Download and install My...
This article shares the specific code of JavaScri...