There are many ways to generate a global ID. Here is a simple solution: using MySQL's auto-increment ID to generate a globally unique ID. 1. Create a table that only needs two fields: CREATE TABLE `guid` ( `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `stub` char(1) NOT NULL DEFAULT '' COMMENT 'stub field, used to occupy the pit', PRIMARY KEY (`id`), UNIQUE KEY `uk_stub` (`stub`) -- set stub as a unique index) ENGINE=MyISAM AUTO_INCREMENT=1000000000 DEFAULT CHARSET=utf8; Specify the auto-increment start value: alter table guid auto_increment=1000000000, which ensures that the ID is 10 digits (it is almost impossible to increase it to 11 digits). 2. Define mybatis mapper: @Mapper public interface GuidMapper { //Get a globally unique ID * @return */ // replace into afs_guid(stub) values('a'); // select last_insert_id(); @Insert("REPLACE INTO guid (stub) VALUES('a')") @SelectKey(statement = {"SELECT LAST_INSERT_ID()"}, keyProperty = "guidHolder.id", before = false, resultType = long.class) int getGuid( @Param("guidHolder") GuidHolder guidHolder); @Data public static class GuidHolder{ private long id; private String stub; } 3. Testing GuidMapper.GuidHolder guidHolder = new GuidMapper.GuidHolder(); int i = guidMapper.getGuid(guidHolder); long guid = guidHolder.getId(); // guid is the returned ID Tail Concurrency safety issues REPLACE INTO is similar to INSERT and is safe. It will first check whether the primary key or unique key is repeated. If it is repeated, it will delete the original one and add a new one to replace the original one. SELECT LAST_INSERT_ID() is bound to the MySQL connection. On the current connection, the operation triggers the change of the auto_increment value, and the new value is obtained. This value is only visible to the current connection. Other connections will only get the value after it changes the auto_increment. The above two points ensure concurrency safety. In addition, even if you manually reduce the value of id, it will continue to increase based on the last auto-increment after the next replace into. This is because manually modifying the value of id will not change the value of auto_increment. Supplementary knowledge: How to ensure the generation of distributed unique global IDs in high concurrency clusters Preface The system's unique ID is a problem we often encounter when designing a system, and we often struggle with this problem. This article is to provide you with an idea for generating a distributed unique global ID generation solution, I hope it can help everyone. Please feel free to point out any shortcomings! ! question Why do we need a distributed globally unique ID and what are the business requirements for distributed IDs? In complex distributed systems, it is often necessary to uniquely identify a large amount of data and messages, such as in Meituan Dianping’s finance, payment, catering, and hotel The data in the system of products such as Maoyan Movies is gradually increasing. After the database is divided into different tables, a unique ID is needed to identify a piece of data or information. In particular, Ian's orders, riders, and coupons all need to be identified by a unique ID. At this time, a system that can generate a globally unique ID is very necessary Some rigid requirements for ID generation rules Globally unique Trend increasing The clustered index is used in the MySQL InnoDB engine. Since most RDBMS use the Btree data structure to store indexes, we should try to use ordered primary keys to ensure write performance when choosing primary keys. Monotonically increasing Ensure that the next ID is greater than the previous ID, such as transaction version number, IM incremental message, sorting and other special requirements Information Security If the ID is continuous, malicious users can easily crawl the URL in sequence. If it is an order number, it is dangerous because competitors can directly know our daily order volume. Therefore, in some application scenarios, the ID needs to be irregular to make it difficult for competitors to guess. With timestamp You can also quickly understand when this distributed ID is generated during development Availability requirements for ID number generation systems High Availability When I issue a request to obtain a distributed ID, the server will guarantee that it will create a unique distributed ID for me in 99.999% of the cases. Low latency Send a request to get a distributed ID, the server must be fast, extremely fast High QPS For example, if 100,000 distributed ID creation requests come at the same time, the server must be able to handle it and successfully create 100,000 distributed IDs at once. General universal solution UUID UUID.randomUUID() , the standard form of UUID contains 32 hexadecimal digits, divided into five segments by hyphens, 36 characters in the form of 8-4-4-4-12, with very high performance, locally generated, and no network consumption. Problems Poor database access performance because UUIDs are unordered Unordered, its generation order cannot be predicted, and it cannot generate increasing and ordered numbers First of all, distributed IDs are generally used gradually, but according to the official recommendation of MySQL, the primary key should be as short as possible. Each UUID is very long, so it is not recommended. Primary key: When ID is used as the primary key, there will be some problems in certain environments. For example, UUID is not suitable for use as a DB primary key. MySQL has a clear explanation. Index, B+ tree index split Since the distributed ID is the primary key, and the primary key contains an index, and the MySQL index is implemented through a B+ tree, each time new UUID data is inserted, the underlying B+ tree of the index will be modified for query optimization. Because UUID data is unordered, each insertion of UUID data will cause a major modification to the primary key's B+ tree, which is very bad. The insertion is completely unordered, which will not only cause some intermediate nodes to split, but also create many unsaturated nodes in vain, greatly reducing the performance of database insertion. UUID can only guarantee global uniqueness, but does not satisfy the trend of increasing or monotonically increasing. Database auto-increment primary key Single Machine In a distributed system, the main principle of the database's auto-increment ID mechanism is implemented by the database's auto-increment ID and the replace into of the MySQL database. The replace into here is similar to the insert function, with the difference that replace into first tries to insert into the data list. If it is found that this row of data already exists in the table (based on the primary key or unique index), it will be deleted first and then inserted. Otherwise, the new data will be directly inserted. REPLACE INTO means inserting a record. If the value of the unique index in the table conflicts, the old data is replaced.
Every time we insert, we find that the original data will be replaced and the ID will increase This is satisfying Incremental Monotonicity Uniqueness In a distributed situation with low concurrency, this solution can be used to obtain a globally unique ID. Cluster Distributed Cluster Is the database's self-increment ID mechanism suitable for distributed IDs? The answer is not suitable It is difficult to expand the system horizontally. For example, after defining the step size and the number of machines, what should we do if we need to add a machine? Suppose there is a machine numbered: 1, 2, 3, 4, 5 (the step size is 1), and we need to expand the capacity of one machine. We can do this: set the initial value of the second machine to be much higher than the first one. It seems okay, but if there are 100 machines online, how to expand the capacity at this time is a nightmare. Therefore, the system horizontal expansion plan is complex and difficult to implement. The database pressure is still very high. Every time an ID is obtained, the database must be read and written, which greatly affects performance and does not meet the low latency and high QPS rules in distributed IDs (under high concurrency, if the ID is obtained from the database, it will greatly affect performance) Generate global ID strategy based on Redis Standalone version Because Redis is single-threaded, atomicity is inherently guaranteed and can be achieved using the atomic operations INCR and INCRBY INCRBY: Set the growth step size Cluster distribution Note: In the case of Redis cluster, different growth steps need to be set as in MySQL, and the key must be set with a validity period. You can use Redis cluster to obtain higher throughput. Assume that there are 5 Redis servers in a cluster. You can initialize the value of each Redis server to 1, 2, 3, 4, and 5 respectively, and then set the step size to 5. The IDs generated by each Redis are:
However, the problem is that the maintenance and configuration of the Redis cluster are rather troublesome. Because a single point of failure is required, the sentinel But the main problem is that for an ID, you need to introduce the entire Redis cluster, which feels like overkill. Snowflake Algorithm What is Twitter's distributed auto-incrementing ID algorithm, Snowflake Initially, Twitter migrated its storage system from MySQL to Cassandra (an open source distributed NoSQL database system developed by Facebook). Because Cassandra did not have a sequential ID generation mechanism, it developed such a globally unique ID generation service. Twitter's distributed snowflake algorithm SnowFlake can generate 260,000 auto-incrementing sortable IDs per second. Twitter's SnowFlake generates IDs in a time-ordered manner The result of the ID generated by the SnowFlake algorithm is a 64-bit integer, which is a Long type (the maximum length after conversion to a string is 19). There will be no ID collision in the distributed system (differentiated by datacenter and workerID) and the efficiency is high. In a distributed system, there are some scenarios that require a globally unique ID. The basic requirements for generating an ID are: In a distributed environment, global uniqueness is required Generally, it needs to be monotonically increasing, because generally unique IDs will exist in the database, and the characteristic of InnoDB is to store the content in the leaf nodes on the primary key index, and it increases from left to right. Therefore, considering the database performance, it is generally best to generate monotonically increasing IDs. To prevent ID conflicts, 36-bit UUIDs can be used, but UUIDs have some disadvantages. First, they are relatively long, and second, UUIDs are generally unordered. You may also need no rules, because if you use a unique ID as the order number, you need this rule to prevent others from knowing the number of orders per day. structure Several core components of the snowflake algorithm In Java, 64-bit certificates are of type long, so the ID generated by the SnowFlake algorithm is stored in the long class. Part I The highest bit in binary is the sign bit, 1 represents a negative number and 0 represents a positive number. The generated ID is usually an integer, so the highest bit is fixed to 0. Part 2 The second part is the 41-bit timestamp bit, which is used to record the timestamp in milliseconds. 41 bits can represent 2^41 -1 numbers If it is only used to represent positive integers, the range that can be represented is: 0 - 2^41 -1. The reason for minus 1 is that the range of values that can be represented starts from 0, not 1. That is to say, 41 bits can represent the value of 2^41 - 1 milliseconds, which is 69.73 years when converted into units of years. Part 3 The third part is the work machine ID, 10 bits are used to record the work machine ID Can be deployed on 2^10 = 1024 nodes, including 5-digit datacenterId (data center, computer room) and 5-digit workerID (machine code) The maximum positive integer that can be represented by 5 bits is 2^5 = 31 numbers, which can represent different data centers and machine codes. Part 4 The positive integer that 12 bits can be used to represent is 2^12 = 4095, that is, 0 1 2 … 4094 can be used to represent 4095 ID numbers generated by the same machine and the same timestamp. SnowFlake can guarantee All generated IDs increase in time trend There will be no duplicate IDs in the entire distributed system because datacenterId and workerId are used to distinguish them. accomplish The snowflake algorithm is written in Scala, and some people use Java to implement it. The github address is /** * Twitter's Snowflake algorithm -- Java implementation * * @author beyond * @date 2016/11/26 */ public class SnowFlake { /** * Starting timestamp */ private final static long START_STMP = 1480166465631L; /** * The number of bits each part occupies*/ private final static long SEQUENCE_BIT = 12; //Number of digits occupied by the serial number private final static long MACHINE_BIT = 5; //Number of digits occupied by the machine identifier private final static long DATACENTER_BIT = 5;//Number of digits occupied by the data center/** * Maximum value of each part*/ private final static long MAX_DATACENTER_NUM = -1L ^ (-1L << DATACENTER_BIT); private final static long MAX_MACHINE_NUM = -1L ^ (-1L << MACHINE_BIT); private final static long MAX_SEQUENCE = -1L ^ (-1L << SEQUENCE_BIT); /** * The displacement of each part to the left*/ private final static long MACHINE_LEFT = SEQUENCE_BIT; private final static long DATACENTER_LEFT = SEQUENCE_BIT + MACHINE_BIT; private final static long TIMESTMP_LEFT = DATACENTER_LEFT + DATACENTER_BIT; private long datacenterId; //Data centerprivate long machineId; //Machine IDprivate long sequence = 0L; //Serial numberprivate long lastStmp = -1L; //Last timestamppublic SnowFlake(long datacenterId, long machineId) { if (datacenterId > MAX_DATACENTER_NUM || datacenterId < 0) { throw new IllegalArgumentException("datacenterId can't be greater than MAX_DATACENTER_NUM or less than 0"); } if (machineId > MAX_MACHINE_NUM || machineId < 0) { throw new IllegalArgumentException("machineId can't be greater than MAX_MACHINE_NUM or less than 0"); } this.datacenterId = datacenterId; this.machineId = machineId; } /** * Generate the next ID * * @return */ public synchronized long nextId() { long currStmp = getNewstmp(); if (currStmp < lastStmp) { throw new RuntimeException("Clock moved backwards. Refusing to generate id"); } if (currStmp == lastStmp) { //In the same millisecond, the sequence number increases automatically sequence = (sequence + 1) & MAX_SEQUENCE; //The number of sequences in the same millisecond has reached the maximum if (sequence == 0L) { currStmp = getNextMill(); } } else { //In different milliseconds, the sequence number is set to 0 sequence = 0L; } lastStmp = currStmp; return (currStmp - START_STMP) << TIMESTMP_LEFT //Timestamp part | datacenterId << DATACENTER_LEFT //Data center part | machineId << MACHINE_LEFT //Machine ID part | sequence; //Serial number part} private long getNextMill() { long mill = getNewstmp(); while (mill <= lastStmp) { mill = getNewstmp(); } return mill; } private long getNewstmp() { return System.currentTimeMillis(); } public static void main(String[] args) { SnowFlake snowFlake = new SnowFlake(2, 3); for (int i = 0; i < (1 << 12); i++) { System.out.println(snowFlake.nextId()); } } } Project implementation experience hutools toolkit Address: https://github.com/looly/hutool SpringBoot integrates snowflake algorithm Introduce the hutool tool class <dependency> <groupId>cn.hutool</groupId> <artifactId>hutool-all</artifactId> <version>5.3.1</version> </dependency> Integration /** * Snowflake Algorithm* * @author: Moxi* @create: 2020-04-18-11:08 */ public class SnowFlakeDemo { private long workerId = 0; private long datacenterId = 1; private Snowflake snowFlake = IdUtil.createSnowflake(workerId, datacenterId); @PostConstruct public void init() { try { //Convert the network ip into long workerId = NetUtil.ipv4ToLong(NetUtil.getLocalhostStr()); } catch (Exception e) { e.printStackTrace(); } } /** * Get the snowflake ID * @return */ public synchronized long snowflakeId() { return this.snowFlake.nextId(); } public synchronized long snowflakeId(long workerId, long datacenterId) { Snowflake snowflake = IdUtil.createSnowflake(workerId, datacenterId); return snowflake.nextId(); } public static void main(String[] args) { SnowFlakeDemo snowFlakeDemo = new SnowFlakeDemo(); for (int i = 0; i < 20; i++) { new Thread(() -> { System.out.println(snowFlakeDemo.snowflakeId()); }, String.valueOf(i)).start(); } } } Get the result 1251350711346790400 1251350711346790402 1251350711346790401 1251350711346790403 1251350711346790405 1251350711346790404 1251350711346790406 1251350711346790407 1251350711350984704 1251350711350984706 1251350711350984705 1251350711350984707 1251350711350984708 1251350711350984709 1251350711350984710 1251350711350984711 1251350711350984712 1251350711355179008 1251350711355179009 1251350711355179010 Pros and Cons advantage The milliseconds are in the high dimension, the auto-increment sequence is in the low dimension, and the entire ID is increasing in trend. It does not rely on third-party systems such as databases, and is deployed as a service, which has higher stability and very high performance in generating IDs. Bits can be allocated according to their own business characteristics, which is very flexible shortcoming Depends on the machine clock. If the machine clock is dialed back, duplicate IDs will be generated. It is increasing on a single machine, but because it involves a distributed environment, the clocks on each machine cannot be completely synchronized, and sometimes there will be situations where the global increase is not present. This shortcoming can be considered unimportant. Generally, distributed IDs only require an increasing trend, and do not strictly require an increasing trend. 90% of the requirements only require an increasing trend. Other Supplements In order to solve the problem of clock dialing back, which leads to duplicate IDs, someone later proposed a solution. Baidu's open source distributed unique ID generator UidGenerator Leaf - Meituan Dianping Distributed ID Generation System The above article, a simple ID generation strategy: Implementation of generating a globally unique ID from a Mysql table, is all I have to share with you. I hope it can give you a reference and I also hope that you will support 123WORDPRESS.COM. You may also be interested in:
|
<<: JavaScript to show and hide the drop-down menu
>>: Detailed tutorial on configuring nginx for https encrypted access
Table of contents 1. Default values for functio...
Starting from this section, we will explain the i...
This article example shares the specific code for...
<iframe src="./ads_top_tian.html" all...
First, let's talk about the in() query. It is...
1.MySQL replication concept It means transferring...
The communication modes of vue3 components are as...
Rendering If you want to achieve the effect shown...
This article uses an example to describe how to q...
What are the lifecycle functions of React compone...
Add rules to the el-form form: Define rules in da...
LocalStorage stores Boolean values Today, when I ...
Preface JavaScript is not like other languages ...
Adding the attribute selected = "selected&quo...
1. Use xshell to connect to the virtual machine, ...