1. Business scenario introduction Suppose there is an e-commerce system that uses Of course, some friends know how to split the database according to provinces/regions or certain business relationships. OK, now the question is, how to ensure that data is stored in different tables in different libraries? Let the library reduce concurrency pressure? How should we formulate the rules for sub-library and table division? Don't worry, it's coming soon 2. Horizontal database and table division method1.RANGEThe first method is to specify a data range to split the table, for example, from 1 to 1000000, 1000001-2000000, using one table for one million, as shown in the following figure Insert picture description here Of course, this method requires maintaining the table ID, especially in a distributed environment. For this distributed ID, it is recommended to use Advantages of the RANGE method: Simple expansion, just build the database and table in advance Disadvantages of the RANGE method: Most reads and writes will access new data, which creates an IO bottleneck. This will cause excessive pressure on the new database and is not recommended. 2. HASH modulus To solve the IO bottleneck problem of In this way, the data can be dispersed in different databases and tables, avoiding the problem of IO bottleneck. Advantages of HASH modulus method: It can ensure that data is evenly distributed in different databases and tables, reducing database pressure Disadvantages of HASH modulus method: expansion is difficult, and the hash value needs to be recalculated and allocated to different databases and tables each time data is migrated 3. Consistent HASHModulo by HASH is not the most perfect way, so what is it? Using consistent HASH algorithm can perfectly solve the problem Common HASH algorithm: A common hash algorithm maps a binary value of arbitrary length to a shorter binary value of fixed length. This small binary value is called a hash value. A hash value is a unique and extremely compact numeric representation of a piece of data. The shortcomings of ordinary Consistent HASH algorithm: Use the commonly used This ring is connected end to end. Suppose there are three database server nodes OK, now let's assume that node3 fails. You will find that user3 will fall on node4. You will find that through the analysis of node addition and deletion, the consistent hashing algorithm can minimize data migration while maintaining monotonicity. Such an algorithm is very suitable for distributed clusters, avoiding large amounts of data migration and reducing server pressure. Of course there is still one problem that needs to be solved, and that is balance. From the figure we can see that when there are relatively few server nodes, a problem will arise, that is, a large amount of data will inevitably be concentrated on one node, and very little data will be concentrated on another node. In order to solve this data skew problem, the consistent hashing algorithm introduces a virtual node mechanism, which calculates multiple hashes for each service node and places a node at each calculation result location, called a virtual node. The specific approach is to first determine the number of virtual nodes associated with each physical node, and then add a number after the IP or host name. For example, in the above case, three virtual nodes can be calculated for each server, so the hash values of " For example, user1 is located on Advantages of the consistent HASH method: virtual nodes can ensure that data is evenly distributed in different databases and tables, and adding or deleting nodes does not affect the data of other nodes, with high availability and strong disaster tolerance. Disadvantages of the consistent modulus method: Well, compared with the above two, it can be considered that there are none. 3. Unit TestingOK, no more nonsense, next is the unit test, assuming there are three nodes, each node has three virtual nodes package com.hyh.core.test; import com.hyh.utils.common.StringUtils; import org.junit.Test; import java.util.LinkedList; import java.util.List; import java.util.SortedMap; import java.util.TreeMap; /** * Consistency HASH TEST * * @Author heyuhua * @create 2021/1/31 19:50 */ public class ConsistentHashTest { //List of servers to be added to the Hash ring private static String[] servers = {"192.168.5.1", "192.168.5.2", "192.168.5.3"}; //Real node list. Considering the scenarios where the server goes online and offline, that is, the addition and deletion scenarios will be more frequent, it is better to use LinkedList here private static List<String> realNodes = new LinkedList<>(); //Virtual node, key represents the hash value of the virtual node, value represents the name of the virtual node private static SortedMap<Integer, String> virtualNodes = new TreeMap<>(); //One real node corresponds to 3 virtual nodes private static final int VIRTUAL_NODES = 3; /** * Test the consistency of HASH with virtual nodes */ @Test public void testConsistentHash() { initNodes(); String[] users = {"user1", "user2", "user3", "user4", "user5", "user6", "user7", "user8", "user9"}; for (int i = 0; i < users.length; i++) System.out.println("[" + users[i] + "]'s hash value is " + getHash(users[i]) + ", routed to node[" + getServer(users[i]) + "]"); } /** * First add the original server to the real node list */ public void initNodes() { for (int i = 0; i < servers.length; i++) realNodes.add(servers[i]); for (String str : realNodes) { for (int i = 0; i < VIRTUAL_NODES; i++) { String virtualNodeName = str + "-virtual node" + String.valueOf(i); int hash = getHash(virtualNodeName); System.out.println("Virtual node [" + virtualNodeName + "] is added, hash value is " + hash); virtualNodes.put(hash, virtualNodeName); } } System.out.println(); } //Use FNV1_32_HASH algorithm to calculate the Hash value of the server. The method of rewriting hashCode is not used here, and the final effect is no different private static int getHash(String str) { final int p = 16777619; int hash = (int) 2166136261L; for (int i = 0; i < str.length(); i++) hash = (hash ^ str.charAt(i)) * p; hash += hash << 13; hash ^= hash >> 7; hash += hash << 3; hash ^= hash >> 17; hash += hash << 5; // If the calculated value is negative, take its absolute value if (hash < 0) hash = Math.abs(hash); return hash; } //Get the node to be routed to private static String getServer(String key) { //Get the hash value of the key int hash = getHash(key); // Get all Maps greater than the Hash value SortedMap<Integer, String> subMap = virtualNodes.tailMap(hash); String virtualNode; if (subMap.isEmpty()) { //If there is no hash value greater than the key, start from the first node Integer i = virtualNodes.firstKey(); //Return the corresponding server virtualNode = virtualNodes.get(i); } else { //The first Key is the node closest to the node in clockwise direction Integer i = subMap.firstKey(); //Return the corresponding server virtualNode = subMap.get(i); } //The virtualNode virtual node name needs to be intercepted if (StringUtils.isNotBlank(virtualNode)) { return virtualNode.substring(0, virtualNode.indexOf("-")); } return null; } } Here we simulate the situation where 9 user objects are routed after being hashed. See the results Summarize: It is strongly recommended to use consistent This is the end of this article about the details of You may also be interested in:
|
<<: JavaScript basics for loop and array
>>: Differences between FLOW CHART and UI FLOW
(1) Introduction: clipboard.js is a lightweight J...
Each of these 16 sites is worth reading carefully,...
This article introduces the sample code of CSS3 t...
The Golden Rule No matter how many people are wor...
0. Environment Operating system for this article:...
When we are doing front-end development, we will ...
I will be learning MySQL next semester. I didn...
Using the image service deployed by docker stack,...
The detailed process of using MySQL database with...
1. Introduction: I think the changes after mysql8...
Whether the a tag opens a new page: (1) Baidu Ency...
1. HTML part <Col span="2">Upload...
1. Docker pull pulls the image When using $ docke...
1. Install Baidu Eslint Rule plugin npm i -D esli...
Table of contents Preface Introduction to QueryCa...