In-depth understanding of Linux load balancing LVS

In-depth understanding of Linux load balancing LVS

1. LVS load balancing

Load balancing cluster is the abbreviation of Load Balance cluster, which is translated into Chinese as load balancing cluster. Commonly used load balancing open source software include Nginx, LVS, and Haproxy, and commercial hardware load balancing devices include F5, Netscale, etc.

2. Basic introduction of load balancing LVS

The architecture and principle of the LB cluster are very simple. When a user request comes in, it will be directly distributed to the Director Server, and then it will intelligently and evenly distribute the user request to the backend real server according to the set scheduling algorithm. In order to avoid different data requested by users on different machines, shared storage is needed to ensure that the data requested by all users is the same.

This is an open source project initiated by Dr. Zhang Wensong. The official website is: http://www.linuxvirtualserver.org. Now LVS is part of the Linux kernel standard. The technical goal that can be achieved by using LVS is: to implement a high-performance and highly available Linux service cluster through the load balancing technology achieved by LVS and the Linux operating system, which has good reliability, scalability and operability. Thereby achieving optimal performance at a low cost.

The LVS cluster uses IP load balancing technology and content request distribution technology. The scheduler has a good throughput rate and transfers requests to different servers for execution in a balanced manner. The scheduler automatically shields server failures, thereby forming a group of servers into a high-performance, highly available virtual server. The structure of the entire server cluster is transparent to the client, and there is no need to modify the client and server programs.

3. LVS Architecture

The principle of load balancing is very simple. When a client initiates a request, the request is sent directly to the Director Server (scheduler). At this time, according to the set scheduling algorithm, the request will be intelligently distributed to the actual background server according to the algorithm's regulations. In order to evenly distribute the pressure. But we know that http connections are stateless. Suppose a scenario like this: I log in to Taobao to buy something. When I like a certain product, I add it to the shopping cart, but I refresh the page. At this time, due to load balancing, the scheduler selects a new server to provide services for me, and all the contents of my shopping cart are gone. This will result in a very poor user experience. Therefore, a storage share is required to ensure that the data requested by the user is the same. Therefore, LVS load balancing is divided into a three-layer architecture (which is the main component of LVS load balancing):

As shown in the figure:

Detailed introduction to each level of LVS:

3.1 Load Balancer Layer

Located at the front end of the entire cluster system, it consists of one or more load schedulers (Director Servers). The LVS module is installed on the Director Server, and the main function of the Director is similar to that of a router. It contains routing tables set to complete the LVS function, and distributes user requests to the application servers (Real Servers) at the Server Array layer through these routing tables. At the same time, the Real Server service monitoring module Ldirectord must be installed on the Director Server. This module is used to detect the health status of each Real Server service. When the Real Server is unavailable, remove it from the LVS routing table and re-add it when it is restored.

3.2 Server Arrary Layer

It is composed of a group of machines that actually run application services. Real Server can be a WEB server, MALL server, FTP server, DNS server, etc. Each Real Server is connected through a high-speed LAN or WAN distributed in various places. In actual applications, Director Server can also serve as a Real Server at the same time.

3.3 Shared Storage Layer

It is a storage area that provides shared storage space and content consistency for all Real Servers. Physically, it is generally composed of disk array devices. In order to provide content consistency, data can generally be shared through the NFS network file system. However, the performance of NFS is not very good in busy business systems. At this time, a cluster file system can be used, such as Red Hat's GFS file system. A company must have a backend account so that it can be coordinated. Otherwise, the customer pays the money to A, and B takes over to receive the customer because there is no identical account. B said that the customer did not pay, so this is not a problem of customer experience.

4. Implementation Principle of LVS

(1) When the user load balancing scheduler (Director Server) initiates a request, the scheduler sends the request to the kernel space

(2) The PREROUTING chain first receives the user request, determines that the target IP is indeed the local IP, and sends the data packet to the INPUT chain

(3) IPVS works on the INPUT chain. When a user request arrives at INPUT, IPVS will compare the user request with the cluster service it has defined. If the user requests the cluster service, IPVS will forcibly modify the target IP address and port in the data packet and send the new data packet to the POSTROUTING chain.

(4) After receiving the data packet, the POSTROUTING chain finds that the target IP address is exactly its own backend server. Then, the data packet is finally sent to the backend server through routing selection.

5. Working Principle of LVS

LVS has four working modes: NAT, DR, TUN, and FULL-NAT. To make a comparison, due to the working principle, NAT has the simplest configuration, but NAT puts too much pressure on the scheduler, resulting in its lowest efficiency. The working principles of DR and TUN are similar, but in DR, all hosts must be in the same physical environment, while in TUN, all hosts can be distributed in different locations, with one server in New York and the other in Shenzhen. The most commonly used is FULL-NAT.

6. LVS-related terms

(1) DS: Director Server refers to the front-end load balancer node.

(2) RS: Real Server, the real working server at the backend.

(3) VIP: Directly faces user requests to the outside world and is the IP address of the target of the user's request.

(4) DIP: Director Server IP is the IP address mainly used for communicating with internal servers.

(5) RIP: Real Server IP: the IP address of the backend server.

(6) CIP: Client IP: the IP address of the access client.

7. NAT Mode - Network Address Translation

This is accomplished through the use of network address translation. First, when the scheduler (LB) receives the client's request data packet (the destination IP of the request is VIP), it decides to which backend real server (RS) to send the request based on the scheduling algorithm. Then the dispatcher changes the target IP address and port of the request data packet sent by the client to the IP address (RIP) of the backend real server, so that the real server (RS) can receive the client's request data packet. After the real server responds to the request, it checks the default route (in NAT mode, we need to set the default route of RS to the LB server.) and sends the response data packet to LB. After LB receives the response packet, it changes the source address of the packet to the virtual address (VIP) and sends it back to the client.

VS/NAT is the simplest method. All RealServers only need to point their gateways to the Director. The client can be any operating system, but in this way, the number of RealServers that can be driven by a Director is relatively limited. In VS/NAT mode, Director can also serve as a RealServer. The VS/NAT architecture is shown in the figure.

8. Working Principle of NAT Mode

(1) When the user request reaches the Director Server, the request data packet will first go to the PREROUTING chain in the kernel space. At this time, the source IP of the message is CIP and the destination IP is VIP.

(2) PREROUTING checks and finds that the destination IP of the data packet is the local machine, and sends the data packet to the INPUT chain.

(3) IPVS compares the service requested by the data packet to see if it is a cluster service. If so, it modifies the target IP address of the data packet to the backend server IP, and then sends the data packet to the POSTROUTING chain. At this time, the source IP of the message is CIP and the destination IP is RIP.

(4) The POSTROUTING chain selects a route and sends the data packet to the Real Server.

(5) The Real Server compares and finds that the target is its own IP address, and starts to construct a response message and send it back to the Director Server. At this time, the source IP of the message is RIP and the destination IP is CIP.

(6) Before responding to the client, the Director Server will modify the source IP address to its own VIP address and then respond to the client. At this time, the source IP of the message is VIP and the destination IP is CIP.

9. DR Mode - Direct Routing Mode

DR mode is to use direct routing technology to implement virtual servers. Its connection scheduling and management are the same as those in VS/NAT and VS/TUN, but its message forwarding method is different. VS/DR rewrites the MAC address of the request message and sends the request to the Real Server, and the Real Server returns the response directly to the client, eliminating the IP tunnel overhead in VS/TUN. This method has the highest performance and the best among the three load scheduling mechanisms, but it requires that the Director Server and the Real Server both have a network card connected to the same physical network segment.

Director and RealServer must be physically connected by a network card through an uninterrupted LAN. The VIP bound to the RealServer is configured on the respective Non-ARP network devices (such as lo or tunl). The Director's VIP address is visible to the outside, while the RealServer's VIP is invisible to the outside. The address of RealServer can be either an internal address or a real address.

The DR mode rewrites the target MAC address of the request message and sends the request to the real server. The processing result of the real server's response is directly returned to the client user. Like the TUN mode, the DR mode can greatly improve the scalability of the cluster system. Moreover, the DR mode does not have the overhead of IP tunnels, and there is no need for the real servers in the cluster to support the IP tunnel protocol. However, the scheduler LB and the real server RS ​​are required to have a network card connected to the same physical network segment and must be in the same LAN environment.

9.1、DR mode working principle diagram

(1) First, the user requests VIP using CIP.

(2) As can be seen from the above figure, the same VIP needs to be configured on both the Director Server and the Real Server. So when the user request reaches the front-end router of our cluster network, the source address of the request data packet is CIP and the destination address is VIP. At this time, the router will also send a broadcast to ask who the VIP is. Since all nodes in our cluster are configured with VIP, the router will send the user request to whoever responds to the router first. In this case, our cluster system is meaningless. We can configure static routing on the gateway router to specify that the VIP is the Director Server, or use a mechanism to prevent the Real Server from accepting ARP address resolution requests from the network. In this way, the user's request packets will pass through the Director Server.

(3) When the user request reaches the Director Server, the requested data packet will first go to the PREROUTING chain in the kernel space. At this time, the source IP of the packet is CIP and the destination IP is VIP.

(4) PREROUTING checks and finds that the destination IP of the data packet is the local machine, so it sends the data packet to the INPUT chain.

(5) IPVS compares the service requested by the data packet to see if it is a cluster service. If so, the source MAC address in the request message is modified to the MAC address of DIP, and the destination MAC address is modified to the MAC address of RIP. The data packet is then sent to the POSTROUTING chain. At this time, the source IP and destination IP are not modified. Only the source MAC address is modified to the MAC address of DIP and the destination MAC address is modified to the MAC address of RIP.

(6) Since DS and RS are in the same network, the data packet is transmitted through Layer 2. The POSTROUTING chain checks that the target MAC address is the MAC address of RIP, then the data packet will be sent to the Real Server.

(7) RS finds that the MAC address of the request message is its own MAC address and receives the message. After processing is completed, the corresponding message is transmitted to the eth0 network card through the lo interface and then sent out. At this time, the source IP address is VIP and the target IP is CIP.

(8) The response message is finally delivered to the client.

There are three ways to configure DR:

  • The first method: It is clearly stated on the router that the address corresponding to the VIP must be the MAC on the Director. Once it is bound, there is no need to request it again when communicating with the VIP in the future. This binding is static, so it will not expire and will not initiate a request again. However, there is a prerequisite. Our routing device must have operating permissions to bind the MAC address. What if the router is operated by the operator and we cannot operate it? The first method is simple, but not necessarily feasible.
  • The second type: On some hosts (such as Red Hat), they introduced a program called arptables, which is somewhat similar to iptables. It is definitely based on arp or MAC for access control. Obviously, we only need to define arptables rules on each Real Server. If the target address of the user's arp broadcast request is the local VIP, no response will be given, or the response message will not be allowed to go out. Obviously, (gateway) cannot receive it, that is, the response message from the director can reach the gateway. This is also okay. The second method we can use is based on arptables.
  • The third type: Two new kernel parameters (kernelparameter) have been added in relatively new versions. The first one is arp_ignore, which defines the response level when receiving ARP requests; the second one is arp_announce, which defines the announcement level when announcing its own address to the outside. [Tip: Obviously, our current systems generally support these parameters in the kernel. We use parameters to adjust them more simply. It does not rely on additional conditions, such as arptables, nor does it rely on external routing configuration settings. Instead, we usually use the third configuration method

arp_ignore: defines the response level when receiving ARP requests

0: As long as there is a corresponding address set locally, a response will be given. (default)

1: Only respond to ARP requests whose target IP address is a local network address.

2: Only respond to ARP requests whose target IP address is a local network address and whose source IP and target IP are in the same subnet.

3: Do not respond to the ARP request of the network interface, but only respond to the set unique and connection address.

4-7: Reserved and not used.

8: Do not respond to all ARP requests.

arp_announce: defines the level of announcement of the address to the outside world:

0: Advertise any address on any local interface to the outside.

1: View only advertises addresses that match its network to the target network.

2: Advertise only to networks matching addresses on the local interface.

9.2. Characteristics of DR Mode

  • Ensure that the front-end router sends all messages with the destination address as VIP to the Director Server instead of RS.
  • The VIP of Director and RS is the same VIP.
  • RS can use a private address or a public network address. If a public network address is used, RIP can be directly accessed through the Internet.
  • RS and Director Server must be in the same physical network.
  • All request messages go through the Director Server, but response messages must not go through the Director Server.
  • Neither address translation nor port translation is supported.
  • RS can be most common operating systems.
  • RS gateway is never allowed to point to DIP (because we don't allow it to go through Director)
  • Configure the VIP IP address on the lo interface of RS
  • DR mode is the most widely used in the market.
  • Disadvantage: RS and DS must be in the same computer room.

10. Tunnel Mode

10.1. Working Principle of Tunnel Mode

(1) When the user request reaches the Director Server, the requested data packet will first get the PREROUTING chain in the kernel space. At this time, the source IP of the packet is CIP and the destination IP is VIP.

(2) PREROUTING checks and finds that the destination IP of the data packet is the local machine, and sends the data packet to the INPUT chain.

(3) IPVS compares the service requested by the data packet to see if it is a cluster service. If so, it encapsulates another layer of IP packet in the header of the request message, with the source IP as DIP and the destination IP as RIP. Then it is sent to the POSTROUTING chain, where the source IP is DIP and the destination IP is RIP.

(4) The POSTROUTING chain sends the data packet to the RS based on the latest encapsulated IP message (because an extra IP header is encapsulated in the outer layer, it can be understood as being transmitted through a tunnel at this time). At this time, the source IP is DIP and the target IP is RIP.

(5) After receiving the message, RS finds that it is its own IP address, so it receives the message. After removing the outermost IP, it will find that there is another layer of IP header inside, and the target is its own lo interface VIP. Then RS starts to process the request. After processing is completed, it sends it to the eth0 network card through the lo interface and then passes it out. At this time, the source IP is VIP and the target IP is CIP.

(6) The response message is finally delivered to the client.

10.2. Features of Tunnel Mode

  • RIP, VIP, and DIP are all public network addresses.
  • The RS gateway will not and cannot point to DIP.
  • All request messages go through the Director Server, but response messages must not go through the Director Server.
  • Port mapping is not supported.
  • The RS system must support tunneling.

11. LVS Scheduling Algorithm

Fixed scheduling algorithms: rr, wrr, dh, sh

Dynamic scheduling algorithms: wlc, lc, lblc, lblcr

Fixed scheduling algorithm: the scheduler will not judge whether the backend server is busy or not, and will dispatch the request as usual.

Dynamic scheduling algorithm: The scheduler will determine the busyness of the backend server and then dynamically dispatch requests based on the scheduling algorithm.

11.1. rr: round robin

This algorithm is the simplest, which is to dispatch requests to different servers in a round-robin manner. The biggest feature of this algorithm is its simplicity. The polling algorithm assumes that all servers have the same ability to process requests. The scheduler will evenly distribute all requests to each real server, regardless of the backend RS configuration and processing capacity, and distribute them very evenly. The disadvantage of this scheduling is that no matter how busy the backend server is, the scheduler will send the requests in sequence. If the requests on server A are completed quickly, but the requests on server B continue, server B will be very busy all the time, while server A will be very idle, which will not achieve a balance.

11.2. wrr: weighted round robin

This algorithm has an additional concept of weight compared to the rr algorithm. You can set a weight for RS. The higher the weight, the more requests are distributed. The weight range is 0-100. It is mainly an optimization and supplement to the rr algorithm. LVS will consider the performance of each server and add a weight to each server. If the weight of server A is 1 and the weight of server B is 2, the requests scheduled to server B will be twice that of server A. The higher the weight of a server, the more requests it handles.

11.3. dh: destination hash scheduling algorithm (destination hash)

Simply put, requests of the same type are assigned to the same backend server. For example, requests ending with .jgp, .jpg, etc. are forwarded to the same node. This algorithm is not for true load balancing, but for classified management of resources. This scheduling algorithm is mainly used in systems that use cache nodes to improve the cache hit rate.

11.4. sh: Source hash scheduling algorithm

That is, requests from the same IP address will be sent to the same server on the backend, if the backend server is working properly and is not overloaded. This can solve the problem of session sharing, but there is a problem here. Many companies, communities, and schools share a single IP address, which will lead to an uneven distribution of requests.

11.5, lc: least-connection

This algorithm will decide who to distribute the request to based on the number of connections of the backend RS. For example, if the number of connections of RS1 is less than that of RS2, the request will be sent to RS1 first. The problem here is that session persistence, i.e. session sharing, cannot be achieved.

11.6. wlc: weighted least-connection

This has an additional concept of weighting compared to the minimum number of connections, that is, a weight value is added on the basis of the minimum number of connections. When the number of connections is similar, the larger the weight value, the higher the priority of the request being assigned.

11.7. LBLC: Locality-based Least-Connection Scheduling Algorithm

Requests from the same destination address are assigned to the same RS if the server is not yet fully loaded, otherwise they are assigned to the RS with the smallest number of connections and it will be considered first for the next assignment.

The above is the detailed content for in-depth understanding of Linux load balancing LVS. For more information about Linux load balancing LVS, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • Comparison of LVS, Nginx and HAProxy load balancers for Linux servers
  • LVS+Keepalived builds high-availability load balancing (testing)
  • LVS+Keepalived builds high-availability load balancing configuration method (configuration chapter)
  • LVS (Linux Virtual Server) Linux virtual server introduction and configuration (load balancing system)
  • Detailed explanation of nginx server installation and load balancing configuration on Linux system
  • How to build nginx load balancing under Linux
  • Linux load balancing summary of the difference between layer 4 load balancing and layer 7 load balancing
  • Nginx+Tomcat load balancing configuration method under Linux
  • Red Hat Linux, Apache2.0+WebLogic9.2 load balancing cluster installation and configuration
  • Use nginx to load balance This article configures nginx to achieve load balancing under window and linux

<<:  Mysql sets boolean type operations

>>:  Detailed explanation of Vue's calculated properties

Recommend

Linux unlink function and how to delete files

1. unlink function For hard links, unlink is used...

Using cursor loop to read temporary table in Mysql stored procedure

cursor A cursor is a method used to view or proce...

Sample code for realizing book page turning effect using css3

Key Takeaways: 1. Mastering CSS3 3D animation 2. ...

Analysis of Difficulties in Hot Standby of MySQL Database

I have previously introduced to you the configura...

MySQL 8.0.17 installation and simple configuration tutorial under macOS

If you don’t understand what I wrote, there may b...

Brief introduction and usage of Table and div

Web front end 1 Student ID Name gender age 01 Zha...

Detailed explanation of firewall rule settings and commands (whitelist settings)

1. Set firewall rules Example 1: Expose port 8080...

A simple way to build a Docker environment

First, let’s understand what Docker is? Docker is...

How to clear default styles and set common styles in CSS

CSS Clear Default Styles The usual clear default ...

Analysis and application of irregular picture waterfall flow principle

The layout problem of irregular picture walls enc...

A question about border-radius value setting

Problem Record Today I was going to complete a sm...

Implementing a simple Christmas game with JavaScript

Table of contents Preface Achieve results Code CS...