Deploy Varnish cache proxy server based on Centos7

Deploy Varnish cache proxy server based on Centos7

1. Varnish Overview

1. Introduction to Varnish

Varnish is a high-performance, open source reverse proxy server and HTTP accelerator that uses a new software architecture and works closely with current hardware systems. Compared with traditional Squid, Varnish has the advantages of high performance, fast speed, and easier management. Currently, many large websites have begun to try to use Varnish to replace Squid. This is the most fundamental reason for the rapid development of Varnish.

Key features of Varnish:

(1) Cache proxy location: either memory or disk can be used;
(2) Log storage: Logs are stored in memory;
(3) Support the use of virtual memory;
(4) There is a precise time management mechanism, that is, the time attribute control of the cache;
(5) State engine architecture: different cache and proxy data are processed on different engines;
(6) Cache management: Use a binary heap to manage cache data and ensure timely data cleanup;

2. Comparison and similarities between Varnish and Squid

  • All are open source software;
  • It is a reverse proxy server;

Advantages of Varnish

(1) Stability: When Varnish and Squid complete the same workload, the probability of Squid server failure is higher than Varnish because Squid needs to be restarted frequently.
(2) Faster access speed: All cached data of Varnish is read directly from the memory, while Squid reads from the hard disk;
(3) Support more concurrent connections: Because Varnish's TCP connection and release speed is much faster than Squid

Disadvantages of Varnish

(1) Once the Varnish process is restarted, the cached data will be completely released from the memory. At this time, all requests will be sent to the backend server. In the case of high concurrency, it will cause great pressure on the backend server;
(2) When using Varnish, if a single URL request is used through load balancing, each request will fall on a different Varnish server, causing all requests to go to the backend server; and the same Qin Gui is cached on multiple servers, which will also waste Varnish's cache resources and cause performance degradation;

Solutions to Varnish Disadvantages

Regarding disadvantage 1: When the traffic is large, it is recommended to start using Varnish's memory cache mode, and it needs to be followed by multiple Squid/nginx servers. This is mainly to prevent a large number of requests from penetrating Varnish when the previous Varnish service or server is restarted. In this way, Squid/nginx can act as the second-layer CACHE and also make up for the problem that the Varnish cache in memory will be released when it is restarted.
For the second disadvantage: you can do URL hashing on the load balancer to make a single URL request fixed to a Varnish server;

3. How Varnish works

When the Varnish server receives a request from a client, it first checks whether there is data in the cache. If so, it responds directly to the client. If not, it requests the corresponding resource from the backend server, caches it locally on the Varnish server, and then responds to the client.

Choose whether the data needs to be cached based on the rules and the type of the requested page. You can determine whether to cache based on the Cache-Contorl in the request header and whether the cookie is marked. These functions can be implemented by writing configuration files.

4. Varnish simple architecture

Varnish is divided into management process and child process

  • Management process: manages child processes, compiles VCL configurations, and applies them to different state engines;
  • Child process: generates a thread pool, which is responsible for processing user requests and returning user results through hash search;

Common threads generated by child processes are

  • Accept thread: receives new connection requests and responds;
  • Worker thread: session, processing request resources;
  • Expiry thread: clears expired content in the cache;

5. Varnish main configuration part

  • Backend configuration: Add a reverse proxy server node to Varnish, at least one;
  • ACL configuration: Add access control lists to Varnish, and specify access or prohibition of access by these lists;
  • Probes configuration: Add rules to Varnish to detect whether the backend server is normal, so as to facilitate switching or disabling the corresponding backend server;
  • Directors configuration: add load balancing mode to Varnish to manage multiple backend servers;
  • Core subroutine configuration: add backend server switching, request caching, access control, error handling and other rules to Varnish;

6. VCL has built-in preset variables: variables (also called objects):

As shown

(1) req: variables available when the client requests the Varnish server;
(2) bereq: The Varnish server requests the variables available on the backend server;
(3) beresp: The variable used when the backend server responds to the Varnish server request and returns the result;
(4) resp: The variable used by the Varnish server to respond to the client request;
(5) obj: cache object, cache backend response request content;
(6) now: returns the current timestamp;

Client

Clienet.ip: Returns the client's IP address
Client.port: Get the port number requested by the client (in Vatnish 4.0 and later, you need to call the std module to use it). The syntax is import std; std.port(client.ip)
Client.identiy Get the client identification code. The software will generate a serial number during the installation process to identify the identity.

server

Server.hostname: Server hostname
Server.identiy: Get the server identification code
Server.ip: Get the server IP address
Server.prot: Get the server IP port number, you need to call the std module

Client request req (the object sent by the client request)

  • Req: The data structure of the entire request
  • req.bachend_hint: specifies the request backend node such as gif to the image server
  • Req.can_gzip: Whether the client accepts gzip transfer encoding (usually browsers support all compression formats)
  • req.hash_always_miss: whether to read data from the cache
  • req.hash_ignore_busy: Ignore busy data deadlocks in the cache (such as two Varnish servers competing for a resource during processing, causing a blockage. If not manually stopped, the deadlock will continue)
  • req.http: corresponds to the header of the http request
  • req.method: request type or request method (such as gt, post)
  • req.proto: The version of the http protocol used by the client request
  • req.restarts: The number of restarts. The default maximum value is 4 (usually used to determine whether the server has been accessed)
  • req.url: requested url
  • req.xid: unique id. When accessed by Varnish server, X-varnish is added to the header. The number behind it is the id of Varnish. The first data is the identification id of the request, and the second data is the identification id of the cache.

Varnish requests the backend server (bereq)

  • bereq: data structure of the entire backend request
  • bereq.backend: Configuration of the requested backend node
  • bereq.between_bytes_timeout: The waiting time or timeout between each byte received from the backend
  • bereq.http: corresponds to the http header information sent to the backend
  • bereq.method: The request type or method sent to the backend
  • bereq.proto: The HTTP protocol version of the request sent to the backend
  • bereq.retires: Same request retry count
  • bereq.uncacheable: The request data is not cached, or the request is not cached
  • bereq.url: The URL sent to the backend request
  • bereq.xid: request unique id

The backend server returns data to Varnish beresq

  • Beresp: backend server response data
  • Beresp.backend.ip: IP address of the backend response (processing request data) server
  • Beresp.backend.name: the node name of the backend response server
  • Beresp.do_gunzip: Defaults to false, decompress the object before caching
  • Beresp.grace: Set additional grace time for cache expiration
  • Beresp.http: http header in response
  • Beresp.keep: Object cache with retention time
  • Beresp.proto: http version of the response
  • Beresp.reason: http status information returned by the backend server
  • Beresp.status: The status code returned by the backend server
  • Beresp.storage_hint: specifies the specific storage (memory) to save
  • Beresp.ttl: Change the remaining time of the object cache and specify the remaining time of the unified cache
  • Beresp,uncacheable: Do not cache data

storage

  • Storage.<name>.free_space: Storage available space (bytes)
  • Storage.<name>.used_space: Storage remaining time (bytes)
  • Storage.<name>.happy: storage node status
  • deliver sends data to the client, and the returned data
  • fetch fetches data from the backend and caches the data locally

7. Specific Function Statements

  • Ban(expression): clear the specified object cache;
  • Call(subroutine): call a subroutine;
  • Hash_data(input): Generates hash keys based on the value of the input subroutine;
  • New(): Create a new vcl object, which can only be used in the vcl_init subprocess;
  • Return(): End the current subroutine and specify the next step;
  • Rollback(): restores the HTTP header to its original state. It has been deprecated. Use std.rollback() instead.
  • Synthetic (STRING): synthesizer, defines the page and status code returned to the client;
  • Regsub(str, regex, sub) replaces the first occurrence of a string using a regular expression;
  • Regsuball(str, regex, sub) replaces all occurrences of a string;

8. Varnish request processing steps

As shown

Varnish request processing steps

Receive state (vcl_recv). That is, the entry state of the request processing. According to the VCL rules, it is determined whether the request should pass (vcl_pass) or pipe (vcl_pipe), or enter the lookup (local query).
Lookup state. After entering this state, the data will be searched in the hash table. If found, it will enter the hit (vcl_hit) state, otherwise it will enter the miss (vcl_miss) state.
Pass (vcl_pass) status. In this state, the backend request will be directly entered, that is, the fetch (vcl_fetch) state
Fetch (vcl_fetch) state. In the fetch state, the request is retrieved from the backend, the request is sent, the data is obtained, and the data is stored locally according to the settings.
Deliver (vcl_deliver) status. Send the acquired data to the client and complete the request.
Pipe status. Establish a direct connection between the client and the server to retrieve data from the backend server

The vcl_recv subroutine: starts processing the request via return (action);
vcl_pipe subroutine: pipe mode processing, which is mainly used to directly obtain the backend response content and return it to the client. The response content can be defined and returned to the client.
vcl_pass subroutine: pass mode processing, which is similar to hash cache mode, but without cache processing.
vcl_hit subroutine: In hash cache mode, it is called when there is a hash cache, used for cache processing, and can abandon or modify the cache.
vcl_miss subroutine: In hash cache mode, it is called when there is no hash cache. It is used to judge whether to enter the backend to obtain the response content. It can be changed to pass mode.
vcl_hash subroutine: hash cache mode, generates hash value as cache lookup key name to extract cache content, mainly used for cache hash key value processing, hash_data(string) can be used to specify the key value composition structure, and different cache key values ​​can be generated on the same page through IP or cookie.
vcl_purge subroutine: Cleanup mode, clear and call when the corresponding cache is found, used to request the method to clear the cache and report
vcl_deliver subroutine: Client delivery subroutine, called after vcl_backend_response subroutine (non-pipe mode) or after vcl_hit subroutine, can be used to append response header information, cookies, etc.
vcl_backend_fetch subroutine: Called before sending a backend request, it can be used to change the request address or other information, or abandon the request.
vcl_backend_response subroutine: Called after the backend responds, it can be used to modify the cache time and cache-related information.
vcl_backend_error subroutine: Backend processing failure call, abnormal page display effect processing, you can customize the error response content, or modify beresp.status and beresp.http.Location redirection, etc.
vcl_synth subroutine: Customize response content. It can be called through synthetic() and the return value synth. Here you can customize the exception display content, and you can also modify resp.status and resp.http.Location redirection.
vcl_init subroutine: It is called first when vcl is loaded. It is used to initialize VMODs. This subroutine does not participate in request processing and is only called once when vcl is loaded.
vcl_fini subroutine: Called when the current vcl configuration is uninstalled to clean up VMODs. This subroutine does not participate in request processing and is only called after vcl is discarded normally.

2. Install Varnish

Download varnish package link: https://pan.baidu.com/s/1OfnyR-5xFuxMUYJTnhQesA Extraction code: m9q4

In addition to a Varnish server, you can open two more web servers to provide web pages.

[root@localhost ~]# yum -y install autoconf automake libedit-devel libtool ncurses-devel pcre-devel pkgconfig python-docutils python-sphinx
[root@localhost ~]# tar zxf varnish-4.0.3.tar.gz 
[root@localhost ~]# cd varnish-4.0.3/
[root@localhost varnish-4.0.3]# ./configure && make && make install
[root@localhost varnish-4.0.3]# cp etc/example.vcl /usr/local/var/varnish/
//Copy the Varnish main configuration file [root@localhost /]# vim /usr/local/var/varnish/example.vcl 
//Edit Varnish master configuration to see the original modification vcl 4.0;
import directors;
import std;
backend default {
  .host = "127.0.0.1";
  .port = "80";
}
probe backend_healthcheck {
    .url="/";
    .interval = 5s;
    .timeout = 1s;
    .window = 5;
    .threshold = 3;
}
backend web1 {
    .host = "192.168.148.132";
    .port = "80";
    .probe = backend_healthcheck;
}
backend web2 {
    .host = "192.168.148.133";
    .port = "80";
    .probe = backend_healthcheck;
}
acl purgers {
    "127.0.0.1";
    "localhost";
    "192.168.148.0/24";
    !"192.168.148.133";
}
sub vcl_init {
    new web_cluster = directors.round_robin();
    web_cluster.add_backend(web1);
    web_cluster.add_backend(web2);
}
//Delete all the original ones and add the following sub vcl_recv {
    set req.backend_hint = web_cluster.backend();
    if (req.method == "PURGE") {
        if (!client.ip ~ purgers) {
            return (synth(405, "Not Allowed."));
    }
    return (purge);
}
if (req.method != "GET" &&
    req.method != "HEAD" &&
    req.method != "PUT" &&
    req.method != "POST" &&
    req.method != "TRACE" &&
    req.method != "OPTIONS" &&
    req.method != "PATCH" &&
    req.method != "DELETE") {
        return (pipe);
    }
if (req.method != "GET" && req.method != "HEAD") {
    return (pass);
}
if (req.url ~ "\.(php|asp|aspx|jsp|do|ashx|shtml)($|\?)") {
    return (pass);
}
if (req.http.Accept-Encoding) {
    if (req.url ~ "\.(bmp|png|gif|jpg|jpeg|ico|gz|tgz|bz2|tbz|zip|rar|mp3|mp4|ogg|swf|flv)$") {
    unset req.http.Accept-Encoding;
} elseif (req.http.Accept-Encoding ~ "gzip") {
        set req.http.Accept-Encoding = "gzip";
    } elseif (req.http.Accept-Encoding ~ "deflate") {
        set req.http.Accept-Encoding = "deflate";
    } else {
    unset req.http.Accept-Encoding;
    }
   }
if (req.url ~ "\.(css|js|html|htm|bmp|png|gif|jpg|jpeg|ico|gz|tgz|bz2|tbz|zip|rar|mp3|mp4|ogg|swf|flv)($|\?)") {
    unset req.http.cookie;
    return (hash);
}
if (req.restarts == 0) {
    if (req.http.X-Forwarded-For) {
        set req.http.X-Forwarded-For = req.http.X-Forwarded-For + ", " + client.ip;
    } else {
    set req.http.X-Forwarded-For = client.ip;
    }
}
return (hash);
}
sub vcl_hash {
    hash_data(req.url);
    if (req.http.host) {
    hash_data(req.http.host);
    } else {
        hash_data(server.ip);
    }
    return (lookup);
}
sub vcl_hit {
    if (req.method == "PURGE") {
        return (synth(200, "Purged."));
    }
    return (deliver);
}
sub vcl_miss {
  if (req.method == "PURGE") {
        return (synth(404, "Purged."));
    }
    return (fetch);
}
sub vcl_deliver {
    if (obj.hits > 0) {
        set resp.http.CXK = "HIT-from-varnish";
        set resp.http.X-Cache-Hits = obj.hits;
    } else {
    set resp.http.X-Cache = "MISS";
    }
    unset resp.http.X-Powered-By;
    unset resp.http.Server;
    unset resp.http.X-Drupal-Cache;
    unset resp.http.Via;
    unset resp.http.Link;
    unset resp.http.X-Varnish;
    set resp.http.xx_restarts_count = req.restarts;
    set resp.http.xx_Age = resp.http.Age;
    set resp.http.hit_count = obj.hits;
        unset resp.http.Age;
            return (deliver);
            }

sub vcl_purge {
    return (synth(200,"success"));
}
sub vcl_backend_error {
    if (beresp.status == 500 ||
        beresp.status == 501 ||
        beresp.status == 502 ||
        beresp.status == 503 ||
        beresp.status == 504) {
        return (retry);
    }
}
sub vcl_fini {
    return (ok);
}
[root@localhost /]# varnishd -f /usr/local/var/varnish/example.vcl -s malloc,200M -a 0.0.0.0:80
//Start the service

The first web page

[root@localhost ~]# yum -y install httpd
[root@localhost ~]# echo aaa > /var/www/html/index.html
[root@localhost ~]# systemctl stop firewalld
[root@localhost ~]# systemctl start httpd

Channel 2

[root@localhost ~]# yum -y install httpd
[root@localhost ~]# echo bbb > /var/www/html/index.html
[root@localhost ~]# systemctl stop firewalld
[root@localhost ~]# systemctl start httpd

If you restart Varnishd as follows:

[root@localhost /]# netstat -anpt | grep 80
[root@localhost /]# killall -9 varnishd
[root@localhost /]# varnishd -f /usr/local/var/varnish/example.vcl -s malloc,200M -a 0.0.0.0:80

Client access is as follows:

Refresh

[root@localhost /]# curl -X "PURGE" 192.168.148.130
// Clear the cache 

Varnish Configuration File Explanation

vcl 4.0;
import directors;
import std;
# Default backend definition. Set this to point to your content server.
probe backend_healthcheck {
    .url="/"; #Access the backend server root path.interval = 5s; #Request time interval.timeout = 1s; #Request timeout.window = 5; #Specify the number of polls to 5 times.threshold = 3; #If there are 3 failures, it means that the backend server is abnormal}
backend web1 { #define the backend server.host = "192.168.1.7"; #IP or domain name of the host (i.e. the backend host) to be redirected.port = "80"; #specify the port number of the backend server.probe = backend_healthcheck; #Health check calls the content defined by backend_healthcheck}
backend web2 {
    .host = "192.168.1.8";
    .port = "80";
    .probe = backend_healthcheck;
}
acl purgers { #define access control list "127.0.0.1";
    "localhost";
    "192.168.1.0/24";
    !"192.168.1.8";
}
sub vcl_init { #Call vcl_init to initialize the subroutine and create the backend host group, i.e. directors
    new web_cluster = directors.round_robin(); #Use the new keyword to create a director object and use the round_robin algorithm web_cluster.add_backend(web1); #Add a backend server node web_cluster.add_backend(web2);
}
sub vcl_recv {
    set req.backend_hint = web_cluster.backend(); #Specify the backend node of the request web_cluster defined backend node if (req.method == "PURGE") { #Judge whether the client's request header is PURGE
        if (!client.ip ~ purgers) { #If yes, check whether the client's IP address is in the ACL access control list.
            return (synth(405, "Not Allowed.")); #If not, return a 405 status code to the client and return the defined page.
    }
    return (purge); #If it is defined by ACL, it will be handled by purge.
}
if (req.method != "GET" &&
    req.method != "HEAD" &&
    req.method != "PUT" &&
    req.method != "POST" &&
    req.method != "TRACE" &&
    req.method != "OPTIONS" &&
    req.method != "PATCH" &&
    req.method != "DELETE") { #Judge the client's request type return (pipe);
    }
if (req.method != "GET" && req.method != "HEAD") {
    return (pass); #If it is not GET or HEAD, pass it.
}
if (req.url ~ "\.(php|asp|aspx|jsp|do|ashx|shtml)($|\?)") {
    return (pass); #When the client accesses a file ending with .php, pass it to pass for processing.
}
if (req.http.Accept-Encoding) {
    if (req.url ~ "\.(bmp|png|gif|jpg|jpeg|ico|gz|tgz|bz2|tbz|zip|rar|mp3|mp4|ogg|swf|flv)$") {
    unset req.http.Accept-Encoding; #Cancel the compression type received by the client} elseif (req.http.Accept-Encoding ~ "gzip") {
        set req.http.Accept-Encoding = "gzip"; #If there is a gzip type, mark the gzip type.
    } elseif (req.http.Accept-Encoding ~ "deflate") {
        set req.http.Accept-Encoding = "deflate";
    } else {
    unset req.http.Accept-Encoding; #Other undefined pages also cancel the compression type received by the client.
    }
   }
if (req.url ~ "\.(css|js|html|htm|bmp|png|gif|jpg|jpeg|ico|gz|tgz|bz2|tbz|zip|rar|mp3|mp4|ogg|swf|flv)($|\?)") {
    unset req.http.cookie; #Cancel the client's cookie value.
    return (hash); #Forward the request to the hash subroutine, that is, check the local cache.
}
if (req.restarts == 0) { #Judge whether it is the first request from the client if (req.http.X-Forwarded-For) { #If it is the first request, set the client's IP address.
        set req.http.X-Forwarded-For = req.http.X-Forwarded-For + ", " + client.ip;
    } else {
    set req.http.X-Forwarded-For = client.ip;
    }
}
return (hash);
}
sub vcl_hash {
    hash_data(req.url); #View the page requested by the client and perform hashing
    if (req.http.host) {
        hash_data(req.http.host); #Set the client's host} else {
        hash_data(server.ip); #Set the server IP
    }
    return (lookup);
}
sub vcl_hit {
    if (req.method == "PURGE") { #If it is HIT and the client request type is PURGE, return the 200 status code and return the corresponding page.
        return (synth(200, "Purged."));
    }
    return (deliver);
}
sub vcl_miss {
  if (req.method == "PURGE") {
        return (synth(404, "Purged.")); #If it is a miss, return 404
    }
    return (fetch);
}
sub vcl_deliver {
    if (obj.hits > 0) {
        set resp.http.CXK = "HIT-from-varnish"; #Set http header X-Cache = hit
        set resp.http.X-Cache-Hits = obj.hits; #Return the number of commands} else {
    set resp.http.X-Cache = "MISS";
    }
    unset resp.http.X-Powered-By; #Cancel displaying the web versionunset resp.http.Server; #Cancel displaying the Varnish serviceunset resp.http.X-Drupal-Cache; #Cancel displaying the cached frameworkunset resp.http.Via; #Cancel displaying the file content sourceunset resp.http.Link; #Cancel displaying the HTML hyperlink addressunset resp.http.X-Varnish; #Cancel displaying the Varnish ID
    set resp.http.xx_restarts_count = req.restarts; #Set the number of client requests set resp.http.xx_Age = resp.http.Age; #Show the length of cached files #set resp.http.hit_count = obj.hits; #Show the number of cache hits #unset resp.http.Age;
    return (deliver);
}
sub vcl_pass {
    return (fetch); #Cache the data returned by the backend server locally}
sub vcl_backend_response {
    set beresp.grace = 5m; #cache additional grace timeif (beresp.status == 499 || beresp.status == 404 || beresp.status == 502) {
        set beresp.uncacheable = true; #When the backend server responds with a status code of 449, etc., do not cache}
    if (bereq.url ~ "\.(php|jsp)(\?|$)") {
        set beresp.uncacheable = true; #When it is a PHP page, it is not cached} else {
        if (bereq.url ~ "\.(css|js|html|htm|bmp|png|gif|jpg|jpeg|ico)($|\?)") {
        set beresp.ttl = 15m; #When it ends with the above, cache for 15 minutes unset beresp.http.Set-Cookie;
        } elseif (bereq.url ~ "\.(gz|tgz|bz2|tbz|zip|rar|mp3|mp4|ogg|swf|flv)($|\?)") {
            set beresp.ttl = 30m; #cache for 30 minutes unset beresp.http.Set-Cookie;
        } else {
            set beresp.ttl = 10m; #Lifetime 10 minutes unset beresp.http.Set-Cookie;
        }
    }
    return (deliver);
}
sub vcl_purge {
    return (synth(200,"success"));
}
sub vcl_backend_error {
    if (beresp.status == 500 ||
        beresp.status == 501 ||
        beresp.status == 502 ||
        beresp.status == 503 ||
        beresp.status == 504) {
        return (retry); #If the status code is one of the above, re-request}
}
sub vcl_fini {
    return (ok);
}

The above is the full content of this article. I hope it will be helpful for everyone’s study. I also hope that everyone will support 123WORDPRESS.COM.

You may also be interested in:
  • CentOS 7.2 builds nginx web server to deploy uniapp project
  • CentOS 7.x deployment of master and slave DNS servers

<<:  Vue realizes click flip effect

>>:  Vue realizes the card flip effect

Recommend

A brief discussion on the optimization of MySQL paging for billions of data

Table of contents background analyze Data simulat...

js regular expression lookahead and lookbehind and non-capturing grouping

Table of contents Combining lookahead and lookbeh...

Vue.set() and this.$set() usage and difference

When we use Vue for development, we may encounter...

Examples of new selectors in CSS3

Structural (position) pseudo-class selector (CSS3...

How to install and use Cockpit on CentOS 8/RHEL 8

Cockpit is a web-based server management tool ava...

How to implement MySQL bidirectional backup

MySQL bidirectional backup is also called master-...

MySQL SQL statement to find duplicate data based on one or more fields

SQL finds all duplicate records in a table 1. The...

Solution to invalid Nginx cross-domain setting Access-Control-Allow-Origin

nginx version 1.11.3 Using the following configur...

htm beginner notes (must read for beginners)

1. What is HTML HTML (HyperText Markup Language):...

Detailed explanation of flex layout in CSS

Flex layout is also called elastic layout. Any co...