Detailed explanation of nginx proxy_cache cache configuration

Detailed explanation of nginx proxy_cache cache configuration

Preface:

Due to my work, I am involved in the field of online live streaming, and the playback and downloading of videos involves some video downloading technologies. For downloading a complete video, the current mainstream approach on the market is to slice the entire video stream and store it in a file server, and then play it back when the user needs to watch it. Through a video back to the source server, the file server is requested to slice it one by one and returned to the user for playback.

Today we will focus on the configuration of the origin server cache and a reasonable caching strategy.

Through the case of configuring cache for the source server, a complete set of cache configuration mechanisms is explained in detail, which can be applied to any other cache configuration scenarios.

Today's explanation is divided into four points:

  • What is the work of the back-to-origin server?
  • Need to add cache to the origin server
  • How to configure cache
  • How to configure a complete caching mechanism for business scenarios

Back to the origin server:

The back-to-source server is referred to as the source server in the following description. As shown in the figure, during the file download process, it spans between the CDN and the file server and serves as a download hub.

Source station architecture: The source station is a webserver architecture of nginx+php, as shown in the figure:


However, if the origin server simply receives the request, downloads the resource, and then returns it, there will inevitably be the following unoptimized issues:

1. CDN may have multiple return to source phenomena

2. The source station downloads the same resource multiple times, resulting in a waste of network traffic bandwidth and unnecessary time consumption.

Therefore, in order to optimize these problems, a layer of cache is needed for the source site. The cache strategy uses the proxy_cache module that comes with nginx.

Proxy_cache principle:

The working principle of the proxy_cache module is shown in the figure:


How to configure the proxy_cache module

Add the following code to the nginx.conf file:

http{
  ......
  proxy_cache_path/data/nginx/tmp-test levels=1:2 keys_zone=tmp-test:100m inactive=7d max_size=1000g;
}

Code Explanation:

proxy_cache_path cache file path

levels sets the cache file directory hierarchy; levels=1:2 means two levels of directory

keys_zone sets the cache name and shared memory size

inactive If no one accesses it within a specified time, it will be deleted

max_size The maximum cache space. If the cache space is full, the resource with the longest cache time will be overwritten by default.

After the configuration is complete, restart nginx. If no error is reported, the configured proxy_cache will take effect.

Check the proxy_cache_path / data / nginx / directory and you will find that a tmp-test folder is generated.

How to use proxy_cache

Add the following code to your corresponding nginx vhost server configuration file:

location /tmp-test/ {
 proxy_cache tmp-test;
 proxy_cache_valid 200 206 304 301 302 10d;
 proxy_cache_key $uri;
 proxy_set_header Host $host:$server_port;
 proxy_set_header X-Real-IP $remote_addr;
 proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
 proxy_passhttp://127.0.0.1:8081/media_store.php/tmp-test/;
}

Configuration item description: Proxy_cache tmp-test uses the corresponding cache configuration named tmp-test

proxy_cache_valid 200 206 304 301 302 10d; cache httpcode 200 for 10 days

proxy_cache_key $uri defines the unique key for the cache, and uses the unique key for hash access

proxy_set_header customizes the http header to be sent to the backend real server.

proxy_pass refers to the path forwarded after the proxy. Note whether the final / is required.

At this point, the most basic proxy_cache function has been configured successfully. When the uri successfully matches the location, proxy_cache will take effect.

After adding proxy_cache, the request process changes:

1. First visit:


On the first access, proxy_cache did not find the corresponding cache file (cache miss), so when the first request is completed, proxy_cache will keep the cache:

2. Save the cache, as shown in the figure:


3. When the same URL is accessed for the second time, when the same file arrives at the origin site again, proxy_cache will find its corresponding cache file (hits the cache HIT) and return it directly to the requester without executing the PHP program again, as shown in the figure:


Ask a question:

At this point, the most basic proxy_cache configuration and access process introduction have been completed. However, the most basic configuration often cannot meet our business needs. We often raise the following questions and requirements:

  1. Need to actively clean up cache files
  2. The write path is a disk. What should I do if the disk is full?
  3. How to enable the source site to support breakpoint-resume downloads and cache strategies for breakpoint-resume downloads
  4. If the request end range requests (segmented downloads) a large resource with the same URI, how to distinguish the requests?
  5. You also need to tell the requester the expiration time of the resource
  6. Log statistics: How to configure hit and miss fields and how to generate statistics?

Faced with the above questions, we will solve them one by one.

Problem 1: Actively clear cache

Use: nginx proxy_cache_purge module, which appears in pairs with proxy_cache and has the opposite function. Design method: In nginx, start another server. When you need to clean up the cache of response resources, access this server on the local computer. For example: Access 127.0.0.1:8083/tmp-test/TL39ef7ea6d8e8d48e87a30c43b8f75e30.txt to clean up the cache file of the resource. Configuration method:

location /tmp-test/ {
        allow 127.0.0.1; //Only allow this machine to access deny all; //Disallow all other IP addresses
        proxy_cache_purge tmp-test $uri; //Clean up the cache}

proxy_cache_purge: cache cleaning module tmp-test: specified key_zone $uri: specified parameters for generating key proxy_cache_purge cache cleaning process, as shown in the figure:


Question 2: What should I do if the cache file disk is full?

Since the write path is a single directory, only one disk can be written. A disk will soon be full. There are two ways to solve this problem:

1. How to combine multiple disks into a disk array? The disadvantage is: the actual storage space is reduced.

2. Cleverly use the directory structure of proxy_cache_path. Since levels=1:2, the directory structure of the cache file is two-layer, and the name of each layer of the directory is generated by the hash function. As shown in the figure:


There are a total of 16*16*16=4096 file directories. Make a soft link to the first-level directory, and soft link 0-f to the specified disk directory you need, as shown in the figure:


Through the soft link method, it is achieved that the directories under different disks are used as the actual paths for storing data, solving the problem of multiple disks being used but a single disk being full.

Question 3: Support range (breakpoint resume)

After adding the cache proxy, the range request initiated by the client will fail, as shown in the following figure:


The reasons why the range parameter cannot be passed to the next level are as follows:

When the cache proxy forwards the HTTP request to the backend server, the HTTP header will change and some parameters in the header will be cancelled. The range parameter was canceled, resulting in the backend nginx server not receiving the range parameter, which ultimately led to the failure of the segment download. Therefore, it is necessary to configure the proxy forwarding header. For example:

location /tmp-test/ {
        proxy_cache tmp-test;
        proxy_cache_valid 200 206 304 301 302 10d;
        proxy_cache_key $uri;
        proxy_set_header Range $http_range;
        proxy_pass http://127.0.0.1:8081/media_store.php/tmp-test/;
}

The meaning of the red part: put the range value ($http_range) in the http request into the http request header forwarded by the proxy as the value of the parameter range.

Question 4: When range loading is supported, proxy_cache_key needs to be reconfigured:

If the request side Range requests (segmented download) a large resource with the same URI, how does the proxy cache identify the key corresponding to the resource? Since nginx is configured as: proxy_cache_key $uri, using uri as the key, when the request is a normal request or a range request, the same uri is used as the key. proxy_cache will likely result in an error return. As shown in the following figure:


The solution is as follows: Modify proxy_cache_key and configure proxy_cache_key $http_range$uri; This will solve the key uniqueness. This can prevent abnormalities in the first retrieved content and the cached content retrieved later, regardless of whether it is a normal request or a different range request.

Question 5: How to configure the return expiration time

The requester needs to specify which resources need to be cached and which resources should not be cached by returning the expiration time.

parameter Normal request Range Request
Returns the expiration time return No return

In order to prevent the requester from caching the sharded resources as complete resources, we need to return the expiration time for normal requests and not return the expiration time for range requests. To solve this problem, you can configure nginx:

location /media_store.php {
   fastcgi_pass 127.0.0.1:9000;
   fastcgi_index media_store.php;
   fastcgi_param SCRIPT_FILENAME $document_root/$fastcgi_script_name;
   include fastcgi_params;
   if ( $http_range = '' ) {
     expires 2592000s;
   }
}

Add the judgment of $http_range in the location after proxy_pass, and expires indicates the expiration time. 2592000s refers to the cache expiration time.

Question 7: How to reflect cache hits in the http header and view them in the nginx log

Solution:

Using the nginx $upstream_cache_status variable: This variable represents the status of the cache hit.

If it hits, it is HIT; if it misses, it is MISS

Add in the returned nginx server configuration:

add_header Nginx-Cache "$upstream_cache_status";

Add in nginxlog:

log_format combinedio …$upstream_cache_status;

http return head screenshot:


Screenshot of nginx log:


Summarize:

This is the end of the introduction to a complete set of cache strategies. This solution not only implements basic cache configuration, but also solves problems encountered in actual scenario applications, such as disk expansion, cache cleaning, breakpoint resumption, cache expiration time, cache hit prompts, etc. As long as this solution is used flexibly, no matter how complex the scenario is, it can basically meet the needs. The above are the pitfalls I have encountered in my work, and I have continuously improved and summarized the results. I hope they will be helpful to readers.

This is the end of this article about the detailed configuration of nginx proxy_cache. For more relevant nginx proxy_cache content, please search for previous articles on 123WORDPRESS.COM or continue to browse the related articles below. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • nginx proxy_cache batch cache clearing script introduction
  • How to enable proxy_cache in Nginx
  • How to configure CDN server using Nginx reverse proxy and proxy_cache

<<:  HTML tag marquee realizes various scrolling effects (without JS control)

>>:  Navicat for MySQL 11 Registration Code\Activation Code Summary

Recommend

Text pop-up effects implemented with CSS3

Achieve resultsImplementation Code html <div&g...

Discussion on CSS style priority and cascading order

In general : [1 important flag] > [4 special fl...

Detailed explanation of Vue mixin

Table of contents Local Mixin Global Mixins Summa...

N ways to align the last row of lists in CSS flex layout to the left (summary)

I would like to quote an article by Zhang Xinxu a...

Small paging design

Let our users choose whether to move forward or ba...

HTML table tag tutorial (7): background color attribute BGCOLOR

The background color of the table can be set thro...

SQL query for users who have logged in for at least n consecutive days

Take 3 consecutive days as an example, using the ...

Summary of JavaScript Timer Types

Table of contents 1.setInterval() 2.setTimeout() ...

HTML Form Tag Tutorial (4):

Suppose now you want to add an item like this to ...

Teach you a trick to achieve text comparison in Linux

Preface In the process of writing code, we will i...

JavaScript pie chart example

Drawing EffectsImplementation Code JavaScript var...