Tutorial on processing static resources in Tomcat

Tutorial on processing static resources in Tomcat

Preface

All requests in Tomcat are handled by Servlet, and static resources are no exception. In the default web.xml, a DefaultServlet is configured to process static resources, which supports caching and breakpoint resuming.

The basic processing of DefaultServlet is as follows:

  • Check if a resource is cached
  • Checks whether the condition specified by the optional If header field is met
  • Set response header fields, such as Content-Type, Content-Length, ETag, Last-Modified
  • Checks whether the conditions of Sendfile are met, otherwise copies the content to the output stream

Next, we will mainly analyze the design and implementation of resource caching and the processing of the If header field.

1. Resource Cache Design

The speed of accessing disk is much lower than the speed of accessing memory, so properly caching some static resources can make the system respond quickly.

When Tomcat implemented the processing of static resources in version 6.0.53, it used some JNDI APIs (but it felt that it had little to do with JNDI when used). The relevant class diagrams and core methods and properties are as follows:

Cache related classes:

  • ResourceCache: Cache implementation, providing resource search, loading, and destruction functions
  • CacheEntry: A cache entry, including the cache name, such as /tomcat.gif, the resource and resource attributes, and the corresponding directory

The resource directory related classes are:

  • EmptyDirContext: Mainly used in embedded mode, behaves as if no resources are available
  • FileDirContext: File system-based resource directory service
  • WARDirContext: Directory service based on war file
  • Resource: encapsulates resource content, mainly byte data and input stream
  • ResourceAttributes: Resource attributes, mainly content length and last modified time
  • ProxyDirContext: A proxy for resource cache and directory services, providing functions such as finding resource cache and checking whether the cache is expired.

By default, the maximum size of the cache is 10 MB, the maximum size of a single cached resource is 512 KB, and the cache TTL is 5 seconds.

Generally, when a Mapper is mapped to a Wrapper that processes static resources, it will cause the loading of resources. The basic method calls are as follows:

Mapper.map(MessageBytes, MessageBytes, MappingData)
└─Mapper.internalMap(CharChunk, CharChunk, MappingData)
 └─Mapper.internalMapWrapper(Mapper$Context, CharChunk, MappingData)
 └─ProxyDirContext.lookup(String)
 └─ProxyDirContext.cacheLookup(String)
 └─ResourceCache.lookup(String)
  └─ResourceCache.find(CacheEntry[], String)

Cache resources are inserted into the internal array in order. The find method performs a binary search in the cache by resource name, where the resource name is the request path. There are two situations: cache hit and cache miss.

If the cache misses, a new CacheEntry object will be created in the cacheLookup method, and the cacheLoad method will be called to add it to the cache array of ResourceCache. Before adding, the following operations will be performed on the cache entry:

  • Get and initialize cache resource properties, mainly the file's contentLength and lastModified
  • If the file length is less than 512KB, then load the file content into memory
  • Mark the cache as existing and set the cache timestamp

Cache hits will verify the cache entry:

  • Check if it is expired, the current time is greater than the timestamp set for the cache entry
  • If it is expired, check whether the resource content has been modified
  • If modified, clear this cache and read the latest content

The above is a simple processing process of resource caching.

2. Processing of the If Header Field

The client receives and caches the requested resource. When requesting the resource again, the server verifies whether the resource has been modified based on the specific request header field. If there is no change, it only returns a 304 Not Modified response. Otherwise, it returns the content of the resource, thereby saving bandwidth.

There are two header fields used for resource verification: Last-Modified+If-Modified-Since and ETag+If-None-Match.

Last-Modified+If-Modified-Since, the unit is seconds. This is easy to understand. If the last modification time of the server resource is less than the value of If-Modified-Since, it means that the resource has not changed. Corresponding to If-Modified-Since is If-Unmodified-Since, which is similar to an assertion. Only resources with a timestamp less than this one will be returned. If the timestamp is greater than or equal to this one, a 412 Precondition Failed error will be returned.

There are several disadvantages to using timestamp validation:

  • The file may only change the modification time, but the content remains unchanged
  • Files modified in less than a second cannot be judged
  • The server may not be able to accurately obtain the last modification time of the file.

Therefore, HTTP introduced ETag. ETag (Entity Tags) is a unique identifier for a resource, which can be regarded as a token generated by the server for the resource and used to verify whether the resource has been modified. HTTP only stipulates that ETag should be placed in double quotes, but does not stipulate what the content is or how to implement it. Tomcat's logic for generating ETag is "W/\"" + contentLength + "-" + lastModified + "\"" , where 'W/' indicates case sensitivity.

ETag+If-None-Match. The value of If-None-Match consists of one or more ETags, separated by commas. If the ETag of the server resource does not match any of them, it means that the requested resource has been modified; otherwise, there is no change. It also has a special value - asterisk (*), which is only used when uploading resources, usually the PUT method, to check whether it has been uploaded.

In addition, If-None-Match has a higher priority than If-Modified-Since, that is, if If-None-Match exists, the last modification time will not be checked. The opposite of If-None-Match is If-Match, which is also similar to an assertion. It is considered to be unchanged only when the ETag of the resource matches. It is usually used for breakpoint resuming.

The core code of Tomcat to implement this part is as follows:

// Return true only if the resource is considered changed protected boolean checkIfHeaders(HttpServletRequest request,
  HttpServletResponse response, ResourceAttributes resourceAttributes)
  throws IOException {
 return checkIfMatch(request, response, resourceAttributes)
  && checkIfModifiedSince(request, response, resourceAttributes)
  && checkIfNoneMatch(request, response, resourceAttributes)
  && checkIfUnmodifiedSince(request, response, resourceAttributes);
}

2.1 One-time request process

Taking the request for /main.css static resource as an example, the first request response header information is as follows:

HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
Accept-Ranges: bytes
ETag: W/"72259-1557127244000"
Last-Modified: Mon, 06 May 2019 07:20:44 GMT
Content-Type: text/css
Content-Length: 72259
Date: Mon, 06 May 2019 07:20:57 GMT

When making the second request, first look at the key information in the request header field:

Cache-Control:max-age=0
Connection:keep-alive
Host:localhost:8080
If-Modified-Since:Mon, 06 May 2019 07:20:44 GMT
If-None-Match:W/"72259-1557127244000"

After receiving the request, the server will compare the ETag. If the match is successful, it means that the resource has not been modified. The response is as follows:

HTTP/1.1 304 Not Modified
Server: Apache-Coyote/1.1
ETag: W/"72259-1557127244000"
Date: Mon, 06 May 2019 07:21:46 GMT

Note: When reproducing, use the text type. If you use the Chrome browser, remember to enable cache.

2.2 Accept-Ranges

In the response above, the server sets an Accept-Ranges: bytes header, which literally means that a portion of the bytes of the resource can be requested. When the client finds this header, it can try to resume the transfer.

The parsing process is the implementation of the HTTP specification. We will not analyze it in detail here. For detailed information on the specification, please refer to RFC7233#section-2.3.

3. SendFile Processing

Check whether SendFile is supported. This operation is supported in NIO mode, that is, zero copy. This operation will reduce one copy to the application memory and write data directly from the kernel to the channel. Tomcat will try to use this method to send files larger than 48KB.

4. Summary

Tomcat's implementation of static resource processing is relatively complete, but it is still slightly inferior to Web servers such as Nginx, because they can directly process static resources, while Tomcat has to do an additional mapping. Generally, dynamic and static separation is performed to allow Tomcat to focus on processing dynamic requests.

Summarize

The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. Thank you for your support of 123WORDPRESS.COM.

You may also be interested in:
  • Solution to Tomcat's failure to load static resource files such as css and js
  • Detailed explanation of Nginx + Tomcat to separate requests for dynamic data and static resources

<<:  Installation tutorial of MySQL 5.1 and 5.7 under Linux

>>:  React new version life cycle hook function and usage detailed explanation

Recommend

Grid systems in web design

Formation of the grid system In 1692, the newly c...

How to use VIM editor in Linux

As a powerful editor with rich options, Vim is lo...

Vue custom v-has instruction, steps for button permission judgment

Table of contents Application Scenario Simply put...

Text pop-up effects implemented with CSS3

Achieve resultsImplementation Code html <div&g...

Solve the problem of invalid utf8 settings in mysql5.6

After the green version of mysql5.6 is decompress...

Detailed explanation of the concept, principle and usage of MySQL triggers

This article uses examples to explain the concept...

Example of how to install kong gateway in docker

1. Create a Docker network docker network create ...

MySQL Series 12 Backup and Recovery

Table of contents Tutorial Series 1. Backup strat...

JavaScript implements an input box component

This article example shares the specific code for...

A super detailed Vue-Router step-by-step tutorial

Table of contents 1. router-view 2. router-link 3...

How to Find the Execution Time of a Command or Process in Linux

On Unix-like systems, you may know when a command...

18 Amazing Connections Between Interaction Design and Psychology

Designers need to understand psychology reading n...