Understand all aspects of HTTP Headers with pictures and text

Understand all aspects of HTTP Headers with pictures and text

What are HTTP Headers

HTTP is an abbreviation of "Hypertext Transfer Protocol". The entire World Wide Web uses this protocol. Almost most of the content you see in the browser is transmitted via the http protocol, such as this article.

HTTP Headers are the core of HTTP requests and responses, and they carry information about the client browser, the requested page, the server, and so on.

Example

When you type a URL in the browser address bar, your browser will make an HTTP request similar to the following:
GET /tutorials/other/top-20-mysql-best-practices/ HTTP/1.1
Host: net.tutsplus.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5 (.NET CLR 3.5.30729)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Cookie: PHPSESSID=r2t5uvjq435r4q7ib3vtdjq120
Pragma: no-cache
Cache-Control: no-cache

The first line is called "Request Line" which describes the basic information of the request, and the rest are HTTP headers.

After the request is completed, your browser may receive the following HTTP response:

HTTP/1.x 200 OK
Transfer-Encoding: chunked
Date: Sat, 28 Nov 2009 04:36:25 GMT
Server: LiteSpeed
Connection: close
X-Powered-By: W3 Total Cache/0.8
Pragma: public
Expires: Sat, 28 Nov 2009 05:36:25 GMT
Etag: "pub1259380237;gz"
Cache-Control: max-age=3600, public
Content-Type: text/html; charset=UTF-8
Last-Modified: Sat, 28 Nov 2009 03:50:37 GMT
X-Pingback: http://net.tutsplus.com/xmlrpc.php
Content-Encoding: gzip
Vary: Accept-Encoding, Cookie, User-Agent
<!-- ... rest of the html ... -->

The first line is called the "Status Line", after it are the http headers, and after a blank line the output starts (in this case some html output).

But you can't see the HTTP headers when you view the page source, although they are sent to the browser along with what you can see.

This HTTP request also sends out some requests to receive other resources, such as pictures, css files, js files, etc.

Let’s look at the details below.

How to view HTTP Headers

The following Firefox extensions can help you analyze HTTP headers:

1. firebug

2. Live HTTP Headers

3. In PHP:

  • getallheaders() is used to get the request headers. You can also use the $_SERVER array.
  • headers_list() is used to get the response headers.

Below the article you will see some examples using PHP.

HTTP Request Structure

The first line, called the "first line", consists of three parts:

  • "method" indicates what type of request this is. The most common request types are GET, POST and HEAD.
  • “path” refers to the path after the host. For example, when you request “http://net.tutsplus.com/tutorials/other/top-20-mysql-best-practices/”, the path will be “/tutorials/other/top-20-mysql-best-practices/”.
  • "protocol" contains "HTTP" and the version number, and modern browsers will use 1.1.

The remaining lines are each a "Name:Value" pair. They contain various information about the request and your browser. For example, "User-Agent" indicates your browser version and the operating system you are using. "Accept-Encoding" tells the server that your browser can accept compressed output such as gzip.

Most of these headers are optional. The HTTP request can even be shortened to this:

GET /tutorials/other/top-20-mysql-best-practices/ HTTP/1.1
Host: net.tutsplus.com

And you will still receive a valid response from the server.

Request Type

The three most common request types are: GET, POST and HEAD. You may already be familiar with the first two from your experience writing HTML.

GET: Get a document

Most of the html, images, js, css, ... that are transmitted to the browser are requested through the GET method. It is the primary method for acquiring data.

For example, to fetch articles from Nettuts+, the first line of the http request would typically look like this:

GET /tutorials/other/top-20-mysql-best-practices/ HTTP/1.1

Once the html is loaded, the browser will send a GET request to fetch the image, like this:

GET /wp-content/themes/tuts_theme/images/header_bg_tall.png HTTP/1.1

Forms can also be sent via the GET method. Here is an example:

<form action="foo.php" method="GET">
First Name: <input name="first_name" type="text" />
Last Name: <input name="last_name" type="text" />
<input name="action" type="submit" value="Submit" />
</form>

When this form is submitted, the HTTP request will look like this:

GET /foo.php?first_name=John&last_name=Doe&action=Submit HTTP/1.1
...

You can send form input to the server by appending it to the query string.

POST: Send data to the server

Although you can use the GET method to attach data to the URL and send it to the server, in many cases it is more appropriate to use POST to send data to the server. Sending large amounts of data via GET is not practical and has certain limitations.

It is common practice to send form data using POST request. Let's modify the above example to use POST:

<form action="foo.php" method="POST">
First Name: <input name="first_name" type="text" />
Last Name: <input name="last_name" type="text" />
<input name="action" type="submit" value="Submit" />
</form>

Submitting this form creates an HTTP request like this:

POST /foo.php HTTP/1.1
Host: localhost
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5 (.NET CLR 3.5.30729)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://localhost/test.php
Content-Type: application/x-www-form-urlencoded
Content-Length: 43
first_name=John&last_name=Doe&action=Submit

There are three things to note here:

  • The path on the first line has become simply /foo.php , without the query string.
  • Added Content-Type and Content-Length headers, which provide information about the information being sent.
  • All data is sent after the headers in the form of a query string.

POST requests can also be used with AJAX, applications, cURL, etc. And all file upload forms are required to use the POST method.

HEAD: Receive header information

HEAD is very similar to GET, except that HEAD does not accept the content part of the HTTP response. When you send a HEAD request, it means that you are only interested in the HTTP headers, not the document itself.

This method allows the browser to determine whether the page has been modified, thereby controlling the cache. It is also possible to determine whether the requested document exists.

For example, if your website has many links, you can simply send a HEAD request to each of them to determine if there are any broken links, which is much faster than using GET.

http response structure

When the browser sends an HTTP request, the server responds to the request with an HTTP response. If you don't care about the content, the request will look like this:

The first valuable information is the protocol. Currently servers will use HTTP/1.x or HTTP/1.1.

Next comes a short message indicating the status. Code 200 means that our request has been sent successfully, and the server will return the document we requested, after the header information.

We’ve all seen the “404” page. When I request a non-existent path from the server, the server responds to us with 404 instead of 200.

The rest of the response content is similar to the HTTP request. These are about server software, when the page/file was modified, mime type, etc...

Again, these headers are optional.

HTTP Status Codes

  • 200 is used to indicate a successful request.
  • 300 indicates a redirect.
  • 400 is used to indicate a problem with the request.
  • 500 is used to indicate a server problem.

200 Success (OK)

As mentioned earlier, 200 is used to indicate a successful request.

206 Partial Content

If an application only requests files within a certain range, it will return 206.

This is usually used for download management, resuming downloads, or downloading files in chunks.

404 Not Found

404

Easy to understand

401 Unauthorized

Password protected pages will return this status. If you do not enter the correct password, you will see the following message in your browser:

401

Note that this is only a password protected page, and the pop-up box requesting the password looks like this:

401_prompt

403 Forbidden

If you do not have permission to access a page, a 403 status is returned. This usually happens when you try to open a folder that doesn't have an index page. If your server settings do not allow viewing directory contents, then you will see a 403 error.

Some other methods will also send permission restrictions, for example you can block by IP address, which requires some htaccess assistance.

order allow,deny
deny from 192.168.44.201
deny from 224.39.163.12
deny from 172.16.7.92
allow from all

302 (or 307) Moved Temporarily and 301 Moved Permanently

These two states will appear when the browser redirects. For example, you use a URL shortening service like bit.ly. This is also how they know who clicks on their links.

302 and 301 are very similar to browsers, but there are some differences for search engine crawlers. For example, if your website is under maintenance, you would redirect the client browser to another address using 302. Search engine crawlers will then re-index your pages in the future. But if you use a 301 redirect, it means you are telling search engine crawlers that your website has been permanently moved to a new address.

500 Internal Server Error

This code usually appears when a page script crashes. Most CGI scripts do not output error messages to the browser like PHP does. If a fatal error occurs, they will just send a 500 status code. At this time, you need to check the server error log to troubleshoot.

Complete List

You can find a complete description of HTTP status codes here . Or you can check here (http://tools.jb51.net/table/http_status_code).

HTTP Request in HTTP Headers

Now let's look at some common HTTP request information in HTTP headers.

All of these headers can be found in PHP's $_SERVER array. You can also use the getallheaders() function to get all header information at once.

Host

An HTTP request is sent to a specific IP address, but most servers have the ability to host multiple websites under the same IP address, so the server must know which domain name the browser is requesting.

Host: rlog.cn

This is just the base hostname, with the domain and subdomains.

In PHP, this can be viewed via $_SERVER['HTTP_HOST'] or $_SERVER['SERVER_NAME'].

User-Agent

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5 (.NET CLR 3.5.30729)

This header can carry the following information:

  • Browser name and version number.
  • Operating system name and version number.
  • Default language.

This is a common method some websites use to collect information about their visitors. For example, you can determine if a visitor is accessing your site from a mobile phone and decide whether to direct them to a mobile site that performs well at lower resolutions.

In PHP, you can get the User-Agent through $_SERVER['HTTP_USER_AGENT']

if ( strstr($_SERVER['HTTP_USER_AGENT'],'MSIE 6') ) {
echo "Please stop using IE6!";
}

Accept-Language

Accept-Language: en-us,en;q=0.5

This information can indicate the user's default language setting. If the website has different language versions, this information can be used to redirect the user's browser.

It can carry multiple languages ​​by separating them with commas. The first one will be the preferred language, and the other languages ​​will carry a "q" value to indicate the user's preference for the language (0~1).

In PHP, use $_SERVER["HTTP_ACCEPT_LANGUAGE"] to get this information.

if (substr($_SERVER['HTTP_ACCEPT_LANGUAGE'], 0, 2) == 'fr') {
header('Location: http://french.mydomain.com');
}

Accept-Encoding

Accept-Encoding: gzip,deflate

Most modern browsers support gzip compression and will report this information to the server. At this time, the server sends the compressed HTML to the browser. This can reduce file size by nearly 80%, saving download time and bandwidth.

In PHP, you can use $_SERVER["HTTP_ACCEPT_ENCODING"] to get this information. The value is then automatically detected when the ob_gzhandler() method is called, so you don't need to check it manually.

// enables output buffering
// and all output is compressed if the browser supports it
ob_start('ob_gzhandler');

If-Modified-Since

If a page is already cached in your browser, the next time you visit it the browser will detect if the document has been modified, and it will send a header like this:

If-Modified-Since: Sat, 28 Nov 2009 06:38:19 GMT

If it has not been modified since this time, the server will return "304 Not Modified" and will not return the content. The browser will automatically read the content from the cache

In PHP, this can be detected using $_SERVER['HTTP_IF_MODIFIED_SINCE'].

// assume $last_modify_time was the last the output was updated
// did the browser send If-Modified-Since header?
if(isset($_SERVER['HTTP_IF_MODIFIED_SINCE'])) {
// if the browser cache matches the modify time
if ($last_modify_time == strtotime($_SERVER['HTTP_IF_MODIFIED_SINCE'])) {
// send a 304 header, and no content
header("HTTP/1.1 304 Not Modified");
exit;
}
}

There is also an HTTP header called Etag, which is used to determine whether the cached information is correct, which we will explain later.

Cookie

As the name implies, it will send the cookie information stored in your browser to the server.

Cookie: PHPSESSID=r2t5uvjq435r4q7ib3vtdjq120; foo=bar

It is a set of name-value pairs separated by semicolons. Cookies can also contain a session id.

In PHP, individual cookies can be accessed from the $_COOKIE array. You can directly use the $_SESSION array to get session variables. If you need a session id, you can use the session_id() function instead of cookies.

echo $_COOKIE['foo'];
// output: bar
echo $_COOKIE['PHPSESSID'];
// output: r2t5uvjq435r4q7ib3vtdjq120
session_start();
echo session_id();
// output: r2t5uvjq435r4q7ib3vtdjq120

Referer

As the name implies, the header will contain the referring url information.

For example, if I visit the Nettuts+ homepage and click a link, this header will be sent to the browser:
Referer: http://net.tutsplus.com/

In PHP, this value can be obtained through $_SERVER['HTTP_REFERER'].

if (isset($_SERVER['HTTP_REFERER'])) {
$url_info = parse_url($_SERVER['HTTP_REFERER']);
// is the surfer coming from Google?
if ($url_info['host'] == 'www.google.com') {
parse_str($url_info['query'], $vars);
echo "You searched on Google for this keyword: ". $vars['q'];
}
}
// if the referring url was:
// http://www.google.com/search?source=ig&hl=en&rlz=&=&q=http+headers&aq=f&oq=&aqi=g-p1g9
// the output will be:
// You searched on Google for this keyword: http headers

You may have noticed the word "referrer" is misspelled as "referer". Unfortunately it was made into the official HTTP specifications like that and got stuck.

Authorization

When a page requires authorization, the browser will pop up a login window. After entering the correct account, the browser will send an HTTP request, but this time it will contain a header like this:

Authorization: Basic bXl1c2VyOm15cGFzcw==

The information contained in the header is base64 encoded. For example, base64_decode('bXl1c2VyOm15cGFzcw==') will be converted to 'myuser:mypass'.

In PHP, this value can be obtained using $_SERVER['PHP_AUTH_USER'] and $_SERVER['PHP_AUTH_PW'].

We will explain more details in the WWW-Authenticate section.

HTTP Response in HTTP Headers

Now let me understand some common HTTP response information in HTTP Headers.

In PHP, you can set the header response information through header() . PHP has automatically sent some necessary header information, such as loading content, setting cookies, etc... You can see the header information that has been sent and will be sent through the headers_list() function. You can also use the headers_sent() function to check whether the headers have been sent.

Cache-Control

w3.org defines it as: "The Cache-Control general-header field is used to specify directives which MUST be obeyed by all caching mechanisms along the request/response chain." The "caching mechanisms" include some gateway and proxy information that your ISP may use.

For example:

Cache-Control: max-age=3600, public

"public" means that the response can be cached by anyone, and "max-age" indicates the number of seconds that the cache is valid. Allowing your website to be cached greatly reduces download time and bandwidth, while also increasing browser loading speeds.

You can also disable caching by setting the "no-cache" directive:

Cache-Control: no-cache

See w3.org for more details.

Content-Type

This header contains the "mime-type" of the document. The browser will decide how to parse the document based on this parameter. For example, an html page (or a php page with html output) would return something like this:

Content-Type: text/html; charset=UTF-8

'text' is the document type, 'html' is the document subtype. This header also includes more information, such as charset.

If it is an image, the response will be:

Content-Type: image/gif

The browser can use mime-type to decide whether to use an external program or its own extension to open the document. The following example calls Adobe Reader:

Content-Type: application/pdf

When loading directly, Apache will usually automatically determine the mime-type of the document and add the appropriate information to the header. And most browsers have a certain degree of fault tolerance. When the header does not provide this information or provides it incorrectly, it will automatically detect the mime-type.

You can find a list of common mime-types here .

In PHP you can use finfo_file() to detect the ime-type of a file.

Content-Disposition

This header tells the browser to open a file download window instead of trying to parse the content of the response. For example:

Content-Disposition: attachment; filename="download.zip"

It will cause the browser to display a dialog box like this:

Note that the appropriate Content-Type header will also be sent.

Content-Type: application/zip
Content-Disposition: attachment; filename="download.zip"

Content-Length

When the content is about to be transmitted to the browser, the server can use this header to inform the browser of the size (bytes) of the file to be transmitted.

Content-Length: 89123

This information is quite useful for file downloads. That's how the browser knows the progress of the download.

For example, here I wrote a dummy script to simulate a slow download.

// it's a zip file
header('Content-Type: application/zip');
// 1 million bytes (about 1megabyte)
header('Content-Length: 1000000');
// load a download dialogue, and save it as download.zip
header('Content-Disposition: attachment; filename="download.zip"');
// 1000 times 1000 bytes of data
for ($i = 0; $i < 1000; $i++) {
echo str_repeat(".",1000);
// sleep to slow down the download
usleep(50000);
}

The result will be something like this:

Now, I comment out the Content-Length header:

// it's a zip file
header('Content-Type: application/zip');
// the browser won't know the size
// header('Content-Length: 1000000');
// load a download dialogue, and save it as download.zip
header('Content-Disposition: attachment; filename="download.zip"');
// 1000 times 1000 bytes of data
for ($i = 0; $i < 1000; $i++) {
echo str_repeat(".",1000);
// sleep to slow down the download
usleep(50000);
}

The result is this:

The browser will only tell you how much has been downloaded, but not how much you need to download in total. And the progress bar won't show the progress either.

Etag

This is another header that is generated for caching purposes. It will look like this:

Etag: "pub1259380237;gz"

The server may respond to the browser with this information along with each file sent. This value can contain the document's last modified date, file size, or file checksum. The browser caches it along with the received documents. The next time the browser requests the same file, it will send the following HTTP request:

If-None-Match: "pub1259380237;gz"

If the Etag value of the requested document matches it, the server will send a 304 status code instead of 2oo. And no content is returned. The browser will then load the file from cache.

Last-Modified

As the name implies, this header indicates the last modification time of the document in GMT format:

Last-Modified: Sat, 28 Nov 2009 03:50:37 GMT

$modify_time = filemtime($file);
header("Last-Modified: " . gmdate("D, d MYH:i:s", $modify_time) . " GMT");

It provides another caching mechanism. A browser might send a request like this:

If-Modified-Since: Sat, 28 Nov 2009 06:38:19 GMT

We have already discussed this in the If-Modified-Since section.

Location

This header is used for redirection. The server MUST send this header if the response code is 301 or 302. For example, when you visit http://www.nettuts.com your browser will receive the following response:

HTTP/1.x 301 Moved Permanently
...
Location: http://net.tutsplus.com/
...

In PHP you can redirect visitors this way:
header('Location: http://net.tutsplus.com/');

By default, a 302 status code will be sent. If you want to send a 301, write this:

header('Location: http://net.tutsplus.com/', true, 301);

Set-Cookie

When a website needs to set or update cookies for your browsing history, it uses a header like this:

Set-Cookie: skin=noskin; path=/; domain=.amazon.com; expires=Sun, 29-Nov-2009 21:42:28 GMT
Set-Cookie: session-id=120-7333518-8165026; path=/; domain=.amazon.com; expires=Sat Feb 27 08:00:00 2010 GMT

Each cookie is sent as a separate header. Note that setting cookies via js will not be reflected in the HTTP header.

In PHP, you can set cookies with the setcookie() function, and PHP will send the appropriate HTTP headers.

setcookie("TestCookie", "foobar");

It will send the following headers:

Set-Cookie: TestCookie=foobar

If no expiration date is specified, the cookie is deleted when the browser is closed.

WWW-Authenticate

A website might send this header via HTTP to authenticate a user. When the browser sees this response in the header it opens a popup.

WWW-Authenticate: Basic realm="Restricted Area"

It will look something like this:

In a chapter of the PHP manual there is a simple code that demonstrates how to do this with PHP:

if (!isset($_SERVER['PHP_AUTH_USER'])) {
header('WWW-Authenticate: Basic realm="My Realm"');
header('HTTP/1.0 401 Unauthorized');
echo 'Text to send if user hits Cancel button';
exit;
} else {
echo "<p>Hello {$_SERVER['PHP_AUTH_USER']}.</p>";
echo "<p>You entered {$_SERVER['PHP_AUTH_PW']} as your password.</p>";
}

Content-Encoding

This header is typically set when the return content is compressed.

Content-Encoding: gzip

In PHP, this header will be set automatically if you call the ob_gzhandler() function.

Original URL: http://css9.net/all-about-http-headers/

<<:  Initialize Ubuntu 16.04 in three minutes, deploy Java, Maven, and Docker environments

>>:  Pure JS method to export table to excel

Recommend

How to change the root password in a container using Docker

1. Use the following command to set the ssh passw...

Summary of using MySQL online DDL gh-ost

background: As a DBA, most of the DDL changes of ...

A brief discussion on when MySQL uses internal temporary tables

union execution For ease of analysis, use the fol...

Full analysis of Vue diff algorithm

Table of contents Preface Vue update view patch s...

Talk about nextTick in Vue

When the data changes, the DOM view is not update...

MySQL uses variables to implement various sorting

Core code -- Below I will demonstrate the impleme...

React diff algorithm source code analysis

Table of contents Single Node Diff reconcileSingl...

Implementation of CSS equal division of parent container (perfect thirds)

The width of the parent container is fixed. In or...

Bootstrap 3.0 study notes for beginners

As the first article of this study note, we will ...

Solution to 2059 error when connecting Navicat to MySQL

Recently, when I was learning Django, I needed to...

HTML structured implementation method

DIV+css structure Are you learning CSS layout? Sti...

How to deploy code-server using docker

Pull the image # docker pull codercom/code-server...

How to write transparent CSS for images using filters

How to write transparent CSS for images using filt...

mysql three tables connected to create a view

Three tables are connected. Field a of table A co...