In-depth analysis of nginx+php-fpm service HTTP status code 502

In-depth analysis of nginx+php-fpm service HTTP status code 502

One of our web projects has seen an increase in traffic and DB pressure due to the increase in new cities. As the business provider of the interface, we have recently received a large number of "502" requests from downstream feedback.

502, bad gateway, is usually caused by an upstream (PHP in this case) error. For PHP, the common cause of 502 is that the script execution exceeds the timeout setting time, or the timeout setting is too large, resulting in the PHP process being unable to be released for a long time and no idle worker process to receive the client.

Our project is caused by the PHP execution time being too short. In this case, we can appropriately increase the PHP execution time to ensure that 502 is cleared first. After all, optimization takes more time.

There are two options for controlling PHP execution time, max_execution_time in php.ini and request_terminate_timeout in php-fpm. Request_terminate_timeout can override max_execution_time, so if you don't want to change the global php.ini, you only need to change the php-fpm configuration.

Next, I will analyze in detail why PHP script execution exceeds the set time and causes nginx to return 502.

Let's set the scene first to reproduce the problem:

Nginx and PHP each start only one worker to facilitate tracking.

php-fpm's request_terminate_timeout is set to 3S.

Test script test.php

sleep(20);
echo 'ok';

go go go:

Visit www.v.com/test.php in the browser, and after 3S, the result is...404? ? ? what? ? ?

The start is not good, let's take a look at the nginx configuration file

This location configuration is to jump to a better-looking interface when a 5xx error occurs, but I don’t have the 50x.html file in /usr/share/nginx/html. So I got a 404 error. Doesn't this affect the accuracy of my judgment? Just comment it out! Visit again, wait for 3S, and finally the 'normal' interface appears.

Now that the environment is ready, let's follow the routine and go through the troubleshooting routine for web problems. Let's take a look at the error log first:

nginx:

The error messages are all recv() failed (104: Connection reset by peer.

The recv failed and the connection was reset. Why was the connection reset? Is there a disagreement?

Let's take a look at the error log of php-fpm:

(Note that the php_admin_value[error_log] option in php-fpm specifies the PHP error log, which will override the one in php.ini. However, here we are not looking at PHP errors, but at PHP-fpm errors. The PHP-fpm error log is specified by the error_log option in php-fpm.conf.)

Each request generates 2 WARNING and 1 NOTICE:

WARNING: Script execution timed out and terminated.

WARNING: The child process received a SIGTERM signal and exited.

NOTICE: A new child process was started (because I set pm.min_spare_servers = 1)

It seems that if the PHP worker process execution times out, not only will the script execution be terminated, but the worker process will also exit. It seems that the nginx error connection was reset because the php worker process exited (in a TCP connection, if one party is disconnected, it will send RST to the other party)

From the log, we can know that the PHP script execution timed out and the worker child process exited, causing nginx to report an error Connection reset by peer. Next, we use strace to look at the situation of PHP and nginx:

php:

1.Accept an nginx connection request (socket, bind, and listen are all completed in the master). You can see that the port of nginx is 47039. Read data from FD0, that is, from the standard input. This is stipulated by the fast-cgi protocol. The connected descriptor after accept is 3.

2. Read the data passed by nginx from FD3 in fastcgi protocol format, and receive 856 bytes. Why read 5 times?

Because the fastcgi protocol data packet is 8-byte aligned and consists of a header and a body. And they will first send a request data packet, which contains some request ID, version, typpe and other information (the header and body each occupy 8 bytes), then send a params data packet to pass get parameters and environment variables (the header is 8 bytes, the body is longer), and finally send a params data packet with no body but only a header, indicating the end of parameter sending (the header is 8 bytes). So the first three reads are used to read the header and body of the request packet, as well as the header of the params packet. The fourth read is to read the actual data, and the last read is to read the header of the last params packet. Therefore, the data transmitted by nginx should be 8+8+8+856+8=896 bytes (which corresponds to the transmission bytes of nginx below). Note that if the post method is used, the stdin data packet will also be sent.

3. Set the sleep time to 20 seconds, which is sleep(20) in the PHP program. Since the process is terminated, there will be no more. The strace program also exits.

nginx:

1.Accept the request to the browser. You can see that the port on the browser side is 56434, the IP is 192.168.1.105, and the FD of the established connection is 3.

2. Receive data from FD3, HTTP protocol.

3. Create a socket, FD21, to establish a connection with PHP.

4. Connect to FD21, you can see that the connection is to port 9000 of the local machine. Here, nginx and php-fpm use IP socket connection method. If nginx and php-fpm are deployed on the same machine, you can consider Unix domain socket.

5. Write data to FD21 in fast-cgi protocol format. We see that the length written is 896, which corresponds to the length received by PHP above.

6. The recvfrom function returns ECONNRESET (Connection reset by peer) from FD21

7. Write error information to FD9. It can be inferred that FD9 is the file descriptor of the nginx error log.

8. Close the connection with FD21.

9. Write 502 Bad Gateway to FD3, which is the information returned to the browser.

10. Write an access log to FD8. It can be inferred that FD8 is the file descriptor of the nginx access log.

Let's verify the inference of nginx access log and error log. It can be seen that it is indeed FD8 and FD9, and is in write mode.

Then let's take a look at the transmission of the entire network packet in this process:

It is more convenient to capture packets through tcpdump and use the artifact to view them.

Because I only want to see the communication between nginx and php, and I know that the port of nginx is 47039, I can filter out the corresponding packets through tcp.srcport==47039.

You can see the data interaction process between nginx and php-fpm: 47039->9000 establishes a three-way handshake, then sends data to 9000, 9000 replies ACK, and 9000 replies RST after 3S. Nothing wrong.

Notice:

SYN and FIN each occupy a sequence number

ACK and RST do not occupy the sequence number (the reqnum and acknum of packets 28 and 29 are the same)

The sequence number is 1 for each byte (packet 29 sends 896 bytes, and the seq of packet 29 is 4219146879, and the ack of packet 30 is 4219147775, which is exactly 896 bytes different)

RST does not require a reply.

The above is the full content of this article. I hope it will be helpful for everyone’s study. I also hope that everyone will support 123WORDPRESS.COM.

You may also be interested in:
  • Nginx 502 Bad Gateway Error Causes and Solutions
  • Detailed explanation of PHP+nginx service 500 502 error troubleshooting ideas
  • Detailed explanation of solutions to common 502 errors in Nginx in Linux
  • Detailed explanation of Nginx 502 error solution
  • PHP script monitors Nginx 502 errors and automatically restarts php-fpm
  • 4 Common Causes and Solutions for Nginx 502 Bad Gateway Error
  • Nginx 502 error when upgrading PHP from 5.3.28 to 5.3.29
  • Troubleshooting the cause of 502 bad gateway error on nginx server

<<:  How to find the my.ini configuration file in MySQL 5.6 under Windows

>>:  MySQL installation diagram summary

Recommend

A Deep Dive into JavaScript Promises

Table of contents 1. What is Promise? 2. Why is t...

Navicat remote connection to MySQL implementation steps analysis

Preface I believe that everyone has been developi...

Analyze the working principle of Tomcat

SpringBoot is like a giant python, slowly winding...

Linux lossless expansion method

Overview The cloud platform customer's server...

Two methods to implement MySQL group counting and range aggregation

The first one: normal operation SELECT SUM(ddd) A...

The difference and usage of LocalStorage and SessionStorage in vue

Table of contents What is LocalStorage What is Se...

40 web page designs with super large fonts

Today's web designs tend to display very larg...

A brief discussion on the lazy loading attribute pattern in JavaScript

Table of contents 1. Introduction 2. On-demand at...

CSS style reset and clear (to make different browsers display the same effect)

In order to make the page display consistent betwe...

Weather icon animation effect implemented by CSS3

Achieve results Implementation Code html <div ...

Responsive Web Design Learning (2) — Can videos be made responsive?

Previous episode review: Yesterday we talked abou...

Native JS object-oriented typing game

This article shares the specific code of JS objec...

Node+socket realizes simple chat room function

This article shares the specific code of node+soc...