Troubleshooting process for Docker container suddenly failing to connect after port mapping

Troubleshooting process for Docker container suddenly failing to connect after port mapping

1. Background

Generally, for Docker containers that need to provide external services, we use the -p command to expose the external access port to the outside world when starting. For example, when starting Docker Registry, we map port 5000 for external access:

docker run -d -p 5000:5000 registry

But recently I encountered a very strange situation: a Docker Registry was deployed in a CentOS 7 test environment in the R&D team, and the port was exposed to the outside world. After starting the container, it can work normally for a period of time, but after an indefinite time interval, the external host will be unable to pull the image from the warehouse and prompt TimeOut:

However, accessing the repository on the Docker host can be done normally:

As for this problem, external access can only be restored after manually restarting the problematic Docker daemon service, but the problem will reappear after a period of time.

2. Troubleshooting

When I encountered this problem, my first reaction was to ask people in the group whether anyone had restarted CentOS 7's own firewallD.

Because this server is configured by me, although the firewall is on, I have already opened port access, so it is definitely not because the firewall blocks the connection. But since this article is a pitfall investigation document, I still wrote this situation out.

Case 1: The firewall is turned on but no ports are open

CentOS 7 comes with and enables the firewall FirewallD. We can check the status of FirewallD with the following command:

firewall-cmd --state 

If the output is "not running", FirewallD is not running and all protection policies are not started. In this case, the firewall can be ruled out as blocking the connection.

If the output is "running", it means that FirewallD is currently running. You need to enter the following command to view which ports and services are currently open:

firewall-cmd --list-ports
firewall-cmd --list-services 

It can be seen that the current firewall only opens port 80/tcp, ssh service (22/tcp) and dhcpv6-client service, and does not open port 5000/tcp mapped by the Docker container.

There are two solutions:

1. Turn off the FirewallD service:

If you don't need a firewall, just turn off the FirewallD service.

systemctl stop firewalld.service

2. Add a policy to open the specified port to the outside world:

For example, if we want to open the external 5000/tcp port, we can use the following command:

firewall-cmd --add-port=5000/tcp --permanent
firewall-cmd --reload

If you only want to open the port temporarily, remove the "--permanent" parameter in the first line of the command. Then when you restart the FirewallD service again, this policy will become invalid.

Case 2: Manually restart the FirewallD service of CentOS 7

FirewallD is a new component introduced in CentOS system in version 7. Simply put, it is a wrapper of iptables, which is used to simplify firewall-related settings.

However, FirewallD and Docker do not get along very well. When FirewallD starts (or restarts), it removes the DOCKER chain from iptables, causing Docker to not work properly:

FirewallD

CentOS-7 introduced firewalld, which is a wrapper around iptables and can conflict with Docker.

When firewalld is started or restarted it will remove the DOCKER chain from iptables, preventing Docker from working properly.

When using Systemd, firewalld is started before Docker, but if you start or restart firewalld after Docker, you will have to restart the Docker daemon.

Excerpted from Docker's official document "CentOS - Docker Documentation"

In CentOS 7, there will be no problem if you set up systemd to start the Docker service automatically at boot, because Docker clearly states "After= firewalld.service" in the systemd configuration file to ensure that the Docker daemon starts after FirewallD starts.

(Docker: If you can’t afford to offend me, can you afford to hide from me?)

However, every time the user manually restarts the FirewallD service, the FirewallD service will delete the DOCKER chain written by the Docker daemon to iptables, so it is necessary to manually restart the Docker daemon service once to allow the Docker daemon service to rebuild the DOCKER chain.

However, when I asked the other two R&D personnel in the group, they both said that they had not touched it. I checked the shell history but couldn't find any corresponding records.

That's strange. But after a period of investigation, I finally found a new reason:

Case 3: IP_FORWARD is not enabled

Because we have not been able to locate the problem, our R&D team manually logs into the host machine and restarts the Docker daemon service when we find that we cannot access the warehouse normally.

Before I logged into the host server and restarted the Docker daemon service, I suddenly remembered another problem I had encountered when using Docker before: if the host machine does not have the IP_FORWARD function enabled, the Docker container will output a warning message when it starts:

WARNING: IPv4 forwarding is disabled. Networking will not work.

And you will not be able to access the external network in the started container, and the ports exposed by the container will not be accessible normally from the outside:

Could this failure be caused by the host machine's IP_FORWARD function not being enabled?

sysctl net.ipv4.ip_forward 

Sure enough, the output shows that the IP_FORWARD function of the current system is disabled!

But the problem is, when I started the container, everything was fine and there was no output. How come the IP_FORWARD function was disabled while I was using it?

Wait, the Docker daemon service automatically sets the iptables settings when it starts. Does it also check the IP_FORWARD setting and temporarily enable it for me?

With this assumption, I manually restarted the Docker daemon service:

Sure enough, the Docker daemon service will check the system's IP_FORWARD configuration item during startup. If the current system's IP_FORWARD function is disabled, it will help us temporarily enable the IP_FORWARD function. However, the temporarily enabled IP_FORWARD function will fail due to various other reasons...

Although there is no conclusive evidence yet to pinpoint the specific cause of this failure, I now seriously suspect that it was caused by restarting the network service. Because the problematic server host is running a web project that our R&D team is developing, one of the functions is to modify the network card IP address. After modifying the network card IP, this function will automatically call the following command to restart the network service:

systemctl restart network.service

Restarting the network service will invalidate the temporary IP_FORWARD configuration automatically set by the Docker daemon service:

In addition, because the program calls the command directly, no trace will be left in the history command.

As for the repair solution, it is very simple, just one line of command:

echo 'net.ipv4.ip_forward = 1' >> /usr/lib/sysctl.d/50-default.conf

After the execution is complete, restart the server or use the following command to load the configuration from the file:

sysctl -p /usr/lib/sysctl.d/50-default.conf 

That's it.

3. Summary

The Docker daemon service will help us adjust many configuration items when it starts, such as the IP_FORWARD configuration that caused the problem this time.

Docker daemon enables the IP_FORWARD function because the default network mode of the Docker container (bridge mode) assigns a private IP to each container. If the container needs to communicate with the outside world, NAT is required. NAT requires the IP_FORWARD function to be supported; otherwise, it cannot be used. This also explains why when the IP_FORWARD function is disabled, the container using bridge mode cannot be accessed from inside or outside.

However, under Linux, for security reasons, the IP_FORWARD function is disabled by default. The Docker daemon service will check whether the IP_FORWARD function is enabled when it starts. If it is not enabled, the Docker daemon will temporarily enable this function silently. However, the temporarily enabled IP_FORWARD function cannot be persistent and will become invalid due to interference from other commands.

However, this incident taught me a little truth: when problems arise, don't panic, make bold assumptions based on experience and verify them to address both the symptoms and the root cause.

Summarize

The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. If you have any questions, you can leave a message to communicate. Thank you for your support for 123WORDPRESS.COM.

You may also be interested in:
  • How to modify the port mapping of a running Docker container
  • Docker port mapping and external inaccessibility issues
  • Add port mapping after docker container starts
  • How to set port mapping for running container in Docker
  • Demonstration and analysis of four port mappings of docker containers

<<:  Detailed explanation of Vue3 sandbox mechanism

>>:  Ideas and methods for incremental backup of MySQL database

Recommend

Explanation on whether to choose paging or loading in interactive design

The author of this article @子木yoyo posted it on hi...

HTML tag overflow processing application

Use CSS to modify scroll bars 1. Overflow setting...

You Probably Don’t Need to Use Switch Statements in JavaScript

Table of contents No switch, no complex code bloc...

JavaScript imitates Jingdong magnifying glass special effects

This article shares the specific code of JavaScri...

Mini Program implements custom multi-level single-select and multiple-select

This article shares the specific code for impleme...

How to update the view synchronously after data changes in Vue

Preface Not long ago, I saw an interesting proble...

How are Vue components parsed and rendered?

Preface This article will explain how Vue compone...

Solution to the 404/503 problem when logging in to TeamCenter12

TeamCenter12 enters the account password and clic...

Solution to the Multiple primary key defined error in MySQL

There are two ways to create a primary key: creat...

MYSQL subquery and nested query optimization example analysis

Check the top 100 highest scores in game history ...

Explanation of the problem that JavaScript strict mode does not support octal

Regarding the issue that JavaScript strict mode d...

Mysql 5.7.18 Using MySQL proxies_priv to implement similar user group management

Use MySQL proxies_priv (simulated role) to implem...