How to customize Docker images using Dockerfile

How to customize Docker images using Dockerfile

Customizing images using Dockerfile

Image customization actually means customizing the configuration and files added to each layer. If we can write the commands for each layer of modification, installation, construction, and operation into a script, and use this script to build and customize the image, then the problems of non-repeatability, image building transparency, and volume will all be solved. This script is the Dockerfile.

Dockerfile is a text file that contains instructions. Each instruction builds a layer, so the content of each instruction describes how the layer should be built.

Here we take the customization of nginx image as an example and use Dockerfile to customize it.

In a blank directory, create a text file and name it Dockerfile:

$ mkdir mynginx
$ cd mynginx
$ touch Dockerfile

Its contents are:

FROM nginx
RUN echo '<h1>Hello, Docker!</h1>' > /usr/share/nginx/html/index.html

This Dockerfile is very simple, just two lines. There are two instructions involved, FROM and RUN.

Dockerfile instructions explained

FROM specifies the base image

The so-called customized image must be based on an image and customized on it. FROM specifies the base image, so FROM is a required instruction in a Dockerfile and must be the first instruction.

There are many high-quality official images on the Docker Store, including service images that can be used directly, such as nginx, redis, mongo, mysql, etc.; there are also some images that are convenient for developing, building, and running applications in various languages, such as node, openjdk, python, etc. You can find an image that best meets your ultimate goal and customize it as the base image.

If you cannot find the image of the corresponding service, the official image also provides some more basic operating system images, such as Ubuntu, Debian, CentOS, etc. The software libraries of these operating systems provide us with a broader expansion space.

In addition to selecting an existing image as the base image, Docker also has a special image called scratch. This image is a virtual concept and does not actually exist. It represents a blank image.

FROM scratch
...

If you use scratch as the base image, it means that you are not based on any image, and the instructions written next will exist as the first layer of the image.

It is not uncommon to copy executable files directly into the image without being based on any system, such as swarm and coreos/etcd. For statically compiled programs under Linux, there is no need for the operating system to provide runtime support. All the required libraries are already in the executable file, so directly FROM scratch will make the image size smaller. Many applications developed using the Go language use this method to create images, which is one of the reasons why some people think that Go is a language that is particularly suitable for container microservice architecture.

RUN Execute command

The RUN instruction is used to execute command line instructions. Due to the powerful capabilities of the command line, the RUN instruction is one of the most commonly used instructions when customizing images. There are two formats:

Shell format: RUN <command>, just like the command entered directly in the command line. The RUN instruction in the Dockerfile just written is in this format.

RUN echo '<h1>Hello, Docker!</h1>' > /usr/share/nginx/html/index.html

exec format: RUN ["executable file", "parameter 1", "parameter 2"], which is more like the format in a function call.
Since RUN can execute commands just like a Shell script, can we map each command to a RUN like a Shell script? For example:

FROM debian:jessie
RUN apt-get update
RUN apt-get install -y gcc libc6-dev make
RUN wget -O redis.tar.gz "http://download.redis.io/releases/redis-3.2.5.tar.gz"
RUN mkdir -p /usr/src/redis
RUN tar -xzf redis.tar.gz -C /usr/src/redis --strip-components=1
RUN make -C /usr/src/redis
RUN make -C /usr/src/redis install

As mentioned before, each instruction in Dockerfile will create a layer, and RUN is no exception. The behavior of each RUN is the same as the process of manually creating an image: a new layer is created, these commands are executed on it, and after the execution is completed, the changes to this layer are committed to form a new image.

The above writing method creates a 7-layer image. This is completely meaningless, and many things that are not needed at runtime are installed in the image, such as the compilation environment, updated software packages, and so on. The result is a very bloated, multi-layered image, which not only increases the time for build and deployment, but is also prone to errors. This is a common mistake made by many beginners of Docker (I can't forgive myself eitherε=(´ο`*))) alas).

Union FS has a maximum number of layers. For example, AUFS used to have a maximum of 42 layers, but now it has a maximum of 127 layers.

The correct way to write the above Dockerfile should be as follows:

FROM debian:jessie
RUN buildDeps='gcc libc6-dev make' \
  && apt-get update \
  && apt-get install -y $buildDeps \
  && wget -O redis.tar.gz "http://download.redis.io/releases/redis-3.2.5.tar.gz" \
  && mkdir -p /usr/src/redis \
  && tar -xzf redis.tar.gz -C /usr/src/redis --strip-components=1 \
  && make -C /usr/src/redis \
  && make -C /usr/src/redis install \
  && rm -rf /var/lib/apt/lists/* \
  && rm redis.tar.gz \
  && rm -r /usr/src/redis \
  && apt-get purge -y --auto-remove $buildDeps

First of all, all the previous commands have only one purpose, which is to compile and install the redis executable file. So there's no need to build up a lot of layers, it's just a one-layer thing. Therefore, instead of using many RUN commands to correspond to different commands one by one, only one RUN instruction is used, and && is used to string together the required commands. The previous 7 layers were simplified to 1 layer. When writing a Dockerfile, always remind yourself that you are not writing a shell script, but defining how each layer should be built.

Also, line breaks are done here for formatting purposes. Dockerfile supports the Shell-like command line-breaking method of adding \ at the end of the line and the comment format of # at the beginning of the line. Good formatting, such as line breaks, indents, comments, etc., will make maintenance and troubleshooting easier, which is a good habit.

In addition, you can see that a cleanup command is added at the end of this set of commands, which deletes the software required for compilation and building, cleans up all downloaded and expanded files, and also cleans up the apt cache files. This is a very important step. As mentioned before, images are stored in multiple layers. The content of each layer will not be deleted in the next layer and will always follow the image. Therefore, when building an image, you must ensure that each layer only adds what is really needed, and anything irrelevant should be cleaned up.

One of the reasons why many people who are new to Docker create very bloated images is that they forget to clean up irrelevant files at the end of each layer of the build.

Build the image

OK, let's go back to the Dockerfile of the customized nginx image. Now that we understand the contents of this Dockerfile, let's build this image.

Execute in the directory where the Dockerfile file is located:

$ docker build -t nginx:v3 .
Sending build context to Docker daemon 2.048 kB
Step 1: FROM nginx
---> e43d811ce2f4
Step 2 : RUN echo '<h1>Hello, Docker!</h1>' > /usr/share/nginx/html/index.html
---> Running in 9cdc27646c7b
---> 44aa4490ce2c
Removing intermediate container 9cdc27646c7b
Successfully built 44aa4490ce2c

From the output of the command, we can clearly see the image building process. In Step 2, as we said before, the RUN instruction starts a container 9cdc27646c7b, executes the required command, and finally submits this layer 44aa4490ce2c, and then deletes the used container 9cdc27646c7b.

Here we use the docker build command to build the image. Its format is:

docker build [options] <context path/URL/->

Here we specify the name of the final image -t nginx:v3. After the build is successful, we can run this image directly, and the result is that our homepage is changed to Hello, Docker!.

Image Build Context

If you pay attention, you will see that there is a . at the end of the docker build command. . indicates the current directory, and Dockerfile is in the current directory, so many beginners think that this path is specifying the path where Dockerfile is located, which is actually inaccurate. If you correspond to the command format above, you may find that this is specifying the context path. So what is context?

First we need to understand how docker build works. Docker is divided into Docker engine (that is, server daemon) and client tools at runtime. The Docker engine provides a set of REST APIs, called DockerRemote APIs, and client tools such as docker commands interact with the Docker engine through this set of APIs to complete various functions. Therefore, although on the surface we seem to be executing various Docker functions on the local machine, in fact, everything is done on the server side (Docker engine) using remote calls. Because of this C/S design, it is easy for us to operate the Docker engine of the remote server.

When we build an image, not all customizations are done through the RUN instruction. We often need to copy some local files into the image, such as through the COPY instruction, ADD instruction, etc. The docker build command builds the image, but it is not actually built locally, but on the server side, that is, in the Docker engine. So in this client/server architecture, how can the server obtain local files?

This introduces the concept of context. When building, the user will specify the path to build the image context. After the docker build command knows this path, it will package all the contents under the path and upload it to the Docker engine. In this way, after the Docker engine receives this context package, it will expand it to obtain all the files needed to build the image.

If you write this in your Dockerfile:

COPY ./package.json /app/

This does not mean copying the package.json in the directory where the docker build command is executed, nor does it mean copying the package.json in the directory where the Dockerfile is located, but rather copying the package.json in the context directory.

Therefore, the source file paths in instructions such as COPY are all relative paths. This is also why beginners often ask why COPY ../package.json /app or COPY /opt/xxxx /app does not work, because these paths are out of the scope of the context and the Docker engine cannot get the files at these locations. If you really need those files, you should copy them into the context directory.

Now you can understand the . in the command docker build -t nginx:v3 .. It actually specifies the context directory. The docker build command will package the contents of the directory and hand them over to the Docker engine to help build the image.

If we observe the docker build output, we can actually see this process of sending context:

$ docker build -t nginx:v3 .
Sending build context to Docker daemon 2.048 kB
...

Understanding the build context is important for image building to avoid making mistakes. For example, some beginners found that COPY /opt/xxxx /app did not work, so they simply put the Dockerfile in the root directory of the hard disk to build it. As a result, they found that after the docker build was executed, it sent a thing of dozens of GB, which was extremely slow and easy to fail to build. That's because this approach is asking docker build to pack the entire hard drive, which is obviously a usage error.

Generally speaking, you should place the Dockerfile in an empty directory or in the project root directory. If the required file does not exist in this directory, you should copy it. If there are things in the directory that you really don't want to pass to the Docker engine during the build, you can write a .dockerignore file using the same syntax as .gitignore. This file is used to remove things that do not need to be passed to the Docker engine as context.

So why do some people mistakenly think that . specifies the directory where the Dockerfile is located? This is because by default, if you do not specify a Dockerfile, the file named Dockerfile in the context directory will be used as the Dockerfile.

This is just the default behavior. In fact, the Dockerfile file name does not have to be Dockerfile, and it does not have to be located in the context directory. For example, you can use the -f ../Dockerfile.php parameter to specify a file as the Dockerfile.

Of course, people usually use the default file name Dockerfile and place it in the image build context directory.

Other docker build usage

Build directly from Git repo

As you may have noticed, docker build also supports building from a URL, for example, you can build directly from a Git repo:

$ docker build https://github.com/twang2218/gitlab-ce-zh.git#:8.14
docker build https://github.com/twang2218/gitlab-ce-zh.git\#:8.14
Sending build context to Docker daemon 2.048 kB
Step 1: FROM gitlab/gitlab-ce:8.14.0-ce.0
8.14.0-ce.0: Pulling from gitlab/gitlab-ce
aed15891ba52: Already exists
773ae8583d14: Already exists
...

This command line specifies the Git repo required for the build, and specifies the default master branch and the build directory as /8.14/. Docker will then git clone the project, switch to the specified branch, and enter the specified directory to start the build.

Build with the given tarball

$ docker build http://server/context.tar.gz

If the URL given is not a Git repo but a tarball, the Docker engine will download the tarball, automatically decompress it, and use it as the context to start the build.

Read Dockerfile from standard input to build

docker build - < Dockerfile

or

cat Dockerfile | docker build -

If a text file is passed as standard input, it is treated as a Dockerfile and a build is started. Since this form reads the contents of the Dockerfile directly from the standard input, it has no context, so it is not possible to COPY local files into the image like other methods.

Read context tarball from standard input for building

$ docker build -< context.tar.gz

If the standard input file format is found to be gzip, bzip2, or xz, it will be treated as a context compressed package, which will be directly expanded and treated as the context to start building.

COPY copy file <br /> Format:

  • COPY <source path>... <destination path>
  • COPY ["<source path1>",... "<destination path>"]

Like the RUN instruction, there are two formats, one similar to a command line and one similar to a function call. The COPY instruction copies the files/directories at <source path> from the build context directory to the <destination path> location in the new layer of the image. for example:

COPY package.json /usr/src/app/

<source path> can be multiple or even wildcards. The wildcard rules must satisfy Go's filepath.Match rules, such as:

COPY hom* /mydir/
COPY hom?.txt /mydir/

<target path> can be an absolute path within the container or a relative path to the working directory (the working directory can be specified using the WORKDIR instruction). The target path does not need to be created in advance. If the directory does not exist, it will be created before copying the files.

In addition, it is also important to note that when using the COPY command, various metadata of the source file will be retained. For example, read, write, execute permissions, file change time, etc. This feature is useful for image customization. Especially when build-related files are managed using Git.

ADD More advanced file copying

The format and properties of the ADD instruction and COPY are basically the same. But some functions are added based on COPY. For example, <source path> can be a URL. In this case, the Docker engine will try to download the file of this link and put it in <destination path>. The file permissions after downloading are automatically set to 600. If this is not the desired permission, you need to add an extra layer of RUN to adjust the permissions. In addition, if the downloaded file is a compressed package and needs to be decompressed, an extra layer of RUN instructions is also required to decompress it. So it is more reasonable to use the RUN command directly, then use the wget or curl tool to download, handle permissions, decompress, and then clean up useless files. Therefore, this feature is not practical and is not recommended.

If <source path> is a tar compressed file, and the compression format is gzip, bzip2 or xz, the ADD command will automatically decompress the compressed file to <destination path>.

In some cases, this automatic decompression function is very useful, such as in the official image Ubuntu:

FROM scratch
ADD ubuntu-xenial-core-cloudimg-amd64-root.tar.gz /
...

But in some cases, if we really want to copy a compressed file into it without decompressing it, we cannot use the ADD command.

The official Dockerfile best practices document from Docker requires that COPY be used whenever possible, because the semantics of COPY are very clear, which is just copying files, while ADD includes more complex functions and its behavior may not be very clear. The most suitable occasion to use ADD is the occasion mentioned where automatic decompression is required.

Also note that the ADD instruction will invalidate the image build cache, which may make the image build slower.

Therefore, when choosing between the COPY and ADD instructions, you can follow this principle: use the COPY instruction for all file copies, and use ADD only when automatic decompression is required.

CMD container startup command

The format of the CMD instruction is similar to that of RUN, and also has two formats:

  • Shell format: CMD <command>
  • exec format: CMD ["executable file", "parameter 1", "parameter 2"...]
  • The parameter list format is: CMD ["parameter 1", "parameter 2"...] . After specifying the ENTRYPOINT instruction, use CMD to specify specific parameters.

When introducing containers before, I said that Docker is not a virtual machine, but a container is a process. Since it is a process, when starting the container, you need to specify the program and parameters to be run. The CMD instruction is used to specify the startup command of the default container main process.

At runtime, you can specify a new command to replace the default command in the image settings. For example, the default CMD of the ubuntu image is /bin/bash. If we directly docker run -it ubuntu, we will directly enter bash. We can also specify other commands to run at runtime, such as docker run -it ubuntu cat /etc/os-release . This is to replace the default /bin/bash command with the cat /etc/os-release command, which outputs the system version information.

In terms of instruction format, it is generally recommended to use the exec format. This format will be parsed as a JSON array during parsing, so be sure to use double quotes " instead of single quotes.

If the shell format is used, the actual command will be packaged as a parameter of sh -c and executed. for example:

CMD echo $HOME

In actual implementation, it will be changed to:

CMD [ "sh", "-c", "echo $HOME" ]

This is why we can use environment variables, because these environment variables will be parsed by the shell. When talking about CMD, we have to mention the issue of foreground and background execution of applications in containers. This is a common confusion among beginners.

Docker is not a virtual machine. Applications in the container should be executed in the foreground, rather than using upstart/systemd to start background services like in virtual machines and physical machines. There is no concept of background services in the container.

Beginners usually write CMD as:

CMD service nginx start

Then it was found that the container exited immediately after execution. Even when using the systemctl command in the container, it turns out that it cannot be executed at all. This is because they do not understand the concepts of foreground and background, do not distinguish the differences between containers and virtual machines, and still understand containers from the perspective of traditional virtual machines.

For a container, its startup program is the container application process. The container exists for the main process. When the main process exits, the container loses its meaning of existence and thus exits. Other auxiliary processes are not something it needs to care about.

When using the service nginx start command, you want systemd to start the nginx service as a background daemon process. As mentioned earlier, CMD service nginx start will be understood as CMD ["sh", "-c", "service nginxstart"], so the main process is actually sh. Then when the service nginx start command ends, sh also ends. When sh exits as the main process, the container will naturally exit.

The correct approach is to execute the nginx executable file directly and require it to run in the foreground. for example:

CMD ["nginx", "-g", "daemon off;"]

ENTRYPOINT

The format of ENTRYPOINT is the same as that of RUN instruction, which is divided into exec format and shell format.

The purpose of ENTRYPOINT is the same as CMD, which is to specify the container startup program and parameters. ENTRYPOINT can also be replaced at runtime, but it is slightly more complicated than CMD and needs to be specified through the docker run parameter –entrypoint.

When ENTRYPOINT is specified, the meaning of CMD changes. It no longer runs the command directly, but passes the contents of CMD as parameters to the ENTRYPOINT instruction. In other words, when it is actually executed, it becomes:

<ENTRYPOINT> "<CMD>"

So why do we need ENTRYPOINT when we have CMD? Is there any benefit of this <ENTRYPOINT> "<CMD>"? Let's look at a few scenarios.

Scenario 1: Use the image like a command

Assuming we need a mirror to know our current public IP, we can use CMD to achieve it first:

FROM ubuntu:16.04
RUN apt-get update \
&& apt-get install -y curl \
&& rm -rf /var/lib/apt/lists/*
CMD [ "curl", "-s", "http://ip.cn" ]

If we use docker build -t myip . to build the image, if we need to query the current public IP, we only need to execute:

$ docker run myip

Current IP: 61.148.226.66 From: Beijing Unicom

Well, it seems that we can use the image directly as a command, but commands always have parameters. What if we want to add parameters? For example, from the CMD above, we can see that the actual command is curl. If we want to display HTTP header information, we need to add the -i parameter. So can we just add the -i parameter to docker run myip?

$ docker run myip -i
docker: Error response from daemon: invalid header field value "oci runtime error: con
tainer_linux.go:247: starting container process caused \"exec: \\\"-i\\\": executable
file not found in $PATH\"\n".

We can see an error message that the executable file cannot be found, executable file not found . As we said before, the command follows the image name, which will replace the default value of CMD when running. Therefore, the -i here replaces the original CMD instead of being added after the original curl -s http://ip.cn. And -i is not a command at all, so it cannot be found.

If we want to add the -i parameter, we must re-enter the command in its entirety:

$ docker run myip curl -s http://ip.cn -i

This is obviously not a good solution, and using ENTRYPOINT can solve this problem. Now we use ENTRYPOINT again to implement this image:

FROM ubuntu:16.04
RUN apt-get update \
  && apt-get install -y curl \
  && rm -rf /var/lib/apt/lists/*
ENTRYPOINT [ "curl", "-s", "http://ip.cn" ]

This time we will try to use docker run myip -i directly:

$ docker run myip

Current IP: 61.148.226.66 From: Beijing Unicom

$ docker run myip -i
HTTP/1.1 200 OK
Server: nginx/1.8.0
Date: Tue, 22 Nov 2016 05:12:40 GMT
Content-Type: text/html; charset=UTF-8
Vary: Accept-Encoding
X-Powered-By: PHP/5.6.24-1~dotdeb+7.1
X-Cache: MISS from cache-2
X-Cache-Lookup: MISS from cache-2:80
X-Cache: MISS from proxy-2_6
Transfer-Encoding: chunked
Via: 1.1 cache-2:80, 1.1 proxy-2_6:8006
Connection: keep-alive

Current IP: 61.148.226.66 From: Beijing Unicom

As you can see, this time it was successful. This is because when ENTRYPOINT exists, the content of CMD will be passed as a parameter to ENTRYPOINT, and here -i is the new CMD, so it will be passed as a parameter to curl, thus achieving the desired effect.

Scenario 2: Preparation before running the application

Starting a container is to start the main process, but sometimes, some preparation is required before starting the main process. For example, a database like MySQL may require some database configuration and initialization work, which must be completed before the final MySQL server is running.

In addition, you may want to avoid using the root user to start the service to improve security. Before starting the service, you need to perform some necessary preparations as the root user, and finally switch to the service user to start the service. Alternatively, in addition to services, other commands can still be executed using the root identity to facilitate debugging, etc.

These preparations have nothing to do with the container CMD. No matter what the CMD is, a preprocessing task needs to be performed in advance. In this case, you can write a script and put it in ENTRYPOINT to execute. The script will receive the parameters (that is, ) as commands and execute them at the end of the script. For example, this is done in the official redis image:

FROM alpine:3.4
...
RUN addgroup -S redis && adduser -S -G redis redis
...
ENTRYPOINT ["docker-entrypoint.sh"]
EXPOSE 6379
CMD [ "redis-server" ]
 

You can see that a redis user is created for the redis service, and the ENTRYPOINT is specified as the dockerentrypoint.sh script at the end.

#!/bin/sh
...
# allow the container to be started with `--user`
if [ "$1" = 'redis-server' -a "$(id -u)" = '0' ]; then
  chown -R redis .
  exec su-exec redis "$0" "$@"
fi
exec "$@"

The content of the script is determined based on the content of CMD. If it is redis-server, it switches to redis user identity to start the server, otherwise it still uses root identity to execute. for example:

$ docker run -it redis id
uid=0(root) gid=0(root) groups=0(root)

ENV sets environment variables

There are two formats:

  1. ENV <key> <value>
  2. ENV <key1>=<value1> <key2>=<value2>...

This instruction is very simple, it just sets the environment variables. Whether it is other subsequent instructions, such as RUN, or applications at runtime, you can directly use the environment variables defined here.

ENV VERSION=1.0 DEBUG=on \
  NAME="Happy Feet"

This example demonstrates how to wrap lines and enclose values ​​containing spaces in double quotes, which is consistent with the behavior under the shell.

Once the environment variable is defined, it can be used in subsequent instructions. For example, in the official node image Dockerfile, there is code similar to this:

ENV NODE_VERSION 7.2.0
RUN curl -SLO "https://nodejs.org/dist/v$NODE_VERSION/node-v$NODE_VERSION-linux-x64.ta
r.xz" \
  && curl -SLO "https://nodejs.org/dist/v$NODE_VERSION/SHASUMS256.txt.asc" \
  && gpg --batch --decrypt --output SHASUMS256.txt SHASUMS256.txt.asc \
  && grep " node-v$NODE_VERSION-linux-x64.tar.xz\$" SHASUMS256.txt | sha256sum -c - \
  && tar -xJf "node-v$NODE_VERSION-linux-x64.tar.xz" -C /usr/local --strip-components=1 \
  && rm "node-v$NODE_VERSION-linux-x64.tar.xz" SHASUMS256.txt.asc SHASUMS256.txt \
  && ln -s /usr/local/bin/node /usr/local/bin/nodejs

Here, the environment variable NODE_VERSION is defined first, and then in the RUN layer, $NODE_VERSION is used multiple times to customize operations. As you can see, when you upgrade the image build version in the future, you only need to update 7.2.0, which makes Dockerfile build maintenance easier.

The following directives support environment variable expansion:
ADD , COPY , ENV , EXPOSE , LABEL , USER , WORKDIR , VOLUME , STOPSIGNAL , ONBUILD .

From this command list, you can feel that environment variables can be used in many places and are very powerful. Through environment variables, we can use one Dockerfile to create more images by using different environment variables.

ARG build parameters

Format: ARG <parameter name>[=<default value>]

Build parameters have the same effect as ENV, both of which set environment variables. The difference is that the environment variables of the build environment set by ARG will not exist when the container is running in the future. But don't use ARG to save information such as passwords, because docker history can still see all values.

The ARG instruction in a Dockerfile defines the parameter name and its default value. This default value can be overridden in the build command docker build using --build-arg <parameter name>=<value>.

In versions prior to 1.13, the parameter name in --build-arg must have been defined in the Dockerfile using ARG. In other words, the parameter specified by --build-arg must be used in the Dockerfile. If the corresponding parameter is not used, an error will be reported and the build will exit. Starting from 1.13, this strict restriction has been relaxed. Instead of exiting with an error, a warning message will be displayed and the build will continue. This is helpful when using a CI system to build different Dockerfiles with the same build process, avoiding the need to modify the build command based on the contents of each Dockerfile.

VOLUME defines anonymous volumes

The format is:

  • VOLUME ["<Path 1>", "<Path 2>"...]
  • VOLUME <path>

As we said before, when the container is running, the container storage layer should be kept free of write operations as much as possible. For database applications that need to save dynamic data, their database files should be saved in volumes. In order to prevent users from forgetting to mount the directory where dynamic files are stored as a volume during runtime, we can specify certain directories to be mounted as anonymous volumes in advance in the Dockerfile. In this way, if the user does not specify the mount during runtime, the application can also run normally without writing a large amount of data to the container storage layer.

VOLUME /data

The /data directory here will be automatically mounted as an anonymous volume at runtime, and any information written to /data will not be recorded in the container storage layer, thus ensuring the statelessness of the container storage layer. Of course, this mount setting can be overridden at runtime. for example:

docker run -d -v mydata:/data xxxx

In this command line, the named volume mydata is mounted to the /data location, replacing the anonymous volume mount configuration defined in the Dockerfile.

EXPOSE declares a port

The format is EXPOSE <port1> [<port2>...].

The EXPOSE instruction declares the service port provided by the runtime container. This is just a declaration, and the application will not open the service of this port at runtime because of this declaration. There are two benefits to writing such a statement in the Dockerfile. One is to help image users understand the daemon port of the image service to facilitate configuration mapping. The other benefit is that when using random port mapping at runtime, that is, when docker run -P is used, the EXPOSE port will be randomly mapped automatically.

Additionally, there is a special use in early Docker versions. Previously, all containers ran in the default bridge network, so all containers could directly access each other, which posed certain security issues. So there is a Docker engine parameter --icc=false. When this parameter is specified, containers will not be able to access each other by default, unless the containers use the --links parameter to communicate with each other, and only the ports declared by EXPOSE in the image can be accessed. The usage of --icc=false is basically no longer used after the introduction of docker network. Through custom networks, it is easy to achieve interconnection and isolation between containers.

To distinguish EXPOSE from using -p <host port>:<container port> at runtime. -p maps the host port and the container port. In other words, it exposes the corresponding port service of the container to the outside world. EXPOSE only declares what port the container intends to use and does not automatically map the port on the host.

WORKDIR specifies the working directory

The format is WORKDIR <working directory path>.

Use the WORKDIR command to specify the working directory (or current directory). The current directory at each level will be changed to the specified directory. If the directory does not exist, WORKDIR will create it for you.

As mentioned earlier, a common mistake made by some beginners is to write Dockerfile as if it were a shell script. This misunderstanding may also lead to the following errors:

RUN cd /app
RUN echo "hello" > world.txt

If you build the image using this Dockerfile and run it, you will find that the /app/world.txt file cannot be found, or its content is not hello. The reason is actually very simple. In Shell, two consecutive lines are the same process execution environment, so the memory state modified by the previous command will directly affect the next command; in Dockerfile, the execution environment of these two lines of RUN commands is completely different, and they are two completely different containers. This is an error caused by not understanding the concept of layered storage in Dockerfile.

As mentioned before, each RUN starts a container, executes commands, and then commits the storage layer file changes. The execution of the first level RUNcd /app only changes the working directory of the current process, a memory change only, and will not result in any file changes. When it comes to the second layer, a brand new container is started, which has nothing to do with the container of the first layer. Naturally, it is impossible to inherit the memory changes during the construction process of the previous layer.

Therefore, if you need to change the location of the working directory at each subsequent level, you should use the WORKDIR instruction.

USER specifies the current user

Format: USER <username>

The USER directive is similar to WORKDIR in that it changes the state of the environment and affects subsequent layers. WORKDIR changes the working directory, and USER changes the identity of the subsequent layers that execute commands such as RUN, CMD, and ENTRYPOINT. Of course, like WORKDIR, USER only helps you switch to a specified user. This user must have been created in advance, otherwise the switch cannot be made.

RUN groupadd -r redis && useradd -r -g redis redis
USER redis
RUN [ "redis-server" ]
 

If you want to change the identity of a script executed as root during execution, for example, if you want to run a service process as an established user, do not use su or sudo, which require more complicated configuration and often fail in an environment without TTY. It is recommended to use gosu.

# Create a redis user and use gosu to change to another user to execute the command RUN groupadd -r redis && useradd -r -g redis redis
# Download gosu
RUN wget -O /usr/local/bin/gosu "https://github.com/tianon/gosu/releases/download/1.7/
gosu-amd64" \
  && chmod +x /usr/local/bin/gosu \
  && gosu nobody true
# Set CMD and execute CMD as another user [ "exec", "gosu", "redis", "redis-server" ]
 

HEALTHCHECK

Format:

  • HEALTHCHECK [options] CMD <command> : Set the command to check the health of the container
  • HEALTHCHECK NONE: If the base image has health check instructions, use this line to block its health check instructions

The HEALTHCHECK instruction tells Docker how to determine whether the status of the container is normal. This is a new instruction introduced in Docker 1.12.

Before the HEALTHCHECK instruction, the Docker engine could only determine whether the container was in an abnormal state by whether the main process in the container had exited. This is fine in many cases, but if the program enters a deadlock or infinite loop, the application process does not exit, but the container can no longer provide services. Before 1.12, Docker would not detect this state of the container and would not reschedule it, resulting in some containers being unable to provide services but still accepting user requests.

Since 1.12, Docker has provided the HEALTHCHECK instruction, which specifies a line of command to determine whether the service status of the container's main process is still normal, thereby more realistically reflecting the actual status of the container.

When the HEALTHCHECK instruction is specified in an image, the container is started with it. The initial status will be starting. After the HEALTHCHECK instruction check succeeds, it will change to healthy. If it fails a certain number of times in a row, it will change to unhealthy.

HEALTHCHECK supports the following options:

  • interval=<interval>: the interval between two health checks, the default is 30 seconds;
  • timeout=<duration>: The timeout period for the health check command to run. If this time is exceeded, the health check is considered a failure. The default time is 30 seconds.
  • retries=<number>: After a specified number of consecutive failures, the container status is considered unhealthy. The default is 3 times.

Like CMD and ENTRYPOINT, HEALTHCHECK can only appear once. If multiple entries are given, only the last one will take effect.

The command following HEALTHCHECK [option] CMD has the same format as ENTRYPOINT, which can be divided into shell format and exec format. The return value of the command determines whether the health check is successful or not: 0: Success; 1: Failure; 2: Reserved, do not use this value.

Suppose we have an image that is a simple web service, and we want to add a health check to determine whether the web service is working properly. We can use curl to help determine whether the web service is working properly. The HEALTHCHECK of the Dockerfile can be written as follows:

FROM nginx
RUN apt-get update && apt-get install -y curl && rm -rf /var/lib/apt/lists/*
HEALTHCHECK --interval=5s --timeout=3s \
  CMD curl -fs http://localhost/ || exit 1
 

Here we set a check every 5 seconds (the interval is very short for the purpose of experimentation, but it should be relatively long in reality). If the health check command does not respond for more than 3 seconds, it is considered a failure, and curl -fs http://localhost/ || exit 1 is used as the health check command.

Use docker build to build this image:

$ docker build -t myweb:v1 .

After building, we start a container:

$ docker run -d --name web -p 80:80 myweb:v1

After running the image, you can see the initial status (health: starting) through docker container ls:

$ docker container ls
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
03e28eb00bd0 myweb:v1 "nginx -g 'daemon off" 3 seconds ago Up 2 seconds (health: starting) 80/tcp, 443/tcp web

After waiting for a few seconds, run docker container ls again and you will see the health status change to (healthy):

$ docker container ls
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
03e28eb00bd0 myweb:v1 "nginx -g 'daemon off" 18 seconds ago Up 16 seconds (health: healthy) 80/tcp, 443/tcp web

If the health check fails continuously for more than the number of retries, the status changes to (unhealthy).

To help with troubleshooting, the output of the health check commands (both stdout and stderr) is stored in the health status and can be viewed with docker inspect.

$ docker inspect --format '{{json .State.Health}}' upbeat_allen | python -m json.tool
{
  "FailingStreak": 0,
  "Log": [
    {
      "End": "2018-06-14T04:55:37.477730277-04:00",
      "ExitCode": 0,
      "Output": "<!DOCTYPE html>\n<html>\n<head>\n<title>Welcome to nginx!</title>\n<style>\n body {\n width: 35em;\n margin: 0 auto;\n font-family: Tahoma, Verdana, Arial, sans-serif;\n }\n</style>\n</head>\n<body>\n<h1>Welcome to nginx!</h1>\n<p>If you see this page, the nginx web server is successfully installed and\nworking. Further configuration is required.</p>\n\n<p>For online documentation and support please refer to\n<a href=\"http://nginx.org/\">nginx.org</a>.<br/>\nCommercial support is available at\n<a href=\"http://nginx.com/\">nginx.com</a>.</p>\n\n<p><em>Thank you for using nginx.</em></p>\n</body>\n</html>\n",
      "Start": "2018-06-14T04:55:37.408045977-04:00"
    },
    {
      "End": "2018-06-14T04:55:42.553816257-04:00",
      "ExitCode": 0,
      "Output": "<!DOCTYPE html>\n<html>\n<head>\n<title>Welcome to nginx!</title>\n<style>\n body {\n width: 35em;\n margin: 0 auto;\n font-family: Tahoma, Verdana, Arial, sans-serif;\n }\n</style>\n</head>\n<body>\n<h1>Welcome to nginx!</h1>\n<p>If you see this page, the nginx web server is successfully installed and\nworking. Further configuration is required.</p>\n\n<p>For online documentation and support please refer to\n<a href=\"http://nginx.org/\">nginx.org</a>.<br/>\nCommercial support is available at\n<a href=\"http://nginx.com/\">nginx.com</a>.</p>\n\n<p><em>Thank you for using nginx.</em></p>\n</body>\n</html>\n",
      "Start": "2018-06-14T04:55:42.480940888-04:00"
    },
    {
      "End": "2018-06-14T04:55:47.631694051-04:00",
      "ExitCode": 0,
      "Output": "<!DOCTYPE html>\n<html>\n<head>\n<title>Welcome to nginx!</title>\n<style>\n body {\n width: 35em;\n margin: 0 auto;\n font-family: Tahoma, Verdana, Arial, sans-serif;\n }\n</style>\n</head>\n<body>\n<h1>Welcome to nginx!</h1>\n<p>If you see this page, the nginx web server is successfully installed and\nworking. Further configuration is required.</p>\n\n<p>For online documentation and support please refer to\n<a href=\"http://nginx.org/\">nginx.org</a>.<br/>\nCommercial support is available at\n<a href=\"http://nginx.com/\">nginx.com</a>.</p>\n\n<p><em>Thank you for using nginx.</em></p>\n</body>\n</html>\n",
      "Start": "2018-06-14T04:55:47.557214953-04:00"
    },
    {
      "End": "2018-06-14T04:55:52.708195002-04:00",
      "ExitCode": 0,
      "Output": "<!DOCTYPE html>\n<html>\n<head>\n<title>Welcome to nginx!</title>\n<style>\n body {\n width: 35em;\n margin: 0 auto;\n font-family: Tahoma, Verdana, Arial, sans-serif;\n }\n</style>\n</head>\n<body>\n<h1>Welcome to nginx!</h1>\n<p>If you see this page, the nginx web server is successfully installed and\nworking. Further configuration is required.</p>\n\n<p>For online documentation and support please refer to\n<a href=\"http://nginx.org/\">nginx.org</a>.<br/>\nCommercial support is available at\n<a href=\"http://nginx.com/\">nginx.com</a>.</p>\n\n<p><em>Thank you for using nginx.</em></p>\n</body>\n</html>\n",
      "Start": "2018-06-14T04:55:52.63499573-04:00"
    },
    {
      "End": "2018-06-14T04:55:57.795117794-04:00",
      "ExitCode": 0,
      "Output": "<!DOCTYPE html>\n<html>\n<head>\n<title>Welcome to nginx!</title>\n<style>\n body {\n width: 35em;\n margin: 0 auto;\n font-family: Tahoma, Verdana, Arial, sans-serif;\n }\n</style>\n</head>\n<body>\n<h1>Welcome to nginx!</h1>\n<p>If you see this page, the nginx web server is successfully installed and\nworking. Further configuration is required.</p>\n\n<p>For online documentation and support please refer to\n<a href=\"http://nginx.org/\">nginx.org</a>.<br/>\nCommercial support is available at\n<a href=\"http://nginx.com/\">nginx.com</a>.</p>\n\n<p><em>Thank you for using nginx.</em></p>\n</body>\n</html>\n",
      "Start": "2018-06-14T04:55:57.714289056-04:00"
    }
  ],
  "Status": "healthy"
}

ONBUILD is a wedding dress for others

Format: ONBUILD <other instructions>.

ONBUILD is a special instruction, which is followed by other instructions, such as RUN, COPY, etc., which will not be executed when the current image is built. It will only be executed when the next level image is built based on the current image.

The other instructions in Dockerfile are prepared to customize the current image, only ONBUILD is prepared to help others customize themselves.

Suppose we want to create an image of an application written in Node.js. We all know that Node.js uses npm for package management, and all dependencies, configurations, startup information, etc. will be placed in the package.json file. After getting the program code, you need to perform npm install first to obtain all the required dependencies. Then you can start the application via npm start. Therefore, a Dockerfile is usually written like this:

FROM node:slim
RUN mkdir /app
WORKDIR /app
COPY ./package.json /app
RUN [ "npm", "install" ]
COPY ./app/
CMD [ "npm", "start" ]
 

Put this Dockerfile in the root directory of the Node.js project. After building the image, you can directly use it to start the container. But what if we have a second Node.js project that does something similar? OK, then copy this Dockerfile to the second project. What if there is a third project? Copy it again? The more copies of a file there are, the more difficult it is to control its versions. Let's continue looking at the maintenance issues in this scenario.

If during the development of the first Node.js project, a problem is found in the Dockerfile, such as a typo or the need to install additional packages, the developer can fix the Dockerfile and build it again, and the problem will be solved. The first project is fine, but what about the second one? Even though the original Dockerfile was copied and pasted from the first project, just because the first project fixed their Dockerfile, the second project's Dockerfile will not be automatically fixed.

So can we make a base image and then each project uses this base image? In this way, when the base image is updated, each project does not need to synchronize the changes in Dockerfile, and will inherit the updates of the base image after rebuilding? Okay, okay, let's see the results like this. Then the above Dockerfile will become:

FROM node:slim
RUN mkdir /app
WORKDIR /app
CMD [ "npm", "start" ]
 

Here we take out the project-related build instructions and put them into the subproject. Assuming that the name of this base image is mynode, the Dockerfile in each project becomes:

FROM my-node
COPY ./package.json /app
RUN [ "npm", "install" ]
COPY ./app/
 

When the base image changes, each project will use this Dockerfile to rebuild the image and will inherit the updates of the base image.

So, is the problem solved? No. To be precise, only half of the problem has been solved. What if something in this Dockerfile needs to be adjusted? For example, npm install requires adding some parameters, what should I do? It is impossible to put this line of RUN into the base image, because it involves the ./package.json of the current project. Do we have to modify them one by one? Therefore, making a basic image in this way only solves the problem of changes in the first four instructions of the original Dockerfile, while changes in the last three instructions cannot be handled at all.

ONBUILD can solve this problem. Let's rewrite the Dockerfile of the base image using ONBUILD:

FROM node:slim
RUN mkdir /app
WORKDIR /app
ONBUILD COPY ./package.json /app
ONBUILD RUN [ "npm", "install" ]
ONBUILD COPY ./app/
CMD [ "npm", "start" ]

This time we go back to the original Dockerfile, but this time add the project-related instructions to ONBUILD so that these three lines will not be executed when building the base image. Then the Dockerfile for each project becomes simply:

FROM my-node

Yes, that's the only line. When you use this one-line Dockerfile to build an image in each project directory, the three ONBUILD lines of the previous base image will start to execute, successfully copy the code of the current project into the image, and execute npm install for this project to generate an application image.

Reference: https://github.com/yeasy/docker_practice

The above is the full content of this article. I hope it will be helpful for everyone’s study. I also hope that everyone will support 123WORDPRESS.COM.

You may also be interested in:
  • Steps to build a Docker image using Dockerfile
  • How to create Apache image using Dockerfile
  • Multi-service image packaging operation of Dockerfile under supervisor
  • Docker image creation Dockerfile and commit operations
  • How to build a tomcat image based on Dockerfile
  • Detailed explanation of Dockerfile to create a custom Docker image and comparison of CMD and ENTRYPOINT instructions
  • Implementation of tomcat image created with dockerfile based on alpine
  • Implementation of crawler Scrapy image created by dockerfile based on alpine
  • How to create your own image using Dockerfile
  • Build a Docker image using Dockerfile

<<:  How to process local images dynamically loaded in Vue

>>:  SSM implements the mysql database account password ciphertext login function

Recommend

The use of FrameLayout in six layouts

Preface In the last issue, we explained LinearLay...

Scoring rules of YSlow, a webpage scoring plugin developed by Yahoo

YSlow is a page scoring plug-in developed by Yaho...

mysql5.7.21.zip installation tutorial

The detailed installation process of mysql5.7.21 ...

Solve the problem that await does not work in forEach

1. Introduction A few days ago, I encountered a p...

How to implement property hijacking with JavaScript defineProperty

Table of contents Preface Descriptors Detailed ex...

Teach you how to build a Hadoop 3.x pseudo cluster on Tencent Cloud

1. Environmental Preparation CentOS Linux release...

Detailed description of shallow copy and deep copy in js

Table of contents 1. js memory 2. Assignment 3. S...

Detailed explanation of Deepin using docker to install mysql database

Query the MySQL source first docker search mysql ...

Write a mysql data backup script using shell

Ideas It's actually very simple Write a shell...

MYSQL METADATA LOCK (MDL LOCK) theory and lock type test

Table of contents MYSQL METADATA LOCK (MDL LOCK) ...

Vue recursively implements custom tree components

This article shares the specific code of Vue recu...

Summary of Vue first screen performance optimization component knowledge points

Vue first screen performance optimization compone...

What are the usages of limit in MySQL (recommended)

SELECT * FROM table name limit m,n; SELECT * FROM...

Detailed explanation of Javascript basics

Table of contents variable Data Types Extension P...

Solution to forget password when installing MySQL on Linux/Mac

Preface This article mainly introduces the releva...