Since I started working on Vulhub in 2017, I have been struggling with a troublesome problem: when writing a Dockerfile, how can I reduce the size of the image generated by 1. Using Alpine Linux Alpine Linux is a Linux distribution based on BusyBox and Musl Libc. Its biggest advantage is its small size. A pure base Alpine Docker image is only 2.67MB compressed. Many official Docker images have Alpine versions, such as PHP: By comparison, it can be found that the size of the alpine version image is about 1/5 of the ordinary version. However, in Docker Hub, most images do not have Alpine versions, such as Mysql and PHP-Apache. If we need to develop based on these environments, we have to write Alpine versions ourselves or find some third-party images. In addition, another disadvantage of Alpine is that it uses Musl Libc as a replacement for the traditional glibc. When compiling the software, you may encounter some unpredictable problems, which will cause us to spend a lot of unnecessary time. 2. Install only minimal dependencies Package managers such as apt-get, yum, and apk are tools that we must use when compiling images. Pure Docker base images usually lack tools such as wget, curl, git, and gcc, which require us to install manually. Let’s take apt as an example. When installing software, apt-get can specify an option: This reduces the size of the image to a certain extent, but the side effect of doing so is that it may cause the target software to lack some functions. For example, at this point wget will not be able to verify the authenticity of the server certificate, resulting in a command error: Therefore, our general practice is to try to add apt-get install --no-install-recommends wget ca-certificates 3. Clean up the mess for apt Some tools are only used in the compilation phase. I don’t want them to take up my precious image capacity. I can delete these intermediate dependencies after the image compilation is completed. Let's take apt as an example. After using it, we need to do the following:
In this process, we will encounter a very difficult problem: which dependencies are "unnecessary"? For example, when compiling PHP, we may use three tools: wget, libxml, and gcc. These three tools need to be installed before compiling PHP. But after the compilation is finished, we can uninstall wget and gcc, but we cannot uninstall libxml. The reason is that libxml is a dynamic link library that PHP depends on. If we uninstall it, an error will occur that the shared link library cannot be found: root@8eab53da8d5b:/# php -v php: error while loading shared libraries: libxml2.so.2: cannot open shared object file: No such file or directory So, is there a more convenient way for me to automatically find only those dependencies that are not "shared link libraries" and delete them? Of course there is. A simpler way is to traverse the newly compiled executable file, use the ldd command to list the shared link library file names it depends on, and search for the package name corresponding to this file name in the source: These packages are all the dynamic link libraries that PHP depends on. Then we use Then, we can automatically uninstall the remaining unused packages. The complete shell script is as follows: find /usr/local -type f -executable -exec ldd '{}' ';' \ | awk '/=>/ { print $(NF-1) }' \ | sort -u \ | xargs -r dpkg-query --search \ | cut -d: -f1 \ | sort -u \ | xargs -r apt-mark manual \ ; \ apt-get purge -y --auto-remove -o APT::AutoRemove::RecommendsImportant=false; 4. Try to install and uninstall intermediate dependencies in one step A docker image is a layered cake made up of layers. We can use For Dockerfile, the data of these layers will be saved in the image, even if the latter layer deletes the files saved in the previous layer. For example, we have the following Dockerfile: FROM alpine:3.12 RUN truncate -s 50M /sample.dat RUN rm -rf /sample.dat We can try to see how big this compiled image is, 58MB: In comparison, the normal alpine:3.12 is only 5.57MB, which means that even if we have deleted the Therefore, when deleting the "intermediate dependencies" mentioned above, we need to write the three parts of installation, use, and uninstallation in one step to ensure that the space is released. for example: FROM debian:buster RUN apt-get update \ && apt-get install gcc \ && gcc ... \ && apt-get purge --autoremove gcc \ && rm -rf /var/lib/apt/lists/* 5. Multi-stage compilation After Docker version 17.05, the concept of multi-stage builds was introduced, which will greatly simplify all of our above operations. In simple terms, multi-stage builds allow us to divide the compilation of Docker images into multiple "stages". For example, in the case of common software compilation, we can separate the compilation stage and directly copy the binary file to a new base image after the software compilation is completed. The biggest advantage of this is that the second image no longer contains any intermediate dependencies used in the compilation stage, which is clean and clear. Taking the most common Java project as an example, when compiling the Jar package, we need to use tools such as JDK and Maven, but in the actual operation stage, we only need the JRE environment. Let's compare the sizes of the two images: The difference is more than double. Taking the Shiro 1.2.4 environment in Vulhub as an example, two FROM maven:3-jdk-8 AS builder LABEL MAINTAINER="phithon <[email protected]>" COPY ./code/ /usr/src/ WORKDIR /usr/src RUN cd /usr/src; \ mvn -U clean package -Dmaven.test.skip=true FROM openjdk:8u102-jre LABEL MAINTAINER="phithon <[email protected]>" COPY --from=builder /usr/src/target/shirodemo-1.0-SNAPSHOT.jar /shirodemo-1.0-SNAPSHOT.jar EXPOSE 8080 CMD ["java", "-jar", "/shirodemo-1.0-SNAPSHOT.jar"] The first Finally, two images will be left on the machine, one is the builder, and the other is the shiro 1.2.4 environment we need in the end. The latter can be used independently by any other user, while the former can be deleted directly. For users, we no longer need to worry about how to delete the intermediate dependencies when compiling the software to make the image smaller. Anyway, any dependencies used in the first stage will not be left in the formal production environment. However, multi-stage compilation still has the above-mentioned problem of relying on dynamic link libraries. If we only copy the executable file when copying the compilation results, the error of not finding the shared link library will still occur when running in the new environment. Therefore, I personally feel that multi-stage compilation is only suitable for languages that can be cross-platform or statically compiled, such as Java and golang, and is still not friendly to projects with more dependencies such as C and Python. 6. Use the slim version of the image Careful students may have noticed that the official Docker Debian image has a slim version, which is more than twice the size of the default version: The Chinese meaning of slim is "slim". As the name suggests, Some upper-level images are written based on the slim version of Debian, such as Python. If we develop a Python project, we can use To sum up, the six methods will not affect each other and we can use them at the same time. But the fifth one, multi-stage compilation will be the mainstream method in the future. This concludes this article on six ways to reduce the size of Docker images. For more information on reducing the size of Docker images, please search 123WORDPRESS.COM’s previous articles or continue to browse the following related articles. I hope everyone will support 123WORDPRESS.COM in the future! You may also be interested in:
|
>>: JavaScript to achieve stair rolling special effects (jQuery implementation)
Table of contents 1. How to represent the current...
Compared with the old life cycle Three hooks are ...
BEM is a component-based approach to web developm...
1. Experimental Environment serial number project...
1. Create a runner container mk@mk-pc:~/Desktop$ ...
In Black Duck's 2017 open source survey, 77% ...
Table of contents 1. First install echarts in the...
ssh is one of the two command line tools I use mo...
MySQL implements sequence function 1. Create a se...
1. First install the pagoda Installation requirem...
Isolation of process address spaces is a notable ...
First, let me explain that what we want to do is ...
1. Delete folders Example: rm -rf /usr/java The /...
Step 1. Enable MySQL slow query Method 1: Modify ...
This article uses an example to describe how to q...