1. Overview The image in Docker is designed in layers. Each layer is called a "layer". These layers are stored in the /var/lib/docker/<storage-driver>/ directory. There are many types of storage-drivers, such as AUFS, OverlayFS, VFS, Brtfs, etc. You can view the storage driver through the docker info command (the author's system is centos7.4): Usually, Ubuntu-like systems use AUFS by default, while CentOS 7.1+ series use OverlayFS. This article will introduce the image storage principle and storage structure using OverlayFS as the storage driver. 2. OverlayFS introduce OverlayFS is a stacked file system that relies on and is built on top of other file systems (such as ext4fs and xfs, etc.). It does not directly participate in the division of the disk space structure. It only "merges" different directories in the original underlying file system and then presents them to the user. This is the joint mount technology. Compared with AUFS, OverlayFS is faster and simpler to implement. The Linux kernel provides two types of OverlayFS drivers for Docker: overlay and overlay2. Overlay2 is an improvement over overlay and is more efficient than overlay in terms of inode utilization. However, overlay has environmental requirements: Docker version 17.06.02+, the host file system needs to be in ext4 or xfs format. Combined Mount Overlayfs is implemented through three directories: lower directory, upper directory, and work directory. There can be multiple lower directories. The work directory is the basic working directory. Its content will be cleared after mounting, and its content is invisible to users during use. Finally, the unified view presented to users after the joint mounting is completed is called the merged directory. The following use of mount will demonstrate how it works. Use the mount command to mount overlayfs with the following syntax: mount -t overlay overlay -o lowerdir=lower1:lower2:lower3,upperdir=upper,workdir=work merged_dir Create three directories A, B, C, and a worker directory: Then use the mount combination to mount it to /tmp/test: Then we check the /tmp/test directory again, you will find that directories A, B, and C are merged together, and files with the same file name will be "overwritten". The overwriting here is not a real overwriting, but when two files in the directory have the same name during merging, the merged layer directory will display the file closest to it: At the same time, we can also view its mounting options through the mount command: The above method is also called joint mounting technology. Overlay driver in Docker After introducing the overlay driver principle, let's take a look at the overlay storage driver in Docker. The following is a diagram of the working principle of overlay from the Docker official website: In the above figure, we can see three layer structures, namely: lowerdir, upperdir, and merged. Lowerdir is a read-only image layer, which is actually rootfs. Compared with the directories A and B we demonstrated above, we know that the image layer can be divided into many layers, so the corresponding lowerdir can have multiple directories. The upperdir is a layer above the lowerdir. This layer is a read-write layer. It is created when a container is started. All changes to the container data occur in this layer, compared with C in the example. Finally, the merged directory is the mount point of the container, which is a unified perspective exposed to the user, compared with /tmp/test in the example. These directory layers are stored in /var/lib/docker/overlay2/ or /var/lib/docker/overlay/ (if overlay is used). Demo Start a container Checking the overlay mount point, you can find the mounted merged directory, lowerdir, upperdir, and workdir: There can be multiple lowerdirs for overlay2, and they are mounted as soft links, which we will explain later. How it works How does the overlayfs storage driver work when data is modified in the container? The following will explain the reading and writing process: read:
Revise:
Precautions
3. Overlay2 image storage structure Pull an Ubuntu image from the repository. The result shows that a total of 4 layers of images are pulled as follows: At this time, the 4 layers are stored in the /var/lib/docker/overlay2/ directory: There is an additional l directory containing soft links of all layers. Short links use short names to avoid parameters reaching the page size limit when mounting (the short directory when viewing the mount command in the demonstration): The bottom-level image directory contains a diff and a link file. The diff directory stores the image content of the current layer, and the link file is the corresponding short name: The image above has an additional work directory and lower file. The lower file is used to record the short name of the parent layer, and the work directory is used to jointly mount the specified working directory. How are these directories and images organized together? The answer is through metadata association. Metadata is divided into image metadata and layer metadata. image metadata The image metadata is stored in the /var/lib/docker/image/<storage_driver>/imagedb/content/sha256/ directory. The name is a file named after the image ID. The image ID can be viewed through docker images. These files save the image's rootfs information, image creation time, build history information, used containers, including the startup Entrypoint and CMD, etc. in the form of json. For example, the id of the ubuntu image is 47b19964fb50: View its corresponding metadata (formatted into json using vim :%!python -m json.tool) and capture the composition of its rootfs: The diff_id above corresponds to an image layer, which is arranged in order, from top to bottom, representing the lowest layer to the top layer of the image layer: How does diff_id relate to the layers? Specifically, docker uses each diff_id and historical information in rootfs to calculate the corresponding content-addressable index (chainID), and chaiID is associated with the layer layer, and then associated with the image file of each image layer. Layer metadata layer corresponds to the concept of image layer. Before Docker version 1.10, images were managed through a graph structure. Each image layer had metadata that recorded the build information of the layer and the parent image layer ID. The top image layer would record more information as metadata for the entire image. Graph maintains a tree-like image layer structure based on the image ID (i.e. the top-level image layer ID) and the parent image layer ID recorded in each image layer. After Docker version 1.10, one of the biggest changes in image metadata management is the simplification of the metadata of the image layer, which only contains a specific image layer file package. After the user downloads a certain image layer on the Docker host, Docker will build local layer metadata on the host based on the image layer file package and image metadata, including diff, parent, size, etc. When Docker uploads the new image layer generated on the host machine to the registry, the metadata on the host machine related to the new image layer will not be packaged and uploaded together with the image layer. Docker defines two interfaces, Layer and RWLayer, which are used to define some operations of read-only layer and read-write layer respectively. It also defines roLayer and mountedLayer to implement the above two interfaces respectively. Among them, roLayer is used to describe the immutable image layer, and mountedLayer is used to describe the readable and writable container layer. Specifically, the contents stored in roLayer mainly include the chainID that indexes the image layer, the verification code diffID of the image layer, the parent image layer parent, the cacheID of the storage_driver that stores the current image layer file, the size of the image layer, and other contents. These metadata are stored in the /var/lib/docker/image/<storage_driver>/layerdb/sha256/<chainID>/ folder. as follows: There will be three files cache-id, diff, zize in each chainID directory: cache-id file: The uuid randomly generated by docker contains the directory index of the image layer, which is the directory in /var/lib/docker/overlay2/. This is why the corresponding layer directory can be found through chainID. The corresponding directory of chainID is d801a12f6af7beff367268f99607376584d8b2da656dcd8656973b7ad9779ab4, which is 130ea10d6f0ebfafc8ca260992c8d0bef63a1b5ca3a7d51a5cd1b1031d23efd5, which is saved in /var/lib/docker/overlay2/130ea10d6f0ebfafc8ca260992c8d0bef63a1b5ca3a7d51a5cd1b1031d23efd5 Diff file: The diff_id in the image metadata is saved (corresponding to the uuid in diff_ids in the metadata) size file: Saves the size of the image layer Among all the attributes of layer, diffID is calculated based on the content of the image layer file package using the SHA256 algorithm. The chainID is an index based on content storage. It is calculated based on the diffID of the current layer and all ancestor image layers. The specific calculation is as follows:
The readable init layer and container mount point information stored in the mountedLayer information include: the container init layer ID (init-id), the ID used for joint mounting (mount-id), and the chainID (parent) of the parent layer image of the container layer. The relevant files are located in the /var/lib/docker/image/<storage_driver>/layerdb/mounts/<container_id>/ directory. Start a container with id 3c96960b3127 as follows: View the corresponding three mountedLayer files: You can see that initID is the name of the directory stored in /var/lib/docker/overlay2/ with -init added after mountID: You can also view the mountID of the corresponding mount directly through the mount command, which corresponds to the /var/lib/docker/overlay2/ directory, which is also the merged directory presented by overlayfs: A file is created in the container: At this point, you can see the corresponding files in the merged directory of the host: About the init layer The init layer is represented by a name ending with uuid+-init. It is sandwiched between the read-only layer and the read-write layer. It is used to store information such as /etc/hosts and /etc/resolv.conf. This layer is needed because when the container is started, the files or directories that should belong to the image layer, such as hostname, need to be modified by the user. However, the image layer does not allow modification. Therefore, a separate init layer is mounted during startup, and the files in the init layer are modified to achieve the purpose of modifying these files. These modifications are often only effective in the current container, and when docker commit is submitted as an image, the init layer is not submitted. The directory where this layer of files is stored is /var/lib/docker/overlay2/<init_id>/diff summary Through the above introduction, a complete layer of a container should consist of three parts, as shown below:
IV. Conclusion This article introduces the image storage principle with overlayfs as the storage driver. The image data of each layer is stored in the /var/lib/docker/overlay2/<uuid>/diff directory, the init layer data is stored in the /var/lib/docker/overlay2/<init-id>/diff directory, and the unified view (container layer) data is stored in the /var/lib/docker/overlay2/<mount_id>/diff directory. Docker uses content addressing (chainID) through image metadata and layer metadata to organize these directories to form the file system running on the container. refer to: 《use overlayfs driver 》 《Storage Management of Docker Images》 This is the end of this article about the use of overlayfs for Docker image storage. For more information about overlayfs for Docker image storage, please search 123WORDPRESS.COM’s previous articles or the following related articles. I hope you will support 123WORDPRESS.COM in the future! You may also be interested in:
|
<<: How to connect Django 2.2 to MySQL database
>>: Detailed explanation of the idea of distributed lock in MySQL with the help of DB
We are all familiar with the MySQL count() functi...
The function I want to achieve is to open a new w...
Table of contents Preface What are asynchronous i...
<br />In order to clearly distinguish the ta...
Table of contents Linux--File descriptor, file po...
This article example shares the specific code of ...
Icon icon processing solution The goal of this re...
Run cmd with administrator privileges slmgr /ipk ...
I have been learning about responsive design rece...
Table of contents 1. Function signature 2. Functi...
In CSS3, the transform function can be used to im...
1. Installation process MySQL version: 5.7.18 1. ...
Introduction The module that limits the number of...
Problem description: I used a desktop computer an...
Table of contents Preface start Preface The defau...