1. The origin of fork The idea of forking appeared several years before UNIX appeared, around 1963, six years before the first version of UNIX on the PDP-7. Let's look at a common flowchart: Look, the branching part of the flowchart, the fork, is so vivid! The branches split out from a branch point on a flowchart are obviously logically independent, which is the premise of parallelism. Therefore, they can be expressed in the form of different processing processes. At that time, the expression was just the term "process", which was not the concept of "process" in the sense of modern operating systems. The join synchronization point is the point where multiple parallel processes have to synchronize for some reason, that is, the point where multiple parallel processes converge. Until now, in multi-threaded programming, this point is still called join. For example, the join method of Java Thread and the pthread_join function of the pthread library. In a broad sense, join also refers to points that must be passed serially, such as critical sections. Reducing the number of join points will improve parallel efficiency. Let's look at the original diagram of fork in Conway's paper: Another innovation of Conway in the paper is that he separated the processing process (which is the concept of process in the operating system later) and the processor that executes the process (that is, the CPU core), abstracting the schedule layer. The general idea is, "As long as the number of active processors in the system is the minimum of the total number of processors and the number of parallel processing processes." This means that the scheduler can treat all processors in the multi-processor system and all processing processes in the system as a unified resource pool and consumers, respectively, and perform unified scheduling: After UNIX introduced fork, this multi-processor parallel design concept was deeply rooted in the core of UNIX. This idea eventually influenced UNIX and later Linux until now. 2. Early UNIX overlay technologyNext, let's look at another context of UNIX forks. The original UNIX in 1969 ran in a way that seems very strange now. General information starts from UNIX v6, which is a relatively "modern" version, so few people can see what the original UNIX looked like. Even the UNIX source code running on the PDP-7 in 1970 that can be consulted is the version after the introduction of fork, and the original version before that is almost impossible to find (you may say that UNIX at that time was not called UNIX, but who cares...). The original UNIX was a time-sharing system with only two shell processes , one belonging to each terminal:
That’s it. In order to reflect the time-sharing feature, the original UNIX implemented a minimum of two terminals. Note that the original UNIX had no fork, no exec, and even no concept of multiple processes. In order to achieve time-sharing, there were only two simple shell processes in the system. In fact, the original UNIX used a table with only two elements to hold all processes (obviously, this looks funny...). Of course, the concept of "table" here is also an abstract and simple concept, because the system at that time was written in PDP-7 assembly, and there was no C language data structure later. We now consider how a shell process in one of the terminals works. The question immediately arises, how does this shell process execute other command programs? ? If the system can only accommodate two processes at most and a terminal has only one shell process, what should the shell process of the terminal do when it executes other command programs? This question needs some thought... Note: Do not use modern eyes to evaluate the first version of UNIX in 1969. According to modern eyes, executing a program must generate a new process. Obviously, this is not correct in the first version of UNIX. The answer is that there is no need to create a new process at all. Just load the command program code into memory and overwrite the shell process code! When the command is executed, the shell code overwrites the command program code. For a single terminal, the system actually keeps executing the following overwriting loop (from the Process control section of the paper): However, this was the case before fork was introduced into UNIX. There is always the same process on a terminal. Sometimes it executes the shell code, and sometimes it executes the code of a specific command program. The following is the structure of an overlay program (the picture comes from the book "FreeBSD Operating System Design and Implementation"): However, at that time, this logic had not yet been encapsulated into an exec system call, and these were all done explicitly by each process:
The exec logic is part of the shell program, and since it is used by all command programs, it is also encapsulated in the exit call. 3. The appearance of fork before its introduction into UNIXIn 1963, Melvin Conway proposed the fork idea as a means of executing processes in parallel on multiple processors. The 1969 Thompson version of UNIX had only two shell processes, using overlaying technology to execute commands. So far, the appearances we have seen are: The Thompson version of UNIX has no fork, no exec, no wait, and the only library function-like exit is very different from the current exit system call. Obviously, the Thompson version of UNIX is not a multi-process system, but just a simple two-terminal time-sharing system that can run! 1. The birth of UNIX forkHow was fork introduced into UNIX? This has to start with the inherent problems of the Thompson version of UNIX that uses overlay technology. Let's look at the original paper: To solve these problems, Thompson thought of a very simple solution:
Obviously, the command program cannot overwrite the shell process. The solution is to use a technique called "swapping". Both swapping and overlaying technologies are actually solutions to the problem of multi-process usage of limited memory. The difference lies in the direction:
Using swapping to solve the overwriting problem means creating a new process:
UNIX needs to be modified, and two quota process tables are obviously not enough. Of course, the solution is not difficult: When it comes to efficiency, copying is worse than creativity. The most direct way to create a new process is to copy the current shell process, and overwrite it in the copied new process. The command program overwrites the copied new process, and the current terminal shell process is swapped to the disk to keep it intact. With overlay and swap combined, UNIX is one step closer to modernization! After determining the solution to copy the current process, the next question is how to copy the process. Now let’s get back to forks. After Conway proposed the idea of fork, a prototype for fork implementation was immediately available (as Conway himself said, he only proposed an idea that might exist, but did not implement it). Project Genie is considered one of the more complete systems for implementing fork. The fork of the Project Genie system does not just blindly copy the process, it has fine-grained control over the fork process, such as how much memory space to allocate, which necessary resources to copy, etc. Obviously, the fork of Project Genie is aimed at Conway's multi-processor parallel logic. As the old saying goes, copying is worse than creation. If UNIX wants to implement process copying, there is a ready-made template, Project Genie. However, Project Genie's fork is too complicated and too sophisticated for UNIX. UNIX obviously does not need such sophisticated control. UNIX only wants the forked new process to be overwritten instead of letting it execute any parallel logic on multiple processors. In other words, UNIX simply borrowed the implementation of fork's copy logic to accomplish something else. So, UNIX implemented fork very roughly! That is, completely copy the parent process. This is the fork system call we are still using until now: Taking shortcuts:
UNIX fork was born! Let's review the situation before the birth of UNIX fork: Let’s take a look at the scene after the birth of fork: Thus, UNIX officially started its modernization journey and has continued to this day. 2. UNIX fork-execThere is not much to say about exec. It is actually an encapsulation of the above-mentioned overwrite logic. After that, programmers do not need to write the overwrite logic themselves, but can directly call the exec system call. Thus the classic UNIX fork-exec sequence was formed. 3. UNIX fork/exec/exit/waitIt is worth mentioning that after fork was introduced into UNIX, the semantics of exit changed dramatically. In the original 1969 Thompson version of UNIX, since there was only one process per terminal, this meant that overwriting was always between the shell program and some command program:
However, after fork was introduced, although the shell still overwrote the forked shell child process with a specific command program when executing a command, when the command was executed, the exit logic could no longer allow the shell to overwrite the current command program, because the shell had never ended. It was just swapped to the disk as the parent process (later, when the memory was large enough to accommodate multiple processes, even swapping was not needed). So who will exit let to overwrite the current process? The answer is no need to overwrite. According to the literal meaning of exit, it just needs to end itself. In line with the principle of managing your own resources, exit only needs to clean up the resources it has allocated. For example, clean up your own memory space and some other data structures. For the child process itself, since it is generated by the parent process, it is managed and released by the parent process. Thus, the classic UNIX process management four-piece set was formally formed: This is the end of this article about the hidden overhead of Unix/Linux forks. For more information about Unix/Linux forks, please search 123WORDPRESS.COM's previous articles or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future! , I hope everyone will support 123WORDPRESS.COM in the future! You may also be interested in:
|
<<: Detailed explanation of JavaScript's built-in objects Math and strings
>>: HTML multi-header table code
Table of contents 1. Page Rendering 2. Switch tag...
Table of contents First look at the effect: accom...
It is very common to use webpack to build single-...
/***************** * proc file system************...
Table of contents Preface 1. Application componen...
Mouse effects require the use of setTimeout to ge...
1. Pull the image docker pull registry.cn-hangzho...
<br />User experience is increasingly valued...
Table of contents 1. HttpGET 2. HTTP POST WebSock...
When nginx configures proxy_pass, the difference ...
Preface I have been summarizing my front-end know...
Remount the data disk after initializing the syst...
Table of contents 1. Configure Linux hostname Con...
When working on a recent project, I found that th...
As a front-end monkey, whether it is during an in...