The difference between fork(), vfork(), exec() and clone()

linux process fork exec clone

I was looking to find the difference between these four on Google and I expected there to be a huge amount of information on this, but there really wasn't any solid comparison between the four calls.

I set about trying to compile a kind of basic at-a-glance look at the differences between these system calls and here's what I got. Is all this information correct/am I missing anything important ?

Fork : The fork call basically makes a duplicate of the current process, identical in almost every way (not everything is copied over, for example, resource limits in some implementations but the idea is to create as close a copy as possible).

The new process (child) gets a different process ID (PID) and has the PID of the old process (parent) as its parent PID (PPID). Because the two processes are now running exactly the same code, they can tell which is which by the return code of fork - the child gets 0, the parent gets the PID of the child. This is all, of course, assuming the fork call works - if not, no child is created and the parent gets an error code.

Vfork : The basic difference between vfork and fork is that when a new process is created with vfork(), the parent process is temporarily suspended, and the child process might borrow the parent's address space. This strange state of affairs continues until the child process either exits, or calls execve(), at which point the parent process continues.

This means that the child process of a vfork() must be careful to avoid unexpectedly modifying variables of the parent process. In particular, the child process must not return from the function containing the vfork() call, and it must not call exit() (if it needs to exit, it should use _exit(); actually, this is also true for the child of a normal fork()).

Exec : The exec call is a way to basically replace the entire current process with a new program. It loads the program into the current process space and runs it from the entry point. exec() replaces the current process with a the executable pointed by the function. Control never returns to the original program unless there is an exec() error.

Clone : Clone, as fork, creates a new process. Unlike fork, these calls allow the child process to share parts of its execution context with the calling process, such as the memory space, the table of file descriptors, and the table of signal handlers.

When the child process is created with clone, it executes the function application fn(arg). (This differs from fork, where execution continues in the child from the point of the original fork call.) The fn argument is a pointer to a function that is called by the child process at the beginning of its execution. The arg argument is passed to the fn function.

When the fn(arg) function application returns, the child process terminates. The integer returned by fn is the exit code for the child process. The child process may also terminate explicitly by calling exit(2) or after receiving a fatal signal.

Information gotten form :

Differences between fork and exec

http://www.allinterview.com/showanswers/59616.html

http://www.unixguide.net/unix/programming/1.1.2.shtml

http://linux.about.com/library/cmd/blcmdl2_clone.htm

Thanks for taking the time to read this ! :)

Why must vfork not call exit()? Or not to return? Doesn't exit() just use _exit()? I'm also trying to understand :)

@Gnuey: because it is potentially (if it's implemented differently from fork(), which it is in Linux, and probably all BSDs) borrowing its parent's address space. Anything it does, besides calling execve() or _exit(), has a great potential to mess up the parent. In particular, exit() calls atexit() handlers and other "finalizers", e.g: it flushes stdio streams. Returning from a vfork() child would potentially (same caveat as before) mess the parent's stack.

I was wondering what happens to parent process's threads; Are all of them cloned or only the thread that calls the fork syscall?

@LazerSharks vfork produces a thread-like process where memory is shared without copy-on-write protections, so doing stack stuff could trash the parent process.

Javier

vfork() is an obsolete optimization. Before good memory management, fork() made a full copy of the parent's memory, so it was pretty expensive. since in many cases a fork() was followed by exec(), which discards the current memory map and creates a new one, it was a needless expense. Nowadays, fork() doesn't copy the memory; it's simply set as "copy on write", so fork()+exec() is just as efficient as vfork()+exec().

clone() is the syscall used by fork(). with some parameters, it creates a new process, with others, it creates a thread. the difference between them is just which data structures (memory space, processor state, stack, PID, open files, etc) are shared or not.

Related: Is it true that fork() calls clone() internally?

vfork avoids the need for temporarily comitting much more memory just so one can execute exec, and it is still more efficient than fork, even if not nearly by as high a degree. Thus, one can avoid having to overcommit memory just so a hunking big program can spawn a child process. So, not just a performance-boost, but might make it feasible at all.

Actually, I have witnessed first-hand how fork() is far from cheap when your RSS is big. I presume this is because the kernel still has to copy all the page tables.

It has to copy all the page tables, set all writable memory copy-on-write in both processes, flush the TLB, and then it has to revert all the changes to the parent (and flush the TLB again) on exec.

vfork is still useful in cygwin (a kernel emulating dll, that runs on top of Microsoft's Windows). cygwin can not implement an efficient fork, as the underlying OS does not have one.

ninjalj

execve() replaces the current executable image with another one loaded from an executable file.

fork() creates a child process.

vfork() is a historical optimized version of fork(), meant to be used when execve() is called directly after fork(). It turned out to work well in non-MMU systems (where fork() cannot work in an efficient manner) and when fork()ing processes with a huge memory footprint to run some small program (think Java's Runtime.exec()). POSIX has standardized the posix_spawn() to replace these latter two more modern uses of vfork().

posix_spawn() does the equivalent of a fork()/execve(), and also allows some fd juggling in between. It's supposed to replace fork()/execve(), mainly for non-MMU platforms.

pthread_create() creates a new thread.

clone() is a Linux-specific call, which can be used to implement anything from fork() to pthread_create(). It gives a lot of control. Inspired on rfork().

rfork() is a Plan-9 specific call. It's supposed to be a generic call, allowing several degrees of sharing, between full processes and threads.

Thanks for adding more information than which was actually asked for,it helped me save my time

Plan 9 is such a tease.

For those who can't remember what MMU means: "Memory management unit" - further reading on Wikipedia

ZarathustrA

fork() - creates a new child process, which is a complete copy of the parent process. Child and parent processes use different virtual address spaces, which is initially populated by the same memory pages. Then, as both processes are executed, the virtual address spaces begin to differ more and more, because the operating system performs a lazy copying of memory pages that are being written by either of these two processes and assigns an independent copies of the modified pages of memory for each process. This technique is called Copy-On-Write (COW). vfork() - creates a new child process, which is a "quick" copy of the parent process. In contrast to the system call fork(), child and parent processes share the same virtual address space. NOTE! Using the same virtual address space, both the parent and child use the same stack, the stack pointer and the instruction pointer, as in the case of the classic fork()! To prevent unwanted interference between parent and child, which use the same stack, execution of the parent process is frozen until the child will call either exec() (create a new virtual address space and a transition to a different stack) or _exit() (termination of the process execution). vfork() is the optimization of fork() for "fork-and-exec" model. It can be performed 4-5 times faster than the fork(), because unlike the fork() (even with COW kept in the mind), implementation of vfork() system call does not include the creation of a new address space (the allocation and setting up of new page directories). clone() - creates a new child process. Various parameters of this system call, specify which parts of the parent process must be copied into the child process and which parts will be shared between them. As a result, this system call can be used to create all kinds of execution entities, starting from threads and finishing by completely independent processes. In fact, clone() system call is the base which is used for the implementation of pthread_create() and all the family of the fork() system calls. exec() - resets all the memory of the process, loads and parses specified executable binary, sets up new stack and passes control to the entry point of the loaded executable. This system call never return control to the caller and serves for loading of a new program to the already existing process. This system call with fork() system call together form a classical UNIX process management model called "fork-and-exec".

Note that the BSD and POSIX requirements for vfork are so weak that it would be legal to make vfork a synonym of fork (and POSIX.1-2008 removes vfork from the spec entirely). If you happen to test your code on a system that synonymizes them (e.g. most post-4.4 BSDs aside from NetBSD, pre-2.2.0-pre6 Linux kernels, etc.), it may work even if you violate the vfork contract, then explode if you run it elsewhere. Some of those that simulate it with fork (e.g. OpenBSD) still guarantee the parent doesn't resume running until the child execs or _exits. It's ridiculously non-portable.

regarding the last sentence of your 3rd point: I noticed on Linux using strace that while indeed the glibc wrapper for fork() calls the clone syscall, the wrapper for vfork() calls the vfork syscall

user991800

The fork(),vfork() and clone() all call the do_fork() to do the real work, but with different parameters.

asmlinkage int sys_fork(struct pt_regs regs)
{
    return do_fork(SIGCHLD, regs.esp, &regs, 0);
}

asmlinkage int sys_clone(struct pt_regs regs)
{
    unsigned long clone_flags;
    unsigned long newsp;

    clone_flags = regs.ebx;
    newsp = regs.ecx;
    if (!newsp)
        newsp = regs.esp;
    return do_fork(clone_flags, newsp, &regs, 0);
}
asmlinkage int sys_vfork(struct pt_regs regs)
{
    return do_fork(CLONE_VFORK | CLONE_VM | SIGCHLD, regs.esp, &regs, 0);
}
#define CLONE_VFORK 0x00004000  /* set if the parent wants the child to wake it up on mm_release */
#define CLONE_VM    0x00000100  /* set if VM shared between processes */

SIGCHLD means the child should send this signal to its father when exit.

For fork, the child and father has the independent VM page table, but since the efficiency, fork will not really copy any pages, it just set all the writeable pages to readonly for child process. So when child process want to write something on that page, an page exception happen and kernel will alloc a new page cloned from the old page with write permission. That's called "copy on write".

For vfork, the virtual memory is exactly by child and father---just because of that, father and child can't be awake concurrently since they will influence each other. So the father will sleep at the end of "do_fork()" and awake when child call exit() or execve() since then it will own new page table. Here is the code(in do_fork()) that the father sleep.

if ((clone_flags & CLONE_VFORK) && (retval > 0))
down(&sem);
return retval;

Here is the code(in mm_release() called by exit() and execve()) which awake the father.

up(tsk->p_opptr->vfork_sem);

For sys_clone(), it is more flexible since you can input any clone_flags to it. So pthread_create() call this system call with many clone_flags:

Summary: the fork(),vfork() and clone() will create child processes with different mount of sharing resource with the father process. We also can say the vfork() and clone() can create threads(actually they are processes since they have independent task_struct) since they share the VM page table with father process.

Raj Kannan B.

in fork(), either child or parent process will execute based on cpu selection.. But in vfork(), surely child will execute first. after child terminated, parent will execute.

Wrong. vfork() can just be implemented as fork().

after AnyFork(), it is not defined who runs first parent / child.

@Raj: You have some conceptual misunderstandings if you think after forking there is an implicit notion of serial order. Forking creates a new process and then returns control to both processes (each returning a different pid) - the operating system can schedule the new process to run in parallel if such a thing makes sense (e.g. multiple processors). If for some reason you need these processes to execute in a particular serial order, then you need additional synchronization that forking does not provide; frankly, you probably would not even want a fork in the first place.

Actually @AjayKumarBasuthkar and @ninjalj, you are both wrong. with vfork(), the child runs first. It is in the man pages; the parents' execution is suspended until the child either dies or execs. And ninjalj look up the kernel source code. There is no way to implement vfork() as fork() because they pass different arguments to do_fork() within the kernel. You can, however, implement vfork with the clone syscall

@ZacWimer: see ShadowRanger's comment to another answer stackoverflow.com/questions/4856255/… Old Linux did synonimize them, as apparently BSDs other than NetBSD (which tends to be ported to a lot of non-MMU systems) do. From the Linux manpage: In 4.4BSD it was made synonymous to fork(2) but NetBSD introduced it again; see ⟨netbsd.org/Documentation/kernel/vfork.html⟩. In Linux, it has been equivalent to fork(2) until 2.2.0-pre6 or so.

The difference between fork(), vfork(), exec() and clone()

Follow WeChat

Want to stay one step ahead of the latest teleworks?

相似问题

Platform

Support

Links

Contact US