albaugh@dms.UUCP (Mike Albaugh) (01/17/90)
From article <610@ssp11.idca.tds.philips.nl>, by willy@idca.tds.PHILIPS.nl (Willy Konijnenberg): [quoting a lot of people arguing about MMU's and the needs of *nix, then getting to the heart of the matter:] > I don't think you should try to think of relocating a program once > it has been running for a while. You have no way of knowing what it is > doing with pointers. Precisely why systems like the MAC (and minix on the PC) use only segment-relative addressing. Well, actually, it gets a little more complicated on the MAC, but believe me you don't want to know :-) > When you run a unix-like system, there is one additional point where > this scheme slows the system down, in addition to the relocation work > during program load. > As Craig noted, when the program fork()s, you have two programs that need > to be located at the same virtual (== physical with no MMU) address space > to run, so for every context switch, you must check whether the program is > at the proper place and if not, swap things around (in memory, not necessarily > to disk), which dramatically increases context switch overhead. Although not mentioned here, someone else on this thread (mis)stated that the MMU was also needed for protection, which is not strictly true. A scheme like that implemented on the early mid-range 360's can provide protection without relocation, and can do it in _parallel_ with the fecth, so there is no performance penalty. There will always be a performance penalty for relocation, but it may be masked by other, still slower, parts of the memory system. I just wanted to get that point out of the way early. > Fortunately, this is normally not much of a problem, since usually a program > does an exec() shortly after the fork() and this exec() can fix the problem. > > This scheme is not very elegant, but it allows one to run a unix system > on hardware like ST, Mac and Amiga. SO--- Why do we _still_ use fork() for all these near-trivial cases. I have been mucking around with computers for over 20 years but am not really familiar with *nix. I would like a reality check on this. I'm also not asking on comp.unix... because I'm afraid that would be like asking pointed questions about the trinity in a seminary :-) Since comp.arch folk have to deal with _implementing_ this stuff, I thought I'd get a more reasoned response ( 1/2 :-). Anyway, I can see a few reasons to use fork: 1) It can be used for part of spawn _and_ for actual task-splitting (problem subdivision). Why have two calls when one will do? 2) On machines that are only (or mainly) swapping anyway, there is no penalty, so what the heck. 3) By just (effectively) copying the entire memory space, we don't need to keep track of just which parts actualy _need_ to be passed to the new task (laziness as a virtue :-). 4) "We have always done it this way". (my personal feelings are that 1 & 2 were the original reasons while 3 & 4 are the reason we are stuck with it now) Against this we have the problems mentioned above with handling *nix programs on machines without dynamic relocation. Also, even machines that _can_ do relocation don't get fork for free: 1) Machines with base/bounds registers may need to copy the whole memory image to a new area. If they have two sets (e.g. KA10) they might get away with "only" copying the data segment. 2) Paging machines still need to at least mark all data pages "copy on write", which may involve traversing the segment and page tables in software. For a large image this can be time consuming. Also, I'd imagine it's a real judgement call whether to deal with a page at a time or just punt and do the copy as soon as "enough" of the image has changed to make write-trap handling a nuisance. And all this hassle so that three or four instructions later the program can overlay itself (most of the time). I must be missing something major here. Can someone tell me what? Mike > Willy Konijnenberg <willy@idca.tds.philips.nl> | Mike Albaugh (albaugh@dms.UUCP || {...decwrl!pyramid!}weitek!dms!albaugh) | Atari Games Corp (Arcade Games, no relation to the makers of the ST) | 675 Sycamore Dr. Milpitas, CA 95035 voice: (408)434-1709 | The opinions expressed are my own (Boy, are they ever)
andrew@frip.WV.TEK.COM (Andrew Klossner) (01/17/90)
Yep, most uses of fork() are quickly followed by exec(). But the exec() doesn't happen within a couple of instructions. Typically this sort of thing happens: fork() child does: close files that child won't need; open (or otherwise establish) standard input, output, and error files; exec(argc, argv, environment_p); print_error("exec failure"); exit(); Wrapping up all the interesting child manipulation of files into options to one huge spawn() call would be cumbersome. Berkeley 4.1BSD Unix addressed the problem by inventing the vfork() syscall, which creates a new child thread but runs it in the parent's environment. This eliminates the need to clone writable memory, or to set up a new set of page table entries with "copy-on-write" set. The parent thread is suspended until the exec(), and any change that the child makes to memory before the exec() are seen by the parent. vfork() is marked for extinction in a future BSD release. We extinguished it on our BSD-derived 68k workstation (for reasons that were sound at the time). Yes, you have to build a duplicate set of page tables in order to implement copy-on-write, but for processes that aren't huge, this isn't a large part of fork() handling. The kernel has to do a lot of other work besides just building the PTEs. For huge processes, though, we've observed that PTE creation time dominates fork() time. -=- Andrew Klossner (uunet!tektronix!frip.WV.TEK!andrew) [UUCP] (andrew%frip.wv.tek.com@relay.cs.net) [ARPA]
pete@sun1102.UUCP (Peter R. Carpenter) (01/17/90)
In article <952@dms.UUCP> albaugh@dms.UUCP (Mike Albaugh) writes: > > SO--- Why do we _still_ use fork() for all these near-trivial >cases. [stuff deleted] >1) It can be used for part of spawn _and_ for actual task-splitting > (problem subdivision). Why have two calls when one will do? > Actually, the Berkley folks have a system call vfork(), which does the fork/exec in one operation. But of course, it is not compatible with Sys V. --- Pete Carpenter, Cirrus Logic Inc, 1463 Centre Pointe Dr, Milpitas, CA 95035 {amdahl,ames,apple,bunker,pyramid}!oliveb!tymix!cirrusl!pete 408-945-8300 ---------------------------------------------------------------------------
chris@mimsy.umd.edu (Chris Torek) (01/17/90)
In article <5875@orca.wv.tek.com> andrew@frip.WV.TEK.COM (Andrew Klossner) writes: >... you have to build a duplicate set of page tables in order to >implement copy-on-write, but for processes that aren't huge, this isn't >a large part of fork() handling. The kernel has to do a lot of other >work besides just building the PTEs. For huge processes, though, we've >observed that PTE creation time dominates fork() time. This is useful information. It does, however, make some assumptions: 1. PTEs. Not all machines with virtual memory require PTEs (e.g., all MIPS-based machines: page translation is done in software). 2. PTEs (if present) must be copied. One valid approach, possible on some (particularly two level page table) machines, is to mark the primary pages invalid or unwritable. Since most uses of fork are of the form: switch (pid = fork()) { case 0: diddle this, that, and the other thing; execv(name, argv); error(notfound); _exit(1); /* NOTREACHED */ case -1: error(cannotfork); break; default: record(pid); break; } the kernel could: a. suspend execution of the parent b. hand PTEs to the child, mark them `must not be written' or `invalid' c. start a timer to allow resumption of the parent d. let the child run. When it gets page faults, make copies of exactly those PTEs and pages needed to let it continue. When it exits or execs, cancel the timer, and move the PTEs back, replacing those that were copied with the originals e. if the timer expires, suspend the child, copy the PTEs back to the parent (replacing modified PTEs and pages with originals), and then resume both processes. This is a fair amount of work (particularly in data structures for tracking original and copied PTEs) and may well not be worth the effort---but it might turn out to help tremendously. The kernel could choose whether to use this trick based on process size. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@cs.umd.edu Path: uunet!mimsy!chris
tony@cairo.UUCP (Tony Anzelmo) (01/17/90)
In article <1239@cirrusl.UUCP> pete@cirrusl (Pete Carpenter) writes: >In article <952@dms.UUCP> albaugh@dms.UUCP (Mike Albaugh) writes: >> >> SO--- Why do we _still_ use fork() for all these near-trivial >>cases. [stuff deleted] > >>1) It can be used for part of spawn _and_ for actual task-splitting >> (problem subdivision). Why have two calls when one will do? >> > >Actually, the Berkley folks have a system call vfork(), which does the >fork/exec in one operation. But of course, it is not compatible with Sys V. ^^^ Vfork does not do fork/exec in "one" operation. It simply avoids copying the parent's address space to the child on the assumption that the child will invoke exec (and therefore replace that address space). The parent loans its address space to the child until an execve (or exit) is performed by the child. During this time, the parent is suspended while the child uses its address space. There are also some strange anomalies with vfork (as compared to fork). One of the more interesting ones is the ability of the child to modify the parent's address space such that the parent sees those changes upon resumption. Since the original intent of vfork was to provide an "efficient" fork, some vendors have eliminated it by using "copy-on-write" in their fork implementations. This technique reduces the address space copying to only portions that are written and amortizes those costs across the child (and parent) execution. However, the subtle semantic of vfork with regard to the anomaly mentioned above is lost. Tony Anzelmo
dwc@cbnewsh.ATT.COM (Malaclypse the Elder) (01/18/90)
In article <952@dms.UUCP>, albaugh@dms.UUCP (Mike Albaugh) writes: > > > Fortunately, this is normally not much of a problem, since usually a program > > does an exec() shortly after the fork() and this exec() can fix the problem. > > > > This scheme is not very elegant, but it allows one to run a unix system > > on hardware like ST, Mac and Amiga. > > SO--- Why do we _still_ use fork() for all these near-trivial > cases. I have been mucking around with computers for over 20 years but am > not really familiar with *nix. I would like a reality check on this. > I'm also not asking on comp.unix... because I'm afraid that would be > like asking pointed questions about the trinity in a seminary :-) Since > comp.arch folk have to deal with _implementing_ this stuff, I thought I'd > get a more reasoned response ( 1/2 :-). Anyway, I can see a few reasons > to use fork: > > 1) It can be used for part of spawn _and_ for actual task-splitting > (problem subdivision). Why have two calls when one will do? > 2) On machines that are only (or mainly) swapping anyway, there is no > penalty, so what the heck. > 3) By just (effectively) copying the entire memory space, we don't > need to keep track of just which parts actualy _need_ to be > passed to the new task (laziness as a virtue :-). > 4) "We have always done it this way". > > (my personal feelings are that 1 & 2 were the original reasons while > 3 & 4 are the reason we are stuck with it now) > actually, we indirectly address this type of question in a paper we are presenting at the winter usenix in washington next week. the paper is titled "insuring improved vm performance: some no-fault policies" and discusses some things we did to improve vm performance in unix system v release 4. in it, we argue that the idea that all forks are followed IMMEDIATELY by execs is misunderstood. it all depends on what you mean by IMMEDIATELY. to fully understand the scale of IMMEDIATELY, you have to think about how forks and execs are used. on MOST systems, the majority of forks are executed by programs that are classified as "shells". these shells are command interpreters whose function is to provide a command level interface to users and often also provide a programming language for executing "shell scripts". all major shells (bourne shell, ksh, csh) provide for such things as i/o redirection, argument expansion, etc. in other words, the shell must allow for the child to have an environment that may be substantially different than the parent. and some shells provide such things as history files, command recall, etc. the bottom line is that for these shells, more than just "a couple of instructions" are executed between fork and exec. thus, it really was a stroke of genius to have a fork call that duplicates the parent and allow the child to craft its environment the way that it wants to since you have to provide primitives to allow a process to change its environment anyway. the alternative is to have a single call with a HUGE argument list to specify all the options. and maintaining compatibility with new release would be a major hassle under this scheme. > Against this we have the problems mentioned above with handling *nix > programs on machines without dynamic relocation. Also, even machines > that _can_ do relocation don't get fork for free: > > 1) Machines with base/bounds registers may need to copy the whole > memory image to a new area. If they have two sets (e.g. KA10) > they might get away with "only" copying the data segment. > 2) Paging machines still need to at least mark all data pages > "copy on write", which may involve traversing the segment > and page tables in software. For a large image this can > be time consuming. Also, I'd imagine it's a real judgement > call whether to deal with a page at a time or just punt and > do the copy as soon as "enough" of the image has changed to > make write-trap handling a nuisance. > > And all this hassle so that three or four instructions later the > program can overlay itself (most of the time). I must be missing something > major here. Can someone tell me what? > on one hand its a matter of your point of view. do you design the operating system to fit the machine or do you design it to provide a nice interface to programmers/users (and put minimal requirements on the machine to support your operating system design)? we can argue religiously about it but it all comes down to costs and market niches. for some, increasing the system cost for an mmu more than offsets the cost of having programmers write code to "do it with mirrors". for others (depending on market niche), it may not pay. to summarize, i believe that people argue about fork/exec without really thinking about who uses it and how it is used. if you accept my assertion that shells are the almost exclusive users of the fork/exec interface, you will realize that more than just 3 or 4 instructions are executed between forks and execs. and since shells tend to be reasonably sized processes, the concern about doing forks of large processes is currently unjustified. but nothing stays constant and these large new graphical user interfaces may change things. as a final plug, read the paper in the conference proceedings. i even present a nice new way of avoiding copy on write faults after forks which is pretty clever (a biased opinion). danny chen att!hocus!dwc
henry@utzoo.uucp (Henry Spencer) (01/18/90)
In article <1239@cirrusl.UUCP> pete@cirrusl (Pete Carpenter) writes: >Actually, the Berkley folks have a system call vfork(), which does the >fork/exec in one operation.... Nope, sorry, wrong, vfork is just a variant of fork that runs faster at the cost of truly horrible semantics. It was basically a kludge to get around some hardware/firmware problems that made it impossible to do a proper copy-on-write optimized fork on the early VAXen. It's already greatly outlived its real usefulness. -- 1972: Saturn V #15 flight-ready| Henry Spencer at U of Toronto Zoology 1990: birds nesting in engines | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
peter@ficc.uu.net (Peter da Silva) (01/19/90)
> Actually, the Berkley folks have a system call vfork(), which does the > fork/exec in one operation. But of course, it is not compatible with Sys V. I thought it sort of deferred the actual fork until the exec occurred, just duplicating the file table and other "cheap" resources. And of course once you have a VM system fork() is just fine and vfork() is a meaningless optimisation. Just mark all the data pages copy-on-write. So why didn't they put vfork() in 2BSD? -- _--_|\ Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>. / \ \_.--._/ Xenix Support -- it's not just a job, it's an adventure! v "Have you hugged your wolf today?" `-_-'
andrew@frip.WV.TEK.COM (Andrew Klossner) (01/19/90)
> From: dwc@cbnewsh.ATT.COM (Malaclypse the Elder) > Organization: The Legion of Dynamic Discord > ... we indirectly address this type of question in a paper we > are presenting at the winter usenix ... > danny chen > att!hocus!dwc Folks, when you refer to something interesting from yourself and your organization, would you please undo the schmaltz in your "name" and "organization" fields? It would be nice in this case to have a better attribution than just "ATT." (I don't believe the bit about the Legion ... :-) ) > since shells tend to be reasonably > sized processes, the concern about doing forks of large processes is > currently unjustified. Think of shell escapes in emacs. Or vi. In my favorite BASIC programming environment (no cracks please) it takes 30 seconds to do "!ls". -=- Andrew Klossner (uunet!tektronix!frip.WV.TEK!andrew) [UUCP] (andrew%frip.wv.tek.com@relay.cs.net) [ARPA]
ccplumb@lion.waterloo.edu (Colin Plumb) (01/19/90)
The other way to avoid fork() is to recognise that the only thing that survives the eventual exec() is kernel-maintained state, so a call (which takes the place of fork()) to ask the kernel to create a new process state, and an additional argument to all state-changing system calls to specify which state (since now more than one can be associated with a single thread of execution) to use will achieve the same effect. exec() is replaced by something which gives one of the states (it doesn't really matter which) a different load image and starts it executing. This is kind of like vfork() except it doesn't even try to fake two threads. The essential idea stolen from Unix is that the system calls uised to manipulate the child's environment are the same ones used to manipulate the parent's. P.S. It actually may be useful to have two states maintained for a long time. It gives you two ownerships, so you can avoid the effective uid kludge to a large degree, and more signal bits if you use them for IPC, etc. The only problem is, which process gets signalled for bus error, etc.? -- -Colin
pete@sun1102.UUCP (Peter R. Carpenter) (01/19/90)
In article <1990Jan17.204433.18006@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes: >Nope, sorry, wrong, The tone of your posting sucks. Read the preceding articles, the original poster asked a simple question: Why can't fork/exec be done together? I simply replied that there was such a beast, and pointed out that it is not portable. --- Pete Carpenter, Cirrus Logic Inc, 1463 Centre Pointe Dr, Milpitas, CA 95035 {amdahl,ames,apple,bunker,pyramid}!oliveb!tymix!cirrusl!pete 408-945-8300 ---------------------------------------------------------------------------
jdarcy@pinocchio.encore.com (Jeff d'Arcy) (01/19/90)
peter@ficc.uu.net (Peter da Silva): . And of course once you have a VM system fork() is just fine and vfork() is . a meaningless optimisation. Just mark all the data pages copy-on-write. So . why didn't they put vfork() in 2BSD? Not all VM systems have copy-on-write. Jeff d'Arcy OS/Network Software Engineer jdarcy@encore.com Encore has provided the medium, but the message remains my own
sms@WLV.IMSD.CONTEL.COM (Steven M. Schultz) (01/19/90)
In article <IC51D77xds13@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes: >> Actually, the Berkley folks have a system call vfork(), which does the >> fork/exec in one operation. But of course, it is not compatible with Sys V. > >And of course once you have a VM system fork() is just fine and vfork() is >a meaningless optimisation... So why didn't they put vfork() in 2BSD? They did. Put vfork() in 2BSD that is. Works very nicely too. 'vmstat' even reports how many forks() and vforks() were done along with the number (and averages) and amount of clicks of memory copied for each type of fork. Steven M. Schultz sms@wlv.imsd.contel.com