throopw@xyzzy.UUCP (Wayne A. Throop) (07/06/87)
I'd like to clarify a few issues I seem to have been unclear on, since several people seem to have slightly misinterpreted what I said. > jerry@oliveb.UUCP (Jerry F Aguirre) > After looking at the parameters of "create_process" and the extended > parameters, and the optional extended extended parameters I considered > the fork exec of Unix a much cleaner and more elegant solution. I agree entirely. I think fork() and exec() cleanly and elegantly partition the work of process creation in a quite good manner. I was concerned to argue that *if* optimization is wanted, *then* create_process() is a cleaner optimization than vfork(). And while I don't think create_process() is as generally useful as vfork(), I think it is certainly cleaner and therefore preferable, *if* an optimized process creation path must be made. I further argued that such an optimization is attractive, because fork() implies some work that cannot be avoided by copy-on-demand, and the response was: > ron@topaz.rutgers.edu (Ron Natalie) > Crap. No extra space is involved as there are already unused bits in > the page tables in most implementations. The few systems that already > do copy-on-write forking don't end up using any additional CPU time. I doubt this a great deal. A *great* deal. First, any scheme which involves using status bits on the physical PTE rather than generating new LTEs is trading CPU time for memory space. Second, even if the bits "are already there" in some sense, they are now being consumed and manipulated, at some cost. Further, I assume that COW fork() consuming no "additional CPU time" relative to vfork() means that the process in question writes to no pages of memory whatsoever. Pretty hard for me to see how such a process does the return from the fork() call, and does any useful work before the (presumably relatively prompt) exec() that often occurs in the vfork() case. But perhaps call/return is done in registers, and the process touches no memory whatsoever in fidding with file descriptors and whatnot. Perhaps... but I doubt it. Again, my point is that fork() involves semantics that vfork() does not, and while the difference in their costs can be made quite small, it will never be zero. This point was simply a quibble to Guy's original statement that there is *no* difference whatsoever. Clearly, fork() can be made orders of magnitude less costly than it typically is now, and can be made *nearly* as efficent as vfork(). Even to the point (on some machines or implementations) where vfork() or other process creation "cheats" become unnecessary. Chris has this to say about efficency: > chris@mimsy.UUCP (Chris Torek) >>But fork() can *never* be as efficent as vfork(), and sometimes this >>efficency is crucial. > I disagree with the first part of this statement. If you wish to > claim that on a Vax or other conventional two-level page table > architecture, fork() cannot be as efficient as vfork(), *that* I > can believe. On the MIPS R2000, I think a copy-on-write fork() > could be just as efficient. Well, in the trivial sense that one could add a busy loop to a vfork() implementation to slow it down to be equal to or slower than a fork() implementation, I agree with Chris and disagree with my statement above. But if the MIPS R2000 really has no additional cost of managing the split of address spaces and copy of a page or two of data that is written to by the child, I'll be very surprised indeed. I agree that the difference can be made very small, perhaps even negligable, but I stick to my guns and insist that I see no way to make it zero. >>Even if you can create copy-on-demand during fork(), the database >>needed to keep track of these pages consumes kernel (possibly virtual) >>memory space, and the creation and upkeep of this data consumes >>kernel CPU time. > What database? Obviously you are considering some particular > implementation. There are no doubt others, in which this operation > is almost free---costing about the same as vfork(). Yes, yes: here I agree entirely. "Almost free" and "about the same". But I *don't* agree that it can be *totally* free and cost *exactly* the same. The semantics of the way memory is shared in vfork() and is not in fork() makes this (as near as I can tell) impossible. > (Vfork also avoids copying both > pages and PTEs, whereas even copy-on-write fork must copy PTEs. > But who says machines even *have* PTEs?) Not me. But any virtual memory system must spend CPU or memory or silicon to acheive the *effect* of a PTE. The cost cannot be zero. >> What *should* have >>been coined is the ability to create a process running a new executable >>image in a single system call. > As others have said, the big problem here is the need to alter some > of the per-process information before running the new image. And as others have answered (thanks Bob Larson), this is not as difficult as it might seem at first, at least for simple cases. Again, my claim is that *if* one wishes to make an optimized process create, the way to do it is to use a create process call for cases where it is possible and profitable, and fall back to the more thorough semantics of fork() when necessary, and avoid vfork() like the plague. Perhaps I overargued the case for there actually being such an optimized process creation function, but I still think it is just the ticket in many cases, just as rename() is the ticket in many cases. -- "C'mon Merle." "You coming or going?" "Both." --- from "Blood of Amber" -- Wayne Throop <the-known-world>!mcnc!rti!xyzzy!throopw
chris@mimsy.UUCP (Chris Torek) (07/10/87)
In article <129@xyzzy.UUCP> throopw@xyzzy.UUCP (Wayne A. Throop) writes: >Further, I assume that COW fork() consuming no "additional CPU time" >relative to vfork() means that the process in question writes to no >pages of memory whatsoever. No, it gets to write to one or two stack pages. >... in the trivial sense that one could add a busy loop to a vfork() >implementation to slow it down to be equal to or slower than a fork() >implementation, I agree with Chris and disagree with my statement above. This is not necessary. >But if the MIPS R2000 really has no additional cost of managing the >split of address spaces and copy of a page or two of data that is >written to by the child, I'll be very surprised indeed. Since the virtual to physical translation is done by kernel code, one need not do anything special to split an address space: the two programs can point to the same mappings. The child part of a fork usually reads about like this: register int pid; if ((pid = fork()) == 0) { dup2(a, 0); dup2(b, 1); signal(SIGINT, SIG_IGN); execv(p, v); _exit(1); } All of this writes to register and stack (not static data), so the trick is that in fork(), you pre-copy one or two pages of stack space, and share all the rest. The advance copies are already writable, so there are no extra faults before the exec(). The parent process is likely to write to its data space, but you prevent it from running for a moment until the child execs or until some (short) timeout has expired. >... any virtual memory system must spend CPU or memory or >silicon to acheive the *effect* of a PTE. The cost cannot be zero. True, but the cost may be the same whether the pretend-PTE is being shared via a vfork style fork or via a copy-on-write style fork(). This means that the cost for the vfork style sharing is greater that it might have to be if the hardware were different, but who cares? We were trying to eliminate vfork anyway. >Again, my claim is that *if* one wishes to make an optimized process >create, the way to do it is to use a create process call for cases where >it is possible and profitable . . . . Certainly. I am just playing devil's advocate anyway. All the hardware *I* have around here makes copy-on-write fork inherently more expensive than vfork fork. I do not know whether the difference is noticeable. (Maybe I should hack up copy-on-write this evening just to find out :-) .) (No, have to get 4.3BSD working on that 8250 first. No trouble, just a few thousand lines of BI code to write, working without manuals as usual [*]. Now where did I put those Mach tapes? . . .) ----- [*] Actually, DEC may be reversing their trend: The KDB50 manuals that came with the machine actually have a substantial fraction of the information I need. This is, at least, a good sign. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690) Domain: chris@mimsy.umd.edu Path: seismo!mimsy!chris