lewine@cheshirecat.webo.dg.com (Donald Lewine) (12/21/90)
Submitted-by: lewine@cheshirecat.webo.dg.com (Donald Lewine) [[I hope that this has not been covered in detail on comp.std.unix. Delivery of the newsgroup has been uneven for the last few weeks.]] [It's not been so much delivery as posting, due to lessened attention on my part because of lack of funding, and also because of no snitch reports. -mod] Draft 5 of POSIX.1a defines qfork() by saying: "The qfork() function shall be identical to the fork() function with the following exception: behavior is undefined if the child process executes any code between the return from qfork() and the succeeding call to one of the exec functions or _exit()." This seems to be a very harsh restriction. The following code seems like it would be undefined: status = qfork(); if (status == 0) execve(...); I would propose replacing the phrase: "executes any code" with "calls any function defined in this standard or the C standard {8}" I think that does what you mean. -------------------------------------------------------------------- Donald A. Lewine (508) 870-9008 Voice Data General Corporation (508) 366-0750 FAX 4400 Computer Drive. MS D112A Westboro, MA 01580 U.S.A. uucp: uunet!dg!lewine Internet: lewine@cheshirecat.webo.dg.com Volume-Number: Volume 22, Number 35
jason@cnd.hp.com (Jason Zions) (12/25/90)
Submitted-by: jason@cnd.hp.com (Jason Zions) >"The qfork() function shall be identical to the fork() function with > the following exception: behavior is undefined if the child process > executes any code between the return from qfork() and the succeeding > call to one of the exec functions or _exit()." > >This seems to be a very harsh restriction. The following code seems >like it would be undefined: > status = qfork(); > if (status == 0) execve(...); > >I would propose replacing the phrase: "executes any code" with "calls >any function defined in this standard or the C standard {8}" I think >that does what you mean. I think that loosens the restriction too much. The intent of the text, I believe, is that *doing* anything between qfork() and exec*() results in undefined behavior. Checking a variable doesn't *do* anything in this sense. The text tries to sidestep the issue of "is qfork() a 4.2BSD-style share-memory pseudo-fork or is it a real fork or what?" An application which takes an action after qfork() and before exec*() that depends upon the implementation of qfork() being any one of those things is inherently unportable. Instead of replacing "executes any code", I think you could just add the phrase "which modifies memory or calls any function" and maintain the intent. Examining variables doesn't depend upon the virtual memory relationship between child and parent, but munging a stack for a function call might behave differently and hence must be rendered undefined. Jason Zions Volume-Number: Volume 22, Number 39
eggert@twinsun.uucp (Paul Eggert) (12/27/90)
Submitted-by: eggert@twinsun.uucp (Paul Eggert)
jason@cnd.hp.com (Jason Zions) writes:
The intent of the [Posix.1a draft 5] text, I believe, is that *doing* anything
between qfork() and exec*() [or _exit()] results in undefined behavior.
I hope this is not the intent. Current applications often do the following
actions between forking and execing in a child process:
Redirect file descriptors using open(), close(), dup2().
Call write() and _exit() if the exec*() fails.
Change signal handling, uid/gid/pgid/session, working/root directory.
Call user-defined functions that modify their local variables.
Both fork() and vfork() permit these actions. Why should qfork() prohibit them?
Volume-Number: Volume 22, Number 42
terri_watson@cis.ohio-state.edu (12/27/90)
Submitted-by: terri_watson@cis.ohio-state.edu Of course assuming that the parent doesn't care about the return value of qfork(), then one could manage to conform to the restrictions by: if (qfork() == 0) exit(execve(...)); or for the truly sick-at-heart: (status = qfork()) ? 1 : exit(execve(...)); (Of course it _could_ be argued that, in the second example, the very assignment of status = qfork() violates the rules, but I thought the humor value was high.) But _I'm_ not writing code like that! <grin> Terri Volume-Number: Volume 22, Number 40
barmar@think.uucp (Barry Margolin) (12/27/90)
Submitted-by: barmar@think.uucp (Barry Margolin) In article <16213@cs.utexas.edu> jason@cnd.hp.com (Jason Zions) writes: >Instead of replacing "executes any code", I think you could just add the >phrase "which modifies memory or calls any function" and maintain the >intent. Examining variables doesn't depend upon the virtual memory >relationship between child and parent, but munging a stack for a function >call might behave differently and hence must be rendered undefined. Rather than "modifies memory" I suggest it be "modifies any variables" or something like it that refers to high-level objects created by the C program. Just examining a variable may modify memory; on a stack machine, comparing a variable with zero generally requires pushing the variable onto the stack, and even on a register machine loading the variable into a register might force the register's old value to be written to memory. -- Barry Margolin, Thinking Machines Corp. barmar@think.com {uunet,harvard}!think!barmar Volume-Number: Volume 22, Number 41
lewine@cheshirecat.rtp.dg.com (Donald Lewine) (12/27/90)
Submitted-by: lewine@cheshirecat.rtp.dg.com (Donald Lewine) In article <16213@cs.utexas.edu>, jason@cnd.hp.com (Jason Zions) writes: |> |> I think that loosens the restriction too much. The intent of the text, I |> believe, is that *doing* anything between qfork() and exec*() results in |> undefined behavior. Checking a variable doesn't *do* anything in this sense. |> The text tries to sidestep the issue of "is qfork() a 4.2BSD-style |> share-memory pseudo-fork or is it a real fork or what?" An application which |> takes an action after qfork() and before exec*() that depends upon the |> implementation of qfork() being any one of those things is inherently |> unportable. |> |> Instead of replacing "executes any code", I think you could just add the |> phrase "which modifies memory or calls any function" and maintain the |> intent. Examining variables doesn't depend upon the virtual memory |> relationship between child and parent, but munging a stack for a function |> call might behave differently and hence must be rendered undefined. |> I think I would vote "NO" on qfork(). I think that there are two better solutions: (1) Just use fork() and require the implementation to do it in an efficient manner. (2) Add some new functions (fexec() ?) which do the fork() and exec() in one call. I know that this is not existing practice but neither was sigemptyset() or tcgetispeed(). This may be another case where it is better to define a new interface than to try to describe the existing practice. [[Also, qfork() is not quite vfork() so it can be shot down on the same basis.]] As I think about it, (2) is a much nicer solution to the problem than qfork(). The library can then implement fexecl() as a vfork() followed by an execl() or a fork() followed by an execl(). The error handling is clean and the semantics are obvious. -------------------------------------------------------------------- Donald A. Lewine (508) 870-9008 Voice Data General Corporation (508) 366-0750 FAX 4400 Computer Drive. MS D112A Westboro, MA 01580 U.S.A. uucp: uunet!dg!lewine Internet: lewine@cheshirecat.webo.dg.com Volume-Number: Volume 22, Number 58
peter@ficc.ferranti.com (Peter da Silva) (12/28/90)
Submitted-by: peter@ficc.ferranti.com (Peter da Silva) In article <16066@cs.utexas.edu> lewine@cheshirecat.webo.dg.com (Donald Lewine) writes: > I would propose replacing the phrase: "executes any code" with "calls > any function defined in this standard or the C standard {8}" I think > that does what you mean. How about "executes any code that changes the state of the program". So, for example: static int child; child = 0; if(qfork() == 0) { child = 1; exec...; } At this point, unless I'm confused about legal interpretations of "qfork()", the value of "child" is indeterminate. -- Peter da Silva. `-_-' "Eat hot digital death, mainframe scum!" +1 713 274 5180. 'U` -- Attack of the Killer Micros. peter@ferranti.com Volume-Number: Volume 22, Number 43
marc@arnor.uucp (12/28/90)
Submitted-by: marc@arnor.uucp It would be useful to know why the function is being proposed. One assumes an efficiency improvement, which implies that the specifiers have an implementation in mind. Also, it should be remembered that unix systems don't execute C - they execute machine instructions generated by the C compiler. So it is necessary to specify the behavior in machine terms if the compiler writers are going to comply. In particular, there is nothing to prevent the compiler from moving certain computations to the space between the qfork and the exec! Does a compiler need to recognize qfork and exec as special? Finally - if the intent is to "bundle" fork and exec together, assuming only that the fork succeeds, would it not be better to propose fexec* - a set of exec calls which fork first? Of course, this makes it absolutely clear that nothing can happen between fork and exec. If the combined function is then deemed useless, how can the qfork/exec idiom be better? -- Marc Auslander <marc@ibm.com> Volume-Number: Volume 22, Number 44
jfh@rpp386.cactus.org (John F Haugh II) (12/29/90)
Submitted-by: jfh@rpp386.cactus.org (John F Haugh II) In article <16271@cs.utexas.edu> peter@ficc.ferranti.com (Peter da Silva) writes: >How about "executes any code that changes the state of the program". So, >for example: executing =any= code changes the state of the program. that's the whole problem with this restriction - how much code is too much. >At this point, unless I'm confused about legal interpretations of "qfork()", >the value of "child" is indeterminate. what is probably needed is a "spawn()" function (god, i never thought i'd advocate such a critter) which can be responsible for understanding the legalese. if the only thing you can do after "qfork()" is "exec()", why not merge the two steps into a single function? sounds like the only way to get it right anyhow. -- John F. Haugh II UUCP: ...!cs.utexas.edu!rpp386!jfh Ma Bell: (512) 832-8832 Domain: jfh@rpp386.cactus.org "While you are here, your wives and girlfriends are dating handsome American movie and TV stars. Stars like Tom Selleck, Bruce Willis, and Bart Simpson." Volume-Number: Volume 22, Number 47
peter@ficc.ferranti.com (Peter da Silva) (12/30/90)
Submitted-by: peter@ficc.ferranti.com (Peter da Silva) In article <16307@cs.utexas.edu> jfh@rpp386.cactus.org (John F Haugh II) writes: > what is probably needed is a "spawn()" function (god, i never thought i'd > advocate such a critter) which can be responsible for understanding the > legalese. Wow, this is the same man who ever so politely flamed me for daring to make such a suggestion. fork can be implemented on a large number of O/Ses, but it's rather expensive. If the POSIX standard includes something like a spawn(), that'll sure increase the efficiency of a lot of POSIX software on systems that have been shoehorned into the model. Yes, fork() is a cleaner method of creating new processes. Yes, it takes a fairly complex calling sequence to get spawn() to have anything like the functionality of fork()...exec(). But I think it'd be worthwhile to let a little heresy in in exchange for making POSIX more palatable to folks in poorer environments. The few cases where spawn() won't fit would usually be better addressed by something like threads anyway... -- Peter da Silva. `-_-' "Eat hot digital death, mainframe scum!" +1 713 274 5180. 'U` -- Attack of the Killer Micros. peter@ferranti.com Volume-Number: Volume 22, Number 48
mason@tmsoft.uucp (Dave Mason) (12/31/90)
Submitted-by: mason@tmsoft.uucp (Dave Mason) In article <16307@cs.utexas.edu> jfh@rpp386.cactus.org (John F Haugh II) writes: >Submitted-by: jfh@rpp386.cactus.org (John F Haugh II) >In article <16271@cs.utexas.edu> peter@ficc.ferranti.com (Peter da Silva) writes: >>How about "executes any code that changes the state of the program". So, >>for example: >executing =any= code changes the state of the program. that's the whole >problem with this restriction - how much code is too much. The real requirement is presumably: ``Must not execute any code that changes MEMORY.'' As both the parent and child have their own register sets. Now, expressing that in a high-level way that is portable may be quite a trick. (Think of SPARC vs. 386 vs. HP/3000!) >>At this point, unless I'm confused about legal interpretations of "qfork()", >>the value of "child" is indeterminate. Not if it's a register variable. >what is probably needed is a "spawn()" function (god, i never thought i'd >advocate such a critter) which can be responsible for understanding the >legalese. if the only thing you can do after "qfork()" is "exec()", why >not merge the two steps into a single function? sounds like the only way >to get it right anyhow. Not really. Assuming qfork in the parent can make sure there is nothing on its stack (that it needs to retrieve later) before it executes the system call instruction, and the child doesn't do anything except: a) make system calls that change its KERNEL state (open files, UID, etc.) b) change register variables qfork can do everything useful that vfork can. (And because there's no memory being changed by the child that can be inspected by the parent, a fork implementation of qfork is still legal.) (Personally I think the whole vfork/qfork/spawn thing is a horrible hack, but if we're going to be stuck with it, lets at least do it right!) -- "Don't break it if you can't fix it." ../Dave Mason <mason%tmsoft@cs.toronto.edu> Volume-Number: Volume 22, Number 49
jason@cnd.hp.com (Jason Zions) (01/01/91)
Submitted-by: jason@cnd.hp.com (Jason Zions) >Also, it should be remembered that unix systems don't execute C - they >execute machine instructions generated by the C compiler. So it is >necessary to specify the behavior in machine terms if the compiler >writers are going to comply. In particular, there is nothing to >prevent the compiler from moving certain computations to the space >between the qfork and the exec! Does a compiler need to recognize >qfork and exec as special? That's specious. Of *course* the compiler needs to recognize system calls as sync points. It wouldn't do for the compiler to migrate the instructions which initialize a write() buffer to after the write() call, would it? >Finally - if the intent is to "bundle" fork and exec together, >assuming only that the fork succeeds, would it not be better to >propose fexec* - a set of exec calls which fork first? Of course, >this makes it absolutely clear that nothing can happen between fork >and exec. If the combined function is then deemed useless, how can >the qfork/exec idiom be better? Um, how about "existing practice"? More than that, there's a whole raft of common extensions revolving around various forms of qfork() that many would like to see remain as possible extensions. Jazz Volume-Number: Volume 22, Number 50
jfh@rpp386.cactus.org (John F Haugh II) (01/02/91)
Submitted-by: jfh@rpp386.cactus.org (John F Haugh II) In article <16370@cs.utexas.edu> mason@tmsoft.uucp (Dave Mason) writes: >The real requirement is presumably: ``Must not execute any code that >changes MEMORY.'' As both the parent and child have their own register >sets. Now, expressing that in a high-level way that is portable may >be quite a trick. (Think of SPARC vs. 386 vs. HP/3000!) Yes, that is the essence of the problem - there are CPUs out there which have a very small number of CPU registers (perhaps only two or three) available to the user. As I recall, the TI 9900 has one register which points to a location in memory where the rest of the "registers" exist, and of course older HP CPU's have their own ideas about where data is stored. -- John F. Haugh II UUCP: ...!cs.utexas.edu!rpp386!jfh Ma Bell: (512) 832-8832 Domain: jfh@rpp386.cactus.org "While you are here, your wives and girlfriends are dating handsome American movie and TV stars. Stars like Tom Selleck, Bruce Willis, and Bart Simpson." Volume-Number: Volume 22, Number 53
guido@cwi.nl (Guido van Rossum) (01/03/91)
Submitted-by: guido@cwi.nl (Guido van Rossum) peter@ficc.ferranti.com (Peter da Silva) writes: >Yes, fork() is a cleaner method of creating new processes. Yes, it takes >a fairly complex calling sequence to get spawn() to have anything like >the functionality of fork()...exec(). But I think it'd be worthwhile to >let a little heresy in in exchange for making POSIX more palatable to >folks in poorer environments. I know of precedents even in OS'es that support fork(): Amoeba and Topaz support a variant of what you call spawn(). (Note that the spawn() functions found in Microsoft C for MS-DOS emulate either just exec() or fork()+exec()+wait(), which is much less powerful, but all that MS-DOS can support (last time I looked).) Amoeba's UNIX emulation supports fork(), but since Amoeba has no virtual memory (yet), it is fairly expensive. An alternative function is provided, "newproc()", which creates a child process running a different program (and, because it is Amoeba, also running on a different processor, in the average case) just like fork()+exec() would do, only much cheaper since the parent's address space never gets copied. Amoeba's newproc() lets you change the two perhaps most important bits of "kernel state" that programs fiddle between fork() and exec(): the set of signals to be ignored and the set of open file descriptors. The interface lets you specify a bitmask of signals that are to be ignored in the child (or -1 to inherit the parent's ignored signals) and an array of file descriptors which provides a mapping between file descriptors in the parent and in the child (also with an option to inherit all file descriptors from the parent). Amoeba's library functions popen() and system() have been changed to use newproc(), and the shell uses newproc() for most simple program invocations (environment manipulations and a few other things make it fall back on fork()). The performance gain was well worth the hacking. The newproc() interface could also be implemented on UNIX using [v]fork() and exec(), although extreme cases of file descriptor permutations could fail if not enough spare file descriptors were available. The Topaz operating system (an Ultrix clone for Firefly multiprocessors developed at DEC's System Research Centre in Palo Alto) has a similar but more complete feature in its Modula-2+ (and now Modula-3?) version of the OS interface, not because Topaz doesn't have virtual memory (it does), but because the average Modula-2+ binary is several megabytes. In Topaz, you create a descriptor for the new process, which represents its relevant kernel state. The descriptor is initialized to inherit all state from the parent, and you can call library functions that modify various parts of the descriptor; this is the equivalent of what you would do between fork() and exec() in real UNIX. Finally you make a system call that presents the descriptor to the kernel for creation. Yes, it's a bit more tedious, but it has all the required functionality, unlike (it seems to me) the proposed qfork() with its not-well-understood restrictions on modifying memory. --Guido -- Guido van Rossum, CWI, Amsterdam <guido@cwi.nl> "Well I'm a plumber. I can't act." Volume-Number: Volume 22, Number 55
donn@hpfcdc.fc.hp.com (Donn Terry) (01/03/91)
Submitted-by: donn@hpfcdc.fc.hp.com (Donn Terry) To add to the qfork discussion: 1) The purpose of qfork() is similar to that of vfork(), but detailed discussions end up showing that there really is nothing that is safe to do between the qfork() and the exec() that will not cause problems on some architecture or other. That's the reason for the name change from vfork(). 2) 1003.5 (Ada) has a entry point "start_process[_search]" which are a spawn()-like animal that takes a data structure to tell it what to do with file descriptors, user ID's and the like between the fork() and exec() that underly it. (In Ada, fork() + exec() is "unsafe"; "unsafe" is a dirty word in the Ada community.) This might be (observation of fact, not an opinion) a place to start for something that does serve the fork(), do a few safe things, exec() sequence. Donn Terry Speaking only for myself. Volume-Number: Volume 22, Number 56
lenox@media-lab.MEDIA.MIT.EDU (Lenox H. Brassell) (01/05/91)
Submitted-by: lenox@media-lab.MEDIA.MIT.EDU (Lenox H. Brassell) In article <16478@cs.utexas.edu>, guido@cwi.nl (Guido van Rossum) writes: > (Note that the > spawn() functions found in Microsoft C for MS-DOS emulate either just > exec() or fork()+exec()+wait(), which is much less powerful, but > all that MS-DOS can support (last time I looked).) > Using Microsoft C under OS/2, you can pass P_NOWAIT as the first argument to the MSC spawn() functions, and the child process will run asynchronously. The spawn() functions return the child process's PID in this case. Although MSC is certainly not a UNIX compiler, this "prior art" might be a good place to start if POSIX needs a "safe" qfork()+exec() service. --lenox (lenox@media-lab.mit.edu) Volume-Number: Volume 22, Number 59
domo@tsa.co.uk (Dominic Dunlop) (01/05/91)
Submitted-by: domo@tsa.co.uk (Dominic Dunlop) In article <16483@cs.utexas.edu> lewine@dg.uucp (Donald Lewine) writes: > I think I would vote "NO" on qfork(). I think that there are > two better solutions: > (1) Just use fork() and require the implementation to do it > in an efficient manner. Well, I know that we POSIX folks want to rule the world, but just how ugly a world will we put up with in order that we can rule it? qfork() (and vfork()) special-case a particular usage of the process creation mechanism in order to give implementors an easier time of it. By now we know that in a ``from to ground up'' virtual memory implementation of UNI*X with a half-way useful memory management hardware copy on write and similar finessing can make the general-case fork() call as efficient as any special-case variant, and, unlike the variants, is free from any threat that sixteen ton weights will be dropped on any programmer who steps out of line. That said, POSIX has nothing to say about the efficiency of any particular implementation: that's a quality issue, not a conformance issue. One hopes that in the kind of free market that standards are supposed to encourage, better quality will win out over poorer quality, other things being equal. So, yes, I'm in favour of keeping just fork(), and letting implementors worry about how slick they need to make it. After all, there's few implementors in this world than applications programmers, so it seems to make sense to localize the pain involved to the smaller group. Sorry about that. I am aware that the efficient implementation of fork() is a real headache on some architectures, and particularly in hosted POSIX, but, well, so's cooking up fake inodes (or parts thereof). Happily, I hear nobody suggesting that we define unsafe versions of stat() to get around that problem. Just how far should we bend over backwards to accommodate history? Remember that every extra function we define has to be maintained on all implementations for ever more (more or less), and that every extra function is something else that programmers have to learn about. > (2) Add some new functions (fexec() ?) which do the fork() and > exec() in one call. I know that this is not existing practice > but neither was sigemptyset() or tcgetispeed(). This may be > another case where it is better to define a new interface than > to try to describe the existing practice. [[Also, qfork() is > not quite vfork() so it can be shot down on the same basis.]] I don't like this much either, but it might be an acceptable compromise if the effect of the new functions was defined in the standard in terms of fork() and exec() family functions. This would make it easy to bring existing implementations with an efficient fork() into line. Please let's resist the temptation to add new functionality to exec (for example) on the way past. By the way, what line (if any) are the .5 (Ada) folks taking on this issue? How does all this square (if at all) with Ada's concept of a task? -- Dominic Dunlop Volume-Number: Volume 22, Number 60
peter@ficc.ferranti.com (Peter da Silva) (01/05/91)
Submitted-by: peter@ficc.ferranti.com (Peter da Silva) ( Guido lists a couple of systems where a function with spawn() type semantics (create a new process and load a program)... ) Also, most DEC operating systems use this sort of call for process creation. AmigaOS does things in the opposite order: you load a segment and then use CreateProc to start executing it. -- Peter da Silva. `-_-' "Eat hot digital death, mainframe scum!" +1 713 274 5180. 'U` -- Attack of the Killer Micros. peter@ferranti.com Volume-Number: Volume 22, Number 61
loren@Eng.Sun.COM (Loren L. Hart) (01/05/91)
Submitted-by: loren@Eng.Sun.COM (Loren L. Hart) In article <16478@cs.utexas.edu> guido@cwi.nl (Guido van Rossum) writes: >Submitted-by: guido@cwi.nl (Guido van Rossum) > >peter@ficc.ferranti.com (Peter da Silva) writes: > >>Yes, fork() is a cleaner method of creating new processes. Yes, it takes >>a fairly complex calling sequence to get spawn() to have anything like >>the functionality of fork()...exec(). But I think it'd be worthwhile to >>let a little heresy in in exchange for making POSIX more palatable to >>folks in poorer environments. > >I know of precedents even in OS'es that support fork(): Amoeba and >Topaz support a variant of what you call spawn(). (Note that the >spawn() functions found in Microsoft C for MS-DOS emulate either just >exec() or fork()+exec()+wait(), which is much less powerful, but >all that MS-DOS can support (last time I looked).) The Ada POSIX Draft currently has some sort of spawn command, since fork is not necessicarily safe with respect to Ada Tasks. I don't rember the specific routine, but it is likely to go through some change before the Ada version of the standard is finalized. I would hope that this spawn capability will be merged back into the 1003.1 standard at some point in the future. ------------------------------------------------------------ Mr. Loren L. Hart The Ada Ace Group, Inc. PO Box 36195 San Jose, CA 95158 loren@cup.portal.com -- ----------------------------------------------------------------------------- Loren L. Hart The Ada Ace Group, Inc loren@cup.portal.com P.O. Box 36195 San Jose, CA 95158 Volume-Number: Volume 22, Number 62
mohta@necom830.cc.titech.ac.jp (Masataka Ohta) (01/07/91)
Submitted-by: mohta@necom830.cc.titech.ac.jp (Masataka Ohta) >By now we >know that in a ``from to ground up'' virtual memory implementation of >UNI*X with a half-way useful memory management hardware copy on write and >similar finessing can make the general-case fork() call as efficient as >any special-case variant, and, unlike the variants, is free from any >threat that sixteen ton weights will be dropped on any programmer who >steps out of line. You should also know that copy-on-write fork(), unlike vfork(), is inherently buggy and can not be a general-purpose useful memory management mechanism. If you have 50MB swap space and want to fork() 30MB process to exec less than 1MB shell, you can't. With COW fork(), there is workaround. But the workaround is so incomplete that the system sometimes deadlocks. Thus, fork(), even COW fork(), is not a proper mechanism to fork-exec other processes. If you insist on using fork(), someday, thirtytwo ton weights will drop on your head. Masataka Ohta PS I can't understand why the name vfork() is changed to qfork(). On crippled systems which can not support true vfork(), the implementors can make vfork() just fork(). It is actually done on NeXT. What's wrong with that? Volume-Number: Volume 22, Number 64
gwyn@smoke.brl.mil (Doug Gwyn) (01/08/91)
Submitted-by: gwyn@smoke.brl.mil (Doug Gwyn) In article <16213@cs.utexas.edu> jason@cnd.hp.com (Jason Zions) writes: >I think that loosens the restriction too much. The intent of the text, I >believe, is that *doing* anything between qfork() and exec*() results in >undefined behavior. Checking a variable doesn't *do* anything in this sense. >The text tries to sidestep the issue of "is qfork() a 4.2BSD-style >share-memory pseudo-fork or is it a real fork or what?" We (IEEE P1003) deliberately omitted vfork() from the POSIX spec because it was not necessary, given a decent implementation of fork(). Why is this notion being reintroduced (quite carelessly so far as I can tell from the quotes so far) for 1003.1a? Volume-Number: Volume 22, Number 66
jfh@rpp386.cactus.org (John F Haugh II) (01/14/91)
Submitted-by: jfh@rpp386.cactus.org (John F Haugh II) In article <16873@cs.utexas.edu> mohta@necom830.cc.titech.ac.jp (Masataka Ohta) writes: >You should also know that copy-on-write fork(), unlike vfork(), is inherently >buggy and can not be a general-purpose useful memory management mechanism. You are confusing theory with implementation. There is nothing "inherently" buggy with either vfork() or copy-on-write fork(). vfork() is fairly inflexible, and as pointed out by other writers, completely superfluous given a properly implemented fork(). >If you have 50MB swap space and want to fork() 30MB process to exec less >than 1MB shell, you can't. With COW fork(), there is workaround. But the >workaround is so incomplete that the system sometimes deadlocks. Again, the problem you are alluding to results from the choice of early or late allocation of paging space. If you choose early allocation, you are correct - you can't fork() a 30MB process with only 20MB remaining. And yes, if you choose late allocation it is possible to deadlock, but only in the cases where you are doing more than you are with vfork(). Thus your complaint is simply invalid. If I modify no pages between fork() and exec() with late allocated COW fork(), I will =never= run out of page space simply because I required no additional pages. Any scenario where I do modify a page is unsuitable for vfork(), so there is no room for comparision of the merits of fork() with vfork(). >Thus, fork(), even COW fork(), is not a proper mechanism to fork-exec >other processes. If you wish to describe some operation which is a simple fork-exec then you are correct. However, process creation frequently involves more than forking and execing a new command. It often involves the creation of IPC mechanisms (pipes, etc), signal manipulation, I/O redirection, ad nauseum. -- John F. Haugh II UUCP: ...!cs.utexas.edu!rpp386!jfh Ma Bell: (512) 832-8832 Domain: jfh@rpp386.cactus.org "While you are here, your wives and girlfriends are dating handsome American movie and TV stars. Stars like Tom Selleck, Bruce Willis, and Bart Simpson." Volume-Number: Volume 22, Number 67
peter@ficc.ferranti.com (Peter da Silva) (01/14/91)
Submitted-by: peter@ficc.ferranti.com (Peter da Silva) In article <16875@cs.utexas.edu> gwyn@smoke.brl.mil (Doug Gwyn) writes: > We (IEEE P1003) deliberately omitted vfork() from the POSIX spec > because it was not necessary, given a decent implementation of fork(). POSIX is not supposed to be a standard for UNIX only. In many non-UNIX environments a "decent implementation of fork" is quite difficult and may even be impossible. A poor implementation of fork() is likely to clobber performance, given how common it is in UNIX code. I suspect that in practice most implementations will use an efficient call in the internals of system() and similar library routines. Either direct spawns or an implementatio of qfork that looks something like: qfork() { Save stack context. Vector open(), exit(), etc to pseudo routines that save flags (either adjust jump table or set a flag that tells open we're "in a qchild". Return 0; } qopen() /* new version of open */ { open file, save descriptor } ... qexec() /* new version of exec */ { call spawn, passing it the info for the new execution environment (juggling fds, signals, etc). Return execution environment to the one set up back at the qfork (juggle fds, etc) longjump back to saved stack context, returning child pid. } This gets around the cost of creating a new execution context that will simply be juggled briefly and discarded. Something like this will be needed to get decent performance out of systems like VMS where creating a new process is quite expensive. Why not standardise this so portable programs can take advantage of it? -- Peter da Silva. `-_-' "Have you hugged your wolf today?" +1 713 274 5180. 'U` peter@ferranti.com Volume-Number: Volume 22, Number 68
domo@tsa.co.uk (Dominic Dunlop) (01/15/91)
Submitted-by: domo@tsa.co.uk (Dominic Dunlop) For those who have been following this issue, last week's meeting of 1003.1 decided that [qv]fork() would be dropped from the next draft of 1003.1a, the extensions to POSIX.1. It will be replaced in the next or a subsequent draft with a proposal for a spawn() family, analogous to the exec() family. Work already done by 1003.5 (Ada bindings) on an interface of this type will be taken into account, and the opportunity will be taken to address some very knotty problems which arise when one thread out of many in a single process decides to replace the whole process image. Threads are strictly a 1003.4 (realtime) issue, but it makes sense for any new interface proposed by 1003.1 in this area make life easier for 1003.4, rather than more difficult. (Cue 1003.4 folks, to tell us all what a pig fork-exec is in a multi-threaded process.) -- Dominic Dunlop Volume-Number: Volume 22, Number 69
richard@aiai.ed.ac.uk (Richard Tobin) (01/15/91)
Submitted-by: richard@aiai.ed.ac.uk (Richard Tobin) In article <16895@cs.utexas.edu> jfh@rpp386.cactus.org (John F Haugh II) writes: >Any scenario where I do modify a page is unsuitable for vfork(), so there >is no room for comparision of the merits of fork() with vfork(). Most programs that use vfork() change the values of variables in the child. This is perfectly reasonable, so long as the parent doesn't rely on the value of those variables. Of course, these programs don't usually change many variables, so a copy-on-write fork() won't need many pages in this case. A c-o-w fork() with late allocation of pages could could be as robust as vfork() almost always by pre-allocating a few pages. Surely the problem is when it's being used as a *real* fork(), and the program fails much later when it modifies one variable too many. I prefer to think of vfork() as being a way to save a process's kernel state - file descriptors etc - and having that state automatically restored when an exec() is done. -- Richard -- Richard Tobin, JANET: R.Tobin@uk.ac.ed AI Applications Institute, ARPA: R.Tobin%uk.ac.ed@nsfnet-relay.ac.uk Edinburgh University. UUCP: ...!ukc!ed.ac.uk!R.Tobin Volume-Number: Volume 22, Number 70
rml@hpfcdc.fc.hp.com (Bob Lenk) (01/16/91)
Submitted-by: rml@hpfcdc.fc.hp.com (Bob Lenk) In article <16895@cs.utexas.edu> jfh@rpp386.cactus.org (John F Haugh II) writes: > Again, the problem you are alluding to results from the choice of early > or late allocation of paging space. If you choose early allocation, you > are correct - you can't fork() a 30MB process with only 20MB remaining. > And yes, if you choose late allocation it is possible to deadlock, but > only in the cases where you are doing more than you are with vfork(). This sounds like a good argument for two mechanisms, one with early allocation and the other with late (or perhaps no) allocation. If the OS chooses blindly from a single interface (fork) it will sometimes choose "wrong" according to the needs of the application. There could be an interface to select the behavior of fork() rather than two separate calls; there are some advantages to each approach. In what I've read in this discussion about qfork(), I'm nervous about the spec that *nothing* is permitted between it and exec. If nothing is permitted portably, then a single spawn call should be defined. The whole advantage of separate calls is that things *can* be done in between. People will take advantage of this even if the standard calls the behavior implementation-defined, unspecified, or undefined, and the result is that people will write less-than-portable code. If the standard defines qfork(), I think it should define a useful set of operations permitted with well-defined results in the child prior to exec. Bob Lenk rml@fc.hp.com {uunet,hplabs}!fc.hp.com!rml Volume-Number: Volume 22, Number 72
gwyn@smoke.brl.mil (Doug Gwyn) (01/17/91)
Submitted-by: gwyn@smoke.brl.mil (Doug Gwyn) In article <16992@cs.utexas.edu> peter@ficc.ferranti.com (Peter da Silva) writes: >In article <16875@cs.utexas.edu> gwyn@smoke.brl.mil (Doug Gwyn) writes: >> We (IEEE P1003) deliberately omitted vfork() from the POSIX spec >> because it was not necessary, given a decent implementation of fork(). >POSIX is not supposed to be a standard for UNIX only. In many non-UNIX >environments a "decent implementation of fork" is quite difficult ... Excuse me, but you're quite wrong. P1003 decided deliberately that we (I was there) would not compromise the (1003.1) interface in order to accommodate "layered" implementations, for example on non-UNIX based operating system kernels. Volume-Number: Volume 22, Number 73
peter@ficc.ferranti.com (Peter da Silva) (01/18/91)
Submitted-by: peter@ficc.ferranti.com (Peter da Silva) I said: > >POSIX is not supposed to be a standard for UNIX only. In many non-UNIX > >environments a "decent implementation of fork" is quite difficult ... In article <17010@cs.utexas.edu> gwyn@smoke.brl.mil (Doug Gwyn) writes: > Excuse me, but you're quite wrong. P1003 decided deliberately that we > (I was there) would not compromise the (1003.1) interface in order to > accommodate "layered" implementations, for example on non-UNIX based > operating system kernels. I don't think I'm wrong here, unless you're leaving something out. There's a difference between: P1003 will not compromise the interface to accomodate layered implementations. And: P1003 is for UNIX only. And I fail to see how an extension that happens to make it easier to write portable programs that remain reasonably efficient on layered implementations compromises the interface. Nobody's saying "get rid of fork()" this time. -- Peter da Silva. `-_-' peter@ferranti.com +1 713 274 5180. 'U` "Have you hugged your wolf today?" Volume-Number: Volume 22, Number 75
addw@phcomp.co.uk (Alain Williams) (01/23/91)
Submitted-by: addw@phcomp.co.uk (Alain Williams) > There's a difference between: > > P1003 will not compromise the interface to accomodate > layered implementations. > > And: > > P1003 is for UNIX only. > > And I fail to see how an extension that happens to make it easier to > write portable programs that remain reasonably efficient on layered > implementations compromises the interface. Nobody's saying "get rid > of fork()" this time. OK, so you are writing a program that you intend to port onto every Posix machine in the universe. Do you use fork()/exec() or do you use spawn() ? If you are writing the system(3C) function the answer is easy, if your application does a little more work in between fork() & exec(), do you jump though hoops to use whatever spawn() turns out to be and ``damage'' your implementation on true UNIX boxes for a new non-UNIX ones ? I guess that what you would do is to use good old #ifdef to get the best of both worlds. So I don't think that it really matters what we end up with as long as the fork()/exec() alternative is well defined so that we only have to do the non-UNIX posix port once. What would be probably quite usefull is a #define FORK_IS_PAINFULL that we can test and thus compile in the spawn() code. You should not forget that these standards are supposed to make life easier for application writer. Also the (good) application writer takes a pragmatic approach and does the best job that he can in a given environment, the most important thing is to get it working reasonably well - and quickly. Alain Williams +44 734 461232 phLOGIN our Turnkey Security Login utility is available NOW - ask me for info. Volume-Number: Volume 22, Number 81
michael@CS.UCLA.EDU (michael gersten) (01/25/91)
Submitted-by: michael@CS.UCLA.EDU (michael gersten) In article <17010@cs.utexas.edu> gwyn@smoke.brl.mil (Doug Gwyn) writes: >Submitted-by: gwyn@smoke.brl.mil (Doug Gwyn) > >In article <16992@cs.utexas.edu> peter@ficc.ferranti.com (Peter da Silva) writes: >>In article <16875@cs.utexas.edu> gwyn@smoke.brl.mil (Doug Gwyn) writes: >>> We (IEEE P1003) deliberately omitted vfork() from the POSIX spec >>> because it was not necessary, given a decent implementation of fork(). >>POSIX is not supposed to be a standard for UNIX only. In many non-UNIX >>environments a "decent implementation of fork" is quite difficult ... > >Excuse me, but you're quite wrong. P1003 decided deliberately that we >(I was there) would not compromise the (1003.1) interface in order to >accommodate "layered" implementations, for example on non-UNIX based >operating system kernels. May I humbly ask what was wrong with vfork? As I understand it, vfork's semantics was a virtual fork-- conceptually two execution threads would return from the call, and they may or may not be sharing data space--any program that relied on one or the other was by definition broken. Now, with that definition, vfork() is pretty trivial to implement, even on a naked 68000 with no mmu. And on better, more complete systems, it can be identical to fork. So its a call that is at least, if not more, efficient on a larger variety of hardware platforms. Now, why was it removed? What is wrong with it? Volume-Number: Volume 22, Number 82
sef@kithrup.COM (Sean Eric Fagan) (01/29/91)
Submitted-by: sef@kithrup.COM (Sean Eric Fagan) In article <17402@cs.utexas.edu> michael@CS.UCLA.EDU (michael gersten) writes: >So its a call that is at least, if not more, efficient on a larger >variety of hardware platforms. >Now, why was it removed? What is wrong with it? Why have it at *all*? If it is functionally equivalent to fork(), why in k&r's name add another call that does exactly the same thing? -- Sean Eric Fagan | "I made the universe, but please don't blame me for it; sef@kithrup.COM | I had a bellyache at the time." -----------------+ -- The Turtle (Stephen King, _It_) Any opinions expressed are my own, and generally unpopular with others. Volume-Number: Volume 22, Number 85
gwyn@smoke.brl.mil (Doug Gwyn) (01/30/91)
Submitted-by: gwyn@smoke.brl.mil (Doug Gwyn) In article <17402@cs.utexas.edu> michael@CS.UCLA.EDU (michael gersten) writes: >Now, why was it removed? What is wrong with it? vfork() wasn't removed; rather it was never added. The base document (/usr/group 1984 Standard) did not have vfork() but it did have fork(). There was no need for a second flavor of fork(). Standards and systems in general should provide one sufficiently good way to perform a given operation. There were only two major arguments for vfork(): efficiency of fork/exec, which is not a convincing argument, and that it provides a form of sharing data space between two processes, which was judged to be an undesirable form of providing such a facility. Volume-Number: Volume 22, Number 86
domo@tsa.co.uk (Dominic Dunlop) (01/31/91)
Submitted-by: domo@tsa.co.uk (Dominic Dunlop) In article <17402@cs.utexas.edu> michael@CS.UCLA.EDU (michael gersten) writes: > May I humbly ask what was wrong with vfork? Yup. I'll humbly try to answer. > > As I understand it, vfork's semantics was a virtual fork-- > conceptually two execution threads would return from the call, > and they may or may not be sharing data space--any program > that relied on one or the other was by definition broken. The problem -- one problem -- is in coming up with a ``portable'' definition of ``data space''. On what we currently assume to be ``vanilla flavour'' architectures such as that of the 68000 which you cite, it's fairly obvious. But on others, it's not. What about registers? Are they data space? No? Even on architectures with register windows which may or may not map onto main memory addresses? Bear in mind that such exotica are not so exotic any more: RISCs use them widely. It seems that any definition which is safe on all architectures is liable to constrain what one may do between [qv]fork() and exec() so greatly that it turns out to be better to define a combined spwan() function. This would make it less likely that the POSIX standards would contain explicit or implicit assumptions about the architecture of the hardware on which a conforming implementation runs. While this might be nice for aging architectures such as that of (say) the Unisys 1100 series, more importantly, it would not constrain architectural advances of the future needlessly to conform to the nineteen seventies' ideas of what was a ``clean machine'' in order to be able efficiently to implement POSIX interfaces. > Now, why was it removed? What is wrong with it? -- see also comp.std.unix Volume 22, Number 69. -- Dominic Dunlop Volume-Number: Volume 22, Number 93
eggert@twinsun.uucp (Paul Eggert) (02/01/91)
Submitted-by: eggert@twinsun.uucp (Paul Eggert) ... any definition which is safe on all architectures is liable to constrain what one may do between [qv]fork() and exec() so greatly that it turns out to be better to define a combined spawn() function. What's wrong with the following definition, which permits the usual actions between fork() and exec()? Isn't this definition easy to explain and support? vfork() acts like fork(), except: 1. Any variables that are common to both parent and child, and are changed by the child before it exits or execs, have undefined values in the parent when its vfork() returns. 2. The child may not call unsafe standard functions (these are nonreentrant; see Posix 1003.1-1988 section 3.3.1.3(3)(f)). 3. The child may not return from the function that called vfork(), either explicitly or via longjmp(). 4. The program must #include <vfork.h>. (2) follows from (1). (4) is common practice, and gets around the exotic architecture problem. (2)'s phrase ``common to both parent and child'' lets the child call reentrant functions, because their automatic variables do not exist in the parent. I don't much like vfork(), but it's common practice, it is much faster on many hosts, and many widely distributed Unix programs use it. By all means, let's invent other primitives if they're needed, but why not first standardize the primitives we already have? Volume-Number: Volume 22, Number 96
sef@kithrup.COM (Sean Eric Fagan) (02/03/91)
Submitted-by: sef@kithrup.COM (Sean Eric Fagan) In article <17572@cs.utexas.edu> eggert@twinsun.uucp (Paul Eggert) writes: >What's wrong with the following definition, which permits the usual actions >between fork() and exec()? Isn't this definition easy to explain and support? [stuff] Because one of the few reasons I would like vfork() (or something similar) is now missing: the parent does not execute until the child has exit'ed or exec'ed something. -- Sean Eric Fagan | "I made the universe, but please don't blame me for it; sef@kithrup.COM | I had a bellyache at the time." -----------------+ -- The Turtle (Stephen King, _It_) Any opinions expressed are my own, and generally unpopular with others. Volume-Number: Volume 22, Number 99
mohta@necom830.cc.titech.ac.jp (Masataka Ohta) (02/03/91)
Submitted-by: mohta@necom830.cc.titech.ac.jp (Masataka Ohta) In article <17527@cs.utexas.edu> domo@tsa.co.uk writes: >The problem -- one problem -- is in coming up with a ``portable'' >definition of ``data space''. These are 'problems' (actually, not a problem) of C, not UNIX. There is no problem about data space. C has clear and portable notion of what is data space: register and memory. That's all. It has very little to do with UNIX nor vfork(). >On what we currently assume to be >``vanilla flavour'' architectures such as that of the 68000 which you >cite, it's fairly obvious. But on others, it's not. What about >registers? Are they data space? No? Even on architectures with >register windows which may or may not map onto main memory addresses? >Bear in mind that such exotica are not so exotic any more: RISCs use >them widely. Clearly, on exotic architectures, a C (not UNIX) pointer may point to a register. It may be an exotic feature of C. But, it never is a problem of C nor UNIX nor vfork(). >It seems that any definition which is safe on all architectures is >liable to constrain what one may do between [qv]fork() and exec() so >greatly No. First, list every operations which is safe between fork() and exec() *and* between BSD vfork() and exec(). Then, those are the safe operations of POSIX vfork() on *all* architectures. >that it turns out to be better to define a combined spwan() >function. Most (perhaps, more than 90%) of cases where fork/exec is necessary is covered by system(). spawn() is not necessary. Rest are special cases, where combined spawn() can help very little and separate [v]fork() and exec() is really necessary. Is it a role of POSIX to define unnecessary and totaly alien functions and badly modify UNIX? Don't try to reinvent wheels. Masataka Ohta Volume-Number: Volume 22, Number 100
mohta@necom830.cc.titech.ac.jp (Masataka Ohta) (02/03/91)
Submitted-by: mohta@necom830.cc.titech.ac.jp (Masataka Ohta) In article <16994@cs.utexas.edu> richard@aiai.ed.ac.uk (Richard Tobin) writes: >Of course, these programs don't usually change many variables, so a >copy-on-write fork() won't need many pages in this case. A c-o-w >fork() with late allocation of pages could could be as robust as >vfork() almost always by pre-allocating a few pages. That is the case where COW fork() with late allocation of pages or vfork() must be used. >Surely the problem is when it's being used as a *real* fork(), and the >program fails much later when it modifies one variable too many. If fork() is used as a real fork(), it is very probable that it modifies its data space many times. The severe problem is that the program can not control the failure. It immediately dies. With non-COW fork() or COW fork() with immediate allocation of pages, such a failure can be detected as error return value of fork() and may be processed in a controlled fashion. Masataka Ohta Volume-Number: Volume 22, Number 101
mohta@necom830.cc.titech.ac.jp (Masataka Ohta) (02/07/91)
Submitted-by: mohta@necom830.cc.titech.ac.jp (Masataka Ohta) In article <17633@cs.utexas.edu> peter@ficc.ferranti.com (Peter da Silva) writes: >Subject: Re: spawn() wars... please... not again... And you have already lost the war. So, please! not again! >Leave in the >fork() call, but allow a more efficient (and, let's face it, easier to >understand) alternative: spawn(). In the last war, you can't even show a specification of spawn(), because of its complexity. Every UNIX programmer understand fork() and exec(), but can't understand spawn() without its specification. >Leave in the fork() call, So, you are not trying to eliminate fork(). You should also preserve exec(), because exec() has its own purpose and several programs are actually using it without fork(). >No. Those are the safe operations between fork() and exec() on UNIX. > >POSIX looks like it's going to comprise far more than UNIX. If fork() and exec() exists in POSIX, many (if POSIX should be useful, all) operations are safe between fork() and exec(). >Look, I know you don't like spawn(). But in a lot of environments... INCLUDING >ONES THAT ARE OTHERWISE QUITE CAPABLE OF SUPPORTING A POSIX ABI... it is *not* >possible to do a safe and efficient implementation of fork(). A lot of? Can you name them? Anyway, it is the problem of that environment. It should provide safe and inefficient implementation of fork() and safe and efficient implementation of system(). If a programmer want to squeeze extra performance in some case which can not covered by system() (dose such a case actually exist?), he can do so by not using POSIX there if he think effeciency is more important than ABI. >Let's say you define vfork() as "set a flag that all posix calls that deal >with uid, signals, files, etc... look at, so they just write a "script" of >actions to take on behalf of the new process". I can't understand what you are saying. "set a flag"? What? >> Most (perhaps, more than 90%) of cases where fork/exec is necessary >> is covered by system(). spawn() is not necessary. > > No, system() and popen() can not, ever, let you pass a set of > arguments to a program without diddling by the shell. When you > have no way of knowing whether that shell will be sh, csh, ksh, > or even rc what can you do to protect yourself? Read the manual! System() and popen() always use "/bin/sh". > Who knows, I can easily imagine DEC setting things up so a user > could set his shell to DCL and hose *everything* up. User's shell has nothing to do with the behaviour of system() nor popen(). > Using system() in programs like (for example) uucp, mail handlers, > and so on is a security hole you can drive a truck through. Yes, it can be a security hole if used improperly just like many system calls. I'm sure spawn() can also be a security hole. So what? Masataka Ohta Volume-Number: Volume 22, Number 112
std-unix-request@uunet.uu.net (John S. Quarterman) (02/08/91)
Submitted-by: std-unix-request@uunet.uu.net (John S. Quarterman) Unfortunately, I'm beginning to hear from readers who are deleting unread every message in the qfork/spawn discussion. I have to admit I tend to agree that there's more heat than light being shown on all sides of this one. So, could we please either get back to reasoned technical discussion, or move on to something else? John S. Quarterman, moderator, comp.std.unix and std-unix@uunet.uu.net Volume-Number: Volume 22, Number 113