henry@utzoo.uucp (Henry Spencer) (03/04/89)
In article <85126@felix.UUCP> preston@felix.UUCP (Preston Bannister) writes: >I've never been particularly fond of fork() as a primitive operation... >...In the "real" world, most fork() calls >are immediately followed with an exec() call. The copy of the >process that fork() creates, exec() discards, so why bother making >the copy... This is a common misconception. The fact is, in the "real" world, very few fork() calls are *immediately* followed by an exec() call. There is almost always some manipulation of file descriptors, signals, etc. in between. -- The Earth is our mother; | Henry Spencer at U of Toronto Zoology our nine months are up. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
preston@felix.UUCP (Preston Bannister) (03/07/89)
From article <1989Mar3.174924.977@utzoo.uucp>, by henry@utzoo.uucp (Henry Spencer): > In article <85126@felix.UUCP> preston@felix.UUCP (Preston Bannister) writes: >>I've never been particularly fond of fork() as a primitive operation... >>...In the "real" world, most fork() calls >>are immediately followed with an exec() call. The copy of the >>process that fork() creates, exec() discards, so why bother making >>the copy... > > This is a common misconception. The fact is, in the "real" world, > very few fork() calls are *immediately* followed by an exec() call. > There is almost always some manipulation of file descriptors, signals, > etc. in between. Oops... What I _meant_ (as opposed to what I said :-) was that fork() calls are almost always followed very closely by exec(). The point I am trying to make is that the child-is-copy-of-parent semantic of fork() is irrelevant to _most_ programs. What most programs do between fork() and exec() is exactly what Henry has listed above. The semantics actually needed by most programs that use fork()/exec() could be met by cloning the file, signal, etc. state from the parent process without copying the *entire* data segment of the parent. The semantics of the BSD vfork() call (kludge?) are close. For those of you who can't do "man vfork" :-) the vfork() call creates a "new system context" for the child process without copying the entire parent's address space. The child runs out of the parent's address space until it calls exec(). This is a kludge as the child is using the parent's _stack_ and could cause major problems if exec() is not called in the same procedure as vfork(). Come to think of it, you could make a "safe" version of vfork() fairly easily: int safe_vfork(init_fn) int (*init_fn)(); { int pid = vfork(); if (pid==0) _exit((*init_fn)(pid)); return pid; } Where init_fn would call exec()... did I miss something? -- Preston L. Bannister USENET : hplabs!felix!preston BIX : plb CompuServe : 71350,3505 GEnie : p.bannister
throopw@bert.dg.com (Wayne A. Throop) (03/09/89)
> preston@felix.UUCP (Preston Bannister) >> henry@utzoo.uucp (Henry Spencer) (paraphrased by Wayne Throop) >> [... there is many a dup() 'twixt fork() and exec() ...] > What I _meant_ (as opposed to what I said :-) was that fork() calls are > almost always followed very closely by exec(). The point I am trying to > make is that the child-is-copy-of-parent semantic of fork() is irrelevant to > _most_ programs. [... and vfork() better represents real needs ...] Well.... no. Vfork does nothing that a properly implemented fork can not do (that is, essentially nothing aside from the copying of the data that the child actually faults on for writing). Vfork was a self-admited kludge introduced because the VM system of some version of the BSD lost in the mists of time didn't do an adequate job in supporting fork. On the other hand, I agree that it *might* be a good thing to support a non-copying-semantic method of process creation. It's just that the interaction of vfork() with exit() and exec() is bogus, and a much more excellent method would be to have push/pop of file descriptor and signal state operations, along with a create-process operation. This would give the advantages of vfork (that is, efficency with inadequate or overly simple VM implementations) without the appearance of klugery. On the other hand, we've just added three system calls to handle a case which is just a subset of the possibilites given by appropriate use of fork() and exec(). Leave aside the fact that the case covered by the three additional systems calls is the most common... it's still pretty clear that fork() + exec() is exactly the right division of responsibility if what you want is neat, sweet, and petite. -- Have you heard of the wonderful one-hoss shay, That was built in such a logical way It ran a hundred years to a day? --- Oliver Wendell Holmes -- Wayne Throop <the-known-world>!mcnc!rti!xyzzy!throopw
rec@indetech.UUCP (Rick Cobb) (03/09/89)
Encore has a research system in which fork is broken down to a very fine level of granularity. See ``Variable Weight Processes with Flexible Shared Resources'' in the the _1989 Winter USENIX Conference Proceedings_, available from the USENIX Association, POBox 2299, Berkeley. Their approach has two elements to it: first, you make all the process specific stuff in the kernel (the user and proc structs, call them together the process structure) be accessed through a pointer. I.e., instead of process -> file_descriptor_array you do process -> file_descript_p -> file_descriptor_array You also stuff reference counts in the stuff that's pointed to. Now, you have a kernel where a parent or child can choose *exactly* which parts of their environments to share, and which ones not to share. Then, what sfork() does is to just (1) create a new process pointer (providing a new runtime context only), and (2) copy the references from the process structure of the parent to the child. If you want a *real* fork, you then (could) use the ``resctl'' call to ask that you be given a copy of the parent's resource, instead of the resource itself, or that you be given a freshly initialized structure to point to. They do give a system call interface called fork() which does all the copying. This facility is intended primarily for enhancing sharability of resources; for example, if you do no copying (just an sfork()), you get something about as light as a MACH thread. I haven't thought about what changes or additional features you might give to exec in this system; it seems you might be able to get a faster fork/exec pair from it, though.
preston@felix.UUCP (Preston Bannister) (03/10/89)
From article <4566@cs.Buffalo.EDU>, by ugkamins@sunybcs.uucp (John Kaminski): > It is my opinion that fork() assumes nothing about anything. It > is merely a standard system call that has standard semantics -- > i.e., you will get another independently running process with the > same program and the same variables, the same state of just about > everything...and the parent gets a pid (the returned value). The > manual for the UNIX system in question usually clearly tells you > the semantics (in the case of MINIX, I guess that would be the > description for V7 UNIX). Ah, yes but the "standard semantics" make assumptions about what the machine can do. It is somewhere between difficult and impossible to implement fork() efficiently on a "real" memory processor. > Also, I agree that most of the time that exec() of some kind is > usually per- formed after most fork()s. However, it is not always > desired. Take for example my attempt a a real-time communications > program ...[deleted] > ... it seemed to me that the only way to prevent the program > from blocking on read() calls was to have two independent > processes, each of which could block all it wanted while waiting > for characters, because the other process would still be running or > ready to run. Oh good! :-) This is an example of where what you really want is something that does _less_ than fork(). What you want for this problem is separate processes _without_ separate address spaces. The buzzwords are "lightweight processes", or Mach OS "threads". Your real-time communications program can be most efficient if the processes _share_ the same data. Your subprocesses that will spend most of their time blocked waiting for I/O probably don't need a _copy_ of all the program's data. > A fork() as defined under UNIX would have been > great, but os9fork() requires a string argument which is the > module name (executable files are roughly the equivalent of > modules) to be "exec()ed" after the new process is created. I had > given up on the project due to lack of time, but what I was > attempting is something like os9fork(argv[0]) and have the program > somehow determine if it had already been started and if not, > signal the started one that it was the copy....as you can see, it > got complicated REAL fast. OS-9 runs on processors (6809, 68000, others?) for which Unix-style fork() cannot be efficiently implemented, which I'm sure is why the os9fork() call is different. I suspect that OS-9 has a call for starting a subprocess that looks something like: start_process(proc,stacksize) Where proc is the address of a function, and stacksize is the number of bytes to allocate for the process's stack. (Can someone who knows OS-9 verify this?) You will find this both easier and more efficient for your purposes. > In short, if you are given the choice of fork() then exec() or > forkexec(), I'll take the control offered by separate functions > any day. I'm not arguing against separate functions, just against the semantics of fork(). -- Preston L. Bannister USENET : hplabs!felix!preston BIX : plb CompuServe : 71350,3505 GEnie : p.bannister
karl@decwrl.dec.com (Karl Rowley) (03/11/89)
fork() is expensive if you have to copy the entire process to do it. Some operating systems with virtual memory implement 'copy on write' for virtual memory pages. When a fork() is done, no memory is actually copied. A page is copied when the parent or child writes to it. This makes fork() a whole lot cheaper. The fork()/exec() combination may not always be efficient, but it is very flexible. Karl Rowley Evans and Sutherland Computer Division escd!karl@decwrl.dec.com ...!decwrl!escd!karl