[comp.unix.aix] vfork

ransom@perelandra.austin.ibm.com (Jeff Peek) (06/28/91)

In article <1991Jun27.221208.14845@kithrup.COM> sef@kithrup.COM (Sean Eric Fagan) writes:
>In article <141@ssi.UUCP> cjr@ssi.UUCP (Cris J. Rhea) writes:
>>* The rs6k like the u370 (that's the 3090) has no vfork().
>
>vfork() is a hack, although, I will admit, it does have its uses.  The only
>legitimate use of vfork() is that the child will execute before the parent.
>However, POSIX does not (yet?) have a vfork() like function; IBM may be
>waiting for POSIX to decide what to do before they add one.  (POSIX did
>have, last time I checked a draft, a proposal for a 'qfork()' which was
>almost-but-not-quite vfork().  So be it.)
>

There is a vfork() which is just a mapping to fork(), located in libbsd.a. The fork()
on the RISC System/6000 does have the parent do a swtch() allowing the child first 
chance to run.

-- 
Jeff Peek
AIX Operating System Architecture -- IBM Personal Systems Programming        
ransom@perelandra.austin.ibm.com 	VNET PEEK at AUSVMQ T/L 793-3935
Austin, TX

lance@mpd.tandem.com (Lance Hartmann) (06/29/91)

In article <8903@awdprime.UUCP> ransom@perelandra.austin.ibm.com (Jeff Peek) writes:
>In article <1991Jun27.221208.14845@kithrup.COM> sef@kithrup.COM (Sean Eric Fagan) writes:
>>In article <141@ssi.UUCP> cjr@ssi.UUCP (Cris J. Rhea) writes:
>>>* The rs6k like the u370 (that's the 3090) has no vfork().
>>
>>vfork() is a hack, although, I will admit, it does have its uses.  The only
>>legitimate use of vfork() is that the child will execute before the parent.
>>[REMAINDER DELETED]

"...only legitimate use...child will execute before parent"???  The MAIN reason
to use vfork() is when it is desired to spawn a child that does nothing
but an execve().  With this scenario, it is unnecessary to copy the
address space of the parent, so using vfork() is much more efficient.
Also, don't forget to use _exit() (note the leading underscore '_') INSTEAD
of exit() in the event the execve() fails so that you don't hose the parent's
stdio.
-- 
Lance G. Hartmann - cs.utexas.edu!devnull!lance (Internet)
-------------------------------------------------------------------------------
DISCLAIMER:  All opinions/actions expressed herein reflect those of my VERY OWN
and shall NOT bear any reflection upon Tandem or anyone else for that matter.

deraadt@cpsc.ucalgary.ca (Theo de Raadt) (06/29/91)

In article <1991Jun29.072930.24674@kithrup.COM> sef@kithrup.COM (Sean Eric Fagan) writes:
>>Also, don't forget to use _exit() (note the leading underscore '_') INSTEAD
>>of exit() in the event the execve() fails so that you don't hose the parent's
>>stdio.
>Yep.  That's what happens when you use the hack vfork().

Sean, I'm disapointed in you.

% cat > test.c
#include <stdio.h>
main()
{
	printf("hello ");
	switch(fork()) {
	case 0:
		printf("world\n");
		exit(0);
	case -1:
		perror("fork");
		exit(1);
	default:
		exit(0);
	}
}
% cc test.c -o test
% ./test
hello % hello world
./test
hello world
hello %

That's the way that stdio works.
 <tdr.
--

SunOS 4.1.1: /usr/include/vm/as.h, Line 49    | Theo de Raadt
typo? Should the '_'  be an 's'??             | deraadt@cpsc.ucalgary.ca

drake@drake.almaden.ibm.com (06/29/91)

In article <351@devnull.mpd.tandem.com> lance@mpd.tandem.com (Lance Hartmann) writes:
>                                                                The MAIN reason
>to use vfork() is when it is desired to spawn a child that does nothing
>but an execve().  With this scenario, it is unnecessary to copy the
>address space of the parent, so using vfork() is much more efficient.

While some systems with inefficient implementations of "fork()" require
a "vfork" mechanism in order for efficiency's sake, this is not a universal
situation.  Many systems have efficient implementations of "fork()",
making "vfork()" much less necessary, and completely eliminating the
issue mentioned in the referenced article.

The RISC System/6000's fork() implementation is quite efficient, making
"vfork" unnecessary for performance reasons.  Nonetheless, vfork() *is*
provided on the RISC System/6000, in the BSD compatibility library,
for the convenience of those porting BSD sources.

Sam Drake / IBM Almaden Research Center 
Internet:  drake@ibm.com            BITNET:  DRAKE at ALMADEN
Usenet:    ...!uunet!ibmarc!drake   Phone:   (408) 927-1861

sef@kithrup.COM (Sean Eric Fagan) (06/29/91)

In article <351@devnull.mpd.tandem.com> lance@mpd.tandem.com (Lance Hartmann) writes:
>"...only legitimate use...child will execute before parent"???  The MAIN reason
>to use vfork() is when it is desired to spawn a child that does nothing
>but an execve().  With this scenario, it is unnecessary to copy the
>address space of the parent, so using vfork() is much more efficient.

Gee, on my system, which does not have vfork(), a fork() does not cause the
address space of the parent to be copied.  That's what COW is for.

vfork() is a hack.  Period.  The history I have gotten was that it was done
because the VAXen Berkeley had at the time were not completely capable of
doing COW (bug in the microcode, apparantly), therefore, to ease fork()'s
cost, they created a new system call.

For extremely large processes, many processors and their implementations
will have some problems with using COW forks that vfork() would handle.
However, this need not be the case (I can think of a few tricks that can be
done, and if you throw in hardware assist there are an amazing number of
things you can do).

>Also, don't forget to use _exit() (note the leading underscore '_') INSTEAD
>of exit() in the event the execve() fails so that you don't hose the parent's
>stdio.

Yep.  That's what happens when you use the hack vfork().

-- 
Sean Eric Fagan  | "What *does* that 33 do?  I have no idea."
sef@kithrup.COM  |           -- Chris Torek
-----------------+              (torek@ee.lbl.gov)
Any opinions expressed are my own, and generally unpopular with others.

shore@theory.TC.Cornell.EDU (Melinda Shore) (06/29/91)

In article <351@devnull.mpd.tandem.com> lance@mpd.tandem.com (Lance Hartmann) writes:
>The MAIN reason
>to use vfork() is when it is desired to spawn a child that does nothing
>but an execve().  With this scenario, it is unnecessary to copy the
>address space of the parent, so using vfork() is much more efficient.

Since most contemporary Unixes support copy on write, there is no
compelling reason to use vfork when writing new code.  vfork only
exists because there was a bug in the Vax mm hardware a long, long time
ago (when BSD was first being developed).  And, as we all know, once
something goes into a system it can never come out.  Vfork lives on,
even though we don't need it.
-- 
                  Software longa, hardware brevis
Melinda Shore - Cornell Information Technologies - shore@tc.cornell.edu

shore@theory.TC.Cornell.EDU (Melinda Shore) (06/30/91)

In article <DERAADT.91Jun29025758@fsa.cpsc.ucalgary.ca> deraadt@cpsc.ucalgary.ca (Theo de Raadt) writes:
>Sean, I'm disapointed in you.

[ code deleted ]

>That's the way that stdio works.

Yes, it is, but that isn't what's being discussed.  After a vfork, both
processes are sharing the same address space and data structures.  One
process can change the contents of a data structure without informing
the other, in the absence of an agreed-upon locking mechanism.  Do you
*really* want one process free()-ing all the FILE * out from under the
other process?

Fork works differently, of course, because the two processes are
(logically) not sharing the same address space.
-- 
                  Software longa, hardware brevis
Melinda Shore - Cornell Information Technologies - shore@tc.cornell.edu

jfh@rpp386.cactus.org (John F Haugh II) (06/30/91)

In article <889@rufus.UUCP> drake@drake.almaden.ibm.com writes:
>The RISC System/6000's fork() implementation is quite efficient, making
>"vfork" unnecessary for performance reasons.  Nonetheless, vfork() *is*
>provided on the RISC System/6000, in the BSD compatibility library,
>for the convenience of those porting BSD sources.

I don't know what you call "particularly efficient", but I have measured
fork/exit performance and AIX v3 is worse than SVR1 on a 12MHz 68000 that
I tested in 1986.  As I recall a S/6000 Model 530 produced about 8 or 10
fork/exits per second, compared to about 45 per second on an Mpulse/XL
that I was testing for my employer in 1986 (Pinnacle Systems, Inc.,
Garland TX)

The test is real simple -

main ()
{
	int	i;

	for (i = 0;i < 10000;i++)
		if (fork () == 0)
			exit ();
		else
			while (wait (0) != -1)
				;
}

I tried it on the 386 here and got 68 per second.  What does your
S/6000 give you?
-- 
John F. Haugh II        | Distribution to  | UUCP: ...!cs.utexas.edu!rpp386!jfh
Ma Bell: (512) 255-8251 | GEnie PROHIBITED :-) |  Domain: jfh@rpp386.cactus.org
"UNIX signals are not interrupts.  Worse, SIGCHLD/SIGCLD is not even a UNIX
 signal, it's an abomination."  -- Doug Gwyn

deraadt@cpsc.ucalgary.ca (Theo de Raadt) (06/30/91)

In article <19439@rpp386.cactus.org> jfh@rpp386.cactus.org (John F Haugh II) writes:
> I don't know what you call "particularly efficient", but I have measured
> fork/exit performance and AIX v3 is worse than SVR1 on a 12MHz 68000 that
> I tested in 1986.  As I recall a S/6000 Model 530 produced about 8 or 10
> fork/exits per second, compared to about 45 per second on an Mpulse/XL
> that I was testing for my employer in 1986 (Pinnacle Systems, Inc.,
> Garland TX)
> The test is real simple -
>  main ()
>  {
>  int i;
>
>      for (i = 0;i < 10000;i++)
>          if (fork () == 0)
>              exit ();
>          else
>              while (wait (0) != -1)
>                  ;
>   }

I'll bite and shove a vfork() in there instead of fork(). Here are
times for a Sun4/490, sunos4.1.1.

 fork()      101.1 real         5.6 user        90.2 sys	  99  fork/sec
vfork()        9.7 real         0.5 user         6.5 sys	1030 vfork/sec

I suggest those who say that vfork() is not needed anymore try this test
on their machine.
 <tdr.

--

SunOS 4.1.1: /usr/include/vm/as.h, Line 49    | Theo de Raadt
typo? Should the '_'  be an 's'??             | deraadt@cpsc.ucalgary.ca

sef@kithrup.COM (Sean Eric Fagan) (06/30/91)

In article <19439@rpp386.cactus.org> jfh@rpp386.cactus.org (John F Haugh II) writes:
>As I recall a S/6000 Model 530 produced about 8 or 10
>fork/exits per second, compared to about 45 per second on an Mpulse/XL
>that I was testing for my employer in 1986 (Pinnacle Systems, Inc.,
>Garland TX)
[code deleted]
>I tried it on the 386 here and got 68 per second.  What does your
>S/6000 give you?

Kithrup, a 25MHz '386 running SCO UNIX 3.2v2, got 82.981 fork/exit's per
second.  The RS/6000 at work (I don't know what model, sorry, nor even how
to find that out 8-)), got 153.02 fork/exit's per second.

That's a bit bettern than 8 or 10 per second, I'd say.

-- 
Sean Eric Fagan  | "What *does* that 33 do?  I have no idea."
sef@kithrup.COM  |           -- Chris Torek
-----------------+              (torek@ee.lbl.gov)
Any opinions expressed are my own, and generally unpopular with others.

sef@kithrup.COM (Sean Eric Fagan) (07/01/91)

In article <DERAADT.91Jun30030305@fsa.cpsc.ucalgary.ca> deraadt@cpsc.ucalgary.ca (Theo de Raadt) writes:
>I suggest those who say that vfork() is not needed anymore try this test
>on their machine.

vfork()       1m8.09s real       0m30.38s user   0m35.55s sys    147 vfork/sec
 fork()       1m7.51s real       0m29.15s user   0m35.57s sys    148  fork/sec

Yep, I can see what a win vfork() is.

-- 
Sean Eric Fagan  | "What *does* that 33 do?  I have no idea."
sef@kithrup.COM  |           -- Chris Torek
-----------------+              (torek@ee.lbl.gov)
Any opinions expressed are my own, and generally unpopular with others.

dennis@gpu.utcs.utoronto.ca (Dennis Ferguson) (07/01/91)

In article <19439@rpp386.cactus.org> jfh@rpp386.cactus.org (John F Haugh II) writes:
>In article <889@rufus.UUCP> drake@drake.almaden.ibm.com writes:
>>The RISC System/6000's fork() implementation is quite efficient, making
>>"vfork" unnecessary for performance reasons.  Nonetheless, vfork() *is*
>
>I don't know what you call "particularly efficient", but I have measured
>fork/exit performance and AIX v3 is worse than SVR1 on a 12MHz 68000 that
>I tested in 1986.  As I recall a S/6000 Model 530 produced about 8 or 10
>fork/exits per second, compared to about 45 per second on an Mpulse/XL
>that I was testing for my employer in 1986 (Pinnacle Systems, Inc.,
>Garland TX)

I'm not sure about "8 or 10 fork/exits per second", but I do know that
inverted page tables and segments make a copy-on-write fork() quite
difficult and expensive to implement.  For the RT it was easy to write
a program which ran under Mach (full copy-on-write fork()) at about 5%
of the speed that it would under AOS (traditional copy-all-data fork()).
Page tables are global, rather than per-process, on the RT and the
RS/6000, and memory sharing is done by segments rather than pages.  This
makes it nearly impossible to avoid rewriting the (global) page tables
during process switches if you are doing copy-on-write forks, and this
can be very expensive particularly if both the parent and child continue
running after the fork().  This, by the way, is a reason why one might
wish to let the child run first after a fork().  You can save yourself
a lot of grief if the child has the decency to exit() or execve() quickly
after the fork(), this avoiding the expensive process switches between
child and parent altogether.

I have no idea what AIX V3 actually does about this.  There are
alternatives to pure copy-on-write (e.g. copy-on-access) which can save
some of the cost during process switches at the expense of other
tradeoffs.  I do note, however, that while I agree that vfork() is
a hack, the traditional fork()/vfork() combination was indeed a good
match for what you can do easily on a machine with inverted page tables
and segments: copy all of the data segment or copy none of it.  Whatever
IBM has done to make their fork() "efficient", I doubt that it is all
that "efficient" compared to machines which do copy-on-write with more
conventional memory management.  Certainly the tradeoffs would be
different.

Dennis Ferguson
University of Toronto

jfh@rpp386.cactus.org (John F Haugh II) (07/01/91)

In article <DERAADT.91Jun30030305@fsa.cpsc.ucalgary.ca> deraadt@cpsc.ucalgary.ca (Theo de Raadt) writes:
>I'll bite and shove a vfork() in there instead of fork(). Here are
>times for a Sun4/490, sunos4.1.1.
>
> fork()      101.1 real         5.6 user        90.2 sys	  99  fork/sec
>vfork()        9.7 real         0.5 user         6.5 sys	1030 vfork/sec
>
>I suggest those who say that vfork() is not needed anymore try this test
>on their machine.

No, all you've managed to prove is that you have a future as a marketing
strategist ;-)

This test is pretty meaningless when vfork() is used since vfork() does
nothing.  For a really good time, try this one -

-- exec.c --
main ()
{
	int	i;

	for (i = 0;i < 1000;i++)
		if (FORK () == 0)
			execl ("./exit", "exit", 0);
		else
			while (wait (0) != -1)
				;
}
-- exit.c --
main ()
{
	_exit (0);
}
--

What we are trying to measure is the amount of time that it takes to
execute a =different= new process.  The "forks per second" rate gives
the maximum number of new processes per second.  Since vfork does
little or nothing, you need to measure "vfork/execs per second".  My
prediction is that you will find the difference isn't that impressive.
The exec() will cause a new address space to be cobbled up, while a
COW fork() will just dump the unscribbled on pages and not waste any
time.  The blindingly quick vfork() should lose its advantage to the
work that exec() must do anyway.

To run your little test, compile "fork.c" with

% cc -o vfork -DFORK=vfork exec.c
% cc -o fork -DFORK=fork exec.c
% cc -o exit exit.c

% timex ./fork
execution complete, exit code = 1

real	      1:11.24
user	         0.11
sys	      1:07.65
% calc 1000 / 71.24
	 14.03705783267827

and time both ./fork and ./vfork with timex as before.  I ran the
test here and got about 14 per second, which isn't all that awful.

And since a retraction is probably in order regarding the v3
fork/exec speed, I'll have to admit that the numbers I saw were
from before GA.  I was writing a stress test for a collection of
system calls I had written and couldn't figure out what the bottle
neck was.  It turned out to be slow fork/exec performance.  My
guess is that someone else noticed this and fixed it, or else
that debugging (or tracing, or ...) was turned off prior to GA
and the system picked up.
-- 
John F. Haugh II        | Distribution to  | UUCP: ...!cs.utexas.edu!rpp386!jfh
Ma Bell: (512) 255-8251 | GEnie PROHIBITED :-) |  Domain: jfh@rpp386.cactus.org
"UNIX signals are not interrupts.  Worse, SIGCHLD/SIGCLD is not even a UNIX
 signal, it's an abomination."  -- Doug Gwyn

shore@theory.TC.Cornell.EDU (Melinda Shore) (07/01/91)

In article <DERAADT.91Jun30030305@fsa.cpsc.ucalgary.ca> deraadt@cpsc.ucalgary.ca (Theo de Raadt) writes:
>I suggest those who say that vfork() is not needed anymore try this test
>on their machine.

Well, since vfork on our AIX system (AIX/370) looks something like this:

	#define vfork fork

vfork doesn't buy us much :-).  vfork is a BSD thing, and you can't
count on finding it in every Unix.  As it is said, "the only well-
behaved program is one which doesn't do any system calls."

It's not surprising that there's a performance penalty even if the cost
is just that of copying the page tables.  Note that this is an
implementation deficiency, however, not something intrinsic.  Also,
this seems to me to be micro-optimization.  It's very, very rare to see
more than 3 forks/second on our machine, even when we have a large
number of users on doing interactive work.
-- 
                  Software longa, hardware brevis
Melinda Shore - Cornell Information Technologies - shore@tc.cornell.edu

deraadt@cpsc.ucalgary.ca (Theo de Raadt) (07/01/91)

In article <1991Jun30.175334.12063@kithrup.COM> sef@kithrup.COM (Sean Eric Fagan) writes:
>In article <DERAADT.91Jun30030305@fsa.cpsc.ucalgary.ca> I writes:
>>I suggest those who say that vfork() is not needed anymore try this test
>>on their machine.
> vfork()     1m8.09s real       0m30.38s user   0m35.55s sys    147 vfork/sec
>  fork()     1m7.51s real       0m29.15s user   0m35.57s sys    148  fork/sec
>
>  Yep, I can see what a win vfork() is.

No, if you'll look at my original figures that I posted, you'll see what that
exactly says WHY the machine you are on should have vfork()!

This above data says flatly that your machine does not have vfork(). If
it did have vfork(), I suggest that based on my figures you could probably
do about 1400 vfork()'s per second.

Geez, a sun3/50 can beat your vfork() behaviour! HAHAHAHAHA! I should
compare it to an 11/750, but I don't wish to brave that much slowness right
now -- it just might beat you though.
 <tdr.
--

SunOS 4.1.1: /usr/include/vm/as.h, Line 49    | Theo de Raadt
typo? Should the '_'  be an 's'??             | deraadt@cpsc.ucalgary.ca

jackv@turnkey.tcc.com (Jack F. Vogel) (07/01/91)

In article <1991Jun29.133535.8354@batcomputer.tn.cornell.edu> shore@theory.TC.Cornell.EDU (Melinda Shore) writes:
 
|Since most contemporary Unixes support copy on write, there is no
|compelling reason to use vfork when writing new code.  vfork only
|exists because there was a bug in the Vax mm hardware a long, long time
|ago (when BSD was first being developed).  And, as we all know, once
|something goes into a system it can never come out.  Vfork lives on,
|even though we don't need it.

Ah, but did you know that AIX on the 370 does not have copy on write??
I was told at some point that there was a hardware reason, and looking
around in the code I've seen comments claiming that there is inadequate
information at trap time to support it. However, I am dubious about this
claim, true the 370 is different in that the reference and modify bits
are not in the pte, they are in the storage array and the key protection
bits are in that array also, nevertheless there is a page protection bit
in the pte. So, unless I'm overlooking something, it seems like everything
you need is there. But, enough of my rambling, the fact of the matter is
that it doesn't do copy on write at present, so vfork() may not be such
a bad idea for code on the 370. The PS/2, on the other hand, does support
copy on write.

Disclaimer: I don't speak for my employer.


-- 
Jack F. Vogel			jackv@locus.com
AIX370 Technical Support	       - or -
Locus Computing Corp.		jackv@turnkey.TCC.COM

deraadt@cpsc.ucalgary.ca (Theo de Raadt) (07/01/91)

In article <19445@rpp386.cactus.org> jfh@rpp386.cactus.org (John F Haugh II) writes:
> In article <DERAADT.91Jun30030305@fsa.cpsc.ucalgary.ca> I write:
>>I'll bite and shove a vfork() in there instead of fork(). Here are
>>times for a Sun4/490, sunos4.1.1.
>>
>> fork()      101.1 real         5.6 user        90.2 sys	  99  fork/sec
>>vfork()        9.7 real         0.5 user         6.5 sys	1030 vfork/sec
>>
>>I suggest those who say that vfork() is not needed anymore try this test
>>on their machine.
>
>No, all you've managed to prove is that you have a future as a marketing
>strategist ;-)
>
>This test is pretty meaningless when vfork() is used since vfork() does
>nothing.  For a really good time, try this one -

That's exactly the point! vfork() does nothing, while fork() does lots of
uneccesary work!

vfork() does everything that a programmer would need a fork() to do in the
case of exec() or exit() behaviour.

Sticking an exec() in there makes your benchmark irrelevant; what you
are doing is hiding the tiny little amount of time that it takes to
vfork() or fork() with a large exec time.  That is statistally
misleading. Like the name of the book "Lying with statistics" :-). To
make it accurate again, you would need to to remove the noise in the
benchmark, subtract a loop of 10000 exec's(), only then could you
determine how much work the fork() and vfork() has done, independently
of the enormous exec time.  That's not possible though.

vfork() doesn't do one thing that fork() does which is needed by the exec() --
fork() allocate's a new upage (or equivelant) immediately, vfork() leaves this
till exec time. In the case of exit(), it never gets allocated.

Much as it's a crock, vfork() does indeed have advantages. All this talk about
how COW and inverted page tables are so difficult suggests to me that the
difference between fork() and vfork() on an AIX platform HAD THEY ADDED
VFORK(), would have been much greater even than what I see on a Sun4/470.

Probably more like 2500 vfork()'s per second.
 <tdr.
--

SunOS 4.1.1: /usr/include/vm/as.h, Line 49    | Theo de Raadt
typo? Should the '_'  be an 's'??             | deraadt@cpsc.ucalgary.ca

jet@karazm.math.uh.edu (J Eric Townsend) (07/01/91)

In article <19439@rpp386.cactus.org> jfh@rpp386.cactus.org (John F Haugh II) writes:
>I tried it on the 386 here and got 68 per second.  What does your
>S/6000 give you?

for 1000 fork()s:

Sun SparcStation-2, 16Mb RAM, SunOS 4.1.1
0.7u 13.0s 0:15 88% o+56K 1+2io 0pf+ow

RS/6000 320, 32Mb RAM, AIX 3.1 rel 3005, 
2.87u 3.28s 0:06 %96 1+2k 0+0io 1pf+0w
--
J. Eric Townsend - jet@uh.edu - bitnet: jet@UHOU - vox: (713) 749-2126
Systems Wrangler, University of Houston Department of Mathematics
Skate UNIX! (curb fault: skater dumped)
PowerGlove mailing list: glove-list-request@karazm.math.uh.edu