[comp.unix.wizards] Is System V.4 fork reliable?

jr@oglvee.UUCP (Jim Rosenberg) (07/07/90)

Somewhere along in the development of System V, fork became an unreliable
system call.  At least it is on my (V.3) system.  I asked the net about
this, and after some completely wrong answers that we were out of swap
space, the story that emerged was (to cite an old posting, Jerry Gardner in
<3696@altos86.Altos.COM>):

> The fork() failures you are seeing are occurring when procdup() calls
> dupreg().  Dupreg() calls ptsalloc() which eventually calls getcpages() to
> allocate memory for page tables to map the new child process' u-area.  
> Apparently, the kernel is paranoid in one place here and it calls ptsalloc
> with a flag that doesn't allow it to sleep.

Apparently if sleep were allowed a deadlock could occur.  The result is that
an intensive burst of activity can cause fork to fail, even though really
the system is *not* out of resources and ought to be able to handle it.
(Since the page-stealing daemon is asynchronous, you can never guarantee
*exactly* when it will run.)

Numerous people suggested more RAM as the cure.  Right.  What that amounts
to saying is, "Get enough RAM so that you *NEVER* page."  I.e. V.3 has
virtual memory, but don't assume you can really use it.

The number of utilities that both use fork and also understand that under
some circumstances it ought to be *retried* if it fails is pitifully small.
(The shell simply reports the bogus message "No more processes".  On my
system when cron incurs a fork failure it logs that it is "rescheduling" the
job. Right.  cron "reschedules" into oblivion, ceasing to run *any* jobs.)

My question is:  Is this *FIXED* in V.4?  I went to the V.4 internals
tutorial at Usenix in D.C.  V.4 does have an asynchronous page-stealing
daemon and does have a kernel memory allocate call with a flag to either
sleep or not sleep.  Do any of the kmem_alloc() calls (if I remember the
name right, I don't have my notes handy) resulting from fork *not* allow
sleep?  If so I believe that would also give V.4 the lovely V.3 feature of
unreliable fork.  And in this I-hope-not case, the man page for fork(2)
should at least tell the truth and make clear the circumstances under which
fork should be retried.  And all the utilities which fork should be hacked
to actually do those retries.
---
Jim Rosenberg             #include <disclaimer.h>      --cgh!amanue!oglvee!jr
Oglevee Computer Systems                                        /      /
151 Oglevee Lane, Connellsville, PA 15425                    pitt!  ditka!
INTERNET:  cgh!amanue!oglvee!jr@dsi.com                      /      /

gwyn@smoke.BRL.MIL (Doug Gwyn) (07/08/90)

In article <561@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) writes:
>The number of utilities that both use fork and also understand that under
>some circumstances it ought to be *retried* if it fails is pitifully small.

This is unfortunately true, but then so is the number of utilities that
understand that write() might successfully write fewer than the requested
number of bytes.

>(The shell simply reports the bogus message "No more processes".

Actually, the UNIX System V shells that I've encountered do keep retrying
the fork() operation for a while.  In fact, somewhere around SVR2.0 the
shell was changed to use an "exponential backoff" algorithm, i.e. the
delay between successive retries was doubled each time until some limit
was hit, at which time the shell would give up with "No more processes".
This algorithm, combined with a too-small per-user process limit, was
directly responsible for some UNIX System V vendors failing to pass the
operational demonstration in the large DA MINIS acquistion; because all
the "batch" scripts were executed under the same UID, several of them
quickly backed off to the point that they did not start running very
soon after the heavy load had subsided; basically the machine was idle
and the shells were sleeping like Rip van Winkle.

jr@oglvee.UUCP (Jim Rosenberg) (07/09/90)

gwyn@smoke.BRL.MIL (Doug Gwyn) writes:

>In article <561@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) writes:
>>(The shell simply reports the bogus message "No more processes".

>Actually, the UNIX System V shells that I've encountered do keep retrying
>the fork() operation for a while.  In fact, somewhere around SVR2.0 the
>shell was changed to use an "exponential backoff" algorithm, i.e. the
>delay between successive retries was doubled each time until some limit
>was hit, at which time the shell would give up with "No more processes".

Fascinating.  I don't think we're seeing this behavior, but all of our
users are either taken straight to our database manager sans shell by
.profile or use csh.  I bet the exponential backoff was *not* put into csh.
This is V.3.1.  I don't know when AT&T officially held its nose and blessed
csh as a "real" shell, but my impression is that it wasn't until V.3.2.

Does csh under V.4 have the exponential backoff?  I presume under BSD no
such thing is needed.
-- 
Jim Rosenberg             #include <disclaimer.h>      --cgh!amanue!oglvee!jr
Oglevee Computer Systems                                        /      /
151 Oglevee Lane, Connellsville, PA 15425                    pitt!  ditka!
INTERNET:  cgh!amanue!oglvee!jr@dsi.com                      /      /

mab@druwy.ATT.COM (Alan Bland) (07/10/90)

In article <561@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) writes:
>The number of utilities that both use fork and also understand that under
>some circumstances it ought to be *retried* if it fails is pitifully small.

fork(2) isn't the only System V call that needs to be retried under some
circumstances.  poll(2) is another one that can fail with EAGAIN, which means
it may work if you try again.  There are also others -- pay attention to
the errno descriptions in the programmers reference guide.
--
-- Alan Bland
-- att!druwy!mab == mab@druwy.ATT.COM
-- AT&T Bell Laboratories, Denver CO
-- (303)538-3510

gwyn@smoke.BRL.MIL (Doug Gwyn) (07/12/90)

In article <563@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) writes:
>Fascinating.  I don't think we're seeing this behavior, ...

I'm not sure you understood what I was describing.  You would only see
that when fork()s started to fail, and that normally occurs only when
the user reaches his process-count limit, which is a system
configuration parameter.  The reason this occurred in the operational
demo that I described is that the vendor chose to set up all accounts
being used for the test with the same UID, thereby virtually assuring
that the process limit would be encountered.  Normal use of UNIX does
not have everybody operating under the same UID.

>Does csh under V.4 have the exponential backoff?  I presume under BSD no
>such thing is needed.

It has nothing to do with "BSD vs. SysV".  Any system with a bound on
the number of processes permitted for a given user would encounter
fork() failures (which could of course also occur if the system ran
out of space resources).  I hope my example made it clear why
exponential backoff was a poor strategy.  However, most shells should
attempt to recover from fork() failures to a reasonable degree.

lls@kings.co.uk (Lady Lodge Systems) (07/13/90)

In article <561@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) writes:
>Somewhere along in the development of System V, fork became an unreliable
>system call....
>...
>The number of utilities that both use fork and also understand that under
>some circumstances it ought to be *retried* if it fails is pitifully small.
>(The shell simply reports the bogus message "No more processes"...

Or if you are using the command substitute construct (...`run cmd`...)
and the Bourne shell cannot fork it simply hangs.  No retry, No Error
just a full stop!  Sheesh!  if the shell is broken about retrying forks,
is it any wonder that most other programs are as well?

>My question is:  Is this *FIXED* in V.4?

Damned Good Question!
-- 
 -------------------------------------------------------------------------
| Peter King.                          | Voice: +44 733 239445            |
| King Bros (Lady Lodge) Ltd           | EMAIL: root@kings.co.uk          |
| Lady Lodge House, Orton Longueville, | UUCP: ...mcvax!ukc!kings!root    |
| Peterborough, PE2 0HZ, England.      |                                  |
 -------------------------------------------------------------------------

jr@amanue.UUCP (Jim Rosenberg) (07/25/90)

In <32@kings.co.uk> lls@kings.co.uk (Lady Lodge Systems) writes:

>In article <561@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) writes:
>>Somewhere along in the development of System V, fork became an unreliable
>>system call....

>>My question is:  Is this *FIXED* in V.4?

>Damned Good Question!

Which, alas, no one has answered.  Surely *somebody* reading this group can
look at the V.4 source and let us know.  To repeat:  do any kernel routines
called as result of a fork issue kmem_alloc() with the flag KM_NOSLEEP?  If the
answer is yes then it would seem to me V.4 will have the same problem as V.3.
-- 
 Jim Rosenberg                                               -- cgh!amanue!jr
     CIS: 71515,124                              UUCP:         /    /    |
     WELL: jer                                          dsi.com  pitt!  ditka!
     BIX: jrosenberg    Internet: cgh!amanue!jr@dsi.com

ag@cbmvax.commodore.com (Keith Gabryelski) (07/25/90)

In article <480@amanue.UUCP> jr@amanue.UUCP (Jim Rosenberg) writes:
>In article <561@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) writes:
>>Somewhere along in the development of System V, fork became an
>>unreliable system call....

Unreliable in what way?  fork() has always been documented to fail
if there isn't enough system resources.

Forgive me, I haven't been keeping up with c.u.w.

>>My question is:  Is this *FIXED* in V.4?
>
>Which, alas, no one has answered.  Surely *somebody* reading this
>group can look at the V.4 source and let us know.  To repeat: do any
>kernel routines called as result of a fork issue kmem_alloc() with
>the flag KM_NOSLEEP?  If the answer is yes then it would seem to me
>V.4 will have the same problem as V.3.

When assigning a process id the kernel will try to allocate a proc_t
without sleeping [ie, pass KMEM_NOSLEEP to *alloc()].  If this fails,
fork() will return EAGAIN.

What isn't reasonable about this?  fork() is documented (under SVR4)
to fail returning EAGAIN if:

	The system-imposed limit on the total number of processes
	under execution by a single user would be exceeded.

or

	Total amount of system memory available when reading via raw
	I/O is temporarily insufficient.

Although I don't know of any other unix varient that documents the
latter case (save SVR3), EAGAIN is a totally reasonable thing to check
for as a return error code from fork().

Pax, Keith

jr@oglvee.UUCP (Jim Rosenberg) (07/27/90)

In <13426@cbmvax.commodore.com> ag@cbmvax.commodore.com (Keith Gabryelski) writes:
>>In article <561@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) writes:
>>>Somewhere along in the development of System V, fork became an
>>>unreliable system call....
>Unreliable in what way?  fork() has always been documented to fail
>if there isn't enough system resources.

*** OPEN DRAGON MOUTH ***

What do you count as a resource?  Disk space is a resource.  If I run out of
swap space, I'm the system administrator, this is *my* problem, I don't go
complaining to Dennis Ritchie because I don't have a rubber disk.  I also *can
fix it*.  It isn't fun, but I can reconfigure my disk for more swap space.

Is physcial memory a resource?  Now it gets sticky.  The whole concept of
virtual memory is exactly supposed to be that I SHOULDN'T HAVE TO think of
physical memory as a resource; the operating system should handle that, as
long as what I'm asking for is reasonable.  If my system is so loaded that
there are no more process table slots, again, as system administrator that's
*my* problem.  And I can fix it.

But if system calls fail simply because of a very temporary bout of activity,
that is *not my problem*!  It's the kernel's problem.  At least it should be:
that's what "I'm paying the kernel for" so to speak.  And *I CAN'T FIX IT*
myself.  If utilities haven't been rewritten to do the right thing with EAGAIN
after fork(), and I'm only a binary licensee, what can I do about it?  (Except
climb the soap box in this newsgroup, of course!  :-))

>When assigning a process id the kernel will try to allocate a proc_t
>without sleeping [ie, pass KMEM_NOSLEEP to *alloc()].  If this fails,
>fork() will return EAGAIN.

OK, so we have the word, V.4 will have some of the same problems as V.3.

>What isn't reasonable about this?  fork() is documented (under SVR4)
>to fail returning EAGAIN if:

	[citation from the man page for fork(2)]

Come on now.  How many times have you seen a writeup on how to do things with
fork() in a UNIX magazine or book -- and how many of those times has the
author done *ANYTHING* at all with EAGAIN??  It is simply a fact that the need
to retry forks that have failed with EAGAIN is not widely embedded out there
in UNIX culture.  I have not seen one single writeup in print that really
discusses this.  My favorite text on programming UNIX system calls is Marc
Rochkind's book, Advanced UNIX Programming.  Maybe I have an old edition, but
here's what he says about it:  "The only cause of an error is resource
exhaustion, such as insufficient swap space or too many processes already in
execution."  No mention of the fact that if you just wait for the page-
stealing daemon to go to work everything will be fine.  Is it "reasonable" for
my application to have to sleep when in fact it should be the kernel that does
the sleep for me?  Why should I have to guess how long to sleep?  My
application doesn't have to sleep to wait for a disk block to become available
-- the kernel does this for me.  Why shouldn't it do the same thing for memory
pages?

What retry policy should one use?  How long should one sleep?  How many
retries?  Or even a larger question:  How come there's no consensus on this?
How come this isn't one of those FrequentlyAskedQuestions?

Does every V.4 utility that forks in fact *do something sensible* with EAGAIN?
(Since having the application deal with this is so "reasonable" ...)  Once
again, those of you out there with source can answer this one easily enough.

The larger question is *why can't the kernel sleep* when it needs more memory
for a fork???  It appears there is a risk of some kind of deadlock.  Whose
problem is *this*?  Mine as system administrator?  When I brought this subject
up the first time, someone posted that they had, in fact, hacked their source
code to allow sleep under V.3.  This person said he got away with it.  That
leads me to believe that in fact he was simply lucky, but that the race
condition or deadlock or whatever the problem is is mighty obscure.  Perhaps
too obscure for anyone to *FIX*.

My personal view is that a kernel whose only mutual exclusion mechanisms are
sleep-wakeup and spl() just makes it too complicated to really fix this and
allow sleep.  Once upon a time there was a consensus that the kernel needed to
be rewritten from scratch.  Once upon a time there was the famous V.5 skunk
works doing exactly this, and based on what Bill Joy told me once it sounds
like it would have dealt with this kind of problem.  But then the skunk works
became politically untenable after the OSF rebellion, so now we seem not to
hear much talk about how important it is to rewrite the kernel.  (Except from
OSF, CMU, who say they've already done it ... :-))

We only hear talk that it is "reasonable" to bother the application with the
fact that the system just happens to be kind of busy at the moment, but busy
with a problem on which it *should* be able to sleep ...

And for those of you wizards out there who write articles on C for such mags
as UNIX Review & UNIX World, please **TELL PEOPLE** about this issue!
-- 
Jim Rosenberg             #include <disclaimer.h>      --cgh!amanue!oglvee!jr
Oglevee Computer Systems                                        /      /
151 Oglevee Lane, Connellsville, PA 15425                    pitt!  ditka!
INTERNET:  cgh!amanue!oglvee!jr@dsi.com                      /      /

gwyn@smoke.BRL.MIL (Doug Gwyn) (07/28/90)

In article <573@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) writes:
-But if system calls fail simply because of a very temporary bout of activity,
-that is *not my problem*!  It's the kernel's problem.  At least it should be:
-that's what "I'm paying the kernel for" so to speak.  And *I CAN'T FIX IT*
-myself.  If utilities haven't been rewritten to do the right thing with EAGAIN
-after fork(), and I'm only a binary licensee, what can I do about it?

Oh, good grief.  It is SILLY to say that the kernel should be redesigned
to compensate for bugs in application programs.

terryl@sail.LABS.TEK.COM (07/28/90)

In article <573@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) writes:
>In <13426@cbmvax.commodore.com> ag@cbmvax.commodore.com (Keith Gabryelski) writes:
>>>In article <561@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) writes:
>>>>Somewhere along in the development of System V, fork became an
>>>>unreliable system call....
>>Unreliable in what way?  fork() has always been documented to fail
>>if there isn't enough system resources.
>
>*** OPEN DRAGON MOUTH ***
>

     Delete MANY, MANY lines of bitching and moaning about why the "kernel"
should sleep if system resources can't be found for a fork, instead of just
returning an error condition....

>My personal view is that a kernel whose only mutual exclusion mechanisms are
>sleep-wakeup and spl() just makes it too complicated to really fix this and
>allow sleep.  Once upon a time there was a consensus that the kernel needed to
>be rewritten from scratch.  Once upon a time there was the famous V.5 skunk
>works doing exactly this, and based on what Bill Joy told me once it sounds
>like it would have dealt with this kind of problem.  But then the skunk works
>became politically untenable after the OSF rebellion, so now we seem not to
>hear much talk about how important it is to rewrite the kernel.  (Except from
>OSF, CMU, who say they've already done it ... :-))
>
>We only hear talk that it is "reasonable" to bother the application with the
>fact that the system just happens to be kind of busy at the moment, but busy
>with a problem on which it *should* be able to sleep ...


     Well, there's a couple of reasons why "it is reasonable to bother the
application" with the fact that system resources are a little scarce, and could
you try your request again later???? These reasons go WAY BACK (we're talking
ancient history here, boys and girls!!! (-:), and are some of the philosophies
espounded by Ritchie, Thompson, Kernighan, et. al. (hope I spelled their names
right!!! (-:).

     The first is "KISS" (Keep It Simple, Stupid). Don't clutter up the code
with obscure and esoteric algorithms when a simpler one will do; makes the
code easier to maintain, and mucho easier to understand. Unfortunately, with
the sophistication of today's hardware, it is becoming increasingly difficult
to "KISS" (pun intended!!! (-:).

     A direct corollary of "KISS" is thus: It is better to make 20 programs
each do one thing well, than it is to make one program do 20 things. Not only
is it better, it is also easier, and promotes modularity and re-use of compo-
nents. Unfortunately, this concept has long since been abandoned MANY, MANY
moons ago, and I for one am sad to see it go (I submit the options of `cat' and
`ls' as prime examples; I'm sure astute readers can think of their favorite
examples of this...).

     The second reason is thus: for the most part, the "kernel" should NOT
impose policy; it should be up to the individual application to impose it's
policy that it wants. If the "kernel" were to impose policy, it would limit
the application's choices, which Ritchie, et. al. decided "is a bad thing to
do". The "kernel" should be thought of as mainly a "system resource manager",
with just enough policy to make things work. The file system is an excellent
axample of this. As far as the "kernel" is concerned, files are just a sequence
of bytes; the "kernel" makes no assumptions on what these sequences of bytes
really means. It is up to the application to determine what these means (OK,
yes, there are a COUPLE of places where the "kernel" knows a priori what these
bytes mean, e.g. an executable header...).

     So, for Mr. Rosenberg's assertion that "the kernel SHOULD sleep on a
resource shortage", this is imposing a policy upon the application that is
better left to the application.

     BTW, this is all IMHO, etc., and if I've misrepresented Mr. Ritchie's
position, apologies in advance, etc....


			Terry Laskodi
			     of
			Tektronix

sar0@cbnewsl.att.com (stephen.a.rago) (07/28/90)

In article <480@amanue.UUCP>, jr@amanue.UUCP (Jim Rosenberg) writes:
> In <32@kings.co.uk> lls@kings.co.uk (Lady Lodge Systems) writes:
> 
> >In article <561@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) writes:
> >>Somewhere along in the development of System V, fork became an unreliable
> >>system call....
> 
> >>My question is:  Is this *FIXED* in V.4?
> 
> Which, alas, no one has answered.  Surely *somebody* reading this group can
> look at the V.4 source and let us know.  To repeat:  do any kernel routines
> called as result of a fork issue kmem_alloc() with the flag KM_NOSLEEP?

Yes.

peter@ficc.ferranti.com (Peter da Silva) (07/28/90)

In article <573@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) writes:
> The larger question is *why can't the kernel sleep* when it needs more memory
> for a fork???

Even if it can't do it right then, it could check right before leaving
kernel mode. Or if that's politically incorrect, change fork.o in libc.a
to implement the check for EAGAIN and do an exponential backoff (or
whatever).
-- 
Peter da Silva.   `-_-'
+1 713 274 5180.   'U`
<peter@ficc.ferranti.com>

martin@mwtech.UUCP (Martin Weitzel) (07/29/90)

In article <13435@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
>In article <573@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) writes:
>-But if system calls fail simply because of a very temporary bout of activity,
>-that is *not my problem*!  It's the kernel's problem.  At least it should be:
>-that's what "I'm paying the kernel for" so to speak.  And *I CAN'T FIX IT*
>-myself.  If utilities haven't been rewritten to do the right thing with EAGAIN
>-after fork(), and I'm only a binary licensee, what can I do about it?
>
>Oh, good grief.  It is SILLY to say that the kernel should be redesigned
>to compensate for bugs in application programs.

Would you write the same, if someone sold a disk controller with a
firmware problem that causes it sometimes to give erratic answers,
until, say, the head happens to cross track 512, after which everthing
runs normal again? Suppose further this erratic behaviour is even
documented somewhere in the controller manual. Would you support the
view of the controller manufacturer who might say:

	"In fact, my controller is working PERFECTLY WELL, it's all
	the problem of those who haven't read the docs which tells
	about this. It's SILLY to say I should redesign the firmware
	to compensate for bugs in device drivers of UNIX, which should
	just make more retries if this particular problems shows up.
	As the DOS-driver I supply with my controller shows, there's
	no problem with my controller if you use it in the right way."

Well, if I put myself in the shoes of that particular manufacturer, I
could even take this view. If I put myself into the shoes of a kernal
code developper, I can well understand Doug Gwyn's view (and from the
many knowledgable answers and worthful contributions Doug has posted
to this group, I think his view *is* the one of a person who knows kernal
stuff more than well).

But I'm in the position of an application developper and as such I
take the view of the original poster, who's opinion is that a solution
for this problem belongs into the kernal.

Oh ... wait a moment, my telephone just rings ...

"Who's there, Joe Customer? What, the program I sold you behaves
erratic. Ehmm, just retry some times ... and btw. read the docs.
On page 277 you'll find a footnote that tells you that you should
just retry ...".

OK, here I'm back again. Oh folks, these SILLY customers! They allways
want me to redesign this complex program, just because they can't use
it in the right way ... :-)
-- 
Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83

jfh@rpp386.cactus.org (John F. Haugh II) (07/30/90)

In article <ZOY4ZO4@xds13.ferranti.com> peter@ficc.ferranti.com (Peter da Silva) writes:
>In article <573@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) writes:
>> The larger question is *why can't the kernel sleep* when it needs more memory
>> for a fork???
>Even if it can't do it right then, it could check right before leaving
>kernel mode. Or if that's politically incorrect, change fork.o in libc.a
>to implement the check for EAGAIN and do an exponential backoff (or
>whatever).

It isn't that the kernel =can't= sleep, but rather that someone decided
[ for some totally random reason, I suspect ... ] that the kernel
=shouldn't= sleep.  The solution isn't to add some kludge on top of the
system, but rather to put back the behavior that was always there -
the kernel sleeps in fork if it requires additional memory.

In the past, the kernel swapped the process if malloc() didn't return
with space.  So, this is a change in function, not some defect in
application code.
-- 
John F. Haugh II                             UUCP: ...!cs.utexas.edu!rpp386!jfh
Ma Bell: (512) 832-8832                           Domain: jfh@rpp386.cactus.org

friedl@mtndew.UUCP (Stephen J. Friedl) (07/30/90)

In article <18478@rpp386.cactus.org>, jfh@rpp386.cactus.org (John F. Haugh II) writes:
> 
> It isn't that the kernel =can't= sleep, but rather that someone decided
> [ for some totally random reason, I suspect ... ] that the kernel
> =shouldn't= sleep.  The solution isn't to add some kludge on top of the
> system, but rather to put back the behavior that was always there -
> the kernel sleeps in fork if it requires additional memory.

I've been following this and am not so sure I agree with this line
of thinking.  What if I >want< fork to fail if it can't do it (because
I want to reschedule the process that wants to fork for a later time).
By allowing the calling program control right away, it can make its
own choice.  If you put this in the kernel then everybody does it the
same way: how long should it wait before giving up?  Wkat kind of backoff
should it use? 

     Steve

-- 
Stephen J. Friedl, KA8CMY / Software Consultant / Tustin, CA / 3B2-kind-of-guy
+1 714 544 6561  / friedl@mtndew.Tustin.CA.US  / {uunet,attmail}!mtndew!friedl

"I'm a simple girl; I wear a cat on my head." - Laura Dykstra @ NCR Orlando

terryl@sail.LABS.TEK.COM (08/01/90)

In article <573@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) writes:
-But if system calls fail simply because of a very temporary bout of activity,
-that is *not my problem*!  It's the kernel's problem.  At least it should be:
-that's what "I'm paying the kernel for" so to speak.  And *I CAN'T FIX IT*
-myself.  If utilities haven't been rewritten to do the right thing with EAGAIN
-after fork(), and I'm only a binary licensee, what can I do about it?


     Bitch and moan to your OS vendor; after all, to use some of your own
logic, "That's what I'm paying the kernel for", so to speak....

ag@cbmvax.commodore.com (Keith Gabryelski) (08/01/90)

In article <ZOY4ZO4@xds13.ferranti.com> peter@ficc.ferranti.com (Peter da
Silva) writes:
>In article <573@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) writes:
>>The larger question is *why can't the kernel sleep* when it needs
>>more memory for a fork???
>
>Even if it can't do it right then, it could check right before leaving
>kernel mode. Or if that's politically incorrect, change fork.o in libc.a
>to implement the check for EAGAIN and do an exponential backoff (or
>whatever).

Or fix the application program.

Pax, Keith

peter@ficc.ferranti.com (Peter da Silva) (08/02/90)

Isn't the problem that EAGAIN is being overloaded for two purposes: out
of process table space (likely long-term on the time-scale programs
operate, and should be an error *for most programs*) and out of memory
(likely short term, so you should back off and try again)? Since only the
kernel can distinguish these two cases, it should be the one to implement
the retry (like, if fork failed because it couldn't get the buffer don't
drop all the way to system mode: allocate a block and when you get it then
you redo the fork... with the allocated block at hand so it can't fail).
-- 
Peter da Silva.   `-_-'
+1 713 274 5180.   'U`
<peter@ficc.ferranti.com>

jr@oglvee.UUCP (Jim Rosenberg) (08/02/90)

In <18478@rpp386.cactus.org> jfh@rpp386.cactus.org (John F. Haugh II) writes:
>>In article <573@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) writes:
>>> The larger question is *why can't the kernel sleep* when it needs more memory
>>> for a fork???
>It isn't that the kernel =can't= sleep, but rather that someone decided
>[ for some totally random reason, I suspect ... ] that the kernel
>=shouldn't= sleep.

I agree with the sentiments 100%, obviously, but I fear life is not so simple.

>In the past, the kernel swapped the process if malloc() didn't return
>with space.  So, this is a change in function, not some defect in
>application code.

But that was in swapping days, before V.3's hideous virtual memory.
Historically the kernel has never multitasked internally.  But somewhere along
the way, System V acquired "kernel daemons" -- in spite of lacking generalized
internal primitives for synchronizing true threads.  As I understand it, to
this day the only ways to synchronize flow of execution in the kernel are
sleep/wakeup and spl (disabling interrupts.)  These are *not good enough* for
generalized thread support.  Wakeups can be lost, as opposed ups on a
semaphore.  Somebody decided that the dirty work of moving reclaimable memory
pages to swap space could be handled by an asynchronous "kernel daemon".  Now
just exactly HOW, without generalized thread support, is this daemon to be
synchronized with the part of the kernel that needs the memory when you fork?
The answer, apparently, is that they are synchronized only by that arcane
black magic that has come to surround the kernel.  You just have to "know"
what you can get away with to avoid deadlock and race conditions, and when you
can't avoid them you throw out the door "the functionality that used to be
there" to paraphrase John.

*THIS* is the reason for rewriting the kernel:

In <13435@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
> Oh, good grief.  It is SILLY to say that the kernel should be redesigned
> to compensate for bugs in application programs.

Lighten up, Doug.  Some mighty heavy folks thought the kernel *needed*
threads.  I'd love to see some of the folks who've been lecturing me on KISS
and policy and whatnot say those things to the people who *were working* on
V.5 -- which *did have* kernel threads.  The V.5 project was not stopped (was
it stopped? :-)) for technical reasons but for political reasons.  Are the
technical reasons that got it started suddenly wrong?

I am truly mortified that **NO ONE** has posted an explanation of just what
the deadlock would be if the kernel did allow sleep in allocating memory for a
fork.  I fear no one understands it any more.  Dammit, Doug, it's not that I'm
too lazy to read a man page or to rewrite my fork code to do the right thing.
The point I'm making is that this functionality which we've lost since
swapping days is but the tip of a future iceberg.  The kernel *can* sleep on
the availability of a block in the buffer cache.  (Yeah, I know, there is no
more buffer cache in V.4, VM does it all ...)  If the page stealing daemon and
the "main part" of the kernel could communicate properly, using something
better than sleep/wakeup, then perhaps the kernel *could sleep* on the
availability of a memory page.  John & I & other folks are saying that we
thought this was the sort of thing kernels were supposed to be able to do.
They could once.  (BSD still can?)  If the System V kernel could, it would be
a better kernel.

And now folks are talking about bolting symmetric multiprocessing onto this
ballooning kernel.  If we can't get a clear explanation of just what the issue
is as to why the kernel can't sleep on more memory for a fork, what's going to
happen when all this is running on multiple processors?  Since I've provoked a
good deal of theological spouting, I can't resist asking what happened to the
old idea that UNIX was supposed to be *understandable*?

Meantime, if architects think they can get away with kernel threads without
full support for it, and us consumers of UNIX point out that by golly they
didn't really quite get away with it, hey, it's our bucks, we're entitled.
When I found that V.4, like V.3, uses an asynchronous page-stealing daemon, it
made me nervous.  I'm still nervous.
-- 
Jim Rosenberg             #include <disclaimer.h>      --cgh!amanue!oglvee!jr
Oglevee Computer Systems                                        /      /
151 Oglevee Lane, Connellsville, PA 15425                    pitt!  ditka!
INTERNET:  cgh!amanue!oglvee!jr@dsi.com                      /      /

dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) (08/04/90)

All the applications programs that I know of assume that that there is
no point retrying a failed fork().  I've seen quite a few books about
programming in a UNIX environment, and I don't recall seeing a
recommendation that a failed fork() be retried.  (Yet authors usually
warn that a failed write() may succeed, and that a failed read() for
some types of descriptors should always be retried.)

So it seems that if fork() can fail but be likely to succeed if
retried, either AT&T needs to hold a MASSIVE publicity campaign telling
all these people (including its own staff programmers, who sometimes
write books and articles about UNIX programming) that they have been
writing buggy code all these years; OR it needs to modify its kernel.
--
Rahul Dhesi <dhesi%cirrusl@oliveb.ATC.olivetti.com>
UUCP:  oliveb!cirrusl!dhesi

sar0@cbnewsl.att.com (stephen.a.rago) (08/04/90)

In article <575@oglvee.UUCP>, jr@oglvee.UUCP (Jim Rosenberg) writes:

[much rambling deleted...]

> I am truly mortified that **NO ONE** has posted an explanation of just what
> the deadlock would be if the kernel did allow sleep in allocating memory for a
> fork.

There is a deadlock if the kernel had to wait for proc slots, because if
all the processes were trying to fork, and nobody intended to exit, then
every process (except the kernel procs and init) would be waiting indefinitely
in the kernel for an event that would never occur.

However, the same deadlock is still possible when you kmem_alloc(KM_SLEEP)
a proc structure, only it's a lot less likely to occur because the memory
can be coming from other places (like STREAMS messages, etc.)  The only
time you'll deadlock is when no more memory is available and everyone is
waiting to allocate the space for their own proc structure.

Steve Rago
sar@attunix.att.com

als@bohra.cpg.oz (Anthony Shipman) (08/07/90)

At the risk of getting in out of my depth!

In article <575@oglvee.UUCP>, jr@oglvee.UUCP (Jim Rosenberg) writes:
> I am truly mortified that **NO ONE** has posted an explanation of just what
> the deadlock would be if the kernel did allow sleep in allocating memory for a
> fork.  I fear no one understands it any more.  Dammit, Doug, it's not that I'm

There could be a deadlock between user level processes. Suppose a process has 
used lots of kernel memory eg with file locks, open files, open inodes etc and 
then sleeps on more memory. Then another such process does the same thing. They
could end up deadlocked. Since shortage of kernel memory is a critical thing
and these deadlocked processes have tied up lots of it the problem could
spread.

A standard simple "naive" solution to break the deadlock is to kill one of the
processes at the request point. In this case, make the fork() fail. If several
retries fail then abort the program (and release memory). An indefinite wait
would be dangerous.

Of course this principle would have to be applied to every system call. Other
deadlock protection methods over general user processes are probably
inapplicable.

> the availability of a block in the buffer cache.  (Yeah, I know, there is no

The memory for buffers is preallocated at system boot time.

> Meantime, if architects think they can get away with kernel threads without
> full support for it, and us consumers of UNIX point out that by golly they
> didn't really quite get away with it, hey, it's our bucks, we're entitled.
> When I found that V.4, like V.3, uses an asynchronous page-stealing daemon, it
> made me nervous.  I'm still nervous.

And I'm bitten. My SCO UNIX 3.2 machine (appears to) deadlock frequently on 
going virtual (using the swap space for the first time).
-- 
Anthony Shipman                               ACSnet: als@bohra.cpg.oz.au
Computer Power Group
9th Flr, 616 St. Kilda Rd.,
St. Kilda, Melbourne, Australia
D

jfh@rpp386.cactus.org (John F. Haugh II) (08/08/90)

In article <553@bohra.cpg.oz> als@bohra.cpg.oz (Anthony Shipman) writes:
>A standard simple "naive" solution to break the deadlock is to kill one of the
>processes at the request point. In this case, make the fork() fail. If several
>retries fail then abort the program (and release memory). An indefinite wait
>would be dangerous.

I'm quite pleased to state the AIX V3.1 does not have this same problem
which AT&T has decided to introduce.  Not only does their fork() not fail
for transient shortages of real memory, but it has the added complexity
of working in an environment where almost all system calls are pre-emptable,
and frequently =are= pre-empted, page faulted, put to sleep, and a half
dozen other evils.

For more information on the implementation of the AIX kernel, please see
"Enhancements to the AIX Kernel for Support of Real-Time Applications",
by Kathy A. Bohrer and John T. O'Quin, IBM publication SA23-2619.
-- 
John F. Haugh II                             UUCP: ...!cs.utexas.edu!rpp386!jfh
Ma Bell: (512) 832-8832                           Domain: jfh@rpp386.cactus.org

boyd@necisa.ho.necisa.oz (Boyd Roberts) (08/09/90)

In article <553@bohra.cpg.oz> als@bohra.cpg.oz (Anthony Shipman) writes:
>At the risk of getting in out of my depth!
>
> [stuff deleted]
>
>A standard simple "naive" solution to break the deadlock is to kill one of the
>processes at the request point. In this case, make the fork() fail. If several
>retries fail then abort the program (and release memory). An indefinite wait
>would be dangerous.
>

NO, NO, NO!  You do not _understand_.  _Never_ do that.  NEVER!

Which proc will you kill?  And when (if ever) will it run so it can exit()?

It's very simple.  When you're low on ram & swap and processes are
fighting for this resource there's bugger all you can do.  Sure you
can purge a few sticky texts but in a pathological case you run out
of resources and there is nothing you can do.

What is required are pre-emptive measures, such as:

   * memory limits
   * working set limits
   * warn that resources are about to expire (say 90% consumed)

With these things a human can make decisions that will best serve
the system's purpose.  You've only got so much cake so you'd better
choose well how it will best be eaten.  Prevent the disaster from
happening and then you won't have to code in disaster recovery.

Boyd Roberts			boyd@necisa.ho.necisa.oz.au

``When the going gets wierd, the weird turn pro...''

junk1@cbnews.att.com (eric.a.olson) (08/10/90)

In article <2123@cirrusl.UUCP> dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) writes:
>All the applications programs that I know of assume that that there is
>no point retrying a failed fork().  I've seen quite a few books about
>programming in a UNIX environment, and I don't recall seeing a
>recommendation that a failed fork() be retried.


	Ummm.... well, just why do you think the errno
	is called 'EAGAIN'  ?