martin@mwtech.UUCP (Martin Weitzel) (07/29/90)
In article <7885@tekgvs.LABS.TEK.COM> terryl@sail.LABS.TEK.COM writes:
[many wise words about KISS principle and the spirit of unix deleted]
But IMHO it's not quite appropriate here. I think the questions here is:
Who should retry if a fork fails?
To see the problem I think that we should generalize a little. Just
consider the case of disk reads for a moment. Surely, there's no
one of us who doesn't appreciate the ability of the device drivers
to issue retrys(%) if a read fails, and that an error from a read in
an application can be considered to be a permanent error.
(%: Maybe, if I were about to write a program which tests for flaky
disk blocks, I'm not so happy with kernal retries ...)
Of course, an application can choose to retry after bad reads and I've
had cases of "ill" disks, where running a program in the background
for some hours helped me to recover 100 % of "bad blocks" by patiently
retrying ... just 1 out of 100 reads or so happened to be succesfull.
On the other hand I would never embrace disk reads in "normal" programs
with a retry capability - why bother: the kernal-drivers solve the
problem in general well.
Now, why is the situation so different with "fork"?
As I understand all the traffic here, the "real" problem is in fact
that in case of the E_AGAIN-error two very different problems may
exist: The one is more a "long-term" problem (no slots in the process
table or user limited reached, where this could also be zombies caused
by careless programming techniques), the other is a very short-term
problem, which is difficult to correct in the kernal because the complexity
of the algorithms in that area.
So I think the complaints here *are* right from the view of an application
developper, but instead of embracing all the forks in application programs
with a retry capability, I think there's a more pragmatic (though not
ideal) approach: Why not enhance the interface to fork in the standard
library with a retry capability? For many of us, "library + kernal" are
more or less a monolithic block (we can't change both easily 1/2:-)), so
if an error from fork could be treated as the described "long-term" error
condition, everything were fine.
Well, only a suggestion, maybe someone will post such a piece of code
soon ...
--
Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83
boyd@necisa.ho.necisa.oz (Boyd Roberts) (07/31/90)
When fork() fails with EAGAIN it fails for a good reason. How many lines of user-mode code does it take to code up retries? About 20. It would seem that there is some consensus to change the semantics of fork() to retry. This would break a critical interface. System calls do one thing, and one thing well. A trivial addition to user-mode programs is what is required, and NOT the re-definition of a well defined critical interface. Leave the kernel and C library alone. Write your own re-trying fork(). NAME bork() - spawn new process with exponential backoff on failure SYNOPSIS int bork(retries) int retries; ... Boyd Roberts boyd@necisa.ho.necisa.oz.au ``When the going gets wierd, the weird turn pro...''
jfh@rpp386.cactus.org (John F. Haugh II) (08/01/90)
In article <1809@necisa.ho.necisa.oz> boyd@necisa.ho.necisa.oz (Boyd Roberts) writes: >When fork() fails with EAGAIN it fails for a good reason. > >How many lines of user-mode code does it take to code up retries? > >About 20. > >It would seem that there is some consensus to change the semantics of >fork() to retry. This would break a critical interface. System calls >do one thing, and one thing well. fork() never failed before for lack of kernel memory - the change in behavior was recently introduced - this is what people are complaining about. [ Gross simplification follows ... ] The previous behavior was allocate memory for the child, and if the allocation failed, swap the entire image. But the process image was left in memory, which had the effect of creating two copies of the process - one in memory and one on the swapper. So, even when there was a shortage of kernel memory, the process would be created, even though it was swapped out. This isn't some new function people want - it's the old, pre-V.4 function people want - you know, the same stuff that's been around since, say, 6th Edition? -- John F. Haugh II UUCP: ...!cs.utexas.edu!rpp386!jfh Ma Bell: (512) 832-8832 Domain: jfh@rpp386.cactus.org
terryl@sail.LABS.TEK.COM (08/02/90)
In article <866@mwtech.UUCP> martin@mwtech.UUCP (Martin Weitzel) writes: >In article <7885@tekgvs.LABS.TEK.COM> terryl@sail.LABS.TEK.COM writes: >[many wise words about KISS principle and the spirit of unix deleted] > >But IMHO it's not quite appropriate here. I think the questions here is: > > Who should retry if a fork fails? Well, I'll agree to disagree!!! (-: IMHO, the KISS principle and the spirit of unix IS appropriate here, and you deleted one of my main reasons why with respect to your question "Who should retry if a fork fails?" And the reason is one of policy; as I said in my previous post, the kernel SHOULD NOT be making policy decisions that are better left to the application program. Having the kernel ALWAYS sleeping and retrying if a fork() fails is a policy decision that may not always be appropriate. However, many people have brought up a point that I didn't think about at first; the error code EAGAIN is overloaded to mean two totally different things: one, a transient condition of not enough system resources(i.e. disk swap space, real memory, etc.) to satisfy the request, and two, a more perma- nent condition of running into some system-wide limits (i.e. no more process slots, too many processes for this user id, etc.), and I'll be happy to agree (-: on this point that the two error conditions should be signaled differently. Terry Laskodi of Tektronix
gwyn@smoke.BRL.MIL (Doug Gwyn) (08/02/90)
In article <18479@rpp386.cactus.org> jfh@rpp386.cactus.org (John F. Haugh II) writes: >fork() never failed before for lack of kernel memory - the change in >behavior was recently introduced - this is what people are complaining >about. Actually, there have been several different implementations of fork(), all of which have been able to fail for reasons of a transient shortage of resources. The details are the only thing that have varied..
lls@kings.co.uk (Lady Lodge Systems) (08/02/90)
In article <1809@necisa.ho.necisa.oz> boyd@necisa.ho.necisa.oz (Boyd Roberts) writes: >When fork() fails with EAGAIN it fails for a good reason. >A trivial addition to user-mode programs is what is required, and NOT >the re-definition of a well defined critical interface. >Leave the kernel and C library alone. Write your own re-trying fork(). OK! Napalm dispenser armed..... How about AT&T making the 'trivial' addition to their user mode programs? The shell springs to mind! What are you suggesting? I make the trivial addition of rewriting the shell?? How about the other programs that I only have in binary form - rewrite them as well?? People are missing the whole point here. The vast majority of Un*x users are binary only licencees and cannot easily change their code to deal with EAGAIN returns in all of their (binary only) programs. When fork fails through a temporary log-jam in the kernel surely the kernel should deal with the situation before telling the calling application that the world has just ended? If fork fails because the process table is full we can fix that by retuning for sensible limits - WE CAN'T FIX THE KERNEL GETTING ITS INTERNALS SNARLED UP The code that I write does check for EAGAIN and retry - however, other utilities and packages that I call (including /bin/sh) do not - how do I fix that? -- ------------------------------------------------------------------------- | Peter King. | Voice: +44 733 239445 | | King Bros (Lady Lodge) Ltd | EMAIL: root@kings.co.uk | | Lady Lodge House, Orton Longueville, | UUCP: ...mcvax!ukc!kings!root | | Peterborough, PE2 0HZ, England. | | -------------------------------------------------------------------------
jfh@rpp386.cactus.org (John F. Haugh II) (08/03/90)
In article <13468@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes: >Actually, there have been several different implementations of fork(), >all of which have been able to fail for reasons of a transient shortage >of resources. The details are the only thing that have varied.. I've been spared the gory details of System V fork() most of my adult life, but in my childhood (last week when this started) I did check V6 and V7, and sure enough - neither of those fail for lack of anything short of swap space. Now, I have V.2 source laying around work someplace, I'd be happy to look and see what fork() does in the case of no physical memory there. But my recollection is, it doesn't fail for transient shortages of physical memory. Anyone with a fork() older than V.3 which fails for transient shortages of memory is free to send me the sordid details. That way I can start a list of which UNIX products not to purchase. Obviously the disease starts about V.3.2 or so and I'd rather avoid the plague. -- John F. Haugh II UUCP: ...!cs.utexas.edu!rpp386!jfh Ma Bell: (512) 832-8832 Domain: jfh@rpp386.cactus.org
gwyn@smoke.BRL.MIL (Doug Gwyn) (08/05/90)
In article <35@kings.co.uk> lls@kings.UUCP (Superuser) writes: >The shell springs to mind! What are you suggesting? I make the trivial >addition of rewriting the shell?? If you had been following the discussion, you should have already heard that the shell DOES retry failed forks, several times with exponentially increasing delay between them. >People are missing the whole point here. The vast majority of Un*x >users are binary only licencees and cannot easily change their code to >deal with EAGAIN returns in all of their (binary only) programs. If you have a problem with a program you've licensed in binary form, you should submit a bug report to the vendor of the program.
boyd@necisa.ho.necisa.oz (Boyd Roberts) (08/08/90)
In article <35@kings.co.uk> lls@kings.UUCP (Superuser) writes: > >The code that I write does check for EAGAIN and retry - however, other >utilities and packages that I call (including /bin/sh) do not - how do >I fix that? I've just read the the System V.2.2 shell code and guess what? It retries the fork() with backoff. So your shell must be broken. The deal is that when fork() fails -- all is not well. The kernel is not psychic and can't predict what will happen in the future. The caller of fork() is responsible to take action. When fork() does fail what can you do? Not much, you can give up in disgust or retry. BUT THERE IS NO GUARANTEE THAT THE ANY AMOUNT OF RETRYING WILL GET YOU THAT NEW PROCESS. What's happened is that a kernel resource has been exhausted and you have no way of predicting whether any relevant resources in use will be freed up in the future. I'm with presotto & hume. But when you're faced with retrying you have to ask the question -- is it worth it, and if so how long should I try? Boyd Roberts boyd@necisa.ho.necisa.oz.au ``When the going gets wierd, the weird turn pro...''
blm@6sceng.UUCP (Brian Matthews) (08/09/90)
In article <1820@necisa.ho.necisa.oz> boyd@necisa.ho.necisa.oz (Boyd Roberts) writes: |When fork() does fail what can you do? Not much, you can give up |in disgust or retry. BUT THERE IS NO GUARANTEE THAT THE ANY AMOUNT |OF RETRYING WILL GET YOU THAT NEW PROCESS. On the other hand, if you don't retry, you're pretty much guarenteed not to get a new process. -- Brian L. Matthews blm@6sceng.UUCP