[comp.unix.programmer] Catching termination of child process and system

vrm@cathedral.cerc.wvu.wvnet.edu (Vasile R. Montan) (01/22/91)

   I have a program which occasionally forks to do some processing.
In order to avoid having zombie process hang around, I put the
following in my main routine:

void dowait()
{
  wait(0);
}

main()
{
   ...
   signal(SIGCHLD, dowait);
   ...
}

   However, in another place in the code, I do a system call and look
at the return status to see if an error has occurred.  Without the
signal in the main routine, the system call works fine, but with the
signal, the system call always returns a -1.  Is there an easy way
to fix this?

**************** The above opinions are mine, all mine. *****************
Vasile R. Montan                           Bell Atlantic Software Systems 
                                           9 South High Street             
vrm@cerc.wvu.wvnet.edu                     Morgantown, WV 26505            

gwc@root.co.uk (Geoff Clare) (02/18/91)

In comp.lang.c<9882@dog.ee.lbl.gov> torek@elf.ee.lbl.gov (Chris Torek) writes:

>(This really belongs in a Unix newsgroup; however, I expect no further
>followups, i.e., I think this will be the decisive answer.)

Sorry to disappoint Chris, but I have something to add to his "decisive
answer".  I have cross-posted to comp.unix.programmer and directed
follow-ups there.  The discussion does have some relevance to 'C' since
it is about the format of the status returned by wait(), and on UNIX
systems this format also applies to the return value of the system()
function.

>The answer, then, is that to wait for a process whose id is `pid' you
>should use:

>	int w, status;

>	if (check_other_wait_results(pid, &status))	/* if necessary */
>	while ((w = wait(&status)) != pid) {
>		if (w == -1 && errno == EINTR)	/* ugly but sometimes... */
>			continue;		/* ...necessary */
>		record_other_wait_result(w, status);	/* if necessary */
>	}

>The exit status of the process, if any, is then `status >> 8' and the
>signal, if any, that caused the process to die is then `status & 0177'.
>The process left a core dump (`image' or `traceback data' to non-Unix
>folks) if `status & 0200' is nonzero.

POSIX does not specify the precise encoding of information in the status
returned by wait(), system(), etc., so portable programs should not
rely on the traditional encoding Chris describes above.  Instead macros
are provided in <sys/wait.h> to extract the relevant data from the status:

     WIFEXITED(status) is non-zero if the child exited normally, in which
case WEXITSTATUS(status) gives the exit code.

     WIFSIGNALED(status) is non-zero if the child was terminated by a signal,
and  WTERMSIG(status) gives the signal number.

     WIFSTOPPED(status) is non-zero if the child was stopped by a signal,
and  WSTOPSIG(status) gives the signal number.
-- 
Geoff Clare <gwc@root.co.uk>  (Dumb American mailers: ...!uunet!root.co.uk!gwc)
UniSoft Limited, London, England.   Tel: +44 71 729 3773   Fax: +44 71 729 3273

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (02/19/91)

In article <2608@root44.co.uk> gwc@root.co.uk (Geoff Clare) writes:
> The discussion does have some relevance to 'C' since
> it is about the format of the status returned by wait(),

No. It has absolutely nothing to do with C. We just happen to be talking
about it in C.

> POSIX does not specify the precise encoding of information in the status
> returned by wait(), system(), etc., so portable programs should not
> rely on the traditional encoding Chris describes above.  Instead macros
> are provided in <sys/wait.h> to extract the relevant data from the status:

Wrong. A program written according to your advice is decidedly
nonportable: it will work only on POSIX systems. A program written
according to Chris's advice will work under System V, BSD, and most
POSIX-based systems to boot. Someone who wants to plan for the future
should conditionally compile the POSIX code, though of course he'll
still have to use w & 0200 to get the core dump bit.

Portability is defined by the real world, not a standards committee.

---Dan

gwyn@smoke.brl.mil (Doug Gwyn) (02/19/91)

In article <2608@root44.co.uk> gwc@root.co.uk (Geoff Clare) writes:
>POSIX does not specify the precise encoding of information in the status
>returned by wait(), system(), etc., so portable programs should not
>rely on the traditional encoding Chris describes above.  Instead macros
>are provided in <sys/wait.h> to extract the relevant data from the status:

(1)  PORTABLE programs MUST follow Chris's recommendation; not all
existing UNIX environments provide the macros to which you alluded.
PORTABLE != POSIX

(2)  Does POSIX really neglect to specify the bits?  Certainly as of
the trial-use 1003.1 standard the bits were specified.  In any case,
all UNIX systems must continue to act as Chris decided, regardless of
whether POSIX requires additional facilities for this.

gwc@root.co.uk (Geoff Clare) (02/20/91)

In <12673:Feb1900:07:4691@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:

>In article <2608@root44.co.uk> gwc@root.co.uk (Geoff Clare) writes:
>> The discussion does have some relevance to 'C' since
>> it is about the format of the status returned by wait(),

>No. It has absolutely nothing to do with C. We just happen to be talking
>about it in C.

I noticed you chopped off an important piece of text when quoting me.
It referred to the system() function, which is defined by ANSI 'C'.
A discussion of how the return value of system() is interpreted certainly
is relevant to 'C', even if the discussion only involves how the value is
interpreted on UNIX systems.  Anyway, now the discussion has moved to
comp.unix.programmer, there's no point in continuing the argument over
whether it is relevant to comp.lang.c.

>> POSIX does not specify the precise encoding of information in the status
>> returned by wait(), system(), etc., so portable programs should not
>> rely on the traditional encoding Chris describes above.  Instead macros
>> are provided in <sys/wait.h> to extract the relevant data from the status:

>Wrong. A program written according to your advice is decidedly
>nonportable: it will work only on POSIX systems. A program written
>according to Chris's advice will work under System V, BSD, and most
>POSIX-based systems to boot. Someone who wants to plan for the future
>should conditionally compile the POSIX code, though of course he'll
>still have to use w & 0200 to get the core dump bit.

If you re-read more carefully what I wrote, you'll realise you've done me an
injustice.  I said "portable programs should not *rely* on the traditional
encoding Chris describes above."  I did *not* say they should instead use
*only* the POSIX macros, which is what you infer I said.

A program which wants to be portable both to POSIX systems and to
traditional systems would need to do something like this:

#include <sys/wait.h>
#ifndef WIFEXITED
#define WIFEXITED(s)	(((s) & 0xff) == 0)
#endif
#ifndef WEXITSTATUS
etc......

>Portability is defined by the real world, not a standards committee.

The two are not mutually exclusive - the POSIX standards are a part of
the real world.  Widely portable programs need to allow for both POSIX
systems and traditional systems.
-- 
Geoff Clare <gwc@root.co.uk>  (Dumb American mailers: ...!uunet!root.co.uk!gwc)
UniSoft Limited, London, England.   Tel: +44 71 729 3773   Fax: +44 71 729 3273

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (02/20/91)

Geoff, somewhere out there is a novice UNIX programmer---let's call him
Joe---who reads your article, and, following your advice, doesn't rely
on the traditional encoding Chris described. Instead he uses the POSIX
macros in exactly the ``portable'' way you showed. He happens to be
using an Ultrix 4.1 system, so his programs work. Some time later he
distributes his code. Most of the administrators who try to install it
find that it doesn't even compile.

If there is even one person like that in the world, you did an injustice
by posting your article. If Doug Gwyn and I hadn't followed up, poor Joe
would have believed you and thought that POSIX code was portable code.
In your second article you were much more careful to point out a truly
portable method. That's good. Why didn't you do it in the first place?

I'm not saying that you were entirely unjustified in stating that
portable programs shouldn't rely on Chris's advice about the wait bits
(although I don't know anyone who would agree with you). I'm just saying
that such a statement will do more harm than good.

Keep this in mind next time you talk about ``portable'' code.

---Dan