[net.unix-wizards] WAITing for specific process

tosca@ihuxj.UUCP (L. Cole) (08/02/85)

The wait() system call returns the process id of zombie child, but a
process may have more than one of these outstanding at a time. Wait()
is free (it appears) to return the pid of any of these zombie children.
What if you want to wait for a particular child? For example, one might
pass the pid of a process to be waited for. It's clear that the way
it works now is useful, but there times it really gets in the way.
An example of this is the library functions popen/pclose, which
generate and wait for a child, possibly discarding a zombie that was
to be waited for later. Any suggestions as to what to do or why this
facility isn't provided?

				Chris Scussel
				ihnp4!ihuxj!tosca
				Bell Labs
				Naperville, IL

gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (08/04/85)

> The wait() system call returns the process id of zombie child, but a
> process may have more than one of these outstanding at a time. Wait()
> is free (it appears) to return the pid of any of these zombie children.
> What if you want to wait for a particular child? For example, one might
> pass the pid of a process to be waited for. It's clear that the way
> it works now is useful, but there times it really gets in the way.
> An example of this is the library functions popen/pclose, which
> generate and wait for a child, possibly discarding a zombie that was
> to be waited for later. Any suggestions as to what to do or why this
> facility isn't provided?

The facility IS provided, to some degree:

	void
	waitfor( pid )			/* wait for process to exit */
		int	pid;		/* ID of process to wait on */
		{
		extern unsigned	sleep();
	#define	DELAY	2		/* test interval, in seconds */

		while ( kill( pid, 0 ) == 0 )
			sleep( (unsigned)DELAY );
		}

If one wants to pick up the exit status, then one runs into the
problem you mention, which is the possible consumption of the exit
statuses of other zombies.  One solution would be to have your
process keep track of all the processes it has created, in a single
code module, and stash the spurious zombie statuses for later calls
on waitfor().  The only children your process should have other than
those it created itself are the ones it is given by the shell when
it is last in a pipeline.  It doesn't matter if you eat those zombies.

ado@elsie.UUCP (Arthur David Olson) (08/05/85)

> > . . .What if you want to wait for a particular child?
> 
>   . . .#define	DELAY	2		/* test interval, in seconds */
> 
> 		while ( kill( pid, 0 ) == 0 )
> 			sleep( (unsigned)DELAY );
> 		}

Please note that this technique is not portable to 4.1bsd systems, where
attempts to use signal zero always cause kill to return a non-zero value.
--
UNIX is an AT&T Bell Laboratories trademark.
DELAY is an American Jurisprudence trademark.
--
	UUCP: ..decvax!seismo!elsie!ado    ARPA: elsie!ado@seismo.ARPA
	DEC, VAX and Elsie are Digital Equipment and Borden trademarks

shannon@sun.uucp (Bill Shannon) (08/06/85)

> > The wait() system call returns the process id of zombie child, but a
> > process may have more than one of these outstanding at a time. Wait()
> > is free (it appears) to return the pid of any of these zombie children.
> > What if you want to wait for a particular child?

	...

> The facility IS provided, to some degree:
> 
> 	void
> 	waitfor( pid )			/* wait for process to exit */
> 		int	pid;		/* ID of process to wait on */
> 		{
> 		extern unsigned	sleep();
> 	#define	DELAY	2		/* test interval, in seconds */
> 
> 		while ( kill( pid, 0 ) == 0 )
> 			sleep( (unsigned)DELAY );
> 		}

Polling for a child's death is really gross.  If the child will run
for a long time, this keeps the parent "hot" (in memory) and wastes
cpu time.  If the child will finish quickly this wastes real time.

> The only children your process should have other than
> those it created itself are the ones it is given by the shell when
> it is last in a pipeline.  It doesn't matter if you eat those zombies.

Ah, but what about the processes created by library routines that you
called?  As you start to build larger programs that make heavy use of
multiple processes (both directly in the program itself and indirectly
in library routines called by the program) you discover that the existing
facilities provided by UNIX are not nearly powerful enough to allow all
the users of sub-processes to cooperate without interfering with each
other.  However, they are powerful enough to build such a facility on
top of.  What's needed is the ability to start a process and be notified
of its termination, without interfering with any other such uses by other
parts of the same parent process.  A list of terminated processes' statuses
needs to be kept in the parent process, so that it can be examined by
something like the "waitfor" proposed above.  Switching to such a mechanism
almost certainly precludes the standard use of wait(), SIGCHLD, etc.  A
layer on top of them is needed and everyone needs to be convinced to use
only that layer (e.g. fread vs. read).  Maybe we should entertain proposals
for such a layer?

					Bill Shannon

gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (08/09/85)

> ...  A list of terminated processes' statuses
> needs to be kept in the parent process, so that it can be examined by
> something like the "waitfor" proposed above.  Switching to such a mechanism
> almost certainly precludes the standard use of wait(), SIGCHLD, etc.  A
> layer on top of them is needed and everyone needs to be convinced to use
> only that layer (e.g. fread vs. read).  ...

Yes, a very good idea.  You still have to watch out for children
you may unknowingly have when started (as the last process in a
pipeline), but that should be no problem so long as it is taken
into account.