[comp.unix.wizards] Zombies ???

larry@MITRE.arpa (Larry Henry) (12/07/86)

	Does anyone out there have a clear understanding of exactly what
	situations create zombie processes ?? Or any reading material I
	can look at.

							-Larry.

	ps. I apolgize if this is not the correct group to ask this question of.

gwyn@brl-smoke.ARPA (Doug Gwyn ) (12/08/86)

In article <1327@brl-adm.ARPA> larry@MITRE.arpa (Larry Henry) writes:
>	Does anyone out there have a clear understanding of exactly what
>	situations create zombie processes ??

A process becomes a zombie when it terminates.
It is laid to rest when it is reported to its parent process
(or to process #1, init, if the original parent is no longer
alive) as the result of a wait() system call in the parent.

>	ps. I apolgize if this is not the correct group to ask this question of.

Questions about UNIX should be posted to comp.unix.questions.

merlin@hqda-ai.UUCP (David S. Hayes) (12/08/86)

A zombie process is the visible artifact of the fact that Unix says
that a parent can read the exit status of a child.  When a child
process exits, a status value is generated.  (If it just returns from
main, it still has an exit value.  Normally zero, but it depends on
your compiler.)  The parent may want to read this status value, but
the child process itself is no longer needed.

When the child dies, its memory resources are reclaimed.  The process
table entry, which contains the exit status, is retained by the
kernel.  The table entry is kept so that the parent may pick up the
exit status (via wait(2)).  While the table entry is being preserved,
its status is ZOMBIE.  When the parent reads the child's exit status,
the remaining table entry for the child is purged from the system.

All processes ZOMBIE for a short time: the time between the process
exit and their parent's wait().  Processes that you see on ps(1) as
ZOMBIE are the result of the parent process exiting before the child,
or exiting without reading the child's exit status.  When this
happens, he child (now considered an "orphan" process) becomes a child
of init (process 1).  The kernel doesn't think about this, though, and
just ZOMBIE's the child as usual.  Init, though, does not check the
exit status either.  Thus, the ZOMBIEs stick around until reboot.

The preceding was discovered on a VAX running 4.2BSD, as a result
of some problem with emacs.  I think Sys V runs the same way, but
I can't be sure, as I don't have a Sys V machine.

-- 
	David S. Hayes, The Merlin of Avalon
	PhoneNet:	(202) 694-6900
	ARPA:		merlin%hqda-ai@brl
	UUCP:		...!seismo!sundc!hqda-ai!merlin

mike@enmasse.UUCP (Mike Schloss) (12/09/86)

In article <1327@brl-adm.ARPA> larry@MITRE.arpa (Larry Henry) writes:
>Does anyone out there have a clear understanding of exactly what
>situations create zombie processes ?? Or any reading material I
>can look at.

	The following hold true for Sys V.  Don't know about the others.
	When a process exits a zombie is created unless SIGCLD is set to
	SIG_IGN.  This is described (smewhat poorly) in the WARNING
	section of signal(2).

clewis@spectrix.UUCP (Chris Lewis) (12/09/86)

In article <165@hqda-ai.UUCP> merlin@hqda-ai.UUCP (David S. Hayes) writes:
>All processes ZOMBIE for a short time: the time between the process
>exit and their parent's wait().  Processes that you see on ps(1) as
>ZOMBIE are the result of the parent process exiting before the child,
>or exiting without reading the child's exit status.  When this
>happens, he child (now considered an "orphan" process) becomes a child
>of init (process 1).  The kernel doesn't think about this, though, and
>just ZOMBIE's the child as usual.  Init, though, does not check the
>exit status either.  

Close but not quite.

>Thus, the ZOMBIEs stick around until reboot.

Not true - normally.

[Sort of reminds me of what somebody told one of our customers: "You should 
do a "ps-l" frequently.  If you see the words "csh" that means
the process has crashed, and you should issue a "kill -9 <pid>" to clear
it".... Sigh... ]

"init" is issuing "wait"'s continuously - that's how it figgers out that a 
login session has gone away.  It has a loop something like this:

	/* fork/exec all getty's specified in /etc/ttys (or whatever) 
	   and remember their pids */
	while(1) {
	    pid = wait(&status);
	    if (pid == one of the getty processes init spawned)
		fork/exec a new getty
	    else
		/* ignore completely */
	}

Therefore, in a normally running system init will eventually "wait" for all
processes that have died without their parent having waited for them, and
all Zombies will disappear.  Usually pretty quickly.

Situations where Zombie processes stay around are:

	1) init has died or hung (repeat by: "kill -9 1" [You hear the
	   howling of the Banshee....])
	2) the process isn't totally dead.  This can happen when the process
	   is trying to die, but during attempts to close all of it's open
	   devices, the driver hangs.  Especially when a tty driver hangs 
	   in "close" because of some odd state - common in some earlier 
	   EXORmacs UNIX systems.  Their not-quite-perfect tty driver and 
	   hardware would hang when a modem dropped the line - in this 
	   case, one "stuck" Zombie would prevent all other Zombies from 
	   being seen by init.

If your system ever starts getting large numbers of Zombies that are
staying around, first check to ensure that "init" is still running (process 1).
If it ain't, it'll be a Zombie too - You MUST reboot your system to recover
(tho, some systems autoreboot if init goes away - eg: Pyramids).  Once init
has gone away, no "new" getty's will be spawned until reboot.

Otherwise, suspect devices associated with the Zombie processes.  Until you 
have the "device hang" problem fixed you will always have to reboot to clear 
Zombies (if the number is increasing).  Contact your support organization.
-- 
Chris Lewis, Spectrix Microsystems Inc,
UUCP: {utzoo|utcs|yetti|genat|seismo}!mnetor!spectrix!clewis
ARPA: mnetor!spectrix!clewis@seismo.css.gov
Phone: (416)-474-1955

madd@bucsb.bu.edu.UUCP (Jim "Jack" Frost) (12/10/86)

In article <1327@brl-adm.ARPA> larry@MITRE.arpa (Larry Henry) writes:
>
>	Does anyone out there have a clear understanding of exactly what
>	situations create zombie processes ??

One of the ways you can do it is not to close pipes when opening them
from within a process with popen().  If you repeatedly open a pipe
using the same variable and either omit the pclose() or use a
fclose(), this should cause zombies.  Killing the parent kills all of
them, BTW.

Example:

-- cut here --
#include <stdio.h>

main()
{ FILE *p,*popen();

  for (;;) {
    if ((p=popen("ps","r"))!=NULL) {
      /* do something with the data */
      fclose(p); /* <- omit this statement for same effect */
    }
    sleep(5);
  }
}
-- cut here --

Anyway, this should cause a new zombie every 5 seconds or so until the
parent is killed.

I'm sure there are others, but this one can be pretty nasty.  I found
it while playing with a daemon I was writing.
-- 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
                   - Jim Frost * The Madd Hacker -
UUCP:  ..!harvard!bu-cs!bucsb!madd | ARPANET: madd@bucsb.bu.edu
CSNET: madd%bucsb@bu-cs            | BITNET:  cscc71c@bostonu
-------------------------------+---+------------------------------------
"Oh beer, oh beer." -- Me      |      [=(BEER) <- Bud the Beer (cheers!)

wolfe@winston.UUCP (12/10/86)

Here is a question about zombie processes.  Are they an artifact of the
implementation or would UNIX break if one did it some other way?

Off hand, I can think of storing the {PID, STATUS} pair (or triple if
you want resource usage ala 4.[23] BSD) in queues for each parent to
read whenever they get around to it.  This would eliminate the need for
having process blocks hanging around until someone reads the exit
status.

Does anyone know of implementations that do not use zombies?
-- 
Peter Wolfe			| ..decvax!microsoft!ubc-vision!winston!wolfe
New Media Technologies Ltd.	| ..ihnp4!alberta!ubc-vision!winston!wolfe
(604) 291-7111			|