[comp.unix.questions] <defunct> processes

dg@lakart.UUCP (David Goodenough) (04/29/88)

From article <3951@killer.UUCP>, by wnp@killer.UUCP (Wolf Paul):
] Can anyone enlighten me as to what causes a process to become "immortal"
] in System VR2,  or Microport UNIX System V/AT, to be more specific?
] 
] I have encountered this a number of times, where it would be impossible
] even for root to kill a process; if the parent process of the "immortal"
] process is killed, the child attaches itself to init, PID 1.
] 
] The only way to get rid of such an immortal process seems to be to reboot,
] which is rather drastic.
] 
] What causes a process to refuse to die? I thought signal 9 (kill) could
] not be intercepted or ignored?
] 
] Any comments welcome.
] 
] Wolf Paul

I have noticed a similar phenomenon with BSD4 - I wrote a program once that
did lots and lots of popen("command", "w") calls. I fired it up background,
and a minute later did a "ps ag" to see what was happening. My process was
there, but so were about 40 processes marked STAT == Z, COMMAND == <defunct>.
Trying to kill -9 these failed, and I had gone superuser. I got lucky in
comparison to Mr. Paul - at least these went away when the parent exited.
However as Mr. Paul stated 'Is it not the case that kill -9 will terminate
a process - no if's, and's or but's'. As an interesting aside, I was running
TT == 0 (/dev/tty0), but the controlling TT of these defunct processes was
drifting all over hell's half acre: TT == co, then TT == h1, then TT == dx -
all for the same process!
--
	dg@lakart.UUCP - David Goodenough		+---+
							| +-+-+
	....... !harvard!adelie!cfisun!lakart!dg	+-+-+ |
						  	  +---+

guy@gorodish.Sun.COM (Guy Harris) (05/04/88)

> My process was there, but so were about 40 processes marked STAT == Z,
> COMMAND == <defunct>.  Trying to kill -9 these failed,

Just as shooting a corpse won't do anything interesting, either.

"<defunct>" means "defunct."  Those processes have already been killed;
however, a dead process remains around in "zombie" form (hence the "Z") until
the parent process does a "wait()" or variant thereof to pick up and dispose of
the corpse.

> I got lucky in comparison to Mr. Paul - at least these went away when the
> parent exited.

When the parent exits, "init" gets custody of the body and disposes of it
rather quickly.

> As an interesting aside, I was running TT == 0 (/dev/tty0), but the
> controlling TT of these defunct processes was drifting all over hell's
> half acre: TT == co, then TT == h1, then TT == dx - all for the same process!

Long-standing 4BSD "ps" bug.  The controlling tty information comes from the U
area of the process; however, zombies don't have U areas.  However, "ps"
doesn't understand the connection between these two facts; it still tries to
extract the controlling tty from its U-area buffer.  This means it picks up the
controlling tty of the last process whose U area it read.

The correct fix is to change the code at the beginning of "save()" from:

	if (mproc->p_stat != SZOMB && getu() == 0)
		return;
	ttyp = gettty();

to something like:

	if (mproc->p_stat != SZOMB) {
		if (getu() == 0)
			return;
		ttyp = gettty();
	} else
		ttyp = "?";	/* zombies are not attached to terminals */

jpayne@cs.rochester.edu (Jonathan Payne) (05/04/88)

<defunct> processes are processes that have terminated but haven't yet
been wait(2)'d for.  All process must be waited for by somebody, usually
the parent, before they go completely away.  If you think about it, that
makes sense, because you hardly ever just want a process that finishes to
go away without being able to tell how it exited (via wait(2)).

If the parent dies before waiting for any of it's children, it becomes
the responsibility of the parent of the parent to do the waiting, which
is usually init(8).  Init is smart enough to notice when a process it
just waited for is one which means that a person logged out, so it can
fire up another login, etc.

Anyway, I wonder how many other people have answered this message ...

john@jetson.UUCP (John Owens) (05/06/88)

In article <77@lakart.UUCP>, dg@lakart.UUCP (David Goodenough) writes:
> did lots and lots of popen("command", "w") calls. I fired it up background,
 . . . 
> there, but so were about 40 processes marked STAT == Z, COMMAND == <defunct>.
 . . .
> TT == 0 (/dev/tty0), but the controlling TT of these defunct processes was
> drifting all over hell's half acre: TT == co, then TT == h1, then TT == dx -
> all for the same process!

This is not the same situation.  These processes have actually been killed -
they have received the "9" signal and have exit-ed.  The difference is that
they have not been "wait"ed for by their parent process, and are considered
"zombies" - only a small portion of their original information is still
present.  Specifically, the controlling tty is no longer present, so
ps will show random values there.

Mr. Paul's processes were not zombie processes - they were still "alive",
just hung in a kernel sleep(), as has been so well explained.

-- 
John Owens		SMART HOUSE Development Venture
john@jetson.UUCP	(old uucp) uunet!jetson!john
+1 301 249 6000		(internet) john%jetson.uucp@uunet.uu.net

tar@ksuvax1.cis.ksu.edu (Tim Ramsey) (05/06/88)

In article <9349@sol.ARPA> jpayne@cs.rochester.edu (Jonathan Payne) writes:
>
>If the parent dies before waiting for any of it's children, it becomes
>the responsibility of the parent of the parent to do the waiting, which
>is usually init(8).  ...

Wrong.  When a process dies, init(8) inherits its children.  There is no
notion of "grandparent" in UNIX.

Tim Ramsey
--------
Timothy Ramsey (aka Nop)            Dept of Computing & Information Sciences
               "Deadlines amuse me"          Kansas State University
Internet: tar@ksuvax1.cis.ksu.edu            Manhattan, Kansas 66506
BITNET:   tar@KSUVAX1 -or- NOP@KSUVM              (913) 532-6350
UUCP:     {cbosgd,pyramid,ihnp4}!ncr-sd!ncrwic!ksuvax1!tar

john@jetson.UUCP (John Owens) (05/06/88)

> If the parent dies before waiting for any of it's children, it becomes
> the responsibility of the parent of the parent to do the waiting, which
> is usually init(8).

Not to nitpick but just to keep correct information flowing....  ;-)

If the parent dies, the process is always "inherited" by init (pid 1),
not the parent of the parent. 

-- 
John Owens		SMART HOUSE Development Venture
john@jetson.UUCP	(old uucp) uunet!jetson!john
+1 301 249 6000		(internet) john%jetson.uucp@uunet.uu.net

rml@hpfcdc.HP.COM (Bob Lenk) (05/13/88)

> I have noticed a similar phenomenon with BSD4 - I wrote a program once that
> did lots and lots of popen("command", "w") calls. I fired it up background,
> and a minute later did a "ps ag" to see what was happening. My process was
> there, but so were about 40 processes marked STAT == Z, COMMAND == <defunct>.

Lots of folks have correctly explained that the parent must call
wait(2) or an equivalent to clean up zombies.  It's important to note
that the way to do this following popen(3) is with pclose(3).

		Bob Lenk
		{ihnp4, hplabs}!hpfcla!rml
		rml%hpfcla@hplabs.hp.com

decot@hpisod2.HP.COM (Dave Decot) (05/14/88)

> P.S. In response to all those that replied to my questions Re: <defunct>
> processes, I discover the solution is simple. After every fclose(fp), where
> fp is the FILE * I got from popen, I do a wait(&j), and the zombies go away.

I'm surprised nobody mentioned to you that the routine pclose() (and not
fclose()) is supposed to be called to close stdio streams obtained
from popen().

Among other things, pclose() calls wait() to pick up the status of the
finished process and get rid of the zombie.

Pclose() is usually documented on the same page as popen(), so I don't
understand how everybody could have missed this.

Dave Decot
Hewlett-Packard Company
decot%hpda@hplabs.hp.com

irick@ei.ecn.purdue.edu (GarBear Irick) (03/18/91)

OK, OK, here goes a really stupid question...

In a bit of server code I am working on, I decided that I would allow
multiple clients to connect to the server, meaning a fork() each time a
connection is established.  The child does its job, but when it finishes,
it hangs around as a <defunct> process after the exit().  I am using a
socket, adn the final code goes somethin' like this:

  write(sock,"Bye!\n",5);
  close(sock);
  exit(0);

I thought that maybe signal(SIGCHLD,SIGIGN) would fix it, but alas.  I
never thought I would post such a DUMB question to the net...  please drop
me a line at the address below, since I don't have much time to read news
this week!  Thanks...

--
Gary A. Irick,  Purdue University | "You can log out any time you like,
INTERNET: irick@ei.ecn.purdue.edu |  But you can never leave!"
UUCP:     ...!pur-ee!irick        |       (apologies to The Eagles)