[comp.sys.sun] Help : <exiting> cannot be killed

861087p@aucs.uucp (A N D R E A S) (01/19/89)

I can't seem to kill a process that is stuck in the <exiting> status. I
tried sure kills etc. nothing seems to work. Any ideas would be helpful.
Please e-mail. 

Thanks in advance.

[[ Try killing the process's parent.  <exiting> processes are usually
waiting for their parents to "wait" on them.  But I have seen <exiting>
processes get thoroughly stuck (not on SunOS, but on 4.2 BSD).  They
wouldn't go away without a reboot.  I'll put a detailed explanation of
<exiting> processes in the next issue.  It might even be right!  --wnl ]]

Andreas Pikoulas    USENET:   {uunet|watmath|utai|garfield}!dalcs!aucs!861087p
Acadia University   BITNET:   861087p@Acadia
Wolfville, NS       Internet: 861087p%Acadia.BITNET@CUNYVM.CUNY.EDU
CANADA  B0P 1X0     (902) 542-5623

phil@Rice.edu (William LeFebvre) (01/19/89)

Okay, let's see if I can get this right.  I guess I should warn readers
that I learned all this initially while working with 4.1 BSD.  But I
seriously doubt that something this low-level and fundamental has changed
substantially.

A process that "ps" shows as <exiting> is usually in a state that the
kernel calls the "zombie" state.  When you create a subprocess under Unix,
you can wait for its completion by calling the "wait" or "wait3" system
calls.  At that time, you find out how it exited.  If it was forced by a
signal, you find out which signal caused it to quit.  If it called "exit"
you get the exit number back.  But when a child actually exits, the kernel
has to store this information somewhere until the parent actually asks for
it.  So the kernel frees up as of the resources that the child used as
possible (swap and page space, virtual memory maps, real memory, files,
etc.), but it hangs on to the "proc" structure (and maybe the "user"
structure, but I don't think so) so that it has a place to put this
information.  A process in this "almost exited" state is a zombie (it's
almost dead).

So, if a process creates many children but never calls "wait" and never
exits itself, when the children exit they will turn into zombies.  You
can't kill a zombie process because it's already on its way out.  However,
once the parent exits, all its children are inherited by the "init"
process, pid 1.  Init always waits for children, so any zombie processes
that it inherets will immediately be "waited" for and will go away.

It is possible that an <exiting> process is really waiting for some sort
of final cleanup before becoming a zombie.  I saw this happen all too
frequently under 4.2 BSD.  I can't recall now if they were actually in the
zombie state or were still in some other kernel state.  But the only way
to get rid of them was to reboot.  I have never seen that under SunOS.

			William LeFebvre
			Sun-Spots moderator
			Department of Computer Science
			Rice University
			<phil@Rice.edu>

ted@ames.arc.nasa.gov (Ted Schroeder) (02/01/89)

> Try killing the process's parent....--wnl

Of course, if the parent process was a daemon as so often happens with zombie tasks, you'll have to reboot to get rid of them.

      Ted Schroeder                   ted@Ultra.com
      Ultra Network Technologies      ...!ames!ultra!ted
      101 Daggett Drive           
      San Jose, CA 95134          
      408-922-0100

[[ Thinking about it, my suggestion was not very useful.  In most cases,
if an exiting process is lingering around for any length of time, it is
very likely because the process got hung up trying to flush an input or
output buffer for one of its open files	(such as an open pseudo-terminal).
There's no way to get rid of those except to reboot the machine.  --wnl ]]

dongre@ames.arc.nasa.gov (Sumit Dongre) (02/14/89)

ultra!ted@ames.arc.nasa.gov (Ted Schroeder) writes:
> > Try killing the process's parent....--wnl
> There's no way to get rid of those except to reboot the machine.  --wnl

Come on computer people....even the movies can kill zombies.....

...ok..ok..I'll find a way...but...I need some help...  question...given a
process id of the child...can I locate the process control
structure...(probably yes)...then using that pointer and knowing the
structure of the control structure....AND NOT HAVING SOURCE CODE...can I
change the STATE for the process (probably TW : terminated and swapped )
to whatever...thus letting the kill command clean up nicely....

how about you guys at "sun"...there's just got to be a way...

[[ But will it clean up properly?  If it is stuck waiting for an output
queue to flush (a typical stuck-in-zombie scenario), simply changing a
number in the proc structure so that you can get rid of the process might
not take care of all the problems.  The terminal line (or pseudo tty)
might just remain stuck if you're lucky.  If you're not lucky, the system
could very well panic further down the road.  Perhaps I'm just naive about
Unix internals, but I think what you propose is pretty risky.  --wnl ]]

chuck@morgan.com (Chuck Ocheret) (03/06/89)

I have been able to remove <exiting> processes in the past when the
scenario is that the process is waiting for an output queue to flush.  A
short program that flushes the device's queue allows the process to rest
in peace.  For example for a terminal, an ioctl() call with a TCFLSH
command (for termio at least) can flush the offending output queue.  The
zombie problem occurs very often when testing new comm systems (maybe I'm
a sloppy developer) so I keep an "exorcist" program around for just such
occasions.

I would like to see a general purpose "exorcist" which, given a pid, can
determine what device the process is waiting for and take the appropriate
action.

Chuck Ocheret
Morgan Stanely & Co., Inc.
1251 Avenue of the Americas
New York, N.Y.  10020
(212)703-4474
chuck@morgan.com

[[ When one of our terminal lines got stuck with an <exiting> process, I
ran a program that called ioctl() with TCFLSH (remembering that this
sometimes unstuck these beasts).  It had no effect.  So it doesn't always
work.  --wnl ]]