861087p@aucs.uucp (A N D R E A S) (01/19/89)
I can't seem to kill a process that is stuck in the <exiting> status. I tried sure kills etc. nothing seems to work. Any ideas would be helpful. Please e-mail. Thanks in advance. [[ Try killing the process's parent. <exiting> processes are usually waiting for their parents to "wait" on them. But I have seen <exiting> processes get thoroughly stuck (not on SunOS, but on 4.2 BSD). They wouldn't go away without a reboot. I'll put a detailed explanation of <exiting> processes in the next issue. It might even be right! --wnl ]] Andreas Pikoulas USENET: {uunet|watmath|utai|garfield}!dalcs!aucs!861087p Acadia University BITNET: 861087p@Acadia Wolfville, NS Internet: 861087p%Acadia.BITNET@CUNYVM.CUNY.EDU CANADA B0P 1X0 (902) 542-5623
phil@Rice.edu (William LeFebvre) (01/19/89)
Okay, let's see if I can get this right. I guess I should warn readers that I learned all this initially while working with 4.1 BSD. But I seriously doubt that something this low-level and fundamental has changed substantially. A process that "ps" shows as <exiting> is usually in a state that the kernel calls the "zombie" state. When you create a subprocess under Unix, you can wait for its completion by calling the "wait" or "wait3" system calls. At that time, you find out how it exited. If it was forced by a signal, you find out which signal caused it to quit. If it called "exit" you get the exit number back. But when a child actually exits, the kernel has to store this information somewhere until the parent actually asks for it. So the kernel frees up as of the resources that the child used as possible (swap and page space, virtual memory maps, real memory, files, etc.), but it hangs on to the "proc" structure (and maybe the "user" structure, but I don't think so) so that it has a place to put this information. A process in this "almost exited" state is a zombie (it's almost dead). So, if a process creates many children but never calls "wait" and never exits itself, when the children exit they will turn into zombies. You can't kill a zombie process because it's already on its way out. However, once the parent exits, all its children are inherited by the "init" process, pid 1. Init always waits for children, so any zombie processes that it inherets will immediately be "waited" for and will go away. It is possible that an <exiting> process is really waiting for some sort of final cleanup before becoming a zombie. I saw this happen all too frequently under 4.2 BSD. I can't recall now if they were actually in the zombie state or were still in some other kernel state. But the only way to get rid of them was to reboot. I have never seen that under SunOS. William LeFebvre Sun-Spots moderator Department of Computer Science Rice University <phil@Rice.edu>
ted@ames.arc.nasa.gov (Ted Schroeder) (02/01/89)
> Try killing the process's parent....--wnl
Of course, if the parent process was a daemon as so often happens with zombie tasks, you'll have to reboot to get rid of them.
Ted Schroeder ted@Ultra.com
Ultra Network Technologies ...!ames!ultra!ted
101 Daggett Drive
San Jose, CA 95134
408-922-0100
[[ Thinking about it, my suggestion was not very useful. In most cases,
if an exiting process is lingering around for any length of time, it is
very likely because the process got hung up trying to flush an input or
output buffer for one of its open files (such as an open pseudo-terminal).
There's no way to get rid of those except to reboot the machine. --wnl ]]
dongre@ames.arc.nasa.gov (Sumit Dongre) (02/14/89)
ultra!ted@ames.arc.nasa.gov (Ted Schroeder) writes: > > Try killing the process's parent....--wnl > There's no way to get rid of those except to reboot the machine. --wnl Come on computer people....even the movies can kill zombies..... ...ok..ok..I'll find a way...but...I need some help... question...given a process id of the child...can I locate the process control structure...(probably yes)...then using that pointer and knowing the structure of the control structure....AND NOT HAVING SOURCE CODE...can I change the STATE for the process (probably TW : terminated and swapped ) to whatever...thus letting the kill command clean up nicely.... how about you guys at "sun"...there's just got to be a way... [[ But will it clean up properly? If it is stuck waiting for an output queue to flush (a typical stuck-in-zombie scenario), simply changing a number in the proc structure so that you can get rid of the process might not take care of all the problems. The terminal line (or pseudo tty) might just remain stuck if you're lucky. If you're not lucky, the system could very well panic further down the road. Perhaps I'm just naive about Unix internals, but I think what you propose is pretty risky. --wnl ]]
chuck@morgan.com (Chuck Ocheret) (03/06/89)
I have been able to remove <exiting> processes in the past when the scenario is that the process is waiting for an output queue to flush. A short program that flushes the device's queue allows the process to rest in peace. For example for a terminal, an ioctl() call with a TCFLSH command (for termio at least) can flush the offending output queue. The zombie problem occurs very often when testing new comm systems (maybe I'm a sloppy developer) so I keep an "exorcist" program around for just such occasions. I would like to see a general purpose "exorcist" which, given a pid, can determine what device the process is waiting for and take the appropriate action. Chuck Ocheret Morgan Stanely & Co., Inc. 1251 Avenue of the Americas New York, N.Y. 10020 (212)703-4474 chuck@morgan.com [[ When one of our terminal lines got stuck with an <exiting> process, I ran a program that called ioctl() with TCFLSH (remembering that this sometimes unstuck these beasts). It had no effect. So it doesn't always work. --wnl ]]