[comp.sys.sun] Problem with <exitting> processes

sitongia@hao.ucar.edu (Leonard Sitongia) (11/23/88)

>From:    Loki Jorgenson rm 421 <loki@physicsa.mcgill.ca>
> 
> I am running a 3/180 as fileserver to four clients with an ALM-2
> multiplexor for 16 terminals and pc's.  On occasion, the ports which host
> connections to the local VAX, modems or terminals become 'hung up' by
> processes which ps describes as <exitting>....

I've seen this problem arrise in two cases, one when a user does a XOFF
(control-s) and then severs the connection to the computer, and the other
case when a telnet is similarly severed.  Unfortunately, I dont know how
to kill a process in this beyond-zombie state.  This brings up another
question: what is the "lbolt" WCHAN?

-Leonard E. Sitongia    System Programmer		 (303) 497-1509
USPS Mail: High Altitude Observatory P.O. Box 3000 Boulder CO  80307
Internet:               sitongia@hao.ucar.edu
SPAN:			NSFGW::"hao.ucar.edu!sitongia"	[NSFGW=9580]

souza@ucsd.edu (Steve Souza @eldest) (12/12/88)

Leonard E. Sitongia writes:
> This brings up another question: what is the "lbolt" WCHAN?

>From what I understand of the system internals, lbolt is a wait channel
address used by device driver routines that need to delay but are not
expecting a device interrupt.  They call the kernel sleep() routine with
lbolt as a parameter, and the system issues a wakeup call with the address
of this variable every second (every 60 clock ticks on a 60HZ machine).

I'd be surprised to see a process in a "zombie" state if it were waiting 
on a reliably deterministic event like lbolt.

Steve Souza			ucsd!telesoft!souza, telesoft!souza@ucsd.edu
TELESOFT Inc., San Diego, CA	(619)457-2700 x277

guy@uunet.uu.net (Guy Harris) (12/20/88)

 >From what I understand of the system internals, lbolt is a wait channel
 >address used by device driver routines that need to delay but are not
 >expecting a device interrupt....

Yes, that's basically it.  However...

 >I'd be surprised to see a process in a "zombie" state if it were waiting 
 >on a reliably deterministic event like lbolt.

...a process may be "waiting on" "lbolt" only in the sense that it's
really waiting for something to happen, but that "something" doesn't
generate a wakeup, so instead it waits on "lbolt" and tests, each time
it's woken up, whether the "something" has really happened, and if not
goes back to sleep again.  I don't remember what the particulars of the
case in question were, but in the 4.xBSD tty driver (and the SunOS 4.0
streams code) if you are trying to do:

	1) a "set"-type "ioctl" on your controlling terminal when it's not
	   in your process group (that is, you're in the background) and
	   you're neither blocking nor ignoring SIGTTOU;

	2) a "read" on your controlling terminal when it's not in your
	   process group and you're not blocking or ignoring SIGTTIN;

	3) a "write" on your controlling terminal when it's not in your
	   process group, TOSTOP is set, and you're not blocking or
	   ignoring SIGTTOU;

the tty driver will send the signal in question to your process group, and
will block you on "lbolt" and then try again.  It will also block on
"lbolt" if it's out of clist blocks when it's trying to write to a
terminal.

In addition, the 4.3BSD virtual memory code, and possibly the 4.2BSD code
- the 4.2BSD code is used in SunOS prior to 4.0 - will sleep on "lbolt" if
the process can't be swapped out while expanding a page table, in the
hopes that when it wakes up there'll be enough space to swap it out.

Some 4.3BSD VAX drivers use it as well.  Sun drivers may also do so.

In many of these cases, the process may be waiting for "something" that
takes a long time to happen, so it just repeatedly sleeps on "lbolt" until
it happens.  If so, the process may be in the middle of a forced close
done as part of an "exit", and the "close" may have to wait for this
"something" to happen, so a process could well be blocked on "lbolt" while
in "zombie" state.