[comp.unix.wizards] Stopped processes with negative priority

barmar@think.com (Barry Margolin) (05/16/91)

From time to time I notice stopped processes with negative priority
(perhaps always -5, in case this is significant) on our Sun-4's (running
SunOS 4.1.1, but I think I remember seeing this in earlier SunOS releases
as well).  I generally notice it because any process with a negative
priority is included in the load average, and the load sometimes gets
pinned at 1.5-2 even though no processes are running hard.  If I send the
process a SIGCONT it immediately stops again, but this time with a positive
priority, so the load goes down to a reasonable level.

I suspect that the problem may be a kernel race condition.  The processes
in this state always seem to be full screen programs.  I suspect what's
happening is that the program catches SIGTSTP, resets tty modes, and then
sends itself a SIGSTOP to stop itself for real.  Maybe the process is being
stopped before the kernel has restored the priority.

Is this indicative of a real problem?  Are these processes using any more
system resources than ordinary stopped processes?  If they're counted in
the load average, then I assume this means that they're sitting on the
active process queue, so are they increasing the scheduler overhead?
-- 
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

hansm@cs.kun.nl (Hans Mulder) (05/18/91)

In <1991May16.062850.14385@Think.COM> barmar@think.com (Barry Margolin) writes:

>From time to time I notice stopped processes with negative priority
>(perhaps always -5, in case this is significant) on our Sun-4's (running
>SunOS 4.1.1, but I think I remember seeing this in earlier SunOS releases
>as well).

For what it's worth, I just noticed one on a Sun-3, alsos running SunOS4.1.1,
also at -5.

>I generally notice it because any process with a negative
>priority is included in the load average, and the load sometimes gets
>pinned at 1.5-2 even though no processes are running hard.  If I send the
>process a SIGCONT it immediately stops again, but this time with a positive
>priority, so the load goes down to a reasonable level.

Not here: I sent the process a SIGCONT and it switched back to raw mode
and then stopped on a SIGTTOU.  My shell dislikes raw input and died...

>I suspect that the problem may be a kernel race condition.  The processes
>in this state always seem to be full screen programs.  I suspect what's
>happening is that the program catches SIGTSTP, resets tty modes, and then
>sends itself a SIGSTOP to stop itself for real.  Maybe the process is being
>stopped before the kernel has restored the priority.

Mine was a full screen program (vi, to be specific).  Vi sends itself a
SIGTSTP, not a SIGSTOP.

>Is this indicative of a real problem?  Are these processes using any more
>system resources than ordinary stopped processes?  If they're counted in
>the load average, then I assume this means that they're sitting on the
>active process queue, so are they increasing the scheduler overhead?

Not necessarily: processes in a non-interruptible wait are also counted
in the load average.  As an extreme example, the parent in a vfork() is
counted in the load average when the only resource it has is a process
table entry.

I guess processes in whatever form of wait shouldn't be counted in the
load average.

--
Have a nice day,

Hans Mulder	hansm@cs.kun.nl