barmar@think.com (Barry Margolin) (05/16/91)
From time to time I notice stopped processes with negative priority (perhaps always -5, in case this is significant) on our Sun-4's (running SunOS 4.1.1, but I think I remember seeing this in earlier SunOS releases as well). I generally notice it because any process with a negative priority is included in the load average, and the load sometimes gets pinned at 1.5-2 even though no processes are running hard. If I send the process a SIGCONT it immediately stops again, but this time with a positive priority, so the load goes down to a reasonable level. I suspect that the problem may be a kernel race condition. The processes in this state always seem to be full screen programs. I suspect what's happening is that the program catches SIGTSTP, resets tty modes, and then sends itself a SIGSTOP to stop itself for real. Maybe the process is being stopped before the kernel has restored the priority. Is this indicative of a real problem? Are these processes using any more system resources than ordinary stopped processes? If they're counted in the load average, then I assume this means that they're sitting on the active process queue, so are they increasing the scheduler overhead? -- Barry Margolin, Thinking Machines Corp. barmar@think.com {uunet,harvard}!think!barmar
hansm@cs.kun.nl (Hans Mulder) (05/18/91)
In <1991May16.062850.14385@Think.COM> barmar@think.com (Barry Margolin) writes: >From time to time I notice stopped processes with negative priority >(perhaps always -5, in case this is significant) on our Sun-4's (running >SunOS 4.1.1, but I think I remember seeing this in earlier SunOS releases >as well). For what it's worth, I just noticed one on a Sun-3, alsos running SunOS4.1.1, also at -5. >I generally notice it because any process with a negative >priority is included in the load average, and the load sometimes gets >pinned at 1.5-2 even though no processes are running hard. If I send the >process a SIGCONT it immediately stops again, but this time with a positive >priority, so the load goes down to a reasonable level. Not here: I sent the process a SIGCONT and it switched back to raw mode and then stopped on a SIGTTOU. My shell dislikes raw input and died... >I suspect that the problem may be a kernel race condition. The processes >in this state always seem to be full screen programs. I suspect what's >happening is that the program catches SIGTSTP, resets tty modes, and then >sends itself a SIGSTOP to stop itself for real. Maybe the process is being >stopped before the kernel has restored the priority. Mine was a full screen program (vi, to be specific). Vi sends itself a SIGTSTP, not a SIGSTOP. >Is this indicative of a real problem? Are these processes using any more >system resources than ordinary stopped processes? If they're counted in >the load average, then I assume this means that they're sitting on the >active process queue, so are they increasing the scheduler overhead? Not necessarily: processes in a non-interruptible wait are also counted in the load average. As an extreme example, the parent in a vfork() is counted in the load average when the only resource it has is a process table entry. I guess processes in whatever form of wait shouldn't be counted in the load average. -- Have a nice day, Hans Mulder hansm@cs.kun.nl