rtp1@quads.uchicago.edu (raymond thomas pierrehumbert) (03/27/91)
My DN10K running 10.1p is experiencing a peculiar problem. User response is slowing down dramatically, even when there is only one user and no user background processes (e.g. compiles take much longer than usual, logons take forever). When I'm next to the cpu, I hear extensive and repeated disk-seek noises even when there is nothing in particular going on that should access the disk. If I reboot the 10k, everything is ok for a little while, but then the problem comes back. Has anybody experienced anything similar? Does this have anything to do with the "runaway init" process I've heard about here previously? If I contact the support line, are they likely to be of any help (I haven't had the stomach to try them for about a year now) .
krowitz@RICHTER.MIT.EDU (David Krowitz) (03/27/91)
There are two tools you should be using: !1) /usr/apollo/bin/dspst with the -a option 2) /bin/ps -alx The "runaway init" problem on earlier revs. of SR10 for DN10000's was a problem with the size of the "init" process growing without bounds. Run "dspst -a" and look at the page purifier processes and the win chester disk I/O. If the page purifiers and the disk I/O are both active when the machine gets slow, then there's a good bet that the beast has run out of RAM and is page thrashing. Use "ps -alx" to check the real and virtual sizes (RSS and SZ columns) of the processes running on your machine. "init" should be something on the order of 1 or 2 MB (1000 to 2000 1kb pages). If "init" is huge, the page purifiers are running, and the disk I/O is high -- then you probably have the run away "init" problem. -- David Krowitz krowitz@richter.mit.edu (18.83.0.109) krowitz%richter.mit.edu@eddie.mit.edu krowitz%richter.mit.edu@mitvma.bitnet (in order of decreasing preference)