david@ukma.UUCP (David Herron, NPR Lover) (11/23/85)
Has anybody else been having these? What's happening is, up til a couple of weeks ago we'd been having a crash every other month or so where it would just print ERR CPU:02 ERR CPU:04 on the console (no other messages) and start rebooting. In the last 2 weeks we've crashed at least 3 times to this problem. We have: 750, rev 7, WCS with G&H Floating point 5 megs memory SI controller with Eagles, and an SI Tape drive 1 dz, 2 dmf's, 1 emulex cs11 (dh emulator) 1 deuna 1 dup-11 the problem dates prior to the deuna and the dup-11, but I'm not sure how it relates to the rev 7 upgrade, or our use of G&H. (The G&H instructions weren't being executed at the time of the crash) About 2 weeks ago we *did* have the air-conditioner break down, and due to a breakdown in communications, the system wasn't brought down promptly ... the machine room was hot for a few hours. Since then we have also had 2 occasions where the machine would get EXTREMELY SLOW. The load average would shoot up to around 20 with a user load that would normally have a load average of 5 or thereabouts. One time this happened, I fired up 'mon' and 'top' to see what was going on. Top was showing a very high system percentage (~70%) for a long period (10 minutes anyway) and most of the cpu (60%) was being divided between 2 processes. Mon showed that interrupts were at 1000 per second (!), 2000 chars per second character i/o (1000 in, 1000 out), and 70K per sec disk i/o. i.e. something was a little flaky and seeing lots of characters and generating lots of interrupts, etc. The load average of 20 was from a lot of rnews processes niced at 15 and not getting run time. Renicing them down so they would run cleared up the clog and dropped the load average back to reasonable levels. And eventually the interrupts went away. (nobody was present at the machine to try plugging and unplugging wires, etc) Any ideas? -- David Herron, cbosgd!ukma!david, david@UKMA.BITNET. Experience is something you don't get until just after you need it.