[comp.sys.sun] SCSI Disk Problems with SS1 and 6 disks.

proicou@afwl.af.mil (Mike Proicou) (01/26/91)
I'm having problems with the SCSI disks on my Sparcstation 1.  I've been
getting SCSI transport errors every so often, which will affect 1 or more
of the drives.  This example affects sd4, but it has occurred on sd2, sd3,
sd4, and sd5 as well.  I'm running a modified kernel configuration (SunOS
4.1) with 6 disks and 1 Exabyte tape.  I haven't been able to correlate
the errors with excessive disk I/O or lack of I/O.

The problem starts with (from /usr/adm/messages):

npbsun vmunix: sd4:	SCSI transport failed: reason 'incomplete': retrying command
npbsun last message repeated 14 times
npbsun vmunix: sd4:	disk not responding to selection
npbsun vmunix: sd4:	disk not responding to selection
npbsun last message repeated 3 times
npbsun vmunix: sd4:	disk not responding to selection
npbsun last message repeated 3 times

and eventually the system will panic like so:

 panic: iinactive
 syncing file systems... panic: iinactive
 00000 low-memory static kernel pages

Here's the device printout after the reboot:

 sd0 at esp0 target 3 lun 0
 sd0: <Quantum ProDrive 105S cyl 974 alt 2 hd 6 sec 35>
 sd1 at esp0 target 1 lun 0
 sd1: <Quantum ProDrive 105S cyl 974 alt 2 hd 6 sec 35>
 sd2 at esp0 target 2 lun 0
 sd2: <arte Imprimis Wren VII 94601-12G cyl 1920 alt 2 hd 15 sec 71>
 sd3 at esp0 target 0 lun 0
 sd3: <CDC Wren IV 94171-344 cyl 1545 alt 2 hd 9 sec 46>
 sd4 at esp0 target 4 lun 0
 sd4: <arte Imprimis Wren VII 94601-12G cyl 1920 alt 2 hd 15 sec 71>
 sd5 at esp0 target 6 lun 0
 sd5: <arte Imprimis Wren VII 94601-12G cyl 1920 alt 2 hd 15 sec 71>
 st1 at esp0 target 5 lun 0

After the reboot everything is fine until the next episode.  So far the
reboots have been necessary (forced on me) about every ten days.  Is there
a way to fix this without rebooting?

Any ideas?

Mike Proicou
proicou@npbsun.afwl.af.mil