[comp.sys.sgi] SCSI timeouts

dwatts@ki.UUCP (Dan Watts) (07/24/90)

Lately I've been getting a few errors like

  sc0,2: Resetting SCSI bus: timeout after 60 sec

These seem to occur randomly and infrequently.  My setup is
a Personal Iris with QIC-24, QIC-150, internal 350MB+ drive
and an external 650MB+ drive.  I don't recall seeeing this
error before I put the external drive on.  All the hardware
works fine otherwise.  I was able to get the error to repeat
itself today by trying to tar some files from the external
drive to a 3M DC615A tape in the QIC-150 drive.  Should I
put my external disk at a lower SCSI address than the tape
drives?

% hinv
12 MHZ IP6 Processor
FPU: MIPS R2010A/R3010 VLSI Floating Point Chip Revision: 1.5
CPU: MIPS R2000A/R3000 Processor Chip Revision: 1.6
Data cache size: 8 Kbytes
Instruction cache size: 16 Kbytes
Main memory size: 8 Mbytes
Graphics board: GR1.2 
Integral Ethernet controller
Integral SCSI controller WD33C93
Disk drive: unit 3 on SCSI controller 0
Disk drive: unit 1 on SCSI controller 0
Tape drive: unit 4 on SCSI controller 0: QIC 24
Tape drive: unit 2 on SCSI controller 0: QIC 150
%
-- 
#####################################################################
# CompuServe: >INTERNET:uunet.UU.NET!ki!dwatts    Dan Watts         #
# UUCP      : ...!uunet!ki!dwatts                 Ki Research, Inc. #
############### New Dimensions In Network Connectivity ##############

dhinds@portia.Stanford.EDU (David Hinds) (07/24/90)

In article <822@ki.UUCP> dwatts@ki.UUCP (Dan Watts) writes:
>
>Lately I've been getting a few errors like
>
>  sc0,2: Resetting SCSI bus: timeout after 60 sec
>
>These seem to occur randomly and infrequently.  My setup is
>a Personal Iris with QIC-24, QIC-150, internal 350MB+ drive
>and an external 650MB+ drive.  I don't recall seeeing this
>error before I put the external drive on.  All the hardware
>works fine otherwise.  I was able to get the error to repeat
>itself today by trying to tar some files from the external
>drive to a 3M DC615A tape in the QIC-150 drive.  Should I
>put my external disk at a lower SCSI address than the tape
>drives?

    We had SGI install a pair of external 3rd-party 720MB SCSI drives for
our 4D-240 system a few months ago.  They were CDC drives, but I forget the
model numbers.  Anyway, it was a model that SGI did not officially endorse
(they didn't bother telling us before charging us to install them, but
that's another story...).  We noticed the same SCSI timeout messages from
time to time.  We also noticed that data tended to not get written to the
drive from time to time, and long batch jobs reading/writing that drive
tended to disappear without trace.  So, we ended up sending back the drives
and getting one from SGI (costing maybe twice as much).  If I were you,
I would take a good look at whether you are losing any data, and try to
give the drive a good workout.

 -David Hinds
  dhinds@popserver.stanford.edu

LES@SLACVM.BITNET (Len Sweeney) (07/25/90)

I've been seeing

     CIO: sc0,1: Resetting SCSI bus: timeout after 1 sec

on our 4D/70 since day one.  It occurs at boot, and once per copyright notice
in SYSLOG.  The guy that installed the system acted like it wasn't important,
and I havn't noticed any pain.

     Len Sweeney       LES@SLACVM.BITNET      415-926-2063

olson@anchor.esd.sgi.com (Dave Olson) (07/25/90)

In <822@ki.UUCP> dwatts@ki.UUCP (Dan Watts) writes:
| Lately I've been getting a few errors like
| 
|   sc0,2: Resetting SCSI bus: timeout after 60 sec
| 
| These seem to occur randomly and infrequently.  My setup is
| a Personal Iris with QIC-24, QIC-150, internal 350MB+ drive
| and an external 650MB+ drive.  I don't recall seeeing this
| error before I put the external drive on.  All the hardware
| works fine otherwise.  I was able to get the error to repeat
| itself today by trying to tar some files from the external
| drive to a 3M DC615A tape in the QIC-150 drive.  Should I
| put my external disk at a lower SCSI address than the tape
| drives?

Some info about what OS release you are running, and the drive
types (mt -t /dev/.... status for the tape drives, and the
drive name that fx 'dksc(0,#)' gives you for the drives) might help.

The 60 second timeout probably indicates that the tape drive
was not able to reconnect after accepting a command for some
reason.  This could be as simple as the drive taking far longer
for some operation than the qualified drives.  It could also
be due to some problem with one of the disk drives.  Finally,
it could be a cabling or termination problem.

Check to be sure that you haven't configured anything in a 'star'
configuration (you need a straight through connection from the
system to the final drive, with only short stubs coming off of
the bus).  Finally, check to be sure that there are NO terminators
on any device, except possibly the last external device on the
bus, if it doesn't have external termination.

Changing the ID of any of the devices is very unlikely to have
any effect on the situation, unless you have ID'ed 2 of them
the same, which is unlikely from your hinv output.
--

	Dave Olson

Life would be so much easier if we could just look at the source code.