[comp.sys.sgi] IRIS 3030 Disk Problems

roode@hydra.cf.uci.edu (Dana Roode) (08/21/89)

I'm no SGI expert (we use Sun, DEC, Apollo equipment), but Im trying
to help someone whose disk appears to be dying.  After a recent power
failure, he started getting hard disk failures, and lots of them, in
different locations on the disk.  The system hangs alot, despite us
having swapped out the badspots using sifex.

If you are a 3030 expert, and could spare a few moments, please read
on:
-----------------------------------------------------------------------

We are going to try reformatting the disk, but sifex doesnt seem to be
documented anywhere.  Is that right?  After we get as much of the disk
backed up as possible, we will experiment with sifex to see if we can get
it to do what we want (reformat, do several verification/excercise passes)

He is running version 3.6 of the OS.  Is that current, old or ancient?

We will need to rebuild the OS after reformatting, but that process
doesnt appear to be documented very well.  We think we can put it
together - we have a backup "mkboot" tape with the root on it.  What
filenumber holds the /usr OS files-cpio image on the 3.4/.5up/.6up
tapes?  Thats the part of the puzzle we are unsure of.

If the disk has really bitten the dust, we'll replace it.  Is it something
that is easily found third party, or are we best to stick with getting
it from SGI?  Its not a SCSI disk, is it?

Any help will be appreciated,

         dana roode
         roode@orion.cf.uci.edu
         bitnet: DFRoode@UCI

blbates@AERO4.LARC.NASA.GOV ("Brent L. Bates AAD/TAB MS294 x42854") (08/22/89)

   For beginners, it may not be the disk.  We recently had to return our
storager board, again, for repairs.  It would not recognize either of our
disks or the tape drive, so we couldn't boot the system.  At one point it
looked as if it was going to boot, but then it said the disk was bad and
I know it couldn't be.  We replaced the storager board and the system booted
up fine and NO disk errors appeared.  I don't know if that is any help, but
I thought you should know.
   There is no documentation available on sifex, unless you have the
maintenance manuals and only maintenace people have that.  The formatting
options require a password for use, but those are in the maintenace manual.
   We have 3.6 OS and as far as I know that is the most recent version of
the system.  Section 4.5 of the Owner's Guide is on Crash Recovery and it
shows you how to reload the system from tape.  I have used the section
many times (too many times) to recover the system and I think it is fairly
straight forward to do. Hopefully you shouldn't have any problems.
   If the disk is completely dead, you could buy the same drive from a third
party for at least half the cost that SGI would charge you.  I have heard
from people who have done this and it seems to work.  Of course NO ONE at
SGI would recomend it, but you know how that is.  We have a 380Mb drive on
order, to replace the 170Mb drive that came with our machine.  As soon as
we have that installed and running I will post our results on info-iris.
   Our current drive is a Hitachi 512-17 ESDI drive.  You could probably
order it from a third party and install it your selves without too much
problem. Also I think Hitachi has a 1 year warranty on the drive as opposed
to the 90 day warranty that SGI has.
   I hope this is of some help.
--

	Brent L. Bates
	NASA-Langley Research Center
	M.S. 294
	Hampton, Virginia  23665-5225
	(804) 864-2854
	E-mail: blbates@aero4.larc.nasa.gov or blbates@aero2.larc.nasa.gov

fsfacca@LERC08.LERC.NASA.GOV (Tony Facca) (08/22/89)

"Brent L. Bates AAD/TAB MS294 x42854" <blbates@aero4.larc.nasa.gov> writes:

>   If the disk is completely dead, you could buy the same drive from a third
>party for at least half the cost that SGI would charge you.  I have heard
>from people who have done this and it seems to work.  Of course NO ONE at
>SGI would recomend it, but you know how that is.  We have a 380Mb drive on
>order, to replace the 170Mb drive that came with our machine.  As soon as
>we have that installed and running I will post our results on info-iris.

I would be interested in hearing how this goes for you guys.  Also, when you
do post your results, could you also include the make and model of the third
party drive, and its GSA price.  Also, are you really replacing the original
disk or just adding a second one?  Thanks.


--
-----------------------------------------------------------------------------
Tony Facca                     |     phone: 216-433-8318
NASA Lewis Research Center     |    
Cleveland, Ohio  44135         |     email: fsfacca@lerc08.lerc.nasa.gov
-----------------------------------------------------------------------------

blbates@AERO4.LARC.NASA.GOV ("Brent L. Bates AAD/TAB MS294 x42854") (08/22/89)

   We currently have two Hitachi 512-17, 170Mb drives.  We plan to replace
our system disk with a Hitachi DK514-38, 380Mb drive.  If this works, we
will probably replace our other disk too.  I have received email from
Jim Diamond indicating that we should be able to replace our data drive
with out any problems, however we weren't sure about the system disk.
This is why we ordered only one drive, if the drive doesn't work as a
system disk, then we will use it as a data disk.  However, if it does
work as a system disk, then we will probably get the second drive.
   Not too long ago I found out that there seems to be a problem with
using two 380Mb drives.  SGI said that the current hardware and 3.6 OS
will support two 380Mb drives.  I talk to someone that actually tried
to do this, and the system keep crashing.  It turned out that there was
a bug some ROM's.  I don't know when or if the correction got into any
other machines or not.  We plan to cross that bridge when we get to it.
--

	Brent L. Bates
	NASA-Langley Research Center
	M.S. 294
	Hampton, Virginia  23665-5225
	(804) 864-2854
	E-mail: blbates@aero4.larc.nasa.gov or blbates@aero2.larc.nasa.gov

markb@denali.sgi.com (Mark Bradley) (08/24/89)

In article <8908221806.AA03436@aero4.larc.nasa.gov>, blbates@AERO4.LARC.NASA.GOV ("Brent L. Bates AAD/TAB MS294 x42854") writes:
> 
>    We currently have two Hitachi 512-17, 170Mb drives.  We plan to replace
> our system disk with a Hitachi DK514-38, 380Mb drive.  If this works, we
> will probably replace our other disk too.  I have received email from
> Jim Diamond indicating that we should be able to replace our data drive
> with out any problems, however we weren't sure about the system disk.
> This is why we ordered only one drive, if the drive doesn't work as a
> system disk, then we will use it as a data disk.  However, if it does
> work as a system disk, then we will probably get the second drive.
>    Not too long ago I found out that there seems to be a problem with
> using two 380Mb drives.  SGI said that the current hardware and 3.6 OS
> will support two 380Mb drives.  I talk to someone that actually tried
> to do this, and the system keep crashing.  It turned out that there was
> a bug some ROM's.  I don't know when or if the correction got into any
> other machines or not.  We plan to cross that bridge when we get to it.

You should save yourself some cash by not trying to use the 514-38 on
your 3030.  It is a 15 MHz xfer rate drive.  The controller in your 3030
(unless you bought something from a vendor other than SGI) will only
support 10 MHz xfer rate disk drives.  The drives will kind-of-sort-of
work a bit, but will eventaully scrog your data and offer all kinds of
wonderful 'sector not found's and 'header not found's and read and
write errors of various sorts.  The 514-38 is a very good drive, tho.
Too good for that archaic controller design (although it was the hottest
game in town in its time).  The only 380's that will work reliably on
the 3030 or 3130 are 10 MHz flavors, and you will have to give up some
capacity due to larger gap requirements.  That is, you will have to run
at 32 sectors/track even though most drives can be hard-sectored to as
high as 36 sectors/track.  Good luck.

					markb


--
Mark Bradley				"Faster, faster, until the thrill of
IO Subsystems				 speed overcomes the fear of death."
Silicon Graphics Computer Systems
Mountain View, CA			     ---Hunter S. Thompson

mitch@rock.sgi.com (Thomas P. Mitchell) (08/24/89)

In article <8908221403.AA27953@lerc08.lerc.nasa.gov> fsfacca@LERC08.LERC.NASA.GOV (Tony Facca) writes:
>"Brent L. Bates AAD/TAB MS294 x42854" <blbates@aero4.larc.nasa.gov> writes:
>
>>   If the disk is completely dead

Don't forget to check DC power.  Bad DC power can make 
anything look bad.  Look at things before they go bad.

Remember that there is a 'sender' and a 'receiver' in the
system.  In fact there are layers of talking and
listening.  Clearly the CPU must talk to the controller the
contoler to its device etc.  Then there must be a return
path.  It is often impossible to decide if it is the sender
or the listener that is having a problem.   This makes
'known' good parts very valuable as a diagnostic tool. 



 
Thomas P. Mitchell (ARPA:mitch@csd.sgi.com, UUCP:  {decwrl,ucbvax}!sgi!mitch )
Rainbows -- The best (well second best) reason for windows.