[unix-pc.general] yet another UNIXpc HD tale

wjc@ho5cad.ATT.COM (Bill Carpenter) (05/03/90)

Okay, okay.  I should have paid closer attention to all those billions
of "help me with my hard drive" messages, but I knew it would never
happen to me.  No, wait, that's not right.  In fact, it has happened
to me and those nearby often enough that we think it's routine, but
we're so woefully hardware-ignorant that we just choose between
sending them home under warranty or tossing them out and buying new
ones.  (No, we don't really toss them out.  We use them for curling
practice.)

This time, though, I've got a broken hard drive on my UNIXpc that is
so tantalizingly close to "ok" that I can hardly stand it.  Although
the drive is under warranty (no sweat), I'd sure like to be able to
get my data off it before it goes cross-country.  (What do you mean
"backups"?  What kind of wimp do you think I am? :-)

THE EQUIPMENT:

  Seagate ST-251
  UNIXpc has original WD 1010 controller chip

THE SYMPTOMS:

Failed to boot after a normal powerdown (the famous dancing little
boxes).  I can boot from the floppy with no trouble.  I can also boot
the machine from another swapped-in hard drive.

From the diagnostics disk, I always get "can't recal, response = 40"
no matter what I try.  Other disk tests like random seek,
nondestructive surface test, etc, all go okay (except that at the
beginning they seem to do a recal and report an error for that, then
after I hit return, they do the rest of the test with no complaint).

If I do a boot from "floppy boot" and do the trick to get a shell
prompt, I have no trouble mounting and unmounting the hard drive on
/mnt.  In fact, when it is mounted, I can read and write whatever I
please from it.  Yes, friends, I have the full power of UNIX at my
disposal :-).

THE BEST THING YOU COULD TELL ME:

Something like "Yeah, we see that all the time on the 251's.  All you
have to do is [your solution here] and it will spin like a champ for a
couple more months."

THE NEXT BEST THING YOU COULD TELL ME:

How to get my data off the disk after I've done a floppy boot.  As far
as I can figure out, I can't use the floppy for anything else after
I've booted off it ("filesystem busy").  If there is a way, I'd sure
like to hear about it.

I also have another UNIXpc within a few feet of the broken one, so if
there is a way to solve this puzzle using that, it's there.  I
couldn't do anything with the serial ports, could I (yes, I know how
long it would take).

As you can probably guess, neither of these machines has a tape drive
nor a second hard drive upgrade.  (The latter looks pretty attractive,
but this is probably not an ideal time to try building it.)

THINGS I'VE ALREADY TRIED THAT MADE NOT MUCH DIFFERENCE:

1.  Tightening the screws that hold the hard drive together.
2.  Power cycling a bunch of times.
3.  Running the "ldrcopy" program from the floppy boot disk.
4.  Connecting and disconnecting power and data cables to make sure
    they're okay.
5.  Reseating a bunch of the big chips on the motherboard.
6.  Waiting a while.
7.  Looking at things really, really hard.

YOUR ADDITIONAL QUESTIONS:

Will be speedily answered, and you advice will be gratefully received.
--
   Bill Carpenter         att!hos1cad!wjc  or  attmail!bill

thad@cup.portal.com (Thad P Floryan) (05/04/90)

Re: yet another UNIXpc HD tale
comp.sys.att,unix-pc.general


wjc@ho5cad.ATT.COM (Bill Carpenter) in <WJC.90May3175246@hoswjc.ho5cad.ATT.COM>
writes about his problems with a Seagate ST251 HD.  Sigh.  :-(

I had ten (10) of those suckers go belly-up on me, all after about 14 months
(two months after warranty expiration).  And I was in contact with the QA mgr
of Seagate in Scotts Valley CA and design engineers also at that facility to
rectify the problem.  Let's just say that I'll *NEVER*, *EVER* buy anything
mfd or sold by Seagate again; and I was deeply saddened to hear of Seagate's
acquisition of Imprimis (CDC's disk drive facility).  At present I stick with
quality drives such as Maxtor, Quantum, and Conner.

As I've posted to numerous newsgroups over the past several years, the problem
is a manufacturing defect; specifically "stiction" caused by excess lubricant
on the platters causing the heads to be "stuck" when they're parked.

Not to bore everyone again (though I have probably close to 1Mbyte archives
on disk problems of this nature), the solution is to either junk the drives
or to have them replattered; nothing else will FIX a drive with the problem.

There IS a temporary solution which will permit you to spin-up the drive and
retrieve your files.  Believe me, I was sweating icicles before I stumbled on
this solution several years ago.  *ALL* hard drives WILL FAIL; the only unknown
is "when?"  Thus, I have NO sympathy (anymore :-) for anyone who doesn't have
and uphold a HD backup regimen.

With all that said and done, here's how you can get the drive to spin up to
where you can retrieve your files:

	Remove the ST251 drive from your system.  Turn it upside down so that
	you see the printed circuit assembly.  Notice there's a center spindle
	which is the main shaft, and off to the corner is another spindle
	which is the stepper motor shaft.  Put the tip of your index finger
	on the stepper motor shaft and twist it back and forth a few times;
	you don't have to twist too far, and BE GENTLE ... your data IS still
	on the drive!

	Remount the drive into your system, power up and boot, then get your
	files off ASAP.

What you do with the drive after this point is up to you.  There are many
companies which will replatter your drive.  The specific problem (as seen
under microscope) with the ST251 is excess lubricant collecting in the PARK
area ('cause the heads during normal operation push the excess out beyond
cylinder 0 and INTO the center area (the PARK zone) much like windshield wipers
on a car's window) and a meniscus forming which effectively "glues" the heads
into the PARK area.

As I wrote to Usenet (and later found plagiarized in several technical mags),
you can demonstrate the phenomenon using two glass plates and a few drops of
water: place the dry plates together and note you can slide them around 
easily; now put a few drops of water between the plates and just TRY to move
or separate them.

Most of the ST251 that fail are assembled in Singapore based on hundreds of
e-mail replies I've received; assembly line workers are over-lubing the
platters and the (sloppy or non-existent) QA doesn't detect the problem since
it's not until the drives are in service for awhile that a noticeable excess
accumulates in the PARK area.  The lubricant is "normally" one or two
mono-molecular layer(s) thick; the defective drives exhibit signs of more than
5 layers of lubricant.  The lube CANNOT be removed inexpensively hence the
need to replatter if you still wish to keep the drive.

Thad

Thad Floryan [ thad@cup.portal.com (OR) ..!sun!portal!cup.portal.com!thad ]

mmengel@cuuxb.ATT.COM (~XT6561110~Marc Mengel~C25~M27~6184~) (05/04/90)

In article <WJC.90May3175246@hoswjc.ho5cad.ATT.COM> wjc@hos1cad.att.com (Bill Carpenter) writes:
>If I do a boot from "floppy boot" and do the trick to get a shell
>prompt, I have no trouble mounting and unmounting the hard drive on
>/mnt.  In fact, when it is mounted, I can read and write whatever I
>please from it.  Yes, friends, I have the full power of UNIX at my
>disposal :-).

	In that case, you might try the "Hard Disk Boot" floppy; this
	boots & loads a kernel from the floppy disk, but uses the hard
	disk as the root filesystem; this way your floppy drive is avaliable
	for backups, etc.
-- 
 Marc Mengel					mmengel@cuuxb.att.com
 						attmail!mmengel
 						...!{lll-crg|att}!cuuxb!mmengel