[comp.unix.wizards] BSD + CDC + Emulex

rackow@anl-mcs.arpa (Gene Rackow) (05/18/87)

I hope CDC is listening an I'll get this problem resolved, I hate spending
the midnight hours formatting the soon to be powder.  I can't afford to
replace them with something better (read reliable).

Short summary first.  Use any drive with a CDC label as a boat anchor only.
I have had no problems with the Emulex controller.

System setup:  Vax 11/780 4.2bsd.  Drives are used for user file space
and 3rd and 4th swapping drives.  Other drives 2/rm05 (root;swap;usr;user),
1/rp06 (tmp;user), 2/rp07(user) 2/9771 (user;swap).  Clean "computer-room"
enviroment with several other machines and lots of disks.  In the years we
have had the VAX system, the only OTHER disk problems we have had are 1 crash on
an rm05 and 1 failing rp07, each after several years in use.  (Side note/Lots
of Eagles with NO problems)  (Also note other CDC exp. on different systems
here.)

FLAME ON | AFTERBURNER | NOVA | ...    (Lots and Lots of heat)

Time to do some yelling about REAL trash ----

We have been using the CDC 9771 drives for about 4 years.  The longest time
one has gone without a HEAD-CRASH is 16 months.  The average is about 10 months.
The last one was barely 4 months old.  You get very little if any warning
that anything is going wrong.  A soft error occurs about 10-12 hours before
crash, a hard error about 6-8 hours before, and then death.  We have had CDC
in to repair the drives time and time again.  Each time they claim we are the
only site with this kind of trouble, but we are about the only place using
them that they service from this office.

Some interesting notes:
1) For a drive that "has been out and stable" for some time, why is it that
every replacement head disk assembly (HDA) looks different.  A new filter
here, an added bracket there, remove the filter, etc.

2) When the service person arrives and checks the serial number of the HDA,
it is a "common" claim of "Oh! That is within serial number range ??-??,
where we recognize a problem of crashing heads.  This has been fixed
recently and you shouldn't have any more problems."   As they are leaving
I say "Bet I'll see you within the year to replace it." (Quotes not exact
as this is usually happening after many hours are spent rearranging file
systems to get the users all happy again).  They never have denied they
will have to replace it again.  It is a standing (not funny) joke we have.

Turn down AFTERBURNER and remove NOVA

We have gone to using the CDC drives as an on-line backup and for material
that does not change often.  In that way, if we have another failure, we do
not loose any work.  My own feelings are that if you need lots of tmp,
swap, or "I don't care if I loose it" type space, the 9771's are just what
you need.  

We have also had bad luck with the CDC "shoebox" FSD about 500Meg drives.
The 2 I have on another system have been replaced 2 times each in the last
year.  In another building here at the labs they have one that gets
replaced every 4-5 months.  At the time of my inital replacement the
service rep. stated that there was an informal trouble report for drives
HDA's under 20,000 and to be forwarned that crashes were coming.  Between
the first and second set of replacments the number moved from 20k to 35k to
50k and to it's present (I think) resting place of 70000.

If someone was to GIVE me a CDC drive I would have to think about it for a
long time as to if I would use it or not.  I do not like to spend time 
recovering from crashed heads.  They would probably have to throw in some
other incentive before I would go for it.

FLAME OFF

These views are mine and not those of the Lab.  With the problems we have
had many people in my group will agree with my comments.  If anyone wants
further comments please feel free to call or email.

Gene Rackow                              ARPA: rackow@anl-mcs.arpa
Mathematics and Computer Science         Voice: 312-972-7126
Argonne National Lab
Argonne Il.  60439

louie@sayshell.umd.edu (Louis A. Mamakos) (05/18/87)

In article <7424@brl-adm.ARPA> rackow@anl-mcs.arpa (Gene Rackow) writes:
>Short summary first.  Use any drive with a CDC label as a boat anchor only.
>I have had no problems with the Emulex controller.

This is my experience too.

> 					  Each time they claim we are the
>only site with this kind of trouble, but we are about the only place using
>them that they service from this office.

Gee, maybe they should talk to the service folks that "service" the
University of Maryland.

>2) When the service person arrives and checks the serial number of the HDA,
>it is a "common" claim of "Oh! That is within serial number range ??-??,
>where we recognize a problem of crashing heads.  This has been fixed
>recently and you shouldn't have any more problems." 

We've head this too.  "Oh, you don't have the NEW HDA, that's your 
problem."

We bought a 9771 for our system (VAX 11/750, EMULEX controller).  It died
in about 9 months.  The Computer Science Department bought a pair, and they
didn't have any problems.  That is, until a month or so later.  The first
one died.  New HDA.  Big $$.  Before the service contract could be put in
place the second on did too.  More $$.

Now, we've got a pile of Eagles around here, and they work real fine.  When
we need disks, we buy Eagles.  They work.  CDC has to do something to
convince us that their drives, especially the 9771, are reliable.  They
have a real bad reputation around here.

This of course represents my views only, and not those of the University
of Maryland.  


Louis A. Mamakos  WA3YMH    Internet: louie@TRANTOR.UMD.EDU
University of Maryland, Computer Science Center - Systems Programming

rick@seismo.CSS.GOV (Rick Adams) (05/18/87)

We have had 2 CDC 9771's for about 3 years. We have had no problems of
any kind with them (This is a vax 780 with Emulex controller).

The 9771's monthly maintenance charge is about 1/2 that of an Eagle. That
says something about the perceived reliability of the drive to me.

--rick

chris@mimsy.UUCP (Chris Torek) (05/18/87)

In article <7424@brl-adm.ARPA> rackow@anl-mcs.arpa (Gene Rackow) writes:
>We have been using the CDC 9771 drives for about 4 years.  The longest time
>one has gone without a HEAD-CRASH is 16 months.  The average is about
>10 months. ... We have had CDC in to repair the drives time and time
>again.  Each time they claim we are the only site with this kind of
>trouble ....

You are not the only site.  We have had two for more than one year, and
a third for about 6 months.  The average time to death of the first two
has been 12 months (10 on one, 14 on the other).  The third one has not
been around long enough to die yet.

Restoring 670 megabytes is such fun,
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
Domain:	chris@mimsy.umd.edu	Path:	seismo!mimsy!chris

ggs@ulysses.homer.nj.att.com (Griff Smith) (05/19/87)

In article <7424@brl-adm.ARPA>, rackow@anl-mcs.arpa writes:
> ...
> Short summary first.  Use any drive with a CDC label as a boat anchor only.
> I have had no problems with the Emulex controller.
> 
> Gene Rackow                              ARPA: rackow@anl-mcs.arpa
> Mathematics and Computer Science         Voice: 312-972-7126
> Argonne National Lab
> Argonne Il.  60439

There was a time when I would have agreed with the above, but our
experience over the last year has been the opposite.  We thought we had
a problem with bad blocks appearing on our 9772 drives, but it turned
out to be a bug in the firmware in the Emulex controller.  A new set of
controller PROMs cured the problem.  A year earlier we went through
many 340 mbyte CDC drives on our CCI POWER 6/32; it was agreed by CDC
and CCI that the drives were lemons, and CCI replaced them with a newer
model under our service contract.  The CDC drives on our systems are
now as reliable as any others I have seen.
-- 
Griff Smith	AT&T (Bell Laboratories), Murray Hill
Phone:		1-201-582-7736
UUCP:		{allegra|ihnp4}!ulysses!ggs
Internet:	ggs@ulysses.uucp

taw@spar.UUCP (05/19/87)

We have had quite a few problems with our CDC9771 drives.  Typically,
the failure mode has been death due to HDA failure after 18.00 months
of operation.  Quite literally, they seemed to die in a predictable
sequence based on when they had first been turned on.  We had a 100%
mortality rate on all the 9771s from one batch.

We now have quite a few 9772s, bought before the 71s began to die.  These
drives seem to work acceptably, but we haven't had them as long enough
to be sure that they don't have the same problems.

--Tom

sanjour@cvl.umd.edu (Joe Sanjour) (05/22/87)

In article <1686@umd5.umd.edu>, louie@sayshell.umd.edu (Louis A. Mamakos) writes:
> We bought a 9771 for our system (VAX 11/750, EMULEX controller).  It died
> in about 9 months.  The Computer Science Department bought a pair, and they
> didn't have any problems.  That is, until a month or so later.  The first
> one died.  New HDA.  Big $$.  Before the service contract could be put in
> place the second on did too.  More $$.
> 
> Now, we've got a pile of Eagles around here, and they work real fine.  When
> we need disks, we buy Eagles.  They work.  CDC has to do something to
> convince us that their drives, especially the 9771, are reliable.  They
> have a real bad reputation around here.
> 
> This of course represents my views only, and not those of the University
> of Maryland.  

At least not all of the University of maryland :-) We probably are the
only ones on campus who likes the drives.

We have two 9771s (VAX 11/785, EMULEX controller). We have had them
for more then a year and they have give us abosolutly no problems.
Admittedly, we had one that we hadn't installed yet sitting around in
its box when the CSC lost theirs. They borrowed it, only to find it DOA.

Eagles are fine too. We have a few here as well. They have had no problems
for as long as I have been here (> 2 years).

> Louis A. Mamakos  WA3YMH    Internet: louie@TRANTOR.UMD.EDU
> University of Maryland, Computer Science Center - Systems Programming

 ^-^	Joseph Sanjour				ARPA: sanjour@cvl.umd.edu
(- -)	Center for Automation Research		UUCP: seismo!cvl!sanjour
 \_/	University of Maryland			(301) 454-4526
	College Park, MD 20742