[net.unix-wizards] RA81's hanging??

sdyer@bbncca.ARPA (Steve Dyer) (10/23/84)

	>The RA81's are useless. They always hang. I can't get
	>them to do anything for love or money. Stick with
	>something else (we usually use Fuji Eagles).

As someone who has a new VAX780 and 3 RA81's in the process of being
installed, this (hopefully idle) comment has me scared to death.
Can any others out there corroborate this, or offer contrary
circumstantial evidence?

We've got the UDA50 on a UNIBUS of its own, by the way.
-- 
/Steve Dyer
{decvax,linus,ima,ihnp4}!bbncca!sdyer
sdyer@bbncca.ARPA

chris@umcp-cs.UUCP (Chris Torek) (10/23/84)

RA81s don't ``always hang''; however, there are lots of problems with them:

	a) nobody seems to have a decent driver for UDA50s;
	b) early UDA50s had microcode bugs that kept taking drives
	   off line;
	c) UDA50s take up too much Unibus +5 power;
	d) they're *slow* (at least for Unix; dunno about VMS);
	e) ??? who knows.

They do have the advantage that you can plug in lots of disk space to
one controller, and they don't need massbi (plural of massbus); i.e.,
they're relatively cheap.  (And 450 Mbytes per disk is kinda nice.)

We have two 750s with only RA81s and one 780 (in EE) with two RA81s and
two RK07s.  The RK07s are as bad as the RA81s for hardware problems
(packs go bad all the time and so forth).

I said before (and now repeat) that the ``df hangs'' bug with RA81s is
due to a bug in udopen(), where a command is started then a sleep is
done, but without blocking interrupts in between.  More serious (to me
at least) was the ``SHOULD REQUEUE OUTSTANDING I/O REQUESTS'' comment
in the UDA hard error code; we were having lots of system crashes due
to that, it seemed, so I wrote the requeue code and it seemed to help.
But we still have software problems with the driver.
-- 
(This mind accidently left blank.)

In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (301) 454-7690
UUCP:	{seismo,allegra,brl-bmd}!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@maryland

jon@boulder.UUCP (Jon Corbet) (10/24/84)

>	>The RA81's are useless. They always hang. I can't get
>	>them to do anything for love or money. Stick with
>	>something else (we usually use Fuji Eagles).
>
>As someone who has a new VAX780 and 3 RA81's in the process of being
>installed, this (hopefully idle) comment has me scared to death.
>Can any others out there corroborate this, or offer contrary
>circumstantial evidence?
>
>We've got the UDA50 on a UNIBUS of its own, by the way.

Well, we have 2 RA81's on a UDA50, and we have yet to have a single 
problem.   They are fast and reliable.

I guess I should mention that we run VMS, which should not make a
difference, but who knows?
--
Jonathan Corbet
National Center for Atmospheric Research, Field Observing Facility
{seismo|hplabs}!hao!boulder!jon

eric@milo.UUCP (Eric Bergan) (10/24/84)

	After some initial trouble (about 4 months worth), we now seem
to have a fairly stable RA81 driver. It does seem to be slower than our
RP07, only explanation I can give (since the drive specs are similar) is
massbus vs unibus. But we have three 780s which each have between 2 and
four RA81s, and another 780 with a RP07, RA81, and RA60. The trick seems
to be getting a good driver, which works with your flavor of Unix (and no,
I still haven't seen my Sys V rel 2 driver I ordered from DEC).

	Another possible speed problem - does anyone know the definitive
m and n numbers for mkfs for a RA81 and a RA60?

-- 
					eric
					...!seismo!umcp-cs!aplvax!milo!eric

sasaki@harvard.ARPA (Marty Sasaki) (10/24/84)

We have about a dozen RA81's and have had lots of problems.
We have had three complete RA81's, and the UDA-50's that they
were attached to, swapped out. Most of the boards in the drives
have been changed at least once.

We run both UNIX and VMS and the drives don't care which system
you run them on.

Luckily for us, we have DEC Field Circus contract, and the FE
does a pretty good job.

DEC feels that the problems have to do with heat. Apparently
something goes wrong when the boards get hot, even if they
cool off. We had an air conditioner problem about the time
the RA-81's started failing.
-- 
			Marty Sasaki
			Havard University Science Center
			sasaki@harvard.{arpa,uucp}

henry@utzoo.UUCP (Henry Spencer) (10/25/84)

> They do have the advantage that you can plug in lots of disk space to
> one controller, and they don't need massbi (plural of massbus); i.e.,
> they're relatively cheap.  (And 450 Mbytes per disk is kinda nice.)

All this is also true of Fujitsu Eagles on Emulex controllers, and the
reliability of *that* hardware combination is, uh, better.
-- 
"You say you want to sell me RA81s?  I'll give you $5 apiece for them."

				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

sohail@terak.UUCP (Sohail M. Hussain) (10/26/84)

> 
> 	>The RA81's are useless. They always hang. I can't get
> 	>them to do anything for love or money. Stick with
> 	>something else (we usually use Fuji Eagles).
> 
> As someone who has a new VAX780 and 3 RA81's in the process of being
> installed, this (hopefully idle) comment has me scared to death.

As some one who has been using 750 with a ra81 for more than a year
I can tell you that there must me more than one kind of ra81, ones that
never work, and ones that never hang. (I seem to have the latter)

-- 
Sohail Hussain

uucp:	 ...{decvax,hao,ihnp4,seismo}!noao!terak!sohail
phone:	 602 998 4800
us mail: Terak Corporation, 14151 N 76th street, Scottsdale, AZ 85260

rpw3@redwood.UUCP (Rob Warnock) (10/27/84)

+---------------
| DEC feels that the problems have to do with heat. Apparently
| something goes wrong when the boards get hot, even if they
| cool off. We had an air conditioner problem about the time
| the RA-81's started failing.
| 			Marty Sasaki
+---------------

A general warning note (on ALL electronics, not just RA81's):

Some failure modes are STRONGLY accelerated by operation at elevated
temperatures, and the accelerated failure rate can continue on return
to normal temperatures. This is especially true of the big electrolytic
capacitors in power supplies. They handle a lot of ripple current (A.C.
component), and are protected only by their low A.C. impedance. If you
overheat those babies, they can dry out a little bit. This raises their
impedance, and they start running hot internally from then on (impedance
times current-squared equals power dissipated). This in turn causes them
to die at a much earlier age than spec'd, say like a week to a month
after the original temperature problem was fixed. (This can be checked
by looking at the ripple voltage ( = current x impedance).)

Other similar effects: power transistors (e.g. in disk drive servos)
can get "hot spots" which continue to make the transistor run hot later;
power resistors can change value when overheated (sometime +/- 30% or more)
causing servo loops to be less accurate or stable (leads to seek errors or
just plain data errors); motors/transformers/speakers can short "one turn"
of a winding (or worse, just get "leaky" between two turns). Generally,
such failures are in high-power sections of the equipment (but not always),
produce no immediately observable effect (negative feedback designs can hide
horrible sins!), and the only way you know there was any damage is that the
equipment dies sooner than (or is flakier than) similar equipment that was
never overheated.

I once worked on an old DEC KA-10 which had such big power supply caps.
At least one of them would blow up (LITERALLY!), spraying goop all over
the inside of the cabinet, almost exactly two weeks after any major air
conditioning failure. (The system was installed in 1970 and is still running,
and is in the South, so there were a few A/C failures in its life... ;-} )

Remember, temperature-induced failure rates double every 10 degrees C.
Failures that include volatile materials are accelerated worse than that
around the melting/boiling point of the particular substance. The insides
of some of those parts are NORMALLY near boiling, already.

Rob Warnock

UUCP:	{ihnp4,ucbvax!amd}!fortune!redwood!rpw3
DDD:	(415)572-2607	(*new*)
Envoy:	rob.warnock/kingfisher
USPS:	510 Trinidad Ln, Foster City, CA  94404	(*new*)

jnelson@trwrba.UUCP (John T. Nelson) (11/01/84)

We have an 11/780 with an RA80 and RA60 hanging off of a single
UDA50.  We're running the vanilla driver from 4.2 and all  and all
I've been reasonably satisfied.... except for the occasional
system hang....

The console spits out an RA hard error and sometimes accompanied by
a DMF silo overflow.  The system doesn't crash, it just sort
of sits there and twiddles it's thumbs ignoring everything except
a ^P.

I wouldn't scrap your hardware... there are a number of drivers
floating around and I'm sure taht someone can give you a pointer
to a tested RA combo driver.



						- John