[net.arch] MTBF of Crays.

jackson@uiuccsb.UUCP (04/26/84)

#R:umcp-cs:-675300:uiuccsb:5600010:000:522
uiuccsb!jackson    Apr 26 11:42:00 1984

The first Cray-1 was delivered to Los Alamos Scientific Laboratory in 1976. 
This machine did not have error correction circuitry in its memory -- only
error detection.  It had a MTBF of 4 hours -- this is the time that you
were probably referring to.  Cray's now have SECDED (single error-correcting
double error-detecting) logic like any well designed machine should.  I'm not
sure what the MTBF of the Cray-1 is now, but it sure is a LOT more than
4 hours!


Dan Jackson
UUCP:    {pur-ee!ihnp4}!uiucdcs!uiuccsb!jackson

jlg@lanl-a.UUCP (04/27/84)

We have five Crays here at Los Alamos and the MTBF is lots larger than
4 hours.  I don't know what the official figures are, but we have had
some machines run for several weeks without unscheduled down time.  In
fact one machine ran long enough without down time that a slight error
in the clock rate had time to accumulate into a noticable error in the 
time of day.

davies@uiuccsb.UUCP (04/27/84)

#R:umcp-cs:-675300:uiuccsb:5600011:000:410
uiuccsb!davies    Apr 27 09:16:00 1984

As a side note, I have heard that the first Cray-1 (the one without error
correction) is still running somewhere (in England, I believe), and that 
its MTBF is now much more than 4 hours, as the memory chips most likely to
cause errors have been found and replaced over the years.  As a result, it
is not only very reliable but also runs faster than other Cray-1 machines
*because* it has no error correction!

ron@brl-vgr.ARPA (Ron Natalie <ron>) (05/01/84)

I think your rumor is placed on the wrong supercomputer.  The Cyber
Star originally had a MBTF of 5 hours.  I don't know if it ever got
better.

(Still a lot better than our 780 used to be).

-Ron

plb@omsvax.UUCP (Phil Barrett) (05/04/84)

An interesting side note about Crays and error correction.
A while ago I was talking to some CDC engineers who said 
that Seymore Cray (when he worked at CDC) was absolutely
dead set against ECC of any sort. The quote I heard (probably
untrue, but it makes a good story) was "Over my dead body".
Since ECC == good (:-)), the decision was made to use ECC. 
Seymore 'died' and went to heaven and the rest is history.

Its ironic to now hear that crays are using ECC.

Since this is second (probably N-th with N approaching
aleph-null) hand, take it with a grain of salt.