[net.arch] MTBF on Crays

eugene@ames-lm.UUCP (05/04/84)

I believe my initial posting did not make it to the net.

We had the ILLIAC IV and a Cray-1S on our site.  We have a 2 processor
Cray-XMP and will be upgrading this to four processors, getting a Cray-1M,
and a Cray-2.

The ILLIAC IV had an initial MTBF of 15 minutes which improved slowly to
1 week.  Our C-1/1300 had an MTBF measured in months but was taken
down weekly for preventive maintance.  We heard from one of the LLNL people who
went to Japan that the Japanese shipped they Cray-1 back to the US after
it had its first H/W failure after three months, this was not reliable
enough for them.

Now, the problem.  The XMP has two processors.  Our people have had to
redefine the concept of MTBF to mean loss of two processors.  If
one processor fails and goes down for repairs, the other processor
continues.  This has created problems for the bean counters.  Still,
the XMP is a reliable machine.

--eugene miya
  NASA Ames Res. Ctr