[comp.sys.alliant] ACE failures

eschle@forty2.UUCP (Patrik Eschle) (03/09/90)

Just a warning/hint: 

We've run diagnostics on our CEs last wednesday because we suspected a
CE failure for vector instructions: A fortran program crashed randomly
when optimized with -Ogv but run ok with -Og.  

IST diagnostics indeed found errors on some of the vector instructions
for one CE. There was no other indication for that error than the
crashing program and I have no idea for how long the error has been
there and how many results have been published that have been
calculated on that CE.

I therefore plan now to run the diagnostics regularly, say every month
with the level 0 backup.

We also found, that IST fails on test 108 (exponent/log) for the first
CE from the left in the CE-crate, independent of the CE and the slot.
So if we put an otherwise good CE as the only CE in slot 2, the test
failed. We have no idea, if this is a real fault or just a bug in
diagnostics. But then, who uses exp/log in scientific programs ?-)

A last remark: When we removed the CEs, we found a patch wire that was
loose on one end and we had to solder it again. We also found lots of
IC that hadn't their pins cut. When inserting the board, the pins were
bent and touched each other.

Patrik

Our configuration: FX/80 with four ACE, 3 MB IP, 1 VME IP, 
                   Concentrix 5.5.02



-- 
Patrik Eschle, Physics Institute University of  Zuerich (Switzerland)
uucp: uunet!mcsun!cernvax!forty2!eschle, bitnet: k538911@czhrzu1a.bitnet
             -> Send CHUUG mail to chuug@chuug <-