[comp.sys.sun] HELP: Fuji 2322 problems

tarsa@decvax.dec.com (06/26/90)
We have a hardware problem and I need to lean on the collective experience
of this group for help.  I am primarily a software weenie and what I know
of hardware I have learned by trying to keep our machines running on our
shoestring budget.  This problem is stretching my abilities.

The configuration in this story consists of a SUN 3/140 and a SUN 2/120
368M disk expansion pedestal containing Fujitsu 2322's.  The cabling
between them consists of twisted pair ribbon cables directly connecting
the drives to the controller.

Last week we had a problem where the disk pedestal was plugged into a
socket that had no ground on the grouding pin.  The system was useless and
examination with a VM showed between 30 and 110V AC flowing between the
two chassis via the data and command cables.  The 30V flowed when the
power switch was 'off', the 110 when it was 'on'.  The voltage leak
appeared to be coming from the AC line filter (I think it is an AC line
filter--its the metal box that the incoming AC runs through prior to
entering the power supply).

Reconnecting ground solved the AC current flow problem.  Is it normal for
the AC filter to be conducting current to ground regardless of the power
switch setting?

Anyway, the system would not boot.  Somehow that didn't seem like too much
of a surprise given the situation, though I was not aware that simply
losing a ground could be so catastrophic.

However, I need to determine exactly what is broken so that I can get it
fixed. Test #1 consisted of replacing the controller and Multibus-to-VME
controller assy, assuming that the controller board got fried by the
current would have flowed through it to ground in the CPU box.  No change.

Test #2, consisted of leaving the new controller assy in place and taking
my only spare 2322 and plugging it into the disk box in place of one of
the questionable drives. Booting the drive from the console appeared to
give me a readable disk, though not usable at the time, since it is loaded
with SUN2 images.  Booting stand/diag over the network from a
hastily-established 'server' showed the drive as readable and properly
formatted (diag could read the labels).  Unfortunately, I used this as a
kind of boolean test and immediately went on to another test rather than
'playing around' with this working disk.

On the strength of the Test #2, I assumed that some of the disk
electronics were kaput and so Test #3 consisted of replacing the most
easily accessible of the boards on one of the bad drives with one from the
good drive.  This resulted in a disk that diag showed to work for about
20-30 seconds before I began to get 'lost interrupt' errors and then 'no
return status' errors--the same errors that the broken drives exhibited at
boot time.

In an attempt to back out, I put the 'good' drive board back on the good
drive and re-ran Test #2.  My 'good' disk only works for about 3-5 minutes
before exhibiting the same problems as the bad disks. But since I didn't
spend a lot of time in Test #2 the first time, I am not sure if I
destroyed anything or not.

I then checked the power supply outputs for the disks (something I now
know I should have done prior to Test #1), but the voltages appear to be
fine.

Anyone out there recognize anything from this scenario?  Not being
electronically trained, I have no idea what kind of problems a floating
ground could cause and my "Field Service Engineer" experience has led me
to a blank wall.

Any help would be appreciated.  Send mail to me at tarsa@abyss.dec.com for
now, since our mail gateway is the afflicted machine.

Thanks,
Greg Tarsa

		33 Seabee Street	(no mail to this address today)
		Bedford, NH 03102	tarsa@elijah.mv.com
		(603)668-9226		{decuac,decvax}!elijah!tarsa