kenw%noah.arc.CDN@ean.ubc.ca (Ken Wallewein) (04/19/88)
I've been having problems with our uVAXII which are rather out of the ordinary. The symptoms are slow disk I/O, page read errors, and system crashes with messages like 'ast in kernal mode' and the like. Through extensive troubleshooting, we have determined that about the only things that solve the problem are to: a) add additional termination to the backplane, or b) remove boards - it doesn't seem to matter which; or although Ihaven't tried these, c) go to a different backplane arrangement like two connected BA23's, or a third party type, or d) replace the disk controller with something more common, like a Dilog. The configuration is a BA123 with the following bus layout: AB CD 1 CPU 2 8 mb memory (NS) 3 8 mb memory (NS) 4 TS05 * 5 US Design tape controller -> Exabyte tape drive 6 DPV11 DPV11 7 LPV11 DEQNA 8 Sigma RQD-11 SC disk controller -> CDC XMDII disk drive 9 DHV11 10 DHV11 11 DHV11 12 TQK50 * * indicates not currently installed The whole thing started when we had DEC install a second DPV-11 so we could run a second X.25 link. Until we tried adding extra termination, the only way we could get the system to run stably was to remove another board, and as I said above, it doesn't seem to matter which. We recently replaced the half- height TTI Exabyte controller with a full-height US Design version, and found that I had to remove _another_ board. This is getting serious! Bus loads and power requiments have been checked. Everything, including the power supply and backplane, has been swapped. The BA123 normally runs at the maximum amount of termination recommended for a Q-bus. Nevertheless, it looks like I'm going to have to add more. I need the TK50 and TS05 back! The Sigma disk controller looks like a prime target for suspicion, simply because it seems to be where the symptoms are centered. Replacing it does affect the severity of the problem - an upgraded version makes the condition worse. DEC doesn't have a comparable device to replace it with. They will take it on contract, but we got it before they decided that, and we haven't switched. I've asked about a Q-bus analyzer. They don't seem to exist. Any trouble- shooting seems to have to be done by non-deterministic trial-and-error methods. I'll give the local DEC people credit; they haven't tried to weasel out of responsibility because of the non-DEC hardware, even though they told me many times that they could. However, any further troubleshooting they do will probably not be covered under any existing contract. -------
kenw%noah.arc.CDN@ean.ubc.ca (Ken Wallewein) (04/19/88)
I've been having problems with our uVAXII which are rather out of the
ordinary. The symptoms are slow disk I/O, page read errors, and system crashes
with messages like 'ast in kernal mode' and the like. Through extensive
troubleshooting, we have determined that about the only things that solve the
problem are to:
a) add additional termination to the backplane, or
b) remove boards - it doesn't seem to matter which.
The configuration is a BA123 with the following bus layout:
Slot AB CD
==== == ==
1 CPU
2 8 mb memory (NS)
3 8 mb memory (NS)
4 TS05 *
5 US Design tape controller -> Exabyte tape drive
6 DPV11 DPV11
7 LPV11 DEQNA
8 Sigma RQD-11 SC disk controller -> CDC XMDII disk drive
9 DHV11
10 DHV11
11 DHV11
12 TQK50 * (empty)
* indicates not currently installed
It's a pretty busy bus, but loads and power requiments have been checked and
are well within limits. Everything, including the power supply and backplane,
has been swapped.
The whole thing started when we had DEC install a second DPV-11 so we could
run a second X.25 link. Until we tried adding extra termination, the only way
we could get the system to run stably was to remove another board, and as I
said above, it doesn't seem to matter which. We recently replaced the half-
height TTI Exabyte controller with a full-height US Design version, and found
that I had to remove _another_ board. This is getting serious!
The BA123 normally runs at the maximum amount of termination recommended for
a Q-bus. Nevertheless, it looks like I'm going to have to add more. I need the
TK50 and TS05 back!
A couple of other potential solutions that have occurred to me are to:
a) go to a different backplane arrangement like two connected BA23's,
or a third party type, or
b) replace the disk controller with something more common, like a
Dilog one used by Digital Review.
These are long shots, I guess. I have no solid reasons to think that they
would work, and they are not simple to do.
The Sigma disk controller looks like a prime target for suspicion, simply
because it seems to be where the symptoms are centered. Replacing it does
affect the severity of the problem - an upgraded version makes the condition
worse. DEC doesn't have a comparable device to replace it with. They will take
it on contract, but we got it before they decided that, and we haven't
switched.
I've asked about a Q-bus analyzer. They don't seem to exist. Any trouble-
shooting seems to have to be done by non-deterministic trial-and-error
methods. I'll give the local DEC people credit; they haven't tried to weasel
out of responsibility because of the non-DEC hardware, even though they told
me many times that they could. However, any further troubleshooting they do
will probably not be covered under any existing contract.
Well, that's my tale of woe. Does anybody have any ideas? Sigma claims to
have no other customers with similar problems. Does anybody know otherwise?
/kenw
A L B E R T A
Ken Wallewein R E S E A R C H
kenw@noah.arc.cdn C O U N C I L
-------