[comp.os.vms] Problems with Q-bus reflections

kenw%noah.arc.CDN@ean.ubc.ca (Ken Wallewein) (04/19/88)

  I've been having problems with our uVAXII which are rather out of the
ordinary. The symptoms are slow disk I/O, page read errors, and system crashes
with messages like 'ast in kernal mode' and the like. Through extensive
troubleshooting, we have determined that about the only things that solve the
problem are to:

	a) add additional termination to the backplane, or
	b) remove boards - it doesn't seem to matter which; or 
although Ihaven't tried these,

 	c) go to a different backplane arrangement like two connected BA23's,
	   or a third party type, or 
	d) replace the disk controller with something more common, like a 
	   Dilog.

  The configuration is a BA123 with the following bus layout:
		AB		CD
	1		CPU
	2	8 mb memory (NS)
	3	8 mb memory (NS)
	4		TS05 *
	5	US Design tape controller  ->  Exabyte tape drive
	6	DPV11		DPV11
	7	LPV11		DEQNA
	8	Sigma RQD-11 SC disk controller  ->  CDC XMDII disk drive
	9		DHV11
	10		DHV11
	11		DHV11
	12	TQK50 *

	* indicates not currently installed

  The whole thing started when we had DEC install a second DPV-11 so we could 
run a second X.25 link. Until we tried adding extra termination, the only way 
we could get the system to run stably was to remove another board, and as I 
said above, it doesn't seem to matter which. We recently replaced the half-
height TTI Exabyte controller with a full-height US Design version, and found 
that I had to remove _another_ board. This is getting serious!

  Bus loads and power requiments have been checked. Everything, including the
power supply and backplane, has been swapped. 

  The BA123 normally runs at the maximum amount of termination recommended for
a Q-bus. Nevertheless, it looks like I'm going to have to add more. I need the
TK50 and TS05 back! 

  The Sigma disk controller looks like a prime target for suspicion, simply
because it seems to be where the symptoms are centered. Replacing it does
affect the severity of the problem - an upgraded version makes the condition
worse. DEC doesn't have a comparable device to replace it with. They will take
it on contract, but we got it before they decided that, and we haven't
switched. 

  I've asked about a Q-bus analyzer. They don't seem to exist. Any trouble-
shooting seems to have to be done by non-deterministic trial-and-error
methods. I'll give the local DEC people credit; they haven't tried to weasel
out of responsibility because of the non-DEC hardware, even though they told
me many times that they could. However, any further troubleshooting they do
will probably not be covered under any existing contract. 

-------

kenw%noah.arc.CDN@ean.ubc.ca (Ken Wallewein) (04/19/88)

  I've been having problems with our uVAXII which are rather out of the
ordinary. The symptoms are slow disk I/O, page read errors, and system crashes
with messages like 'ast in kernal mode' and the like. Through extensive
troubleshooting, we have determined that about the only things that solve the
problem are to:
	a) add additional termination to the backplane, or
	b) remove boards - it doesn't seem to matter which.

  The configuration is a BA123 with the following bus layout:

	Slot	AB		CD
	====	==		==
	1		CPU
	2	8 mb memory (NS)
	3	8 mb memory (NS)
	4		TS05 *
	5	US Design tape controller  ->  Exabyte tape drive
	6	DPV11		DPV11
	7	LPV11		DEQNA
	8	Sigma RQD-11 SC disk controller  ->  CDC XMDII disk drive
	9		DHV11
	10		DHV11
	11		DHV11
	12	TQK50 *		(empty)

	* indicates not currently installed

  It's a pretty busy bus, but loads and power requiments have been checked and
are well within limits. Everything, including the power supply and backplane,
has been swapped. 

  The whole thing started when we had DEC install a second DPV-11 so we could 
run a second X.25 link. Until we tried adding extra termination, the only way 
we could get the system to run stably was to remove another board, and as I 
said above, it doesn't seem to matter which. We recently replaced the half-
height TTI Exabyte controller with a full-height US Design version, and found 
that I had to remove _another_ board. This is getting serious!

  The BA123 normally runs at the maximum amount of termination recommended for
a Q-bus. Nevertheless, it looks like I'm going to have to add more. I need the
TK50 and TS05 back! 

  A couple of other potential solutions that have occurred to me are to:
 	a) go to a different backplane arrangement like two connected BA23's,
	   or a third party type, or 
	b) replace the disk controller with something more common, like a 
	   Dilog one used by Digital Review.
These are long shots, I guess. I have no solid reasons to think that they
would work, and they are not simple to do. 

  The Sigma disk controller looks like a prime target for suspicion, simply
because it seems to be where the symptoms are centered. Replacing it does
affect the severity of the problem - an upgraded version makes the condition
worse. DEC doesn't have a comparable device to replace it with. They will take
it on contract, but we got it before they decided that, and we haven't
switched. 

  I've asked about a Q-bus analyzer. They don't seem to exist. Any trouble-
shooting seems to have to be done by non-deterministic trial-and-error
methods. I'll give the local DEC people credit; they haven't tried to weasel
out of responsibility because of the non-DEC hardware, even though they told
me many times that they could. However, any further troubleshooting they do
will probably not be covered under any existing contract. 

  Well, that's my tale of woe. Does anybody have any ideas? Sigma claims to 
have no other customers with similar problems. Does anybody know otherwise?

 /kenw
                                                                 A L B E R T A
Ken Wallewein                                                  R E S E A R C H
kenw@noah.arc.cdn                                                C O U N C I L


-------