[comp.bugs.4bsd] DEQNA woes

don@pyr.gatech.EDU (Don Deal) (03/24/89)

A Vaxstation II running 4.3 that we are using as a dedicated name server has
recently begun spurting a rash of:

	qerestart: restart qe0 <n>

console messages, where <n> is the n'th restart since boot.  The system stops
answering nameserver queries, pings, etc; however, initiating network
traffic from the console (via a ping or telnet) causes the DEQNA to be
restarted, and everything works again -- at least for a while.

I seem to vaguely remember this problem surfacing a year or so ago, but I
don't remember any specifics.  Clues anyone?
--
Don Deal 	don@anzio.gatech.edu    gatech!don

dgharriss@watmath.waterloo.edu (Dermot G. Harriss) (03/24/89)

In article <7705@pyr.gatech.EDU> don@pyr.gatech.edu (Don Deal) writes:
>recently begun spurting a rash of:
>	qerestart: restart qe0 <n>
>...
>I seem to vaguely remember this problem surfacing a year or so ago, but I
>don't remember any specifics.  Clues anyone?

That is due to a well known DEQNA hardware/firmware bug: it locks up
under load.  Check the revision number on the board.  Rev C1 boards
(of which we have all too many) have this problem, and perhaps one
or two later revs do as well.  I believe the current bug-fixed rev is `F'
(sorry - can't be sure without shutting down a machine :-), but DEC can
tell you.  A DEQNA that has had the `operation' to fix the problem and
bring it up to rev `F'(?) has a PAL in the spare position under the i8544,
like so:
                          +-------------------------+
                          |           8544          |
                          +-------------------------+
                          +-------+    .............
                          |  PAL  |    :  new PAL  :
                          +-------+    '''''''''''''
                          +------------+  +---------+
                          | N82S167AN  |  | 74LS08N |
                          +------------+  +---------+
There are a bunch of fly wires coming from the new PAL.
Another solution: buy a DELQA :-)
							-- Dermot

jch@batcomputer.tn.cornell.edu (Jeffrey C Honig) (03/24/89)

In article <7705@pyr.gatech.EDU> don@pyr.gatech.edu (Don Deal) writes:
>A Vaxstation II running 4.3 that we are using as a dedicated name server has
>recently begun spurting a rash of:
>
>	qerestart: restart qe0 <n>
>

Older DEQNA's locked up under load, check with DEC to be sure your DEQNA
is up to rev level.  Sorry, but I don't remember what rev level it
should be up to.

Jeff

jg@crltrx.crl.dec.com (Jim Gettys) (03/27/89)

In article <7615@batcomputer.tn.cornell.edu> jch@tcgould.tn.cornell.edu (Jeffrey C Honig) writes:
>In article <7705@pyr.gatech.EDU> don@pyr.gatech.edu (Don Deal) writes:
>>A Vaxstation II running 4.3 that we are using as a dedicated name server has
>>recently begun spurting a rash of:
>>
>>	qerestart: restart qe0 <n>
>>
>
>Older DEQNA's locked up under load, check with DEC to be sure your DEQNA
>is up to rev level.  Sorry, but I don't remember what rev level it
>should be up to.

Also note that this may be a symptom of broadcast woes, as the problem was
triggered by collisions in particular.  A badly configured network with
a bunch of hosts running old network code can cause real grief.  This
is generally caused by some machines believing one IP broad cast address
and others believing a different one, often caused by machines not knowing
how to do subnet routing or misconfigured broadcast addresses.  4.3BSD
is much less eager to be "helpful" and much less likely to forward broadcast 
packets to a gateway, but other machines on your net may not be so kind.
			- Jim