[comp.unix.wizards] qe: Non existant memory interrupt

greg@duke.cs.unlv.edu (Greg Wohletz) (03/27/90)

In article <10333@cbmvax.commodore.com>, grr@cbmvax.commodore.com
(George Robbins) writes:
|> In article <1642@jimi.cs.unlv.edu> greg@unlv.edu (Greg Wohletz) writes:
|> > We have several microvax II's that we are using as fileservers.  The are
|> > running ultrix 3.1.  Periodically (about once every 24 hours) they crash
|> > with ``qe: Non existant memory interrupt''.  A peek at if_qe.c
reveales the
|> > following comment...
|> > 
|> > So it would appear that this is an error condition from the controller
|> > itself.  Has anyone seen this before?  Is there a fix?  What is a
|> > non-existent memory interrupt?
|> 
|> Well, the first comment is certainly bogus, since (illegally) long packets
|> on your ethernet will cause a panic due to "chained packets".  I wouldn't
|> be too surpries if there is some network disease that could cause the
second.
|> 
|> What is the history of this problem?  Is it new with 3.1 or are the machines
|> new or is there some new system/software elsewhere on your network that has
|> triggered these panics?

We've had the microvaxes for several years, about 6 months ago we
converted three of them into fileservers, until then we were running
2.0 on them, but we discovered severe NFS bugs with 2.0 that caused
frequent crashes, we also discovered problems with our old DEQNA
boards, so we upgraded to 3.1 and installed new DELQA boards.  This
made things alot better, but we still get the ``non-existent memory
interrupt'' panics daily...   I've found the following in the DELQA
documentation:

    There are three interrupt conditions:

    	o   Recieve Interrupt Request, when a complete packet has
    	    been recieved.

    	o   Transmit Interrupt Request, when a transmission is
    	    completed

    	o   Nonexistent Memory, when a Q-bus or memory access error
    	    occurs.

This seems to match well with the interupt code that looks like:

        if( csr & QE_RCV_INT )
                qerint( unit );
        if( csr & QE_XMIT_INT )
                qetint( unit );
        if( csr & QE_NEX_MEM_INT )
                panic("qe: Non existant memory interrupt");

So the question is what is causes this interupt?  Elsewhere in the
documentation it says:

    Nonexisten-Memory timeout, this is set if the DELQA times out
    while trying to access host memory.

So, I've come up with one possible theory, could the interupt priority
of the DELQA be higher that the processor level set by the kernel when
manipulating the memory management registers?


|> Which board is actually involved?  If all else fails and they're DEQNA's you
|> might try upgrading to a newer board - see the VMS related DEQNA discussion
|> recently in comp.sys.dec.  A while back I had a DEQNA problem that
|> turned out
|> to be a problem with jumpers on the *memory* card, but that was in an PDP11
|> Q-bus environment...

As I said above the card is a less than 6 months old DELQA.  One other
possibility is the following piece of info from the manual:


The mode switch defines two possible modes of operation for the DELQA.
The preferred  mode is the  ``Normal mode'' which  indicates  that the
DELQA is operating  as a DELQA.  All  current DIGITAL software for the
DEQNA may  be used with  confidence for  the  DELQA  when the DELQA is
switched  to operate in Normal mode.   ``DEQNA-lock mode'' should only
be requered for use with some non-DIGITAL  software drivers to acheice
compatibility with DEQNA programming features.


We currently have the boards set up the way they were shiped (normal
mode).  Perhaps I'll try putting them into DEQNA-lock mode and see if
this clears up the problem (What?  You thought Ultrix was ``current
DIGITAL software''?  Shame on you!)

Anyway, if anyone has any further insite I'd sure appreciate it,
otherwise I'll keep you posted.

    	    	    	    	    	    --Greg