greg@duke.cs.unlv.edu (Greg Wohletz) (03/27/90)
In article <10333@cbmvax.commodore.com>, grr@cbmvax.commodore.com (George Robbins) writes: |> In article <1642@jimi.cs.unlv.edu> greg@unlv.edu (Greg Wohletz) writes: |> > We have several microvax II's that we are using as fileservers. The are |> > running ultrix 3.1. Periodically (about once every 24 hours) they crash |> > with ``qe: Non existant memory interrupt''. A peek at if_qe.c reveales the |> > following comment... |> > |> > So it would appear that this is an error condition from the controller |> > itself. Has anyone seen this before? Is there a fix? What is a |> > non-existent memory interrupt? |> |> Well, the first comment is certainly bogus, since (illegally) long packets |> on your ethernet will cause a panic due to "chained packets". I wouldn't |> be too surpries if there is some network disease that could cause the second. |> |> What is the history of this problem? Is it new with 3.1 or are the machines |> new or is there some new system/software elsewhere on your network that has |> triggered these panics? We've had the microvaxes for several years, about 6 months ago we converted three of them into fileservers, until then we were running 2.0 on them, but we discovered severe NFS bugs with 2.0 that caused frequent crashes, we also discovered problems with our old DEQNA boards, so we upgraded to 3.1 and installed new DELQA boards. This made things alot better, but we still get the ``non-existent memory interrupt'' panics daily... I've found the following in the DELQA documentation: There are three interrupt conditions: o Recieve Interrupt Request, when a complete packet has been recieved. o Transmit Interrupt Request, when a transmission is completed o Nonexistent Memory, when a Q-bus or memory access error occurs. This seems to match well with the interupt code that looks like: if( csr & QE_RCV_INT ) qerint( unit ); if( csr & QE_XMIT_INT ) qetint( unit ); if( csr & QE_NEX_MEM_INT ) panic("qe: Non existant memory interrupt"); So the question is what is causes this interupt? Elsewhere in the documentation it says: Nonexisten-Memory timeout, this is set if the DELQA times out while trying to access host memory. So, I've come up with one possible theory, could the interupt priority of the DELQA be higher that the processor level set by the kernel when manipulating the memory management registers? |> Which board is actually involved? If all else fails and they're DEQNA's you |> might try upgrading to a newer board - see the VMS related DEQNA discussion |> recently in comp.sys.dec. A while back I had a DEQNA problem that |> turned out |> to be a problem with jumpers on the *memory* card, but that was in an PDP11 |> Q-bus environment... As I said above the card is a less than 6 months old DELQA. One other possibility is the following piece of info from the manual: The mode switch defines two possible modes of operation for the DELQA. The preferred mode is the ``Normal mode'' which indicates that the DELQA is operating as a DELQA. All current DIGITAL software for the DEQNA may be used with confidence for the DELQA when the DELQA is switched to operate in Normal mode. ``DEQNA-lock mode'' should only be requered for use with some non-DIGITAL software drivers to acheice compatibility with DEQNA programming features. We currently have the boards set up the way they were shiped (normal mode). Perhaps I'll try putting them into DEQNA-lock mode and see if this clears up the problem (What? You thought Ultrix was ``current DIGITAL software''? Shame on you!) Anyway, if anyone has any further insite I'd sure appreciate it, otherwise I'll keep you posted. --Greg