v.wales%ucla-locus@sri-unix.UUCP (09/20/83)
From: Rich Wales <v.wales@ucla-locus> On our VAX 11/780 running 4.1BSD, every so often some device on the UNIBUS starts spewing forth zero interrupt vectors. After about 250,000 of these, of course, UNIX's solution is to do a UNIBUS reset. Clearly, it would be nice if I could figure out which device is causing all these zero vectors -- but I can't seem to figure out any way to get the culprit to 'fess up, since (Catch-22!) the only way I can think of to identify an interrupting device on the UNIBUS is by its vector. By adding a few lines to sys/locore.s and dev/uba.c, I was able to tell that the guilty device is interrupting at IPL 15. That doesn't really help me much, though, because just about EVERY device I have interrupts at IPL 15. Does anyone out there have any helpful hints? Our UNIBUS configuration is as follows, by the way: 1 SI 9400 disk controller 1 ABLE DH/DM 6 DEC DZ-11's 1 Proteon V2LNI interface 1 Interlan Ethernet interface Everything we have interrupts at IPL 15, except for the LNI interface and the DM half of the DH/DM (both of which interrupt at IPL 14). -- Rich <wales@UCLA-LOCUS>
dmmartindale@watcgl.UUCP (Dave Martindale) (09/25/83)
First, is it a device suddenly spewing forth these vectors, or is it the slow, gradual collection of 250000 of them over a long period? As distributed, the system never resets this count and if you seldom crash this can become a problem. One "normal" source of zero vectors is DEC interrupt controllers. Some of them are designed to speed up DMA transactions by throwing away bus grants if NPR is asserted. During a normal interrupt sequence, the device pulls down BR5 (in this case) and waits to see BG5. When it gets BG5, it returns SACK and then eventually asserts BBSY and INTR along with the vector when the previous transaction completes. SACK is negated after INTR is asserted. This is probably fine on PDP11's, but on the 780 the UBA doesn't know what priority the processor is at and thus can't issue BG's on its own. Thus it just passes the BR on to the processor as an interrupt request on the SBI, and when the UBA interrupt handler goes to read the appropriate BRRVR, the UBA then knows that the processor is ready to handle that interrupt and issues the BG. Then, if the grant is thrown away without producing an interrupt vector, the UBA just returns zero since it has to pass back something. This produces a zero vector. (The above description is my own understanding of how this works, based on reading manuals and circuit diagrams and watching the bus. I could be wrong....) Anyway, you would expect to get these frequently if you have devices which have this sort of interrupt controller (and I think the DZ's do) plus lots of Unibus DMA activity. The Unibus disk would provide the latter. Now, if the zero-vector count builds up gradually, there really isn't much you can do practically about it; just reset it to zero every once in a while so you don't get unibus resets. If you really do get very large bursts of zero vectors all at once and can produce the problem on demand, or observe it while it is happening, you should be able to find out which device is actually requesting the interrupt, and which is throwing away the grant, with an oscilloscope or (better) a logic analyzer. Probably not an attractive prospect, but a useful last resort... Hope this helps. Dave Martindale