ps@uok.UUCP (10/24/84)
Subject: panic: trap type 11
Index: /usr/include/sys/seg.h 2.9bsd
Description:
The system crashes with a 'panic: trap type 11'. This is
quite possible on a system compiled with NOKA5 undefined
and data extending up into the fifth segment. (> 40K data+bss)
The machine may also just stop without saying very much (anything?)
if you get two segmentation violations in a short period of time.
In this last case, the machine will not even reboot itself because
it is spinning in trap().
Repeat-By:
Hard to repeat on demand. This problem seems to be timing
dependant. A possible scenario is:
buffer gets mapped in:
*KDSD5 = (BSIZE << 2) | RW
*KDSA5 = click address in buffer pool
... (copying buffer)
clock interrupts:
map[0].se_addr = *KDSA5;
map[0].se_desc = *KDSD5;
*KDSD5 = seg5.se_desc;
*KDSA5 = seg5.se_addr;
... (handle 1 second clock processing)
... ((++lbolt >= hz) && BASEPRI(ps)) is true
... drop the priority to spl1().
... finish processing, start the restormap stuff
*KDSD5 = map[0].se_desc;
disk interrupts:
calls wakeup to notify process of completion of I/O
no Savemap() done because *KDSA5 == seg5.se_addr
and KDSD5 is not checked.
(notice, KDSD5 wrong, KDSA5 correct)
wakeup tries to reference something beyond the size
of a buffer in the fifth segment,
segmentation violation!
Fix:
The solution appears to be to change the savemap() macro
in /usr/include/sys/seg.h. Both the address and descriptor
registers need to be checked when deciding whether to call
Savemap().
It was:
#define savemap(map) {if (*KDSA5 != seg5.se_addr) Savemap(map); \
else map[0].se_desc = NOMAP; }
It should be changed to:
#define savemap(map) \
{\
if (*KDSA5 != seg5.se_addr || *KDSD5 != seg5.se_desc)\
Savemap(map);\
else\
map[0].se_desc = NOMAP;\
}
There is also a small problem in sys/machdep.c in the Restormap()
routine. The code to restore the values of KDS?5 is done in the
wrong order and should be reversed.
It was:
*KDSA5 = map[0].se_addr;
*KDSD5 = map[0].se_desc;
It should be:
*KDSD5 = map[0].se_desc;
*KDSA5 = map[0].se_addr;
We have been running on a kernel with these changes for about two
weeks with no apparent problems. Prior to making these changes,
we would crash at random times, varying from about 5 minutes to 48
hours.
borman@decvax.UUCP (Dave borman) (11/03/84)
One interesting thing to note about the fix for the mapping problem is that you wind up saving the map when you don't need to, but that's better than not saving it when you need to. The KA5 descriptor will have the modified bit clear when it is saved at boot time. Then, anytime you touch anything in KA5 the modified savemap macro will force the save, even though it isn't needed. Depending on how far into KA5 your data extends, I've been considering just getting rid of the macro since you'll wind up saveing it most of the time anyway. Any thoughts on this? Speaking of 2.9, has anyone out there had trouble with mount/unmount? When running with RX50 floppies and several hundred buffers, I would have problems with kernel cancelling write request before all the buffers got written out. This caused me a few headaches and several re-writes of the unmount system call to make sure that 1) all the delayed writes got queued up to be written, and 2) not return from the mount until they are all done. I suppose most people aren't mounting and unmounting file systems a lot. -Dave Borman, Digital UNIX Engineering Group decvax!borman