[comp.unix.wizards] 4.2BSD MFIND panic

urban@spp2 (Mike Urban) (07/26/88)

About six weeks ago, I requested help from this newsgroup:
our 4.2bsd system, to which we had just added two disk drives
and several new file systems, was crashing daily with an
MFIND panic.  Two kind souls directed me to change NMOUNT
and MSWAPX in param.h, and the size of the c_mdev field
in cmap.h (swiping the bit from the c_blkno field).  This
I have done, and matters did improve somewhat.  The system
in question now crashes with the MFIND panic on a weekly
basis rather than daily (actually, it is more like once in
ten days, but let that pass).  Is there another problem
here that must be fixed, or should I begin to suspect
hardware?  Thanks in proverbial advance.

(``Convert to 4.3bsd,'' regardless of its obvious correctness, is
not an acceptable answer, and will elicit an irritated response)
   Mike Urban
	...!trwrb!trwspp!spp2!urban 

"You're in a maze of twisty UUCP connections, all alike"

terryl@tekcrl.CRL.TEK.COM (07/28/88)

In article <1368@spp2.UUCP> urban@spp2 (Mike Urban) writes:
>About six weeks ago, I requested help from this newsgroup:
>our 4.2bsd system, to which we had just added two disk drives
>and several new file systems, was crashing daily with an
>MFIND panic.  Two kind souls directed me to change NMOUNT
>and MSWAPX in param.h, and the size of the c_mdev field
>in cmap.h (swiping the bit from the c_blkno field).  This
>I have done, and matters did improve somewhat.  The system
>in question now crashes with the MFIND panic on a weekly
>basis rather than daily (actually, it is more like once in
>ten days, but let that pass).  Is there another problem
>here that must be fixed, or should I begin to suspect
>hardware?  Thanks in proverbial advance.


     Sorry, I didn't see your original posting (our news has been somewhat
flakey due to disk space problems latley).

     There is a long, outstanding bug in the VAX C compiler when combining
signed/unsigned casts (at least in the 4.2 C compiler; I have no idea if
it has been fixed in 4.3). Anyways, there are two changes you have to make:

     In the file /sys/sys/vm_mem.c (your path name may vary), find a call to
mfind() in the routine memall(); it's a multi-line call whose last line
looks like thus:

	(daddr_t)(u_long)c->c_blkno))

     Take out the (daddr_t) cast. This is the bug I alluded to earlier. The
VAX C compiler folds both casts together somehow, but loses the fact that
c_blkno should be treated as an unsigned quantity, and sign extends it to
the call to mfind.

     In the file /sys/vax/vm_machdep.c (again, your path name may vary;
void where prohibited by law, etc.) in the routine chgprot(), there is
a call to munhash() (again, a multi-line call) that is identical to the
call to mfind in memall(). Again, look for the line:

	(daddr_t)(u_long)c->c_blkno))

     And, again, take out the (daddr_t) cast, for the reasons stated above.

     And a warning: in /sys/vax/locore.s, there are implicit assumptions
about the struct cmap (which is defined in cmap.h); specifically, the bit
offsets and lengths of various members of this structure are HARD CODED in
assembly language, so you shouldn't just go changing the structure definition
without makeing sure the bit offsets and lengths are updated in locore.s



				Terry Laskodi
				     of
				Tektronix