[net.bugs.4bsd] tbuf par faults and 4.2bsd

bruce@godot.UUCP (Bruce Nemnich) (06/06/84)

> From: rees@apollo.UUCP
> Subject: tbuf par errors, once more
> Date: Mon, 4-Jun-84 08:23:32 EDT
> 
> Maybe this should go in the list of most-often asked questions about Unix?
> 
> Thanks to Dennis Ritchie, Sam Leffler, Andy Tannenbaum, and all the
> other people who helped straighten this out.
> 
> The translation buffer helps translate virtual addresses to real.  A
> tbuf par err is a parity error in the translation buffer.  The original
> 4.1bsd code didn't handle these errors correctly.  There is an ECO
> from DEC that reduces the number of these errors, but Unix should still
> be able to handle them correctly.
> 
> There are some bogus versions of this fix floating around.  They have
> mc5_mcesr&0xf instead of mc5_mcesr&0xe.  Make sure you have the right one.
> 
> I have not looked at the 4.2 code to see if this fix made it in.  This code
> is for 4.1bsd.

I just looked at the 4.2bsd code, because I have been having problems with
these faults.  Sure enough, the distributed version has the bogus fix.
Here's the 4.2bsd diff on /sys/vax/machdep.c:

810c810
< 		if ((mcf->mc5_mcesr&0xf) == MC750_TBPAR) {
---
> 		if ((mcf->mc5_mcesr&0xe) == MC750_TBPAR) {
-- 
--Bruce Nemnich, Thinking Machines Corporation, Waltham, MA

boylan@dicomed.UUCP (Chris Boylan) (06/06/84)

As I said before, the problem I am having with the tbuf errors
is that the flush and return fix doesn't work too well if (as it appears)
it gets a second tbuf par error before it finishes it's work.

What's happening with our 750 is a 4 (MC750_TBPAR) in mcesr and
the thing fails to recover because it gets a second mchk 2 as
it's returning from the trap.  Since I am currently using the stock
4.2bsd ra81 driver I do not get a dump of the system when this
happens but it seems obvious that the second error is probably
another tb parity error.

This being the case, I really don't see what can be done to
fix this problem aside from fixing the source of the problem,
thus my relief that DEC is actually shipping the REV 6 upgrade.

However, the local dec people just talked to me somemore about this
fix and as they are currently distributing it, you CANNOT do
an autorestart and still load the microcode patches.  I am told there
is already one site in Minneapolis running with REV 6 and the new
micro code which they load from a diagnostic cassette and then
reboot by hand.  This brings up two questions:

What is the straight story on the DEC UEG code to reload the
control store in the unix boot cycle being worked on/released?
This sounds too good to be one or more of: free available true.

Do the patches HAVE to be loaded to run?  I'm sure not driving back
into work to reload the control store...  Our night usage is not
enough that we couldn't just run degraded until morning but I would
rather be up...

-- 

	Chris Boylan
	{mgnetp | ihnp4 | uwvax}!dicomed!boylan