[net.unix-wizards] 750 tbuf error revisited - help with micropatch?

sherouse@unc.UUCP (George W. Sherouse) (09/21/84)

	Has anyone sorted out a way of applying the micropatch to
	a rev.7 VAX 750 processor board?  We have just discovered
	that we have a rev.7 board - thus explaining our chronic
	problem with tbuf parity errors under BSD 4.2.

	The story from DEC is that VMS knows how to load the micropatch
	from the console TU58 at boot time and it's our tough luck
	that 4.2 doesn't know how, too.  Surely some site more blessed
	with wizardry has already solved this.  And what are ULTRIX sites
	doing?

	Thanks in advance for any help.

George W. Sherouse
Associate Physicist
Division of Radiation Therapy
North Carolina Memorial Hospital
Chapel Hill, NC   27514

(919) 966-1101

<decvax!mcnc!unc!godot!sherouse>

chris@umcp-cs.UUCP (Chris Torek) (09/23/84)

When our 750s (which we ordered with ULTRIX but are now running BRL 4.2)
boot they claim to be ``Updating 11/750 microcode''.  This happens before
the boot message, so presumably this is either in the boot ROMs or in
/boot (it sure isn't in /vmunix!).

Perhaps you can get DEC to sell you the ULTRIX boot code . . . .
-- 
(This page accidently left blank.)

In-Real-Life: Chris Torek, Univ of MD Comp Sci (301) 454-7690
UUCP:	{seismo,allegra,brl-bmd}!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@maryland

rbbb@RICE.ARPA (10/02/84)

From:  David Chase <rbbb@RICE.ARPA>

This sounds vaguely like DEC field service bullshit to me. (pardon me while
I dig out my magic hat, VMS release notes, and 750 processor description
manual)

There is a "classic" 750 tbuf error bug, and it is not (to my knowledge)
fixed by upgrading the microcode.  The symptoms of this bug are:

1) if you can figure out the machine check information, it claims that BOTH
   tbufs are in error
2) the machine check stack is badly confused
3) the 750 passes diagnostics (usually)

This bug was fixed by swapping out boards till we got it right.  The line
to use on your FE's if "well, if our board's not broken, what does it hurt
you to swap?"  My officemate claims that there is no way that this bug
could be fixed with microcode, and that it is caused by a timing problem
between memory, datapath, and translation buffer.

I also know that microcode rev 94 does not have this bug, and rev 94 does
NOT need the patchable control store.  We have rev 94 in 6 750s, and
everything works fine.

To find your microcode rev, type "E/I/L 3E" to the console microcode.  This
will get the SID in hex, formatted as SS00MMHH.  The MM digits are the
microcode rev; interpret them as decimal or hex to get a number close to
94.

Other microcode facts:

All revs greater than 94 need the PCS option, and thus need to load the
microcode from TU58 or elsewhere.

The minimum rev for 750s with CI750 from VMS 4.0 on will be 97; as of 3.7,
a warning message will be printed.  I get the feeling that the later
microcode fixes deal with things like fixup after exceptions and CI garbage.

The microcode upgrades may be loaded by a running machine (though it won't
be possible to boot from CI750 unless it is loaded from TU58).  For VMS,
this is accomplished by using a dummy device driver WDA0 whose only
function is to reload the microcode at system boot and after power failures
(if you have battery backup on your memory).  This means that it should be
possible for Unix, too, given sufficient informed hacking.

If anyone knows anything more about the what the various microcode revs
fix, or about the workings of WDA0, please let me know so I don't
misinform people.  I'm about to go tromping through the fiche looking for
WDDRIVER, but it is an "optional" device driver, so it may not be there.

drc