[net.bugs.4bsd] I have memory problems on my 750 4.1bsd

ptw (04/01/83)

     We had problems with some El-cheapo (non-DEC) memory in our 750, which
were being incorrectly reported as "tbuf par err".  We got a patch to
machdep.c (attributed to S. Leffler) verbally before we were on USENET that
was supposed to cure our translation buffer problems, while we waited for DEC
to fix the hardware (we're still waiting).

     It turned out that we were not really having that many translation buffer
problems (the hint was that they always got worse when we put our flaky-puff
memory on line).  A careful examination of the hardware revealed that Unix was
not being careful enough and interpreting bus errors (memory double-bit
errors) as translation buffer errors.

     I suspect that if you had only the Leffler change, you might just
continue on after a memory error (thinking you had cleared the translation
error) and have funny things happen.

     We made the following change to Leffler's change, so that unrecoverable
memory errors crash, but real translation buffer errors are still cleared and
processing continues.  (There probably should be a limit on the number of
times you will clear and continue in a certain time period... )  We have also
returned our El-cheapo's.

583d582
< #define MC750_tbpar 2
623a623
> #define MC750_tbpar 4
646a647
> 		mtpr(TBIA, 0);                  /* assume bad, ala VMS */
690c691
< 		if ((type&0xf) == MC750_tbpar) {
---
> 		if ((mcf->mc5_mcesr&0xf) == MC750_tbpar) {
692d692
< 		    mtpr(TBIA, 0);


     The complete change to machdep.c from the 4.1 sources is as follows:

572a573,574
>  * Except on translation buffer errors, which are recoverable by invalidating
>  * the buffer and continuing.
620a623
> #define MC750_tbpar 4
643a647
> 		mtpr(TBIA, 0);                  /* assume bad, ala VMS */
680c684
< 		printf("\tva %x errpc %x mdr %x smr %x tbgpar %x cacherr %x\n",
---
> 		printf("\tva %x errpc %x mdr %x smr %x rdtimo %x tbgpar %x cacherr %x\n",
686a691,694
> 		if ((mcf->mc5_mcesr&0xf) == MC750_tbpar) {
> 		    printf("tbuf par: flushing and returning\n");
> 		    return;
> 		    }

			     P. Tucker Withington
			     Automatix Incorporated
			     ...decvax!{wivax,genrad}!linus!vaxine!ptw
			     (617) 667-7900 x2044