[comp.unix.aux] Major bug in A/UX 1.1

mckenzie@june.cs.washington.edu (Neil McKenzie) (05/09/89)

I just installed the upgrade from A/UX 1.0.1 to A/UX 1.1 on my 
4 Mbyte Mac IIx ('030 version of the MacII).  There seems to be
a serious reliability problem with the new kernel running on my
computer.  I can make it crash by simply doing the following from
the shell:

	% rlogin localhost
	...
	% exit

The machine then dies and prints the error message:

	panic: sbflush 2

It is necessary to perform a cold restart at this point.
I've also seen similar problems when I run the new (color supported)
X11, running innocuous programs like "plaid" from a remote client.
The machine bombs and various messages come up:

	Double panic: mget

	Kernel bus error (then a register dump)
	Double panic: kernel memory management error


I tried playing with kconfig to increase different kernel parameters (as is
described in Chapter 2 of the "X Window System User's Guide for A/UX",
but this has no effect; the machine crashes the same way.

Fortunately I saved a copy of the 1.0.1 kernel and I am just using that
instead; it is highly robust.  Running the new X11 with the old kernel
is fine but just monochrome.  Both the new kernel and the new X11
server are necessary to make X11 work in color.


Questions:
Are there others (including people at Apple) who have seen these bugs?
Is there a fix in the works?

This problem is pretty serious -- serious enough to make this whole
upgrade unusable, since color X11 is the whole point of the upgrade,
for my purpose.

--Neil McKenzie (mckenzie@june.cs.washington.edu)

chuq@Apple.COM (Chuq Von Rospach) (05/09/89)

>I just installed the upgrade from A/UX 1.0.1 to A/UX 1.1 on my 
>4 Mbyte Mac IIx ('030 version of the MacII).  There seems to be
>a serious reliability problem with the new kernel running on my
>computer.  I can make it crash by simply doing the following from
>the shell:

>	% rlogin localhost
>	...
>	% exit

>The machine then dies and prints the error message:
>
>	panic: sbflush 2


I can't reproduce this. 

>I've also seen similar problems when I run the new (color supported)
>X11, running innocuous programs like "plaid" from a remote client.
>The machine bombs and various messages come up:
>
>	Double panic: mget
>
>	Kernel bus error (then a register dump)
>	Double panic: kernel memory management error

I can't reproduce this, either. Both of these error messages are somewhat
indicative of a sytem problem -- I would strongly suggest either a latent
hardware problem (very possibly on the ethernet card) that got tickled by 
the upgrade or a corrupted kernel. Other possibilities might be a memory
problem that passes self-test but occasionally flips a bit or a flakey 68030
chip. 

Since a 1.0.1 kernel works, I'd try building a new 1.1 kernel from scratch
and see whether it works. It sounds like something was corrupted to me.


Chuq Von Rospach      =|=     Editor,OtherRealms     =|=     Member SFWA/ASFA
         chuq@apple.com   =|=  CI$: 73317,635  =|=  AppleLink: CHUQ
      [This is myself speaking. No company can control my thoughts.]

Bookends. What a wonderful thought.

phil@Apple.COM (Phil Ronzone) (05/10/89)

In article <8150@june.cs.washington.edu> mckenzie@uw-june.UUCP (Neil McKenzie) writes:
>I just installed the upgrade from A/UX 1.0.1 to A/UX 1.1 on my 
>4 Mbyte Mac IIx ('030 version of the MacII).  There seems to be
>a serious reliability problem with the new kernel running on my
>computer.  I can make it crash by simply doing the following from
>the shell:
>
>	% rlogin localhost
>	...
>	% exit
>
>The machine then dies and prints the error message:
>
>	panic: sbflush 2
>
I can't reproduce any of these errors. I would suspect flakey hardware. I
just had an old Mac II that has been running without poweroff or failure for
almost two years -- and it started getty flakey. I finally took the cover
off and "stroked" the SIMMs and wiggled all the connectors and reseated the
PMMU. Now it behaves again. If the bug is repeatable, check your Ethernet
card to see if it has all the revs on it.
+-----------------------------------------------------------------------------+
|Philip K. Ronzone, Apple Computer, 10440 Bubb Rd, MS 58A, Cupertino, CA 95014|
|{amdahl,decwrl,sun,voder,nsc,mtxinu,dual,unisoft,...}!apple!phil             |
+-----------------------------------------+-----------------------------------+
| All "IMHOs" disclaimed and copyrighted. | Self defense is a human right ... |
+-----------------------------------------+-----------------------------------+

randy@violet.berkeley.edu (Randy Ballew) (05/10/89)

>>I've also seen similar problems when I run the new (color supported)
>>X11, running innocuous programs like "plaid" from a remote client.
>>The machine bombs and various messages come up:
>>
>>	Double panic: mget
>>
>>	Kernel bus error (then a register dump)
>>	Double panic: kernel memory management error
>
>I can't reproduce this, either. Both of these error messages are somewhat
>indicative of a sytem problem -- I would strongly suggest either a latent
>hardware problem (very possibly on the ethernet card) that got tickled by 
>the upgrade or a corrupted kernel. Other possibilities might be a memory
>problem that passes self-test but occasionally flips a bit or a flakey 68030
>chip. 
>
>Since a 1.0.1 kernel works, I'd try building a new 1.1 kernel from scratch
>and see whether it works. It sounds like something was corrupted to me.

I experienced a similar problem:  MC68030, AUX 1.1, color X11, 8Mb, 
80Mb external disk drive.   Kernel was configured according to the 
instructions in "X Window System User's Guide for A/UX:"

	NBUF=1000
	NINODE=600
	NFILE=400
	NPROC=100
	MAXUP=100

The system was left running X and an xterm or two (one with an rlogin) over the
weekend, on Monday it was crashed with a screenful of memory management error
messages and references to XMacII.  The machine was completely wedged, I was
unable to bring up MacOS.  Not being a MacWizard, I was stumped.  Fortunately,
I was able to locate our local Mac expert, who was able to boot the machine
from a floppy.  One unsolved mystery:  

	we had to disconnect the SCSI cable before  MacOS would come up.  

We were then able to do a sanity restore by "zapping the PRAM," and
resetting all its parameters.  The machine is running X again, I'm waiting for
a repeat performance...  any tips?

Another problem:  the A/UX version[s] of cpp seem unable to cope with certain X
software.  I'm trying to get our hacked up X11R3 HP widgets to compile (they're
compiled and working on Sun 3.5 and 4.0, Ultrix 2.2 and 3.0).  I switched to
/usr/lib/big after /bin/cpp choked in <stdio.h> (!!).  Here's where I'm hung
now:

randy@scrapple [73] % cd MACII 
randy@scrapple [74] % make TextEdit.o
	/bin/cc -O -c -I..  -DmacII -B /usr/lib/big/ ../TextEdit.c
../TextEdit.c: 433: token too long
*** Error code 1

Stop.

Line 433 is about half way through the static declaration of the widget's
translations, which I'd rather not modify.

Unless someone has a less painful solution, my next shot is to make a GNU cpp.
Unfortunately,  gcc version 1.35 has no configuration files for AUX.  Before I
reinvent a wheel, has anyone successfully made gcc for AUX?  If so, any tips?

Thanks in advance...

Randy
	Randy Ballew
	randy@violet.berkeley.edu ucbvax!violet!randy