[net.unix-wizards] Make makes shell dump core....

jnelson@trwrba.UUCP (08/29/84)

Subject: Memory fault in shell using make...
Newsgroups: net.unix-wizards

	Subject: bug in 4.2 /bin/make

	SYSNOPSIS:
	    make sometimes dumps core with a memory fault or segmentation fault

Yes, I have seen the same behavior with our make, however
our shell blows up ONLY when make tries to do a make depend.

Needless to say, I'm somewhat concerned... especially if the
bug is in the kernel.

Using our brand new friend, Mr. DBX, I find that the command string
from the makefile is passed properly to the Bourne Shell, by make.
After the shell is vforked we suddenly intercept a trap and a maze of
4.2 signal routines are called (apparently due to some sort of
memory problem from within the shell) with a core dumped.

Make performs fine when I try to remake, say ''make'' or
any other particular piece of code.  When I try to remake the
kernel, however, the bourne shell goes bonkers.  But what
could be making the Bourne shell barf like this?

The makefiles in /usr/sys look okay, and I haven't made any
kernel changes yet.  Only reconfiguration of the system
hardware.  I don't think it has anything to do with swap space.

Now, I'm not exactly desparate or anything.... I took the cheap
and easy way out and reloaded the system code from tape (heh heh).
But I still find it disconcerting that such can happen.
Thanks in advance if you have any suggestions or fixes.



						- John

stevesu@azure.UUCP (Steve Summit) (09/13/84)

I don't know why make depend is making the shell dump core, but I
do know why dbx is showing "a maze of 4.2 signal routines called
(apparently due to some sort of memory problem from within the
shell)."  It might have something to do with the core dump, and
it's good to know about, anyway.

The Bourne shell does memory management by a "hit or miss"
method.  It catches some signal (probably SIGSEGV, although I
have this memory that it's signal 12) to detect when it tries to
use memory it doesn't have yet.  The signal catch routine does an
sbrk(), and it tries whatever memory access it was doing again.

The problem is that the shell knows how to catch signals too
well.  It has this general signal catcher routine that ALWAYS
checks to see if signals were being ignored before it catches
them (and if you don't know why, read "Unix Programming" in
Volume 2 right away, in particular section 6).  It calls this
internal routine to catch SIGSEGV, and if for some reason SIGSEGV
was already being ignored, the shell will loop forever trying to
do something that ought to work once enough SIGSEGV's get caught
to allocate enough memory to allow what it's trying to do to work.

You can demonstrate this by writing a little C program that
ignores SIGSEGV and then exec's a shell that does any nontrivial
task.  The times I tried it the shell sat there in a loop doing
nothing but eating up all of the cpu it could get its hands on.

I suppose this could cause the core dump problem on the make
depend, somehow.  There may well be other problems.  The Bourne
shell is surprisingly strangely written for such a great program.
(By the way, I'm not trying to cut it down -- I use it in
preference to csh.)

A fix would be to put a check in the shell's internal catch
routine to catch SIGSEGV (or whatever the signal is) regardless. 
(By the way, I haven't gotten around to doing this yet, so don't
blame me if it doesn't work or if it breaks something else.)  One
thing that complicates the task is that the shell sources don't
#include signal.h (perhaps it was written before there was one).
Instead, it has all its own definitions for the signals.

                                         Steve Summit
                                         tektronix!tekmdp!stevesu