jnelson@trwrba.UUCP (08/29/84)
Subject: Memory fault in shell using make... Newsgroups: net.unix-wizards Subject: bug in 4.2 /bin/make SYSNOPSIS: make sometimes dumps core with a memory fault or segmentation fault Yes, I have seen the same behavior with our make, however our shell blows up ONLY when make tries to do a make depend. Needless to say, I'm somewhat concerned... especially if the bug is in the kernel. Using our brand new friend, Mr. DBX, I find that the command string from the makefile is passed properly to the Bourne Shell, by make. After the shell is vforked we suddenly intercept a trap and a maze of 4.2 signal routines are called (apparently due to some sort of memory problem from within the shell) with a core dumped. Make performs fine when I try to remake, say ''make'' or any other particular piece of code. When I try to remake the kernel, however, the bourne shell goes bonkers. But what could be making the Bourne shell barf like this? The makefiles in /usr/sys look okay, and I haven't made any kernel changes yet. Only reconfiguration of the system hardware. I don't think it has anything to do with swap space. Now, I'm not exactly desparate or anything.... I took the cheap and easy way out and reloaded the system code from tape (heh heh). But I still find it disconcerting that such can happen. Thanks in advance if you have any suggestions or fixes. - John
stevesu@azure.UUCP (Steve Summit) (09/13/84)
I don't know why make depend is making the shell dump core, but I do know why dbx is showing "a maze of 4.2 signal routines called (apparently due to some sort of memory problem from within the shell)." It might have something to do with the core dump, and it's good to know about, anyway. The Bourne shell does memory management by a "hit or miss" method. It catches some signal (probably SIGSEGV, although I have this memory that it's signal 12) to detect when it tries to use memory it doesn't have yet. The signal catch routine does an sbrk(), and it tries whatever memory access it was doing again. The problem is that the shell knows how to catch signals too well. It has this general signal catcher routine that ALWAYS checks to see if signals were being ignored before it catches them (and if you don't know why, read "Unix Programming" in Volume 2 right away, in particular section 6). It calls this internal routine to catch SIGSEGV, and if for some reason SIGSEGV was already being ignored, the shell will loop forever trying to do something that ought to work once enough SIGSEGV's get caught to allocate enough memory to allow what it's trying to do to work. You can demonstrate this by writing a little C program that ignores SIGSEGV and then exec's a shell that does any nontrivial task. The times I tried it the shell sat there in a loop doing nothing but eating up all of the cpu it could get its hands on. I suppose this could cause the core dump problem on the make depend, somehow. There may well be other problems. The Bourne shell is surprisingly strangely written for such a great program. (By the way, I'm not trying to cut it down -- I use it in preference to csh.) A fix would be to put a check in the shell's internal catch routine to catch SIGSEGV (or whatever the signal is) regardless. (By the way, I haven't gotten around to doing this yet, so don't blame me if it doesn't work or if it breaks something else.) One thing that complicates the task is that the shell sources don't #include signal.h (perhaps it was written before there was one). Instead, it has all its own definitions for the signals. Steve Summit tektronix!tekmdp!stevesu