jwp@uwmacc.UUCP (jeffrey w percival) (05/27/85)
In my previous posting I neglected to mention console messages... For the large majority of crashes, there is no console message. Nothing at all. In those cases I type a <cntl/P> H and then reboot. For one of the crashes, though, there was this: ka6 = 7545 aps = 141374 pc = 343 ps = 110 cpuerr = 160 trap type 11 panic: trap syncing disks... This problem is really severe for us. It *does* seem to be related to CPU load; today I was doing a MAKEALL in /usr/src and crashed twice (the mysterious no-message type). If you have any thoughts about this, even shots in the dark, please mail them to me. -- Jeff Percival ...!uwvax!uwmacc!jwp
jwp@uwmacc.UUCP (jeffrey w percival) (05/29/85)
Our crashes haven't abated, but many responses have increased my awareness of the nature of our problem. Let me summarize: We have an 11/70, and are a brand new 2.9 site. That is, we got the 2.9BSD tape, followed the installation guide, recompiled a local kernel, and have applied no patches of any kind to the distributed source files. Our changes to localopts.h were trivial; 2 DH's, etc. No changes in subtle (to me) things like stack size. It appears like there's been a lot of 2.9 analysis and debugging, but because I'm new on the net (through this vax account), I've not had the benefit of any previous interactions. Some people have sent me mail mentioning "segmentation code", xp and ht driver bugs (we use both drivers), and so on, and I've asked a few respondees for more info, but has anyone maintained a list of the key fixes I have to make to keep from crashing two or three times a day? Thanks for helping me on this! -- Jeff Percival ...!uwvax!uwmacc!jwp
johanw@ttds.UUCP (Johan Wide'n) (05/31/85)
One thing too watch out for is a bad /etc/init. /etc/init as delivered to us did not reset several signals before execing other programs. This bug was pointed out by (ihnp4!inuxc!isrnix!greg) >Received: by isrnix.UUCP; Wed, 13 Jun 84 18:03:38 EST >To: BERKELEY!2bsd-people >Subject: 2.9 problem > > > I'm having a problem with 2.9 - I am getting spurious signals sent >to all the processes - the signal is signal 9 (I'm almost sure) and >init (proc 1) does not get it (although everything else does). Has >anyone else seen this? I have VFORK and MENLO_JCL turned on f.y.i. >Any ideas? --------------- >Received: by isrnix.UUCP; Thu, 14 Jun 84 18:11:15 EST >To: BERKELEY!2bsd-people >Subject: My earlier message about interrupts. > > > Found the culprit - > > There is a bug in our distributed version of init.c. In dofork and >runcom you should add the signal calls > > signal(SIGINT, DIG_DFL); > and > signal(SIGTERM, SIG_DFL); > >Greg So: check out your /usr/src/cmd/init.c. Here is a context diff: *** init.c.org Wed May 18 20:54:04 1983 --- init.c Fri Apr 26 16:13:28 1985 *************** *** 201,206 pid = fork(); if(pid == 0) { open("/", 0); dup(0); dup(0); --- 201,208 ----- pid = fork(); if(pid == 0) { + signal(SIGTERM, SIG_DFL); + signal(SIGINT, SIG_DFL); open("/", 0); dup(0); dup(0); *************** *** 205,210 dup(0); dup(0); #ifdef UCB_AUTOBOOT if ((howto & RB_SINGLE) || (howto & RB_NOFSCK)) arg1 = "fastboot"; else --- 207,213 ----- dup(0); dup(0); #ifdef UCB_AUTOBOOT + signal(SIGQUIT, SIG_DFL); if ((howto & RB_SINGLE) || (howto & RB_NOFSCK)) arg1 = "fastboot"; else *************** *** 413,418 pid = fork(); if(pid == 0) { signal(SIGTERM, SIG_DFL); signal(SIGHUP, SIG_IGN); strcpy(tty, dev); strncat(tty, p->line, LINSIZ); --- 416,425 ----- pid = fork(); if(pid == 0) { signal(SIGTERM, SIG_DFL); + signal(SIGINT, SIG_DFL); + #ifdef UCB_AUTOBOOT + signal(SIGQUIT, SIG_DFL); + #endif signal(SIGHUP, SIG_IGN); strcpy(tty, dev); strncat(tty, p->line, LINSIZ); johanw@ttds Johan Widen