gww@beatnix.UUCP (Gary Winiger) (03/13/88)
Subject: Previous bug report on abort(3) in error +Fix Index: libc/gen/abort.c 4.3BSD Description: I previously reported that when abort was called, there were certain cases that it would continue with the user code before core was dumped. Keith Bostic (bostic@ucbvax) kindly pointed out that there was probably some other error in the kernel handling of signals: ``A signal is sent as soon as the process is next scheduled to run. The bug report assumed that there was a multiprocessor, so presumably the process could be running on one processor while the signal was being sent on another processor. The problem is that the process was sending a signal to itself (that is the `abort' library routine sends the signal to the current process). It is semantically incorrect for a process to be running in user code and kernel code concurrently. Therefore the process must have been stopped while it was doing the `kill' system call. hence when it next ran in user code it would have to get the signal. Clearly the Elxsi implementation subtlely violates the semantics of UNIX!'' Keith is clearly correct. I have since found the problem. The kernel was returning to the user code to allow the user to complete the `kill' system call before taking the action of the then pending signal. My thanks to Keith for pointing this out. Repeat-By: Fix: Don't install the suggested sigpause. Gary.. {ucbvax!sun,uunet,lll-lcc!lll-tis,amdahl!altos86,bridge2}!elxsi!gww --------------------------------------------------------------------------- ---------------------- ORIGINAL BUG REPORT -------------------------------- --------------------------------------------------------------------------- #Subject: abort(3) returns to user. +fix #Index: libc/gen/abort.c 4.3BSD # #Description: # When abort is called, it may return to the user and continue # processing before core is dumped. #Repeat-By: # Run a program that calls abort(3) on a multiprocessor system where # the kernel process is on a different cpu from the user process. # Examine the core dump stack trace with adb and notice that the # program has continued to run after abort was called. #Fix: # It is possible for the user process calling abort() to continue # after the abort before the kernel process gets it stopped. # Add code to abort to wait for the kill signal to occur. # # The attached code modification solves this problem at Elxsi. # #Gary.. #{ucbvax!sun,lll-lcc!lll-tis,amdahl!altos86,bridge2}!elxsi!gww #--------- cut --------- snip --------- :.,$w diff ------------- #*** /tmp/,RCSt1001187 Thu Jun 18 17:56:07 1987 #--- abort.c Thu Jun 18 17:55:11 1987 #*************** #*** 1,5 **** #--- 1,8 ---- # /* # * $Log: abort.c,v $ #+ * Revision 1.2 87/06/18 17:54:12 gww #+ * Guarantee abort doesn't return to user. #+ * # * Revision 1.1 87/01/15 15:35:03 gww # * Initial revision # * #*************** #*** 11,17 **** # */ # # #if defined(LIBC_SCCS) && !defined(lint) #! static char *ERcsId = "$Header: abort.c,v 1.1 87/01/15 15:35:03 gww Exp $ ENIX BSD"; # static char sccsid[] = "@(#)abort.c 5.3 (Berkeley) 3/9/86"; # #endif LIBC_SCCS and not lint # #--- 14,20 ---- # */ # # #if defined(LIBC_SCCS) && !defined(lint) #! static char *ERcsId = "$Header: abort.c,v 1.2 87/06/18 17:54:12 gww Exp $ ENIX BSD"; # static char sccsid[] = "@(#)abort.c 5.3 (Berkeley) 3/9/86"; # #endif LIBC_SCCS and not lint # #*************** #*** 25,28 **** #--- 28,32 ---- # signal(SIGILL, SIG_DFL); # sigsetmask(~sigmask(SIGILL)); # kill(getpid(), SIGILL); #+ sigpause(~sigmask(SIGILL)); # }