[comp.bugs.4bsd] Previous bug report on abort

gww@beatnix.UUCP (Gary Winiger) (03/13/88)

Subject: Previous bug report on abort(3) in error +Fix
Index:	libc/gen/abort.c 4.3BSD

Description:
	I previously reported that when abort was called, there were certain
	cases that it would continue with the user code before core was dumped.
	Keith Bostic (bostic@ucbvax) kindly pointed out that there was probably
	some other error in the kernel handling of signals:

	``A signal is sent as soon as the process is next scheduled to run.
	  The bug report assumed that there was a multiprocessor, so presumably
	  the process could be running on one processor while the signal was
	  being sent on another processor.  The problem is that the process
	  was sending a signal to itself (that is the `abort' library routine
	  sends the signal to the current process).  It is semantically
	  incorrect for a process to be running in user code and kernel code
	  concurrently.  Therefore the process must have been stopped while
	  it was doing the `kill' system call.  hence when it next ran in user
	  code it would have to get the signal. Clearly the Elxsi implementation
	  subtlely violates the semantics of UNIX!''

	Keith is clearly correct.  I have since found the problem.  The kernel
	was returning to the user code to allow the user to complete the
	`kill' system call before taking the action of the then pending signal.
	My thanks to Keith for pointing this out.
Repeat-By:
Fix:
	Don't install the suggested sigpause.

Gary..
{ucbvax!sun,uunet,lll-lcc!lll-tis,amdahl!altos86,bridge2}!elxsi!gww

---------------------------------------------------------------------------
---------------------- ORIGINAL BUG REPORT --------------------------------
---------------------------------------------------------------------------
#Subject: abort(3) returns to user. +fix
#Index:	libc/gen/abort.c 4.3BSD
#
#Description:
#	When abort is called, it may return to the user and continue
#	processing before core is dumped.
#Repeat-By:
#	Run a program that calls abort(3) on a multiprocessor system where
#	the kernel process is on a different cpu from the user process.
#	Examine the core dump stack trace with adb and notice that the
#	program has continued to run after abort was called.
#Fix:
#	It is possible for the user process calling abort() to continue
#	after the abort before the kernel process gets it stopped.
#	Add code to abort to wait for the kill signal to occur.
#
#	The attached code modification solves this problem at Elxsi.
#
#Gary..
#{ucbvax!sun,lll-lcc!lll-tis,amdahl!altos86,bridge2}!elxsi!gww
#--------- cut --------- snip --------- :.,$w diff -------------
#*** /tmp/,RCSt1001187	Thu Jun 18 17:56:07 1987
#--- abort.c	Thu Jun 18 17:55:11 1987
#***************
#*** 1,5 ****
#--- 1,8 ----
#  /*
#   * $Log:	abort.c,v $
#+  * Revision 1.2  87/06/18  17:54:12  gww
#+  * Guarantee abort doesn't return to user.
#+  * 
#   * Revision 1.1  87/01/15  15:35:03  gww
#   * Initial revision
#   * 
#***************
#*** 11,17 ****
#   */
#  
#  #if defined(LIBC_SCCS) && !defined(lint)
#! static char *ERcsId = "$Header: abort.c,v 1.1 87/01/15 15:35:03 gww Exp $ ENIX BSD";
#  static char sccsid[] = "@(#)abort.c	5.3 (Berkeley) 3/9/86";
#  #endif LIBC_SCCS and not lint
#  
#--- 14,20 ----
#   */
#  
#  #if defined(LIBC_SCCS) && !defined(lint)
#! static char *ERcsId = "$Header: abort.c,v 1.2 87/06/18 17:54:12 gww Exp $ ENIX BSD";
#  static char sccsid[] = "@(#)abort.c	5.3 (Berkeley) 3/9/86";
#  #endif LIBC_SCCS and not lint
#  
#***************
#*** 25,28 ****
#--- 28,32 ----
#  	signal(SIGILL, SIG_DFL);
#  	sigsetmask(~sigmask(SIGILL));
#  	kill(getpid(), SIGILL);
#+ 	sigpause(~sigmask(SIGILL));
#  }