[comp.unix.wizards] trap causes longjmp botch

leo@philmds.UUCP (Leo de Wit) (06/12/88)

In article <11820@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
 [stuff deleted]
>Here is how I might write it:
>
>	case $# in
>	0)	echo "usage: $0 file [arguments to plot3d]" 1>&2; exit 1;;
>	esac
>
>	TEMP=/tmp/z$$			# make a unique temporary file name
>	/bin/rm -f $TEMP		# remove it if it exists
>	trap '/bin/rm -f $TEMP; exit' 0 1 2 3 15 # and again at exit or signal

    Used to write it myself that way too ... until one day the sh (/bin/sh;
the Bourne shell on an Ultrix machine) complained about a 'longjmp botch'; 
I don't remember if there was a core dump.
    What is the problem with this trap? The normal signals, as 1 2 3 and 15 are
handled with interrupt routines in the shell, using signal() or sigvec() or
whatever. The 0 signal isn't really a signal, just used by the shell to
perform certain actions at closing time (putting chairs on tables, removing
any drunks left 8-). It could well have been implemented using setjmp and
longjmp (I dunno, don't have the source); that could explain the 'longjmp
botch'. The last command in the trap is 'exit'. But when the shell exits,
it has to perform the trap! So another reason for the 'longjmp botch' message
could be the shell trying to recursively trap and exit, each calling the
other and finally finding out it's running out of stack space.
    Perhaps it is safer to add another trap:
    trap '/bin/rm -f $TEMP; trap 0; exit' 0 1 2 3 15
In this case the 'exit'-trap is reset when the trap is executed.

    Leo.

chris@mimsy.UUCP (Chris Torek) (06/13/88)

In article <502@philmds.UUCP> leo@philmds.UUCP (Leo de Wit) writes:
>    Used to [use trap 'action' 0] myself that way too ... until one
>day the sh (/bin/sh; the Bourne shell on an Ultrix machine) complained
>about a 'longjmp botch'; I don't remember if there was a core dump.

Unless it was otherwise disabled (file `core' in current directory
unwritable, `limit coredumpsize 0', etc), there was.

>    What is the problem with this trap?  [explanation deleted]

The problem is that there is a bug in the 4.2BSD /bin/sh such that
`trap 0' outside of a script sometimes causes a longjmp to a stack
context that is no longer around.  Most likely this bug has been around
since V7, but is only caught now that longjmp unwinds the stack and
aborts if the frame is gone.

The bug does not occur when the trap is inside a script.

The bug remains in the 4.3BSD /bin/sh.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris