[comp.sys.isis] ISIS and signal handling on Unix V.3

garyb@conan.SanDiego.NCR.COM (Gary Boggs) (05/30/90)

[][][]

Hello,

I'm attempting to port ISIS (V1.3) to an NCR Tower which runs AT&T Unix
V.3 plus some berkeleyisms. My problem is that my kernel is getting confused 
by the ISIS threads package. When a signal (e.g. SIGCLD) comes in, the 
sendsig() routine that is supposed to build a stack frame for the signal 
catching function becomes baffled because the stack pointer is not in the 
normal stack region. It then gives up and aborts the process by calling exit().

Has anybody out there experienced this and found a solution? Any ideas?

Please respond by e-mail.

Thanks,

Gary Boggs
NCR Corp. E&M San Diego
gary.boggs@sandiego.ncr.com

ken@gvax.cs.cornell.edu (Ken Birman) (06/04/90)

In article <2731@ncr-sd.SanDiego.NCR.COM>
   gary.boggs@SanDiego.NCR.COM (Gary Boggs) writes:
>My [NCR Tower running ATT UNIX] kernel is getting confused by the
>ISIS threads package. When a signal (e.g. SIGCLD) comes in, the 
>sendsig() routine that is supposed to build a stack frame for the signal 
>catching function becomes baffled because the stack pointer is not in the 
>normal stack region. It then gives up and aborts the process by calling exit().
>
>Has anybody out there experienced this and found a solution? Any ideas?
>

This is a pretty typical problem for ISIS ports; other common ones are
that vfork() tends to die when called from an ISIS thread (no surprise;
vfork() doesn't know about ISIS stacks) and that on systems that use
setjmp/longjmp to implement context switching, longjmp will often check
the destination frame and refuse to jump other than "up" the stack.

In the case of AT&T UNIX, I thought that there was at least one threads
package that conforms to the POSIX threads definition and hence is certainly
solid enough for use in ISIS.  Presumably the people who built it would
have coded the signal stuff to work correctly.

Otherwise, you basically have two choices.  The first is to hack ISIS
so that it masks signals out except when running on the "system" stack;
this wouldn't actually be hard to do (a simple change to swtch() should
suffice) but would be sort of ugly.

swtch() is defined in clib/cl_task.c and in protos/pr_task.c.

The second approach would be to get a copy of the assembler code for the
routine in question and look for "range checks" on the stack pointer or
frame pointer register values.  Typically, these types of aborts have
to do with the routine detecting what it considers to be a fatal error,
and if the check is simply commented out things will work fine.

You'll want to be especially careful of stack overflow if you code
any sort of significant signal handler, obviously.

Ken