garyb@conan.SanDiego.NCR.COM (Gary Boggs) (05/30/90)
[][][] Hello, I'm attempting to port ISIS (V1.3) to an NCR Tower which runs AT&T Unix V.3 plus some berkeleyisms. My problem is that my kernel is getting confused by the ISIS threads package. When a signal (e.g. SIGCLD) comes in, the sendsig() routine that is supposed to build a stack frame for the signal catching function becomes baffled because the stack pointer is not in the normal stack region. It then gives up and aborts the process by calling exit(). Has anybody out there experienced this and found a solution? Any ideas? Please respond by e-mail. Thanks, Gary Boggs NCR Corp. E&M San Diego gary.boggs@sandiego.ncr.com
ken@gvax.cs.cornell.edu (Ken Birman) (06/04/90)
In article <2731@ncr-sd.SanDiego.NCR.COM> gary.boggs@SanDiego.NCR.COM (Gary Boggs) writes: >My [NCR Tower running ATT UNIX] kernel is getting confused by the >ISIS threads package. When a signal (e.g. SIGCLD) comes in, the >sendsig() routine that is supposed to build a stack frame for the signal >catching function becomes baffled because the stack pointer is not in the >normal stack region. It then gives up and aborts the process by calling exit(). > >Has anybody out there experienced this and found a solution? Any ideas? > This is a pretty typical problem for ISIS ports; other common ones are that vfork() tends to die when called from an ISIS thread (no surprise; vfork() doesn't know about ISIS stacks) and that on systems that use setjmp/longjmp to implement context switching, longjmp will often check the destination frame and refuse to jump other than "up" the stack. In the case of AT&T UNIX, I thought that there was at least one threads package that conforms to the POSIX threads definition and hence is certainly solid enough for use in ISIS. Presumably the people who built it would have coded the signal stuff to work correctly. Otherwise, you basically have two choices. The first is to hack ISIS so that it masks signals out except when running on the "system" stack; this wouldn't actually be hard to do (a simple change to swtch() should suffice) but would be sort of ugly. swtch() is defined in clib/cl_task.c and in protos/pr_task.c. The second approach would be to get a copy of the assembler code for the routine in question and look for "range checks" on the stack pointer or frame pointer register values. Typically, these types of aborts have to do with the routine detecting what it considers to be a fatal error, and if the check is simply commented out things will work fine. You'll want to be especially careful of stack overflow if you code any sort of significant signal handler, obviously. Ken