rhl@grendel.Princeton.EDU (Robert Lupton (the Good)) (01/01/91)
A colleague has a 386 running Interactive's unix, and the LPI fortran compiler. He reported that a divide by zero in fortran crashes the system, so I wrote a trivial C programme to catch SIGFPE --- it works fine. So I wrote a stub to set the handler from fortran, and now it works fine ONCE --- if you run the fortran a second time it still crashes the system. I have no further ideas --- do any of you? Robert
jackv@turnkey.tcc.com (Jack F. Vogel) (01/02/91)
In article <4983@idunno.Princeton.EDU> rhl@grendel.Princeton.EDU (Robert Lupton (the Good)) writes: >A colleague has a 386 running Interactive's unix, and the LPI fortran >compiler. He reported that a divide by zero in fortran crashes the >system, so I wrote a trivial C programme to catch SIGFPE --- it works >fine. So I wrote a stub to set the handler from fortran, and now it >works fine ONCE --- if you run the fortran a second time it still >crashes the system. When will people learn that problem descriptions of the form "crashes the system" are about as useful to a support person as "the car won't run" would be to a mechanic :-}! Seriously, you need to be more specific, what happens exactly? Does the system panic and if so what is the panic message or type? Also, what level of the system is this? What type hardware, etc, etc... In any case, sounds like a fairly serious bug. If it panics, sounds like a bug in the ISC trap code, a user application should just get a signal, it should never panic the system. Of course, one might ask what the hell someone is doing a divide-by-zero for anyway :-}, but that isn't meant as an answer. I think someone at Interactive should take a close look at this once you provide some more detail. I also have crossposted this followup to sysv386 where it will definitely be seen. Good Luck! Disclaimer: I in no way speak for my employer, and certainly not ISC :-}. -- Jack F. Vogel jackv@locus.com AIX370 Technical Support - or - Locus Computing Corp. jackv@turnkey.TCC.COM
src@scuzzy.in-berlin.de (Heiko Blume) (01/02/91)
rhl@grendel.Princeton.EDU (Robert Lupton (the Good)) writes: >A colleague has a 386 running Interactive's unix, and the LPI fortran >compiler. He reported that a divide by zero in fortran crashes the >system, so I wrote a trivial C programme to catch SIGFPE --- it works >fine. So I wrote a stub to set the handler from fortran, and now it >works fine ONCE --- if you run the fortran a second time it still >crashes the system. either /* POSIX signal facilities */ your_catcher() { printf("Division by zero\n"); } main() { [...] sigset(SIGFPE,your_catcher); [...] } or /* plain signal facilities */ your_catcher() { signal(SIGFPE,your_catcher); printf("Division by zero\n"); } main() { [...] signal(SIGFPE,your_catcher); [....] } should do it. however, if you want to set/longjmp() with the POSIX thing you must either engage sigrelse() or use sigsetjmp(2) and siglongjmp() in the first place. -- Heiko Blume <-+-> src@scuzzy.in-berlin.de <-+-> (+49 30) 691 88 93 public source archive [HST V.42bis]: scuzzy Any ACU,f 38400 6919520 gin:--gin: nuucp sword: nuucp uucp scuzzy!/src/README /your/home
buhrt@sawmill.uucp (Jeffery A Buhrt) (01/02/91)
> >/* POSIX signal facilities */ >your_catcher() { printf("Division by zero\n"); } >main() { [...] >sigset(SIGFPE,your_catcher); >[...] } > >/* plain signal facilities */ >your_catcher() { signal(SIGFPE,your_catcher); printf("Division by zero\n"); } >main() { [...] >signal(SIGFPE,your_catcher); >[....] } > >should do it. however, if you want to set/longjmp() with the POSIX >thing you must either engage sigrelse() or use sigsetjmp(2) and siglongjmp() >in the first place. > Heiko Blume <-+-> src@scuzzy.in-berlin.de <-+-> (+49 30) 691 88 93 > Should work ... kind of... as long as you are not on a '386 (as the orignal article ask). The above code is very correct and works fine on all systems that I know of except a 386/387 system. The problem is probibly not so much one divide by zero as it is 8 and 9 (the 387 stack is 8 deap), after eight divide by zero calls the 387 stack is full and the next (or later) 387 call will die. I do NOT have ISC and can't say if maybe the user->kernel switch after the code exits doesn't reset his 387 stack (I assume if the system does panic it would not be an emulator doing it). I don't have a portable solution yet though. If you have a '387 and/or a FULL 387 emulator you can call fsave and frstor to reset the 387 FPU stack, but most of the default emulators (ex: Esix Rev. D) do not implement all the instructions. What is the proper way to reset the 387's stack, when you get an FPE the stack is not reset and eventually overwrites the rest of user memory and/or generates random FPU faults? Below is a shar file of a stripped down version of Sc that causes a 387/387 emulator stack overflow, as far I I know on any 386/unix system. Basically what happens: signal(SIGFPE, eval_fpe); ...divide by 0 (8 times) signal(SIGFPE, quit_fpe); ...do some other floating point operation FPU stack overflow This code dies the same way on: Esix rev. D: cc, gcc and a Sequent Symmetry: cc, atscc, and gcc (all tested in BSD and USG) with the message: "quit_fpe() not called in EvalAll() (STACK TRASHED)" (which comes from a doprnt in sprintf in update()) 1) unshar 2) pick the correct options in the Makefile for your system 3) make;fpedie -- now turn on the fsave/frestore fix 5) If you are not on Esix, #define I387 in interp.c 6) make;fpedie -this works fine on the Symmetry, except on Esix: a) The 'struct fpusave' is part of the <sys/user.h> structure b) per the RevD manual page 7-6: FSAVE, FRSTOR are not defined -Jeff Buhrt 317-477-6000 {sequent, tippy.cs.purdue.edu}!sawmill!buhrt #!/bin/sh # This is a shell archive (shar 3.20) # made 12/21/1990 19:03 UTC by buhrt@sawmill # Source directory /users/buhrt/src/sc/z3/t/ok # # existing files WILL be overwritten # # This shar contains: # length mode name # ------ ---------- ------------------------------------------ # 2009 -rw-r--r-- Makefile # 1716 -rw-r--r-- interp.c # 1604 -rw-r--r-- sc.c # 972 -rw-rw-r-- sc.h # 964 -rw-r--r-- vmtbl.c # if touch 2>&1 | fgrep '[-amc]' > /dev/null then TOUCH=touch else TOUCH=true fi # ============= Makefile ============== echo "x - extracting Makefile (Text)" sed 's/^X//' << 'SHAR_EOF' > Makefile && X# Set SIGVOID if signal routines are type void. System 5.3, SunOS 4.X, X# VMS and ANSI C Compliant systems use this. Most BSD systems and the X# UNIXPC 'cc' do not. X#SIGVOID=-DSIGVOID XSIGVOID= X X# Set IEEE_MATH if you need setsticky() calls in your signal handlers X# X#IEEE_MATH=-DIEEE_MATH XIEEE_MATH= X X# flags for lint XLINTFLAGS=-abchxv X X# For ULTRIX: define the BSD4.2 section and SIGVOID above X# tdw@cl.cam.ac.uk tested on Ultrix 3.1C-0 X X# Use this for system AIX V3.1 X#CFLAGS= -O -DSYSV2 -DCHTYPE=int -DNLS X#LDFLAGS= X#LIB= X X# Use this for system V.2 X#CFLAGS= -O -DSYSV2 X#LDFLAGS= X#LIB= X X# Use this for system V.3 X#CFLAGS= -O -DSYSV3 X#LDFLAGS= X#LIB= X X# Microport X#CFLAGS= -DSYSV2 -O -DUPORT -Ml X#LDFLAGS=-Ml X#LIB= X X# Use this for BSD 4.2 X#CFLAGS= -O -DBSD42 X#LDFLAGS= X#LIB= X X# Use this for Sequent boxes X#CC=atscc XCC=gcc XCFLAGS=-g -DBSD42 X#LDFLAGS= -s XLDFLAGS= -g XLIB=-ldmalloc XPSCLIB= X X# Use this for BSD 4.3 X#CFLAGS= -O -DBSD43 X#LDFLAGS= X#LIB= X X# Use this for SunOS 4.X if you have the System V package installed. X# This will link with the System V curses which is preferable to the X# BSD curses (especially helps scrolling on slow (9600bps or less) X# serial lines). X# X# Be sure to define SIGVOID and RE_COMP above. X# X#CC=/usr/5bin/cc X#CFLAGS= -O -DSYSV3 X#LDFLAGS= X#LIB= X X# Use this for system III (XENIX) X#CFLAGS= -O -DSYSIII X#LDFLAGS= -i X#LIB= X X# Use this for VENIX X#CFLAGS= -DVENIX -DBSD42 -DV7 X#LDFLAGS= -z -i X#LIB= X X# For SCO Unix V rel. 3.2.0 X# -compile using rcc, cc does not cope with gram.c X# -edit /usr/include/curses.h, rcc does not understand #error X# -link: make CC=cc, rcc's loader gets unresolved __cclass, __range X# (rather strange,?) X#CC=rcc X#SIGVOID=-DSIGVOID X#CFLAGS= -O -DSYSV3 X#LDFLAGS= X#LIB= X X# The objects XOBJS=sc.o interp.o vmtbl.o X Xfpedie:$(PAR) $(OBJS) X $(CC) ${CFLAGS} ${LDFLAGS} ${OBJS} ${LIB} -o fpedie X Xsc.o: sc.h sc.c X $(CC) ${CFLAGS} ${SIGVOID} -c sc.c X Xinterp.o: interp.c sc.h X $(CC) ${CFLAGS} ${IEEE_MATH} ${SIGVOID} -c interp.c X SHAR_EOF $TOUCH -am 1221140290 Makefile && chmod 0644 Makefile || echo "restore of Makefile failed" set `wc -c Makefile`;Wc_c=$1 if test "$Wc_c" != "2009"; then echo original size 2009, current size $Wc_c fi # ============= interp.c ============== echo "x - extracting interp.c (Text)" sed 's/^X//' << 'SHAR_EOF' > interp.c && X/*#define I387 /* HERE is the bigee */ X X#ifdef aiws X#undef _C_func /* Fixes for undefined symbols on AIX */ X#endif X X#ifdef I387 X#include <sys/types.h> X#include <i386/fpu.h> Xstruct fpusave fpu_buf; X#endif /* I387 */ X X#ifdef IEEE_MATH X#include <ieeefp.h> X#endif /* IEEE_MATH */ X X#include <signal.h> X#include <setjmp.h> X#include <stdio.h> X Xextern int errno; /* set by math functions */ X X#include "sc.h" X Xjmp_buf fpe_save; X Xint quit_fpe(); X Xdouble eval(); X X#ifdef SIGVOID Xvoid X#else Xint X#endif Xeval_fpe() /* Trap for FPE errors in eval */ X{ X fputs("eval_fpe called\n", stderr); X/* not sure if needed since we do a frstor */ X/* X#ifdef i386 X asm(" fnclex"); X asm(" fwait"); X#else X*/ X#ifdef IEEE_MATH X (void)fpsetsticky((fp_except)0); /* Clear exception */ X#endif /* IEEE_MATH */ X#ifdef PC X _fpreset(); X#endif X/*#endif /* from #ifdef i386*/ X X#ifdef I387 X fputs("fpe_save\n", stderr); X asm(" frstor _fpu_buf "); X#endif /* I387 */ X /* re-establish signal handler for next time */ X (void) signal(SIGFPE, eval_fpe); X longjmp(fpe_save, 1); X} X Xdouble Xeval(e) Xregister struct enode *e; X{ X double denom; X denom = (double)0; X return((double)1/denom); X} X X X#ifdef SIGVOID Xvoid X#else Xint X#endif Xquit_fpe() X{ X fputs("quit_fpe() not called in EvalAll() (STACK TRASHED)\n", stderr); X abort(); /* what might be left */ X exit(1); X} X Xvoid XEvalAll () { X int i; X struct ent *p; X X (void) signal(SIGFPE, eval_fpe); X#ifdef I387 X fputs("fsave", stderr); X asm(" fsave _fpu_buf "); X asm(" frstor _fpu_buf "); X#endif /* I387 */ X X for (i=0; i<8; i++) X if (p = *ATBL(tbl,1,0)) X { double v; X X if (setjmp(fpe_save)) { X v = (double)0.0; X } else { X v = eval (p->expr); X } X } X X (void) signal(SIGFPE, quit_fpe); X} SHAR_EOF $TOUCH -am 1221140290 interp.c && chmod 0644 interp.c || echo "restore of interp.c failed" set `wc -c interp.c`;Wc_c=$1 if test "$Wc_c" != "1716"; then echo original size 1716, current size $Wc_c fi # ============= sc.c ============== echo "x - extracting sc.c (Text)" sed 's/^X//' << 'SHAR_EOF' > sc.c && X/* SC A Spreadsheet Calculator X * Main driver X * X * original by James Gosling, September 1982 X * modifications by Mark Weiser and Bruce Israel, X * University of Maryland X * X * More mods Robert Bond, 12/86 X * More mods by Alan Silverstein, 3-4/88, see list of changes. X * Currently supported by sequent!sawmill!buhrt (Jeff Buhrt) X * $Revision: 6.12 $ X * X */ X X#include <stdio.h> X#include "sc.h" X X#ifdef SYSV3 Xvoid exit(); X#endif X X/* Globals defined in sc.h */ Xstruct ent ***tbl; X Xchar line[FBUFLEN]; X Xvoid update(); X Xstruct enode * Xnew_const(op, a1) Xint op; Xdouble a1; X{ X register struct enode *p; X p = (struct enode *) malloc ((unsigned)sizeof (struct enode)); X p->op = op; X p->e.k = a1; X return p; X} X X/* return a pointer to a cell's [struct ent *], creating if needed */ Xstruct ent * Xlookat(row,col) Xint row, col; X{ X register struct ent **pp; X X pp = ATBL(tbl, row, col); X if (*pp == (struct ent *)0) { X *pp = (struct ent *) malloc((unsigned)sizeof(struct ent)); X (*pp)->expr = new_const(O_CONST, (double)4); X (*pp)->v = (double) 0.0; X } X return(*pp); X} X Xvoid Xupdate () X{ X struct ent *p1; X X if (p1 = *ATBL(tbl, 0, 0)) X (void) sprintf (line, "%.15g", p1 -> v); X} X Xint Xmain (argc, argv) Xint argc; Xchar **argv; X{ X /* setup the spreadsheet arrays, initscr() will get the screen size */ X if (!growtbl(0, 0, 0)) X { exit(1); X } X lookat(0, 0); X lookat(1, 0); X lookat(2, 0); X lookat(3, 0); X lookat(4, 0); X lookat(5, 0); X lookat(6, 0); X lookat(7, 0); X lookat(8, 0); X EvalAll(); X update(); X EvalAll(); X X exit(0); X} SHAR_EOF $TOUCH -am 1221140290 sc.c && chmod 0644 sc.c || echo "restore of sc.c failed" set `wc -c sc.c`;Wc_c=$1 if test "$Wc_c" != "1604"; then echo original size 1604, current size $Wc_c fi # ============= sc.h ============== echo "x - extracting sc.h (Text)" sed 's/^X//' << 'SHAR_EOF' > sc.h && X/* SC A Table Calculator X * Common definitions X * X * original by James Gosling, September 1982 X * modified by Mark Weiser and Bruce Israel, X * University of Maryland X * R. Bond 12/86 X * More mods by Alan Silverstein, 3-4/88, see list of changes. X * $Revision: 6.12 $ X * X */ X X#define ATBL(tbl, row, col) (*(tbl + row) + (col)) X#define FBUFLEN 1024 /* buffer size for a single field */ X Xstruct ent_ptr { X int vf; X struct ent *vp; X}; X X/* info for each cell, only alloc'd when something is stored in a cell */ Xstruct ent { X double v; /* v && label are set in EvalAll() */ X struct enode *expr; /* cell's contents */ X short flags; X}; X X/* stores type of operation this cell will preform */ Xstruct enode { X int op; X union { X double k; /* constant # */ X struct ent_ptr v; /* ref. another cell */ X } e; X}; X X/* op values */ X#define O_CONST 'k' X Xextern struct ent ***tbl; /* data table ref. in vmtbl.c and ATBL() */ X X#define FALSE 0 X#define TRUE 1 SHAR_EOF $TOUCH -am 1221140290 sc.h && chmod 0664 sc.h || echo "restore of sc.h failed" set `wc -c sc.h`;Wc_c=$1 if test "$Wc_c" != "972"; then echo original size 972, current size $Wc_c fi # ============= vmtbl.c ============== echo "x - extracting vmtbl.c (Text)" sed 's/^X//' << 'SHAR_EOF' > vmtbl.c && X# include <stdio.h> X# include "sc.h" X Xextern char *malloc(); Xextern char *realloc(); X X X/* X * grow the main && auxiliary tables (reset maxrows/maxcols as needed) X * toprow &&/|| topcol tell us a better guess of how big to become. X * we return TRUE if we could grow, FALSE if not.... X */ Xint Xgrowtbl(rowcol, toprow, topcol) Xint rowcol; Xint toprow, topcol; X{ X struct ent ** nullit; X struct ent *** tnullit; X int maxrows, maxcols; X int cnt; X int i; X X maxrows = maxcols = 20; X tbl = (struct ent ***)malloc((unsigned)(maxrows*sizeof(struct ent **))); X for(tnullit = tbl, cnt = 0; cnt < maxrows; cnt++, tnullit++) X *tnullit = (struct ent **)NULL; X X /* fill in the bottom of the table */ X for (i = 0; i < maxrows; i++) X { if ((tbl[i] = (struct ent **)malloc((unsigned)(maxcols * X sizeof(struct ent **)))) == (struct ent **)0) X { return(FALSE); X } X for(nullit = tbl[i], cnt = 0; cnt < maxcols; cnt++, nullit++) X *nullit = (struct ent *)NULL; X } X X return(TRUE); X} SHAR_EOF $TOUCH -am 1221140290 vmtbl.c && chmod 0644 vmtbl.c || echo "restore of vmtbl.c failed" set `wc -c vmtbl.c`;Wc_c=$1 if test "$Wc_c" != "964"; then echo original size 964, current size $Wc_c fi exit 0
src@scuzzy.in-berlin.de (Heiko Blume) (01/03/91)
buhrt@sawmill.uucp (Jeffery A Buhrt) writes: >>[signal handling stuff] >Should work ... kind of... >as long as you are not on a '386 (as the orignal article ask). >The above code is very correct and works fine on all systems that I know >of except a 386/387 system. oh, well. at least it solves the original problem, that the signal wasn't catched after the first occurence :-) >The problem is probably not so much one divide by zero as it is >8 and 9 (the 387 stack is 8 deap), after eight divide by zero calls >the 387 stack is full and the next (or later) 387 call will die. >I do NOT have ISC and can't say if maybe the user->kernel switch after the >code exits doesn't reset his 387 stack (I assume if the system does >panic it would not be an emulator doing it). i tried your fpedie on interactive 2.2.1 without a 387: eval_fpe called eval_fpe called eval_fpe called eval_fpe called eval_fpe called eval_fpe called eval_fpe called eval_fpe called quit_fpe() not called in EvalAll() (STACK TRASHED) IOT trap (core dumped) not a panic at least. the #define I386 failed because there is no struct fpusave in the header files. i kludged around a bit with _fpstate but got only segmentation faults. anyway, the <ieeefp.h> says "When a signal handler catches a FPE, it will have a freshly initialized coprocessor (really?). [...] it gets a single parameter of type struct _fpustackframe. [...] By modifying it, the state of the coprocessor (and emulator?) can be changed upon return to the main task." since i don't know sh*t about the 387 innards i can't investigate this, but it sounds like the right way to try. -- Heiko Blume <-+-> src@scuzzy.in-berlin.de <-+-> (+49 30) 691 88 93 public source archive [HST V.42bis]: scuzzy Any ACU,f 38400 6919520 gin:--gin: nuucp sword: nuucp uucp scuzzy!/src/README /your/home
rhl@grendel.Princeton.EDU (Robert Lupton (the Good)) (01/04/91)
Let me defend my reputation. I started this thread, and apparently
wasn't careful enough. My problem was not that I didn't reinstall the
signal handler (I actually called exit instead of reinstalling it, but I
would have called signal() if the handler returned). The trouble was the
sequence
for(;;) {
Reboot machine
run fortran code that divide checked with my handler
(all OK so far. The programme exited. Back at the shell)
run fortran code that divide checked with my handler
(now the machine crashes)
}
So the problem is not missing handlers, and I don't think that it is
this 8-deep stack on the '387 either; unless the fortrash handlers are
worse than I think I doubt if they manage to generate an extra 7 divide
checks while trying to clean up.
Robert