bferguso@ics.uci.edu (B. Ferguson) (08/05/89)
We have a FORTRAN program which runs correctly under VMS, UNICOS, and Stellix. When we run it on a Decstation 3100 we get the message: grow failed because stack limit exceeded, pid xxxxx, proc xxx ts = 0xc1, ds = 0xdc7, ss = 0x2f The program then runs for a few more minutes before crashing with: Segmentation fault (core dumped) Can anyone explain this message, and the possible cause? We have 65mb of swap space on local disk, and the machine is otherwise treated as diskless. Thanks in advance. Bruce.
graham@fuel.dec.com (kris graham) (08/05/89)
>We have a FORTRAN program which runs correctly under VMS, UNICOS, and Stellix. >When we run it on a Decstation 3100 we get the message: >grow failed because stack limit exceeded, pid xxxxx, proc xxx >ts = 0xc1, ds = 0xdc7, ss = 0x2f >The program then runs for a few more minutes before crashing with: >Segmentation fault (core dumped) Typing "limit" at the shell (DECstation 3100 UWS 2.1) shows the following csh> limit cputime unlimited filesize unlimited datasize 85988 kbytes stacksize 512 kbytes coredumpsize unlimited memoryuse 12744 kbytes So, why limit your stacksize ;-) ? csh> limit stacksize unlimited .......(or, you can set your own limit like ...2048,4096, 8192, unlimited ) is the one way to 'liberate' your Fortran program....unless something weird is going on. -- Christopher Graham Digital Equipment Corp Ultrix Resource Center 2 Penn Plaza New York City
kent@ticipa.ti.com (Russell Kent) (08/10/89)
From article <20434@paris.ics.uci.edu>, by bferguso@ics.uci.edu (B. Ferguson): >We have a FORTRAN program which runs correctly under VMS, UNICOS, and Stellix. >When we run it on a Decstation 3100 we get the message: >grow failed because stack limit exceeded, pid xxxxx, proc xxx >ts = 0xc1, ds = 0xdc7, ss = 0x2f >The program then runs for a few more minutes before crashing with: >Segmentation fault (core dumped) Kris Graham responds: From article <1425@riscy.dec.com>, by graham@fuel.dec.com (kris graham): > Typing "limit" at the shell (DECstation 3100 UWS 2.1) shows the following > > csh> limit > cputime unlimited > filesize unlimited > datasize 85988 kbytes > stacksize 512 kbytes > coredumpsize unlimited > memoryuse 12744 kbytes > > So, why limit your stacksize ;-) ? > > csh> limit stacksize unlimited .......(or, you can set your own limit > like ...2048,4096, 8192, unlimited ) > > is the one way to 'liberate' your Fortran program....unless something > weird is going on. (Disclaimer: This is from memory, and I've killed millions of brain cells since I first heard of it.) There exists a heuristic in the kernel's trap code for out-of-bounds memory references. This code tries to distinguish between references "just below" the top of the stack versus any other references. This is so that the kernel can automatically grow the process' stack, like when you do deep function calls, etc. People who use FORTRAN (usually crystallography weenies or pipe stress freaks ;-) typically have rather large arrays of numbers allocated as local variables (and therefore placed on the stack in one fell-swoop). So, suddenly there might be a reference to a variable (or array position) that is not "acceptably close" to the current top of stack. The heuristic says that this reference is an out-of-control pointer reference, and should trigger a segmentation fault. (Heuristics are "rules of thumb" and don't always work, like in this case). So, how do you fix this? Increasing your stack size probably _won't_ help. You could use smaller arrays. You could use heap allocated (read: malloc()) arrays. You could recompile the fragment of the kernel that does this to increase it's idea of "acceptably close", although this isn't always possible (unless you're Chris Torek or George Robbins ;-). You could put an "initialization" loop as _THE_VERY_FIRST_ACTION_ in the offending routine(s) that steps down through the array to slowly grow the stack. It isn't necessary to reference every array element, just enough to keep the growth chunks down to reasonable sizes. Example: FFT_invert (pa, pb) int pa[], pb[]; { #define SCRATCH_SIZE 102400 register int i; int scratch_array[SCRATCH_SIZE]; /* Work around stack-growth heuristic in 16K chunks */ for (i=0; i < SCRATCH_SIZE; i += 16384/sizeof(int)) { scratch_array[i] = 0; } : : The remainder of the function. : } Of course, if you don't have source for the offending function(s), this is not easy to do :-). Hope I remembered this right! From vn Wed Aug 9 19:35:18 1989 Subject: Re: grow failed because stack limit exceeded Newsgroups: comp.unix.ultrix References: <20434@paris.ics.uci.edu> <1425@riscy.dec.com> From article <20434@paris.ics.uci.edu>, by bferguso@ics.uci.edu (B. Ferguson): >We have a FORTRAN program which runs correctly under VMS, UNICOS, and Stellix. >When we run it on a Decstation 3100 we get the message: >grow failed because stack limit exceeded, pid xxxxx, proc xxx >ts = 0xc1, ds = 0xdc7, ss = 0x2f >The program then runs for a few more minutes before crashing with: >Segmentation fault (core dumped) Kris Graham responds: From article <1425@riscy.dec.com>, by graham@fuel.dec.com (kris graham): > Typing "limit" at the shell (DECstation 3100 UWS 2.1) shows the following > > csh> limit > cputime unlimited > filesize unlimited > datasize 85988 kbytes > stacksize 512 kbytes > coredumpsize unlimited > memoryuse 12744 kbytes > > So, why limit your stacksize ;-) ? > > csh> limit stacksize unlimited .......(or, you can set your own limit > like ...2048,4096, 8192, unlimited ) > > is the one way to 'liberate' your Fortran program....unless something > weird is going on. (Disclaimer: This is from memory, and I've killed millions of brain cells since I first heard of it.) There exists a heuristic in the kernel's trap code for out-of-bounds memory references. This code tries to distinguish between references "just below" the top of the stack versus any other references. This is so that the kernel can automatically grow the process' stack, like when you do deep function calls, etc. People who use FORTRAN (usually crystallography weenies or pipe stress freaks ;-) typically have rather large arrays of numbers allocated as local variables (and therefore placed on the stack in one fell-swoop). So, suddenly there might be a reference to a variable (or array position) that is not "acceptably close" to the current top of stack. The heuristic says that this reference is an out-of-control pointer reference, and should trigger a segmentation fault. (Heuristics are "rules of thumb" and don't always work, like in this case). So, how do you fix this? Increasing your stack size probably _won't_ help. You could use smaller arrays. You could use heap allocated (read: malloc()) arrays. You could recompile the fragment of the kernel that does this to increase it's idea of "acceptably close", although this isn't always possible (unless you're Chris Torek or George Robbins ;-). You could put an "initialization" loop as _THE_VERY_FIRST_ACTION_ in the offending routine(s) that steps down through the array to slowly grow the stack. It isn't necessary to reference every array element, just enough to keep the growth chunks down to reasonable sizes. Example: FFT_invert (pa, pb) int pa[], pb[]; { #define SCRATCH_SIZE 102400 register int i; int scratch_array[SCRATCH_SIZE]; /* Work around stack-growth heuristic in 16K chunks */ for (i=0; i < SCRATCH_SIZE; i += 16384/sizeof(int)) { scratch_array[i] = 0; } : : The remainder of the function. : } Of course, if you don't have source for the offending function(s), this is not easy to do :-). Hope I remembered this right! -- Russell Kent UUCP: convex!smu!\ Texas Instruments sun!texsun! ti-csl!tifsil!kent PO Box 655012 M/S 3635 ut-sally!im4u!/ Dallas, TX 75265 Voice: (214) 995-3501 TI-MSG: RAK9