[comp.unix.ultrix] grow failed because stack limit exceeded

bferguso@ics.uci.edu (B. Ferguson) (08/05/89)

We have a FORTRAN program which runs correctly under VMS, UNICOS, and Stellix.

When we run it on a Decstation 3100 we get the message:

grow failed because stack limit exceeded, pid xxxxx, proc xxx
ts = 0xc1, ds = 0xdc7, ss = 0x2f

The program then runs for a few more minutes before crashing with:

Segmentation fault (core dumped)

Can anyone explain this message, and the possible cause?

We have 65mb of swap space on local disk, and the machine is otherwise
treated as diskless.



Thanks in advance.


Bruce.

graham@fuel.dec.com (kris graham) (08/05/89)

>We have a FORTRAN program which runs correctly under VMS, UNICOS, and Stellix.

>When we run it on a Decstation 3100 we get the message:

>grow failed because stack limit exceeded, pid xxxxx, proc xxx
>ts = 0xc1, ds = 0xdc7, ss = 0x2f

>The program then runs for a few more minutes before crashing with:

>Segmentation fault (core dumped)

Typing "limit" at the shell (DECstation 3100 UWS 2.1) shows the following

csh> limit
cputime         unlimited
filesize        unlimited
datasize        85988 kbytes
stacksize       512 kbytes
coredumpsize    unlimited
memoryuse       12744 kbytes

So, why limit your stacksize ;-) ?

csh> limit  stacksize unlimited   .......(or, you can set your own limit like ...2048,4096, 8192, unlimited )

is the one  way to 'liberate' your Fortran program....unless something
weird is going on.
-- 
Christopher Graham          
Digital Equipment Corp            
Ultrix Resource Center                             
2 Penn Plaza                  
New York City

kent@ticipa.ti.com (Russell Kent) (08/10/89)

From article <20434@paris.ics.uci.edu>, by bferguso@ics.uci.edu (B. Ferguson):
>We have a FORTRAN program which runs correctly under VMS, UNICOS, and Stellix.
>When we run it on a Decstation 3100 we get the message:
>grow failed because stack limit exceeded, pid xxxxx, proc xxx
>ts = 0xc1, ds = 0xdc7, ss = 0x2f
>The program then runs for a few more minutes before crashing with:
>Segmentation fault (core dumped)

Kris Graham responds:
From article <1425@riscy.dec.com>, by graham@fuel.dec.com (kris graham):
> Typing "limit" at the shell (DECstation 3100 UWS 2.1) shows the following
> 
> csh> limit
> cputime         unlimited
> filesize        unlimited
> datasize        85988 kbytes
> stacksize       512 kbytes
> coredumpsize    unlimited
> memoryuse       12744 kbytes
> 
> So, why limit your stacksize ;-) ?
> 
> csh> limit  stacksize unlimited   .......(or, you can set your own limit
> like ...2048,4096, 8192, unlimited )
> 
> is the one  way to 'liberate' your Fortran program....unless something
> weird is going on.

(Disclaimer: This is from memory, and I've killed millions of brain
cells since I first heard of it.)

There exists a heuristic in the kernel's trap code for out-of-bounds
memory references.  This code tries to distinguish between references
"just below" the top of the stack versus any other references.  This is
so that the kernel can automatically grow the process' stack, like when
you do deep function calls, etc.  People who use FORTRAN (usually
crystallography weenies or pipe stress freaks ;-) typically have rather
large arrays of numbers allocated as local variables (and therefore
placed on the stack in one fell-swoop).  So, suddenly there might be a
reference to a variable (or array position) that is not "acceptably
close" to the current top of stack.  The heuristic says that this
reference is an out-of-control pointer reference, and should trigger a
segmentation fault.  (Heuristics are "rules of thumb" and don't always
work, like in this case).

So, how do you fix this?  Increasing your stack size probably _won't_
help.  You could use smaller arrays.  You could use heap allocated
(read: malloc()) arrays.  You could recompile the fragment of the
kernel that does this to increase it's idea of "acceptably close",
although this isn't always possible (unless you're Chris Torek or
George Robbins ;-).  You could put an "initialization" loop as
_THE_VERY_FIRST_ACTION_ in the offending routine(s) that steps down
through the array to slowly grow the stack.  It isn't necessary
to reference every array element, just enough to keep the growth
chunks down to reasonable sizes.  Example:

    FFT_invert (pa, pb)
        int pa[], pb[];
    {
    #define SCRATCH_SIZE   102400
	register int  i;
        int  scratch_array[SCRATCH_SIZE];

	/* Work around stack-growth heuristic in 16K chunks */
	for (i=0; i < SCRATCH_SIZE; i += 16384/sizeof(int)) {
	    scratch_array[i] = 0;
	}

	:
	: The remainder of the function.
	:
    }

Of course, if you don't have source for the offending function(s), this
is not easy to do :-).

Hope I remembered this right!


From vn Wed Aug  9 19:35:18 1989
Subject: Re: grow failed because stack limit exceeded
Newsgroups: comp.unix.ultrix
References: <20434@paris.ics.uci.edu> <1425@riscy.dec.com>

From article <20434@paris.ics.uci.edu>, by bferguso@ics.uci.edu (B. Ferguson):
>We have a FORTRAN program which runs correctly under VMS, UNICOS, and Stellix.
>When we run it on a Decstation 3100 we get the message:
>grow failed because stack limit exceeded, pid xxxxx, proc xxx
>ts = 0xc1, ds = 0xdc7, ss = 0x2f
>The program then runs for a few more minutes before crashing with:
>Segmentation fault (core dumped)

Kris Graham responds:
From article <1425@riscy.dec.com>, by graham@fuel.dec.com (kris graham):
> Typing "limit" at the shell (DECstation 3100 UWS 2.1) shows the following
> 
> csh> limit
> cputime         unlimited
> filesize        unlimited
> datasize        85988 kbytes
> stacksize       512 kbytes
> coredumpsize    unlimited
> memoryuse       12744 kbytes
> 
> So, why limit your stacksize ;-) ?
> 
> csh> limit  stacksize unlimited   .......(or, you can set your own limit
> like ...2048,4096, 8192, unlimited )
> 
> is the one  way to 'liberate' your Fortran program....unless something
> weird is going on.

(Disclaimer: This is from memory, and I've killed millions of brain
cells since I first heard of it.)

There exists a heuristic in the kernel's trap code for out-of-bounds
memory references.  This code tries to distinguish between references
"just below" the top of the stack versus any other references.  This is
so that the kernel can automatically grow the process' stack, like when
you do deep function calls, etc.  People who use FORTRAN (usually
crystallography weenies or pipe stress freaks ;-) typically have rather
large arrays of numbers allocated as local variables (and therefore
placed on the stack in one fell-swoop).  So, suddenly there might be a
reference to a variable (or array position) that is not "acceptably
close" to the current top of stack.  The heuristic says that this
reference is an out-of-control pointer reference, and should trigger a
segmentation fault.  (Heuristics are "rules of thumb" and don't always
work, like in this case).

So, how do you fix this?  Increasing your stack size probably _won't_
help.  You could use smaller arrays.  You could use heap allocated
(read: malloc()) arrays.  You could recompile the fragment of the
kernel that does this to increase it's idea of "acceptably close",
although this isn't always possible (unless you're Chris Torek or
George Robbins ;-).  You could put an "initialization" loop as
_THE_VERY_FIRST_ACTION_ in the offending routine(s) that steps down
through the array to slowly grow the stack.  It isn't necessary
to reference every array element, just enough to keep the growth
chunks down to reasonable sizes.  Example:

    FFT_invert (pa, pb)
        int pa[], pb[];
    {
    #define SCRATCH_SIZE   102400
	register int  i;
        int  scratch_array[SCRATCH_SIZE];

	/* Work around stack-growth heuristic in 16K chunks */
	for (i=0; i < SCRATCH_SIZE; i += 16384/sizeof(int)) {
	    scratch_array[i] = 0;
	}

	:
	: The remainder of the function.
	:
    }

Of course, if you don't have source for the offending function(s), this
is not easy to do :-).

Hope I remembered this right!

-- 
Russell Kent                        UUCP:  convex!smu!\
Texas Instruments                          sun!texsun! ti-csl!tifsil!kent 
PO Box 655012  M/S 3635                 ut-sally!im4u!/
Dallas, TX 75265                   Voice: (214) 995-3501     TI-MSG: RAK9