[comp.lang.c] Question on large arrays in C

jwp@uwmacc.UUCP (02/12/87)

I am running 4.3BSD on a MicroVax II.
I have a simple program:

----------
#include <stdio.h>
#define N 20480
main()
{
	double x[N];
	double y_1[N];
	double y_2[N];
	double y_3[N];
	double y_4[N];
	char *l = "";

	fprintf(stdout, "hi\n");
	exit(0);
}
----------

When I run it, I get "Segmentation violation".
dbx reports the violation to occur on the "char *l" line.
If I move the 5 array declarations up above main(),
under the define statement, the program works OK.
What is wrong with the program as listed above?
-- 
	Jeff Percival ...!uwvax!uwmacc!sal70!jwp or ...!uwmacc!jwp

chris@mimsy.UUCP (02/12/87)

In article <1051@uwmacc.UUCP> jwp@uwmacc.UUCP (Jeffrey W Percival) writes:
>I am running 4.3BSD on a MicroVax II. ...

with array declarations that require 20480*5*8 = 819200 bytes.

>When I run it, I get "Segmentation violation".

>If I move the 5 array declarations up above main(),
>under the define statement, the program works OK.
>What is wrong with the program as listed above?

Nothing.  This is a `feature' of a large virtual address space with
demand paging.  The system cannot tell whether many stack pointer
alterations are proper, so it uses a heuristic.  If the stack
pointer has been moved less than 512 kilobytes, the stack allocation
is a controlled one and is allowed.  If it has moved by more than
512K, it is assumed to be accidental, and the program is sent a
SIGSEGV.

This heuristic has an obvious flaw.  To fix it, Berkeley made the
limit not really 512K, but rather the `stacksize' resource limit
for the process.  This is set to 512K by init, and inherited by
all children.  It can be changed with the setrlimit() system call,
or with the C shell's built-in `limit' command:

	limit stacksize 1m

will raise it to one megabyte;

	unlimit stacksize

will raise it to the kernel's configured maximum, probably 16M.

Incidentally, on all machines with which I have worked, large arrays
are more efficiently accessed when static than when on the stack.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
UUCP:	seismo!mimsy!chris	ARPA/CSNet:	chris@mimsy.umd.edu

whb@vax135.UUCP (02/12/87)

In article <1051@uwmacc.UUCP> jwp@uwmacc.UUCP (Jeffrey W Percival) writes:
>#define N 20480
>main()
>{
>	double x[N];
>	double y_1[N];
>	double y_2[N];
>	double y_3[N];
>	double y_4[N];
> [etc.]
>When I run it, I get "Segmentation violation".
>If I move the 5 array declarations up above main(),
>under the define statement, the program works OK.
>What is wrong with the program as listed above?
>	Jeff Percival ...!uwvax!uwmacc!sal70!jwp or ...!uwmacc!jwp

Five arrays times 20480 doubles per array times 8 bytes per array
	= 819.2 kbytes!
This is larger than the standard stack allocation (which is where
these arrays are being put).

Moving the delcarations out of main() puts them in the data
segment, which can get that much space with no problems.
Defining them in main() as static would do the same thing.
-- 
Wilson H. Bent, Jr.		... ihnp4!vax135!hoh-2!whb
AT&T - Bell Laboratories	(201) 949-1277
Disclaimer: My company has not authorized me to issue a disclaimer.

bob@cald80.UUCP (02/13/87)

In article <1051@uwmacc.UUCP> jwp@uwmacc.UUCP (Jeffrey W Percival) writes:
:I am running 4.3BSD on a MicroVax II.
:I have a simple program:
:
:----------
:#include <stdio.h>
:#define N 20480
:main()
:{
:	double x[N];
:	double y_1[N];
:	double y_2[N];
:	double y_3[N];
:	double y_4[N];
:	char *l = "";
:
:	fprintf(stdout, "hi\n");
:	exit(0);
:}
:----------
:
:When I run it, I get "Segmentation violation".
:dbx reports the violation to occur on the "char *l" line.
:If I move the 5 array declarations up above main(),
:under the define statement, the program works OK.
:What is wrong with the program as listed above?
:-- 

	My compiler (Ancient V7 68000 UNIX) won't even let me do that.
I don't think that aggregate assignments inside a block are not allowed
on most systems that I hack on.  Also remember that local variables
hang out on the stack.  Your compiler may not catch a stack overrun
(409600 Kbytes may be a bit much even for a VAX).

	I generally stuff large arrays out in global area anyway.
That keeps you from having to pass them all the time.  No flames, I
know that globals are generally to be avoided unless really necessary
but I prefer to stuff big stuff out there in case I get stuck on
another machine with small stack (8086 and company).  On those
machines, 64K is all you're allowed in one data segment.

	Also, the local FORTrashers find globals easier to deal with
when trying to figger out what I did (seems that they associate them
with unnamed commons or some such).

	I might be completely off base on this (I don't really dig
into VAXEN much and am not sure of the architecture there) but I think
that this may be right.

-- 
					Bob Meyer
					Calspan ATC
					seismo!kitty!sunybcs!cald80!bob
					decvax!sunybcs!cald80!bob

flaps@utcsri.UUCP (Alan J Rosenthal) (02/13/87)

In article <1051@uwmacc.UUCP> jwp@uwmacc.UUCP writes:
>#include <stdio.h>
>#define N 20480
>main()
>{
>	double x[N];
>	double y_1[N];
>	double y_2[N];
>	double y_3[N];
>	double y_4[N];
>	char *l = "";
>
>	fprintf(stdout, "hi\n");
>	exit(0);
>}
>----------
>
>When I run it, I get "Segmentation violation".
>dbx reports the violation to occur on the "char *l" line.
>If I move the 5 array declarations up above main(),
>under the define statement, the program works OK.

Apparently you are overrunning the stack area (where auto variables are
often (and apparently, in this case) allocated), causing a segmentation
exception.  If you move the five double declarations "up above main()",
they are no longer auto variables and thus are no longer allocated on the
stack.

A more appropriate way to do this which continues to keep the scope of those
variables restricted to the main() function is to declare them as static,
in the same place as they are declared above, like:
	static double x[N];
, but this has radically different semantics in the case of a recursive
function call - namely, that the different function invocations will share
these variables rather than having private versions.  Hopefully your program
doesn't require this feature of auto variables, in which case inserting
'static' should solve your problem.

-- 

Alan J Rosenthal

UUCP: {backbone}!seismo!mnetor!utgpu!flaps, ubc-vision!utai!utgpu!flaps,
      or utzoo!utgpu!flaps (among other possibilities)
ARPA: flaps@csri.toronto.edu
CSNET: flaps@toronto
BITNET: flaps at utorgpu

bs@linus.UUCP (02/13/87)

In article <1051@uwmacc.UUCP>, jwp@uwmacc.UUCP (Jeffrey W Percival) writes:
> #include <stdio.h>
> #define N 20480
> main()
> {
> 	double x[N];
> 	double y_1[N];
> 	double y_4[N];
 
etc.
> When I run it, I get "Segmentation violation".
 
Try typing 'limit' to UNIX. It will tell you what your current
program size and stack size limits are. I suspect, that since your
variables are local, and hence placed on the stack, that you have
overflowed the stack size limit.

One can set stacksize limits via:

% limit stacksize  amount

where amount is how much space you want

Bob Silverman

bright@dataio.UUCP (02/13/87)

In article <1051@uwmacc.UUCP] jwp@uwmacc.UUCP (Jeffrey W Percival) writes:
]I am running 4.3BSD on a MicroVax II.
]I have a simple program:
]----------
]#include <stdio.h]
]#define N 20480
]main()
]{
]	double x[N];
]	double y_1[N];
]	double y_2[N];
]	double y_3[N];
]	double y_4[N];
]
]	fprintf(stdout, "hi\n");
]}
]----------
]When I run it, I get "Segmentation violation".
]What is wrong with the program as listed above?

You are using 5*8*20480 == 819200 bytes of stack, which is a bit much
for most machines...

levy@ttrdc.UUCP (02/17/87)

In article <4124@utcsri.UUCP>, flaps@utcsri.UUCP writes:
>>#define N 20480
>>main()
>>{
>>	double x[N];
>>	double y_1[N];
>>	double y_2[N];
>>	double y_3[N];
>>	double y_4[N];
>>	char *l = "";
>>When I run it, I get "Segmentation violation".
>>dbx reports the violation to occur on the "char *l" line.
>
>A more appropriate way to do this which continues to keep the scope of those
>variables restricted to the main() function is to declare them as static,
>in the same place as they are declared above, like:
>	static double x[N];
>, but this has radically different semantics in the case of a recursive
>function call - namely, that the different function invocations will share
>these variables rather than having private versions.  Hopefully your program
>doesn't require this feature of auto variables, in which case inserting
>'static' should solve your problem.
>Alan J Rosenthal

This will also result in an elephantine (HUGE) executable file on many
systems (if the compile gets that far, and doesn't run out of temp-file
ulimit or space first).  An extern (explicit, or implicitly so by being
declared outside of a function) does not have this problem (it will go
in bss) though it will no longer be hidden.

What WOULD be nice would be a way to do in C something analogous to the way
UNIX f77 treats the variable "d" in:

	program krunch
	double precision d (1 000 000)
c
c	dummy code so the array d does not get optimized out
c
	d(0)=d(0)+1.0
	end

Compiling this will result in an image with "d" in the .bss segment
(small executable file), yet the storage for "d" is not visible to other
modules which might be in the program.  A "static" declaration in C,
either inside or outside of a function, I have found to result in each
and every byte being initialized data.  Ugh.
-- 
 -------------------------------    Disclaimer:  The views contained herein are
|            dan levy            |  my own and are not at all those of my em-
|         an engihacker @        |  ployer or the administrator of any computer
| at&t computer systems division |  upon which I may hack.
|        skokie, illinois        |
 --------------------------------   Path: ..!{akgua,homxb,ihnp4,ltuxa,mvuxa,
                                        allegra,ulysses,vax135}!ttrdc!levy

jtr485@umich.UUCP (02/19/87)

In article <1514@ttrdc.UUCP>, levy@ttrdc.UUCP writes:
> This will also result in an elephantine (HUGE) executable file on many
> systems (if the compile gets that far, and doesn't run out of temp-file
> ulimit or space first).  An extern (explicit, or implicitly so by being
> declared outside of a function) does not have this problem (it will go
> in bss) though it will no longer be hidden.

Why doesn't this have the same problem?  Statics and externs should have the
same allocation semantics, only the static scope of the variables should be
different.
> 					  A "static" declaration in C,
> either inside or outside of a function, I have found to result in each
> and every byte being initialized data.  Ugh.
>--dan levy

Then you have been dealing with some VERY POOR compilers.  Allocating code file
space for bss data which gets initialized to 0 is absurd, since it is trivial
to build prologue code (which will probably run faster than loading from disk)
to handle this.

--j.a.tainter

dave@onfcanim.UUCP (02/21/87)

In article <1514@ttrdc.UUCP> levy@ttrdc.UUCP (Daniel R. Levy) writes:
>.  A "static" declaration in C,
>either inside or outside of a function, I have found to result in each
>and every byte being initialized data.  Ugh.

With what compiler?  The 4.2BSD C compiler statics, either external or
local to a function, into ".lcomm", which ends up in the bss segment.
Only initialized statics end up in the data segment.  Sounds like your
compiler or loader is lazy or broken.

steffen@ihlpg.UUCP (02/26/87)

> > 					  A "static" declaration in C,
> > either inside or outside of a function, I have found to result in each
> > and every byte being initialized data.  Ugh.
> >--dan levy
> 
> Then you have been dealing with some VERY POOR compilers.  Allocating code file
> space for bss data which gets initialized to 0 is absurd, since it is trivial
> to build prologue code (which will probably run faster than loading from disk)
> to handle this.

This is a bug in all PCC1 compilers I've encountered here at AT&T
Bell Labs.  It's easy to fix in some, but hard in others that expect
the asembler comm psuedo-op to mean global data. (Uninitialized global data
uses the comm psuedo-op.)
-- 


	Joe Steffen, AT&T Bell Labs, Naperville, IL, (312) 369-7395