onders@taac.ipl.rpi.edu (Timothy E. Onders) (08/07/90)
I've got a mysterious problem. I've taken the following program: main() { int r, c; int testm[1024][1024]; r = 0; c = 0; for (r = 0 ; r < 1024 ; r++) { for (c = 0 ; c < 1024 ; c++) { testm[c][r] = c * 10000 + r; } } } It compiles on both a Sun 3, and a Sun 4. When run on a Sun 4, it does fine. When run on the Sun 3, attempting to write to the array gives a SEGV. Attempting to read the array from dbx gives a "bad address" error. by examining the addresses of the three variables, there seems to be enough space for everything allocated. There is 4 megs of space between the array starting address, and the address of r. Any ideas what might be causing this problem, and how I can get around it? -Tim Onders onders@ipl.rpi.edu
als@roxanne.mlb.semi.harris.com (Alan Sparks) (08/08/90)
In article <1990Aug7.003432.1984@rice.edu> onders@taac.ipl.rpi.edu (Timothy E. Onders) writes: > [program omitted] > When run on the Sun 3, attempting to write to the array gives a >SEGV. Attempting to read the array from dbx gives a "bad address" error. >by examining the addresses of the three variables, there seems to be >enough space for everything allocated. There is 4 megs of space between >the array starting address, and the address of r. Any ideas what might be >causing this problem, and how I can get around it? I haven't taken much time to research this really well... but I suspect you've blown out the stack on the 68xxx CPU (especially with such a mondo array). I duplicated your situation, then devised a couple of workarounds. One workaround is to make the array "testm" static: static int testm[1024][1024]; This moves it off the stack, into the static data area. Another workaround (especially if you want to reclaim storage) is to dynamically allocate the array. One way to do it is: int **testm, i; testm = (int **) calloc(1024,sizeof(int *)); for (i = 0; i < 1024; ++i) testm[i] = (int *) calloc(1024,sizeof(int)); The remainder of the code stays the same. To reclaim storage afterward: for (i = 0; i < 1024; ++i) cfree(testm[i]); cfree(testm); Some variant of these workarounds will solve your problem. They work on a local Sun 3/60 here. Hope this helps. -Alan
beau@uunet.uu.net (Beau James) (08/08/90)
Your automatic array is "too big" for the kernel heuristic that decides when it's time to grow the stack, vs. when to declare a user program error. That heurisitc is different on Sun-3s and Sun-4s, since stack frames tend to b different sizes on those systems. (The heuristic may also change from one SunOS release to another.) This isn't really a SunOS issue, though; the behavio of most *nixs is similar. Unix user programs never do anything to explicitly manage the growth of their stack. If the program makes a reference beyond the end of currently allocated stack [virtual] memory, the hardware traps. The kernel looks to see "how far" the trapped reference was beyond the end of the existing stack; if it was "close enough", the kernel decides that the program was really just trying to grow the stack, so it allocates additional [virtual] memory for the stack and reruns the instruction that caused the trap - much like a standard VM page fault. On the other hand, if the reference is "too far" past the stack, the kernel decides that it was indeed an invalid reference, and sends the process a SIGSEGV. As a general rule, this means it's a bad idea to make "big" objects automatic. Better to have an automatic pointer to the object, and malloc() it on the fly; or to make the object static. That approach is more portable, also, since the precise definitions of "big", "too far", etc. are very system (hardware and OS version) dependent. Beau James beau@Ultra.COM Ultra Network Technologies {sun,ames}!ultra.com!beau
guy@uunet.uu.net (Guy Harris) (08/10/90)
>Your automatic array is "too big" for the kernel heuristic that decides >when it's time to grow the stack, vs. when to declare a user program >error. That heurisitc is different on Sun-3s and Sun-4s, since stack >frames tend to b different sizes on those systems. No, the heuristic is essentially the same; the parameter the heuristic uses, namely the stack limit, is different - it defaults to 2MB on a Sun-3 (and probably a Sun-2), and 8MB on a SPARC. His program worked just fine on a Sun-3 (well, an NS5000, but it *does* have a 3E120 inside it, running 4.0.3) after I did "limit stacksize 8192k" from the C shell to boost the stack limit to 8MB (can't set it in the SunOS Bourne shell; the Korn and Bourne-again shells may let you set it).