[comp.sys.mac.programmer] global data in Think C

ted@cs.utexas.edu (Ted Woodward) (05/07/90)

Well, I just found out that string constants in Think C 4.0 get put into the
global data segment...bad move, guys.  Because of this, I have 47K of 'global'
data...

So what do I try?  I try this:
#define NUMELEMS 29
Str32 *myArray;	/*yes, I did a #include <appletalk.h>*/

InitElms()
{	myArray = (Str32 *) NewPtr(sizeof(Str32) * (long) NUMELMS);
	myArray[0] = "generic text here";
}

and get 'illegal operation on array'.  So I can't say:
Str32 test;
test = "stuff";

?
This seems kinda silly.  I'd also like to be able to say something like this:

myArray = {"generic1","genereic2",...}
like in a declaration, so I can put the stuff in the heap instead of global
space...


-- 
Ted Woodward (ted@cs.utexas.edu)

Someone shot the food...

jholt@pro-sol.cts.com (joe holt) (05/07/90)

In-Reply-To: message from ted@cs.utexas.edu

47K of global space??  Irregardless of where Think C is putting the strings,
*you're* obviously doing something wrong... :-(

Older versions of Think C did put the strings in a separate STR resource. 
This required some run-time address patching to the code, so they made this an
option in 4.0 and defaulted to OFF.  If you check the "separate STRs" box in
the "Set Project Type" dialog, you'll have you're non-global strings.

I would invest some time in reading about Memory Management.  You should never
declare big arrays as static--they should always be dynamically allocated,
either with malloc(), or for Mac purists NewPtr() or NewHandle().

6600pete@ucsbuxa.ucsb.edu (GurgleKat [Pete Gontier]) (05/08/90)

From article <704@dimebox.cs.utexas.edu>, by ted@cs.utexas.edu (Ted Woodward):
> Well, I just found out that string constants in Think C 4.0 get put into the
> global data segment...bad move, guys.  Because of this, I have 47K of 'global'
> data...

Standard practice is to use STR# resources, GetIndString, and StringHandle.
This won't help, of course, if you're porting a large chunk of someone
else's code, but it doesn't look like in this case that you are.

> So what do I try?  I try this:
> #define NUMELEMS 29
> Str32 *myArray;	/*yes, I did a #include <appletalk.h>*/
> 
> InitElms()
> {	myArray = (Str32 *) NewPtr(sizeof(Str32) * (long) NUMELMS);
> 	myArray[0] = "generic text here";
> }
> 
> and get 'illegal operation on array'.
>
> So I can't say:
> Str32 test;
> test = "stuff";

Looks to me like you are trying to declare a pointer to an array of Str32's,
which is fine, except you need to do something like this:

/****************************************************************************/
void pstrcat ( char *target, char *append ) {
	BlockMove ( &append [ 1 ], &target [ target [ 0 ] + 1 ], append [ 0 ] );
	target [ 0 ] = ( unsigned char ) target [ 0 ] + append [ 0 ];
}
 
#define NUMELEMS 29
 
typedef Str32 Str32Array [ 100 ]; /* arbitrary bounds */
Str32Array *myArray;
 
main ( ) {
	myArray = NewPtr ( sizeof ( Str32 ) * NUMELEMS );
	( * myArray ) [ 0 ] [ 0 ] = '\0';
	pstrcat ( ( char * ) & ( ( * myArray ) [ 0 ] ), "\pgeneric text here" );
	DebugStr ( ( char * ) & ( ( * myArray ) [ 0 ] ) );
}
/****************************************************************************/

pstrcat is necessary because you can't just assign a string constant to an
array of characters in C. The language is low-level enough that the designers
assumed people would be insulted if they couldn't optimize string operations
for their own architecture. (BlockMove happens to be VERY efficient.)
If you wrote a pstrcpy, you could omit one of the above lines, but I seldom
have need for a pstrcpy without a pstrcat.

Also, if you're going to be messing with Pascal data types, you should use
Pascal strings (note the "\p").

Finally, note that this particular solution does not help you, because it still
uses a string constant. But it might be instructive.

> I'd also like to be able to say something like this:
> myArray = {"generic1","genereic2",...}
> like in a declaration, so I can put the stuff in the heap instead of global
> space...

I'm not sure if I read you correctly. Here's what I interpret you to want:

/****************************************************************************/
main ( ) {
	char *myArray [ ] = { "\pa string", "\panother string" };
	DebugStr ( myArray [ 0 ] );
	DebugStr ( myArray [ 1 ] );
}
/****************************************************************************/

This is convenient, except that you're still stuck with string constants.

Here's how I'd do it in the heap:

/****************************************************************************/
main ( ) {
	Str255 s;
	StringHandle sh;
	GetIndString ( s, 128, 1 ); /* get first string out of 'STR#' ID 128 */
	sh = NewString ( s );
	DebugStr ( *sh );
}
/****************************************************************************/

This would be easy to expand to use an array of StringHandles. The strings are
easier to dereference then, but it's one more array you have to manage. It's
a trade-off. Also note that this fragment uses the Handle, which is a
relocatable heap object. They're not much fun in a big program, because
screwing one up inadvertently may take days to debug, but they make the Mac OS
possible, so you ought to use them as much as you can.
--
             Pete Gontier, Kiwi Software; Kiwi's opinions not presented here
InterNet 6600pete@ucsbuxa.ucsb.edu; BitNet 6600pete@ucsbuxa; AppleLink D0862

kk@mcnc.org (Krzysztof Kozminski) (05/08/90)

In article <704@dimebox.cs.utexas.edu> ted@cs.utexas.edu (Ted Woodward) writes:
>Well, I just found out that string constants in Think C 4.0 get put into the
>global data segment...bad move, guys.  Because of this, I have 47K of 'global'
>data...

What are your strings doing in the program code instead of being stored
in resources (so that you can change them without recompiling) ?  The
only strings that have a good reason to be in the code are those used
for debugging (hopefully enclosed in #ifdef's so that you can get rid
of them in the final version).

>So what do I try?  I try this:
>#define NUMELEMS 29
>Str32 *myArray;	/*yes, I did a #include <appletalk.h>*/
>
>InitElms()
>{	myArray = (Str32 *) NewPtr(sizeof(Str32) * (long) NUMELMS);
>	myArray[0] = "generic text here";
>}

Where do you think the string "generic text here" will get put by the
compiler?  Yup, right into the global data segment.  The size of your
data segment would get reduced only if you had some redundant strings
(provided that the compiler is not smart enough to do this for you).

>and get 'illegal operation on array'.  So I can't say:
>Str32 test;
>test = "stuff";
>
>This seems kinda silly.

Lets see, 'test' is an array of 32 charcters. "stuff" is a POINTER
to the location where character 's' is stored, followed by 'tuff' and '\0'.
You're assigning a wrong entity here.

What you have requires string copy, NOT assignment.

	"strcpy(myArray[0],"generic text here")"
	"strcpy(test,"stuff");

This is rather silly, since it wastes lots of space (all this stuff trailing
the strings that are shorter than 31 characters).

Or you can do:
char **myArray;
...
	myArray = char **NewPtr(sizeof(*myArray) *(long) NUMELMS);
	myArray[0] = "generic text here";

except that this is sorta silly, too, since you can just do the
initialization:

	char *myArray[] = { "generic text here", ...

that acompplishes exactly the same thing with less hassle ...

The bottom line: put the strings in 'STR#'.

KK
-- 
Kris Kozminski   kk@mcnc.org
"The party was a masquerade; the guests were all wearing their faces."

ted@cs.utexas.edu (Ted Woodward) (05/08/90)

This is in response to everyone criticizing my programming style for having
47K of globals, etc etc.

Well, I'm doing a port of moria from MPW C to Think C.  This isn't my code.
I've already put 10K of globals into dynamically allocated space, and the
other 47K is either library globals (ANSI has 3K, etc), strings, or static
variables.  That's right, Think C puts static variables into global space.
So, please, don't criticize me for using 47K of globals.  If you had actually
looked at what I was trying to do, I was trying to put global strings into the
help, something  I wouldn't try to do unless I had already put lots of other
stuff into the heap (easier stuff, like big arrays...)

To the people who sent me mail suggesting I check the strings as STRs,
thank you.  I'll try it when I get home tonight.

Now, as to what I wanted to do when I said I wanted to assign cstrings
to Str32's in the heap:
You can say char *temp = {"dslkfsdf","lkjasflkds","lkdsflsdf"};
and get an array of strings.  I want to be able to do this at runtime, to say:
char *temp;
main()
{   temp = {"asl0kjsaf", "askfjasjf","asl0kfja;0kjf"};
}

and be able to reference them by temp[2], etc...so what I decided to do is
create an array of *Str32.  then reference each string as temp[3], like
above...I really don't care about wasted space, as long as it's in the heap.
And this will be done at the beginning of the program, so fragmentation is
not a problem.  And I don't really feel like  making this a handle, because
they are a pain, and this won't need to be relocated (above sentence).

(Geez, am I being incoherent, or what?)

Anyway, I'll try just checking the STR option...hopefully, moria 5.1.4 for the
mac will be done tonight...

(Thanks, Jim)


-- 
Ted Woodward (ted@cs.utexas.edu)

Death now has Extra Speed!

c60c-3cf@e260-3f.berkeley.edu (Dan Kogai) (05/08/90)

In article <704@dimebox.cs.utexas.edu> ted@cs.utexas.edu (Ted Woodward) writes:
>Well, I just found out that string constants in Think C 4.0 get put into the
>global data segment...bad move, guys.  Because of this, I have 47K of 'global'
>data...
>So what do I try?  I try this:
>#define NUMELEMS 29
>Str32 *myArray;	/*yes, I did a #include <appletalk.h>*/
>InitElms()
>{	myArray = (Str32 *) NewPtr(sizeof(Str32) * (long) NUMELMS);
>	myArray[0] = "generic text here";
                     ^^^^^^^^^^^^^^^^^^Problem
>}
>
>and get 'illegal operation on array'.  So I can't say:
>Str32 test;
>test = "stuff";
         ^^^^^Problem #2'
>This seems kinda silly.  I'd also like to be able to say something like this:
>myArray = {"generic1","genereic2",...}
             ^^^^^^^^^^^^^^^^^^^^^Problem
>like in a declaration, so I can put the stuff in the heap instead of global
>space...

	Okay, I grokked your problem enough so let me explain.
	You keep forgetting Ptr32 is pascal string!  You have to
put "\p" in case of pascal string.  And you should use pStrCopy() to copy
it down:  String assignment cannot be done with "=" in C.  Second example
is fine except you forgot "\p".   And you also forgot to check null pointer
in case NewPtr() couldn't find enough space.  Plus since you are using
NUMELEM as CONSTANT size of array, why do you have to bother allocating it
dynamically?  Dynamic allocation is necessary only when the size of array
is variant.  In your case you don't even have to bother writhing InitElms()
and it's as simple as:

Str32 myarray[NUMELEM]={
	"\pgeneric string here"
}

	Personally I hate pascal string so I store all string in C convention
and convert it to pascal whenever I need to use pascal string.

---
##################  Dan The "Think C fan" Man
+ ____  __  __   +  (Aka Dan Kogai)
+     ||__||__|  +  E-mail:     dankg@ocf.berkeley.edu
+ ____| ______   +  Voice:      415-549-6111
+ |     |__|__|  +  USnail:     1730 Laloma Berkeley, CA 94709
+ |___  |__|__|  +              U.S.A
+     |____|____ +  Disclaimer: I'd rather be blackmailed for my long .sig
+   \_|    |     +              than give up my cool name in Kanji. And my
+ <- THE MAN  -> +              citizenship of People's Republic o' Berkeley
##################              has nothing 2 do w/ whatever I post, ThanQ.

siegel@endor.harvard.edu (Rich Siegel) (05/08/90)

In article <704@dimebox.cs.utexas.edu> ted@cs.utexas.edu (Ted Woodward) writes:
>Well, I just found out that string constants in Think C 4.0 get put into the
>global data segment...bad move, guys.  Because of this, I have 47K of 'global'
>data...

	Turn on "Separate STRS" in the Set Project Type dialog, and string
and floating-point constants will go into their own space.

R.

~~~~~~~~~~~~~~~
 Rich Siegel
 Staff Software Developer
 Symantec Corporation, Language Products Group
 Internet: siegel@endor.harvard.edu
 UUCP: ..harvard!endor!siegel

"Don't try to understand 'em, just rope, throw, and brand 'em."

~~~~~~~~~~~~~~~

ech@cbnewsk.att.com (ned.horvath) (05/26/90)

From article <waIrjuG00UgyAPK458@andrew.cmu.edu>, by cc4b+@andrew.cmu.edu (Christopher Brian Cox):
> Rich,
> 
> Face it, the 32k global data limit is a BUG!
> I have a little program here called p2c.  It had 90+k of global data
> when I first compiled it.  After removing the un-initialized arrays
> and allocating them at runtime, it had 48+k of global data.
> That's with Seperate STRS.
> The compiler should automatically allocate un-initialized arrays at
> runtime.  There still shouldn't be a 32k limit.
>  
> Chris Cox

Ease off.  32K is a restriction, an irritation at worst.  It is routed in
the fact that the MC68000 uses 16-bit signed offsets for address-register
based addressing, and furthered by Apple's decision to use a program model
that includes "globals are accesssed as negative offsets from a5."

There are ways around this for the compiler writer, but because of the
inherent limitations of the processor chip, they all require explicit
address arithmetic in the generated machine code.  That means larger,
slower programs.

You have a couple of alternatives.  One you've identified: using NewPtr or
NewHandle at run time for uninitialized arrays cut your "globals"
requirement in half.  That has the rather nice side effect that you can 
resize such arrays to match the problem size.

For large initialized arrays, consider "compiling" them into resources,
or reading them in from your data fork.  I won't argue that this is
convenient, and you might justifiably claim that Motorola, Apple, and
Think/Symantec have made your life a little tougher.  The only compiler
I know of that lets you duck the limit (for a price in speed and size) 
is Aztec C.

=Ned Horvath=