[comp.sys.mac] compiler thoughts

oster@dewey.soe.berkeley.edu (David Phillip Oster) (10/06/87)

References:


Steve Falco pointed out that I'm missing the point in my recent
posting about 32k global limits in most C and pascal compilers on the
Mac. The problem isn't arrays, that are easy to program around, but
initialized data.  For example, on many systems bitmaps are passed
around as C source code that you are expected to compile into your
program. A single bitmap can easily be 1024x1024 bytes for a big
screen.

Because of the 32k global limit, few mac C compilers can compile
such a C source code fragment.

LightSpeed C already has a preamble that interprets a resource of type
'DATA' to set up the global area, and (if you are making a desk
accessory or actually any kind of code resource except an application,
it can generate code that references globals indirect off of register
a4 (which is initialized to point t a block of locked storage in the heap.)
instead of a5. It could easily extend the semantics of 'DATA'
resources to support an 'unbounded globals' option that did both these
things to get around any limits imposed by the environment.

shap@sfsup.UUCP (J.S.Shapiro) (10/20/87)

In article <21163@ucbvax.BERKELEY.EDU>, oster@dewey.soe.berkeley.edu.UUCP writes:
> Because of the 32k global limit, few mac C compilers can compile
> such a C source code fragment.
> 
Nothing requires you to observe this limitation. The 32k limit is the limit
of offsets into a text segment that you can get to *from the jump table*.
For intra-text-segment references your compiler is free to generate
relative jumps (you can't fix the address because of relocatable code
needs). Most compilers I am aware of in fact leave it up to the user to
segment the code. Note also that as of the new ROMS the resource size limit
has been relaxed.

Jon Shapiro

oster@dewey.soe.berkeley.edu (David Phillip Oster) (10/20/87)

In article <21163@ucbvax.BERKELEY.EDU>, oster@dewey.soe.berkeley.edu.UUCP writes:
> Because of the 32k global limit, few mac C compilers can compile
> such a C source code fragment.

Jon Shapiro's reply shows me that I haven't been as clear as I might be.

Using any Mac C compiler, It is easy to have a C _program_ larger than 32k.

It is easy to have global, uninitialized arrays larger than 32k (you just
declare pointers to them as global variables, and initialize the ponters
with calls to NewPtr() in your initialization procedure.)

The problem is compiling a program that has a _single_ initialized
array of data that is larger than 32k. Such arrays might be generated
by C writing tools like the Unix Lex and Yacc. The problem is that the
C compilers handle initialized arrays differently from procedures.

The macintosh is different from many 68000 based computers because it
allows executable code to move around while the code is running.
Because of this, macintosh compilers generate position independent code.

Procedures live in code segments. These segments are a kind of
resource, so they live in the heap, as all resources do.  Most
compilers limit the size of code segments to 32k, because there are
some small, fast relative branch instructions in the 68000 processor
chip used in the Mac that use 16-bit signed offsets. Since most C
programs are written as many small files, it is not a problem to
allocate files to segments so that no segment overflows (and, most
compilers warn you if a segment is getting to big.)  (By the way, most
C compilers on IBMs don't warn you if a segment is too big, you find
out when your program mysteriously crashes.)

Initialized data does not live in segments, and does not move around.
The apple operating system strongly encourages applications to
allocate initialized data by reserving space above the stack and using
register A5 to point to it. (In fact, the format for an executable
application includes a field to tell the Launch system call how much
space to reserve above the stack for globals, initialized data, and
lookup tables to global procedures (remember, procedures can move.
These lookup tables are used so one procedure can find another.))

In addition to Mac C compilers not handling more than 32k of
initialized data, most Mac C compilers can't handle a single procedure
that compiles to more than 32k of executable code. I've never seen
such monstrousity, but once again, program writing tools may generate
such things.

There is a work-around if you really need more than 32k of initialized
data: Compile your data arrays on some other machine, and have that
other machine write the data to a file. Move the file to the Mac, and
either use it directly (by having your program read it into an
un-initialized array as part of its init routine.) Or, use a second
helper program to copy the data file into a resource. Mark the
resource as "PreLoad, Non-Purgable" That way, your Mac program can use
the big initialized array by just doing:

bigInitializedArray = (ArrayType *) *GetResource('GNRL', 128);

I'm sorry to bother the net with Jon Shapiro's comment, but my news reader
shows his address as shap@sfsup.UUCP, which my mailer can't
understand.

--- David Phillip Oster            --A Sun 3/60 makes a poor Macintosh II.
Arpa: oster@dewey.soe.berkeley.edu --A Macintosh II makes a poor Sun 3/60.
Uucp: {uwvax,decvax,ihnp4}!ucbvax!oster%dewey.soe.berkeley.edu