[comp.sys.mac] What is wrong with the Sumacc C compiler

oster@dewey.soe.berkeley.edu.UUCP (10/30/87)

In article <7508@dartvax.UUCP> earleh@dartvax.UUCP (Earle R. Horton) writes:
>Perhaps someone would care to enlighten me as to why the sumacc compiler has
>apparently fallen into disuse.  Is it solely the inconvenience of having to
>download the compiled program (I can live with that) or is there something
>wrong with it?

Well, there are two things wrong with it:

1.) it is slow.
Unless you have a Mac with an ethernet connection to a _very_ fast machine,
it is much faster to compile and link locally than to compile on a remote
machine, link, rmake, and down load the result.

2.) it is buggy.  

You may have seen my shareware program, Calendar 1.9. It is loosely
based on a program that was broadcast to the net in, I think '84, that
was compiled in sumacc C. I liked the program, but it hhad an obvious
killer bug, as a result of the Sumacc compiler.  (The symptoms were,
if you used the Calendar desk accessory once, then you couldn't use it
again until you quit the current application.) After trying,
unsuccessfully to contact the author, I took over development of the
program, and have added many new features and fixed many old bugs.

The authors of the compiler seem to be unaware that on a Macintosh
executable code can move while the program is running. Unlike all Macintosh
compilers, they generate position _dependent_ code, and have a funky
loader scheme to resolve non-relocatable references at program load
time. Eventually, the code moves, and all that position dependent code
points at never-neverland.

Conclusion:
Don't use sumacc C. Because of its authors' poor understanding of the
Macintosh execution time environment, the sumacc C compiler generates
incorrect code.

--- David Phillip Oster            --A Sun 3/60 makes a poor Macintosh II.
Arpa: oster@dewey.soe.berkeley.edu --A Macintosh II makes a poor Sun 3/60.
Uucp: {uwvax,decvax,ihnp4}!ucbvax!oster%dewey.soe.berkeley.edu

brian@ut-sally.UUCP (Brian H. Powell) (10/30/87)

In article <21522@ucbvax.BERKELEY.EDU>, oster@dewey.soe.berkeley.edu (David Phillip Oster) writes:
> 1.) it is slow.
> Unless you have a Mac with an ethernet connection to a _very_ fast machine,
> it is much faster to compile and link locally than to compile on a remote
> machine, link, rmake, and down load the result.

	Agreed.  I think a 9600 baud connection would be tolerable, though.

> 2.) it is buggy.

	Agreed.  Buggy but usable.  I only ran into one code generation bug.
The problem you describe is a feature, not a bug.

and let me add 3) you can't do segmentation.

	Segmentation, as we all know, is a means of getting around the problem
that the mac has with segments of code that are larger than 32K.  This meant
that you couldn't write a program in SUMacC that was larger than 32K.
     As I recall that was a bug in the resource manager (you couldn't have a
resource (e.g., CODE) that was larger than 32K.)  I don't know if there are
still inherent limitations that prevent a CODE resource from being larger than
32K.  Comments anyone?  (Aside:  an advantage of limiting code to 32K chunks
is that a compiler can always set aside 16 bits for an offset; it will never
have to set aside 32 bits, since everyplace it would ever want to jump (or
jsr) to is within 32K of where you are.)  (I would bet the SUMacC compiler is
probably intelligent enough to use the right size offsets (16 bits normally,
32 when needed.))
     I seem to recall, back when I was coordinating the updates to the SUMacC
"rmaker" resource compiler, that someone said they had added segmentation to
the SUMacC compiler.

> The authors of the compiler seem to be unaware that on a Macintosh
> executable code can move while the program is running. Unlike all Macintosh
> compilers, they generate position _dependent_ code, and have a funky
> loader scheme to resolve non-relocatable references at program load
> time. Eventually, the code moves, and all that position dependent code
> points at never-neverland.

	SUMacC does indeed do run-time relocation (once, the first time the
code is called).  If anybody ever wants to
know how it works, you can ask me.  (That's what the "longruns" are when you
rmaker the file.)
     This works fine for applications written in SUMacC because they are only
one segment.  The main segment is always locked in memory, so it never moves.
This doesn't work for desk ornaments, as described in the parent article.
Once closed, they are unlocked.  The next time the DA is opened, it's locked
again, but it may have moved in the meantime.  Boom.


     I consider SUMacC to be outdated, and I don't think it should be used as
a development environment for anything serious, especially something like
kermit.  SUMacC was great back when it was the only way other than the lisa to
develop programs for the mac.  It was also great for those of us who only had
128K macs.  (Ever try to develop software on a 128K Mac?  I have.  Remember
the old megamax compiler that used the screen memory...  Those were the good
old days...)
     For SUMacC to be competitive, it would have to generate
position-independent code, and it would have to provide segmentation.  It's
code generation is among the best (if not the best) considering it does a
reasonably good job at optimization.

Brian H. Powell
		UUCP:	...!uunet!ut-sally!brian
		ARPA:	brian@sally.UTEXAS.EDU

   _Work_					 _Not Work_
  Department of Computer Sciences		P.O. Box 5899
  Taylor Hall 2.124				Austin, TX 78763-5899
  The University of Texas at Austin		(512) 346-0835
  Austin, TX 78712-1188
  (512) 471-9536

earleh@dartvax.UUCP (Earle R. Horton) (11/01/87)

In article <9431@ut-sally.UUCP>, brian@ut-sally.UUCP (Brian H. Powell) writes:

>      As I recall that was a bug in the resource manager (you couldn't have a
> resource (e.g., CODE) that was larger than 32K.)  I don't know if there are
> still inherent limitations that prevent a CODE resource from being larger than
> 32K.  Comments anyone?  (Aside:  an advantage of limiting code to 32K chunks

The bug in the resource manager appears to affect writing resources larger
than 32k, and not reading them from disk.  The CODE resource for Kermit is
about 64k in size when compiled with the Sumacc compiler here.  Apparently,
you can have CODE resources as big as you want, as long as you don't use
the resource manager to write them out.

>      For SUMacC to be competitive, it would have to generate
> position-independent code, and it would have to provide segmentation.  It's
> code generation is among the best (if not the best) considering it does a
> reasonably good job at optimization.

It does a good job of linking, too.  Kermit made with Sumacc is about 71k,
while Kermit made with Megamax is 85k.  It's really tempting to use Sumacc,
since the tools available (make, emacs, csh, real background processing, etc.)
are about 100 times better than what exists on the Mac, and they just got
new disk drives for the VAX, and we can download files here using Darterminal
and Appletalk...

The large volume of information I received in reply to my original query
indicates that Sumacc probably is not a real good choice for doing Mac
development, but that it sure would be nice if it were.  I guess I will stick
with Lightspeed.  It may not optimize, but at least it compiles ROM calls
inline (well, most of them, anyway.)
-- 
*********************************************************************
*Earle R. Horton, H.B. 8000, Dartmouth College, Hanover, NH 03755   *
*********************************************************************

raylau@dasys1.UUCP (Raymond Lau) (11/03/87)

According to Apple, the 32k write bug has since been fixed.  Forgot what System version the fix was in.  There's also a TN on how to fix it yourself so that you can anticipate and be prepared for old System's.


-----------------------------------------------------------------------------
Raymond Lau                      {allegra,philabs,cmcl2}!phri\
Big Electric Cat Public Unix           {bellcore,cmcl2}!cucard!dasys1!raylau
New York, NY, USA                               {sun}!hoptoad/         
GEnie:RayLau   Delphi:RaymondLau   CIS:76174,2617
"Take it and StuffIt."

kdmoen@watcgl.UUCP (11/08/87)

oster@dewey.soe.berkeley.edu.UUCP (David Phillip Oster) writes:
...concerning why the sumacc compiler is no good...
>The authors of the compiler seem to be unaware that on a Macintosh
>executable code can move while the program is running. Unlike all Macintosh
>compilers, they generate position _dependent_ code, and have a funky
>loader scheme to resolve non-relocatable references at program load
>time. Eventually, the code moves, and all that position dependent code
>points at never-neverland.

Sorry, but this sounds blatantly impossible.  If the code were to move
while the program is running, then *all of the return addresses and
function pointers on the stack would become invalid* and the program
would crash, regardless of what compiler was used to create the program.

Now, I can see that maybe desk accessories might move around,
*but not application programs*.

Still, I will second your opinion that Sumacc is "not recommended".
I abandoned Sumacc a while back for MPW, on the theory that it was not
worth my time to maintain Sumacc when Apple would maintain MPW for me.
Now, of course, I am deeply regretting the fact that I can no longer
run my programs through lint...

PS: Has anybody heard rumours about the forthcoming availability of Lint
for any Mac C compiler??
-- 
Doug Moen
University of Waterloo Computer Graphics Lab
UUCP:     {ihnp4,watmath}!watcgl!kdmoen
INTERNET: kdmoen@cgl.waterloo.edu

kdmoen@watcgl.UUCP (11/08/87)

brian@ut-sally.UUCP (Brian H. Powell) writes:
)	Segmentation, as we all know, is a means of getting around the problem
)that the mac has with segments of code that are larger than 32K.  This meant
)that you couldn't write a program in SUMacC that was larger than 32K.

I have been developing/maintaining a very large Sumacc program since Sumacc
was first released.  (Before that, we used the Lisa Workshop & Lisa Pascal).
When we stopped using Sumacc earlier this year it was >100K, and it ran fine.

)     For SUMacC to be competitive, it would have to generate
)position-independent code, and it would have to provide segmentation.  It's
)code generation is among the best (if not the best) considering it does a
)reasonably good job at optimization.
)
)Brian H. Powell

When I translated the program to MPW C, the code size dropped to 2/3 of its
size under Sumacc.  This is despite the fact that I junked the compiler
supplied with Sumacc and replaced it with a slightly better version of the
MIT 68K C compiler, which I then proceeded to hack to improve code quality.
-- 
Doug Moen
University of Waterloo Computer Graphics Lab
UUCP:     {ihnp4,watmath}!watcgl!kdmoen
INTERNET: kdmoen@cgl.waterloo.edu

earleh@dartvax.UUCP (Earle R. Horton) (11/09/87)

In article <2283@watcgl.waterloo.edu>, 
	kdmoen@watcgl.waterloo.edu (Doug Moen) writes:
> 
> Sorry, but this sounds blatantly impossible.  If the code were to move
> while the program is running, then *all of the return addresses and
> function pointers on the stack would become invalid* and the program
> would crash, regardless of what compiler was used to create the program.
> 

Nevertheless, that's how it's done on the Mac.  I suggest you read the 
appropriate section of Inside Macintosh (Segment Loader in this case) the
next time you get the urge to say that something is "blatantly impossible."
-- 
*********************************************************************
*Earle R. Horton, H.B. 8000, Dartmouth College, Hanover, NH 03755   *
*********************************************************************

dwb@apple.UUCP (David W. Berry) (11/10/87)

In article <2283@watcgl.waterloo.edu> kdmoen@watcgl.waterloo.edu (Doug Moen) writes:
>
>Sorry, but this sounds blatantly impossible.  If the code were to move
>while the program is running, then *all of the return addresses and
>function pointers on the stack would become invalid* and the program
>would crash, regardless of what compiler was used to create the program.
	This (code moving while program runs) is exactly what happens.
There are two types of subroutine calls that get compiled into an application,
some of them are direct pc relative calls to other routines in the same
segment.  The rest are made via the "segment table" which is relative
to a5.  The segment table contains either a branch to the appropriate
address in memory if the segment is loaded or a trap (LoadSeg) that
will load the appropriate segment into memory and lock it in place
if it isn't.

	The return addresses on the stack are kept valid by the programmer
only unloading a segment (which actually changes the segment table
and marks the segment purgeable and moveable, but leaves it in core)
if he can verify that there are no calls to it outstanding on the
stack.  The function pointers on the stack are handled the same way.

	Function pointers in tables, etc. are handled by having them
jump into the segment table (which >never< moves once a program
begins executing) rather than jumping to the subroutine required.

	All in all, it's a rather useful way to implement poor
man's demand paging.  This is, by the way, what allows programs
such as Excel, which has more than 200K of code to execute in a
192K MultiFinder segment.  Much of that 200K of code is rarely
needed, and therefore, rarely if ever loaded, meaning the memory
can be used for other things.
-- 
	David W. Berry
	dwb@well.uucp                   dwb@Delphi
	dwb@apple.com                   973-5168@408.MaBell
Disclaimer: Apple doesn't even know I have an opinion and certainly
	wouldn't want if they did.

steele@unc.cs.unc.edu (Oliver Steele) (11/10/87)

kdmoen@watcgl.waterloo.edu (Doug Moen) writes:
>oster@dewey.soe.berkeley.edu.UUCP (David Phillip Oster) writes:
>...concerning why the sumacc compiler is no good...
>>The authors of the compiler seem to be unaware that on a Macintosh
>>executable code can move while the program is running. Unlike all Macintosh
>>compilers, they generate position _dependent_ code, and have a funky
>>loader scheme to resolve non-relocatable references at program load
>>time. Eventually, the code moves, and all that position dependent code
>>points at never-neverland.
>
>Sorry, but this sounds blatantly impossible.

Take a look at UnloadSeg().

>If the code were to move
>while the program is running, then *all of the return addresses and
>function pointers on the stack would become invalid* and the program
>would crash, regardless of what compiler was used to create the program.

That's why it's dangerous to call UnloadSeg() on segments that have live
routines.  The Sumacc compiler just makes it dangerous to call it on
segments that have ever been used, as well.

------------------------------------------------------------------------------
Oliver Steele				  ...!{decvax,ihnp4}!mcnc!unc!steele
							steele%unc@mcnc.org

  "When you believe in a loving God, life appears to be very funny."
							-- Garrison Keillor

raylau@dasys1.UUCP (Raymond Lau) (11/11/87)

In article <2283@watcgl.waterloo.edu>, kdmoen@watcgl.waterloo.edu (Doug Moen) writes:
> oster@dewey.soe.berkeley.edu.UUCP (David Phillip Oster) writes:
> ...concerning why the sumacc compiler is no good...
> >The authors of the compiler seem to be unaware that on a Macintosh
> >executable code can move while the program is running. Unlike all Macintosh
> >compilers, they generate position _dependent_ code, and have a funky
> >loader scheme to resolve non-relocatable references at program load
> >time. Eventually, the code moves, and all that position dependent code
> >points at never-neverland.
> 
> Sorry, but this sounds blatantly impossible.  If the code were to move
> while the program is running, then *all of the return addresses and
> function pointers on the stack would become invalid* and the program
> would crash, regardless of what compiler was used to create the program.
> 
> Now, I can see that maybe desk accessories might move around,
> *but not application programs*.

I think the reference to moveable code is that when a prgm is segmented, each
segment, except the main segment, is relocatable when and if loaded into
memory.  When anything in that segment is called, a jsr is made to the
jump table, which contains a jump to the actual procedure.  When there're}i
no references to procedures in that segment, it may be unloaded, and the
entry in the jump table would be replaced by a jmp to a routine which would
1. load in the segment and
2. finally call the original procedure desired.

I believe that is accurate, if not detailed.  It's been a while since I've
reviewed the info on the jump table...so, no promises.


-----------------------------------------------------------------------------
Raymond Lau                      {allegra,philabs,cmcl2}!phri\
Big Electric Cat Public Unix           {bellcore,cmcl2}!cucard!dasys1!raylau
New York, NY, USA                               {sun}!hoptoad/
GEnie:RayLau   Delphi:RaymondLau   CIS:76174,2617
"Take it and StuffIt."

pem@cadnetix.UUCP.UUCP (11/12/87)

In article <2283@watcgl.waterloo.edu> kdmoen@watcgl.waterloo.edu (Doug Moen) writes:
>oster@dewey.soe.berkeley.edu.UUCP (David Phillip Oster) writes:
>...concerning why the sumacc compiler is no good...
>>The authors of the compiler seem to be unaware that on a Macintosh
>>executable code can move while the program is running. Unlike all Macintosh
>>compilers, they generate position _dependent_ code, and have a funky
>>loader scheme to resolve non-relocatable references at program load
>>time. Eventually, the code moves, and all that position dependent code
>>points at never-neverland.
>
>Sorry, but this sounds blatantly impossible.  If the code were to move
>while the program is running, then *all of the return addresses and
>function pointers on the stack would become invalid* and the program
>would crash, regardless of what compiler was used to create the program.
>
>Now, I can see that maybe desk accessories might move around,
>*but not application programs*.

Doug, I'm afraid you clearly do not understand the execution-time
environment on the Macintosh.  As David is correct in stating, code
*does* move around at execution time--not code that is executing, but
code in the same program.  The Mac has a smart segment loader which
cooperates with the memory manager in a well-built (read well-written-
by-the-compiler) application.  While code in a segment is executing it
is in a LOCKED relocatable memory block; when no code in a segment is
executing or on the return stack, the block should be unlocked to
allow the memory manager to do what it needs to do most efficiently.
For maximum memory efficiency, explicit Segment Loader calls should be
put in an application's main event loop, as is done automatically by
Lightspeed C and can be done explicitly in Megamax C.

On a 68000-series machine, sufficient fancy addressing modes are
available to make absolute references to data unneccessary except for
using low-memory globals (which of course don't move anyway).  This
kind of coding is what makes code not "dynamically relocatable", and
all programs for the Mac are required to be dynamically relocatable.

Presumably, the sumacc application could be set up as a single
segment, but a single segment can't be more than 32K long.  I have no
hard data on what sumacc does, but if it doesn't cooperate with the
Mac operating system, it will obviously be vulnerable to breaking at
the change of the wind.

-- 
pem@cadnetix.UUCP  (nbires!isis!ico!cadnetix!pem)

drc@dbase.UUCP (Dennis Cohen) (11/13/87)

In article <1102@cadnetix.UUCP>, pem@cadnetix.UUCP (Paul Meyer) writes:
> In article <2283@watcgl.waterloo.edu> kdmoen@watcgl.waterloo.edu (Doug Moen) writes:
> 
> On a 68000-series machine, sufficient fancy addressing modes are
> available to make absolute references to data unneccessary except for
> using low-memory globals (which of course don't move anyway).  This
> kind of coding is what makes code not "dynamically relocatable", and
> all programs for the Mac are required to be dynamically relocatable.
> 
> Presumably, the sumacc application could be set up as a single
> segment, but a single segment can't be more than 32K long.  I have no
> hard data on what sumacc does, but if it doesn't cooperate with the
> Mac operating system, it will obviously be vulnerable to breaking at
> the change of the wind.
> 

Wrong!!!  There are compilers such as Sumacc, TDI Modula-2, and one of the
FORTRAN compilers (I forget the name, but marketed by DCM in Fort Worth) that
do NOT use the Mac Segment loader.  There is nothing sacred about using the
segment loader, it's just that this is the addressing mode used by the vast
majority of Mac compilers (because that's the way Lisa Pascal did it, probably).
Compilers that don't use the segment loader need to have some other way of
handling overlays and segmentation is all.  As a matter of fact, TDI advertises
the lack of segment loader support as a beneficial feature of their compiler!
I'm not sure that I agree with that in general, but it could be valuable for a
certain class of application (maybe CAD) as this also removes the stack frame
and allows for other forms of global data storage than A5-relative.

You'll notice that the underlying architecture is a MC680x0 and that other
machines with the same processor don't have the segment loader.  It's just a
set of traps that Apple provides in ROM and PTCH to make things a little bit
easier, kind of like the Control Manager.  You don't have to use it, but if
you don't, you'll probably need to replace its functionality with something
else.

In short, don't say it ain't so until you at least take the time to check it
out.

Dennis Cohen
Ashton-Tate Glendale Development Center
dBASE Mac Development Team
--------------------------
Disclaimer:  The above opinions are mine (proprietary) and do not belong to my
employer.

smethers@psu-cs.UUCP (Paul Smethers) (11/19/87)

In article <1102@cadnetix.UUCP> pem@cadnetix.UUCP (Paul Meyer) writes:
>For maximum memory efficiency, explicit Segment Loader calls should be
>put in an application's main event loop, as is done automatically by
>Lightspeed C and can be done explicitly in Megamax C.
 ^^^^^^^^^^^^
Is this true?  I'm unaware of LSC doing anything funny in the event loop
(and if they do, how do they know how to find an application's event loop?)

I'm sure this is a missunderstanding, since it is the programmer's
responsibility to put calls to unload segments into the event loop.

Paul Smethers