[net.micro.mac] >64k code segs. WAS: Aztec C assembler bug?

oster@ucblapis.berkeley.edu (David Phillip Oster) (05/29/86)

Because of a bug in the old rom's resource manager, resources, including
code resources, could not be bigger than 32k.  As a result, most compilers
are poorly debugged when it comes to putting more than that amount of code
in a single segment.  

The Motorola 68000 processor itself compounds the problem by having
different rules for generating position independent code depending on
whether 16bit or 32 bit address offset instructions are used.

(Instances:, the T.M.L. Pascal Version 1.11 Linker
will generate incorrect code without even giving you a warning if you tell
it to put more than 32k code in a segment.  The Sumacc C cross development
system trys to handle the problem by not using the Mac Segment Loader,
putting all your code in one block, and patching refernces itself at load
time to try to get you position independence.  This scheme almost works -
it fails for desk accessories like Calendar, since in a desk accessory
the application is free to move around the code of the desk accessory
whenver the desk accessory is not being run.  Since relocation happens
only when the desk accessory comes in from disk running the moved,
in-memory copy causes a bomb.  Setting the "needs lock" bit cures the symptom,
but not the problem of bad code generation.)

Breaking a program up into segments seems straight-forward, but it has its
own set of pitfalls.  The big one is that before the break-up all the
procedures in the program were in the same segment.  Afterwards, an
ordinary procedure call might cause the heap to get re-arranged.  
My example of this bug uses a source of segmentation that happens all the
time, but that most programmers don't think about:  Whenever you call a
routine in a package, you have the same potential for disaster that you
have when you call a routine in a different segment.

Example:
PROCEDURE Bomber;
VAR h : Handle;
BEGIN
  h := GetDataHandleProcedure;
  Qsort(h^, 0, GetHandleSize(h) div DATAITEMSIZE, @IUCompString);
END;
This bombs some of the time because when Qsort calls IUCompString,
IUCompString is going to call the International Utilities Package.
The package will be read in from disk if it is not already resident.
The resource manager will call the memory manager to get room to put the
package.  The memory manager might reshuffle the heap, and if it does,
h^ will no longer point at the data.
(Before the current release of the software, the Pascal version of IUCompstring
incorrectly failed to save register A3, but that is another problem.)

The solution is simple: Use MacroScope to:
1.) prowl the source code files that make up a program to produce a text file
containing the calling tree of the entire program (a graph of what calls
what for the entire program.)
2.) resolve the calling graph against the linker source file to discover
what calling chains might potentially cause problems due to heap shuffling
and un-locked pointers.
3.) examine the actual calls to determine whether the arguments are locked
or not.  (Throw out those lines where it is easy to tell that all the
arguments are non-moveable.)
4.) produce an error message file containing only those lines that might
cause problems.

Unfortunately MacroScope hasn't been released for sale yet.

--- David Phillip Oster		-- "The goal of Computer Science is to
Arpa: oster@lapis.berkeley.edu  -- build something that will last at
Uucp: ucbvax!ucblapis!oster     -- least until we've finished building it."

dave@comm.UUCP (Dave Brownell) (06/01/86)

In article <761@jade.BERKELEY.EDU> oster@ucblapis.UUCP,
otherwise known as "David Phillip Oster" writes:

> [ Much prose omitted ]

> 2.) resolve the calling graph against the linker source file to discover
> what calling chains might potentially cause problems due to heap shuffling
> and un-locked pointers.

    Maybe I'm being naive ... but this is exactly the class of bugs
    that I like to fix by not making them in the first place.  Is it
    really THAT hard to obey the discipline of locking your handles
    when you pass "hard pointers" around?  Or of fixing someone else's
    code to do that?  (Or not using "hard pointers" hardly ever?)

    I think a good top-down design in the first place can eliminate
    maybe 50% of the debugging cycle, and will give a better product
    in the end.  (As well as giving a nice clean calling graph!!)
-- 
Dave Brownell		{panda,genrad,harvard}!enmasse!comm!dave

"They sang long into the evening about their Truck and Radio."

oster@ucblapis.berkeley.edu (David Phillip Oster) (06/03/86)

Is it that hard to lock all your handles when you dereference them? Well,
yes and no.  The problem is that:
1) You often give out pointers to fields within handles. Particularly in C
where some types are passed by value (Rects) and others by reference
(TypeLists) you often give out pointers without having any marker in the
source to say you are doing it.
 
2) If you write programs that use the whole machine, you can't just lock all
you handles before you begin and unlock them when you are done -- you'll
die of the memory fragmentation.
 
Therefore, it is very important to lock handles when you need them locked,
and unlock them again as soon as possible.  Segmentation changes the rules
for when things need to be locked.
 
This kind of problem is made worse by bad design, but I saw no need to
mention in my original posting that one should always do a good design.