[comp.sys.mac.programmer] Summary: Speeding up MacApp code

philipc@runx.oz.au (Philip Craig) (04/14/91)

Here is the summary of the replies I got when I asked about speeding
up code written using MacApp. Thanks to all who responded.

From: Keith Rollin <keith%apple.com@cs.mu.oz>
Subject: Re: Speeding up MacApp programs
Organization: Apple Computer Inc., Cupertino, CA

In article <1991Apr8.101436.11202@runx.oz.au> you write:
>Greetings netters,
>	I am soon to be having a look at a MacApp-based project that runs
>very slowly, too slowly in fact to take to market as is. I was wondering
>if people here have any suggestions or brilliant ideas as to how to speed
>up MacApp code in large projects. I have an idea that the speed bottleneck
>is due to many (thousands?) of small objects being continually allocated and
>deallocated, but I'd be interested to see what other speed-up ideas people
>have.

Philip,

Definitely one place to look at for speed problems is in the area you
describe. If your program is allocating and de-allocating thousands of
objects at a high frequency, then your program may be slowing down. If
that's the case, then the only solution is to stop doing that. MacApp
itself is not constantly creating and disposing of thousands of
object every time through the event loop, so the problem may be 
solvable by looking at your own part of the source code.

MacApp 3.0 will alleviate this problem a little bit, as it implements
a low-level object cache in its object creation and deletion code.
However, it caches only a single object, so that won't help anyone
using thousands of objects.

Mark Lannett also pointed out on Usenet that there are problems when
you deal with some of the user interface. Specifically, in complex
views, MacApp walks the view hierarchy a lot. When it does this, it
usually has to call TView.Focus on each view. Since this includes a lot
of region manipulation, your program can slow down. However, this view
walking doesn't occur at the core of any tight loops inside of MacApp,
so these speed hits are usually one-shots, and shouldn't affect the
overall execution of your program.

In debug mode, MacApp implements a code profiler. Try turning that on
and seeing if it helps. In all honesty, it may not, as the profiler
ends up profiling mostly MacApp's debug code, but who knows...something
might turn up.

Just remember...regardless of the platform, _any_ complex program will
run slowly until it is optimized. When you program with MacApp, you
automatically start out with a complex platform, and it just might be
slow until you optimize it. Speed issues aren't due to MacApp per se;
they are problems with any large program, of which MacApp is an
example.

------------------------------------------------------------------------------
Keith Rollin  ---  Apple Computer, Inc. 
INTERNET: keith@apple.com
    UUCP: {decwrl, hoptoad, nsc, sun, amdahl}!apple!keith
"But where the senses fail us, reason must step in."  - Galileo
------------------------------------------------------------------------------
Date: Tue,  9 Apr 91 15:57:43 -0400 (EDT)


Howdy,

   I have some advice that you ought to take with a grain of salt.  I
have a large-scale project that allocates tons of pointer-based and
handle-based record structures and objects (though mostly record
structures, MacApp is exploited on the interface, not the
model/document).  When this project was first ported to the Mac,
NewPointer and NewHandle was noticably exponentially slower as the
number of objects increased and performance was typically abysmal
compared to Unix virtual memory allocation (which had its own set of
problems).

   I resorted to the following techniques.  First, I wrote bottlenecks
for allocating each different kind of data structure instead of calling
NewPtr or NewHandle directly (okay, this is mostly straight structure
stuff), e.g., NewValuePtr, NewGraphHdl, etc.  I then designed a
THeapManager object that holds references to "heaps", e.g., fValueHeap,
fGraphHeap, etc.  The purpose of the THeapManager is simply for the
bottlenecks to redirect allocation and deallocation requests to the
appropriate heap, and the purpose of a heap is to find space inside its
blocks and manage its blocks.  So I only have one global gHeapManager
whose fields are a THeap reference for each data type, and it has no
other (current) purpose or methods beyond IHeapMgr and Free.

   I wrote several standard (non-object, non-true-handle) THeap types
for my different needs.  The most common type I used was a pointer-based
"reusable heap." It holds two integers, a "record size" and "records per
block," and a TList of TBlocks.  When allocating, the THeap asks each
TBlock in its list for a free record until one returns an unused record
or else it determines it must allocate a new block.  When deleting, it
finds the block containing the structure to be deleted and tells that
block to free the structure.

   I'll skip over describing TBlock which is my Unix-style brute
allocation without care to deallocation.  The TReusableBlock has a
reference to a chunk of memory and a list of "pointers" that will be
used to maintain a linked list of "unallocated" records (actually
offsets) within that chunk of memory.  When allocating a TReusableBlock,
it looks at the record size and records per block of its fHeap to
allocate the correct sized chunk of memory to match that, and allocate
enough pointers for that block as well.  One doohickie (semaphore is the
word?) I added recently was an integer counter to tell when the block
has been completely deallocated so that the block can remove itself from
the heap's list and delete itself when it is empty, which helps minimize
fragmentation over time and frees up some memory.  The TReusableBlock
uses its pointers as a "free record linked list" and has a head fFreePtr
pointer to that list, so that records can be recovered over time if
their use is needed again.  Allocation in a block is as fast as checking
if fFreePtr is NIL or not (if not, then it is pointing to the first free
record in the chunk of data).  I typecast all record structures in the
bottleneck procedures rather than defining zillions of THeap or TBlock
subclasses for each data type.

   It gets more complicated than this, but rest assured, it doesn't take
much effort to write this stuff.  It took me a couple of days to set it
up and maybe a total of 5 days scattered in the past year improving it. 
Although my techniques may sound very convoluted, I got a 200x
improvement over calling NewHandle and NewPtr directly.  One test
evaluation of a real model requiring ~3MB of memory took 45 seconds
allocating and now takes 2-3 seconds allocating.  I didn't have
significant adverse effects on fragmentation because at the time most of
my structures were pointers anyway (I have since converted most large
structures to handles and have gradually lessened my dependence on
THeaps).  I have applied this to both static (fixed-size) record
structures and dynamically-sized record structures (usually dynamic
array structures).  My technique is particularly useful for structures
that are less than 24 bytes (the overhead per block/ptrs in the MM).  To
apply this to dynamic arrays, I type-define several small fixed size
array structures (1, 2, 4, and 7 elements) and use a different heap for
each, and then just use the Mac MM directly for arrays larger than 7
elements.  For handle stuff, I had to write HLock and HUnlock
bottlenecks for two of my types, e.g., LockValue, LockGraph, etc, to
avoid passing fake handles to the MM directly (all "handles" in my heaps
are unfortunately locked by definition).

   I don't know if your memory problems are similar, but I think even
with Pascal objects, if you allocate tons of them, then you aren't using
the Mac MM as it was intended, and your recourses are to (a) fix your
program to use the MM the way it was intended or (b) avoid using the MM
the way it was intended.  The latter method worked for me for a while
and bought me time before I had to succomb to the former.  I have to
succomb to the former in order to reduce memory fragmentation (maybe
System 7's virtual memory will bail me out).  Good luck in your
endeavors and I hope this is of some help.

- Brian
-Date: Tue, 9 Apr 91 15:28:47 -0700
From: Kent Sandvik <ksand%apple.com@cs.mu.oz>
Organization: Apple Computer Inc., Cupertino, CA
 
Phil,
 
the MacApp debugger has a performance utility, that could show
some of the bottlenecks.
 
Cheers,
Kent, former Oz dude
 
 
--
Kent Sandvik, DTS junkie
-- 
| Philip Craig          | ACSnet,Ean,CSnet: philipc@runxtsa.runx.oz.au         |
| RUNX Unix Timeshare   | Arpa:  philipc%runxtsa.runx.oz.au@UUNET.UU.NET       |
| Wahroonga  NSW  2076  | Janet: philipc%runxtsa.runx.oz.au@UK.AC.UKC          |
| Australia             | UUCP:{uunet,mcvax}!munnari!runxtsa.runx.oz.au!philipc|