[comp.sys.m68k] inst. cache/optimization

tcd@vax5.cit.cornell.edu (01/10/91)

     This is probably a dumb question and I'm not sure if this is the most
appopriate group, but here goes anyway.  Suppose one is writing an assembly
language program for a Macintosh with 68020/68881 chips that has a loop
containing just enough instructions to fill the instruction cache twice.
It seems that one would then get zero cache hits.  Would it make sense to
set the Freeze Cache bit halfway through the loop, or do something similar
with the Cache Control Register?  In a similar vein, unrolling loops is
suggested as a way to improve performance, but is this still likely to be
a good idea if, say, the original version just fills the instruction cache?
     I have spent some time studying relevant sections of the User Guides
from Motorola, but have had a difficult time putting all the pieces together.
Can anyone recommend a good book that covers these sorts of issues, for the
68030 and 68882 chips as well?
Thanks,
Tim Dorcey

kdq@demott.com (Kevin D. Quitt) (01/11/91)

    To answer your second question first, unrolling loops speeds up code
because the loop tests and branches are eliminated.  For a loop wholly
inside the cache, branches are very inexpensive, as they generally fall
into the "best" catagory - 0 or 1 cycles for most common branches.  So,
if you can make it fit in the cache, make it fit. 

    If your program doesn't fit into the cache, it's probably still a
win to keep the cache active, since the cache is filled 32 bits at a
time, and some of your instructions are only going to be 16 bits (so the
second instruction will be a hit).  It's really a tough call, since you
could look at the timing charts and build some super-efficient code
for the portion you lock in, (or if your memory is slow you'd get the
same effect).



-- 
 _
Kevin D. Quitt         demott!kdq   kdq@demott.com
DeMott Electronics Co. 14707 Keswick St.   Van Nuys, CA 91405-1266
VOICE (818) 988-4975   FAX (818) 997-1190  MODEM (818) 997-4496 PEP last