[comp.sys.mac.programmer] BlockMove efficiency

palarson@watdragon.waterloo.edu (Paul Larson) (06/05/88)

I'm working of a Lightspeed Pascal program which basically juggles
single and double bytes from lookup tables into a coherent order.
I'm using the BlockMove call to accomplish this.  How efficient
is BlockMove for things like this?

I realize that the ideal thing to use would be assembler and MOVE.*
statements, but I'm not a pro yet.

Johan Larson, Programmer Aspirant

earleh@eleazar.dartmouth.edu (Earle R. Horton) (06/06/88)

In article <7212@watdragon.waterloo.edu> palarson@watdragon.waterloo.edu (Paul Larson) writes:
>
>I'm working of a Lightspeed Pascal program which basically juggles
>single and double bytes from lookup tables into a coherent order.
>I'm using the BlockMove call to accomplish this.  How efficient
>is BlockMove for things like this?
>

BlockMove isn't bad for general purpose use, and I probably couldn't
do better.  For specific cases, however, there is certainly room for
speed improvement.  BlockMove uses "move.b" for all moves, and strings
a few of them together to gain a speed increase over a loop.  If you
are moving WORDs (integers, shorts, what have you) then you can move
them much faster using "move.w" or "move.l".  If you are moving
LONGINTs, then you can move them a lot faster on a Mac II if they are
aligned properly and you use "move.l."

Take a look at the code which you would normally write to do the same
move function in your high-level language.  If it contains a loop,
then BlockMove is probably indicated.  If the source and destination
are always aligned on even bytes, then maybe you want to write some
assembler to do the same thing.  What percentage of time do you spend
actually doing the move?  If this is small, maybe looping is just
fine.

Too bad, but if you really want to know the best method for use with
your program, you have to time all of them, I think.

*********************************************************************
*Earle R. Horton, H.B. 8000, Dartmouth College, Hanover, NH 03755   *
*********************************************************************

lsr@Apple.COM (Larry Rosenstein) (06/07/88)

In article <8796@dartvax.Dartmouth.EDU> earleh@eleazar.dartmouth.edu (Earle R. Horton) writes:
>
>BlockMove isn't bad for general purpose use, and I probably couldn't
>do better.  For specific cases, however, there is certainly room for
>speed improvement.  BlockMove uses "move.b" for all moves, and strings

The last statement doesn't appear to be true in looking at the Mac Plus &
Mac II.  It uses MOVE.B's only if the exactly one of the source and
destination addresses is odd.  The code uses MOVE.W and MOVE.L where
possible and on the 68000 uses MOVEM.L if the number of bytes is greater
than 160.

You are right, however, that in some special cases you can do better than
BlockMove, since BlockMove is written for the general case.  The code
distinguishes a few different cases which takes a few cycles.  Also remember
that if you call BlockMove via a trap it takes a few microseconds; you can
get the address of BlockMove at the start of the program and JSR directly to
the code to eliminate this overhead.





-- 
		 Larry Rosenstein,  Object Specialist
 Apple Computer, Inc.  20525 Mariani Ave, MS 27-AJ  Cupertino, CA 95014
	    AppleLink:Rosenstein1    domain:lsr@Apple.COM
		UUCP:{sun,voder,nsc,decwrl}!apple!lsr

dan@Apple.COM (Dan Allen) (06/07/88)

>speed improvement.  BlockMove uses "move.b" for all moves, and strings
>a few of them together to gain a speed increase over a loop.  If you
>are moving WORDs (integers, shorts, what have you) then you can move
>them much faster using "move.w" or "move.l".  If you are moving
>LONGINTs, then you can move them a lot faster on a Mac II if they are
>aligned properly and you use "move.l."
>
>*********************************************************************
>*Earle R. Horton, H.B. 8000, Dartmouth College, Hanover, NH 03755   *
>*********************************************************************

BlockMove was considerably enhanced in the MacPlus over the original
Mac, and the new improved version has been present on all machines since
the Mac Plus.

The new improved BlockMove uses a MOVE.L loop when possible and for
blocks of memory larger than 124 bytes will use a VERY FAST 12 register
MOVEM.L instruction that is about as optimized as possible: result?
_BlockMove is good for both general purpose and many specialized calls.
In fact, MultiFinder even uses BlockMove to do its low memory context
switching.  You can't get much better.  And what is neat is that it has
different strategies depending on how much you are moving.

Everyone ought to go out and call _BlockMove today!

Dan Allen
Software Explorer
Apple Computer

ephraim@think.COM (ephraim vishniac) (06/07/88)

In article <8796@dartvax.Dartmouth.EDU> earleh@eleazar.dartmouth.edu (Earle R. Horton) writes:
>BlockMove isn't bad for general purpose use, and I probably couldn't
>do better.  For specific cases, however, there is certainly room for
>speed improvement.  BlockMove uses "move.b" for all moves, and strings
>a few of them together to gain a speed increase over a loop.  If you
>are moving WORDs (integers, shorts, what have you) then you can move
>them much faster using "move.w" or "move.l".  If you are moving
>LONGINTs, then you can move them a lot faster on a Mac II if they are
>aligned properly and you use "move.l."

Earle, the last time I disassembled _BlockMove (back in 64K ROM days),
the description you give above was wrong.  Unless someone at Apple has
completely lost his mind, I expect it's still wrong.

BlockMove checks on the relative alignment of the source and
destination.  When possible, it does bulk moves by saving off most of
the registers, then doing large MOVEM.L's to read/write as much data
as possible at each pass.  I seem to recall a four-instruction loop:
one to read from memory to registers; one to write from registers to
memory; one to fix up an address pointer because MOVEM has limited
choice of pre/post increment/decrement; and a dbra for loop control.

What's the source of your claim?


Ephraim Vishniac					  ephraim@think.com
Thinking Machines Corporation / 245 First Street / Cambridge, MA 02142-1214

     On two occasions I have been asked, "Pray, Mr. Babbage, if you put
     into the machine wrong figures, will the right answers come out?"

earleh@eleazar.dartmouth.edu (Earle R. Horton) (06/08/88)

In article <21737@think.UUCP> ephraim@vidar.think.com.UUCP 
	(ephraim vishniac) writes:
>In article <8796@dartvax.Dartmouth.EDU> earleh@eleazar.dartmouth.edu 
	(Earle R. Horton) writes:
[Incorrect statements by me.]
...
>What's the source of your claim?

I claim nearsightedness and temporary insanity in my analysis of BlockMove().
My "mbox" overfloweth.

I still maintain, however, that if you have "move"s in your program and you
have a real concern as to the speed, you will have to TIME IT to find the
absolute best method.  Also, the best method probably depends heavily on the
number of bytes to be moved.  If the number of bytes varies, there may well
be no best choice.

*********************************************************************
*Earle R. Horton, H.B. 8000, Dartmouth College, Hanover, NH 03755   *
*********************************************************************