[comp.sys.nsc.32k] Timing question

bob@reed.UUCP (Bob Ankeney) (08/23/89)

     A quick question:  Which (if either) is faster in execution:

     movw foo(r1),bar(r1)
               or
     movw foo[r1:b],bar[r1:b]

I don't have a manual with an instruction timing guide handy (and they're
such a pain to use anyway!)  Thanx for the help!

	Bob Ankeney

---------------------------------------------------------------------------
Bob Ankeney                    | "Wise men talk because they have         |
...!tektronix!reed!bob         |  something to say; fools, because        |
...!tektronix!bob@reed.BITNET  |  they have to say something."    - Plato |
...!percival!bob@agravain.UUCP |                                          |
                               | "He who talks by the yard and acts by    |
                               |  the inch should be kicked by the foot." |
---------------------------------------------------------------------------

gideony@microsoft.UUCP (Gideon Yuvall) (08/26/89)

In article <13223@reed.UUCP> bob@reed.UUCP (Bob Ankeney) writes:
>
>     A quick question:  Which (if either) is faster in execution:
...

A quick answer: compile the equivalent "C",under "-O -S -KC<processor>",
using NSC's compilers; you'll get NSC's best guess as to what is
fastest. That guess is probably MORE accurate than the manuals,
and is certainly no less accurate.
-- 
Gideon Yuval, gideony@microsof.UUCP, 206-882-8080 (fax:206-883-8101;TWX:160520)

kls@ditka.UUCP (Karl Swartz) (08/31/89)

In article <13223@reed.UUCP> bob@reed.UUCP (Bob Ankeney) writes:
>
>     A quick question:  Which (if either) is faster in execution:

According to my (ancient) manuals, for a 32032 ...

    movw foo(r1),bar(r1)		; 19 cycles
    movw foo[r1:b],bar[r1:b]		; 27 cycles

This assumes the source and destination words are aligned and there
are MMU delays (subtract 2 cycles from each case for no MMU).  Plus
there's the time to fetch an extra two bytes of instruction for the
the indexed case.

In other words, index mode is a loser where a simpler addressing
mode can be used, certainly in this case, and probably in all cases.

-- 
Karl Swartz		|UUCP		uunet!lll-winken!ames!hc!rt1!ditka!kls
1-505/667-7777 (work)	|Internet	kls@rt1.lanl.gov
1-505/672-3113 (home)	|BIX		kswartz
"I never let my schooling get in the way of my education."  (Twain)

george@wombat.UUCP (George Scolaro) (09/03/89)

In article <3878@ditka.UUCP> kls@ditka.UUCP (Karl Swartz) writes:
>In article <13223@reed.UUCP> bob@reed.UUCP (Bob Ankeney) writes:
>>
>>     A quick question:  Which (if either) is faster in execution:
>
>According to my (ancient) manuals, for a 32032 ...
>
>    movw foo(r1),bar(r1)		; 19 cycles
>    movw foo[r1:b],bar[r1:b]		; 27 cycles

On a 32532 (or 32gx32) reading the 32gx32 data 'sheet' the timing is:

     movw foo(r1),bar(r1)		; 4 cycles (5 x 32032)
     movw foo[r1:b],bar[r1:b]		; 8 cycles (3.5 x 32032)

The 32532 has hardware for the addressing modes but the following
still add to the overall instruction time:

	Memory relative		3 clocks (bit cisc)
	External (yuk)		8 clocks (very cisc)
	Scaled Indexing		2 clocks (little cisc)

Note: the above times are added for source and or destination.
-- 
George Scolaro
george@wombat
(try {pyramid|sun|vsi1|killer} !daver!wombat!george)