[net.micro.6809] Don't use C "register"

knudsen@ihwpt.UUCP (mike knudsen) (11/30/85)

The dhrystone benchmarks just posted
have re-discovered what I've suspected all along:
namely "registers" do no good, and may do harm, in MW C
for the 6809!  Reason: the U Reg is simply substituted for a
RAM location, so the LD's and STD's become TFR's.
TFR is the slowest instruction in the 6809 set.  It is quicker
to move to/from RAM (in almost any addressing mode)
than to TFR between two regs.
If you "reg" a *int, MW C still
TFR's it to X to do the indexing.
(And incrementing, if needed).
Then TFRs back to U.   AAAgh!
No ethinic jokes, please--
but you've been warned!
You want speed -- declare everything "direct extern"
except arrays!
     mike k
PS: My own "sieve" tests last year made me suspicious enough
to look at the assembly code.

goldman@ucsfcca.UUCP (Eric Goldman) (12/02/85)

[]
In article <593@ihwpt.UUCP> knudsen@ihwpt.UUCP (mike knudsen) writes:
>
>The dhrystone benchmarks just posted
>have re-discovered what I've suspected all along:
>namely "registers" do no good, and may do harm, in MW C
>for the 6809!  ....

Oops!  My fault.  I thought I had carefully proofread the "header line"
(i.e., the line containing, "The output from... WITH...") for each result in
my posting, but I reversed them.  Tonight, I reran the benchmarks with and
without registers, just to be certain.  The results, which I am re-posting,
are, indeed, the direct output from the programs; the header lines are now
correct:

	The output from the dhrystone benchmark WITHOUT registers:

	Dhrystone time for 25000 passes = 238
	This machine benchmarks at 105 dhrystones/second
	----------------------------------------------------------
	The output from the dhrystone benchmark WITH registers:

	Dhrystone time for 25000 passes = 234
	This machine benchmarks at 106 dhrystones/second

Admittedly, the results still support your statement that registers do not
seem to do much good using OS-9 MW C for the CoCo, at least in this benchmark.
But my apologies for the erroneous posting.

--Eric S. Goldman, M.D.
  UCSF School of Medicine
  ARPA: cope.ucsf!goldman@ucsf-cgl.ARPA
  UUCP: ucbvax!ucsfmis!cope.ucsf!goldman

don@gitpyr.UUCP (Don Deal) (12/03/85)

In article <593@ihwpt.UUCP> knudsen@ihwpt.UUCP (mike knudsen) writes:
> ...  Reason: the U Reg is simply substituted for a
>RAM location, so the LD's and STD's become TFR's.
>TFR is the slowest instruction in the 6809 set.  It is quicker
>to move to/from RAM (in almost any addressing mode)
>than to TFR between two regs.

  How do you figure?  The 'tfr' instruction takes 6 machine cycles, and
a load or store using direct addressing averages 5 cycles.  Using even
the simplest addressing mode, direct, is going to cost you on the average
of 4 cycles more than 'tfr', and the other addressing modes (inherent
excluded) are even more expensive.

  The 'cwai', 'swi2', and 'swi3' instructions are tied for being the
slowest on the 6809; they all consume 20 cycles.
-- 
D.L. Deal, Office of Computing Services, Georgia Tech, Atlanta GA, 30332-0275
Phone: (404) 894-6160 (office) 894-4669 (messages) / BITNET: cc100dd@gitvm1
uucp: ...!{akgua,allegra,amd,hplabs,ihnp4,masscomp,ut-ngp}!gatech!gitpyr!don

knudsen@ihwpt.UUCP (mike knudsen) (12/04/85)

>   How do you figure?  The 'tfr' instruction takes 6 machine cycles, and
> a load or store using direct addressing averages 5 cycles.  Using even
> the simplest addressing mode, direct, is going to cost you on the average
> of 4 cycles more than 'tfr', and the other addressing modes (inherent
> excluded) are even more expensive.

Okay, I admit to ambiguity in my posting, which was done under adverse
conditions.  I meant the ROUND-TRIP timing: it is STILL faster to
LDX from RAM, twiddle it, and STX back to RAM,
than to TFR U,X, twiddle, and TFR X,U.

Also, TFR takes 8 cycles for doulbe-byte regs; 6 cycles is for A and B.

>   The 'cwai', 'swi2', and 'swi3' instructions are tied for being the
> slowest on the 6809; they all consume 20 cycles.

Of course you're right.  I meant "slowest of normally used 
instructions."  CWAI is used to go to sleep, so doesn't really count.
I'm afraid that under OS-9, SWIs are pretty "normal" too.

I still say: Don't use TFR if you can help it,
and don't declare register in MW C.
	mike k