[net.micro.amiga] Who is stealing my cycles?

rokicki@navajo.STANFORD.EDU (Tomas Rokicki) (09/26/86)

[   / | \    cruisin' on down . . . ]

Well, folks, I have some bad news.  I wrote a test routine
to see just exactly how many CPU cycles I was getting from
my Amiga.  This routine was essentially nested dbra's:

loop(i,j)
int i,j
{
#asm
	move.w	8(a5),d1
	move.w	10(a5),d0
loop:	dbra	d0,loop
	dbra	d1,loop
#endasm
}

So, I pass it the appropriate parameters to consume 2,000,000,000
cycles, and time it.  This was with no other programs running, no
workbench loaded, and just a standard CLI open.  I even subtracted
the overhead.  I tried it in chip and in fast memory; I tried it
with my external memory disconnected; I tried it with the workbench
screen moved all the way down so it wasn't visible.  Still,
between 8 and 17 percent of the CPU disappeared somewhere.

So it's not that much, but I'm interested in what the machine could
be possibly doing that eats the CPU.

I have an explanation for the 17% figure; when run out of chip
memory, the processor gets every other 3.6 MHz RAM slot.  Well,
running at 7.2 MHz, occasionally it wants the odd slot (a dbra
does two fetches but requires ten cycles, so it is guilty of
this) so the processor must wait two cycles before it gets the
bus.  But out of fast memory this shouldn't happen . . .

What it comes down to is this; the fastest effective clock speed
I have been able to get is 6.6MHz.  Not to bitch, but I wouldn't
be upset to get more . . .

-tom

grr@cbmvax.cbm.UUCP (George Robbins) (09/27/86)

In article <858@navajo.STANFORD.EDU> rokicki@navajo.STANFORD.EDU (Tomas Rokicki) writes:
>
>Well, folks, I have some bad news.  I wrote a test routine
>to see just exactly how many CPU cycles I was getting from
>my Amiga.  This routine was essentially nested dbra's:
>
...
>screen moved all the way down so it wasn't visible.  Still,
>between 8 and 17 percent of the CPU disappeared somewhere.
>
>So it's not that much, but I'm interested in what the machine could
>be possibly doing that eats the CPU.
>
>I have an explanation for the 17% figure; when run out of chip
>memory, the processor gets every other 3.6 MHz RAM slot.  Well,
>running at 7.2 MHz, occasionally it wants the odd slot (a dbra
>does two fetches but requires ten cycles, so it is guilty of
>this) so the processor must wait two cycles before it gets the
>bus.  But out of fast memory this shouldn't happen . . .
>
>What it comes down to is this; the fastest effective clock speed
>I have been able to get is 6.6MHz.  Not to bitch, but I wouldn't
>be upset to get more . . .
>
>-tom

First guess:  Did you account for wasted pre-fetch cycles on the branches?

-- 
George Robbins - now working for,	uucp: {ihnp4|seismo|caip}!cbmvax!grr
but no way officially representing	arpa: cbmvax!grr@seismo.css.GOV
Commodore, Engineering Department	fone: 215-431-9255 (only by moonlite)

daveb@cbmvax.cbm.UUCP (Dave Berezowski) (09/28/86)

In article <858@navajo.STANFORD.EDU> rokicki@navajo.STANFORD.EDU (Tomas Rokicki) writes:
>[   / | \    cruisin' on down . . . ]
>
>Well, folks, I have some bad news.  I wrote a test routine
>to see just exactly how many CPU cycles I was getting from
>my Amiga.  This routine was essentially nested dbra's:
>
>loop(i,j)
>int i,j
>{
>#asm
>	move.w	8(a5),d1
>	move.w	10(a5),d0
>loop:	dbra	d0,loop
>	dbra	d1,loop
>#endasm
>}
>
>So, I pass it the appropriate parameters to consume 2,000,000,000
>cycles, and time it.  This was with no other programs running, no
>workbench loaded, and just a standard CLI open.  I even subtracted
>the overhead.  I tried it in chip and in fast memory; I tried it
>with my external memory disconnected; I tried it with the workbench
>screen moved all the way down so it wasn't visible.  Still,
>between 8 and 17 percent of the CPU disappeared somewhere.
>
>So it's not that much, but I'm interested in what the machine could
>be possibly doing that eats the CPU.
>
>I have an explanation for the 17% figure; when run out of chip
>memory, the processor gets every other 3.6 MHz RAM slot.  Well,
>running at 7.2 MHz, occasionally it wants the odd slot (a dbra
>does two fetches but requires ten cycles, so it is guilty of
>this) so the processor must wait two cycles before it gets the
>bus.  But out of fast memory this shouldn't happen . . .
>
>What it comes down to is this; the fastest effective clock speed
>I have been able to get is 6.6MHz.  Not to bitch, but I wouldn't
>be upset to get more . . .
>
>-tom


	During your test did you disable/enable multitasking with
Forbid()/Permit() or disable/enable interupts with Disable()/Enable()?

	If not, this could account for some of the overhead.

daveh@cbmvax.cbm.UUCP (Dave Haynie) (09/29/86)

> 
> [   / | \    cruisin' on down . . . ]
> 
> Well, folks, I have some bad news.  ...

> I have been able to get is 6.6MHz.  Not to bitch, but I wouldn't
> be upset to get more . . .
> 
> -tom

Just a thought... did you disable multitasking and interrupts during your
test program?  Any task runnning at the same or better priority is going
to want its share of the CPU time.

-- 
============================================================================
Dave Haynie    {caip,ihnp4,allegra,seismo}!cbmvax!daveh

	These opinions are my own, though if you try them out, and decide
	that you really like them, a small donation would be appreciated.