[comp.unix.aix] Why 256?

bglenden@mandrill.cv.nrao.edu (Brian Glendenning) (10/25/90)

I've just been reading the document "IBM RISC SYSTEM/6000 PERFORMANCE
TUNING FOR NUMERICALLY INTENSIVE FORTRAN AND C PROGRAMS" (hey - would
I invent all those caps!). Very nice.

The only thing I think I don't understand is the tuning summary entry
that says "Make sure leading dimension of arrays is not a multiple of
2 greater than or equal to 256."

I first thought this had to do with cases where you had to stride
through the array in other than the first dimension and the fact that
you would have increased cache misses, but I can't see why the number
"256" is picked on. What is the reasoning behind this statement? And
does this mean that FFTs and convolutions etc are doomed to be "slow"
on these machines?

(I would greatly appreciate an example of a hot FFT routine for the
RS/6000 if anyone has one).

Thanks!

Brian
--
       Brian Glendenning - National Radio Astronomy Observatory
bglenden@nrao.edu          bglenden@nrao.bitnet          (804) 296-0286

tif@doorstop.austin.ibm.com (Paul Chamberlain) (10/25/90)

bglenden@mandrill.cv.nrao.edu (Brian Glendenning) writes:
>that says "Make sure leading dimension of arrays is not a multiple of
>2 greater than or equal to 256."

Don't mistake me for someone who knows what he's talking about, but ...
Would it make sense if this had to do with interleaved memory and hitting
the same "bank" repeatedly on sequential accesses in the second dimension?

Paul Chamberlain | I do NOT represent IBM.     tif@doorstop, sc30661 at ausvm6
512/838-7008     | ...!cs.utexas.edu!ibmaus!auschs!doorstop.austin.ibm.com!tif

sdl@adagio.austin.ibm.com (Stephen Linam) (10/25/90)

In article <BGLENDEN.90Oct24211953@mandrill.cv.nrao.edu>,
bglenden@mandrill.cv.nrao.edu (Brian Glendenning) writes:
|> 
|> I've just been reading the document "IBM RISC SYSTEM/6000 PERFORMANCE
|> TUNING FOR NUMERICALLY INTENSIVE FORTRAN AND C PROGRAMS" (hey - would
|> I invent all those caps!). Very nice.
|> 
|> The only thing I think I don't understand is the tuning summary entry
|> that says "Make sure leading dimension of arrays is not a multiple of
|> 2 greater than or equal to 256."
|> 
|> I first thought this had to do with cases where you had to stride
|> through the array in other than the first dimension and the fact that
|> you would have increased cache misses, but I can't see why the number
|> "256" is picked on. 

Bits 18-24 (I think) of the virtual address control which entry
in the cache is used for a particular address.  If you are accessing 
data which is 256, 512, 1028 etc. double words apart, you are reducing the 
effective size of the cache, causing more frequent cache misses.  The document 
you refer to mentions the pathological case, 16k bytes or 2048
double words, where the cache is effectively only 4 lines (512 bytes).

--------------------------------------------------------------------
Stephen Linam   AWD Austin   T/L: 793-3674  Bell-net: (512) 832-3674
IBM Internet: sdl@adagio.austin.ibm.com        VNET: LINAM at AUSTIN
UUCP:  ...!cs.utexas.edu:ibmchs!auschs!adagio.austin.ibm.com!sdl