bglenden@mandrill.cv.nrao.edu (Brian Glendenning) (10/25/90)
I've just been reading the document "IBM RISC SYSTEM/6000 PERFORMANCE TUNING FOR NUMERICALLY INTENSIVE FORTRAN AND C PROGRAMS" (hey - would I invent all those caps!). Very nice. The only thing I think I don't understand is the tuning summary entry that says "Make sure leading dimension of arrays is not a multiple of 2 greater than or equal to 256." I first thought this had to do with cases where you had to stride through the array in other than the first dimension and the fact that you would have increased cache misses, but I can't see why the number "256" is picked on. What is the reasoning behind this statement? And does this mean that FFTs and convolutions etc are doomed to be "slow" on these machines? (I would greatly appreciate an example of a hot FFT routine for the RS/6000 if anyone has one). Thanks! Brian -- Brian Glendenning - National Radio Astronomy Observatory bglenden@nrao.edu bglenden@nrao.bitnet (804) 296-0286
tif@doorstop.austin.ibm.com (Paul Chamberlain) (10/25/90)
bglenden@mandrill.cv.nrao.edu (Brian Glendenning) writes: >that says "Make sure leading dimension of arrays is not a multiple of >2 greater than or equal to 256." Don't mistake me for someone who knows what he's talking about, but ... Would it make sense if this had to do with interleaved memory and hitting the same "bank" repeatedly on sequential accesses in the second dimension? Paul Chamberlain | I do NOT represent IBM. tif@doorstop, sc30661 at ausvm6 512/838-7008 | ...!cs.utexas.edu!ibmaus!auschs!doorstop.austin.ibm.com!tif
sdl@adagio.austin.ibm.com (Stephen Linam) (10/25/90)
In article <BGLENDEN.90Oct24211953@mandrill.cv.nrao.edu>, bglenden@mandrill.cv.nrao.edu (Brian Glendenning) writes: |> |> I've just been reading the document "IBM RISC SYSTEM/6000 PERFORMANCE |> TUNING FOR NUMERICALLY INTENSIVE FORTRAN AND C PROGRAMS" (hey - would |> I invent all those caps!). Very nice. |> |> The only thing I think I don't understand is the tuning summary entry |> that says "Make sure leading dimension of arrays is not a multiple of |> 2 greater than or equal to 256." |> |> I first thought this had to do with cases where you had to stride |> through the array in other than the first dimension and the fact that |> you would have increased cache misses, but I can't see why the number |> "256" is picked on. Bits 18-24 (I think) of the virtual address control which entry in the cache is used for a particular address. If you are accessing data which is 256, 512, 1028 etc. double words apart, you are reducing the effective size of the cache, causing more frequent cache misses. The document you refer to mentions the pathological case, 16k bytes or 2048 double words, where the cache is effectively only 4 lines (512 bytes). -------------------------------------------------------------------- Stephen Linam AWD Austin T/L: 793-3674 Bell-net: (512) 832-3674 IBM Internet: sdl@adagio.austin.ibm.com VNET: LINAM at AUSTIN UUCP: ...!cs.utexas.edu:ibmchs!auschs!adagio.austin.ibm.com!sdl