tff@na.toronto.edu (Tom Fairgrieve) (07/13/90)
Does SGI have an optimized version of the BLAS (Basic Linear Algebra Subroutines) available for the 4d/240? If so, how does the performance of this version compare to a version produced by the f77 compiler with -O3 optimization level set? I'm interested in all 3 levels of the BLAS. Thanks for any information, Tom Fairgrieve tff@na.utoronto.ca
jpp@pipo.corp.sgi.com (Jean-Pierre Panziera) (07/14/90)
In article <90Jul13.100737edt.8304@ephemeral.ai.toronto.edu>, tff@na.toronto.edu (Tom Fairgrieve) writes: > From: tff@na.toronto.edu (Tom Fairgrieve) > Subject: Basic Linear Algebra Subroutines (BLAS) > Date: 13 Jul 90 14:08:02 GMT > Organization: Department of Computer Science, University of Toronto > > Does SGI have an optimized version of the BLAS (Basic Linear Algebra > Subroutines) available for the 4d/240? If so, how does the performance > of this version compare to a version produced by the f77 compiler with > -O3 optimization level set? I'm interested in all 3 levels of the BLAS. > > Thanks for any information, > Tom Fairgrieve > tff@na.utoronto.ca As far as I know SGI does not have an official version of BLAS3, I may be wrong. However I have optimized/parallelized a Fortran version of the matrix multiplication routines of Blas3 I get pretty good results on a 220-GTX : dgemm 5-11 Mflops zgemm 10-14 Mflops sgemm 10-16 Mflops cgemm 12-17 Mflops the lowest performances are for A * trans(B), the highest for trans(A) * B I am sure it can be improved and I do not warranty it is bug free.
bron@bronze.wpd.sgi.com (Bron Campbell Nelson) (07/17/90)
In article <90Jul13.100737edt.8304@ephemeral.ai.toronto.edu>, tff@na.toronto.edu (Tom Fairgrieve) writes: > Does SGI have an optimized version of the BLAS (Basic Linear Algebra > Subroutines) available for the 4d/240? If so, how does the performance > of this version compare to a version produced by the f77 compiler with > -O3 optimization level set? I'm interested in all 3 levels of the BLAS. As far as I know, SGI does not have versions of the BLAS libraries. However, Kuck and Associates, Inc. (KAI) in Illinois does sell math libraries that are tuned to run on SGI multiprocessors. If I remember correctly (always a dangerous assumption) one customer was able to hit over 50MFLOPS on an 8 cpu machine using the KAI software. My *personal* opinion is that the KAI library is very good and very fast. Contact KAI directly for more info. I believe Debbie Carr is still their marketing person: try dcarr@kai.com Standard disclaimer: This is provided for information only. Neither I nor SGI make any warrenties, either express or implied. And so on blah blah blah etc. etc. -- Bron Campbell Nelson bron@sgi.com or possibly ..!ames!sgi!bron These statements are my own, not those of Silicon Graphics.