zazula@uazhe0.physics.arizona.edu (RALPH ZAZULA) (04/03/91)
I posted a short while back about the "slowness" of the DSPAPmtm function. I received a reply (I forget from who, sorry) that mentioned that, when you call the DSPAPxxx routines, the DSP code gets loaded *every* time, even if it is the same call over-and-over again. I looked around and found the program: /NextDeveloper/Examples/DSP/ArrayProcessing/myAP and it seemed that it had exactly what I wanted. That is, load the DSP code once and call it repeatedly. So, I modified .../DSP/ArrayProcessing/matrix/matrix.c to use the myAP code (ie. replaced myAPvasl with myAPmtm and got rid of myAPvnot) and ran it again. The results for 5000 iterations of DSP vs. 030 are: [27](bonehead)/users/zazula/Apps/myAP> time mytest DSP time: 59 030 time: 1 expected: 0.040000 -0.020000 0.060000 0.160000 received: 0.040000 -0.020000 0.060000 0.160000 mtm succeeded 10.1u 15.9s 1:01 42% 0+0k 4+7io 0pf+0w I used time(3) to get the times so resolution was only seconds. Monitor tells me that the DSP portion of the code runs mostly in system mode (if that means anything...). Does anyone have an idea what is going on here? Is it just the the myAP code is inefficient too? Is there a right way to load DSP code and then call it repeatedly???? I'll include the modified matrix.c (really mytest.c) code at the end to answer any questions about exactly how the above numbers were arrived at. Also, I didn't use the -O option when compiling mytest.c (I had at first thought that was why the 030 code ran faster - that maybe the test loop was optimized down to one iteration). Thanks Ralph Zazula |----------------------------------------------------------------------| | Ralph Zazula "Computer Addict!" | | University of Arizona --- Department of Physics | | UAZHEP::ZAZULA (DecNet/HEPNet) | | zazula@uazhe0.physics.arizona.edu (Internet) | |----------------------------------------------------------------------| | "You can twist perceptions, reality won't budge." - Neil Peart | |----------------------------------------------------------------------| ------------------ mytest.c (matrix.c using myAP) ----------------- /* * matrix.c * Test the matrix multiplication array processing DSP macro. * This code downloads two matricies to the DSP then calls the * C function DSPAPmtm(). This function contains the wrapped DSP array * processing macro mtm, and causes the DSP code to be downloaded * and executed. The resultant matrix is read back and verified * against the correct result. */ //#include <dsp/arrayproc.h> /* include the array processing header */ #include "myAP.h" #include "myAPmtm.h" #include <math.h> /* needed only for fabs() */ #include <sys/time.h> #include <sys/types.h> #ifndef TRUE #define TRUE 1 #endif #ifndef FALSE #define FALSE 0 #endif #define B_ROWS 3 /* number of rows in b */ #define B_COLS 2 /* number of columns in b */ #define A_ROWS 2 /* number of rows in a */ #define A_COLS B_ROWS /* number of columns in a, must == B_ROWS */ #define C_ROWS A_ROWS /* number of rows in c */ #define C_COLS A_ROWS /* number of columns in c */ #define A_SIZE (A_ROWS * A_COLS) /* number of elements in a */ #define B_SIZE (B_ROWS * B_COLS) /* number of elements in b */ #define C_SIZE (C_ROWS * C_COLS) /* number of elements in c */ #define A_IN myAPGetLowestAddress() /* a address */ #define B_IN (A_IN + A_SIZE) /* b address */ #define C_OUT (B_IN + B_SIZE) /* c address */ /* Compare two floats to within 6 significant digits */ #define feq(a,b) (fabs((a)-(b))<.000001) main() { float a[A_ROWS][A_COLS]; /* input matrix a */ float b[B_ROWS][B_COLS]; /* input matrix b */ float c[C_ROWS][C_COLS]; /* output matrix c */ float d[C_ROWS][C_COLS]; /* correct result matrix d */ int failed = FALSE; int i, j, k, l; long start,end; /* load some values into input arrays */ a[0][0] = 0.1; a[0][1] = 0.2; a[0][2] = -0.1; a[1][0] = 0.3; a[1][1] = 0.1; a[1][2] = 0.4; b[0][0] = -0.2; b[0][1] = 0.5; b[1][0] = 0.4; b[1][1] = -0.3; b[2][0] = 0.2; b[2][1] = 0.1; DSPSetErrorFP(stderr); DSPEnableErrorLog(); myAPInit(); /* initialize the DSP for array processing */ /* put input arrays to the DSP */ DSPWriteFloatArray((float *)a, DSP_MS_X, A_IN, 1, A_SIZE); DSPWriteFloatArray((float *)b, DSP_MS_X, B_IN, 1, B_SIZE); /* call the C interface function */ // load the DSP program myAPmtm(A_IN, B_IN, C_OUT, B_ROWS, B_COLS, A_ROWS); start = time(0); for(l=0; l<5000; l++){ // printf("%u\n",l); myAPGo(); if(myAPAwaitNotBusy(100)) { fprintf(stderr,"AP program is hung!\n"); exit(1); } } printf("DSP time: %u\n",time(0) - start); /* get output array from the DSP */ DSPReadFloatArray((float *)c, DSP_MS_X, C_OUT, 1, C_SIZE); myAPFree(); /* free the DSP */ /* compute correct result into d */ start = time(0); for(l=0; l<5000; l++){ // printf("%u\n",l); for (i = 0; i < A_ROWS; i++) for (j = 0; j < B_COLS; j++) { d[i][j] = 0; for (k = 0; k < A_COLS; k++) d[i][j] += a[i][k] * b[k][j]; } } printf("030 time: %u\n",time(0) - start); /* display and compare computed result with DSP result */ printf("expected:\n"); for (i = 0; i < C_ROWS; i++) { for (j = 0; j < C_COLS; j++) { printf("%9f ", d[i][j]); if (!feq(c[i][j], d[i][j])) failed = TRUE; } printf("\n"); } printf("received:\n"); for (i = 0; i < C_ROWS; i++) { for (j = 0; j < C_COLS; j++) printf("%9f ", c[i][j]); printf("\n"); } if (failed) printf("\n*** mtm FAILED! ***\n"); else printf("\nmtm succeeded\n"); }