root@rice.edu (Brazos) (11/07/89)
In article <2598@brazos.Rice.edu> tjeerd@cana.sci.kun.nl writes: >I'm trying to program a SUN4/260 CXP (GP2+CG5) with microcode version >4.2.11 (comes with 4.0.3). I'm using the GPSI software and try to push the >machine to it's claimed limits. Drawing 1000 short 3D-vectors 1000 times >with clearscreen takes me about 40 seconds which amounts to 25K vectors >per second about 20% of the claimed speed of 150K vectors per sec. I've seen programs spit out better than 150K vps with earlier versions of the microcode. Could be something broke since then. Those programs didn't include screen clear time, but screen clear for the full screen is only about a tenth of a second. >My test program consists of a part filling the shared memory with the >clearscreen code (GP1_PR_ROP_NF), the 1000 1 pixel vectors (so they are >dots actually) using GP1_XF_LINE_3D and a matrix multiplication >(GP1_MUL_MAT_3D) and a part that calls gp1_post 1000 times. Couldn't be >faster I think. Has anyone tried to speed up the GP2? My hunch is that a >lot of time is spent in the copying from shared memory to the XP local >memory. Loading the data from the shared memory takes about four 100 nanosecond cycles per point. A million points takes less than half a second out of your 40. The points don't get copied into local memory, they get loaded directly into floating point registers. To get the advertised GP2 numbers, you have to be sure to pack many vectors per GP1_XF_LINE_3D command. If you turn on z-buffering for the vectors, your mileage WILL differ. Don't do a gp1_sync on every buffer, always use gp1_alloc to get fresh buffers and free them on every end of buffer. Then you never have to wait for the GP2, you cycle though the available buffers. Dan McCoy {ucbvax,sun}!pixar!mccoy