[comp.sys.sgi] The Future

dmlaur@phoenix.Princeton.EDU (David M. Laur) (02/23/91)

--------- long ago, Kurt Akeley of SGI said ----------
>From: kurt@cashew.asd.sgi.com (Kurt Akeley)
>Subject: Re: Efficient use of lighting models
>Date: 29 Oct 90 20:43:05 GMT
>
>Here's the facts, make of them what you will.  Vertex data are transferred
>to GTX and VGX graphics systems using special 3-way operations (see other
>SGI publications for explanation).  A 3-way transfer takes 10 bus clocks to
>complete if its data are quad-word aligned, 14 bus clocks otherwise.  Since
>the bus clock is always 16 MHz, this translates into 1.6 million aligned
>transfers per second, and 1.15 million unaligned transfers per second.
>If both transform and fill limits support a call rate that is greater than
>1.15 million calls per second (counting each c, n, v, and t call) then
>quad-word alignment will improve performance.  This situation is common on
>well tuned VGX code, somewhat less common on GTX code.

So, with respect to allocating "quad-word aligned" space for
floating point vertex data  AND  keeping in mind some of the
recently discussed issues in comp.arch viz number of bits in
an int/long/float/pointer:

Is there a way to (upward) portably allocate space for four-dimensional
(e.g. xyzw or xyz,pad) floating point vertices, that will keep performance
on the fast path?

That is: does "quad-word" also mean "quad-float"?  will it continue
to mean that if a 64-bit chip comes along?  or does "quad-word"
mean "4*sizeof(int)" and therefore only happens to affect performance
of floats because sizeof(float)==sizeof(int) on the SGI/4D's today.

I realize the -lmpc verion of malloc will give a "quad-word" aligned
address (whatever that really means).  But it's not clear to me if
that library should (or can) be used on single-processor systems.
More importantly, I'm just trying to understand what aspect of
alignment I should really be concerned with when developing data
structures.

Thanks for any insight;  I suppose follow-ups should be to comp.sys.sgi
particularly regarding graphics performance issues.

------

David Laur    dmlaur@gauguin.princeton.edu
Princeton University, Interactive Computer Graphics Lab
"Talking about music is like dancing about architecture" - Laurie Anderson

robert@texas.asd.sgi.com (Robert Skinner) (03/01/91)

In article <6518@idunno.Princeton.EDU>, dmlaur@phoenix.Princeton.EDU (David M. Laur) writes:
|> I realize the -lmpc verion of malloc will give a "quad-word" aligned
|> address (whatever that really means).  But it's not clear to me if
|> that library should (or can) be used on single-processor systems.
|> 
|> David Laur    dmlaur@gauguin.princeton.edu
|> Princeton University, Interactive Computer Graphics Lab
|> "Talking about music is like dancing about architecture" - Laurie Anderson

libmpc.a can be used on single-processor systems, and with single
process applications.  There is a *slight* performance penalty if your
application is a single process, because all stdio and memory
allocation routines are semaphored.  (It is very slight, because the
semaphore will never block if the application is a single process).

And of course, you're application will be larger, because libmpc is not
a shared library.

-- 
Robert Skinner
robert@sgi.com


	My father was a gambler down in Georgia,
	Wound up on the wrong end of a gun,
	I was born in the back seat of a Greyhound,
	Rollin' down highway forty-one.
 
			- The Allman Brothers