[comp.arch] Binaries and Max Speed

andy@solitary.Stanford.EDU (Andy Freeman) (01/01/89)

In article <2700003@prisma> mo@prisma writes:
[More on the discussion about SPARC runtime stall vs MIPS NOPs:
 Dave thinks that MIPS will be disadvantaged because a binary
 compatible SPARC that manages to avoid load stalls will run
 3% faster than one that doesn't, while the NOPs will always
 be there in MIPS code that isn't recompiled for an implementation
 that can avoid load stalls.]
>Running the same binaries, but faster, is clearly a win.

I don't know of any applications where one can't recompile to get the
last 3% if it is really important.  Perhaps someone can give an
example of one.  The requirements are simple - a 3% difference in
runtime is very important AND you don't get to recompile before
running the application on a different system.  (The deck is a bit
stacked as there are very few applications where 3% matters.)

-andy
UUCP:  {arpa gateways, decwrl, uunet, rutgers}!polya.stanford.edu!andy
ARPA:  andy@polya.stanford.edu
(415) 329-1718/723-3088 home/cubicle

arman@oahu.cs.ucla.edu (Arman Bostani) (01/01/89)

In article <5847@polya.Stanford.EDU> andy@solitary.Stanford.EDU (Andy Freeman) writes:
>In article <2700003@prisma> mo@prisma writes:
>[More on the discussion about SPARC runtime stall vs MIPS NOPs:
> Dave thinks that MIPS will be disadvantaged because a binary
> compatible SPARC that manages to avoid load stalls will run
> 3% faster than one that doesn't, while the NOPs will always
> be there in MIPS code that isn't recompiled for an implementation
> that can avoid load stalls.]
>>Running the same binaries, but faster, is clearly a win.
>
>I don't know of any applications where one can't recompile to get the
>last 3% if it is really important. -- stuff deleted --

In this case, I don't even see a reason for recompilation of source
programs. One can always write a "relatively" simple program to get
rid of the NOPs. 






-- Arman Bostani // UCLA Computer Science Department // +1 213-825-3194
	3417 Boelter Hall // Los Angeles, California 90024-1596 // USA
	arman@CS.UCLA.EDU   ...!(ucbvax,rutgers)!ucla-cs!arman

bcase@cup.portal.com (Brian bcase Case) (01/03/89)

>>I don't know of any applications where one can't recompile to get the
>>last 3% if it is really important. -- stuff deleted --
>
>In this case, I don't even see a reason for recompilation of source
>programs. One can always write a "relatively" simple program to get
>rid of the NOPs. 

I claim this isn't true.  Getting rid of NOPs changes code addresses.
You'll have to find all places where code addresses are referenced,
including tables for case statements and other places where code
addresses are computed at run time (procedure parameters?), and change
them too.  This is a hard problem.

mo@prisma (01/04/89)

Well, it turns out that if your machine has a large base of
available applications, like the SPARC, for instance, you discover
that many of those people don't really want to be bothered to
do ANOTHER stock-keeping unit just for YOUR machine if it will
run already, even if crippled to some degree.  As for 3%, on a
100-150 mips machine, that 3% is as fast as what was, until
recently considered a reasonably quick (if not fast) machine.
I'm not claiming it's a gigantic win, but it can be a non-trivial one
for some applications.  Just remember that not everyone's problem
will currently run on a workstation-class machine.

Waste not, want not.

	-Mike

larry@mips.COM (Larry Weber) (01/04/89)

In article <13114@cup.portal.com> bcase@cup.portal.com (Brian bcase Case) writes:
>>>I don't know of any applications where one can't recompile to get the
>>>last 3% if it is really important. -- stuff deleted --
>>
>>In this case, I don't even see a reason for recompilation of source
>>programs. One can always write a "relatively" simple program to get
>>rid of the NOPs. 
>
>I claim this isn't true.  Getting rid of NOPs changes code addresses.
>You'll have to find all places where code addresses are referenced,
>including tables for case statements and other places where code
>addresses are computed at run time (procedure parameters?), and change
>them too.  This is a hard problem.

Hard problem? Not really, just you must be careful to modify all the
addresses correctly.  Also, if your coding style (or the compilers) is
to intermix code and data in a hap-hazzard way or if you like to 
write into the code area (:-{) these add significant difficulty.

The Mips program pixie is used to add insturmentation to a program
for performance analysis.  It must change the addresses to account
for the added code.  Deleting code is much the same.

Larry
-- 
-Larry Weber  DISCLAIMER: I speak only for myself, and I sometimes deny that.
UUCP: 	{ames,decwrl,prls,pyramid}!mips!larry  OR  larry@mips.com
DDD:  	408-991-0214 or 408-720-1700, x214
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

bcase@cup.portal.com (Brian bcase Case) (01/06/89)

>>I claim this isn't true.  Getting rid of NOPs changes code addresses.
>>You'll have to find all places where code addresses are referenced,
>>including tables for case statements and other places where code
>>addresses are computed at run time (procedure parameters?), and change
>>them too.  This is a hard problem.
>
>Hard problem? Not really, just you must be careful to modify all the
>addresses correctly.  Also, if your coding style (or the compilers) is
>to intermix code and data in a hap-hazzard way or if you like to 
>write into the code area (:-{) these add significant difficulty.
>
>The Mips program pixie is used to add insturmentation to a program
>for performance analysis.  It must change the addresses to account
>for the added code.  Deleting code is much the same.

Yes, I believe you probably can do it for MIPS code.  The first thing
that makes it possible is fixed-size and fixed-alignment instructions.
These are attributes that make distinguishing code from data either
trivial or unimportant (you can translate data as if it were code
without harm, it would amount to a pessimistic assumption).  Plus, it
seems that PIXIE uses an address correspondence array (something that
I thought *I* had thought of, sigh), which also helps (it keeps track
of the mapping of old code addresses to new code addresses so that
indirect branches will work OK).

Thus, yes, I believe that you think this is not a hard problem for
*your* environment.  In general, however, which is how I took the
original poster's words and how I intended my response, this is certainly
a very hard problem.  Talk to Hunter Systems.  Guys like Phoenix
Technologies, who use dynamic analysis, have a much easier time, of
course.

Also, note that the address correspondence array is an extra level of
indirection that will have some effect on the performance of the
converted program.

I would like to know more about PIXIE, especially the techniques used.
Are there published reports to which I can refer?  Thanks....

andy@solitary.Stanford.EDU (Andy Freeman) (01/07/89)

In article <2700005@prisma> mo@prisma writes:
[We're discussing whether SPARC's stalling is a big advantage over MIPS
 explicit NOPs in binary portability to implementations that don't need them.]
>run already, even if crippled to some degree.  As for 3%, on a
>100-150 mips machine, that 3% is as fast as what was, until
>recently considered a reasonably quick (if not fast) machine.
>I'm not claiming it's a gigantic win, but it can be a non-trivial one
>for some applications.
     ^^^^
I'm still asking for a SINGLE example.  The application has to perform
acceptably on a 10 MIPS machine yet when it is run on a 100 MIPS
machine, the last 3% performance (almost a third of the performance on
the 10 MIPS box) is crucial.  Oh, and recompiling has to be out of the
question.  (If you push on this a bit more, you'll discover that there
are actually so few "dusty decks" that re-writing them is reasonable.)

>Just remember that not everyone's problem will currently run on a
>workstation-class machine.

Netnews is the only problem I've got that runs on a workstation. :-)
(That's a small lie - they're also fine for mail, editing, and compiling.)

-andy
UUCP:  {arpa gateways, decwrl, uunet, rutgers}!polya.stanford.edu!andy
ARPA:  andy@polya.stanford.edu
(415) 329-1718/723-3088 home/cubicle