[comp.arch] FREQUENCY statements: fill a much-needed gap

mash@mips.UUCP (John Mashey) (01/11/88)

In article <28200085@ccvaxa> aglew@ccvaxa.UUCP writes:

>..> The Fortran FREQUENCY statement:
>..>   John Levine (paraphrased) says that the lesson is that the programmer 
>..>   should not provide the frequency hints, the system should.

>I would rather that you had said "the system should be able to",
>but "the programmer can if he really wants to".

>First, a high level directive like FREQUENCY can be generated either
>by the programmer or automatically by the language system, so why not
>make this functionality available to the programmer?

I suppose it would be OK to have FREQUENCY statements or similar
directives, although the code clutter might be awful.   In my opinion,
if much of this idea depends on human beings putting them in,
it's a *********Wrong Thing**************

1) Human beings should never supply information that computers can figure
out for themselves.  As noted earlier in this discussion, there indeed
may be cases where the computer cannot figure it out.  However:

2) If knowing the frequencies of branches (or of other things)
makes a big difference, there is a (conceptually) simple mechanism
for gathering the information mechanically, at least for user programs:
	a) Compile the program.
	b) Convert the program to a profiling version.
	c) Run the profiled version on one or more test cases to
	generate feedback files.
	d) Add the feedback files together [and note that this is important:
	some programs [Spice, for example] look very different with
	different kinds of inputs.
	e) Recompile and relink the program, applying whatever
	transforms are useful, given the information in the feedback files.
	Such transforms might include:
		1) FREQUENCY-like changes, if the architecture is actually
		helped by them
		*2) Selective inlining of procedures, at the call sites
		where it is actually worth it.
		3) Other code optimizations at compile time helped by
		frequency analysis.
		*4) Rearrangement to help cache efficiency

[MIPS software currently does a)-e) above, plus *2 and *4],
i.e., none of this is remotely theoretical.

3) Human intuition about performance is notoriously awful.  People
are often surprised when they see profile results.  If the way you
got most FREQUENCY statements was to profile something first,
why not let the computer do it?

4) Features like FREQUENCY are a crutch to fill the gap between
what a human knows and what the compiler can do.  C's "register"
declaration was such a crutch, appropriate at the time, since it
made a big difference, and optimizing compilers were not as widely
available as they are today.  Many current compilers almost ignore
"register" in favor of their own allocation.

FREQUENCY is like that, but not as useful, since it isn't spanning
much of gap, as current leading-edge software technology is quite
capable of gathering and using the needed information.
It is a feature that "fills a much-needed gap" (in the words of Ken T.)
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	{ames,decwrl,prls,pyramid}!mips!mash  OR  mash@mips.com
DDD:  	408-991-0253 or 408-720-1700, x253
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

blarson@skat.usc.edu (Bob Larson) (01/11/88)

Programmers often don't realy know how their code will be used.  This
means that not only will their intuitive guesses on how frequently
sections of code will be used will be bad, but also they may not be
able to pick a reasonable set of cases to profile.  (Would you like to
see various versions of a product based on what profiles were fead to
the optomizer?)

Also, there are times where the most common case is not the one that
needs optimization.  Would you want a missle tracking system optimized
for the case where there are no incoming missles?  (Many such
real-time applictations are coded in assembler, but more and more are
now in high-level languages.)
--
Bob Larson	Arpa: Blarson@Ecla.Usc.Edu	blarson@skat.usc.edu
Uucp: {sdcrdcf,cit-vax}!oberon!skat!blarson
Prime mailing list:	info-prime-request%fns1@ecla.usc.edu
			oberon!fns1!info-prime-request

segall@clash.rutgers.edu (Ed Segall) (01/13/88)

> 4) Features like FREQUENCY are a crutch to fill the gap between
> what a human knows and what the compiler can do.  C's "register"
> declaration was such a crutch, appropriate at the time, since it
> made a big difference, and optimizing compilers were not as widely
> available as they are today.  Many current compilers almost ignore
> "register" in favor of their own allocation.
> 
> FREQUENCY is like that, but not as useful, since it isn't spanning
> much of gap, as current leading-edge software technology is quite
> capable of gathering and using the needed information.
> It is a feature that "fills a much-needed gap" (in the words of Ken T.)


Aaarrrggghh!


Yes, it would be really nice to have a compiler figure everything out
in advance, and I wish the greatest encouragement to those who can get
a compiler to figure out which branch is most likely to be taken.
Yes, it it true that profiling (with appropriate input data, for
programs with data-dependent profiles) is the best way to do this.
However, predicting the most likely branch is not the only reason to
include a directive in a language to tell the compiler which branch to
optimize.  Perhaps FREQUENCY is a misnomer.  May I propose
MAKE_FASTEST_BRANCH, instead? This is only half-serious, of course,
and I hope the choice of a name doesn't turn into a "this name is
better than that name" contest, but there is an important semantic
difference between the two. 

I think the missle example showed it best (to paraphrase: you don't
want to optimize for the most common case, when no missles are coming
in.  You'd be better off optimizing the very uncommon case when they
are).  So, in mine humble opinion, it is a Very Good Thing to have a
compiler figure out (guess, usually) which branch is going to be
executed most frequently, and to use that information, but it is a
Very Bad Thing to ignore the need for communicating as much useful
information as possible to the compiler.  Sure, most such features
would go unused most of the time, often when used they would be wrong,
and the code clutter would be horrendous if they are used extensively,
(sounds like a candidate for Hypertext-style programming)  but please
don't say they shouldn't be developed and used.




--Ed
-- 

uucp:   ...{harvard, ut-sally, sri-iu, ihnp4!packard}!topaz!caip!segall
arpa:   SEGALL@CAIP.RUTGERS.EDU

mash@mips.UUCP (John Mashey) (01/13/88)

In article <2861@clash.rutgers.edu> segall@clash.rutgers.edu (Ed Segall) writes:
>> 4) Features like FREQUENCY are a crutch to fill the gap between
>> what a human knows and what the compiler can do....
 
>> FREQUENCY is like that, but not as useful, since it isn't spanning
>> much of gap, as current leading-edge software technology is quite
>> capable of gathering and using the needed information.
>> It is a feature that "fills a much-needed gap" (in the words of Ken T.)

>Aaarrrggghh!

Well, I thought my earlier posting might stir something up.

>Yes, it would be really nice to have a compiler figure everything out
>in advance, and I wish the greatest encouragement to those who can get
>a compiler to figure out which branch is most likely to be taken....

>I think the missle example showed it best (to paraphrase: you don't
>want to optimize for the most common case, when no missles are coming
>in.  You'd be better off optimizing the very uncommon case when they
>are).  So, in mine humble opinion, it is a Very Good Thing to have a
>compiler figure out (guess, usually) which branch is going to be
>executed most frequently, and to use that information, but it is a
>Very Bad Thing to ignore the need for communicating as much useful
>information as possible to the compiler.  Sure, most such features
>would go unused most of the time, often when used they would be wrong,
>and the code clutter would be horrendous if they are used extensively,
>(sounds like a candidate for Hypertext-style programming)  but please
>don't say they shouldn't be developed and used.

The missile case doesn't prove it at all:
	a) If a thing like a FREQUENCY statement makes a difference,
	you are in serious trouble. Anti-missile code is hard enough.
	b) You can certainly pick your test cases.  Everyone that I've
	known that worked on anti-missile stuff said they spent a
	lot of time simulating "here come the missiles", so presumably,
	data could be gathered on that case.

All of this is a whole lot like profiling in the first place, i.e.,
if you spend a lot of apriori time tuning something, and then you
profile it, you may be surprised that the time wasn't going where
you thought.  [this is not to say that you shouldn't do the level of
performance analysis in advance that is deserved by a project, merely
that tuning code without knowledge is often wasted effort.]

Architecture&language design, with regard to performance feature additions seem
like the same thing to me.  It would be a good M.S. Thesis for somebody
to modify a C compiler to take "advice" statements,  annotate some
useful real programs with them, and analyze the differences in performance.
Hopefully, they'd start with an optimizer in the first place.
I'm all for features (in cpus or languages) that help people get what they want;
on the other hand, every feature costs you, and if you added every feature
to C, that at least somebody has thought was a good idea, and might help
some special case, C would substantially larger than PL/I.

To get back to what started this, architecturally:
a) Wrong-guessed branches are slow, so work very hard to guess right.
b) All branches are fast.

So far, b) looks awfully good to me; I sure prefer hardware that lets people
get on with the job.

Finally, it is interesting that FREQUENCY has been a feature familiar to
people for about 30 years, and yet, it's been dropped from the original place
and not picked up elsewhere.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	{ames,decwrl,prls,pyramid}!mips!mash  OR  mash@mips.com
DDD:  	408-991-0253 or 408-720-1700, x253
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

glenn@godzilla.ele.toronto.edu (Glenn Mackintosh) (01/14/88)

In article <6107@oberon.USC.EDU> blarson@skat.usc.edu (Bob Larson) writes:
>Programmers often don't realy know how their code will be used.  This
>means that not only will their intuitive guesses on how frequently
>sections of code will be used will be bad, but also they may not be
>able to pick a reasonable set of cases to profile.  (Would you like to
>see various versions of a product based on what profiles were fead to
>the optomizer?)
>
>Also, there are times where the most common case is not the one that
>needs optimization.  Would you want a missle tracking system optimized
>for the case where there are no incoming missles?  (Many such
>real-time applications are coded in assembler, but more and more are
>now in high-level languages.)

I'm not really sure which side your arguing for here.  Maybe you were trying
to say what I'm about to and I missed it.  The first paragraph sounds like
an argument against things like the FREQUENCY statement but the second
paragraph seems like a good argument for them.  In the above missile example
the programmers knowledge of which direction MUST be optimized is probably
far more important and more correct than any specific group of test cases
could be.  Even a group of simulated attacks would likely not cover all the
cases and in fact they might tend to cancel each other out.  Therefore good
use can be made of profiling but you may still need to force it to optimize
in a specific direction.

My personal opinion is that hints from the programmer can be useful and
should be taken into account by the compiler.  The optimizer should weigh
this against things it discovers from profiling or other techniques and
possibly choose to go the opposite way.  The example above illustrates that
it can be important to be able to override the compiler though.


			Glenn Mackintosh (@ University of Toronto)

Standard disclaimer applies.

dick@ccb.ucsf.edu (Dick Karpinski) (01/20/88)

In article <2861@clash.rutgers.edu> segall@clash.rutgers.edu (Ed Segall) writes:
>Yes, it it true that profiling (with appropriate input data, for
>programs with data-dependent profiles) is the best way to do this.
>However, predicting the most likely branch is not the only reason ...
>
>I think the missle example showed it best (to paraphrase: you don't
>want to optimize for the most common case, when no missles are coming
>in.  You'd be better off optimizing the very uncommon case when they
>are).  

Computer programs exhibit unexpected behavior very often.  Even in this
odd missle case, I would rather try to simulate the environment in which
the missles are coming and instruct the compiler to respect the observed
frequencies rather than expect the programmer to guess correctly.  For
example, it may well be that the usual case is still the usual case even
when the missles are in view.  Only a well designed optimiser can be 
expected to do a good job.  People are seldom much like well designed
optimisers.

Dick

Dick Karpinski  Manager of Minicomputer Services, UCSF Computer Center
UUCP:  ...!ucbvax!ucsfcgl!cca.ucsf!dick        (415) 476-4529 (11-7)
BITNET:  dick@ucsfcca or dick@ucsfvm           Compuserve: 70215,1277  
USPS:  U-76 UCSF, San Francisco, CA 94143-0704   Telemail: RKarpinski   
Domain: dick@cca.ucsf.edu  Home (415) 658-6803  Ans 658-3797

eugene@pioneer.arpa (Eugene Miya N.) (01/20/88)

Ah population count!  The NSA needs that!

--eugene miya

hammond@faline.bellcore.com (Rich A. Hammond) (01/20/88)

One case that I would like to see the programmer comment for the
compiler and other readers is error recovery routines.  Quite
often the routines lack names that I immediately recognize as
error handling and the compiler has no way of knowing whether "perror"
is really the libc routine or a function declared by the programmer.

In this case the programmer could be relatively certain of the
frequency (or alternatively doesn't care if the error recovery is
slower) and the compiler could put code, such as save all the registers
before the error routine calls,  that would enable better code for the
case that we wish to go quickly.

This is somewhat different from commenting all branches and should be
helpful to both future code maintainers and the compiler and doesn't
depend on tracing to get useful results.

The problem that I have with using trace information is its effect
on benchmarks.  Optimizing a program for a given set of inputs will
make benchmarks run wonderfully, but may not reflect the case we
really want to measure.  Averaging the run time of the benchmark
with trace information and without might give us a better feel,
but we'd force a more complex compiler, since I assume it would
use two different branch prediction strategies, both with possible
bugs.

This isn't to say that we shouldn't use tracing, just that benchmarks
should indicate exactly what the compiler options and their effects
were.

Rich Hammond

mash@mips.COM (John Mashey) (01/22/88)

In article <1722@faline.bellcore.com> hammond@faline.UUCP (Rich A. Hammond) writes:

>In this case the programmer could be relatively certain of the
>frequency (or alternatively doesn't care if the error recovery is
>slower) and the compiler could put code, such as save all the registers
>before the error routine calls,  that would enable better code for the
>case that we wish to go quickly.
This is a specific instance of the general idea of moving register
saves and restores around when possible. Rich's case is:
	glurp()
	{
		if (something) {
			low-frequency-function-call
		} else {
			a few statements
		}
	}
You could move at least a return-address register-save/restore around
the function call (which our compiler folks some-times call "shrink-wrapping"
it.)  However, the idea really has nothing to do with function calls:
it could just as well be: 

	glurp()
	{
		if (something) {
			simple statment that needs few registers
		} else {
			a whole bunch of code that needs many regs
		}
	}

In this case, you move any save/restore code into the "else" part.
Has anybody actually implemented this? (we haven't yet)
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	{ames,decwrl,prls,pyramid}!mips!mash  OR  mash@mips.com
DDD:  	408-991-0253 or 408-720-1700, x253
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086