mash@mips.UUCP (John Mashey) (01/11/88)
In article <28200085@ccvaxa> aglew@ccvaxa.UUCP writes: >..> The Fortran FREQUENCY statement: >..> John Levine (paraphrased) says that the lesson is that the programmer >..> should not provide the frequency hints, the system should. >I would rather that you had said "the system should be able to", >but "the programmer can if he really wants to". >First, a high level directive like FREQUENCY can be generated either >by the programmer or automatically by the language system, so why not >make this functionality available to the programmer? I suppose it would be OK to have FREQUENCY statements or similar directives, although the code clutter might be awful. In my opinion, if much of this idea depends on human beings putting them in, it's a *********Wrong Thing************** 1) Human beings should never supply information that computers can figure out for themselves. As noted earlier in this discussion, there indeed may be cases where the computer cannot figure it out. However: 2) If knowing the frequencies of branches (or of other things) makes a big difference, there is a (conceptually) simple mechanism for gathering the information mechanically, at least for user programs: a) Compile the program. b) Convert the program to a profiling version. c) Run the profiled version on one or more test cases to generate feedback files. d) Add the feedback files together [and note that this is important: some programs [Spice, for example] look very different with different kinds of inputs. e) Recompile and relink the program, applying whatever transforms are useful, given the information in the feedback files. Such transforms might include: 1) FREQUENCY-like changes, if the architecture is actually helped by them *2) Selective inlining of procedures, at the call sites where it is actually worth it. 3) Other code optimizations at compile time helped by frequency analysis. *4) Rearrangement to help cache efficiency [MIPS software currently does a)-e) above, plus *2 and *4], i.e., none of this is remotely theoretical. 3) Human intuition about performance is notoriously awful. People are often surprised when they see profile results. If the way you got most FREQUENCY statements was to profile something first, why not let the computer do it? 4) Features like FREQUENCY are a crutch to fill the gap between what a human knows and what the compiler can do. C's "register" declaration was such a crutch, appropriate at the time, since it made a big difference, and optimizing compilers were not as widely available as they are today. Many current compilers almost ignore "register" in favor of their own allocation. FREQUENCY is like that, but not as useful, since it isn't spanning much of gap, as current leading-edge software technology is quite capable of gathering and using the needed information. It is a feature that "fills a much-needed gap" (in the words of Ken T.) -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: {ames,decwrl,prls,pyramid}!mips!mash OR mash@mips.com DDD: 408-991-0253 or 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086
blarson@skat.usc.edu (Bob Larson) (01/11/88)
Programmers often don't realy know how their code will be used. This means that not only will their intuitive guesses on how frequently sections of code will be used will be bad, but also they may not be able to pick a reasonable set of cases to profile. (Would you like to see various versions of a product based on what profiles were fead to the optomizer?) Also, there are times where the most common case is not the one that needs optimization. Would you want a missle tracking system optimized for the case where there are no incoming missles? (Many such real-time applictations are coded in assembler, but more and more are now in high-level languages.) -- Bob Larson Arpa: Blarson@Ecla.Usc.Edu blarson@skat.usc.edu Uucp: {sdcrdcf,cit-vax}!oberon!skat!blarson Prime mailing list: info-prime-request%fns1@ecla.usc.edu oberon!fns1!info-prime-request
segall@clash.rutgers.edu (Ed Segall) (01/13/88)
> 4) Features like FREQUENCY are a crutch to fill the gap between > what a human knows and what the compiler can do. C's "register" > declaration was such a crutch, appropriate at the time, since it > made a big difference, and optimizing compilers were not as widely > available as they are today. Many current compilers almost ignore > "register" in favor of their own allocation. > > FREQUENCY is like that, but not as useful, since it isn't spanning > much of gap, as current leading-edge software technology is quite > capable of gathering and using the needed information. > It is a feature that "fills a much-needed gap" (in the words of Ken T.) Aaarrrggghh! Yes, it would be really nice to have a compiler figure everything out in advance, and I wish the greatest encouragement to those who can get a compiler to figure out which branch is most likely to be taken. Yes, it it true that profiling (with appropriate input data, for programs with data-dependent profiles) is the best way to do this. However, predicting the most likely branch is not the only reason to include a directive in a language to tell the compiler which branch to optimize. Perhaps FREQUENCY is a misnomer. May I propose MAKE_FASTEST_BRANCH, instead? This is only half-serious, of course, and I hope the choice of a name doesn't turn into a "this name is better than that name" contest, but there is an important semantic difference between the two. I think the missle example showed it best (to paraphrase: you don't want to optimize for the most common case, when no missles are coming in. You'd be better off optimizing the very uncommon case when they are). So, in mine humble opinion, it is a Very Good Thing to have a compiler figure out (guess, usually) which branch is going to be executed most frequently, and to use that information, but it is a Very Bad Thing to ignore the need for communicating as much useful information as possible to the compiler. Sure, most such features would go unused most of the time, often when used they would be wrong, and the code clutter would be horrendous if they are used extensively, (sounds like a candidate for Hypertext-style programming) but please don't say they shouldn't be developed and used. --Ed -- uucp: ...{harvard, ut-sally, sri-iu, ihnp4!packard}!topaz!caip!segall arpa: SEGALL@CAIP.RUTGERS.EDU
mash@mips.UUCP (John Mashey) (01/13/88)
In article <2861@clash.rutgers.edu> segall@clash.rutgers.edu (Ed Segall) writes: >> 4) Features like FREQUENCY are a crutch to fill the gap between >> what a human knows and what the compiler can do.... >> FREQUENCY is like that, but not as useful, since it isn't spanning >> much of gap, as current leading-edge software technology is quite >> capable of gathering and using the needed information. >> It is a feature that "fills a much-needed gap" (in the words of Ken T.) >Aaarrrggghh! Well, I thought my earlier posting might stir something up. >Yes, it would be really nice to have a compiler figure everything out >in advance, and I wish the greatest encouragement to those who can get >a compiler to figure out which branch is most likely to be taken.... >I think the missle example showed it best (to paraphrase: you don't >want to optimize for the most common case, when no missles are coming >in. You'd be better off optimizing the very uncommon case when they >are). So, in mine humble opinion, it is a Very Good Thing to have a >compiler figure out (guess, usually) which branch is going to be >executed most frequently, and to use that information, but it is a >Very Bad Thing to ignore the need for communicating as much useful >information as possible to the compiler. Sure, most such features >would go unused most of the time, often when used they would be wrong, >and the code clutter would be horrendous if they are used extensively, >(sounds like a candidate for Hypertext-style programming) but please >don't say they shouldn't be developed and used. The missile case doesn't prove it at all: a) If a thing like a FREQUENCY statement makes a difference, you are in serious trouble. Anti-missile code is hard enough. b) You can certainly pick your test cases. Everyone that I've known that worked on anti-missile stuff said they spent a lot of time simulating "here come the missiles", so presumably, data could be gathered on that case. All of this is a whole lot like profiling in the first place, i.e., if you spend a lot of apriori time tuning something, and then you profile it, you may be surprised that the time wasn't going where you thought. [this is not to say that you shouldn't do the level of performance analysis in advance that is deserved by a project, merely that tuning code without knowledge is often wasted effort.] Architecture&language design, with regard to performance feature additions seem like the same thing to me. It would be a good M.S. Thesis for somebody to modify a C compiler to take "advice" statements, annotate some useful real programs with them, and analyze the differences in performance. Hopefully, they'd start with an optimizer in the first place. I'm all for features (in cpus or languages) that help people get what they want; on the other hand, every feature costs you, and if you added every feature to C, that at least somebody has thought was a good idea, and might help some special case, C would substantially larger than PL/I. To get back to what started this, architecturally: a) Wrong-guessed branches are slow, so work very hard to guess right. b) All branches are fast. So far, b) looks awfully good to me; I sure prefer hardware that lets people get on with the job. Finally, it is interesting that FREQUENCY has been a feature familiar to people for about 30 years, and yet, it's been dropped from the original place and not picked up elsewhere. -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: {ames,decwrl,prls,pyramid}!mips!mash OR mash@mips.com DDD: 408-991-0253 or 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086
glenn@godzilla.ele.toronto.edu (Glenn Mackintosh) (01/14/88)
In article <6107@oberon.USC.EDU> blarson@skat.usc.edu (Bob Larson) writes: >Programmers often don't realy know how their code will be used. This >means that not only will their intuitive guesses on how frequently >sections of code will be used will be bad, but also they may not be >able to pick a reasonable set of cases to profile. (Would you like to >see various versions of a product based on what profiles were fead to >the optomizer?) > >Also, there are times where the most common case is not the one that >needs optimization. Would you want a missle tracking system optimized >for the case where there are no incoming missles? (Many such >real-time applications are coded in assembler, but more and more are >now in high-level languages.) I'm not really sure which side your arguing for here. Maybe you were trying to say what I'm about to and I missed it. The first paragraph sounds like an argument against things like the FREQUENCY statement but the second paragraph seems like a good argument for them. In the above missile example the programmers knowledge of which direction MUST be optimized is probably far more important and more correct than any specific group of test cases could be. Even a group of simulated attacks would likely not cover all the cases and in fact they might tend to cancel each other out. Therefore good use can be made of profiling but you may still need to force it to optimize in a specific direction. My personal opinion is that hints from the programmer can be useful and should be taken into account by the compiler. The optimizer should weigh this against things it discovers from profiling or other techniques and possibly choose to go the opposite way. The example above illustrates that it can be important to be able to override the compiler though. Glenn Mackintosh (@ University of Toronto) Standard disclaimer applies.
dick@ccb.ucsf.edu (Dick Karpinski) (01/20/88)
In article <2861@clash.rutgers.edu> segall@clash.rutgers.edu (Ed Segall) writes: >Yes, it it true that profiling (with appropriate input data, for >programs with data-dependent profiles) is the best way to do this. >However, predicting the most likely branch is not the only reason ... > >I think the missle example showed it best (to paraphrase: you don't >want to optimize for the most common case, when no missles are coming >in. You'd be better off optimizing the very uncommon case when they >are). Computer programs exhibit unexpected behavior very often. Even in this odd missle case, I would rather try to simulate the environment in which the missles are coming and instruct the compiler to respect the observed frequencies rather than expect the programmer to guess correctly. For example, it may well be that the usual case is still the usual case even when the missles are in view. Only a well designed optimiser can be expected to do a good job. People are seldom much like well designed optimisers. Dick Dick Karpinski Manager of Minicomputer Services, UCSF Computer Center UUCP: ...!ucbvax!ucsfcgl!cca.ucsf!dick (415) 476-4529 (11-7) BITNET: dick@ucsfcca or dick@ucsfvm Compuserve: 70215,1277 USPS: U-76 UCSF, San Francisco, CA 94143-0704 Telemail: RKarpinski Domain: dick@cca.ucsf.edu Home (415) 658-6803 Ans 658-3797
eugene@pioneer.arpa (Eugene Miya N.) (01/20/88)
Ah population count! The NSA needs that! --eugene miya
hammond@faline.bellcore.com (Rich A. Hammond) (01/20/88)
One case that I would like to see the programmer comment for the compiler and other readers is error recovery routines. Quite often the routines lack names that I immediately recognize as error handling and the compiler has no way of knowing whether "perror" is really the libc routine or a function declared by the programmer. In this case the programmer could be relatively certain of the frequency (or alternatively doesn't care if the error recovery is slower) and the compiler could put code, such as save all the registers before the error routine calls, that would enable better code for the case that we wish to go quickly. This is somewhat different from commenting all branches and should be helpful to both future code maintainers and the compiler and doesn't depend on tracing to get useful results. The problem that I have with using trace information is its effect on benchmarks. Optimizing a program for a given set of inputs will make benchmarks run wonderfully, but may not reflect the case we really want to measure. Averaging the run time of the benchmark with trace information and without might give us a better feel, but we'd force a more complex compiler, since I assume it would use two different branch prediction strategies, both with possible bugs. This isn't to say that we shouldn't use tracing, just that benchmarks should indicate exactly what the compiler options and their effects were. Rich Hammond
mash@mips.COM (John Mashey) (01/22/88)
In article <1722@faline.bellcore.com> hammond@faline.UUCP (Rich A. Hammond) writes: >In this case the programmer could be relatively certain of the >frequency (or alternatively doesn't care if the error recovery is >slower) and the compiler could put code, such as save all the registers >before the error routine calls, that would enable better code for the >case that we wish to go quickly. This is a specific instance of the general idea of moving register saves and restores around when possible. Rich's case is: glurp() { if (something) { low-frequency-function-call } else { a few statements } } You could move at least a return-address register-save/restore around the function call (which our compiler folks some-times call "shrink-wrapping" it.) However, the idea really has nothing to do with function calls: it could just as well be: glurp() { if (something) { simple statment that needs few registers } else { a whole bunch of code that needs many regs } } In this case, you move any save/restore code into the "else" part. Has anybody actually implemented this? (we haven't yet) -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: {ames,decwrl,prls,pyramid}!mips!mash OR mash@mips.com DDD: 408-991-0253 or 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086