pg@sftig.UUCP (P.Glick) (12/06/84)
I don't understand why people would consider a compiler "smart" if it were to replace the two nested empty 'for' loops with assignments of the loop variables. I prefer a compiler to do the expected -- what I told it to do. These kinds of 'optimizations' can make programming a pain in the neck. When it comes to programming, I prefer to be the one that makes the mistakes and not find out that a compiler cannot generate the code I want it to generate. Paul Glick btlunix!pg
henry@utzoo.UUCP (Henry Spencer) (12/08/84)
> ... I prefer a compiler to do the expected -- what > I told it to do. These kinds of 'optimizations' can make programming a pain > in the neck. When it comes to programming, I prefer to be the one that makes > the mistakes ... The problem is that such "obviously wrong" optimizations as deleting entire loops completely may be a side effect of complex combinations of different optimizations, where each optimization in itself is eminently reasonable. The solution is not dumb compilers, but compilers that can be told what to do and what not to do. The addition of "volatile" to the ANSI C draft is a major example. -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (12/09/84)
> When it comes to programming, I prefer to be the one that makes > the mistakes and not find out that a compiler cannot generate the code I > want it to generate. I empathize with Paul's position. However, the whole point of a compiler is to keep you from having to worry about code generation. Practically all production compilers, including PCC, rearrange the initial "dumb" parse tree and generated code to produce logically equivalent but faster and smaller code. Most often this is a desirable service. It is quite true that this kind of optimization can get in the way when one is trying to twiddle device registers and do other similar things beyond the ken of the compiler. That is why the "volatile" specification (which protects such code against optimization) being proposed for ANSI standard C is important.
rcd@opus.UUCP (Dick Dunn) (12/14/84)
> I don't understand why people would consider a compiler "smart" > if it were to replace the two nested empty 'for' loops with assignments > of the loop variables. I prefer a compiler to do the expected -- what > I told it to do. These kinds of 'optimizations' can make programming a pain > in the neck... I agree with this sentiment--but realize that it's as much wrapped up in language design as in compiler implementation. Languages such as C (and a VERY few others) allow you sufficient capability to say just what you want done that the attitude above makes sense. On the other side, consider a FORTRAN compiler for a vector machine. FORTRAN is so crippled in terms of being able to express what's "really going on" in its problem domain that a decent (let alone smart) compiler has to do a lot of optimization. -- Dick Dunn {hao,ucbvax,allegra}!nbires!rcd (303)444-5710 x3086 ...Are you making this up as you go along?
ecf_ujaj%jhu.csnet@csnet-relay.arpa (12/17/84)
Dick Dunn <rcd@opus > writes: > FORTRAN is so crippled in terms of > being able to express what's "really going on" in its problem domain that > a decent (let alone smart) compiler has to do a lot of optimization. Compilers that optimize code can break programs regardless of what language that code is written in. Optimizing compilers are very useful as long as the optimization is optional (or at least can be turned off when the code involved may be sensitive to optimization). -jj
rbbb@RICE.ARPA (12/17/84)
> Compilers that optimize code can break programs regardless of what > language that code is written in. Optimizing compilers are very useful > as long as the optimization is optional (or at least can be turned > off when the code involved may be sensitive to optimization). I must disagree with such a general statement. There are compilers that claim to "optimize", and that do break code, but (especially nowadays) compiler writers go to great pains to ensure that this is not the case, and they DO provide ways for the programmer to inform the compiler that mystical things are happening (e.g, "volatile" declaration for variables that may be modified asynchronously (this for FORTRAN!!)). An optimizing compiler that breaks code is buggy, and should be reported as such (rather than tolerated: "Oh, those optimizing compilers, such rascally programs. Can't expect an easy time if you want your code to run fast."). Don't buy that lie. Real compilers DO exist. drc
rcd@opus.UUCP (Dick Dunn) (12/19/84)
> Dick Dunn <rcd@opus > writes: > > FORTRAN is so crippled in terms of > > being able to express what's "really going on" in its problem domain that > > a decent (let alone smart) compiler has to do a lot of optimization. > > Compilers that optimize code can break programs regardless of what > language that code is written in. Optimizing compilers are very useful > as long as the optimization is optional (or at least can be turned > off when the code involved may be sensitive to optimization). Not quite fair to compilers. If optimizing breaks programs which conform to the language definition, the optimizer is broken. If optimizing breaks a program because the program is non-conforming (as, say, a program which depends on order of evaluation in most languages), the program is at fault. Using C assignments to manipulate or busy-wait on device registers is a gray area, which is why (1) optimizers zing people on such code periodically and (2) a mechanism like `volatile' has been proposed. FORTRAN is less susceptible to clashes with the optimizer due to the device-register sort of hassle but more susceptible to poor programming such as parameter/COMMON aliasing, misunderstanding of modifying initialized COMMON, etc. (And yes, I know--the program is equally broken whether it's the compiler's fault or the programmer's.) As to turning off optimization--sometimes you want certain optimizations on and others off, though this can be annoying for compiler-writers. -- Dick Dunn {hao,ucbvax,allegra}!nbires!rcd (303)444-5710 x3086 ...Are you making this up as you go along?
boyle@ANL-MCS.ARPA (12/20/84)
The view "If optimizing breaks programs which conform to the language definition, the optimizer is broken" is not universally held. For example, on the IBM Fortran H compiler (and probably also on H-extended), if you wrote a loop with an if test to protect against division by zero, and if the expression controlled by the if (the expression with the divide) were constant with respect to the loop, it would be moved out of the loop, *but the test would not*. Hence the optimized program would divide by zero where the unoptimized one would not. The Fortran Programmer's Guide even contained the statement that "the program may not behave the same after optimization as before". I never understood why they didn't optimize all programs to a return instruction, so at least they would have been fast! Jim Boyle Math. and Comp. Sci. Div. Argonne Nat'l Lab.
ndiamond@watdaisy.UUCP (Norman Diamond) (12/20/84)
> > Optimizing compilers are very useful > > as long as the optimization is optional (or at least can be turned > > off when the code involved may be sensitive to optimization). > > Not quite fair to compilers. If optimizing breaks programs which conform > to the language definition, the optimizer is broken. If optimizing breaks > a program because the program is non-conforming (as, say, a program which > depends on order of evaluation in most languages), the program is at fault. > Using C assignments to manipulate or busy-wait on device registers is a > gray area, which is why (1) optimizers zing people on such code > periodically and (2) a mechanism like `volatile' has been proposed. > > FORTRAN is less susceptible to clashes with the optimizer due to the > device-register sort of hassle but more susceptible to poor programming > such as parameter/COMMON aliasing, misunderstanding of modifying > initialized COMMON, etc. It is usually necessary for optimizations to break certain obscure points that conform to the language definition, e.g. which order STATEMENTS are executed in so that code can be moved out of loops. Many useful optimizations cannot be done when more than 99.9% of the language definition must be respected. In turn, this is the reason that optimizations must be optional. And if these optimizations were not available, clamouring might start for a return to assembly languages. -- Norman Diamond UUCP: {decvax|utzoo|ihnp4|allegra|clyde}!watmath!watdaisy!ndiamond CSNET: ndiamond%watdaisy@waterloo.csnet ARPA: ndiamond%watdaisy%waterloo.csnet@csnet-relay.arpa "Opinions are those of the keyboard, and do not reflect on me or higher-ups."
oacb2@ut-ngp.UUCP (oacb2) (12/21/84)
> FORTRAN is less susceptible to clashes with the optimizer due to the > device-register sort of hassle but more susceptible to poor programming > such as parameter/COMMON aliasing, misunderstanding of modifying > initialized COMMON, etc. However, the IBM H Level FORTRAN Programmer's Guide does warn that code such as DO 100 I = 1 TO 10 IF (Y .GT. 0) X(I) = SQRT(Y) 100 CONTINUE (with X appropriately declared as an array of course) can cause an error because it tries to take the square root of a negative number. The optimizer compiles this as if it were TEMP = SQRT(Y) DO 100 I = 1 TO 10 IF (Y .GT. 0) X(I) = TEMP 100 CONTINUE -- Mike Rubenstein, OACB, UT Medical Branch, Galveston TX 77550
jeff@gatech.UUCP (Jeff Lee) (12/21/84)
> > Compilers that optimize code can break programs regardless of what > > language that code is written in. Optimizing compilers are very useful > > as long as the optimization is optional (or at least can be turned > > off when the code involved may be sensitive to optimization). > > I must disagree with such a general statement. There are compilers that > claim to "optimize", and that do break code, but (especially nowadays) > compiler writers go to great pains to ensure that this is not the case, > and they DO provide ways for the programmer to inform the compiler that > mystical things are happening (e.g, "volatile" declaration for variables > that may be modified asynchronously (this for FORTRAN!!)). An optimizing > compiler that breaks code is buggy, and should be reported as such (rather > than tolerated: "Oh, those optimizing compilers, such rascally programs. > Can't expect an easy time if you want your code to run fast."). Don't buy > that lie. Real compilers DO exist. > One of the things that was stressed in the graduate compiler class here at Georgia Tech was that you don't make optimizations that will make errors look like they happened in a different place (to assist in debugging) much less break things. Code movement is a shaky thing. You can cause errors to happen (divides by 0 that would not normally occur) by moving invariant code outside of a loop. These are incorrect optimizations. I always enjoyed CDC's old FTN4 compiler (FORTRAN 66). It had, as I remember, 4 levels of optimization with the last one included with the argument UO --- Unsafe Optimizations. I have had code broken by this because it makes certain assuptions about DO loops to try and keep things in registers. In reponse to the person who didn't want the compiler to make any optimizations for him (paraphrase, I want that sucker to do what I tell it to), there are situations that need to be optimized that are hard for the user to control. Everyone has seen the case of optimization for common subexpressions, eg... a = i + j + k; b = i + j; turned into t = i + j; a = t + k; b = t; or even b = i + j; a = b + k; swapping an addition for a store (which may actually be a register and be for free) or completely eliminating the addition. This is nice that it can be done automatically but the real benefit of common subexpression elimination is in optimizing things like array references where the programmer (in most languages) has no way to optimize them out himself. Take for example the code segment int a[3][5]; ... a[i-1][j-1] = 1; a[i-1][j] = 2; a[i-1][j+1] = 3; should expand into *(a + 5 * (i - 1) + j - 1) = 1; *(a + 5 * (i - 1) + j ) = 2; *(a + 5 * (i - 1) + j + 1) = 3; which is pretty tacky, but what was asked for. With very little smarts in your common subexpression remover you can turn this into t = a + 5 * (i - 1) + j; *(t - 1) = 1; *(t ) = 2; *(t + 1) = 3; and with a small amount of expression rearrangement (which can also be a tricky movement since theoretically things are associative but on a computer they really aren't) you can finish it up with t = (a - 5) + 5 * i + j; *(t - 1) = 1; *(t ) = 2; *(t + 1) = 3; where (a - 5) is even a constant that can be calculated at compile time. Thus with a little smarts we have turned 4 subtractions, 7 adds, and 3 multiplies into 1 subtraction, 3 adds, and 1 multiply. None of these optimizations should have broken anything yet they should almost triple the speed of the index calculations. The only possible errors could be if either i or j were a "volatile" variable, but the new C standard is fixing this. I realize that the normal C user can perform all these optimizations himself and that is one of the reasons that I like C. I can make these optimizations if the need be. What I would like, though, is for the compiler to do these for me automatically so I don't have to mangle my code to make it fast. Also, the FORTRAN, PASCAL, BASIC, PL/I, etc.... user is left to the mercy of the compiler to make these optimizations for him. Excuse me for dragging on so long. This wasn't supposed to be a lecture. -- Jeff Lee CSNet: Jeff @ GATech ARPA: Jeff.GATech @ CSNet-Relay uucp: ...!{akgua,allegra,rlgvax,sb1,unmvax,ulysses,ut-sally}!gatech!jeff
jlg@lanl.ARPA (12/21/84)
> However, the IBM H Level FORTRAN Programmer's Guide does warn that code such > as > > DO 100 I = 1 TO 10 > IF (Y .GT. 0) X(I) = SQRT(Y) > 100 CONTINUE > > (with X appropriately declared as an array of course) can cause an error > because it tries to take the square root of a negative number. The optimizer > compiles this as if it were > > TEMP = SQRT(Y) > DO 100 I = 1 TO 10 > IF (Y .GT. 0) X(I) = TEMP > 100 CONTINUE This particular example should be optimized as IF ( Y.GT.0 ) TEMP=SQRT(Y) DO 100 I = 1,10 !THERE ARE NO 'TO' DELIMITERS IN FORTRAN X(I)=TEMP !FOLLOW INDENTATION STANDARDS PLEASE 100 CONTINUE A good data flow analysis would have seen this. However, in general these types of optimizations are called 'unsafe'; for obvious reasons. I don't think unsafe optimizations are allowed by the standard since they change the meaning of the code (in particular, they cause errors when none would have normally occurred). IBM has simply stepped out of line here. IBM is not alone, most vendors tend to have some violations of the standard lurking around in the compilers. The given example has a safe optimization available to it. It is possible to come up with cases that don't, so in general this sort of optimization should be considered unsafe (unless you have a really good data flow analyzer available to guarantee safety). ------------------------------------------------------------------------------ The greatest derangement of the mind is to believe in something because one wishes it to be so - Louis Pasteur James Giles
cliff@unmvax.UUCP (12/21/84)
> > However, the IBM H Level FORTRAN Programmer's Guide does warn that code such > > as > > > > DO 100 I = 1 TO 10 > > IF (Y .GT. 0) X(I) = SQRT(Y) > > 100 CONTINUE > > > > (with X appropriately declared as an array of course) can cause an error > > because it tries to take the square root of a negative number. The optimizer > > compiles this as if it were > > > > TEMP = SQRT(Y) > > DO 100 I = 1 TO 10 > > IF (Y .GT. 0) X(I) = TEMP > > 100 CONTINUE > > This particular example should be optimized as > > IF ( Y.GT.0 ) TEMP=SQRT(Y) > DO 100 I = 1,10 !THERE ARE NO 'TO' DELIMITERS IN FORTRAN > X(I)=TEMP !FOLLOW INDENTATION STANDARDS PLEASE > 100 CONTINUE C Not quite. You will wind up setting X(I) to zero or the last value of TEMP C when Y is less than or equal to zero. How about something that would really C be correct, like: C IF (Y.LE.0) GO TO 100 TEMP = SQRT(Y) DO 100 I = 1, 10 X(I) = TEMP 100 CONTINUE C C The greatest derangement of the mind is to believe in something C because one wishes to be Louis Pasteur. C C --Cliff [Matthews] C {purdue, cmcl2, ihnp4}!lanl!unmvax!cliff C {csu-cs, pur-ee, convex, gatech, ucbvax}!unmvax!cliff C 4744 Trumbull S.E. - Albuquerque NM 87108 - (505) 265-9143
mangoe@umcp-cs.UUCP (Charley Wingate) (12/22/84)
In article <18397@lanl.ARPA> jlg@lanl.ARPA writes: >> However, the IBM H Level FORTRAN Programmer's Guide does warn that code >> such as >> >> DO 100 I = 1 TO 10 >> IF (Y .GT. 0) X(I) = SQRT(Y) >> 100 CONTINUE >> >> (with X appropriately declared as an array of course) can cause an error >> because it tries to take the square root of a negative number. The >> optimizer compiles this as if it were >> >> TEMP = SQRT(Y) >> DO 100 I = 1 TO 10 >> IF (Y .GT. 0) X(I) = TEMP >> 100 CONTINUE >This particular example should be optimized as > > IF ( Y.GT.0 ) TEMP=SQRT(Y) > DO 100 I = 1,10 !THERE ARE NO 'TO' DELIMITERS IN FORTRAN > X(I)=TEMP !FOLLOW INDENTATION STANDARDS PLEASE >100 CONTINUE Wrong again. A correct optimization would be If (Y.GT.0) Temp=Sqrt(Y) Do 100 I=1,10 If (Y.GT.0) X(I)=Temp 100 Continue If Y is negative, X should be unchanged, according to the original program. My feeling is that an optimizer should produce the same "action" on some well-defined level as the original unoptimized code. Well-defined in this case should be equivalent to well-documented. Charley Wingate umcp-cs!mangoe
jlg@lanl.ARPA (12/22/84)
> Wrong again. A correct optimization would be > > If (Y.GT.0) Temp=Sqrt(Y) > Do 100 I=1,10 > If (Y.GT.0) X(I)=Temp > 100 Continue Quite right! My first version was wrong. But 'Y' is a loop invariant, the correct optimization is: IF (Y.GT.0) THEN TEMP=SQRT(Y) DO 100 I=1,10 X(I)=TEMP 100 CONTINUE ENDIF This has the advantage of not going through the loop at all if the assignments would all have been skipped. All these examples show the point I originally made: these optimizations are unsafe unless you have a REALLY GOOD data flow analyzer. ------------------------------------------------------------------------------ The greatest derangement of the mind is to believe in something because one wishes it to be so - Louis Pasteur James Giles
chris@umcp-cs.UUCP (Chris Torek) (12/23/84)
> It is usually necessary for optimizations to break certain obscure points > that conform to the language definition, e.g. which order STATEMENTS are > executed in so that code can be moved out of loops. Depends on what you mean. If the loop is guaranteed to be executed at least once, then moving loop-invariant expressions out of loops is fine. If the loop might not get executed, then you can always optimize for i := j to k { ... invariant-expr ... } into if j <= k then { temp := invariant-expr; i := j; do { ... temp ... } while ++i <= k; } (please excuse the syntax). Note that the "do" is probably slightly more efficient than the "for", and should probably be used to avoid degrading performance when j always <= k. Of course, if you allow such things as interrupts which modify variables into your language, then the optimizer must be either MUCH smarter (has to read and understand all your interrupt handlers [.5 :-)]) or you need hints like ``volatile''. -- (This line accidently left nonblank.) In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (301) 454-7690 UUCP: {seismo,allegra,brl-bmd}!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@maryland
oacb2@ut-ngp.UUCP (oacb2) (12/23/84)
> It is usually necessary for optimizations to break certain obscure points > that conform to the language definition, e.g. which order STATEMENTS are > executed in so that code can be moved out of loops. Many useful > optimizations cannot be done when more than 99.9% of the language > definition must be respected. > In turn, this is the reason that optimizations must be optional. > And if these optimizations were not available, clamouring might start for a > return to assembly languages. NO! NO! NO!. An optimizer that changes the meaning of the code is broken I agree that there is some discussion needed at to just what code should mean, particularly in languages like C which are often used as fancy assemblers, but at a very minimum an optimizer should never affect the results of a program provided - no variables are changed by asynchronous processes unknown to the compiler. - no variables whose values are undefined by the language specification (e.g. the DO loop index in FORTRAN after normal completion of the loop). - timing is not considered a result of the program. If an optimizer cannot be written that does not change the meanings of programs with these minimal requirements we are better off without the optimization. I'm more than a little tired of the troubles I've had debugging legal programs because some turkey wrote an optimizer that shaves a few microseconds off, but screws up in some rare case that I'm just lucky enought to hit. By the way, when I first ran into the problem I posted with DO loops in FORTRAN, I considered it to be a bug. Now it's been documented (for at least 10 years), but I still regard it as a bug. It's simply too outrageous for a statement like IF (Y .GT. 0) X(I) = SQRT(Y) to ever try to take the square root of a negative for me to accept it. Unfortunately, too many (read: every one I've ever dealt with) computer manufacturers thing that it's legitimate for their software to do all kinds of outrageous things if they just bury a sentence about it in the documentation. Another goody from IBM, this time from the PL/I Programmer's Guide: The use of bit strings in a multiprocessing environment may, occasionally, lead to incorrect results. -- Mike Rubenstein, OACB, UT Medical Branch, Galveston TX 77550
rcd@opus.UUCP (Dick Dunn) (12/25/84)
>...It is usually necessary for optimizations to break certain obscure points > that conform to the language definition, e.g. which order STATEMENTS are > executed in so that code can be moved out of loops. Many useful optimizations > cannot be done when more than 99.9% of the language definition must be > respected. Or, rephrasing the above ">", sometimes you get: The optimization was a success, but the program died. I can't see this is anything other than a conflict resulting from either a bad (or simply incomplete) language definition or a broken compiler. But then I've always had trouble dealing with compromises between efficiency and correctness. -- Dick Dunn {hao,ucbvax,allegra}!nbires!rcd (303)444-5710 x3086 ...Nothing left to do but smile, smile, smile.
herbie@watdcsu.UUCP (Herb Chong, Computing Services) (12/25/84)
> Another goody from IBM, this time from the PL/I Programmer's > Guide: > > The use of bit strings in a multiprocessing environment may, > occasionally, lead to incorrect results. Taken out of context, this statement is indeed very silly. You have to read the paragraphs before and after this to understand that the PL/I program can be easily communicating with another process sharing the same address space and possibly written in another language. No matter what the compiler does, the silliness of the other program can wipe out the best code generation. The other thing is that the 370 architecture does not allow bit addressing. If you want to examine or modify a bit, you have to fetch at least 1 byte and store at least one byte. That byte you are interested in may easily contain other bits that are being used by other processes. The PL/I language requires that if you are modifying a bit string shared between two processes (and therefore in the same address space), you must provide your own synchronization although it doesn't really come out and say so in one place. I might also mention (although this is nitpicking) that if you were to refering to the PL/I Checkout or Optimizing compilers, the manual is the PL/I Language Reference where this is explained in detail. Herb Chong... I'm user-friendly -- I don't byte, I nybble....
gnu@sun.uucp (John Gilmore) (12/26/84)
> James Giles says: > My first version was wrong. But 'Y' is a loop invariant, the > correct optimization is: > > IF (Y.GT.0) THEN > TEMP=SQRT(Y) > DO 100 I=1,10 > X(I)=TEMP > 100 CONTINUE > ENDIF > > This has the advantage of not going through the loop at all if the > assignments would all have been skipped. All these examples show the point > I originally made: these optimizations are unsafe unless you have a REALLY > GOOD data flow analyzer. You're absolutely right! Had you had a good data flow analyzer, it would have detected that your optimization had changed the value of 'I' on exit from this piece of code.
msb@lsuc.UUCP (Mark Brader) (12/26/84)
Ho hum, all of you are wrong. The CORRECT optimization for:
DO 100 I = 1 TO 10
IF (Y .GT. 0) X(I) = SQRT(Y)
100 CONTINUE
Is, in Ratfor syntax:
IF (Y > 0) {
TEMP = SQRT(Y)
DO I = 1,10
X(I) = TEMP
}
ELSE
I = 11
(In some implementations the number 11 will be something else, most
likely 10, but leaving I unchanged is clearly wrong!)
Moral: This is not only trickier than it looks, it's trickier than that!
Mark Brader
jeff@gatech.UUCP (Jeff Lee) (12/28/84)
> It is usually necessary for optimizations to break certain obscure points > that conform to the language definition, e.g. which order STATEMENTS are > executed in so that code can be moved out of loops. Many useful optimizations > cannot be done when more than 99.9% of the language definition must be > respected. > > In turn, this is the reason that optimizations must be optional. This is a truly incredible statement. I'm sure we can come up with many more optimizations that work real well if we only respect 99% of the language definition, but how useful are those "optimizations". It seems to me that when your optimizer gets finished with your code, it is no longer compiling FORTRAN, Pascal, C, etc... but some dialect that doesn't do what you intended for it to do. I don't want any compiler (and optimizer) that doesn't "respect" all of the language definition. That is simply begging and pleading for trouble. Jeff Lee CSNet: Jeff @ GATech ARPA: Jeff.GATech @ CSNet-Relay uucp: ...!{akgua,allegra,rlgvax,sb1,unmvax,ulysses,ut-sally}!gatech!jeff -- Jeff Lee CSNet: Jeff @ GATech ARPA: Jeff.GATech @ CSNet-Relay uucp: ...!{akgua,allegra,rlgvax,sb1,unmvax,ulysses,ut-sally}!gatech!jeff
ndiamond@watdaisy.UUCP (Norman Diamond) (12/31/84)
> Ho hum... The CORRECT [sic] optimization for: > > DO 100 I = 1 TO 10 > IF (Y .GT. 0) X(I) = SQRT(Y) > 100 CONTINUE > > Is, in Ratfor syntax: > > IF (Y > 0) { > TEMP = SQRT(Y) > DO I = 1,10 > X(I) = TEMP > } > ELSE > I = 11 > > (In some implementations the number 11 will be something else, most > likely 10, but leaving I unchanged is clearly wrong!) The else case can leave anything at all in I and still be correct. The value of I is undefined after completion of the DO loop, and when optimized it remains undefined after the IF statement (whether the then-branch or ELSE-branch is taken). The most optimal setting of I is to leave it alone, requiring 0 machine instructions. Otherwise, this suggestion was correct, as were several others. -- Norman Diamond UUCP: {decvax|utzoo|ihnp4|allegra|clyde}!watmath!watdaisy!ndiamond CSNET: ndiamond%watdaisy@waterloo.csnet ARPA: ndiamond%watdaisy%waterloo.csnet@csnet-relay.arpa "Opinions are those of the keyboard, and do not reflect on me or higher-ups."
bsa@ncoast.UUCP (Brandon Allbery (the tame hacker on the North Coast)) (01/02/85)
It occurs to me that there are two ways to look at optimizers: 1) That an optimizer optimizes, not the language code, but ASSEMBLER code, and hence does not need to follow the language standard; 2) That an optimizer must follow the language standard exactly, and hence should optimize the language code itself. If you accept (1), as every optimizer I've seen is designed to do, you are likely to see an optimizer that will occasionally fail. If (2), there is little optimization possible, especially in a language like Pascal where you can't manipulate the machine directly. All you who complain about optimizers breaking code: look at the current discussion of the correct optimization of a piece of ForTran code. No one has yet proposed an optimization (type 2) that does the same thing as the original code. And if it's hard in ForTran, how much harder in Assembler? Let's face it, guys -- placing any limitations (i.e. standards) on a programming language is going to make an executing program in that language slower. Optimizing is often simply ignoring some of the standards and removing their effects from the generated code. --bsa -- Brandon Allbery @ decvax!cwruecmp!ncoast!bsa (..ncoast!tdi1!bsa business) 6504 Chestnut Road, Independence, Ohio 44131 (216) 524-1416 Who said you had to be (a) a poor programmer or (b) a security hazard to be a hacker?
andrew@orca.UUCP (Andrew Klossner) (01/04/85)
} It occurs to me that there are two ways to look at optimizers: } } 1) That an optimizer optimizes, not the language code, but ASSEMBLER } code, and hence does not need to follow the language standard; } } 2) That an optimizer must follow the language standard exactly, } and hence should optimize the language code itself. } } If you accept (1), as every optimizer I've seen is designed to do, you } are likely to see an optimizer that will occasionally fail. If (2), } there is little optimization possible, especially in a language like } Pascal where you can't manipulate the machine directly. This shows a lack of familiarity with the computer science behind optimization. Systems which improve the assembler output from a compiler are commonly called "peephole" optimizers, because they never look at more than a very small part of the program (they look at the program through a "peephole"). Unix is the only system I've encountered in which the standard language optimizer is just an assembly code improver. The more traditional (read "mainframe") optimization strategy is to convert the entire program into an intermediate form which preserves the meaning of the code and yet is amenable to rearrangement into a more efficient program with equivalent meaning. Once this is done, the intermediate code is translated into assembler or machine code. So-called "global" optimizers examine the entire program, using "data-flow" algorithms which gather information by making several passes over the intermediate code; this information is then used to make decisions about how the program can be improved without changing its meaning. Pascal programs are particularly amenable to this sort of optimization. The standard example is the assignment "a[i]:=a[i]+1". This cannot be expressed any more efficiently in Pascal, but in a good intermediate code the translation can be arranged to reflect the fact that computation of the address of a[i] need only be done once, and that an "increment word in memory" instruction can be used. } Let's face it, guys -- placing any limitations (i.e. standards) on a } programming language is going to make an executing program in that language } slower. Optimizing is often simply ignoring some of the standards and } removing their effects from the generated code. Not at all. An implementor who claims that she had to revise the language in order to write an optimizer is being inexcusably lazy. Despite the recent discussions about well-known optimizers breaking programs, such behavior is considered incorrect by workers in the field. A good and readable treatment of optimization is Aho and Ullman's "Principles of Compiler Design", which has become the standard introductory compiler text. The chapters on optimization emphasize that an optimization which changes the semantics of the program is not valid. -- Andrew Klossner (decvax!tektronix!orca!andrew) [UUCP] (orca!andrew.tektronix@csnet-relay) [ARPA]
cem@intelca.UUCP (Chuck McManis) (01/04/85)
Again on the subject of "smart" compilers, specifically FORTRAN-IV on a DEC 2060XE running TOPS-20. The following code segment is approximate but the situation was the same. CALL FOO(X,0.,Z) ... X2 = 0. WRITE(5,10) X2 10 FORMAT(1X,F10.2) END SUBROUTINE FOO(X,Y,Z) ... Y=1.0 RETURN END Where the compiler optimized all constant references to 0.0 to a memory location containing 0.0 and the subroutine changed that value to 1.0. All further references to 0 produced the value 1. Not to swift. (I know it is poor style, and have since stopped using constants in subroutine calls.) --Chuck -- - - - D I S C L A I M E R - - - {ihnp4,fortune}!dual\ All opinions expressed herein are my {qantel,idi}-> !intelca!cem own and not those of my employer, my {ucbvax,hao}!hplabs/ friends, or my avocado plant. :-}
Mark Crispin <MRC@SU-SCORE.ARPA> (01/05/85)
I may be mistaken, but I believe according to the standards it is not only "poor style" to change a constant which was passed to a subroutine, it is also undefined what the behavior of the system after that should be. In any case, the object code from the TOPS-20 Fortran compiler is not the only one where the value of a constant is changed as a result of such practice. This particular sort of bug is fairly well-known. There are various flavors of Fortran compilers on 10/20 and IBM systems which will check for trying to do this sort of thing at runtime (also check for out-of-bounds array references, remember the neat tricks we did with those??), but that code introduces a *lot* of overhead. You really have no hope for catching this at compile-time; the earliest you can do it is when linking the modules together. -------
gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (01/06/85)
I thought Fortran-77 required constant actual parameters to be "copied in" rather than passed by reference?
alexc@dartvax.UUCP (Alex Colvin) (01/06/85)
(intelca.472) mentions what can only be called a horrendous com- piler bug. That one has been known since roughly FORTRAN II -- namely passing constants by reference. The solution adopted by most such languages (FORTRAN,PL/I) is to always pass copies of the constant. Please understand that the aforementioned compiler is NOT optim- izing. It is compiling incorrect code. The DTSS PL/I compiler does the correct optimization in this case. Iff the interprocedure analysis detects that a (reference) parameter is not modified, then it passes constant arguments by reference, without forcing a copy. This is done ONLY if it is known to be safe. As to passing constants, I see nothing wrong with it. I prefer it to passing variables. At least with a constant you can see what you're getting. True, magic constants (e.g. 501) should be given some sort of explanatory name (e.g. StreetAddress). But I see little point in defining ZERO as a name for 0. Pascal text- books are particularly prone to this sort of foolishness. One defined NINE as 9. If anything they should have defined it as Ra- dixLessOne. That would cause less surprise when some idiot changes it to 7.
dgary@ecsvax.UUCP (D Gary Grady) (01/07/85)
<> Mark Crispin <MRC@SU-SCORE.ARPA> writes: > I may be mistaken, but I believe according to the standards it is > not only "poor style" to change a constant which was passed to a subroutine, > it is also undefined what the behavior of the system after that should be. ... > You really have no > hope for catching this at compile-time; the earliest you can do it is when > linking the modules together. What's so hard about the calling program always moving the constant to a work area and passing the address of that? As I said before, that's equivalent to just considering a constant a special case of an expression. -- D Gary Grady Duke University Computation Center, Durham, NC 27706 (919) 684-4146 USENET: {decvax,ihnp4,akgua,etc.}!mcnc!ecsvax!dgary
dgary@ecsvax.UUCP (D Gary Grady) (01/07/85)
<> > I thought Fortran-77 required constant actual parameters to be > "copied in" rather than passed by reference? FORTRAN parameters are always pass-by-reference. -- D Gary Grady Duke University Computation Center, Durham, NC 27706 (919) 684-4146 USENET: {decvax,ihnp4,akgua,etc.}!mcnc!ecsvax!dgary
Mark Crispin <MRC@SU-SCORE.ARPA> (01/09/85)
The issue is NOT whether the compiler can avoid the problem. The issue is that the Fortran 77 standard disallows doing that. If you do something which is invalid Fortran, you shouldn't be surprised if strange things happen. Some compilers do indeed generate code which prevents constants from being changed (and checks array bounds, etc.). However, such compilers are called "debugging compilers" because the object code is slower. A LOT slower. Let's not forget the bad old days when we used to write code which deliberately violated Fortran to do useful things (especially out-of-bounds array references!!). I really find it surprising that this issue seriously comes up on this list. C will bite you back MUCH WORSE than Fortran will if you write invalid (albeit syntactically valid) C programs. -------
jhf@lanl.ARPA (01/10/85)
> > I thought Fortran-77 required constant actual parameters to be > > "copied in" rather than passed by reference? > > FORTRAN parameters are always pass-by-reference. > -- > D Gary Grady > Duke University Computation Center, Durham, NC 27706 Absolutely not. Both reference and value-result are acceptable parameter passing methods for Fortran. Programs that are sensitive to the difference, i.e., that create aliases to actual parameters ("dummy arguments") which they redefine, are not standard conforming. This restriction, in my opinion, is entirely appropriate. Efficiency of parameter passing is of the greatest importance to the overall quality of a compiler's target code, and archi- tectural features and other circumstances may make one of these methods much more desirable than the other. The compiler writer should be free to choose the appropriate technique, and not be restricted to one of them for the sake of guaranteeing the effects of some bad programs. I might note that Ada imposes a similar restriction on parameters, but, incredibly, the Pascal standard does not. It requires call-by-reference for var parameters! (At least, it did the last time I looked. If the situation has changed, I stand duly corrected and chastised.) I think this is a huge mistake. Not only does it unreasonably restrict implementations, but it makes impossible the removal of a wart in the language, that components of packed arrays or records may not be passed as var parameters. This restriction is there, of course, because there will generally not be a reasonable way to implement call-by-reference for objects that are not aligned on an addressable bit boundary. This directly conflicts with the notion that packed types differ from their unpacked counterparts only in their storage requirements and access times. What the Pascal standard should have done is to restrict the aliasing of var parameters in a manner similar to Fortran or Ada, and then to remove the restriction on passing components of packed structures. (Most compilers would implement this by copy-in-copy-out on the calling side, where necessary.) Joe Fasel Los Alamos National Laboratory jhf@lanl.{arpa,uucp}
johnl@ima.UUCP (01/14/85)
>I thought Fortran-77 required constant actual parameters to be >"copied in" rather than passed by reference? No, the F77 standard is deliberately vague so that either call by reference or call by copy in/out is legitimate. But let's move such discussions to net.lang. As I recall, the original argument was about whether it was legitimate to optimize object code. Sheesh. John Levine, ima!johnl
rcd@opus.UUCP (Dick Dunn) (01/15/85)
> > I thought Fortran-77 required constant actual parameters to be > > "copied in" rather than passed by reference? > > FORTRAN parameters are always pass-by-reference. Please verify before you post absolutes like this. FORTRAN parameters are required to obey a certain semantics. Call-by-reference is one mechanism which satisfies the constraints. Copy/restore is another. -- Dick Dunn {hao,ucbvax,allegra}!nbires!rcd (303)444-5710 x3086 ...A friend of the devil is a friend of mine.
jbn@wdl1.UUCP (01/16/85)
Wait until you see a vectorizing compiler for a supercomputer.
jbn@wdl1.UUCP (01/16/85)
Trouble comes when the language semantics are so loose that efficient code generation is impossible without making assumptions that are true almost all of the time, but not all the time. In C, for example, there is an assumption that pointers point to memory, not non-memory objects such as devices. In the Berkeley VAX compiler, code sequences involving shifting and masking may generate VAX field-extract functions. But the VAX does not implement these instructions for UNIBUS space, so this is an optimization not always valid. Assuming that the ``volatile'' keyword is a data attribute (i.e., one writes "static volatile char device_input_register") this is an appropriate way to deal with the problem.