rcd@opus.UUCP (Dick Dunn) (12/14/84)
> I don't understand why people would consider a compiler "smart" > if it were to replace the two nested empty 'for' loops with assignments > of the loop variables. I prefer a compiler to do the expected -- what > I told it to do. These kinds of 'optimizations' can make programming a pain > in the neck... I agree with this sentiment--but realize that it's as much wrapped up in language design as in compiler implementation. Languages such as C (and a VERY few others) allow you sufficient capability to say just what you want done that the attitude above makes sense. On the other side, consider a FORTRAN compiler for a vector machine. FORTRAN is so crippled in terms of being able to express what's "really going on" in its problem domain that a decent (let alone smart) compiler has to do a lot of optimization. -- Dick Dunn {hao,ucbvax,allegra}!nbires!rcd (303)444-5710 x3086 ...Are you making this up as you go along?
rcd@opus.UUCP (Dick Dunn) (12/19/84)
> Dick Dunn <rcd@opus > writes: > > FORTRAN is so crippled in terms of > > being able to express what's "really going on" in its problem domain that > > a decent (let alone smart) compiler has to do a lot of optimization. > > Compilers that optimize code can break programs regardless of what > language that code is written in. Optimizing compilers are very useful > as long as the optimization is optional (or at least can be turned > off when the code involved may be sensitive to optimization). Not quite fair to compilers. If optimizing breaks programs which conform to the language definition, the optimizer is broken. If optimizing breaks a program because the program is non-conforming (as, say, a program which depends on order of evaluation in most languages), the program is at fault. Using C assignments to manipulate or busy-wait on device registers is a gray area, which is why (1) optimizers zing people on such code periodically and (2) a mechanism like `volatile' has been proposed. FORTRAN is less susceptible to clashes with the optimizer due to the device-register sort of hassle but more susceptible to poor programming such as parameter/COMMON aliasing, misunderstanding of modifying initialized COMMON, etc. (And yes, I know--the program is equally broken whether it's the compiler's fault or the programmer's.) As to turning off optimization--sometimes you want certain optimizations on and others off, though this can be annoying for compiler-writers. -- Dick Dunn {hao,ucbvax,allegra}!nbires!rcd (303)444-5710 x3086 ...Are you making this up as you go along?
ndiamond@watdaisy.UUCP (Norman Diamond) (12/20/84)
> > Optimizing compilers are very useful > > as long as the optimization is optional (or at least can be turned > > off when the code involved may be sensitive to optimization). > > Not quite fair to compilers. If optimizing breaks programs which conform > to the language definition, the optimizer is broken. If optimizing breaks > a program because the program is non-conforming (as, say, a program which > depends on order of evaluation in most languages), the program is at fault. > Using C assignments to manipulate or busy-wait on device registers is a > gray area, which is why (1) optimizers zing people on such code > periodically and (2) a mechanism like `volatile' has been proposed. > > FORTRAN is less susceptible to clashes with the optimizer due to the > device-register sort of hassle but more susceptible to poor programming > such as parameter/COMMON aliasing, misunderstanding of modifying > initialized COMMON, etc. It is usually necessary for optimizations to break certain obscure points that conform to the language definition, e.g. which order STATEMENTS are executed in so that code can be moved out of loops. Many useful optimizations cannot be done when more than 99.9% of the language definition must be respected. In turn, this is the reason that optimizations must be optional. And if these optimizations were not available, clamouring might start for a return to assembly languages. -- Norman Diamond UUCP: {decvax|utzoo|ihnp4|allegra|clyde}!watmath!watdaisy!ndiamond CSNET: ndiamond%watdaisy@waterloo.csnet ARPA: ndiamond%watdaisy%waterloo.csnet@csnet-relay.arpa "Opinions are those of the keyboard, and do not reflect on me or higher-ups."
oacb2@ut-ngp.UUCP (oacb2) (12/21/84)
> FORTRAN is less susceptible to clashes with the optimizer due to the > device-register sort of hassle but more susceptible to poor programming > such as parameter/COMMON aliasing, misunderstanding of modifying > initialized COMMON, etc. However, the IBM H Level FORTRAN Programmer's Guide does warn that code such as DO 100 I = 1 TO 10 IF (Y .GT. 0) X(I) = SQRT(Y) 100 CONTINUE (with X appropriately declared as an array of course) can cause an error because it tries to take the square root of a negative number. The optimizer compiles this as if it were TEMP = SQRT(Y) DO 100 I = 1 TO 10 IF (Y .GT. 0) X(I) = TEMP 100 CONTINUE -- Mike Rubenstein, OACB, UT Medical Branch, Galveston TX 77550
jlg@lanl.ARPA (12/21/84)
> However, the IBM H Level FORTRAN Programmer's Guide does warn that code such > as > > DO 100 I = 1 TO 10 > IF (Y .GT. 0) X(I) = SQRT(Y) > 100 CONTINUE > > (with X appropriately declared as an array of course) can cause an error > because it tries to take the square root of a negative number. The optimizer > compiles this as if it were > > TEMP = SQRT(Y) > DO 100 I = 1 TO 10 > IF (Y .GT. 0) X(I) = TEMP > 100 CONTINUE This particular example should be optimized as IF ( Y.GT.0 ) TEMP=SQRT(Y) DO 100 I = 1,10 !THERE ARE NO 'TO' DELIMITERS IN FORTRAN X(I)=TEMP !FOLLOW INDENTATION STANDARDS PLEASE 100 CONTINUE A good data flow analysis would have seen this. However, in general these types of optimizations are called 'unsafe'; for obvious reasons. I don't think unsafe optimizations are allowed by the standard since they change the meaning of the code (in particular, they cause errors when none would have normally occurred). IBM has simply stepped out of line here. IBM is not alone, most vendors tend to have some violations of the standard lurking around in the compilers. The given example has a safe optimization available to it. It is possible to come up with cases that don't, so in general this sort of optimization should be considered unsafe (unless you have a really good data flow analyzer available to guarantee safety). ------------------------------------------------------------------------------ The greatest derangement of the mind is to believe in something because one wishes it to be so - Louis Pasteur James Giles
cliff@unmvax.UUCP (12/21/84)
> > However, the IBM H Level FORTRAN Programmer's Guide does warn that code such > > as > > > > DO 100 I = 1 TO 10 > > IF (Y .GT. 0) X(I) = SQRT(Y) > > 100 CONTINUE > > > > (with X appropriately declared as an array of course) can cause an error > > because it tries to take the square root of a negative number. The optimizer > > compiles this as if it were > > > > TEMP = SQRT(Y) > > DO 100 I = 1 TO 10 > > IF (Y .GT. 0) X(I) = TEMP > > 100 CONTINUE > > This particular example should be optimized as > > IF ( Y.GT.0 ) TEMP=SQRT(Y) > DO 100 I = 1,10 !THERE ARE NO 'TO' DELIMITERS IN FORTRAN > X(I)=TEMP !FOLLOW INDENTATION STANDARDS PLEASE > 100 CONTINUE C Not quite. You will wind up setting X(I) to zero or the last value of TEMP C when Y is less than or equal to zero. How about something that would really C be correct, like: C IF (Y.LE.0) GO TO 100 TEMP = SQRT(Y) DO 100 I = 1, 10 X(I) = TEMP 100 CONTINUE C C The greatest derangement of the mind is to believe in something C because one wishes to be Louis Pasteur. C C --Cliff [Matthews] C {purdue, cmcl2, ihnp4}!lanl!unmvax!cliff C {csu-cs, pur-ee, convex, gatech, ucbvax}!unmvax!cliff C 4744 Trumbull S.E. - Albuquerque NM 87108 - (505) 265-9143
mangoe@umcp-cs.UUCP (Charley Wingate) (12/22/84)
In article <18397@lanl.ARPA> jlg@lanl.ARPA writes: >> However, the IBM H Level FORTRAN Programmer's Guide does warn that code >> such as >> >> DO 100 I = 1 TO 10 >> IF (Y .GT. 0) X(I) = SQRT(Y) >> 100 CONTINUE >> >> (with X appropriately declared as an array of course) can cause an error >> because it tries to take the square root of a negative number. The >> optimizer compiles this as if it were >> >> TEMP = SQRT(Y) >> DO 100 I = 1 TO 10 >> IF (Y .GT. 0) X(I) = TEMP >> 100 CONTINUE >This particular example should be optimized as > > IF ( Y.GT.0 ) TEMP=SQRT(Y) > DO 100 I = 1,10 !THERE ARE NO 'TO' DELIMITERS IN FORTRAN > X(I)=TEMP !FOLLOW INDENTATION STANDARDS PLEASE >100 CONTINUE Wrong again. A correct optimization would be If (Y.GT.0) Temp=Sqrt(Y) Do 100 I=1,10 If (Y.GT.0) X(I)=Temp 100 Continue If Y is negative, X should be unchanged, according to the original program. My feeling is that an optimizer should produce the same "action" on some well-defined level as the original unoptimized code. Well-defined in this case should be equivalent to well-documented. Charley Wingate umcp-cs!mangoe
jlg@lanl.ARPA (12/22/84)
> Wrong again. A correct optimization would be > > If (Y.GT.0) Temp=Sqrt(Y) > Do 100 I=1,10 > If (Y.GT.0) X(I)=Temp > 100 Continue Quite right! My first version was wrong. But 'Y' is a loop invariant, the correct optimization is: IF (Y.GT.0) THEN TEMP=SQRT(Y) DO 100 I=1,10 X(I)=TEMP 100 CONTINUE ENDIF This has the advantage of not going through the loop at all if the assignments would all have been skipped. All these examples show the point I originally made: these optimizations are unsafe unless you have a REALLY GOOD data flow analyzer. ------------------------------------------------------------------------------ The greatest derangement of the mind is to believe in something because one wishes it to be so - Louis Pasteur James Giles
chris@umcp-cs.UUCP (Chris Torek) (12/23/84)
> It is usually necessary for optimizations to break certain obscure points > that conform to the language definition, e.g. which order STATEMENTS are > executed in so that code can be moved out of loops. Depends on what you mean. If the loop is guaranteed to be executed at least once, then moving loop-invariant expressions out of loops is fine. If the loop might not get executed, then you can always optimize for i := j to k { ... invariant-expr ... } into if j <= k then { temp := invariant-expr; i := j; do { ... temp ... } while ++i <= k; } (please excuse the syntax). Note that the "do" is probably slightly more efficient than the "for", and should probably be used to avoid degrading performance when j always <= k. Of course, if you allow such things as interrupts which modify variables into your language, then the optimizer must be either MUCH smarter (has to read and understand all your interrupt handlers [.5 :-)]) or you need hints like ``volatile''. -- (This line accidently left nonblank.) In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (301) 454-7690 UUCP: {seismo,allegra,brl-bmd}!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@maryland
oacb2@ut-ngp.UUCP (oacb2) (12/23/84)
> It is usually necessary for optimizations to break certain obscure points > that conform to the language definition, e.g. which order STATEMENTS are > executed in so that code can be moved out of loops. Many useful > optimizations cannot be done when more than 99.9% of the language > definition must be respected. > In turn, this is the reason that optimizations must be optional. > And if these optimizations were not available, clamouring might start for a > return to assembly languages. NO! NO! NO!. An optimizer that changes the meaning of the code is broken I agree that there is some discussion needed at to just what code should mean, particularly in languages like C which are often used as fancy assemblers, but at a very minimum an optimizer should never affect the results of a program provided - no variables are changed by asynchronous processes unknown to the compiler. - no variables whose values are undefined by the language specification (e.g. the DO loop index in FORTRAN after normal completion of the loop). - timing is not considered a result of the program. If an optimizer cannot be written that does not change the meanings of programs with these minimal requirements we are better off without the optimization. I'm more than a little tired of the troubles I've had debugging legal programs because some turkey wrote an optimizer that shaves a few microseconds off, but screws up in some rare case that I'm just lucky enought to hit. By the way, when I first ran into the problem I posted with DO loops in FORTRAN, I considered it to be a bug. Now it's been documented (for at least 10 years), but I still regard it as a bug. It's simply too outrageous for a statement like IF (Y .GT. 0) X(I) = SQRT(Y) to ever try to take the square root of a negative for me to accept it. Unfortunately, too many (read: every one I've ever dealt with) computer manufacturers thing that it's legitimate for their software to do all kinds of outrageous things if they just bury a sentence about it in the documentation. Another goody from IBM, this time from the PL/I Programmer's Guide: The use of bit strings in a multiprocessing environment may, occasionally, lead to incorrect results. -- Mike Rubenstein, OACB, UT Medical Branch, Galveston TX 77550
rcd@opus.UUCP (Dick Dunn) (12/25/84)
>...It is usually necessary for optimizations to break certain obscure points > that conform to the language definition, e.g. which order STATEMENTS are > executed in so that code can be moved out of loops. Many useful optimizations > cannot be done when more than 99.9% of the language definition must be > respected. Or, rephrasing the above ">", sometimes you get: The optimization was a success, but the program died. I can't see this is anything other than a conflict resulting from either a bad (or simply incomplete) language definition or a broken compiler. But then I've always had trouble dealing with compromises between efficiency and correctness. -- Dick Dunn {hao,ucbvax,allegra}!nbires!rcd (303)444-5710 x3086 ...Nothing left to do but smile, smile, smile.
herbie@watdcsu.UUCP (Herb Chong, Computing Services) (12/25/84)
> Another goody from IBM, this time from the PL/I Programmer's > Guide: > > The use of bit strings in a multiprocessing environment may, > occasionally, lead to incorrect results. Taken out of context, this statement is indeed very silly. You have to read the paragraphs before and after this to understand that the PL/I program can be easily communicating with another process sharing the same address space and possibly written in another language. No matter what the compiler does, the silliness of the other program can wipe out the best code generation. The other thing is that the 370 architecture does not allow bit addressing. If you want to examine or modify a bit, you have to fetch at least 1 byte and store at least one byte. That byte you are interested in may easily contain other bits that are being used by other processes. The PL/I language requires that if you are modifying a bit string shared between two processes (and therefore in the same address space), you must provide your own synchronization although it doesn't really come out and say so in one place. I might also mention (although this is nitpicking) that if you were to refering to the PL/I Checkout or Optimizing compilers, the manual is the PL/I Language Reference where this is explained in detail. Herb Chong... I'm user-friendly -- I don't byte, I nybble....
gnu@sun.uucp (John Gilmore) (12/26/84)
> James Giles says: > My first version was wrong. But 'Y' is a loop invariant, the > correct optimization is: > > IF (Y.GT.0) THEN > TEMP=SQRT(Y) > DO 100 I=1,10 > X(I)=TEMP > 100 CONTINUE > ENDIF > > This has the advantage of not going through the loop at all if the > assignments would all have been skipped. All these examples show the point > I originally made: these optimizations are unsafe unless you have a REALLY > GOOD data flow analyzer. You're absolutely right! Had you had a good data flow analyzer, it would have detected that your optimization had changed the value of 'I' on exit from this piece of code.
msb@lsuc.UUCP (Mark Brader) (12/26/84)
Ho hum, all of you are wrong. The CORRECT optimization for:
DO 100 I = 1 TO 10
IF (Y .GT. 0) X(I) = SQRT(Y)
100 CONTINUE
Is, in Ratfor syntax:
IF (Y > 0) {
TEMP = SQRT(Y)
DO I = 1,10
X(I) = TEMP
}
ELSE
I = 11
(In some implementations the number 11 will be something else, most
likely 10, but leaving I unchanged is clearly wrong!)
Moral: This is not only trickier than it looks, it's trickier than that!
Mark Brader
jeff@gatech.UUCP (Jeff Lee) (12/28/84)
> It is usually necessary for optimizations to break certain obscure points > that conform to the language definition, e.g. which order STATEMENTS are > executed in so that code can be moved out of loops. Many useful optimizations > cannot be done when more than 99.9% of the language definition must be > respected. > > In turn, this is the reason that optimizations must be optional. This is a truly incredible statement. I'm sure we can come up with many more optimizations that work real well if we only respect 99% of the language definition, but how useful are those "optimizations". It seems to me that when your optimizer gets finished with your code, it is no longer compiling FORTRAN, Pascal, C, etc... but some dialect that doesn't do what you intended for it to do. I don't want any compiler (and optimizer) that doesn't "respect" all of the language definition. That is simply begging and pleading for trouble. Jeff Lee CSNet: Jeff @ GATech ARPA: Jeff.GATech @ CSNet-Relay uucp: ...!{akgua,allegra,rlgvax,sb1,unmvax,ulysses,ut-sally}!gatech!jeff -- Jeff Lee CSNet: Jeff @ GATech ARPA: Jeff.GATech @ CSNet-Relay uucp: ...!{akgua,allegra,rlgvax,sb1,unmvax,ulysses,ut-sally}!gatech!jeff
ndiamond@watdaisy.UUCP (Norman Diamond) (12/31/84)
> Ho hum... The CORRECT [sic] optimization for: > > DO 100 I = 1 TO 10 > IF (Y .GT. 0) X(I) = SQRT(Y) > 100 CONTINUE > > Is, in Ratfor syntax: > > IF (Y > 0) { > TEMP = SQRT(Y) > DO I = 1,10 > X(I) = TEMP > } > ELSE > I = 11 > > (In some implementations the number 11 will be something else, most > likely 10, but leaving I unchanged is clearly wrong!) The else case can leave anything at all in I and still be correct. The value of I is undefined after completion of the DO loop, and when optimized it remains undefined after the IF statement (whether the then-branch or ELSE-branch is taken). The most optimal setting of I is to leave it alone, requiring 0 machine instructions. Otherwise, this suggestion was correct, as were several others. -- Norman Diamond UUCP: {decvax|utzoo|ihnp4|allegra|clyde}!watmath!watdaisy!ndiamond CSNET: ndiamond%watdaisy@waterloo.csnet ARPA: ndiamond%watdaisy%waterloo.csnet@csnet-relay.arpa "Opinions are those of the keyboard, and do not reflect on me or higher-ups."
bsa@ncoast.UUCP (Brandon Allbery (the tame hacker on the North Coast)) (01/02/85)
It occurs to me that there are two ways to look at optimizers: 1) That an optimizer optimizes, not the language code, but ASSEMBLER code, and hence does not need to follow the language standard; 2) That an optimizer must follow the language standard exactly, and hence should optimize the language code itself. If you accept (1), as every optimizer I've seen is designed to do, you are likely to see an optimizer that will occasionally fail. If (2), there is little optimization possible, especially in a language like Pascal where you can't manipulate the machine directly. All you who complain about optimizers breaking code: look at the current discussion of the correct optimization of a piece of ForTran code. No one has yet proposed an optimization (type 2) that does the same thing as the original code. And if it's hard in ForTran, how much harder in Assembler? Let's face it, guys -- placing any limitations (i.e. standards) on a programming language is going to make an executing program in that language slower. Optimizing is often simply ignoring some of the standards and removing their effects from the generated code. --bsa -- Brandon Allbery @ decvax!cwruecmp!ncoast!bsa (..ncoast!tdi1!bsa business) 6504 Chestnut Road, Independence, Ohio 44131 (216) 524-1416 Who said you had to be (a) a poor programmer or (b) a security hazard to be a hacker?
andrew@orca.UUCP (Andrew Klossner) (01/04/85)
} It occurs to me that there are two ways to look at optimizers: } } 1) That an optimizer optimizes, not the language code, but ASSEMBLER } code, and hence does not need to follow the language standard; } } 2) That an optimizer must follow the language standard exactly, } and hence should optimize the language code itself. } } If you accept (1), as every optimizer I've seen is designed to do, you } are likely to see an optimizer that will occasionally fail. If (2), } there is little optimization possible, especially in a language like } Pascal where you can't manipulate the machine directly. This shows a lack of familiarity with the computer science behind optimization. Systems which improve the assembler output from a compiler are commonly called "peephole" optimizers, because they never look at more than a very small part of the program (they look at the program through a "peephole"). Unix is the only system I've encountered in which the standard language optimizer is just an assembly code improver. The more traditional (read "mainframe") optimization strategy is to convert the entire program into an intermediate form which preserves the meaning of the code and yet is amenable to rearrangement into a more efficient program with equivalent meaning. Once this is done, the intermediate code is translated into assembler or machine code. So-called "global" optimizers examine the entire program, using "data-flow" algorithms which gather information by making several passes over the intermediate code; this information is then used to make decisions about how the program can be improved without changing its meaning. Pascal programs are particularly amenable to this sort of optimization. The standard example is the assignment "a[i]:=a[i]+1". This cannot be expressed any more efficiently in Pascal, but in a good intermediate code the translation can be arranged to reflect the fact that computation of the address of a[i] need only be done once, and that an "increment word in memory" instruction can be used. } Let's face it, guys -- placing any limitations (i.e. standards) on a } programming language is going to make an executing program in that language } slower. Optimizing is often simply ignoring some of the standards and } removing their effects from the generated code. Not at all. An implementor who claims that she had to revise the language in order to write an optimizer is being inexcusably lazy. Despite the recent discussions about well-known optimizers breaking programs, such behavior is considered incorrect by workers in the field. A good and readable treatment of optimization is Aho and Ullman's "Principles of Compiler Design", which has become the standard introductory compiler text. The chapters on optimization emphasize that an optimization which changes the semantics of the program is not valid. -- Andrew Klossner (decvax!tektronix!orca!andrew) [UUCP] (orca!andrew.tektronix@csnet-relay) [ARPA]
cem@intelca.UUCP (Chuck McManis) (01/04/85)
Again on the subject of "smart" compilers, specifically FORTRAN-IV on a DEC 2060XE running TOPS-20. The following code segment is approximate but the situation was the same. CALL FOO(X,0.,Z) ... X2 = 0. WRITE(5,10) X2 10 FORMAT(1X,F10.2) END SUBROUTINE FOO(X,Y,Z) ... Y=1.0 RETURN END Where the compiler optimized all constant references to 0.0 to a memory location containing 0.0 and the subroutine changed that value to 1.0. All further references to 0 produced the value 1. Not to swift. (I know it is poor style, and have since stopped using constants in subroutine calls.) --Chuck -- - - - D I S C L A I M E R - - - {ihnp4,fortune}!dual\ All opinions expressed herein are my {qantel,idi}-> !intelca!cem own and not those of my employer, my {ucbvax,hao}!hplabs/ friends, or my avocado plant. :-}
alexc@dartvax.UUCP (Alex Colvin) (01/06/85)
(intelca.472) mentions what can only be called a horrendous com- piler bug. That one has been known since roughly FORTRAN II -- namely passing constants by reference. The solution adopted by most such languages (FORTRAN,PL/I) is to always pass copies of the constant. Please understand that the aforementioned compiler is NOT optim- izing. It is compiling incorrect code. The DTSS PL/I compiler does the correct optimization in this case. Iff the interprocedure analysis detects that a (reference) parameter is not modified, then it passes constant arguments by reference, without forcing a copy. This is done ONLY if it is known to be safe. As to passing constants, I see nothing wrong with it. I prefer it to passing variables. At least with a constant you can see what you're getting. True, magic constants (e.g. 501) should be given some sort of explanatory name (e.g. StreetAddress). But I see little point in defining ZERO as a name for 0. Pascal text- books are particularly prone to this sort of foolishness. One defined NINE as 9. If anything they should have defined it as Ra- dixLessOne. That would cause less surprise when some idiot changes it to 7.
robert@cheviot.UUCP (Robert Stroud) (01/08/85)
<This line is a figment of your imagination> There has been a lot of discussion recently about how to optimise the Fortran program, DO 100 I=1,10 IF (Y .GT. 0) X(I) = SQRT(Y) 100 CONTINUE Most of the suggestions have been wrong, and even those that are right would not work in the presence of certain pathological cases of aliassing and side-effects. For example, suppose SQRT was a user function which modified the COMMON block variable I as a side-effect. Or suppose, SQRT gave a negative result and Y was EQUIVALENCE'd onto X(3). Could even a really good data flow analyser cope with such pathologies and would it be reasonable to expect it to be able to cope?? If this was a SUBROUTINE fragment and X and Y were parameter or COMMON block elements, then whether the optimisation was valid or not might well depend on precisely how the subroutine was called; in other words, it would work sometimes but not necessarily always. I agree completely with the principle that optimisers should not modify the semantics of programs, but there is a grey area where things get very tricky. I believe that there are a lot of little known rules for programming in FORTRAN which are part of the language definition and which attempt to prevent these pathological cases arising. They only exist in order to guarantee that certain optimisations will always be possible. I am thinking of a paper called "Serious Fortran", (I'm afraid I can't give a more precise reference), and things like the restrictions on modifications to loop variables in extended DO loop ranges and the apparently little known fact that the DO loop variable is UNDEFINED after executing a loop. [This may be historical - can any Fortran expert shed more light??] As an aside, in Pascal you are not allowed to modify a for loop variable or alter the value of a with expression within the bounds of the statement. But does anyone know of a Pascal compiler that actually checks for this and prohibits even such blatant violations as... WITH P^ DO BEGIN P^ := P^.next; END; In practice these rules are not enforced (or indeed cannot be enforced) and so programs which are strictly speaking illegal will compile without error but will give unexpected results when optimised. An obvious solution to the problem is to try and introduce syntactic restrictions which make aliassing and side-effects either impossible or easily detectable. This was the approach taken by the designers of the language Euclid, but what started as Pascal got extremely complicated and I think it is fair to say that this approach is a lot harder than it looks. Aliases and side-effects will be with us as long as we use languages with variables and assignment. But again, even if in theory all these nasty things are lurking beneath the surface ready to bite us, in practice do we really run into such pathological cases. If things were really that bad, we wouldn't be able to write any programs that worked (:-)! Does anyone have any good horror stories about being bitten by an alias or a side-effect? [Preferably in languages with proper abstraction - ie BASIC is excluded!!] Robert Stroud, Computing Laboratory, University of Newcastle upon Tyne
jhf@lanl.ARPA (01/10/85)
> > I thought Fortran-77 required constant actual parameters to be > > "copied in" rather than passed by reference? > > FORTRAN parameters are always pass-by-reference. > -- > D Gary Grady > Duke University Computation Center, Durham, NC 27706 Absolutely not. Both reference and value-result are acceptable parameter passing methods for Fortran. Programs that are sensitive to the difference, i.e., that create aliases to actual parameters ("dummy arguments") which they redefine, are not standard conforming. This restriction, in my opinion, is entirely appropriate. Efficiency of parameter passing is of the greatest importance to the overall quality of a compiler's target code, and archi- tectural features and other circumstances may make one of these methods much more desirable than the other. The compiler writer should be free to choose the appropriate technique, and not be restricted to one of them for the sake of guaranteeing the effects of some bad programs. I might note that Ada imposes a similar restriction on parameters, but, incredibly, the Pascal standard does not. It requires call-by-reference for var parameters! (At least, it did the last time I looked. If the situation has changed, I stand duly corrected and chastised.) I think this is a huge mistake. Not only does it unreasonably restrict implementations, but it makes impossible the removal of a wart in the language, that components of packed arrays or records may not be passed as var parameters. This restriction is there, of course, because there will generally not be a reasonable way to implement call-by-reference for objects that are not aligned on an addressable bit boundary. This directly conflicts with the notion that packed types differ from their unpacked counterparts only in their storage requirements and access times. What the Pascal standard should have done is to restrict the aliasing of var parameters in a manner similar to Fortran or Ada, and then to remove the restriction on passing components of packed structures. (Most compilers would implement this by copy-in-copy-out on the calling side, where necessary.) Joe Fasel Los Alamos National Laboratory jhf@lanl.{arpa,uucp}
johnl@ima.UUCP (01/14/85)
>For example, suppose SQRT was a user function which >modified the COMMON block variable I as a side-effect. Or suppose, SQRT >gave a negative result and Y was EQUIVALENCE'd onto X(3). Ah, but it's not. The Fortran 77 committee gave a lot of thought to making sure that useful optimizations would be possible. One of the things they did was to define a list of "intrinsic" functions of which SQRT was one, and to say that they are well-behaved. If for some masochistic reason you want to define your own version of SQRT, you have to declare it in a special EXTERNAL statement, and the compiler is duly warned to be careful. John Levine, ima!johnl PS: So how does this all work on a Perq?
joe@petsd.UUCP (Joe Orost) (01/15/85)
In article <206@cheviot.UUCP> robert@cheviot.UUCP (Robert Stroud) writes: >There has been a lot of discussion recently about how to optimise >the Fortran program, > > DO 100 I=1,10 > IF (Y .GT. 0) X(I) = SQRT(Y) > 100 CONTINUE > >Most of the suggestions have been wrong, and even those that are right >would not work in the presence of certain pathological cases of aliassing >and side-effects. For example, suppose SQRT was a user function which >modified the COMMON block variable I as a side-effect. Or suppose, SQRT >gave a negative result and Y was EQUIVALENCE'd onto X(3). Could even >a really good data flow analyser cope with such pathologies and would it >be reasonable to expect it to be able to cope?? No, SQRT cannot be a user function unless it is declared as "EXTERNAL". Our compiler uses this info to keep a list of functions that are "optimizable", that is they have no side effects and return the same result for the same argument. regards, joe -- Full-Name: Joseph M. Orost UUCP: ..!{decvax,ucbvax,ihnp4}!vax135!petsd!joe US Mail: MS 313; Perkin-Elmer; 106 Apple St; Tinton Falls, NJ 07724 Phone: (201) 870-5844 Location: 40 19'49" N / 74 04'37" W
robert@cheviot.UUCP (Robert Stroud) (01/18/85)
In article <206@cheviot.UUCP> robert@cheviot.UUCP (that's me!) wrote: > For example, suppose SQRT was a user function which >modified the COMMON block variable I as a side-effect. Or suppose, SQRT >gave a negative result and Y was EQUIVALENCE'd onto X(3). Could even >a really good data flow analyser cope with such pathologies and would it >be reasonable to expect it to be able to cope?? Judging by a couple of responses to this, I didn't make myself clear for which I apologise. I was not suggesting writing another version of SQRT with the same name but an eccentric behaviour, but asking how an optimising compiler would cope if the code was identical except that instead of calling SQRT the programmer called an arbitrary function of his own about which the compiler knew nothing. My second example, aliassing Y onto X(3), would fail with a completely standard SQRT function in any case, because Y would no longer be a loop invariant. However, getting away from the nitty-gritty of particular pathological cases I was really trying to make a more general point. It would seem from other discussions that examples like mine are banned by the small print of the language standard. The problem is that such deviations are very hard if not impossible to detect by a conventional compiler which is therefore at liberty to do what it likes with "non-standard" code. Perhaps because I am a European (:-) I worry about this. If it is possible to write a non-standard program by accident which is incorrectly optimised by a compiler which conforms to the standard, then we have a potentially dangerous situation on our hands. Suppose this program is let loose on some application where lives are at risk? Even if the application is not that critical, a lot of time and money could be wasted tracking down something which is not really a compiler bug. Whether this will be a problem or not depends on how easy it is to write a non-standard program by accident, and whether such things are rare anyway. It would presumably be possible to write something like "lint" only more so which crawled all over a large program looking for deviations from the standard that an ordinary compiler would have neither the time nor the inclination to discover. Any opinions on this? I don't see why the discussion should be restricted to Fortran. Probably the only way of eliminating aliases and such-like altogether would be to program in a purely functional language without assignment or side-effects. Robert Stroud, Computing Laboratory, University of Newcastle upon Tyne ARPA robert%cheviot%newcastle.mailnet@mit-multics UUCP ...!ukc!cheviot!robert
rcd@opus.UUCP (Dick Dunn) (01/23/85)
> It would seem from other discussions that examples like mine are banned > by the small print of the language standard. > > The problem is that such deviations are very hard if not impossible to > detect by a conventional compiler which is therefore at liberty to do > what it likes with "non-standard" code. That may not be the sum of the problem. A lot rests on how much "small print" exists in the language standard. One would like the union of {fine points which ARE understood by the average programmer} with {fine points detected by the compiler} to cover all of the nasty fine points. > Perhaps because I am a European (:-) I worry about this. If it is possible > to write a non-standard program by accident which is incorrectly optimised > by a compiler which conforms to the standard, then we have a potentially > dangerous situation on our hands. Suppose this program is let loose on some > application where lives are at risk? Even if the application is not that > critical, a lot of time and money could be wasted tracking down something > which is not really a compiler bug. [I want to pick nits and say that if the program is non-standard, it is not the compiler which is doing "incorrect optimization" but the programmer who is doing "incorrect programming".] It seems to be far too difficult to create a real language, useful for writing significant programs, which has no dark corners. > It would presumably be possible to write something like "lint" only more so > which crawled all over a large program looking for deviations from the > standard that an ordinary compiler would have neither the time nor the > inclination to discover. But if it's not part of the compiler, it is much less likely to be used, particularly if it's as noisy as lint. Moreover, if it's more thorough than lint, it will probably be even noisier. Finally, if the compiler doesn't have the time (and well it might not), that speaks poorly of the chance that people will take the extra time to run it. -- Dick Dunn {hao,ucbvax,allegra}!nbires!rcd (303)444-5710 x3086 ...Keep your day job 'til your night job pays.