kent@xanth.UUCP (04/11/87)
In article <995@wanginst.EDU> mckeeman@wanginst.EDU (William McKeeman) writes: >I am a bit frustrated by people talking past each other on this topic. The >order of evaluation rules in C are a blanket over several subproblems. >Conclusion: when making a pitch for a better C definition rule, identify >which of the problems: > unrestricted optimization > overflow anomalies > roundoff anomalies > side effect anomalies >it is designed to solve. If you don't intend for your rule to have >language-wide implications, state the limited area of application. >-- >W. M. McKeeman mckeeman@WangInst >Wang Institute decvax!wanginst!mckeeman >Tyngsboro MA 01879 I agree that this is an excellent idea, and, being nothing if not opinionated, and having freely confessed my disenchantment with C in its present state in this forum previously, I will start this discussion off with four new subjects corresponding to (Mr., Mrs., Miss, Ms., Dr., (?)) McKeeman's plan, and then sit back and read the responses. A few times in 26 years of coding, I have found myself working at the limits of a data type, and having to rearrange code to assure that overflow or underflow did not take place, during a sound, but not robust calculation. Where control of execution order was needed, nothing else would do. A previous poster has noted that a compiler writer who is on an optimization spree (not meant to be derogatory, I have a lot of respect for this skill, and spent a long bar evening listing to some justifiable bragging by one of the practicioners of this bit of arcana) will not respect anything you can do to force a certain code order, since s/he feels free to rearrange statements, elide temporaries, rearrange loops, rearrange blocks, reduce strength (change your multiplies to adds, for example), and recent publications suggest that separate compilation may not be good enough, either, to prevent unexpected interactions of two pieces of code which were certainly separate in the programmer's mind. Mathematical programmers (and, after its 30 years in continuous use for the purpose, a great many of these are FORTRAN users) who turn to C for whatever reason are occassionally astonished by the disrespect of C compilers for the parentheses the programmer just so carefully included to control execution order. Perhaps the proper response is to tell the mathematical programmer to look elsewhere for a language to use, but the problem exists for the less demanding user, too. C is famous (notorious) for its variety of integer variables, signed and unsigned, and byte, word, and double word (and other) sizes make the creation of portable code a task for the cautious and insightful programmer. The utility of these many integer sizes is manifest; it allows systems tables (structures) to be created with a minimum size by using just the right mix of integer variable sizes. But now, if arithmetic must be done with these variables, the problem of overflow arises. One solution is surely to pick the next larger integer size when overflow threatens. Usually, however, if the integer size is chosen to match the range of the data to be stored, this is an inappropriate and wasteful choice, compared to arranging the arithmetic in the proper order to avoid overflow. To do this, the compiler must be able to understand when the programmer wants the code executed exactly as written, and leave it alone. The ANSI C dpANS, I understand, has chosen to overload the unary plus with this task. I have two problems with this. First it is {counterintuitive, ugly,unmathematical} (pick one). I have noted earlier, that to the user coming from a secondary school environment, the unary plus is a no-op, just as s/he learned in algebra class. It really trashes out the appearance of an expression containing it (although C is already in the top five, in the running for world's ugliest major language) (need I mention that is one of the opinions promised above?). ;-) Stray operators floating around don't look much like the mathematical equations to which the new user is presumably accustomed. Second, this may provide insufficient or inappropriate granularity of control against the optimizing compiler. I may need to control the order of execution of a series of statements which the compiler may decide to rearrange. The unary plus doesn't cut it here, but the compiler author has promised to look everywhere for optimizations. How do I stop the code from exercising optimization where it is dangerous to the success of the program, without turning it off where it is harmless or beneficial? I think the standard should include a better, prettier method of control of optimization scope and degree. Comments? Kent. -- The Contradictor Member HUP (Happily Unemployed Programmers) // Also // A Back at ODU to learn how to program better (after 25 years!) \\ // Happy \// Amigan! UUCP : kent@xanth.UUCP or ...{sun,cbosgd,harvard}!xanth!kent CSNET : kent@odu.csnet ARPA : kent@xanth.cs.odu.edu Voice : (804) 587-7760 USnail: P.O. Box 1559, Norfolk, Va 23501-1559 Copyright 1987 Kent Paul Dolan. How about if we keep the human All Rights Reserved. Author grants race around long enough to see retransmission rights recursively only. a bit more of the universe?
aeusemrs@csun.UUCP (04/12/87)
In article <820@xanth.UUCP> kent@xanth.UUCP (Kent Paul Dolan) writes: [edited for space] |A previous poster has noted that a compiler writer who is on an optimization |spree (not meant to be derogatory, I have a lot of respect for this skill, |and spent a long bar evening listing to some justifiable bragging by one of |the practicioners of this bit of arcana) will not respect anything you can |do to force a certain code order, since s/he feels free to rearrange |statements, elide temporaries, rearrange loops, rearrange blocks, reduce |strength (change your multiplies to adds, for example), and recent |publications suggest that separate compilation may not be good enough, |either, to prevent unexpected interactions of two pieces of code which were |certainly separate in the programmer's mind. | |Second, this may provide insufficient or inappropriate granularity of |control against the optimizing compiler. I may need to control the order |of execution of a series of statements which the compiler may decide to |rearrange. The unary plus doesn't cut it here, but the compiler author |has promised to look everywhere for optimizations. How do I stop the code |from exercising optimization where it is dangerous to the success of the |program, without turning it off where it is harmless or beneficial? I |think the standard should include a better, prettier method of control of |optimization scope and degree. | |Comments? [just a couple questions for now] |Copyright 1987 Kent Paul Dolan. All Rights Reserved. Author grants |retransmission rights recursively only. [I guess I must include his copyright message...] Does optimization mean the changing of the internal workings of something, such that the inputs, and the results or outputs of the ``black box'' are the SAME, and of course, the process is less consuming in some way, typically time? Is this definition use only in theory, or is it used in practice too? Can you please give me an example of the case you alluded to, in your last paragraph above? I have never felt the need to have anything but, the full unrepressed optimizations, global or local, of the the compiler. With that I have one comment, I can see the use of storing values into areas of memory (or was that i/o, :-)) declared as volatile in C. What type of ``unexpected interactions of two pieces of code which were certainly separate in the programmer's mind'' where you referring to in your first paragraph? -- Mike Stump, Cal State Univ, Northridge Comp Sci Department uucp: {sdcrdcf, ihnp4, hplabs, ttidca, psivax, csustan}!csun!aeusemrs
kent@xanth.UUCP (Kent Paul Dolan) (04/13/87)
In article <820@xanth.UUCP> kent@xanth.UUCP I wrote: [edited, paraphrased, etc. to save space; we've seen it twice already] ...a compiler writer...feels free to [...muck about like mad several ways...], and recent publications suggest that separate compilation may not be good enough, either, to prevent unexpected interactions of two pieces of code which were certainly separate in the programmer's mind. Second, this may provide insufficient or inappropriate granularity of control against the optimizing compiler. [...] of execution of a series of statements which the compiler may decide to rearrange. The unary plus doesn't cut it here, but the compiler author has promised to look everywhere for optimizations. How do I stop the [compiler] from exercising optimization where it is dangerous to the success of the program, without turning it off where it is harmless or beneficial? I think the standard should include a better, prettier method of control of optimization scope and degree. Comments? In article <568@csun.UUCP> aeusemrs@csun.UUCP (Mike Stump) writes: >{just a couple questions for now} > >|Copyright 1987 Kent Paul Dolan. All Rights Reserved. Author grants >|retransmission rights recursively only. > [I guess I must include his copyright message...] > I think under the fair use doctrine, you could have omitted the copyright, but I am no lawyer. This was yet another in a small groundswell of attempts to persuade the Stargate project not to do damage to the nature of USENet in the (laudable) process of attempting to save sites money on phone bills. At present, USENet postings may be freely passed from site to site; we would lose this ability for news passed via Stargate as presently envisioned. The Stargate project's present plan proposes to copyright all the USENet material it broadcasts as a collective work, and to insist on recompense from Stargate subscribers who pass the material on over phone lines to additional sites, beyond the normal Stargate subscription fees. This problem is being adequately flamed in the Stargate newsgroup, please read/respond there, not here, on this topic. > Does optimization mean the changing of the internal workings of something, >such that the inputs, and the results or outputs of the ``black box'' are the >SAME, and of course, the process is less consuming in some way, typically >time? Is this definition use only in theory, or is it used in practice too? Optimization in non-pathological cases should be of the black-box type; if a=2, b=3, and c=4, then d = a-b+c has the same result as d = a+c-b. The problems being addressed in this (very, very long) discussion have to do with behavior in the pathological cases. As previous postings have shown, these are probably the majority, certainly in the case of floating point numbers. For example, if the implementation of floating point numbers in a particular hardware implementation is a signed, 4 bit exponent, and a signed, 12 bit mantissa, then the most positive representable number (in binary) is 11111111.1111. This being the case, then given (all binary): a = 10000000.0000; b = 00000001.0000; c = 10000000.0000; then the evaluation order (a - b) + c procedes with no problem, while the evaluation order (a + c) - b produces an overflow. If there were a hardware implementation which rewarded doing the add before the subtract, than an optimizing compiler might do the evaluation the second way, even though the programmer, knowing the possible data values for a, b, and c, had written the code using the (a - b) + c format, and expected never to see an overflow from this piece of code. The point of course, is that the compiler does not know what values will be assigned to a, b, and c at run time, and so does not know if that optimization is safe or not. The reason that discussion has been so prolonged, is that the programmer might know, and (some of us) want a nice way to tell the compiler not to muck about with that expression. The compiler writer, lacking guidance to the contrary, wants to muck about, because then the code will run faster, benchmark better, and the compiler will compete better in the marketplace (all else being equal). > Can you please give me an example of the case you alluded to, in your last >paragraph above? I have never felt the need to have anything but, the full >unrepressed optimizations, global or local, of the the compiler. With that I >have one comment, I can see the use of storing values into areas of memory (or >was that i/o, :-)) declared as volatile in C. The volatile declaration (I have no access to the dpANS, thank heaven, or I'd read that, too) certainly takes care of part of the problem, but it gives compiler writers license to break TONS of existing code, which depended on K&R's statement that splitting expressions into separate statements was enough to guarantee the evaluation order. If I wrote (all "float" variables): temp = a - b; d = temp + c; in a piece of existing code to avoid the above overflow; the compiler writer is now free to elide temp from the code, then go back and rearrange to achieve: d = (a + c) - b; exactly what I was trying to avoid, unless I declare temp to be volatile. Would _you_ like to go back through your site's existing C code and find all instances where that is why temp was put in the code, and add the appropriate volitile declaration? ;-) How about if a little maintenance activity has produced: temp = a - b; /* 2000 lines of code not involving temp, a, b, c, or d */ d = temp + c; which the ambitious compiler writer is STILL free to optimize into: /* 2000 lines of code not involving temp, a, b, c, or d */ d = (a + c) - b; Aaaaarargh! ;-) > What type of ``unexpected interactions of two pieces of code which were >certainly separate in the programmer's mind'' where you referring to in your >first paragraph? main(){... /* evreybody floats */ external float foo(); temp = a - b; d = foo(temp,c); /* only call of foo in any code in the known universe ;-) */ ... } /* 2,000,000 lines of kernal code and several hundred separate source files later, we finally find foo: ;-) */ foo(temp,c) ...{ temp2 = temp + c; return temp2; } Which the ambitious PhD student compiler writer doing a thesis on "AI Applications in Optimizing Compilers and Linkers in Spite of all Reasonable Attempts to Prevent it" is perfectly free to change, by putting the call to foo inline, and then converting to: d = (a + c) - b; (aw, you guessed ;-). >-- >Mike Stump, Cal State Univ, Northridge Comp Sci Department >uucp: {sdcrdcf, ihnp4, hplabs, ttidca, psivax, csustan}!csun!aeusemrs It's almost daylight, did any of that make more (any) sense? Kent. -- The Contradictor Member HUP (Happily Unemployed Programmers) // Also // A Back at ODU to learn how to program better (after 25 years!) \\ // Happy \// Amigan! UUCP : kent@xanth.UUCP or ...{sun,cbosgd,harvard}!xanth!kent CSNET : kent@odu.csnet ARPA : kent@xanth.cs.odu.edu Voice : (804) 587-7760 USnail: P.O. Box 1559, Norfolk, Va 23501-1559 Copyright 1987 Kent Paul Dolan. How about if we keep the human All Rights Reserved. Author grants race around long enough to see retransmission rights recursively only. a bit more of the universe?
mckeeman@wanginst.UUCP (04/13/87)
Let me focus on one particular point of the draft standard that may be read differently by C programmers and C compiler writers. First the raw data: (from the July 9 draft from x3j11) 2.1.2.3 Program execution ... Evaluation of an expression (specifically, one that involves assignment, incrementing, decrementing, or calling a function that does any of those operations) may produce _side_effects_, which are changes in the state of the execution environment incidental to the evaluation. At certain specified points in the execution sequence, called _sequence_points_, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place. 3.6 Statements ... Except as indicated, statements are executed in sequence. ... The end of a _full_expression_ is a sequence point. Analysis: Point 1. the final appearance of "side" in the paragraph above should be stricken. In fact it is necesary that _no_ effects of subsequent evaluations shall have taken place. Example: x=y; z=x; The side effect of storing into x must precede the (non-side-effect) fetch of x in the second statement. I suppose that is what the authors intended. Point 2. compilers may, in fact, discard explicit assignments x=y; z=x; In the case that the above is the only use of x, and absent directives such as volatile and implicit casts, some compiler writers feel free to not allocate storage for x and translate the above as: z=y; The reasoning is that the _observable_ effect of the more efficient version is the same as the one the programmer wrote. This is harmless enough unless the programmer is poking around with a debugger or looking at a postmortem. If the x3j11 document is supposed to forbid this optimization, then it had better say something like "all assignments and {inc|dec}rements must result in updating the indicated memory location." Point 3. point 2 can get you (or me) into trouble Many compiler writers would treat the following assignments likewise: x=y+1; z=x-1; turning into z=y; which is fine except where y==MAXINT. This is an example of optimizers that work except for expressions that overflow. I think x3j11 is OK on this point (the compiler writer is surely violating it) but x3j11 could explicitly forbid optimizations that fail to preserve semantics for expressions that {over|under}flow. Deep in the heart and professional aspirations of some compiler writers is an urge to ignore non-arithmetic properties of computer arithmetic, and a suspicion that customers would rather have really fast code than code that preserves the semantics of overflow for those very rare cases where it matters. End of Points. The innocent examples above do not expose the subtlety of the problem for complex expressions, but resolving the meaning of x3j11 for the above should constrain/free the compiler writer as needed. QUESTION: Is it worth tinkering with the x3j11 verbiage? QUESTION: Should all explicit assignments be forced to memory? QUESTION: Should compiled code overflow exactly the same way under all implementations? -- W. M. McKeeman mckeeman@WangInst Wang Institute decvax!wanginst!mckeeman Tyngsboro MA 01879
dc@sdd.UUCP (04/14/87)
In article <820@xanth.UUCP> kent@xanth.UUCP (Kent Paul Dolan) writes: > >A few times in 26 years of coding, I have found myself working at the limits >of a data type, and having to rearrange code to assure that overflow or >underflow did not take place, during a sound, but not robust calculation. >Where control of execution order was needed, nothing else would do. > >A previous poster has noted that a compiler writer who is on an optimization >spree (not meant to be derogatory, I have a lot of respect for this skill, >and spent a long bar evening listing to some justifiable bragging by one of >the practicioners of this bit of arcana) will not respect anything you can >do to force a certain code order, since s/he feels free to rearrange >statements, elide temporaries, rearrange loops, rearrange blocks, reduce >strength (change your multiplies to adds, for example), and recent >publications suggest that separate compilation may not be good enough, >either, to prevent unexpected interactions of two pieces of code which were >certainly separate in the programmer's mind. > >The ANSI C dpANS, I understand, has chosen to overload the unary plus with >this task. I have two problems with this. First it is {counterintuitive, >ugly,unmathematical} (pick one). I have noted earlier, that to the user >coming from a secondary school environment, the unary plus is a no-op, just >as s/he learned in algebra class. It really trashes out the appearance of >an expression containing it (although C is already in the top five, in the >running for world's ugliest major language) (need I mention that is one of >the opinions promised above?). ;-) Stray operators floating around don't >look much like the mathematical equations to which the new user is presumably >accustomed. > >Second, this may provide insufficient or inappropriate granularity of >control against the optimizing compiler. I may need to control the order >of execution of a series of statements which the compiler may decide to >rearrange. The unary plus doesn't cut it here, but the compiler author >has promised to look everywhere for optimizations. How do I stop the code >from exercising optimization where it is dangerous to the success of the >program, without turning it off where it is harmless or beneficial? I >think the standard should include a better, prettier method of control of >optimization scope and degree. > >Comments? > Why not utilize a #pragma NO_OPTIMIZE & a counter pragma #pragma OPTIMIZE around code that is to executed as is Perhaps we can even create various classes of pragmas for different optimizations: #pragma NO_OPTIMIZE LOOPS, #pragma NO_OPTIMIZE_STRENGTH, etc. Daniel Corbett V.P. Engineering SDD Corp.
m5d@bobkat.UUCP (Mike McNally ) (04/16/87)
In article <827@xanth.UUCP> kent@xanth.UUCP (Kent Paul Dolan) writes: > . . .If I wrote (all "float" variables): > temp = a - b; > d = temp + c; > in a piece of existing code to avoid the above overflow; the > compiler writer is now free to elide temp from the code, then go > back and rearrange to achieve: > d = (a + c) - b; > >UUCP : kent@xanth.UUCP or ...{sun,cbosgd,harvard}!xanth!kent Not a C compiler writer. A C compiler (as I know it; I suppose my complete ignorance of ANSI C may show here) is not supposed to do any non-expression-local optimizations, not even basic block type stuff. Thus while a Pascal compiler may do such work, or for that matter an optimizing FORTRAN compiler, no C compiler will. -- Mike McNally, mercifully employed at Digital Lynx --- Where Plano Road the Mighty Flood of Forest Lane doth meet, And Garland fair, whose perfumed air flows soft about my feet... uucp: {texsun,killer,infotel}!pollux!bobkat!m5d (214) 238-7474
guy%gorodish@Sun.COM (Guy Harris) (04/17/87)
> Why not utilize a #pragma NO_OPTIMIZE & a counter pragma #pragma >OPTIMIZE around code that is to executed as is Perhaps we can even create >various classes of pragmas for different optimizations: #pragma NO_OPTIMIZE >LOOPS, #pragma NO_OPTIMIZE_STRENGTH, etc. "We" can't create any sort of pragma classes. #pragmas are compiler-dependent. There's no guarantee that a particular compiler knows about the categories of optimization you know about, with the possible exception of "#pragma OPTIMIZE" and "#pragma NO_OPTIMIZE". The standard should do as little as possible to constrain the forms of optimizing technology used in comn M ~r
jm36#@andrew.cmu.edu.UUCP (04/18/87)
> From: kent@xanth.UUCP (Kent Paul Dolan) > Subject: Re: C and overflow anomolies > The volatile declaration (I have no access to the dpANS, thank > heaven, or I'd read that, too) certainly takes care of part of the > problem, but it gives compiler writers license to break TONS of > existing code, which depended on K&R's statement that splitting > expressions into separate statements was enough to guarantee > the evaluation order. If I wrote (all "float" variables): > temp = a - b; > d = temp + c; > in a piece of existing code to avoid the above overflow; the > compiler writer is now free to elide temp from the code, then go > back and rearrange to achieve: > d = (a + c) - b; > exactly what I was trying to avoid, unless I declare temp to be > volatile. Would _you_ like to go back through your site's existing C > code and find all instances where that is why temp was put in the code, > and add the appropriate volitile declaration? ;-) > The ambitions compiler writer is NOT free to make this optimization. There are such things as "sequence points" which occur between statements. The best that this could be optimized to is: d = +(a + c) - b; With the additional restriction that "(a + c)" must be evaluated before "b" if there is any possibility of side-effects between "b" and either of "a" or "c". _.John