[comp.lang.ada] "Optimizations" that change the meaning of a program

NCOHEN@IBM.COM (Norman COHEN) (01/16/88)

Noam Tene appears to be claiming that an optimizing compiler is free to
change the meaning of a program in order to achieve faster execution.
This is a dangerous attitude.  If a compiler has unlimited freedom to
change the meaning of a program because there is a "close" meaning with a
more efficient implementation, there is no point in defining a language
standard.  There is no point in teaching careful programming, because
the programmer no longer has control over the semantics of the compiled
program.

(-: I've just designed a superoptimizing compiler that "optimizes" any
program into a single branch instruction to the operating system's
"terminate-program" address.  Admittedly, this may radically change
the meaning of the program, but the improvement in execution time is also
radical! :-)

The designers of Ada agree with Tene that "the optimizer must be allowed
some room to work."  Therefore, they provide such room in a number of
specifically defined ways:

   1. Certain actions are defined to take place "in some order that is
      not defined by the language."  In the words of Ada RM 1.6(9),
      "this means that an implementation is allowed to execute these
      parts in any given order.... Furthermore, the construct is
      incorrect if execution of these parts in a different order would
      have a different effect."  This form of programming error is called
      an incorrect order dependency.  Tene's first two examples contain
      incorrect order dependencies, so a compiler is perfectly justified
      in evaluating operands in either order and a programmer has no
      grounds for complaint if he was incorrectly depending on a
      different order.

   2. A compiler is permitted to optimize the program under the
      assumption that the program obeys each of the twelve specific rules
      in the RM whose violation makes execution erroneous.  The
      optimization described in Tene's third example is permitted
      (assuming the absence of a "synchronization point"--see RM 9.11(2)
      and 9.11(9)--within the loop) because it only changes the meanings
      of programs that violate the restrictions given in RM 9.11(3..6),
      and whose execution is therefore erroneous.  Since, in the words of
      RM 1.6(7), "The effect of erroneous execution is unpredictable,"
      the optimization is consistent with Ada rules.

   3. If an optimization would change the meaning of the program only
      in cases where a predefined operation would have raised an
      exception in the unoptimized program, the optimization is permitted
      by RM 11.6, with minor restrictions.  (I find this distasteful, and
      it was a serious obstacle to a formal description of Ada semantics,
      but it is a concession to those who share Tene's philosophy.)

   4. RM 11.6(5) permits arithmetic expressions to be rearranged
      (using properties like the associative law) even if this will
      "introduce a further predefined exception."  (It has been argued
      that "further" means "in addition to some other exception that
      would have been raised anyway," but it has also been argued--by the
      same person!--that this paragraph permits raising an exception even
      if the unoptimized program would not have done so.)

However, Tene considers it appropriate for an optimizing compiler to take
even further liberties than those generously allowed by the Ada RM.  He
states, for example:

   "Your program includes the basic ingredients for making the program
   erroneous with a work-around (using the assignment to a constant) that
   would resolve the problem ONLY IF no optimization is done."

   "If you want optimization ... you must pay by being more careful with
   erroneous (or nearly erroneous) programs ...."

   "I think that expecting the results of an optimized compilation to be
   the SAME as those of the nonoptimized one for erroneous programs (or
   bad programs on the edge) is ridiculous."

I agree that the compiler has carte blanche with erroneous and
incorrectly order-dependent programs, but if a compiler treats "nearly
erroneous programs" or "bad programs on the edge" as if they WERE
erroneous, that compiler violates the Ada standard.  The Ada RM leaves no
room for doubt about this.  The permissibility of the optimizations
described in points 3 and 4 above is presented as an exception to the
following general rule in RM 11.6(2):

   "In general, when the language rules specify an order for certain
   actions (the _canonical_order_), an implementation may use an
   alternative order only if it can guarantee that the effect of the
   program is not changed by the reordering."

The writer of the program that deallocates a variable used earlier as
the initial-value expression in a constant declaration has every right to
expect that his program will execute normally.  Even if his program has a
vague resemblance to one that is erroneous, it is not itself erroneous.
(In fact, I would not even characterize it as a "bad program on the
edge."  I am a stickler for good Ada style, but this usage seems quite
appropriate to me.)

Teme suggests a NOOPTIMIZE pragma if one wants to guarantee that a
program will execute with the meaning stipulated by the Ada RM, but he's
got it backwards.  I have no problem with a LOCAL implementation-defined
pragma, e.g.,

   pragma SHARE_STRINGS;

that PERMITS certain normally unsafe optimizations to be performed.  The
user of such a pragma has explicitly requested the optimization and
knows that he must obey certain programming conventions (e.g., avoiding
unchecked deallocation of strings) if normal Ada semantics are to be
preserved.  In effect, such a pragma is a promise by the programmer to
the compiler to obey certain programming restrictions that will make
the optimization safe; execution of the program is erroneous if it
violates this promise.  This is the principle behind the SUPPRESS pragma,
for example.  Given that most programs spend a majority of their time
executing 5% of the program text, it makes sense to explicitly enable
potentially dangerous optimizations in these small regions of the text,
where their effects can be scrutinized, and to leave these optimizations
disabled by default in the vast majority of the program text, where
their impact on execution time would be negligible anyway.

If we step back from Ada and consider programming languages in general, I
find even the amount of freedom that the Ada RM grants to optimizers to
be excessive.  I am a "strict constructionist":  Any optimization that
preserves the semantics of the programming language should be allowed.
(Indeed, such an optimization should be called simply "good code
generation.")  Any "optimization" that changes the semantics of the
programming language, however slightly, should be forbidden.  (Such an
optimization is really the generation of a different program, presumably
more efficient, in the hope that its behavior will be close enough to
what the programmer really called for that nobody will care.)  To make
this strict approach practical, we must have a programming language in
which many optimizations are safe in this strict sense.  This requires,
for example, expressions that are completely devoid of side effects and
whose values are determined entirely by the values of the variables
mentioned in the expressions.  Such expressions, in turn, require
functions without global variables (or pointers, which are really indices
into a global "collection" variable).  I am convinced that mathematical
rigor in the design of programming languages, rather than inhibiting
useful optimizations, can increase the number of optimizations that are
truly safe and make it easier to determine that they are safe.