rbbb@rice.EDU (04/18/87)
Comments on several of your messages: Abstract machine--I believe that the "correct" use of an abstract machine in a language specification is to (it is hoped) provide more meaning for a "specification by interpretation/translation". This can run the gamut from the code-environment-continuation-store denotational semantic model (where a statement is just a functional transformation on the environment/continuation/store) and compilation to a PDP-11. I believe that a middle ground is more practical right now (and I would rather not go into the details; "I know one when I see one" :-). How is one used? (1) it is asserted that sensible people understand the meaning of behavior in the abstract machine (thus the need for a simple one). (2) the language specification describes either a translation to abstract machine code, or transformations to the abstract machine caused by each source language element. This specification may be VERY picky about evaluation order, etc, because here is where the MEANING of a program is defined. (3) the observable behavior of the abstract machine is defined; this is probably a subset of the total behavior of the abstract machine (this is so that different compilers can generate different code for different machines; the actual code generated IS NOT part of the C specification). (4) An implementation of a language is correct if the implementation has the same observed behavior as the abstract machine. Notice that different notions of "observation" are required for debugging code and for correct interaction with device drivers. In C, for instance, it might be claimed that the volatile variables are observed after every statement in the program. This is not, however, strong enough to rule out the following optimization: (original) while ((*csr & mask) == 0) ; /* busy waiting */ (optimized) if ((*csr & mask) == 0) while (1) ; /* infinite loop */ Presumably the examination of volatile variables must somehow be exposed in the abstract machine's behavior. There are some situations where it may be desired to leave the interpretation/translation unspecified; for example, evaluation of arguments to a procedure, and perhaps evaluation of subexpressions. Note that if there are clearly no side-effects in the evaluated code, or if the compiler happens to be smart enough to determine that two subexpressions do not interfere with each other, that it can still change the order of evaluation because the final result will be the same. What this means is that optimizations may be prohibited in the abstract machine which are in fact legal in real compiled code; if the observable behavior is the same, then the optimization is ok. This brings us to: Integer overflow--The standard OUGHT to say something about this, but I don't know if it does. If it doesn't, then any compiler writer for C who has ever seen another C compiler will say "AHA! longs/integers/shorts/chars are just integers modulo some power of two" and you can be sure that the writer will try to re-order statements that don't contain references to volatile varables. My gripe about rearrangement of floating point arithmetic is that it certainly does not obey algebraic rules; modulo integer arithmetic is not "theoretically associative and commutative", it *IS* associative and commutative ("Does this program work correctly?" "Theoretically, yes.") Debugging and optimizations--read work of Polle Zellweger, read of work of John Hennessy. Zellweger's thesis is a good place to start. They actually talk about debugging optimized code, what optimizations you can do, what you cannot, and how compiles and debuggers can get along with each other. (I hope I am not misrepresenting their work). There are three good reasons (in general) not to worry about debugging of optimized C code (1) if you are debugging it, it is likely that you plan to recompile several times anyway, and the time spent optimizing the program is not likely to be recovered while running the program, (2) there are worthwhile optimizing transformations that make it difficult to understand what is going on when you debug a program, and (3) the bugs in your code should be independent of the optimizations applied, except in the case that (a) you are writing a device driver or (b) you are writing code for execution in a concurrent environment. Obviously, (a) calls for use of volatile variables, and (b) calls for either volatile variables or a new compiler, depending upon whether this sort of programming is an occasional thing or a common thing. There ARE optimizations that are safe even in a concurrent context; it is not necessary to turn off all optimization. If your code has different behavior depending upon whether or not it has been optimized, I suggest you run lint. Sigh. I hope that this makes things a little clearer for someone. Comments? David