padpowell@wateng.UUCP (PAD Powell) (10/12/84)
I have just run into a really fun thing with an optimizer. The problem
was in the code for a hardware level driver, which wanted to:
1. Stuff a value into a register.
2. Look at the register until a flag (bit) went high.
The code written was
struct regs{
int r1;
} *csr;
...
csr->r1 = ST_START;
while( (csr->r1 & ST_DONE) == 0 );
Well, imagine my surprise when the code generated only did:
1. loaded the ST_START value into CPU register (byte value actually)
2. placed the CPU register value into memory (word value)
3. Did not generate a test instruction, cause the ST_START and
ST_DONE value were identical.
Now here is the question:
1. Was this legal code generation?
2. Note that this compiler did "simple" optimizations as part of the code
generation. Is this legal?
I know of several ways around this, but I thought that it should be addressed
by the ANSII standard.
Patrick Powell
henry@utzoo.UUCP (Henry Spencer) (10/14/84)
> [Compiler is optimizing out a wait-for-hardware-done loop.] > ... > 1. Was this legal code generation? > 2. Note that this compiler did "simple" optimizations as part of the code > generation. Is this legal? As to whether it's legal by K&R, the only answer is "mumble". This thorny issue was never addressed in the old days. The draft ANSI standard has a "volatile" declaration that you can use to tell the compiler "don't get tricky with this variable, it may change underfoot". -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
bright@dataio.UUCP (Emperor) (10/15/84)
If you are using an optimizing compiler, and your code deals with hardware registers and such, it is usually a good idea to turn off all optimizations. Nearly all optimizing transformations applied to a program will cause certain kinds of programs that deal with hardware to fail. Since, however, 99% of code written does not deal directly with hardware, and a good optimizer can double the speed of the resulting code, the optimizer is worth keeping around. Walter Bright
guy@rlgvax.UUCP (Guy Harris) (10/17/84)
> I have just run into a really fun thing with an optimizer. ... > (Discussion of optimization that doesn't work on "volatile" locations > like device registers) > > Now here is the question: > 1. Was this legal code generation? > 2. Note that this compiler did "simple" optimizations as part of the code > generation. Is this legal? I'd say "yes" to both questions. (BTW, if this was code for a VAX-11, there's an undocumented "-i" flag to "c2" which turns off these optimizations; the 4.2BSD Makefile uses it for anything declared as "device-driver" in the "files" or "files.vax" file.) > I know of several ways around this, but I thought that it should be addressed > by the ANSII standard. It is; there's a pseudo-storage-class called "volatile" which says "this is subject to change without notice, so don't be clever and optimize references to it." This is actually useful in for things other than device registers, given that the UNIX kernel has data within it shared by multiple processes, and that several versions of UNIX, as well as other OSes, support data shared between user processes. Guy Harris {seismo,ihnp4,allegra}!rlgvax!guy
gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (10/17/84)
> Well, imagine my surprise when the code generated only did: > 1. loaded the ST_START value into CPU register (byte value actually) > 2. placed the CPU register value into memory (word value) > 3. Did not generate a test instruction, cause the ST_START and > ST_DONE value were identical. > > Now here is the question: > 1. Was this legal code generation? > 2. Note that this compiler did "simple" optimizations as part of the code > generation. Is this legal? Step 2 of the generated code was sloppy but legal... incomplete optimization. Optimizations certainly are legal, but your C compiler needs an escape to avoid the sort of over-optimization you have described. Frequently this is done by testing for references that can be seen at compile time to be possible "I/O page" addresses and skip optimizations for them. The given example would not have been detected under this test since the CSR variable was loaded at run-time, not compile-time. The ANSI C committee was supposed to be considering this issue; the BLISS-style "volatile" type modifier was mentioned as one possibility. (This tells the compiler not to optimize any expression containing the variable so flagged.) I don't know the outcome of the discussions.
tom@ucbcad.UUCP (10/18/84)
I ran into some similar problems when I was writing a device driver. I had to put in some weird kludges to make things work. MOST of my problems could be solved by avoiding the optimizer, but not all. In any case, I wanted to use the optimizer to tighten the code as much as I could. I concluded that there should be a storage class that indicates hardware side-effects. I also thought it would be nice to have storage attributes for read-only registers (I guess this is the "const" storage class in the proposed ANSI standard -- I don't think much of the mnemonic value here, but I suppose consistency is more important than mnemonics) and one for write-only registers, so you would get a compile-time error if you tried to read a write-only register or vice versa. I'm sure there are other strange storage classes/attributes that people would like to see in the standard. What do people think is a reasonable set? I personally think the side-effect class is very important (IS this in the proposed standard? I don't remember seeing it, but I may be senile.), but the others are harder to justify. Tom Laidig
thomson@uthub.UUCP (Brian Thomson) (10/19/84)
The 'volatile' keyword will tell an (optimizing) compiler that a location can change value asynchronously, but sometimes that isn't enough. Another common idiosyncrasy of hardware is that it is read-only or write-only, or can only be accessed using byte or halfword or fullword accesses. I believe the Ritchie PDP-11 compiler would compile a = b + c; into mov b,a add c,a in appropriate circumstances; this clearly will fail if 'a' refers to a write-only register. The second situation is exemplified by registers in nexus space on a Vax (forgive me if you don't know what this means), which may only be accessed as longwords, and by Unibus registers which may only be accessed as bytes or halfwords. A favourite improvement made by /lib/c2 when compiling short a; ... if (a&1) ... is to replace the straightforward bitw $2,_a jeql L1 by jlbc _a,L1 (Jump if Lower Bit Clear, that means) which is fine except that the first sequence references _a as a halfword while the second does a longword reference. ( the undocumented "-i" flag to vax /lib/c2 disables these optimizations that alter reference size, and doesn't have anything to do with memory volatility as suggested by an earlier article ) -- Brian Thomson, CSRI Univ. of Toronto {linus,ihnp4,uw-beaver,floyd,utzoo}!utcsrgv!uthub!thomson
henry@utzoo.UUCP (Henry Spencer) (10/19/84)
A paranoid compiler could, presumably, assume that "volatile" meant not only that the location might change underfoot, but also that there was something strange about it and it had better be accessed in the most straightforward way possible. My own thought would be that the special keyword ("volatile" is perhaps not an ideal choice, it's too specific) should simply mean that the compiler should be as paranoid as possible on the given architecture. I suspect that trying to pin down the exact semantics is both difficult and unwise. -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
alan@apollo.uucp (Alan Lehotsky) (10/22/84)
APOLLO's implementation of C has support for both the notion of memory being shared between asynchronous activities (VOLATILE) and memory-mapped i/o (DEVICE). Rather than implement these as storage classes, we chose to create an "extensible" extension mechanism with an attributes list. The semantics of VOLATILE are based on the notion that the named variable can be modified between any two references, so that the variable may not be a component of any common subexpression, nor may it be hoisted out of a loop. DEVICE implies more stringent conditions on the optimizer and code generator. (Still somewhat of a DWIM [Do What I Mean] situation, though) In addition to implying VOLATILE, it indicates that extra references by the compiled code will be MOST unwelcomed. As an explicit example, a DEVICE location will never be the target of a CLR instruction by the 68000 code generator, as this instruction does a READ-MODIFY bus cycle! (which can really annoy some of the dumber write-only Multibus peripheral cards!) The flavor of the syntax for this extension is int devreg #attribute[volatile]; short outmask #attribute[device(write)], inmask #attribute[device(read)]; (NO FLAMES ABOUT HERETICAL VIOLATIONS OF THE K&R "BIBLE", PLEASE) We also support an "ADDRESS(expr)" attribute which allows a name to be associated with a compile-time constant expression which denotes a memory address. All of the above functionality also appears in our PASCAL, with different syntax (similar to VAX-11 PASCAL's attributes). The implementation and semantics of this was based on very similar work I implemented in DIGITAL's common BLISS compilers in the late 70's. (Just another example of BLISS being a much better system's programming language than C will EVER be.....)