padpowell@wateng.UUCP (PAD Powell) (10/12/84)
I have just run into a really fun thing with an optimizer. The problem
was in the code for a hardware level driver, which wanted to:
1. Stuff a value into a register.
2. Look at the register until a flag (bit) went high.
The code written was
struct regs{
int r1;
} *csr;
...
csr->r1 = ST_START;
while( (csr->r1 & ST_DONE) == 0 );
Well, imagine my surprise when the code generated only did:
1. loaded the ST_START value into CPU register (byte value actually)
2. placed the CPU register value into memory (word value)
3. Did not generate a test instruction, cause the ST_START and
ST_DONE value were identical.
Now here is the question:
1. Was this legal code generation?
2. Note that this compiler did "simple" optimizations as part of the code
generation. Is this legal?
I know of several ways around this, but I thought that it should be addressed
by the ANSII standard.
Patrick Powellhenry@utzoo.UUCP (Henry Spencer) (10/14/84)
> [Compiler is optimizing out a wait-for-hardware-done loop.] > ... > 1. Was this legal code generation? > 2. Note that this compiler did "simple" optimizations as part of the code > generation. Is this legal? As to whether it's legal by K&R, the only answer is "mumble". This thorny issue was never addressed in the old days. The draft ANSI standard has a "volatile" declaration that you can use to tell the compiler "don't get tricky with this variable, it may change underfoot". -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
bright@dataio.UUCP (Emperor) (10/15/84)
If you are using an optimizing compiler, and your code deals with hardware registers and such, it is usually a good idea to turn off all optimizations. Nearly all optimizing transformations applied to a program will cause certain kinds of programs that deal with hardware to fail. Since, however, 99% of code written does not deal directly with hardware, and a good optimizer can double the speed of the resulting code, the optimizer is worth keeping around. Walter Bright
guy@rlgvax.UUCP (Guy Harris) (10/17/84)
> I have just run into a really fun thing with an optimizer. ... > (Discussion of optimization that doesn't work on "volatile" locations > like device registers) > > Now here is the question: > 1. Was this legal code generation? > 2. Note that this compiler did "simple" optimizations as part of the code > generation. Is this legal? I'd say "yes" to both questions. (BTW, if this was code for a VAX-11, there's an undocumented "-i" flag to "c2" which turns off these optimizations; the 4.2BSD Makefile uses it for anything declared as "device-driver" in the "files" or "files.vax" file.) > I know of several ways around this, but I thought that it should be addressed > by the ANSII standard. It is; there's a pseudo-storage-class called "volatile" which says "this is subject to change without notice, so don't be clever and optimize references to it." This is actually useful in for things other than device registers, given that the UNIX kernel has data within it shared by multiple processes, and that several versions of UNIX, as well as other OSes, support data shared between user processes. Guy Harris {seismo,ihnp4,allegra}!rlgvax!guy
gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (10/17/84)
> Well, imagine my surprise when the code generated only did: > 1. loaded the ST_START value into CPU register (byte value actually) > 2. placed the CPU register value into memory (word value) > 3. Did not generate a test instruction, cause the ST_START and > ST_DONE value were identical. > > Now here is the question: > 1. Was this legal code generation? > 2. Note that this compiler did "simple" optimizations as part of the code > generation. Is this legal? Step 2 of the generated code was sloppy but legal... incomplete optimization. Optimizations certainly are legal, but your C compiler needs an escape to avoid the sort of over-optimization you have described. Frequently this is done by testing for references that can be seen at compile time to be possible "I/O page" addresses and skip optimizations for them. The given example would not have been detected under this test since the CSR variable was loaded at run-time, not compile-time. The ANSI C committee was supposed to be considering this issue; the BLISS-style "volatile" type modifier was mentioned as one possibility. (This tells the compiler not to optimize any expression containing the variable so flagged.) I don't know the outcome of the discussions.
tom@ucbcad.UUCP (10/18/84)
I ran into some similar problems when I was writing a device driver. I had to put in some weird kludges to make things work. MOST of my problems could be solved by avoiding the optimizer, but not all. In any case, I wanted to use the optimizer to tighten the code as much as I could. I concluded that there should be a storage class that indicates hardware side-effects. I also thought it would be nice to have storage attributes for read-only registers (I guess this is the "const" storage class in the proposed ANSI standard -- I don't think much of the mnemonic value here, but I suppose consistency is more important than mnemonics) and one for write-only registers, so you would get a compile-time error if you tried to read a write-only register or vice versa. I'm sure there are other strange storage classes/attributes that people would like to see in the standard. What do people think is a reasonable set? I personally think the side-effect class is very important (IS this in the proposed standard? I don't remember seeing it, but I may be senile.), but the others are harder to justify. Tom Laidig
thomson@uthub.UUCP (Brian Thomson) (10/19/84)
The 'volatile' keyword will tell an (optimizing) compiler that a
location can change value asynchronously, but sometimes that isn't
enough.
Another common idiosyncrasy of hardware is that it is read-only
or write-only, or can only be accessed using byte or halfword or
fullword accesses.
I believe the Ritchie PDP-11 compiler would compile
a = b + c;
into
mov b,a
add c,a
in appropriate circumstances; this clearly will fail if 'a' refers to
a write-only register.
The second situation is exemplified by registers in nexus space
on a Vax (forgive me if you don't know what this means), which
may only be accessed as longwords, and by Unibus registers which
may only be accessed as bytes or halfwords.
A favourite improvement made by /lib/c2 when compiling
short a;
...
if (a&1) ...
is to replace the straightforward
bitw $2,_a
jeql L1
by
jlbc _a,L1
(Jump if Lower Bit Clear, that means) which is fine except that
the first sequence references _a as a halfword while the second
does a longword reference.
( the undocumented "-i" flag to vax /lib/c2 disables
these optimizations that alter reference size, and doesn't have
anything to do with memory volatility as suggested by an earlier
article )
--
Brian Thomson, CSRI Univ. of Toronto
{linus,ihnp4,uw-beaver,floyd,utzoo}!utcsrgv!uthub!thomsonhenry@utzoo.UUCP (Henry Spencer) (10/19/84)
A paranoid compiler could, presumably, assume that "volatile" meant not
only that the location might change underfoot, but also that there was
something strange about it and it had better be accessed in the most
straightforward way possible. My own thought would be that the special
keyword ("volatile" is perhaps not an ideal choice, it's too specific)
should simply mean that the compiler should be as paranoid as possible
on the given architecture. I suspect that trying to pin down the exact
semantics is both difficult and unwise.
--
Henry Spencer @ U of Toronto Zoology
{allegra,ihnp4,linus,decvax}!utzoo!henryalan@apollo.uucp (Alan Lehotsky) (10/22/84)
APOLLO's implementation of C has support for both the notion of
memory being shared between asynchronous activities (VOLATILE)
and memory-mapped i/o (DEVICE). Rather than implement these as
storage classes, we chose to create an "extensible" extension
mechanism with an attributes list.
The semantics of VOLATILE are based on the notion that the
named variable can be modified between any two references, so
that the variable may not be a component of any common subexpression,
nor may it be hoisted out of a loop.
DEVICE implies more stringent conditions on the optimizer and code
generator. (Still somewhat of a DWIM [Do What I Mean] situation, though)
In addition to implying VOLATILE, it indicates that extra references
by the compiled code will be MOST unwelcomed. As an explicit example,
a DEVICE location will never be the target of a CLR instruction by the
68000 code generator, as this instruction does a READ-MODIFY bus cycle!
(which can really annoy some of the dumber write-only Multibus peripheral
cards!)
The flavor of the syntax for this extension is
int devreg #attribute[volatile];
short outmask #attribute[device(write)],
inmask #attribute[device(read)];
(NO FLAMES ABOUT HERETICAL VIOLATIONS OF THE K&R "BIBLE", PLEASE)
We also support an "ADDRESS(expr)" attribute which allows a name to
be associated with a compile-time constant expression which denotes
a memory address.
All of the above functionality also appears in our PASCAL, with
different syntax (similar to VAX-11 PASCAL's attributes). The
implementation and semantics of this was based on very similar
work I implemented in DIGITAL's common BLISS compilers in the late
70's. (Just another example of BLISS being a much better system's
programming language than C will EVER be.....)