pcg@aber-cs.UUCP (Piercarlo Grandi) (12/03/88)
In article <9033@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes: Aggressive optimization is the new wave, brought on partly by all those C compiler races published in hobbyist magazines. Most of the time, faster-running code is an advantage, so generally this is a good thing. My extreme hostility to "volatile" (and, as I had forgot to mention, making all standard procs names "reserved words") stems exactly from this: C has been beautifully designed so as to allow for a simple compiler that does not do aggressive code optimization, and volatile (and "reserved word" procs) instead try to make respectable the idea that a complex expensive optimizer will give you substantially better performance than a little ingenuity. In a sense volatile (and "reserved word" procs) are there mostly to allow cheating on well known benchmarks by peddlers of complex compilers. First of all, in a sense, volatile and register exclude each other; in traditional C all variables are volatile save for those declared to be register (and this explains why there is a restriction on &...). People who really care about speed know that register is perfectly adequate, IF USED COMPETENTLY, given a compiler that does a straightforward but COMPETENT generation job (most tuned PCCs), to make non benchmark programs run fast. Indeed, compiler selection of variables to put in registers, among those not declared volatile, usually is based on STATIC frequency of use, while the programmer knows better about DYNAMIC frequency of use, because he/she understands the algorithm and has an idea of its hot spots, either from theory or from profiling. Arguments for volatile, as opposed to register, are essentially: [1] you cannot trust programmers to be competent and understand what they are writing, or at least not enough to place appropriate register declarations; in that case an automatic optimizer will possibly improve things, even if it does a purely static analysis. [2] you want people to pay a lot of money for your latest optimizing compiler technology for C, and you want it to be blessed by dpANS, so you can say of other compilers "they do not have/implement volatile" without people questioning whether it is useful at all (it's in the standard!). As to point [1], unfortunately you need damn competent programmers that understand a lot of the subtleties of their program to place volatile where it is needed, and only there, otherwise they will be inclined, for safety, to declare everything in sight to be volatile... As to point [2], it makes a lot of sense if you want your customers to believe that they can put their crusty old C codes, and ZAP! they will run like greased lightining on your large register set CPU, without rewriting, thanks to the optimizer; they wil normally not have read that old SIGPLAN paper that found the knee for C variables in registers at FOUR, and all the stastics on the median (low) complexity of expressions and statements. And your customers probably are full of things like the dot prod loop coded like this, by somebody that knows algebra but not programming: float a[M][X], b[X][N], c[M][N]; int i,j,k; for (i = 0; i < M; i++) for (j = 0; j < N; j++) c[i][j] = 0.0; for (i = 0; i < M; i++) for (j = 0; j < N; j++) for (k = 0; k < X; l++) c[i][j] += a[i][k] * b[k][j]; instead of the obviously proper better structured and efficient C way (slicing, and using register pointers): /*INLINE*/ float rowcol ( register float *aix, register float *bxj, register unsigned e, register unsigned s ) { register float p = 0.0; for (aix, bxj, e; e != 0; aix += 1, bxj += s, --e) p += *aix * *bxj; return p; } float a[M][X], b[X][N], c[M][N]; float *cix, *aix; float *bxj, *cij; for (cix = c, aix = a; cix < c+(M*N); cix += N, aix += X) for (cij = cix, bxj = b+(cij-cix); cij < cix+N; cij++, bxj += X) *cij = rowcol(aix,bxj,X,N); Note also the following two points: [1] Placing register is a lot safer than placing volatile, as you never make a program wrong by declaring a variable register, but it may be wrong if you fail to place volatile where it is needed. [2] In any case, because of the 80%/20% rule, it is usually beneficial only to put a few variables in registers, and you do not usually need a lot of thought to identify them. Volatile means that all non volatile variables may be put in registers, but usually this does not help a lot, becaus eof the rule... Summing up: [1] volatile is useless; you only need a compiler that does good code generation (usually good peephole optimization is enough), and the ability to place register tags where necessary. [2] volatile is a loss; it is a new keyword, a new and difficult concept, one that makes programs wrong if omitted, and one that gives the illusion that complex, highly optimizing compilers are really necessary for C, which was carefully designed, with several restrictions, and with register, not to require them. As to the last point, when I first studied C many years ago, I was really struck by the cleverness of having a register storage class, its careful definition, the obvious advantage of having the programmer select the few crucial variables to cache, obviating the need for a complex and large optimizer to do the same job, only worse. -- Piercarlo "Peter" Grandi INET: pcg@cs.aber.ac.uk Sw.Eng. Group, Dept. of Computer Science UUCP: ...!mcvax!ukc!aber-cs!pcg UCW, Penglais, Aberystwyth, WALES SY23 3BZ (UK)
henry@utzoo.uucp (Henry Spencer) (12/06/88)
In article <319@aber-cs.UUCP> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes: >...C has been beautifully designed so as to allow for a simple >compiler that does not do aggressive code optimization... However, it is a verifiable fact that in general your code runs faster if your C compiler does aggressive optimization. Many (not all) of the things an optimizer can do can be done by a C programmer, *if* he has enough time and patience to do a lot of fussy, boring bit-picking. In practice, programmers understandably prefer to leave that to the compiler, and simply don't bother doing it if the compiler can't. Another point which should not be overlooked is that code that has been micro-optimized in this fashion tends to be (a) quite machine-specific (it may work on another machine, but the micro-optimizations will be all wrong), and (b) grossly unreadable. -- SunOSish, adj: requiring | Henry Spencer at U of Toronto Zoology 32-bit bug numbers. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu