umh@vax5.cit.cornell.edu (04/09/91)
Something I wonder about. C has provision for register variables, which are supposed to run faster than standard variables. How come one never sees register variables in any code? Are modern RISC compilers sufficiently good that they automatically make sensible choice of register variables? Can I make my code run slower by using them? In like vein, C has the feature that you can create variables local to some part of your function: something like if( condition )$ int i; . . and i is invisble to the rest of your code. Can judicious use of this speed things up- the compiler should know it doesn't have to save i when it leaves that block etc? Maynard Handley
jlg@cochiti.lanl.gov (Jim Giles) (04/09/91)
In article <1991Apr8.193155.3911@vax5.cit.cornell.edu>, umh@vax5.cit.cornell.edu writes: |> Something I wonder about. |> C has provision for register variables, which are supposed to run faster than |> standard variables. How come one never sees register variables in any code? I see them all the time. Most of my local variables in C code that I write are register. I wish it were the default mode for all automatic variables. According to the ANSI standard, the only thing that the register attribute means is that you can't point to that variable (that is, you can't use the 'address-of' operator (&) on them). This means that a smart compiler can optimize them is ways that it otherwise can't. |> Are modern RISC compilers sufficiently good that they automatically make |> sensible choice of register variables? [...] No. At least I've not seen any that good. The problem is that _any_ use of the address of a variable requires turning off the register atrtribute - even call by reference to a procedure. Unless the compiler keeps track of all uses of all variables, it can't identify those that can be made register variables. Most compilers optimize on the 'basic block' level - and don't know much about the use of variables outside that level (except for the declaraction). Still, as you say, a _really_ good compiler would make the register attribute redundant. |> [...] Can I make my code run slower by using |> them? Yes. A low quality compiler might try to force your register variables into real registers as often as possible. Even this sort of behaviour can be useful, but you have to choose your register variables very carefully and knowing something about the target machine language (and especially how many registers it has) is a prerequisite. A better compiler will simply make use of the no-alias property of register variables and on those compilers you should make everything register that you can. |> [... nested scope ...] Can judicious use of this speed |> things up- the compiler should know it doesn't have to save i when it leaves |> that block etc? Yes and no. An exceptionally bad compiler may actually do stack allocation chores for each nexted scope. This could mean, for example that a variable inside the scope of a loop could cause memory management calls for each pass through the loop. Fortunately, most compilers only do stack accocation on entry to a procedure and allocate enough for all nested scopes at that time. However, most really good modern compilers will already know whether a variable is 'live' on exit or not. If you don't use the variable anywhere else in the code, a really good compiler will not do any redundant stores. In fact, even if you _do_ use it elsewhere in the code, but all paths to the additional use contain an assignment to the variable, the compiler will still not do the store. (Liveness analysis is more often available than the global data flow required to identify a variable which can be given register attribute. I don't know why this is, the same mechanisms could carry the additional information.) J. Giles
marc@watson.ibm.com (Marc Auslander) (04/10/91)
The IBM Risc System 6000 ignores the register declartion except for checking that no address is taken. All local variables which are not aliased are register allocated and spilled if necessary. Aliased variables must be forced home as needed. -- Marc Auslander <marc@ibm.com>
mjs@hpfcso.FC.HP.COM (Marc Sabatella) (04/11/91)
>|> Are modern RISC compilers sufficiently good that they automatically make >|> sensible choice of register variables? [...] > >No. At least I've not seen any that good. The problem is that _any_ >use of the address of a variable requires turning off the register >atrtribute - even call by reference to a procedure. Unless the compiler >keeps track of all uses of all variables, it can't identify those that >can be made register variables. Come on, Jim, you know better than this. Determining which variables are *candidates* for register allocation is TRIVIAL. While the actual allocation may not be as good as you can do by hand (since you have dynamic information, like how many times that loop will really be executed), it is usually almost as good as randomly putting "register" on most or all locals would do. > Most compilers optimize on the 'basic >block' level - and don't know much about the use of variables outside >that level (except for the declaraction). Still, as you say, a _really_ >good compiler would make the register attribute redundant. Translation - any compiler taking advatntage of basic global optimization techniques you can read about in your favorite Dragon book. Certainly any compiler for a RISC chip.
andras@alzabo.ocunix.on.ca (Andras Kovacs) (04/11/91)
umh@vax5.cit.cornell.edu writes: >C has provision for register variables, which are supposed to run faster than >standard variables. How come one never sees register variables in any code? >Are modern RISC compilers sufficiently good that they automatically make >sensible choice of register variables? Can I make my code run slower by using >them? My compiler is Norcroft ARM C V1.54A (admittedly an early release). It puts the first 10 variables into registers and the others are loaded/stored on demand. Now using 'register' means the same as moving the variable declaration into the first 10. Would be the compiler better, it would analyze the code and decide register usage based on that analysis; but then 'register'-ing a var could have two possible effects: 1, It is obeyed - but then better if you know what you are doing otherwise you can indeed slow down the code, or 2, Disregarded - because the compiler trusts itself that indeed it knows the best allocation scheme. I assume that good compilers use the second approach - not out of disregard to the programmer but either the programmer asks for the right var to be register and then it is already; or the compiler KNOWS that the register var would cost execution speed and then what is the point of using it? I hope my view is not too simplistic; could someone with actual experience follow up on the subject? Andras -- Andras Kovacs andras@alzabo.ocunix.on.ca Nepean, Ont.
jimp@cognos.UUCP (Jim Patterson) (04/11/91)
In article <1991Apr11.003431.24918@alzabo.ocunix.on.ca> andras@alzabo.ocunix.on.ca (Andras Kovacs) writes: >... 'register'-ing a var >could have two possible effects: > 1, It is obeyed - but then better if you know what you are doing otherwise > you can indeed slow down the code, or > 2, Disregarded - because the compiler trusts itself that indeed it knows the > best allocation scheme. > > I assume that good compilers use the second approach - not out of disregard >to the programmer but either the programmer asks for the right var to be >register and then it is already; or the compiler KNOWS that the register var >would cost execution speed and then what is the point of using it? I don't think a compiler can ever know absolutely that a given variable should be in a register whereas another should not, for "performance". It can't do anything more than guess at the actual code dynamics, and therefore would be allocating registers based on assumptions which could be totally erronous. Here's an example which should illustrate this. Assume that there is 1 register free to allocate to either i or j. The outer for-loop is allocating an array of things. The inner loop is an error recovery routine and normally not executed; however, if a compiler assumes that there is a 50% chance of the first if-condition is going to succeed, it's going to weight j a lot higher than i for register allocation. This can't possibly be right if during normal execution malloc never fails. for (i=0 ; i<10000; ++i) { e=malloc(sizeof(element)); if (!e) { for (j=0; j<10000; ++j) if (j < i) free(table[j]); getout(); /* Leave routine now */ } table[i] = e; } We could make a rule that a variable in a conditional block is weighted enough lighter than one outside that the register allocation would go to i instead of j, but then we can turn just as easily reverse the test and find that now we should put j in the register, not i. Neither allocation wins in all situations, but the compiler doesn't have enough information to know which is best. Things are different if the compiler is optimizing for size, not speed. Then it becomes very clear which allocation yields the best (smallest) code. If a compiler optimizes for size however it should give the option of optimizing for speed as well since they aren't usually the same. (The PL/I compilers I used to use gave you the option). -- Jim Patterson Cognos Incorporated UUCP:uunet!mitel!cunews!cognos!jimp P.O. BOX 9707 PHONE:(613)738-1440 x6112 3755 Riverside Drive NOT a Jays fan (not even a fan) Ottawa, Ont K1G 3Z4
jesup@cbmvax.commodore.com (Randell Jesup) (04/15/91)
In article <20715@lanl.gov> jlg@cochiti.lanl.gov (Jim Giles) writes: >I see them all the time. Most of my local variables in C code that I >write are register. I wish it were the default mode for all automatic >variables. According to the ANSI standard, the only thing that the >register attribute means is that you can't point to that variable (that >is, you can't use the 'address-of' operator (&) on them). This means >that a smart compiler can optimize them is ways that it otherwise can't. A smart compiler notices that & is never used on the variable, removing that reason for putting 'register' everywhere. >|> Are modern RISC compilers sufficiently good that they automatically make >|> sensible choice of register variables? [...] > >No. At least I've not seen any that good. The problem is that _any_ >use of the address of a variable requires turning off the register >atrtribute - even call by reference to a procedure. Unless the compiler >keeps track of all uses of all variables, it can't identify those that >can be made register variables. Most compilers optimize on the 'basic >block' level - and don't know much about the use of variables outside >that level (except for the declaraction). Still, as you say, a _really_ >good compiler would make the register attribute redundant. Umm, maybe you've spent too much time in the MSDOS world, or something equally horrid. In any case, modern compilers for even things like the Amiga (680x0) do global optimization, including register selection (and have for some time). Sometimes you can do better than they do, but they usually are as good or better. I no longer even bother with register except in extreme cases - and I'm more likely to drop to assembler if I need speed that much. >However, most really good modern compilers will already know whether >a variable is 'live' on exit or not. If you don't use the variable >anywhere else in the code, a really good compiler will not do any >redundant stores. In fact, even if you _do_ use it elsewhere in the >code, but all paths to the additional use contain an assignment to the >variable, the compiler will still not do the store. (Liveness analysis >is more often available than the global data flow required to identify >a variable which can be given register attribute. I don't know why this >is, the same mechanisms could carry the additional information.) I think you're underestimating the state of compilers out there. -- Randell Jesup, Keeper of AmigaDos, Commodore Engineering. {uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.commodore.com BIX: rjesup Disclaimer: Nothing I say is anything other than my personal opinion. Thus spake the Master Ninjei: "To program a million-line operating system is easy, to change a man's temperament is more difficult." (From "The Zen of Programming") ;-)
coopteam@bcarh680.bnr.ca (Jean Laliberte) (04/15/91)
In article <9526@cognos.UUCP> jimp@cognos.UUCP (Jim Patterson) writes: >I don't think a compiler can ever know absolutely that a given variable >should be in a register whereas another should not, for "performance". >It can't do anything more than guess at the actual code dynamics, and >therefore would be allocating registers based on assumptions which >could be totally erronous. Here's an example which should illustrate >this. > >[description deleted] > >for (i=0 ; i<10000; ++i) { > e=malloc(sizeof(element)); > if (!e) { > for (j=0; j<10000; ++j) > if (j < i) > free(table[j]); > getout(); /* Leave routine now */ > } > table[i] = e; >} > Is there some reason register allocation couldn't also be dynamic? In this example, i would be stored in the register for the most part, but when the if() statement evaluates to true, 'store i' and 'load j' instructions would be executed, and j would then be the register variable (when the if() condition is over, i would be put back in the register). I think this would be faster than choosing one variable or the other. >-- >Jim Patterson Cognos Incorporated >UUCP:uunet!mitel!cunews!cognos!jimp P.O. BOX 9707 >PHONE:(613)738-1440 x6112 3755 Riverside Drive >NOT a Jays fan (not even a fan) Ottawa, Ont K1G 3Z4 no signature. std disclaimer. using a 6809 which *does* only allow one register variable.
john@acorn.co.uk (John Bowler) (04/15/91)
In article <1991Apr11.003431.24918@alzabo.ocunix.on.ca> andras@alzabo.ocunix.on.ca (Andras Kovacs) writes: >umh@vax5.cit.cornell.edu writes: >>C has provision for register variables, which are supposed to run faster than >>standard variables. How come one never sees register variables in any code? (Try looking at traditional UNIX toolkit code :-) >>Are modern RISC compilers sufficiently good that they automatically make >>sensible choice of register variables? Can I make my code run slower by using >>them? > > My compiler is Norcroft ARM C V1.54A (admittedly an early release). Very, very early indeed :-). > It puts >the first 10 variables into registers and the others are loaded/stored on >demand. Now using 'register' means the same as moving the variable declaration >into the first 10. Norcroft for the ARM no longer does this. Register colouring was implemented a long, long time ago (well, several years). For a long time norcroft ignored the register directive completely (apart from the traditional winge if you tried to take the address of a register variable :-). One of the more recent changes was to start taking note of the declaration again - it really will put the variable into a register if you ask it. When compiling ``traditional'' unix code I use the definition:- #define register ___type (___type is a built-in type with no attributes, originally added to allow a safer offsetof macro; it means that the traditional abberation:- register a; still compiles!) > Would be the compiler better, it would analyze the code and >decide register usage based on that analysis; In addition to this Norcroft now does proper variable life time analysis, so that the compiler knows when a variable is *really* no longer required, and hence knows when a register can be reused. > but then 'register'-ing a var >could have two possible effects: > 1, It is obeyed - but then better if you know what you are doing otherwise > you can indeed slow down the code, or > 2, Disregarded - because the compiler trusts itself that indeed it knows the > best allocation scheme. I used to favour (1), but the problems caused by wanton addition of register declarations to code just because it used to be compiled by pcc (;-) have since caused me to favour (2). In theory there are isolated cases where the programmer really does know that the static analysis which the compiler does will give the wrong result. In practice very few programmers have the necessary training to be able to recognise these circumstances, and, of those who can do it, very few have the inclination. IMHO it is normally far better to *restructure the code* so that the static analysis is correct, or doesn't matter. Normally if the compiler cannot understand it neither can I. > I assume that good compilers use the second approach - not out of disregard >to the programmer but either the programmer asks for the right var to be >register and then it is already; or the compiler KNOWS that the register var >would cost execution speed and then what is the point of using it? > > I hope my view is not too simplistic; could someone with actual experience >follow up on the subject? My experience is limited mainly to Norcroft and the ARM. I cannot see any justification for using ``register'' in this environment. If the compiler chooses to put the wrong thing into a register I would much rather have the compiler writers fix the compiler than attempt to fix all my code. (If fixing the compiler wasn't an option I would choose a different approach, such as recoding the time critical part in assembler). It is certainly true that the use of ``register'' is inherently non-portable; either it is being used for some machine specific reason (eg, for its effect on variable values after a BSD UNIX vfork system call), or it is being used to provide a speed-up. In the latter case the benefit must be machine dependent; even if you only use one register declaration per function you could still upset compilers which do global optimisation. The real benefits come from telling the compiler things which it cannot know, rather than attempting to do its job for it. For example:- { int temp; function(&temp); /* Result of function not required */ ... /* temp is used as a temporary variable */ calculations involving temp ... } is much better written:- { { int temp; function(&temp); /* Result not required */ } { int temp; ... /* temp is used as a temporary variable */ calculations involving temp ... } } The compiler doesn't read the comments (;-) so it doesn't know that the (potential) aliasing of temp at the function call is irrelevant, as a result the code which performs the calculations is likely to be far less efficiently compiled (certainly if it calls any functions!). All the register declarations in the world will not help this, and yet similar things happen with monotonous regularity in typical C code. The original poster asked about declaring variables within blocks (as in the second piece of code above) - this is the thing to do! By declaring variables only where the programmer thinks they are needed, and by ``undeclaring'' them (by closing the block) when they are finished with the programmer helps the compiler by telling it quite clearly how long the values are needed. Most of the time a good compiler can work this out for itself however the cases where it cannot are often the cases where its register allocation will screw up. Incidentally, someone else observed that this can be expensive because the compiler may insist on allocating space for variables when blocks are entered rather than at the entry to the function. Clearly the compiler does not *need* to do this; this is just a quality of implementation issue. Norcroft *does* do this, but the code overhead is, at most, a subtraction from the stack pointer on block entry and an addition to it on exit (on the ARM; in some cases the addition or subtration may be combined with the first operation which stores a value to the stack). This is well worth it because of the saving in stack space - very important in the market at which the ARM architecture is aimed (cheap (<$1000) (RISC) PC's). John Bowler (jbowler@acorn.co.uk)
jimp@cognos.UUCP (Jim Patterson) (04/16/91)
In article <1991Apr15.134417.24380@bigsur.uucp> coopteam@bcarh680.bnr.ca (Jean Laliberte) writes: >In article > <9526@cognos.UUCP> > jimp@cognos.UUCP (Jim Patterson) >writes: >>I don't think a compiler can ever know absolutely that a given variable >>should be in a register whereas another should not, for "performance". >> >>[description deleted] >>[example deleted] >> >Is there some reason register allocation couldn't also be dynamic? In >this example, i would be stored in the register for the most part, >but when the if() statement evaluates to true, 'store i' and 'load j' >instructions would be executed, and j would then be the register >variable (when the if() condition is over, i would be put back in the >register). There's no reason but it's obviously a trade-off as well. If you're loading it for a single reference, it might be more efficient to reference it on the stack. That is, there is a cost to loading and storing registers which needs to be factored into the overal cost computation. You also have to consider code which needs to reference both variables e.g. the test "j < i" in the example I gave. Actually, the point is really that the compiler need not put j into a register at all since it's in code that won't be executed most times the program is run. I agree, it's an over-simplification to assume that only one of the two variables can go into a register. Consider instead a situation where the "for-i" block either does it's "free-memory" loop using j, or another piece of logic using a variable "k" which is kept across iterations. for (i=0, k=-1; i<10000; ++i) { e=malloc(sizeof(element)); if (!e) { for (j=0; j<10000; ++j) if (j < i) free(table[j]); getout(); /* Leave routine now */ } else { if (k >= 0) table[k]->link = e; ++k; } table[i] = e; } table[k]->link = 0; If the compiler considers both paths equally likely it may continue to assign the single register to j, or could dump and load j and k into the single register as required. However, we have information the compiler doesn't, to wit that the "j" path will not normally be executed, so we can see that allocating a register to j but not k, or spending any additional time in the "k" code segment to optimize the "j" code path, will not benefit performance. The best-performing compiler will put k into a register (if it has two), and leave j as a memory variable. I know that many compiler optimizers are quite sophisticated, and likely do a better job than a naive programmer putting "register" on a few declarations. In fact even an experienced programmer would not likely find a register allocation strategy that would do better than a good code optimizer over a range of architectures. My point is simply that due to lack of information, no optimizer can provide an optimal register allocation for best performance (i.e. speed) of a specific application and specific data inputs. The example cites a situation where a "statistical" guess as to the code's behaviour is likely to lead to wrong decisions by the optimizer. -- Jim Patterson Cognos Incorporated UUCP:uunet!mitel!cunews!cognos!jimp P.O. BOX 9707 PHONE:(613)738-1440 x6112 3755 Riverside Drive Ottawa, Ont K1G 3Z4
mash@mips.com (John Mashey) (04/18/91)
In article <1991Apr8.193155.3911@vax5.cit.cornell.edu> umh@vax5.cit.cornell.edu writes: >Something I wonder about. >C has provision for register variables, which are supposed to run faster than >standard variables. How come one never sees register variables in any code? >Are modern RISC compilers sufficiently good that they automatically make >sensible choice of register variables? Yes. There are plenty of compilers around, at least {HP PA, MIPS, SPARC, IBM, but I'm sure others} which pretty much ignore whether or not you declare something register or not, and move variables in and out of registers (not necesarily the same ones) as makes sense. -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash DDD: 408-524-7015, 524-8253 or (main number) 408-720-1700 USPS: MIPS Computer Systems MS 1/05, 930 E. Arques, Sunnyvale, CA 94086
dhoyt@ux.acs.umn.edu (David Hoyt) (04/18/91)
In <9537@cognos.UUCP> jimp@cognos.UUCP (Jim Patterson) writes: > My point is simply >that due to lack of information, no optimizer can provide an optimal >register allocation for best performance (i.e. speed) of a specific >application and specific data inputs. The example cites a situation >where a "statistical" guess as to the code's behaviour is likely to >lead to wrong decisions by the optimizer. I am taking your quote slighly out of context, however; a compiler that uses static codepath analysis in conjunction with pc sampling (or other tracing mechinisms) can provide optimal (or near optimal) register allocation. With good benchmarks the compiler would have sufficient information to 'know' the code's behavior in the more general case as well. I sure people have done research in this area, but does any comercial compiler take advantage of this kind of analysis? david | dhoyt@ux.acs.umn.edu
kcollins@convex.com (Kirby L. Collins) (04/18/91)
In <9537@cognos.UUCP> jimp@cognos.UUCP (Jim Patterson) writes: >If the compiler considers both paths equally likely it may >continue to assign the single register to j, or could dump and load j >and k into the single register as required. However, we have information >the compiler doesn't, to wit that the "j" path will not normally be >executed, so we can see that allocating a register to j but not k, or Rather than provide a means for the programmer to "hand optimize" register allocation, a more useful solution would be to give the programmer a way to inform the compiler about branch frequency. If the compiler has as much information as the programmer, it ought to do as good or better a job than the programmer (we hope $-).
meissner@osf.org (Michael Meissner) (04/19/91)
In article <kcollins.671991559@convex.convex.com> kcollins@convex.com (Kirby L. Collins) writes: | In <9537@cognos.UUCP> jimp@cognos.UUCP (Jim Patterson) writes: | | >If the compiler considers both paths equally likely it may | >continue to assign the single register to j, or could dump and load j | >and k into the single register as required. However, we have information | >the compiler doesn't, to wit that the "j" path will not normally be | >executed, so we can see that allocating a register to j but not k, or | | Rather than provide a means for the programmer to "hand optimize" register | allocation, a more useful solution would be to give the programmer a way | to inform the compiler about branch frequency. If the compiler has as | much information as the programmer, it ought to do as good or better a | job than the programmer (we hope $-). I seem to remember that Fortran-II had a frequency statement, and that it was removed because they discovered the users were usually wrong. I think it's much better if you have some tool that automagically records which way the branches go, and run the program(s) under the appropriate test harnesses. -- Michael Meissner email: meissner@osf.org phone: 617-621-8861 Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142 Considering the flames and intolerance, shouldn't USENET be spelled ABUSENET?
zs@cs.mu.OZ.AU (Zoltan Somogyi) (04/19/91)
dhoyt@ux.acs.umn.edu (David Hoyt) writes: > ... a compiler that >uses static codepath analysis in conjunction with pc sampling (or other >tracing mechinisms) can provide optimal (or near optimal) register >allocation. With good benchmarks the compiler would have sufficient >information to 'know' the code's behavior in the more general case as well. >I sure people have done research in this area, but does any comercial >compiler take advantage of this kind of analysis? There is at least one compiler suite (the one by MIPS) that has this kind of functionality. The idea is that you compile your program with profiling, run it a few times on the kind of data you want to optimize its behavior for (i.e. typical data), and then compile the program again, but this time give the data gathered by the profiling process to the compiler as well. This approach gives the compiler a reasonably close approximation to perfect information about the run-time behaviour of the program. This information can be used not only for register allocation within procedures, but also procedure inlining (only at most frequent call sites) and interprocedural register allocation. The MIPS suite does all these things (and more). Zoltan Somogyi zs@cs.mu.OZ.AU
johnl@iecc.cambridge.ma.us (John R. Levine) (04/22/91)
In article <MEISSNER.91Apr18150644@curley.osf.org> meissner@osf.org (Michael Meissner) writes: >I seem to remember that Fortran-II had a frequency statement, and that >it was removed because they discovered the users were usually wrong. Actually it was the original Fortran I. The FREQUENCY statement let you specify the relative likelihood of each of the three branches from an IF statement (less, equal, and greater to zero) as well as the average number of trips through a DO loop. Fortran II dropped it, because it was infrequently used, and because it made little difference given the architectures and software technology of 1960. Apparently it wasn't even implemented right. The frequency numbers were applied to the wrong branches of the IF, and for several years nobody noticed. >I think it's much better if you have some tool that automagically >records which way the branches go, and run the program(s) under the >appropriate test harnesses. Indeed, though I wonder how much better that will really do than the usual heuristics, e.g. backward branches will be taken, forward branches won't. -- John R. Levine, IECC, POB 349, Cambridge MA 02238, +1 617 864 9650 johnl@iecc.cambridge.ma.us, {ima|spdcc|world}!iecc!johnl Cheap oil is an oxymoron.
keithh@tplrd.tpl.oz.au (Keith Harwood) (04/23/91)
In article <kcollins.671991559@convex.convex.com> kcollins@convex.com (Kirby L. Collins) writes: >Rather than provide a means for the programmer to "hand optimize" register >allocation, a more useful solution would be to give the programmer a way >to inform the compiler about branch frequency. The FORTRAN II (sorry about the F... word) circa 1959 had the FREQUENCY statement for exactly that purpose. Keith Harwood.