ram@nucsrl.UUCP (01/17/87)
Hi, This is my first posting in this newsgroup. So hold your flames if this is a dumb question. C allows "register ......" construct which instructs the compiler to reserve a machine register to store that value. Now my question is, given a fixed number of registers, How many are effectively usable for the register declaration. I know this is machine dependent. Could somebody say how many register definitions I could use within a block of code say for a VAX. And please go on to mention the CPU/Machine that allows the greatest number and smallest number of such declarations. Is this number fixed or does it change as the program runs. Renu Raman ....ihnp4!nucsrl!ram Northwestern Comp. Sci. Lab
cccmark@ucdavis.UUCP (Mark Nagel) (01/19/87)
In article <3950004@nucsrl.UUCP> ram@nucsrl.UUCP (Raman Renu) writes: > C allows "register ......" construct which instructs the compiler >to reserve a machine register to store that value. Now my question is, >given a fixed number of registers, How many are effectively usable for >the register declaration. I know this is machine dependent. Could >somebody say how many register definitions I could use within a block >of code say for a VAX. And please go on to mention the CPU/Machine that >allows the greatest number and smallest number of such declarations. On the machines I've worked on, the register declaration will use up to the total available registers on the CPU and then it is ignored (i.e. no error, just no register declaration either). Depending on the compiler, the register declaration will do anything from telling the compiler to put this variable in an available register or else (Macintosh w/Lightspeed C) to strongly advising the compiler to possibly put the variable in a register if it wouldn't be too much trouble (VAX/VMS C). I am not sure of exact numbers offhand, but they will vary according to compiler as well as CPU. - Mark Nagel ucdavis!deneb!cccmark@ucbvax.berkeley.edu (ARPA) mdnagel@ucdavis (BITNET) ...!{sdcsvax|lll-crg|ucbvax}!ucdavis!deneb!cccmark (UUCP)
mwm@eris.BERKELEY.EDU (Mike Meyer) (01/19/87)
In article <83@ucdavis.UUCP> cccmark@deneb.UUCP (Mark Nagel) writes: >On the machines I've worked on, the register declaration will use up to >the total available registers on the CPU and then it is ignored (i.e. no >error, just no register declaration either). Depending on the compiler, >the register declaration will do anything from telling the compiler to put >this variable in an available register or else (Macintosh w/Lightspeed C) >to strongly advising the compiler to possibly put the variable in a register >if it wouldn't be too much trouble (VAX/VMS C). I am not sure of exact >numbers offhand, but they will vary according to compiler as well as CPU. There is a false implication in the above: that it doesn't hurt to add register declerations. There is at least one compiler out there that effectively allocates registers from the last declared instead of the first, so that blindly adding registers to code can slow the generatred code down. The algorithm I use for allocating registers is as follows: Assign one register to each heavily-used variable. While there are fewer registers than N, and there are variables that are touched in loops, or more than a few times outside of loops, repeat for the next least heavily-used variables. N depends on the expected target machines. Since I'm writing for 68K's and VAXen these days, it's 6. When I wrote for 11's and 4004 family machines, it was 3. Three isn't enough; I don't very often need more than 6, though. If you feel that you need speed badly enough to want to KNOW which variables go into registers, and can't afford to pull the overhead of a subroutine, you probably oughta be hand-coding that routine in assembler for speed anyway. <mike
cb@mitre-bedford.arpa (Christopher Byrnes) (01/20/87)
Several people have pointed out that the number of effective register declarations will vary from CPU architecture to architecture. Register declarations can also vary from `C' compiler to `C' compiler. I've used several different 680x0 compilers. One (of unknown heritage) was effective for up to 6 "integer" or "short" data values (using 68000 registers d2 - d7 with d0 and d1 reserved for function returns and intermediate expressions) AND it was effective for up to 3 "pointer" values (using registers a2 - a4, with a0 and a1 reserved for function returns, a5 as a frame pointer, a6 as the stack pointer and a7 as the program counter). If you were carfeful, you could have up to 9 registers in use at once. I'm now using the Sun `C' compiler on version 3.0 of Sun's UNIX system. Could someone tell me what the magic numbers are for effective register declarations on this `C' compiler. Does the `C' optimizer do register allocations correctly anyways (as some good compilers may do now)? I'd rather not have to wade through assembler listings to try and figure these magic numbers out again. Thanks. /* the usual disclaimers */ Christopher Byrnes The MITRE Corporation Burlington Road M/S A156 Bedford, Mass. 01730 cb@Mitre-Bedford.ARPA ...!decvax!linus!mbunix!cb.UUCP
greg@utcsri.UUCP (Gregory Smith) (01/20/87)
In article <2250@jade.BERKELEY.EDU> mwm@eris.BERKELEY.EDU (Mike Meyer) writes: >There is a false implication in the above: that it doesn't hurt to add >register declerations. There is at least one compiler out there that >effectively allocates registers from the last declared instead of the >first, so that blindly adding registers to code can slow the generatred >code down. Furthermore, in most run-time environments, functions are expected to preserve that set of registers which are available for 'register' vars. So if you declare six register variables, they must be pushed on entry and popped on exit. If the function in question does very little, this pushing and popping may become a significant portion of the function's execution time. It may seem silly to want a large number of register vars on a function that does very little. This problem applies, though, to any function that *Usually* does very little: /* update the data structure */ Update(){ register foo *first, *last, *current; register int loops, item_count, *bats_knees; extern int Dirty; /* dirty flag */ extern ... if(Dirty){ ... mucho code using register vars ... Dirty = FALSE; } } Assume Update() is called very frequently, but that Dirty is false on 98% of these calls. Then it looks bad, no? A fix might be to remove the 'Dirty' test from Update(), and use if(Dirty)Update(); whenever Update() was called. -- ---------------------------------------------------------------------- Greg Smith University of Toronto UUCP: ..utzoo!utcsri!greg Have vAX, will hack...
Leisner.Henr@xerox.com (marty) (01/21/87)
Some more insights on coding style using register variables for effiency. I've done a lot of real-time coding with Manx Aztec C for 8085 machines. Manx allows one register variable (stored in the BC pair). When the code can be non-reentrant, I've found it effective to pick one good variable (often a pointer to structure) to be register and the rest static. Stack operations are expensive on 8080 architectures and real-time performance is important. Any additional register declarations beyond the first Manx treats as auto (which means stack, which is undesirable). There is a compile time option to convert autos to statics. I've usually been pretty happy with the assembly language this compiler generates. I feel if at all possible it is better to stay away from assembly language for any meaningful algorithms. Significant optimization can be performed on small machines by "fiddling" with the C source and understanding the compiler output. Of course, oddball implementations with the idea the intent of increased speed should thoroughly commented. marty leisner.henr@xerox.com
lcc.rich-wiz@locus.ucla.edu (Richard Mathews) (01/21/87)
> There is a false implication in the above: that it doesn't hurt to add > register declerations. There is at least one compiler out there that > effectively allocates registers from the last declared instead of the > first, so that blindly adding registers to code can slow the generatred > code down. There is another more common case where excess register declarations hurt performance. Consider, for example, the VAX compilers distributed with BSD and SYS V systems. When a function is called the compiler will only cause those registers to be saved which are actually "used" in the function. All registers allocated to register variables are considered to be used. If you declare a variable to be "register" and it is never accessed, you have wasted a "push" of this register (even if the "push" is actually built into the VAX's "calls" instruction). If the variable is an argument, there is the added problem that the register must be loaded with the argument's value. This will be true on just about any architecture. Someone here at LOCUS once made the following recommendations. Besides the above, these take into account the fact that moving a pointer in a register may give more of a performance gain than moving an integral variable into one. By "ref", this refers to the "typical" number of run time references (whatever that means) rather than the number of syntactic references. I don't know that I agree with these numbers, but they are probably the right order of magnitude for a lot of machines/compilers. I'd probably make all of these numbers a little lower. a. don't make a local pointer a register unless at least 3 refs are made b. don't make a parameter pointer a register unless at least 4 refs are made c. don't make a local integer a register unless 4 or 5 refs are made d. don't make a parameter integer a register unless 6 or more refs are made. e. be care to look for various forms of loops when doing ref counting. Richard M. Mathews Locus Computing Corporation lcc.richard@LOCUS.UCLA.EDU lcc.richard@UCLA-CS {ihnp4,trwrb}!lcc!richard {randvax,sdcrdcf,ucbvax,trwspp}!ucla-cs!lcc!richard
jjw@celerity.UUCP (01/26/87)
In article <2250@jade.BERKELEY.EDU> mwm@eris.BERKELEY.EDU (Mike Meyer) writes: > >There is a false implication in the above: that it doesn't hurt to add >register declerations. There is at least one compiler out there that >effectively allocates registers from the last declared instead of the >first, so that blindly adding registers to code can slow the generatred >code down. My version of K&R states (page 193, section 8.1 "Storage Class Specifiers)": A register declaration is best thought of as an auto declaration, together with a hint to the compiler that the variables will be heavily used. Only the first few such declarations are effective. ^^^^^ This implies to me that a conforming compiler should allocate "registers" starting with the first declaration.
henry@utzoo.UUCP (Henry Spencer) (01/30/87)
The approach I use to registers was chosen based on three facts: 1. Many of the machines my stuff is going to run on -- including the one that is going to be my primary machine soon -- have many registers. 2. Some of the machines, however -- including the one that is my primary machine right now -- have few registers. 3. In general you cannot trust the compiler to be predictable in picking specific "register" variables to actually go into registers. There are too many complications (e.g. the 68000's which have two flavors of registers). So if you read my code, you'll find me using both "register" and "REGISTER" in declarations. You will find "register" on about three variables per function, which is a not-uncommon number on register-poor machines (e.g. the pdp11/44 on which I write this). You will find "REGISTER" on the rest of the heavily-used variables (or all variables in functions that don't have many local variables). Up at the top of the code you'll find: #ifndef REGISTER #define REGISTER register #endif and in the Makefile you'll find instructions saying "on a register-poor machine, put '-DREGISTER=' in CFLAGS". (This could be the other way 'round, but on the whole I prefer to consider "good" machines, e.g. register-rich ones here, the default and make the "poor" machines go through the hassle of having to explicitly compensate.) Which variables get which? A somewhat ad-hoc decision, normally made during final review of working code rather than at code-writing time. Frequently- used variables, especially ones used in loops, get priority. Pointers used with the -> operator generally get priority over numeric variables, since using -> with a non-register pointer is often relatively expensive. Longs get a slight penalty, since my 44 can't put them in registers anyway. Parameters get a slight penalty, since putting one of them in a register often involves more startup overhead than putting a local variable in a register. Anything whose address is taken, of course, gets neither form of register prefix. One could arguably do better with multiple classes of registers, to express priorities in more detail. In practice I seldom have enough local variables to make this worthwhile, and I doubt that it can be done well enough to show much consistent benefit across a wide range of hardware. -- Legalize Henry Spencer @ U of Toronto Zoology freedom! {allegra,ihnp4,decvax,pyramid}!utzoo!henry
guy@gorodish.UUCP (02/10/87)
>My version of K&R states (page 193, section 8.1 "Storage Class Specifiers)": > A register declaration is best thought of as an auto declaration, > together with a hint to the compiler that the variables will be > heavily used. Only the first few such declarations are effective. > >This implies to me that a conforming compiler should allocate "registers" >starting with the first declaration. Well, no, I wouldn't go that far. The wording is too loose to be read as a requirement. The use of the word "hint" indicates that such declarations really aren't binding; the mention of the rules used by the compilers around at the time is there just to give the programmer an indication of which items would be put into registers. It's probably a Good Idea to process declarations in the Ritchie compiler/PCC fashion if you don't use any other information to decide which variables to put into registers, but it's probably a Good Idea to offer the programmer the option of using other information, since they may not know how many and what kind registers the machine the code is currently being compiled for has. Fortunately, the ANSI C standard does not promise which declarations will be effective.
jjw@celerity.UUCP (02/12/87)
In response to my claim that compilers which conform to K&R should allocate "registers" starting with the first declaration guy@sun.UUCP (Guy Harris) indicates: >Fortunately, the ANSI C standard does not promise which declarations >will be effective. I believe this is unfortunate. If I have a program which can effectively use differing numbers of registers how can I indicate which variables should go into registers in a machine/compiler independent manner? For example, postulate a function which has more than 6 variables 6 of which have the following characteristics: a -- Is extremely frequently used. It is critical to performance that it be in a register. b, c -- Are used very frequently. They should be in registers to obtain optimal performance. d, e, f -- Are used frequently. If possible, they should be in registers. The remaining variables are only used infrequently and should never be in registers in preference to those listed. The question is -- How do I declare these variables so that I get the best performance on machines with 1, 3, 6, 8 ... registers available for register variables? I am trying to code in a machine and compiler independent manner. I do not want to reshuffle the declarations nor to have to re-define a "REGISTER" macro. In fact I don't even want to care about how many register variables the compiler allocates. K&R's suggestion that the register variables are assigned to registers in order of appearance solves my problem -- I just put the variables in order of importance and let the compilers handle it from there. The reason for my original posting was because of this. I think the K&R statement, "Only the first few such declarations are effective," is insightful and aids in producing machine independent code. Therefore I am saddened to see it ignored or forgotten. As Guy says: >It's probably a Good Idea to process declarations in the Ritchie >compiler/PCC fashion if you don't use any other information to decide >which variables to put into registers, but it's probably a Good Idea >to offer the programmer the option of using other information, since >they may not know how many and what kind registers the machine the >code is currently being compiled for has. Except that I would replace his "but" with "because". Also, I don't understand what he means by "using other information." I assume the register declarations are the result of considering whatever information the programmer has about the operation of the program.
howard@cpocd2.UUCP (02/13/87)
In article <873@celerity.UUCP> jjw@celerity.UUCP (Jim (JJ) Whelan) writes: >For example, postulate a function which has more than 6 variables 6 of >which have the following characteristics: > a -- Is extremely frequently used. It is critical to performance > that it be in a register. > b, c -- Are used very frequently. They should be in registers to > obtain optimal performance. > d, e, f -- Are used frequently. If possible, they should be in > registers. > The remaining variables are only used infrequently and should never > be in registers in preference to those listed. >The question is -- How do I declare these variables so that I get the best >performance on machines with 1, 3, 6, 8 ... registers available for register >variables? I am trying to code in a machine and compiler independent >manner. I do not want to reshuffle the declarations nor to have to >re-define a "REGISTER" macro. In fact I don't even want to care about how >many register variables the compiler allocates. Boy, there are sure a lot of things you "don't want" to do to get good code! Seriously, there is an easy way to get approximately what you want, with a fixed amount of work PER MACHINE (not per program). Declare your variables as follows (assuming they are all ints): #include "register.h" main() { REG1 int a; REG2 int b; REG3 int c; REG4 int d; REG5 int e; REG6 int f; } And then have register.h contain (assuming there are 3 usable registers): #define REG1 register #define REG2 register #define REG3 register #define REG4 #define REG5 #define REG6 ... If you do this for all your programs, then when you port to a new machine you only need to change ONE register.h file, once, and you're set! In actuality, this is oversimplified, since some machines have separate registers for integer, floating point, and/or pointer; and a double may eat up 2 registers! A similar approach can be used to get total portability with respect to the length of short, int, and long. Just define INTn for n = 1 up to the maximum of the machine (example here assumes short=16, int=32, long=64): #define INT1 short ... #define INT16 short #define INT17 int ... #define INT32 int #define INT33 long ... #define INT64 long /* ... and likewise for UINT1 to UINT64 */ Then you declare each int with the precise number of bits you actually require: REG1 INT5 a; /* This works with register scheme above. */ INT16 b; INT16 c; INT10 d; INT18 e; /* May pay off on a 36-bit machine! */ INT60 f; /* Just the thing for a Cray 2? */ Now of course, INT60 isn't very portable, but at least you'll know instantly every place in your program that needs to be fixed. You can also use, e.g.: #ifdef INT60 /* simple code using 60-bit int */ #else /* complex code to emulate 60-bit int */ #endif to get a better shot at portability. The drawback of this approach is that it requires you to understand (and declare) exactly how many bits each variable requires; but shouldn't you know that anyway? (Note to wizards: you will have noticed that using a short instead of an int for a loop variable can cause performance degradation on some machines. If you're that smart, you should be able to figure out how to modify the above scheme to do what you want. "An exercise for the reader". It's not very hard.) Wouldn't it be nice if UNIX was written this way? Then we wouldn't be arguing about whether or not we're stuck with sizeof(int) == sizeof(long)! -- Howard A. Landman ...!intelca!mipos3!cpocd2!howard
flaps@utcsri.UUCP (Alan J Rosenthal) (02/14/87)
In article <873@celerity.UUCP> jjw@celerity.UUCP (Jim (JJ) Whelan) writes: >K&R's suggestion that the register variables are assigned to registers in >order of appearance solves my problem -- I just put the variables in order >of importance and let the compilers handle it from there. Unfortunately, this is not sufficient in the case of register formals. Consider something like: f(n) register int n; { register int i; where it is considered more useful to put 'i' in a register than 'n'. It is not possible to arrange the declarations in the appropriate order, and f(nformal) int nformal; { register int i,n = nformal; , which is often recommended, wastes an int on all machines. -- Alan J Rosenthal UUCP: {backbone}!seismo!mnetor!utgpu!flaps, ubc-vision!utai!utgpu!flaps, or utzoo!utgpu!flaps (among other possibilities) ARPA: flaps@csri.toronto.edu CSNET: flaps@toronto BITNET: flaps at utorgpu
greg@utcsri.UUCP (Gregory Smith) (02/17/87)
In article <4141@utcsri.UUCP> flaps@utcsri.UUCP (Alan J Rosenthal) writes: >>order of appearance solves my problem -- I just put the variables in order >>of importance and let the compilers handle it from there. > >Unfortunately, this is not sufficient in the case of register formals. >Consider something like: > > f(n) > register int n; > { register int i; > >where it is considered more useful to put 'i' in a register than 'n'. It >is not possible to arrange the declarations in the appropriate order, and > > f(nformal) > int nformal; > { register int i,n = nformal; > >, which is often recommended, wastes an int on all machines. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ How so? To my understanding, declaring a formal to be register is equivalent to asking for a local register var which is to be initialized to the value of the formal. I.e: foo(x) register int x; { statements.... and foo(xf) { register int x = xf; statements.... are exactly equivalent, provided a register is available. Thus the 'often recommended' solution only wastes an int when the local var (in this case n) cannot be put in a register. Since most C implementations pass parameters on the stack, declaring a formal to be 'register' results in a copy operation from the stack to the register. This copy is implicit in the foo(x) example; the same copy is explicit in the foo(xf) example. -- ---------------------------------------------------------------------- Greg Smith University of Toronto UUCP: ..utzoo!utcsri!greg Have vAX, will hack...