[comp.lang.c] Available No. of Registers

ram@nucsrl.UUCP (01/17/87)

Hi,
  
   This is my first posting in this newsgroup. So hold your flames if this
is a dumb question.

   C allows "register ......" construct which instructs the compiler
to reserve a machine register to store that value.  Now my question is,
given a fixed number of registers, How many are effectively usable for
the register declaration.  I know this is machine dependent.  Could
somebody say how many register definitions I could use within a block
of code say for a VAX.  And please go on to mention the CPU/Machine that
allows the greatest number and smallest number of such declarations. 

   Is this number fixed or does it change as the program runs.


                                         Renu Raman
                                     ....ihnp4!nucsrl!ram
                                     Northwestern Comp. Sci. Lab

cccmark@ucdavis.UUCP (Mark Nagel) (01/19/87)

In article <3950004@nucsrl.UUCP> ram@nucsrl.UUCP (Raman Renu) writes:
>   C allows "register ......" construct which instructs the compiler
>to reserve a machine register to store that value.  Now my question is,
>given a fixed number of registers, How many are effectively usable for
>the register declaration.  I know this is machine dependent.  Could
>somebody say how many register definitions I could use within a block
>of code say for a VAX.  And please go on to mention the CPU/Machine that
>allows the greatest number and smallest number of such declarations. 

On the machines I've worked on, the register declaration will use up to
the total available registers on the CPU and then it is ignored (i.e. no
error, just no register declaration either).  Depending on the compiler,
the register declaration will do anything from telling the compiler to put
this variable in an available register or else (Macintosh w/Lightspeed C)
to strongly advising the compiler to possibly put the variable in a register
if it wouldn't be too much trouble (VAX/VMS C).  I am not sure of exact
numbers offhand, but they will vary according to compiler as well as CPU.

- Mark Nagel

ucdavis!deneb!cccmark@ucbvax.berkeley.edu               (ARPA)
mdnagel@ucdavis                                         (BITNET)
...!{sdcsvax|lll-crg|ucbvax}!ucdavis!deneb!cccmark      (UUCP)

mwm@eris.BERKELEY.EDU (Mike Meyer) (01/19/87)

In article <83@ucdavis.UUCP> cccmark@deneb.UUCP (Mark Nagel) writes:
>On the machines I've worked on, the register declaration will use up to
>the total available registers on the CPU and then it is ignored (i.e. no
>error, just no register declaration either).  Depending on the compiler,
>the register declaration will do anything from telling the compiler to put
>this variable in an available register or else (Macintosh w/Lightspeed C)
>to strongly advising the compiler to possibly put the variable in a register
>if it wouldn't be too much trouble (VAX/VMS C).  I am not sure of exact
>numbers offhand, but they will vary according to compiler as well as CPU.

There is a false implication in the above: that it doesn't hurt to add
register declerations. There is at least one compiler out there that
effectively allocates registers from the last declared instead of the
first, so that blindly adding registers to code can slow the generatred
code down.

The algorithm I use for allocating registers is as follows: Assign one
register to each heavily-used variable. While there are fewer registers
than N, and there are variables that are touched in loops, or more than
a few times outside of loops, repeat for the next least heavily-used
variables.

N depends on the expected target machines. Since I'm writing for 68K's
and VAXen these days, it's 6. When I wrote for 11's and 4004 family
machines, it was 3. Three isn't enough; I don't very often need more
than 6, though.

If you feel that you need speed badly enough to want to KNOW which
variables go into registers, and can't afford to pull the overhead of
a subroutine, you probably oughta be hand-coding that routine in
assembler for speed anyway.

	<mike

cb@mitre-bedford.arpa (Christopher Byrnes) (01/20/87)

  Several people have pointed out that the number of effective register
declarations will vary from CPU architecture to architecture.  Register
declarations can also vary from `C' compiler to `C' compiler.  I've used
several different 680x0 compilers.  One (of unknown heritage) was effective
for up to 6 "integer" or "short" data values (using 68000 registers d2 - d7
with d0 and d1 reserved for function returns and intermediate expressions)
AND it was effective for up to 3 "pointer" values (using registers a2 - a4,
with a0 and a1 reserved for function returns, a5 as a frame pointer, a6 as
the stack pointer and a7 as the program counter).  If you were carfeful,
you could have up to 9 registers in use at once.

  I'm now using the Sun `C' compiler on version 3.0 of Sun's UNIX system.
Could someone tell me what the magic numbers are for effective register
declarations on this `C' compiler.  Does the `C' optimizer do register
allocations correctly anyways (as some good compilers may do now)?  I'd
rather not have to wade through assembler listings to try and figure these
magic numbers out again.  Thanks.


/* the usual disclaimers */		Christopher Byrnes
					The MITRE Corporation
					Burlington Road
					M/S A156
					Bedford, Mass. 01730

					cb@Mitre-Bedford.ARPA
					...!decvax!linus!mbunix!cb.UUCP

greg@utcsri.UUCP (Gregory Smith) (01/20/87)

In article <2250@jade.BERKELEY.EDU> mwm@eris.BERKELEY.EDU (Mike Meyer) writes:
>There is a false implication in the above: that it doesn't hurt to add
>register declerations. There is at least one compiler out there that
>effectively allocates registers from the last declared instead of the
>first, so that blindly adding registers to code can slow the generatred
>code down.

Furthermore, in most run-time environments, functions are expected to
preserve that set of registers which are available for 'register' vars.
So if you declare six register variables, they must be pushed on entry
and popped on exit. If the function in question does very little, this
pushing and popping may become a significant portion of the function's
execution time.

It may seem silly to want a large number of register vars on a function
that does very little. This problem applies, though, to any function
that *Usually* does very little:

/* update the data structure */

Update(){
	register foo *first, *last, *current;
	register int loops, item_count, *bats_knees;
	extern int Dirty;		/* dirty flag */
	extern ...

	if(Dirty){
		... mucho code using register vars ...
		Dirty = FALSE;
	}
}

Assume Update() is called very frequently, but that Dirty is false on 98%
of these calls. Then it looks bad, no? A fix might be to remove the 'Dirty'
test from Update(), and use if(Dirty)Update(); whenever Update() was
called.

-- 
----------------------------------------------------------------------
Greg Smith     University of Toronto      UUCP: ..utzoo!utcsri!greg
Have vAX, will hack...

Leisner.Henr@xerox.com (marty) (01/21/87)

Some more insights on coding style using register variables for
effiency.

I've done a lot of real-time coding with Manx Aztec C for 8085 machines.
Manx allows one register variable (stored in the BC pair).

When the code can be non-reentrant, I've found it effective to pick one
good variable (often a pointer to structure) to be register and the rest
static.  Stack operations are expensive on 8080 architectures and
real-time performance is important.

Any additional register declarations beyond the first Manx treats as
auto (which means stack, which is undesirable).  There is a compile time
option to convert autos to statics.  

I've usually been pretty happy with the assembly language this compiler
generates.  I feel if at all possible it is better to stay away from
assembly language for any meaningful algorithms.  Significant
optimization can be performed on small machines by "fiddling" with the C
source and understanding the compiler output.  Of course, oddball
implementations with the idea the intent of increased speed should
thoroughly commented.

marty
leisner.henr@xerox.com

lcc.rich-wiz@locus.ucla.edu (Richard Mathews) (01/21/87)

> There is a false implication in the above: that it doesn't hurt to add
> register declerations. There is at least one compiler out there that
> effectively allocates registers from the last declared instead of the
> first, so that blindly adding registers to code can slow the generatred
> code down.

There is another more common case where excess register declarations hurt
performance.  Consider, for example, the VAX compilers distributed with
BSD and SYS V systems.	When a function is called the compiler will only
cause those registers to be saved which are actually "used" in the function.
All registers allocated to register variables are considered to be used.
If you declare a variable to be "register" and it is never accessed, you
have wasted a "push" of this register (even if the "push" is actually
built into the VAX's "calls" instruction).

If the variable is an argument, there is the added problem that the register
must be loaded with the argument's value.  This will be true on just about
any architecture.

Someone here at LOCUS once made the following recommendations.	Besides the
above, these take into account the fact that moving a pointer in a register
may give more of a performance gain than moving an integral variable into
one.  By "ref", this refers to the "typical" number of run time references
(whatever that means) rather than the number of syntactic references.  I
don't know that I agree with these numbers, but they are probably the right
order of magnitude for a lot of machines/compilers.  I'd probably make all
of these numbers a little lower.

	a. don't make a local pointer a register unless at least 3 refs are
		made
	b. don't make a parameter pointer a register unless at least 4 refs
		are made
	c. don't make a local integer a register unless 4 or 5 refs are made
	d. don't make a parameter integer a register unless 6 or more refs
		are made.
	e. be care to look for various forms of loops when doing ref counting.

Richard M. Mathews
Locus Computing Corporation		       lcc.richard@LOCUS.UCLA.EDU
					       lcc.richard@UCLA-CS
				 {ihnp4,trwrb}!lcc!richard
       {randvax,sdcrdcf,ucbvax,trwspp}!ucla-cs!lcc!richard

jjw@celerity.UUCP (01/26/87)

In article <2250@jade.BERKELEY.EDU> mwm@eris.BERKELEY.EDU (Mike Meyer) writes:
>
>There is a false implication in the above: that it doesn't hurt to add
>register declerations. There is at least one compiler out there that
>effectively allocates registers from the last declared instead of the
>first, so that blindly adding registers to code can slow the generatred
>code down.

My version of K&R states (page 193, section 8.1 "Storage Class Specifiers)":
	A register declaration is best thought of as an auto declaration,
	together with a hint to the compiler that the variables will be
	heavily used.  Only the first few such declarations are effective.
	                        ^^^^^

This implies to me that a conforming compiler should allocate "registers"
starting with the first declaration.

henry@utzoo.UUCP (Henry Spencer) (01/30/87)

The approach I use to registers was chosen based on three facts:

1. Many of the machines my stuff is going to run on -- including the one
	that is going to be my primary machine soon -- have many registers.

2. Some of the machines, however -- including the one that is my primary
	machine right now -- have few registers.

3. In general you cannot trust the compiler to be predictable in picking
	specific "register" variables to actually go into registers.  There
	are too many complications (e.g. the 68000's which have two flavors
	of registers).

So if you read my code, you'll find me using both "register" and "REGISTER"
in declarations.  You will find "register" on about three variables per
function, which is a not-uncommon number on register-poor machines (e.g.
the pdp11/44 on which I write this).  You will find "REGISTER" on the rest
of the heavily-used variables (or all variables in functions that don't have
many local variables).  Up at the top of the code you'll find:

	#ifndef REGISTER
	#define	REGISTER	register
	#endif

and in the Makefile you'll find instructions saying "on a register-poor
machine, put '-DREGISTER=' in CFLAGS".  (This could be the other way 'round,
but on the whole I prefer to consider "good" machines, e.g. register-rich
ones here, the default and make the "poor" machines go through the hassle
of having to explicitly compensate.)

Which variables get which?  A somewhat ad-hoc decision, normally made during
final review of working code rather than at code-writing time.  Frequently-
used variables, especially ones used in loops, get priority.  Pointers used
with the -> operator generally get priority over numeric variables, since
using -> with a non-register pointer is often relatively expensive.  Longs
get a slight penalty, since my 44 can't put them in registers anyway.
Parameters get a slight penalty, since putting one of them in a register
often involves more startup overhead than putting a local variable in a
register.  Anything whose address is taken, of course, gets neither form
of register prefix.

One could arguably do better with multiple classes of registers, to express
priorities in more detail.  In practice I seldom have enough local variables
to make this worthwhile, and I doubt that it can be done well enough to show
much consistent benefit across a wide range of hardware.
-- 
Legalize			Henry Spencer @ U of Toronto Zoology
freedom!			{allegra,ihnp4,decvax,pyramid}!utzoo!henry

guy@gorodish.UUCP (02/10/87)

>My version of K&R states (page 193, section 8.1 "Storage Class Specifiers)":
>	A register declaration is best thought of as an auto declaration,
>	together with a hint to the compiler that the variables will be
>	heavily used.  Only the first few such declarations are effective.
>
>This implies to me that a conforming compiler should allocate "registers"
>starting with the first declaration.

Well, no, I wouldn't go that far.  The wording is too loose to be
read as a requirement.  The use of the word "hint" indicates that
such declarations really aren't binding; the mention of the rules
used by the compilers around at the time is there just to give the
programmer an indication of which items would be put into registers.
It's probably a Good Idea to process declarations in the Ritchie
compiler/PCC fashion if you don't use any other information to decide
which variables to put into registers, but it's probably a Good Idea
to offer the programmer the option of using other information, since
they may not know how many and what kind registers the machine the
code is currently being compiled for has.

Fortunately, the ANSI C standard does not promise which declarations
will be effective.

jjw@celerity.UUCP (02/12/87)

In response to my claim that compilers which conform to K&R should allocate
"registers" starting with the first declaration guy@sun.UUCP (Guy Harris)
indicates:
>Fortunately, the ANSI C standard does not promise which declarations
>will be effective.

I believe this is unfortunate.  If I have a program which can effectively
use differing numbers of registers how can I indicate which variables
should go into registers in a machine/compiler independent manner?

For example, postulate a function which has more than 6 variables  6 of
which have the following characteristics:

	a -- Is extremely frequently used.  It is critical to performance
	     that it be in a register.
	b, c -- Are used very frequently.  They should be in registers to
	        obtain optimal performance.
	d, e, f -- Are used frequently.  If possible, they should be in
	           registers.
	The remaining variables are only used infrequently and should never
	be in registers in preference to those listed.

The question is -- How do I declare these variables so that I get the best
performance on machines with 1, 3, 6, 8 ... registers available for register
variables?  I am trying to code in a machine and compiler independent
manner.  I do not want to reshuffle the declarations nor to have to
re-define a "REGISTER" macro.  In fact I don't even want to care about how
many register variables the compiler allocates.

K&R's suggestion that the register variables are assigned to registers in
order of appearance solves my problem -- I just put the variables in order
of importance and let the compilers handle it from there.  The reason for my
original posting was because of this.  I think the K&R statement, "Only the
first few such declarations are effective," is insightful and aids in
producing machine independent code.  Therefore I am saddened to see it
ignored or forgotten.

As Guy says:
>It's probably a Good Idea to process declarations in the Ritchie
>compiler/PCC fashion if you don't use any other information to decide
>which variables to put into registers, but it's probably a Good Idea
>to offer the programmer the option of using other information, since
>they may not know how many and what kind registers the machine the
>code is currently being compiled for has.

Except that I would replace his "but" with "because".  Also, I don't
understand what he means by "using other information."  I assume the
register declarations are the result of considering whatever information
the programmer has about the operation of the program.

howard@cpocd2.UUCP (02/13/87)

In article <873@celerity.UUCP> jjw@celerity.UUCP (Jim (JJ) Whelan) writes:
>For example, postulate a function which has more than 6 variables  6 of
>which have the following characteristics:
>	a -- Is extremely frequently used.  It is critical to performance
>	     that it be in a register.
>	b, c -- Are used very frequently.  They should be in registers to
>	        obtain optimal performance.
>	d, e, f -- Are used frequently.  If possible, they should be in
>	           registers.
>	The remaining variables are only used infrequently and should never
>	be in registers in preference to those listed.
>The question is -- How do I declare these variables so that I get the best
>performance on machines with 1, 3, 6, 8 ... registers available for register
>variables?  I am trying to code in a machine and compiler independent
>manner.  I do not want to reshuffle the declarations nor to have to
>re-define a "REGISTER" macro.  In fact I don't even want to care about how
>many register variables the compiler allocates.

Boy, there are sure a lot of things you "don't want" to do to get good code!
Seriously, there is an easy way to get approximately what you want, with a
fixed amount of work PER MACHINE (not per program).  Declare your variables
as follows (assuming they are all ints):

	#include	"register.h"

	main()
	{
	REG1 int	a;
	REG2 int	b;
	REG3 int	c;
	REG4 int	d;
	REG5 int	e;
	REG6 int	f;
	}

And then have register.h contain (assuming there are 3 usable registers):

	#define	REG1	register
	#define	REG2	register
	#define	REG3	register
	#define	REG4
	#define	REG5
	#define	REG6
	...

If you do this for all your programs, then when you port to a new machine
you only need to change ONE register.h file, once, and you're set!

In actuality, this is oversimplified, since some machines have separate
registers for integer, floating point, and/or pointer; and a double may
eat up 2 registers!

A similar approach can be used to get total portability with respect to
the length of short, int, and long.  Just define INTn for n = 1 up to
the maximum of the machine (example here assumes short=16, int=32, long=64):

	#define	INT1	short
	...
	#define	INT16	short
	#define	INT17	int
	...
	#define	INT32	int
	#define	INT33	long
	...
	#define	INT64	long
	/* ... and likewise for UINT1 to UINT64 */

Then you declare each int with the precise number of bits you actually require:

	REG1 INT5	a;	/* This works with register scheme above. */
	INT16		b;
	INT16		c;
	INT10		d;
	INT18		e;	/* May pay off on a 36-bit machine! */
	INT60		f;	/* Just the thing for a Cray 2? */

Now of course, INT60 isn't very portable, but at least you'll know instantly
every place in your program that needs to be fixed.  You can also use, e.g.:

	#ifdef INT60
		/* simple code using 60-bit int */
	#else
		/* complex code to emulate 60-bit int */
	#endif

to get a better shot at portability.  The drawback of this approach is that
it requires you to understand (and declare) exactly how many bits each variable
requires; but shouldn't you know that anyway?  (Note to wizards: you will
have noticed that using a short instead of an int for a loop variable can
cause performance degradation on some machines.  If you're that smart, you
should be able to figure out how to modify the above scheme to do what you
want.  "An exercise for the reader".  It's not very hard.)

Wouldn't it be nice if UNIX was written this way?  Then we wouldn't be arguing
about whether or not we're stuck with sizeof(int) == sizeof(long)!
-- 

	Howard A. Landman
	...!intelca!mipos3!cpocd2!howard

flaps@utcsri.UUCP (Alan J Rosenthal) (02/14/87)

In article <873@celerity.UUCP> jjw@celerity.UUCP (Jim (JJ) Whelan) writes:
>K&R's suggestion that the register variables are assigned to registers in
>order of appearance solves my problem -- I just put the variables in order
>of importance and let the compilers handle it from there.

Unfortunately, this is not sufficient in the case of register formals.
Consider something like:

	f(n)
	register int n;
	{
	    register int i;

where it is considered more useful to put 'i' in a register than 'n'.  It
is not possible to arrange the declarations in the appropriate order, and

	f(nformal)
	int nformal;
	{
		register int i,n = nformal;

, which is often recommended, wastes an int on all machines.

-- 

Alan J Rosenthal

UUCP: {backbone}!seismo!mnetor!utgpu!flaps, ubc-vision!utai!utgpu!flaps,
      or utzoo!utgpu!flaps (among other possibilities)
ARPA: flaps@csri.toronto.edu
CSNET: flaps@toronto
BITNET: flaps at utorgpu

greg@utcsri.UUCP (Gregory Smith) (02/17/87)

In article <4141@utcsri.UUCP> flaps@utcsri.UUCP (Alan J Rosenthal) writes:
>>order of appearance solves my problem -- I just put the variables in order
>>of importance and let the compilers handle it from there.
>
>Unfortunately, this is not sufficient in the case of register formals.
>Consider something like:
>
>	f(n)
>	register int n;
>	{    register int i;
>
>where it is considered more useful to put 'i' in a register than 'n'.  It
>is not possible to arrange the declarations in the appropriate order, and
>
>	f(nformal)
>	int nformal;
>	{	register int i,n = nformal;
>
>, which is often recommended, wastes an int on all machines.
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^
How so? To my understanding, declaring a formal to be register is
equivalent to asking for a local register var which is to be initialized
to the value of the formal. I.e:

foo(x) register int x; {  statements....
and
foo(xf) { register int x = xf; statements....

are exactly equivalent, provided a register is available. Thus the 'often
recommended' solution only wastes an int when the local var (in this
case n) cannot be put in a register.

Since most C implementations pass parameters on the stack, declaring
a formal to be 'register' results in a copy operation from the stack
to the register. This copy is implicit in the foo(x) example; the same
copy is explicit in the foo(xf) example.

-- 
----------------------------------------------------------------------
Greg Smith     University of Toronto      UUCP: ..utzoo!utcsri!greg
Have vAX, will hack...