[comp.arch] Available no. of registers

ram@nucsrl.UUCP (01/16/87)

Hi,
  
   This is my first posting in this newsgroup. So hold your flames if this
is a dumb question.

   C allows "register ......" construct which instructs the compiler
to reserve a machine register to store that value.  Now my question is,
given a fixed number of registers, How many are effectively usable for
the register declaration.  I know this is machine dependent.  Could
somebody say how many register definitions I could use within a block
of code say for a VAX.  And please go on to mention the CPU/Machine that
allows the greatest number and smallest number of such declarations. 

   Is this number fixed or does it change as the program runs.


                                         Renu Raman
                                     ....ihnp4!nucsrl!ram
                                     Northwestern Comp. Sci. Lab

"it is a poor sort of memory that works backwards"

                                             ---The Queen

sjc@mips.UUCP (Steve Correll) (01/19/87)

In article <3810002@nucsrl.UUCP>, ram@nucsrl.UUCP (Raman Renu) writes:
>    This is my first posting in this newsgroup. So hold your flames if this
> is a dumb question.
> 
>    C allows "register ......" construct which instructs the compiler
> to reserve a machine register to store that value.  Now my question is,
> given a fixed number of registers, How many are effectively usable for
> the register declaration.  I know this is machine dependent.  Could
> somebody say how many register definitions I could use within a block
> of code say for a VAX.  And please go on to mention the CPU/Machine that
> allows the greatest number and smallest number of such declarations. 

Not a dumb question. The number of registers available depends
not only on the machine architecture, but also on which compiler you use,
and sometimes on the level of optimization you ask the compiler to
perform. Sometimes the compiler's documentation will answer your question.
Failing that, experimenting with the "-S" compiler option may reveal it.

A smart compiler will try to keep variables in registers whenever
possible, even without 'register' declarations, and will decide for
itself how best to use the available registers. But the use of pointers
can create 'aliases' (that is, you can refer to the same variable
either directly, or indirectly via a pointer). The compiler cannot in
general know which variable a pointer may be pointing to, and dares not
keep a variable in a register when a pointer reference may access the
'stale' copy of the variable in memory.

Given a smart compiler, it's often effective to put the 'register'
declaration on every variable which you happen to know won't be
referenced by a pointer.  Even if the compiler can't put them all in
registers, it can sometimes perform other optimizations as a result.

Given a not-so-smart compiler, which puts variables into registers only
on command, it's ususally harmless to give too many 'register'
declarations--but you'd probably better put the most important variables
first, lest the compiler run out of registers before getting to the
important ones.

I myself have encountered machines with as few as 0 or 1 registers (stack
machines, early minicomputers) and as many as 128. I suspect I haven't run
the gamut.

-- 
...decwrl!mips!sjc						Steve Correll

radford@calgary.UUCP (Radford Neal) (01/21/87)

In article <926@mips.UUCP>, sjc@mips.UUCP (Steve Correll) writes:
> In article <3810002@nucsrl.UUCP>, ram@nucsrl.UUCP (Raman Renu) writes:
> > ...given a fixed number of registers, How many are effectively usable for
> > the register declaration.

> Given a smart compiler, it's often effective to put the 'register'
> declaration on every variable which you happen to know won't be
> referenced by a pointer.  Even if the compiler can't put them all in
> registers, it can sometimes perform other optimizations as a result.

Shouldn't a compiler smart enough to allocate variables to registers be
smart enough to see that a local variable is never the operand of the
& (address-of) operator, and thus cannot be referenced by a pointer?
Both of these tasks seem to require that the procedure be completely
scanned before code generation.

> Given a not-so-smart compiler, which puts variables into registers only
> on command, it's ususally harmless to give too many 'register'
> declarations--but you'd probably better put the most important variables
> first, lest the compiler run out of registers before getting to the
> important ones.

True. Unfortunately if you use inner scopes this isn't always possible.
E.g.

   proc()
   { register int blat, blog, blit;
     ...
     { register int glip, glop, glurp;
       ...
     }
   }

Even if you know exactly the order of priority for these variables to
be put in registers it's not possible to declare them in a way that will
result in the most important being put in registers on any machine, without
modifying the program to not use inner scopes, which could result in other
problems.

By the way, it is typical for 68000 C compilers to have two entirely 
separate sets of register variables, one for pointers, one for data. So
for example:

   register int a,b,c,d,e,f,g,h,i,j,k,l,m;
   register int *p;

Despite all those previous register int's, p probably gets put in a 
register.

    Radford Neal
    The University of Calgary

mwm@cuuxb.UUCP (Marc W. Mengel) (01/22/87)

In article <759@vaxb.calgary.UUCP> radford@calgary.UUCP writes:
>Shouldn't a compiler smart enough to allocate variables to registers be
>smart enough to see that a local variable is never the operand of the
>& (address-of) operator, and thus cannot be referenced by a pointer?
>Both of these tasks seem to require that the procedure be completely
>scanned before code generation.

Unfortunately, you don't have to take the address of a given variable
to use it, you merely have to take the address of a variable near it
and add an offset to it.  The way C is defined, this is quite legal.

For example, suppose I have a function f, declared as follows:
	f(p)
		struct { int a, b, c; } *p;
	{
	...
	}
And I call it as follows:
	b()
	{
		int a, b, c;

		f( &a );
	}
According to our venerable friends kerningham&ritchie, this is legal;
it is also used to some great extent in the older (v6 & v7) unix kernels.
>    Radford Neal
>    The University of Calgary


-- 
 Marc Mengel
 ...!ihnp4!cuuxb!mwm

johnt@microsoft.UUCP (01/22/87)

> Shouldn't a compiler smart enough to allocate variables to registers be
> smart enough to see that a local variable is never the operand of the
> & (address-of) operator, and thus cannot be referenced by a pointer?

God forbid, but there is probably some C program out there that relies
on being able to:
	int a,b, *ptr;

	ptr = &a;
	ptr++;
	/* ptr now points to b */
No flames please, if there is such a program out there, >I< didn't write it.

> . . . Unfortunately if you use inner scopes this isn't always possible.
> E.g.
> 
>    proc()
>    { register int blat, blog, blit;
>      ...
>      { register int glip, glop, glurp;
>        ...
>      }
>    }

I once used a 68000 compiler that would indeed ignore 'register' for inner
scope variables.  I ended up moving all the variable declarations to the head
of the procedure.
-----
John Tupper
...!decvax!microsoft!johnt

guy%gorodish@Sun.COM (Guy Harris) (01/22/87)

> Unfortunately, you don't have to take the address of a given variable
> to use it, you merely have to take the address of a variable near it
> and add an offset to it.  The way C is defined, this is quite legal.

Even if it is legal (which I doubt), it's not necessarily the right thing to
do.  As such, I consider it perfectly legal for an optimizing compiler to
pretend that sort of thing doesn't happen; if you use disgusting tricks like
that, turn the optimizer off.

> For example, suppose I have a function f, declared as follows:
> 	f(p)
> 		struct { int a, b, c; } *p;
> And I call it as follows:
> 	b()
> 	{
> 		int a, b, c;
> 
> 		f( &a );
> 	}
> According to our venerable friends kerningham&ritchie, this is legal;
> it is also used to some great extent in the older (v6 & v7) unix kernels.

Oh, really?  Could you tell me where in K&R they claim that a C
implementation must:

	1) put "b" and "c" in memory at all in this case?

	2) if it does put them in memory, put "a", "b", and "c" in memory
	   in the exact same fashion that the members of the structure in
	   question are put in memory?

In fact, because of a historical accident, that one *doesn't* work in our
implementation; "int" structure members are put on 16-bit boundaries, but
"int" arutomatic variables are put on 32-bit boundaries.  (Our compiler is
derived from the MIT 68000 compiler, which put 32-bit "int"s on 16-bit
boundaries because that's all it had to do; we later modified it to put
"int" automatic variables on 32-bit boundaries for efficiency on the 68020,
but didn't change the alignment rules for structure members for reasons of
binary compatibility).

chris@mimsy.UUCP (Chris Torek) (01/22/87)

In article <1029@cuuxb.UUCP> mwm@cuuxb.UUCP (Marc W. Mengel) writes:
>Unfortunately, you don't have to take the address of a given variable
>to use it, you merely have to take the address of a variable near it
>and add an offset to it.  The way C is defined, this is quite legal.

Legal indeed, but the result is undefined.  Taking the address of
an adressable object is all right, and offsetting that is fine;
but no mention is made of where the resultant pointer points, if
anywhere.

>For example, suppose I have a function f, declared as follows:
>	f(p)  struct { int a, b, c; } *p; { ... }
>And I call it as follows:
>	b() { int a, b, c; f( &a ); }
>According to our venerable friends kerningham&ritchie, this is legal;

How so?

>it is also used to some great extent in the older (v6 & v7) unix kernels.

The kernels have many instances of code such as

	struct {	/* ioctl arguments, e.g. */
		int	fd;
		int	cmd;
		caddr_t	addr;
	} *uap;

The kernel, however, is one of those inherently machine specific
things; it is allowed to cheat.  This cheating is (generally) very
carefully controlled so as not to cause trouble with the compiler.

There is nothing in C that requires that independently declared
variables be adjacent in terms of pointer arithmetic.  Arrays must
be so, but on, e.g., a Pyramid, the thirteenth scalar stack variable
is nowhere near the first twelve:

	f()
	{
		int a, b, c, d, e, f, g, h, i, j, k, l;	/* registers */
		int m, n, o;	/* stack */
		...

As the registers in the Pyramid are addressable, the compiler simply
puts the first twelve variables that fit into the twelve free local
registers.  (There are twelve standard parameter registers as well.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
UUCP:	seismo!mimsy!chris	ARPA/CSNet:	chris@mimsy.umd.edu

tim@amdcad.UUCP (Tim Olson) (01/23/87)

In article <1029@cuuxb.UUCP>, mwm@cuuxb.UUCP (Marc W. Mengel) writes:
+---------------------------------------
| Unfortunately, you don't have to take the address of a given variable
| to use it, you merely have to take the address of a variable near it
| and add an offset to it.  The way C is defined, this is quite legal.
| 
| For example, suppose I have a function f, declared as follows:
| 	f(p)
| 		struct { int a, b, c; } *p;
| 	{
| 	...
| 	}
| And I call it as follows:
| 	b()
| 	{
| 		int a, b, c;
| 
| 		f( &a );
| 	}
| According to our venerable friends kerningham&ritchie, this is legal;
| it is also used to some great extent in the older (v6 & v7) unix kernels.
+---------------------------------------

What is legal is not necessarily correct.  The above will not run on Vaxen
with pcc (or, for that matter, any machine/compiler combination where stacks 
grow in a direction opposite that of global variables).  Anyone who uses
such a construct should be put out of our misery.

[ By the way, I first thought that this wouldn't work on any machine I knew
of, but it *actually works* on an IBM RT-PC!  Really! ]

	-- Tim Olson
	Advanced Micro Devices

meissner@dg_rtp.UUCP (01/23/87)

In article <1029@cuuxb.UUCP> mwm@cuuxb.UUCP (Marc W. Mengel) writes:
> 
> Unfortunately, you don't have to take the address of a given variable
> to use it, you merely have to take the address of a variable near it
> and add an offset to it.  The way C is defined, this is quite legal.
> 
> For example, suppose I have a function f, declared as follows:
> 	f(p)
> 		struct { int a, b, c; } *p;
> 	{
> 	...
> 	}
> And I call it as follows:
> 	b()
> 	{
> 		int a, b, c;
> 
> 		f( &a );
> 	}
> According to our venerable friends kerningham&ritchie, this is legal;
> it is also used to some great extent in the older (v6 & v7) unix kernels.

    And this poor coding method will produce obscure results when used on
compilers that try to optimize the variables (group small items near the
beginning of the stack frame so instructions with small displacements can
be used, or to try and group frequently used items together).  Even if
the compiler doesn't reorganize things, it plays havoc if the stack goes
in the oppisite direction.
-- 
	Michael Meissner, Data General
	...mcnc!rti-sel!dg_rtp!meissner

radford@calgary.UUCP (01/24/87)

In article <1029@cuuxb.UUCP>, mwm@cuuxb.UUCP (Marc W. Mengel) writes:
> Unfortunately, you don't have to take the address of a given variable
> to use it, you merely have to take the address of a variable near it
> and add an offset to it.  The way C is defined, this is quite legal.
> 
> For example, suppose I have a function f, declared as follows:
> 	f(p)
> 		struct { int a, b, c; } *p;
> 	{
> 	...
> 	}
> And I call it as follows:
> 	b()
> 	{
> 		int a, b, c;
> 
> 		f( &a );
> 	}
> According to our venerable friends kerningham&ritchie, this is legal;
> it is also used to some great extent in the older (v6 & v7) unix kernels.

I know C is poorly designed, but is it really THIS bad?! Did Kernighan and
Ritchie suffer from this level of brain damage? Does anyone know?

I assume the proposed ANSI standard doesn't countenance this sort of 
behaviour...

Anyway, regardless of what old C manuals said, this is sufficiently 
ridiculous programming that compiler writers should ignore it and 
shoot any programmers who complain.

   Radford Neal

guy@gorodish.UUCP (01/25/87)

>God forbid, but there is probably some C program out there that relies
>on being able to:
>	int a,b, *ptr;
>
>	ptr = &a;
>	ptr++;
>	/* ptr now points to b */

No program that relies on that is correct C, so no compiler is
obliged to make them work.  In fact, the sooner such programs *are*
prevented, the better off we'll all be.

If the person *really* wanted that, they should have done

	int array[2], *ptr;

	ptr = &array[0];
	ptr++;
	/* ptr now points to array[1] */

and, if for some mysterious reason they insist on referring to these
objects as "a" and "b", they could do

	#define	a	array[0]
	#define	b	array[1]

Nowhere does C guarantee that contiguously-declared objects will be
given contiguous addresses in memory.

fouts@orville.UUCP (01/25/87)

In article <763@vaxb.calgary.UUCP> radford@calgary.UUCP (Radford Neal) writes:
>I know C is poorly designed, but is it really THIS bad?! Did Kernighan and
>Ritchie suffer from this level of brain damage? Does anyone know?
>

A pointer points to an address in memory.  manipulating items at offsets from
that address by using that pointer is just fine.  A common valid use is to
set a char pointer to the first character in a string and then increment the
pointer to examine succesive characters.  There is nothing in the language
that requires the programmer to stop at the end of the string. . .

>I assume the proposed ANSI standard doesn't countenance this sort of 
>behaviour...
>

Of course it does.  A classic use of the above idiom (sort of from K&R, but
by memory, I don't have my copy handy is:

char *s;
char *t = "a string";

. . .

while (*s++ = *t++) ;

. . .

To copy the string pointed to by t to the memory pointed to starting at
the place where s originaly points.

>Anyway, regardless of what old C manuals said, this is sufficiently 
>ridiculous programming that compiler writers should ignore it and 
>shoot any programmers who complain.

Actually it would be quite difficult to allow C constructs such as those
above will prohibiting illegal references through pointers, but this isn't
suprising; even ADA can't protect you from misusing a pointer reference.

The magic in C is that any kind of pointer can be assigned the address of
any kind of data, provided that the data has an address (lvalue in K&R)
via the classic cast:

Some_kind X;
Other_kind *Y ;

Y = (Other_kind *) &X;

This is used heavily in C.  The semantics of many Un*x kernels and more than
a few applications programs would break without it.  The problem occurs when
you do pointer arithmetic on Y.  It is perfectly legitimate to write:

X = *(Some_kind *) (Y+1);

and get at whatever is one Sizeof(Other_kind) beyon the original X in
memory.

Of course, if pointers have the same "shape," it becomes possible to leave
out the type casts, making the code difficult to port.

guy@gorodish.UUCP (01/25/87)

>A pointer points to an address in memory.  manipulating items at offsets from
>that address by using that pointer is just fine.

A pointer points to an object, not an "address in memory".  You may
be able to get away with treating pointers as if they pointed to
addresses in memory in most C implementations, but there is nothing
in any C specification that requires this to work.

Furthermore, manipulating items at offsets from that address by using
that pointer is ONLY fine if the pointer points to a member of an
array.  See K&R, 7.4 "Additive operators".

> A common valid use is to set a char pointer to the first character in a
>string and then increment the pointer to examine succesive characters.

But in this case the pointer is pointing to a member of an array,
since a string is stored in an array of characters.

>There is nothing in the language that requires the programmer to stop at
>the end of the string. . .

Depends on what you mean by "in the language".  There is nothing in
the language that guarantees that the pointer you get if you *don't*
stop at the end of the string has a meaningful, useful, or
intuitively-obvious value, or that it has the value that you'd
"expect" it to have.  It may happen to do so on existing
implementations, but don't count on it doing so on all
implementations.

>The problem occurs when you do pointer arithmetic on Y.  It is perfectly
>legitimate to write:
>
>X = *(Some_kind *) (Y+1);

Oh, no it isn't!  See 7.4 "Additive operators".  "Y" doesn't point to
an object in an array.

eager@amd.UUCP (01/27/87)

In article <213@ames.UUCP>, fouts@orville (Marty Fouts) writes:
> 
> The magic in C is that any kind of pointer can be assigned the address of
> any kind of data, provided that the data has an address (lvalue in K&R)
> via the classic cast:
> 
> Some_kind X;
> Other_kind *Y ;
> 
> Y = (Other_kind *) &X;
> 
> 
> This is used heavily in C.  The semantics of many Un*x kernels and more than
> a few applications programs would break without it.  The problem occurs when
> you do pointer arithmetic on Y.  It is perfectly legitimate to write:
> 
> X = *(Some_kind *) (Y+1);
> 
> and get at whatever is one Sizeof(Other_kind) beyon the original X in
> memory.
> 
> Of course, if pointers have the same "shape," it becomes possible to leave
> out the type casts, making the code difficult to port.

Oh, what a mess!!  One of the parts of C which is cleaned up in the ANSI
draft standard is the use of casts to convert pointer values.  
In a conforming compiler, none of the above is true.  
The only conversions that may be performed on pointers is
to convert them to (void *) and back to the SAME type.  On some machines,
the conversion to (char *) and back is possible, but this is implementation
defined behavior, and not portable.

jdb@mordor.UUCP (01/27/87)

>Oh, what a mess!!  One of the parts of C which is cleaned up in the ANSI
>draft standard is the use of casts to convert pointer values.  ...
>The only conversions that may be performed on pointers is
>to convert them to (void *) and back to the SAME type.  On some machines,
>the conversion to (char *) and back is possible, but this is implementation
>defined behavior, and not portable. 

*Implicit* pointer conversion (without casts) is only permitted between
a pointer to an object and a pointer to void.  However, the draft
ANSI standard (October 10, 1986, section 3.3.4) says that explicit
conversions between other pointer types is allowed:

	A pointer to an object of one type may be converted to a
	pointer to an object of another type.  The resulting pointer
	might not be valid if it is improperly aligned for the type
	of object pointed to.  It is guaranteed, however, that a
	pointer to an object of a given alignment may be converted
	to a pointer to an object of a less strict alignment and
	back again; the result shall compare equal to the original
	pointer.  (An object that has type char has the least
	strict alignment.)
-- 
  John Bruner (S-1 Project, Lawrence Livermore National Laboratory)
  MILNET: jdb@mordor.s1.gov		(415) 422-0758
  UUCP: ...!ucbvax!decwrl!mordor!jdb 	...!seismo!mordor!jdb

radford@calgary.UUCP (01/27/87)

In article <213@ames.UUCP>, fouts@orville (Marty Fouts) writes:
> In article <763@vaxb.calgary.UUCP> radford@calgary.UUCP (Radford Neal) writes:
> >I know C is poorly designed, but is it really THIS bad?! Did Kernighan and
> >Ritchie suffer from this level of brain damage? Does anyone know?
> >
> 
> A pointer points to an address in memory.  manipulating items at offsets from
> that address by using that pointer is just fine.  A common valid use is to
> set a char pointer to the first character in a string and then increment the
> pointer to examine succesive characters.  There is nothing in the language
> that requires the programmer to stop at the end of the string. . .

Look, give me credit for some intelligence. When I wrote the first comment
above I was quite aware of all the usual uses of pointer arithmetic. I am
quite aware that a compiler must allocate the members of an array in such a
fashion that pointer arithmetic scans through them. I am NOT aware of any
requirement that local variables be allocated storage in the order they are
declared. Hence my remark expressing incredulity at someone claiming that
a compiler had to do this.

Note that it is quite irrelevant whether the compiler gives a compile-time
error for programs depending on this.

    Radford Neal

greg@utcsri.UUCP (Gregory Smith) (01/28/87)

In article <213@ames.UUCP> fouts@orville.UUCP (Marty Fouts) writes:
>A pointer points to an address in memory.  manipulating items at offsets from
>that address by using that pointer is just fine.  A common valid use is to
>set a char pointer to the first character in a string and then increment the
>pointer to examine succesive characters.  There is nothing in the language
>that requires the programmer to stop at the end of the string. . .

Given:
struct foo{
	char blat[3];
	float klotz;
} bar;

.. there is no 'i' such that the address of 'bar.blat[i]' is the same as
the address of 'bar.klotz' on any machine. Actually there has been a long
discussion on comp.lang.c recently about how to generate such an 'i'
portably, and it isn't pretty.

>The magic in C is that any kind of pointer can be assigned the address of
>any kind of data, provided that the data has an address (lvalue in K&R)
>via the classic cast:

This is exactly the kind of 'magic' that has given C an undeserved bad
name.
>
>Some_kind X;
>Other_kind *Y ;
>
>Y = (Other_kind *) &X;
>
>
>This is used heavily in C.
 Only when dealing with chunks of 'raw' memory, i.e. with malloc().
>  The semantics of many Un*x kernels and more than
>a few applications programs would break without it.  The problem occurs when
>you do pointer arithmetic on Y.  It is perfectly legitimate to write:
>
>X = *(Some_kind *) (Y+1);
>
>and get at whatever is one Sizeof(Other_kind) beyon the original X in
>memory.

What that is, if anything, depends on many things. I'd rather use the
facilities built officially into the language to address things in a
well-defined way.

-- 
----------------------------------------------------------------------
Greg Smith     University of Toronto      UUCP: ..utzoo!utcsri!greg
Have vAX, will hack...

tom@uw-warp.UUCP (01/29/87)

In article <170@microsoft.UUCP>, johnt@microsoft.UUCP (John Tupper) writes:
> I once used a 68000 compiler that would indeed ignore 'register' for inner
> scope variables.  I ended up moving all the variable declarations to the head
> of the procedure.
> -----
> John Tupper
> ...!decvax!microsoft!johnt

My Microsoft C Compiler Version 3.00 also disregards such declarations.
A pity, since I find that's where they're most useful.  But I'd really rather
just let an intelligent compiler handle register allocation most of the time.

-- 
Tom May.	uw-beaver!uw-nsr!uw-warp!tom