[net.lang.c] pointer question

plw@mgweed.UUCP (Pete Wilson) (03/10/84)

[the usual sacrifice...]

	Pardon my ignorance, but I have a question about 'pointers'.
Somehow I got the idea that a 'pointer' is an address, that is, a machine
memory address and I think this is the source of my confusion.
	Why are pointers to different data types different lengths? Why
SHOULD they be different lengths?

	It would be helpful if someone could enlighten me in this area.
Send replies by mail to:

		...!ihnp4!mgweed!plw

					Pete Wilson
					ATT/T-CP Montgomery Works
					Montgomery, Ill.
					...!ihnp4!mgweed!plw

barmar@mit-eddie.UUCP (Barry Margolin) (03/11/84)

Yes, a pointer is an address, but there are different kinds of
addresses.  Consider a machine where four characters fit into a word,
and one integer fits into a word (remember, this is just a
hypothetical).  Thus, you can reference four times as many characters as
you can integers, so you need at least two more bits in a *char than in
a *int.  On some machines this is not really a problem: normal pointers
are byte pointers, and word pointers are just pointers whose low order
two bits are both zero.  Other machines actually have two different
kinds of pointers.
-- 
			Barry Margolin
			ARPA: barmar@MIT-Multics
			UUCP: ..!genrad!mit-eddie!barmar

hans@log-hb.UUCP (Hans Albertsson) (03/14/84)

[come & get it ]

	I know one machine where pointers are not just different
	length, but different data structures as well. I'm thinking
	of the DEC PDP-10, a 36-bit machine, ( VERY nice.. ) where
	most stuff gets put in an integral no. of full words, and a
	pointer	to one of these objects is 18 bits, that is a Half-word.
	
	However, characters can have one of many representations,
	usually, but not always, 7 bits each ( other usual sizes are
	5 or 6 bits ), and packed an integral no. of characters in a
	word, usually 5 7-bit char:s ( = 35 bits ) in a word. The left-
	over bit is ignored... This means that a "pointer" to a
	character string is a complex structure, containing A) the
	starting address, B) the "Byte Size" in bits plus the "Byte
	Number" you're presently looking at. It takes up MOST of a
	36-bit word.
	
	Other examples probably exist.

hamilton@uiucuxc.UUCP (03/15/84)

#R:mgweed:-762400:uiucuxc:21000007:000:40
uiucuxc!hamilton    Mar 14 00:08:00 1984

check out pdp-10 addressing sometime...

martillo@ihuxt.UUCP (Yehoyaqim Shemtob Martillo) (03/16/84)

There is another side to the pointer question.

In most books, a pointer is a variable.  The number it contains is the
address.  The pointer itself has an address as well unless a register is
declared as a pointer variable.  On some machines even registers are
addressable parts of the memory spectrum.

usenet@abnjh.UUCP (usenet) (03/17/84)

Another example of a machine where pointers to ints are significantly
different from pointers to chars is the Sperry 1100 series.  They are
also 36 bit word addressable machines with add-ons to the instruction
set to allow working with character strings.  (Note, I said strings,
NOT individual characters -- and no, the latter is not a special case
of the former.  If you want the details, ask me by email, there is no
point in bothering the net with them.)  extracting a character from a
word in memory is possible, but complicated.  Due to 'upward
compatibility' mania the basic pointer to integer will fit into a
half-word (18 bits) (And yes, this means you cannot have a program
bigger than 256K words. And yes, this is a pain.  And yes, later
models in the series allow you to get around this restriction by
standing on your head and spitting thru your teeth, but it is still a
pain.)  But a pointer to a character string encodes the starting word,
the starting character and the character size (usually a quarter
word -- 9 bits -- but other sizes are possible, including 6, 12
and 18 bits).  All of this uses up most of a 36 bit word.  The string
length is encoded in a separate word.

Rick Thomas.
ihnp4!abnjh!usenet   ihnp4!abnji!rbt

ajs@hpfcla.UUCP (03/21/84)

Then, for another  example, the HP9000 Series 500 has a lot of different
pointer  types, all of which fit in 32 bits.  The  context and the first
bit or two clue you into whether the pointer is absolute, is relative to
a register, or is a combination of a code or data segment  number and an
offset.  The relative  data  pointer  types can be used  interchangeably
with any  instruction  that  takes a  relative  data  pointer,  allowing
flexibility in what "kind" of memory the data is stored in, e.g.  how it
is accessed.

Alan Silverstein, Hewlett-Packard Fort Collins Systems Division, Colorado
{ihnp4 | hplabs}!hpfcla!ajs, 303-226-3800 x3053, N 40 31'31" W 105 00'43"

mcferrin@inuxc.UUCP (P McFerrin) (04/05/84)

A pointer has a length associated with it.  The reason for this is that it
permits such operations as auto-increment/decrement.  If the pointer
is to a char, then an autoincrement operation would bump the pointer
value by 1.  On the other hand, the pointer was to 64 byte complex structure,
an autoincrement operation would increment the pointer value by 64.
The compiler takes care of the boundary allignment problems which will
result in the length of a pointer being adjusted to maintain proper
boundary allignment.

It is dangerous to increment pointer yourself (e.g. p = p + 4).
By doing so, you are restricting your programs to a specific machine type
since wordsize varies among diffenent machines. (a act against portability)

					Paul McFerrin
					AT&T Consumer Products
					...ihnp4!inuxc!mcferrin

ron@brl-vgr.ARPA (Ron Natalie <ron>) (04/05/84)

It is not dangerous to implement the pointer yourself:

	p = p + 4

means incrememt p by 4 items of the pointer type.  Totally machine
indepedant.

-Ron

gwyn@brl-vgr.ARPA (Doug Gwyn ) (04/06/84)

What you say is true but your example (p = p + 4) may be confusing.
If p is declared as a pointer to a 4-byte something, then
	++p;
and
	p = p + 1;
are equivalent.
	p = p + 4;
on the other hand makes the pointer point to the 4th something after
what it used to point to.  I think you meant using things like int p's:
	int	p;
	p = p + 4;
	do something with *(something *)p;
which is a (not recommended!) way to use an int as a pointer to
something.  If p is properly declared, then you would have had to
write your example in a way that makes the shenanigans obvious:
	something	*p;	/* 4 bytes per something */
	p = (something *)((int)p + 4);

barmar@mit-eddie.UUCP (Barry Margolin) (04/06/84)

--------------------
It is dangerous to increment pointer yourself (e.g. p = p + 4).
By doing so, you are restricting your programs to a specific machine type
since wordsize varies among diffenent machines. (a act against portability)

					Paul McFerrin
--------------------

I don't have my K&R handy, but I thought that adding an int to a pointer
was defined to scale the int appropriately.  There is nothing special
about auto-increment in this regard.
-- 
			Barry Margolin
			ARPA: barmar@MIT-Multics
			UUCP: ..!genrad!mit-eddie!barmar

archiel@hercules.UUCP (Archie Lachner) (04/07/84)

As I read Kernighan and Ritchie (K & R), the assertion that the statement

	p = p + 4;

where p is a pointer, is dangerous and non-portable is completely wrong.
In section 5.4, p. 98, it is stated that when an int or constant is
added to a pointer, it is first scaled by the size of the object the pointer
points to.  For example, and I am quoting directly now,

	"The construction

		p + n

	means the n-th object beyond the one p currently points to."

As far as I am concerned, K & R contains the definition of the C language.
Expressions such as "p + 4" above are not dangerous and completely portable,
at least to machines and systems where the C compiler implements the language
correctly.
-- 

				Archie Lachner
				Logic Design Systems Division
				Tektronix, Inc.

uucp:    {ucbvax,decvax,pur-ee,cbosg,ihnss}!tektronix!teklds!archiel
CSnet:   archiel@tek
ARPAnet: archiel.tek@rand-relay

cobb@utcsstat.UUCP () (04/09/84)

If anyone out there is still not convinced about "p = p + 4" portability,
after reading Archie's explanation (Archie Lachner), I submit the
following:
	given a structure struct X {
					int a;
					int b;
					int c;
				}
	with a register pointer p of type X;

	p = p + 4 	generates

	addl	$48,reg_n,reg_n		; add decimal 48 to register
					; VAX - berkeley C compiler

and
	add	$30,reg_n		; add octal 30 (24) to register
					; 11/70 - Bell C Compiler

	As you might've guessed, reg_n is p.
	In each case, the actual value added is <size of struct> * 4.

	Thus, any fears about "p = p + n" is quite unfounded.
	In fact, K&R states that a construct p[i] is actually
	treated as p + i.

If this is still not convincing enough, c'est la vie !!

	OZ

				Ozan S. Yigit (utzoo!utcsstat!cobb)
				(utzoo!yetti!ozan)
				Dept. of Computer Science
				York University

barrett@hpcnob.UUCP (04/09/84)

How is p = p + 4 not portable when p is a pointer?  From K & R p 188:

   The result of the + operator  is the sum of the  operands.  A pointer to
   an  object  in an array and a value of any  integral  type may be added.
   The latter is in all cases converted to an address offset by multiplying
   it by the length of the object to which the pointer  points.  The result
   is a pointer of the same type as the original  pointer, and which points
   to  another  object in the same  array,  appropriately  offset  from the
   original  object.  Thus if P is a pointer  to an object in an array, the
   expression P+1 is a pointer to the next object in the array.

   No further type combinations are allowed for pointers.

It would seem that this is non-portable  only if used as something other
than an index into the  original  storage  area of the  pointer.  Do you
have an example of another case?

Dave Barrett (hplabs!hp-dcd!barrett)