[comp.lang.c] Clarification needed on Pointers/Arrays

sam@lfcs.ed.ac.uk (S. Manoharan) (02/21/89)

To clarify my doubts regarding pointers/arrays, I ran the
following code. Result: I am more confused. Could someone
help me out?

I need to reason the likely outcome of 
   /* LINE 1 */  and /* LINE 2 */ 

main()
{
   static char *a[] = { "123456789", "bull", "fred", "foo" };
   /* array of char-pointers */

   printf("Entries %d\n",sizeof(a)/sizeof(char *));
   foo1(a);
   foo2(a);
}

foo1(b)
char *b;
{
   int i;

   printf("Entries %d\n",sizeof(b)/sizeof(char *));
   /* LINE 1 */ for ( i = 0; i < 10; ++i ) printf("%d: %c\n",i,b+i);
}

foo2(b)
char *b[];
{
   int i;
   /* LINE 2 */ printf("Entries %d\n",sizeof(b)/sizeof(char *));
   for ( i = 0; i < 4; ++i ) printf("%d: %s\n",i,b[i]);
}

--------
Output of the program was:

Entries 4
Entries 1
0: ^P              <--- Expected ... 1
1: ^Q              <--- Expected ... 2
2: ^R              <--- Expected ... 3
3: ^S              <--- Expected ... 4
4: ^T              <--- Expected ... 5
5: ^U              <--- Expected ... 6
6: ^V              <--- Expected ... 7
7: ^W              <--- Expected ... 8
8: ^X              <--- Expected ... 9
9: ^Z
Entries 1          <--- Expected ... 4
0: 123456789
1: bull
2: fred
3: foo
---
sam%uk.ac.edinburgh.lfcs%uk.ac.ucl.cs.nss@net.cs.relay

sabbagh@acf3.NYU.EDU (sabbagh) (02/22/89)

In article <1436@etive.ed.ac.uk> sam@lfcs.ed.ac.uk (S. Manoharan) writes:
>
>main()
>{
>   static char *a[] = { "123456789", "bull", "fred", "foo" };
>   /* array of char-pointers */
>
>   printf("Entries %d\n",sizeof(a)/sizeof(char *));
>   foo1(a);
>   foo2(a);
>}
>
>foo1(b)
>char *b;
>{
>   int i;
>
>   printf("Entries %d\n",sizeof(b)/sizeof(char *));
>   /* LINE 1 */ for ( i = 0; i < 10; ++i ) printf("%d: %c\n",i,b+i);
>}

Since b is a pointer to char, b+i is a pointer to char also. Maybe you 
want b[i]?

>
>foo2(b)
>char *b[];
>{
>   int i;
>   /* LINE 2 */ printf("Entries %d\n",sizeof(b)/sizeof(char *));
>   for ( i = 0; i < 4; ++i ) printf("%d: %s\n",i,b[i]);
>}
>

In this case, sizeof(b) == sizeof(char **) (i.e. a pointer to a pointer to
char).  Clearly, sizeof(char **) == sizeof(char *) is not unexpected (although, not required by the standard).

Now for my $0.02 on pointers vs. arrays:

Simply put, a pointer to blah should be considered a different type than blah.
Consider the following declarations:

	char fred, *p1, **p2;

Then &fred returns char *; *p1 returns char; *p2 returns char *, etc.

Now, according to K & R, the notation

	p1[j]

is in ALL WAYS equivalent to

	*(p1 + j)

In fact, the compiler makes this transformation during parsing! (Incidentally,
this implies that p1[j] == j[p1]) !!

So what are arrays?  They are POINTER CONSTANTS.  That is, if you declare

	char a[10]

then a == &a[0] is a constant; it is the address of the first element of the 
array.

Moral of the story: you can treat pointers like arrays, but you can't treat
arrays like pointers!

karl@haddock.ima.isc.com (Karl Heuer) (02/23/89)

In article <889@acf3.NYU.EDU> sabbagh@acf3.UUCP () writes:
>In article <1436@etive.ed.ac.uk> sam@lfcs.ed.ac.uk (S. Manoharan) writes:
>>main() {
>>   static char *a[] = { ... };
>>   foo1(a);
>>}
>>
>>foo1(b) char *b; { ... }

Type mismatch: foo1() invoked with (char **) but declared with (char *)
Later comments suggest that you meant to invoke foo1(a[0]) instead.

>>foo2(b) char *b[]; {

Warning: C does not allow arrays as formal arguments; declaration will be
interpreted as `char **b'

>>   printf("Entries %d\n",sizeof(b)/sizeof(char *));

Warning: previous rewrite causes sizeof(b) to be misleading.

>>   for ( i = 0; i < 4; ++i ) printf("%d: %s\n",i,b[i]);
>>}

wlint: 1 error and 2 warnings produced.  Stop.


>Simply put, a pointer to blah should be considered a different type than blah.

Of course.  It *is* a different type.  Even if blah is a function type.

>Now, according to K & R, the notation
>	p1[j]
>is in ALL WAYS equivalent to
>	*(p1 + j)

Minor nit: they are equivalent in expressions, but not in declarations.

>(Incidentally, this implies that p1[j] == j[p1]) !!

A fact used primarily by the Obfuscated C Contestants.  I wish X3J11 had fixed
this.  (They could use the same justification that they did for fixing `+ ='.)

>So what are arrays?  They are POINTER CONSTANTS.

Urk.  While this is true in a sense, it's been my observation that people who
think of them that way are taking the wrong path to understanding.  It becomes
a pointer after the transformation (array-valued expressions decay into
pointer-valued expressions when used in an rvalue context), in which case its
value is constant throughout its lifetime.  This is independent of its lvalue
properties, but some people believe that this is the reason that the lvalue is
not modifiable.  This is not the case; it would be a relatively simple
extension to the language to allow array copy.  (It is hindered only by the
fact that X3J11 didn't fix the declaration botch noted in the first warning
message above (and in fact paved the way for reinterpreting it according to
the Darnell syntax rather than the more logical Brader-Heuer syntax), and by
the opinion that array copy with compile-time array sizes is not especially
useful.)

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

chris@mimsy.UUCP (Chris Torek) (02/23/89)

In article <1436@etive.ed.ac.uk> sam@lfcs.ed.ac.uk (S. Manoharan) writes:
>I need to reason the likely outcome of 
>   /* LINE 1 */  and /* LINE 2 */ 

Okay.  Take a deep breath:

First, we need to get rid of irrelevant stuff.  So:

>   static char *a[] = { "123456789", "bull", "fred", "foo" };
>   foo1(a);
>   foo2(a);

The object `a' has the <type,value> pair% <array 4 of pointer to char,
[aggregate, not displayed]>.  In each call, this is converted to the
rvalue <pointer to pointer to char, &a[0]>.
-----
% C values are always typed.  You cannot talk about a value without
  also talking about its type (although you can talk about a type
  without any particular value).  This is why there is no nil pointer:
  there is a nil pointer-to-char, a nil pointer-to-int, and so forth,
  but there is no `nil pointer'.  There is also no 0, just a 0 short,
  a 0.0 float, a 0L long, and so forth.  Eliding the type is a popular
  way to confuse oneself.
-----

>foo1(b)
>char *b;
>{

All bets are off.  `b' does not match the actual argument's type, so
its value is not describable.

>   printf("Entries %d\n",sizeof(b)/sizeof(char *));

sizeof(b) is the size of `char *' (since b is `char *'), so the
divide produces 1.

>   /* LINE 1 */ for ( i = 0; i < 10; ++i ) printf("%d: %c\n",i,b+i);

The result of this is (as the VAX architecture manual likes to say)
UNPREDICTABLE (well, actually, the VAX A.M. sets it in small-caps :-) ).

>foo2(b)
>char *b[];
>{

Here, b has the type `pointer to pointer to char'.  Although b
was declared as `array ?? of pointer to char' (?? denotes missing
information), it is a declaration for a formal parameter, and the
compiler `adjusts' it according to the rvalue-promotion rule for
arrays.  As far as the compiler is concerned, you typed

	char **b;

---it completely forgets the empty []s, because this is a formal
parameter.

>   int i;
>   /* LINE 2 */ printf("Entries %d\n",sizeof(b)/sizeof(char *));

sizeof(b) is the size of `char **' (since b was declared that way);
the result of this division is unpredictable, but chances are it
will be either 0 (on word-oriented machines) or 1 (on others).

>   for ( i = 0; i < 4; ++i ) printf("%d: %s\n",i,b[i]);

This will print the four strings in the original a[].
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

sabbagh@acf3.NYU.EDU (sabbagh) (02/24/89)

In article <11840@haddock.ima.isc.com> karl@haddock.ima.isc.com (Karl Heuer) writes:
>In article <889@acf3.NYU.EDU> sabbagh@acf3.UUCP () writes:
>>In article <1436@etive.ed.ac.uk> sam@lfcs.ed.ac.uk (S. Manoharan) writes:
>
>>Now, according to K & R, the notation
>>	p1[j]
>>is in ALL WAYS equivalent to
>>	*(p1 + j)
>
>Minor nit: they are equivalent in expressions, but not in declarations.

Agreed.

>
>>(Incidentally, this implies that p1[j] == j[p1]) !!
>
>A fact used primarily by the Obfuscated C Contestants.  I wish X3J11 had fixed
>this.  (They could use the same justification that they did for fixing `+ ='.)
>
>>So what are arrays?  They are POINTER CONSTANTS.
>
>Urk.  While this is true in a sense, it's been my observation that people who
>think of them that way are taking the wrong path to understanding.

Hmm. It depends on what you are trying to understand.  If you are trying to 
USE C, then it's the perfect way to understand them.

>It becomes
>a pointer after the transformation (array-valued expressions decay into
>pointer-valued expressions when used in an rvalue context), in which case its
>value is constant throughout its lifetime.  This is independent of its lvalue
>properties, but some people believe that this is the reason that the lvalue is
>not modifiable.  This is not the case; it would be a relatively simple
>extension to the language to allow array copy.  (It is hindered only by the
>fact that X3J11 didn't fix the declaration botch noted in the first warning
>message above (and in fact paved the way for reinterpreting it according to
>the Darnell syntax rather than the more logical Brader-Heuer syntax), and by
>the opinion that array copy with compile-time array sizes is not especially
>useful.)

I have a slightly different philosophy than you.  I use C as I can find it.
I use it on a whole bunch of flavors of UN*X and also Turbo C 2.0 at home.
The fact is, X3J11 DID NOT make the changes and C compilers treat the name
of an array as a constant.

Also, this is the way most people view C if they are ex-assembly language
programmers.  At NYU, I teach C as "portable assembly language" with some
success, and that is the way I understand it.

One explanation for why X3J11 did not "fix" the "declaration botch" is 
because it would change the way many systems programmers use C. I believe 
that things in C are the way they are in order to make it as close to
assembly language as possible and still be portable.  Thus something like

	int a[10],b[10];
	...
	a = b;

is easy enough to interpret as array copy, but it be a "high-level" construct
that is not found in other semnatic areas of C.

Before you start flaming me about machine architectures that support 
multi-word copies in one instruction (e.g. 8086 MOVS), remember that C
was developed in the early '70s on a PDP 11 which did not have such
instructions.  I guess that really means that it isn't such a bad idea
to allow array copies now ;-).

dmg@ssc-vax.UUCP (David Geary) (02/25/89)

In Message-ID: <1436@etive.ed.ac.uk>, S. Manoharan writes:

> I need to reason the likely outcome of 
>   /* LINE 1 */  and /* LINE 2 */ 

> main()
> {
>    static char *a[] = { "123456789", "bull", "fred", "foo" };
>   /* array of char-pointers */

  Ok, a is "an array of pointers to char".  a[0] holds the address
  where the '1' resides in the string "123456789", a[1] holds the
  address where the 'b' in "bull" resides in memory, etc.

>
>   printf("Entries %d\n",sizeof(a)/sizeof(char *));
>   foo1(a);
>   foo2(a);

    When you call foo1() and foo(2), you are actually doing:

    foo1(&a[0]);
    foo2(&a[0]);

    In the context of an argument to a function, the name of an array
    is the same as the address of the initial element of the array.

    Furthermore, realize that if a[0] is a pointer to char (which it is),
    then &a[0] is a pointer to a pointer to char, (char **)
> }
>
> foo1(b)
> char *b;

  NO!!  You didn't pass char *, you passed char ** - see above.

> {
>   int i;
>
>   printf("Entries %d\n",sizeof(b)/sizeof(char *));
>   /* LINE 1 */ for ( i = 0; i < 10; ++i ) printf("%d: %c\n",i,b+i);

  You are printing b+i as a character.  b holds the ADDRESS of a[0] in main.
  Therefore, you are trying to print the address of a[0] plus a
  constant (probably something ugly like:  0xfffe0 - a big number)
  as a character.  So, you will get garbage.

> }
>
> foo2(b)
> char *b[];

  Now this is better.  b is an array of pointers to char, which is
  what you passed from main().  Notice that the compiler treats this
  declaration the same as:  char **b;

> {
>   int i;
>   /* LINE 2 */ printf("Entries %d\n",sizeof(b)/sizeof(char *));
>   for ( i = 0; i < 4; ++i ) printf("%d: %s\n",i,b[i]);

  b is an array of pointers to char, so b[i] is a pointer to char,
  so this will work correctly.

> }
>
>--------
> Output of the program was:
>
> Entries 4
> Entries 1
> 0: ^P              <--- Expected ... 1
  
  No, expect garbage, see above.

> 1: ^Q              <--- Expected ... 2
> 2: ^R              <--- Expected ... 3
> 3: ^S              <--- Expected ... 4
> 4: ^T              <--- Expected ... 5
> 5: ^U              <--- Expected ... 6
> 6: ^V              <--- Expected ... 7
> 7: ^W              <--- Expected ... 8
> 8: ^X              <--- Expected ... 9
> 9: ^Z
> Entries 1          <--- Expected ... 4
> 0: 123456789
> 1: bull
> 2: fred
> 3: foo
> ---

Here's a working version of foo1():

foo1(b)
  char *b[];  /* or char **b - same difference */
{
  int  i;
  char *p = *b; /* b holds the address of a[0] from main.
		   *b gives whatever value is at the address stored
		   in b, namely, a[0], which is the address of the '1'
		   in the string "123456789" in main(). */

  for(i=0; i < 9; ++i)  /* There are only 9 chars in string */
    printf("%d:  %c\n", i, *(p+i));
}
   

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~ David Geary, Boeing Aerospace,               ~ 
~ #define    Seattle     RAIN                  ~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

throopw@xyzzy.UUCP (Wayne A. Throop) (02/25/89)

> sabbagh@acf3.NYU.EDU (sabbagh)
>>>So what are arrays?  They are POINTER CONSTANTS.
>>Urk.  While this is true in a sense, it's been my observation that people who
>>think of them that way are taking the wrong path to understanding.
> Hmm. It depends on what you are trying to understand.  If you are trying to 
> USE C, then it's the perfect way to understand them.

How then to account for their behavior with "sizeof"?  I recommend
thinking of them not AS pointer constants, but as (almost) universally
CONVERTED TO pointer constants.  That accounts for their behavior
in sizeof, and even in multiply-dimensioned arrays (that is, if you
have an array of arrays, you have an array of... constants??? nah!).

So again... it's not that arrays ARE pointer constants, but that they
are CONVERTED TO pointer constants (almost always).  This seems to me
to be the least confusing way of thinking about it while using C.

--
All things scabbed and ulcerous,
All pox both great and small,
Putrid, foul and gangrenous,
The Lord God made them all.
          --- Monty Python
-- 
Wayne Throop      <the-known-world>!mcnc!rti!xyzzy!throopw

rbutterworth@watmath.waterloo.edu (Ray Butterworth) (02/25/89)

> sabbagh@acf3.NYU.EDU (sabbagh)
>>>So what are arrays?  They are POINTER CONSTANTS.
> If you are trying to 
> USE C, then it's the perfect way to understand them.
 
For those that STILL don't understand, consider this:

int   A[10];
float F;
short S;

What is A when evaluated in an expression?  a constant pointer to an int.
What is F when evaluated in an expression?  a double.
What is S when evaluated in an expression?  an int.

(see K&R, K&R2, pANSI, any C manual or tutorial for verification of this)

So, if you really want to think of A itself as a pointer constant,
then you'd better think of S itself as an int and F itself as a float.
If so, you must really enjoy being confused.

As for the rest of us:

What is A?  an array of 10 ints.
What is S?  a short int.
What is F?  a float.

(see K&R, K&R2, pANSI, any C manual or tutorial for verification of this)

gwyn@smoke.BRL.MIL (Doug Gwyn ) (02/26/89)

In article <23877@watmath.waterloo.edu> rbutterworth@watmath.waterloo.edu (Ray Butterworth) writes:
-int   A[10];
-float F;
-short S;
-What is A when evaluated in an expression?  a constant pointer to an int.
-What is F when evaluated in an expression?  a double.
-What is S when evaluated in an expression?  an int.

It all depends on the context.  For example, consider sizeof(whatever).

That's why it is best not to mentally map the type until the language
rules require that it be mapped.

One of our computer vendors once decided to map array names to pointer
prematurely, so that sizeof A returned 4 no matter how big the array A
was.  We screamed and they eventually fixed that bug.  Let that serve
as a warning of the danger of trying to map types before their time.

alistair@minster.york.ac.uk (03/01/89)

a is an array of pointers - not chars, in other words:
a is a pointer to a pointer
*a is a pointer
**a is a char

so passing it to foo1 and 'pretending' it is a pointer to a char does not
work.  So, b is not pointing to an array of chars, but an array of pointers,
but when the compiler encounters b+i it increments b to the next
byte address - somewhere in the middle of a pointer.

Also, what you really intended to print out was *(b + i), b + i is an
address - which is why you happened to get control codes printed - but
incremented by one each time.

In foo2 b is an address - the base address of an array of pointers.  On
your implementation (but see other discussions about sizes of pointers
in this topic) a char address is the same size as an array address -

karl@haddock.ima.isc.com (Karl Heuer) (03/02/89)

In article <890@acf3.NYU.EDU> sabbagh@acf3.UUCP () writes:
>	int a[10],b[10];
>	...
>	a = b;
>
>is easy enough to interpret as array copy, but it be a "high-level" construct
>that is not found in other semantic areas of C.

Not so.  C has had struct copy for the last decade; I understand that array
copy was not added at the same time only because they couldn't find a clean
way to fit it into the existing language.  The invention of prototypes did
provide a (relatively) clean path, but unfortunately X3J11 didn't take it.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

gwyn@smoke.BRL.MIL (Doug Gwyn ) (03/02/89)

In article <11914@haddock.ima.isc.com> karl@haddock.ima.isc.com (Karl Heuer) writes:
-In article <890@acf3.NYU.EDU> sabbagh@acf3.UUCP () writes:
->	int a[10],b[10];
->	a = b;
-Not so.  C has had struct copy for the last decade; I understand that array
-copy was not added at the same time only because they couldn't find a clean
-way to fit it into the existing language.  The invention of prototypes did
-provide a (relatively) clean path, but unfortunately X3J11 didn't take it.

Excuse me, but C array types were already too badly broken to be fixed
by anything X3J11 could do.  Of course we could have devised a new
language with arrays as first-class citizens, but it wouldn't be C.

There are problems with C arrays beyond the conversion of [] to * in
formal function parameters.

rbutterworth@watmath.waterloo.edu (Ray Butterworth) (03/02/89)

In article <11914@haddock.ima.isc.com>, karl@haddock.ima.isc.com (Karl Heuer) writes:
> I understand that array
> copy was not added at the same time only because they couldn't find a clean
> way to fit it into the existing language.  The invention of prototypes did
> provide a (relatively) clean path, but unfortunately X3J11 didn't take it.

That's a bit of an understatement.
Not only didn't they take the path,
they demolished it so that no one else could ever take it in the future.
I've never heard a good explaination of why they thought it necessary
to take this action though.

karl@haddock.ima.isc.com (Karl Heuer) (03/06/89)

In article <9766@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>In article <11914@haddock.ima.isc.com> karl@haddock (Karl Heuer) writes:
>>C has had struct copy for the last decade; I understand that array copy was
>>not added at the same time only because they couldn't find a clean way to
>>fit it into the existing language.  The invention of prototypes did provide
>>a (relatively) clean path, but unfortunately X3J11 didn't take it.
>
>Excuse me, but C array types were already too badly broken to be fixed
>by anything X3J11 could do.  Of course we could have devised a new
>language with arrays as first-class citizens, but it wouldn't be C.

I didn't say that X3J11 could have made arrays first-class citizens; I said
it could have added array copy (thus raising arrays from third-class citizens
to second-class, as was done with structs).  I believe something like this
would do it:

|If the left operand of the assignment operator has type `array [N] of T',
|then the right operand must have type `pointer to T' (possibly resulting from
|the decay of an array expression), and the first N elements of the array into
|which it points are copied.  This also applies to the implicit assignment of
|an actual argument to an array-typed formal argument (when a prototype is in
|scope), and to the implicit assignment caused by the `return' statement in an
|array-typed function.

As I noted in an earlier article (which seems to have expired now), one
problem with this is that array copy with a constant size is of limited
utility.  I agree that this cannot be fixed without inventing a new language.

Actually, I wasn't even expecting X3J11 to accept this untested feature; I
just wanted it to be a legally conforming extension.  Unfortunately, it now
collides with the kludge that allows "[]" to declare a pointer, in the special
case of a formal argument.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

mouse@mcgill-vision.UUCP (der Mouse) (03/09/89)

In article <890@acf3.NYU.EDU>, sabbagh@acf3.NYU.EDU (sabbagh) writes:
> [Thus] something like
> 	int a[10],b[10];
> 	a = b;
> is easy enough to interpret as array copy, but it be a "high-level"
> construct that is not found in other semnatic areas of C.

Other areas like....

	struct foo a, b;
	a = b;

for example?  I see no essential difference, except for the weight of
history: arrays have always been converted to pointers in expressions,
therefore they always must be.  (No similar requirement ever existed
for structs, of course.)

					der Mouse

			old: mcgill-vision!mouse
			new: mouse@larry.mcrcim.mcgill.edu