[comp.lang.c] confusion with char *a and char aNUM

jjk@jupiter.astro.umd.edu (Jim Klavetter) (12/03/90)

I am not a beginner and I have checked my texts, man pages, and the
faq list (Thanks Steve, I think it is a good job).  I am still
confused.  I think part of the problem is with an inconsistency in c
and part is my understanding.

I can have two identical files except that one I declare a to be
	char a[NUM]
and the other has
	char *a with a malloc of NUM+1 characters.

I guess I can stop there and ask the general question, "what is the
difference between those two?"  If done properly, they will both be
NUM+1 bytes (or whatever a char is) of memory and should be accessible
either by a[3] or *(a+3) for the forth element, for example.  Yet,
there are differences.

In both cases, the following is accepted by both my sun4 compiler and
gcc
	strcpy(string, a)
and there is no problem.

However, if I have
	a=strchr(string, ":");
I get the error message
	121: incompatible types in assignment
or some such thing (that one is from gcc).

The man page treats both the arguement of strcpy() and the return value
of strchr() as type (char *).  So why this inconsistency?  Am I using
strcpy() wrong above and just getting away with a flaw in the
compiler, or is there actually an inconsistency here.

Thanks in advance.

jjk@astro.umd.edu
Jim Klavetter (accepting mail for Athabasca and Reudi)
Astronomy
UMD
College Park, MD  20742

gwyn@smoke.brl.mil (Doug Gwyn) (12/03/90)

In article <7656@umd5.umd.edu> jjk@astro.umd.edu( Jim Klavetter) writes:
>	char a[NUM]
>	char *a with a malloc of NUM+1 characters.
>I guess I can stop there and ask the general question, "what is the
>difference between those two?"

The first is an array of char of length NUM, and the second is a pointer
to char (which you claim has been obtained from malloc()).

>If done properly, they will both be NUM+1 bytes (or whatever a char is)
>of memory and should be accessible either by a[3] or *(a+3) for the
>forth element, for example.

Wrong -- only the array IS "NUM" (not NUM+1) bytes of storage; the
pointer is (typically) always four bytes of storage no matter what it
points to.

>	strcpy(string, a)

What happens here is obvious in the case that "a" is a pointer; the
pointer is simply copied-in as an argument to the function.  When "a"
is the name of an array, on the other hand, a special conversion rule
of C comes into play:  In most (but not all) expression contexts, the
name of an array is replaced by a pointer to the first element of the
array.  (An exception is when the array name is the operand of "sizeof".)

>	a=strchr(string, ":");
>	121: incompatible types in assignment

First, you need to #include <string.h> so that strchr() gets properly
declared as returning char* (otherwise it will be assumed to return int,
which is indeed incompatible with pointer types).  Assuming that a proper
declaration of strchr() is in scope, then assigning the char* return
value to a char* variable is okay, but assignment (of anything at all)
to an array is never okay.

ARRAYS ARE NOT POINTERS.

barmar@think.com (Barry Margolin) (12/04/90)

In article <7656@umd5.umd.edu> jjk@astro.umd.edu( Jim Klavetter) writes:
>However, if I have
>	a=strchr(string, ":");
>I get the error message
>	121: incompatible types in assignment
>or some such thing (that one is from gcc).

Doug Gwyn already answered most of your questions, but he missed on problem
with the above expression.  The second argument to strchr() is supposed to
be an int (according to the SunOS manual, but maybe ANSI says "char"), not
a char*.  Therefore, it should be

	a = strchr (string, ':');
--
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

darcy@druid.uucp (D'Arcy J.M. Cain) (12/04/90)

In article <7656@umd5.umd.edu> jjk@astro.umd.edu( Jim Klavetter) writes:
>I can have two identical files except that one I declare a to be
>	char a[NUM]
>and the other has
>	char *a with a malloc of NUM+1 characters.
> [...]
>In both cases, the following is accepted by both my sun4 compiler and
>gcc
>	strcpy(string, a)
>and there is no problem.
>
>However, if I have
>	a=strchr(string, ":");
>I get the error message
>	121: incompatible types in assignment
>or some such thing (that one is from gcc).

I know there are people better qualified to answer this but I'd like to
attack this from a slightly different angle.  Perhaps it will shed some
light for those who haven't grasped the difference between pointers and
arrays.

#if A_IS_A_CONSTANT
#define a 3
#else
int a = 3;
...
foo(a);

When 'a' is a constant the value is passed to foo.  If it is a variable
then the CPU must get the value from memory and then use the resulting
value.  Note that in both cases foo gets exactly the same thing as an
argument (the value 3.)  There is no such thing as a 'constant' or
'variable' argument to a function.

Think of an array as a constant.  if you declare
    char a[NUM];
then a is a constant unchanging value.  It is almost like saying:
    #define a ADDRESS_OF_A_SPECIFIC_MEMORY_LOCATION
and when you pass this as an argument you pass the value of the specific
memory location.  If, on the other hand you declare
    char *a = ADDRESS_OF_A_SPECIFIC_MEMORY_LOCATION;
then you have created a variable and initialized it to a specific location.
The difference here is that the value stored at 'a' can be modified, for
example by assigning the return value from malloc.  However when you call
a function with this as an argument, the value is read from memory and that
value is sent to the function.  The called function gets exactly the same
kind of value in both cases.

As to why your example doesn't work, you are in effect trying to assign
a returned address (from strchr()) to a constant (the array address 'a'.)
This is somewhat like saying
    3 = getchar();

I know that I have played fast and loose with some concepts but I think
that this may help some people.  I strongly suggest rereading the FAQ
if you think the above helped to drop the penny.

-- 
D'Arcy J.M. Cain (darcy@druid)     |
D'Arcy Cain Consulting             |   There's no government
West Hill, Ontario, Canada         |   like no government!
+ 416 281 6094                     |

salomon@ccu.umanitoba.ca (Dan Salomon) (12/05/90)

In article <7656@umd5.umd.edu> jjk@astro.umd.edu( Jim Klavetter) writes:
>
>I can have two identical files except that one I declare a to be
>	char a[NUM]
>and the other has
>	char *a with a malloc of NUM+1 characters.
>
   ...
>However, if I have
>	a=strchr(string, ":");
>I get the error message
>	121: incompatible types in assignment
>

No big mystery.
    char a[NUM];
declares "a" to be a constant pointer to an array of char that points
to the region allocated by the compiler.
Hence you cannot change its value.
    char *a;
declares a to be a variable pointer to an array of char.
This treatment allows the optimization of array accesses with
constant subscripts.
-- 

Dan Salomon -- salomon@ccu.UManitoba.CA
               Dept. of Computer Science / University of Manitoba
	       Winnipeg, Manitoba, Canada  R3T 2N2 / (204) 275-6682

chris@mimsy.umd.edu (Chris Torek) (12/05/90)

In article <14638@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:
>ARRAYS ARE NOT POINTERS.

(Which is of course correct.)

A few days ago, on one lunch venture, someone wanted to stop at the
campus bookstore and search for a particular book on graphics.  While
he was doing that, I took a random sample of the `C books' shelf and
read what each had to say about pointers and arrays.

I had time for only two books.  They were both wrong in at least one
specific, and both gave the wrong idea.

Part of the problem, then, with people's understanding of arrays and
pointers in C is that there has been a profusion of books about C,
many of which continue to promote myths about pointers and arrays.
(Perhaps there should be a note to this effect in the FAQ list.)
Remember:

	Being in print does not make it true.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

chris@mimsy.umd.edu (Chris Torek) (12/06/90)

In article <1990Dec4.214845.18949@ccu.umanitoba.ca>
salomon@ccu.umanitoba.ca (Dan Salomon) writes:
>    char a[NUM];
>declares "a" to be a constant pointer to an array of char that points
>to the region allocated by the compiler.

One more time:
	a is not a constant pointer.
	a is an array.

How can you tell?
	sizeof((char *)0) is typically 2, 4, or 8.
	  ((char *)0 is a constant nil-pointer-to-char.)
	sizeof(a) is always exactly NUM.
 => conclusion: a is not a constant pointer to char.

How else can you tell?
	&3 is illegal (cannot take address of any rvalue, including
	  any constant).
	&a is a pointer (in ANSI C only, not in K&R-1 C) of type
	  (char (*)[NUM]).
 => conclusion: a is not a constant; it has an address.

How else can you tell?
	There are no other ways.  `a=b' and `a++' give the same sort of
	  error that `3=b' and `3++' produce.  If you do not understand
	  that arrays are not pointers, the declaration of `a' is not going
	  to help either.

>Hence you cannot change its value.

This conclusion is correct (see X3.159-1989; an array object, though an
lvalue, is not a modifiable lvalue), but for the wrong reason.

Note: the so-called `equivalence' between arrays and pointers is NOT
`array equals pointer'.  The rule is more complicated:

	In a value context, an object of type `array N of T' is
	transformed into a value of type `pointer to T' (discarding
	the constant N) by taking the address of the 0th element
	of that array.

Given:
	char *p, a[100];
	p = a;

this rule is applied as follows:

	<object, pointer to char, p>[object context] =
	  <object, array 100 of char, a>[value context];

The left hand side of the assignment statement is an object in an
object context, so nothing need be done.  The right hand side, however,
is an object in a value context.  For most objects (those that are
not arrays), the object is changed to a value simply by fetching its
current value.  This does not work for arrays since the `current value'
is (in this case) 100 `char's, which, back when Dennis Ritchie wrote
the first C compiler, was considered `too much work' to handle.  (So
were structure values; *that* has been rectified, but the array nuisance
has not.)  (Note: `too much work' is not just `on the part of the
compiler writer', but also `in terms of run-time code'.)  So the rule
above applies.  We have an `array N of T' (N=100, T=char), so we change
to a pointer to T by taking &array[0]:

	<object, pointer to char, p> =
	  <value, pointer to char, &a[0]>;

Now we have an assignment of the form `object = value;', so all that
is left is checking that the types match (they do) and doing the assignment.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

gwyn@smoke.brl.mil (Doug Gwyn) (12/06/90)

In article <28339@mimsy.umd.edu> chris@mimsy.umd.edu (Chris Torek) writes:
>Part of the problem, then, with people's understanding of arrays and
>pointers in C is that there has been a profusion of books about C,
>many of which continue to promote myths about pointers and arrays.

The profusion of books about C during the past couple of years appears
to be primarily an attempt to "cash in" on the interest in C.  Many of
the several books that I briefly examined appeared to be written by
authors who had only recently "discovered" C and thus should have no
business trying to teach it.  My recommendation for C tutorial texts
is to get the current edition of the "old standbys" such as Tom Plum's
series or K&R (the latter is best for already-experienced non-C
programmers).  At least those authors know what they are talking about.

rmartin@clear.com (Bob Martin) (12/06/90)

In article <7656@umd5.umd.edu> jjk@astro.umd.edu( Jim Klavetter) writes:
>
>I think part of the problem is with an inconsistency in c
>and part is my understanding.
>
>I can have two identical files except that one I declare a to be
>	char a[NUM]
>and the other has
>	char *a with a malloc of NUM+1 characters.
>
>I guess I can stop there and ask the general question, "what is the
>difference between those two?"  If done properly, they will both be
>NUM+1 bytes (or whatever a char is) of memory and should be accessible
>either by a[3] or *(a+3) for the forth element, for example.  Yet,
>there are differences.

	a[NUM] is an array of NUM chars. (NOT NUM+1 as the text implies).
	char *a is a pointer to a character, period.  In the first case
	the compiler has knowledge about the number of bytes in 'a' and
	reserves those bytes for you.  In the second case the compiler 
	knows nothing about the size of the object you are pointing to
	and depends on you to explicitly allocate the bytes.
>
>In both cases, the following is accepted by both my sun4 compiler and
>gcc
>	strcpy(string, a)
>and there is no problem.

	Right.  This is because of C's convention of treating the name
	of an array like a pointer to the first element of the array
	in an rvalue context.  
>
>However, if I have
>	a=strchr(string, ":");
>I get the error message
>	121: incompatible types in assignment
>or some such thing (that one is from gcc).

	Right again.  This is because 'a' is an lvalue in this statement.
	If 'a' was declared as an array, then in an lvalue context the name
	a represents the array, not a pointer at all.

	If you declared a as "char a[NUM]" you have set aside NUM bytes
	for your use.  Ask yourself what you mean when you say:
		"a=strchr(string, ':');"
	'a' is an array of NUM bytes, do you want to throw away those 
	bytes and now ask 'a' to be a pointer to the innards of 'string'?
>
>The man page treats both the arguement of strcpy() and the return value
>of strchr() as type (char *).  So why this inconsistency?  Am I using
>strcpy() wrong above and just getting away with a flaw in the
>compiler, or is there actually an inconsistency here.

	There is no inconsistency.  The return value of strchr is char*,
	but 'a' (if declared as char a[NUM]) in the lvalue context is
	not a char*, it is a char[NUM].
>
>Thanks in advance.
>
>jjk@astro.umd.edu
>Jim Klavetter (accepting mail for Athabasca and Reudi)
>Astronomy
>UMD
>College Park, MD  20742


-- 
+-Robert C. Martin-----+:RRR:::CCC:M:::::M:| Nobody is responsible for |
| rmartin@clear.com    |:R::R:C::::M:M:M:M:| my words but me.  I want  |
| uunet!clrcom!rmartin |:RRR::C::::M::M::M:| all the credit, and all   |
+----------------------+:R::R::CCC:M:::::M:| the blame.  So there.     |

tps@chem.ucsd.edu (Tom Stockfisch) (12/07/90)

In article <28339@mimsy.umd.edu> chris@mimsy.umd.edu (Chris Torek) writes:
>In article <14638@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:
>>ARRAYS ARE NOT POINTERS.

>... I took a random sample of the `C books' shelf and
>read what each had to say about pointers and arrays.
>I had time for only two books.  They were both wrong in at least one
>specific, and both gave the wrong idea.

>Part of the problem, then, with people's understanding of arrays and
>pointers in C is that there has been a profusion of books about C,
>many of which continue to promote myths about pointers and arrays.
>Remember:  Being in print does not make it true.


Yeah, like K&R I, p. 111:
	
	"argv is a pointer to an array of pointers" *

I believe this is in Edition II, as well.

I think there is an excellent explanation for the confusion about arrays and
pointers:  the "base document" for C didn't always get it right.
I was hopelessly confused about pointers/arrays until I got Harbison & Steele's
tome.  I don't think it's a fault in the language, just in its original
explication.


* note to authors writing new C books:  "argv" is actually a
  "pointer to pointer to char", or, more descriptively, a "pointer to the
  first element of an array of pointers".
-- 

|| Tom Stockfisch, UCSD Chemistry	tps@chem.ucsd.edu

gwyn@smoke.brl.mil (Doug Gwyn) (12/09/90)

In article <928@chem.ucsd.EDU> tps@chem.ucsd.edu (Tom Stockfisch) writes:
>I think there is an excellent explanation for the confusion about arrays and
>pointers:  the "base document" for C didn't always get it right.

The Base Document was Appendix A, not the tutorial portion of K&R I.

I think in context, especially considering the preceding material,
the cited colloquial usage wasn't especially misleading.

Note that argv could not have been said to BE an array of string
pointers, which is what one would have liked to say, because arguments
cannot be arrays.

dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) (12/12/90)

Let me give you a different way of understanding the difference between
char *A and char B[NUM].

After you have used malloc to assign NUM characters of storage for A,
both A and B can be used in a very similar way.  In some contexts any
reference to A or B is treated as the use of a pointer.

However, B is not a pointer.  It's just treated like one sometimes.  It
is a SYMBOLIC NAME FOR AN ADDRESS.  (And remember that a pointer is not
an address; it's a way of finding an address.)  In place of B you could
(if the language allowed it) use the absolute value of the address of
the array you call B.  You can pretend that the instant you use B, the
compiler converts it to the address for which B is a symbolic name.

A is not the same thing.  It is a VARIABLE, not an address.  The
variable is at some address, but when you use A, you usually
dereference that variable, and reach some other address.  This is why A
really *is* a pointer variable, while B is merely *treated* like one.

So, what happens when you do something like "A[3] = 7" or "B[3] = 7"?
In the first case, the compiler generates code to look at the CONTENTS
of the variable A, get those contents and add 3 to them, and treat the
result as an address.  B, however, is a symbolic name for an address.
The compiler generates code to directly take that address, add 3 to it,
and get another address.  In both cases, after the address is obtained,
a value of 7 is stored at that address.  (The above description is not
strictly accurate if A or B is on the stack.  For this description,
let's assume they are statically allocated and not in the stack.)

The terms "pointer" and "address" are often erroneously used with the
same meaning.  They are not the same.  An address is a location.  A
pointer is a way of getting to a location.  Think of the pointer as an
arrow pointing somewhere, and an address as the place the arrow is
pointing to.  A, the pointer, is an arrow, and it points to the address
where malloc() gave us memory.  B, the address, doesn't point anywhere
-- it *is* already an address.  A (a pointer) could point to B (an
address).  But B (an address) can never point to A, because B is not a
pointer.
--
Rahul Dhesi <dhesi%cirrusl@oliveb.ATC.olivetti.com>
UUCP:  oliveb!cirrusl!dhesi