gary@mit-eddie.UUCP (Gary Samad) (04/04/84)
I spent hours debugging this--anyone know why it is a problem?
Is it a 'feature' or is it a 'bug'?
in file foo.c:
	char ch[32];
	foo()
	{
	    strcpy(ch,"string");
	    printf("&ch=%x\n",ch);
	}
in file ref.c:
	extern char *ch;
	ref()
	{
	    printf("&ch=%x\n",ch);
	    printf(ch);
	}
in file main.c:
	main()
	{
	    foo();
	    ref();
	}
The program prints:
	&ch=2eb4
	&ch=64abc
	Segmentation fault (core dumped)
(The addresses are aproximately correct)
The compiler didn't resolve the extern char correctly!
Replacing the 'extern char *ch' with 'extern char ch[]' fixes it!
Does anyone know why?
By the way, this is with the 4.1BSD compiler running under Eunice.
		Gary Samad
		decvax!genrad!mit-eddie!garygwyn@brl-vgr.ARPA (Doug Gwyn ) (04/04/84)
The compiler is just fine. You are confusing a char[] with a char *. They are NOT the same thing (try printing their sizeofs, for example). P.S. "lint" would have caught this. P.P.S. You shouldn't print a (char *) with an int format specifier unless you cast the (char *) to an int. "lint" will NOT catch this, although "Safe C" is supposed to.
ark@rabbit.UUCP (Andrew Koenig) (04/04/84)
If, in one file, you say char ch[32]; and in another file, you say char *ch; then your program won't work. Reason: in the first file you have asked for memory to be associated with the external name "ch" that should contain a 32-character array, and in the second you have asked for the same memory to be associated with a character pointer. In those implementations which I am familiar, the first four (or two) characters in the array will be interpreted as the address pointed to by the pointer. Arrays and pointers are simply different, though they can be used interchangably in a few contexts.
jas@drutx.UUCP (04/04/84)
To paraphrase the question:
     Defining a global array "char ch[ 32 ];" in one file and
     declaring it externally as "extern char *ch" in another
     file causes bad craziness ("core dump").  Is this a bug
     or a feature?
It is most emphatically a feature.  An array of characters is not
the same thing as a character pointer.  Lying to the compiler about
the type of an external variable will result in severe retribution.
To wit:  a reference to ch in the file in which it was defined as
an array of 32 chars will be automatically dereferenced by the compiler,
i.e., converted to the address of the first element of the array, because 
arrays are automatically dereferenced when they appear in an expression.
A reference to ch in a file in which it was declared as "extern char *"
will cause the compiler to issue code retrieving the POINTER VALUE STORED
AT THAT LOCATION, i.e., to take the first several chars in the array
("several" usually = 2 or 4, depending on the machine), and interpret
them as a pointer to a character somewhere.  Interpreting the CONTENTS
of "Hi, Mom!" as a character pointer will usually make you point at 
something you later wish you hadn't pointed at.
Jim Shankland
..!ihnp4!druxy!jasks@ecn-ee.UUCP (04/05/84)
#R:mit-eddi:-153800:ecn-ee:13100011:000:1241
ecn-ee!ks    Apr  4 15:58:00 1984
There is an important distinction between the following:
	extern char ch[];
	extern char *ch;
ch[] indicates that you have reserved space elswhere for some number of
characters and you can use ch as the address of the first reserved space.
*ch indicates that you reserved space for a pointer to some characters
which may or may not contain a valid value such that it points to some real
space that is holding some characters.  In the first case, ch is a "constant",
and in the second case, ch is a variable.
The confusion persists because when either form of ch is passed to a function
as a parameter, it is passed by value.  (The value of an array is the address
of it's first element.)  All function parameters can be modified as if they
were automatic variables, so both forms are equivalent only for function
parameter declarations.
In my opinion, this is a very elegant way of doing things, even if it
is confusing at first.
The REAL problem is that many C loaders do not flag the extern declaration
as an error.  They just load incorrect code.  So, the moral of the story is:
			>>>>	USE LINT    <<<<
The compiler is not meant to check for every little inconsistency.
That is why lint is around.
					Kirk Smith
					Purdue EErobert@erix.UUCP (Robert Virding) (04/06/84)
This is one instance when a pointer is not the same as an array. When the compiler sees extern char *ch; it assumes that ch is a pointer to a string of char. However when the compiler sees extern char ch[]; it assumes that ch IS the actual string, not a pointer. The difference in the code generated is how it actually references the external variable ch. I have also come across this feature in writing programs. Robert Virding @ L M Ericsson, Stockholm
colonel@sunybcs.UUCP (George Sicherman) (04/09/84)
[tail +2] It's just what you'd expect. The extern char *c expects to find a character address stored in a global word. Instead, it finds an array of bytes. If you call it extern char c[] the problem should go away. Col. G. L. Sicherman ...seismo!rochester!rocksvax!sunybcs!colonel
john@edai.UUCP (John Hallam) (04/16/84)
Article 185 refers to the problems that arise when a name is declared char ch[...]; /* in the first file */ extern char *ch; /* in another file */ It appears that the name is not correctly linked. ------------------ The answer to the question raised is that it is not a compiler bug, but it might be called a feature depending on your definition of those. It is a part of the language definition! For those who knwow about l-values and r-values the explanation is this: In C, array names (and function names) denote an r-value CONSTANT which is determined at link editing time; most other variable names denote l-values and implicit contents coercion is done when necessary. Thus in the above, the name 'ch' is first defined as an array (I use first in the sense that this declaration actually allocates space for the array) and denotes the address of the storage in which the characters will go. The second declaration informs the compiler that 'ch' is an l-value, i.e. the name now denotes the address of storage in which a pointer value can be put. Thus accessing 'ch' under the second declaration actually gives the value of the first word of the array! The problem encountered here is more faulty explanation in K&R of identifier semantics than anything else. In an r-value context (when used in expressions) the two declarations give the same TYPE, but the pointer declaration implies a contents coercion (which in this case fetches the first word of the array) and the array declaration implies no contents coercion (because the name already denotes an r-value). Conversely, you can assign to the pointer declared name 'ch', because it is an l-value (this is just what l-value means -- can stand on the left of assignments), but if you try to assign to the array declared name 'ch' you'll get a message 'Lvalue required' or something like it from the compiler. I hope this makes things a little clearer. John Hallam. (edai!john).