[net.unix-wizards] PCC, lint bug

root%bostonu.csnet@csnet-relay.arpa (BostonU SysMgr) (08/27/85)

The following totally reasonable looking garbage compiles and passes
lint -hp without a peep. It printed garbage on my 4.2 VAX, core dumped
on my UNIX/PC (SYSV). I realize the difference between a two dimensional
array and a pointer to a pointer (or whatever, pluralize), apparently
neither C nor lint does. Sorry if this has been covered.

----------
main()
{
	int x[2][2] ;
	int **xp = x ;
	int i,j ;

	for(i=0 ; i < 2 ; i++)
		for(j=0 ; j<2 ; j++)
			printf("%d\n",x[i][j] = i+j) ;
	for(i=0 ; i < 2 ; i++)
		for(j=0 ; j < 2 ; j++)

			printf("%d\n",xp[i][j]) ;
}

----------
		-Barry Shein, Boston University
P.S. Found this while trying to speed up a hand translation of Footran->C

guy@sun.uucp (Guy Harris) (08/30/85)

This really belonged in net.lang.c, for reasons which will be apparent
shortly...

> The following totally reasonable looking garbage compiles and passes
> lint -hp without a peep. It printed garbage on my 4.2 VAX, core dumped
> on my UNIX/PC (SYSV). I realize the difference between a two dimensional
> array and a pointer to a pointer (or whatever, pluralize), apparently
> neither C nor lint does. Sorry if this has been covered.
> (excerpted)
> ----------
> 	int x[2][2] ;
> 	int **xp = x ;
> 			printf("%d\n",x[i][j] = i+j) ;
> 			printf("%d\n",xp[i][j]) ;

C does know the difference between "array of X" and "pointer to X"; however,
when the name of an "array of X" is used it evaluates to a pointer to the
first member of that array, hence a "pointer to X".

xp[i][j] is (xp[i])[j].  xp[i] is *(xp + i).  "xp" is a pointer to a pointer
to an "int", as is xp + i.  *(xp + i) is thus a pointer to an "int".
(xp[i])[j] is thus (*(xp + i))[j].  Call *(xp + i) Xp.  (xp[i])[j] is Xp[j].
This is *(Xp + j).  "Xp" is a pointer to an int, as is Xp + j, so *(Xp + j)
is an "int".  The code is perfectly legal C.  Any C compiler or "lint" which
*rejected* it would have a bug.  Why the program drops core is left as an
exercise for the reader.  (Hint - has what "xp" points to been initialized?
Is code that dereferences an uninitialized pointer likely to work?)

Does anybody else think that the array/pointer semi-equivalence is probably
one of the major causes of errors in C coding?

	Guy Harris

root%bostonu.csnet@csnet-relay.arpa (BostonU SysMgr) (09/01/85)

>This really belonged in net.lang.c, for reasons which will be apparent
>shortly...
>
>> The following totally reasonable looking garbage compiles and passes
>> lint -hp without a peep. It printed garbage on my 4.2 VAX, core dumped
>> on my UNIX/PC (SYSV). I realize the difference between a two dimensional
>> array and a pointer to a pointer (or whatever, pluralize), apparently
>> neither C nor lint does. Sorry if this has been covered.
> (excerpted)
> ----------
>> 	int x[2][2] ;
>> 	int **xp = x ;
>> 			printf("%d\n",x[i][j] = i+j) ;
>> 			printf("%d\n",xp[i][j]) ;
>
>C does know the difference between "array of X" and "pointer to X"; however,
>when the name of an "array of X" is used it evaluates to a pointer to the
>first member of that array, hence a "pointer to X".
>
>xp[i][j] is (xp[i])[j].  xp[i] is *(xp + i).  "xp" is a pointer to a pointer
>to an "int", as is xp + i.  *(xp + i) is thus a pointer to an "int".
>(xp[i])[j] is thus (*(xp + i))[j].  Call *(xp + i) Xp.  (xp[i])[j] is Xp[j].
>This is *(Xp + j).  "Xp" is a pointer to an int, as is Xp + j, so *(Xp + j)
>is an "int".  The code is perfectly legal C.  Any C compiler or "lint" which
>*rejected* it would have a bug.  Why the program drops core is left as an
>exercise for the reader.  (Hint - has what "xp" points to been initialized?
>Is code that dereferences an uninitialized pointer likely to work?)
>
>	Guy Harris

WRONG WRONG WRONG

THE  ERROR IS ALLOWING THE DECLARATION TO PASS BOTH C AND LINT:

	int x[STUFF][THING] ;	/* the name 'x' is a pointer to an int */
	int **xp = x ;		/* not a pointer to a pointer */

I do not believe *any* reading of 'x' lets it be a pointer to a pointer.
My, ahem, point stands, it's a bug in the compiler not a misunderstanding
of C. The semantics of a two dimensional array (x[STUFF][THING]) is not
at all the same as an array of pointers (*x[STUFF]), the former involves
only a base pointer and STUFF*THING ints (in this case.) Therefore, perhaps,
UNIX-WIZARDS, not net.lang.c if I read intentions right (this is where
interesting global bugs go.)

Here is the code for a trivial proof of this on a VAX (4.2):

int x[4][3] ;	/* make these externs so the names show up in the asm */
int **xp = x ;	/* wrong, type clash,  but C nor LINT care */
int i ;
foo()
{
	i = 4 ;
	x[2][2] = i ;
	xp[2][2] = i ;
}
(Now, cc -S:)
LL0:
	.data
	.comm	_x,48		<- 4*3*sizeof(int)
	.align	2
	.globl	_xp
_xp:
	.long	_x		<- wrong, type clash
	.comm	_i,4
	.text
	.align	1
	.globl	_foo
_foo:
	.word	L15
	jbr 	L17
L18:
	movl	$4,_i
	movl	_i,_x+32	<- doesn't deref anything, correct (x[2][2])
	movl	_xp,r0		<- merrily derefs a pointer (xp[2][2])
				this is correct code, but the decl should
				have caused warnings
	movl	8(r0),r0
	movl	_i,8(r0)
	ret
	.set	L15,0x0
L17:
	jbr 	L18
	.data

If you are still not convinced this is a bug, change all the occurrances
of 'int' to 'char' above and run it through C and lint, lint only warns
when -hp (possible pointer alignment problem ?!?! still wrong.)

		-Barry Shein, Boston University

guy@sun.uucp (Guy Harris) (09/03/85)

> 	int x[STUFF][THING] ;	/* the name 'x' is a pointer to an int */

WRONG WRONG WRONG

The name "x" is a pointer to an array of THING "int"s, not a pointer to an
"int".

(See article in net.lang.{c,f77} for a discussion of the way arrays work in
C.  They aren't equivalent to pointers.  The name of an array is equivalent
to a pointer to its first member, but if the members of an array are
themselves arrays a pointer to the first member of such an array is not
equivalent to a pointer to the first member of the first member of that
array.)

The type checking in PCC/lint is incorrect; it takes two types and keeps
stripping one layer of pointer/array off both of them until 1) one type is a
scalar type and the other isn't, which is an error or 2) both types are
scalar types, in which case the scalar types are compared.  "scalar"
includes "structure", etc. here...

	Guy Harris

friesen@psivax.UUCP (Stanley Friesen) (09/03/85)

In article <1152@brl-tgr.ARPA> root%bostonu.csnet@csnet-relay.arpa (BostonU SysMgr) writes:
>
>
>WRONG WRONG WRONG
>
>THE  ERROR IS ALLOWING THE DECLARATION TO PASS BOTH C AND LINT:
>
>	int x[STUFF][THING] ;	/* the name 'x' is a pointer to an int */
>	int **xp = x ;		/* not a pointer to a pointer */
>
>I do not believe *any* reading of 'x' lets it be a pointer to a pointer.

	Still wrong: x is a pointer to an array of THING int's!
This is *not* the same as a pointer to an int or a pointer to a
pointer to an int! The difference shows up in the following:

	(x + 1) == &x[1]  *not* &x[0][1]
that is adding one to a pointer increments it by *the* *size* *of*
*the* *object* pointed to - in this case sizeof(int [THING]) not
sizeof(int). The correct declarations above would be:

	int x[STUFF][THING];
	int *xp[THING] = x;
-- 

				Sarima (Stanley Friesen)

UUCP: {ttidca|ihnp4|sdcrdcf|quad1|nrcvax|bellcore|logico}!psivax!friesen
ARPA: ttidca!psivax!friesen@rand-unix.arpa

jrife@fthood (09/04/85)

>Does anybody else think that the array/pointer semi-equivalence is probably
>one of the major causes of errors in C coding?
>
>	Guy Harris

I always thought the array/pointer semi-equivalence was one of the major causes
of easier to {read,understand} coding.  And, sometimes it's just necessary to
switch between pointer and array, for the sake of sanity.  At least I've felt
this way, but then my sanity has never been too solid.


--

					*********************************
					*				*
					* Jeff Rife			*
					* ihnp4!uiucuxc!fthood!jrife	*
					*				*
					* "Gene Simmons never had a	*
					*  personal computer when he	*
					*  was a kid."			*
					*	     --Berke Breathed	*
					*				*
					*********************************

root@bu-cs.UUCP (Barry Shein) (09/05/85)

>> 	int x[STUFF][THING] ;	/* the name 'x' is a pointer to an int */

>WRONG WRONG WRONG

>The name "x" is a pointer to an array of THING "int"s, not a pointer to an
>"int".
>	Guy Harris

RIGHT RIGHT RIGHT

My explanation was flawed, we agree on the PCC/LINT bug which misses
clearly illegal and useless type clashes. Sigh.

	-Barry Shein, Boston University

scottha@copper.UUCP (Scott Hankerson) (09/05/85)

In article <1152@brl-tgr.ARPA> root%bostonu.csnet@csnet-relay.arpa (BostonU SysMgr) writes:
>
>>This really belonged in net.lang.c, for reasons which will be apparent
>>shortly...
>>
>>> The following totally reasonable looking garbage compiles and passes
>>> lint -hp without a peep. It printed garbage on my 4.2 VAX, core dumped
>>> on my UNIX/PC (SYSV). I realize the difference between a two dimensional
>>> array and a pointer to a pointer (or whatever, pluralize), apparently
>>> neither C nor lint does. Sorry if this has been covered.
>> (excerpted)
>> ----------
>>> 	int x[2][2] ;
>>> 	int **xp = x ;
>>> 			printf("%d\n",x[i][j] = i+j) ;
>>> 			printf("%d\n",xp[i][j]) ;
>>
>>C does know the difference between "array of X" and "pointer to X"; however,
>>when the name of an "array of X" is used it evaluates to a pointer to the
>>first member of that array, hence a "pointer to X".
>>
>>xp[i][j] is (xp[i])[j].  xp[i] is *(xp + i).  "xp" is a pointer to a pointer
>>to an "int", as is xp + i.  *(xp + i) is thus a pointer to an "int".
>>(xp[i])[j] is thus (*(xp + i))[j].  Call *(xp + i) Xp.  (xp[i])[j] is Xp[j].
>>This is *(Xp + j).  "Xp" is a pointer to an int, as is Xp + j, so *(Xp + j)
>>is an "int".  The code is perfectly legal C.  Any C compiler or "lint" which
>>*rejected* it would have a bug.  Why the program drops core is left as an
>>exercise for the reader.  (Hint - has what "xp" points to been initialized?
>>Is code that dereferences an uninitialized pointer likely to work?)
>>
>>	Guy Harris
>
>WRONG WRONG WRONG
>
>THE  ERROR IS ALLOWING THE DECLARATION TO PASS BOTH C AND LINT:
>
>	int x[STUFF][THING] ;	/* the name 'x' is a pointer to an int */
>	int **xp = x ;		/* not a pointer to a pointer */
>
>I do not believe *any* reading of 'x' lets it be a pointer to a pointer.
>	.
>	.
>	.
>		-Barry Shein, Boston University


Who is WRONG WRONG WRONG??  Page 104 of the 1978 edition of \The C
Programming Language/ by Kernighan and Ritchie says:

	   In C, by definition a two-dimensional array is really a
	one-dimensional array, each of whose elements is an array.
	Hence subscripts are written as

		day_tab[i][j]

	rather than

		day_tab[i, j]

	as in most languages.  Other than this, a two-dimensional array
	can be treated in much the same way as in other languages.

There's still the question of whether or not xp (in the original example)
is properly initialized.  But since when did C care if a pointer is
initialized?

Scott Hankerson
tektronix!copper!scottha

anton@ucbvax.ARPA (Jeff Anton) (09/07/85)

Those who have had enough of the junk should only read this first page.
>>>> 	int x[2][2] ;
>>>> 	int **xp = x ;
>>>> 			printf("%d\n",x[i][j] = i+j) ;
>>>> 			printf("%d\n",xp[i][j]) ;

int	**xp = x;  /* is wrong, wrong, wrong; lint should not accept this */
int	xp[][2] = x; /* is right */
int	(*xp)[2] = x; /* is also right */
int	*xp[2] = x; /* is WRONG */

I'm disgusted at the number of 'wizards' who are confused.
'x' should best be though of as a pointer to array of two ints.
int	y[2][2][2]; /* y should best be thought of as a pointer to
			two dimentional array of ints ([2][2]) */
int	yp[][2][2] = y; /* is a proper pointer */

In article <81@copper.UUCP> scottha@copper.UUCP (Scott Hankerson) writes:
>In article <1152@brl-tgr.ARPA> root%bostonu.csnet@csnet-relay.arpa (BostonU SysMgr) writes:
>>
>>>This really belonged in net.lang.c, for reasons which will be apparent
>>>shortly...
>>>
>>>> The following totally reasonable looking garbage compiles and passes
>>>> lint -hp without a peep. It printed garbage on my 4.2 VAX, core dumped
>>>> on my UNIX/PC (SYSV). I realize the difference between a two dimensional
>>>> array and a pointer to a pointer (or whatever, pluralize), apparently
>>>> neither C nor lint does. Sorry if this has been covered.
>>> (excerpted)
>>> ----------
>>>> 	int x[2][2] ;
>>>> 	int **xp = x ;
>>>> 			printf("%d\n",x[i][j] = i+j) ;
>>>> 			printf("%d\n",xp[i][j]) ;
>>>
>>>C does know the difference between "array of X" and "pointer to X"; however,
---> This writer doesn't. (JAA)
>>>when the name of an "array of X" is used it evaluates to a pointer to the
>>>first member of that array, hence a "pointer to X".
>>>
>>>xp[i][j] is (xp[i])[j].  xp[i] is *(xp + i).  "xp" is a pointer to a pointer
>>>to an "int", as is xp + i.  *(xp + i) is thus a pointer to an "int".
>>>(xp[i])[j] is thus (*(xp + i))[j].  Call *(xp + i) Xp.  (xp[i])[j] is Xp[j].
>>>This is *(Xp + j).  "Xp" is a pointer to an int, as is Xp + j, so *(Xp + j)
>>>is an "int".  The code is perfectly legal C.  Any C compiler or "lint" which
>>>*rejected* it would have a bug.  Why the program drops core is left as an
>>>exercise for the reader.  (Hint - has what "xp" points to been initialized?
>>>Is code that dereferences an uninitialized pointer likely to work?)
>>>
>>>	Guy Harris
---> No cigar for Guy, He's got the semantics of 'xp' down, but not 'x'.
>>
>>WRONG WRONG WRONG
>>THE  ERROR IS ALLOWING THE DECLARATION TO PASS BOTH C AND LINT:
>>	int x[STUFF][THING] ;	/* the name 'x' is a pointer to an int */
>>	int **xp = x ;		/* not a pointer to a pointer */
>>I do not believe *any* reading of 'x' lets it be a pointer to a pointer.
>>		-Barry Shein, Boston University
---> Barry understands the problem but not the solution.
>
>Who is WRONG WRONG WRONG??  Page 104 of the 1978 edition of \The C
>Programming Language/ by Kernighan and Ritchie says:
>
>	   In C, by definition a two-dimensional array is really a
>	one-dimensional array, each of whose elements is an array.
---> K & R are right.  int (*xp)[2]; /* xp is a pointer to array of 2 ints */
---> Thus, xp++ moves xp past two (2) ints
>	Hence subscripts are written as
>
>		day_tab[i][j]
>
>	rather than
>
>		day_tab[i, j]
>
>	as in most languages.  Other than this, a two-dimensional array
>	can be treated in much the same way as in other languages.
>
>There's still the question of whether or not xp (in the original example)
>is properly initialized.  But since when did C care if a pointer is
>initialized?
>
>Scott Hankerson
>tektronix!copper!scottha

I thank those people who understand this stuff and are still reading.
C does have a problem here.  Consider the case where you wish to pass
multi-dimentional arrays to utility functions.  Currently, the receiving
function must be compiled with proper dimentions. ex:
foo(matrix)	/* proper definition for passing 4x4 arrays */
double matrix[][4];
{ ....
C should let you pass what dimentions you want. ex:
bar(matrix, dim)
double matrix[][dim];
{ ....
This does add complexity, but the only way around the problem now
is to do your own array deref:
baz(matrix, dim)
double matrix[];	/* note one dimention */
int    dim;
{
	double d = matrix[2*dim+3];	/* to get matrix[2][3] */

Of course, you can avoid the whole mess by working by useing
an array of pointers to array of type.  ex:
int	*x[2];
int	x0[2], x1[2];
/* later */
x[0]=x0; x[1]=x1;
But this requires run time code to set up properly.
(I think I've managed to send follow ups to net.lang.c automagically
for those people who are impulsive with 'f'.)
-- 
C knows no bounds.
					Jeff Anton
					U.C.Berkeley
					Ingres Group
					ucbvax!anton
					anton@BERKELEY.EDU

throopw@rtp47.UUCP (Wayne Throop) (09/10/85)

> I'm disgusted at the number of 'wizards' who are confused.
> 'x' should best be though of as a pointer to array of two ints.
> int   y[2][2][2]; /* y should best be thought of as a pointer to
>                       two dimentional array of ints ([2][2]) */
> int   yp[][2][2] = y; /* is a proper pointer */
>                       Jeff Anton ucbvax!anton anton@BERKELEY.EDU

Now wait a minuite.  As near as I can tell from this fragment, Jeff is
as confused as any other wizard.  Since an initializer is used in the
declaration

        int yp[][2][2] = y;

yp is clearly *not* a formal.  And this being the case, *yp* *is* *not*
(I repeat *not*) a pointer.  I assume what is meant to happen for the
two above declarations is

        int y[2][2][2];
        int (*yp)[2][2] = y;

The abomination Jeff gave above declares yp to be an array of 1
array of 2 array of 2 integers, and (if the compiler doesn't choke)
initializes yp[0][0][0] to be the expression (int)y.  Yuck.

I tried this C file on SysV lint:

    int x1[2][2];
    int x2[][2] = x1;  /* I hope lint complains here */

    int (*x3)[2] = x1; /* I hope lint doesn't complain here */

And lint quite properly complained, saying:

    warning: illegal combination of pointer and integer:
        (2)  operator =

It is interesting that pointer/array equivalence causes such problems,
when it it really so simple.   The *only* (I repeat *only*) place where
a declarator like "int x[]" declares x to be a pointer is in *formal*
declarations.  In static, external, or automatic declarations, *x* *is*
*an* *array* (of compiler or loader determined size).  And even in
formal declarations, x "should be" *thought of* as an array.

--
Note that Followup-To specifies net.lang.c
--
"People who live in glass houses, shouldn't"
-- 
Wayne Throop at Data General, RTP, NC
<the-known-world>!mcnc!rti-sel!rtp47!throopw

guy@sun.uucp (Guy Harris) (09/10/85)

> >Does anybody else think that the array/pointer semi-equivalence is probably
> >one of the major causes of errors in C coding?
> 
> I always thought the array/pointer semi-equivalence was one of the major
> causes of easier to {read,understand} coding.

You missed the point.  The array/pointer semi-equivalence seems to be one of
the major causes of *incorrect* code, as well.  It's not a question of how
readable the code is once it's written, it's a question of whether it'll get
written correctly in the first place.  Every so often, somebody asks why

file1.c:

extern int *foo;

file2.c:

int foo[42];

doesn't work.  The person asking the question obviously doesn't realize that
just because you can say "foo[13]" in either case doesn't mean that the two
declarations mean the same thing.  Another common question is "why doesn't
this work?":

	struct frobozz *foo;

	get_frobozz_value(foo);

which code proceeds to smash some arbitrary location pointed to by the
uninitialized pointer "foo".  I suspect this assumption that a pointer to a
structure, absent a structure for it to point to, is just as good as a
structure is fallout from pointer/array confusion.

	Guy Harris