[comp.lang.c] sizeof and multi-dimensional arrays

dkeisen@Gang-of-Four.Stanford.EDU (Dave Eisen) (01/05/91)

Is the following a compiler bug or am I just confused?

char x[2][3];

  sizeof (*x)          gives 6
  sizeof (x[0])        gives 3.

What's the scoop?




--
Dave Eisen                      	There's something in my library
1447 N. Shoreline Blvd.                     to offend everybody.
Mountain View, CA 94043            --- Washington Coalition Against Censorship
(415) 967-5644                            dkeisen@Gang-of-Four.Stanford.EDU

fred@prisma.cv.ruu.nl (Fred Appelman) (01/05/91)

In <1991Jan5.050613.22303@Neon.Stanford.EDU> dkeisen@Gang-of-Four.Stanford.EDU (Dave Eisen) writes:

>Is the following a compiler bug or am I just confused?
>
>char x[2][3];
>
>  sizeof (*x)          gives 6
>  sizeof (x[0])        gives 3.
>
>What's the scoop?
>

You are just confused. 
'x' is a two dimensional array of 2*3 elments of type char. Makes a total of
6. 'x[0]' and 'x[1]' are arrays with a length of 3 elements. So both arrays
have a size of 3.

Fred
-- 
Fred J.R. Appelman, 3D Computer Vision, Utrecht University
AZU, Heidelberglaan 100, 3584 CX Utrecht, The Netherlands.
Telephone: +31-30-506710 Fax: +31-30-513399
e-mail: fred@cv.ruu.nl or appelman@cs.unc.edu

jak@sactoh0.SAC.CA.US (Jay A. Konigsberg) (01/06/91)

In article <fred.663069060@prisma> fred@prisma.cv.ruu.nl (Fred Appelman) writes:
>In <1991Jan5.050613.22303@Neon.Stanford.EDU> dkeisen@Gang-of-Four.Stanford.EDU (Dave Eisen) writes:
>>Is the following a compiler bug or am I just confused?
>>
>>char x[2][3];
>>  sizeof (*x)          gives 6
>>  sizeof (x[0])        gives 3.
>>What's the scoop?
>
>You are just confused. 
>'x' is a two dimensional array of 2*3 elments of type char. Makes a total of
>6. 'x[0]' and 'x[1]' are arrays with a length of 3 elements. So both arrays
>have a size of 3.
>

Something is wrong here. I ran the program with the addition of sizeof(x)
and got the following:

sizeof(x[0])=3
sizeof(*x)=3
sizeof(x)=6

Also I bumped the array to "char x[5][6]" and got:

sizeof(x[0])=6
sizeof(*x)=6
sizeof(x)=30

This seems to be one of the finer differences between pointers and arrays.

sizeof(x)    makes sense as it is returning the total size declared for
	     the array.

sizeof(x[0]) makes sense as it returns the total size of that dimmension
	     of the array.

sizeof(*x)   DOES NOT make sense. The size of a pointer on this machine
	     is 4 bytes. (Note: adding "char *y; sizeof(y) does return 4).

-- 
-------------------------------------------------------------
Jay @ SAC-UNIX, Sacramento, Ca.   UUCP=...pacbell!sactoh0!jak
If something is worth doing, it's worth doing correctly.

tim@proton.amd.com (Tim Olson) (01/06/91)

In article <fred.663069060@prisma> fred@prisma.cv.ruu.nl (Fred Appelman) writes:
| In <1991Jan5.050613.22303@Neon.Stanford.EDU> dkeisen@Gang-of-Four.Stanford.EDU (Dave Eisen) writes:
| 
| >Is the following a compiler bug or am I just confused?
| >
| >char x[2][3];
| >
| >  sizeof (*x)          gives 6
| >  sizeof (x[0])        gives 3.
| >
| >What's the scoop?
| >
| 
| You are just confused. 
| 'x' is a two dimensional array of 2*3 elments of type char. Makes a total of
| 6. 'x[0]' and 'x[1]' are arrays with a length of 3 elements. So both arrays
| have a size of 3.

That is correct, but the first sizeof is '*x', not 'x'.  Thus, it
appears to be a compiler bug -- they both should result in '3'.

--
	-- Tim Olson
	Advanced Micro Devices
	(tim@amd.com)

rory@maccs.dcss.mcmaster.ca (Rory Jacobs) (01/06/91)

In article <4596@sactoh0.SAC.CA.US> jak@sactoh0.SAC.CA.US (Jay A. Konigsberg) writes:

>sizeof(*x)   DOES NOT make sense. The size of a pointer on this machine
>	     is 4 bytes. (Note: adding "char *y; sizeof(y) does return 4).
>

But it does make sense.

In a sense the array name is a pointer to the array.  To access
the i-th element in an array you could write
   foo[i]

or

   *(foo + i)

Both statements return the i-th element.  

Now back to the orignal problem, since the above is true (convient
as pointer arithmatic is faster than array indexing) *x is equivalent
to *(x+0) which is x[0], and thus they have the same size.

Hope this helps,
   Rory

Rory Jacobs                                   Who me?!?
rory@maccs.dcss.mcmaster.ca                   Let's go Flyers!
...!uunet!uati!utgpu!maccs!rory               I thought it was easy...
Department of Computer Science and Systems    Boring (yawn)!
McMaster University, Hamilton, Ont            Let's have some fun.

wirzeniu@cs.Helsinki.FI (Lars Wirzenius) (01/06/91)

In article <4596@sactoh0.SAC.CA.US> jak@sactoh0.SAC.CA.US (Jay A. Konigsberg) writes:
>[ char x[5][6]; sizeof(*x) gives 6 ]
>sizeof(*x)   DOES NOT make sense. The size of a pointer on this machine
>	     is 4 bytes. (Note: adding "char *y; sizeof(y) does return 4).

But *x isn't a pointer, it's an array.  First the the type of x decays
from "array 5 of array 6 of char" into "pointer to array 6 of chars".
(See for example: _Standard_C_, by P.J.Plauger and Jim Brodie, page 74,
or K&R-2, Section A7.1, "Pointer Generation", page 200.)

This pointer is dereferenced with '*', and the result is an array of
type |char [6]|, which has the size 6.
--
Lars Wirzenius    wirzeniu@cs.helsinki.fi    wirzenius@cc.helsinki.fi

bull@ccs.carleton.ca (Bull Engineers) (01/06/91)

]>>Is the following a compiler bug or am I just confused?
]>>
]>>char x[2][3];
]>>  sizeof (*x)          gives 6
]>>  sizeof (x[0])        gives 3.
]>>What's the scoop?
]>
]>You are just confused. 
]>'x' is a two dimensional array of 2*3 elments of type char. Makes a total of
]>6. 'x[0]' and 'x[1]' are arrays with a length of 3 elements. So both arrays
]>have a size of 3.
]>
]
]Something is wrong here. I ran the program with the addition of sizeof(x)
]and got the following:
]
]sizeof(x[0])=3
]sizeof(*x)=3
]sizeof(x)=6
]
]Also I bumped the array to "char x[5][6]" and got:
]
]sizeof(x[0])=6
]sizeof(*x)=6
]sizeof(x)=30
]
]This seems to be one of the finer differences between pointers and arrays.
]
]sizeof(x)    makes sense as it is returning the total size declared for
]             the array.
]
]sizeof(x[0]) makes sense as it returns the total size of that dimmension
]             of the array.
]
]sizeof(*x)   DOES NOT make sense. The size of a pointer on this machine
]             is 4 bytes. (Note: adding "char *y; sizeof(y) does return 4).
]

   Sorry, sizeof(*x) makes perfect sense.  Remember, the * operator
   means "evaluate what's at this address".  This means, that for
   two-dimensional arrays, *x and x[0] are identical by definition.  Try
   this with a three dimensional array z[2][3][4].  sizeof(z) = 24,
   sizeof(z[0]) = 12, and sizeof(*z) = 12 also.  Why?  Because *z
   dereferences the first (0th) dimension of z.

ping@cubmol.bio.columbia.edu (Shiping Zhang) (01/06/91)

In article <4596@sactoh0.SAC.CA.US> jak@sactoh0.SAC.CA.US (Jay A. Konigsberg) writes:

>Also I bumped the array to "char x[5][6]" and got:
 
>sizeof(x[0])=6
>sizeof(*x)=6
>sizeof(x)=30
 
>This seems to be one of the finer differences between pointers and arrays.
...
 
>sizeof(x[0]) makes sense as it returns the total size of that dimmension
>	     of the array.
>
>sizeof(*x)   DOES NOT make sense. The size of a pointer on this machine
>	     is 4 bytes. (Note: adding "char *y; sizeof(y) does return 4).

For an array x, *x is equivalent to x[0], so sizeof(x[0]) and sizeof(*x)
should get the same number. Don't be confused with the difference
between *x and &x. *x means a point in a declaration, but not in an
expression. 

-ping
 

chris@mimsy.umd.edu (Chris Torek) (01/07/91)

First, the instant replays (I guess I saw too much football yesterday :-) );
then a tutorial essay....


In article <1991Jan5.050613.22303@Neon.Stanford.EDU>
dkeisen@Gang-of-Four.Stanford.EDU (Dave Eisen) asks why, with his compiler,
>char x[2][3];
>  sizeof (*x)          gives 6
>  sizeof (x[0])        gives 3.
>What's the scoop?

(The correct answer is `There is a bug in that compiler.')

In article <fred.663069060@prisma> fred@prisma.cv.ruu.nl (Fred Appelman)
writes:
>You are just confused. 
>'x' is a two dimensional array of 2*3 elments of type char. Makes a total of
>6. 'x[0]' and 'x[1]' are arrays with a length of 3 elements. So both arrays
>have a size of 3.

This is correct, but does not explain why the compiler produces 6 for
`sizeof (*x)'.  (Of course, no one without the source can explain the
particular bug in that compiler.)

In article <4596@sactoh0.SAC.CA.US> jak@sactoh0.SAC.CA.US (Jay A. Konigsberg)
adds:
>Something is wrong here.

(True enough.)

>sizeof(x)    makes sense as it is returning the total size declared for
>	      the array.
>sizeof(x[0]) makes sense as it returns the total size of that dimmension
>	      of the array.

Right.

>sizeof(*x)   DOES NOT make sense. The size of a pointer on this machine
	      is 4 bytes. (Note: adding "char *y; sizeof(y) does return 4).

Not right.

In article <10303@hydra.Helsinki.FI> wirzeniu@cs.Helsinki.FI (Lars Wirzenius)
corrects Jay Konigsberg:
>But *x isn't a pointer, it's an array.  First the the type of x decays
>from "array 5 of array 6 of char" into "pointer to array 6 of chars".
>(See for example: _Standard_C_, by P.J.Plauger and Jim Brodie, page 74,
>or K&R-2, Section A7.1, "Pointer Generation", page 200.)
>
>This pointer is dereferenced with '*', and the result is an array of
>type |char [6]|, which has the size 6.

This is exactly right.

Finally, in article <1991Jan5.232225.14909@ccs.carleton.ca> a mystery
person (`Engineers' seems rather an unlikely surname!) given as
bull@ccs.carleton.ca (Bull Engineers) writes:
>Sorry, sizeof(*x) makes perfect sense.  Remember, the * operator
>means "evaluate what's at this address".  This means, that for
>two-dimensional arrays, *x and x[0] are identical by definition.  Try
>this with a three dimensional array z[2][3][4].  sizeof(z) = 24,
>sizeof(z[0]) = 12, and sizeof(*z) = 12 also.  Why?  Because *z
>dereferences the first (0th) dimension of z.

This is awfully informal, but is the right idea.

[begin tutorial]

	Key concepts:
		types
		objects
		values
		contexts (object and value)
		address-of operator `&' changes object to value
		indirect operator `*' changes value to object
		arrays in object contexts remain arrays
		arrays in value contexts become values

C has five different `places' in which array identifiers (including []
and `*') can appear:

 - declarations and definitions:
	int i, a[10], *p;	/* local, global, extern, whatever */
   These can be further divided into formal parameters and all others.

 - `left hand sides' (`to the left of an assignment'):
	i = 3;
	a[2] = 4;
   This includes the `modifying' operators `++' and `--', i.e., in the
   expression
	a[3] = ++i;
   the `i' being incremented is in a `miniature left hand side' of its
   own.

 - `right hand sides':
	p = a;
   Here `p' is in a `left side', or `left value', or `lvalue', context,
   and `a' is in a `right side', or `right value', or `rvalue', context.

 - sizeof:
	sizeof(a)
   An identifier that follows sizeof is treated as if it were in a `left
   value' context.  (More on this in a bit.)

 - address-of operator:
	&i
   An identifier that follows an address-of ampersand (`&') is also treated
   as if it were an `lvalue'.

Aside from declarations and definitions, then, there are really only
two contexts here, `lvalue' and `rvalue'.  Since an `lvalue' identifier
need not actually appear on the left---as is the case with `++i'
above---I prefer to call these `object' and `value' contexts.  Other
books may use `lvalue' and `rvalue' respectively.

In an object context, we are interested in the object itself.  Usually
the variable name corresponds to some `address' (whatever that is; the
C language does not pin down addresses all that exactly, so that
whatever the system uses for addresses will probably suffice).  `i',
`a', and `p' above each have some address%.  Each variable has a type,
and so each of these addresses also has a type corresponding to the
variable's type:

      name   is a/an		so its address is a
      ----			-------------------
	i     int		pointer to int
	a     array 10 of int	pointer to array 10 of int%%
	p     pointer to int	pointer to pointer to int

This address is what the `&' operator produces.  The result of the `&'
operator is itself a value, not an object; a value does not have an
address and it is therefore illegal to try to take it, so `&(&i)' is
illegal.  (Most C compilers correctly diagnose this error, although
many do not correctly diagnose `&(&*p)'.  This does not make &(&*p)
legal: even though it *could* be defined as &p, it happens that it is
not.  If you want &p, write &p.)
-----
% Note that `i', `a', and `p' need not be given addresses unless the
  code takes those addresses with `&'.  A smart compiler can, if the
  machine allows it, put objects into machine registers or other
  `special' places.  In a few cases, it can do this even when the
  object's address is taken.  (One example occurs on Pyramid computers,
  where the registers have addresses.) The `register' keyword acts as a
  promise, and sometimes as a recommendation: `I promise not to take
  the address of this variable, and suggest that the compiler might put
  it in a machine register.'  Most modern compilers completely ignore
  the advice, and some do not even hold you to the promise.

%% In `old C' as defined by K&R 1st edition, &a is illegal.  This
  is no longer the case; &a is the address of the array `a', and its
  type is `pointer to array 10 of int'.
-----

`sizeof' is not really interested in the object's address, but on the
other hand, it is not interested in the object's value either.  Objects
that appear in `sizeof' contexts are used only for their type.  The
size of that type, whatever it is, is `spliced in' as though it were an
integral constant.  (Note that this constant has type `size_t'.)  In
other words, given `char c;', writing `sizeof c' is essentially the
same as writing `(size_t)1'.

This leaves assignments and value contexts (and declarations and
definitions, which I am ignoring).  Here things start to get a bit
peculiar.  For sizeof and address-of, we are only interested in the
size and type of the object that follows, but in assignments and
values, we need the value of the object as well---sometimes to fetch
it, sometimes to set it, sometimes both.  This is all well and good for
`simple' objects like `i', for pointers like `p', and (these days) even
for structure and union objects (with some restrictions).  But array
objects are different.  They get no respect.

An assignment to an array object is simply illegal.  (Note that the
initial value that may appear in a definition is not an assignment%:
it is an initializer.  That is why it is legal there.)  `i = 3;' is
fine, but `a = { 0,1,2,3,4,5,6,7,8,9 };' is not.  You might think,
then, that taking the value of an array would also be illegal.
-----
% Well, technically speaking, at least.  It looks and acts like an
  assignment, but the rules regarding what is and is not legal are
  different.
-----

Here is where things get very strange.

Instead of being outlawed, an attempt to take the `value' of an array
is treated as an attempt to take the address of the first element of
the array (the one with subscript 0).  So in
	p = a;
the compiler pretends you wrote instead
	p = &a[0];
a[0] is an object of type `int', therefore its address is a value of
type `pointer to int', so we have an assignment with a `pointer to int'
on the left (p) and a `pointer to int' on the right (&a[0]) and everything
is okay.

There is a subtlety here as well.  How did we name a[0] in the first place?

The expression
	a[0]
breaks down into four sub-expressions:
	a
	0
	add
	indirect
As above, the `a' turns into the address of a[0].  To this value we
add 0 (leaving it unchanged) and then indirect.  This changes the value
`pointer to a[0]' into the object `a[0]'.  In other words, we have to
know where a[0] is in order to find a[0]!  So it is a good thing we can
find a[0] by asking for `a'.

Formally, then, the rule is:

    In a value context, an object of type `array N of T' (where N is an
    integral constant and T is a legal type) becomes a value of type
    `pointer to T' whose value is the address of the first element---
    element number 0---of that array.

Remember also that the `&' address-of operator takes an object and
produces a value, and that the `*' indirect operator takes a value
and produces an object.  For `&' the value produced has type `pointer
to ...' while for `*' the value consumed must have type `pointer to ...'.
In each case the `...' represents the type of the object (whether
consumed or produced).

Rewinding to the original question, then:
>char x[2][3];
>  sizeof (*x)          gives 6
>  sizeof (x[0])        gives 3.
>What's the scoop?

We can see that this is a compiler bug by expanding the two arguments
to `sizeof'.  These are each in object context and we want their types.
First we have

	*x

This means that x appears in a value context (`*' takes a value and
produces an object).  It had better come out as a value of type `pointer
to ...'.  Well: `x' is an `array 2 of array 3 of char', but as noted
above, an array in a value context gets changed:

    In a value context, an object of type `array N of T' (where N is an
    integral constant and T is a legal type) becomes a value of type
    `pointer to T' whose value is the address of the first element---
    element number 0---of that array.

so we have an array with N=2 and T=`array 3 of char'.  This becomes a
value of type `pointer to T', or in this case, `pointer to array 3 of
char', pointing to the first element of x (x[0]).  So we can apply the
indirecting `*'.  The indirection changes this `pointer to array 3 of char'
into the object `array 3 of char'.  Thus we want the size of an object
that is an `array 3 of char'; by definition, this is the value `3'.

To check `sizeof x[0]', do the same thing.  Write down the expression:

	sizeof x[0]

Break down the subexpression x[0] by rewriting according to its definition:

	*( (x) + (0) )

Handle the subexpression x+0, noting the contexts:

	*( [value] ( [value] (x) + [value] (0) ) )

`x' is an array in a value context, so apply The Rule from above:

	[value] (x)						=
	[value] <object, array 2 of array 3 of char, x>		=
	[value] <value, pointer to array 3 of char, &x[0]>

Adding 0 leaves the pointer unchanged, so apply the `*':

	*( <value, pointer to array 3 of char, &x[0]> )		=
	<object, array 3 of char, x[0]>

Now we have an object in an object context (target of `sizeof') so
we just read its type---`array 3 of char'---and decide its size: 3.

Incidentally, sizeof can handle values as well as objects: `sizeof 3+4'
produces the same constant as `sizeof(int)'.  Sizeof is unique in this;
other C operators that take objects refuse to work on values.  Of
course, sizeof can also take a type in parentheses, which shows just
how special it is.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

bengsig@oracle.nl (Bjorn Engsig) (01/07/91)

Article <28950@mimsy.umd.edu> by chris@mimsy.umd.edu (Chris Torek) says:
|First, the instant replays (I guess I saw too much football yesterday :-) );
|then a tutorial essay....
more that 250 hundred instructive lines - amazing that you have the time to 
do it....
-- 
Bjorn Engsig,         E-mail: bengsig@oracle.com, bengsig@oracle.nl
ORACLE Corporation    Path:   uunet!orcenl!bengsig

            "Stepping in others footsteps, doesn't bring you ahead"

chris@mimsy.umd.edu (Chris Torek) (01/09/91)

(Two corrections from Karl Heuer)

In article <28950@mimsy.umd.edu> I wrote:
>C has five different `places' in which array identifiers (including []
>and `*') can appear:

`identifiers' is the wrong word (as it happens, it was left over from a
small edit I made in that section).  `Array notations' might be better.

And:
>Incidentally, sizeof can handle values as well as objects: `sizeof 3+4'
>produces the same constant as `sizeof(int)'.

Make that `sizeof(3+4)': otherwise it acts as (sizeof 3) + 4.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

chris@mimsy.umd.edu (Chris Torek) (01/09/91)

In article <1991Jan8.234328.5075@sol.UVic.CA> gmclaren@sirius.UVic.CA
(Gavin  Mclaren) writes:
>>char x[2][3];
>What's to be confused about?  x is a pointer to a two dimensional array,

No, x is an <object, array 2 of array 3 of char>.  There is one thing to
be confused about already.  (To get a pointer to x, write `&x'.  This
gives a <value, pointer to array 2 of array 3 of char, &x>.)

>Is this perhaps an example of how some are confused by the pointer-array 
>equivalence _theory_ in the C language?

Yes.  There is no real `pointer-array equivalence'; there is, however, a
(one, single) rule that makes the use of arrays and pointers use the same
syntax.  That rule is:

    In a value context, an object of type `array N of T' (where N is
    some integral constant and T is a suitable type, including another
    `array Nprime of Tprime') is converted to a value of type `pointer
    to T' which points to the first element of the array, i.e., the one
    with subscript 0.

This conversion occurs only in value contexts.  (There is a completely
separate rule that applies to formal parameter declarations.)

>My best advice is to muddle through the FAQ, one more time....

The FAQ answers are not intended as full tutorials (there is not enough
space, among other things).

For much more detail about this, see my previous postings.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris