[comp.lang.c] Another silly question

kelly@nmtsun.nmt.edu (Sean Kelly) (04/27/89)

My CS instructor and I disagree about a certain moot point.  I have a text
book which says that

	*(a + i)	and	a[i]

are equivalent, given an array a, and int index i ... each gives the
value stored in a[i].  But he says that

	*(a + i)

is non-standard and would not expect it do go far on all _real_ C compilers
(_real_ meaning those compilers that are somewhat devoted to K & R or ANSI).
He expects that many compilers would instead add the value of i to the
pointer a, and then reference the item stored there.  I say that the
compiler's smart enough to realize what we're trying to achieve, and
won't do something like

	* (char *) ( (int) a + i )

which he thinks it will probably do on most machines.  It doesn't on our
Suns nor our VAX.

I don't have a copy of K&R's book, first or new edition, just _Programming_
_in_C_ by S. Kochan, which seems pretty valid.

What do you think?

--
Sean Kelly                    I'm not a number, I am a free man!
kelly@nmtsun.nmt.edu          --The Prisoner
--

gwyn@smoke.BRL.MIL (Doug Gwyn) (04/27/89)

In article <2459@nmtsun.nmt.edu> kelly@titan.nmt.edu (Sean Kelly) writes:
>He expects that many compilers would instead add the value of i to the
>pointer a, and then reference the item stored there.

In C, pointer arithmetic ALWAYS involves scaling by the size of the
pointed-to objects.  This is one of Dennis's really useful insights.
It is so fundamental to C that I have to worry about an instructor
who claims otherwise.

krazy@claris.com (Jeff Erickson) (04/27/89)

From article <2459@nmtsun.nmt.edu>, by kelly@nmtsun.nmt.edu (Sean Kelly):
> He expects that many compilers would instead add the value of i to the
> pointer a, and then reference the item stored there.  I say that the
> compiler's smart enough to realize what we're trying to achieve, and
> won't do something like "* (char *) ((int) a+i)" which he thinks it
> will probably do on most machines.

You're right.  You're instructor is full of donkey doo-doo.

In fact, since a[i] = *(a+i), and (a+i)=(i+a), you can actually write i[a]
for a[i] and most compilers will take it!  (Every one I've tried has, anyway.)

I refer you to page 205 of K&R, second edition.

	"A pointer to an object in an array and a value of any integral
	type may be added.  The latter is converted to an address
	offset by multiplying it by the size of the object to which
	the pointer points.  The sum is the same type as the original
	pointer, and points to another object in the same array,
	appropriately offset from the original object.  Thus, if P
	is a pointer to an object in an array, the expression P+1 is
	a pointer to the next object in the array."

If I were you, I'd question your instructor's qualifications to his superior.
This is one of *the* most useful features of C.  He obviously isn't well-
versed in the language he's trying to teach you.    ~~~~~~~~~
-- 
Jeff Erickson       Claris Corporation  | Birdie, birdie, in the sky,
408/987-7309      Applelink: Erickson4  |   Why'd you do that in my eye?
krazy@claris.com     ames!claris!krazy  | I won't fret, and I won't cry.
       "I'm a heppy, heppy ket!"        |   I'm just glad that cows don't fly.

cik@l.cc.purdue.edu (Herman Rubin) (04/27/89)

In article <10135@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
> In article <2459@nmtsun.nmt.edu> kelly@titan.nmt.edu (Sean Kelly) writes:
> >He expects that many compilers would instead add the value of i to the
> >pointer a, and then reference the item stored there.
> 
> In C, pointer arithmetic ALWAYS involves scaling by the size of the
> pointed-to objects.  This is one of Dennis's really useful insights.
> It is so fundamental to C that I have to worry about an instructor
> who claims otherwise.

For the same operation, one way will be better on one machine, and a 
different way on another.  There are machines with index operations, where
the multiplication by the appropriate power of 2 is invisible hardware,
there are machines where increment and decrement for addresses is invisible
hardware, and machines where neither of these is the case.  I suspect that
the number of ways of doing this is comparable to the number of discussants
of this on comp.lang.c.

Now suppose I am doing some serious array operations, and I have to know
whether one array buffer is longer than another.  The elements are of type
long.  Do I have to do this multiplying and dividing by 4 all the time?
Another example of "user-friendly" which turns out to be "user-inimical."
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)

ark@alice.UUCP (Andrew Koenig) (04/27/89)

In article <2459@nmtsun.nmt.edu>, kelly@nmtsun.nmt.edu (Sean Kelly) writes:

> My CS instructor and I disagree about a certain moot point.  I have a text
> book which says that

> 	*(a + i)	and	a[i]

> are equivalent, given an array a, and int index i ... each gives the
> value stored in a[i].  But he says that

> 	*(a + i)

> is non-standard and would not expect it do go far on all _real_ C compilers

You are right, *(a + i) is precisely equivalent to a[i].
Any compiler that gets that wrong is badly broken.

Moreover, many programs say

	a + i

instead of

	&a[i]

so there is a fair premium on getting at least that part of it right.

You might ask your instructor for an example of a compiler that
doesn't get *(a + i) right.
-- 
				--Andrew Koenig
				  ark@europa.att.com

kremer@cs.odu.edu (Lloyd Kremer) (04/27/89)

In article <2459@nmtsun.nmt.edu> kelly@nmtsun.nmt.edu (Sean Kelly) writes:

>My CS instructor and I disagree about a certain moot point.  I have a text
>book which says that
>
>	*(a + i)	and	a[i]
>
>are equivalent, given an array a, and int index i ... each gives the
>value stored in a[i].  But he says that
>
>	*(a + i)
>
>is non-standard and would not expect it do go far on all _real_ C compilers


The expressions *(a + i) and a[i] are absolutely synonymous in every way.
Either one could be defined as the other.  This fact is one of the foundational
pillars of the C Language.

Any C compiler that does not agree with this lacks knowledge of the most basic
fundamentals of the language and does not deserve to be called a C compiler.
I am tempted to make analogous remarks about C instructors.

An interesting corollary of this rule, often used in intentionally obfuscated
code, is:

	a[i]  ==  *(a + i)  ==  *(i + a)  ==  i[a]

Ask your instructor what he thinks 1["hello"] will evaluate to!

-- 
					Lloyd Kremer
					Brooks Financial Systems
					...!uunet!xanth!brooks!lloyd
					Have terminal...will hack!

henry@utzoo.uucp (Henry Spencer) (04/27/89)

In article <2459@nmtsun.nmt.edu> kelly@titan.nmt.edu (Sean Kelly) writes:
>My CS instructor and I disagree about a certain moot point.  I have a text
>book which says that
>
>	*(a + i)	and	a[i]
>
>are equivalent, given an array a, and int index i ... each gives the
>value stored in a[i].  But he says that
>
>	*(a + i)
>
>is non-standard and would not expect it do go far on all _real_ C compilers
>(_real_ meaning those compilers that are somewhat devoted to K & R or ANSI).

Your instructor needs to read a book about C, and pay attention to it.  He's
obviously confusing C with assembler.  Your book, and you, are correct.
-- 
Mars in 1980s:  USSR, 2 tries, |     Henry Spencer at U of Toronto Zoology
2 failures; USA, 0 tries.      | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

gwyn@smoke.BRL.MIL (Doug Gwyn) (04/28/89)

In article <1266@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin)
		[the infamous proponent of assembly language] writes:
-In article <10135@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
-> In C, pointer arithmetic ALWAYS involves scaling by the size of the
-> pointed-to objects.  This is one of Dennis's really useful insights.
-> It is so fundamental to C that I have to worry about an instructor
-> who claims otherwise.
-For the same operation, one way will be better on one machine, and a 
-different way on another.  There are machines with index operations, where
-the multiplication by the appropriate power of 2 is invisible hardware,
-there are machines where increment and decrement for addresses is invisible
-hardware, and machines where neither of these is the case.  I suspect that
-the number of ways of doing this is comparable to the number of discussants
-of this on comp.lang.c.
-Now suppose I am doing some serious array operations, and I have to know
-whether one array buffer is longer than another.  The elements are of type
-long.  Do I have to do this multiplying and dividing by 4 all the time?
-Another example of "user-friendly" which turns out to be "user-inimical."

I can make no sense whatsoever out of your comment.
	long a[ASIZE], b[BSIZE], *ap = &a[aindex], *bp = &b[bindex];
	if ( ASIZE > BSIZE )
		...
	if ( sizeof a > sizeof b )
		...
	if ( aindex > bindex )
		...
	if ( ap > bp )
		...
You don't have to do any "multiplying and dividing by 4 all the time".
Neither does the compiler.
There is virtually no sensible operation you can attempt with arrays
or pointers in C that requires you to deal with such scaling; it's
taken care of for you by the compiler.

jimb@hpmcaa.HP.COM (Jim Belesiu) (04/28/89)

I refer you to Kernighan and Ritchie's second edition of "The C Programming
Language", p99.  There you'll find it stated explicitly that

			a[i] is equivalent to *(a+i)

where *a ana a[] reference the same data type.

Jim Belesiu

bill@twwells.uucp (T. William Wells) (04/28/89)

In article <2459@nmtsun.nmt.edu> kelly@titan.nmt.edu (Sean Kelly) writes:
: My CS instructor and I disagree about a certain moot point.  I have a text
: book which says that
:
:       *(a + i)        and     a[i]
:
: are equivalent, given an array a, and int index i

This is a fundamental identity in C. A failure to do this in a
compiler would be considered a *major*, as in withdraw the product,
BUG.

A failure to understand this marks one as not competent to program C.

: What do you think?

Get another CS instructor.

---
Bill                            { uunet | novavax } !twwells!bill

mat@mole-end.UUCP (Mark A Terribile) (04/28/89)

> ...  I have a text book which says that
> 	*(a + i)	and	a[i]
> are equivalent, given an array a, and int index i ... each gives the
> value stored in a[i].  But he says that
> 	*(a + i)
> is non-standard and would not expect it do go far on all _real_ C compilers
> ... He expects that many compilers would instead add the value of i to the
> pointer a, and then reference the item stored there.  I say that the
> compiler's smart enough to realize what we're trying to achieve, and
> won't do something like
> 
> 	* (char *) ( (int) a + i )
> 
> which he thinks it will probably do on most machines. ...

Oy vey!  Of course they are equivalent; that is how subscripting is *defined*
in C.  Further, any compiler that introduces the effect of spurious type
conversions of pointer expressions is broken.

And I can testify of my own knowledge that at least two C compiler families
with which some or most of us have experience transform the parser tree for the
subscripted expression into the parse tree for explicit indirection before
trying to generate code.  If THAT won't cause them to produce the same code
for both forms, there's very little that will.

K&R state quite literally that the two expressions are identical, and
further that

	a[ i ]

is the same as

	i[ i ]

I verified it on the PDP-11 compiler and on a Z-80 compiler derived from the
PDP-11 compiler; I also tried it on an early PCC.  I haven't tried it lately
on anything.  Why don't you try it on your favorite machine?  K&R say it should
work!
-- 

(This man's opinions are his own.)
From mole-end				Mark Terribile

guy@auspex.auspex.com (Guy Harris) (04/28/89)

 >My CS instructor and I disagree about a certain moot point.  I have a text
 >book which says that
 >
 >	*(a + i)	and	a[i]
 >
 >are equivalent, given an array a, and int index i ... each gives the
 >value stored in a[i].

Your textbook is correct.

 >But he says that
 >
 >	*(a + i)
 >
 >is non-standard and would not expect it do go far on all _real_ C compilers
 >(_real_ meaning those compilers that are somewhat devoted to K & R or ANSI).

Your instructor is incorrect.

 >He expects that many compilers would instead add the value of i to the
 >pointer a, and then reference the item stored there.

Yes, which gives the value stored in a[i].  "The pointer a" is really
"the pointer-valued expression generated by the conversion of the
array-valued expression 'a' into a pointer-valued expression that points
to the first element of the array 'a'"; if you add "i" to that
pointer-valued expression, you get a pointer to the "i"th element of the
array "a".  Dereference that pointer, and you get the "i"th element of
the array "a", or "a[i]".

 >I say that the compiler's smart enough to realize what we're trying
 >to achieve, and won't do something like
 >
 >	* (char *) ( (int) a + i )
 >
 >which he thinks it will probably do on most machines.

You are correct; he is incorrect.  Perhaps he does not understand how
pointer addition works in C?  If you add an integral value N to a pointer,
it doesn't increment the address in that pointer by N storage units
(bytes on byte addressible machine, etc.), it can be thought of as
incrementing the address by N objects of the type to which that pointer
points.  In C, pointers have types, and those types are significant.

ftw@masscomp.UUCP (Farrell Woods) (04/28/89)

In article <2459@nmtsun.nmt.edu> kelly@titan.nmt.edu (Sean Kelly) writes:

>My CS instructor and I disagree about a certain moot point.  I have a text
>book which says that

>	*(a + i)	and	a[i]

>are equivalent, given an array a, and int index i ... each gives the
>value stored in a[i].  But he says that

>	*(a + i)

>is non-standard and would not expect it do go far on all _real_ C compilers

[deleted]

>What do you think?

I think your Sun and your Vax are better authorities on C than your instructor.
So are you, for that matter.  If the Sun and Vax compilers aren't "real" in
your instructors terms, whaich compilers are?

What you describe is simple pointer math.

Find a K&R 1 and have your instructor start reading at the paragraph beginning
near the top of page 94.

-- 
Farrell T. Woods				Voice:  (508) 392-2471
Concurrent Computer Corporation			Domain: ftw@masscomp.com
1 Technology Way				uucp:   {backbones}!masscomp!ftw
Westford, MA 01886				OS/2:   Half an operating system

bill@twwells.uucp (T. William Wells) (04/29/89)

In article <9987@claris.com> krazy@claris.com (Jeff Erickson) writes:
: In fact, since a[i] = *(a+i), and (a+i)=(i+a), you can actually write i[a]
: for a[i] and most compilers will take it!  (Every one I've tried has, anyway.)

Try Microsoft. I don't know if it is true of the latest version, but
one of about two years ago wouldn't take it.

I made the mistake (don't ask why!) of putting this in some code
Proximity shipped; we got many complaints from people with broken
compilers. And not only Microsoft though I don't recall which others.
I certainly remember the embarrassment!

---
Bill                            { uunet | novavax } !twwells!bill

bill@twwells.uucp (T. William Wells) (04/29/89)

In article <1266@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:
: Now suppose I am doing some serious array operations, and I have to know
: whether one array buffer is longer than another.  The elements are of type
: long.  Do I have to do this multiplying and dividing by 4 all the time?
: Another example of "user-friendly" which turns out to be "user-inimical."

Oh bullshit, Mr. Rubin. As usual, you are wanting PL/I++++, not C. And
a compiler that reads programmer's minds. If you really don't want to
do the division, maintain integer indexes, not pointers.

If you think that pointer manipulation is going to give you the
better code, use them. If you think that index manipulation is going
to give you the better code, use those. And if you can't figure which
is better, how, pray tell, do you figure the compiler will figure it
out? C at least gives yo a fighting chance by giving you a choice.

---
Bill                            { uunet | novavax } !twwells!bill

mat@mole-end.UUCP (Mark A Terribile) (04/29/89)

> > 	*(a + i)	and	a[i]
> > are equivalent, given an array a, and int index i ...  [ ARE THEY?? ]
 
> Oy vey!  Of course ... that is how subscripting is *defined* in C. ...

> K&R state quite literally that the two expressions are identical, and [that]
> 	a[ i ]     is the same as    i[ i ]

Of course, that should read

  	a[ i ]     is the same as    i[ a ]

Pardon my gaffe!

Is this group c.beginners ?
-- 

(This man's opinions are his own.)
From mole-end				Mark Terribile

d87-hho@nada.kth.se (Henrik Holmstr|m) (05/01/89)

Do we need 28 follow-ups to a trivial question?  You know the rule, if you
see something obviously wrong or simple, don't just hit 'F'.  Wait a day
or two and see if someone else answered the question (or just mail the anwser).

	Henrik Holmstr|m

sater@cs.vu.nl (Hans van Staveren) (05/01/89)

In article <1513@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes:
>
> >My CS instructor and I disagree about a certain moot point.  I have a text
> >book which says that
> >
> >	*(a + i)	and	a[i]
>	/* LOTS OF STUFF DELETED */
>You are correct; he is incorrect.  Perhaps he does not understand how
>pointer addition works in C?  If you add an integral value N to a pointer,
>it doesn't increment the address in that pointer by N storage units
>(bytes on byte addressible machine, etc.), it can be thought of as
>incrementing the address by N objects of the type to which that pointer
>points.  In C, pointers have types, and those types are significant.

Just to show how old I am, let me tell you that in Unix V6 on the PDP 11,
the only machine it ran on, the expressions
	a + i
and
	i + a
with a a pointer and i an integer were not equivalent.
	a + i worked as it does nowadays
while	i + a worked as this guys instructor fears.

I am even willing to admit I used this trick, but then in those days the way 
to get an unsigned was to declare it as a char* and casts were not invented
yet.

Language historians, take note!

	Hans van Staveren
	Vrije Universiteit
	Amsterdam, Holland

dg@lakart.UUCP (David Goodenough) (05/01/89)

From article <879@twwells.uucp>, by bill@twwells.uucp (T. William Wells):
> In article <9987@claris.com> krazy@claris.com (Jeff Erickson) writes:
> : In fact, since a[i] = *(a+i), and (a+i)=(i+a), you can actually write i[a]
> : for a[i] and most compilers will take it!  (Every one I've tried has, anyway.)
> 
> Try Microsoft. I don't know if it is true of the latest version, but
> one of about two years ago wouldn't take it.

GreenHills (the compiler supplied with our machine) gets all bent out of
shape about it too. Makes it a bear to compile obfuscated C programs, but
not much of a handicap otherwise :-)
It does get *(a + i) right though .....
-- 
	dg@lakart.UUCP - David Goodenough		+---+
						IHS	| +-+-+
	....... !harvard!xait!lakart!dg			+-+-+ |
AKA:	dg%lakart.uucp@xait.xerox.com		  	  +---+

Tim_CDC_Roberts@cup.portal.com (05/01/89)

Ok, folks.  In regards to  "a[i] == *(a+i) == *(i+a) == i[a]", let me
refer to the oft-used example  2["hello"].

I agree that this works and is equivalent to "hello"[2].  I've seen it
in books and postings.  My simple question is why?  (Please don't submit
30 replies saying "because the book says so"...)  Doesn't that equivalence
imply that the pointer type is somehow "stronger" than the simple type?
Is that, in fact, the case?  Is a compiler force to examine all of the
elements in a pointer expression and establish the "master type" of the
expression?  If I mix two pointer types, as in
  char * c;
  long * ell;
     return c + ell;
is this anarchy?  Is it a syntax error?  What is sizeof(*(c+ell))?

Inquiring minds want to know.

Tim_CDC_Roberts@cup.portal.com                | Control Data...
...!sun!portal!cup.portal.com!tim_cdc_roberts |   ...or it will control you.

kremer@cs.odu.edu (Lloyd Kremer) (05/02/89)

In article <17812@cup.portal.com> Tim_CDC_Roberts@cup.portal.com writes:

>In regards to  "a[i] == *(a+i) == *(i+a) == i[a]", let me
>refer to the oft-used example  2["hello"].
>
>I agree that this works and is equivalent to "hello"[2].  I've seen it
>in books and postings.  My simple question is why?
>.........
>Is a compiler force to examine all of the
>elements in a pointer expression and establish the "master type" of the
>expression?  If I mix two pointer types, as in
>  char * c;
>  long * ell;
>     return c + ell;
>is this anarchy?  Is it a syntax error?  What is sizeof(*(c+ell))?


Anarchy?  Yes, pointer addition has never been defined in C.
Syntax error?  I guess so.  Lint says, "operands of + have incompatible types."
Sizeof?  The expression is not defined, so its size certainly is not.

As to the conceptual implementation of a[i], the compiler sees a pointer a, and
an int i.  As has been shown, it does not matter which is in the brackets and
which is outside.  It does matter which is the pointer and which is the
integer, but since C is a type-oriented language, it does know this.

Many compilers immediately translate a[i] into *(a + i).  (Yet another
demonstration of their equivalence!)  a+i is an address which is evaluated as:
{machine address referenced by "a"} plus {"i" times sizeof(*a)}.
a[i] or *(a + i) is then the object of type *a located at that address.

Although 2["hello"] is cryptic, a compiler *should* get it right according to
the language definition (old or new).  If I observed a certain compiler to fail
on it, my confidence in that compiler to perform properly in other areas would
decrease by several orders of magnitude.

Another of these "confidence-diminishing" tests is 'sizeof("string")'.
The correct answer is 7.  Compilers that say 'sizeof(char *)' are broken.

-- 
					Lloyd Kremer
					Brooks Financial Systems
					...!uunet!xanth!brooks!lloyd
					Have terminal...will hack!

cs132046@brunix (Garrett Fitzgerald) (05/03/89)

Umm... I'm kind of getting lost here. Is 2["hello"] == 'e'? And why
does it allow pointer/integer addition in this order?
--------------------------------
Campus Crusade for Cthulhu--when you're tired of the lesser of two evils.
Sarek of Vulcan, a.k.a. Garrett Fitzgerald
cs132046@brunix or st902620@brownvm.bitnet

chris@mimsy.UUCP (Chris Torek) (05/03/89)

In article <17812@cup.portal.com> Tim_CDC_Roberts@cup.portal.com writes:
>... let me refer to the oft-used example  2["hello"].
>I agree that this works and is equivalent to "hello"[2].  I've seen it
>in books and postings.  My simple question is why?

The type of

	"hello"

is

	array 6 of char

(or

	char [6]

if you prefer) which, in all contexts except declarations and targets
of sizeof(), changes to

	pointer to char

with the value being the address of the first (zero'th) element of
the array.  So the types of the two expressions

	"hello"[2]

and

	2["hello"]

are

	(char *) [ (int) ]


and

	(int) [ (char *) ]

The [] syntax means `add the value of the object to the left to
the value of the object to the right, then dereference':

	* ( (char *) + (int) )

and

	* ( (int) + (char *) )

respectively.  Addition is defined on two cases: addition of scalar
types with other scalar types (such as int+int, or double+int, or
char+long) and addition involving pointers.  Both additions involve
pointers, so both follow these rules, which are:

	The result of <pointer to T> plus <integral expression whose
	value is N> is the address of the N'th object of type T `away
	from' the place where the pointer points, in the `increasing'
	direction if N is positive, and the `decreasing' direction if
	N is negative.

	The result of <integral expression whose value is N> plus
	<pointer to T> is the same as that of <pointer to T> plus
	<integral expression whose value is N>.

	No other additions involving pointers are legal.

>Doesn't that equivalence imply that the pointer type is somehow
>"stronger" than the simple type?

You might think of it as such; without a defintion of `strength' there
is no way to say.

>Is a compiler force to examine all of the elements in a pointer
>expression and establish the "master type" of the expression?

The compiler must look at both types in any dyadic operation (addition,
subtraction, multiplication, division, -> selection, . selection, etc.).
The result of the lookup can be found in a table in the language
definition.

>If I mix two pointer types ... is this anarchy?  Is it a syntax error?

If the operation is addition, it is a semantic error: there is no
definition for the result of addition of two pointers.  (The
subtraction operator allows two operands which are both pointers, but
they must have the same type.)

>(Please don't submit 30 replies saying "because the book says so"...)

s/book/language definition/, and you have the answer above (but without
all the verbiage).
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

sho@pur-phy (Sho Kuwamoto) (05/03/89)

In article <17812@cup.portal.com> Tim_CDC_Roberts@cup.portal.com writes:
>If I mix two pointer types, as in
>  char * c;
>  long * ell;
>     return c + ell;
>is this anarchy?  Is it a syntax error?  What is sizeof(*(c+ell))?

This should be a syntax error.  Even
  long *a, *b;
  return(a+b)
is illegal.  However,
  long *a, *b;
  return(a-b); 
Is legit.  If a and b are pointers to different types, it is probably
still a syntax error.  On the other hand, I could be wrong.  I got
mildly crisped last time I fielded a question...

-Sho

jeffrey@algor2.UUCP (Jeffrey Kegler) (05/04/89)

In article <941@draken.nada.kth.se> d87-hho@nada.kth.se (Henrik Holmstr|m) writes:
>Do we need 28 follow-ups to a trivial question?  You know the rule, if you
>see something obviously wrong or simple, don't just hit 'F'.  Wait a day
>or two and see if someone else answered the question (or just mail the anwser).
>
>	Henrik Holmstr|m

While I sympathize with Mr Holmstr|m, I hope his advice is not followed.  For a
start, if it were universally followed, you would get 28 answers 2 days late.

More important, even when the question is "silly", and I already know the
answer, the answers from 28 other people usually add to my knowledge.

The only real alternative would be a moderated group where the moderator had
a panel of C experts he called upon.  With the number of C experts posting
here the moderator would really have taken on a full time job.
-- 

Jeffrey Kegler, President, Algorists,
jeffrey@algor2.UU.NET or uunet!algor2!jeffrey
1762 Wainwright DR, Reston VA 22090

pc@cs.keele.ac.uk (Phil Cornes) (05/17/89)

From article <17812@cup.portal.com>, by Tim_CDC_Roberts@cup.portal.com:
> Ok, folks.  In regards to  "a[i] == *(a+i) == *(i+a) == i[a]", let me
> refer to the oft-used example  2["hello"].
> I agree that this works and is equivalent to "hello"[2].  I've seen it
> in books and postings.  My simple question is why? 

C does not really support arrays, and the square bracket operator ([]) is
just syntactic sugar to make you think that it does! This works quite well
until you see things like "hello"[2] == 2["hello"] which only look odd if
you continue to think of them as arrays and not pointers.


> If I mix two pointer types, as in
>   char * c;
>   long * ell;
>      return c + ell;
> is this anarchy?  Is it a syntax error?  What is sizeof(*(c+ell))?
> 

In this case the question doesn't make sense because the addition operators
function is undefined for two pointer type operands....

mesmo@Portia.Stanford.EDU (Chris Johnson) (05/18/89)

From article <17812@cup.portal.com>, by Tim_CDC_Roberts@cup.portal.com:
> Ok, folks.  In regards to  "a[i] == *(a+i) == *(i+a) == i[a]", let me
> refer to the oft-used example  2["hello"].
> I agree that this works and is equivalent to "hello"[2].  I've seen it
> in books and postings.  My simple question is why? 

	The supposed proof of a[i] == i[a] rests on the faulty
	assumption that (x+y) == (y+x) in all contexts; this is
	not correct.

	When "+" denotes simple (ie int/float/etc) arithmetic, the
	operation commutes; when it denotes pointer arithmetic,
	commutation is not legal/meaningful.

	The statement that *(a+i) == *(i+a) is therefore invalid.
-- 
==============================================================================
 Chris M Johnson === mesmo@portia.stanford.edu === "Grad school sucks rocks"
            "Imitation is the sincerest form of plagiarism" -- ALF
==============================================================================

gwyn@smoke.BRL.MIL (Doug Gwyn) (05/18/89)

In article <607@kl-cs.UUCP> pc@cs.keele.ac.uk (Phil Cornes) writes:
>C does not really support arrays, and the square bracket operator ([]) is
>just syntactic sugar to make you think that it does!

Just in case this misleads anyone, it should be noted that C really does
support arrays as distinct from pointers; however, pointers are fundamental
to C while arrays are second-class objects with "crippled" semantics.

rob@kaa.eng.ohio-state.edu (Rob Carriere) (05/18/89)

In article <2336@Portia.Stanford.EDU> mesmo@Portia.Stanford.EDU (Chris Johnson)
writes:
>	The supposed proof of a[i] == i[a] rests on the faulty
>	assumption that (x+y) == (y+x) in all contexts; this is
>	not correct.

No it doesn't.  It relies on a direct statement in K&R I (pg 210, 1st par)

"Therefore, despite its asymmetric appearance, subscripting is a
commutative operation."

>	When "+" denotes simple (ie int/float/etc) arithmetic, the
>	operation commutes; when it denotes pointer arithmetic,
>	commutation is not legal/meaningful.

There is no such statement in K&R; in fact, the paragraph from the
above quote came implies the oppposite and so does the section on the
"+" operator (A7.4, pg188--189)

SR

tim@crackle.amd.com (Tim Olson) (05/18/89)

In article <2336@Portia.Stanford.EDU> mesmo@Portia.Stanford.EDU (Chris Johnson) writes:
| 	The supposed proof of a[i] == i[a] rests on the faulty
| 	assumption that (x+y) == (y+x) in all contexts; this is
| 	not correct.
| 
| 	When "+" denotes simple (ie int/float/etc) arithmetic, the
| 	operation commutes; when it denotes pointer arithmetic,
| 	commutation is not legal/meaningful.
| 
| 	The statement that *(a+i) == *(i+a) is therefore invalid.

Why do you think that commutation is not legal for pointer arithmetic?
It certainly is still associative:

	(pointer + 3) +5 	<==>	pointer + (3 + 5)

K&R simply say that the "+" operator (as well as "*", "&", "|", and "^")
is commutative and associative, without mentioning any restrictions.

The (d)PANS says, in the constraint section for additive operators that
"For addition, either both operands shall have arithmetic type, or one
operand shall be a pointer to an object type and the other shall have
integral type."

It doesn't say that "... or the first operator shall be a pointer...",
which certainly seems to mean that pointer addition is commutative.


	-- Tim Olson
	Advanced Micro Devices
	(tim@amd.com)

gwyn@smoke.BRL.MIL (Doug Gwyn) (05/18/89)

In article <2336@Portia.Stanford.EDU> mesmo@Portia.Stanford.EDU (Chris Johnson) writes:
-	When "+" denotes simple (ie int/float/etc) arithmetic, the
-	operation commutes; when it denotes pointer arithmetic,
-	commutation is not legal/meaningful.
-	The statement that *(a+i) == *(i+a) is therefore invalid.

100% wrong!  If you don't know C any better than that, you should avoid
causing confusion and refrain from posting such misinformation.

chris@mimsy.UUCP (Chris Torek) (05/18/89)

In article <2336@Portia.Stanford.EDU> mesmo@Portia.Stanford.EDU
(Chris Johnson) writes:
>The supposed proof of a[i] == i[a] rests on the faulty
>assumption that (x+y) == (y+x) in all contexts; this is
>not correct.  When "+" denotes simple (ie int/float/etc)
>arithmetic, the operation commutes; when it denotes pointer
>arithmetic, commutation is not legal/meaningful.

The latter assertion is exactly backwards.  Pointer arithmetic (in
the forms pointer+integer and integer+pointer) is guaranteed to be
commutative, while scalar addition is not: scalar addition of certain
values is not commutative on certain peculiar architectures---things
like negative zero or peculiar floating point values, for instance.

>The statement that *(a+i) == *(i+a) is therefore invalid.

Since pointer arithmetic is commutative, the above statement is wrong,
and *(a+i) is equivalent to *(i+a), so that a[i] and i[a] denote the
same object.  See K&R (either edition) chapter 5.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

usenet@TSfR.UUCP (usenet) (05/18/89)

In article <2336@Portia.Stanford.EDU> mesmo@Portia.Stanford.EDU (Chris Johnson) writes:
>	When "+" denotes simple (ie int/float/etc) arithmetic, the
>	operation commutes; when it denotes pointer arithmetic,
>	commutation is not legal/meaningful.

  Umm, I don't follow.  K&R says that `+' is a commutative operator (K&R 1,
  page 185) and I can't find any comment to the contrary in the section
  that discusses `+' in detail.


   -david parsons
   -orc@pell.uucp

ark@alice.UUCP (Andrew Koenig) (05/18/89)

In article <2336@Portia.Stanford.EDU>, mesmo@Portia.Stanford.EDU (Chris Johnson) writes:

> 	When "+" denotes simple (ie int/float/etc) arithmetic, the
> 	operation commutes; when it denotes pointer arithmetic,
> 	commutation is not legal/meaningful.

Yes it is.  Addition of integers and pointers is commutative.

> 	The statement that *(a+i) == *(i+a) is therefore invalid.

No, the statment is true.
-- 
				--Andrew Koenig
				  ark@europa.att.com

jejones@mcrware.UUCP (James Jones) (05/18/89)

A message asserts that surely

	(p + 3) + 5 == p + (3 + 5)

where p is a pointer, and so it is, but...in general, it might not be. 
We turn once again to the canonical counterexample, segmented
architectures, where it's not clear that

	(p - 5) + 6 == p + (-5 + 6)

since p - 5 might fall off the end of the segment, and after that, all
bets are likely to be off.

That said, I hasten to add that I agree that p + i == i + p; any bogosity
arising will arise no matter what order is used.

	James Jones

guy@auspex.auspex.com (Guy Harris) (05/18/89)

>C does not really support arrays, and the square bracket operator ([]) is
>just syntactic sugar to make you think that it does! This works quite well
>until you see things like "hello"[2] == 2["hello"] which only look odd if
>you continue to think of them as arrays and not pointers.

If one says "C does not really support arrays", one should be careful to
indicate what one means; arrays really *are* arrays, not pointers.  E.g.,
on most C implementations,

	int	a[33];

causes a block of "33*sizeof (int)" bytes to be allocated; if it has
static storage duration (i.e., either external or static), a symbol "a"
or "_a" or whatever will probably be defined, and will refer to the
first location in that block - *not* to a block of size "sizeof (int *)"
that contains a pointer to the block of size "33*sizeof (int)".

However, array-valued expressions are, in most (but *not* all!)
contexts, converted to pointer-valued expressions; this is why "[]" is
sort of syntactic sugar.  "a[i]" gets turned into "*(a + i)"; the reason
this would work for the array "a" defined above is that in the context
of the expression "*(a + i)", the array-valued expression "a" gets
converted to a pointer-valued expression that points to the first
element of "a"; adding "i" to the value of that expression yields a
pointer that points to the "i"th element of "a", and dereferencing that
pointer yields the value of the "i"th element of "a".

The distinction between "arrays are pointers" and "array-valued
expressions get converted to pointer-valued expressions" is important;
the question "why doesn't this program work:

	foo.c:

	...

	int a[33];

	...

	bar.c:

	...

	extern int *a;

	...

" surfaces periodically in this group.  If arrays and pointers really
were the same thing, that program might well work; the reason it doesn't
work is that arrays and pointers *aren't* the same thing.  Similarly,
for the array "a" described above, "sizeof a" is "33*sizeof (int)",
not "sizeof (int *)" (although some compiler writers may have been
confused as well, and made their compilers give "sizeof (int *)" for
"sizeof a").

guy@auspex.auspex.com (Guy Harris) (05/18/89)

>	When "+" denotes simple (ie int/float/etc) arithmetic, the
>	operation commutes; when it denotes pointer arithmetic,
>	commutation is not legal/meaningful.

Funny, X3J11 disagrees with you:

	3.3.6 Additive operators

	...

	Semantics

	...

	...In other words, if the expression "P" pointers to the "i"th
	element of an array object, the expressions "(P)+N"
	(equivalently, "N+(P)")...

>	The statement that *(a+i) == *(i+a) is therefore invalid.

The statement that "The statement that *(a+i) == *(i+a) is therefore
invalid" is therefore invalid.

It may make life miserable for compiler writers, but if so they should
have lobbied X3J11; it's probably too late now - go forth and fix your
compiler, if it can't cope with "i[a]". 

bph@buengc.BU.EDU (Blair P. Houghton) (05/19/89)

In article <2336@Portia.Stanford.EDU> mesmo@Portia.Stanford.EDU (Chris Johnson) writes:
>
>	The supposed proof of a[i] == i[a] rests on the faulty
>	assumption that (x+y) == (y+x) in all contexts; this is
>	not correct.

Oh yeah?

>	When "+" denotes simple (ie int/float/etc) arithmetic, the
>	operation commutes; when it denotes pointer arithmetic,
>	commutation is not legal/meaningful.
>
>	The statement that *(a+i) == *(i+a) is therefore invalid.

it implies that you were doing

	sometype *a, *i;

	something = a[i];
	something_else = i[a];

So, like, tell me.  When do you use pointers as indices?  I.e., if one of
the two variables, a or i, is an int, and the other is a pointer, then
you have leave to say that pointer[int] == int[pointer] because
*(pointer + int) == *(int + pointer) and because
*((pointer or int) + (the other)) == (pointer or int)[the other].

				--Blair
				  "I still think I should be able
				   to add pointers together."

pjh@mccc.UUCP (Pete Holsberg) (05/19/89)

In article <607@kl-cs.UUCP> pc@cs.keele.ac.uk (Phil Cornes) writes:
=From article <17812@cup.portal.com>, by Tim_CDC_Roberts@cup.portal.com:
=> Ok, folks.  In regards to  "a[i] == *(a+i) == *(i+a) == i[a]", let me
=> refer to the oft-used example  2["hello"].
=> I agree that this works and is equivalent to "hello"[2].  I've seen it
=> in books and postings.  My simple question is why? 
=
=C does not really support arrays, and the square bracket operator ([]) is
=just syntactic sugar to make you think that it does! This works quite well
=until you see things like "hello"[2] == 2["hello"] which only look odd if
=you continue to think of them as arrays and not pointers.

Can you explain why compilers produce different code for "a[i]" and "*(a+i)"?
Thanks.
-- 
Pete Holsberg, Mercer County Community College, Trenton, NJ 08690  

mat@mole-end.UUCP (Mark A Terribile) (05/19/89)

> > Ok, folks.  In regards to  "a[i] == *(a+i) == *(i+a) == i[a]", let me
> > refer to the oft-used example  2["hello"].

> > I agree that this works and is equivalent to "hello"[2].  I've seen it
> > in books and postings.  My simple question is why? 

For the reason that everyone has said:  x [ y ]  IS DEFINED AS  *( x + y )

and if one expression is legal, so is the other one.
 
> 	The supposed proof of a[i] == i[a] rests on the faulty
> 	assumption that (x+y) == (y+x) in all contexts; this is
> 	not correct.

Are you saying that  2[ "hello" ]  is not the same as  "hello"[ 2 ] ?  If
so, you are wrong.  The ordering of the operands does not matter.  C has
been this way from about the beginning and unless there is a specific item
in the pANSI spec (I find none in K&R-II) it is allowed.  (But see K&R-II,
section A8.6.2: ``Therefore, despite its asymmetric appearance, subscripting
is a commutative operation.'')
 
> 	When "+" denotes simple (ie int/float/etc) arithmetic, the
> 	operation commutes; when it denotes pointer arithmetic,
> 	commutation is not legal/meaningful.

It is.  Using K&R-II again, over and over in discussing the addition of an
integer to a pointer, they say ``one operand ... and the other operand ...''
*Never* are the first and second operands distinguished.  There is a reason for
this care.

> 	The statement that *(a+i) == *(i+a) is therefore invalid.

Not only

		*(a+i) == *(i+a)

for  a  of any pointer type (excluding pointer-to-function) and  i  of an
integral type, but

		*( a + i ) === * ( i + a )
		( a + i ) ==  ( i + a )
		*( a + i ) == * ( i + a )

etc.

If your compiler rejects  "hello"[ 2 ]  it is broken.  (Have you tried it, by
the way?)
-- 

 (This man's opinions are his own.)
 From mole-end				Mark Terribile

byron@pyr.gatech.EDU (Byron A Jeff) (05/19/89)

In article <2336@Portia.Stanford.EDU> mesmo@Portia.Stanford.EDU (Chris Johnson) writes:
-From article <17812@cup.portal.com>, by Tim_CDC_Roberts@cup.portal.com:
-> Ok, folks.  In regards to  "a[i] == *(a+i) == *(i+a) == i[a]", let me
-> refer to the oft-used example  2["hello"].
-> I agree that this works and is equivalent to "hello"[2].  I've seen it
-> in books and postings.  My simple question is why? 
-
-	The supposed proof of a[i] == i[a] rests on the faulty
-	assumption that (x+y) == (y+x) in all contexts; this is
-	not correct.
-
-	When "+" denotes simple (ie int/float/etc) arithmetic, the
-	operation commutes; when it denotes pointer arithmetic,
-	commutation is not legal/meaningful.
-
-	The statement that *(a+i) == *(i+a) is therefore invalid.

Try this program on for size:

main()
{
   char *p = "Goofy";

   printf("%c %c %d %d\n",*(p+2),*(2+p), *(p+2) == *(2+p), 2+p == p+2);

}

and its output:

o o 1 1

Any other assertions you'd like to make?

--- 
-==============================================================================
- Chris M Johnson === mesmo@portia.stanford.edu === "Grad school sucks rocks"
-            "Imitation is the sincerest form of plagiarism" -- ALF
-==============================================================================


-- 
Another random extraction from the mental bit stream of...
Byron A. Jeff
Georgia Tech, Atlanta GA 30332
Internet:	byron@pyr.gatech.edu  uucp:	...!gatech!pyr!byron

henry@utzoo.uucp (Henry Spencer) (05/19/89)

In article <749@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes:
>Can you explain why compilers produce different code for "a[i]" and "*(a+i)"?

Incompetent compiler writers.
-- 
Subversion, n:  a superset     |     Henry Spencer at U of Toronto Zoology
of a subset.    --J.J. Horning | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

Tim_CDC_Roberts@cup.portal.com (05/20/89)

In <1176@mcrware.UUCP>, jejones@mcrware.UUCP (James Jones) writes:

>A message asserts that surely
>
>  (p + 3) + 5 == p + (3 + 5)
>
>where p is a pointer, and so it is, but...in general, it might not be. 
>We turn once again to the canonical counterexample, segmented
>architectures, where it's not clear that
>
>  (p - 5) + 6 == p + (-5 + 6)
>
>since p - 5 might fall off the end of the segment, and after that, all
>bets are likely to be off.

I disagree with this!  I assert that EVEN if the intermediate result
goes negative, the final value will be correct, even on segmented 
architectures.

It is true that it might be impossible or even dangerous to dereference 
the address (p - 5), but we aren't trying to DO that.

Example:  32 bit system.  Top 12 bits are a segment number, bottom 20 bits 
are an address.  Lets say p is at offset 2 in segment 0x012.

        p        = 0x01200002
        p-5      = 0x011ffffd
        (p-5)+6  = 0x01200003

Yes, the intermediate value is not a valid address, but I don't think that's 
important.

Tim_CDC_Roberts@cup.portal.com                | Control Data...
...!sun!portal!cup.portal.com!tim_cdc_roberts |   ...or it will control you.

 

guy@auspex.auspex.com (Guy Harris) (05/20/89)

>Can you explain why compilers produce different code for "a[i]" and "*(a+i)"?

If the compiler doesn't produce equally good code for those two cases,
it's because the compiler writer wasn't doing as good a job as s/he
could.  If it produces different but equally good code, I dunno;
possibly because the compiler writer didn't understand that "a[i]" is
equivalent to "*(a+i)", or decided for whatever reason to implement them
differently.

The existence of compilers that produce different code for those cases
does not, in any way, prove that the two expressions are in equivalent;
K&R First Edition points out that

	...The expression E1[E2] is identical (by definition) to
	*((E1)+(E2)).

and the December 7, 1988 ANSI C draft says that

	...The definition of the subscript operator [] is that E1[E2] is
	identical to *(E1+(E2))).

so further discussion on whether they're equivalent in C is pointless -
they are, and that's that.  If somebody wants to debate whether they
*should* be equivalent, they can, but they're then talking about D or P,
say, not C. 

diamond@diamond.csl.sony.junet (Norman Diamond) (05/20/89)

In article <749@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes:

>Can you explain why compilers produce different code for "a[i]" and "*(a+i)"?
>Thanks.

Hear, hear.  Yes, there are a lot of broken implementations, and there
are a lot more implementations which are not broken but just wierd.
Yes, this is one of the ways in which many implementations are wierd,
and I also wonder why.  Anyone have any ideas?

--
Norman Diamond, Sony Computer Science Lab (diamond%csl.sony.co.jp@relay.cs.net)
  The above opinions are my own.   |  Why are programmers criticized for
  If they're also your opinions,   |  re-implementing the wheel, when car
  you're infringing my copyright.  |  manufacturers are praised for it?

chris@mimsy.UUCP (Chris Torek) (05/20/89)

In article <749@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes:
[in response to article <607@kl-cs.UUCP> by pc@cs.keele.ac.uk (Phil Cornes)]
>Can you explain why compilers produce different code for "a[i]" and "*(a+i)"?

I have never observed one to do so.  There is no reason for a compiler
to generate different code, as the expressions are semantically identical.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

karl@haddock.ima.isc.com (Karl Heuer) (05/21/89)

In article <1657@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes:
>[K&R and the pANS agree that a[i]==i[a],] so further discussion on whether
>they're equivalent in C is pointless - they are, and that's that.  If
>somebody wants to debate whether they *should* be equivalent, they can, but
>they're then talking about D or P, say, not C.

Or maybe ANSI C in a nearby parallel universe.  I thought it strange that
X3J11 outlawed "x+ =1" ("+=" is now a single token), but permitted "i[a]" (on
the grounds that they "saw no reason to forbid it").  Neither construct is
ever used outside the IOCCC, and outlawing "i[a]" would have been a small step
towards making arrays higher-class citizens than they are.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
(To anyone who's itching to say that i[hairy_array_expression] avoids a pair
of parentheses: if you code that way, I spit on your grandmother's shadow.)

henry@utzoo.uucp (Henry Spencer) (05/21/89)

In article <18560@cup.portal.com> Tim_CDC_Roberts@cup.portal.com writes:
>I disagree with this!  I assert that EVEN if the intermediate result
>goes negative, the final value will be correct, even on segmented 
>architectures.

You are assuming that there will *be* a final value.  You may get a trap
the instant the intermediate result goes invalid, if pointer arithmetic
is being done by special pointer-arithmetic instructions.

Actually, even if you don't get a trap, pointer-arithmetic instructions
may do almost anything when presented with an invalid operand.  They don't
have to act like integer instructions.
-- 
Subversion, n:  a superset     |     Henry Spencer at U of Toronto Zoology
of a subset.    --J.J. Horning | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

gwyn@smoke.BRL.MIL (Doug Gwyn) (05/21/89)

In article <13234@haddock.ima.isc.com> karl@haddock.ima.isc.com (Karl Heuer) writes:
>I thought it strange that X3J11 outlawed "x+ =1" ("+=" is now a single
>token), but permitted "i[a]" (on the grounds that they "saw no reason
>to forbid it").

It was never clear why the old C language reference manual said "the
two parts of a compound assignment operator are separate tokens" when
the formal grammar showed them as indivisible units.  That may have
simply been a description of the (somewhat sloppy) way the
implementation of the PCC lexer happened to work.  Another possibility
is that it was desired to guarantee that such an operator could be
constructed via preprocessing.  X3J11 allows the latter anyway.  It
seems for more likely that "x+ =1" is a typo than that it is intended.

"i[a]" on the other hand has actually been intentionally used by some
programmers, although most of us certainly don't recommend it.

>... outlawing "i[a]" would have been a small step
>towards making arrays higher-class citizens than they are.

I don't think you can ever make the existing C arrays first-class
objects without invalidating large amounts of existing correct code.
There are efforts underway to find a suitable language extension
that solves this problem (for the new class of objects provided by
the extension).

pjh@mccc.UUCP (Pete Holsberg) (05/21/89)

In article <1989May19.154248.426@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
=In article <749@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes:
=>Can you explain why compilers produce different code for "a[i]" and "*(a+i)"?
=
=Incompetent compiler writers.

ALL of them?  Really?  Can you name a C compiler that was written by a
competent compiler writer?

(Your new .sig is D U L L!! ;-)


-- 
Pete Holsberg, Mercer County Community College, Trenton, NJ 08690  
{backbone}!rutgers!njin!princeton!njsmu!mccc!pjh

pjh@mccc.UUCP (Pete Holsberg) (05/21/89)

In article <17635@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
=In article <749@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes:
=[in response to article <607@kl-cs.UUCP> by pc@cs.keele.ac.uk (Phil Cornes)]
=>Can you explain why compilers produce different code for "a[i]" and "*(a+i)"?
=
=I have never observed one to do so.  There is no reason for a compiler
=to generate different code, as the expressions are semantically identical.

Perhaps I've asked the wrong question.  I saw a couple of simple test
programs that assigned 0 to each member of an array.  One used array
subscript notation, and the other, pointer notation.  I compiled these
on a 7300, a 3B2/400, and a 386 running Microport V/386, using a variety
of compilers (cc and gnu-cc on the 7300, fpcc on the 3B2, and cc and
Greenhills on the 386).  I ran each version and timed the execution. 
The subscript versions had different run times from the pointer versions
(some slower, some faster!).  I assumed - perhaps naively - that the
differences were caused by differences in code produced by the different
compilers (and of course the hardware differences).  Was that wrong? 
How does one account for the differences?




-- 
Pete Holsberg, Mercer County Community College, Trenton, NJ 08690  
{backbone}!rutgers!njin!princeton!njsmu!mccc!pjh

henry@utzoo.uucp (Henry Spencer) (05/22/89)

In article <755@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes:
>=>Can you explain why compilers produce different code for "a[i]" and "*(a+i)"?
>=
>=Incompetent compiler writers.
>
>ALL of them?  Really?  Can you name a C compiler that was written by a
>competent compiler writer?

Sometimes I think the only one was Dennis Ritchie's original pdp11 compiler,
with the original PCC perhaps a borderline case.  And lest there be any
doubts about the matter, both of them convert "a[i]" to "*(a+i)" as they
parse, so the code for the two expressions is necessarily identical.  (I
went and looked at the compiler sources to be sure.)

The two expressions are semantically identical by the definition of C.
Any compiler which generates different code for them either is broken or
has outsmarted itself in trying to be clever.

>(Your new .sig is D U L L!! ;-)

Just wait for the next one. |->             <--- evil smile
-- 
Van Allen, adj: pertaining to  |     Henry Spencer at U of Toronto Zoology
deadly hazards to spaceflight. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

jamesa@arabian.Sun.COM (James D. Allen) (05/22/89)

In article <10299@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
>In article <13234@haddock.ima.isc.com> karl@haddock.ima.isc.com (Karl Heuer) writes:
>>... outlawing "i[a]" would have been a small step
>>towards making arrays higher-class citizens than they are.
>
>I don't think you can ever make the existing C arrays first-class
>objects without invalidating large amounts of existing correct code.
>There are efforts underway to find a suitable language extension
>that solves this problem (for the new class of objects provided by
>the extension).

One source of trouble is "hidden" array typedefs, such as `jmp_buf'.
(You have to "know" what a jmp_buf is to use it nontrivially, while
if it were "first-class" you wouldn't.)  But a logical array can be
promoted to a first-class citizen by just putting it in a structure:

		typedef struct {
			jmp_buf j;
		} first_class_jmpbuf;

Any idea why this wasn't done for jmp_buf's?

I think the "second-classedness" of arrays helps give C its elegant
syntax.  Any other examples of the "problems" it causes?

chris@mimsy.UUCP (Chris Torek) (05/22/89)

>>In article <749@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) asked
>>>why compilers produce different code for "a[i]" and "*(a+i)"?

>In article <17635@mimsy.UUCP> I noted that
>>I have never observed one to do so.  There is no reason for a compiler
>>to generate different code, as the expressions are semantically identical.

In article <756@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes:
>Perhaps I've asked the wrong question.

Maybe not.

>I saw a couple of simple test programs that assigned 0 to each member
>of an array.  One used array subscript notation, and the other, pointer
>notation.  I compiled these >on a 7300, a 3B2/400, and a 386 running
>Microport V/386, using a variety of compilers (cc and gnu-cc on the
>7300, fpcc on the 3B2, and cc and Greenhills on the 386).

I have none of these machines, and only gcc as a compiler.  The code
produce by

	GNU C version 1.35 (vax) compiled by GNU C version 1.35.

for both loops in

	int a[20];
	main(){int i;
	for(i=0;i<20;i++)a[i]=0;
	f();
	for(i=0;i<20;i++)*(a+i)=0;
	f();
	}

was identical.  (The lack of spacing in this example is due to me
typing it in with the `cat' editor :-) )

>I ran each version and timed the execution. The subscript versions
>had different run times from the pointer versions (some slower, some
>faster!).  I assumed - perhaps naively - that the differences were
>caused by differences in code produced by the different compilers
>(and of course the hardware differences).  Was that wrong? 
>How does one account for the differences?

Differing code sequences is one of two obvious possibilities, the other
being differing multi-user loads.  The latter seems less likely, especially
if the results are repeatable.  Why not compile to assembly and compare?

If a compiler produces better code for a[i] than for *(a+i) (or vice
versa), that compiler needs work.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

andre@targon.UUCP (andre) (05/22/89)

In article <18560@cup.portal.com> Tim_CDC_Roberts@cup.portal.com writes:
>In <1176@mcrware.UUCP>, jejones@mcrware.UUCP (James Jones) writes:
>[pointer story]
>I disagree with this!  I assert that EVEN if the intermediate result
>goes negative, the final value will be correct, even on segmented 
>architectures.
Don't underestimate the intel approach to computing :-)
I have it on good authority that on the 386,
((adress) 0x0010 - 0x0100) + 0x0100 != 0x0010
but instead it winds up somewhere at the top of memory :-(.

>Yes, the intermediate value is not a valid address, but I don't think that's 
>important.
If the intermediate result would be put in an address register (on the '386)
(where else does an address even a bogus one belong else ?)
you will get a trap from the processors 'MMU'.


-- 
~----~ |m    AAA         DDDD  It's not the kill, but the thrill of the chase.
~|d1|~@--   AA AAvv   vvDD  DD        Segment registers are for worms.
~----~  &  AAAAAAAvv vvDD  DD
~~~~~~ -- AAA   AAAvvvDDDDDD        Andre van Dalen, uunet!mcvax!targon!andre

tim@crackle.amd.com (Tim Olson) (05/22/89)

In article <756@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes:
| In article <17635@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
| =In article <749@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes:
| =[in response to article <607@kl-cs.UUCP> by pc@cs.keele.ac.uk (Phil Cornes)]
| =>Can you explain why compilers produce different code for "a[i]" and "*(a+i)"?
| =
| =I have never observed one to do so.  There is no reason for a compiler
| =to generate different code, as the expressions are semantically identical.
| 
| Perhaps I've asked the wrong question.  I saw a couple of simple test
| programs that assigned 0 to each member of an array.  One used array
| subscript notation, and the other, pointer notation.  I compiled these
| on a 7300, a 3B2/400, and a 386 running Microport V/386, using a variety
| of compilers (cc and gnu-cc on the 7300, fpcc on the 3B2, and cc and
| Greenhills on the 386).  I ran each version and timed the execution. 
| The subscript versions had different run times from the pointer versions
| (some slower, some faster!).  I assumed - perhaps naively - that the
| differences were caused by differences in code produced by the different
| compilers (and of course the hardware differences).  Was that wrong? 
| How does one account for the differences?

If you wrote the routines like:

	
	int a[MAX];			int a[MAX];
	int i;				int i;
	for (i=0; i<MAX; ++i)		for (i=0; i<MAX; ++i)
		a[i] = 0;			*(a+i) = 0;

Then the code generated should probably be identical (and it was, on the
three machines I tried it on).  However, if instead you wrote them like:

	int a[MAX];			int a[MAX];
	int i;				int *p;
	for (i=0; i<MAX; ++i)		for (p=&a[0]; p<&a[MAX]; ++p)
		a[i] = 0;			*p=0;

Then you indeed might get different assembly language generated.  The
second pointer version has had a "loop induction" optimization performed
by hand. On some compiler/machine combinations, this will run faster,
because the scaling operation and base/offset addition have been
eliminated; on others it may run slower, because a specific addressing
mode cannot be used.


	-- Tim Olson
	Advanced Micro Devices
	(tim@amd.com)

guy@auspex.auspex.com (Guy Harris) (05/23/89)

>Perhaps I've asked the wrong question.  I saw a couple of simple test
>programs that assigned 0 to each member of an array.  One used array
>subscript notation, and the other, pointer notation.  I compiled these
>on a 7300, a 3B2/400, and a 386 running Microport V/386, using a variety
>of compilers (cc and gnu-cc on the 7300, fpcc on the 3B2, and cc and
>Greenhills on the 386).  I ran each version and timed the execution. 
>The subscript versions had different run times from the pointer versions
>(some slower, some faster!).  I assumed - perhaps naively - that the
>differences were caused by differences in code produced by the different
>compilers (and of course the hardware differences).  Was that wrong? 
>How does one account for the differences?

Well, if the program that used subscript notation was something like:

	for (i = 0; i < LEN; i++)
		a[i] = 0;

and the program that used pointer notation was something like:

	p = &a[0];
	while (p < &a[LEN]) 
		*p++ = 0;

the answer has nothing whatsoever to do with the equivalence of "a[i]"
and "*(a + i)", since the latter program doesn't use the latter
construct, so you did ask the wrong question.

It has, instead, to do with the fact that the equivalence of the two
constructs in question is not as trivial as the equivalence of "a[i]"
and "*(a + i)", and therefore it may be less likely that the compilers
will generate the same code for them.  There may well be compilers that
*do* generate the same code for them - rewrite the first loop as:

	for (i = 0; i < LEN; i++)
		*(a + i) = 0;

and then note that on most architectures, this requires that the value
in "i" be multiplied by "sizeof a[0]" before being added to the address
represented by the address of "a[0]", and do a strength reduction on
that multiplication; you then find the induction variable not used, and
eliminate it, and by the time the smoke clears you have the loop in the
first example generating the same code as the loop in the second
example.  (I don't know whether there are any compilers that do this or
not.)

If the code generated for the two constructs is different, that could
account for performance differences.

karl@haddock.ima.isc.com (Karl Heuer) (05/23/89)

In article <756@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes:
>In article <17635@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>>I have never observed [a compiler to treat "a[i]" and "*(a+i)" differently].
>
>Perhaps I've asked the wrong question.  I saw a couple of simple test
>programs that assigned 0 to each member of an array.  One used array
>subscript notation, and the other, pointer notation.

By "pointer notation" do you mean only that the code used "*(a+i)" for "a[i]"?
Or are you talking about code that used "*p++" instead of "a[i++]"?  The
latter is an entirely different question!  (And it's usually what people are
testing when they write "array vs. pointer" tests.)

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

karl@haddock.ima.isc.com (Karl Heuer) (05/23/89)

In article <105998@sun.Eng.Sun.COM> jamesa@arabian.Sun.COM (James D. Allen) writes:
>Any idea why this wasn't done for jmp_buf's?

Originally?  Probably shortsightedness.  In ANSI C?  Backward compatibility.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

cudcv@warwick.ac.uk (Rob McMahon) (05/23/89)

In article <756@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes:
>Perhaps I've asked the wrong question.  I saw a couple of simple test
>programs that assigned 0 to each member of an array.  One used array
>subscript notation, and the other, pointer notation ...  The subscript
>versions had different run times from the pointer versions (some slower, some
>faster!).  I assumed - perhaps naively - that the differences were caused by
>differences in code produced by the different compilers (and of course the
>hardware differences).

I'll lay odds that you're comparing

	int i;
	for (i = 0; i < MAX; i++)
		a[i] = 0;
with
	grimble *p;
	for (p = a; p < &a[MAX]; p++)
		*p = 0;

am I right?

Note that this is not comparing `a[i]' with `*(a+i)' at all, the second loop
simply has to increment a pointer, not scale an integer and add it to the
address of an array.  Compilers with strength reduction will make both
equivalent.  On machines with fiendish indexed addressing modes the first may
be as fast or faster, on other machines the second may be faster.

Rob
-- 
UUCP:   ...!mcvax!ukc!warwick!cudcv	PHONE:  +44 203 523037
JANET:  cudcv@uk.ac.warwick             ARPA:   cudcv@warwick.ac.uk
Rob McMahon, Computing Services, Warwick University, Coventry CV4 7AL, England

pjh@mccc.UUCP (Pete Holsberg) (05/23/89)

In article <17657@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
=>>In article <749@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) asked
=>>>why compilers produce different code for "a[i]" and "*(a+i)"?
=
=>In article <17635@mimsy.UUCP> I noted that
=>>I have never observed one to do so.  There is no reason for a compiler
=>>to generate different code, as the expressions are semantically identical.
=
=I have none of these machines, and only gcc as a compiler.  The code
=produce by
=
=	GNU C version 1.35 (vax) compiled by GNU C version 1.35.
=
=for both loops in
=
=	int a[20];
=	main(){int i;
=	for(i=0;i<20;i++)a[i]=0;
=	f();
=	for(i=0;i<20;i++)*(a+i)=0;
=	f();
=	}
=
=was identical.  

=Differing code sequences is one of two obvious possibilities, the other
=being differing multi-user loads.  The latter seems less likely, especially
=if the results are repeatable.  Why not compile to assembly and compare?

Thanks, Chris.  I will try that on the 386 machine, as that assembly language
is not as much an unknown as those for the 680x0 and the WE320x0!
-- 
Pete Holsberg, Mercer County Community College, Trenton, NJ 08690  
{backbone}!rutgers!njin!princeton!njsmu!mccc!pjh

diamond@diamond.csl.sony.junet (Norman Diamond) (05/23/89)

In article <755@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes:

>>Can you name a C compiler that was written by a
>>competent compiler writer?

In article <1989May21.205928.26064@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:

>Sometimes I think the only one was Dennis Ritchie's original pdp11 compiler,
>with the original PCC perhaps a borderline case.

Well, in terms of the original question ...
>And lest there be any
>doubts about the matter, both of them convert "a[i]" to "*(a+i)" as they
>parse, so the code for the two expressions is necessarily identical.  (I
>went and looked at the compiler sources to be sure.)
... yes it's encouraging to see that they were competent.

Now about PCC.  Do you really mean that all those bugs were inserted
into PCC after the original?  I suppose it's possible.  I don't have
the original.  But some of those bugs look pretty old.  How portable
is it to dereference the null pointer?  I'm amazed that the thing runs
(except of course for where it doesn't run).

--
Norman Diamond, Sony Computer Science Lab (diamond%csl.sony.co.jp@relay.cs.net)
  The above opinions are my own.   |  Why are programmers criticized for
  If they're also your opinions,   |  re-implementing the wheel, when car
  you're infringing my copyright.  |  manufacturers are praised for it?

randolph@ektools.UUCP (Gary L. Randolph) (05/23/89)

In article <194@mole-end.UUCP> mat@mole-end.UUCP (Mark A Terribile) writes:
>Are you saying that  2[ "hello" ]  is not the same as  "hello"[ 2 ] ?  If
>so, you are wrong. ...
>
>If your compiler rejects  "hello"[ 2 ]  it is broken.

I agree with Mark T, but more importantly, this question was answered
a day or two ago by Andrew Koenig.  I am an avid reader of all of Andrew's
works (C & C++).  I have learned much easily from this reading.  He says,
as does the reference manual that commutation applies, therefore IT DOES.
:-)

                         Gary L. Randolph

pjh@mccc.UUCP (Pete Holsberg) (05/23/89)

In article <25711@amdcad.AMD.COM> tim@amd.com (Tim Olson) writes:
=However, if instead you wrote them like:
=
=	int a[MAX];			int a[MAX];
=	int i;				int *p;
=	for (i=0; i<MAX; ++i)		for (p=&a[0]; p<&a[MAX]; ++p)
=		a[i] = 0;			*p=0;
=
=Then you indeed might get different assembly language generated.  The
=second pointer version has had a "loop induction" optimization performed
=by hand. On some compiler/machine combinations, this will run faster,
=because the scaling operation and base/offset addition have been
=eliminated; on others it may run slower, because a specific addressing
=mode cannot be used.

Tim,
	Here's the actual code.  I think you've hit the nail on the head.

#define IMAX 10
#define LOOP 10000

main()
{
	int a[IMAX];
	register int * p;
	int v=0;
	
	while (v++ < LOOP)
		for (p=a; p < &a[IMAX];)
			*p++=v;
}

-- 
Pete Holsberg, Mercer County Community College, Trenton, NJ 08690  
{backbone}!rutgers!njin!princeton!njsmu!mccc!pjh

pjh@mccc.UUCP (Pete Holsberg) (05/24/89)

In article <1677@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes:
=Well, if the program that used subscript notation was something like:
=
=	for (i = 0; i < LEN; i++)
=		a[i] = 0;
=
=and the program that used pointer notation was something like:
=
=	p = &a[0];
=	while (p < &a[LEN]) 
=		*p++ = 0;
=
=the answer has nothing whatsoever to do with the equivalence of "a[i]"
=and "*(a + i)", since the latter program doesn't use the latter
=construct, so you did ask the wrong question.

So it seems!

=It has, instead, to do with the fact that the equivalence of the two
=constructs in question is not as trivial as the equivalence of "a[i]"
=and "*(a + i)", and therefore it may be less likely that the compilers
=will generate the same code for them.  

OK, so even though the two pieces of code are doing the same job and one uses
index notation while the other uses pointer notation, the compiler is not 
likely to notice this.

=There may well be compilers that
=*do* generate the same code for them - rewrite the first loop as:
=
=	for (i = 0; i < LEN; i++)
=		*(a + i) = 0;
=
=and then note that on most architectures, this requires that the value
=in "i" be multiplied by "sizeof a[0]" before being added to the address
=represented by the address of "a[0]", and do a strength reduction on
											    ^^^^^^^^^^^^^^^^^^
												could you explain this?

=that multiplication; you then find the induction variable not used, and
                                        ^^^^^^^^^^^^^^^^^^
										and this?

=eliminate it, and by the time the smoke clears you have the loop in the
=first example generating the same code as the loop in the second
=example.  (I don't know whether there are any compilers that do this or
=not.)
=
=If the code generated for the two constructs is different, that could
=account for performance differences.

I'll try it.  Thanks for the explanation.
-- 
Pete Holsberg, Mercer County Community College, Trenton, NJ 08690  
{backbone}!rutgers!njin!princeton!njsmu!mccc!pjh

castong@bucsb.UUCP (Paul Castonguay) (05/24/89)

>
>> 	The statement that *(a+i) == *(i+a) is therefore invalid.
>
>No, the statment is true.
>-- 

#include<stdio.h>
main()
{
    int *a;
    int i=1;

    a = (int *)malloc(16);
    *a = 0;
    *(a+1) = 4;
    *(a+2) = 0;
    *(a+3) = 0;

    printf("*(a+i) = %d  ", *(a+i));
    printf("*(i+a) = %d\n", *(i+a));
}


Output produced:

*(a+i) = 4  *(i+a) = 4


Does that not show that *(a+i) == *(i+a) ?

sar@datcon.UUCP (Simon A Reap) (05/25/89)

In article <155@titania.warwick.ac.uk> cudcv@warwick.ac.uk (Rob McMahon) writes:
>In article <756@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes:
>>Perhaps I've asked the wrong question.  I saw a couple of simple test
>>programs that assigned 0 to each member of an array.  One used array
>>subscript notation, and the other, pointer notation ...  The subscript
>>versions had different run times from the pointer versions (some slower, some
>>faster!).  I assumed - perhaps naively - that the differences were caused by
>>differences in code produced by the different compilers (and of course the
>>hardware differences).
>
>I'll lay odds that you're comparing
>
>	int i;
>	for (i = 0; i < MAX; i++)
>		a[i] = 0;
>with
>	grimble *p;
>	for (p = a; p < &a[MAX]; p++)
>		*p = 0;
>
>am I right?
>

Compiling:
int a[20];
main(){int i;
for(i=0;i<20;i++)a[i]=0;
f();
for(i=0;i<20;i++)*(a+i)=0;
f();
}

on our Pyramid 9820 gives the following assembler in att and ucb universes:
	movw	$0x0,lr0
	br	L13
L15:
	movw	$0x0,_a[lr0*0x4]             ;body of loop for a[i]=0
	addw	$0x1,lr0                     ;
L13:
	cmpw	$0x14,lr0
	blt	L15
L14:
	call	_f
	movw	$0x0,lr0
	br	L16
L18:
	mova	_a[lr0*0x4],pr2              ;body of loop for *(a+i)=0
	movw	$0x0,(pr2)                   ;
	addw	$0x1,lr0                     ;
L16:
	cmpw	$0x14,lr0
	blt	L18
L17:
	call	_f
	ret	

Yup, different code.  But then again, what can one expect from a compiler
that doesn't understand i[a] (as in "hello"[2]) :-(  Such a pity on an
otherwise good machine.


-- 
Enjoy,
yerluvinunclesimon             Opinions are mine - my cat has her own ideas
Reach me at sar@datcon.co.uk, or ...!mcvax!ukc!pyrltd!datcon!sar

rec@elf115.uu.net (Roger Critchlow) (05/26/89)

In <105998@sun.Eng.Sun.COM>
jamesa@arabian.Sun.COM (James D. Allen) had written:
>                                         But a logical array can be
>promoted to a first-class citizen by just putting it in a structure:
>
>		typedef struct {
>			jmp_buf j;
>		} first_class_jmpbuf;
>
>Any idea why this wasn't done for jmp_buf's?

In <13269@haddock.ima.isc.com>
karl@haddock.ima.isc.com (Karl Heuer) responded:
>In article <105998@sun.Eng.Sun.COM> jamesa@arabian.Sun.COM (James D. Allen) writes:
>>Any idea why this wasn't done for jmp_buf's?
>Originally?  Probably shortsightedness.  In ANSI C?  Backward compatibility.

I believe that struct's were still second class objects when
jmp_buf was first declared.

The declaration of a jmp_buf as an array means that the jmp_buf
is passed by reference instead of by value.  This is essential
to many implementations of setjmp().  It's also a useful trick
to remember if you want user declared data objects to be passed
to your library routines by reference.

In <105998@sun.Eng.Sun.COM>
jamesa@arabian.Sun.COM (James D. Allen) continued:
>I think the "second-classedness" of arrays helps give C its elegant
>syntax.  Any other examples of the "problems" it causes?

I think of C arrays as syntactic sugar for initialized pointers.
Thus

	char foo[] = "I am an anonymous char *";

is an abbreviation for

	register char *const foo = "I am an anonymous char *";

I reason 'const' because the value of the pointer cannot be changed,
and 'register' because the address of the pointer cannot be taken.

-- rec@elf115.uu.net --

karl@haddock.ima.isc.com (Karl Heuer) (05/26/89)

In article <96@elf115.uu.net> rec@elf115.uu.net (Roger Critchlow) writes:
>In <13269@haddock.ima.isc.com> karl@haddock.ima.isc.com (Karl Heuer) writes:
>>In article <105998@sun.Eng.Sun.COM> jamesa@arabian.Sun.COM (James D. Allen) writes:
>>>Any idea why this [enclosing in a struct] wasn't done for jmp_buf's?
>>
>>Originally?  Probably shortsightedness.
>
>I believe that struct's were still second class objects when
>jmp_buf was first declared.

But arrays were (and still are) third-class objects.  A struct would still
have been the better choice.

(Or do you mean to suggest that structs didn't exist at all?  I bet if you go
back that far, typedef didn't exist either.)

>The declaration of a jmp_buf as an array means that the jmp_buf
>is passed by reference instead of by value.  This is essential
>to many implementations of setjmp().

Of course it has to be passed by reference; the function needs to write in it,
after all.  But what *should* have been done was to typedef jmp_buf as a
struct, and use "&" explicitly.  That's what the rest of the library uses when
call by reference is required.

>I think of C arrays as syntactic sugar for initialized pointers.  Thus
>	char foo[] = "I am an anonymous char *";
>is an abbreviation for
>	register char *const foo = "I am an anonymous char *";

This is very wrong, but I'll let Chris Torek explain why, since he already has
it in his FAQ database.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

guy@auspex.auspex.com (Guy Harris) (05/27/89)

 >>I think the "second-classedness" of arrays helps give C its elegant
 >>syntax.  Any other examples of the "problems" it causes?
 >
 >I think of C arrays as syntactic sugar for initialized pointers.

In other words, your answer to his question is that one problem caused
by the "second-classedness" of arrays is that it leads people to think
of them, incorrectly, as pointers?  I'd certainly agree with that....

 >Thus
 >
 >	char foo[] = "I am an anonymous char *";
 >
 >is an abbreviation for
 >
 >	register char *const foo = "I am an anonymous char *";
 >
 >I reason 'const' because the value of the pointer cannot be changed,
 >and 'register' because the address of the pointer cannot be taken.

Well, unfortunately, there's no little thing you can add to the
declaration to straightforwardly reflect the fact that:

	foo.c:

		...

		char foo[] = "I am an array";

		...

	bar.c:

		...

		extern char *const foo;

		...

is wrong.  (And yes, "I did (the above); why isn't it working?" has
appeared as a question in this newsgroup in the past, so people really
*do* get the idea that it's supposed to work.)  If you really want to go
out of your way, I guess the "register" does that - but it also hints
that something gets stuffed into a register, which is wrong. 

Think of arrays as arrays, pointers as pointers, and array-valued
expressions being converted, in most but *not* all contexts, as being
converted to pointer-valued expressions that point to the first element
of the array, and you won't go wrong.  That may be more *complicated*
than your model, but it has the advantage of reflecting reality....