[comp.lang.c] What does Z["ack"] = 5 mean?

laba-3aw@web.berkeley.edu (Sam Shen) (10/05/88)

Exactly what does this mean:

main()
{
	char Z;

	Z["ack!"] = 5;
}

This doesn't look right to me.  However, cc doesn't complain
at all about it.  Lint says:

blah.c(5): warning: Z may be used before set

And finally GNU C, (gcc -Wall) says:

blah.c:2: warning: return-type defaults to `int'
blah.c:6: warning: control reaches end of non-void function

Worse yet, the executable produced by gcc core dumps.  Oh, by the way, this
is all on a Sun-3/50.

Sam Shen (laba-3aw@web.berkeley.edu)

earleh@eleazar.dartmouth.edu (Earle R. Horton) (10/05/88)

In article <14999@agate.BERKELEY.EDU> laba-3aw@web.berkeley.edu 
	(Sam Shen) writes:
>Exactly what does this mean:
>
>main()
>{
>	char Z;
>
>	Z["ack!"] = 5;
>}
>

Portion of core dump obtained by running this program on a VAX, and core
dumping before main() returns, follows:

\300^C\214\350\377^?^E^@^@^@^Eck!^@^@^@

                            ^-- There's your 5!

What happened here was that Z was initialized to (char)0 by the
compiler or loader on the VAX.  (BSD 4.3)  Then, Z["ack!"] was taken
to mean 0["ack!"] which of course means 0[<address of "ack!">].  The
equivalence sets the first character in "ack!" equal to 5, which is
what you see in the core dump as displayed by emacs.

This is of course grossly implementation dependent, and uses
questionable programming practices, to wit:  Casting of char to
pointer is highly questionable.  Said pointer defaulting to (char *)
is probably reliable, but not good programming practice.  Assuming you
can write on top of constant data may not work on all systems.  The
assumption that Z will be equal to '\0' if not explicitly initialized
is not portable.  Worse, you cannot cast a pointer ("ack!") to an
integer array subscript on some systems!

Good programming practice demands you say exactly what you mean, and
only write on variable data (usually).  Thus:

main()
{
	char Z[5] = "ack!";

	Z[0] = 5;
}
Earle R. Horton. 23 Fletcher Circle, Hanover, NH 03755
(603) 643-4109
Sorry, no fancy stuff, since this program limits my .signature to three

merlyn@intelob.intel.com (Randal L. Schwartz @ Stonehenge) (10/06/88)

In article <10283@dartvax.Dartmouth.EDU>, earleh@eleazar (Earle R. Horton) writes:
| What happened here was that Z was initialized to (char)0 by the
| compiler or loader on the VAX.  (BSD 4.3)  Then, Z["ack!"] was taken
| to mean 0["ack!"] which of course means 0[<address of "ack!">].  The
| equivalence sets the first character in "ack!" equal to 5, which is
| what you see in the core dump as displayed by emacs.
| 
| This is of course grossly implementation dependent, and uses
| questionable programming practices, to wit:  Casting of char to
| pointer is highly questionable.  Said pointer defaulting to (char *)
| is probably reliable, but not good programming practice.  Assuming you
| can write on top of constant data may not work on all systems.  The
| assumption that Z will be equal to '\0' if not explicitly initialized
| is not portable.  Worse, you cannot cast a pointer ("ack!") to an
| integer array subscript on some systems!

I disagree.  I don't have the C bible to quote, but as a fluent C-er,
I can safely say that somewhere it says

   A[B] is entirely equivalent to B[A], both being synonymous with
   *(A+B)

This means that

   Z["Ack!"]

is the same as

   "Ack!"[Z]

so Z is being used as a char cast into an integer (can you say,
"subscript"?) while "Ack!" provides a const char array reference
(L-value).

Now, you may not like the idea of the subscript transposed with the
array name, but it is legal C.

The bad part is assigning into the const char array.  Some Cs will
like it, and others won't.  Essentially, what you are doing is:

   "Ack!"[0] = 5

or

  *"Ack!" = '\5'

Enough said.  Flames about how "foo" is a const char array are
welcome.  I don't know enough ANSI to comment on legality in the "new"
C.
-- 
Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095
on contract to BiiN Technical Information Services (for now :-),
in a former Intel building in Hillsboro, Oregon, USA
<merlyn@intelob.intel.com> or ...!tektronix!inteloa[!intelob]!merlyn
Standard disclaimer: I *am* my employer!

kyriazis@rpics (George Kyriazis) (10/06/88)

In article <14999@agate.BERKELEY.EDU> laba-3aw@web.berkeley.edu (Sam Shen) writes:
>Exactly what does this mean:
>
>main()
>{
>	char Z;
>
>	Z["ack!"] = 5;
>}
>

	Something like that was posted a few months ago.  Remember that

		a[b]  == *(a+b)		[pointer erithmetic]
		*(a+b) == *(b+a)	[obvious]
		*(b+a) == b[a]		[!!!!]

	In other words  Z["ack!"] == "ack!"[Z]  (!!). Z happens to be '\0',
so what is does is changes the first character of the constant "ack!" to 5.
In some systems you can't do that because constants are read-only.  I just
tried it on a SUN4 and it didn't complain at all..

  George Kyriazis
  kyriazis@turing.cs.rpi.edu
------------------------------

ron@ron.rutgers.edu (Ron Natalie) (10/06/88)

You're program is bogus only in the fact that Z is not set to anything
before it is referenced.  Automatic variables are NOT guaranteed to
be set to zero.

	A[B] is equivelent to *(A+B)

This means that in programs like:
	char	a[10];
	int	b;

	b = 1;

	a[b] = '5';
	b[a] = '5';

The last two lines are equivelent.
Note that strings are just character arrays.  So you can do

	"FOOBAR"[3] = 'E';

or even
	3["FOOBAR"] = 'E';

Although it's not clear to me what modifying a string like that
is because your example doesn't show you storing a pointer to
that string anywhere and hence it is not going to be able to
be referenced again.

-Ron

jfh@rpp386.Dallas.TX.US (The Beach Bum) (10/06/88)

In article <14999@agate.BERKELEY.EDU> laba-3aw@web.berkeley.edu (Sam Shen) writes:
>Exactly what does this mean:
>
>main()
>{
>	char Z;
>
>	Z["ack!"] = 5;
>}

The same as

	char Z;

	"ack!"[Z] = 5;

>This doesn't look right to me.  However, cc doesn't complain
>at all about it.  Lint says:
>
>blah.c(5): warning: Z may be used before set

Right.  It should be

	char Z = 0;

to give Z a reasonable value.

That would produce code to change the 'a' in "ack!" to a \005, which is ^E.

>Worse yet, the executable produced by gcc core dumps.  Oh, by the way, this
>is all on a Sun-3/50.

And well it should.
-- 
John F. Haugh II (jfh@rpp386.Dallas.TX.US)                   HASA, "S" Division

      "Why waste negative entropy on comments, when you could use the same
                   entropy to create bugs instead?" -- Steve Elias

dg@lakart.UUCP (David Goodenough) (10/07/88)

From article <14999@agate.BERKELEY.EDU>, by laba-3aw@web.berkeley.edu (Sam Shen):
> Exactly what does this mean:
> 
> main()
> {
> 	char Z;
> 
> 	Z["ack!"] = 5;
> }
> 
> This doesn't look right to me.

Given that a[b] is (in some compilers [1]) considered to be *(a+b), your
Z["ack!"] = 5 can be considered as *("ack!" + Z) = 5, which equates to
"ack!"[Z] = 5. OK so far - this will work on most machines where strings
do not become part of a write only memory segment (Do any architectures
exist where strings become shared and read only??). Reading such a value
however, should be legal (at least that's what I interpret K&R to say).

[1] It is interesting to note that Greenhills CC (what we have here)
chokes on this sort of thing, generating the following:

"foo.c", line 5: Indexing not allowed
"foo.c", line 5: Type mismatch

Is this compiler broken W.R.T. dpANSI?? (not that I do this, it just happens
a lot in the obfuscated C contest :-)
-- 
	dg@lakart.UUCP - David Goodenough		+---+
							| +-+-+
	....... !harvard!xait!lakart!dg			+-+-+ |
AKA:	dg%lakart@harvard.harvard.edu		  	  +---+

kenny@m.cs.uiuc.edu (10/08/88)

/* Written  2:54 pm  Oct  5, 1988 by merlyn@intelob.intel.com in m.cs.uiuc.edu:comp.lang.c */
Now, you may not like the idea of the subscript transposed with the
array name, but it is legal C.
/* End of text from m.cs.uiuc.edu:comp.lang.c */

Since K&R are somewhat unclear on the point, some compiler
implementors, notably Encore, have been lax about implementing
integer[pointer] and integer +- pointer.  It is unwise to depend on
either form's working in portable code; better to use pointer[integer]
and pointer +- integer.

Kevin

ark@alice.UUCP (Andrew Koenig) (10/09/88)

In article <4700019@m.cs.uiuc.edu>, kenny@m.cs.uiuc.edu writes:
> 
> /* Written  2:54 pm  Oct  5, 1988 by merlyn@intelob.intel.com in m.cs.uiuc.edu:comp.lang.c */
> Now, you may not like the idea of the subscript transposed with the
> array name, but it is legal C.
> /* End of text from m.cs.uiuc.edu:comp.lang.c */
> 
> Since K&R are somewhat unclear on the point, some compiler
> implementors, notably Encore, have been lax about implementing
> integer[pointer] and integer +- pointer.

First edition, page 186:

	A primary expression followed by an expression in
	square brackets is a primary expression.  The intuitive
	meaning is that of a subscript.  Usually, the primary
	expression has type ``pointer to ...'', the subscript
	expression is int, and the type of the result is ``...''.
	The expression E1[E2] is identical (by definition)
	to *((E1)+(E2)).  All the clues needed to understand
	this notation are contained in this section together
	with the discussion in sections 7.1, 7.2, and 7.4
	on identifiers, *, and + respectively; section 14.3 below
	summarizes the implications.

And on page 210, in section 14.3:

	Because of the conversion rules which apply to +, if E1
	is an array and E2 an integer, then E1[E2] refers to the
	E2-th member of E1.  Therefore, despite its asymmetric
	appearance, subscripting is a commutative operation.

I don't see how it's possible to be more explicit than that.
E1[E2] means precisely the same thing as *((E1)+(E2)) and is
therefore meaningful in every context where *((E1)+(E2)) is
meaningful.  In particular, it means the same as E2[E1].
I can't see how to draw any other conclusion from the above.

Indeed, some compiler writers may get it wrong.  But don't blame K&R.
-- 
				--Andrew Koenig
				  ark@europa.att.com

kenny@m.cs.uiuc.edu (10/10/88)

It would be clearer had K&R said specifically, in section 7.4, that
addition is commutative.  I'm not *blaming* K&R -- to me the intent
was clear -- but more than one compiler writer has misinterpreted the
passage.

kenny@m.cs.uiuc.edu (10/10/88)

	I owe Messrs. Kernighan and Ritchie an apology for suggesting
that they were less than clear in specifying the behavior of the
subscript operator; there is a passage that states specifically that
subscripting is commutative, despite its asymmetric appearance.  That
passage, however, is the *only* indication that pointer arithmetic is
commutative (the fact is not mentioned in the context of the addition
operator), which may account for its having been missed by several
compiler writers.

	In any case, K&R second edition, Harbison & Steele, and the
dpANS have all added text clarifying the point, so perhaps we can lay
it to rest, except to remember that a number of compilers do it wrong,
and therefore to avoid the construct in code that is expected to be
portable.  Consider it `a common bug,' though, rather than `a flaw in
the specification.'

tanner@cdis-1.uucp (Dr. T. Andrews) (10/11/88)

In article <4700019@m.cs.uiuc.edu>, kenny@m.cs.uiuc.edu writes:
) ...implementors, notably Encore, have been lax about implementing
) integer[pointer] and integer +- pointer.  It is unwise to depend on
) either form's working in portable code; ...

It is also possible that compiler writers will get the "for" loop
handling wrong.  It is unwise to depend on "for" loops in portable
code.  Use a "while" loop instead.
-- 
...!bikini.cis.ufl.edu!ki4pv!cdis-1!tanner  ...!bpa!cdin-1!cdis-1!tanner
or...  {allegra killer gatech!uflorida decvax!ucf-cs}!ki4pv!cdis-1!tanner

thomas@uplog.se (Thomas Hameenaho) (10/13/88)

In article <14999@agate.BERKELEY.EDU> laba-3aw@web.berkeley.edu (Sam Shen) writes:
#Exactly what does this mean:
#
#main()
#{
#	char Z;
#
#	Z["ack!"] = 5;
#}
#

#Worse yet, the executable produced by gcc core dumps.  Oh, by the way, this
#is all on a Sun-3/50.
#


The problem in the gcc case is that gcc puts strings in the text segment and
text is normally read-only.

Our gcc/cc for 68k also doesn't initialize Z to anything useful, it just grabs
what happens to be on the stack.
-- 
Real life:	Thomas Hameenaho		Email:	thomas@uplog.{se,uucp}
Snail mail:	TeleLOGIC Uppsala AB		Phone:	+46 18 189406
		Box 1218			Fax:	+46 18 132039
		S - 751 42 Uppsala, Sweden

mcdaniel@uicsrd.csrd.uiuc.edu (10/18/88)

Written  3:17 pm  Oct 14, 1988 by knudsen@ihlpl.ATT.COM in comp.lang.c:
> In article <6945@cdis-1.uucp>, tanner@cdis-1.uucp (Dr. T. Andrews) writes:
> > In article <4700019@m.cs.uiuc.edu>, kenny@m.cs.uiuc.edu writes:
> > ) ...implementors, notably Encore, have been lax about implementing
> > ) integer[pointer] and integer +- pointer.  It is unwise to depend on
> > ) either form's working in portable code; ...
> > It is also possible that compiler writers will get the "for" loop
> > handling wrong.  It is unwise to depend on "for" loops in portable
> > code.  Use a "while" loop instead.
>
>Could you elaborate more on what kind of errors are most likely
>in compiling for loops?  In testing?  Incrementing?

Um, I think (I hope) Dr. Andrews was being sarcastic.

The C language, as defined by the dpANS and K&R's second edition (and
arguably first edition), says that "i[p]" is equivalent to "p[i]" and
"i+p" is equivalent to "p+i", if "i" is an integer type and "p" is a
pointer type.  "i[p]" and "i+p" are as much a part of the language as
"for".  If a compiler does not accept such syntax, it's not a C
compiler; it's a compiler for a language that resembles C.

I don't think "i-p" is legal, though, Ken.

-- 
Tim, the Bizarre and Oddly-Dressed Enchanter
Center for Supercomputing Research and Development
at the University of Illinoid at Urbana-Champaign

Internet, BITNET:  mcdaniel@uicsrd.csrd.uiuc.edu
UUCP:    {uunet,convex,pur-ee}!uiucuxc!uicsrd!mcdaniel
ARPANET: mcdaniel%uicsrd@uxc.cso.uiuc.edu
CSNET:   mcdaniel%uicsrd@uiuc.csnet

ts@cup.portal.com (Tim W Smith) (11/01/88)

Just because something is only mentioned in one place in K&R is no
excuse for a compiler writer to miss it!  Before someone writes a
compiler that other people are going to use, that person should at
least have read *ALL* of the book. 

Can anyone give a reasonable excuse for a compiler writer not to
be aware of A[B] and B[A] meaning the same thing?

						Tim Smith