[comp.lang.c] "abcdef"[3] == 3["abcdef"], but why?

TEITTINEN@cc.helsinki.fi (09/29/89)

As I was trying to figure out what the cryptic maze program (posted to
net some days ago) does, I ran into a very interesting feature of C. The
program exploiting the fact that the following equation is true

         "abcdef"[3] == 3["abcdef"]   (both equal to 'd')

In fact, not only the values are the same. If the string "abcdef" was
replaced by a pointer to char, the expressions would refer to the same
memory location! This can't be a compiler-dependent feature (or bug)
because the maze program runs correctly on various machines.

Could someone explain to me what a C compiler does when it runs into
expression 3["abcdef"]? 

-- 
EARN: teittinen@finuh                        ! "Studying is the only way
Internet: teittinen@cc.helsinki.fi           !  to do nothing without
Marko Teittinen, student of computer science !  anyone blaming you" -me

kohli@gemed (Software Surfer) (09/30/89)

On 29 Sep 89 15:29:00 GMT, Marko Teittinen, student of computer
science, wrote:
>Could someone explain to me what a C compiler does when it runs into
>expression 3["abcdef"]? 
>

Look out-- this was posted earlier this year as one of
several feeble tests of C wizardry, so you'll get a lot
of postings.

The reason 3["abcdef"] is equivalent to "abcdef"[3] is
that "abcdef" is a char*, and 3 is an int.

Expressions of the form: ptr[offset] are evaluated as
ptr+offset, which, you'll note is equivalent to offset+ptr,
which is what you'd expect from offset[ptr].


Jim Kohli

chris@mimsy.UUCP (Chris Torek) (09/30/89)

In article <781@cc.helsinki.fi> TEITTINEN@cc.helsinki.fi writes:
>Could someone explain to me what a C compiler does when it runs into
>expression 3["abcdef"]? 

3["abcdef"] is of the form `e1 [ e2 ]' (e1 and e2 are arbitrary
expressions).  Every C compiler converts this internally to

	*( (e1) + (e2) )

(actually, compilers need only work *as if* they had done this
conversion, although many really do it).  So we really have

	*( 3 + "abcdef" )

To find out what this means, if anything, first try evaluating the
expression `3+"abcdef"'.  This is

	<value: int: 3> + <object: array 7 of char: `abcdef\0'>

Apply the rule for array objects in rvalue contexts: change `array N
of T' to `pointer to T', whose value is the address of the first (0th)
element of the array:

	<value: int: 3> + <value: char *: points to `a' in `abcdef\0'>

We now have an expression of the form <int>+<pointer>, so we move to
the int'th element of the array that starts at *<pointer>.  Here we
move to the 3rd element of `abcdef\0', which is the letter `c'.  Now
we have

	<value: char *: points to `c' in `abcdef\0'>

Now we can put the indirection back:

	*( <value: char *: points to `c' in `abcdef\0'> )

The operand for `*' is a pointer, so we get the object to which the
pointer points:

	<object: char: `c'>

Since e1[e2] is equivalent to *(e1+e2), if we reverse the expressions,
the only change is in the order of the operands to `+'.  The result of
<pointer>+<int> is the same as that of <int>+<pointer>, so we wind up
with <object: char: `c'> again.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

bengsig@oracle.nl (Bjorn Engsig) (10/02/89)

Article <19883@mimsy.UUCP> by chris@mimsy.UUCP (Chris Torek) explains
how the compiler could/does handle 3["abcde"].  This was very well done,
except that 3["abcde"] actually evaluates to the char 'd' and not 'c'.

I'm sure that Chris nows that C arrays start with index 0, right Chris :-)
-- 
Bjorn Engsig, bengsig@oracle.nl, bengsig@oracle.com, mcvax!orcenl!bengsig

tyler@procase.UUCP (William B. Tyler) (10/03/89)

In article <781@cc.helsinki.fi> TEITTINEN@cc.helsinki.fi writes:
>
>As I was trying to figure out what the cryptic maze program (posted to
>net some days ago) does, I ran into a very interesting feature of C. The
>program exploiting the fact that the following equation is true
>
>         "abcdef"[3] == 3["abcdef"]   (both equal to 'd')
>
>In fact, not only the values are the same. If the string "abcdef" was
>replaced by a pointer to char, the expressions would refer to the same
>memory location! This can't be a compiler-dependent feature (or bug)
>because the maze program runs correctly on various machines.
>
>Could someone explain to me what a C compiler does when it runs into
>expression 3["abcdef"]? 

This is a standard, though little-known part of the C language.
The expression   a[b]  is defined to mean the same thing as
*(a+b).  The result you obtained follows naturally from the rules
for pointer arithmetic.

Bill Tyler
-- 
Bill Tyler				...(tolerant|hpda)!procase!tyler

chris@mimsy.UUCP (Chris Torek) (10/03/89)

In article <519.nlhp3@oracle.nl> bengsig@oracle.nl (Bjorn Engsig) writes:
>Article <19883@mimsy.UUCP> by chris@mimsy.UUCP (Chris Torek) explains
>how the compiler could/does handle 3["abcde"].  This was very well done,
>except that 3["abcde"] actually evaluates to the char 'd' and not 'c'.

Oh my.  How embarrassing.

>I'm sure that Chris nows that C arrays start with index 0, right Chris :-)

He does now :-)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris