kelly@nmtsun.nmt.edu (Sean Kelly) (04/27/89)
My CS instructor and I disagree about a certain moot point. I have a text book which says that *(a + i) and a[i] are equivalent, given an array a, and int index i ... each gives the value stored in a[i]. But he says that *(a + i) is non-standard and would not expect it do go far on all _real_ C compilers (_real_ meaning those compilers that are somewhat devoted to K & R or ANSI). He expects that many compilers would instead add the value of i to the pointer a, and then reference the item stored there. I say that the compiler's smart enough to realize what we're trying to achieve, and won't do something like * (char *) ( (int) a + i ) which he thinks it will probably do on most machines. It doesn't on our Suns nor our VAX. I don't have a copy of K&R's book, first or new edition, just _Programming_ _in_C_ by S. Kochan, which seems pretty valid. What do you think? -- Sean Kelly I'm not a number, I am a free man! kelly@nmtsun.nmt.edu --The Prisoner --
gwyn@smoke.BRL.MIL (Doug Gwyn) (04/27/89)
In article <2459@nmtsun.nmt.edu> kelly@titan.nmt.edu (Sean Kelly) writes: >He expects that many compilers would instead add the value of i to the >pointer a, and then reference the item stored there. In C, pointer arithmetic ALWAYS involves scaling by the size of the pointed-to objects. This is one of Dennis's really useful insights. It is so fundamental to C that I have to worry about an instructor who claims otherwise.
krazy@claris.com (Jeff Erickson) (04/27/89)
From article <2459@nmtsun.nmt.edu>, by kelly@nmtsun.nmt.edu (Sean Kelly): > He expects that many compilers would instead add the value of i to the > pointer a, and then reference the item stored there. I say that the > compiler's smart enough to realize what we're trying to achieve, and > won't do something like "* (char *) ((int) a+i)" which he thinks it > will probably do on most machines. You're right. You're instructor is full of donkey doo-doo. In fact, since a[i] = *(a+i), and (a+i)=(i+a), you can actually write i[a] for a[i] and most compilers will take it! (Every one I've tried has, anyway.) I refer you to page 205 of K&R, second edition. "A pointer to an object in an array and a value of any integral type may be added. The latter is converted to an address offset by multiplying it by the size of the object to which the pointer points. The sum is the same type as the original pointer, and points to another object in the same array, appropriately offset from the original object. Thus, if P is a pointer to an object in an array, the expression P+1 is a pointer to the next object in the array." If I were you, I'd question your instructor's qualifications to his superior. This is one of *the* most useful features of C. He obviously isn't well- versed in the language he's trying to teach you. ~~~~~~~~~ -- Jeff Erickson Claris Corporation | Birdie, birdie, in the sky, 408/987-7309 Applelink: Erickson4 | Why'd you do that in my eye? krazy@claris.com ames!claris!krazy | I won't fret, and I won't cry. "I'm a heppy, heppy ket!" | I'm just glad that cows don't fly.
cik@l.cc.purdue.edu (Herman Rubin) (04/27/89)
In article <10135@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn) writes: > In article <2459@nmtsun.nmt.edu> kelly@titan.nmt.edu (Sean Kelly) writes: > >He expects that many compilers would instead add the value of i to the > >pointer a, and then reference the item stored there. > > In C, pointer arithmetic ALWAYS involves scaling by the size of the > pointed-to objects. This is one of Dennis's really useful insights. > It is so fundamental to C that I have to worry about an instructor > who claims otherwise. For the same operation, one way will be better on one machine, and a different way on another. There are machines with index operations, where the multiplication by the appropriate power of 2 is invisible hardware, there are machines where increment and decrement for addresses is invisible hardware, and machines where neither of these is the case. I suspect that the number of ways of doing this is comparable to the number of discussants of this on comp.lang.c. Now suppose I am doing some serious array operations, and I have to know whether one array buffer is longer than another. The elements are of type long. Do I have to do this multiplying and dividing by 4 all the time? Another example of "user-friendly" which turns out to be "user-inimical." -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)
ark@alice.UUCP (Andrew Koenig) (04/27/89)
In article <2459@nmtsun.nmt.edu>, kelly@nmtsun.nmt.edu (Sean Kelly) writes: > My CS instructor and I disagree about a certain moot point. I have a text > book which says that > *(a + i) and a[i] > are equivalent, given an array a, and int index i ... each gives the > value stored in a[i]. But he says that > *(a + i) > is non-standard and would not expect it do go far on all _real_ C compilers You are right, *(a + i) is precisely equivalent to a[i]. Any compiler that gets that wrong is badly broken. Moreover, many programs say a + i instead of &a[i] so there is a fair premium on getting at least that part of it right. You might ask your instructor for an example of a compiler that doesn't get *(a + i) right. -- --Andrew Koenig ark@europa.att.com
kremer@cs.odu.edu (Lloyd Kremer) (04/27/89)
In article <2459@nmtsun.nmt.edu> kelly@nmtsun.nmt.edu (Sean Kelly) writes: >My CS instructor and I disagree about a certain moot point. I have a text >book which says that > > *(a + i) and a[i] > >are equivalent, given an array a, and int index i ... each gives the >value stored in a[i]. But he says that > > *(a + i) > >is non-standard and would not expect it do go far on all _real_ C compilers The expressions *(a + i) and a[i] are absolutely synonymous in every way. Either one could be defined as the other. This fact is one of the foundational pillars of the C Language. Any C compiler that does not agree with this lacks knowledge of the most basic fundamentals of the language and does not deserve to be called a C compiler. I am tempted to make analogous remarks about C instructors. An interesting corollary of this rule, often used in intentionally obfuscated code, is: a[i] == *(a + i) == *(i + a) == i[a] Ask your instructor what he thinks 1["hello"] will evaluate to! -- Lloyd Kremer Brooks Financial Systems ...!uunet!xanth!brooks!lloyd Have terminal...will hack!
henry@utzoo.uucp (Henry Spencer) (04/27/89)
In article <2459@nmtsun.nmt.edu> kelly@titan.nmt.edu (Sean Kelly) writes: >My CS instructor and I disagree about a certain moot point. I have a text >book which says that > > *(a + i) and a[i] > >are equivalent, given an array a, and int index i ... each gives the >value stored in a[i]. But he says that > > *(a + i) > >is non-standard and would not expect it do go far on all _real_ C compilers >(_real_ meaning those compilers that are somewhat devoted to K & R or ANSI). Your instructor needs to read a book about C, and pay attention to it. He's obviously confusing C with assembler. Your book, and you, are correct. -- Mars in 1980s: USSR, 2 tries, | Henry Spencer at U of Toronto Zoology 2 failures; USA, 0 tries. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
gwyn@smoke.BRL.MIL (Doug Gwyn) (04/28/89)
In article <1266@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) [the infamous proponent of assembly language] writes: -In article <10135@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn) writes: -> In C, pointer arithmetic ALWAYS involves scaling by the size of the -> pointed-to objects. This is one of Dennis's really useful insights. -> It is so fundamental to C that I have to worry about an instructor -> who claims otherwise. -For the same operation, one way will be better on one machine, and a -different way on another. There are machines with index operations, where -the multiplication by the appropriate power of 2 is invisible hardware, -there are machines where increment and decrement for addresses is invisible -hardware, and machines where neither of these is the case. I suspect that -the number of ways of doing this is comparable to the number of discussants -of this on comp.lang.c. -Now suppose I am doing some serious array operations, and I have to know -whether one array buffer is longer than another. The elements are of type -long. Do I have to do this multiplying and dividing by 4 all the time? -Another example of "user-friendly" which turns out to be "user-inimical." I can make no sense whatsoever out of your comment. long a[ASIZE], b[BSIZE], *ap = &a[aindex], *bp = &b[bindex]; if ( ASIZE > BSIZE ) ... if ( sizeof a > sizeof b ) ... if ( aindex > bindex ) ... if ( ap > bp ) ... You don't have to do any "multiplying and dividing by 4 all the time". Neither does the compiler. There is virtually no sensible operation you can attempt with arrays or pointers in C that requires you to deal with such scaling; it's taken care of for you by the compiler.
jimb@hpmcaa.HP.COM (Jim Belesiu) (04/28/89)
I refer you to Kernighan and Ritchie's second edition of "The C Programming Language", p99. There you'll find it stated explicitly that a[i] is equivalent to *(a+i) where *a ana a[] reference the same data type. Jim Belesiu
bill@twwells.uucp (T. William Wells) (04/28/89)
In article <2459@nmtsun.nmt.edu> kelly@titan.nmt.edu (Sean Kelly) writes:
: My CS instructor and I disagree about a certain moot point. I have a text
: book which says that
:
: *(a + i) and a[i]
:
: are equivalent, given an array a, and int index i
This is a fundamental identity in C. A failure to do this in a
compiler would be considered a *major*, as in withdraw the product,
BUG.
A failure to understand this marks one as not competent to program C.
: What do you think?
Get another CS instructor.
---
Bill { uunet | novavax } !twwells!bill
mat@mole-end.UUCP (Mark A Terribile) (04/28/89)
> ... I have a text book which says that > *(a + i) and a[i] > are equivalent, given an array a, and int index i ... each gives the > value stored in a[i]. But he says that > *(a + i) > is non-standard and would not expect it do go far on all _real_ C compilers > ... He expects that many compilers would instead add the value of i to the > pointer a, and then reference the item stored there. I say that the > compiler's smart enough to realize what we're trying to achieve, and > won't do something like > > * (char *) ( (int) a + i ) > > which he thinks it will probably do on most machines. ... Oy vey! Of course they are equivalent; that is how subscripting is *defined* in C. Further, any compiler that introduces the effect of spurious type conversions of pointer expressions is broken. And I can testify of my own knowledge that at least two C compiler families with which some or most of us have experience transform the parser tree for the subscripted expression into the parse tree for explicit indirection before trying to generate code. If THAT won't cause them to produce the same code for both forms, there's very little that will. K&R state quite literally that the two expressions are identical, and further that a[ i ] is the same as i[ i ] I verified it on the PDP-11 compiler and on a Z-80 compiler derived from the PDP-11 compiler; I also tried it on an early PCC. I haven't tried it lately on anything. Why don't you try it on your favorite machine? K&R say it should work! -- (This man's opinions are his own.) From mole-end Mark Terribile
guy@auspex.auspex.com (Guy Harris) (04/28/89)
>My CS instructor and I disagree about a certain moot point. I have a text >book which says that > > *(a + i) and a[i] > >are equivalent, given an array a, and int index i ... each gives the >value stored in a[i]. Your textbook is correct. >But he says that > > *(a + i) > >is non-standard and would not expect it do go far on all _real_ C compilers >(_real_ meaning those compilers that are somewhat devoted to K & R or ANSI). Your instructor is incorrect. >He expects that many compilers would instead add the value of i to the >pointer a, and then reference the item stored there. Yes, which gives the value stored in a[i]. "The pointer a" is really "the pointer-valued expression generated by the conversion of the array-valued expression 'a' into a pointer-valued expression that points to the first element of the array 'a'"; if you add "i" to that pointer-valued expression, you get a pointer to the "i"th element of the array "a". Dereference that pointer, and you get the "i"th element of the array "a", or "a[i]". >I say that the compiler's smart enough to realize what we're trying >to achieve, and won't do something like > > * (char *) ( (int) a + i ) > >which he thinks it will probably do on most machines. You are correct; he is incorrect. Perhaps he does not understand how pointer addition works in C? If you add an integral value N to a pointer, it doesn't increment the address in that pointer by N storage units (bytes on byte addressible machine, etc.), it can be thought of as incrementing the address by N objects of the type to which that pointer points. In C, pointers have types, and those types are significant.
ftw@masscomp.UUCP (Farrell Woods) (04/28/89)
In article <2459@nmtsun.nmt.edu> kelly@titan.nmt.edu (Sean Kelly) writes: >My CS instructor and I disagree about a certain moot point. I have a text >book which says that > *(a + i) and a[i] >are equivalent, given an array a, and int index i ... each gives the >value stored in a[i]. But he says that > *(a + i) >is non-standard and would not expect it do go far on all _real_ C compilers [deleted] >What do you think? I think your Sun and your Vax are better authorities on C than your instructor. So are you, for that matter. If the Sun and Vax compilers aren't "real" in your instructors terms, whaich compilers are? What you describe is simple pointer math. Find a K&R 1 and have your instructor start reading at the paragraph beginning near the top of page 94. -- Farrell T. Woods Voice: (508) 392-2471 Concurrent Computer Corporation Domain: ftw@masscomp.com 1 Technology Way uucp: {backbones}!masscomp!ftw Westford, MA 01886 OS/2: Half an operating system
bill@twwells.uucp (T. William Wells) (04/29/89)
In article <9987@claris.com> krazy@claris.com (Jeff Erickson) writes:
: In fact, since a[i] = *(a+i), and (a+i)=(i+a), you can actually write i[a]
: for a[i] and most compilers will take it! (Every one I've tried has, anyway.)
Try Microsoft. I don't know if it is true of the latest version, but
one of about two years ago wouldn't take it.
I made the mistake (don't ask why!) of putting this in some code
Proximity shipped; we got many complaints from people with broken
compilers. And not only Microsoft though I don't recall which others.
I certainly remember the embarrassment!
---
Bill { uunet | novavax } !twwells!bill
bill@twwells.uucp (T. William Wells) (04/29/89)
In article <1266@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:
: Now suppose I am doing some serious array operations, and I have to know
: whether one array buffer is longer than another. The elements are of type
: long. Do I have to do this multiplying and dividing by 4 all the time?
: Another example of "user-friendly" which turns out to be "user-inimical."
Oh bullshit, Mr. Rubin. As usual, you are wanting PL/I++++, not C. And
a compiler that reads programmer's minds. If you really don't want to
do the division, maintain integer indexes, not pointers.
If you think that pointer manipulation is going to give you the
better code, use them. If you think that index manipulation is going
to give you the better code, use those. And if you can't figure which
is better, how, pray tell, do you figure the compiler will figure it
out? C at least gives yo a fighting chance by giving you a choice.
---
Bill { uunet | novavax } !twwells!bill
mat@mole-end.UUCP (Mark A Terribile) (04/29/89)
> > *(a + i) and a[i] > > are equivalent, given an array a, and int index i ... [ ARE THEY?? ] > Oy vey! Of course ... that is how subscripting is *defined* in C. ... > K&R state quite literally that the two expressions are identical, and [that] > a[ i ] is the same as i[ i ] Of course, that should read a[ i ] is the same as i[ a ] Pardon my gaffe! Is this group c.beginners ? -- (This man's opinions are his own.) From mole-end Mark Terribile
d87-hho@nada.kth.se (Henrik Holmstr|m) (05/01/89)
Do we need 28 follow-ups to a trivial question? You know the rule, if you see something obviously wrong or simple, don't just hit 'F'. Wait a day or two and see if someone else answered the question (or just mail the anwser). Henrik Holmstr|m
sater@cs.vu.nl (Hans van Staveren) (05/01/89)
In article <1513@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes: > > >My CS instructor and I disagree about a certain moot point. I have a text > >book which says that > > > > *(a + i) and a[i] > /* LOTS OF STUFF DELETED */ >You are correct; he is incorrect. Perhaps he does not understand how >pointer addition works in C? If you add an integral value N to a pointer, >it doesn't increment the address in that pointer by N storage units >(bytes on byte addressible machine, etc.), it can be thought of as >incrementing the address by N objects of the type to which that pointer >points. In C, pointers have types, and those types are significant. Just to show how old I am, let me tell you that in Unix V6 on the PDP 11, the only machine it ran on, the expressions a + i and i + a with a a pointer and i an integer were not equivalent. a + i worked as it does nowadays while i + a worked as this guys instructor fears. I am even willing to admit I used this trick, but then in those days the way to get an unsigned was to declare it as a char* and casts were not invented yet. Language historians, take note! Hans van Staveren Vrije Universiteit Amsterdam, Holland
dg@lakart.UUCP (David Goodenough) (05/01/89)
From article <879@twwells.uucp>, by bill@twwells.uucp (T. William Wells): > In article <9987@claris.com> krazy@claris.com (Jeff Erickson) writes: > : In fact, since a[i] = *(a+i), and (a+i)=(i+a), you can actually write i[a] > : for a[i] and most compilers will take it! (Every one I've tried has, anyway.) > > Try Microsoft. I don't know if it is true of the latest version, but > one of about two years ago wouldn't take it. GreenHills (the compiler supplied with our machine) gets all bent out of shape about it too. Makes it a bear to compile obfuscated C programs, but not much of a handicap otherwise :-) It does get *(a + i) right though ..... -- dg@lakart.UUCP - David Goodenough +---+ IHS | +-+-+ ....... !harvard!xait!lakart!dg +-+-+ | AKA: dg%lakart.uucp@xait.xerox.com +---+
Tim_CDC_Roberts@cup.portal.com (05/01/89)
Ok, folks. In regards to "a[i] == *(a+i) == *(i+a) == i[a]", let me refer to the oft-used example 2["hello"]. I agree that this works and is equivalent to "hello"[2]. I've seen it in books and postings. My simple question is why? (Please don't submit 30 replies saying "because the book says so"...) Doesn't that equivalence imply that the pointer type is somehow "stronger" than the simple type? Is that, in fact, the case? Is a compiler force to examine all of the elements in a pointer expression and establish the "master type" of the expression? If I mix two pointer types, as in char * c; long * ell; return c + ell; is this anarchy? Is it a syntax error? What is sizeof(*(c+ell))? Inquiring minds want to know. Tim_CDC_Roberts@cup.portal.com | Control Data... ...!sun!portal!cup.portal.com!tim_cdc_roberts | ...or it will control you.
kremer@cs.odu.edu (Lloyd Kremer) (05/02/89)
In article <17812@cup.portal.com> Tim_CDC_Roberts@cup.portal.com writes: >In regards to "a[i] == *(a+i) == *(i+a) == i[a]", let me >refer to the oft-used example 2["hello"]. > >I agree that this works and is equivalent to "hello"[2]. I've seen it >in books and postings. My simple question is why? >......... >Is a compiler force to examine all of the >elements in a pointer expression and establish the "master type" of the >expression? If I mix two pointer types, as in > char * c; > long * ell; > return c + ell; >is this anarchy? Is it a syntax error? What is sizeof(*(c+ell))? Anarchy? Yes, pointer addition has never been defined in C. Syntax error? I guess so. Lint says, "operands of + have incompatible types." Sizeof? The expression is not defined, so its size certainly is not. As to the conceptual implementation of a[i], the compiler sees a pointer a, and an int i. As has been shown, it does not matter which is in the brackets and which is outside. It does matter which is the pointer and which is the integer, but since C is a type-oriented language, it does know this. Many compilers immediately translate a[i] into *(a + i). (Yet another demonstration of their equivalence!) a+i is an address which is evaluated as: {machine address referenced by "a"} plus {"i" times sizeof(*a)}. a[i] or *(a + i) is then the object of type *a located at that address. Although 2["hello"] is cryptic, a compiler *should* get it right according to the language definition (old or new). If I observed a certain compiler to fail on it, my confidence in that compiler to perform properly in other areas would decrease by several orders of magnitude. Another of these "confidence-diminishing" tests is 'sizeof("string")'. The correct answer is 7. Compilers that say 'sizeof(char *)' are broken. -- Lloyd Kremer Brooks Financial Systems ...!uunet!xanth!brooks!lloyd Have terminal...will hack!
cs132046@brunix (Garrett Fitzgerald) (05/03/89)
Umm... I'm kind of getting lost here. Is 2["hello"] == 'e'? And why does it allow pointer/integer addition in this order? -------------------------------- Campus Crusade for Cthulhu--when you're tired of the lesser of two evils. Sarek of Vulcan, a.k.a. Garrett Fitzgerald cs132046@brunix or st902620@brownvm.bitnet
chris@mimsy.UUCP (Chris Torek) (05/03/89)
In article <17812@cup.portal.com> Tim_CDC_Roberts@cup.portal.com writes: >... let me refer to the oft-used example 2["hello"]. >I agree that this works and is equivalent to "hello"[2]. I've seen it >in books and postings. My simple question is why? The type of "hello" is array 6 of char (or char [6] if you prefer) which, in all contexts except declarations and targets of sizeof(), changes to pointer to char with the value being the address of the first (zero'th) element of the array. So the types of the two expressions "hello"[2] and 2["hello"] are (char *) [ (int) ] and (int) [ (char *) ] The [] syntax means `add the value of the object to the left to the value of the object to the right, then dereference': * ( (char *) + (int) ) and * ( (int) + (char *) ) respectively. Addition is defined on two cases: addition of scalar types with other scalar types (such as int+int, or double+int, or char+long) and addition involving pointers. Both additions involve pointers, so both follow these rules, which are: The result of <pointer to T> plus <integral expression whose value is N> is the address of the N'th object of type T `away from' the place where the pointer points, in the `increasing' direction if N is positive, and the `decreasing' direction if N is negative. The result of <integral expression whose value is N> plus <pointer to T> is the same as that of <pointer to T> plus <integral expression whose value is N>. No other additions involving pointers are legal. >Doesn't that equivalence imply that the pointer type is somehow >"stronger" than the simple type? You might think of it as such; without a defintion of `strength' there is no way to say. >Is a compiler force to examine all of the elements in a pointer >expression and establish the "master type" of the expression? The compiler must look at both types in any dyadic operation (addition, subtraction, multiplication, division, -> selection, . selection, etc.). The result of the lookup can be found in a table in the language definition. >If I mix two pointer types ... is this anarchy? Is it a syntax error? If the operation is addition, it is a semantic error: there is no definition for the result of addition of two pointers. (The subtraction operator allows two operands which are both pointers, but they must have the same type.) >(Please don't submit 30 replies saying "because the book says so"...) s/book/language definition/, and you have the answer above (but without all the verbiage). -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
sho@pur-phy (Sho Kuwamoto) (05/03/89)
In article <17812@cup.portal.com> Tim_CDC_Roberts@cup.portal.com writes: >If I mix two pointer types, as in > char * c; > long * ell; > return c + ell; >is this anarchy? Is it a syntax error? What is sizeof(*(c+ell))? This should be a syntax error. Even long *a, *b; return(a+b) is illegal. However, long *a, *b; return(a-b); Is legit. If a and b are pointers to different types, it is probably still a syntax error. On the other hand, I could be wrong. I got mildly crisped last time I fielded a question... -Sho
jeffrey@algor2.UUCP (Jeffrey Kegler) (05/04/89)
In article <941@draken.nada.kth.se> d87-hho@nada.kth.se (Henrik Holmstr|m) writes: >Do we need 28 follow-ups to a trivial question? You know the rule, if you >see something obviously wrong or simple, don't just hit 'F'. Wait a day >or two and see if someone else answered the question (or just mail the anwser). > > Henrik Holmstr|m While I sympathize with Mr Holmstr|m, I hope his advice is not followed. For a start, if it were universally followed, you would get 28 answers 2 days late. More important, even when the question is "silly", and I already know the answer, the answers from 28 other people usually add to my knowledge. The only real alternative would be a moderated group where the moderator had a panel of C experts he called upon. With the number of C experts posting here the moderator would really have taken on a full time job. -- Jeffrey Kegler, President, Algorists, jeffrey@algor2.UU.NET or uunet!algor2!jeffrey 1762 Wainwright DR, Reston VA 22090
pc@cs.keele.ac.uk (Phil Cornes) (05/17/89)
From article <17812@cup.portal.com>, by Tim_CDC_Roberts@cup.portal.com: > Ok, folks. In regards to "a[i] == *(a+i) == *(i+a) == i[a]", let me > refer to the oft-used example 2["hello"]. > I agree that this works and is equivalent to "hello"[2]. I've seen it > in books and postings. My simple question is why? C does not really support arrays, and the square bracket operator ([]) is just syntactic sugar to make you think that it does! This works quite well until you see things like "hello"[2] == 2["hello"] which only look odd if you continue to think of them as arrays and not pointers. > If I mix two pointer types, as in > char * c; > long * ell; > return c + ell; > is this anarchy? Is it a syntax error? What is sizeof(*(c+ell))? > In this case the question doesn't make sense because the addition operators function is undefined for two pointer type operands....
mesmo@Portia.Stanford.EDU (Chris Johnson) (05/18/89)
From article <17812@cup.portal.com>, by Tim_CDC_Roberts@cup.portal.com: > Ok, folks. In regards to "a[i] == *(a+i) == *(i+a) == i[a]", let me > refer to the oft-used example 2["hello"]. > I agree that this works and is equivalent to "hello"[2]. I've seen it > in books and postings. My simple question is why? The supposed proof of a[i] == i[a] rests on the faulty assumption that (x+y) == (y+x) in all contexts; this is not correct. When "+" denotes simple (ie int/float/etc) arithmetic, the operation commutes; when it denotes pointer arithmetic, commutation is not legal/meaningful. The statement that *(a+i) == *(i+a) is therefore invalid. -- ============================================================================== Chris M Johnson === mesmo@portia.stanford.edu === "Grad school sucks rocks" "Imitation is the sincerest form of plagiarism" -- ALF ==============================================================================
gwyn@smoke.BRL.MIL (Doug Gwyn) (05/18/89)
In article <607@kl-cs.UUCP> pc@cs.keele.ac.uk (Phil Cornes) writes: >C does not really support arrays, and the square bracket operator ([]) is >just syntactic sugar to make you think that it does! Just in case this misleads anyone, it should be noted that C really does support arrays as distinct from pointers; however, pointers are fundamental to C while arrays are second-class objects with "crippled" semantics.
rob@kaa.eng.ohio-state.edu (Rob Carriere) (05/18/89)
In article <2336@Portia.Stanford.EDU> mesmo@Portia.Stanford.EDU (Chris Johnson) writes: > The supposed proof of a[i] == i[a] rests on the faulty > assumption that (x+y) == (y+x) in all contexts; this is > not correct. No it doesn't. It relies on a direct statement in K&R I (pg 210, 1st par) "Therefore, despite its asymmetric appearance, subscripting is a commutative operation." > When "+" denotes simple (ie int/float/etc) arithmetic, the > operation commutes; when it denotes pointer arithmetic, > commutation is not legal/meaningful. There is no such statement in K&R; in fact, the paragraph from the above quote came implies the oppposite and so does the section on the "+" operator (A7.4, pg188--189) SR
tim@crackle.amd.com (Tim Olson) (05/18/89)
In article <2336@Portia.Stanford.EDU> mesmo@Portia.Stanford.EDU (Chris Johnson) writes: | The supposed proof of a[i] == i[a] rests on the faulty | assumption that (x+y) == (y+x) in all contexts; this is | not correct. | | When "+" denotes simple (ie int/float/etc) arithmetic, the | operation commutes; when it denotes pointer arithmetic, | commutation is not legal/meaningful. | | The statement that *(a+i) == *(i+a) is therefore invalid. Why do you think that commutation is not legal for pointer arithmetic? It certainly is still associative: (pointer + 3) +5 <==> pointer + (3 + 5) K&R simply say that the "+" operator (as well as "*", "&", "|", and "^") is commutative and associative, without mentioning any restrictions. The (d)PANS says, in the constraint section for additive operators that "For addition, either both operands shall have arithmetic type, or one operand shall be a pointer to an object type and the other shall have integral type." It doesn't say that "... or the first operator shall be a pointer...", which certainly seems to mean that pointer addition is commutative. -- Tim Olson Advanced Micro Devices (tim@amd.com)
gwyn@smoke.BRL.MIL (Doug Gwyn) (05/18/89)
In article <2336@Portia.Stanford.EDU> mesmo@Portia.Stanford.EDU (Chris Johnson) writes:
- When "+" denotes simple (ie int/float/etc) arithmetic, the
- operation commutes; when it denotes pointer arithmetic,
- commutation is not legal/meaningful.
- The statement that *(a+i) == *(i+a) is therefore invalid.
100% wrong! If you don't know C any better than that, you should avoid
causing confusion and refrain from posting such misinformation.
chris@mimsy.UUCP (Chris Torek) (05/18/89)
In article <2336@Portia.Stanford.EDU> mesmo@Portia.Stanford.EDU (Chris Johnson) writes: >The supposed proof of a[i] == i[a] rests on the faulty >assumption that (x+y) == (y+x) in all contexts; this is >not correct. When "+" denotes simple (ie int/float/etc) >arithmetic, the operation commutes; when it denotes pointer >arithmetic, commutation is not legal/meaningful. The latter assertion is exactly backwards. Pointer arithmetic (in the forms pointer+integer and integer+pointer) is guaranteed to be commutative, while scalar addition is not: scalar addition of certain values is not commutative on certain peculiar architectures---things like negative zero or peculiar floating point values, for instance. >The statement that *(a+i) == *(i+a) is therefore invalid. Since pointer arithmetic is commutative, the above statement is wrong, and *(a+i) is equivalent to *(i+a), so that a[i] and i[a] denote the same object. See K&R (either edition) chapter 5. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
usenet@TSfR.UUCP (usenet) (05/18/89)
In article <2336@Portia.Stanford.EDU> mesmo@Portia.Stanford.EDU (Chris Johnson) writes: > When "+" denotes simple (ie int/float/etc) arithmetic, the > operation commutes; when it denotes pointer arithmetic, > commutation is not legal/meaningful. Umm, I don't follow. K&R says that `+' is a commutative operator (K&R 1, page 185) and I can't find any comment to the contrary in the section that discusses `+' in detail. -david parsons -orc@pell.uucp
ark@alice.UUCP (Andrew Koenig) (05/18/89)
In article <2336@Portia.Stanford.EDU>, mesmo@Portia.Stanford.EDU (Chris Johnson) writes: > When "+" denotes simple (ie int/float/etc) arithmetic, the > operation commutes; when it denotes pointer arithmetic, > commutation is not legal/meaningful. Yes it is. Addition of integers and pointers is commutative. > The statement that *(a+i) == *(i+a) is therefore invalid. No, the statment is true. -- --Andrew Koenig ark@europa.att.com
jejones@mcrware.UUCP (James Jones) (05/18/89)
A message asserts that surely (p + 3) + 5 == p + (3 + 5) where p is a pointer, and so it is, but...in general, it might not be. We turn once again to the canonical counterexample, segmented architectures, where it's not clear that (p - 5) + 6 == p + (-5 + 6) since p - 5 might fall off the end of the segment, and after that, all bets are likely to be off. That said, I hasten to add that I agree that p + i == i + p; any bogosity arising will arise no matter what order is used. James Jones
guy@auspex.auspex.com (Guy Harris) (05/18/89)
>C does not really support arrays, and the square bracket operator ([]) is >just syntactic sugar to make you think that it does! This works quite well >until you see things like "hello"[2] == 2["hello"] which only look odd if >you continue to think of them as arrays and not pointers. If one says "C does not really support arrays", one should be careful to indicate what one means; arrays really *are* arrays, not pointers. E.g., on most C implementations, int a[33]; causes a block of "33*sizeof (int)" bytes to be allocated; if it has static storage duration (i.e., either external or static), a symbol "a" or "_a" or whatever will probably be defined, and will refer to the first location in that block - *not* to a block of size "sizeof (int *)" that contains a pointer to the block of size "33*sizeof (int)". However, array-valued expressions are, in most (but *not* all!) contexts, converted to pointer-valued expressions; this is why "[]" is sort of syntactic sugar. "a[i]" gets turned into "*(a + i)"; the reason this would work for the array "a" defined above is that in the context of the expression "*(a + i)", the array-valued expression "a" gets converted to a pointer-valued expression that points to the first element of "a"; adding "i" to the value of that expression yields a pointer that points to the "i"th element of "a", and dereferencing that pointer yields the value of the "i"th element of "a". The distinction between "arrays are pointers" and "array-valued expressions get converted to pointer-valued expressions" is important; the question "why doesn't this program work: foo.c: ... int a[33]; ... bar.c: ... extern int *a; ... " surfaces periodically in this group. If arrays and pointers really were the same thing, that program might well work; the reason it doesn't work is that arrays and pointers *aren't* the same thing. Similarly, for the array "a" described above, "sizeof a" is "33*sizeof (int)", not "sizeof (int *)" (although some compiler writers may have been confused as well, and made their compilers give "sizeof (int *)" for "sizeof a").
guy@auspex.auspex.com (Guy Harris) (05/18/89)
> When "+" denotes simple (ie int/float/etc) arithmetic, the > operation commutes; when it denotes pointer arithmetic, > commutation is not legal/meaningful. Funny, X3J11 disagrees with you: 3.3.6 Additive operators ... Semantics ... ...In other words, if the expression "P" pointers to the "i"th element of an array object, the expressions "(P)+N" (equivalently, "N+(P)")... > The statement that *(a+i) == *(i+a) is therefore invalid. The statement that "The statement that *(a+i) == *(i+a) is therefore invalid" is therefore invalid. It may make life miserable for compiler writers, but if so they should have lobbied X3J11; it's probably too late now - go forth and fix your compiler, if it can't cope with "i[a]".
bph@buengc.BU.EDU (Blair P. Houghton) (05/19/89)
In article <2336@Portia.Stanford.EDU> mesmo@Portia.Stanford.EDU (Chris Johnson) writes: > > The supposed proof of a[i] == i[a] rests on the faulty > assumption that (x+y) == (y+x) in all contexts; this is > not correct. Oh yeah? > When "+" denotes simple (ie int/float/etc) arithmetic, the > operation commutes; when it denotes pointer arithmetic, > commutation is not legal/meaningful. > > The statement that *(a+i) == *(i+a) is therefore invalid. it implies that you were doing sometype *a, *i; something = a[i]; something_else = i[a]; So, like, tell me. When do you use pointers as indices? I.e., if one of the two variables, a or i, is an int, and the other is a pointer, then you have leave to say that pointer[int] == int[pointer] because *(pointer + int) == *(int + pointer) and because *((pointer or int) + (the other)) == (pointer or int)[the other]. --Blair "I still think I should be able to add pointers together."
pjh@mccc.UUCP (Pete Holsberg) (05/19/89)
In article <607@kl-cs.UUCP> pc@cs.keele.ac.uk (Phil Cornes) writes:
=From article <17812@cup.portal.com>, by Tim_CDC_Roberts@cup.portal.com:
=> Ok, folks. In regards to "a[i] == *(a+i) == *(i+a) == i[a]", let me
=> refer to the oft-used example 2["hello"].
=> I agree that this works and is equivalent to "hello"[2]. I've seen it
=> in books and postings. My simple question is why?
=
=C does not really support arrays, and the square bracket operator ([]) is
=just syntactic sugar to make you think that it does! This works quite well
=until you see things like "hello"[2] == 2["hello"] which only look odd if
=you continue to think of them as arrays and not pointers.
Can you explain why compilers produce different code for "a[i]" and "*(a+i)"?
Thanks.
--
Pete Holsberg, Mercer County Community College, Trenton, NJ 08690
mat@mole-end.UUCP (Mark A Terribile) (05/19/89)
> > Ok, folks. In regards to "a[i] == *(a+i) == *(i+a) == i[a]", let me > > refer to the oft-used example 2["hello"]. > > I agree that this works and is equivalent to "hello"[2]. I've seen it > > in books and postings. My simple question is why? For the reason that everyone has said: x [ y ] IS DEFINED AS *( x + y ) and if one expression is legal, so is the other one. > The supposed proof of a[i] == i[a] rests on the faulty > assumption that (x+y) == (y+x) in all contexts; this is > not correct. Are you saying that 2[ "hello" ] is not the same as "hello"[ 2 ] ? If so, you are wrong. The ordering of the operands does not matter. C has been this way from about the beginning and unless there is a specific item in the pANSI spec (I find none in K&R-II) it is allowed. (But see K&R-II, section A8.6.2: ``Therefore, despite its asymmetric appearance, subscripting is a commutative operation.'') > When "+" denotes simple (ie int/float/etc) arithmetic, the > operation commutes; when it denotes pointer arithmetic, > commutation is not legal/meaningful. It is. Using K&R-II again, over and over in discussing the addition of an integer to a pointer, they say ``one operand ... and the other operand ...'' *Never* are the first and second operands distinguished. There is a reason for this care. > The statement that *(a+i) == *(i+a) is therefore invalid. Not only *(a+i) == *(i+a) for a of any pointer type (excluding pointer-to-function) and i of an integral type, but *( a + i ) === * ( i + a ) ( a + i ) == ( i + a ) *( a + i ) == * ( i + a ) etc. If your compiler rejects "hello"[ 2 ] it is broken. (Have you tried it, by the way?) -- (This man's opinions are his own.) From mole-end Mark Terribile
byron@pyr.gatech.EDU (Byron A Jeff) (05/19/89)
In article <2336@Portia.Stanford.EDU> mesmo@Portia.Stanford.EDU (Chris Johnson) writes: -From article <17812@cup.portal.com>, by Tim_CDC_Roberts@cup.portal.com: -> Ok, folks. In regards to "a[i] == *(a+i) == *(i+a) == i[a]", let me -> refer to the oft-used example 2["hello"]. -> I agree that this works and is equivalent to "hello"[2]. I've seen it -> in books and postings. My simple question is why? - - The supposed proof of a[i] == i[a] rests on the faulty - assumption that (x+y) == (y+x) in all contexts; this is - not correct. - - When "+" denotes simple (ie int/float/etc) arithmetic, the - operation commutes; when it denotes pointer arithmetic, - commutation is not legal/meaningful. - - The statement that *(a+i) == *(i+a) is therefore invalid. Try this program on for size: main() { char *p = "Goofy"; printf("%c %c %d %d\n",*(p+2),*(2+p), *(p+2) == *(2+p), 2+p == p+2); } and its output: o o 1 1 Any other assertions you'd like to make? --- -============================================================================== - Chris M Johnson === mesmo@portia.stanford.edu === "Grad school sucks rocks" - "Imitation is the sincerest form of plagiarism" -- ALF -============================================================================== -- Another random extraction from the mental bit stream of... Byron A. Jeff Georgia Tech, Atlanta GA 30332 Internet: byron@pyr.gatech.edu uucp: ...!gatech!pyr!byron
henry@utzoo.uucp (Henry Spencer) (05/19/89)
In article <749@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes: >Can you explain why compilers produce different code for "a[i]" and "*(a+i)"? Incompetent compiler writers. -- Subversion, n: a superset | Henry Spencer at U of Toronto Zoology of a subset. --J.J. Horning | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
Tim_CDC_Roberts@cup.portal.com (05/20/89)
In <1176@mcrware.UUCP>, jejones@mcrware.UUCP (James Jones) writes: >A message asserts that surely > > (p + 3) + 5 == p + (3 + 5) > >where p is a pointer, and so it is, but...in general, it might not be. >We turn once again to the canonical counterexample, segmented >architectures, where it's not clear that > > (p - 5) + 6 == p + (-5 + 6) > >since p - 5 might fall off the end of the segment, and after that, all >bets are likely to be off. I disagree with this! I assert that EVEN if the intermediate result goes negative, the final value will be correct, even on segmented architectures. It is true that it might be impossible or even dangerous to dereference the address (p - 5), but we aren't trying to DO that. Example: 32 bit system. Top 12 bits are a segment number, bottom 20 bits are an address. Lets say p is at offset 2 in segment 0x012. p = 0x01200002 p-5 = 0x011ffffd (p-5)+6 = 0x01200003 Yes, the intermediate value is not a valid address, but I don't think that's important. Tim_CDC_Roberts@cup.portal.com | Control Data... ...!sun!portal!cup.portal.com!tim_cdc_roberts | ...or it will control you.
guy@auspex.auspex.com (Guy Harris) (05/20/89)
>Can you explain why compilers produce different code for "a[i]" and "*(a+i)"?
If the compiler doesn't produce equally good code for those two cases,
it's because the compiler writer wasn't doing as good a job as s/he
could. If it produces different but equally good code, I dunno;
possibly because the compiler writer didn't understand that "a[i]" is
equivalent to "*(a+i)", or decided for whatever reason to implement them
differently.
The existence of compilers that produce different code for those cases
does not, in any way, prove that the two expressions are in equivalent;
K&R First Edition points out that
...The expression E1[E2] is identical (by definition) to
*((E1)+(E2)).
and the December 7, 1988 ANSI C draft says that
...The definition of the subscript operator [] is that E1[E2] is
identical to *(E1+(E2))).
so further discussion on whether they're equivalent in C is pointless -
they are, and that's that. If somebody wants to debate whether they
*should* be equivalent, they can, but they're then talking about D or P,
say, not C.
diamond@diamond.csl.sony.junet (Norman Diamond) (05/20/89)
In article <749@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes: >Can you explain why compilers produce different code for "a[i]" and "*(a+i)"? >Thanks. Hear, hear. Yes, there are a lot of broken implementations, and there are a lot more implementations which are not broken but just wierd. Yes, this is one of the ways in which many implementations are wierd, and I also wonder why. Anyone have any ideas? -- Norman Diamond, Sony Computer Science Lab (diamond%csl.sony.co.jp@relay.cs.net) The above opinions are my own. | Why are programmers criticized for If they're also your opinions, | re-implementing the wheel, when car you're infringing my copyright. | manufacturers are praised for it?
chris@mimsy.UUCP (Chris Torek) (05/20/89)
In article <749@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes: [in response to article <607@kl-cs.UUCP> by pc@cs.keele.ac.uk (Phil Cornes)] >Can you explain why compilers produce different code for "a[i]" and "*(a+i)"? I have never observed one to do so. There is no reason for a compiler to generate different code, as the expressions are semantically identical. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
karl@haddock.ima.isc.com (Karl Heuer) (05/21/89)
In article <1657@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes: >[K&R and the pANS agree that a[i]==i[a],] so further discussion on whether >they're equivalent in C is pointless - they are, and that's that. If >somebody wants to debate whether they *should* be equivalent, they can, but >they're then talking about D or P, say, not C. Or maybe ANSI C in a nearby parallel universe. I thought it strange that X3J11 outlawed "x+ =1" ("+=" is now a single token), but permitted "i[a]" (on the grounds that they "saw no reason to forbid it"). Neither construct is ever used outside the IOCCC, and outlawing "i[a]" would have been a small step towards making arrays higher-class citizens than they are. Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint (To anyone who's itching to say that i[hairy_array_expression] avoids a pair of parentheses: if you code that way, I spit on your grandmother's shadow.)
henry@utzoo.uucp (Henry Spencer) (05/21/89)
In article <18560@cup.portal.com> Tim_CDC_Roberts@cup.portal.com writes: >I disagree with this! I assert that EVEN if the intermediate result >goes negative, the final value will be correct, even on segmented >architectures. You are assuming that there will *be* a final value. You may get a trap the instant the intermediate result goes invalid, if pointer arithmetic is being done by special pointer-arithmetic instructions. Actually, even if you don't get a trap, pointer-arithmetic instructions may do almost anything when presented with an invalid operand. They don't have to act like integer instructions. -- Subversion, n: a superset | Henry Spencer at U of Toronto Zoology of a subset. --J.J. Horning | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
gwyn@smoke.BRL.MIL (Doug Gwyn) (05/21/89)
In article <13234@haddock.ima.isc.com> karl@haddock.ima.isc.com (Karl Heuer) writes: >I thought it strange that X3J11 outlawed "x+ =1" ("+=" is now a single >token), but permitted "i[a]" (on the grounds that they "saw no reason >to forbid it"). It was never clear why the old C language reference manual said "the two parts of a compound assignment operator are separate tokens" when the formal grammar showed them as indivisible units. That may have simply been a description of the (somewhat sloppy) way the implementation of the PCC lexer happened to work. Another possibility is that it was desired to guarantee that such an operator could be constructed via preprocessing. X3J11 allows the latter anyway. It seems for more likely that "x+ =1" is a typo than that it is intended. "i[a]" on the other hand has actually been intentionally used by some programmers, although most of us certainly don't recommend it. >... outlawing "i[a]" would have been a small step >towards making arrays higher-class citizens than they are. I don't think you can ever make the existing C arrays first-class objects without invalidating large amounts of existing correct code. There are efforts underway to find a suitable language extension that solves this problem (for the new class of objects provided by the extension).
pjh@mccc.UUCP (Pete Holsberg) (05/21/89)
In article <1989May19.154248.426@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes: =In article <749@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes: =>Can you explain why compilers produce different code for "a[i]" and "*(a+i)"? = =Incompetent compiler writers. ALL of them? Really? Can you name a C compiler that was written by a competent compiler writer? (Your new .sig is D U L L!! ;-) -- Pete Holsberg, Mercer County Community College, Trenton, NJ 08690 {backbone}!rutgers!njin!princeton!njsmu!mccc!pjh
pjh@mccc.UUCP (Pete Holsberg) (05/21/89)
In article <17635@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes: =In article <749@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes: =[in response to article <607@kl-cs.UUCP> by pc@cs.keele.ac.uk (Phil Cornes)] =>Can you explain why compilers produce different code for "a[i]" and "*(a+i)"? = =I have never observed one to do so. There is no reason for a compiler =to generate different code, as the expressions are semantically identical. Perhaps I've asked the wrong question. I saw a couple of simple test programs that assigned 0 to each member of an array. One used array subscript notation, and the other, pointer notation. I compiled these on a 7300, a 3B2/400, and a 386 running Microport V/386, using a variety of compilers (cc and gnu-cc on the 7300, fpcc on the 3B2, and cc and Greenhills on the 386). I ran each version and timed the execution. The subscript versions had different run times from the pointer versions (some slower, some faster!). I assumed - perhaps naively - that the differences were caused by differences in code produced by the different compilers (and of course the hardware differences). Was that wrong? How does one account for the differences? -- Pete Holsberg, Mercer County Community College, Trenton, NJ 08690 {backbone}!rutgers!njin!princeton!njsmu!mccc!pjh
henry@utzoo.uucp (Henry Spencer) (05/22/89)
In article <755@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes: >=>Can you explain why compilers produce different code for "a[i]" and "*(a+i)"? >= >=Incompetent compiler writers. > >ALL of them? Really? Can you name a C compiler that was written by a >competent compiler writer? Sometimes I think the only one was Dennis Ritchie's original pdp11 compiler, with the original PCC perhaps a borderline case. And lest there be any doubts about the matter, both of them convert "a[i]" to "*(a+i)" as they parse, so the code for the two expressions is necessarily identical. (I went and looked at the compiler sources to be sure.) The two expressions are semantically identical by the definition of C. Any compiler which generates different code for them either is broken or has outsmarted itself in trying to be clever. >(Your new .sig is D U L L!! ;-) Just wait for the next one. |-> <--- evil smile -- Van Allen, adj: pertaining to | Henry Spencer at U of Toronto Zoology deadly hazards to spaceflight. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
jamesa@arabian.Sun.COM (James D. Allen) (05/22/89)
In article <10299@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn) writes: >In article <13234@haddock.ima.isc.com> karl@haddock.ima.isc.com (Karl Heuer) writes: >>... outlawing "i[a]" would have been a small step >>towards making arrays higher-class citizens than they are. > >I don't think you can ever make the existing C arrays first-class >objects without invalidating large amounts of existing correct code. >There are efforts underway to find a suitable language extension >that solves this problem (for the new class of objects provided by >the extension). One source of trouble is "hidden" array typedefs, such as `jmp_buf'. (You have to "know" what a jmp_buf is to use it nontrivially, while if it were "first-class" you wouldn't.) But a logical array can be promoted to a first-class citizen by just putting it in a structure: typedef struct { jmp_buf j; } first_class_jmpbuf; Any idea why this wasn't done for jmp_buf's? I think the "second-classedness" of arrays helps give C its elegant syntax. Any other examples of the "problems" it causes?
chris@mimsy.UUCP (Chris Torek) (05/22/89)
>>In article <749@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) asked >>>why compilers produce different code for "a[i]" and "*(a+i)"? >In article <17635@mimsy.UUCP> I noted that >>I have never observed one to do so. There is no reason for a compiler >>to generate different code, as the expressions are semantically identical. In article <756@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes: >Perhaps I've asked the wrong question. Maybe not. >I saw a couple of simple test programs that assigned 0 to each member >of an array. One used array subscript notation, and the other, pointer >notation. I compiled these >on a 7300, a 3B2/400, and a 386 running >Microport V/386, using a variety of compilers (cc and gnu-cc on the >7300, fpcc on the 3B2, and cc and Greenhills on the 386). I have none of these machines, and only gcc as a compiler. The code produce by GNU C version 1.35 (vax) compiled by GNU C version 1.35. for both loops in int a[20]; main(){int i; for(i=0;i<20;i++)a[i]=0; f(); for(i=0;i<20;i++)*(a+i)=0; f(); } was identical. (The lack of spacing in this example is due to me typing it in with the `cat' editor :-) ) >I ran each version and timed the execution. The subscript versions >had different run times from the pointer versions (some slower, some >faster!). I assumed - perhaps naively - that the differences were >caused by differences in code produced by the different compilers >(and of course the hardware differences). Was that wrong? >How does one account for the differences? Differing code sequences is one of two obvious possibilities, the other being differing multi-user loads. The latter seems less likely, especially if the results are repeatable. Why not compile to assembly and compare? If a compiler produces better code for a[i] than for *(a+i) (or vice versa), that compiler needs work. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
andre@targon.UUCP (andre) (05/22/89)
In article <18560@cup.portal.com> Tim_CDC_Roberts@cup.portal.com writes: >In <1176@mcrware.UUCP>, jejones@mcrware.UUCP (James Jones) writes: >[pointer story] >I disagree with this! I assert that EVEN if the intermediate result >goes negative, the final value will be correct, even on segmented >architectures. Don't underestimate the intel approach to computing :-) I have it on good authority that on the 386, ((adress) 0x0010 - 0x0100) + 0x0100 != 0x0010 but instead it winds up somewhere at the top of memory :-(. >Yes, the intermediate value is not a valid address, but I don't think that's >important. If the intermediate result would be put in an address register (on the '386) (where else does an address even a bogus one belong else ?) you will get a trap from the processors 'MMU'. -- ~----~ |m AAA DDDD It's not the kill, but the thrill of the chase. ~|d1|~@-- AA AAvv vvDD DD Segment registers are for worms. ~----~ & AAAAAAAvv vvDD DD ~~~~~~ -- AAA AAAvvvDDDDDD Andre van Dalen, uunet!mcvax!targon!andre
tim@crackle.amd.com (Tim Olson) (05/22/89)
In article <756@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes: | In article <17635@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes: | =In article <749@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes: | =[in response to article <607@kl-cs.UUCP> by pc@cs.keele.ac.uk (Phil Cornes)] | =>Can you explain why compilers produce different code for "a[i]" and "*(a+i)"? | = | =I have never observed one to do so. There is no reason for a compiler | =to generate different code, as the expressions are semantically identical. | | Perhaps I've asked the wrong question. I saw a couple of simple test | programs that assigned 0 to each member of an array. One used array | subscript notation, and the other, pointer notation. I compiled these | on a 7300, a 3B2/400, and a 386 running Microport V/386, using a variety | of compilers (cc and gnu-cc on the 7300, fpcc on the 3B2, and cc and | Greenhills on the 386). I ran each version and timed the execution. | The subscript versions had different run times from the pointer versions | (some slower, some faster!). I assumed - perhaps naively - that the | differences were caused by differences in code produced by the different | compilers (and of course the hardware differences). Was that wrong? | How does one account for the differences? If you wrote the routines like: int a[MAX]; int a[MAX]; int i; int i; for (i=0; i<MAX; ++i) for (i=0; i<MAX; ++i) a[i] = 0; *(a+i) = 0; Then the code generated should probably be identical (and it was, on the three machines I tried it on). However, if instead you wrote them like: int a[MAX]; int a[MAX]; int i; int *p; for (i=0; i<MAX; ++i) for (p=&a[0]; p<&a[MAX]; ++p) a[i] = 0; *p=0; Then you indeed might get different assembly language generated. The second pointer version has had a "loop induction" optimization performed by hand. On some compiler/machine combinations, this will run faster, because the scaling operation and base/offset addition have been eliminated; on others it may run slower, because a specific addressing mode cannot be used. -- Tim Olson Advanced Micro Devices (tim@amd.com)
guy@auspex.auspex.com (Guy Harris) (05/23/89)
>Perhaps I've asked the wrong question. I saw a couple of simple test >programs that assigned 0 to each member of an array. One used array >subscript notation, and the other, pointer notation. I compiled these >on a 7300, a 3B2/400, and a 386 running Microport V/386, using a variety >of compilers (cc and gnu-cc on the 7300, fpcc on the 3B2, and cc and >Greenhills on the 386). I ran each version and timed the execution. >The subscript versions had different run times from the pointer versions >(some slower, some faster!). I assumed - perhaps naively - that the >differences were caused by differences in code produced by the different >compilers (and of course the hardware differences). Was that wrong? >How does one account for the differences? Well, if the program that used subscript notation was something like: for (i = 0; i < LEN; i++) a[i] = 0; and the program that used pointer notation was something like: p = &a[0]; while (p < &a[LEN]) *p++ = 0; the answer has nothing whatsoever to do with the equivalence of "a[i]" and "*(a + i)", since the latter program doesn't use the latter construct, so you did ask the wrong question. It has, instead, to do with the fact that the equivalence of the two constructs in question is not as trivial as the equivalence of "a[i]" and "*(a + i)", and therefore it may be less likely that the compilers will generate the same code for them. There may well be compilers that *do* generate the same code for them - rewrite the first loop as: for (i = 0; i < LEN; i++) *(a + i) = 0; and then note that on most architectures, this requires that the value in "i" be multiplied by "sizeof a[0]" before being added to the address represented by the address of "a[0]", and do a strength reduction on that multiplication; you then find the induction variable not used, and eliminate it, and by the time the smoke clears you have the loop in the first example generating the same code as the loop in the second example. (I don't know whether there are any compilers that do this or not.) If the code generated for the two constructs is different, that could account for performance differences.
karl@haddock.ima.isc.com (Karl Heuer) (05/23/89)
In article <756@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes: >In article <17635@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes: >>I have never observed [a compiler to treat "a[i]" and "*(a+i)" differently]. > >Perhaps I've asked the wrong question. I saw a couple of simple test >programs that assigned 0 to each member of an array. One used array >subscript notation, and the other, pointer notation. By "pointer notation" do you mean only that the code used "*(a+i)" for "a[i]"? Or are you talking about code that used "*p++" instead of "a[i++]"? The latter is an entirely different question! (And it's usually what people are testing when they write "array vs. pointer" tests.) Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
karl@haddock.ima.isc.com (Karl Heuer) (05/23/89)
In article <105998@sun.Eng.Sun.COM> jamesa@arabian.Sun.COM (James D. Allen) writes: >Any idea why this wasn't done for jmp_buf's? Originally? Probably shortsightedness. In ANSI C? Backward compatibility. Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
cudcv@warwick.ac.uk (Rob McMahon) (05/23/89)
In article <756@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes: >Perhaps I've asked the wrong question. I saw a couple of simple test >programs that assigned 0 to each member of an array. One used array >subscript notation, and the other, pointer notation ... The subscript >versions had different run times from the pointer versions (some slower, some >faster!). I assumed - perhaps naively - that the differences were caused by >differences in code produced by the different compilers (and of course the >hardware differences). I'll lay odds that you're comparing int i; for (i = 0; i < MAX; i++) a[i] = 0; with grimble *p; for (p = a; p < &a[MAX]; p++) *p = 0; am I right? Note that this is not comparing `a[i]' with `*(a+i)' at all, the second loop simply has to increment a pointer, not scale an integer and add it to the address of an array. Compilers with strength reduction will make both equivalent. On machines with fiendish indexed addressing modes the first may be as fast or faster, on other machines the second may be faster. Rob -- UUCP: ...!mcvax!ukc!warwick!cudcv PHONE: +44 203 523037 JANET: cudcv@uk.ac.warwick ARPA: cudcv@warwick.ac.uk Rob McMahon, Computing Services, Warwick University, Coventry CV4 7AL, England
pjh@mccc.UUCP (Pete Holsberg) (05/23/89)
In article <17657@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
=>>In article <749@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) asked
=>>>why compilers produce different code for "a[i]" and "*(a+i)"?
=
=>In article <17635@mimsy.UUCP> I noted that
=>>I have never observed one to do so. There is no reason for a compiler
=>>to generate different code, as the expressions are semantically identical.
=
=I have none of these machines, and only gcc as a compiler. The code
=produce by
=
= GNU C version 1.35 (vax) compiled by GNU C version 1.35.
=
=for both loops in
=
= int a[20];
= main(){int i;
= for(i=0;i<20;i++)a[i]=0;
= f();
= for(i=0;i<20;i++)*(a+i)=0;
= f();
= }
=
=was identical.
=Differing code sequences is one of two obvious possibilities, the other
=being differing multi-user loads. The latter seems less likely, especially
=if the results are repeatable. Why not compile to assembly and compare?
Thanks, Chris. I will try that on the 386 machine, as that assembly language
is not as much an unknown as those for the 680x0 and the WE320x0!
--
Pete Holsberg, Mercer County Community College, Trenton, NJ 08690
{backbone}!rutgers!njin!princeton!njsmu!mccc!pjh
diamond@diamond.csl.sony.junet (Norman Diamond) (05/23/89)
In article <755@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes: >>Can you name a C compiler that was written by a >>competent compiler writer? In article <1989May21.205928.26064@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes: >Sometimes I think the only one was Dennis Ritchie's original pdp11 compiler, >with the original PCC perhaps a borderline case. Well, in terms of the original question ... >And lest there be any >doubts about the matter, both of them convert "a[i]" to "*(a+i)" as they >parse, so the code for the two expressions is necessarily identical. (I >went and looked at the compiler sources to be sure.) ... yes it's encouraging to see that they were competent. Now about PCC. Do you really mean that all those bugs were inserted into PCC after the original? I suppose it's possible. I don't have the original. But some of those bugs look pretty old. How portable is it to dereference the null pointer? I'm amazed that the thing runs (except of course for where it doesn't run). -- Norman Diamond, Sony Computer Science Lab (diamond%csl.sony.co.jp@relay.cs.net) The above opinions are my own. | Why are programmers criticized for If they're also your opinions, | re-implementing the wheel, when car you're infringing my copyright. | manufacturers are praised for it?
randolph@ektools.UUCP (Gary L. Randolph) (05/23/89)
In article <194@mole-end.UUCP> mat@mole-end.UUCP (Mark A Terribile) writes: >Are you saying that 2[ "hello" ] is not the same as "hello"[ 2 ] ? If >so, you are wrong. ... > >If your compiler rejects "hello"[ 2 ] it is broken. I agree with Mark T, but more importantly, this question was answered a day or two ago by Andrew Koenig. I am an avid reader of all of Andrew's works (C & C++). I have learned much easily from this reading. He says, as does the reference manual that commutation applies, therefore IT DOES. :-) Gary L. Randolph
pjh@mccc.UUCP (Pete Holsberg) (05/23/89)
In article <25711@amdcad.AMD.COM> tim@amd.com (Tim Olson) writes: =However, if instead you wrote them like: = = int a[MAX]; int a[MAX]; = int i; int *p; = for (i=0; i<MAX; ++i) for (p=&a[0]; p<&a[MAX]; ++p) = a[i] = 0; *p=0; = =Then you indeed might get different assembly language generated. The =second pointer version has had a "loop induction" optimization performed =by hand. On some compiler/machine combinations, this will run faster, =because the scaling operation and base/offset addition have been =eliminated; on others it may run slower, because a specific addressing =mode cannot be used. Tim, Here's the actual code. I think you've hit the nail on the head. #define IMAX 10 #define LOOP 10000 main() { int a[IMAX]; register int * p; int v=0; while (v++ < LOOP) for (p=a; p < &a[IMAX];) *p++=v; } -- Pete Holsberg, Mercer County Community College, Trenton, NJ 08690 {backbone}!rutgers!njin!princeton!njsmu!mccc!pjh
pjh@mccc.UUCP (Pete Holsberg) (05/24/89)
In article <1677@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes:
=Well, if the program that used subscript notation was something like:
=
= for (i = 0; i < LEN; i++)
= a[i] = 0;
=
=and the program that used pointer notation was something like:
=
= p = &a[0];
= while (p < &a[LEN])
= *p++ = 0;
=
=the answer has nothing whatsoever to do with the equivalence of "a[i]"
=and "*(a + i)", since the latter program doesn't use the latter
=construct, so you did ask the wrong question.
So it seems!
=It has, instead, to do with the fact that the equivalence of the two
=constructs in question is not as trivial as the equivalence of "a[i]"
=and "*(a + i)", and therefore it may be less likely that the compilers
=will generate the same code for them.
OK, so even though the two pieces of code are doing the same job and one uses
index notation while the other uses pointer notation, the compiler is not
likely to notice this.
=There may well be compilers that
=*do* generate the same code for them - rewrite the first loop as:
=
= for (i = 0; i < LEN; i++)
= *(a + i) = 0;
=
=and then note that on most architectures, this requires that the value
=in "i" be multiplied by "sizeof a[0]" before being added to the address
=represented by the address of "a[0]", and do a strength reduction on
^^^^^^^^^^^^^^^^^^
could you explain this?
=that multiplication; you then find the induction variable not used, and
^^^^^^^^^^^^^^^^^^
and this?
=eliminate it, and by the time the smoke clears you have the loop in the
=first example generating the same code as the loop in the second
=example. (I don't know whether there are any compilers that do this or
=not.)
=
=If the code generated for the two constructs is different, that could
=account for performance differences.
I'll try it. Thanks for the explanation.
--
Pete Holsberg, Mercer County Community College, Trenton, NJ 08690
{backbone}!rutgers!njin!princeton!njsmu!mccc!pjh
castong@bucsb.UUCP (Paul Castonguay) (05/24/89)
> >> The statement that *(a+i) == *(i+a) is therefore invalid. > >No, the statment is true. >-- #include<stdio.h> main() { int *a; int i=1; a = (int *)malloc(16); *a = 0; *(a+1) = 4; *(a+2) = 0; *(a+3) = 0; printf("*(a+i) = %d ", *(a+i)); printf("*(i+a) = %d\n", *(i+a)); } Output produced: *(a+i) = 4 *(i+a) = 4 Does that not show that *(a+i) == *(i+a) ?
sar@datcon.UUCP (Simon A Reap) (05/25/89)
In article <155@titania.warwick.ac.uk> cudcv@warwick.ac.uk (Rob McMahon) writes: >In article <756@mccc.UUCP> pjh@mccc.UUCP (Pete Holsberg) writes: >>Perhaps I've asked the wrong question. I saw a couple of simple test >>programs that assigned 0 to each member of an array. One used array >>subscript notation, and the other, pointer notation ... The subscript >>versions had different run times from the pointer versions (some slower, some >>faster!). I assumed - perhaps naively - that the differences were caused by >>differences in code produced by the different compilers (and of course the >>hardware differences). > >I'll lay odds that you're comparing > > int i; > for (i = 0; i < MAX; i++) > a[i] = 0; >with > grimble *p; > for (p = a; p < &a[MAX]; p++) > *p = 0; > >am I right? > Compiling: int a[20]; main(){int i; for(i=0;i<20;i++)a[i]=0; f(); for(i=0;i<20;i++)*(a+i)=0; f(); } on our Pyramid 9820 gives the following assembler in att and ucb universes: movw $0x0,lr0 br L13 L15: movw $0x0,_a[lr0*0x4] ;body of loop for a[i]=0 addw $0x1,lr0 ; L13: cmpw $0x14,lr0 blt L15 L14: call _f movw $0x0,lr0 br L16 L18: mova _a[lr0*0x4],pr2 ;body of loop for *(a+i)=0 movw $0x0,(pr2) ; addw $0x1,lr0 ; L16: cmpw $0x14,lr0 blt L18 L17: call _f ret Yup, different code. But then again, what can one expect from a compiler that doesn't understand i[a] (as in "hello"[2]) :-( Such a pity on an otherwise good machine. -- Enjoy, yerluvinunclesimon Opinions are mine - my cat has her own ideas Reach me at sar@datcon.co.uk, or ...!mcvax!ukc!pyrltd!datcon!sar
rec@elf115.uu.net (Roger Critchlow) (05/26/89)
In <105998@sun.Eng.Sun.COM> jamesa@arabian.Sun.COM (James D. Allen) had written: > But a logical array can be >promoted to a first-class citizen by just putting it in a structure: > > typedef struct { > jmp_buf j; > } first_class_jmpbuf; > >Any idea why this wasn't done for jmp_buf's? In <13269@haddock.ima.isc.com> karl@haddock.ima.isc.com (Karl Heuer) responded: >In article <105998@sun.Eng.Sun.COM> jamesa@arabian.Sun.COM (James D. Allen) writes: >>Any idea why this wasn't done for jmp_buf's? >Originally? Probably shortsightedness. In ANSI C? Backward compatibility. I believe that struct's were still second class objects when jmp_buf was first declared. The declaration of a jmp_buf as an array means that the jmp_buf is passed by reference instead of by value. This is essential to many implementations of setjmp(). It's also a useful trick to remember if you want user declared data objects to be passed to your library routines by reference. In <105998@sun.Eng.Sun.COM> jamesa@arabian.Sun.COM (James D. Allen) continued: >I think the "second-classedness" of arrays helps give C its elegant >syntax. Any other examples of the "problems" it causes? I think of C arrays as syntactic sugar for initialized pointers. Thus char foo[] = "I am an anonymous char *"; is an abbreviation for register char *const foo = "I am an anonymous char *"; I reason 'const' because the value of the pointer cannot be changed, and 'register' because the address of the pointer cannot be taken. -- rec@elf115.uu.net --
karl@haddock.ima.isc.com (Karl Heuer) (05/26/89)
In article <96@elf115.uu.net> rec@elf115.uu.net (Roger Critchlow) writes: >In <13269@haddock.ima.isc.com> karl@haddock.ima.isc.com (Karl Heuer) writes: >>In article <105998@sun.Eng.Sun.COM> jamesa@arabian.Sun.COM (James D. Allen) writes: >>>Any idea why this [enclosing in a struct] wasn't done for jmp_buf's? >> >>Originally? Probably shortsightedness. > >I believe that struct's were still second class objects when >jmp_buf was first declared. But arrays were (and still are) third-class objects. A struct would still have been the better choice. (Or do you mean to suggest that structs didn't exist at all? I bet if you go back that far, typedef didn't exist either.) >The declaration of a jmp_buf as an array means that the jmp_buf >is passed by reference instead of by value. This is essential >to many implementations of setjmp(). Of course it has to be passed by reference; the function needs to write in it, after all. But what *should* have been done was to typedef jmp_buf as a struct, and use "&" explicitly. That's what the rest of the library uses when call by reference is required. >I think of C arrays as syntactic sugar for initialized pointers. Thus > char foo[] = "I am an anonymous char *"; >is an abbreviation for > register char *const foo = "I am an anonymous char *"; This is very wrong, but I'll let Chris Torek explain why, since he already has it in his FAQ database. Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
guy@auspex.auspex.com (Guy Harris) (05/27/89)
>>I think the "second-classedness" of arrays helps give C its elegant >>syntax. Any other examples of the "problems" it causes? > >I think of C arrays as syntactic sugar for initialized pointers. In other words, your answer to his question is that one problem caused by the "second-classedness" of arrays is that it leads people to think of them, incorrectly, as pointers? I'd certainly agree with that.... >Thus > > char foo[] = "I am an anonymous char *"; > >is an abbreviation for > > register char *const foo = "I am an anonymous char *"; > >I reason 'const' because the value of the pointer cannot be changed, >and 'register' because the address of the pointer cannot be taken. Well, unfortunately, there's no little thing you can add to the declaration to straightforwardly reflect the fact that: foo.c: ... char foo[] = "I am an array"; ... bar.c: ... extern char *const foo; ... is wrong. (And yes, "I did (the above); why isn't it working?" has appeared as a question in this newsgroup in the past, so people really *do* get the idea that it's supposed to work.) If you really want to go out of your way, I guess the "register" does that - but it also hints that something gets stuffed into a register, which is wrong. Think of arrays as arrays, pointers as pointers, and array-valued expressions being converted, in most but *not* all contexts, as being converted to pointer-valued expressions that point to the first element of the array, and you won't go wrong. That may be more *complicated* than your model, but it has the advantage of reflecting reality....