rhm@druwy.ATT.COM (Roger Massey) (01/20/89)
No bugs in TC2.0 ?
This small model C program :
sub( cp )
unsigned char *cp;
{
int i;
i = (*cp++ << 8) + *cp++;
}
generates the following code fragment (I left out some of the .asm):
_TEXT segment byte public 'CODE'
; ?debug L 1
_sub proc near
push bp
mov bp,sp
sub sp,2
push si
mov si,word ptr [bp+4]
; ?debug L 6
mov al,byte ptr [si]
mov ah,0
mov cl,8
shl ax,cl
mov dl,byte ptr [si]
mov dh,0
add ax,dx
mov word ptr [bp-2],ax
inc si
inc si
@1:
; ?debug L 7
pop si
mov sp,bp
pop bp
ret
_sub endp
_TEXT ends
note that si (i.e. cp) is not incremented between references
but instead after both references.
Roger Massey
AT&T Denver
andrews@calgary.UUCP (Keith Andrews) (01/21/89)
In article <3785@druwy.ATT.COM>, rhm@druwy.ATT.COM (Roger Massey) writes: > No bugs in TC2.0 ? > > This small model C program : > > sub( cp ) > unsigned char *cp; > { > int i; > > i = (*cp++ << 8) + *cp++; > } > *** Omitted code showing that "cp" gets incremented twice at end of generate code instead of after each reference *** > > Roger Massey > AT&T Denver Sorry, this isn't a bug. The order of evaluation of the ++ operators in an expression is not guaranteed. As evidence, this is what lint has to say about the above code fragment: Script started on Fri Jan 20 09:21:51 1989 1: lint foo.c foo.c(6): warning: cp evaluation order undefined foo.c(6): warning: i set but not used in function sub sub defined( foo.c(3) ), but never used 2: ^D script done on Fri Jan 20 09:22:04 1989 Nevertheless, Turbo C likes to make noises about practically everything else, I wonder they don't issue warnings about this? Keith Andrews andrews@cpsc.UCalgary.CA
ralf@b.gp.cs.cmu.edu (Ralf Brown) (01/21/89)
In article <3785@druwy.ATT.COM> rhm@druwy.UUCP (MasseyR) writes: }No bugs in TC2.0 ? } }This small model C program : } }sub( cp ) }unsigned char *cp; }{ } int i; } i = (*cp++ << 8) + *cp++; }} } }generates the following code fragment (I left out some of the .asm): [fragment omitted] } }note that si (i.e. cp) is not incremented between references }but instead after both references. That is not a bug. K&R explicitly state that the compiler may do the increment any time between the value's use and the next sequence point (comma, end of statement, etc.). You cannot rely on order-of-execution within an expression, since the compiler is free to rearrange things as it sees fit. The code fragment i = 4 ; printf("%d",i++ + i++ + i++) ; has three legal results: 12 14 15 depending on how many of the increments are deferred until after the additions. -- {harvard,uunet,ucbvax}!b.gp.cs.cmu.edu!ralf -=-=- AT&T: (412)268-3053 (school) ARPA: RALF@B.GP.CS.CMU.EDU |"Tolerance means excusing the mistakes others make. FIDO: Ralf Brown at 129/31 | Tact means not noticing them." --Arthur Schnitzler BITnet: RALF%B.GP.CS.CMU.EDU@CMUCCVMA -=-=- DISCLAIMER? I claimed something? --
kneller@cgl.ucsf.edu (Don Kneller) (01/21/89)
In article <3785@druwy.ATT.COM> rhm@druwy.UUCP (MasseyR) writes: >No bugs in TC2.0 ? > >sub( cp ) >unsigned char *cp; >{ > int i; > > i = (*cp++ << 8) + *cp++; >} > >[ ASM code removed which shows cp is not incremented between references > but instead after both references - dgk ] This is perfectly valid behavior. Not exactly DWIM, but certainly not disallowed. In essence, foo++ in an expression means to use the value of foo and, sometime before proceeding to the next line, increment the value of foo. We all know what you mean to say in the above expression, but you don't have complete control over the order of evaluation! That is, you have no control over which of (*cp++ << 8) or *cp++ is first to be evaluated. The compiler makers are free to do whatever they please. The take-home lesson (as such) is for C programmers to never depend on the order of evaluation in expressions where the operators have equal precedence. Currently C has no way of forcing the order. Only a few operators can force explicit order (e.g. || && ,). - don P.S. I recently fell for (getchar() << 8) + getchar(). ----- Don Kneller UUCP: ...ucbvax!ucsfcgl!kneller INTERNET: kneller@cgl.ucsf.edu BITNET: kneller@ucsfcgl.BITNET
johnl@ima.ima.isc.com (John R. Levine) (01/21/89)
In article <3785@druwy.ATT.COM> rhm@druwy.UUCP (MasseyR) writes: >No bugs in TC2.0 ? > ... > i = (*cp++ << 8) + *cp++; [ compiles as though he had written i = (*cp << 8) + *cp, cp += 2; ] No bug here, consult your C manual. C only promises that a postfix ++ will be executed before the next sequence point and the only sequence point here is the end of the statement. Try this instead: i = (cp[0] << 8) + cp[1], cp += 2; -- John R. Levine, Segue Software, POB 349, Cambridge MA 02238, +1 617 492 3869 { bbn | spdcc | decvax | harvard | yale }!ima!johnl, Levine@YALE.something You're never too old to have a happy childhood.
chris@mimsy.UUCP (Chris Torek) (01/21/89)
In article <16674@iuvax.cs.indiana.edu> bobmon@iuvax.cs.indiana.edu (RAMontante) writes: >I am cross-posting the following to comp.lang.c, because the language >expertise is there. I am not convinced that you have any guarantee about >just when the post-increments happen -- consider p.50 of K&R 1st edition, >for example. So it's not a TC bug, just an implementor's choice. This is correct. The pANS uses the notion of `sequence points' to decide when a side effect (such as post-increment) must happen; there is no sequence point within a simple assignment expression like the one quoted below. Incidentally, the quoted assembly shows that TC2.0 (Turbo C, I suppose) misses one possible optimisation. >> i = (*cp++ << 8) + *cp++; cp is in the SI register at this point; the goal is to compute AX=[SI]<<8: >> mov al,byte ptr [si] >> mov ah,0 >> mov cl,8 >> shl ax,cl A much better sequence is mov al,0 mov ah,byte ptr [si] A simple peephole optimiser should be able to catch this. More complex analysis might even figure out that i = (cp[0] << 8) + cp[1]; cp += 2; could be computed as mov ah,byte ptr [si] inc si mov al,byte ptr [si] inc si ; i is now in ax while i = cp[0] + (cp[1] << 8); is even more simple: mov ax,word ptr [si] but I would not expect most compilers to manage either of these. (If you rearranged the source to read i = *(int *)cp; I would expect the latter sequence.) -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
naughton%wind@Sun.COM (Patrick Naughton) (01/21/89)
In article <3785@druwy.ATT.COM> rhm@druwy.UUCP (MasseyR) writes: >No bugs in TC2.0 ? > >This small model C program : >sub( cp ) >unsigned char *cp; >{ > int i; > > i = (*cp++ << 8) + *cp++; This is bogus code anyways... Remember that '+' is commutative. you are saying that: i = (*cp++ << 8) + *cp++; is the same as: i = *cp++ + (*cp++ << 8); which it is obviously not. You should use: i = (*cp << 8) + *(cp+1); cp += 2; ...but the code you sent out did look like it showed up a compiler bug, none the less. -Patrick ______________________________________________________________________ Patrick J. Naughton ARPA: naughton@Sun.COM Window Systems Group UUCP: ...!sun!naughton Sun Microsystems, Inc. AT&T: (415) 336 - 1080
rjchen@phoenix.Princeton.EDU (Raymond Juimong Chen) (01/21/89)
<537@cs-spool.calgary.UUCP>, andrews@calgary.UUCP (Keith Andrews): > Nevertheless, Turbo C likes to make noises about practically everything else, > I wonder they don't issue warnings about this? Maybe because they don't know... In an issue of Turbo Technix, an example in one of the articles actually RELIED ON the order of evaluation to get the correct answer. And the author proceeded to explain that two results are reasonable (left-to-right and right-to-left evaluation) and furthermore adds that the left-to-right evaluation is CORRECT! I wrote them a letter about it, but by that time the magazine had gone extinct. -- Raymond Chen UUCP: ...allegra!princeton!{phoenix|pucc}!rjchen BITNET: rjchen@phoenix.UUCP, rjchen@pucc ARPA: rjchen@phoenix.PRINCETON.EDU, rjchen@pucc.PRINCETON.EDU "Say something, please! ('Yes' would be best.)" - The Doctor
abcscnge@csuna.UUCP (Scott "The Pseudo-Hacker" Neugroschl) (01/23/89)
In article <3785@druwy.ATT.COM> rhm@druwy.UUCP (MasseyR) writes: [ statement in about TC2.0 being bug free, context of a subroutine ] > i = (*cp++ << 8) + *cp++; I can't find it in my copy of K&R, but I think that they (and most C compilers I have seen) indicate that ++'ing the same variable multiple times in a single expression is undefined. Hence, this statement cannot really be used as an example of a TC bug. NO FLAMES PLEASE!!! I would like any ANSI X3J11'ers or TC hacks or anyone who can find documentation one way or the other to respond in a reasonable manner. -- Scott "The Pseudo-Hacker" Neugroschl UUCP: ...!sm.unisys.com!csun!csuna!abcscnge -- "Beat me, whip me, make me code in Ada" -- Disclaimers? We don't need no stinking disclaimers!!!
abcscnge@csuna.UUCP (Scott "The Pseudo-Hacker" Neugroschl) (01/23/89)
In article <15560@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes: >(If you rearranged the source to read > > i = *(int *)cp; > >I would expect the latter sequence.) The problem with this construct is that it is not portable. Many machines (read 68000) require integer data to be aligned. Actually, i have in my "standard library" of home brewed routines: int intat(p) unsigned char *p; { int x = 0; x = (int)(*p) * 0x100; x += *++p; return(x); } long longat(p) unsigned char *p; { return(((long)intat(p) * 0x10000L) + (long)intat(p+2)); } /* Please note these are from memory, I think I handled sign extension woes but these are just skeletal for the idea */ Actually, these two routines aren't even completely portable because they assume a big-endian architecture. The would have to be rewritten (or ifdefed) for a little endian architecture. -- Scott "The Pseudo-Hacker" Neugroschl UUCP: ...!sm.unisys.com!csun!csuna!abcscnge -- "Beat me, whip me, make me code in Ada" -- Disclaimers? We don't need no stinking disclaimers!!!
clutx.clarkson.edu (Jason Coughlin,221 Rey,,) (01/23/89)
From article <537@cs-spool.calgary.UUCP>, by andrews@calgary.UUCP (Keith Andrews): > Nevertheless, Turbo C likes to make noises about practically everything else, > I wonder they don't issue warnings about this? > I compiled something like this with Turbo C and 4.2 BSD cc: int hello(c) char c; { switch (c) { case 'a': .... return (b*a); case 'b': .... return (b*c); } } This is JUST an example, but TC didn't give me ANY warnings, but 4.2 BSD figured out that my simple return at the end of the function was inconsistent. >I wonder why they don't issue warnings like this? -- Jason Coughlin (jk0@clutx, jk0@clutx.clarkson.edu)
ralf@b.gp.cs.cmu.edu (Ralf Brown) (01/23/89)
In article <2041@sun.soe.clarkson.edu> jk0@sun.soe!clutx.clarkson.edu.UUCP writes: } I compiled something like this with Turbo C and 4.2 BSD cc: [sample code omitted] } This is JUST an example, but TC didn't give me ANY warnings, but } 4.2 BSD figured out that my simple return at the end of the function } was inconsistent. >I wonder why they don't issue warnings like this? Do you have warnings turned on (-w)? I put the code thru TC2.0, and it definitely warned that a value should be returned at the end of the function. Adding a return; at the end of the function generates the different warning that "Both return and return of a value used in function hello" -- {harvard,uunet,ucbvax}!b.gp.cs.cmu.edu!ralf -=-=- AT&T: (412)268-3053 (school) ARPA: RALF@B.GP.CS.CMU.EDU |"Tolerance means excusing the mistakes others make. FIDO: Ralf Brown at 129/31 | Tact means not noticing them." --Arthur Schnitzler BITnet: RALF%B.GP.CS.CMU.EDU@CMUCCVMA -=-=- DISCLAIMER? I claimed something? --
colburn@sip7.SRC.Honeywell.COM (Mark H. Colburn) (02/08/89)
In article <3785@druwy.ATT.COM> rhm@druwy.UUCP (MasseyR) writes: > i = (*cp++ << 8) + *cp++; >note that si (i.e. cp) is not incremented between references >but instead after both references. Note, that this is not necesarily incorrect. It is undefined to include two operations with side effects which reference the same variable. Among other things it is bad coding style. Worse, it provides unexpected results on a number of compilers. The code would be better written: i = (*cp << 8) + *(cp + 1); cp += 2; One of the classic examples of side effects is given below: i = 10; i++ = 10 * i++; what is the value of I going to be? 111? 120? Remember that C may entirely reorder your expression before evaluating it. You shouldn't write code like that. Mark H. Colburn MN65-2300 colburn@SRC.Honeywell.COM Systems Administration and Support Honeywell Systems & Research Center