rbutterworth@watmath.UUCP (Ray Butterworth) (09/14/86)
/*
* The following two loops should be identical,
* the second test possibly generating more efficient code.
* i.e. ((c=exp1),(c!=exp2)) versus ((c=exp1)!=(exp2)).
* This is true on all the other compilers I have tried,
* regardless of sign-extension, byte order, word size, etc.
* But on the 4.2 cc, the second test rather than being more
* efficient, actually generates more code, and in fact generates
* incorrect code (the loop never terminates (well almost never)).
* In this particular case it seems to think that the type of
* (c=exp1) is the same as the type of (exp1) and not that of (c).
* According to both K&R and the proposed X3J11 this is wrong.
* (I realize that the "&0377" is actually redundant, but it is
* one way of invoking the bug in this contrived example.)
*/
main()
{
char *data="see should_not_see";
char stop='\345';
char *p;
char c;
data[3]=stop;
p = data;
printf("Using comma operator\n");
while ( (c=(*p++&0377)), (c!=stop) )
printf("%3.3o ", c);
printf("\n");
p = data;
printf("Using asignment!=\n");
while ( (c=(*p++&0377))!=stop )
printf("%3.3o ", c);
printf("\n");
return 0;
}
/*
*This is the correct code generated for the first loop:
L19:
cvtbl *-12(fp),r0
bicl2 $-256,r0
cvtlb r0,-13(fp)
incl -12(fp)
cmpb -13(fp),-5(fp)
jeql L20
*This is the incorrect code generated for the second (more efficient) loop:
L24:
movl -12(fp),r0
incl -12(fp)
cvtbl (r0),r0
bicl2 $-256,r0
cvtlb r0,-13(fp)
cvtbl -5(fp),r1
cmpl r0,r1
jeql L25
*/
guy@sun.uucp (Guy Harris) (09/17/86)
Fixed - in part - in 4.3. The code it generates for the first loop, for comparison; it's the same as in 4.2: L19: cvtbl *-12(fp),r0 bicl2 $-256,r0 cvtlb r0,-13(fp) incl -12(fp) cmpb -13(fp),-5(fp) jeql L20 And the code for the second loop: L24: movl -12(fp),r0 incl -12(fp) cvtbl (r0),r0 bicl2 $-256,r0 cvtlb r0,-13(fp) cmpb -13(fp),-5(fp) jeql L25 Well, at least it's correct this time, but it's *still* less efficient; moving the pointer into "r0" first is pointless. Turning on the optimizer doesn't help; it does collapse the "cvtbl" and "bicl2" into a "movzbl", but it doesn't figure out that movl -12(fp),r0 incl -12(fp) movzbl (r0),r0 is better done as movzbl *-12(fp),r0 incl -12(fp) -- Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.com (or guy@sun.arpa)