[net.lang.c] Bug in 4.2 cc code generator

rbutterworth@watmath.UUCP (Ray Butterworth) (09/14/86)

/*
 * The following two loops should be identical,
 * the second test possibly generating more efficient code.
 * i.e. ((c=exp1),(c!=exp2))  versus  ((c=exp1)!=(exp2)).

 * This is true on all the other compilers I have tried,
 * regardless of sign-extension, byte order, word size, etc.

 * But on the 4.2 cc, the second test rather than being more
 * efficient, actually generates more code, and in fact generates
 * incorrect code (the loop never terminates (well almost never)).
 * In this particular case it seems to think that the type of
 * (c=exp1) is the same as the type of (exp1) and not that of (c).
 * According to both K&R and the proposed X3J11 this is wrong.

 * (I realize that the "&0377" is actually redundant, but it is
 * one way of invoking the bug in this contrived example.)
 */

main()
{
    char *data="see should_not_see";
    char stop='\345';
    char *p;
    char c;

    data[3]=stop;

    p = data;
    printf("Using comma operator\n");
    while ( (c=(*p++&0377)), (c!=stop) )
        printf("%3.3o ", c);
    printf("\n");

    p = data;
    printf("Using asignment!=\n");
    while ( (c=(*p++&0377))!=stop )
        printf("%3.3o ", c);
    printf("\n");

    return 0;
}

/*
 *This is the correct code generated for the first loop:
L19:
    cvtbl   *-12(fp),r0
    bicl2   $-256,r0
    cvtlb   r0,-13(fp)
    incl    -12(fp)
    cmpb    -13(fp),-5(fp)
    jeql    L20

 *This is the incorrect code generated for the second (more efficient) loop:
L24:
    movl    -12(fp),r0
    incl    -12(fp)
    cvtbl   (r0),r0
    bicl2   $-256,r0
    cvtlb   r0,-13(fp)
    cvtbl   -5(fp),r1
    cmpl    r0,r1
    jeql    L25
 */

guy@sun.uucp (Guy Harris) (09/17/86)

Fixed - in part - in 4.3.  The code it generates for the first loop, for
comparison; it's the same as in 4.2:

L19:
	cvtbl	*-12(fp),r0
	bicl2	$-256,r0
	cvtlb	r0,-13(fp)
	incl	-12(fp)
	cmpb	-13(fp),-5(fp)
	jeql	L20

And the code for the second loop:

L24:
	movl	-12(fp),r0
	incl	-12(fp)
	cvtbl	(r0),r0
	bicl2	$-256,r0
	cvtlb	r0,-13(fp)
	cmpb	-13(fp),-5(fp)
	jeql	L25

Well, at least it's correct this time, but it's *still* less efficient; moving
the pointer into "r0" first is pointless.  Turning on the optimizer doesn't
help; it does collapse the "cvtbl" and "bicl2" into a "movzbl", but it
doesn't figure out that

	movl	-12(fp),r0
	incl	-12(fp)
	movzbl	(r0),r0

is better done as

	movzbl	*-12(fp),r0
	incl	-12(fp)
-- 
	Guy Harris
	{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
	guy@sun.com (or guy@sun.arpa)