[comp.unix.questions] Wierd Compilers

dc@gcm.UUCP (11/02/87)

/* Program 1 */
main()
{
	int 	i = 500000;

	while(--i);
}

/* Program 2 */
main()
{
	int 	i = 500000;

	while(i--);
}

Does anyone have a guess why program one runs in half the time of program two?
BTW this is a SUN 3.

edw@IUS1.CS.CMU.EDU (Eddie Wyatt) (11/02/87)

In article <367@white.gcm>, dc@gcm (Dave Caswell) writes:
> 
> /* Program 1 */
> main()
> {
> 	int 	i = 500000;
> 
> 	while(--i);
> }
> 
> /* Program 2 */
> main()
> {
> 	int 	i = 500000;
> 
> 	while(i--);
> }
> 
> Does anyone have a guess why program one runs in half the time of program two?
> BTW this is a SUN 3.


Do a cc -S and you will see

Program one has an inter-loop like:

L14:
        subql   #0x1,a6@(-0x4)
        jeq     L15
        jra     L14

Program two has an inter-loop like:

L14:
        movl    a6@(-0x4),d0
        subql   #0x1,a6@(-0x4)
        tstl    d0
        jeq     L15
        jra     L14
L15:

  I actually go out of my way to write my loops like number one.

-- 

					Eddie Wyatt

e-mail: edw@ius1.cs.cmu.edu

tim@amdcad.AMD.COM (Tim Olson) (11/02/87)

In article <367@white.gcm> dc@gcm (Dave Caswell) writes:

| /* Program 1 */
| main()
| {
| 	int 	i = 500000;
| 
| 	while(--i);
| }
| 
| /* Program 2 */
| main()
| {
| 	int 	i = 500000;
| 
| 	while(i--);
| }
| 
| Does anyone have a guess why program one runs in half the time of program two?
| BTW this is a SUN 3.

Questions like this are easily answered by examining the
assembly-language generated by the compiler.  The difference is due to
the semantics of post-decrement vs pre-decrement.  The post-decrement
operator returns the value of i *before* it is decremented, while the
pre-decrement returns the value of i *after* it is decremented.  The
post-decrement compiles to something like:

	movl	#50000,a6@(-4)
.L13:
	movl	a6@(-4),d0		; copy the old value
	subql	#1,a6@(-4)		; now decrement i
	tstl	d0			; is the old value zero?
	beq	.L14
	bra	.L13
.L14:

While the pre-decrement compiles to:

	movl	#50000,a6@(-4)
.L15:
	subql	#1,a6@(-4)		; decrement i
	beq	.L16			; is it zero?
	bra	.L15
.L16:


This is yet another reason to use pre-decrement over post-decrement when
all that is wanted is the side-effect of decrementing, and the return
value is thrown away.

	-- Tim Olson
	Advanced Micro Devices
	(tim@amdcad.amd.com)

jfh@killer.UUCP (The Beach Bum) (11/03/87)

The answer is in how the compiler is going to generate the code.  Pre-
and Post- decrement require different code to be generated.

In article <367@white.gcm>, dc@gcm (Dave Caswell) writes:

> 
> /* Program 1 */
> main()
> {
> 	int 	i = 500000;
> 
> 	while(--i);
> }

[ code done by hand ... ]

main:
	link	#-4,a6
	movl	#500000,-4(a6)
1$:	subl	#1,-4(a6)
	bne	1$		| condition codes set as a side effect.
	rts
> 
> /* Program 2 */
> main()
> {
> 	int 	i = 500000;
> 
> 	while(i--);
> }

[ coded by hand again ... ]

main:
	link	#-4,a6
	movl	#50000,-4(a6)
1$:	movl	-4(a6),d0
	subl	#1,-4(a6)
	tstl	d0		| codes got trashed after the subl.

> Does anyone have a guess why program one runs in half the time of program two?
> BTW this is a SUN 3.

Sure, look at the assembler I am suggesting and the reason is obvious.  You
might want to look at the code your compiler actually generates.  In general,
don't post-anything when evaluating for side effects only.  _Unless_ you
really have to.  Micro-optimizations are a loser though.

- John.
-- 
John F. Haugh II		HECI Exploration Co. Inc.
UUCP:	...!ihnp4!killer!jfh	11910 Greenville Ave, Suite 600
"Don't Have an Oil Well?"	Dallas, TX. 75243
" ... Then Buy One!"		(214) 231-0993

thompson@dalcs.UUCP (Michael A. Thompson) (11/03/87)

In article <367@white.gcm> dc@gcm (Dave Caswell) writes:
>
>Does anyone have a guess why program one runs in half the time of program two?
>BTW this is a SUN 3.

	The same occures on our VAX 11/785 running BSD4.3:
Program 1		0.7u 0.0s 0:00 81% 1+5k 0+0io 1pf+0w
Program 1 w/-O		0.7u 0.0s 0:00 97% 1+5k 0+1io 1pf+0w
Program 2		1.1u 0.0s 0:01 97% 1+5k 0+1io 1pf+0w
Program 2 w/-O		0.9u 0.0s 0:01 71% 1+5k 0+1io 1pf+0w

	Here is the assembler output from our C compiler:
Program 1:                      Program 2:
	.			        .
	.			        .
	.			        .
L16:				L16:
	decl	-4(fp)		        movl    -4(fp),r0
	jeql	L17		        decl    -4(fp)
	jbr 	L16		        tstl    r0
L17:				        jeql    L17
	.			        jbr     L16
	.			L17:
	.			        .
				        .
				        .

Program 1 w/-O:                 Program 2 w/-O:
	.			        .
	.			        .
	.			        .
L16:decl	-4(fp)		L16:movl        -4(fp),r0
jneq	L16			decl    -4(fp)
	.			tstl    r0
	.			jneq    L16
	.			        .
				        .
				        .

	Without trying to go into why the compiler constructs it's
    loops this way, it is pretty obvious that program 2 is taking
    longer because it has more instructions to execute each time
    through the loop.
-- 
Michael A. Thompson, Dept. Math, Stats, & C.S., Dalhousie U., Halifax, N.S.
thompson@dalcs.uucp	From Bitnet or Uucp
thompson@cs.dal.cdn	From Bitnet or Cdn
thompson%dalcs.uucp@uunet.uu.net From Arpa

rick@seismo.CSS.GOV (Rick Adams) (11/04/87)

Even a mediocre compiler should be able to recognize that
	a++;
is equivalent to
	++a;
if there is no assignment, etc.

There is really no excuse for not making this trivial optimization. You should
not have to depend on the user doing it.

--rick

edw@IUS1.CS.CMU.EDU (Eddie Wyatt) (11/04/87)

In article <44177@beno.seismo.CSS.GOV>, rick@seismo.CSS.GOV (Rick Adams) writes:
> Even a mediocre compiler should be able to recognize that
> 	a++;
> is equivalent to
> 	++a;
> if there is no assignment, etc.
> 
> There is really no excuse for not making this trivial optimization. You should
> not have to depend on the user doing it.
> 
> --rick

But they are not equivalent in the context they where used  (refering back
to the original article).

i = 10

#1 while (i--) ;
#2 while (--i) ;

In number one the operation is executed 10 times.  In number two the
operation is executed 9 times.  This is because the value of
"--" applied to a var is not ignored.  #2 can be made totally equivalent to
#1 as follows:

i++;
while (--i) { /* my junk */ }

Which is what I do. 

-- 
That one looks Jewish -- And that one's a coon -- Who let all this riff raff
into the room -- There's one smoking a joint -- and Anoter with spots -- If
I had my way -- I'd have all of you shot ... Pink Floyd, "In the Flesh"
Eddie Wyatt 				e-mail: edw@ius1.cs.cmu.edu

tim@amdcad.AMD.COM (Tim Olson) (11/04/87)

In article <44177@beno.seismo.CSS.GOV> rick@seismo.CSS.GOV (Rick Adams) writes:
| Even a mediocre compiler should be able to recognize that
| 	a++;
| is equivalent to
| 	++a;
| if there is no assignment, etc.
| 
| There is really no excuse for not making this trivial optimization. You should
| not have to depend on the user doing it.

If a compiler made this "optimization" in the original programs to which
this discussion is referring, it would be wrong.  The return value of
the decrement operators *is* used, by the while statement.

#1:
	count = 5000;
	while (--count);	/* executes 4999 loops */
	
#2:
	count = 5000;
	while (count--);	/* executes 5000 loops */
	
	-- Tim Olson
	Advanced Micro Devices
	(tim@amdcad.amd.com)

jfh@killer.UUCP (11/04/87)

In article <44177@beno.seismo.CSS.GOV>, rick@seismo.CSS.GOV (Rick Adams) writes:
> Even a mediocre compiler should be able to recognize that
> 	a++;
> is equivalent to
> 	++a;
> if there is no assignment, etc.
> 
> There is really no excuse for not making this trivial optimization. You should
> not have to depend on the user doing it.
> 
> --rick

I really hadn't expected Rick to come up with that one.  Guess working for
the government was gone to his brain ... (Sorry, couldn't pass up a
anti-government dig ;-).

I have recapped the story for the comp.lang.c folks.

The code in question was

	while (a--)		vs.		while (--a)
		;					;

which is the same as

	while (a-- != 0)	vs.		while (--a != 0)
		;					;

the pre-decrement loop makes one less trip since the value of `a' is
decremented _prior_ to the test (not actually though).  Consider the
case where a == 1.  The --a will equal 0 and the test will fail.  In
the case of a--, the value _prior_ to the decrement is used and the
test will pass, so the loop gets executed.  To be the same, assuming
`a' is not used in the loop, you would have to code

	a++;
	while (--a)
		;

to get the same number of trips (assuming overflow was not a problem).
This would be a big win if the code in the loop was very small and
the value of `a' was large.  I wouldn't do it since the time savings
would be small, and the code a kluge.  Of course, documentation helps.

- John.
-- 
John F. Haugh II		HECI Exploration Co. Inc.
UUCP:	...!ihnp4!killer!jfh	11910 Greenville Ave, Suite 600
"Don't Have an Oil Well?"	Dallas, TX. 75243
" ... Then Buy One!"		(214) 231-0993

rick@seismo.CSS.GOV (Rick Adams) (11/05/87)

In Article: <18964@amdcad.AMD.COM> tim@amdcad.AMD.COM (Tim Olson) writes:
> This is yet another reason to use pre-decrement over post-decrement when
> all that is wanted is the side-effect of decrementing, and the return
> value is thrown away.

I was referring to this statement which specifically states that the
return value is thrown away. Of course the compiler can't do that
optimization if you are using the return value for something.

--rick

mdf@tut.cis.ohio-state.edu (Mark D. Freeman) (11/06/87)

In <44177@beno.seismo.CSS.GOV> rick@seismo.CSS.GOV (Rick Adams) writes:
>Even a mediocre compiler should be able to recognize that
>	a++;
>is equivalent to
>	++a;
>if there is no assignment, etc.
>
>There is really no excuse for not making this trivial optimization. You should
>not have to depend on the user doing it.

Forgive a possibly stupid question, but in what way is there any
difference between a++ and ++a in the generated code, even with the
optimiser disabled.

Note:  I am not arguing with you, merely asking for more information.




-- 
Mark D. Freeman							(614) 262-3703
StrongPoint Systems, Inc.			    mdf@tut.cis.ohio-state.edu
2440 Medary Avenue		 ...!cbosgd!osu-cis!tut.cis.ohio-state.edu!mdf
Columbus, OH  43202		    Guest account at The Ohio State University

daveb@geac.UUCP (11/06/87)

In article <18964@amdcad.AMD.COM> tim@amdcad.UUCP (Tim Olson) writes:
> [a discussion of the code generated by while(i--) and while (--i)
>This is yet another reason to use pre-decrement over post-decrement when
>all that is wanted is the side-effect of decrementing, and the return
>value is thrown away.
>
>	-- Tim Olson

  Er, don't you mean "when the return value **isn't** thrown away?
My compiler generates identical code for --i; and i--; when they're
independent statements, because they have the same semantics.
-- 
 David Collier-Brown.                 {mnetor|yetti|utgpu}!geac!daveb
 Geac Computers International Inc.,   |  Computer Science loses its
 350 Steelcase Road,Markham, Ontario, |  memory (if not its mind)
 CANADA, L3R 1B3 (416) 475-0525 x3279 |  every 6 months.

karl@haddock.ISC.COM (Karl Heuer) (11/06/87)

[Someone noted that "while (i--);" is slower than "while (--i);".  It was
pointed out that the former generates more instructions because of the extra
work to save the old value of the counter.  At least one person claimed that
the postdecrement operator should be avoided for this reason; apparently he
didn't notice that the value was being used, and that the two statements are
not semantically identical.  Rick Adams observed that, even if the value is
not being used, the user need not be concerned with this optimization.]

More than one person writes that an n-trip loop should be written
  ++n;  while (--n) ...;

I'm surprised that nobody has yet mentioned the idiom that I use:
  while (--n >= 0) ...;

This has the speed of predecrement, and avoids the kludge of having an extra
increment.  It requires n to be a signed type, but that's not usually a
problem.  (Also, it doesn't have identical semantics if n is negative, but
from the context I presume that this is not a concern.)

Of course, the other way is to write
  do ...; while (--n != 0);    [or]    do ...; while (--n > 0);
assuming n is known to be initially nonzero.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
Followups to comp.lang.c only.

edw@IUS1.CS.CMU.EDU (Eddie Wyatt) (11/07/87)

> 
> Forgive a possibly stupid question, but in what way is there any
> difference between a++ and ++a in the generated code, even with the
> optimiser disabled.
> 
> Note:  I am not arguing with you, merely asking for more information.
> 
> 


  In case 1 (a++) the value store in "a" may have to be saved before executing
the increment if the value of the expression is used.  In case case 2
(++a) the value stored in  "a" after the operation is preformed.
Hence no extra save must be preformed.

Example:

case 1	i = a++;

	save a in b
	increment a
	i gets  b


case 2	 i = ++a;

	increment a
	i gets a;

Note there is an optimization that can be made for the first case that is  :

	i get a
	increment a

But this optimization assumes that the order of the evaluation ++ and
assignment don't matter (however in some cases it does).

	a = 1;
	a = a++;

The unoptimized case would yields an "a" value of  1.  The optimized
case would yield an "a" value of 2.  This really shows that there
is an ambiguity in the language definition.  They should have  used
denotational semantics.
-- 
That one looks Jewish -- And that one's a coon -- Who let all this riff raff
into the room -- There's one smoking a joint -- and Anoter with spots -- If
I had my way -- I'd have all of you shot ... Pink Floyd, "In the Flesh"
Eddie Wyatt 				e-mail: edw@ius1.cs.cmu.edu

rbj@icst-cmr.arpa (Root Boy Jim) (12/15/87)

   From: Dave Caswell <dc@gcm>

BTW, that's `weird'.

   Does anyone have a guess why program 1 runs in half the time of program 2?

By now you have seen assembly code listings, with and without optimization,
which should make things clear. I'll attempt to address a few points
which nobody has made as of yet.

There seem to be four basic styles of while loops; one dimension is
pre-/post-decrement, the other is `while () ...;' vs `do ... while ();'

You can get n-1, n, n, and n+1 repetitions out of the constructs
`while (--n) ...;', `while (n--) ...;', `do ... while (--n)', and
`do ... while (n--);' respectively. 

I always thought that the `do ... while();' syntax was a gratuitous
addition to the language until someone pointed out that it was tailored
to the PDP-11 `sob' (subtract one and branch if *not* zero) instruction.

The VAX has (is?) an `sob' as well. So does the 680[012]0 as well, altho
it's called `dbra', a special case of the `DBcc', wher `cc' is one of
any sixteen branch conditions. Oddly, the SUN compilers don't seem to
make use of it, so what I am about to say next is irrelevant.

Another feature of the 680[12]0, but not the 68000, is what is called
`loop' mode. If a one word instruxion (oops! back to my old habits :-)
is followed by a DBcc that references it, the whole loop is executed
without refetching the two instructions (got it right this time :-).
Thus, if we could coax the compiler into using dbra, the best way to
copy register int n bytes from register char *p to register char *q
would be `do *q++ = *p++; while (--n);'.

   BTW this is a SUN 3.

So that would be a 68020. The 68020 may handle bigger loops in loop
mode as it has a bigger instruction prefetch queue.

	(Root Boy) Jim Cottrell	<rbj@icst-cmr.arpa>
	National Bureau of Standards
	Flamer's Hotline: (301) 975-5688
	Isn't this my STOP?!

pac@munsell.UUCP (Paul Czarnecki) (12/22/87)

In article <10861@brl-adm.ARPA> rbj@icst-cmr.arpa (Root Boy Jim) writes:
>I always thought that the `do ... while();' syntax was a gratuitous

I had also thought that way.  But after 5 years of C I finally wrote
my first do{}while loop.  (I had to look it up!)

For one of my libraries, I treat a "picture" as a collection of
images.  They are held in a doubly linked list in the "picture"
structure.  If I want to "process" each of the images in a picture I
use this code:

        image_list_2 = image_list_1 = pic->image_list;

        /*
         * process each of the subimages 
         */

        do {
                retcode = process_image(image_list_2->image);
                if (retcode <= 0) {
                        return (EPROCESSFAILED);
                }

                image_list_2 = image_list_2->next;	/* next images */
        } while (image_list_1 != image_list_2);


There *are* some applications that a do{}while are natural for.  There
just aren't many of them.

					pZ
-- 
		       Paul Czarnecki -- Spam, spam, spam, Usenet, and spam
	{{harvard,ll-xn}!adelie,{decvax,allegra,talcott}!encore}!munsell!pz