dc@gcm.UUCP (11/02/87)
/* Program 1 */ main() { int i = 500000; while(--i); } /* Program 2 */ main() { int i = 500000; while(i--); } Does anyone have a guess why program one runs in half the time of program two? BTW this is a SUN 3.
edw@IUS1.CS.CMU.EDU (Eddie Wyatt) (11/02/87)
In article <367@white.gcm>, dc@gcm (Dave Caswell) writes: > > /* Program 1 */ > main() > { > int i = 500000; > > while(--i); > } > > /* Program 2 */ > main() > { > int i = 500000; > > while(i--); > } > > Does anyone have a guess why program one runs in half the time of program two? > BTW this is a SUN 3. Do a cc -S and you will see Program one has an inter-loop like: L14: subql #0x1,a6@(-0x4) jeq L15 jra L14 Program two has an inter-loop like: L14: movl a6@(-0x4),d0 subql #0x1,a6@(-0x4) tstl d0 jeq L15 jra L14 L15: I actually go out of my way to write my loops like number one. -- Eddie Wyatt e-mail: edw@ius1.cs.cmu.edu
tim@amdcad.AMD.COM (Tim Olson) (11/02/87)
In article <367@white.gcm> dc@gcm (Dave Caswell) writes: | /* Program 1 */ | main() | { | int i = 500000; | | while(--i); | } | | /* Program 2 */ | main() | { | int i = 500000; | | while(i--); | } | | Does anyone have a guess why program one runs in half the time of program two? | BTW this is a SUN 3. Questions like this are easily answered by examining the assembly-language generated by the compiler. The difference is due to the semantics of post-decrement vs pre-decrement. The post-decrement operator returns the value of i *before* it is decremented, while the pre-decrement returns the value of i *after* it is decremented. The post-decrement compiles to something like: movl #50000,a6@(-4) .L13: movl a6@(-4),d0 ; copy the old value subql #1,a6@(-4) ; now decrement i tstl d0 ; is the old value zero? beq .L14 bra .L13 .L14: While the pre-decrement compiles to: movl #50000,a6@(-4) .L15: subql #1,a6@(-4) ; decrement i beq .L16 ; is it zero? bra .L15 .L16: This is yet another reason to use pre-decrement over post-decrement when all that is wanted is the side-effect of decrementing, and the return value is thrown away. -- Tim Olson Advanced Micro Devices (tim@amdcad.amd.com)
jfh@killer.UUCP (The Beach Bum) (11/03/87)
The answer is in how the compiler is going to generate the code. Pre- and Post- decrement require different code to be generated. In article <367@white.gcm>, dc@gcm (Dave Caswell) writes: > > /* Program 1 */ > main() > { > int i = 500000; > > while(--i); > } [ code done by hand ... ] main: link #-4,a6 movl #500000,-4(a6) 1$: subl #1,-4(a6) bne 1$ | condition codes set as a side effect. rts > > /* Program 2 */ > main() > { > int i = 500000; > > while(i--); > } [ coded by hand again ... ] main: link #-4,a6 movl #50000,-4(a6) 1$: movl -4(a6),d0 subl #1,-4(a6) tstl d0 | codes got trashed after the subl. > Does anyone have a guess why program one runs in half the time of program two? > BTW this is a SUN 3. Sure, look at the assembler I am suggesting and the reason is obvious. You might want to look at the code your compiler actually generates. In general, don't post-anything when evaluating for side effects only. _Unless_ you really have to. Micro-optimizations are a loser though. - John. -- John F. Haugh II HECI Exploration Co. Inc. UUCP: ...!ihnp4!killer!jfh 11910 Greenville Ave, Suite 600 "Don't Have an Oil Well?" Dallas, TX. 75243 " ... Then Buy One!" (214) 231-0993
thompson@dalcs.UUCP (Michael A. Thompson) (11/03/87)
In article <367@white.gcm> dc@gcm (Dave Caswell) writes: > >Does anyone have a guess why program one runs in half the time of program two? >BTW this is a SUN 3. The same occures on our VAX 11/785 running BSD4.3: Program 1 0.7u 0.0s 0:00 81% 1+5k 0+0io 1pf+0w Program 1 w/-O 0.7u 0.0s 0:00 97% 1+5k 0+1io 1pf+0w Program 2 1.1u 0.0s 0:01 97% 1+5k 0+1io 1pf+0w Program 2 w/-O 0.9u 0.0s 0:01 71% 1+5k 0+1io 1pf+0w Here is the assembler output from our C compiler: Program 1: Program 2: . . . . . . L16: L16: decl -4(fp) movl -4(fp),r0 jeql L17 decl -4(fp) jbr L16 tstl r0 L17: jeql L17 . jbr L16 . L17: . . . . Program 1 w/-O: Program 2 w/-O: . . . . . . L16:decl -4(fp) L16:movl -4(fp),r0 jneq L16 decl -4(fp) . tstl r0 . jneq L16 . . . . Without trying to go into why the compiler constructs it's loops this way, it is pretty obvious that program 2 is taking longer because it has more instructions to execute each time through the loop. -- Michael A. Thompson, Dept. Math, Stats, & C.S., Dalhousie U., Halifax, N.S. thompson@dalcs.uucp From Bitnet or Uucp thompson@cs.dal.cdn From Bitnet or Cdn thompson%dalcs.uucp@uunet.uu.net From Arpa
rick@seismo.CSS.GOV (Rick Adams) (11/04/87)
Even a mediocre compiler should be able to recognize that a++; is equivalent to ++a; if there is no assignment, etc. There is really no excuse for not making this trivial optimization. You should not have to depend on the user doing it. --rick
edw@IUS1.CS.CMU.EDU (Eddie Wyatt) (11/04/87)
In article <44177@beno.seismo.CSS.GOV>, rick@seismo.CSS.GOV (Rick Adams) writes: > Even a mediocre compiler should be able to recognize that > a++; > is equivalent to > ++a; > if there is no assignment, etc. > > There is really no excuse for not making this trivial optimization. You should > not have to depend on the user doing it. > > --rick But they are not equivalent in the context they where used (refering back to the original article). i = 10 #1 while (i--) ; #2 while (--i) ; In number one the operation is executed 10 times. In number two the operation is executed 9 times. This is because the value of "--" applied to a var is not ignored. #2 can be made totally equivalent to #1 as follows: i++; while (--i) { /* my junk */ } Which is what I do. -- That one looks Jewish -- And that one's a coon -- Who let all this riff raff into the room -- There's one smoking a joint -- and Anoter with spots -- If I had my way -- I'd have all of you shot ... Pink Floyd, "In the Flesh" Eddie Wyatt e-mail: edw@ius1.cs.cmu.edu
tim@amdcad.AMD.COM (Tim Olson) (11/04/87)
In article <44177@beno.seismo.CSS.GOV> rick@seismo.CSS.GOV (Rick Adams) writes: | Even a mediocre compiler should be able to recognize that | a++; | is equivalent to | ++a; | if there is no assignment, etc. | | There is really no excuse for not making this trivial optimization. You should | not have to depend on the user doing it. If a compiler made this "optimization" in the original programs to which this discussion is referring, it would be wrong. The return value of the decrement operators *is* used, by the while statement. #1: count = 5000; while (--count); /* executes 4999 loops */ #2: count = 5000; while (count--); /* executes 5000 loops */ -- Tim Olson Advanced Micro Devices (tim@amdcad.amd.com)
jfh@killer.UUCP (11/04/87)
In article <44177@beno.seismo.CSS.GOV>, rick@seismo.CSS.GOV (Rick Adams) writes: > Even a mediocre compiler should be able to recognize that > a++; > is equivalent to > ++a; > if there is no assignment, etc. > > There is really no excuse for not making this trivial optimization. You should > not have to depend on the user doing it. > > --rick I really hadn't expected Rick to come up with that one. Guess working for the government was gone to his brain ... (Sorry, couldn't pass up a anti-government dig ;-). I have recapped the story for the comp.lang.c folks. The code in question was while (a--) vs. while (--a) ; ; which is the same as while (a-- != 0) vs. while (--a != 0) ; ; the pre-decrement loop makes one less trip since the value of `a' is decremented _prior_ to the test (not actually though). Consider the case where a == 1. The --a will equal 0 and the test will fail. In the case of a--, the value _prior_ to the decrement is used and the test will pass, so the loop gets executed. To be the same, assuming `a' is not used in the loop, you would have to code a++; while (--a) ; to get the same number of trips (assuming overflow was not a problem). This would be a big win if the code in the loop was very small and the value of `a' was large. I wouldn't do it since the time savings would be small, and the code a kluge. Of course, documentation helps. - John. -- John F. Haugh II HECI Exploration Co. Inc. UUCP: ...!ihnp4!killer!jfh 11910 Greenville Ave, Suite 600 "Don't Have an Oil Well?" Dallas, TX. 75243 " ... Then Buy One!" (214) 231-0993
rick@seismo.CSS.GOV (Rick Adams) (11/05/87)
In Article: <18964@amdcad.AMD.COM> tim@amdcad.AMD.COM (Tim Olson) writes: > This is yet another reason to use pre-decrement over post-decrement when > all that is wanted is the side-effect of decrementing, and the return > value is thrown away. I was referring to this statement which specifically states that the return value is thrown away. Of course the compiler can't do that optimization if you are using the return value for something. --rick
mdf@tut.cis.ohio-state.edu (Mark D. Freeman) (11/06/87)
In <44177@beno.seismo.CSS.GOV> rick@seismo.CSS.GOV (Rick Adams) writes: >Even a mediocre compiler should be able to recognize that > a++; >is equivalent to > ++a; >if there is no assignment, etc. > >There is really no excuse for not making this trivial optimization. You should >not have to depend on the user doing it. Forgive a possibly stupid question, but in what way is there any difference between a++ and ++a in the generated code, even with the optimiser disabled. Note: I am not arguing with you, merely asking for more information. -- Mark D. Freeman (614) 262-3703 StrongPoint Systems, Inc. mdf@tut.cis.ohio-state.edu 2440 Medary Avenue ...!cbosgd!osu-cis!tut.cis.ohio-state.edu!mdf Columbus, OH 43202 Guest account at The Ohio State University
daveb@geac.UUCP (11/06/87)
In article <18964@amdcad.AMD.COM> tim@amdcad.UUCP (Tim Olson) writes: > [a discussion of the code generated by while(i--) and while (--i) >This is yet another reason to use pre-decrement over post-decrement when >all that is wanted is the side-effect of decrementing, and the return >value is thrown away. > > -- Tim Olson Er, don't you mean "when the return value **isn't** thrown away? My compiler generates identical code for --i; and i--; when they're independent statements, because they have the same semantics. -- David Collier-Brown. {mnetor|yetti|utgpu}!geac!daveb Geac Computers International Inc., | Computer Science loses its 350 Steelcase Road,Markham, Ontario, | memory (if not its mind) CANADA, L3R 1B3 (416) 475-0525 x3279 | every 6 months.
karl@haddock.ISC.COM (Karl Heuer) (11/06/87)
[Someone noted that "while (i--);" is slower than "while (--i);". It was pointed out that the former generates more instructions because of the extra work to save the old value of the counter. At least one person claimed that the postdecrement operator should be avoided for this reason; apparently he didn't notice that the value was being used, and that the two statements are not semantically identical. Rick Adams observed that, even if the value is not being used, the user need not be concerned with this optimization.] More than one person writes that an n-trip loop should be written ++n; while (--n) ...; I'm surprised that nobody has yet mentioned the idiom that I use: while (--n >= 0) ...; This has the speed of predecrement, and avoids the kludge of having an extra increment. It requires n to be a signed type, but that's not usually a problem. (Also, it doesn't have identical semantics if n is negative, but from the context I presume that this is not a concern.) Of course, the other way is to write do ...; while (--n != 0); [or] do ...; while (--n > 0); assuming n is known to be initially nonzero. Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint Followups to comp.lang.c only.
edw@IUS1.CS.CMU.EDU (Eddie Wyatt) (11/07/87)
> > Forgive a possibly stupid question, but in what way is there any > difference between a++ and ++a in the generated code, even with the > optimiser disabled. > > Note: I am not arguing with you, merely asking for more information. > > In case 1 (a++) the value store in "a" may have to be saved before executing the increment if the value of the expression is used. In case case 2 (++a) the value stored in "a" after the operation is preformed. Hence no extra save must be preformed. Example: case 1 i = a++; save a in b increment a i gets b case 2 i = ++a; increment a i gets a; Note there is an optimization that can be made for the first case that is : i get a increment a But this optimization assumes that the order of the evaluation ++ and assignment don't matter (however in some cases it does). a = 1; a = a++; The unoptimized case would yields an "a" value of 1. The optimized case would yield an "a" value of 2. This really shows that there is an ambiguity in the language definition. They should have used denotational semantics. -- That one looks Jewish -- And that one's a coon -- Who let all this riff raff into the room -- There's one smoking a joint -- and Anoter with spots -- If I had my way -- I'd have all of you shot ... Pink Floyd, "In the Flesh" Eddie Wyatt e-mail: edw@ius1.cs.cmu.edu
rbj@icst-cmr.arpa (Root Boy Jim) (12/15/87)
From: Dave Caswell <dc@gcm> BTW, that's `weird'. Does anyone have a guess why program 1 runs in half the time of program 2? By now you have seen assembly code listings, with and without optimization, which should make things clear. I'll attempt to address a few points which nobody has made as of yet. There seem to be four basic styles of while loops; one dimension is pre-/post-decrement, the other is `while () ...;' vs `do ... while ();' You can get n-1, n, n, and n+1 repetitions out of the constructs `while (--n) ...;', `while (n--) ...;', `do ... while (--n)', and `do ... while (n--);' respectively. I always thought that the `do ... while();' syntax was a gratuitous addition to the language until someone pointed out that it was tailored to the PDP-11 `sob' (subtract one and branch if *not* zero) instruction. The VAX has (is?) an `sob' as well. So does the 680[012]0 as well, altho it's called `dbra', a special case of the `DBcc', wher `cc' is one of any sixteen branch conditions. Oddly, the SUN compilers don't seem to make use of it, so what I am about to say next is irrelevant. Another feature of the 680[12]0, but not the 68000, is what is called `loop' mode. If a one word instruxion (oops! back to my old habits :-) is followed by a DBcc that references it, the whole loop is executed without refetching the two instructions (got it right this time :-). Thus, if we could coax the compiler into using dbra, the best way to copy register int n bytes from register char *p to register char *q would be `do *q++ = *p++; while (--n);'. BTW this is a SUN 3. So that would be a 68020. The 68020 may handle bigger loops in loop mode as it has a bigger instruction prefetch queue. (Root Boy) Jim Cottrell <rbj@icst-cmr.arpa> National Bureau of Standards Flamer's Hotline: (301) 975-5688 Isn't this my STOP?!
pac@munsell.UUCP (Paul Czarnecki) (12/22/87)
In article <10861@brl-adm.ARPA> rbj@icst-cmr.arpa (Root Boy Jim) writes: >I always thought that the `do ... while();' syntax was a gratuitous I had also thought that way. But after 5 years of C I finally wrote my first do{}while loop. (I had to look it up!) For one of my libraries, I treat a "picture" as a collection of images. They are held in a doubly linked list in the "picture" structure. If I want to "process" each of the images in a picture I use this code: image_list_2 = image_list_1 = pic->image_list; /* * process each of the subimages */ do { retcode = process_image(image_list_2->image); if (retcode <= 0) { return (EPROCESSFAILED); } image_list_2 = image_list_2->next; /* next images */ } while (image_list_1 != image_list_2); There *are* some applications that a do{}while are natural for. There just aren't many of them. pZ -- Paul Czarnecki -- Spam, spam, spam, Usenet, and spam {{harvard,ll-xn}!adelie,{decvax,allegra,talcott}!encore}!munsell!pz