chris@mimsy.UUCP (04/04/87)
> while (wy--) { > j = wx; > while (j--) { (The above is an inner loop.) Such loops should usually (always?) be written as while (--j >= 0) ---assuming j is signed. Why? They both do the same thing, but a dumb compiler will turn the former into `move j to tmp; decrement j; test tmp; branch if zero', while the same dumb compiler will turn the latter into `decrement j; branch if negative'. Details will vary depending on condition codes, but the former is often four instructions, and the latter two. On a Vax, the second version is sometimes a single instruction. Of course, a smart compiler will generate the same code for both. That is wonderful---if you have a smart compiler. Better check! -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690) UUCP: seismo!mimsy!chris ARPA/CSNet: chris@mimsy.umd.edu
mcvoy@uwvax.UUCP (04/05/87)
(Chris Torek) writes: >> while (wy--) { >> j = wx; >> while (j--) { > >(The above is an inner loop.) Such loops should usually (always?) >be written as > > while (--j >= 0) > >---assuming j is signed. Why? They both do the same thing, but > [good justification that (--j >= 0) is faster] I'm not sure that it worth everyones time to be thinking about this. And I speak from a certain amount of experience, to wit: last semester I had to write a fake TCP/IP from scratch and I wanted to make a fast implementation. Over the course of the semester I gathered a directory full of t.{c,s} files where I was looking at exactly this sort of thing. I really wanted fast code. As it turned out, my implementation did not gain much at all from all this extra work - mainly because it as just not used. The old "90% of the time 10% of the code" saw applies. I would have been much better off to profile my code and rewrite the bottlenecks. Please don't take this the wrong way - Chris is a smart guy, and he's right in a technical sense. And I get a warm fuzzy feeling from writing my code in a efficient manner too. It's just that I think it's misleading to to appear worried about this sort of thing in general sense - it really belongs in the "profiling code" section, not "general programming tips". Kernighan and Plauger say "Premature Optimization is the root of all evil". I think this is a bit extreme, but I agree in principle. Food for thought (?), --larry -- Larry McVoy mcvoy@rsch.wisc.edu or uwvax!mcvoy "It's a joke, son! I say, I say, a Joke!!" --Foghorn Leghorn
chris@mimsy.UUCP (04/05/87)
In article <6134@mimsy.UUCP> I wrote: >[use] while (--j >= 0) [rather than while (j--)] I poked around today and discovered that Sun's compiler, at least, will turn register short j; /* but not `register int j' */ while (--j != -1) ... into a `dbra' loop. If you are willing to put machine dependent source optimisations into your C code, this might be something to consider (at least for inner loops). Anyway, it is a good idea to profile, tune, compile to assembly code, and sometimes even hand-tweak the results, in speed-critical routines. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690) UUCP: seismo!mimsy!chris ARPA/CSNet: chris@mimsy.umd.edu
jon@eps2.UUCP (04/07/87)
In article <6139@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes: > In article <6134@mimsy.UUCP> I wrote: > >[use] while (--j >= 0) [rather than while (j--)] > > I poked around today and discovered that Sun's compiler, at least, > will turn > > register short j; /* but not `register int j' */ > while (--j != -1) > ... > > into a `dbra' loop. If you are willing to put machine dependent > source optimisations into your C code, this might be something to > consider (at least for inner loops). On this Sun-3/160 running 3.2, the compiler won't generate the dbra. However, the object code optimizer will. But the only way I know of to find this out is to adb the executable or a .o (I know I am not as clever as you, am I missing something? I didn't think you could get an optimized .s). One nice thing about the Green Hills compiler that Integraph uses was that you could look at optimized .s files. Interestingly enough, this incredibly old Alcyon C compiler I sometimes use generates the dbras. Another way to get the dbra instruction from the Sun optimizer is to use: register short i; for (i = 10; --i != -1;) /* the Alcyon compiler will do it too */ And as we all know, the dbra is especially important on the 68010 and 68012 because with the right instruction in the loop, you can get it into loop mode. Jonathan Hue DuPont Design Technologies/Via Visuals leadsv!eps2!jon
david@sun.UUCP (04/08/87)
In article <75@eps2.UUCP> jon@eps2.UUCP (Jonathan Hue) writes: >On this Sun-3/160 running 3.2, the compiler won't generate the dbra. However, >the object code optimizer will. But the only way I know of to find this out >is to adb the executable or a .o (I know I am not as clever as you, am I >missing something? I didn't think you could get an optimized .s). cc -O -S foo.c In article <6139@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes: > In article <6134@mimsy.UUCP> I wrote: > >[use] while (--j >= 0) [rather than while (j--)] > > I poked around today and discovered that Sun's compiler, at least, > will turn > > register short j; /* but not `register int j' */ > while (--j != -1) > ... > > into a `dbra' loop. If you are willing to put machine dependent > source optimisations into your C code, this might be something to > consider (at least for inner loops). Be a good citizen and hide it with macros... #ifdef mc68000 typedef short LOOP_T; #define LOOP_DECR(var) (--(var) != -1) #else typedef int LOOP_T; #define LOOP_DECR(var) (--(var) >= 0) #endif register LOOP_T j; while (LOOP_DECR(j)) something; which leads us to the mystic loop macro ... #define LOOP(count, op) do { register LOOP_T _loop = (count); if (--_loop >= 0) do { op; } while (LOOP_DECR(_loop)); } while (0) (end of line backslashes omitted for clarity) -- David DiGiacomo, Sun Microsystems, Mt. View, CA sun!david david@sun.com Disclaimer: blah blah blah
mark@markshome (mark weiser) (04/09/87)
In article <75@eps2.UUCP> jon@eps2.UUCP (Jonathan Hue) writes: >...On this Sun-3/160 running 3.2, the compiler won't generate the dbra. However, >the object code optimizer will. But the only way I know of to find this out >is to adb the executable or a .o (I know I am not as clever as you, am I >missing something? I didn't think you could get an optimized .s). On this Sun-3/75 running SunOS 3.2, using -O and -S together shows me the optimized assembly in the .s file. -mark Spoken: Mark Weiser ARPA: mark@mimsy.umd.edu Phone: +1-301-454-7817 After May 1, 1987: weiser@xerox.com