chris@mimsy.UUCP (Chris Torek) (06/29/88)
In article <7570@boring.cwi.nl> dik@cwi.nl (Dik T. Winter) writes: >(What is signum?) It has other names, but I prefer signum to avoid confusion with sine, etc. { -1 x < 0 signum(x) { 0 x = 0 { 1 x > 0 >Not so surprising for min and max (and minmax). However, they require a >sign extending shift, so that implementation is not possible on all >machines. No, they use clever tricks with the carry bit (these are for unsigned values). Indeed, this implementation is not possible on some machines (VAXen have subtract-with-carry but have no AND instruction; others have no subtract-with-carry), which is why you want some automatic way of converting unsigned-min(x,y) into actual code. >Also, if a branch takes only one cycle (with delay slot), >you do not gain anything (in general). On the 68020 the no-branch versions win in terms of total cycles. min() looks like this: | input: d0=x, d1=y, both unsigned; d2 is scratch | output: d0=min(x, y) sub.l d1,d0 | d0 = x-y subx.l d2,d2 | d2 = 111...111 if x<y, else 000...000 and.l d2,d0 | d0 = x-y if x<y, else 0 add.l d1,d0 | d0 = x-y+y=x if x<y, else y -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
leichter@yale.UUCP (Jerry Leichter) (07/05/88)
It really should be pointed out that these techniques are by no means new. I saw them used back around 1971 on a CDC 6600. On a 6600, branches were VERY expensive relative to almost anything else. (The basic unit was the minor cycle, 100ns. A Boolean operation was 3 minor cycles; an FP add was, I think, 8. A branch was 12 or 14, depending on whether it was taken or not, plus possible memory latency.) So techniques like this were very well developed. I don't recall ever seeing signum written out this way, though it's obvious enough, but everyone who knew about good coding practices knew how to do ABS or MAX with no branches. (Note that a 6600 is a ones-com- plement machine, so the code is simpler.) In fact, the FORTRAN compiler used these techniques for generating code! There was also the "take two 60-bit words thought of as 10 6-bit BCD numbers and add them without a loop" hack which will teach you why there are 26 letters in the English alphabet. (There have to be. You reserve 00, then lay out all the letters, then the digits. That puts 0 at the only place it can be to make this trick go through without pre- and post-processing.) (No, I won't reconstruct the whole thing here. The trick is to set things up so that all carries beyond 9 end up setting or clearing a fixed bit in the 6-bit representation. A mask extracts those bits, controlling where the carries have to be "added back in".) -- Jerry