hamilton@siberia.rtp.dg.com (Eric Hamilton) (06/15/91)
In article <1991Jun14.134338.4673@linus.mitre.org>, bs@frieda.mitre.org (Robert D. Silverman) writes: |> : ibig = absdx .le. dmax |> ^^^^^^^^^^^^^^^^^^^^^^^^ |> |> I would be very interested in seeing the assembler code that gets |> emitted for this line of Fortran. How can this statement get executed |> WITHOUT a branch?? |> |> I also have never seen a machine that has conditional assignment |> implemented as an instruction! (As indicated above). What machine is |> this? Could you show us the assembler syntax?? |> |> I ** believe ** that the above conditional assignment will get compiled |> into a test and branch. I don't know of any machine that provides |> instructions to do otherwise. |> Motorola 88000, for one. The trick is that the compare instruction puts the condition code bit vector into a general register instead of a dedicated condition code register, so that the bitfield instructions can be used on the condition codes. For example: cmp scratch,absdx,dmax ; Compare; condition codes to scratch extu ibig,scratch,1<le> ; Extract the le bit into ibig, ; right-adjusted. The condition code bit vector is also redundant, in the sense that if any bit is set, there is another bit, in this case greater-than, which is known to be clear, and vice versa. In many cases it turns out the redundancy is useful. It's certainly near-as-no-never-mind free. The ibig = absdx .le. dmax example just barely scratches the surface. For example: if( a condition b ) i = i + 1; can be rendered as: cmp rscratch,x,y extu rscratch,rscratch,1<condition> addu i,i,rscratch and lots more. I'll leave it to the compiler folks to post the most aggressively clever example. -- ---------------------------------------------------------------------- Eric Hamilton +1 919 248 6172 Data General Corporation hamilton@dg-rtp.rtp.dg.com 62 Alexander Drive ...!mcnc!rti!xyzzy!hamilton Research Triangle Park, NC 27709, USA
glew@pdx007.intel.com (Andy Glew) (06/17/91)
|> : ibig = absdx .le. dmax |> ^^^^^^^^^^^^^^^^^^^^^^^^ |> |> I would be very interested in seeing the assembler code that gets |> emitted for this line of Fortran. How can this statement get executed |> WITHOUT a branch?? The Intel386(tm) architecture has SETcc (set on condition) instructions that put 0/1 into a register depending on the condition codes. So you would have: cmp ebx,ecx ! assuming absdx in ebx, dmax in ecx setle al ! put 0/1 into al Set-conditional is rather common - I believe it is in the MIPS R2000 architecture (and I am sure I will be corrected if wrong), and the AMD 29K architecture. Minor differences according to whether 0/1 or 0/-1 is stored. |> I also have never seen a machine that has conditional assignment |> implemented as an instruction! (As indicated above). What machine is |> this? Could you show us the assembler syntax?? The Cydra-5 had a conditional assignment instruction (like the phi function in single assignment code). Semantics: dest := IF icr[N] THEN val1 ELSE val2. Cydra also conditionalized nearly all instructions, as does the Acorn Risk machine, although I do not know if the ARM has a conditional select. Note that with a bit of masking conditional select can be acheived on the 88K, as on most other machines. |> I ** believe ** that the above conditional assignment will get compiled |> into a test and branch. I don't know of any machine that provides |> instructions to do otherwise. I have seen some of the better x86 compilers compile this without a branch. -- Andy Glew, glew@ichips.intel.com Intel Corp., M/S JF1-19, 5200 NE Elam Young Parkway, Hillsboro, Oregon 97124-6497 This is a private posting; it does not indicate opinions or positions of Intel Corp.
mac@gold.kpc.com (Mike McNamara) (06/21/91)
|> : ibig = absdx .le. dmax |> ^^^^^^^^^^^^^^^^^^^^^^^^ |> |> I would be very interested in seeing the assembler code that gets |> emitted for this line of Fortran. How can this statement get executed |> WITHOUT a branch?? Firstly, the MIPS instruction set has SLT, SLTI, SLTU and SLTIU, which set to 1 the destination register if the first argument is less than the second. Otherwise the destination is set to 0. U requests an unsigned comparison; I indicates the second argument is a 16 bit immediate. Your code snippet does not contain declarations, but one is probably correct in assuming that you are implicitly allowing absdx and dmax to be typed as REALS, inwhich case the MIPS instructions would not work (integer only) Most vector machines, (Ardent Titan, Convex, etc) have a vector compare instruction that sets or clears bits in any specified register based upon the compare; further, other operations can be specified to operate "undermask" of that register. This facilitates vectorization of loops containing conditionals: DO I = 1,N IF ( X(I) .GT. Y(I) ) X(I) = Y(I) * Z(I) + K ENDIF ENDDO becomes sw N,<vlength> DVLDA X,0(addr_X) DVLDB Y,0(addr_Y) DVCGT [M],X,Y DVLDA Z,0(addr_Z) ! could be under mask as well... DVMA,mt X,X,Y,[K] DVST,mt X,0(addr_X) on an ardent titan. Ah, yes, Andy brings up the beloved cydrome machine. I introduced the intrinsic SELECT(EXP,A,B) to it's fortran compiler. *Amazing* speed ups were achieved especially on loops with internal conditional codes. e.g., vector intrinsics. I was able to get a 17x improvement in ATAN2. I find myself today still "thinking" in the predicated vliw paradigm. To be fair, the hardware model the cydra provided facilitates the software ideas in the control flow graph papers from IBM. Those of you at the most recent ASPLOS saw how the cydrome architecture implements software pipelining in hardware; the predicated operations greatly facilitate the control graph optimizations as well. The one incompleteness of the cydra was predicated exceptions. I.E., this instruction is PROMPTED: i.e., been moved out of it's containing block, so if it generates an exception, just set a bit in the destination register. When a non promoted instruction uses a value from a register that has the exception bits set, THEN generate the fault... Ah... musings... -mac -- +-----------+-----------------------------------------------------------------+ |mac@kpc.com| Increasing Software complexity lets us sell Mainframes as | | | personal computers. Carry on, X windows/Postscript/emacs/CASE!! | +-----------+-----------------------------------------------------------------+
dwells@fits.cx.nrao.edu (Don Wells) (06/22/91)
In article <MAC.91Jun21094148@gold.kpc.com> mac@gold.kpc.com (Mike McNamara) writes: ... Most vector machines, (Ardent Titan, Convex, etc) have a vector compare instruction that sets or clears bits in any specified register based upon the compare; further, other operations can be specified to operate "undermask" of that register. This facilitates vectorization of loops containing conditionals... In most such machines one can use the bits to mask the "compress" of a vector of values 1...n by one means or another. The resulting vector is a list of indicies of the loop for which the condition was satisfied (or not satisfied). In such machines the list can then be used as an index vector for load and store (gather, scatter) operations. This technique is especially advantageous when the truth ratio (probability of test being satisfied) is low, or when the amount of work (especially load/store) being done in the loop is large. This approach is a part of standard bag of tricks for high performance programming of conditionals on vector boxes. Some compilers, notably the Fujitsu, produce such code automatically for loops preceded by a suitable truth ratio compiler directive, based on a execution cost analysis. It is even practical to use the approach on concurrent vector machines like the Alliants (yes, a concurrent vector compress!). Perhaps in this era of supersonic RISC engines nobody cares anymore about these old vector programming tactics... -- Donald C. Wells Associate Scientist dwells@nrao.edu National Radio Astronomy Observatory +1-804-296-0277 Edgemont Road Fax= +1-804-296-0278 Charlottesville, Virginia 22903-2475 USA 78:31.1W, 38:02.2N
edwardm@hpcuhe.cup.hp.com (Edward McClanahan) (06/25/91)
|> : ibig = absdx .le. dmax |> ^^^^^^^^^^^^^^^^^^^^^^^^ |> |> I would be very interested in seeing the assembler code that gets |> emitted for this line of Fortran. How can this statement get executed |> WITHOUT a branch?? |> |> I also have never seen a machine that has conditional assignment |> implemented as an instruction! (As indicated above). What machine is |> this? Could you show us the assembler syntax?? |> |> I ** believe ** that the above conditional assignment will get compiled |> into a test and branch. I don't know of any machine that provides |> instructions to do otherwise. Just to add another example... HP's RISC CPU (and many others) use conditionals to NULLIFY the next instruction. Here is how the HP compiler 'could' translate the above statement: Let's assume the compiler has made the following register assignments: R19 - ibig R20 - absdx R21 - dmax The following three instruction sequence will store a ZERO in R19 (ibig) if absdx > dmax (assuming that 1 means TRUE; I'm not a FORTRAN person) and a ONE if absdx <= dmax: LDI 0,R19 ; Set ibig to 'default' of FALSE SUB,> R20,R21,R0 ; 'Compare' absdx to dmax and NULLIFY next instruction ; if absdx > dmax ; Result of subtraction is thrown away when sent to R0 LDI 1,R19 ; If not NULLIFIED by 'SUB,>', set ibig to TRUE Of course some may argue that this is really a branch... =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Edward McClanahan Hewlett Packard Company -or- edwardm@cup.hp.com Mail Stop 42UN 11000 Wolfe Road Phone: (408)447-5651 Cupertino, CA 95014 Fax: (408)447-5039
hrubin@pop.stat.purdue.edu (Herman Rubin) (06/25/91)
This is the old conditional skip, unused in most "modern" architectures. This works well on sequential machines, but has it been implemented efficiently on the faster non-sequential ones? -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet) {purdue,pur-ee}!l.cc!hrubin(UUCP)
mjs@hpfcso.FC.HP.COM (Marc Sabatella) (06/25/91)
>HP's RISC CPU (and many others) use conditionals to NULLIFY the next >instruction. Here is how the HP compiler 'could' translate the above >statement: >... >Of course some may argue that this is really a branch... No, I think part of the point is that a nullify is not a branch - in particular, it never interferes with prefetch, nor does it alter locality. -------------- Marc Sabatella (marc@hpmonk.fc.hp.com) Disclaimers: 2 + 2 = 3, for suitably small values of 2 Bill and Dave may not always agree with me
jdg@cs.man.ac.uk (Jim Garside) (06/26/91)
Since no one has mentioned it, the ARM - a modern RISC processor - has conditional execution of *all* its instructions. Jim Garside