mccalpin@vax1.udel.edu, , 7 Jun 90 08:46:40 -0400 (MCCALPIN) (06/08/90)
In article <1990Jun7.010349.2097@esegue.segue.boston.ma.us> James Larus writes: >Let's make this argument more concrete. Consider the example of summing the >elements of a vector: > for i <- 1 to n do > sum <- sum + A[i] >....on a parallel computer it can [use] a parallel prefix >algorithm that builds up a tree of additions. The problem is that this >optimization requires addition to be associative. Integer and floating point >addition isn't associative. However, how many programmers have carefully >analyzed their algorithm and convinced themselves that summing the elements >from 1 to N will produce the least error? This code will already be modified by a number of current compilers. I have seen the following "interpretations" of the above on existing compilers: (1) Calculate the sum in 80-bit precision on the 8087 stack and truncate on completion of the loop. (2) Add the two numbers in 80-bit precision, then truncate, store, and re-load the 'sum' variable at each iteration of the loop. (3) Add the numbers in order in unnormalized 128-bit floating-point format. (Cyber 205) (4) Unroll the loop to some depth (typically 4) and calculate: (MIPS?) for i <- 1 to n by 4 do sum <- sum + A[i] + A[i+1] + A[i+2] + A[i+3] (5) Stripmine the loop by groups of 64 and add them separately: (Cray 1) set temp[i]=0.0, i=1..64 for isect <- 1 to (n/64-1) do for i <- 1 to 64 do temp[i] <- temp[i] + A[i+(isect-1)*64]+A[i+(isect*64)] end_for end_for for i <- 1 to 64 do sum <- sum + temp[i] end_for So we might as well forget about the problems with non-associativity of FP addition, since the compilers are going to do the calculations in a different order than the one we chose anyway! Unless optimization/vectorization is turned off, you cannot count on any two machines adding that simple loop in the same order or with the same rounding/truncation -- even if they nominally use the same floating point rules and precisions. >Of course, some languages such as Fortran 90 permit a programmer to express >his intent in a manner that gives the compiler information about when order >matters. Unfortunately, such languages are rare and only include a few common >idioms. Fortran-90 actually lets you express *less* about the actual order of the calculations by allowing you to specify that order does not change the *mathematical* result. The array syntax includes the implicit assumption that data dependencies do not exist, so the calculations can proceed in any order. This really does nothing for the problem of associativity. It does allow a much nicer interface to the data parallel programming paradigm. >[I suppose there is merit to this suggestion, but the thought of an optimizer >that second guesses the programmer and attempts to generate code for what the >program would have said if the source language was different fills me with >considerable unease. -John] But that is exactly what vectorizers do, and the better ones do a pretty good job. The bad ones, though.... (I guess I shouldn't mention any brand names for fear of reprisals!) On a more philosophical note, this direction is precisely the one being taken by the proponents of higher-level programming languages/environments. In order for computers to be really helpful extensions to the human intellect, we will have to give them increasing autonomy in choosing exactly how to go about solving a problem. I understand that their is an active research program at Purdue on expert systems for partial differential equations that is attempting to do get the computer to choose (or advise the user to choose) the discretization strategy and algorithm to solve a specified partial differential equation. That is a system that would probably fill John with even more "considerable unease"! -- Send compilers articles to compilers@esegue.segue.boston.ma.us {spdcc | ima | lotus}!esegue. Meta-mail to compilers-request@esegue. Please send responses to the author of the message, not the poster.
pardo@cs.washington.edu (David Keppel) (06/08/90)
MCCALPIN <mccalpin@vax1.udel.edu> writes: >[``ambiguous'' approach advocated by higher-level prog lang designers] Many languages have operations that have domains for which they have explicitly undefined behavior. For example in C, the code 'a[i] = i++' has two interpretations: `j=i++, a[j] = j' and `j=i++, a[i] = j'. If we want to, we can define an iteration construct `forp i=1..n' that executes all iterations in parallel. If the programmer needs restricted semantics, then the programmer restricts them explicitly. The compiler does not need to second-guess the programmer, the only question is whether the compiler exploits the information given by the programmer. So, for instance, we can define `ford i=1..n' to be a `forp' in which all iterations are executed in an order that satisfies the data dependencies but no other constraints; again, no second- guessing is needed. ;-D on ( Second-guessing the compiler-compiler ) Pardo -- pardo@cs.washington.edu {rutgers,cornell,ucsd,ubc-cs,tektronix}!uw-beaver!june!pardo -- Send compilers articles to compilers@esegue.segue.boston.ma.us {spdcc | ima | lotus}!esegue. Meta-mail to compilers-request@esegue. Please send responses to the author of the message, not the poster.