anw@maths.nott.ac.uk (Dr A. N. Walker) (11/09/90)
In article <5085@lanl.gov> jlg@lanl.gov (Jim Giles) writes: >By the way, as I've pointed out before, the _LACK_ of aliasing >should be the default - the presence of possible aliasing should >require the assertion. It makes sense to make the most commonly >desired configuration be the default. ^^^^^^^ The key word here is "desired". The programmer may "desire" that the program run like a bat out of hell and that what it does is what it seems to do. These desires sometimes conflict. Anyway, what's worrying me now is the prospect of a replay of this debate in respect of parallelisation. The following C fragment is erroneous: int i = 0; int f() { return i++; } ... { int j = (f(), 2) + (f(), 2); printf ("%d\n", i); } because the two invocations of "f" may be run in parallel, so that the two occurrences of "i++" may interfere with each other. But "f" doesn't know it's going to be invoked in parallel, and the rest of the program doesn't know that "f" is dangerous. In real life, this has rarely mattered so far. If there are two separate invocations of "f" on a serial processor, you will get 2 for "i" no matter how you re-order the expression for "j"; if (as in Fortran) you optimise out one of the calls, you get 1, and the programmer who complains can be pointed to some corner of the language definition. But if the two calls of "f" are handed over to different processors, you will usually get 2, but occasionally 1. Insidious bugs strike again. Of course, we never write code like that. But it is easy to write code like "g() + h()" where "g" and "h" happen (unknown to the writer) both to invoke "f". In languages with semaphores or similar, we can write "f" like sema s = 1; f() { int k; down s; k = i++; up s; return k; } but who is going to go to so much trouble [and why will we think of it in the first place]? What should users of C and Fortran do? Pray? -- Andy Walker, Maths Dept., Nott'm Univ., UK. anw@maths.nott.ac.uk
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (11/09/90)
In article <1990Nov8.183058.25045@maths.nott.ac.uk> anw@maths.nott.ac.uk (Dr A. N. Walker) writes: > int f() { return i++; } > { int j = (f(), 2) + (f(), 2); > printf ("%d\n", i); [ two f()s may run in parallel, so i's value is undefined ] One thing that helps detect this situation is pure functions, which depend only on their arguments and local variables initialized from their arguments. (Clearly subroutines of pure functions must be pure.) Once you add the ``pure'' keyword, the compiler can warn about parallel non-pure functions. It also helps vectorization if you can assert that a series of statements is pure. [ semaphores ] > but who is going to go to so much trouble [and why will we think of > it in the first place]? What should users of C and Fortran do? > Pray? Be careful with static and global variables? ---Dan
jlg@lanl.gov (Jim Giles) (11/10/90)
From article <1990Nov8.183058.25045@maths.nott.ac.uk>, by anw@maths.nott.ac.uk (Dr A. N. Walker): > [... lots of stuff about functions with side-effects in parallel ...] > [... ending with a solution involving explicit semaphores ...] > sema s = 1; > f() { int k; down s; k = i++; up s; return k; } > > but who is going to go to so much trouble [and why will we think of > it in the first place]? What should users of C and Fortran do? > Pray? Hard to tell. The problem, as stated, is beyond the language definitions of either language (and most other popular languages as well). Certainly the semaphore idea will work (since semaphores are presumed to be an extension available on all parallel machines). My own opinion is that functions with side-effects should be explicitly identified and the the compiler should be responsible for them. On a serial machine, the compiler would just be constrained against reordering or eliminating calls of such functions. On a parallel machine, the compiler would be required, not just to exclude concurrent operations on side-effect functions, but to enforce a specific order on them. This means that side-effects would be expensive - but we knew that already. J. Giles
jlg@lanl.gov (Jim Giles) (11/10/90)
From article <24674:Nov906:50:1890@kramden.acf.nyu.edu>, by brnstnd@kramden.acf.nyu.edu (Dan Bernstein): > In article <1990Nov8.183058.25045@maths.nott.ac.uk> anw@maths.nott.ac.uk (Dr A. N. Walker) writes: >> int f() { return i++; } >> { int j = (f(), 2) + (f(), 2); >> printf ("%d\n", i); > [ two f()s may run in parallel, so i's value is undefined ] > > One thing that helps detect this situation is pure functions, which > depend only on their arguments and local variables initialized from > their arguments. [...] Finally an agreement! Pretty much the same thing I said. However, I'm not too sure about your default - shouldn't 'pure' be the default and 'impure' require the extra declaration? :-) I'm only half kidding here - I haven't got an opinion about which should be the default, but it should not be selected by arbitrary means. Do an experiment to find the most common usage. Do another to find the most common expectations of programmers. Then pick you default. > [...] (Clearly subroutines of pure functions must be pure.) And how do you determine _this_ at compile time? :-) Again, only half kidding. This is another thing that the loader would be very good at enforcing. I can _imagine_ arcane schemes involving header files to do this earlier than load-time, but it again requires that the headers and the routines themselves actually match. J. Giles
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (11/10/90)
In article <5499@lanl.gov> jlg@lanl.gov (Jim Giles) writes: > From article <24674:Nov906:50:1890@kramden.acf.nyu.edu>, by brnstnd@kramden.acf.nyu.edu (Dan Bernstein): > > One thing that helps detect this situation is pure functions, which > > depend only on their arguments and local variables initialized from > > their arguments. [...] > Finally an agreement! Pretty much the same thing I said. Shall we get into an argument about who posted first? :-) (I did!) > However, > I'm not too sure about your default - shouldn't 'pure' be the > default and 'impure' require the extra declaration? :-) I'm pretty sure that impure is the right default, for various reasons. > > [...] (Clearly subroutines of pure functions must be pure.) > And how do you determine _this_ at compile time? :-) Again, only > half kidding. This is another thing that the loader would be very > good at enforcing. I can _imagine_ arcane schemes involving header > files to do this earlier than load-time, but it again requires that > the headers and the routines themselves actually match. Jim, can you please shed your paranoia about .h files that don't reflect c files? There's absolutely nothing arcane about putting purity into prototypes in header files, and enforcing these restrictions locally. It works perfectly. ---Dan